FLOPS

剧毒术士马文2017-07-13最后更新: 2017-07-13

Intel Core 2 and Nehalem:

4 DP FLOPs/cycle: 2-wide SSE2 addition + 2-wide SSE2 multiplication

8 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication

Intel Sandy Bridge/Ivy Bridge:

8 DP FLOPs/cycle: 4-wide AVX addition + 4-wide AVX multiplication

16 SP FLOPs/cycle: 8-wide AVX addition + 8-wide AVX multiplication

Intel Haswell/Broadwell/Skylake/Kaby Lake:

16 DP FLOPs/cycle: two 4-wide FMA (fused multiply-add) instructions

32 SP FLOPs/cycle: two 8-wide FMA (fused multiply-add) instructions

Intel Xeon Phi (Knights Corner), per core:

16 DP FLOPs/cycle: 8-wide FMA every cycle

32 SP FLOPs/cycle: 16-wide FMA every cycle

Intel Xeon Phi (Knights Corner), per thread:

8 DP FLOPs/cycle: 8-wide FMA every other cycle

16 SP FLOPs/cycle: 16-wide FMA every other cycle

Intel Xeon Phi (Knights Landing), per core:

32 DP FLOPs/cycle: two 8-wide FMA every cycle

64 SP FLOPs/cycle: two 16-wide FMA every cycle

AMD K10:

4 DP FLOPs/cycle: 2-wide SSE2 addition + 2-wide SSE2 multiplication

8 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication

AMD Bulldozer/Piledriver/Steamroller/Excavator, per module (two cores):

8 DP FLOPs/cycle: 4-wide FMA

16 SP FLOPs/cycle: 8-wide FMA

AMD Ryzen

8 DP FLOPs/cycle: 4-wide FMA

16 SP FLOPs/cycle: 8-wide FMA

Intel Atom (Bonnell/45nm, Saltwell/32nm, Silvermont/22nm):

1.5 DP FLOPs/cycle: scalar SSE2 addition + scalar SSE2 multiplication every other cycle

6 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication every other cycle

AMD Bobcat:

1.5 DP FLOPs/cycle: scalar SSE2 addition + scalar SSE2 multiplication every other cycle

4 SP FLOPs/cycle: 4-wide SSE addition every other cycle + 4-wide SSE multiplication every other cycle

AMD Jaguar:

3 DP FLOPs/cycle: 4-wide AVX addition every other cycle + 4-wide AVX multiplication in four cycles

8 SP FLOPs/cycle: 8-wide AVX addition every other cycle + 8-wide AVX multiplication every other cycle

ARM Cortex-A9:

1.5 DP FLOPs/cycle: scalar addition + scalar multiplication every other cycle

4 SP FLOPs/cycle: 4-wide NEON addition every other cycle + 4-wide NEON multiplication every other cycle

ARM Cortex-A15:

2 DP FLOPs/cycle: scalar FMA or scalar multiply-add

8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add

Qualcomm Krait:

2 DP FLOPs/cycle: scalar FMA or scalar multiply-add

8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add

IBM PowerPC A2 (Blue Gene/Q), per core:

8 DP FLOPs/cycle: 4-wide QPX FMA every cycle

SP elements are extended to DP and processed on the same units

IBM PowerPC A2 (Blue Gene/Q), per thread:

4 DP FLOPs/cycle: 4-wide QPX FMA every other cycle

SP elements are extended to DP and processed on the same units

剧毒术士马文2017-07-13最后更新: 2017-07-13

FLOPS

剧毒术士马文

发表回复取消回复

Playlist

Intel 22Q3 财报新信息

Intel 2021-2025展望 Part0：制程及封装

代号”Chagall”，AMD Threadripper 5000系列预计于8月份发布

Intel下代至强Sapphire Rapids 确认延期至2022Q2，性能定位相关

剧毒术士马文

服务器大战：Intel Skylake-SP Xeon 8176 vs AMD EPYC 7601【Anandtech】

【第七龙神3-代号:VFD】quesQ「ゴッドハンド（チエリ）」预约开始

相关文章

HiSilicon Kirin Mobile SoC SKU list

E3 2018 主要游戏发布时间表

《Pokémon Let’s Go》登陆 Switch，可与手机的《Pokémon Go》联动

《绝地求生》开发方：Epic Games 的《堡垒之夜》对战模式侵权了

发表回复 取消回复

发表回复取消回复