K20: updated Kepler architecture
With GTX 680 and GK104, nVidia unveiled the Kepler architecture, that was presented years ago to be GPGPU-oriented. While launching the first Kepler GPU, nVidia presented it as gaming-oriented, pretending that a future GK110 will be launched for GPGPU and that GK104 is not “true Kepler”.
The difference between GK104 and GK110
Putting it shortly, more L2 cache and more double-precision floating-point units, keeping the same exact SMx architecture.
With more SMx, having 192 cuda-core each, GK110 (and K20 card) will deliver more than 1 Tflop DP, and probably around 4 Tflop SP.
K20 is the “true” Kepler?
With so few differences, essentially L2 cache-size, and same exact SMx organisation, GK110 have the same exact problem that impair GK104 performance on GPGPU: too few registers, limited L1-cache, and thus a GPU that will struggle with any complex algorithm such as LuxMark, but will shine on DGEMM, CUBLAS, etc.
Why I am deceived
The expected peak performance-level of $2000+ K20, expected for Q1’2013 will be on a par with $459 Q1’2012 Radeon HD 7970, while with the limitation on register numbers and little L1 cache will make it on the same class at $329 Radeon HD7870!
The GP in GPGPU means General Purpose, and nVidia created this category with the GeForce 8800 (G80). Now nVidia is pushing Kepler only for very specific purpose, instead of enabling new class of applications to be developped for the desktop, using nVidia GPU to accelerate our daily work.