Skip to content

Recent Articles


No more CUDA in the tag line…

Yes, I removed CUDA from the tag line, and there are reasons behind this choice.

No, CUDA is not an obsolete or bad technology. In face CUDA is edgy for GPGPU computing, and from my point of view, the best for High-Performance Computing, clusters, research, etc. There’s no doubt for me that CUDA is and will be the platform for HPC for the next decade to come!

As a Mac user, all my Macs now have AMD GPU, and/or Intel iGPU. My PC development box have an AMD GPU installed, and no nVidia GPU. Was my choice for this one, because I am switching to OpenCL.

I am not targeting HPC, I am targeting everyone’s computer, and in this sense, CUDA is not appropriate anymore. OpenCL runs very well on Intel iGPU, AMD GPU and also nVidia GPU. It also run perfectly on AMD CPU and Intel CPU (using AVX2 on Haswell with OS X!).

Portability is a concern, an open-platform is also a plus (including for personal choices), so I am using OpenCL, and no more CUDA. Don’t think CUDA is a bad technology, it’s the best GPGPU solution for HPC, and the best proprietary GPGPU solution (but limited to nVidia GPU, and that’s the point!).


Fractal Forward

Fractal Forward is the name of my actual Chess Engine. It’s a strange beast, it don’t do thing the way it used to be, and that’s interesting in many ways.

Forward, as the preceding Chess Engine “Fast Forward”, going deep into the tree, that’s totally classical, and you will find that on any major actual Chess Engine, nothing to worry about, but at actual GPU speed, it go deeper and deeper. But Fast Forward is dumber than the other engines. It see more, but don’t understand. “The sage point to the moon and the idiot look at the finger” ?

Fractal because it consider the tree as a dynamic tree, and moreover it’s understanding of the tree itself as dynamic, changing over time, over iterations, with each in-depth iteration being identical as it’s tree iteration. What does that mean in practice?

Any major Chess Engine actually evaluate positions and intermediate nodes with different algorithms than for the tree by itself, position evaluation, quiescence, quick exchange evaluation, the need to evaluate some nodes more in-depth, each of this is a different algorithm, while clearly trying to do the same thing: see deeper on the tree without parsing it. If Quick Exchange Evaluation works well, or the others, why not using it at the root of the tree? Because they don’t work at all, they just hide that we don’t have processing ressources to parse the tree and have a correct view of what’s happening, and they do marvel at this, at the cost of algorithms and implementation complexity. Something that translates badly on GPU!

On the other side, if we have a good algorithm to travel the tree, why not applying it recurrently on each node? Recurrence, with a deeper and deeper view? Exactly as we could view a sponge closer and closer, it’s an endless tasks, but what’s interesting is that if we have MORE processing power using GPU, and a simple effective algorithm, instead of implementing specific algorithms for differently characterized nodes of the tree, we could just throw the tree at it, and it will grow naturally with a simple and unique view, that is homogenous wether you are at the root or considering a 18-plies deep move.

Interesting idea?


NSA unleashed

I was spied by local and foreign agencies since mid-2000′s, I had to strengthen security on my network, and have to isolate some of my development computers. Given the recent informations Edward Snowden gives us all, I might just have loss my time, since NSA and other agencies have incredibly efficient spying tools.

I am going back to OpenCL development, and don’t want to have them given to foreign companies, or even get examined by foreign agencies…

Through nVidia K20x (and now K40), I have received proposals to run my OpenCL developments on supercomputers based on USA, and paid by US agencies or US army. They never proposed me computers running on my country (Canada). If I would have accepted that to put my (virtual) hands on K20x or K40, I would have offered my code to any US agency, with a possible “leak” to US companies, that may use it, or even patent it or their own.

They are working hard to have insight into my personal work. They might already be able to spy on my main computers. I am clearly thinking about creating a strongly protected computer, non-networked, bought used, without sound I/O, to work on my personal projects for 2014. My code is MY code, not their. Happy 2014 NSA year!



2014 will be the OpenCL Year

I am back on OpenCL development, having worked for a big Canadian media company in 2013, I will have time on 2014 to work on OpenCL, and I think it’s time for OpenCL to be mainstream!

The signal is the commitment of Apple to OpenCL technology, with the new Mac Pro, and it’s dual GPU. Maybe these 2 GPU are overstated or overrated on many websites, with performance-level ranging from (actual) Radeon R9 280X ($300 street price) to Radeon HD 7990 on customized Mac Pro. This is not expected level of performance expected by a $4000+ computer, especially with non-pro hardware (no ECC for example).

The main point is software. The new Apple’s Final Cut know hot to use OpenCL, across at least 2 GPU, and this is a big news, as it is much more complex to handle multiple OpenCL devices than just one, synchronize them and use them at their bests. Apple is making a strong point with Final Cut to show how OpenCL and multiple GPU could unload CPU, and may offers unprecedented performance-level for software that make good use of OpenCL.

At the same time, Intel offering for their Haswell integrated GPU is mature, with impressive hardware, Iris Pro HD5200 being an incredible iGPU for small dataset, and solid OpenCL drivers. Yes AMD is offering good iGPU, but we are all awaiting them to be built on GCN 1.1, and having same incredible memory/cache bandwidth (sorry AMD you seems to lag behind on iGPU).

nVidia is still playing it’s game with Kepler, that is all but impressive on real-world GENERAL PURPOSE GPU usages, but may unleash a new architecture in 2014 that may put them back into the game. CUDA is dead outside HPC world, OpenCL is leading the GPGPU world, that’s what I was expecting, nVidia must come back with strong OpenCL development tools (based on their current impressive CUDA tools), to re-establish itself as a leader in GPGPU for all. I remember my GeForce 8800GTS 320MB, the first generation of GPGPU, and an impressive performer for it’s time: a game changer.

I wish you all an awesome 2014, I know that for me and OpenCL developers, there will be incredible opportunities :)


R.I.P Paul Morphy

I was in vacation in the New Orleans two weeks ago, and I had the chance to visit the Cemetery Saint-Louis #1, where you could find the Voodoo mastress Marie Laveau, but also the chess prodigy and unofficial world chess champion Paul Morphy.

I appreciated the chess pieces that were offered to his memory. This cemetery is part of the history of New orleans, you must absolutely visit it, during the day, with a guide!