Skip to content

Recent Articles


nVidia and OpenCL support

nVidia have been the first major GPU designer to jump into the OpenCL wagon, a project initiated by Apple to enable cross-platform GPGPU development that is OS and vendor-agnostic, then maintained by the Khronos Group.

Today AMD and Intel are big players for OpenCL support, for both their CPU and GPU, while the new AMD Radeon GCN architecture is clearly performance leader on OpenCL when you use complex algorithms, while new nVidia Kepler architecture lag far behind the old Fermi architecture! Intel 2013 CPU+GPU architecture, Haswell, is expected to beat entry-level Kepler GT640 on any usage (Intel GT3 will beat it, trust me!).

nVidia have an hard time, with uneffective and deceptive Kepler architecture that is slower than AMD new architecture for both 3D and GPGPU, and could not even compete with 2010 nVidia architecture for GPGPU, on an open playfield that is OpenCL.

nVidia that was OpenCL leader is actually trying everything it could to stop supporting it, including removing comments or documentation in EXISTING OpenCL examples, not updating them, removing them from the SDK, etc.

Please read this open-letter to nVidia, and sign the petition!


nVidia Forums seriously hacked

One month ago, the nVidia forums have been hacked, and all the credentials have been stolen, including password hash, salted (but no confirmation at this time that the salt is unique to each account!).

One month after that the forums is still down; 3 weeks since nVidia promised to send new credential to reset our accounts and passwords. I wonder what’s happening, did they have web developpers, or did they use a third-party closed-source software that could not be fixed by nVidia or consultants?!?

The nVidia Forum was one main point for CUDA and OpenCL developpers to exchange informations and ideas, and it’s totally sad to see it down for so long…


For chess lovers

This is a website I discovered, with the help of Charle, an incredible guy :)

Bryan Whitby created it’s own Computer Chess, using micro-controllers, existing Computer Chess, and any piece that fit in-between! It gaves me so much ideas!

I have a side-side-project, to connect my Novag Citrine with my Macs, but now I want to do it and go further, giving it autonomy (batteries) and eventually replace the electronic with a quad-core ARM micro-controller, to be able to run latest chess software on it, with a probable ELO of 2600-2700 while running on a battery pack! Sexxxxyyyyy!


Branching factor is the key for Chess Engines

The branching factor is the average number of positions that are searched deeper from each node.

Combinatorial explosion is the main problem a Chess developer should consider when trying to parse the game tree: with an average 30 possibilities for each camp on the middle-game, the tree to parse for 8 ply (4 white moves and 4 black moves) might contain more than 650 billions nodes. This is branching factor of 30, where each possibility isĀ  checked.

Actual modern chess engines could go as low as 2.1 for their branching factor, meaning they only consider an average 2.1 moves for each position, they will parse the same tree with examining only 378 final nodes instead 650 billion nodes. It works by cutting some branches ASAP using alpha-beta pruning, null move heuristics, etc.

Brute-weakness of Hydra

Hydra was a clustered Chess Engine, able to examine 150 million nodes per second, but with a branching factor of 3.7, so to go as deep as 18-ply, it have to consider 18 billion positions in tournament time. With their low branching factor, the best engines don’t need a cluster to go deeper, and on the same positions will be able to examine 24-ply while computing much less positions per second: the branching factor is the key!

How-to improve branching factor

The best way to improve branching factor is to order possible moves on a position such as the first move will be the most interesting one. By “the most interesting one”, I mean the one that will generate the best possible evaluation, and this is the point: to have a low branching factor, your mover ordering function should correlate strongly your full evaluation function.

I tend to think that the evaluation function and the one used to order move should be identical, and probably the most crucial is the move ordering function itself, to increase the paying strength of a Chess Engine.


nVidia GT 640 : first impressions

The GT 640 is on the same league as AMD Radeon HD7750, the $99 graphic card (still a little over it, let’s wait a month or two), based on the new Kepler architecture that was unveiled on GTX 680. And powered by the PCI-Express bus, no hassle, will work in 90% of recent PC, even low-end!

This card is the perfect Kepler development card, with 2GB DDR3 memory (would have paid a premium for GDDR5, but…). It’s Kepler for the masses, and for the CUDA or OpenCL developer, this is the card of choice to tune their code for Kepler.

Was simple to install, worked flawlessly, I will probably never use a 3D game with it, but it’s an interesting GPGPU device.

Benchmarking with LuxMark 2.0

Room = 63

Sala = 132

Luxball HDR = 872

It’s 2.5X SLOWER than Radeon HD7750, but compared to other OpenCL devices, it’s 20% faster than my old GTX 260, and nearly 2X faster than HD6750M.

My goals

My goals are to understand the Kepler architecture and the way Warp instructions are scheduled on the different SP inside an SMx, and optimize my current OpenCL code for the Kepler architecture.