Our just submitted algorithm "Fractal AI" played 100 consecutive games -the minimum allowed for an official scoring- and get an average score for the best 10 games of 11543 +/- 492, well above previous record of 9106 +/- 143, so we are actually #1 on this particular atari game:
Previous record by MontrealAI achieved an average scoring of 9106 +/- 143 after playing about 300k consecutive games. When you inspect its learning ratio, you notice how different our approach is:
As you can see, it is a pure "Learning algorithm", meaning that it starts with zero knowledge and a also near-zero scoring, and as it learns from more games, it gets better and better scoring, so after learning from 300.000 games, it can achieve scores of about 9.000 points.
In contrast, Fractal AI is a pure-intelligence algorithm, it doesn't learn at all by its own (on its simpler incarnation), so to get better scoring, you need more thinking power (more CPU or a better implementation).
If we super-impose both graphs, this difference is quite evident:
Adding learning is, of course, the next step, as it would make the algorithm orders of magnitude better (given it time) and faster (learning allows as to cut down number of walkers over time saving most of the CPU needed without learning), but until then, we will try to beat some more atari games and other OpenAI environments we already worked on in the past (but never submitted) like the pole or the hill climbing classic control ones.
Update: Qbert also has official score now (27/6/2017)!