'Computer taught itself Space Invaders'

In this 1997 picture, chess champion Garry Kasparov, right, pauses before making his first move in a game against the IBM Deep Blue computer. AFP PHOTO/Stan Honda

In this 1997 picture, chess champion Garry Kasparov, right, pauses before making his first move in a game against the IBM Deep Blue computer. AFP PHOTO/Stan Honda

Published Feb 26, 2015

Share

London - A new kind of artificial intelligence has learned to play vintage video games without any prior instructions in a bid to achieve human-like scoring abilities, scientists claim.

The intelligent machine learns by itself from scratch using a trial-and-error approach that is reinforced by the reward of a score in the game. This is fundamentally different to previous game-playing "intelligent" computers.

The system of software algorithms is called Deep Q-network and has learned to play 49 classic Atari games such as Space Invaders and Breakout, with only the help of information about the pixels on a screen and the scoring method.

Scientists behind the development said the software represents a breakthrough in artificial intelligence capable of learning without being fed instructions from human experts - the classic method for chess-playing machines such as IBM's Deep Blue computer.

"This work is the first time anyone has built a single, general learning system that can learn directly from experience to master a wide range of challenging tasks, in this case a set of Atari games, and to perform at or better than human level," said Demis Hassabis, a former neuroscientist and founder of DeepMind Technologies, which was bought by Google for £400m in 2014.

"It can learn to play dozens of the games straight out of the box. What that means is we don't pre-program it between each game. All it gets access to is the raw pixel inputs and the game's score. From there it has to figure out what it controls in the game world, how to get points and how to master the game, just by playing the game," Mr Hassabis, a former chess prodigy, said.

"The ultimate goal here is to build smart, general purpose machines but we're many decades off from doing that, but I do think this is the first significant rung on the ladder," he added.

The Deep Q-network played the same game hundreds of times to learn the best way of achieving high scores. In some games it outperformed humans by learning smart tactics.

In more than half the games, the system was able to achieve more than 75 per cent of the human scoring ability just by trial and error, according to a study published in the journal Nature.

In 1997, Deep Blue beat Gary Kasparov, the world champion chess player, while IBM's Watson computer outperformed players of the quiz show game Jeopardy! in 2011. However, Mr Hassabis said Deep Q works in a fundamentally different way.

He said: "The key difference between those kinds of algorithms is that they are largely pre-programmed with their abilities. What we've done is to build programs that learn from the ground up.

"These type of systems are more human-like in the way they learn in the sense that it's how humans learn. We learn from experiencing the world around us, from our senses and our brains then make models of the world that allow us to make decisions and plan what to do in the world, and that's exactly the type of system we are trying to design here," Mr Hassabis said.

"The advantage of these kind of systems is that they can learn and adapt to unexpected things and the programmers and systems designers don't have to know the solution themselves in order for the machine to master that task," he added.

The Independent

Related Topics: