Rajendra

I write columns on news related to bots, specially in the categories of Artificial Intelligence, bot startup, bot funding.I am also interested in recent developments in the fields of data science, machine learning and natural language processing ...

Follow on

I write columns on news related to bots, specially in the categories of Artificial Intelligence, bot startup, bot funding.I am also interested in recent developments in the fields of data science, machine learning and natural language processing

This asset class turned Rs 1 lakh into Rs 625 crore in 7 years; make a wild guess!
19 days ago

Artificial intelligence is not our friend: Hillary Clinton is worried about the future of technology
23 days ago

More than 1 lakh scholarship on offer by Google, Know how to apply
24 days ago

Humans have some learning to do in an A.I. led world
24 days ago

Human Pilot Beats Artificial Intelligence In NASA's Drone Race
25 days ago

Google AI can create better machine-learning code than the researchers who made it
51120 views

More than 1 lakh scholarship on offer by Google, Know how to apply
38583 views

13-year-old Indian AI developer vows to train 100,000 coders
23775 views

Pornhub is using machine learning to automatically tag its 5 million videos
21903 views

How to win in the AI era? For now, it's all about the data
16011 views

When AI learns to sumo wrestle, it starts to act like a human

Oct 11, 2017 | 2049 Views

OpenAI, the artificial intelligence research lab backed by Elon Musk, found that AI pitted against another AI opponent could continuously learn and adapt to its foe, as well as explore new and different ways to capitalize on weaknesses.

The experiments took two versions of the same AI code, built to learn from attempting a simple task many times, and made them compete in simple virtual challenges that require complex movements, like sumo wrestling. One of the two related research papers released today focused on the method of learning between rounds, while the other studied the interaction between AI agents as they competed.

When the AI was humanoid, it figured out techniques that mirror how humans perform the activity, like crouching to gain better stability, without any coaching or prompting to do so. The AI even figured out how to deceive its opponents, luring them to the edge of the ring and then dodging out of the way as the opponent's momentum caused it to fall.

"It does seem to be matching what humans might be doing in a similar wrestling setting, and additionally [learning] strategies like deception," says Igor Mordatch, who led the research at OpenAI.

The basis of this research is a subset of artificial intelligence research called reinforcement learning. An AI agent is made to repeat a task over and over with slightly variations until the task can be completed. Researchers tinker with how the agent brings experience from one attempt to the next, but a large part of the research is figuring out how the agent is told that an action was good or bad, called a reward. In the sumo wrestling test, the fighters were programmed to get +1,000 points if they won, -1,000 points if they lost, and -1,000 points if the match ended in a tie.

In order to win, the agents naturally learned stances that made them more stable. They lowered their heads and torsos, and extended their arms to the side, similar to the stance associated with human sumo wrestlers. Arms were also used to hook and move opponents towards the ledge.
As the agents learned, the rewards in some tests needed to be changed. In a soccer-like game, researchers first rewarded the agents for learning to walk. After thousands of tries, the agents learned to walk, and then the reward was switched to +1,000 points for successfully defending or scoring (depending on which agent) plus bonus points for standing at the end of the round.

But since the AICs knowledge is learned slowly over thousands of iterations, researchers say it's difficult to track exactly how the learning takes place or why. As the AI learned to sumo wrestle, one of the agents figured out how to fake its opponent out, deceiving them into lunging forward near the edge of the ring and then stepping out of the way. But what the team doesn't know is whether the agent predicted that strategy would help it win or it was merely an accident that got rewarded into a successful behavior.
While these exact skills or knowledge of how to walk in a specific simulation might not be useful on their own, Mordatch says this research helps further understanding of learning complex goals in competitive games, like the lab's work in mastering competitive video game Dota 2.

Source: Quartz
Rajendra

I write columns on news related to bots, specially in the categories of Artificial Intelligence, bot startup, bot funding.I am also interested in recent developments in the fields of data science, machine learning and natural language processing ...

Full Bio 
Follow on

I write columns on news related to bots, specially in the categories of Artificial Intelligence, bot startup, bot funding.I am also interested in recent developments in the fields of data science, machine learning and natural language processing

This asset class turned Rs 1 lakh into Rs 625 crore in 7 years; make a wild guess!
19 days ago

Artificial intelligence is not our friend: Hillary Clinton is worried about the future of technology
23 days ago

More than 1 lakh scholarship on offer by Google, Know how to apply
24 days ago

Humans have some learning to do in an A.I. led world
24 days ago

Human Pilot Beats Artificial Intelligence In NASA's Drone Race
25 days ago

Google AI can create better machine-learning code than the researchers who made it
51120 views

More than 1 lakh scholarship on offer by Google, Know how to apply
38583 views

13-year-old Indian AI developer vows to train 100,000 coders
23775 views

Pornhub is using machine learning to automatically tag its 5 million videos
21903 views

How to win in the AI era? For now, it's all about the data
16011 views