En cours

Monte Carlo Algorithm

Consider a world with grid 2x2 ( see attachment)

The cells S1, S2, S3, S4 are the states.

In each state the agent can choose one of the following actions: up, down, left, right.

The S1 state is the terminal state. In any other state the agent is moving to the next cell depending on the action.

For example: we are in the S3 and we choose the action ''Right''. Then the agent moves to S4 with probability 1 and reward -1.

In case of the action selected drives the agent outside the grid then it will hit to a wall and will move to the opposite state with reward -2. For example. At S4 we want to go right, will result the agent to move left to S3.

We consider initially that Q(S,a) is 0 for every S,a.

Monte Carlo algorithm for every visit with exploring starts for an episode of 3 steps.

What will be the policy of the agent after the episode and why?

Compétences : Algorithme

Voir plus : what algorithm, example algorithm, an algorithm, algorithm world, algorithm is, algorithm example, s4, monte, episode, algorithm, algorithm c, left right, probability algorithm, s1, drives, 2x2, state algorithm, matlab monte carlo simulation, matlab volume monte carlo, vba monte carlo simulation, matlab monte carlo, matlab monte carlo simmulation, monte carlo simulationmatlab, monte carlo matlab, monte carlo method matlab

Concernant l'employeur :
( 7 commentaires ) Athens, Greece

N° du projet : #1085542

Décerné à :

dobreiiita

Hi,Please check your inbox,Thanks.

35 $ USD en 0 jours
(18 Commentaires)
4.8

3 freelance ont fait une offre moyenne de 35 $ pour ce travail

ronobir1

Hellow friend Please check PM

40 $ USD en 1 jour
(6 Commentaires)
3.4
topcoder0

I can do it

30 $ USD en 1 jour
(0 Commentaires)
0.0