Bachelor Thesis from the year 2021 in the subject Engineering - Computer Engineering, grade: 1,3, Hamburg University of Technology (Embedded Systems), language: English, abstract: This bachelor thesis aims to illustrate the idea behind Markov Decision Processes (MDP) and to present a few basic methods of Reinforcement Learning (RL) namely Monte Carlo Learning and Q-Learning, which are the solutions for decision problems modelled by MDPs. For the last section we apply these methods on an application and in the end discuss the results. Let us imagine the scenario where we put a hamster inside a maze, we expect the hamster to go through the maze till it reaches some point we considered as the goal. Well, it may randomly work but most of the time it won't. At this place, the hamster does not know how important this particular point remains namely the goal. But how will it be, when we remunerate the hamster once the goal is reached, he receives a reward for example a piece of cheese. The hamster will start to remember the route, which leads to the cheese and he maybe will learn to go the easy and quick way to achieve this goal. What we did, is that we reinforce the good behavior of the hamster by giving it some reward.
ThriftBooks sells millions of used books at the lowest
everyday prices. We personally assess every book's quality and offer rare, out-of-print treasures. We
deliver the joy of reading in recyclable packaging with free standard shipping on US orders over $15.
ThriftBooks.com. Read more. Spend less.