Lecture 5: Nondeterministic world

티스토리 뷰

A.I/RL by Sung Kim

Lecture 5: Nondeterministic world

궁선이 2019. 1. 23. 18:01

이 내용은 Sung Kim 교수님의 모두를 위한 RL 강좌를 정리한 내용입니다.

https://youtu.be/6KSf-j4LL-c

---------------------------------------------------------------------------------------------------------------------------------------

지금 까지 다루었던 World 는 모두 Deterministic world. 그러나 현실 세계는 Non-deterministic 요소를 가지고 있다.

Deterministic model : the output of the model is fully determined by the parameter values and the initial conditions.

Stochastic model (non-deterministic) : possess some inherent randomness. The same set of parameter values and initial conditions will lead to an ensemble of different outputs.

정리

만약 내가 S 라는 상태에서 A 라는 행동을 취했을 때,

- Deterministic world: S에서 A라는 선택은 언제나! A' 라는 결과가 도출됨

- Non-deterministic world: A' 라는 결과 말고도 A+ or A- or A* .... 등등 다른 결과도 도출될 수 있음.

-> 기존의 Q-learning 알고리즘으로는 Non-deterministic world 문제를 해결할 수 없다.

해결법

Q(s') 을 learning rate 비율만큼만 받아들여서 Q(s) 를 업데이트한다.

Non-deterministic Q-learning algorithm

강좌에 나오는 코드는 다음 Github에 구현해 놓았습니다.

https://github.com/whitesoil/ReinforceLearningZeroToAll

저작자표시 비영리 변경금지

'A.I > RL by Sung Kim' 카테고리의 다른 글

Lecture 7 : DQN (0)	2019.01.28
Lecture 6: Q-Network (0)	2019.01.23
Lecture 4: Q-learning (table) (0)	2019.01.23
Lecture 3: Dummy Q-learning (table) (0)	2019.01.18
Lecture 2: OpenAI GYM (0)	2019.01.18

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

글 보관함

왕초보 개발자의 일기장

티스토리 뷰

Lecture 5: Nondeterministic world

'A.I > RL by Sung Kim' 카테고리의 다른 글

티스토리툴바