Browsing by Author "Saadat, Kimiya"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Open Access Exploring Adaptive MCTS with TD Learning in miniXCOM(2022-10-24) Saadat, Kimiya; Zhao, RichardIn recent years, Monte Carlo tree search (MCTS) has achieved widespread adoption within the game community. Its use in conjunction with deep reinforcement learning has produced success stories in many applications. While these approaches have been implemented in various games, from simple board games to more complicated video games such as StarCraft, the use of deep neural networks requires a substantial training period. In this work, we explore on-line adaptivity in MCTS without requiring pre-training. We present MCTS-TD, an adaptive MCTS algorithm improved with temporal difference learning. We demonstrate our new approach on the game miniXCOM, a simplified version of XCOM, a popular commercial franchise consisting of several turn-based tactical games, and show how adaptivity in MCTS-TD allows for improved performances against opponents.Item Open Access Single-player to Two-player Knowledge Transfer in Atari 2600 Games(2024-11-18) Saadat, Kimiya; Zhao, Richard; Abou-Zeid, Hatem; Aycock, JohnPlaying two-player games using reinforcement learning and self-play can be challenging due to the complexity of two-player environments and the potential instability in the training process. It is proposed that a reinforcement learning algorithm can train more efficiently and achieve improved performance in a two-player game by leveraging the knowledge from the single-player version of the same game. This study examines the proposed idea in ten different Atari 2600 environments using the Atari 2600 RAM as the input state. The advantages of using transfer learning from a single-player training process over training in a two-player setting from scratch are discussed, and the results are demonstrated in several metrics, such as the training time and average total reward. Finally, a method for calculating RAM complexity and its relationship to performance after transfer is discussed. Results show that in most cases transferred agent is performing better than the agent trained from scratch while taking less time to train. Moreover, it is shown that RAM complexity can be used as a weak predictor to predict the transfer's effectiveness.