Abstract:
Reinforcement Learning (RL) has emerged as a transformative method that revolutionized portfolio management by utilizing its unique properties to learn and adapt to the dynamic nature of financial markets. This approach is distinctive as it mirrors how humans make decisions by learning from the consequences of their actions and adapting their behaviours accordingly. Similarly, Reinforcement is able to learn from its experience upon entering uncharted environments, and acclimate to select the best decisions in new situations which makes it a suitable approach for financial data, especially in the context of portfolio management where conditions are constantly changing.
This thesis starts with an explanation of the main concepts in Reinforcement Learning, gradually setting up the stage for a comparison of three vital RL learning methods: Value-Based Learning, Policy-Based Learning, and Actor-Critic Learning Methods. The objective of this study is to analyse each of these methods and compare their advantages and limitations in the context of portfolio optimization providing insights on their effectiveness and usefulness in the nuanced landscape of financial investment. Ultimately, leveraging Python, this analysis will be applied to develop and compare practical models, demonstrating how these RL methods can be employed to optimize financial portfolios effectively.