Abstract:
The aim of this thesis is to explore the potential of a Deep Reinforcement Learning approach to the Portfolio Optimization problem. Four different types of Reinforcement Learning algorithms – Advantage Actor-Critic (A2C), Proximal Policy Optimization (PPO), Deep Determinist Policy Gradient (DDPG), and Twin-Delayed Deep Deterministic Policy Gradient (TD3) – will be tested on the thirty Dow Jones constituents and compared to the index’s performances as a baseline. We will also assess the capability of such algorithms to detect crisis patterns, and act accordingly. To do so, we will provide, as additional input, indexes that aim at capturing financial stress and volatility: their impact will be assessed contextually with the algorithms’ performances.