Abstract:
In this thesis I consider a Reinforcement Learning (RL) approach for policy evaluation, in particular the Q-Learning algorithm (QLa). The QLa is able to dynamically optimize, in real time, its behaviour on the basis of the feedbacks it receives from the surrounding environment. First, I introduce the theory of Adaptive Market Hypothesis (AMH), on which an active portfolio management is based, as an evolution of the Efficient Market Hypothesis (EMH). Then, the essential aspects of the RL method are explained. Different parameters and values for Financial Trading Systems (FTSs) are presented in order to configure different QLas. Finally, the application and the results of such FTSs on stock price time series are presented.