Abstract:
Many authors have shown that investors are not fully rational, as the traditional Efficient Markets Hypothesis suggests, and that investor sentiment can have an impact on stock prices. As investor sentiment is not directly measurable, different proxies have been used by researchers. In addition, progress in natural language processing has contributed to the development of new sentiment measures based on text sources obtained by news providers and social media. This work deals with a classification problem on financial news data and defines a reliable proxy for investor sentiment using both dictionary – based and supervised Machine Learning techniques. In particular, LSTMs networks have been adopted. The resulting sentiment proxies have been used as exogenous variables in the mean and variance equations of a Generalized Autoregressive Conditional Heteroskedasticity model in order to prove the existence of a relationship among them and stock returns and among them and volatility.