PhD Thesis Defense Presentation: Chengyu Zhang
Mr. Chengyu Zhang, a doctoral student at 海角社区 in the Finance area will be presenting his thesis defense entitled:
Three Essays on the Applications of Machine Learning in Finance
听
Wednesday, July 31, 2024 at 10:00 a.m.
(The defense will be conducted on Zoom)
Student Committee Chair: Professor Ruslan Goyenko
Please note that the presentation will be conducted on Zoom. If you wish to attend the presentation, kindly contact the PhD Office.
ABSTRACT
This thesis consists of three essays that covers various topics on the applications of machine learning methods in the field of finance.
The first essay introduces a machine learning framework that non-parametrically estimates the optimal dynamic portfolio strategy subject to realistic and predictable trading costs. Conditioning on a comprehensive set of stock characteristics and macroeconomic indicators, the trading-cost-aware portfolio strategy substantially outperforms market benchmarks in out-of-sample tests, and is robust to various limits-to-arbitrage constraints. I demonstrate that incorporating explicit trading-cost penalty is critical to avoid extracting performance from small and illiquid stocks, better capture market stress periods, and allocate assets based on more fundamental signals.
In the second essay, we study whether the stock market or the options market has the leading informational advantage. By conducting a horse-race using a large set of stock and option characteristics with machine learning, we find that option characteristics dominates the return predictability for both stocks and options. Among option characteristics, option illiquidity is the most important predictor for both stock and option returns, and we uncover positive option illiquidity premium in the stock returns, as an increase in derivatives trading decreases information asymmetry and stock price uncertainty.
In the third essay, we train classic machine learning algorithms and large language models, LLMs, to predict future earnings surprises using textual data extracted from quarterly and annual filings of U.S. corporations. We observe a negative correlation between the length of the MD&A section and both future earnings surprises and firm returns. In addition, conventional machine learning methods that rely on sentiment analysis or bag-of-words techniques fail to effectively leverage past managerial discussions for accurate predictions of future earnings. We find that only finance-objective trained LLMs have the capacity to comprehend the contextual information embedded in 10-Q and 10-K filings to predict both positive and negative earnings surprises, and future firm returns.