Neural Landscape for Option Modelling
Machine Learning in Finance I
Traditionally, pricing and hedging methods were selected based on their tractability as opposed to their accuracy. Indeed, the speed of calibration is a fundamental constraint, especially when one wants to extend away from the simplest Black-Scholes setting. The Black-Scholes formula is concise and simple to understand and therefore widely applied, similarly the Heston model benefits from convenient Fourier pricing. The use of these models come at the expense of using better performing, but slow-to-calibrate models.
Machine learning can be used to price derivatives faster. Historically, Hutchinson et al. (1994) trained a neural network on simulated data to learn the Black-Scholes option pricing formula and more recently a number of efficient algorithms have been developed along these lines to approximate parametric pricing operators. This in turn can eliminate the calibration bottlenecks found in more realistic pricing models.
Another way to use machine learning is to avoid the use of simplified models and to directly calibrate models using market data and the tools of machine learning to avoid overfitting. The problem with calibrating to market data is that it becomes hard to understand what is driving the price of the derivative and can be a cause of unease for regulators and risk managers. It is also true that data modelling and preprocessing might introduce a unique set of risks.
Various approaches exist to assess the risk associated with a machine learning solutions, each with different advantages: Backtesting provides a holistic view of the performance of the system, but is constrained to using historical data and struggles with feedback effects. Using market simulators allows these effects to be included, but at a cost of limiting the risks considered to those within a model. Sensitivity analysis and explainable machine-learning gives some indication of the broad features that drive a machine-learning system, but can struggle with extreme events. Adversarial methods produce robust results, but do not give clear understanding of the underlying system, and require modelling of the types of adversarial interactions that are plausible, in order to give practical prices and actions.
Neural Network Applications
Hutchinson (1994) was one of the first papers to use neural networks in options pricing. Since then it has become an active topic of research. The precise role of machine learning in options pricing and the data used to support it can vary significantly. We roughly order the innovations in the order in which they were introduced to the literature.
Functional models: Some models rely on computationally expensive procedures like solving a partial differential equation (PDE) or performing Monte-Carlo simulations to estimate the option price, implied volatility, or hedging ratio. For these models we can use offline neural networks to approximate a pricing or hedging function through parametric simulations (Hutchinson, Lo, & Poggio, 1994; Carverhill & Cheuk, 2003).
Hybrid models: Other models use a hybrid approach whereby they first leverage a parametric model to estimate the price and then build a data-driven model to learn the difference or residuals between the price and the parametric model estimate (Lajbcygier & Connor, 1997).
Solver models: A range of parametric models need to solve a PDE and neural networks having the ability to deal with high-dimensional equations are quite adept at solving PDEs (Barucci, Cherubini, & Landi, 1997; Beck, Becker, Cheridito, Jentzen, & Neufeld, 2019).
Data-driven models: Other models disregard the parametric models in its entirety and simply use historical or synthetic data of any type to learn from an unbounded model that is free to explore new relationships (Ghaziri, Elfakhani, & Assi, 2000; Montesdeoca & Niranjan, 2016).
Knowledge models: These models constrain a universal neural network by adding domain knowledge to the architecture to learn more realistic relationships that increases the interpretability of the model e.g., forcing monotonous relationships towards one direction by adding penalties to the loss function (Garcia & Gençay, 2000; Nadeau, & Garcia, 2009).
Calibration models: These models use price or other outputs to calibrate an existing model and obtain the resulting parameters. This method also provides enhanced interpretability because the neural network model is simply used in the calibration step of existing parametric models (Andreou, Charalambous, & Martzoukos, 2010; Bayer, Horvath, Muguruza, Stemper, & Tomas, 2019).
Activity models: A number of option types like American options benefits from learning an optimal stopping rule using neural networks in a reinforcement learning framework or benefits from learning a value function or a hedging strategy that benefits from temporal optimal control i.e., a model that takes evolving market frictions into account (Buehler et al., 2019).
Generative models: A generative model can take any data as input and generate new data that either looks similar to the original data or use inputs that are conditioned on other attributes to generate different looking data. This generated data model’s purpose is simply to aid the performance of traditional parameter models and models (1)-(7) as a form of regularisation and interpolation (Bühler, Horvath, Lyons, Perez Arribas, & Wood, 2020; Ni, Szpruch, Wiese, Liao, & Xiao, 2020).
Many of the use-cases for neural networks boil down to their ability to learn non-linear relationships and overcome high-dimensional concerns, allowing it to solve PDEs, develop data-driven models with large feature sets, and find optimal policies in large state-spaces via reinforcement learning. This ability to overcome the dimensionality further allows modellers to create offline input-to-output mappings that can significantly increase inference speed. We have seen why these models are valuable, but as always, there is no-free-lunch. We have to ask what expense these methods impose.
Valuation and hedging users
Most of the recent derivative pricing and hedging innovation is coming from calibration, solver, activity, generative, and knowledge neural networks. Before we investigate the user implications of the meteoric rise of neural networks, let us first consider the entities that rely on accurate and timely valuations and hedging models.
The users are sorted by the typical speeds required:
Market makers use pricing theory to identify discrepancies between underlying assets and option contracts, as well as between the same option with different maturities, or even the same underlying and term across different venues.
Brokers and dealers of OTC options want to establish a derivative business that implements a successful and profitable hedging strategy.
Exchanges and central clearing parties would like to understand the true value of options, to assess initial and variation margin requirements throughout the life of the contract.
Corporate treasurers and institutional investors want to calculate the value of options, especially OTC options with little to no market data.
Regulatory and supervisory bodies want to value, or at least review, the methodologies used to value options, to assess whether institutions and banks meet their capital requirements.
Option valuation have important implications for taxation due to their recent use as compensation and hedging vehicles.
Market makers would like to have access to models with the least amount of latency, possibly milliseconds, thereafter, exchanges would want prices on at least a minute-to-minute basis to assess the stability of the order book. Brokers and dealers of OTC derivatives would price derivatives on a minute to hourly basis. Next, central clearing parties need to reassess the option price on a daily basis to set forth margin variance requirements. Then somewhere between a daily to weekly basis, corporate treasurers and institutional investors would want to know the value of their options for risk management purposes, lastly regulators, revenue services, and shareholders will want to know the value of options on a quarterly to annual basis. These requirements are important to consider, because many machine learning models purely espouse improved speeds to the loss of accuracy.
In truth the speed advances that machine learning offers mostly lead to short-term pecuniary benefits for competing market makers. The adoption of machine learning would also ease the computational resource expenditure for other entities who might not need real-time valuation but are instead performing millions of different calculations per day. Whether for latency or computational reasons, machine learning has some role to play but at what expense. The three elephants in the room are accuracy, explainability, and robustness. The functional model is guaranteed to be less accurate and less robust than a parametric model but faster, whereas say the hybrid model is almost guaranteed to be more accurate than a parametric model but at the expense of comprehensibility. We will keep these questions in mind as we address their principal concerns in future sections.