**Alain BENSOUSSAN***The University of Texas at Dallas*

Stochastic Control and Limited Commitment

The theory of investment and growth of firms has been an important source of stochastic control problems. The issue of CEO compensation has been addressed more recently. A seminal paper has been written by H. Ai and R. Li, with a model of CEO compensation under limited commitment. It leads to a new type of stochastic control problem, where a stochastic constraint captures the limited commitment. The authors introduce a Bellman equation, with unusual boundary conditions. Many formal arguments are used in the proof, although the amount of intuition is impressive. The objective of this work is to provide a rigorous and complete theory for this Bellman equation and to solve the corresponding stochastic control problem.

**Ning CAI***The Hong Kong University of Science and Technology (Guangzhou)*

Sensitivity Estimates with Computable Bias Bounds

The likelihood ratio method (LRM) is widely used to estimate sensitivities in risk management. Constructions of the LRM estimators depend heavily on the computations of probability density functions (and their derivatives) of the underlying models, which are usually known only through their Laplace transforms under many popular financial models. We propose a Laplace inversion based LRM with computable bias bounds under these models. By selecting the algorithm parameters appropriately, we can obtain LRM estimators with any desired bias level. In addition, some asymptotic properties of our LRM estimators are also investigated. Numerical experiments indicate that our method performs well under a broad range of popular financial models.

This is joint work with Ziyang Hao.

**Huyên PHAM***Université Paris Cité*

Actor-Critic Learning for Mean-field Control in Continuous Time

We study policy gradient for mean-field control in continuous time in a reinforcement learning setting. By considering randomised policies with entropy regularisation, we derive a gradient expectation representation of the value function, which is amenable to actor-critic type algorithms where the value functions and the policies are learnt alternately based on observations samples of the state and model-free estimation of the population state distribution. In the linear-quadratic mean-field framework, we obtain an exact parametrisation of the actor and critic functions defined on the Wasserstein space. Finally, we illustrate the results of our algorithms with some numerical experiments on concrete examples.

**Chi Seng PUN***Nanyang Technological University*

Bayesian Estimation and Optimization for Learning Sequential Regularized Portfolios

This paper incorporates Bayesian estimation and optimization into portfolio selection framework, particularly for high-dimensional portfolio in which the number of assets is larger than the number of observations. We leverage a constrained l1 minimization approach, called linear programming optimal (LPO) portfolio, to directly estimate effective parameters appearing in the optimal portfolio. We propose two refinements for the LPO strategy. First, we explore improved Bayesian estimates, instead of sample estimates, of the covariance matrix of asset returns. Second, we introduce Bayesian optimization (BO) to replace traditional grid-search cross-validation (CV) in tuning hyperparameters of the LPO strategy. We further propose modifications in the BO algorithm by (1) taking into account time-dependent nature of financial problems and (2) extending commonly used expected improvement (EI) acquisition function to include a tunable trade-off with the improvement's variance (EIVar). Allowing a general case of noisy observations, we theoretically derive the sub-linear convergence rate of BO under the newly proposed EIVar and thus our algorithm has no regret. Our empirical studies confirm that the adjusted BO result in portfolios with higher out-of-sample Sharpe ratio, certainty equivalent, and lower turnover compared to those tuned with CV. This superior performance is achieved with significant reduction in time elapsed, thus also addressing time-consuming issues of CV. Furthermore, LPO with Bayesian estimates outperform original proposal of LPO, as well as the benchmark equally weighted and plug-in strategies.

This is a joint work with Godeliva Petrina Marisu.

**Neil SHEPHARD***Harvard University*

Some Properties of the Sample Weighted Median of an In-fill Sequence with an Application to High Frequency Financial Econometrics

Using an in-fill argument, the properties of the sample median of a sequence of events are established both for the case of a fixed period of time and for a period which shrinks as the sample size grows. The results are used to study the properties of the sample median of absolute returns under stochastic volatility. This estimator is invariant, asymptotically pivotal and a 1/2 breakdown estimator. In practice it has deep robustness to jump processes even when there are jumps of α-stable type.