Title: S.Y. Lee’s Lagrange Multiplier Test in Structural Modeling: Still Useful?
Abstract: Professor S.Y. Lee introduced constrained estimation of parameters subject to nonlinear restrictions in structural equation models (SEM) (Lee & Bentler, 1980). Based on the normal theory generalized least squares (GLS) function, Lee proved a variety of relevant theorems such as consistency and asymptotic normality of estimators and their asymptotic equivalence to maximum likelihood estimators. He also developed a GLS chisquare test of the adequacy of the model, of differences between nested sets of restrictions, and a Lagrange Multiplier (LM) test for evaluating correctness of model restrictions. This talk reviews Lee’s results and presents an overview of further developments and trends in constrained SEM model testing. Although Lee developed the LM test as a confirmatory parameter testing methodology, its main contemporary use in SEM seems to be as an exploratory tool for adding parameters to improve statistically inadequate models. In that context, its use is standard.
Presentation File (1.2MB PDF) 
Peter BENTLER
University of California, Los Angeles

Title: On Some Functional Characterizations of (Fuzzy) Setvalued Random Elements
Abstract: Numerous experimental studies involve semiquantitative expert information, or measured in a nonprecise way, which can be modeled with interval (fluctuations, grouped data, etc.) or fuzzy (ratings, opinions, perceptions etc.) data. A general framework to analyze these kinds of inexact data with statistical tools developed for Hilbertian random variables will be presented. The space of nonempty convex and compact (fuzzy) subsets of R^p, has been traditionally used to handle this kind of imprecise data. Mathematically, these elements can be characterized via the support function, which agrees with the usual Minkowski addition, and naturally embeds the considered into a cone of a separable Hilbert space. The support function embedding holds interesting properties, but it lacks of an intuitive interpretation for imprecise data. Moreover, although the Minkowski addition is very natural when p = 1, if p > 1 the shapes which are obtained when two sets are aggregated are apparently unrelated to the original sets, because it tends to convexify. An alternative and more intuitive functional representation will be introduced in order to circumvent these difficulties. The imprecise data will be modeled by using starshaped sets on R^p. These sets will be characterized through a center and the corresponding polar coordinates, which have a clear interpretation in terms of location and imprecision, and lead to a natural directionally extension of the Minkowski addition.

Ana COLUBI
Justus Liebig University Giessen

Title: Statistical Inference on Membership Profiles in Large Networks
Abstract: Network data is prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. The nodes can be broadly defined such as individuals, economic entities, documents, or medical disorders in social, economic, text, or health networks. Yet a simple question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this talk, we suggest the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degreecorrected mixed membership model, where the null hypothesis assumes that the pair of nodes share the same profile of community memberships. In the simpler case of no degree heterogeneity, the model reduces to the mixed membership model and an alternative more robust test is proposed. Under some mild regularity conditions, we establish the exact limiting distributions of the two forms of SIMPLE test statistics under the null hypothesis and their asymptotic properties under the alternative hypothesis. Both forms of SIMPLE tests are pivotal and have asymptotic size at the desired level and asymptotic power one. The advantages and practical utility of our new method in terms of both size and power are demonstrated through several simulation examples and real network applications.
(Joint work with Yingying Fan and Jinchi Lv)
Presentation File (2.5MB PDF) 
Jianqing FAN
Princeton University

Title: Group Inference in High Dimensions with Applications to Hierarchical Testing
Abstract: Group inference has been a longstanding question in statistics and the development of highdimensional group inference is an essential part of statistical methods for analyzing complex data sets, including hierarchical testing, tests of interaction, detection of heterogeneous treatment effects and local heritability. Group inference in regression models can be measured with respect to a weighted quadratic functional of the regression subvector corresponding to the group. Asymptotically unbiased estimators of these weighted quadratic functionals are constructed and a procedure using these estimator for inference is proposed. We derive its asymptotic Gaussian distribution which allows to construct asymptotically valid confidence intervals and tests which perform well in terms of length or power. The results simultaneously address four challenges encountered in the literature: controlling coverage or type I error even when the variables inside the group are highly correlated, achieving a good power when there are many small coefficients inside the group, computational efficiency even for a large group, and no requirements on the group size. We apply the methodology to several interesting statistical problems and demonstrate its strength and usefulness on simulated and real data.
This is based on the joint work with Claude Renaux, Peter Bühlmann and T. Tony Cai.
Presentation File (1.4MB PDF) 
Zijian GUO
Rutgers University

Title: Separation of Interindividual Differences, Intraindividual Changes, and Timespecific Effects in Intensive Longitudinal Data using the NDLCSEM Framework Abstract: In this talk, we propose a nonlinear dynamic latent class structural equation model (NDLCSEM; Kelava & Brandt, 2019). It can be used to examine intraindividual processes of observed or latent variables. These processes are decomposed into parts which include individual and timespecific components. Unobserved heterogeneity of the intraindividual processes are modeled via a latent Markov process that can be predicted by individualspecific and timespecific variables as random effects. We discuss examples of submodels which are special cases of the more general NDLCSEM framework. Furthermore, we provide empirical examples and illustrate how to estimate this model in a Bayesian framework. Finally, we discuss essential properties of the proposed framework, give recommendations for applications, and highlight some general problems in the estimation of parameters in comprehensive frameworks for intensive longitudinal data.
Kelava, A. & Brandt, H. (2019). A nonlinear dynamic latent class structural equation model. Structural Equation Modeling: A Multidisciplinary Journal, 26(4), 509528. doi: 10.1080/10705511.2018.1555692
Presentation File (750KB PDF) 
Augustin KELAVA
University of Tubingen

Title: Computing the Best Subset Regression Model
Abstract: Several regressiontree strategies for computing all subset regression
models are presented. Branchandbound techniques are employed to
reduce the number of generated nodes. To improve the efficiency of
the branchandbound algorithms, the variables can be preordered in
the root node or in nodes deeper inside the tree. Approximation
algorithms allow to tackle large scale problems while giving
guarantees on the error bounds. If the desired subset sizes are known
in advance, the recursive structure of the regression tree can be
exploited to generate a minimal covering subtree. Given a
predetermined statistical search criterion, the various algorithms
can be adapted to select the single best subset model, drastically
reducing the number of generated nodes and thus improving execution
times. An R package which efficiently implements the algorithms is
described and its performance assessed. 
E. J. KONTOGHIORGHES
Cyprus University of Technology / Birkbeck, University of London, UK

Title: On a Matrix Factor Models
Abstract: Some recently proposed time series models for the so called realized volatility matrices (RCOV) are introduced. From high frequency trading data, estimated RCOV can be utilized as a promising measure on the underlying covariance structure of low frequency returns. This motivates the need in modeling and forecasting the RCOV’s. Bayesian approach for the factor model used in the finance literature proposed by S Y Lee et al. (2007) is reviewed. The Bayesian approach could have great potential for factor models defined for the realized volatility matrices.
Presentation File (183KB PDF) 
Wai Keung LI
The Education University of Hong Kong

Title: Financial Systemic Risk Prediction with NonGaussian OrthogonalGARCH Models
Abstract: There are several aspects of financial asset portfolio construction relevant for success. First, the methodology should be applicable to a reasonably large number of assets, at least on the order of 100. Second, calculations should be computationally feasible, straightforward, and fast. Third, realistic transaction costs need to be taken in account for the modeling paradigm to be genuinely applicable. Fourth, and arguably most importantly, the proposed methods should demonstrably outperform benchmark models such as the equally weighted portfolio, Markowitz IID and Markowitz using the DCCGARCH model. A fifth "icing on the cake" is that the underlying stochastic process assumption is mathematically elegant, statistically coherent, and allows analytic computation of relevant risk measures for both passive and active risk management. The model structure to be shown, referred to as "COMFORT", satisfies all these criteria. Various potential new ideas will also be discussed, with the aim of enticing and motivating other researchers to collaborate and/or improve upon the shown investment vehicles.
Presentation File (7.4MB PDF) 
Marc PAOLELLA
University of Zurich

Title: Modelling Functionvalued Processes with Nonseparable and/or Nonstationary Covariance Structure
Abstract: Separability of the covariance structure is a common assumption for functionvalued processes defined on two or higherdimensional domains. This assumption is often made to obtain an interpretable model or due to difficulties in modelling a potentially complex covariance structure, especially in the case of sparse designs. We proposed using Gaussian processes with flexible parametric covariance kernels which allow interactions between the inputs in the covariance structure. When we use suitable covariance kernels, the leading eigensurfaces of the covariance operator can explain well the main modes of variation in the functional data, including the interactions between the inputs. The results are demonstrated by simulation studies and by applications to real world data.
Presentation File (5.3MB PDF) 
Jian Qing SHI
Newcastle University and
The Alan Turing Institute

Title: Differential Item Functioning Analysis without A Priori Information on Anchor Items: Scree Plots and Graphical Test
Abstract: The detection of differential item functioning (DIF) is an important step in establishing the validity of measurements. Most traditional methods detect DIF using an itembyitem strategy, via anchor items that are assumed DIFfree. If anchor items are contaminated, the methods will yield misleading results due to biased scales. In this article, based on the fact that the item’s relative change of difficulty difference (RCD) does not depend on the mean ability of individual groups, a new DIF detection method (RCDDIF) is proposed under the true null hypothesis, without a priori knowledge of anchor items. The RCDDIF method consists of RCDscree plot that facilitates visual examination of DIF, and RCD confidence interval that facilitates a formal test of DIF at test level. Two simulation studies indicate that RCD confidence interval performs better than three widely used methods in controlling Type I error rate and with greater power, especially under unbalanced DIF conditions. Moreover, the RCDscree plot displays the results in graphics, thereby visually revealing the overall pattern of DIF in the test and the size of DIF for each item. A real data analysis is conducted to illustrate the rationality and effectiveness of the RCDDIF method.
Presentation File (1.8MB PDF) 
KeHai YUAN
University of Notre Dame

Title: Challenges in Analyzing Twosided Market and Its Application on Ridesourcing Platform
Abstract: In this talk, we will introduce a general analytical framework for large scale data obtained from twosided markets, especially ridesourcing platforms like DiDi. This framework integrates classical methods including Experiment Design, Causal Inference and Reinforcement Learning, with modern machine learning methods, such as Graph Convolutional Models, Deep Learning, Transfer Learning and Generative Adversarial Network. We aim to develop fast and efficient approaches to address five major challenges for ridesharing platform, ranging from demandsupply forecasting, demandsupply diagnosis, MDPbased policy optimization, AB testing, to business operation simulation. Each challenge requires substantial methodological developments and inspires many researchers from both industry and academia to participate in this endeavor. Based on our preliminary results for the policy optimization challenge, we receive the Daniel Wagner Prize for Excellent in Operations Research Practice in 2019. All the research accomplishments presented in this talk are joint work by a group of researchers at Didi Chuxing and our international collaborators. 
Hongtu ZHU
University of North Carolina at Chapel Hill

