Seminar Schedule



Click here to redirect to the updated Seminar page



January 20, 2023

MSCS 310


Xiulin Xie

Transparent Sequential Learning: A Powerful Tool for Monitoring Sequential Processes

ZOOM Link:

 Abstract: Sequential process monitoring has received considerable attention due to its broad applications, including manufacturing industry, spatial-temporal disease surveillance, environmental monitoring and many more. To sequentially monitor a process, a major statistical tool is statistical process control (SPC) chart, whose major goal is to check whether a process has a significant distributional shift over time. However, traditional SPC charts are developed mainly for monitoring production lines in the manufacturing industry under the assumptions that process observations at different observation times are independent and identically distributed with a parametric (e.g., normal) distribution when the process is stable. However, these assumptions are rarely valid in applications. In this talk, we introduce a new learning framework, called “Transparent Sequential Learning”, for monitoring sequential processes. The new method can properly accommodate the longitudinal pattern of the process under monitoring and serial correlation in the observed data. It also is not limited to parametric distributional families. These properties make it an effective and powerful tool for monitoring sequential processes. 


January 27, 2023

MSCS 310


Yan Li

Statistical Learning for Compositional Data with Application to Microbiome Analysis

Abstract: In recent years microbiome studies have become increasingly prevalent and large-scale. Through high-throughput sequencing techniques and well-established analytical pipelines, relative abundance data of operational taxonomic units and their associated taxonomic structures are routinely produced. Such data do not admit the familiar Euclidean geometry and are typically of high dimension, inflated with excessive zeros, and subject to measurement errors. Due to these challenging data features, analytical methods for microbiome analysis are still in their infancy. We argue that an effective and interpretable learning approach for microbiome data is through aggregating the compositional components, i.e., through the so-called amalgamation operation. In this talk, we introduce novel amalgamation-based, taxonomy-guided learning paradigms for analyzing microbiome data, under both supervised and unsupervised fashions. The new frameworks can properly accommodate all the unique data features and achieve superior biological interpretation. The efficacy of the proposed methods is demonstrated in finite-sample theoretical properties, extensive simulation studies, and multiple real-world applications.


February 3, 2023

MSCS 310


Zhongyuan Chen

Data-guided Statistical Methods for Treatment Recommendations, Cancer Association Studies, and Beyond

Abstract: Despite the availability of large amounts of genomic-clinical data, medical treatment recommendations have yet to successfully take good advantage of them. In this talk, I will first introduce a data-guided statistical machine learning approach for treatment recommendation with feature scores by applying a dimension reduction method (Sliced Inverse Regression). It allows highly general regression models for the treatment response, a large number of covariates, and convenient visualization of the optimal treatment recommendation. I will also show that, for some data, the optimal treatment recommendation can be achieved by accurate estimation of the conditional average treatment effect, which sheds light on a practical guide on treatment recommendations using treatment effect estimation.
Next, I will discuss data-guided statistical methods in cancer association studies between somatic mutations and germline variations. We propose data-adaptive and pathway-based statistical tests for information aggregation at both SNP and gene levels so as to improve the statistical power. A low-rank approximation method is adopted to preselect parameters to improve the efficiency. I will also briefly mention molecular characterization and clinical relevance of metabolic expression subtypes in human cancers.
In addition, data-guided statistical methods are also useful for designing novel computational methods for dimension reduction strategies such as Sliced Inverse Regression and Principal Component Analysis. I will illustrate this in terms of a novel randomized eigenvalue solver.

 March 3, 2023

MSCS 310


Dr. Akash Deep
School of Industrial Engineering and Management
Oklahoma State University

Event Data Analytics for Smart and Connected Systems

Event data is ubiquitously present in many industrial and business landscapes, such as failures, warnings, maintenance histories, health logs, repair records, customer interactions, etc. The rapid advances in data acquisition, communication, storage, and processing technologies in recent years have enabled the transformation of conventional industrial equipment into smart and connected systems. The wealth of data extracted presents unprecedented opportunities for applying advanced data analytical methods to enhance industrial operations. In this talk, several new data analytics techniques will be introduced, including modeling and prognosis of critical events for individualized production systems, and a novel scalable maintenance model for multicomponent systems. The advantageous features will be demonstrated through industrial case studies.

Fall 2022  

September 9, 2022

MSCS 310


Dr. Shahina Rahman
Dept. of Statistics
Texas A & M University.


September 16, 2022

MSCS 310



Dr. Laura P. Coombs
2022 OSU Department of Statistics Distinguished Alumnus

Challenges to Evaluating and Monitoring Artificial Intelligence in Medical Imaging

Laura P. Coombs, PhD, proudly received her doctoral degree in Statistics from Oklahoma State University. She is now Vice President of Data Science and Informatics at the American College of Radiology, where she is responsible for the informatics portfolio, including the Data Science Institute. In that role, Laura collaborates with academic institutions, industry, and government, including the FDA, to ensure that artificial intelligence algorithms for medical imaging are deployed safely and effectively. She provides product oversight of the development of the AI-LAB, the Data Archive and Research Toolkit, and other informatics products including ACR Assist and ACR Common.

Laura started at the American College of Radiology as Director of Data Registries, a collection of quality registries for medical imaging. Prior to joining the ACR, she was an Assistant Research Professor at George Washington University Biostatistics Center, where she was the biostatistician on several National Institutes of Health-sponsored clinical trials. She also developed models for risk-adjustment for outcomes registries as an Assistant Research Professor at Duke Clinical Research Institute.

 Refreshments immediately following seminar in Room 309 MSCS.


October 7, 2022

MSCS 310


Dr. Ki Cole
Associate Professor
OSU Dept. Research, Evaluation, Measurement, and Statistics

Item Response Theory and Applications in R

Item Response Theory (IRT) is a modern test theory method used to analyze item response data. Its applications are utilized in educational testing (e.g., ACT, GRE, etc.) and psychological surveys in many fields including education, health, counseling, and business. IRT utilizes item response patterns from test- and survey-takers to model the probability of a specific item response given the examinee’s ability.

During the seminar, unidimensional, dichotomous IRT and its application using R will be presented. Students are encouraged to bring their laptop with R installed on it to work alongside the presentation (but this is not required). Models for polytomous data will be presented as time allows.


October 14, 2022

MSCS 310


Dr. Deb Mishra
Associate Professor
OSU Dept. of Civil & Environmental Engineering

Application of Data Analytics in Pavement Engineering: Some Case Studies

ABSTRACT: This presentation will focus on case studies involving the application of data analytics in the field of Pavement Engineering. Examples are taken from recent and ongoing research projects at Oklahoma State University, where transportation infrastructure health monitoring has been complemented by the application of statistical methods. Some of the applications to be discussed include: (1) Selection of appropriate data-filtering techniques; (2) Data reduction protocols to eliminate redundant information; (3) Sensor reliability analysis using data-based methods; and (4) Handling of missing-value problems in pavement health monitoring.


November 11,

MSCS 310


Dr. Lucas M. Stolerman
Assistant Professor
OSU Dept. of Mathematics

Mathematical models and data-driven methods for infectious diseases and protein networks.

In this talk, I will cover some of my recent research projects. First, we will discuss two projects on Dengue Fever. In the first one, we explore the role of human mobility in epidemic outbreaks with the SIR-Network model. A second project is devoted to analyzing local climate conditions and their impact on Dengue outbreaks in Brazil with machine learning techniques. The second part of this talk will contain some of my work in cell biology, where local stability analysis provided insight into protein network dynamics. We will discuss biological problems related to protein clustering in the plasma membrane. In the third part of this talk, I will return to epidemiology and the specific problem of anticipating sharp increases in COVID-19 cases using internet-based data.

December 9,

MSCS 310



Undisturbed Soils in Oklahoma

Amir Hossein Javid

Masters Candidate
Department of Statistics

Reliable estimates of soil water diffusivity are critical for describing and predicting water movement in unsaturated soils. Determination of diffusivity (D) as a function of volumetric water content (θ) is important as this hydraulic property is fundamental in order to characterize unsaturated water and solute transport in soils. Determination of this property is complex, time-consuming, and requires quite expensive instruments. For this reason, the determination of D has been seldom carried out. Several methods were proposed for determining soil water diffusivity. However, the Oklahoma Mesonet data set has predictors and diffusivity measurements that could be used to predict D with minimal effort.

The study's objective is to establish a data-driven statistical model for estimating diffusivity using Oklahoma Mesonet measurements. Specifically, the study presents a set of equations for estimating diffusivity in Oklahoma soils. The study's findings show that the statistical model produced results comparable to those obtained by the experimental (one-step pressure outflow) procedure. Thus, in settings where limited resources are available to collect test samples from field, the statistical model could be provide reasonable estimate of diffusivity.