Seminar Schedule


   

 

 
   

9/15/17

2:30pm

MSCS 310

Dr. William Paiva
Center for Health Systems Innovation

Elvena Fong
Spears School of Business

Changing Health Care Through Data

With the recent digitalization of health data, there has been a high demand for data scientists who are able to extract value out of health data.  Come and learn about Center for Health Systems Innovation, and our efforts to innovate and transform rural and Native American health care through the implementation of innovative care delivery models and IT tools and solutions.  We will discuss about the driving forces in the market, how we are structured to meet those needs as an innovation center, and also go over a small sampling of analytics and health care delivery projects.  Come and learn more about how to get involved with us!

   

9/22/17

2:30pm

MSCS 310

Dr.  Shengwu Shang
OSU Dept. of Statistics

The Effects of Spending on Test Pass Rates Revisited: A Spatial Statistics Approach

Abstract:

In 1994, Michigan initiated a school finance reform which is called Proposal A, aimed at equalization of school finances among school districts; many researches have been done for it is impacts, either on student performance, like Papke (2005, 2008), Pake and Wooldridge (2008); or on the socioeconomic segregation across school districts, like Chakrabarti and Roy (2012). However, none of them accounts for the spatial dependence among school districts. In this paper, we adopt a logit model with spatial effects to analyze the MEAP data of Michigan in 2009/2010 school year; the Hierarchical Bayesian technique is proposed for the parameter estimation. This also gives a solution to the question raised in Papke and Wooldridge (1996).

   

9/29/17

2:30pm 

MSCS 310

Dr. Peter Hoyt
OSU, Bioinformatics 

Sequencing Technologies and a Bad Bug

Summary:

This seminar will discuss the new Illumina sequencing technology in the Genomics Core including new developments in long-read Nanopore sequencing. The Illumina sequencer can produce 120-billion base=pairs of DNA sequence in 28 hours, enough to sequence a human genome 40-fold over. But short-read sequencers need long reads to close gaps in genomes left by powerful short-read sequencers. A short presentation on research performed using the genomes of an emerging pathogenic bacteria Elizabethkingia anopheles and related species will show genomic clues for how a bacteria can become lethal when DNA exchanges occur.

   

10/6/17

2:30pm

MSCS 310

Bayesian Nonparametric Inference for Panel Count Data

Dr. Ye Liang
OSU Department of Statistics

In this paper, the panel count data analysis for recurrent events is considered. Such analysis is useful for studying tumor or infection recurrences in both clinical trial and observational studies. A bivariate log-Gaussian Cox process model is proposed to jointly model the observation process and the recurrent event process. Bayesian nonparametric inference is proposed for simultaneously estimating regression parameters, bivariate frailty effects and baseline intensity functions. Inference is done through Markov chain Monte Carlo, with using the Riemann manifold Hamiltonian Monte Carlo for the high dimensional sampling. Predictive inference is also discussed under the Bayesian setting. The proposed method is shown to be efficient via simulation studies. Two data examples from clinical trials are analyzed to illustrate the proposed approach.

   

 

 
   

 

 
   

 

11/10/17

2:30pm

MSCS 310

 Dr. Saeed Piri
OSU, Dept of Management Science and Information Systems

A datasetis called imbalanced when the number of examples from one class outnumbers the number of the instances from another class. Learning from imbalanced datasets is one of the major challenges in machine learning. While a standard classifier could have a very good performance on a balanced dataset, when applied to an imbalanced dataset, its performance deteriorates dramatically. This poor performance is rather troublesome, especially in detecting the minority class, which usually is the class of interest. Over-sampling the minority class is one of the most promising remedies for imbalanced data learning. In this study, we propose a new synthetic informative minority over-sampling (SIMO) algorithm imbedded into support vector machine (SVM). In this algorithm, first SVM is applied to the original imbalanced dataset, then, minority examples close to the SVM decision boundary, as the informative minority examples are over-sampled. We also developed another version of SIMO and call it weighted SIMO (W-SIMO). W-SIMO is different from SIMO in the degree of over-sampling the informative minority examples. In W-SIMO, incorrectly classified informative minority examples are over-sampled with a higher degree compared to the correctly classified informative minority examples. In this way, there is more focus on incorrectly classified minority examples. The over-sampled dataset through our algorithms can be used to train any classifier and is not limited only to SVM. We applied our algorithms to the 15 publicly available benchmark imbalanced datasets and assessed their performance in comparison with existing approaches in the area of imbalanced data learning. The results showed that our algorithms had the best performance in the benchmark datasets compared to other approaches. Diabetes is a common chronic disease that may lead to several complications. Diabetic retinopathy (DR), one of the most serious of these complications, is the most common cause of vision loss among diabetic patients. In this study, we analyzed data from more than 1.4 million diabetics and developed a clinical decision support system (CDSS) for detecting DR. While the existing diagnostic approach requires access to ophthalmologists and expensive equipment, our CDSS only uses demographic and lab data to detect patients' susceptibility to retinopathy with a high accuracy. In this study, we developed a novel “confidence margin” ensemble technique that outperformed the existing ensemble models. Our CDSS provides several important practical implications, including identifying the DR risk factors, facilitating the early diagnosis of DR, and solving the problem of low compliance with annual retinopathy screenings.

   
 

11/17/17

2:30pm

MSCS 310

Seminars for GTAs and Others: The Effects of Ambiguity

Dr. Adam Molnar, Dept. of Statistics 

Abstract: When deciding how to teach statistics, instructional experts make choices about topics and definitions. In many cases, those choices are universal, but not all. The first part of this talk examines situations where lack of consensus exists in courses taught at Oklahoma State. The second part of this talk describes lexical ambiguity, when a term has different meaning in statistics than in another context, such as mathematics or everyday life. For both lack of consensus and lexical ambiguity, there are actions that can reduce the negative effects of ambiguity; such actions will be described.