Statistics Learning with BioData Club: A Dive into Logistic Regression Modeling

Friday, June 7
4:00 – 5:30
SON 358 (School of Nursing)

In biomedical research we often wish to classify data into two or more groups (eg. healthy and diseased) based on a variety of measurement variables, but how do you determine if the model you’ve selected is good?

In this BioData Club workshop instructor Crista Moreno will discuss the mathematics of logistic regression for binary classification modeling, and how to prevent the harms of overfitting with cross validation in R.

Attendees will gain knowledge through hands-on exercises about the following concepts and data science skills.

  • Logistic regression (logit function, probability, binary classification)
  • Overfitting (adding parameters and high dimensional spaces)
  • Cross validation
  • Cross validation Error
  • R (R markdown, R packages magrittr, dplyr, ggplot2, tidyr, corrplot, caret, rgl)

Anyone with interest in building a classification model for biomedical data is encouraged to attend!  Prior experience with R, Rstudio, and a basic knowledge of classification modeling (also mathematical functions) will be helpful, but is not a requirement.

Crista Moreno is a mathematical scientist, an avid R user, and works as a statistician at OHSU.  This workshop is sponsored by the OHSU Library, DMICE, and BERD.  BioData Club is supported by the OHSU Library and DMICE.

Please register in advance:  http://bit.ly/logregOHSU

Friday, June 7
4:00 – 5:30
SON 358 (School of Nursing)

Participants are highly encouraged to bring laptops with R and Rstudio installed.

Everyone is welcome!