CASE STUDY: JMP027

Titanic Passengers

by Marlene Smith, University of Colorado Denver Business School

Key Concepts: Logistic regression, log odds and logit, odds, odds ratios, prediction profiler

case-study-27-hero

Authors

Dr. Marlene Smith

University of Colorado Denver

Objective

Use the passenger data related to the sinking of the RMS Titanic ship to explore some questions of interest about survival rates for the Titanic. For example, were there some key characteristics of the survivors? Were some passenger groups more likely to survive than others? Can we accurately predict survival?

Background

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1,502 of the 2,224 passengers and crew. This sensational tragedy shocked the international community and motivated the adoption of better maritime safety regulations.

One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others. (“Titanic: Machine Learning from Disaster.” From a Kaggle competition. Available at http://bit.ly/1f2crzi, data accessed 08/2014.)

The Task

We use this rich and storied example to explore some questions of interest about Titanic survival rates. For example, were there any key characteristics shared by survivors? Were some passenger groups more likely to survive than others? Can we accurately predict survival?

We will fit a logistic regression model using the available data to explore these questions


Use the links below to read the full case study and download the data files