Course Overview

This course is a study in methods for deriving information from and making decisions based on limited or noisy data. In the information age, large amounts of data are created on a near-constant basis. Unfortunately, the sources of this data are often of low quality compared to the controlled studies for which simple statistical inference techniques are effective. In order to exploit this abundance, we require methods to incorporate data into our models and decisions while accounting for the uncertainty introduced by this “noisy” data. Many of these tools can be understood in terms of conditional probability.

The course will begin with mathematical preliminaries, including set theory and Boolean algebra, the basics of mathematical reasoning, combinatorics, and discrete and continuous probability spaces. After this, we will use these tools for a careful study of conditional probiability, with an emphasis on the use of Bayes’ theorem. Finally, we will explore several applications for modeling random scenarios, including Markov chains, information theory, and decision theory.

Who this course is for

Prerequisites

In this course, we expect a basic knowledge of probability and statistics, at the level of ISTA 116, MATH 263, or similar. We also expect Python programming at the level of ISTA 130 or another introductory course. Experience with logical reasoning will be helpful, but we do not require any formal coursework in logic.

If you do not meet these prerequisites, but believe you have sufficient preparation to enroll, please contact me to discuss in person.

Required work

Readings and textbooks

There is not a single textbook that covers the material we need at exactly the right level, so we will be drawing from several sources.

Our primary source is the course notes developed for a somewhat different version of 311 by Colin Reimer Dawson; we will also draw some material from the following sources

You don’t need to buy any books for this course; all required and recommended readings are available for free or will be excerpted as needed and posted on D2L.

The following are not required readings, but offer a different perspective on the ideas we are engaging with in the course.

  • Allen B. Downey, Think Bayes. This is a book on Bayesian statistics and estimation written from computational perspective rather than a mathematical one. If you are more comfortable thinking about probability using Python than using mathematics, this book may help you understand the course material better.
  • Daniel Kahneman, Thinking, Fast and Slow. This is not a book on probability or statistics, but rather a book on human cognition, derived from the behavioral reseorch that earned the author the 2002 Nobel Prize in Economics. A recurring theme in the book is the inability of certain cognitive modes to correctly process statistical information, particularly information pertaining to conditional probabilities. Since we all are required to approach this subject through our own human cognitive abilities, it is enlightening to understand where and how the weaknesses of those abilities may affect us.

Homework Assignments

The most important work in this course consists of homework assignments, which come in two forms. You are asked to complete six handwritten homework assignments, which consist of mathematical calculations and written analysis of reasoning problems. There will also be four programming-based assignments, which are to be completed in Python. These will involve writing software to perform basic analyses and to solve data-related problems arising in practice.