ORF 350 --- Spring 2019

Analysis of Big Data


Basic info

Course description: The amount of data in our world has been exploding and analyzing large data sets is a central challenge in society. This course introduces the statistical principles and computational tools for analyzing big data. Topics include statistical modeling and inference, model selection and regularization, scalable computational algorithms, and more.

The course has two main learning objectives:
To achieve the latter we will have numerous assignments requiring the analysis of data sets.
These will utilize Jupyter notebooks, with a focus on Python and R.

Prerequisites: ORF 245 (Statistics) and ORF 309 (Probability and Stochastic Systems).

Instructor: Miklos Z. Racz
Lecture time and location: MW 8:30 - 9:50 am, 006 Friend Center
Office hours: M 10:00 am - 12:00 pm, 204 Sherrerd Hall

Teaching Assistants (AIs):
Precepts:


Grading and course policies

Grading: There will be homework problem sets throughout the semester (approximately weekly), as well as a midterm and a take-home final exam.
Your final score is a combination of your performance in these, with the following breakdown:
Midterm info: Monday, March 11, in class

Final info: Take-home final exam, details posted on Piazza

Homework and collaboration policy:
Please be considerate of the grader and write solutions neatly. Unreadable solutions will not be graded.
Please follow the instructions on the problem set regarding submitting your homework online via Blackboard.
Please write your name, Princeton email, and the names of other students you discussed with on the first page of your HW.
No late homework will be accepted. Your lowest homework score will be dropped.

You should first attempt to solve homework problems on your own.
You are encouraged to discuss any remaining difficulties in study groups of two to four people.
However, you must write up the solutions on your own and you must never read or copy the solutions of other students.
Similarly, you may use books or online resources to help solve homework problems, but you must always credit all such sources in your writeup, and you must never copy material verbatim.

Advice: do the homeworks! The best way to understand the material is to solve many problems and analyze many data sets. In particular, the homeworks are designed to help you learn the material along the way.

Email policy: For questions about the material, please come to office hours.
For general interest questions, please post to the course Piazza page.
This facilitates quick and efficient communication with the class.
Please use email only for emergencies and administrative or personal matters.
Please include "ORF 350" in the subject line of any email about the course.



Resources

Recommended text: Piazza: The course has a Piazza page.
Think of this as a Q&A wiki for the course, use it for questions and discussions. For more details, see Piazza.



Schedule

Classes begin on Monday, February 4. Note: this plan is subject to change depending on how we progress throughout the semester.



Back to Teaching Home