STAT 453 -- Introduction to Deep Learning and Generative Models (Spring 2020)
Table of Contents
- Table of Contents
- Project Awards
- Course Resources
- Course Logistics
- Course Description
- Resources
- Grading
- Exams
- Class Project
- Other Important Course Information
- Schedule
Project Awards
We had an amazing selection of student project presentations this year! Below are the winners (by voting) from each category, Best Oral Presentation, Best Visualizations, and Most Creative Project (click to enlarge).
Course Resources
For the course material, we are going to use a mix between different technologies, each suited best for the given task.
- General information and schedule: General information about this course will be provided through this website. In particular, keep an eye on the calendar at the bottom, which will list important dates and provide links to the course material.
- Course material: The course material (PDF files, code files) will be served through a GitHub repository. The reason is that it permits updates with transparent date stamps and the tracking of changes. Also, the machine learning research community relies heavily on GitHub for sharing code and research results, which is why it is beneficial for you to become familiar with it. You can obtain the course material (slides, code examples, etc.) directly from the GitHub repository. However, I will also link the lecture notes, slides, and code examples in the calendar at the bottom of this page.
- Important information and announcements: Important course information and deadlines (as well as updates or changes) will also be shared via the Announcements on Canvas. You should get an automated email each time I upload a new announcement there. Previously, I have been using Classlist; however, the Announcements feature on Canvas may be nice because this way it will be easier to keep track of all current announcements – if this won’t work as well as planned, we can still switch to ClassList.
- Submissions: Homework assignment submissions, project submissions are to be submitted via Canvas. I will provide more information and instructions regarding submissions during the course. Your course grade (points) will also be displayed on Canvas.
- Questions: Questions should generally be asked in the Piazza forum I set up for the course (you can access it through a link on Canvas or the direct link here). This is most efficient in case multiple students have the same or similar questions. Students are also encouraged to help other students on Piazza. For personal questions (missed assignments etc.), please contact me or the TA via email directly (please use the prefix “STAT453:” as the email subject header).
Piazza 1-time sign-up link: piazza.com/wisc/spring2020/sp20stat453001
Piazza forum link: https://canvas.wisc.edu/courses/192139/external_tools/65
Course Logistics
When
- Tue 11:00 am - 12:15 pm
- Thu 11:00 am - 12:15 pm
Where
- SMI 331 (SMI = Service Memorial Institute)
Instructors
- Instructor: Sebastian Raschka
- Teaching Assistant: Zhongjie Yu
Office Hours
- Prof. Sebastian Raschka (Instructor) – MSC 1171 (MSC = Medical Sciences Center):
- Thu 1:00 - 2:00 pm (Jan 23rd - Apr 30th)
- Exceptions:
- No office hour on Apr 23rd -> replacement office hour on Tue Apr 21st instead (1:00-2:00 pm)
- Exceptions:
- Thu 1:00 - 2:00 pm (Jan 23rd - Apr 30th)
- Zhongjie Yu (Teaching Assistant) – MSC 1130:
- Thu 2:30 - 3:30 pm
Course Description
Credits: 3
Course Description:
Deep learning is a field that specializes in discovering and extracting intricate structures in large, unstructured datasets for parameterizing artificial neural networks with many layers. Since deep learning has pushed the state-of-the-art in many research and application areas, it’s become indispensable for modern technology.
The focus of this course will be on understanding deep, artificial neural networks by connecting it to related concepts in statistics, such as generalized linear models and maximum likelihood estimation. Beyond covering deep learning models for predictive modeling, the latter portion of this course will focus on deep generative models and models based on stochastic variational inference, which allows for learning directed probabilistic models.
Besides covering and explaining deep learning and generative models on a mathematical and conceptual level, this course emphasizes the practical aspects of deep learning. Open-source libraries from the Python open-source ecosystem for scientific computing will be used to provide students with hands-on experience for implementing deep neural nets, working on supervised learning tasks, and applying generative models for dataset synthesis. Regarding the class project, students will form teams of three and collaboratively work on a project proposal to outline the planned scope of the project and meet with the lecturer for further discussion and feedback. After receiving feedback on the proposal, students will work independently towards the final project report, which will be submitted in the form of a conference paper for peer-review by other students and the lecturer. Finally, the students will give an 8-10-minute talk at the end of the semester to formally present their projects in class.
Learning Outcomes:
- Developing an advanced understanding of deep learning and generative models, which represent state-of-the-art approaches for predictive modeling in today’s data-driven world.
- Identifying scenarios where it makes sense to deep learning for real-world problem-solving.
- Building a repertoire of different algorithms and approaches to deep learning and understanding their various strengths and weaknesses.
- Learning how to use the Python programming language and Python’s scientific computing stack for implementing deep learning algorithms to 1) enhance the learning experience, 2) conduct research and be able to develop nzvel algorithms, and 3) apply deep learning to problem-solving in various fields and application areas.
- Combining both the theoretical and practical concepts taught in this class to creative, real-world problem solving and communicating the outcome professionally in form of a scientific paper and a formal oral presentation.
Course Rerequisites:
Along with introducing of the concepts of deep learning and generative models, the in-class lectures will provide a refresher on relevant concepts from calculus and linear algebra; however, a calculus background (e.g., Math 221) and a linear algebra background (e.g., Math 340) is recommended. While this course will also provide an introduction to the basics of the Python programming language for machine learning, it is highly recommended that students are familiar with basic programming and have completed an introductory programming class.
The official requisites are
- MATH 320, 321, 340, 341, graduate/professional standing, or member of the Statistics Visiting International Scholars program
Course Audience:
Students majoring in math or statistics or those wishing to take additional statistics courses.
Resources
Useful Background Material
For those who haven’t taken the machine learning course, no worries, the concepts in the deep learning course are related, but this course will not require knowledge of the machine learning material I taught there. However, it would be good if you could review certain sections from the previous course if you haven’t taken it:
- L01, ML intro/overview
- L02, intro to Supervised Learning: KNN
- L04, Python’s Scientific Computing Stack
- Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
Links to the lecture notes can be found at the bottom (linked in the calendar) of the course website at http://pages.stat.wisc.edu/~sraschka/teaching/stat479-fs2019/
Deep Learning Books
Deep learning is a new and very fast moving field, and many of the knowledge is contained in freely available research articles and other articles shared freely on the internet. As of today, there is also no nice textbook available that would be suitable as a textbook for this course.
Thus, I will link free resources, including internet articles and research articles that are relevant for the course. The book suggestions are recommendations but not requirements. I will not use any chapters directly for this course, but you can use them as a personal reference.
Deep Learning – Goodfellow, Bengio, Courville
The “Deep Learning” book by Goodfellow et al. is nice for deeper theoretical coverage of the topic. The book is also officially and freely available as web version at http://www.deeplearningbook.org
Python Machine Learning, 3rd Edition – Raschka & Mirjalili
The Python Machine Learning book provides a great intro to general machine learning; the deep learning chapters are in TensorFlow though, and we will be using PyTorch in this class. However, the explanations are still useful.
Deep Learning with PyTorch
The “Deep Learning with PyTorch” is the most relevant book, but it has not been released yet. There is free preview version available at https://www.manning.com/books/deep-learning-with-pytorch
A link direct link to the PDF draft chapters is available at:
Python Resources
Regarding Python, we will mainly focus on two libraries: NumPy and PyTorch. You can think of NumPy as a linear algebra library that provides utilities similar to MatLab (if you are familiar with MatLab). It’s a library that is used in almost any scientific computing task and other libraries in Python and is generally useful. PyTorch is the main deep learning library we will be using. My deep learning background is in Theano and TensorFlow, but I made the switch to PyTorch about ~1 1/2 years ago when it was released as it offers many advantages over TensorFlow at the same computational performance – in fact, most people use it now for research, and as I know from colleagues at Stanford and NYU, among many others, switched to it from TensorFlow for teaching as well.
In any case, you don’t need to be an expert Python programmer to use these libraries (and I will teach you about PyTorch in this course, so no worries about learning it beforehand). However, some basic familiarity with Python will be necessary in order to use these libraries.
Illustrated Guide to Python (recommended)
- “Illustrated Guide to Python 3: A Complete Walkthrough of Beginning Python with Unique Illustrations Showing how Python Really Works. Now covering Python 3.6 (Treading on Python) (Volume 1)” by Matt Harrison, ISBN-13: 978-1977921758.
This book will not be coverered in class. However, some readers asked me for good Python resources as preparation for this class, and this is one of the resources I would recommend. However, there are many other Python learning resources available online.
For instance, another great book is Allen Downey’s Think Python 2e (free PDF available at https://greenteapress.com/wp/think-python-2e/).
Interactive Python course on Codecademy (highly recommended)
Depending on your preferred learning style, also consider learning Python interactively instead/or in addition of reading a Python book. A great interactive resource for learning Python is Codecademy: https://www.codecademy.com. In particular, there is a free, < 10 hr interactive course: https://www.codecademy.com/learn/learn-python.
Python Like You Mean It
A short, free intro for getting started with Python and its main scientific computing libraries: https://www.pythonlikeyoumeanit.com.
Python for Beginners (Video Lectures)
A great video series by educators at Microsoft, which was recently made available for free on YouTube: https://www.youtube.com/playlist?list=PLlrxD0HtieHhS8VzuMCfQD4uJ9yne1mE6.
Grading
The final grade will be computed using the following weighted grading scheme:
- 20% Problem Sets
- 50% Exams:
- 20% Midterm Exam
- 30% Final Exam
- 30% Class Project:
- 5% Project proposal
- 10% Project presentation
- 15% Project report
To make the grading more transparent and provide students with a better intuition of their performance throughout the course, there will be a total of 1000 points in this course. For instance, 200 points can be obtained from homework assignments (30% of the final grade), 500 points from exams (50% of the final course grade), and 300 points for the class project (30% of the final grade).
The final letter grade will be based on the total number of points/percent of the total points accumulated in the course:
- A: >= 930 points or >= 93%
- AB: >= 900 points or >= 90%
- B: >= 850 points or >= 85%
- BC: >= 800 points or >= 80%
- C: >= 700 points or >= 70%
- D: >= 500 points or >= 50%
- F: < 500 points or < 50%
The final grades will not be curved and will be determined based on an absolute scale (percentage of the total points as listed above) to avoid competition between students and encourage collaboration when studying for this course. Graduate students will not be graded separately. For those who have concerns: empirically, this concept has worked very well in my past courses, with 20-30% students performing extremely well and receiving an A.
Exams
Both the midterm and final exam will be conceptual, which means that you will not be asked to write code in the exam. You should bring a pocket calculator to the class, but otherwise, no further material will be permitted (except pens).
The final will be cumulative in the sense that some of the earlier topics may be relevant to the final exam; however, the final exam will largely focus on the parts covered after the midterm. In other words, you still should be familiar with all concepts covered in the course, but questions will be centered around the topics after the midterm.
While there will be different types of questions, one question could be as follows:
Q: Does the (computational) time complexity of a k-Nearest Neighbor classifier grow linearly, quadratically, or exponentially with the number of samples in the training dataset? Explain your answer in 1-2 sentences.
A: Linearly. For each new training point there is an additional distance computation.
Class Project
Overview
The goal of working on a class project is three-fold. First, it will provide you with the opportunity to apply the concepts learned in this class creatively, which helps you with understanding material more deeply. Second, designing and working on a unique project in a team which is something that you will encounter, if you haven’t already, rather sooner than later in life, and this course project helps with preparing for that. Third, along with the opportunity to practice and the satisfaction of working creatively, students can use this project to enhance their portfolio or resume.
Note about grading
There is no “perfect project.” While you are encouraged to be ambitious, the most important aspect of this project is your learning experience. Hence, you don’t want to pick something that is too easy for you, but similarly, you don’t want to choose a project where you are not certain that is out of the scope of this class. (However, note that the more comprehensive and interesting the project is, the easier you’ll find it to write the 6-8-page project report.) The project proposal is not graded by how exciting your project is but based on whether you follow the objectives of the project proposal, project presentation, and project report. For instance, if your project ends up being unsuccessful – for example, if you choose to design a classifier and it doesn’t achieve the desired accuracy – it will not negatively affect your grade as long as you are honest, describe the potential issues well, and suggest improvements or further experiments. Again, the objective of this project is to provide you with hands-on practice and an opportunity to learn.
The project consists of 3 parts:
- a project proposal,
- a short project presentation,
- and a project report.
The expectations for each part will be discussed in the following sections.
1) Project Proposal
Please note that you should use the proposal-latex
file(s) for writing and submitting your proposal!
The main purpose of the project proposal is to receive feedback from the TAs/the instructor regarding whether your project is feasible and whether it is within the scope of this class. Also, the project proposal offers a chance to receive useful feedback and suggestions on your project.
For this project, you will be working in a team consisting of three students. You are encouraged to form groups by yourself, as discussed in class. If you cannot find group members, the TA and I will randomly assign you to a group. If you have any concerns working with someone in your group, please talk to a TA or the instructor for accommodations.
Proposal Format:
- The project proposal is a 1-3 page document (800-1200 words), excluding references.
- You are encouraged (not required) to use 1-2 figures to illustrate technical concepts.
- The proposal must be formatted and submitted as a PDF document (the submission deadline will be later announced via the calendar & email).
Introduction:
- Describe what you are planning to do.
- Briefly describe related work (if applicable).
Motivation:
- Describe why your project is exciting. E.g., you can describe why your project could have a broader societal impact. Or, you may describe the motivation from a personal learning perspective.
Evaluation:
- What would the successful outcome of your project look like? In other words, under which circumstances would you consider your project to be “successful?”
- How do you measure success, specific to this project, from a technical standpoint?
Resources:
- What resources are you going to use (datasets, computer hardware, computational tools, etc.)?
Contributions:
You are expected to share the workload evenly, and every group member is expected to participate in both the experiments and writing. (As a group, you only need to submit one proposal and one report, though. So you need to work together and coordinate your efforts.)
- Clearly indicate what computational and writing task each member of your group will be participating in.
It is crucial that you talk to each other regularly!!! Schedule regular meetings and/or use online communication tools (e.g., Gitter, Slack, or email) to stay in touch with your group members throughout the semester regarding the process of your project.
Modifications to the proposal
After you have received feedback from the TAs/the instructor and your project proposal has been graded, you are advised to stick to the project outline in the proposal as closely as possible. However, if there is a concept introduced in a later lecture (for instance, a machine learning algorithm that you think is more appropriate then the one you proposed), you have the option to modify your proposal, but you are not penalized if you don’t. If you wish to update your project outline, talk to a TA first.
Project Proposal Assessment
The proposal will be graded based on completeness of each of the 5 sections (Introduction, Motivation, Evaluation, Resources, and Contributions) and not be based on language, style, and how “exciting” or “interesting” the project is. For each section, you can receive a maximum of 10 points, totaling 50 pts for the proposal overall.
Also, it is important to make sure that you acknowledge previous work and use citations properly when referring to other people’s work. Even minor forms of plagiarism (e.g., copying sentences from other texts) will result in a subtraction of at least 10 pts each per incidence. And university guidelines dictate that severe incidents need to be reported. If you are unsure about what constitutes plagiarism and how to avoid it, please see the helpful guides at https://conduct.students.wisc.edu/plagiarism/.
2) Project Presentation
During the last three lectures, you will be presenting your project to the class. The presentation is “free form” but should cover the following:
- introduce the topic to a general audience (your class);
- summarize the main approach or method;
- highlight the outcomes of your project.
The presentation should be 8-10 minutes long, plus 2 minutes will be reserved for questions. All members of the group should participate in the presentation.
- To encourage attendance, we will use a random number generator in class to determine the order in which the groups will present.
-
Please bring your own device for the presentation (we have a VGA and an HDMI cable for this projector). Further, I will provide the following connectors: Displayport-to-HDMI, Displayport-to-VGA, USB-C-to-VGA, USB-C-to-HDMI, Lightning-to-HDMI (for iPad).
- There will be 3 awards:
- Best Oral Presentation
- Most Creative Project
- Best Visualizations
- The awards will be determined by voting, each student will fill out a card in class (I will provide the cards), voting for each presentation (on a scale from 1-10 for each of the 3 categories, where 10 is best), and I will collect the cards at the end of the lecture.
The voting card should be filled out as follows:
- Title of the Presentation, a/10, b/10, c/10
- Title of the Presentation, a/10, b/10, c/10 …
where
- a are the points for 1. Best Oral Presentation
- b are the points for 2. Most Creative Project
- c are the points 3. Best Visualizations
The awards will be computed based on the highest number of points for each category. However, one project can only receive one of the prizes. The points for the grade are considered independently from the 3 prize categories. The rubric for the grades is provided in the subsection Project Presentation Assessment below.below.
Project Presentation Assessment
The rubric for assigning the points (out of 100) for the presentation is provided below:
- 10 pts: Is there a motivation for the project given?
- 40 pts: Is the project described well enough that a general audience, familiar with machine learning, can understand the project?
- 20 pts: Figures are all legible and explained well
- 20 pts: Are the results presented adequately discussed?
- 10 pts: Did all team members contribute to the presentation?
3) Project Report
The project report is expected to be 6-8 pages long (excluding references) and should contain the follwing sections:
- Introduction
- Related Work
- Proposed Method
- Experiments
- Results and Discussion
- Conclusions
- Contributions
More details are provided in the LaTeX report template at https://github.com/rasbt/stat453-deep-learning-ss20/tree/master/report-template.
Please note that you should use the report-latex
file for writing and submitting your report!
Also, you are required to submit all the code, computations, and experiments you developed and conducted for this project. Note that the quality of code will not have any influence on your grad and will merely serve as a basis to establish that the report contains original and “real” results.
Project Report Assessment
The rubric for grading the project reports is provided below.
Abstract: 15 pts
- Is enough information provided get a clear idea about the subject matter?
- Is the abstract conveying the findings?
- Are the main points of the report described succinctly?
Introduction: 15 pts
- Does the introduction cover the required background information to understand the work?
- Is the introduction well organized: it starts out general and becomes more specific towards the end?
- Is there a motivation explaining why this project is relevant, important, and/or interesting?
Related Work: 15 pts
- Is the similar and related work discussed adequately?
- Are references cited properly (here, but also throughout the whole paper)?
- Is the a discussion or paragraph on comparing this project with other people’s work adequate?
Proposed Method: 25 pts
- Are there any missing descriptions of symbols used in mathematical notations (if applicable)?
- Are the main algorithms described well enough so that they can be implemented by a knowledgeable reader?
Experiments: 25 pts
- Is the experimental setup and methodology described well enough so that it can be repeated?
- If datasets are used, are they referenced appropriately?
Results and Discussion: 30 pts
- Are the results described clearly?
- Is the data analyzed well, and are the results logical?
- Are the figures clear and have no missing labels?
- Do the figure captions have sufficient information to understand the figure?
- Is each figure referenced in the text?
- Is the discussion critical/honest, and are potential weaknesses/shortcomings are discussed as well?
Conclusions: 15 pts
- Do the authors describe whether the initial motivation/task was accomplished or not based on the results?
- Is it discussed adequately how the results relate to previous work?
- If applicable, are potential future directions given?
Contributions: 10 pts
- Are all contributions listed clearly?
- Did each member contribute approximately equally to the project?
Optional: Sharing your Project
You are encouraged to share your project/final project report online after you completed the course – for example, via GitHub or on a personal website online.
If there are enough students willing to share their report online, I’d be happy to write a short article summarizing your projects as I’ve done for the deep learning course last year.
Other Important Course Information
Rules, Rights & Responsibilities
See the Guides’s Rules, Rights and Responsibilities
Academic Integrity
By enrolling in this course, each student assumes the responsibilities of an active participant in UW-Madison’s community of scholars in which everyone’s academic work and behavior are held to the highest academic integrity standards. Academic misconduct compromises the integrity of the university. Cheating, fabrication, plagiarism, unauthorized collaboration, and helping others commit these acts are examples of academic misconduct, which can result in disciplinary action. This includes but is not limited to failure on the assignment/course, disciplinary probation, or suspension. Substantial or repeated cases of misconduct will be forwarded to the Office of Student Conduct & Community Standards for additional review. For more information, refer to studentconduct.wiscweb.wisc.edu/academic-integrity/.
Accommodations for Students with Disabilities
McBurney Disability Resource Center syllabus statement: “The University of Wisconsin-Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW-Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility. Students are expected to inform faculty [me] of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized. Faculty [I], will work either directly with the student [you] or in coordination with the McBurney Center to identify and provide reasonable instructional accommodations. Disability information, including instructional accommodations as part of a student’s educational record, is confidential and protected under FERPA.” http://mcburney.wisc.edu/facstaffother/faculty/syllabus.php
Diversity and Inclusion
Institutional statement on diversity: “Diversity is a source of strength, creativity, and innovation for UW-Madison. We value the contributions of each person and respect the profound ways their identity, culture, background, experience, status, abilities, and opinion enrich the university community. We commit ourselves to the pursuit of excellence in teaching, research, outreach, and diversity as inextricably linked goals.
The University of Wisconsin-Madison fulfills its public mission by creating a welcoming and inclusive community for people from every background – people who as students, faculty, and staff serve Wisconsin and the world.” https://diversity.wisc.edu/
Schedule
Note that this is a tentative schedule subject to changes.
Below is a list of topics we aim to cover. However, we will take our time, and it is more important to build a good understanding of the core concepts and the field in general rather than covering one more algorithm. Keep in mind that a good foundation will enable you to study and understand additional algorithms if the need arises.
Topics Summary (Planned)
Below is a list of the topics I am planning to cover. Note that while these topics are numerated by lectures, note that some lectures are longer or shorter than others. Also, we may skip over certain topics in favor of others if time is a concern. While this section provides an overview of potential topics to be covered, the actual topics will be listed in the course calendar at the bottom of this page.
Part 1: Introduction
- Course overview, introduction to deep learning
- The brief history of deep learning
- Single-layer neural networks: The perceptron algorithm
Part 2: Mathematical and computational foundations
- Linear algebra and calculus for deep learning
- Parameter optimization with gradient descent
- Automatic differentiation
- Cluster and cloud computing resources
Part 3: Introduction to neural networks
- Multinomial logistic regression
- Multilayer perceptrons
- Regularization
- Input normalization and weight initialization
- Learning rates and advanced optimization algorithms
- Project proposal (online submission)
Part 4: Deep learning for computer vision and language modeling
- Introduction to convolutional neural networks 1
- Introduction to convolutional neural networks 2
- Introduction to recurrent neural networks 1
- Introduction to recurrent neural networks 2
- Midterm exam
Part 5: Deep generative models
- Autoencoders
- Autoregressive models
- Variational autoencoders
- Normalizing Flow Models
- Generative adversarial networks 1
- Generative adversarial networks 2
- Evaluating generative models
Part 6: Class projects and final exam
- Course summary
- Student project presentations 1
- Student project presentations 2
- Student project presentations 3
- Final exam
- Final report (online submission)
Calendar
Jan 23
book draft chapters
Jan 30
Explaining how to install Jupyter Nb
Help w. getting started w. HW1
Feb 06
HW1 due at 11:59 pm (upl. on Canvas)
Feb 11
Feb 13
Feb 25
Feb 27
Mar 03
Mar 05
Mar 10
Mar 17
Mar 24
Apr 21
Apr 30