Deep Learning

Generative Deep Learning - 2nd Edition - David Foster

Book

1. Introduction

1.1. From neural networks to deep learning (handout)

1.2. Current applications and success (handout)

1.3. What is really happening? (handout)

1.4. Tensor basics and linear regression (handout)

1.5. High dimension tensors (handout)

1.6. Tensor internals (handout)

2. Machine Learning Fundamentals

2.1. Loss and risk (handout)

2.2. Over and under fitting (handout)

2.3. Bias-variance dilemma (handout)

2.4. Proper evaluation protocols (handout)

2.5. Basic clusterings and embeddings (handout)

3. Multi-layer Perceptron and Backpropagation

3.1. The perceptron (handout)

3.2. Probabilistic view of a linear classifier (handout)

3.3. Linear separability and feature design (handout)

3.4. Multi-Layer Perceptrons (handout)

3.5. Gradient descent (handout)

3.6. Back-propagation (handout)

4. Graphs of operators, autograd, and convolutional layers

4.1. DAG networks (handout)

4.2. Autograd (handout)

4.3. PyTorch modules and batch processing (handout)

4.4. Convolutions (handout)

4.5. Pooling (handout)

4.6. Writing a PyTorch module (handout)

5. Initialization and optimization

5.1. Cross-entropy loss (handout)

5.2. Stochastic gradient descent (handout)

5.3. PyTorch optimizers (handout)

5.4. L2 and L1 penalties (handout)

5.5. Parameter initialization (handout)

5.6. Architecture choice and training protocol (handout)

5.7. Writing an autograd function (handout)

6. Going deeper

6.1. Benefits of depth (handout)

6.2. Rectifiers (handout)

6.3. Dropout (handout)

6.4. Batch normalization (handout)

6.5. Residual networks (handout)

6.6. Using GPUs (handout)

7. Autoencoders

7.1. Transposed convolutions (handout)

7.2. Deep Autoencoders (handout)

7.3. Denoising autoencoders (handout)

7.4. Variational autoencoders (handout)

8. Computer vision

8.1. Computer vision tasks (handout)

8.2. Networks for image classification (handout)

8.3. Networks for object detection (handout)

8.4. Networks for semantic segmentation (handout)

8.5. DataLoader and neuro-surgery (handout)

9. Under the hood

9.1. Looking at parameters. (handout)

9.2. Looking at activations (handout)

9.3. Visualizing the processing in the input (handout)

9.4. Optimizing inputs (handout)

10. Autoregression and Normalizing Flows

10.1. Auto-regression (handout)

10.2. Causal convolutions (handout)

10.3. Non-volume preserving networks (handout)

11. Generative Adversarial Networks

11.1. Generative Adversarial Networks (handout)

11.2. Wasserstein GAN (handout)

11.3. Conditional GAN and image translation (handout)

11.4. Model persistence and checkpoints (handout)

12. Recurrent models and NLP

12.1. Recurrent Neural Networks (handout)

12.2. LSTM and GRU (handout)

12.3. Word embeddings and translation (handout)

13. Attention models

13.1. Attention for Memory and Sequence Translation (handout)

13.2. Attention Mechanisms (handout)

13.3. Transformer Networks (handout)

Practice Questions

Practice 1 (Solution)

Practice 2 (Solution)

Practice 3 (Solution)

Practice 4 (Solution)

Practice 5 (Solution)

Practice 6 (Solution)

Book PDF

Notebooks

Chapter 1 - Introduction (pptx)

Chapter 2 - Supervised Learning (pptx)

Chapter 3 - Shallow Neural Networks (pptx)

Chapter 4 - Deep Neural Networks (pptx)

Chapter 5 - Loss Functions (pptx)

Chapter 6 - Training Models (pptx)

Chapter 7 - Gradients and Initializations (pptx)

Chapter 8 - Measuring Performance (pptx)

Chapter 9 - Regularization (pptx)

Chapter 10 - Convolutional Networks (pptx)

Chapter 11 - Residual Networks (pptx)

Chapter 12 - Transformers (pptx)

Chapter 13 - Graph Neural Networks (pptx)

Chapter 14 - Unsupervised Learning (pptx)

Chapter 15 - Generative Adversarial Networks (pptx)

Chapter 16 - Normalizing Flows (pptx)

Chapter 17 - Variational Autoencoders (pptx)

Chapter 18 - Diffusion Models (pptx)

Chapter 19 - Deep Reinforcement Learning (pptx)

Chapter 20 - Why does deep learning work? (pptx)

Chapter 21 - Deep Learning and Ethics (pptx)

Appendices (pptx)

Transformer Models

Using HF Transformers

Fine-tuning Pretrained Models

Sharing Models and Tokenizers

HF Datasets Library

HF Tokenizers Library

Main NLP Tasks

Building and Sharing Demos

Transformers by Brandon Rohrer

Transformers by Michael Phi

Transformers: Python Implementation

Transformers: Python Implementation (PDF)

Transformers Catalogue

Chapter 1

1 - An Introduction to Deep Learning System Design

2 - Dataset Management Service

3 - Model Training Service

4 - Distributed Training

5 - Hyperparameter Optimization (HPO) Service

6 - Model Serving Design

7 - Model Serving in Practice

8 - Metadata and Artifact Store

9 - Workflow Orchestration

10 - Path to Production

Appendix A: A Hello World Deep Learning System

Appendix B: Survey of Existing Solutions

Appendix C: Creating an HPO Service With Kubflow Katib

Chapter 1

Part 1 - Model Prompting

Part 2 - Use Case Ideation

Part 3 - The Generate Endpoint

Part 4 - Creating Custom Models

Part 5 - Chaining Prompts

1 - Intro to LLMs and the generative AI project lifecycle

Lab 1

2 - Fine-tuning and evaluating large language models

Lab 2

3 - Reinforcement learning and LLM-powered applications

Lab 3

Vinija's AI Notes: LLM Models

Vinija's AI Notes: Prompt Engineering

7 Frameworks for Serving LLMs

ToolLLM

COURSE: Building LLM-Powered Apps by Weights and Biases

Patterns for Building LLM-based Systems & Products

LLM-Rec: Personalized Recommendation via Prompting Large Language Models

GitHub Repo for LLM-Human Aligning

Llama 2 - Guidance and Best Practices for Building LLM Applications

Challenges and Applications of Large Language Models

What are GPT Agents? A deep dive into the AI interface of the future

Open Problems and Fundamental Limitations of RLHF

Llama 2: Open Foundation and Fine-Tuned Chat Models

MetaGPT

COURSE: Langchain & Vector Databases in Production by Activeloop

LlamaIndex

MLflow AI Gateway by Databricks

Understanding LLaMA-2 Architecture & its Ginormous Impact on GenAI

LLM Powered Autonomous Agents

AWS Guides on Generative AI

HELM (Holistic Evaluation of Language Models), HELM Docs

Comparison of LLMs

LangChain Academy - Introduction to LangGraph

DEEPLEARNING.AI - Automated Testing for LLMOps (AVAILABLE SOON)

DEEPLEARNING.AI - Prompt Engineering for Vision Models (AVAILABLE SOON)

DEEPLEARNING.AI - Quantization in Depth (AVAILABLE SOON)

DEEPLEARNING.AI - Quantization Fundamentals with HuggingFace (AVAILABLE SOON)

DEEPLEARNING.AI - Getting Started with Mistral (AVAILABLE SOON)

DEEPLEARNING.AI - Preprocessing Unstructured Data for LLM Applications (AVAILABLE SOON)

DEEPLEARNING.AI - Building and Evaluating Advanced RAG Applications (AVAILABLE SOON)

DEEPLEARNING.AI - Vector Databases: from Embeddings to Applications (AVAILABLE SOON)

DEEPLEARNING.AI - How Diffusion Models Work? (AVAILABLE SOON)

DEEPLEARNING.AI - Efficiently Serving LLMs (AVAILABLE SOON)

DEEPLEARNING.AI - Knowledge Graphs for RAG (AVAILABLE SOON)

DEEPLEARNING.AI - Advanced Retrieval for AI with Chroma (AVAILABLE SOON)

DEEPLEARNING.AI - Building Agentic RAG with LlamaIndex

DEEPLEARNING.AI - LLMs with Semantic Search (Cohere)

DEEPLEARNING.AI - LangChain

DEEPLEARNING.AI - Prompt Engineering

DEEPLEARNING.AI - Building Systems with the ChatGPT API

DEEPLEARNING.AI - LangChain: Chat With Your Data

DEEPLEARNING.AI - OpenAI Functions, Tools, and Agents in LangChain

DEEPLEARNING.AI - Multimodal Llama 3.2

Building LLM Applications for Production by Chip Huyen

Open challenges in LLM research by Chip Huyen

LLMOps by Vinija

Instruction Tuning for Large Language Models: A Survey

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

A complete Hugging Face tutorial: how to build and train a vision transformer

How the Vision Transformer (ViT) works in 10 minutes: an image is worth 16x16 words

Transformers in computer vision: ViT architectures, tips, tricks and improvements

The theory behind Latent Variable Models: formulating a Variational Autoencoder

How diffusion models work: the math from scratch

GANs in computer vision - Introduction to generative learning

GANs in computer vision - Improved training with Wasserstein distance, game theory control and progressively growing schemes

How Graph Neural Networks (GNN) work: introduction to graph convolutions from scratch

Best Graph Neural Network architectures: GCN, GAT, MPNN and more

Explainable AI (XAI): A survey of recents methods, applications and frameworks

In-layer normalization techniques for training very deep neural networks

Why multi-head self attention works: math, intuitions and 10+1 hidden insights

How Positional Embeddings work in Self-Attention (code in Pytorch)

Grokking self-supervised (representation) learning: how it works in computer vision and why

How Attention works in Deep Learning: understanding the attention mechanism in sequence models

How Transformers work in deep learning and NLP: an intuitive introduction

Learn Pytorch: Training your first deep learning models step by step

A complete Apache Airflow tutorial: building data pipelines with Python

BYOL tutorial: self-supervised learning on CIFAR images with code in Pytorch

How Neural Radiance Fields (NeRF) and Instant Neural Graphics Primitives work

How distributed training works in Pytorch: distributed data-parallel and mixed-precision training

Vision Language models: towards multi-modal deep learning

Understanding Maximum Likelihood Estimation in Supervised Learning

Grokking self-supervised (representation) learning: how it works in computer vision and why

Speech Recognition: a review of the different deep learning approaches

A complete Weights and Biases tutorial

Regularization techniques for training deep neural networks

Top Resources to start with Computer Vision and Deep Learning

Tensorflow Extended (TFX) in action: build a production ready deep learning pipeline

An introduction to Recommendation Systems: an overview of machine and deep learning architectures

An overview of Unet architectures for semantic segmentation and biomedical image segmentation

How Graph Neural Networks (GNN) work: introduction to graph convolutions from scratch

Best Resources to Learn Deep Learning Theory

Build a Transformer in JAX from scratch: how to write and train your own models

JAX vs Tensorflow vs Pytorch: Building a Variational Autoencoder (VAE)

JAX for Machine Learning: how it works and why learn it

Understanding einsum for Deep learning: implement a transformer with multi-head self-attention from scratch

Introduction to medical image processing with Python: CT lung and vessel segmentation without labels

Best deep CNN architectures and their principles: from AlexNet to EfficientNet

How Transformers work in deep learning and NLP: an intuitive introduction

A journey into Optimization algorithms for Deep Neural Networks

Introduction to Kubernetes with Google Cloud: Deploy your Deep Learning model effortlessly

How to use Docker containers and Docker Compose for Deep Learning applications

Transfer learning in medical imaging: classification and segmentation

Scalability in Machine Learning: Grow your model to serve millions of users

How to use uWSGI and Nginx to serve a Deep Learning model

Deploy a Deep Learning model as a web application using Flask and Tensorflow

Distributed Deep Learning training: Model and Data Parallelism in Tensorflow

How to train a deep learning model in the cloud

How to build a custom production-ready Deep Learning Training loop in Tensorflow from scratch

Data preprocessing for deep learning: Tips and tricks to optimize your data pipeline using Tensorflow

Understanding coordinate systems and DICOM for deep learning medical image analysis

How to Unit Test Deep Learning: Tests in TensorFlow, mocking and test coverage

Understanding the receptive field of deep convolutional networks

Logging and Debugging in Machine Learning - How to use Python debugger and the logging module to find errors in your AI application

Best practices to write Deep Learning code: Project structure, OOP, Type checking and documentation

Intuitive Explanation of Skip Connections in Deep Learning

Deep learning in medical imaging - 3D medical image segmentation with PyTorch

Deep Learning Algorithms - The Complete Guide

Human Pose Estimation

Graph Neural Networks - An overview

Localization and Object Detection with Deep Learning

Trust Region and Proximal policy optimization (TRPO and PPO)

Semantic Segmentation in the era of Neural Networks

YOLO - You only look once (Single shot detectors)

Unravel Policy Gradients and REINFORCE

The idea behind Actor-Critics and how A2C and A3C improve them

Deep Q Learning and Deep Q Networks

The secrets behind Reinforcement Learning

Decrypt Generative Adversarial Networks (GAN)

Self-driving cars using Deep Learning

Predict Bitcoin price with Long sort term memory Networks (LSTM)

How to Generate Images using Autoencoders

Neural Network from scratch-part 1

Neural Network from scratch-part 2

Document clustering

Research Papers Highlights and Notes - Comprehensive List

...