13A: LLMs Introduction

Materials:

Date: Thursday, 07-Nov-2024, 11.30-1pm, IST.

Pre-work:

  1. LING571@UW Deep Learning For NLP, Prof. Shane at UW, Spring’24. Introduction, Word Vectors, Language Modeling
  2. CIS7000@UPenn LLMs, by Prof. Mayur Naik at UPenn, Fall’24, Background, Language Modeling
  3. AIL821@IIT-Delhi LLMs: Introduction and Recent Advances ELL881/AIL821, LLMs: Introduction and Advances @ IIT-Delhi, Fall’24.
  4. Transformers

In-Class

We will follow the “Follow the data” approach to organize the content.

  1. Quick review of NLP and Deep Learning for NLP, pre- and post-GPT world.
  2. LLM Flow: (Quality) Datasets, Model Training (Pre-training, Alignment, Fine-tuning), Prompt Optimization, Constrained Language Generation, Evaluation.
  3. Datasets and Tasks (to train LLMs)
  4. Model Training
  5. Prompt Optimization
  6. Constrained Language Generation
  7. Evaluation
  8. Applications and Design Patterns
  9. LLMs can not reason & plan

Post-class

  1. Datasets and Tasks (to train LLMs)
  2. Model Training
  3. Prompt Optimization
  4. Constrained Language Generation
  5. Evaluation
  6. Applications and Design Patterns
  7. LLMs can not reason & plan

LLMs and Influence Functions

  1. Studying Large Language Model Generalization with Influence Functions
  2. Do Influence Functions Work on Large Language Models?
  3. TextGrad Automatic ‘’Differentiation’’ via Text, paper

Full Courses

  1. CIS7000 LLM Course @ UPenn by Prof. Mayur Naik. Covers many advanced topics.
  2. AIL821 LLMs Course @ IIT-D
  3. Deep Learning For NLP @ UW LING 574, Deep Learning For NLP, Prof. Shane @ UW, Spring’24.
  4. Walk through the book Building LLMs from Scratch