Module 1: Foundations of AI & ML
Machine Learning Fundamentals
Machine Learning Fundamentals
Machine learning is the study of algorithms that improve through experience. Rather than programming explicit rules, we give models data and let them learn patterns.
The Three Paradigms
Supervised Learning
The model learns a mapping from inputs X to outputs Y using labeled examples.
- Regression: Predicts continuous values (e.g., house prices)
- Classification: Predicts discrete categories (e.g., spam detection)
The learning objective is to minimize a loss function L(ŷ, y) that measures prediction error.
Unsupervised Learning
No labels — the model discovers structure in data.
- Clustering: Group similar data points (k-means, DBSCAN)
- Dimensionality Reduction: Compress while preserving structure (PCA, t-SNE, UMAP)
Reinforcement Learning
An agent learns by taking actions in an environment to maximize cumulative reward. The foundation of RLHF used to align LLMs.
The Bias-Variance Tradeoff
Every model balances two error sources:
| Source | Description | Symptom |
|---|---|---|
| Bias | Too simple, misses patterns | Underfitting |
| Variance | Too complex, memorizes noise | Overfitting |
Key insight: A model that perfectly fits training data will generalize poorly. We need held-out validation data to detect overfitting.
Gradient Descent
The core optimization algorithm behind all deep learning. Given loss L(θ) as a function of parameters θ:
Where α is the learning rate — the step size per update.
Variants:
- SGD: Update on a single sample (noisy but fast per step)
- Mini-batch GD: Update on a batch of 32–512 samples (standard)
- Adam: Adaptive learning rates per parameter (default for deep learning)
Key Evaluation Metrics
- Accuracy: Fraction correct (misleading for imbalanced classes)
- Precision / Recall / F1: Better for classification tasks
- RMSE / MAE: For regression tasks
- Perplexity: For language models — lower is better