Machine Learning

Machine Learning

“Any computer program that improves its performance at some task through experience.”

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

Supervised Learning

In this category, the model learns from labeled training data. The goal is to predict a specific output, either a category or a numerical value.

Regression Algorithms

  • Linear Regression: Used to predict a continuous value (e.g., house prices) by modeling a linear relationship between input and output.
  • Logistic Regression: Used for binary classification tasks, predicting the probability of an input belonging to a specific class (e.g., spam or not spam).

Classification Algorithms

  • Support Vector Machines (SVMs): Finds the optimal hyperplane that separates data points into different classes.
  • Decision Trees: A tree-like model that makes decisions by splitting data based on a series of feature rules.
  • Random Forest: An ensemble method that combines multiple decision trees to improve accuracy and reduce overfitting.
  • Naive Bayes: A probabilistic classifier based on Bayes’ theorem, often used in text classification.
  • K-Nearest Neighbors (KNN): Classifies a data point based on the majority class of its nearest neighbors.
Unsupervised Learning

These techniques are used when there is no labeled data. The goal is to discover hidden patterns or structures in the data.

Clustering Algorithms

Clustering is an unsupervised learning algorithm that divides the dataset into mutually exclusive and collectively exhaustive subsets (non-overlapping) that are homogeneous within the group and heterogeneous between the groups.

  • K-Means: Partitions a dataset into a specified number of clusters based on the similarity of the data points.
  • Hierarchical Clustering: Hierarchical clustering creates subsets of data similar to a tree-like structure in which the root node corresponds to the complete set of data.  Branches are created from the root node to split the data into heterogeneous subsets (clusters).

Dimensionality Reduction

  • Principal Component Analysis (PCA): Reduces the number of features in a dataset while retaining most of the important information.
Semi-Supervised Learning
  • Combines aspects of both supervised and unsupervised learning.
  • Algorithms learn from a dataset containing both labeled and unlabeled data, typically with a small amount of labeled data and a large amount of unlabeled data.
  • Useful when obtaining labeled data is expensive or time-consuming.
Reinforcement Learning

These algorithms learn through trial and error by interacting with an environment.

An agent takes actions in an environment, receives rewards or penalties, and learns a policy to maximize cumulative rewards over time.

Examples: Robotics, game playing (e.g., AlphaGo), autonomous driving.

Deep Learning

Neural Networks: A specific class of these machine learning models is known as the Neural Network model.

Deep Learning is a subfield of machine learning. A Deep Learning model uses multi-layered neural networks to learn from vast amounts of data.

Neural networks are an architecture of a ML model. They can be used for supervised learning (or) for unsupervised learning (or) for reinforcement learning.

Few types of neural networks are:

  • Multilayer Perceptrons (MLPs): The simplest type of deep neural network, consisting of multiple layers of neurons.
  • Convolutional Neural Networks (CNNs): Primarily used for image and spatial data. They use convolutional layers to automatically detect features in the input.
  • Recurrent Neural Networks (RNNs): Designed for sequential data (like text or time series). They have a memory that allows them to process sequences of varying lengths.
  • Long Short-Term Memory (LSTMs): A specialized type of RNN that solves the vanishing gradient problem, allowing it to learn long-term dependencies.
  • Transformer Architecture: The Transformer architecture is a type of neural network, specifically designed for processing sequential data (like text, audio, or time series) without relying on the recurrence used in RNNs or the fixed-size context of CNNs. It is the foundational model for modern large language models, known for its self-attention mechanism that allows for parallel processing.
Generative Models

These models are designed to create new, original content. They are built using machine learning models in variety of architectures. Refer Generative Models for details.

References
  1. Mitchell, T. (1997). Machine learning. McGraw Hill