Related Course Projects

Image and Video Processing

Project	Summary
Neural Image Colorization	Built a ResNet-18 grayscale→RGB translator using perceptual + pixel losses for high-fidelity colorization.
Adaptive Image Segmentation	Implemented histogram clustering and region growing for adaptive, data-driven segmentation.
Mean Shift vs Graph Cut	Benchmarked Mean Shift vs Graph Cut across varied images; analyzed accuracy vs runtime.
Template Matching	Classical template matching for precise region localization.
Image Thresholding	Robust binary segmentation with Otsu and Niblack methods.
Histogram Equalization	Boosted contrast using global and adaptive histogram equalization.
Harris Corner Detection & SIFT	Built feature-alignment pipelines combining Harris corners with SIFT descriptors.
Panorama Stitching	Produced panoramas via RANSAC-based feature matching and blending.
Image Denoising	Implemented Gaussian/average filters; benchmarked PSNR under varied noise.
Gaussian & Laplacian Pyramids	Multi-scale decomposition/reconstruction with image pyramids.
Color Channel Analysis	Explored RGB/HSV channels to study spatial/spectral characteristics.
2D Convolutions	Applied 2D filters in spatial and frequency domains; compared effects.
Hybrid Video Coding	Built a block-based hybrid coder with EBMA for P-frame compression (intra/inter).
Edge & Line Detection	Canny + Hough transforms for edges/lines/circles on aerial images.
Refinement of ResNet	Parameter-constrained ResNet improvements on CIFAR-10.

Audio & Speech Processing

Project	Summary
Sound Event Classification	ESC-50 pipeline using log-Mel spectrograms; compared SVM/RF vs MLP/Conv1D with temporal pooling.
Voice Activity Detection	Neural VAD on log-Mel features to detect speech in noisy audio.
Audio Captioning (LLM)	Mel + wavelet features with a pre-trained Vicuna LLM to generate descriptive captions.
Audio Feature Exploration	Implemented envelope, energy, spectral centroid, pitch, and STFT with interactive visualizations.

Deep Learning

Project	Summary
Transformer Models	BERT for IMDB sentiment; ViT for FashionMNIST image classification.
Emotion-Driven Music Generation	EfficientNet for emotion recognition → melody generation with MIDINet.
DCGAN Image Generation	Trained DCGAN to synthesize realistic clothing images (FashionMNIST).
Binary Segmentation (U-Net)	PyTorch U-Net for pedestrian mask prediction.
YOLOv3 on Video	Object detection/recognition on video streams using YOLOv3.
EfficientNet (Transfer Learning)	Fine-tuned EfficientNet for image classification tasks.
CIFAR-10 CNN vs MLP	Implemented and compared CNN and MLP classifiers on CIFAR-10.
Neural Style Transfer	Used pre-trained VGG19 to blend content and style.
Word Embeddings	Modified GloVe/Word2Vec and ran analogy tasks.
DNN for FashionMNIST	Baseline deep network for FashionMNIST classification.
Neural Machine Translation	Implemented NMT with attention in TensorFlow/Keras for sequence-to-sequence translation.
Trigger Word Detection	Built GRU/LSTM-based model to detect trigger words in audio streams.
Transformer (TensorFlow)	Trained a Transformer with attention layers in TensorFlow for NLP tasks.
Refinement of ResNet	Optimized ResNet (≤5M params) for CIFAR-10; achiev

Machine Learning

Project	Summary
Speech Emotion Recognition	Supervised (SVM, KNN, MLP) and unsupervised (DBSCAN, K-Means, GMM) baselines on speech features.
EEG Signal Processing	Extracted EEG features and trained supervised models to detect activation windows.
SVM	Implemented support vector machines with common kernels and evaluation.
KNN / Parzen Window	Non-parametric classification via KNN and Parzen density estimation.
Decision Tree	Tree induction, pruning, and evaluation.
Random Forest	Ensemble trees with out-of-bag evaluation.
MLP	Feed-forward neural network baselines.
Logistic Regression	Regularized logistic models for classification.
Polynomial Regression	Polynomial feature expansion with bias-variance analysis.
Ensemble Learning	Bagging/boosting experiments and comparisons.
Optimal & Naive Bayes	Implemented optimal Bayes classifier and Naive Bayes variants.
Gaussian Mixture Models	EM for GMMs; clustering and density estimation.
SFS / SBE	Feature selection via sequential forward/backward methods.
JTA	Implemented JTA for binary Markov chains to compute pairwise marginals using message passing.
PCA	Dimensionality reduction and reconstruction error analysis.
Metric Learning (LMNN/LFDA)	Studied how learned metrics affect k-NN performance.
Genetic Algorithms	Applied GA for local-minima search and optimization.
RGB Classification	Image classification using raw RGB features.

Optimization & Reinforcement Learning

Project	Summary
Model-Free RL: Q-Learning	Implemented Q-Learning for optimal decision-making in a taxi game environment.
Model-Based RL: Value Iteration	Developed a value iteration approach to compute optimal policies in a betting scenario.
Linear Programming	Formulated and solved LP problems in Python using Pyomo.
Non-Linear Programming	Implemented constrained optimization with Pyomo + IPOPT.
Dynamic Programming	Applied DP techniques in Python, including string similarity and sequence matching.

Data Structures & Data Analysis

Project	Summary
DFS	Implemented depth-first search for graph traversal problems.
BFS	Implemented breadth-first search for pathfinding and graph exploration.
Stacks, Queues, Linked Lists	Recursive algorithms and fundamental data structures in Python.
Tree Problems	Implemented binary tree and recursive traversal algorithms.
Heap	Built heap-based data structures and priority queue operations.
Descriptive Data Analysis	Performed exploratory and qualitative analysis on a Kaggle automobile dataset in R.
SEIRS Model	Simulated infectious disease dynamics with the SEIRS compartmental model.
Probability & Statistics Models	Investigated probability/statistics concepts, including Monte Carlo methods.
Central Limit Theorem	Demonstrated CLT through statistical sampling and visualization.