Huzefa
Huzefa|

// ML Engineer  ·  GenAI  ·  Multimodal AI  ·  Computer Vision

Hey, I'm Huzefa

Rebuilt the same ML system 6 times to get it right.

Download Resume
scroll ↓
0.1341BLEU-4
97.62%Accuracy
1.00Precision
0.916R² Score
8Projects
★ Featured Project LensToWords
📷
LensToWords — Image Captioning through 7 Architecture Iterations
7 rewrites: CNN-LSTM → InceptionV3 + Attention → ConvNeXt + Perceiver + Cross-Attention + GPT-2. Trained on Flickr30k + MSCOCO. BLEU-4: 0.1341. Each failure was a deliberate architectural decision.
PyTorch ConvNeXt Perceiver Cross-Attention GPT-2 MSCOCO

Projects

01
Custom CNN + FastAPI + Streamlit pipeline for potato leaf disease detection. 3 classes.
97.62% Acc
02
Analytics dashboard for WhatsApp exports. Hinglish NLP, emoji analytics, temporal behaviour modelling.
Hinglish NLP
03
Visual fashion recommendation using ResNet50 features + cosine similarity.
ResNet50
04
13 classifiers benchmarked. VotingClassifier ensemble. Zero false positives.
Precision=1.0
05
VotingRegressor ensemble (RF + GBM + ExtraTree) on hardware specification data.
R²=0.916
06
Face recognition using DeepFace + VGG-Face + MTCNN across 100 celebrities.
100 celebs
07
Content-based recommendation on TMDB 5000. CountVectorizer + cosine similarity. Deployed on Streamlit.
Deployed ↗
// 005  ·  Identity Protocol
ABOUT
Identity  ·  Architecture  ·  Process
> LOADING: IDENTITY_MODULE_
Huzefa Merchant — ML Engineer
ABOUT

I build ML systems that actually work.

Deep learning systems that go all the way — from architecture decisions to production output, not just notebooks that run on clean data.
My work spans Generative AI, Multimodal AI, Computer Vision, and Recommendation Systems. I go deep on one problem until the output is right, not until the metrics look good.
Spent the last year rebuilding the same image captioning system seven times — each time because I hit the ceiling and needed to understand why. That's how I approach everything._

Open to roles, collaborations, and interesting problems — anywhere in the world.

0.1341BLEU‑4
Iterations
8Projects
GenAI Multimodal AI Computer Vision
CURSOR TRACKING
EYE  X:—  Y:—
HM_ENGINEER_001——:——:——
import torch BLEU_4 = 0.1341 attention(Q,K,V) model.fit(X_train) { ml } BLEU_4=0.1341 > model.fit(X_train)

Skills

Deep Learning
PyTorch TensorFlow Keras Transformers CNNs
Computer Vision
ConvNeXt ResNet50 DeepFace MTCNN timm OpenCV
NLP & GenAI
Multimodal AI Hugging Face GPT-2 Perceiver LLMs
Stack & Tools
Python FastAPI Streamlit Scikit-learn Pandas Jupyter Git Linux Kaggle

Let's build something.

Open to ML Engineering, Applied AI, and Research roles — anywhere in the world.

huzefamerchant10@gmail.com →
Made by Huzefa  ·  © 2026 ML Engineer