Huzefa|

github.com/huzefa10 linkedin.com/in/huzefa-merchant kaggle.com/huzefamerchant huzefamerchant10@gmail.com

// ML Engineer · GenAI · Multimodal AI · Computer Vision

Hey, I'm Huzefa

Rebuilt the same ML system 6 times to get it right.

Download Resume

scroll ↓

0.1341BLEU-4

97.62%Accuracy

1.00Precision

0.916R² Score

8Projects

★ Featured Project LensToWords

📷

LensToWords — Image Captioning through 7 Architecture Iterations

7 rewrites: CNN-LSTM → InceptionV3 + Attention → ConvNeXt + Perceiver + Cross-Attention + GPT-2. Trained on Flickr30k + MSCOCO. BLEU-4: 0.1341. Each failure was a deliberate architectural decision.

PyTorch ConvNeXt Perceiver Cross-Attention GPT-2 MSCOCO

GitHub Kaggle

Projects

Potato Disease Detector

Custom CNN + FastAPI + Streamlit pipeline for potato leaf disease detection. 3 classes.

97.62% Acc

WhatsApp Chat Analyser

Analytics dashboard for WhatsApp exports. Hinglish NLP, emoji analytics, temporal behaviour modelling.

Hinglish NLP

Fashion Recommender

Visual fashion recommendation using ResNet50 features + cosine similarity.

ResNet50

SMS Spam Classifier

13 classifiers benchmarked. VotingClassifier ensemble. Zero false positives.

Precision=1.0

Laptop Price Predictor

VotingRegressor ensemble (RF + GBM + ExtraTree) on hardware specification data.

R²=0.916

Bollywood Celebrity Predictor

Face recognition using DeepFace + VGG-Face + MTCNN across 100 celebrities.

100 celebs

Movie Recommender System

Content-based recommendation on TMDB 5000. CountVectorizer + cosine similarity. Deployed on Streamlit.

Deployed ↗

// 005 · Identity Protocol

ABOUT

Identity · Architecture · Process

> LOADING: IDENTITY_MODULE_

ABOUT

I build ML systems that actually work.

Deep learning systems that go all the way — from architecture decisions to production output, not just notebooks that run on clean data.
My work spans Generative AI, Multimodal AI, Computer Vision, and Recommendation Systems. I go deep on one problem until the output is right, not until the metrics look good.
Spent the last year rebuilding the same image captioning system seven times — each time because I hit the ceiling and needed to understand why. That's how I approach everything._

Open to roles, collaborations, and interesting problems — anywhere in the world.

0.1341BLEU‑4

7×Iterations

8Projects

GenAI Multimodal AI Computer Vision

GitHub LinkedIn Kaggle

CURSOR TRACKING

EYE X:— Y:—

HM_ENGINEER_001——:——:——

Skills

Deep Learning

PyTorch TensorFlow Keras Transformers CNNs

Computer Vision

ConvNeXt ResNet50 DeepFace MTCNN timm OpenCV

NLP & GenAI

Multimodal AI Hugging Face GPT-2 Perceiver LLMs

Stack & Tools

Python FastAPI Streamlit Scikit-learn Pandas Jupyter Git Linux Kaggle

Let's build something.

Open to ML Engineering, Applied AI, and Research roles — anywhere in the world.

huzefamerchant10@gmail.com →

GitHub LinkedIn Kaggle