Hi, I'm Kabir J.
Data Science student and Research Intern working on Knowledge Editing & Interp.
My broader interests lie in deep learning, interpretability, and alignment.
I’m currently building around deep learning at its core—curious about how models think, reason, fail, and can be improved.
Right now, I'm:
- Building a tiny-mixtral from scratch.
- Resarch Interning on Knowledge Editing & Interp.
- Contributing to Interp & Alignment @AI-Plans
- Participating at @AI Safety Collab
- Pinning down in-depth ML blogs & Paper summaries Blogs
- Studying Data Science @ IIT Madras.
Reach out to work on projects and collaborate on ML research — email me
Projects
-
Tiny-Mixtral 175M MoE
A simplified re-implementation of the **Mixtral of Experts (MoE)** architecture.
Core Concepts:
MoE, GQA, KV Caching, Sliding Window Attention, Rolling Buffer KV Cache.
Training:
Dataset: TinyStories
Hardware: NVIDIA Tesla P100, Kaggle Notebooks
Paper:
Mixtral of Experts
-
Deep-Learning-History
A personal deep dive into 18+ foundational and cutting-edge deep learning papers, implemented from scratch in NumPy and PyTorch. The goal: build intuition by understanding the "why" behind each method—not just the code.
Key Implementations:
Foundations: LeNet, AlexNet
Sequence Models: RNNs, LSTMs, word2vec, GloVe, Encoder-Decoder
Transformers: BERT, GPT-1/2, LLaMA-2, Mixtral
Generative Models: GANs, DCGANs, WGANs
Vision: Vision Transformers
Focus Areas:
Fine-tuning (LoRA, GPT-2)
Multilingual NMT
Evaluation Metrics (BLEU, Classification)
-
Bank Telemarketing Success Prediction
Ranked Top 25 out of 1,200+ participants in a Kaggle competition to predict client subscription to term deposits based on telemarketing campaign data.
Key Contributions:
EDA: Uncovered trends, patterns, and relationships within the data
Preprocessing: Handled missing values, outliers, and applied feature scaling
Modeling: Logistic Regression, SVM, Random Forest, XGBoost
Optimization: Cross-validation and hyperparameter tuning to boost performance
-
funniest-joke-with-LLMs
Can LLMs be funny on purpose? This project explores controlled joke generation using Plan-Search, and ranks results via LLM judges with diverse humor styles.
Key Modules:
Joke Generation via Plan & Search
LLM-as-Judge with 7+ humor personas
Novelty Detection using FAISS & POS patterns
Focus Areas:
Incongruity modeling
Subjective humor evaluation
Creative alignment in LLMs
-
Mangakensaku
A FastAPI-based app that retrieves the most relevant manga panels using natural language queries. Built to explore how FAISS performs similarity search on CLIP-based image embeddings.
Key Features:
CLIP-based embeddings
FAISS similarity search (IndexFlatIP)
Semantic image search
Multimodal retrieval (text-image)
Efficient ANN using FAISS