Hi, I'm Kabir J.

Github Linkedin Twitter Goodreads

Data Science student and Research Intern working on Knowledge Editing & Interp.

My broader interests lie in deep learning, interpretability, and alignment. I’m currently building around deep learning at its core—curious about how models think, reason, fail, and can be improved.

Right now, I'm:

Building a tiny-mixtral from scratch.
Resarch Interning on Knowledge Editing & Interp.
Contributing to Interp & Alignment @AI-Plans
Participating at @AI Safety Collab
Pinning down in-depth ML blogs & Paper summaries Blogs
Studying Data Science @ IIT Madras.

Reach out to work on projects and collaborate on ML research — email me

Projects

Tiny-Mixtral 175M MoE
A simplified re-implementation of the **Mixtral of Experts (MoE)** architecture.
Core Concepts:
MoE, GQA, KV Caching, Sliding Window Attention, Rolling Buffer KV Cache.

Training:
Dataset: TinyStories
Hardware: NVIDIA Tesla P100, Kaggle Notebooks

Paper:
Mixtral of Experts

Deep-Learning-History
A personal deep dive into 18+ foundational and cutting-edge deep learning papers, implemented from scratch in NumPy and PyTorch. The goal: build intuition by understanding the "why" behind each method—not just the code.
Key Implementations:
Foundations: LeNet, AlexNet
Sequence Models: RNNs, LSTMs, word2vec, GloVe, Encoder-Decoder
Transformers: BERT, GPT-1/2, LLaMA-2, Mixtral
Generative Models: GANs, DCGANs, WGANs
Vision: Vision Transformers

Focus Areas:
Fine-tuning (LoRA, GPT-2)
Multilingual NMT
Evaluation Metrics (BLEU, Classification)

Bank Telemarketing Success Prediction
Ranked Top 25 out of 1,200+ participants in a Kaggle competition to predict client subscription to term deposits based on telemarketing campaign data.
Key Contributions:
EDA: Uncovered trends, patterns, and relationships within the data
Preprocessing: Handled missing values, outliers, and applied feature scaling
Modeling: Logistic Regression, SVM, Random Forest, XGBoost
Optimization: Cross-validation and hyperparameter tuning to boost performance
funniest-joke-with-LLMs
Can LLMs be funny on purpose? This project explores controlled joke generation using Plan-Search, and ranks results via LLM judges with diverse humor styles.
Key Modules:
Joke Generation via Plan & Search
LLM-as-Judge with 7+ humor personas
Novelty Detection using FAISS & POS patterns

Focus Areas:
Incongruity modeling
Subjective humor evaluation
Creative alignment in LLMs
Mangakensaku
A FastAPI-based app that retrieves the most relevant manga panels using natural language queries. Built to explore how FAISS performs similarity search on CLIP-based image embeddings.
Key Features:
CLIP-based embeddings
FAISS similarity search (IndexFlatIP)
Semantic image search
Multimodal retrieval (text-image)
Efficient ANN using FAISS

Other/Fun

I love reading books. Currently Reading: And The Mountains Echoed
Learning the electric guitar. I love the blues. Cover of Lenny by SRV
Running & Hitting the gym.

Res