💼 GDM Research Engineer
Google Deepmind
🗓️ Aug 2024 to Current
📍 Cambridge MA
- Core contributor to Gemini Pretraining in both model architecture and implementation
- Wrote performant low level kernels for experimental attention variants using pallas
- Investigated numerical instability in MoE implementation and its effects on pretraining and RL
💼 Research Engineer
Character.ai
🗓️ Dec 2023 to Aug 2024
📍 New York NY
- Contributed to all aspects of LLM pretraining from fundamental research to performant implementations
- Discovered the relation between attention and intelligence in LLMs and its corresponding scaling laws
- Designed scaling laws to predict a 5 order-of-magnitude extrapolation in validation loss with tight error bounds
- Implemented multiple MoE variants for our flagship model trained across our entire cluster of GPUs
- Investigated the non-causality of Expert Choice MoEs and when this affects model performance
🧐 Using RL and Synthetic Data to Teach Chatbots to Avoid Certain Topics
Suppressing Pink Elephants with Direct Principle Feedback
- Investigated ILQL-based Reinforcement Learning for finetuning LLMs
- Implemented automated GPT-4 based evaluation pipeline for judging model coherence
💼 Senior Machine Learning Engineer
Square
🗓️ Sep 2022 to Dec 2023
📍 Boston MA
- Finetuned open-source LLMs on merchant-buyer conversations to suggest replies to incoming messages
- Conducted an online A/B test and demonstrated a 5% increase in suggestion acceptance rate
- Designed and implemented a multi-task training system to incorporate classification tasks into an LLM
- Instruction finetuned FLAN-T5 on internal data and evaluated performance against individual classifiers
🧐 Investigating Reasoning Capabilities of Large Language Models
OPT-R: Enhacing Reasoning Capabilities of Large Language Models
🗓️ May 2023
😃 Contributor
🌐 ACL Natural Language Reasoning and Structured Explanations workshop
🔗 ArXiv
- Curated dataset comprised of various reasoning tasks grouped by reasoning skills like Mathematical and Common Sense Reasoning.
- Architected a Makefile based data pipeline to streamline the downloading and preprocessing of data from multiple sources.
- Finetuned multiple sizes of OPT upto 13B parameters
- Analyzed reasoning performance with respect to model size and reasoning skill.
🧐 Empirical investigation of masking strategies and rates in Vision-Language Pretraining
Uniform Masking Prevails in Vision-Language Pretraining
- Conducted a large-scale experimental analysis of a 335M parameter Vision-Language model.
- Designed a scientific experiment to analyze the effects of masking strategies and masking percent on downstream performance
- Evaluated the model on multiple downstream tasks like VQA and NLVR and performed data analysis on the results.
- Published findings on arxiv in a short-form paper revealing that masking strategy has negligible effect on performance
🧐 Investigating Reasoning Capabilities of Large Language Models
ALERT: Adapting Language Models to Reasoning Tasks
- Curated dataset comprised of various reasoning tasks grouped by reasoning skills like Mathematical and Common Sense Reasoning.
- Architected a Makefile based data pipeline to streamline the downloading and preprocessing of data from multiple sources.
- Finetuned multiple sizes of OPT upto 13B parameters
- Analyzed reasoning performance with respect to model size and reasoning skill.
💼 AI Resident
Meta (Facebook)
🗓️ Aug 2021 to Sep 2022
📍 Seattle WA
- Wrote code to process 1TB of multimodal data using Rust and Parquet for a 20x speedup against Python
- Automated the training LLMs of up to 13B parameters on large multi-node clusters with up to 64GPUs
- Evaluated whether training on explanations improve reasoning capabilities of LLMs, and found that explanations mostly benefit mathematical reasoning
- Analyzed effect of masking rates and masking strategies in multimodal learning, showing that increasing masking rate nullifies effects of different masking strategies
🧐 Reinforcement Learning based Chatbots using Large Language Models
CHAI: A Chatbot AI for Task-oriented Dialog with Offline Reinforcement Learning
- Trained a model to negotiate a price for a product using data from Craigslist.
- Architected an algorithm to fuse Reinforcement Learning with Language Models.
- Implemented various Offline RL algorithms like CQL and EMaQ.
💼 Machine Learning Intern
Apple
🗓️ Jun 2021 to Aug 2021
📍 Seattle WA
- Implemented Transformer architecture from primitive operations for an in-house deep learning framework
- Demonstrated correctness by replicating English-German translation results from 'Attention Is All You Need'
- Optimized self-attention for Apple Neural Engine by rewriting computation with supported operations
🧐 Reinforcement Learning based Chatbots using Large Language Models
CHAI: A Chatbot AI for Task-oriented Dialog with Offline Reinforcement Learning
- Trained a model to negotiate a price for a product using data from Craigslist.
- Architected an algorithm to fuse Reinforcement Learning with Language Models.
- Implemented various Offline RL algorithms like CQL and EMaQ.
💼 Undergraduate Researcher at Robotic AI and Learning Lab
Berkeley Artificial Intelligence Research Lab
🗓️ Jan 2019 to May 2021
📍 Berkeley CA
- Worked with Prof. Sergey Levine and Prof. Chelsea Finn on RL and NLP in domains of robotics and chatbots
- Designed and implemented a multi-agent RL algorithm to learn composable locomotive skills without manual environment resets, subsequently using them to solve a maze. Published at NeurIPS
- Used Offline RL to finetune LLMs to bargain on craigslist items, beating supervised learning in human evals across all metrics. Accepted as oral presentation at NAACL
💼 Teaching Assistant, Deep Learning and Neural Networks
UC Berkeley EECS
🗓️ Jan 2021 to May 2021
📍 Berkeley CA
- Served as a TA for a deep learning fundamentals class teaching topics like logistic regression, convolutional neural networks, and transformers.
- Led a 15-person discussion session to review material taught in lecture and held office hours for homework help
- Designed exam problems for the final and beta-tested new homework assignments before releasing them to students
🎓 UC Berkeley
BA Computer Science & Music
🗓️ Aug 2017 to May 2021
📜 3.965
🧐 Reset-free robotic skill learning via Adversarial RL
Continual Learning of Control Primitives: Skill Discovery via Reset-Games
- Designed an RL algorithm to learn skills without manual interventions to reset the environment
- Implemented a Python RL framework using Pytorch and open-sourced it on Github.
- Trained a four-legged robot to walk and subsequently solve a maze using learned skills
🎓 The International School Bangalore
International Baccalaureate Diploma
🗓️ Aug 2015 to May 2017
📜 48.0