ML Practitioner and Musician
Hi! I'm Siddharth Verma, a Research Engineer at Google Deepmind working on Gemini. Previously, I was a Research Engineer at Character working on LLM pretraining. I also was AI resident at Facebook AI Research working on scaling multimodal learning and improving reasoning capabilities of LLMs. I completed my undergraduate degree in Computer Science & Music from UC Berkeley where I worked with Prof. Sergey Levine on Reinforcement Learning and its applications to Natural Language Processing.
My expertise lies in Natural Language Processing and Reinforcement Learning. I have trained SoTA LLMs and deployed them to production serving millions of users. I have also conducted extensive ML research in both academic and industrial settings, resulting in multiple published papers in venues such as NeurIPS and ACL.
When I'm not training models or performing research, you can catch me practicing the piano, playing table tennis or tweaking my Emacs config.
💼 GDM Research Engineer
Google Deepmind
🗓️ Aug 2024 to Current
📍 Cambridge MA
💼 Research Engineer
Character.ai
🗓️ Dec 2023 to Aug 2024
📍 New York NY
- Contributed to all aspects of LLM pretraining from fundamental research to performant implementations
- Implemented multiple MoE variants for our flagship model trained across our entire cluster of GPUs
- Investigated the non-causality of Expert Choice MoEs and when this affects model performance
- Discovered the relation between attention and intelligence in LLMs and its corresponding scaling laws
- Designed scaling laws to predict a 5 order-of-magnitude extrapolation in validation loss with tight error bounds
🧐 Using RL and Synthetic Data to Teach Chatbots to Avoid Certain Topics
Suppressing Pink Elephants with Direct Principle Feedback
🗓️ Feb 2024
😃 Contributor
🌐 ACL
- Investigated ILQL-based Reinforcement Learning for finetuning LLMs
- Implemented automated GPT-4 based evaluation pipeline for judging model coherence
🥂 Reviewer for EMNLP 2023
Peer review paper submissions
💼 Senior Machine Learning Engineer
Square
🗓️ Sep 2022 to Dec 2023
📍 Boston MA
- Finetuned open-source LLMs on merchant-buyer conversations to suggest replies to incoming messages
- Conducted an online A/B test and demonstrated a 5% increase in suggestion acceptance rate
- Designed and implemented a multi-task training system to incorporate classification tasks into an LLM
- Instruction finetuned FLAN-T5 on internal data and evaluated performance against individual classifiers
🥂 Reviewer for ACL 2023
Peer review paper submissions
🧐 Investigating Reasoning Capabilities of Large Language Models
OPT-R: Enhacing Reasoning Capabilities of Large Language Models
🗓️ May 2023
😃 Contributor
🌐 ACL Natural Language Reasoning and Structured Explanations workshop
- Curated dataset comprised of various reasoning tasks grouped by reasoning skills like Mathematical and Common Sense Reasoning.
- Architected a Makefile based data pipeline to streamline the downloading and preprocessing of data from multiple sources.
- Finetuned multiple sizes of OPT upto 13B parameters
- Analyzed reasoning performance with respect to model size and reasoning skill.
🧐 Empirical investigation of masking strategies and rates in Vision-Language Pretraining
Uniform Masking Prevails in Vision-Language Pretraining
🗓️ Dec 2022
😃 First Author
- Conducted a large-scale experimental analysis of a 335M parameter Vision-Language model.
- Designed a scientific experiment to analyze the effects of masking strategies and masking percent on downstream performance
- Evaluated the model on multiple downstream tasks like VQA and NLVR and performed data analysis on the results.
- Published findings on arxiv in a short-form paper revealing that masking strategy has negligible effect on performance
🥂 Reviewer for EMNLP 2022
Peer review paper submissions
🧐 Investigating Reasoning Capabilities of Large Language Models
ALERT: Adapting Language Models to Reasoning Tasks
🗓️ Oct 2022
😃 Contributor
🌐 ACL
- Curated dataset comprised of various reasoning tasks grouped by reasoning skills like Mathematical and Common Sense Reasoning.
- Architected a Makefile based data pipeline to streamline the downloading and preprocessing of data from multiple sources.
- Finetuned multiple sizes of OPT upto 13B parameters
- Analyzed reasoning performance with respect to model size and reasoning skill.
💼 AI Resident
Meta (Facebook)
🗓️ Aug 2021 to Sep 2022
📍 Seattle WA
- Wrote code to process 1TB of multimodal data using Rust and Parquet for a 20x speedup against Python
- Automated the training LLMs of up to 13B parameters on large multi-node clusters with up to 64GPUs
- Evaluated whether training on explanations improve reasoning capabilities of LLMs, and found that explanations mostly benefit mathematical reasoning
- Analyzed effect of masking rates and masking strategies in multimodal learning, showing that increasing masking rate nullifies effects of different masking strategies
🥂 Reviewer for SIGIR 2022
Peer review paper submissions
🧐 Reinforcement Learning based Chatbots using Large Language Models
CHAI: A Chatbot AI for Task-oriented Dialog with Offline Reinforcement Learning
🗓️ Apr 2022
😃 First Author
🌐 NAACL
- Trained a model to negotiate a price for a product using data from Craigslist.
- Architected an algorithm to fuse Reinforcement Learning with Language Models.
- Implemented various Offline RL algorithms like CQL and EMaQ.
💼 Machine Learning Intern
Apple
🗓️ Jun 2021 to Aug 2021
📍 Seattle WA
- Implemented Transformer architecture from primitive operations for an in-house deep learning framework
- Demonstrated correctness by replicating English-German translation results from 'Attention Is All You Need'
- Optimized self-attention for Apple Neural Engine by rewriting computation with supported operations
🧐 Reinforcement Learning based Chatbots using Large Language Models
CHAI: A Chatbot AI for Task-oriented Dialog with Offline Reinforcement Learning
🗓️ Jul 2021
😃 First Author
🌐 ICLR NeuCAIR workshop
- Trained a model to negotiate a price for a product using data from Craigslist.
- Architected an algorithm to fuse Reinforcement Learning with Language Models.
- Implemented various Offline RL algorithms like CQL and EMaQ.
💼 Undergraduate Researcher at Robotic AI and Learning Lab
Berkeley Artificial Intelligence Research Lab
🗓️ Jan 2019 to May 2021
📍 Berkeley CA
- Worked with Prof. Sergey Levine and Prof. Chelsea Finn on RL and NLP in domains of robotics and chatbots
- Designed and implemented a multi-agent RL algorithm to learn composable locomotive skills without manual environment resets, subsequently using them to solve a maze. Published at NeurIPS
- Used Offline RL to finetune LLMs to bargain on craigslist items, beating supervised learning in human evals across all metrics. Accepted as oral presentation at NAACL
💼 Teaching Assistant, Deep Learning and Neural Networks
UC Berkeley EECS
🗓️ Jan 2021 to May 2021
📍 Berkeley CA
- Served as a TA for a deep learning fundamentals class teaching topics like logistic regression, convolutional neural networks, and transformers.
- Led a 15-person discussion session to review material taught in lecture and held office hours for homework help
- Designed exam problems for the final and beta-tested new homework assignments before releasing them to students
🎓 UC Berkeley
BA Computer Science & Music
🗓️ Aug 2017 to May 2021
📜 3.965
🥂 High Distinction
Graduated with High Distinction. Equivalent to magna cum laude.
🥂 Phi Beta Kappa
Honor society for top graduates in college of L&S.
🧐 Reset-free robotic skill learning via Adversarial RL
Continual Learning of Control Primitives: Skill Discovery via Reset-Games
🗓️ Nov 2020
😃 CoFirst Author
🌐 NeurIPS
- Designed an RL algorithm to learn skills without manual interventions to reset the environment
- Implemented a Python RL framework using Pytorch and open-sourced it on Github.
- Trained a four-legged robot to walk and subsequently solve a maze using learned skills
🥂 EECS Honors
Awarded to the top students in EECS/CS who perform research.
🥂 Dean's List
Awarded semesterly to the top 10% of undergraduates.
🥂 Upsilon Pi Epsilon
Computer Science Honor Society. Was on the board of directors.
🎓 The International School Bangalore
International Baccalaureate Diploma
🗓️ Aug 2015 to May 2017
📜 48.0