Siddharth Verma

💼 GDM Research Engineer

Google Deepmind

🗓️ Aug 2024 to Current

📍 Cambridge MA

Core contributor to Gemini Pretraining in both model architecture and implementation
Wrote a performant pallas kernel for 2-simplical attention to enable research into higher order attention
Investigated numerical instability in MoE implementation and its effects on pretraining and RL

💼 Research Engineer

Character.ai

🗓️ Dec 2023 to Aug 2024

📍 New York NY

Contributed to all aspects of LLM pretraining from fundamental research to performant implementations
Discovered the relation between attention and intelligence in LLMs and its corresponding scaling laws
Designed scaling laws to predict a 10,000x flops extrapolation in validation loss with tight error bounds
Implemented multiple MoE variants for our flagship model trained across our entire cluster of GPUs
Investigated the non-causality of Expert Choice MoEs and when this affects model performance

🧐 Using RL and Synthetic Data to Teach Chatbots to Avoid Certain Topics

Suppressing Pink Elephants with Direct Principle Feedback

🗓️ Feb 2024

😃 Contributor

🌐 ACL

🔗 ArXiv

Investigated ILQL-based Reinforcement Learning for finetuning LLMs
Implemented automated GPT-4 based evaluation pipeline for judging model coherence

🥂 Reviewer for EMNLP 2023

Peer review paper submissions

🗓️ Dec 2023

💼 Senior Machine Learning Engineer

Square

🗓️ Sep 2022 to Dec 2023

📍 Boston MA

Finetuned open-source LLMs on merchant-buyer conversations to suggest replies to incoming messages
Conducted an online A/B test and demonstrated a 5% increase in suggestion acceptance rate
Designed and implemented a multi-task training system to incorporate classification tasks into an LLM
Instruction finetuned FLAN-T5 on internal data and evaluated performance against individual classifiers

🥂 Reviewer for ACL 2023

Peer review paper submissions

🗓️ Jul 2023

🧐 Investigating Reasoning Capabilities of Large Language Models

OPT-R: Enhacing Reasoning Capabilities of Large Language Models

🗓️ May 2023

😃 Contributor

🌐 ACL Natural Language Reasoning and Structured Explanations workshop

🔗 ArXiv

Curated dataset comprised of various reasoning tasks grouped by reasoning skills like Mathematical and Common Sense Reasoning.
Architected a Makefile based data pipeline to streamline the downloading and preprocessing of data from multiple sources.
Finetuned multiple sizes of OPT upto 13B parameters
Analyzed reasoning performance with respect to model size and reasoning skill.

🧐 Empirical investigation of masking strategies and rates in Vision-Language Pretraining

Uniform Masking Prevails in Vision-Language Pretraining

🗓️ Dec 2022

😃 First Author

🔗 ArXiv

Conducted a large-scale experimental analysis of a 335M parameter Vision-Language model.
Designed a scientific experiment to analyze the effects of masking strategies and masking percent on downstream performance
Evaluated the model on multiple downstream tasks like VQA and NLVR and performed data analysis on the results.
Published findings on arxiv in a short-form paper revealing that masking strategy has negligible effect on performance

🥂 Reviewer for EMNLP 2022

Peer review paper submissions

🗓️ Dec 2022

🧐 Investigating Reasoning Capabilities of Large Language Models

ALERT: Adapting Language Models to Reasoning Tasks

🗓️ Oct 2022

😃 Contributor

🌐 ACL

🔗 ArXiv

Curated dataset comprised of various reasoning tasks grouped by reasoning skills like Mathematical and Common Sense Reasoning.
Architected a Makefile based data pipeline to streamline the downloading and preprocessing of data from multiple sources.
Finetuned multiple sizes of OPT upto 13B parameters
Analyzed reasoning performance with respect to model size and reasoning skill.

💼 AI Resident

Meta (Facebook)

🗓️ Aug 2021 to Sep 2022

📍 Seattle WA

Wrote code to process 1TB of multimodal data using Rust and Parquet for a 20x speedup against Python
Automated the training LLMs of up to 13B parameters on large multi-node clusters with up to 64GPUs
Evaluated whether training on explanations improve reasoning capabilities of LLMs, and found that explanations mostly benefit mathematical reasoning
Analyzed effect of masking rates and masking strategies in multimodal learning, showing that increasing masking rate nullifies effects of different masking strategies

🥂 Reviewer for SIGIR 2022

Peer review paper submissions

🗓️ May 2022

🧐 Reinforcement Learning based Chatbots using Large Language Models

CHAI: A Chatbot AI for Task-oriented Dialog with Offline Reinforcement Learning

🗓️ Apr 2022

😃 First Author

🌐 NAACL

🔗 ArXiv

🔗 Webpage

🔗 Code

Trained a model to negotiate a price for a product using data from Craigslist.
Architected an algorithm to fuse Reinforcement Learning with Language Models.
Implemented various Offline RL algorithms like CQL and EMaQ.

💼 Machine Learning Intern

Apple

🗓️ Jun 2021 to Aug 2021

📍 Seattle WA

Implemented Transformer architecture from primitive operations for an in-house deep learning framework
Demonstrated correctness by replicating English-German translation results from 'Attention Is All You Need'
Optimized self-attention for Apple Neural Engine by rewriting computation with supported operations

🧐 Reinforcement Learning based Chatbots using Large Language Models

CHAI: A Chatbot AI for Task-oriented Dialog with Offline Reinforcement Learning

🗓️ Jul 2021

😃 First Author

🌐 ICLR NeuCAIR workshop

🔗 ArXiv

🔗 Webpage

🔗 Code

Trained a model to negotiate a price for a product using data from Craigslist.
Architected an algorithm to fuse Reinforcement Learning with Language Models.
Implemented various Offline RL algorithms like CQL and EMaQ.

💼 Undergraduate Researcher at Robotic AI and Learning Lab

Berkeley Artificial Intelligence Research Lab

🗓️ Jan 2019 to May 2021

📍 Berkeley CA

Worked with Prof. Sergey Levine and Prof. Chelsea Finn on RL and NLP in domains of robotics and chatbots
Designed and implemented a multi-agent RL algorithm to learn composable locomotive skills without manual environment resets, subsequently using them to solve a maze. Published at NeurIPS
Used Offline RL to finetune LLMs to bargain on craigslist items, beating supervised learning in human evals across all metrics. Accepted as oral presentation at NAACL

💼 Teaching Assistant, Deep Learning and Neural Networks

UC Berkeley EECS

🗓️ Jan 2021 to May 2021

📍 Berkeley CA

Served as a TA for a deep learning fundamentals class teaching topics like logistic regression, convolutional neural networks, and transformers.
Led a 15-person discussion session to review material taught in lecture and held office hours for homework help
Designed exam problems for the final and beta-tested new homework assignments before releasing them to students

🎓 UC Berkeley

BA Computer Science & Music

🗓️ Aug 2017 to May 2021

📜 3.965

🥂 High Distinction

Graduated with High Distinction. Equivalent to magna cum laude.

🗓️ May 2021

🥂 Phi Beta Kappa

Honor society for top graduates in college of L&S.

🗓️ Jan 2021

🧐 Reset-free robotic skill learning via Adversarial RL

Continual Learning of Control Primitives: Skill Discovery via Reset-Games

🗓️ Nov 2020

😃 CoFirst Author

🌐 NeurIPS

🔗 ArXiv

🔗 Code

Designed an RL algorithm to learn skills without manual interventions to reset the environment
Implemented a Python RL framework using Pytorch and open-sourced it on Github.
Trained a four-legged robot to walk and subsequently solve a maze using learned skills

🥂 EECS Honors

Awarded to the top students in EECS/CS who perform research.

🗓️ Jan 2020

🥂 Dean's List

Awarded semesterly to the top 10% of undergraduates.

🗓️ Jan 2019

🥂 Upsilon Pi Epsilon

Computer Science Honor Society. Was on the board of directors.

🗓️ Jan 2019

🎓 The International School Bangalore

International Baccalaureate Diploma

🗓️ Aug 2015 to May 2017

📜 48.0