Shubham Gaur

Shubham Gaur

AI Researcher at UC Santa Cruz

Download CV

About Me

I am a machine learning engineer and researcher with 5+ years of experience across Nokia, Adani Group, and BlackRock. My research interests center on AI Safety and Agentic AI, with an emphasis on post-training behavior, evaluation, and alignment—understanding where capable models fail and how to make their behavior more predictable and robust in practice.

Experience

Research Experience

Graduate Research Assistant Dr. Chenguang Wang’s Lab
Oct 2025 - Present
Santa Clara, USA
  • Contributing to MassGen, an open-source multi-agent system that coordinates multiple language models to solve complex tasks.
  • Exploring research directions to add diversity in agent’s responses, voting in multi-agents to solve open-ended tasks.
  • Developing RCABench, a benchmark for evaluating agent-based root-cause localization via multi-hop causal reasoning over codebases.
Research Apprentice Samsung Research America (advised by Dr. Beth Ann Hockey)
Apr 2025 - Present
Santa Clara, USA
  • Developing a novel data collection pipeline for capturing high-quality human interaction trajectories on ServiceNow using the WorkArena benchmark.
  • Designed end-to-end preprocessing and post-processing of JavaScript interaction events to BrowserGym compatible actions.
  • Collected 50+ human trajectories on WorkArena L1 tasks, showing a 30s reduction in agent (GPT-5) execution time via in-context learning.
Graduate Research Assistant (NLP 244 Coursework) Dr. Jeffrey Flanigan
Apr 2025 - Jun 2025
Santa Clara, USA
  • Benchmarked diffusion language models (LLaDA) against autoregressive LLMs (LLaMA-3, SmolLM) for faithfulness on summarization datasets.
  • Analyzed robustness under zero-shot, chain-of-thought, and adversarial prompting settings.
Graduate Research Assistant (NLP 243 Coursework) Dr. Amita Misra
Sept 2024 - Dec 2025
Santa Clara, USA
  • Contributed to an ACL 2025 Workshop paper from SemEval-2025 Task 1 (AdMIRe), studying idiomatic sentence–image ranking where pretrained VLMs exhibit strong literal bias.
  • Developed a prompting-based approach using GPT-4 to generate idiomatic definitions prior to ranking, improving Top-1 accuracy from 43% to 87%.

Industry Experience

Applied Research Intern Nokia Bell Labs
Jun 2025 - Aug 2025
Naperville, USA
  • Built Graph Neural Network-based anomaly detection models, applied PageRank algorithms, and developed transformer based forecasting pipeline for software releases.
  • Developed an agentic multimodal RAG pipeline for enterprise knowledge retrieval.
Founding Lead ML Engineer Adani Group
Nov 2021 - Aug 2024
Gurgaon, India
  • Developed and deployed LLM-based systems (GPT, BERT) for large-scale text understanding tasks, including feedback analysis and churn modeling.
  • Built and evaluated an LLM-powered, multi-tenant conversational agent with vector search over enterprise data using Delta Lake–backed retrieval.
  • Fine-tuned GPT-3.5 models for controlled text generation (hotel descriptions) across 33K listings.
  • Received MongoDB APAC Innovation Award 2024 - Batch to Real-Time Category for building an AI-powered product catalog.
ML Engineer BlackRock
Jul 2019 - Oct 2021
Mumbai, India
  • Developed a neural network-based solution for portfolio and index tree generation, scaling to 100 portfolios.
  • Built and deployed ETL tools for data validation and securities loading, saving time equivalent to 7 FTEs.
  • Designed XGBoost and Random Forest–based models for risk scoring and ranking investment instruments.

Publications

UCSC NLP T6 at SemEval-2025 Task 1: Leveraging LLMs and VLMs for Idiomatic Understanding Judith Clymo, Adam Zernik, Shubham Gaur
In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2103–2115, Vienna, Austria. Association for Computational Linguistics.
Advancing Web-Based Visual Question Answering with Efficient Image-Text Alignment Saketh Kilaru, Shubham Gaur, Spandan Rout
Proceedings of the International Conference on Recent Advancements in Artificial Intelligence (ICRAAI 2024).
Extraction of Cumulative Blobs Using Dynamic Gestures Rishabh Naulakha, Shubham Gaur, Dhairya Lodha, Mehek Tulsyan, Utsav Kotecha
International Journal of Scientific Research (IJSR), 2021.

Poster Presentations

What Web Agents Need Isn't Data - It's High Quality Data
Capstone Presentation, UC Santa Cruz (2025)
Advancing Web-Based Visual Question Answering with Efficient Image-Text Alignment
SciML Symposium, Georgia Tech (2024)

Technical Skills

AI Safety & Alignment Faithfulness, Hallucination Eval, Robustness Testing, Adversarial Prompting, RLHF, PPO.
Models & Analysis PyTorch, Transformers, Diffusion Models, Causal Reasoning, Interpretability.
Agentic Systems Multi-Agent Systems, Agent Routing, Tool-Calling LLMs, Human-in-the-Loop.
Engineering & Infra Docker, Kubernetes, FastAPI, MLflow, Airflow, Vector Databases (FAISS, Qdrant).
Languages Python (NumPy, SciPy, pandas), SQL, C/C++, Java, Bash.

Education

University of California, Santa Cruz
Masters in Natural Language Processing (4.0/4.0)
Sept, 2024 - March 2026
Santa Clara, USA
SRM Institute of Science and Technology
Bachelors in Information Technology (3.8/4.0)
July 2015 - May 2019
Chennai, India

Teaching Experience (TA)

ARTG 91: Intro to Game Art Production
Present (Winter 2026)
THEA 80C: Monsters
Fall 2025
TIM 172B: Intro to Tech Management II
Winter 2025
TIM 172A: Introduction to Technology Management I
Fall 2024