Shubham Gaur
AI/ML Research Engineer • MS NLP @ UC Santa Cruz
🌍 San Francisco Bay Area, CA, USA
📧 sgaur2@ucsc.edu
📱 (408) 640-5717
🔗 LinkedIn | GitHub | Website | Google Scholar
🎯 About Me
Graduate Student in Natural Language Processing at UC Santa Cruz (Silicon Valley Campus) with 5+ years of industry experience at Nokia, Adani Group, and BlackRock. Currently based in Santa Clara, CA, working on advancing generalist AI through multimodal systems and agentic AI.
Research Interests:
-
🎯 Alignment & Reasoning in AI Systems
-
🔧 Efficiency in Large Language Models
-
🤖 Multimodal Systems (text, image, video)
-
🎯 Agentic Systems (Conversational AI, GUI Agents)
🎓 Education
University of California, Santa Cruz (UCSC) | Santa Clara, CA
M.S. in Natural Language Processing (Major: AI) | Sept 2024 - Dec 2025
SRM Institute of Science & Technology (SRMIST) | Chennai, India
B.S. in Information Technology (Major: ML) | Jul 2015 - May 2019
💼 Work Experience
🔵 Nokia | Machine Learning Intern | Jun 2025 - Present
Naperville, USA
- Built Graph Neural Network-based anomaly detection models and applied PageRank algorithms
- Developed transformer-based forecasting pipeline for network optimization
- Created agentic multimodal RAG pipeline (LangChain + Qdrant + Azure ML) for enterprise knowledge retrieval
🟢 Adani Group | Lead Machine Learning Engineer | Nov 2021 - Aug 2024
Gurgaon, India
- Built a 100M+ user real-time Customer Data Platform with 10-member team, powering personalized AI/ML applications across travel, retail, and finance
- Developed AdaniOne Chat, an LLM-powered multi-tenant chatbot with Delta Lake-backed vector search
- Deployed GPT-3.5 SFT models for automated hotel description generation across 33K listings (saved $130K)
- Applied transformer-based models (BERT, GPT) for feedback analysis and churn prediction: improved customer retention by 48% and CLV by 15%
- Developed recommendation system with advanced ranking techniques: increased duty-free engagement by 87% and revenue by 20%
🔴 BlackRock | Machine Learning Engineer | Jul 2019 - Oct 2021
Mumbai, India
- Developed neural network-based solution for portfolio and index tree generation, scaling to 100 portfolios
- Built and deployed ETL tools for data validation and securities loading, saving time equivalent to 7 FTEs
- Designed XGBoost and Random Forest-based models for risk scoring and ranking investment instruments
🔬 Research Projects
🤖 GUI Agents (in collaboration with Samsung Research America)
May 2025 - Present
- Building web action agents combining DOM, accessibility trees, and reinforcement learning (PPO) with visual cues
- Developing reliable multi-modal UI automation systems
- Creating benchmarks for web-based task generalization
🏠 HomeHelper - Multimodal Agent for Appliance Troubleshooting
Apr - June 2025
- Multi-agent system integrating images, text, and video for household appliance support
- LangChain orchestrator-based system with Self-Improving RAG loop (SIM-RAG)
- Models: BLIP, GPT, Mistral, LLaMA with memory for conversations
- Dataset: 3,000+ synthetic multi-turn conversations
📊 Evaluation of Faithfulness over ARMs and Diffusion Language Models | Code
Apr - June 2025
- Benchmarked diffusion-based LLMs (LLaDA) against ARMs (LLaMA-3, SmolLM) across summarization datasets
- Demonstrated LLaDA’s robustness through zero-shot, chain-of-thought, and adversarial prompting
- Used BERTScore (semantic) + AlignScore (faithfulness) for comprehensive evaluation
🎨 Leveraging LLMs and VLMs for Idiomatic Understanding | Code
SemEval’25 Research
- Used LLMs & VLMs (OpenCLIP, BLIP, ALIGN) to interpret idiomatic sentences
- Developed image-text alignment strategies for contextual meaning interpretation
- Publication: Submitted to SemEval 2025 (ACL Workshop)
📝 Publications
-
“Leveraging LLMs and VLMs for idiomatic understanding” | ACL’25 Workshop (SemEval) Paper
Judith Clymo, Adam Zernik, Shubham Gaur
-
“Advancing Web-Based Visual Question Answering with Efficient Image-Text Alignment” | ICRAAI’24 Paper
Saketh Kilaru, Shubham Gaur, Spandan Rout
-
“Extraction of Cumulative Blobs Using Dynamic Gestures” | IJSR’21 Paper
Rishabh Naulakha, Shubham Gaur, Dhairya Lodha, Mehek Tulsyan, Utsav Kotecha
🛠️ Technical Skills
Programming & Data Science

Libraries: pandas, NumPy, SciPy, scikit-learn, Matplotlib, Jupyter
Machine Learning & AI

Specializations: Computer Vision (CNN, OpenCV), Reinforcement Learning, LoRA/PEFT, Diffusion Models
Generative AI & LLMs
Models: BERT, GPT-3/4, LLaMA, CLIP, ALIGN
Frameworks: LangChain/LangGraph, AutoGen, Retrieval Augmented Generation
Vector Databases: Pinecone, Qdrant
Techniques: Supervised Fine Tuning, RLHF, PPO, Instruction-Tuned LLMs
Big Data & Cloud

Technologies: Apache Spark, Hadoop, Kafka, Databricks, BigQuery, Synapse, Data Factory
Deployment & MLOps

Tools: FastAPI, MLflow, Airflow, CI/CD, Redis, PostgreSQL, MySQL, MongoDB
🏆 Key Achievements
- 🚀 Built 100M+ user platform at Adani Group
- 💰 Saved $130K through automated content generation
- 📈 Improved customer retention by 48% using AI/ML models
- 💼 Increased revenue by 20% through recommendation systems
- ⚡ Saved equivalent of 7 FTEs through automation at BlackRock
- 🎓 Maintained 4.0 GPA in graduate studies
- 📚 3 research publications across AI/ML conferences
📞 Let’s Connect!
I’m always interested in discussing AI research, collaboration opportunities, or innovative projects. Feel free to reach out!
“Advancing AI systems that understand, reason, and act across multiple modalities with human-level reliability.”