Hello! I’m Shubham Gaur, a Machine Learning Engineer with 5+ years of experience in Natural Language Processing (NLP), Computer Vision, and MLOps. Currently pursuing my Master’s in NLP at UC Santa Cruz, I’m passionate about developing AI solutions that deliver real-world impact.
đź’Ľ My journey in AI has taken me through innovative projects at Adani Group and BlackRock.
🔍 My research interests include NLP, multimodal systems, RAG, and knowledge representation.
🤝 I’m always eager to collaborate on AI projects and explore new ideas. Let’s connect and build solutions that matter!
Lead Machine Learning Engineer @ Adani Group (Nov 2021 - Aug 2024)
AI Solutions for Business Optimization
Scalable Infrastructure and Data Pipelines
Machine Learning Engineer @ BlackRock (Jul 2019 - Oct 2021)
This paper is a contribution to WebQA, a recent benchmark by Microsoft that combines visual and text reasoning for answers. Exploring alternatives to WebQA’s baseline models like vision language pre-training (VLP) models, my aim was to enhance accuracy through (a) a lighter model, (b) detector-free visual encoder, and (c) knowledge distillation methods such as RoBERTa. Comparing VLP and RoBERTa on single and multi-source questions revealed RoBERTa’s superior performance at 54.91% and 30.11%, respectively, compared to VLP’s 51.49% and 28.73%. This research serves as an advancement to Bing Search.
In this project, we’re diving into the world of gesture recognition, a game-changer in the way we interact with computers. Imagine controlling your computer without the need for a mouse or keyboard – that’s the magic of gesture recognition. But here’s the catch – it struggles in low-light conditions due to its reliance on cameras. To tackle this, we got creative and introduced a night vision camera. It’s like giving our system a pair of night-vision goggles, making it thrive where others struggle. We’ve set up a Raspberry Pi with OpenCV, teaching it to spot and track dynamic gestures. With the help of a nifty machine learning algorithm, we’re now recognizing patterns and seamlessly controlling the Raspberry Pi’s GPIOs for various tasks. Guess what? We’ve hit an impressive 99.62% accuracy, proving that even in the dark, our gesture recognition system shines bright.
ADMIRE: Advancing Multimodal Idiomatic Understanding in NLP (Code)
SemEval 2025
Question Answering using Retrieval Augmented Generation (RAG) (Code)
University of California, Santa Cruz (UCSC), Santa Clara, CA
SRM Institute of Science and Technology (SRMIST), Chennai, India
Programming Languages: Python, C, C++, Java, Perl, SQL, HTML, Node.js
Machine Learning & Deep Learning: PyTorch, TensorFlow, Keras, HuggingFace Transformers, JAX, LoRA Fine-Tuning, CNN
Data Analysis & Scientific Computing: pandas, numpy, scipy, scikit-learn, matplotlib, OpenCV
NLP & LLM Tools: LangChain, HuggingFace Transformers, AutoGen
Development & Deployment: Jupyter, Git, Linux, CUDA, Docker, Kubernetes, FastAPI, Rest API, Streamlit
MLOps & DevOps: MLflow, Airflow, Pinecone, MLOps, CI/CD, Data Factory
Big Data & Distributed Systems: Apache Spark, Kafka, Hadoop, Databricks
Databases & Data Warehousing: PostgreSQL, MySQL, MongoDB, SQL Server, Redis, BigQuery, Synapse
Cloud Platforms: Azure, AWS, GCP
Visualization & Reporting: Tableau, Power BI
GitHub: github.com/shubham2345
LinkedIn: linkedin.com/in/shubhamggaur
Email: sgaur2@ucsc.edu