Shubham Gaur

Logo

View My GitHub Profile

đź‘‹ About Me

Hello! I’m Shubham Gaur, a Machine Learning Engineer with 5+ years of experience in Natural Language Processing (NLP), Computer Vision, and MLOps. Currently pursuing my Master’s in NLP at UC Santa Cruz, I’m passionate about developing AI solutions that deliver real-world impact.

đź’Ľ My journey in AI has taken me through innovative projects at Adani Group and BlackRock.

🔍 My research interests include NLP, multimodal systems, RAG, and knowledge representation.

🤝 I’m always eager to collaborate on AI projects and explore new ideas. Let’s connect and build solutions that matter!

Work Experience

Lead Machine Learning Engineer @ Adani Group (Nov 2021 - Aug 2024)

AI Solutions for Business Optimization

Scalable Infrastructure and Data Pipelines

Machine Learning Engineer @ BlackRock (Jul 2019 - Oct 2021)

Publications

Advancing Web-Based Visual Question Answering with Efficient Text Alignment

Publication

This paper is a contribution to WebQA, a recent benchmark by Microsoft that combines visual and text reasoning for answers. Exploring alternatives to WebQA’s baseline models like vision language pre-training (VLP) models, my aim was to enhance accuracy through (a) a lighter model, (b) detector-free visual encoder, and (c) knowledge distillation methods such as RoBERTa. Comparing VLP and RoBERTa on single and multi-source questions revealed RoBERTa’s superior performance at 54.91% and 30.11%, respectively, compared to VLP’s 51.49% and 28.73%. This research serves as an advancement to Bing Search.

Extraction of Cumulative Blobs from Dynamic Gestures

Publication

In this project, we’re diving into the world of gesture recognition, a game-changer in the way we interact with computers. Imagine controlling your computer without the need for a mouse or keyboard – that’s the magic of gesture recognition. But here’s the catch – it struggles in low-light conditions due to its reliance on cameras. To tackle this, we got creative and introduced a night vision camera. It’s like giving our system a pair of night-vision goggles, making it thrive where others struggle. We’ve set up a Raspberry Pi with OpenCV, teaching it to spot and track dynamic gestures. With the help of a nifty machine learning algorithm, we’re now recognizing patterns and seamlessly controlling the Raspberry Pi’s GPIOs for various tasks. Guess what? We’ve hit an impressive 99.62% accuracy, proving that even in the dark, our gesture recognition system shines bright.

Research Projects

ADMIRE: Advancing Multimodal Idiomatic Understanding in NLP (Code)
SemEval 2025

Question Answering using Retrieval Augmented Generation (RAG) (Code)

Education

University of California, Santa Cruz (UCSC), Santa Clara, CA

SRM Institute of Science and Technology (SRMIST), Chennai, India

Technical Skills

Contact

EXTRA CURRICULAR ACTIVITIES