<
INITIALIZING

Hello, I'm JATIN MEHRA

> Data Scientist & AI Engineer

Building the future with AI-powered solutions that enhance business efficiency. Specialized in machine learning, NLP, and automation.

Jatin Mehra

<ABOUT_ME/>

Results-driven Data Scientist with expertise in machine learning, natural language processing (NLP), and automation. Currently pursuing a Master of Science in Data Science.

I specialize in building AI-powered solutions that enhance business efficiency and automate workflows. My portfolio showcases projects in predictive analytics, AI-driven automation, chatbots, and financial forecasting.

Machine Learning NLP AI Solutions RAG Deep Learning

> PERSONAL_INFO

jatinmehra119@gmail.com
(+91) 9910364780
Delhi, India

> CONNECT_WITH_ME

<TECH_STACK/>

LANGUAGES

Python C++ Java SQL

ML_FRAMEWORKS

Pandas Scikit-learn NumPy PyTorch Hugging Face LangChain

WEB_&_API

Streamlit FastAPI Flask

VISUALIZATION

Matplotlib Seaborn Tableau Power BI

DEVOPS_&_CLOUD

Git Docker Google Cloud Platform GitHub Actions

DEV_TOOLS

Jupyter Notebook Visual Studio Anaconda

<DATA_SCIENCE_JOURNEY/>

A structured progression from statistical foundations to cutting-edge AI applications, showcasing continuous learning and practical implementation across multiple domains.

01

FOUNDATIONS_IN_PROBABILITY_&_STATISTICS

Laid the theoretical groundwork with core statistical concepts—probability distributions, inferential statistics, and hypothesis testing—which are essential for data analysis and ML.

[BCA - 2nd Semester]
02

DATA_MANIPULATION_MASTERY

Data Analysis & Manipulation (BCA - 4th Semester + Self-Learning)

Mastered NumPy and Pandas for data cleaning, EDA, and transformation

Already proficient in Python (learned in 11th–12th CS & further in BCA)

Practiced on DataWars platform and Kaggle notebooks

Learned DSA using C#, Java, and Python

03

ML_ALGORITHM_FOUNDATION

Introduction to Machine Learning (Self-Learning + BCA - 6th Semester)

Completed "Hands-On ML with Scikit-Learn, TensorFlow, and Keras" by Aurélien Géron

Explored supervised/unsupervised learning through real-world datasets

Built initial ML projects and deepened practical knowledge via Kaggle

04

CORE_ML_ALGORITHMS

Core ML Algorithms (YouTube + BCA Curriculum + Projects)

Studied and implemented core ML algorithms like SVM, Decision Trees, KNN, Naive Bayes, etc.

Built multiple ML projects applying these concepts

Started participating in Kaggle competitions, achieving higher rankings over time

05

DEEP_LEARNING_MASTERY

Deep Learning (YT + Books + Projects)

Gained intuition for neural networks, SGD, backpropagation, attention, and transformers through resources like 3Blue1Brown, Krish Naik, Andrej Karpathy, and Aurélien Géron's book

Worked hands-on with TensorFlow, Keras, and PyTorch to build DL projects

06

TRANSFORMERS_&_COMPUTER_VISION

Transformers, LLMs & Computer Vision

Studied transformers, self-attention, and GPT architecture in depth

Built applications using Hugging Face Transformers, fine-tuned LLMs, and trained models like CNNs, YOLO, ViTs, and Masked Autoencoders (MAE)

Participated in Kaggle research competitions, achieving top 1–10% ranks

07

AI_WORKFLOWS_&_AGENTIC_SYSTEMS

AI Workflows & Agentic Systems

Currently working with LangChain, LangGraph, and exploring Smol Agents

Built real-world AI apps integrating LLMs, RAG pipelines, and vector databases

08

CLOUD_&_MODEL_DEPLOYMENT

Cloud & Model Deployment (GCP + Docker + FastAPI + Flask)

Gained hands-on experience with Google Cloud Platform (GCP) services for scalable model hosting

Deployed multiple AI apps using Docker, FastAPI, and Flask for model inference and API serving

Emphasis on creating production-ready AI systems with CI/CD pipelines and cloud-native deployment strategies

> ACADEMIC_NOTE: ONGOING_M.SC._IN_DATA_SCIENCE

I'm currently pursuing my Master of Science in Data Science, where I'm intentionally revisiting and strengthening foundational topics like:

Probability & Statistics Advanced Databases SQL Python Programming

Though I've previously worked with these technologies extensively during my BCA and projects, revisiting them with a deeper academic lens is sharpening my theoretical understanding and filling any knowledge gaps. I firmly believe that a strong foundation amplifies the impact of advanced AI systems.

COMPETITION_SUCCESS

Top 1-10% rankings in multiple Kaggle research competitions including transformers and computer vision

ADVANCED_AI_SYSTEMS

LLM fine-tuning, RAG systems, and multi-agent AI applications with production deployments

CONTINUOUS_LEARNING

Self-taught progression from statistical foundations to cutting-edge AI through books, courses, and hands-on projects

<FEATURED_PROJECTS/>

CRAWL_GPT

A powerful web content crawler with LLM-powered RAG (Retrieval Augmented Generation) capabilities. CrawlGPT extracts content from URLs, processes it through intelligent summarization, and enables natural language interactions using modern LLM technology.

Generative AI Web Crawling RAG Vector DB CI/CD Docker

AI-POWERED_YOUTUBE_VIDEO_SUMMARIZER_&_FACT-CHECKER

This Web APP extracts captions from YouTube videos, generates summary, text embeddings, and allows users to search within podcast transcripts. It also refines the context and fact-checks claims using AI models and web crawlers.

Generative AI NLP Web Scraping FastAPI Pandas

PDF_INSIGHT_PRO:_RAG_APP

Agentic RAG using FastAPI, FAISS, LangChain & Groq — with real-time web validation via Tavily — to answer your PDF-based queries intelligently. Achieved Semantic Similarity (Mean) 0.852 with ~86% evaluation accuracy (threshold 75 semantic similarity scores). Android app using Java.

Generative AI LangChain Agentic RAG FAISS 0.852 Similarity 86% Accuracy Android App FastAPI Docker

PLAGIARISM_DETECTOR_USING_FINE-TUNED_SMOLLM2_135M

The smolLM2 135M Ins. MODEL was fine-tuned on the MIT Plagiarism Detection Dataset for improved performance in identifying textual similarities. Achieved 0.96 F1 score, 0.96 recall, and 0.96 precision scores. This model provides binary classification outputs with 1000+ Downloads/Month.

Generative AI NLP LLM Fine Tuning PyTorch HuggingFace 0.96 F1 Score

AUTOMATED_ESSAY_SCORING_SYSTEM

Developed a state-of-the-art AI model for automated essay evaluation as part of the Kaggle AES competition. Achieved 0.79 QWK (Quadratic Weighted Kappa) score and ranked in top 10%. The project aimed to reduce manual grading effort and enhance the feedback process for students and educators.

PyTorch HuggingFace LLM Fine Tuning 0.79 QWK Top 10%

AI-POWERED_PODCAST_TO_BLOG_GENERATOR

Convert Podcasts into interesting Blogs using AI, Generate FAQs, Social Media Posts, Newsletter and SEO Elements. This app uses langchain library, llama 4 model, OpenAI whisper-large-v3-turb, Pydantic and fastapi.

Generative AI NLP Tavily Search OpenAI Whisper FastAPI Docker

NEXAR_DASHCAM_CHALLENGE:_DASHCAM_COLLISION_PREDICTION_USING_VIDEOMAE-2

Advanced computer vision project for dashcam collision prediction using state-of-the-art VideoMAE-2 architecture. Achieved 11th rank (top 1%) in the prestigious Nexar Dashcam Challenge on Kaggle. Implemented sophisticated video understanding models to predict potential collisions from dashcam footage, leveraging transformer-based video analysis for real-time safety applications in autonomous driving systems.

Computer Vision VideoMAE-2 Transformers PyTorch Video Analysis 11th Rank Top 1%

FAKE_SCENE_DETECTOR

Advanced computer vision project to detect and classify fake or manipulated scenes in images. Achieved 0.93 AUC as part of a Kaggle competition with top 10% solution. Utilizes deep neural networks and advanced image processing techniques to identify digitally altered content for media authenticity verification.

Computer Vision Deep Learning Image Processing 0.93 AUC Top 10%

BRIST1D_BLOOD_GLUCOSE_PREDICTION

Machine learning project for predicting blood glucose levels in Type 1 Diabetes patients using the BrisT1D dataset. Achieved 86th rank (top 10%) in Kaggle competition. Implemented time series forecasting models to help patients and healthcare providers better manage diabetes through predictive analytics.

Machine Learning Time Series Healthcare AI Predictive Analytics 86th Rank Top 10%

SMART_RESUME_GENERATOR_USING_AI

AI-powered application that automatically generates professional resumes tailored to specific job descriptions. Uses natural language processing to analyze job requirements and optimize resume content for better job matching and ATS compatibility.

Generative AI NLP LangChain Document Generation

AI-AGENT-BASED_DEEP_RESEARCH_SYSTEM

Advanced multi-agent AI system for conducting comprehensive research on complex topics. Uses collaborative AI agents with specialized roles to gather, analyze, and synthesize information from multiple sources for in-depth research reports and insights.

Multi-Agent AI Research Automation LangChain LangGraph Knowledge Synthesis

GAS_TURBINE_ELECTRICITY_PREDICTION_WITH_LSTM_NEURAL_NETWORKS

Developed a deep learning solution for predicting gas turbine electricity output using LSTM neural networks. The model processes time-series data to forecast power generation with high accuracy (RMSE < 370), outperforming traditional prediction methods and reducing the need for manual monitoring.

TensorFlow LSTM Time Series Deep Learning RMSE < 370 High Accuracy

<EDUCATION_&_EXPERIENCE/>

M.SC_DATA_SCIENCE

Chandigarh University, Punjab, India

Current - Pursuing advanced studies in Data Science and AI

2025 - Present

BCA_COMPUTER_APPLICATIONS

Chandigarh University, Punjab, India

CGPA: 7.89/10

2022 - 2025

<PROFESSIONAL_EXPERIENCE/>

SALES_OPERATIONS_ANALYST

NB ENTERPRISER (India Today Group)

Delhi, India

Dec 2022 - Present

> KEY_RESPONSIBILITIES_&_ACHIEVEMENTS:

  • Designed daily inventory analytical reports using Excel, improving decision-making processes across the organization
  • Built comprehensive KPI dashboards in Tableau, enhancing strategic insights for leadership team
  • Automated bank statement reconciliation for 1000+ transactions/month, reducing manual work by 80% and achieving 5x faster processing using pdfplumber, OpenPyXL, and Pandas
  • Developed the WMPL SAP Report Generator, a Python & Streamlit tool for automated SAP-compatible Excel reports
Excel Analytics Tableau Python Pandas Streamlit Process Automation

<CERTIFICATIONS/>

NVIDIA Certificate

BUILDING_RAG_AGENTS_WITH_LLMs

NVIDIA Deep Learning Institute

Advanced training in developing Retrieval-Augmented Generation (RAG) agents using Large Language Models for enhanced AI applications.

Harvard Presentation Skills

PRESENTATION_SKILLS

Harvard Business School Online

Mastered advanced presentation techniques, storytelling, and executive communication strategies for impactful business presentations.

Harvard Digital Intelligence

DIGITAL_INTELLIGENCE

Harvard Business School Online

Comprehensive understanding of digital transformation, data-driven decision making, and technology strategy in modern business.

Harvard Business Plan Development

BUSINESS_PLAN_DEVELOPMENT

Harvard Business School Online

Strategic business planning, market analysis, financial modeling, and venture development for successful business initiatives.

Harvard Decision Making

DECISION_MAKING

Harvard Business School Online

Advanced decision-making frameworks, analytical thinking, and strategic problem-solving for complex business challenges.

Hugging Face Certificate

AGENTS_COURSE_CERTIFICATION

Hugging Face

Specialized training in building intelligent AI agents using Hugging Face transformers and modern NLP techniques for autonomous systems.

4
HARVARD_CERTS
1
NVIDIA_CERT
1
HUGGINGFACE_CERT
6
TOTAL_CERTS

<GET_IN_TOUCH/>

LOCATION

Delhi, India

Available for remote work

PHONE

(+91) 9910364780

Available 9 AM - 8 PM IST

> SEND_MESSAGE

Drop me a message and let's discuss your next AI project!

> CONNECT_WITH_ME

Let's collaborate on exciting AI and Data Science projects!

15+
AI_PROJECTS
TOP_10%
KAGGLE_RANKS
2+
YEARS_EXP
24/7
LEARNING