Hey, I'm

I am an AI and Machine Learning professional passionate about developing AI Agents and agentic systems that automate reasoning, decision-making, and analytics at scale. Evolving from mobile app development, I now focus on building multimodal AI, Generative AI applications, and cloud-based intelligent platforms, creating impactful solutions that solve real-world challenges.

Github IconLinkedin IconLinkedin Icon
hero image
hero image

About Me

I am an Associate Software Engineer (Data Science) at HARMAN, where I focus on developing AI agents, multimodal systems, and advanced analytics solutions. Previously, as an AI Intern at HARMAN, I worked on Generative AI and intelligent automation, and as an SDE Intern at Sense Original Technologies, I contributed to backend development and Android applications. My expertise spans AI/ML, Generative AI, cloud platforms, and full-stack development, with a strong passion for creating scalable and impactful software solutions.
  • Associate AI Engineer at HARMAN
  • AI and Data Intern at HARMAN
  • Android Developer intern at Sense Original Technologies

My Projects

Industrial Agentic System
Industry
AIMulti-AgentBackendLLMsIndustrial

Designed and developed an intelligent multi-agent framework using patterns like Supervisor-Orchestrator, Plan-and-Execute, Swarm, Tool-augmented ReAct, and multi-hierarchy PreAct with reflection and adaptation. Deployed on real-time industrial sensor data to automate anomaly detection, predictive maintenance, fault explanation, and root cause analysis, reducing manual intervention by 45%. Implemented evaluation pipelines for latency, task completion, LLM error rate, and system reliability.

EDA AI Agents
Industry
AIMultimodalMLAgentsEDAMultimodalVideo

Built multimodal AI agents to automate complex EDA tasks by fine-tuning pretrained transformers using LoRA. Achieved 95% accuracy in converting circuit designs to netlists and vice versa. Developed agents for netlist-to-circuit image conversion, Verilog code generation and optimization, intelligent design validation queries, and graph-based netlist representations.

Referred VOS
Industry
AIComputer VisionMultimodalVideoMultimodalVideo

Developed a multimodal transformer-based solution for real-time video-text object segmentation. Implemented query-based masking and integrated frame-wise segmentation with audio context to improve temporal consistency, achieving 92% segmentation accuracy for real-time object tracking and video querying.

Code Change Impact Analysis
Industry
AIBackendKnowledge GraphsStatic Analysis

Built a system to analyze large codebases by converting them into Program Dependency Knowledge Graphs using a custom parser. Leveraged AI agents and LLM-as-a-judge to predict and rank the impact of code changes. Evaluated on PL/I legacy systems to uncover hidden dependencies and reduce undetected breakages.

CampusPass
Personal
AndroidMobileNFCFull StackNFCFull Stack

Engineered an NFC-based mobile application to validate student ID cards, improving verification speed by 60% and authentication speed by 250%. Enabled attendance tracking, delegation, and suspension management for professors and guards, serving over 5,000 students with a seamless mobile experience.

File Storage Drive
Personal
Full StackAndroidCloudNestPostgreSQL

A cloud storage app similar to Google Drive, featuring file upload, download, and local storage integration. It utilizes NestJS and Node.js for backend services, Kotlin Jetpack Compose for the Android interface, PostgreSQL for database management, and AWS S3 for secure file storage, all hosted on AWS EC2 for optimal performance and scalability.

CampusPass
Personal
BackendAndroidNFCIOTDjangoPostgreSQL

CampusPass is an application designed for Professors and Guards in a college environment to authenticate student-issued cards. Leveraging NFC technology, it enables seamless card validation, improving authentication speed by 250%.

Ignitia 2k24
Personal
AllFull StackAndroidBackendEcommerceWalletDjangoPostgreSQL

Ignitia - 2k24 is an Android app for a fest, enabling event registration and in-app purchases. Built with Kotlin Jetpack Compose and Django, it features secure hosting on AWS and robust OTP verification, enhancing user engagement and monetization.

Coders
Personal
AllMobileProgrammingAndroidNodeJS + ExpressJSMongoDB

Coders is an Android app that significantly improved task completion rates through competitive task assignments and real-time notifications. It incorporates gamification and social features, allowing users to compete with friends based on task completion using web-scraped data. The backend, built with Express.js, integrates task tracking and notifications, enhancing user retention with an interactive and seamless experience.

Behance
Personal
AllWebReactTailwind CSSNodeJS + ExpressJSMongoDB

A Behance clone website featuring a modern and responsive design, built with React for the frontend and styled using Tailwind CSS. The backend is powered by Node.js and Express.js, providing a robust API for user authentication, project uploads, and interaction functionalities, creating an engaging platform for creatives to showcase their work.

Stocks Application
Personal
AllAndroidStocks APIAndroidKotlinJetpack Compose

A Stocks Tracker App built with Jetpack Compose** and Kotlin, leveraging the Stock Free API to provide real-time stock market updates. This app offers a sleek and modern UI, enabling users to track stock prices, view detailed stock information, and monitor market trends. Designed for efficiency, the app ensures smooth navigation, real-time data syncing, and offline access to favorite stocks. It’s a reliable tool for both casual users and active investors to stay informed on the go.

My Research Papers

MLLM4EDA: Multimodal Adaptation of Large Language Models for Electronic Design Automation

Novelty

Introduces a multimodal LLM framework for EDA that jointly adapts visual and textual representations using modality-specific LoRA and bidirectional cross-attention for accurate circuit reasoning.

Authors

Ashutosh Pandey, Abhinav Upadhyay, Ripunjoy Goswami

Technical Blogs

Efficiently Fine-Tuning LLaVA 7B Model Using LoRA on Image-Text Pairs

Vision-language models like LLaVA 1.5 7B are powerful tools for building AI assistants that understand both images and text. But fine-tuning these models can be resource-heavy — requiring expensive GPUs and tons of VRAM.

Full Fine-Tuning of Large Language Models (LLMs): A Deep Dive

In the rapidly evolving field of Large Language Models (LLMs), fine-tuning plays a critical role in adapting powerful base models to specific downstream tasks. While parameter-efficient methods such as LoRA or prompt-tuning have gained popularity, full fine-tuning remains the gold standard for maximum model adaptability and performance — particularly when the dataset is substantial and task-specific.

Full Fine-Tuning of LLMs: Challenges, GPU Memory

Full fine-tuning refers to updating all the parameters of a pre-trained model using a new dataset. Unlike transfer learning or parameter-efficient methods, full fine-tuning modifies the entire model’s weights, allowing it to deeply adapt to new tasks or domains.

Enhance Your Android App with Integrating reCAPTCHA Enterprise with Android apps Google

In today’s digital landscape, security is paramount for mobile applications. As the threat of bots and spam continues to loom, integrating robust security measures becomes essential. Google reCAPTCHA offers a powerful solution to this challenge, helping developers protect their apps from malicious activities.

Developed with ❤️ by Ashutosh Pandey