Hi, Namaskāram 👋

About me 📝

I am Bal Narendra Sapa, an engineer who likes to solve problems related to AI and Technology. I have always been good at mathematics since childhood and developed a keen interest in computer science during my schooling years as it involves mathematics heavily. I chose a Bachelor’s degree in Computer Science Engineering at Rajiv Gandhi University of Knowledge Technologies (IIIT Basar), where I was exposed to programming, technologies, and AI. After my bachelor’s, I pursued Master’s degree in Data Science at the University of New Haven, Connecticut, where I learned extensively about Artificial Intelligence, Machine Learning, and Data Science. I have worked on multiple projects related to AI and contributed to open-source repositories on GitHub (refer to Projects). I have also gained industry experience by working at companies on AI-related problems across different areas (refer to Work Experience).

Work Experience 💼

AI Engineer

vedha.ai - (Jan 2024 - Sept 2025) - Connecticut, USA

Added additional features to the chat application, like caching, and sparse searching, Bulk Ingestion.
Worked on creating a AI Chat. Created a strategy for generating scores to match using the vector search techniques using langchain.
Optimized the LLM model for retrieval using retrieval-augmented generation Strategies and utilized the full capacity of llama models to generate responses for an input size of almost 30k tokens with a GPU size of 16 GB.
Improved the response time of the LLM model to generate the responses faster.
Worked on the semantic caching mechanism to cache the responses using Langchain, Redis and Redis Search.
Implemented Bulk Ingestion API routes for the application to process the files in large amounts efficiently using celery with 5 workers.
Added Optical Character Recognition to the application to read the text from the documents.
Implemented sparse searching using the BM25 algorithm in addition to the dense searching.
Used prompt engineering to improve the accuracy of the responses.
Used Docker and Docker-Compose to manage the application, LLM model, celery workers, and Azure Virtual Machines to deploy the application.
Optimized the application using memory management in Python for efficient memory management of GPU on a Virtual Machine.
Worked with Milvus vector stores for efficient vector store management.

AI Engineer Intern

vedha.ai - (Jun 2023 - Aug 2023) - Connecticut, USA

Worked on creating a Chat Application that uses documents provided by the user to answer the questions using open-source LLM models like Llama (Llama-2-13b-chat, Llama-2-7b-chat), falcon-7b-instruct using Django Python and Huggingface's Text Generation Inference toolkit.
Created a strategy to manage vector stores for each user by utilizing Azure Blob Storage to store the vector stores.
Wrote a REST API for the application for user management, file management and LLM model interactions.
Used Django REST Framework for the backend of the application.
Used Runpod serverless worker for response generation with the help of Huggingface’s TGI toolkit.
Used Docker, Azure DevOps, Git and Agile Framework to manage and maintain the project.
Worked on creating strategies for processing and chunking the documents for better vector search and response generation by the model.
Wrote custom code for managing and integrating Huggingface’s TGI API and Langchain.

Certifications 📜

Azure AI Engineer Associate - AI 102 - Credential Link
Azure AI Fundamentals - AI 900 - Credential Link

Projects 💡

1. Open-Source Contribution to Langchain

links: Langchain PR

while I was working with Langchain, I faced an issue of not being able to serialize the vectorstores and store it in a variable. I wrote custom code to serialize the vectorstores efficiently to the variables so that these can be stored anywhere we want. I raised a PR in the official repository and they accepted that PR.

2. Drivable Area & Lane Segmentation

links: Github Repository (8 stars on Github), Annotated Dataset(5 likes on Huggingface), Deployment

This is about fine-tuning a model that can detect the drivable area and segment the lane in the image. Our team has collected images for training and we annotated those images and trained the model with those images.

3. E-Commerce FAQ Chatbot with PEFT & LoRA

links: Kaggle Notebook (Received 51 upvotes on kaggle), Github Repository (Received 3 Stars), Model

Developed by Bal Narendra Sapa and Ajay Kumar Jagu from the University of New Haven, this project introduces an E-Commerce FAQ Chatbot using Parameter Efficient Fine Tuning (PEFT) with the Falcon-7B model. We fine-tuned the base model with custom e-commerce FAQ dataset. Leveraging the LoRA Technique for Low-Rank Adaptation. The deployment includes a Streamlit application. The code and fine-tuned model are available on Hugging Face.

4. Mask Detection with Transfer Learning

links: Github Repository, Colab link, Deployment

Utilizing PyTorch in a Jupyter notebook, this project focuses on mask detection (Detecting whether a person is wearing mask or not) using transfer learning with a pretrained VGG19 model. The dataset, sourced from Kaggle, comprises around 12k images. Leveraging Google Colab's T4 GPU, the model is trained with additional layers on top of the VGG19 base. The result showcases successful mask detection.

5. Cybersecurity Named Entity Recognition

links: Github Repository (Received 4 Github stars), Dataset (Received 2 likes on huggingface), Deployment

This NLP project focuses on Named Entity Recognition (NER) tailored for the cybersecurity domain. Utilizing the distilbert-base-uncased pretrained model, the system is fine-tuned using the MITRE dataset. Metrics showcase the model's efficacy, and the fine-tuned model is hosted on Hugging Face for easy integration. The given below example shows how the model performs NER. The model identifies tokens that are malware based on the context.

Education 🎓

1. University of New Haven

Master of Science in Data Science - (Aug 2022 - Dec 2023) - Connecticut, USA - Grade: 3.81 / 4

2. Rajiv Gandhi University of Knowledge Technologies - (IIIT Basar)

Bachelor’s Degree in Computer Science - (2018 - 2022) - Basar, Telangana - Grade: 9.01 / 10

Hobbies 🧶

Solving leetcode problems
Exploring new technologies and understanding their use-cases in the real-world

Languages I know 🧶

English - Fluent
Telugu - Fluent (Native)
Hindi - Intermediate

Contact 📬

Email: bnsapa2000@gmail.com
LinkedIn: Bal Narendra Sapa
GitHub: balnarendrasapa
HuggingFace: bnsapa
Kaggle: balnarendrasapa

About me 📝​

Work Experience 💼​

AI Engineer​

vedha.ai - (Jan 2024 - Sept 2025) - Connecticut, USA​

AI Engineer Intern​

vedha.ai - (Jun 2023 - Aug 2023) - Connecticut, USA​

Certifications 📜​

Projects 💡​

1. Open-Source Contribution to Langchain​

2. Drivable Area & Lane Segmentation​

3. E-Commerce FAQ Chatbot with PEFT & LoRA​

4. Mask Detection with Transfer Learning​

5. Cybersecurity Named Entity Recognition​

Education 🎓​

1. University of New Haven​

2. Rajiv Gandhi University of Knowledge Technologies - (IIIT Basar)​

Hobbies 🧶​

Languages I know 🧶​

Contact 📬​