Skip to main content

Bal Narendra Sapa

AI Engineer | Software Engineer | Retrieval-Augmented Generation (RAG) | LLM Systems


About

I am an AI Engineer and Software Engineer specializing in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), vector search systems, and scalable AI infrastructure.

I hold a Master of Science in Data Science from the University of New Haven and a Bachelor's degree in Computer Science from IIIT Basar. My work focuses on designing and deploying production-grade AI systems that solve real-world business problems using modern machine learning and cloud technologies.

My interests include:

  • Large Language Models (LLMs)
  • Retrieval-Augmented Generation (RAG)
  • AI Infrastructure and MLOps
  • Distributed Systems
  • Information Retrieval
  • Open Source Software

Work Experience

Software Engineer

PPK VirtueServices Pvt. Ltd. | Hyderabad, India

Oct 2025 – Apr 2026

Worked on enterprise AI applications focused on document intelligence, retrieval systems, and scalable LLM deployments.

Key Contributions

  • Designed and implemented semantic caching using LangChain, Redis, and Redis Search to reduce response latency.
  • Built bulk document ingestion pipelines using Celery workers for large-scale document processing.
  • Implemented hybrid search capabilities by combining dense vector retrieval with BM25 sparse retrieval.
  • Added OCR support for extracting text from scanned documents.
  • Improved LLM response quality through prompt engineering and retrieval optimization.
  • Optimized GPU memory utilization for large-context inference workloads.
  • Managed deployments using Docker, Docker Compose, Azure Virtual Machines, and Milvus vector databases.
  • Enabled processing of document contexts approaching 30,000 tokens on 16 GB GPU environments.

Technologies

Python, LangChain, Redis, Redis Search, Celery, Docker, Azure, Milvus, OCR, BM25, Llama Models


AI Engineer

Sunrise Software Solutions Corporation | Connecticut, USA

Jan 2024 – Sept 2025

Worked on document-based conversational AI systems powered by open-source Large Language Models.

Key Contributions

  • Developed a production-grade RAG chatbot platform using Llama and Falcon models.
  • Built REST APIs for user management, document management, and AI interactions using Django REST Framework.
  • Designed strategies for document chunking, retrieval, and vector storage management.
  • Integrated Hugging Face Text Generation Inference (TGI) for scalable model serving.
  • Implemented per-user vector store management using Azure Blob Storage.
  • Deployed AI workloads using Docker, Azure DevOps, and Runpod Serverless infrastructure.

Technologies

Python, Django, DRF, LangChain, Azure Blob Storage, Docker, Azure DevOps, Hugging Face TGI, Runpod


Open Source Contributions

LangChain Contributor

Pull Request: https://github.com/langchain-ai/langchain/pull/7914

Contributed vector store serialization functionality to LangChain.

The contribution enabled vector stores to be serialized and stored programmatically, making them easier to persist, transport, and manage across environments.


Selected Projects

Drivable Area and Lane Segmentation

Links

Built a computer vision system for drivable-area detection and lane segmentation using a custom annotated dataset collected and labeled by the project team.

Highlights

  • Custom dataset creation and annotation
  • Semantic segmentation
  • Hugging Face deployment
  • Open-source release

E-Commerce FAQ Chatbot using PEFT and LoRA

Links

Fine-tuned Falcon-7B using Parameter-Efficient Fine-Tuning (PEFT) and LoRA techniques to build an e-commerce FAQ assistant.

Highlights

  • Falcon-7B fine-tuning
  • LoRA adaptation
  • Streamlit deployment
  • Public Hugging Face model release

Cybersecurity Named Entity Recognition

Links

Developed a domain-specific NER model for cybersecurity using DistilBERT and MITRE-based datasets.

Highlights

  • Domain-specific NLP
  • DistilBERT fine-tuning
  • Custom dataset publication
  • Hugging Face deployment

Face Mask Detection

Links

Built a transfer-learning-based mask detection system using a pretrained VGG19 model and a dataset containing approximately 12,000 images.


Certifications

  • Microsoft Azure AI Engineer Associate (AI-102)
  • Microsoft Azure AI Fundamentals (AI-900)

Education

University of New Haven

Master of Science in Data Science Aug 2022 – Dec 2023

GPA: 3.81 / 4.00


Rajiv Gandhi University of Knowledge Technologies (IIIT Basar)

Bachelor of Technology in Computer Science 2018 – 2022

GPA: 9.01 / 10.00


Technical Skills

Programming Languages

  • Python
  • JavaScript
  • SQL

AI & Machine Learning

  • Large Language Models (LLMs)
  • Retrieval-Augmented Generation (RAG)
  • LangChain
  • Hugging Face
  • Fine-Tuning (LoRA, PEFT)
  • Vector Databases
  • Information Retrieval
  • Computer Vision
  • NLP

Backend & Infrastructure

  • Django
  • FastAPI
  • REST APIs
  • Docker
  • Celery
  • Redis
  • Azure
  • Milvus
  • Azure Blob Storage

Data & Analytics

  • Pandas
  • NumPy
  • Machine Learning
  • Data Science

Contact