V-EditR Python

A reasoning-first image editor powered by Vision–Language Models

An advanced AI image editing pipeline that understands complex natural language instructions and applies precise, context-aware edits. Unlike traditional tools, V-EditR first reasons about scene context, spatial relationships, and object identities before making any modification — handling requests like "remove the chair behind the table" or "make the person holding the phone wear a black jacket".

Tech Stack

Python GroundingDINO SAM InstructPix2Pix ControlNet Stable Diffusion LLM Parser

Key Features

  • Multi-stage pipeline: text → plan generation → object grounding → edit → validation
  • Spatial and relational reasoning ("next to", "behind", "holding")
  • Object grounding with GroundingDINO + SAM segmentation masks
  • Modular architecture with separate planners, validators, and verifiers
TrustRAG Python

Trustworthy Retrieval-Augmented Generation for the medical domain

A medical-domain QA system built on a hybrid dense + sparse retrieval pipeline with hallucination prevention. Uses ~230 K Wikipedia medical passages and refuses to return answers it cannot verify — hallucinated content is replaced with explicit, grounded refusal messages.

Tech Stack

Python FAISS BM25 e5-small-v2 MiniLM cross-encoder gemma3:4b (Ollama) nli-deberta-v3-base RAGAS

Key Features

  • Hybrid retrieval: dense (FAISS) + sparse (BM25) with score fusion
  • Cross-encoder reranking for improved passage relevance
  • NLI-based faithfulness verification — hallucinated sentences are blocked
  • Evaluated with RAGAS (Faithfulness, Answer Relevancy, Context Precision)
FocusFlow Python

Localized Image Editing via Masked Velocity Blending

A novel image editing method built on top of Stable Diffusion 3 that enables precise, localized edits using only text prompts — no manual masks required. Extends the FlowEdit technique by automatically identifying which regions to modify via velocity field analysis, then applying masked velocity blending to confine edits to the relevant area.

Tech Stack

Python 3.10+ PyTorch 2.x + CUDA Stable Diffusion 3 diffusers CLIP LPIPS

Key Features

  • Automatic mask generation via velocity field differencing between source/target prompts
  • Masked velocity blending: V_blend = M · V_target + (1 − M) · V_source
  • Evaluated on 40 test cases: pose changes, background replacement, material/style edits
  • Best CLIP-T score (0.296) vs. FlowEdit and SDEdit baselines
Direct Preference Optimization Notebook

DPO paper implementation for LLM alignment

A clean implementation of the Direct Preference Optimization algorithm for aligning language models with human preferences. Fine-tunes TinyLlama-1.1B on a sentiment classification task using preference pairs (chosen vs. rejected outputs) generated by gemma3:4b via Ollama.

Tech Stack

Python TinyLlama-1.1B Ollama + gemma3:4b DPO loss (β=0.1) PyTorch

Key Features

  • Full DPO training pipeline: data prep → SFT → DPO fine-tuning → evaluation
  • Preference pair generation using a larger LLM as a judge
  • Faithful reproduction of the original DPO paper methodology
FlowEdit Python

FlowEdit paper implementation — text-guided image editing with flow-matching diffusion

A faithful implementation of the FlowEdit paper, enabling precise text-guided image transformations using Stable Diffusion 3 via source and target text prompts — without re-training or fine-tuning any model. Used as the baseline in the FocusFlow project above.

Tech Stack

Python PyTorch Stable Diffusion 3 diffusers transformers NumPy PyYAML

Key Features

  • Reproduces the FlowEdit paper's delta velocity blending approach
  • Configurable editing parameters (timesteps, guidance scales, averaging steps)
  • Batch editing via YAML config and paper figure reproductions
  • Served as the baseline for the FocusFlow quantitative evaluation
RAG Assistant Agent Python

Multi-agent RAG system powered by LLMs

A multi-agent document assistant that answers questions from internal documents (PDFs, HTML, emails) with guaranteed citations and built-in PII/secrets safeguards. Exposes a FastAPI REST interface for ingestion and querying.

Tech Stack

Python FastAPI sentence-transformers ChromaDB BM25 Tesseract OCR OpenAI / Ollama

Key Features

  • Multi-agent pipeline: Retrieval → Reranker → QA → Citation/Verifier → Safety/PII → Composer
  • Supports PDF, HTML, TXT, and .eml (email) file ingestion
  • Hybrid retrieval: vector similarity + BM25 keyword search
  • Automatic PII/secret detection during ingestion and response generation
  • REST API (POST /ingest, POST /ask) for easy integration