LLM agents and evaluation
RAG pipelines, multi-agent flows, and guardrails that hold up under real usage.
- Retrieval, grounding, and safety layers with reproducible evaluation.
- Tracing and observability to surface drift, hallucinations, and latency.
- Prompt and tool design tuned to business outcomes, not demos.
LLMs
RAG
Evaluation
LangGraph
Observability
Data, infra, and automation
Reliable plumbing for models and products: ETL, orchestration, and deploys.
- APIs and services in Python/FastAPI with structured logging and metrics.
- Feature pipelines, data validation, and backfills for trustworthy inputs.
- CI/CD, environment parity, and playbooks that keep releases calm.
Python
FastAPI
Pipelines
CI/CD
Automation
Product craft
Translate vague requests into scoped features, prototypes, and shipped outcomes.
- Discovery with stakeholders; define success metrics and experiment plans.
- Design lightweight UIs and flows that make ML output usable and trustworthy.
- Documentation and handoffs that keep teams aligned after delivery.
Product
UX
Delivery
Experimentation