a notebook, mostly

Field Notes

four years in production

Vol. 01 · 2022 — 2026

currently open to work

open · senior backend / fullstack / ai / sre / platform No. 1998 · BLR, IN

Vijay Gupta

sde 2 · namma yatri · 4 yrs · backend & distributed systems

I scaled Namma Yatri's backend from 5K to 300K+ rides a day on Haskell, Redis, and Kafka. I helped run it active-active across AWS and GCP, and led the cost work that removed $126K a year from our AWS bill — all while holding 99.9% uptime.

300k+ rides / day (60×)
$126k aws saved / yr
99.9% availability
↓45% p95 latency

haskell
redis / valkey
kafka
postgres
aws + gcp
kubernetes

start reading

this notebook belongs to

VijayGupta

bengaluru · karnataka · in mail · vijayrauniyar1818@gmail.com web · vijaygupta18.github.io git · github.com/vijaygupta18

If this notebook has reached you by accident, please drop me a note. There is some reasonably personal reading inside.

— VG

· contents ·

01who i am, roughlyp. 04
02numbers i am proud ofp. 08
03namma yatri — currentp. 12
04juspay — the kv yearsp. 34
05vahan — the first jobp. 56
06things i made on weekendsp. 70
07what's in my toolboxp. 102
08letters people sent mep. 118
09how to find mep. 260

entry 01

so, who am i.

I build backends — the kind that can wake you up at 3 AM. I have been doing this for four years, and I am currently SDE 2 at Namma Yatri, part of a small team that grew the platform's daily load sixty times over in two years while keeping the pager mostly quiet. My title says SDE 2, but in practice the work is part feature engineering, part distributed systems, part infrastructure, and a lot of staring at Grafana graphs at 2 in the morning.

I grew up writing Python, discovered Haskell at Juspay in 2023, and today most of Namma Yatri is written in it. I prefer languages that make it hard to do the wrong thing — Haskell, Rust, and the good parts of TypeScript — and I enjoy the parts of the stack that most people call boring: Redis, Kafka, Postgres, and the zone-aware plumbing underneath them.

I care most about three things, in this order: cost, latency, and the on-call page not firing. The rest of this notebook is essentially a record of that work — features I have shipped, systems I have helped build, and infrastructure I have helped tame. There are also a few side projects that kept me sane between incidents.

— V.

yes, that's me.

entry 02

numbers i'm proud of.

A short list, in no particular order, and with honest caveats. Every number here is measured off real production graphs, not back-of-envelope estimates. Where a number is approximate, I have said so. An asterisk means this was a team win in which I played a meaningful part — not I did this alone.

300k+

rides / day

Scaled up from 5K when I joined — roughly a 60× jump. Same team, mostly the same hardware, with the hot paths rewritten.

the nice curve

$126k

/ year · aws savings

Through EC2 right-sizing, AZ-aware Redis and Valkey routing, ALB compression, and storage clean-up — about 20% of the annual bill.

↓45%

p95 api latency

Profiled the hot paths, removed blocking I/O, and added connection pooling on the worst offenders. Unglamorous work, with real gains.

99.9%

availability · on-call

75% fewer 5xx errors and roughly half the 3 AM pages, sustained under on-call ownership across the core services.

↓50%

redis memory

Achieved through zone-aware replica selection, zstd compression, and table-level sharding. Cost on the Redis line item alone dropped a further 40%.

↓70%

incident rca time

Delivered through Vishwakarma, an autonomous SRE agent I built that runs 16 parallel investigations across Prometheus, Kubernetes, Elasticsearch, and the databases, and posts a PDF RCA into Slack.

* most of this is team work. i was load-bearing on all of it.

entry 03 · sde 2 · current

namma yatri. (the long one)

March 2024. Namma Yatri spun out of Juspay and I stayed on with the team. On day one we were doing roughly five thousand rides a day across Bengaluru, and the backend was a small, stubborn service written in Haskell. Today, as SDE 2, I help run a platform that serves three hundred thousand+ rides a day across multiple cities on the same Haskell core — but almost nothing else about it is the same.

In plain English, my job has been to keep the platform from falling over as it grows. The work falls into three parts.

the features users actually touch

I designed the multimodal journey planner — the feature that lets a user type Majestic to Whitefield and receive a stitched itinerary across autos, metro, and bus with honest ETAs and traffic-aware routing. It now ships in three cities. I also built large parts of the real-time dynamic pricing engine, which decides what an auto ride should cost at any given moment, based on demand, weather, and the live conditions on a corridor.

the systems behind them

The Redis / Valkey KV framework is the piece I am proudest of. It takes our real-time state — drivers, rides, prices — and stores it in a Redis-shaped structure that mirrors our Postgres schema. We run it at 5 million+ transactions a day and 5,000+ events per second, using table-level sharding, asynchronous operations, pipelining, and autoscaling that does not make you nervous. On top of that, I wrote an in-memory GTFS service with GraphQL preloading that reduced hot-path read latency by 60% at approximately 5,000 requests per second.

the infra nobody sees

The platform now runs active-active on AWS and GCP, with cross-cloud Redis and Kafka routing, client-aware dispatch, and zero-downtime cross-cloud deploys. I led the Redis → Valkey migration (memory down 50%, cost down 40%), the Kyverno pod-zone injection that finally fixed our cross-AZ bill, and the cost work that quietly saved $126K a year on AWS. I own on-call for the core, and we have held 99.9% availability while the platform grew sixty times over.

the tools i built on the side

Vishwakarma (p. 70) is an autonomous SRE agent. It runs sixteen parallel investigations for every alert and delivers an RCA PDF into Slack within a few minutes — cutting investigation time by 70%. ART (Automated Regression Testing) replays real production traffic against candidate builds, so we catch API regressions before a deploy, not after.

entry 04 · apr 2023 → mar 2024

juspay — the kv years.

April 2023. I joined Juspay as an SDE on what would later become Namma Yatri. The stack was Haskell and a lot of Redis. This was where I got serious about types, and about treating the database as a system to be designed rather than a service to be called.

Two headline projects from that year:

KV — a Redis front for Postgres.

We were paying too much for Postgres reads on data that did not really need to be strongly consistent. KV is a framework that keeps real-time rows in Redis, in a structure that mirrors the underlying table, and keeps them in sync through a drainer service. It reduced our database spend by 40% while making reads noticeably faster. It later scaled, at Namma Yatri, to 5 million+ daily transactions.

the drainer service.

Originally, the drainer lived inside the application — which meant that a deploy of the app was also a deploy of the drainer, and a mistake in one could affect the other. I split the drainer out into its own service. Data now flows cleanly from Redis Streams into Postgres, and directly from Kafka into ClickHouse for analytics. With data being shaped at the edge rather than inside the app, deploys became far less risky.

entry 05 · jun 2022 → apr 2023

vahan — the first job.

June 2022. My first full-time role. My slice of the product was the backend for an AI-driven WhatsApp bot that matched blue-collar workers to jobs at Zomato, Swiggy, Zepto, and Uber. It was the kind of workload where every ten milliseconds you shave matters, because there is a real person on the other side waiting for a reply.

What I shipped:

Redesigned the bot to process chats concurrently, using a RabbitMQ fan-out with per-user handlers. API response time dropped by about 45%, engagement rose by about 35%, telecalling costs fell by roughly 32%, and we handled over 10,000 conversations on the new pipeline.
Built a data-warehouse integration that tracks application status automatically — reducing manual monitoring effort for the operations team by around 70%.
Contributed to the backend for an AI matching flow that quietly served 100,000+ job seekers.

It was a good year. I learnt how a real product behaves under a real load, and how different that is from a classroom assignment, where no one is waiting for your response.

good code is half the job. the other half is knowing your users are waiting.

entry 06

things i made on weekends.

Work is one thing. The best way I know to actually learn something is to build a small, slightly embarrassing version of it over a weekend. Pinned below are the projects I am happy to show someone.

featured

vishwakarma

An autonomous SRE agent. When you receive a page, it starts investigating, fanning out across Prometheus, Kubernetes, Elasticsearch, and the databases, and posts a full RCA PDF into Slack. Investigation time down 70%.

github ↗

system design simulator ★ 174

A drag-and-drop system-design practice environment. 30 components with real benchmarks (load balancer: 1M QPS, cache: 100K QPS), 35 design problems, and a traffic simulator with bottleneck detection.

github ↗

ai interview platform

A voice and video interview flow using Deepgram for STT and TTS, with real-time proctoring and automated scoring. Multi-tenant, and works with any OpenAI-compatible API.

github ↗

multi-cloud db manager

A single console that queries Postgres and Redis across AWS and GCP at the same time. Parallel execution, side-by-side diff, three-tier RBAC, a Monaco editor, and password-gated destructive operations.

github ↗

argus

A Slack-native incident platform. Mention @argus and it triages the issue, assigns on-call by availability, and streams an LLM-powered RCA into a dashboard. Includes RBAC, audit trail, and reminders for stale tickets.

github ↗

nodesage

A CLI that lets you query any codebase using local LLMs (Ollama) with RAG. Ask a repository anything — nothing leaves your machine. Especially useful when you are new to a monorepo and would rather not interrupt a teammate.

github ↗

k8s-dashboard

A real-time Kubernetes cluster monitor covering pod health, resource usage, and an events viewer. Built with FastAPI and React, and deploys on a single pod.

github ↗

bus route tracker

An open-source project under @nammayatri. It crowd-sources bus-stop and route data using real-time GPS, and provides an admin API for corrections.

github ↗

entry 07

what's in my toolbox.

Sketched first as a mind-map, then listed in full below, because the structured list is more honest. ★ means deepest experience. · means I have shipped with it and would happily pick it up again. ~ means I am familiar with it around the edges.

— the whole drawer, spelled out —

languages

haskell · 3 roles, ny + juspay
python · 11 endorsements
typescript
javascript · linkedin-assessed
purescript · ny + juspay
c++ · linkedin-assessed
java · core java
sql
c · assessed
html5 · css

data & stream

redis · valkey · 4 roles
kafka · 3 roles
postgresql · 3 roles
clickhouse · juspay
rabbitmq · vahan
mysql
mongodb · assessed
databases · 3 roles

cloud & infra

aws · 2 roles, prod
kubernetes · 3 roles
gcp · active-active w/ aws
docker
kyverno · pod-zone injection
lambda
ci / cd
git · github
automation · 2 roles

observability

prometheus
grafana
victoriametrics
cloudwatch
jenkins

frameworks & web

node.js · vahan
express.js
react.js · vahan
fastapi · vishwakarma, argus
rest apis · vahan
full-stack · back-end
web development · 11 endorsements
bootstrap

discipline

systems design
distributed systems
sre · on-call ownership
performance & cost tuning
data structures · endorsed, codechef
algorithms · sme @ chegg india
oop · sme @ chegg, assessed
interview loops & panels
mentorship · 3+ engineers
incident rca & on-call rotation

Also on the bench: VS Code, GitHub Actions, Slack bots, Mermaid diagrams, and more I will read the docs this weekend than I care to admit.

entry 08

letters people sent me.

These are recommendations from former managers and teammates, originally posted on LinkedIn. I have pasted them here as they were written. The handwriting is mine; the words are not.

Vijay was a fantastic person to work with — multi-skilled, insightful, and with very strong problem-solving skills. His focus keeps everything moving smoothly, deadlines are met, and whatever project he is working on meets the highest standards. An asset to any company.

Vijay impressed me with his positive attitude and strong work ethic. Always eager to learn, with great initiative on challenging projects. His technical skills are equally impressive — it was clear from the very beginning that he had a great deal of potential.

Vijay's technical knowledge, attention to detail, and problem-solving skills are unmatched. He was always a key contributor to our team's projects. I highly recommend Vijay for any opportunity he pursues.

entry 09 · the last one

how to find me.

I am actively interviewing for senior Backend, Fullstack, AI, SRE, and Platform roles — remote, relocation, or Bengaluru. If you have an interesting distributed-systems problem, a team that moves quickly without cutting corners on quality, or you would simply like to discuss Haskell or Redis over a call, any of the addresses below will reach me.

· torn out and mailed to you ·

email vijayrauniyar1818@gmail.com
linkedin /in/vijaygupta18
github /vijaygupta18
twitter · x @vijaygupta18
resume open the pdf ↗
based bengaluru, in · happy to travel

fin. — thanks for reading the whole notebook.

flip back to the cover