The cheapest good enough model is the one that ships. K cha khabar now runs on Gemini 3

The cheapest good enough model is the one that ships. K cha khabar now runs on Gemini 3
News IntelligenceNepalBuild in Public

A few weekends ago I shipped kchakhabar.com. Free, no login, no ads, no email capture. The shape of it is what Ground News does for US/UK media, but built for the Nepali newsstand. The name is the first thing a Nepali asks when they pick up the phone. “के छ खबर?” or what’s up / whats the news? The whole product is a daily answer to that question.

The problem nobody is solving

Nepal has 45+ digital news publishers. Every story gets retold by every one of them within hours — different numbers, different framings, different blame. There is no neutral, cross-publisher view of what actually happened today. So the reader does the synthesis themselves, badly, in their head.

The pain is sharpest for the diaspora. More than 1.5 million Nepalis live outside Nepal, and they are simultaneously the most engaged and the most poorly served audience for Nepal news. I am one of them, Melbourne-based, Chitwan-raised. Built it for that reader first.

What’s under the hood

K cha khabar ingestion pipeline

The pipeline is unremarkable on its own, each piece is doing a job that’s been done before. The interesting part is stitching them together for a low-resource language and keeping it cheap enough for one person to run.

  • Ingestion. RSS every fifteen minutes from 30+ publishers (OnlineKhabar, Kantipur, Setopati, Gorkhapatra, BBC Nepali, Nagarik, Himalayan Times, Nepali Times, MyRepublica, RSS Nepal, …). Headlines and short excerpts only. We honour robots.txt. Every story links back.
  • Embeddings. Every article gets vectorised with BAAI BGE-M3 (1024-dim, multilingual). This is the unsung hero, make clustering possible.
  • Clustering. Postgres + pgvector/HNSW. For each new article, find the nearest neighbour within ±3 days; if cosine similarity ≥ 0.78, join the cluster, else open a new one. A Nepali piece and a BBC English piece about the same arrest end up in the same cluster.
  • Bilingual summary. For every multi-outlet cluster, an LLM writes parallel English + Nepali briefs, grounded only in the articles in that cluster. No facts the sources didn’t say.
  • Render. Plain Next.js 15 SSR over Postgres. The browser never talks to an LLM.

Picking the model: I didn’t start where I ended up

V1 ran on Claude Sonnet 4.6. It was fine. It was also expensive enough that I was not comfortable using it as my deafult, specially for a hobby project with a total of 1 user.

So I built a benchmark. NepNewsCluster. Fifteen frontier and open-weight LLMs, 107 clusters, 1,310 article snippets, 23 publishers. Each model had to produce an English headline, a Devanagari headline, parallel 3–4 sentence summaries in both languages, and a typed entity list. Outputs were graded blind on three axes. Nepali prose, English prose and topic coverage by Claude Opus 4.7 with extended thinking, scaled to a /100 axis-quality scale.

The headline result:

RankModelAxis quality#1 finishesCost (107 clusters)
1Claude Sonnet 4.681 / 10034 / 107$2.5
2Gemini 3 Flash79 / 10016 / 107$0.29

Sonnet wins quality by two points. The measured per-cluster noise floor is ±2 points. Two points is exactly on the edge of meaningful. Gemini 3 Flash gives me Sonnet-class output at fractions of the cost, with a 4-second mean latency. For a one-person operation that has to keep running every fifteen minutes, the answer was obvious.

Production switched to Gemini 3 Flash. The whitepaper is the receipt.

TL;DR Rankings from the Whitepaper

A quieter finding worth flagging: Gemma 4 31B, an open-weights model you can run on your own GPU, beat both GPT-5.4 mini and Claude Haiku 4.5 at less than 1/40th the per-call cost. The local AI story keeps getting better.

What’s next

Cross-lingual clustering and bilingual summaries are live. AI-narrated short video summaries, both languages, refreshed hourly, just shipped. The whole thing runs in support of a Nepali current-affairs podcast I co-host on Fridays, which gives it a built-in editorial dogfood loop.

To any Nepali publisher reading this: if you want a listing added, thanks for your dedicated journalism, and shring your stories to the public. If you want your content corrected, or delisted, email contact@kchakhabar.com. K cha khabar ingests headlines and excerpts only, never full bodies. Every story links back to you.

The shortest possible version of what I learned: the cheapest model that’s good enough is the one that ships. The benchmark is the part that turns “good enough” from a feeling into a number.

Back to Blog