karpathy · 83.4/100 — Rate My GitHub

01 · Roasts

96% Solo Artist

soloPct = 96%. With 170k followers watching your every commit, you've still never once needed a pull request from anyone else. Turns out the BDFL lifestyle means being the only FL too.

HTML Titan of ML

langPcts say you're 85% HTML. The man who trained GPT-2 for $48 is, statistically, a web developer. Your blog template is doing more bytes than all your CUDA kernels combined.

Bursty to a Fault

The heatmap tells the real story: weeks 1–12 are nearly empty, then a heroic 15-week sprint, then silence again. 344 commits in a year sounds fine until you notice 12 of 52 weeks contain ~85% of them.

Ship It and Ghost It

staleRepoRatio = 0.63. Nearly two-thirds of your repos haven't been touched in 2+ years. The graveyard grows every time you spawn a new 74k-star project and lose interest in 20 days.

5 PRs, 170k Fans

totalPRsYear = 5. You have more followers than the population of Reykjavik and contributed 5 pull requests to other people's code this year. The mountain does not go to Muhammad.

Built using

Zoral

Shadows one worker for a week, then takes over their job with zero extra setup. Behaves exactly like the original.

zoral.ai

02 · Category breakdown

Impact
25% weight
96S
Consistency
20% weight
60C
Quality
20% weight
77B
Depth
15% weight
75B
Breadth
10% weight
65C
Community
10% weight
90S

03 · Stats

365-day commit heatmap

173 active days

Less

Language distribution

7 langs

HTML85%
Jupyter Notebook7%
Python3%
Cuda2%
JavaScript1%
C1%
Other1%

04 · Numbers

Owned repos

non-fork

Commits

last 12 months

344

Followers

170,453

Joined GitHub

Apr 2010

05 · Top repos

karpathy /

nanochat

80/100

nanochat is a well-engineered, production-grade LLM training framework that democratizes GPT-2-scale model training through novel efficiency improvements (FP8, Muon optimizer, sliding windows). 52k stars, comprehensive test suite, typed Python, and sustained development signal strong community adoption and craftsmanshi

I85Q80D75

READMETests

Python★ 52,0973mo ago

karpathy /

karpathy.github.io

70/100

Well-maintained Jekyll blog with 1.2k stars by prominent ML researcher; 11+ years of substantive technical content spanning neural networks, deep learning, and practical ML guidance; production static site generator setup.

I65Q70D75

README

CSS★ 1,2843mo ago

karpathy /

jobs

70/100

A focused, well-documented BLS job market visualization tool with LLM-powered AI exposure scoring. Demonstrates solid engineering discipline (typed Python, structured pipelines, clear docs) but limited external adoption signals.

I65Q75D50

README

HTML★ 1,4904mo ago

karpathy /

autoresearch

62/100

Autonomous AI research framework enabling LLMs to modify and experiment on a 768-dim 12-layer GPT model within a 5-minute training budget. Real application with 74k stars, clear architecture (train.py/prepare.py/program.md), documented baseline but unfinished implementation (train.py truncated).

I75Q60D50

README

Python★ 74,0413mo ago

06 · Timeline

Apr 10, 2010
Joined GitHub
Jul 3, 2014
Created karpathy.github.io — my blog
Oct 13, 2025
Created nanochat — The best ChatGPT that $100 can buy.
Mar 6, 2026
Created autoresearch — AI agents running research on single-GPU nanochat training automatically
Mar 14, 2026
Created jobs — A research tool for visually exploring Bureau of Labor Statistics Occupational Outlook Handbook data. This is not a report, a paper, or a serious economic publication — it is a dev
Apr 14, 2026
Most recent push to nanochat

07 · Compare

Compare karpathy against

github.com/

karpathy · 6dmedian coder

08 · Rubric

How this score was produced

Overall = Σ (category × weight) + gentle top-end curve

CategoryWeightScoreContrib.

Raw total78.2

Top-end curve+5.3

Final overall83.4

Tier thresholds

S90–100Mass-producing humansA80–89Ship machineB70–79Solid engineerC60–69Getting thereD40–59README enthusiastF0–39GitHub tourist

▸ How the pipeline works

01Scrape.Pull every non-fork repo pushed in the last 90 days, plus your contribution calendar, followers, and language byte counts — straight from GitHub's REST & GraphQL APIs.
02Triage.A small model reads every repo's file tree + README and picks the 20 files per repo that actually reveal how you code.
03Grade each repo. All repos run in parallel through a fast scoring model that reads the picked files and rates each one independently on Impact, Quality, and Depth — with evidence citations.
04Aggregate. A larger reasoning model combines the per-repo scores with server-computed stats (heatmap, commit cadence, language entropy, follower count) to produce the 6-dimension profile score + roasts.
05Correct.Deterministic server-side checks enforce anchor-scale floors (e.g. a profile with 2,000+ public commits can't score 30 Consistency) and recompute the final verdict.

~90 seconds per profile, ~$0.25 in compute. Total of ~240 files read across your top-12 repos. One rating per GitHub account per day.

▸ Data sources & caveats

Heatmap & commit totals: GitHub GraphQL contributionsCollection — covers the last 365 days, includes private repos when the user has opted in (default).
Language %: byte totals across the top 30 owned non-fork repos.
Curve: a small upward nudge centered on raw score ≈ 70, capping at 100. Prevents specialists from being unfairly penalised for narrow breadth.
Anchor corrections: when server-measured signals (e.g. privateWorkLikely, multiRepoVolume, follower count) mandate a minimum category score, the aggregation step enforces it. These are signal-conditional, not identity-based floors.