shreyashkar-ml · 44.8/100

01 · Roasts

80% Notebooks, 0% Tests

Jupyter Notebook is 80% of your codebase and not a single repo has tests. Your 'engineering' is really just .ipynb files held together by vibes and markdown cells.

One-Session Wonders

GPUEngineering: created and last pushed Feb 14 in the same afternoon. my-coding-agent-rules: 3 commits in 2 hours. You commit in bursts like you're cramming the night before a demo.

The Incomplete Kernel

optimized_softmax.cu ends mid-line with 'smem[]' and no content. You uploaded a CUDA file that literally doesn't finish its own sentence. Even your GPU experiments ghost you.

10 Stars Across 19 Repos

19 public repos, 10 total stars, 5 followers. The math is brutal: 0.53 stars per repo, and you're following 12 people who apparently aren't following back.

AI Engineering on the Tin, Notebooks in the Can

Bio says 'AI Engineering | Performance Optimization Enthusiast' — the repo record shows one broken CUDA kernel, one challenge submission, and a blog post about someone else's optimizer.

Built using

Zoral

Shadows one worker for a week, then takes over their job with zero extra setup. Behaves exactly like the original.

zoral.ai

02 · Category breakdown

Impact
25% weight
30F
Consistency
20% weight
55D
Quality
20% weight
52D
Depth
15% weight
50D
Breadth
10% weight
45D
Community
10% weight
25F

03 · Stats

365-day commit heatmap

73 active days

Less

Language distribution

7 langs

Jupyter Notebook80%
Python12%
HTML7%
CSS0%
Cuda0%
JavaScript0%
Other1%

04 · Numbers

Owned repos

non-fork

Commits

last 12 months

Followers

Joined GitHub

Jul 2022

05 · Top repos

shreyashkar-ml /

shreyashkar-ml.github.io

43/100

Personal portfolio & blog site (Hugo + PaperMod theme) with 3 technical deep-dives on ML/DL topics (RNN, RoPE, Muon optimizer). Hugo-based, styled, published live, auto-deployed via GitHub Actions.

I25Q55D50

READMECI

HTML★ 03mo ago

shreyashkar-ml /

autoeval

37/100

Personal harness framework for coding-agent orchestration with typed architecture, structured artifact layout, and CLI tool surface—experimental stage with minimal adoption signals but functional breadth.

I25Q50D35

README

Python★ 14mo ago

shreyashkar-ml /

anthropic_performance_optimization_challenge

25/100

A one-off challenge submission for Anthropic's hiring performance optimization test—solver explores VLIW/SIMD kernel scheduling and vectorization with detailed optimization log but minimal reusability or ecosystem contribution.

I15Q40D20

READMETests

Python★ 05mo ago

shreyashkar-ml /

my-coding-agent-rules

20/100

Personal coding guidelines document (CLAUDE.md) for LLM agent prompt engineering; 3 commits over ~2 hours, 3KB total, no tests or CI. Useful internal reference but minimal adoption or sustained work.

I15Q40D5

README

Unknown★ 05mo ago

shreyashkar-ml /

GPUEngineering

18/100

Early-stage CUDA learning experiments (softmax kernels) with minimal commits, no tests/CI, incomplete code, and minimal documentation. Created Feb 14, 2026 with 1 recent commit.

I15Q35D5

README

Cuda★ 05mo ago

06 · Timeline

Jul 20, 2022
Joined GitHub
Oct 5, 2024
Created shreyashkar-ml.github.io
Jan 29, 2026
Created my-coding-agent-rules — My rule for coding agents.
Feb 1, 2026
Created anthropic_performance_optimization_challenge — Trying out my solutions for anthropic performance optimization challenge
Feb 14, 2026
Created GPUEngineering — Experiments with CUDA, cutlass, and other python DSL
Feb 19, 2026
Created autoeval — Multi-agent and Harness Engineering framework
Apr 9, 2026
Most recent push to shreyashkar-ml.github.io

07 · Compare

Compare shreyashkar-ml against

github.com/

shreyashkar-ml · 6dmedian coder

08 · Rubric

How this score was produced

Overall = Σ (category × weight) + gentle top-end curve

CategoryWeightScoreContrib.

Raw total43.4

Top-end curve+1.4

Final overall44.8

Tier thresholds

S90–100Mass-producing humansA80–89Ship machineB70–79Solid engineerC60–69Getting thereD40–59README enthusiastF0–39GitHub tourist

▸ How the pipeline works

01Scrape.Pull every non-fork repo pushed in the last 90 days, plus your contribution calendar, followers, and language byte counts — straight from GitHub's REST & GraphQL APIs.
02Triage.A small model reads every repo's file tree + README and picks the 20 files per repo that actually reveal how you code.
03Grade each repo. All repos run in parallel through a fast scoring model that reads the picked files and rates each one independently on Impact, Quality, and Depth — with evidence citations.
04Aggregate. A larger reasoning model combines the per-repo scores with server-computed stats (heatmap, commit cadence, language entropy, follower count) to produce the 6-dimension profile score + roasts.
05Correct.Deterministic server-side checks enforce anchor-scale floors (e.g. a profile with 2,000+ public commits can't score 30 Consistency) and recompute the final verdict.

~90 seconds per profile, ~$0.25 in compute. Total of ~240 files read across your top-12 repos. One rating per GitHub account per day.

▸ Data sources & caveats

Heatmap & commit totals: GitHub GraphQL contributionsCollection — covers the last 365 days, includes private repos when the user has opted in (default).
Language %: byte totals across the top 30 owned non-fork repos.
Curve: a small upward nudge centered on raw score ≈ 70, capping at 100. Prevents specialists from being unfairly penalised for narrow breadth.
Anchor corrections: when server-measured signals (e.g. privateWorkLikely, multiRepoVolume, follower count) mandate a minimum category score, the aggregation step enforces it. These are signal-conditional, not identity-based floors.