▸ This tool was built by an AI agent from Zoral
← RATE MY GITHUB

#469 — Top 60.8%

ncduy0303

Nguyen Cao Duy

D

README enthusiast

Overall

0.0

/ 100

01 · Roasts

Notebook Hoarder

85% of your codebase is Jupyter Notebooks. That's not a portfolio, that's a homework pile with a GitHub account attached. At least rename the cells.

One-and-Done Committer

TokenizerStats: one commit, lifetime under 60 seconds. datacamp: created and abandoned in 29 minutes. You're speedrunning the 'push and disappear' achievement.

Star-Starved Researcher

499 total stars across 50 repos sounds decent until you realize it averages to ~10 per repo and your FYP — your biggest project — has exactly zero. The work is there; the audience isn't.

Half-Life Problem

47% of your repos haven't been touched in over 2 years. Your GitHub is part active lab, part archaeological dig site.

Solo Artist, No Label

soloPct = 100%. Every single repo, just you, alone, in the dark. 35 PRs a year to other projects but nobody PRs back — you're contributing to the world but not inviting the world in.

Built using

Zoral

Shadows one worker for a week, then takes over their job with zero extra setup. Behaves exactly like the original.

zoral.ai

02 · Category breakdown

  • Impact
    25% weight
    36F
  • Consistency
    20% weight
    60C
  • Quality
    20% weight
    42D
  • Depth
    15% weight
    55D
  • Breadth
    10% weight
    65C
  • Community
    10% weight
    50D

03 · Stats

365-day commit heatmap

167 active days

Less
More

Language distribution

7 langs
  • Jupyter Notebook85%
  • C++4%
  • Python4%
  • HTML3%
  • Julia1%
  • Go1%
  • Other2%

04 · Numbers

Owned repos

non-fork

30

Commits

last 12 months

253

Followers

109

Joined GitHub

Jun 2017

05 · Top repos

ncduy0303 /

molecule-tokenization

40/100

FYP research project benchmarking multiple molecule tokenization methods (SMIRK, BPE, APE, fragSMILES, t-SMILES) for MLM pretraining via HuggingFace Trainer, with downstream classification finetuning on MoleculeNet datasets. Typed, documented, and architecturally sound for scope (537 KB, ~11k LOC), but experimental in

I25Q50D45
README
Python02mo ago

ncduy0303 /

ncduy0303.github.io

38/100

Personal portfolio and blog site built with Hugo and PaperMod theme. Includes resume CV, experience, achievements, and about pages. CI/CD via GitHub Actions. Minimal impact but well-documented and structured.

I15Q50D45
READMECI
HTML03mo ago

ncduy0303 /

TokenizerStats

30/100

One-shot research code for molecular tokenizer analysis, supporting the paper "Smirk: An Atomically Complete Tokenizer for Molecular Foundation Models." Julia + Python hybrid project with ~508 KB codebase, minimal tests, no CI, and single-day lifetime.

I15Q50D20
READMETests
Julia03mo ago

ncduy0303 /

ncduy0303

12/100

Personal profile repository with minimal substance — a README-only project serving as a GitHub landing page with links and statistics, no code artifacts or meaningful project content.

I5Q10D20
READMECI
Unknown01mo ago

ncduy0303 /

datacamp

12/100

Single Jupyter notebook Datacamp coursework project with no README, tests, CI, license, or documentation. Created and last pushed same day (2026-01-26). Minimal codebase (~1.2 MB) implementing a basic PyTorch neural network for cybersecurity threat detection.

I5Q25D5
Jupyter Notebook04mo ago

06 · Timeline

  1. Jun 26, 2017
    Joined GitHub
  2. Nov 3, 2020
    Created ncduy0303
  3. Mar 31, 2023
    Created ncduy0303.github.io — My personal Github Page
  4. Jan 26, 2026
    Created datacamp — A repository to store my work of different Datacamp projects
  5. Feb 6, 2026
    Created molecule-tokenization — FYP Project: Advanced Tokenization Methods for Molecular Foundation Models
  6. Feb 10, 2026
    Created TokenizerStats — Taken from https://pubs.acs.org/doi/10.1021/acs.jcim.5c01856
  7. Apr 21, 2026
    Most recent push to ncduy0303

07 · Compare

github.com/
ncduy0303 · 6dmedian coder

08 · Rubric

How this score was produced

Overall = Σ (category × weight) + gentle top-end curve

CategoryWeightScoreContrib.
Raw total49.1
Top-end curve+2.5
Final overall51.6

Tier thresholds

S90100Mass-producing humansA8089Ship machineB7079Solid engineerC6069Getting thereD4059README enthusiastF039GitHub tourist
▸ How the pipeline works
  1. 01Scrape.Pull every non-fork repo pushed in the last 90 days, plus your contribution calendar, followers, and language byte counts — straight from GitHub's REST & GraphQL APIs.
  2. 02Triage.A small model reads every repo's file tree + README and picks the 20 files per repo that actually reveal how you code.
  3. 03Grade each repo. All repos run in parallel through a fast scoring model that reads the picked files and rates each one independently on Impact, Quality, and Depth — with evidence citations.
  4. 04Aggregate. A larger reasoning model combines the per-repo scores with server-computed stats (heatmap, commit cadence, language entropy, follower count) to produce the 6-dimension profile score + roasts.
  5. 05Correct.Deterministic server-side checks enforce anchor-scale floors (e.g. a profile with 2,000+ public commits can't score 30 Consistency) and recompute the final verdict.

~90 seconds per profile, ~$0.25 in compute. Total of ~240 files read across your top-12 repos. One rating per GitHub account per day.

▸ Data sources & caveats
  • Heatmap & commit totals: GitHub GraphQL contributionsCollection — covers the last 365 days, includes private repos when the user has opted in (default).
  • Language %: byte totals across the top 30 owned non-fork repos.
  • Curve: a small upward nudge centered on raw score ≈ 70, capping at 100. Prevents specialists from being unfairly penalised for narrow breadth.
  • Anchor corrections: when server-measured signals (e.g. privateWorkLikely, multiRepoVolume, follower count) mandate a minimum category score, the aggregation step enforces it. These are signal-conditional, not identity-based floors.
ncduy0303 · 51.6/100 — Rate My GitHub