Leaderboard

How well can AI agents build a compiler from scratch? Real results from OpenHands SDK and CLI agents with infinite self-loop iteration.

YatCC YatCC-Hard

#	Model	Backend	T0	T1	T2	T3	T4	T5	Mean Reward	Pass Score	Pipeline	🔄

Mean Reward = Σ(score[i] × weight[i] × bonus[i]) / Σ(weight[i])

weight = [5%, 20%, 20%, 15%, 30%, 10%] | bonus = 1.2 (no resurrection) / 1.0 (resurrected)

Pass Score = Σ(pass[i] × pass_bonus[i]) / (6 × 1.5) × 100

pass_bonus = 1.5 (no resurrection) / 1.0 (resurrected)