Open SLM Leaderboard

A leaderboard for sub-150M parameter language models, evaluated using LM-eval harness or a custom benchmark script available here Arithmark-2.0.

Leaderboard

Zero-shot evaluation. Higher is better for all columns. Click any header to sort.

# Model Params Avg HellaSwag ARC-Easy ARC-Challenge PIQA ArithMark-2

Efficiency

Average score vs parameter count (log scale). Shaded zone = above regression line.

Avg Score vs Log Parameters

Add your model

Open a PR on this Space with your model's results for the given benchmarks. They will be independently verified by our team and then your PR will be merged. Your model must be open weights to qualify. Open a PR →