Performance Benchmarks

Explore momo-kibidango's performance across different Apple Silicon chips and configurations.

Performance Comparison

Baseline (Sonnet 3.5 only)12.5 tok/s
2-Model Speculative18.7 tok/s
Pyramid (momo-kibidango)24.6 tok/s

vs Baseline

+97x

vs 2-Model

+32x

Speedup

1.97x

Acceptance Rates

89%

Tier 1 → Tier 2

Haiku 2 → Haiku 3

76%

Tier 2 → Tier 3

Haiku 3 → Sonnet 3.5

68%

Overall

End-to-end acceptance

Higher acceptance rates mean fewer rejections and faster overall inference.

Memory Usage

Model Sizes

Haiku 2 (Tier 1)2.3 GB
Haiku 3 (Tier 2)3.8 GB
Sonnet 3.5 (Tier 3)4.9 GB
Total11.6 GB

Runtime Requirements

Minimum RAM16 GB
Recommended RAM32 GB
Peak Usage13.2 GB
Disk Space50 GB

Export Benchmark Data

Download the complete benchmark dataset for your own analysis.