EigenBench Pipeline
Upload evaluations + spec, run BTD training + bootstrap, publish to
ValueArena
.
Secret
evaluations.jsonl
Drop File Here
- or -
Click to Upload
spec.py
Drop File Here
- or -
Click to Upload
Run Name
Group (optional)
Note (optional)
Git Commit (optional)
Run Pipeline & Upload
Status
Results
EigenBench Elo
Bootstrap CI
UV Embeddings PCA
Training Loss
meta.json
▼
meta.json
Summary (Elo rankings)
▼
summary.json