EigenBench Pipeline

Upload evaluations + spec, run BTD training + bootstrap, publish to ValueArena.

Results