rajveerb an hour ago
One thing which was not addressed but will be interesting to discuss would be benchmarks/evals that conflict.
Are there desirable emergent behavior that might not be optimized because the evals penalize them?
an hour ago
Comment deleted