logo

Scaling Laws, Honestly

Posted by CompleteSkeptic |3 hours ago |1 comments

wardc an hour ago

Pretty interesting article it seems reasonable and vibes with what kinds of models people are releasing in the open source world.

For chincilla / scaling laws doesnt it seem a bit weird that they arent using wall-clock? Like FA4 backwards is bandwidth bound not flops bound. it seems like you'd care about like dollars or time in relation to loss or something like that not just clean room flops. MFUs are likely not equivalent given different model sizes / shapes

adamzwasserman 3 hours ago

[flagged]