Constant 14ms attention: 512→524K tokens (24.5x faster than FlashAttention)

(github.com)

1 points | by luxiedge 10 hours ago ago

1 comments