Flash Attention: The Mathematical Tricks That Broke the Memory WallFlash Attention, a memory-efficient attention mechanism for transformers.