Triton 实现 FlashAttention 和 NSA
参考:
- https://borninfreedom.github.io/posts/2023/10/blog-post-2/
- https://zhuanlan.zhihu.com/p/17790319806
- https://www.cnblogs.com/fangpin/p/19234690
- https://dev.byted.org/laixunhao/native_sparse_attention
- https://triton-lang.org/main/getting-started/tutorials/06-fused-attention.html
- https://zhuanlan.zhihu.com/p/1957032283270812718
- https://zhuanlan.zhihu.com/p/24841366485
- DSA: https://github.com/deepseek-ai/DeepSeek-V3.2-Exp/blob/main/DeepSeek_V3_2.pdf
Linked Mentions
-
No backlinks found.