基础数学原理

Transformer 模型结构

Embedding

Attention

MLP

MoE

VLM

Huggingface

Training

N-D 并行

Comm-Compute Overlap

低精度训练

负载均衡

LongContext

Dataload & Ckpt

Inference

Decode

KV-Cache

PD 分离/合并

Quant

Kernel

RL 强化学习

RL Algo

RL System

Post-Training Recipes

MaaS

Hardward

通信

Misc

Kernels