MoE

Mixture of Experts is a neural network architecture that uses multiple specialized sub-networks (experts) and a gating mechanism to route inputs to the most relevant experts, enabling scalable model capacity while maintaining computational efficiency.

See all tags.

2025

August

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer | Paper Notes
AI
LLM
MoE
NLP
Paper Notes