I am Mo Zhu, a PHD student studying CS & AI in Zhejiang University.
-
Zhejiang University
- Zhejiang University
Pinned Loading
-
massive-activations-deep
massive-activations-deep PublicForked from locuslab/massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
Python
-
llm_distillation
llm_distillation PublicForked from Nicolas-BZRD/llm-distillation
i don't know why it doesn't do well
Python
-
2018cx/Multi-Level-OT
2018cx/Multi-Level-OT PublicPytorch Implementation of "Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models", AAAI 2025
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.