pytorch/benchmarks/gpt_fast
2024-05-11 10:46:54 +00:00
..
benchmark.py GPT-fast benchmark: adding memory bandwidth and use A100-40GB as target (#125881) 2024-05-11 10:46:54 +00:00
mixtral_moe_model.py Reduce the number of layers for mixtral moe model to adapt CI memory limitation (#125608) 2024-05-06 21:52:25 +00:00
mixtral_moe_quantize.py
model.py
quantize.py