Efficient and Flexible FP-INTx Accelerator for Weight-only Quantized LLM Inference

Published in Preprint, 2025

Recommended citation:
Download Paper | Download Slides | Download Bibtex