Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

Guanxi Lu, Hao (Mark) Chen, Yuto Karashima, Zhican Wang, Daichi Fujiki, Hongxiang Fan

FastTTS: Accelerating Test-Time Scaling for Edge LLM Reasoning

Hao (Mark) Chen, Zhiwen Mo, Guanxi Lu, Shuang Liang, Lingxiao Ma, Wayne Luk, Hongxiang Fan

(Invited Paper) Enhancing Trustworthiness with Mixed Precision: Benchmarks, Opportunities, and Challenges

Guanxi Lu, Hao (Mark) Chen, Zhiqiang Que, Wayne Luk, Hongxiang Fan

Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling

Hao (Mark) Chen, Guanxi Lu, Yasuyuki Okoshi, Zhiwen Mo, Masato Motomura, Hongxiang Fan

Preprints


Efficient and Flexible FP-INTx Accelerator for Weight-only Quantized LLM Inference

Zhican Wang, Hongxiang Fan, Guanxi Lu, Chen Zhang, Haroon Waris, Hao (Mark) Chen, Guanghui He