Homepage - Hansong Zhou

Hansong Zhou

5-th year Ph.D. Candidate
Department of Computer Science, Florida State University

Hansong Zhou is a 5-th year Ph.D. in the Department of Computer Science at the Florida State University. He is under the supervision of Prof. Xiaonan Zhang. He obtained his master degress in 2021 from the Department of Electrical & Computer Engineering at the University of Florida, and his bachelor degree in 2019 from the Department of Electrical and Information Engineering at Xi'an Jiaotong University. His work contributed to the research for the efficient and reliable Edge Intelligence, particularly for rapid development in large-scale scenarios and Large Language Models (LLM) inference acceleration.

Hansong's Research focuses on distributed LLM training/inference on edges, LLM fine-tuning for domain-specific expert agents and large-scale collaborative edge systems. Currently, He is owning the end-to-end delivery of a LLM-based AI Health Coach for rural health, spanning data pipeline design, model fine-tuning, prompt engineering, and final application deployment.

hz21e(at)fsu.edu Google Scholar GitHub Twitter

Education

Florida State University

Department of Computer Science
Ph.D. Candidate

Aug. 2021 - present
University of Florida

Department of Electrical & Computer Engineering
M.S. in Computer Engineering

Sep. 2019 - Jul. 2021
Xi'an Jiaotong University

Department of Electrical and Information Engineering
B.S. in Information Engineering

Sep. 2015 - Jul. 2019

Experience

University of Electronic Science and Technology of China

School of Information and Communication Engineering
Research Intern

July. 2018 - Dec. 2018

Honors & Awards

Dean's Award for Doctoral Excellence (DADE), Florida State University

2023 - 2025

Selected Publications (view all )

Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference

Hansong Zhou, Xiaonan Zhang

NeurIPS '25 - AI4NextG: The Thirty-Ninth Annual Conference on Neural Information Processing Systems

Full abstract

Collaborative inference (CI) in NextG networks enables battery-powered devices to collaborate with nearby edges on deep learning inference. The fairness issue in a multi-device multi-edge (M2M) CI system remains underexplored. Mean-field multi-agent reinforcement learning (MFRL) is a promising solution for its low complexity and adaptability to system dynamics. However, the mobility nature in M2M CI systems hinders their effectiveness, as it breaks the premise of stable mean-field statistics. We propose FOCI (Fairness-Oriented Collaborative Inference), an RL-based method with two components: (i) an oracle-shaping reward for approaching max-min fairness, and (ii) a competitor-aware observation augmentation for stabilizing device behaviors. We provide a convergence guarantee with bounded estimation errors. According to the results from real-world devices mobility traces, FOCI shows the best performance on multiple metrics and tightens the tails. It reduces worst-case latency by up to 56\% and worst-case energy by 46\% versus baselines, while halving the switch cost and preserving competitive QoS.

Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference

Hansong Zhou, Xiaonan Zhang

NeurIPS '25 - AI4NextG: The Thirty-Ninth Annual Conference on Neural Information Processing Systems

Full abstract

Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing

Hansong Zhou, Jingjing Fu, Yukun Yuan, Linke Guo, Xiaonan Zhang

INFOCOM '25: IEEE Conference on Computer Communications

Full abstract

Edge computing is envisioned to enable rapid federated intelligence on edge devices to satisfy their dynamically changing AI service demands. Semi-Asynchronous FL (Semi-Async FL) enables distributed learning in an asynchronous manner, where the server does not have to wait all local models for improving the global model. Hence, it takes a small time to well-train a global model. However, system heterogeneity in edge computing results in staleness issue, which will deteriorate training accuracy. In this paper, we propose to accelerate Semi-Async FL while ensuring training accuracy by designing a Similarity-Aware Aggregation (SAA) strategy. SAA is able to enhance the aggregation quality and thus decrease the wall-clock time, the training time for a certain accuracy. Particularly, we leverage the global model similarity to describe the local model influence and let those with higher influence contribute more to global aggregation. We further measure the similarity between global model update deviations as directional similarity, which is then used for determining aggregation timing. We theoretically provide a convergence analysis to SAA. Our extensive experimental results empirically show that the proposed SAA strategy reduces up to 53.7% wall-clock time and 59.4% wall-clock round for Semi-Async FL compared with several benchmark schemes.

[Paper]

Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing

Hansong Zhou, Jingjing Fu, Yukun Yuan, Linke Guo, Xiaonan Zhang

INFOCOM '25: IEEE Conference on Computer Communications

Full abstract

[Paper]

Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing

Hansong Zhou, Shaoying Wang, Chutian Jiang, Linke Guo, Yukun Yuan, Xiaonan Zhang

MobiHoc '23: Proceedings of the Twenty-fourth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing

Full abstract

Future mobile edge computing (MEC) is envisioned to provide federated intelligence to delay-sensitive learning tasks with multimodal data. Conventional horizontal federated learning (FL) suffers from high resource demand in response to complicated multi-modal models. Multi-modal FL (MFL), on the other hand, offers a more efficient approach for learning from multi-modal data. In MFL, the entire multi-modal model is split into several sub-models with each tailored to a specific data modality and trained on a designated edge. As sub-models are considerably smaller than the multi-modal model, MFL requires fewer computation resources and reduces communication time. Nevertheless, deploying MFL over MEC faces the challenges of device mobility and edge heterogeneity, which, if not addressed, could negatively impact MFL performance. In this paper, we investigate an Service Migration-assisted Mobile Multi-modal Federated Learning (SM3FL) framework, where the service migration for sub-models between edges is enabled. To effectively utilize both communication and computation resources without extravagance in SM3FL, we develop the optimal strategies of service migration and data sample collection to minimize the wall-clock time, defined as the required training time to reach the learning target. Our experiment results show that the proposed SM3FL framework demonstrates remarkable performance, surpassing other state-of-art FL frameworks via substantially reducing the computing demand by 17.5% and dramatically decreasing the wall-clock time by 25.3%.

[Paper]

Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing

Hansong Zhou, Shaoying Wang, Chutian Jiang, Linke Guo, Yukun Yuan, Xiaonan Zhang

MobiHoc '23: Proceedings of the Twenty-fourth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing

Full abstract

[Paper]

DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network

Hansong Zhou, Sihan Yu, Linke Guo, Beatriz Lorenzo, Xiaonan Zhang

MASS '22: IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems

Full abstract

Sensing data collection from the Internet of Things (IoT) devices lays the foundation to support massive IoT applications, such as patient monitoring in smart health and intelligent control in smart manufacturing. Unfortunately, the heterogeneity of IoT devices and dynamic environments result in not only the life-cycle latency but also data collection failures, affecting the quality of experience (QoE) for all the users. In this paper, we propose a recovery mechanism with a dynamic data contamination method to handle the failure. To further enhance the long-term overall QoE, we allocate the spectrum resources and make contamination decisions for each device using a deep reinforcement learning method. Particularly, a lightweight decentralized State-sharing Deep-Recurrent Q-Network (SDRQN) is proposed to find the optimal collection policies. Our simulation results indicate that the recurrent unit in SDRQN gives rise to 10% lower waiting time and 60% lower task drop rate than the fully-connected design. Compared to a centralized DQN scheme, SDRQN achieves a similar ultra-low drop rate of 0.29% but requires only 1% GPU memory, demonstrating the effectiveness of SDRQN in the large-scale heterogeneous IoT network.

[Paper]

DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network

Hansong Zhou, Sihan Yu, Linke Guo, Beatriz Lorenzo, Xiaonan Zhang

MASS '22: IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems

Full abstract

[Paper]

Warning

Action required

Education

Experience

Honors & Awards

Selected Publications (view all )

Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference

Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference

Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing

Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing

Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing

Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing

DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network

DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network

All publications