Hansong Zhou
Logo
5-th year Ph.D. Candidate
Department of Computer Science, Florida State University

Hansong Zhou is a 5-th year Ph.D. in the Department of Computer Science at the Florida State University. He is under the supervision of Prof. Xiaonan Zhang. He obtained his master degress in 2021 from the Department of Electrical & Computer Engineering at the University of Florida, and his bachelor degree in 2019 from the Department of Electrical and Information Engineering at Xi'an Jiaotong University. His work contributed to the research for the efficient and reliable Edge Intelligence, particularly for rapid development in large-scale scenarios and Large Language Models (LLM) inference acceleration.

Hansong's Research focuses on distributed LLM training/inference on edges, LLM fine-tuning for domain-specific expert agents and large-scale collaborative edge systems. Currently, He is owning the end-to-end delivery of a LLM-based AI Health Coach for rural health, spanning data pipeline design, model fine-tuning, prompt engineering, and final application deployment.


Education
  • Florida State University
    Florida State University
    Department of Computer Science
    Ph.D. Candidate
    Aug. 2021 - present
  • University of Florida
    University of Florida
    Department of Electrical & Computer Engineering
    M.S. in Computer Engineering
    Sep. 2019 - Jul. 2021
  • Xi'an Jiaotong University
    Xi'an Jiaotong University
    Department of Electrical and Information Engineering
    B.S. in Information Engineering
    Sep. 2015 - Jul. 2019
Experience
  • University of Electronic Science and Technology of China
    University of Electronic Science and Technology of China
    School of Information and Communication Engineering
    Research Intern
    July. 2018 - Dec. 2018
Honors & Awards
  • Dean's Award for Doctoral Excellence (DADE), Florida State University
    2023 - 2025
Selected Publications (view all )
Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference
Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference

Hansong Zhou, Xiaonan Zhang

NeurIPS '25 - AI4NextG: The Thirty-Ninth Annual Conference on Neural Information Processing Systems

Collaborative inference (CI) in NextG networks enables battery-powered devices to collaborate with nearby edges on deep learning inference. The fairness issue in a multi-device multi-edge (M2M) CI system remains underexplored. Mean-field multi-agent reinforcement learning (MFRL) is a promising solution for its low complexity and adaptability to system dynamics. However, the…

Full abstract
Collaborative inference (CI) in NextG networks enables battery-powered devices to collaborate with nearby edges on deep learning inference. The fairness issue in a multi-device multi-edge (M2M) CI system remains underexplored. Mean-field multi-agent reinforcement learning (MFRL) is a promising solution for its low complexity and adaptability to system dynamics. However, the mobility nature in M2M CI systems hinders their effectiveness, as it breaks the premise of stable mean-field statistics. We propose FOCI (Fairness-Oriented Collaborative Inference), an RL-based method with two components: (i) an oracle-shaping reward for approaching max-min fairness, and (ii) a competitor-aware observation augmentation for stabilizing device behaviors. We provide a convergence guarantee with bounded estimation errors. According to the results from real-world devices mobility traces, FOCI shows the best performance on multiple metrics and tightens the tails. It reduces worst-case latency by up to 56\% and worst-case energy by 46\% versus baselines, while halving the switch cost and preserving competitive QoS.
Fairness-Oracular MARL with Competitor-Aware Signals for Collaborative Inference

Hansong Zhou, Xiaonan Zhang

NeurIPS '25 - AI4NextG: The Thirty-Ninth Annual Conference on Neural Information Processing Systems

Collaborative inference (CI) in NextG networks enables battery-powered devices to collaborate with nearby edges on deep learning inference. The fairness issue in a multi-device multi-edge (M2M) CI system remains underexplored. Mean-field multi-agent reinforcement learning (MFRL) is a promising solution for its low complexity and adaptability to system dynamics. However, the…

Full abstract
Collaborative inference (CI) in NextG networks enables battery-powered devices to collaborate with nearby edges on deep learning inference. The fairness issue in a multi-device multi-edge (M2M) CI system remains underexplored. Mean-field multi-agent reinforcement learning (MFRL) is a promising solution for its low complexity and adaptability to system dynamics. However, the mobility nature in M2M CI systems hinders their effectiveness, as it breaks the premise of stable mean-field statistics. We propose FOCI (Fairness-Oriented Collaborative Inference), an RL-based method with two components: (i) an oracle-shaping reward for approaching max-min fairness, and (ii) a competitor-aware observation augmentation for stabilizing device behaviors. We provide a convergence guarantee with bounded estimation errors. According to the results from real-world devices mobility traces, FOCI shows the best performance on multiple metrics and tightens the tails. It reduces worst-case latency by up to 56\% and worst-case energy by 46\% versus baselines, while halving the switch cost and preserving competitive QoS.
Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing
Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing

Hansong Zhou, Jingjing Fu, Yukun Yuan, Linke Guo, Xiaonan Zhang

INFOCOM '25: IEEE Conference on Computer Communications

Edge computing is envisioned to enable rapid federated intelligence on edge devices to satisfy their dynamically changing AI service demands. Semi-Asynchronous FL (Semi-Async FL) enables distributed learning in an asynchronous manner, where the server does not have to wait all local models for improving the global model. Hence, it takes…

Full abstract
Edge computing is envisioned to enable rapid federated intelligence on edge devices to satisfy their dynamically changing AI service demands. Semi-Asynchronous FL (Semi-Async FL) enables distributed learning in an asynchronous manner, where the server does not have to wait all local models for improving the global model. Hence, it takes a small time to well-train a global model. However, system heterogeneity in edge computing results in staleness issue, which will deteriorate training accuracy. In this paper, we propose to accelerate Semi-Async FL while ensuring training accuracy by designing a Similarity-Aware Aggregation (SAA) strategy. SAA is able to enhance the aggregation quality and thus decrease the wall-clock time, the training time for a certain accuracy. Particularly, we leverage the global model similarity to describe the local model influence and let those with higher influence contribute more to global aggregation. We further measure the similarity between global model update deviations as directional similarity, which is then used for determining aggregation timing. We theoretically provide a convergence analysis to SAA. Our extensive experimental results empirically show that the proposed SAA strategy reduces up to 53.7% wall-clock time and 59.4% wall-clock round for Semi-Async FL compared with several benchmark schemes.
Similarity-Guided Rapid Deployment of Federated Intelligence Over Heterogeneous Edge Computing

Hansong Zhou, Jingjing Fu, Yukun Yuan, Linke Guo, Xiaonan Zhang

INFOCOM '25: IEEE Conference on Computer Communications

Edge computing is envisioned to enable rapid federated intelligence on edge devices to satisfy their dynamically changing AI service demands. Semi-Asynchronous FL (Semi-Async FL) enables distributed learning in an asynchronous manner, where the server does not have to wait all local models for improving the global model. Hence, it takes…

Full abstract
Edge computing is envisioned to enable rapid federated intelligence on edge devices to satisfy their dynamically changing AI service demands. Semi-Asynchronous FL (Semi-Async FL) enables distributed learning in an asynchronous manner, where the server does not have to wait all local models for improving the global model. Hence, it takes a small time to well-train a global model. However, system heterogeneity in edge computing results in staleness issue, which will deteriorate training accuracy. In this paper, we propose to accelerate Semi-Async FL while ensuring training accuracy by designing a Similarity-Aware Aggregation (SAA) strategy. SAA is able to enhance the aggregation quality and thus decrease the wall-clock time, the training time for a certain accuracy. Particularly, we leverage the global model similarity to describe the local model influence and let those with higher influence contribute more to global aggregation. We further measure the similarity between global model update deviations as directional similarity, which is then used for determining aggregation timing. We theoretically provide a convergence analysis to SAA. Our extensive experimental results empirically show that the proposed SAA strategy reduces up to 53.7% wall-clock time and 59.4% wall-clock round for Semi-Async FL compared with several benchmark schemes.
Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing
Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing

Hansong Zhou, Shaoying Wang, Chutian Jiang, Linke Guo, Yukun Yuan, Xiaonan Zhang

MobiHoc '23: Proceedings of the Twenty-fourth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing

Future mobile edge computing (MEC) is envisioned to provide federated intelligence to delay-sensitive learning tasks with multimodal data. Conventional horizontal federated learning (FL) suffers from high resource demand in response to complicated multi-modal models. Multi-modal FL (MFL), on the other hand, offers a more efficient approach for learning from multi-modal…

Full abstract
Future mobile edge computing (MEC) is envisioned to provide federated intelligence to delay-sensitive learning tasks with multimodal data. Conventional horizontal federated learning (FL) suffers from high resource demand in response to complicated multi-modal models. Multi-modal FL (MFL), on the other hand, offers a more efficient approach for learning from multi-modal data. In MFL, the entire multi-modal model is split into several sub-models with each tailored to a specific data modality and trained on a designated edge. As sub-models are considerably smaller than the multi-modal model, MFL requires fewer computation resources and reduces communication time. Nevertheless, deploying MFL over MEC faces the challenges of device mobility and edge heterogeneity, which, if not addressed, could negatively impact MFL performance. In this paper, we investigate an Service Migration-assisted Mobile Multi-modal Federated Learning (SM3FL) framework, where the service migration for sub-models between edges is enabled. To effectively utilize both communication and computation resources without extravagance in SM3FL, we develop the optimal strategies of service migration and data sample collection to minimize the wall-clock time, defined as the required training time to reach the learning target. Our experiment results show that the proposed SM3FL framework demonstrates remarkable performance, surpassing other state-of-art FL frameworks via substantially reducing the computing demand by 17.5% and dramatically decreasing the wall-clock time by 25.3%.
Waste not, want not: service migration-assisted federated intelligence for multi-modality mobile edge computing

Hansong Zhou, Shaoying Wang, Chutian Jiang, Linke Guo, Yukun Yuan, Xiaonan Zhang

MobiHoc '23: Proceedings of the Twenty-fourth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing

Future mobile edge computing (MEC) is envisioned to provide federated intelligence to delay-sensitive learning tasks with multimodal data. Conventional horizontal federated learning (FL) suffers from high resource demand in response to complicated multi-modal models. Multi-modal FL (MFL), on the other hand, offers a more efficient approach for learning from multi-modal…

Full abstract
Future mobile edge computing (MEC) is envisioned to provide federated intelligence to delay-sensitive learning tasks with multimodal data. Conventional horizontal federated learning (FL) suffers from high resource demand in response to complicated multi-modal models. Multi-modal FL (MFL), on the other hand, offers a more efficient approach for learning from multi-modal data. In MFL, the entire multi-modal model is split into several sub-models with each tailored to a specific data modality and trained on a designated edge. As sub-models are considerably smaller than the multi-modal model, MFL requires fewer computation resources and reduces communication time. Nevertheless, deploying MFL over MEC faces the challenges of device mobility and edge heterogeneity, which, if not addressed, could negatively impact MFL performance. In this paper, we investigate an Service Migration-assisted Mobile Multi-modal Federated Learning (SM3FL) framework, where the service migration for sub-models between edges is enabled. To effectively utilize both communication and computation resources without extravagance in SM3FL, we develop the optimal strategies of service migration and data sample collection to minimize the wall-clock time, defined as the required training time to reach the learning target. Our experiment results show that the proposed SM3FL framework demonstrates remarkable performance, surpassing other state-of-art FL frameworks via substantially reducing the computing demand by 17.5% and dramatically decreasing the wall-clock time by 25.3%.
DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network
DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network

Hansong Zhou, Sihan Yu, Linke Guo, Beatriz Lorenzo, Xiaonan Zhang

MASS '22: IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems

Sensing data collection from the Internet of Things (IoT) devices lays the foundation to support massive IoT applications, such as patient monitoring in smart health and intelligent control in smart manufacturing. Unfortunately, the heterogeneity of IoT devices and dynamic environments result in not only the life-cycle latency but also data…

Full abstract
Sensing data collection from the Internet of Things (IoT) devices lays the foundation to support massive IoT applications, such as patient monitoring in smart health and intelligent control in smart manufacturing. Unfortunately, the heterogeneity of IoT devices and dynamic environments result in not only the life-cycle latency but also data collection failures, affecting the quality of experience (QoE) for all the users. In this paper, we propose a recovery mechanism with a dynamic data contamination method to handle the failure. To further enhance the long-term overall QoE, we allocate the spectrum resources and make contamination decisions for each device using a deep reinforcement learning method. Particularly, a lightweight decentralized State-sharing Deep-Recurrent Q-Network (SDRQN) is proposed to find the optimal collection policies. Our simulation results indicate that the recurrent unit in SDRQN gives rise to 10% lower waiting time and 60% lower task drop rate than the fully-connected design. Compared to a centralized DQN scheme, SDRQN achieves a similar ultra-low drop rate of 0.29% but requires only 1% GPU memory, demonstrating the effectiveness of SDRQN in the large-scale heterogeneous IoT network.
DQN-based QoE Enhancement for Data Collection in Heterogeneous IoT Network

Hansong Zhou, Sihan Yu, Linke Guo, Beatriz Lorenzo, Xiaonan Zhang

MASS '22: IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems

Sensing data collection from the Internet of Things (IoT) devices lays the foundation to support massive IoT applications, such as patient monitoring in smart health and intelligent control in smart manufacturing. Unfortunately, the heterogeneity of IoT devices and dynamic environments result in not only the life-cycle latency but also data…

Full abstract
Sensing data collection from the Internet of Things (IoT) devices lays the foundation to support massive IoT applications, such as patient monitoring in smart health and intelligent control in smart manufacturing. Unfortunately, the heterogeneity of IoT devices and dynamic environments result in not only the life-cycle latency but also data collection failures, affecting the quality of experience (QoE) for all the users. In this paper, we propose a recovery mechanism with a dynamic data contamination method to handle the failure. To further enhance the long-term overall QoE, we allocate the spectrum resources and make contamination decisions for each device using a deep reinforcement learning method. Particularly, a lightweight decentralized State-sharing Deep-Recurrent Q-Network (SDRQN) is proposed to find the optimal collection policies. Our simulation results indicate that the recurrent unit in SDRQN gives rise to 10% lower waiting time and 60% lower task drop rate than the fully-connected design. Compared to a centralized DQN scheme, SDRQN achieves a similar ultra-low drop rate of 0.29% but requires only 1% GPU memory, demonstrating the effectiveness of SDRQN in the large-scale heterogeneous IoT network.
All publications