Edge intelligence networks: from independent computation offloading to multi-agentic AI collaboration
This Distinguished Lecturer presentation mainly discusses the multi-agentic AI collaboration. With the development of artificial intelligence (AI), deep neural network (DNN) inference has become a crucial computational task in edge intelligence networks. However, due to the limited computing capacity and energy supplying of IoT devices, and the task offloading latency between IoT devices and edge servers as well, the local inference or centralized inference can hardly meet the requirements of low latency and high energy efficiency. For this, distributed collaborative inference provides a promising solution based on the multi-agentic AI collaboration among IoT devices and edge servers. Notably, the multi-agentic AI collaboration has to face key challenges such as unbalanced resource scheduling, redundant data exchange, and heterogeneous accuracy demands across different computational domains. In this talk, we thus focus on how to realize the end-end collaboration among IoT devices and end-edge collaboration between IoT devices and edge servers. For the end-to-end collaborative inference, a padding-aware IoT device collaboration framework is proposed to achieve efficient data interaction and synchronized computation among devices. By jointly optimizing model partitioning and padding data transmission strategies, a latency minimization model is established and transformed into a solvable linear programming form. For the end–edge collaborative inference, an accuracy-aware multi-branch collaborative inference model is proposed to cope with diverse accuracy requirements and heterogeneous computational capacities. A mixed-integer nonlinear optimization model is formulated to jointly optimize DNN branch selection, task partitioning, and resource allocation for computation and communication. To reduce the computational complexity, an efficient algorithm based on hierarchical decomposition and proportional–integral–derivative (PID) search is developed, achieving a dynamic trade-off between energy consumption and inference accuracy. Finally, we develop the prototype system on the NVIDIA Jetson platform to validate the effectiveness and practicality of the proposed schemes in collaborative inference scenarios.
Date and Time
Location
Hosts
Registration
-
Add Event to Calendar
- Academic Lecture Hall, Engineering Building No. 2, School of Automation, Guangdong University of Technology
- Guangzhou, Guangdong
- China
Speakers
Prof. Li Ping Qian
Biography:
Li Ping Qian received the PhD degree in information Engineering from the Chinese University of Hong Kong in 2010. She worked as a postdoctoral research associate at the Chinese University of Hong Kong, during 2010-2011. Since 2011, she has been with College of Information Engineering, Zhejiang University of Technology, Hangzhou, China, where she is currently a Full Professor. From 2016 to 2017she was a visiting scholar with the Broadband Communications Research Group, ECE Department. University of Waterloo. Her research interests include wireless communication and networking, resource management in wireless networks, massive IoTs, mobile edge computing, emerging multiple access techniques, and machine learning oriented towards wireless communications, She was a co-recipient of the IEEE Marconi Prize Paper Award in Wireless Communications in 2011, the Best Paper Award from IEEE ICC 2016, the Best Paper Award from IEEE Communication Society TCGCC 2017, the Best Paper Award from the Digital Communications and Networking in 2021, and the Best Paper Award from IEEE WCNC 2023. She is the Distinguished Lecturer of IEEE Vehicular Technology Society (2024-2026). She was an Associate Editor of the IET Communications from 2016 to 2022, She is currently on the Editorial Boards of IEEE Wireless Communications Magazine, IEEE Transactions on Cognitive Communications and Networking, and IEEE Internet of Things Journal.