Towards Sustainable AI Infrastructure
The way how we obtain knowledge in this world is dramatically changing along the evolution of AI techniques (e.g., large language models). AI-driven services are quickly becoming the backbone of cloud platforms today. As more applications are driven by AI, data centers must evolve accordingly. This translates directly to increasing energy demand, operation costs, and carbon emissions. Ideally, we seek pervasive data centers to serve new and emerging applications, and scale without physical limitations to tackle grand scientific challenges, while achieving carbon neutrality. I am building the next-generation intelligent system infrastructure based on modular data centers with renewable energy. This new infrastructure, named SkyMachine, aims to offer computing services to overcome scientific challenges that existing infrastructures cannot solve. If successful, the project will fundamentally advance approaches for operating large-scale system infrastructure and accelerate the development of renewable energy-based computing systems.
I am working with my reserach group to conduct four major research activities: (1) build scalable modular data centers with various renewable energy sources, based on insights that renewable energy production can be predicted with detailed weather models, and that some sources tend to have complementary production patterns; (2) develop a software-defined accelerator-centric hardware platform (V10-ISCA’23, NeuCloud-HotOS’23), in which we will support accelerator virtualization, trusted execution environment, direct communication between smart devices, and energy-aware operations; (3) present a self-driving system infrastructure to manage hardware resources and energy, in which we employ learning techniques to automate resource allocation, application placement and migration, and job scheduling; and (4) enable new and emerging applications such as AI-driven climate prediction and materials discovery on the SkyMachine. To facilitate this research, we organized the First Workshop on Hot Topics in System Infrastructure (HotInfra’23) with our industry collaborators. We will continue HotInfra in the coming years and develop it into a flagship workshop in our community.
Here is a list of our recent publications:
- Hardware-Assisted Virtualization of Neural Processing Units for Cloud Platforms
Yuqi Xue, Yiqi Liu, Lifeng Nai, Jian Huang
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture (MICRO'24)
- Exploring the Efficiency of Renewable Energy-based Modular Data Centers at Scale
Jinghan Sun, Zibo Gong, Anup Agarwal, Shadi Noghabi, Ranveer Chandra, Marc Snir, Jian Huang
Preprint at arXiv, 2024
- Towards Building Sustainable AI Infrastructures with Modular Data Centers
Jian Huang, Deming Chen, Klara Nahrstedt, Ravishankar K. Iyer, Philip T. Krein
NSF Workshop on Sustainable Computing for Sustainability (NSF-WSCS'24)
- Sustainable AI Workload Scheduling and Operation Optimization in Hybrid Clouds
Klara Nahrestedt, Deming Chen, Jian Huang, Eun K. Lee, Asser Tantawi, Alaa S. Youssef, Tamar Eilam, Olivier Tardieu
NSF Workshop on Sustainable Computing for Sustainability (NSF-WSCS'24)
- G10: Enabling An Efficient Unified GPU Memory and Storage Architecture with Smart Tensor Migrations
Haoyang Zhang*, Eric Zhou*, Yuqi Xue, Yiqi Liu, Jian Huang
Proceedings of the 56th IEEE/ACM International Symposium on Microarchitecture (MICRO'23)
*co-primary authors
All four artifact badges (available, functional, reproduced, reusable) received
- System Virtualization for Neural Processing Units
Yuqi Xue, Yiqi Liu, Jian Huang
Proceedings of the 19th Workshop on Hot Topics in Operating Systems (HotOS'23)
- Hardware-Assisted System Virtualization for Neural Processing Units
Yuqi Xue, Yiqi Liu, Lifeng Nai, Jian Huang
The 1st Workshop on Hot Topics in System Infrastructure (HotInfra'23)
- V10: Hardware-Assisted NPU Multi-tenancy for Improved Resource Utilization and Fairness
Yuqi Xue, Yiqi Liu, Lifeng Nai, Jian Huang
Proceedings of the 50th International Symposium on Computer Architecture (ISCA'23)