Towards Sustainable AI Infrastructure

The way how we obtain knowledge in this world is dramatically changing along the evolution of AI techniques (e.g., large language models). AI-driven services are quickly becoming the backbone of cloud platforms today. As more applications are driven by AI, data centers must evolve accordingly. This translates directly to increasing energy demand, operation costs, and carbon emissions. Ideally, we seek pervasive data centers to serve new and emerging applications, and scale without physical limitations to tackle grand scientific challenges, while achieving carbon neutrality. I am building the next-generation intelligent system infrastructure based on modular data centers with renewable energy. This new infrastructure, named SkyMachine, aims to offer computing services to overcome scientific challenges that existing infrastructures cannot solve. If successful, the project will fundamentally advance approaches for operating large-scale system infrastructure and accelerate the development of renewable energy-based computing systems.

I am working with my reserach group to conduct four major research activities: (1) build scalable modular data centers with various renewable energy sources, based on insights that renewable energy production can be predicted with detailed weather models, and that some sources tend to have complementary production patterns; (2) develop a software-defined accelerator-centric hardware platform (V10-ISCA’23, NeuCloud-HotOS’23), in which we will support accelerator virtualization, trusted execution environment, direct communication between smart devices, and energy-aware operations; (3) present a self-driving system infrastructure to manage hardware resources and energy, in which we employ learning techniques to automate resource allocation, application placement and migration, and job scheduling; and (4) enable new and emerging applications such as AI-driven climate prediction and materials discovery on the SkyMachine. To facilitate this research, we organized the First Workshop on Hot Topics in System Infrastructure (HotInfra’23) with our industry collaborators. We will continue HotInfra in the coming years and develop it into a flagship workshop in our community.

Here is a list of our recent publications: