Jian's webpage

MICRO • 07/24/2023

G10: Enabling An Efficient Unified GPU Memory and Storage Architecture with Smart Tensor Migrations

G10 integrates the host memory, GPU memory, and flash memory into a unified memory space, to scale the GPU memory capacity while enabling transparent data migrations. Based on this unified GPU memory and storage architecture, G10 utilizes compiler techniques to characterize the tensor behaviors in deep learning workloads and schedule data migrations in advance.

MICRO • 07/24/2023

Learning to Drive Software-Defined SSDs

We present an automated learning-based SSD hardware configuration framework, named AutoBlox, that utilizes both supervised and unsupervised machine learning (ML) techniques to drive the tuning of hardware configurations for SSDs. AutoBlox automatically extracts the unique access patterns of a new workload using its block I/O traces, maps the workload to previous workloads for utilizing the learned experiences, and recommends an optimized SSD configuration based on the validated storage performance.

SOSP • 07/16/2023

RackBlox: A Software-Defined Rack-Scale Storage System with Network/Storage Co-design

We co-design the SDN and SDF stack by re-defining the functions of their control plane and data plane, and splitting up them within a new architecture named RackBlox. RackBlox decouples the storage management functions of flash-based solid-state drives (SSDs), and allow the SDN to track and manage the states of SSDs in a rack. Therefore, we can enable the state sharing between SDN and SDF, and facilitate global storage resource management.

Research • 06/18/2023

The First Workshop on Hot Topics in System Infrastructure

The Workshop on Hot Topics in System Infrastructure (HotInfra'23) provides a unique forum for cutting-edge research on system infrastructure and platforms. Researchers and engineers can share their recent research results and experiences and discuss new challenges and opportunities in building next-generation system infrastructures, such as AI infrastructure, software-defined data centers, and edge/cloud computing infrastructure. The topics span across the full system stack with a focus on the design and implementation of system infrastructures. Relevant topics include hardware architecture, operating systems, runtime systems, and emerging applications.

HotOS • 04/21/2023

System Virtualization for Neural Processing Units

Modern cloud platforms have been employing hardware accelerators such as neural processing units (NPUs) to meet the increasing demand for computing resources for AI-based application services. However, due to the lack of system virtualization support, the current way of using NPUs in cloud platforms suffers from either low resource utilization or poor isolation between multi-tenant application services. In this paper, we investigate the system virtualization techniques for NPUs across the entire software and hardware stack, and present our NPU virtualization solution. We propose a flexible NPU abstraction named vNPU that allows fine-grained NPU virtualization and resource management.

ISCA • 03/09/2023

V10: Hardware-Assisted NPU Multi-tenancy for Improved Resource Utilization and Fairness

We present V10, a hardware-assisted NPU multi-tenancy framework for improving resource utilization, while ensuring fairness for different ML services. We rethink the NPU architecture for supporting multi-tenancy. V10 employs an operator scheduler for enabling concurrent operator executions on the systolic array and the vector unit, and offers flexibility for enforcing different priority-based resource-sharing mechanisms. V10 also enables fine-grained operator preemption and lightweight context switch.

ASPLOS • 09/22/2022

LeaFTL: A Learning-based Flash Translation Layer for Solid-State Drives

we present a learning-based flash translation layer (FTL), named LeaFTL, which learns the address mapping to tolerate dynamic data access patterns via linear regression at runtime. By grouping a large set of mapping entries into a learned segment, it significantly reduces the memory footprint of the address mapping table, which further benefits the data caching in SSD controllers.

Prevous 1 2 Next

Workshop • 04/22/2024

The 2nd HotInfra workshop will be co-located with SOSP'24 in Austin!

Award • 04/10/2024

I am honored to receive the Dean's Award for Early Innovation!

Grant • 03/06/2024

We receive a grant from DoE CESER to work on systems/firmware vulnerability discovery and mitigation!

Award • 10/31/2023

I am deeply honored to receive the inaugural ACM SIGMICRO Early Career Award!

Grant • 10/18/2023

Our storage research for data centers receives a gift fund award from Google!

Publication • 07/24/2023

Jian Huang

G10: Enabling An Efficient Unified GPU Memory and Storage Architecture with Smart Tensor Migrations

Learning to Drive Software-Defined SSDs

RackBlox: A Software-Defined Rack-Scale Storage System with Network/Storage Co-design

The First Workshop on Hot Topics in System Infrastructure

System Virtualization for Neural Processing Units

V10: Hardware-Assisted NPU Multi-tenancy for Improved Resource Utilization and Fairness

LeaFTL: A Learning-based Flash Translation Layer for Solid-State Drives

The 2nd HotInfra workshop will be co-located with SOSP'24 in Austin!

I am honored to receive the Dean's Award for Early Innovation!

We receive a grant from DoE CESER to work on systems/firmware vulnerability discovery and mitigation!

I am deeply honored to receive the inaugural ACM SIGMICRO Early Career Award!

Our storage research for data centers receives a gift fund award from Google!

Our research on enabling an efficient unified GPU memory/storage architecture will appear at MICRO'23!

Our research on "Learning to Drive Software-Defined SSDs" will appear at MICRO'23!

Our software-defined rack-scale storage system will appear at SOSP'23!

We successfully launched the first HotInfra workshop!

I'm honored to receive the Y. T. Lo Faculty Fellow in Electrical and Computer Engineering!

Our preliminary study on NPU virtualization was accepted to HotOS'23!

Our research on hardware-assisted NPU multi-tenancy was accepted to ISCA'23!

Video