Projects

Research projects in the ACES Lab

WHOLEPRO: An Online, Holistic Job Scheduling and Resource Provisioning Framework for Datacenter Architectures and Applications

The project goal is to develop solutions for datacenters which will provide user-centric services that meet diverse user Quality-of-Experience (QoE) requirements for individual customers, while allowing for user-utility aware fair resource allocation.

The project is funded by NSF under award CCF SHF-1704504.


PACCP: Price-based, Quality-of-Service (QoS) Aware Transport Protocols for Datacenters

This project goal is to develop a highly scalable, readily implementable, price-aware, host-based flow control solution, possibly with minimum assistance from top-of-rack (ToR) switches for performance enhancement.

The project is funded by Alibaba Group.


Data De-duplication in Distributed Object Storage Systems Used for Cold Storage

This project goal is to study new technologies to enhance the storage efficiency in modern storage systems. We have been investigating various methods to achieve our goal. Data de-duplication and erasure coding are the current focus areas.

The project is supported by NetApp.

APP/User-Defined Dual-Level I/O Scheduling on SmartNIC with Fairness and Throughput SLO Guarantee

The project focuses on application/user-level (or dual-level) fair I/O sharing in terms of SLOs and fairness policies. To effectively optimize user-experienced I/O fairness and throughput performance, we resort to an application/user-defined dual-level I/O scheduling mechanism that enables run-time control for I/O scheduling by supporting programmable interfaces exported to an application/user-defined optimization modular. The optimization module is designed to conduct user parallelism optimization and bursty I/O regulation under predefined dual-level SLOs and fairness policies, and dynamically generate optimized settings to guide the dual-level I/O scheduling. Furthermore, to render a cost-effective and non-intrusive solution, we further offload the dual-level I/O scheduling task into SmartNICs without the need to modify server applications and OS kernels. In addition, the proposed on-SmartNIC building blocks on different nodes can be made to work collaboratively to service distributed server applications running concurrently across nodes.

Online Proactive Multi-Percentile Tail Latency SLO Enforcement for End Users

This project aims to implement an application-pinned resource planning and scheduling system (i.e., MPSLO) to enable MP tail latency QoS on end users. MPSLO does so by decomposing tail latency SLO enforcement into a combination of three complementary mechanisms, i.e., tail latency SLO instantaneous conversion, online proactive resource planning and dual-group cooperative I/O scheduling. They work together to translate MP tail latency SLOs into mean latency budgets and then adaptively enforce these SLOs under highly variable user I/O patterns and service capacity with minimized resource over-provisioning. The extensive evaluation based on a real key-value store, MongoDB, shows that MPSLO can effectively plan resources allocation for latency-sensitive users under the SLO constraints and take full advantage of instantaneous resource slack to improve other users’ experience.