Abstract:
Modern software stacks contain concurrent and heterogeneous workloads with bespoke constraints. This is especially crucial for emerging edge applications, such as AR/VR, robotics, and autonomous vehicles. In response to these demands, hardware has shifted towards pervasive specialization, making development and cross-stack integration increasingly challenging. This shift raises the pressing question at the core of modern computing: How do we enable scalable specialization in modern SoCs? How do we design and integrate heterogeneous accelerators while ensuring performance scalability through efficient resource management and adaptability across system layers?
In this talk, I will present my research addressing these interconnected challenges of scalable specialization through full-stack, system approaches. (1) First, I will introduce Gemmini, an award-winning, widely used DNN accelerator generator that enables agile, full-stack accelerator evaluation. Gemmini allows researchers to explore the specialized accelerator design space under a full SoC. (2) Next, I will present AuRORA, the award-winning, novel virtualized accelerator integration approach with dynamic resource allocation, paving the foundation for accelerator-rich SoCs. AuRORA redesigns a novel CPU-accelerator interface that enables fast and flexible resource repartitioning, along with a runtime system that abstracts physical accelerators into a unified virtualized resource pool. (3) Then, I will introduce SuperNoVA, an algorithm-hardware co-design for real-time, dynamic workloads on resource-constrained platforms, using SLAM as a target workload. SuperNoVA tackles the challenge of balancing accuracy and real-time execution with an adaptive algorithm for large-scale SLAM. (4) Finally, I will showcase a silicon test chip I taped out that embodies my research by integrating these innovations. Silicon validation with real workloads successfully proves the feasibility of scalable specialization.
By bridging hardware design, system software, application algorithms, and silicon validation, my research enables adaptive, accelerator-rich computing platforms for modern edge applications. I will revolutionize edge SoC design by combining design-time hardware-software co-optimization with runtime adaptive resource management, achieving the best of both static specialization and dynamic flexibility to address the evolving demands of future edge platforms.
Bio:
