From CPUs and GPUs to AI Accelerators: How Chiplets and UCIe are Reshaping Semiconductor Design

Amiee
Apr 29
7 min read

The Crossroads of Semiconductor Design: Why Chiplets?

For decades, the semiconductor industry marched to the beat of Moore's Law, integrating ever-increasing functionality and performance onto single chips by shrinking transistors. However, as physical limits approach, enhancing chip capabilities solely through process scaling has become increasingly difficult and expensive. Large, complex monolithic Systems-on-Chip (SoCs) face significant manufacturing hurdles, particularly concerning yield. The larger the chip area, the higher the risk that a single defect renders the entire die useless, diminishing the cost-effectiveness of advanced process nodes.

Furthermore, different functional blocks (like CPU cores, GPU cores, I/O units, memory controllers, etc.) have varying optimal process technology requirements. For instance, CPU cores might demand the highest clock speeds, necessitating the most advanced logic processes, while I/O units might prioritize stability and cost, making mature processes suitable. Forcing all functions onto the single most advanced, and thus most expensive, process is not only costly but potentially suboptimal technologically. These bottlenecks collectively drove the urgent need for new chip design methodologies, giving rise to the concept of chiplets.

The Core Concept of Chiplets: Building Blocks for Chips

The core idea behind chiplets is quite intuitive: instead of striving to integrate all functions onto one massive monolithic die, the large chip is broken down into several functionally independent, smaller dies. These individual dies are the chiplets. Each chiplet can specialize in a specific function, such as CPU computation, graphics processing, high-speed I/O, or cache memory. Crucially, different chiplets can be manufactured using the process technology best suited for their function. For example, performance-critical CPU chiplets can utilize cutting-edge 5nm or 3nm processes, while cost- and power-sensitive I/O chiplets can opt for relatively mature and lower-cost 12nm or 22nm processes.

Once manufactured, these chiplets, originating from different processes and serving distinct functions, are precisely assembled and interconnected via advanced packaging techniques on an interposer or substrate, forming a complete System-in-Package (SiP). This approach resembles constructing a complex system using standardized "chip Legos," offering unprecedented design flexibility and cost benefits. Chiplets allow designers to mix and match different functional units more flexibly, accelerating product development cycles. Moreover, the smaller area of individual chiplets significantly improves manufacturing yields, lowering overall costs.

The Critical Link: From Proprietary Tech to the UCIe Standardization Path

While the chiplet concept is appealing, enabling chiplets from different vendors and processes to "communicate" and work together smoothly requires a common "language"—a standardized die-to-die (D2D) interconnect interface. Initially, leading semiconductor companies like Intel (with AIB - Advanced Interface Bus) and AMD (with Infinity Fabric), along with industry consortium proposals like BoW (Bunch of Wires), developed their own proprietary D2D interconnect technologies. These worked well within their respective ecosystems but lacked interoperability, creating technological silos that limited the potential growth of the chiplet ecosystem.

To break down these barriers and foster an open, interoperable chiplet ecosystem, the industry urgently needed a unified standard. This led to the birth of the Universal Chiplet Interconnect Express (UCIe) standard. Initiated and promoted by numerous industry giants including Intel, AMD, Arm, Qualcomm, TSMC, Samsung, ASE, and others, the UCIe consortium aims to define an open, standardized D2D interconnect specification, enabling seamless integration of chiplets designed and manufactured by different companies. The advent of UCIe is considered a critical milestone in chiplet development, laying the foundation for their large-scale adoption and the flourishing of the ecosystem.

UCIe Technology Unveiled: The Universal Chiplet Interconnect Expressway

The UCIe standard defines a complete, layered D2D interconnect protocol stack designed for high-bandwidth, low-latency, and low-power communication between chiplets. Its main components include:

Physical Layer (PHY): Responsible for handling the raw electrical signal transmission. UCIe defines different PHY options to support various packaging technologies and use cases, including standard packages (e.g., organic substrates) and advanced packages (e.g., 2.5D/3D packaging using silicon interposers or RDL fan-out layers, such as TSMC's CoWoS or Intel's EMIB/Foveros). This gives UCIe high flexibility to adapt to different cost and performance requirements.
Die-to-Die Adapter: Manages link training, parameter negotiation, and error detection, ensuring the reliability and stability of the D2D connection.
Protocol Layer: Defines the rules for how data is transferred between chiplets. A key feature of UCIe is its protocol layer flexibility; it natively supports the widely used PCI Express (PCIe) and Compute Express Link (CXL) protocols. PCIe is primarily used for I/O connectivity, while CXL enables more efficient memory sharing and coherency, which is particularly important for CPUs, GPUs, and AI accelerators handling large datasets. By supporting these standard protocols, UCIe allows chiplet communication patterns to integrate seamlessly with existing system architectures.

UCIe aims to provide a "plug-and-play" chiplet interconnect solution, significantly reducing the complexity and cost of integrating chiplets from different sources and accelerating time-to-market for innovative products.

The Trade-offs of Chiplet Design: Advantages and Challenges

Feature	Monolithic SoC	Chiplet Design
Design Flexibility	Low; all functions tied to a single process	High; mix-and-match chiplets from different processes and for different functions
Manufacturing Cost	High; yield is critical, especially for large advanced nodes	Potentially lower; smaller die size leads to higher yield; optimal process per function
Yield	Lower; larger area increases defect impact	Higher; individual chiplet yields are better; faulty chiplets can be replaced
Time-to-Market	Long; complex design and verification cycles	Potentially shorter; reuse of validated chiplet IP accelerates design/validation
Performance (Potential)	Potentially highest (lowest on-die latency)	Must overcome D2D interconnect latency; higher aggregate performance possible via more specialized cores/memory
Technological Complexity	High design and verification complexity	New complexities in packaging, D2D interconnects, thermal management, testing (KGD)
Ecosystem	Relatively closed (single vendor dominance)	Trending towards open; UCIe standard promotes interoperability

Challenges in Practice: Key Considerations for Chiplet Design

Despite the immense potential offered by chiplets and UCIe, several key challenges need addressing for practical implementation:

Advanced Packaging Technologies: Precisely packaging multiple chiplets together requires sophisticated and costly advanced packaging techniques, such as 2.5D (using silicon interposers) or 3D packaging. The cost, capacity, and yield of these technologies remain areas for continuous optimization.
Interconnect Performance: While UCIe strives to minimize D2D latency and power consumption, the communication speed and delay between chiplets can rarely match the performance of connections within a monolithic die. Architecturally mitigating or hiding this latency is a crucial consideration for high-performance computing applications.
Thermal Management: Tightly stacking or arranging multiple high-performance chiplets generates significant heat density. Efficiently dissipating this heat to prevent thermal throttling or damage poses a major challenge for packaging and system design.
Testing and Validation: Ensuring each chiplet is a Known Good Die (KGD) before packaging is essential. Developing efficient and reliable KGD test methods, along with system-level testing and validation of the entire SiP after packaging, adds extra complexity.
Standardization and Ecosystem Maturity: Although UCIe provides a foundational standard, the complete chiplet ecosystem still needs time to mature, including the diversity of chiplet IP, design tool support, and supply chain integration.

Overcoming these challenges requires concerted effort and continuous innovation across the entire semiconductor supply chain.

Reshaping Future Chips: The Application Landscape of Chiplets in CPUs, GPUs, and AI

The flexibility of the chiplet architecture makes it highly promising for various processor designs:

CPUs (Central Processing Units): AMD's Ryzen and EPYC processors exemplify the successful application of chiplet design. They combine high-performance CPU core chiplets (using advanced processes) with a separate I/O chiplet (using a more mature process), achieving excellent performance, scalability, and cost-effectiveness. Intel is also actively embracing chiplets (which they call "tiles"), as seen in Meteor Lake and future processor architectures, integrating different functional tiles using advanced packaging like Foveros.
GPUs (Graphics Processing Units): As GPUs grow larger, monolithic designs face similar yield and cost pressures. Chiplet design allows GPU manufacturers to break down massive graphics compute cores into smaller chiplets, making it easier to achieve higher core counts and performance scaling while maintaining reasonable costs. Future flagship GPUs are widely expected to adopt chiplet architectures.
AI Accelerators: Artificial intelligence and machine learning applications require processing vast amounts of data and performing intensive matrix operations. The chiplet architecture is ideally suited for AI accelerator design, enabling flexible combinations of numerous compute units, high-speed memory (like HBM), and I/O interfaces in chiplet form. For instance, specialized AI compute chiplets can be paired with high-bandwidth memory chiplets and high-speed interconnect chiplets to create accelerators highly optimized for specific AI workloads. The CXL protocol support within UCIe further provides efficient memory sharing and expansion capabilities crucial for AI accelerators.

Chiplets are not just changing how individual chips are designed; they could potentially spawn entirely new, highly customized chip types tailored to the specific needs of different market segments.

Looking Ahead: The Next Steps for Chiplets and UCIe

The story of chiplets and UCIe is just beginning, and the future holds exciting possibilities:

More Advanced Packaging: 3D stacking technologies will continue to evolve, allowing vertical stacking of multiple chiplets for higher integration density and shorter interconnect distances. Next-generation interconnect technologies like hybrid bonding promise even higher bandwidth and lower power.
Optical I/O: As bandwidth demands continue to soar, traditional electrical interconnects may face bottlenecks. Integrating optical I/O into chiplets or packages, using photons for high-speed data transfer, is considered a potential future solution for overcoming D2D bandwidth limitations.
An Open Chiplet Marketplace: With the proliferation of the UCIe standard and the maturation of the ecosystem, an open chiplet market, similar to the IP market, could emerge. Design houses could purchase standardized chiplets from various suppliers, quickly assembling custom chip systems much like building a PC.
Deeper Heterogeneous Integration: Chiplets won't be limited to just CPUs, GPUs, and AI. Future applications could integrate more diverse functions, such as RF units, sensors, memory, etc., enabling even more highly integrated and feature-rich Systems-in-Package.

The combination of chiplets and the UCIe standard is propelling the semiconductor industry into a new era of "Heterogeneous Integration," shifting the design focus from extreme scaling of single chips to system-level architectural innovation and modular integration.

Embracing the New Era of Chiplet-Driven Heterogeneous Integration

The shift from monolithic chips to chiplet-based designs is a critical strategy for the semiconductor industry to address the slowdown of Moore's Law, rising manufacturing costs, and diverse application demands. Chiplets offer unprecedented advantages in design flexibility, cost-effectiveness, and time-to-market. The emergence of the UCIe standard has cleared the path for this vibrant chiplet ecosystem, removing interoperability barriers and accelerating the pace of innovation.

While technical challenges remain, chiplets and UCIe are undoubtedly reshaping the future landscape of CPUs, GPUs, AI accelerators, and semiconductor design as a whole. A new era built upon interchangeable, composable "chip Legos" has arrived, poised to drive the next wave of computing power leaps and deliver more powerful, efficient, and customized chip solutions for applications ranging from personal computers and data centers to edge computing.