Intel ‘Alder Lake’ 12th Gen Core, Thread Director, ‘Alchemist’ Discrete GPU Architecture Details Announced

Intel held a virtual Architecture Day presentation, revealing details of the engineering behind various products that will be launched in the consumer and data center spaces. While the exact specs for CPUs and GPUs will have to wait until they’re actually released, we now have a better idea of ​​the building blocks Intel is using to put them together. Intel’s Senior Vice President and General Manager, Accelerated Computing Systems and Graphics Group, Raja Koduri, led the presentation during a presentation by several senior Intel engineers.

the 12º The Gen Core CPU line, codenamed ‘Alder Lake’, is expected to be released in the coming months, starting with desktop models. These will be the first mainstream Intel processors to feature a mix of high-performance cores and low power consumption – which is common in mobile SoCs today. This follows the experimental ‘Lakefield’ CPU which has only had a limited release so far. Alder Lake will use a more modular approach than before, with different combinations of logical blocks for different product segments.

Intel will use the terms Performance core and Efficient core, generally abbreviated to P core and E core. For Alder Lake, the E cores are based on the ‘Gracemont’ architecture while the P cores use the ‘Golden Cove’ design. For Gracemont, Intel targeted physical silicon size and throughput efficiency to achieve multithreaded performance on a large number of individual cores. These cores work at low voltage and will be used mainly by simpler processes.

Golden Cove-based P cores are designed for speed and low latency. Intel calls this the highest performing core they’ve ever built. New to this generation is support for advanced matrix extensions to accelerate training and deep learning inference.

intel alder lake intel architecture

Three different Alder Lake dies will serve different product segments

Combined, this generation of P and E cores in the Alder Lake architecture will be highly scalable, from 9W to 125W, covering most of today’s mobile and desktop categories. It will be manufactured using the newly announced Intel 7 process, which is a reworking of the 10nm ‘Enhanced SuperFIN’ process. Different implementations will integrate different combinations of DDR5, PCIe Gen5, Thunderbolt 4 and Wi-Fi 6E.

The desktop implementation will use a new LGA1700 socket with up to eight performance cores (two threads each), eight efficient cores (single-threaded) and 30MB of high-end cache memory. The integrated GPU will have up to 32 execution units for basic video output and graphics capabilities. It will not have an integrated Thunderbolt or image processing block, but will support 16 lanes of PCIe Gen5 plus another four lanes of PCIe Gen4. The corresponding platform controllers for motherboards will have up to 12 more PCIe Gen4 and 16 PCIe Gen3 lanes.

Two mobile versions of Alder Lake were also discussed – a more conventional matrix with six P cores and eight E cores, and an ultra-compact matrix with two P cores and eight E cores. Both will have GPUs with 96 execution units in addition to processing units Thunderbolt controllers and imaging tools, and will be targeted to devices that will not have discrete GPUs.

All Alder Lake CPUs are made up of modular logic blocks – the CPU cores, GPU, memory controller, IO and more. They will support up to DDR5-4800, LPDDR5-5200, DDR4-3200 and LPDDR4X-4266 RAM, and it will be up to motherboard and laptop OEMs to decide which to implement. The modular blocks of each CPU will be connected through three fabrics – Compute, Memory and IO. Intel describes 100 GBps of compute fabric bandwidth per P-core or per quad-core E cluster, for a total of 1000 GBps across 10 of these units. The last level cache can be dynamically adjusted between inclusive and exclusive depending on load.

intel thread director intel

Thread Director requires Windows 11 for optimal use of all cores

We now have some information on how workloads will be balanced between the P and E cores. Intel is announcing a new hardware scheduler called Thread Director, which will be completely transparent to software and will work with the OS scheduler to assign threads to different cores based on urgency and real-time conditions. Designed to scale across mobile and desktop CPUs, Thread Director will be able to adapt to thermal and power conditions and migrate threads from one type of core to another, as well as manage multi-threading on P cores with “nanosecond accuracy ”.

Thread Director requires Windows 11 and therefore Alder Lake will perform optimally on this next operating system, although Windows 10, Linux and other operating systems will also work. This means that the OS scheduler now understands which types of threads require which types of resources and can prioritize latency, energy savings or other parameters depending on operating conditions.

Intel has been promoting its first high-end gaming GPU for some time and is building excitement with the recent announcement of a new Intel Arc brand for GPU hardware, software and services. The first generation product is codenamed ‘Alchemist’ and will be released in early 2022. This is a layer of the Xe architecture product stack known as Xe-HPG, or High Performance Gaming. Alchemist will be manufactured by TSMC on their N6 node. It will support hardware ray tracing as well as DirectX 12 Ultimate features such as mesh shading and variable rate shading.

intel xess alchemist intel

XeSS will use AI to upscale frames and improve performance as well as DLSS

Each first-generation Xe-HPG core will have 16 vector engines and 16 matrix engines plus caches, enabling common GPU workloads as well as AI acceleration. Four of these cores, plus four ray tracking units and other rendering hardware, form a “slice”. Each Alchemist GPU can have up to eight of these slices.

Now, we also know that Intel will be releasing its own AI upscaling version, called XeSS (Xe Super Sampling), to take on Nvidia’s DLSS and AMD’s FSR. XeSS is an AI-based upscaling method that combines information from previous frames. Intel is claiming up to 2X better performance when rendering at lower resolutions and then scaling up to the desired resolution. XeSS will even run on Xe LP embedded GPUs, and several game developers are on board to support it.

Although we don’t have any GPU specs yet, Intel said it worked to deliver “leading” performance per Watt. We’re sure we’ll find out more as the release approaches.

Intel also made several announcements related to its server and data center business during Architecture Day, including a demonstration of Ponte Vecchio’s upcoming architecture for big data that will be the foundation of the Aurora exascale supercomputer. Other highlights were the modular scalable Xeon ‘Sapphire Rapids’ platform, the oneAPI software stack and an emerging product category – Infrastructure Processing Units (IPUs), designed to separate customer data infrastructure overheads and processing requirements in data centers cloud-centric.


Leave a Comment