Intel Socket-P LGA 3647 Processors 6138P includes the Intel® Arria® 10 GX FPGA
the Intel® Xeon® Scalable
processor with integrated Intel® Arria® 10 field programmable gate array
(FPGA) is now available to select customers. This marks the first
production release of an Intel® Xeon® processor with a coherently
interfaced FPGA—an important result of Intel’s acquisition of Altera.
The combination of these industry-leading FPGA solutions with Intel’s
world-class processors enables customers to create the next generation
of data center systems with flexible workload-optimized performance and
power efficiency.
1.0 The Intel® Xeon® Gold 6138P processor with Integrated Arria® 10 GX 1150 FPGA delivers up to 3.2X throughput with half the latency and 2X more VMs when compared to Intel® Xeon® Gold 6138P processor with software OVS (Open Virtual Switch) DPDK forwarding in the CPU user space application.
Configuration: 2x Intel® Xeon® Gold 6138P processor with Integrated Intel® Arria® 10 GX 1150 FPGA on Blue Mountain Pass (BMP) platform,
12 x 16GB Micron 2Rx8 DDR4 2666MHz (192GB total),
240GB Kingston SSD,
1xPCI-E 3.0 x8 slot and 1xPCI-E 3.0 x10 slot,
Network NICs:1x 100G Alaska NIC and 2 x Intel® Ethernet Network Adapter XXV710-DA2 (25GbE NIC) (fw 5.50.47059 api 1.5 nvm 5.51 0x80002bf8 1.1568.0),
Operating System: Ubuntu-16.04.3,
OS Kernel: 4.4.0-116-generic,
Bios: SE5C620.86B.01.00.0813.041020180320 (Release Date: 04/10/2018),
uCode: mb750654_02000043,
FPGA BBS v6.4.0_Production (GBS 6.4.0,
OPAE-0.12.1, Lib switch OPAE ver 1.1),
VM opearting system: Ubuntu 17.10,
OS Kernel: 4.13.0-31-generic...
Compared to 2x Intel® Xeon® Gold 6138P processor on Blue Mountain Pass (BMP) platform, 12 x 16GB Micron 2Rx8 DDR4 2666MHz (192GB total), 240GB Kingston SSD, 1xPCI-E 3.0 x8 slot and 1xPCI-E 3.0 x10 slot, Network NICs:1x 100G Alaska NIC and 2 x Intel® Ethernet Network Adapter XXV710-DA2 (25GbE NIC) (fw 5.50.47059 api 1.5 nvm 5.51 0x80002bf8 1.1568.0), Operating System: Ubuntu-16.04.3, OS Kernel: 4.4.0-116-generic, Bios: SE5C620.86B.01.00.0813.041020180320 (Release Date: 04/10/2018), uCode: mb750654_02000043, VM opearting system: Ubuntu 17.10, OS Kernel: 4.13.0-31-generic, Benchmark: Open vSwitch 2.9.0,
The benchmark results may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and workloads utilizedin the testing, and may not be applicable to any particular user's components, computer system or workloads. The results are not necessarily representative of other benchmarks and other benchmark results may show greater or lesser impact from mitigations. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance.
Introducing the Intel® Xeon® Scalable processor with integrated Intel® Arria® 10 FPGA
The Intel® Xeon® Scalable Processor 6138P includes the Intel® Arria® 10 GX 1150, which provides up to 160Gbps of I/O bandwidth per socket and a cache-coherent interface for tightly coupled acceleration. The Intel® Arria® 10 GX 1150 has its own cache and shares memory with the processor via low-latency, cache coherent access over the Intel® Ultra Path Interconnect (Intel® UPI) bus. Unlike other system interface bus standards, Intel® UPI allows seamless access to data regardless of where the data resides (core cache, FPGA cache, or memory) without the need for redundant data storage and direct memory access (DMA) transfers. Data coherency also reduces application programming complexity and saves CPU cycles that would be wasted to determine which data is most-up-to-date.
A great example of this system capability is Intel’s new virtual switching reference design for the Intel® Xeon® Scalable processor with integrated FPGA. This reference design uses the FPGA for infrastructure dataplane switching, while the processor does application processing or processes virtual machines. This helps simplify network complexity and improve the productivity of the processor.
This solution is also compatible with the Open Virtual Switch (OVS) framework and delivers a dramatic 3.2X throughput improvement at half the latency and 2X more VMs as compared to OVS running on an equivalent processor without FPGA acceleration.1 Additionally, code compatibility with Intel’s OVS-DPDK software makes data center retrofits simple and scalable to optimize operational agility.
Fujitsu, a lead partner, plans to deliver systems based on the Intel® Xeon® processor with integrated FPGA and Intel’s OVS reference design. They are making the Intel® virtual switching reference design even more robust for the networking environment through their reliability, availability, and serviceability (RAS) with performance monitoring and debug assisting functions. This solution is being demonstrated this week at the Fujitsu Forum in Tokyo.
FPGAs continue to be an important part of Intel’s portfolio of workload-optimized solutions for the data center. Going forward, we will continue to improve ease of use of Intel® FPGAs and other accelerators in the datacenter. To provide our customers with greater deployment flexibility, Intel’s future roadmap will introduce a discrete FPGA solution with faster coherent and increased high-bandwidth interconnect enabled by the Acceleration Stack for Intel® Xeon® CPU with FPGAs. It will support code migration from the Intel® Xeon® Scalable processor with Integrated FPGA and the Intel® Programmable Acceleration Card (Intel® PAC) solutions, and will continue to be optimized for enhanced bandwidth and low latency.
To learn more about the new Intel® Xeon® Scalable processor with integrated FPGA, visit www.intel.com/accelerators.
Image source: Intel Now Intel has finally announced that they are shipping their Xeon 6138P Gold with integrated FPGA accelerator to selected customers.
The Intel Xeon 6138P includes one Arria10 GX 1150 FPGA core, with up to 160Gbps of I/O bandwidth and a cache-coherent interface for tightly coupled acceleration. The Arria FPGA has its own cache and connects with the Xeon processor via Intel’s ultra fast UPI (Ultra Path Interconnect). The data sharing between processor and FPGA do not need DMA access, reducing programming complexity.
According Anandtech, “The Xeon Scalable Gold 6138 is already an existing CPU, and the x86 silicon on the 6138P looks to be identical between the two parts: A 20C/40T CPU, with a 2GHz base clock, 3.7GHz boost, with 6 channels of DDR4 support. The PCIe lane count is different — 48 lanes on the base 6138 compared with 32 lanes on the 6138P — but this almost certainly means that 16 of those PCIe 3.0 lanes have been diverted for bandwidth for the FPGA.”
According to Intel, the integrated processor Xeon delivers a 3.2x throughput improvement at half the latency compared to an FPGA-less Xeon device.
In the announcement, Intel has stated that “Fujitsu, a lead partner, plans to deliver systems based on the Intel® Xeon® processor with integrated FPGA and Intel’s OVS reference design. They are making the Intel® virtual switching reference design even more robust for the networking environment through their reliability, availability, and serviceability (RAS) with performance monitoring and debug assisting functions. This solution is being demonstrated this week at the Fujitsu Forum in Tokyo.”