Optimize 3rd Gen Intel Xeon Scalable RAM Performance

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

3rd Gen Intel Xeon Scalable Arbeitsspeicher Performance optimieren

Processors of the 3rd Generation Intel Xeon Scalable series have four built-in memory controllers, each of which can control up to two channels. Up to two DIMMs are possible per channel. In order to achieve the best possible overall performance, it is advisable to ensure a balanced configuration. In this article we show frequently used DIMM configurations.

Information for 1st and 2nd gen processors can be found in the article Optimize memory performance of Intel Xeon Scalable systems.

Basics

Processors of the 3rd Gen Intel Xeon Scalable Performance series are characterised by the following features on the memory side:

  • 4 memory controllers per CPU
  • 2 memory channels per memory controller (thus 8 memory channels per CPU = 16 memory channels in a dual-CPU system)
  • 2 DIMMs per channel (thus a maximum of 16 DIMMs per CPU = 32 DIMMS in a dual-CPU system)

Configurations

The following tables show the ideal configurations of the different RAM slots.

Important note on maximum memory bandwidth:

  • The tables show measurements from a bandwidth test with STREAM Triad.[1] The highest possible memory bandwidth is particularly relevant in the HPC environment.
  • In other application areas the influence of memory bandwidth on overall performance is lower and depends on the respective application.
  • Tests with the SPECint_rate_base2006 show that even with a memory bandwidth of 35%, the SPEC benchmark achieves up to 90% performance.[2]
  • RAM configurations with fewer modules reduce the maximum achievable memory bandwidth, but also have lower energy requirements and offer good expandability' for the future.

Single-CPU systems with 8 DIMM slots

The following table shows the possible DIMM configurations of a single-CPU system with a Supermicro X12SPI-TF mainboard. The maximum possible memory bandwidth can be achieved with a configuration with 8 DIMMs:

Module size 2 DIMMs 4 DIMMs 6 DIMMs 8 DIMMs
8 GB 16 GB 32 GB 48 GB 64 GB
16 GB 32 GB 64 GB 96 GB 128 GB
32 GB 64 GB 128 GB 192 GB 256 GB
64 GB 128 GB 256 GB 384 GB 512 GB
128 GB Not profitable due to high module prices(*) 1 TB
256 GB 2 TB
Max. memory
bandwidth
~ 25% ~ 50% ~ 75% ~ 100%
*) 128 GB and 256 GB DDR4-3200 RDIMMs are manufactured using 3DS technology. With 128 GB modules, the price per GB is around 4 to 5 times that of 64 GB modules. For 256 GB modules, the GB price is even more expensive. These module sizes therefore only make sense for maximum configurations if as much RAM as possible is to be provided on a single mainboard.

Dual CPU systems with 8 to 32 DIMM slots

The following table shows the possible DIMM configurations of a dual CPU system with 8 to 32 DIMM slots. The maximum possible memory bandwidth can be achieved with a configuration with 16 DIMMs, 32 DIMMs achieve almost the same memory bandwidth:

Module size 4 DIMMs
(2 per CPU)
8 DIMMs
(4 per CPU)
12 DIMMs
(6 per CPU)
16 DIMMs
(8 per CPU)
32 DIMMs
(16 per CPU)
8 GB 32 GB 64 GB 96 GB 128 GB 256 GB
16 GB 64 GB 128 GB 192 GB 256 GB 512 GB
32 GB 128 GB 256 GB 384 GB 512 GB 1 TB
64 GB 256 GB 512 GB 768 GB 1 TB 2 TB
128 GB Not profitable due to high module prices(*) 4 TB
256 GB 8 TB
Max. memory
bandwidth
~ 25% ~ 50% ~ 75% ~ 100% ~ 98%

Additional Information

References


Foto Werner Fischer.jpg

Author: Werner Fischer

Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.


Related articles

Intel On Demand
Security Advisories for Intel Products 2019-12-10
Security Advisories for Intel Products 2020-09-08