Optimization of AMD EPYC 7002 Rome and 7003 Milan working memory performance

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

Processors of the AMD EPYC 7002 Rome and AMD EPYC 7003 Milan series provide more than eight Memory Controller that can head to a memory channel. Up to two DIMMs are possible per channel. It is helpful to ensure a balanced allocation to receive the best possible overall performance. In this article, we present frequently used DIMM-configurations.

Basics

Processors of the AMD EPYC 7002/7003 series are characterized by the following properties in terms of memory:

  • 8 memory controller per CPU
  • 1 memory channel per memory controller
  • 2 DIMMs per channel (maximum 16 DIMMs per CPU = 32 DIMMs in a Dual-CPU system)

Configuration

The following chart shows the ideal assembly of different RAM-slots.

Important hint for maximum storage bandwidth: The tables show measurements of a bandwidth test with STREAM Triad.[1] A highest possible memory bandwidth is particularly relevant in the HPC environment. In other areas of application, the influence of memory bandwidth on overall performance is lower and depends on the specific application. Tests with SPECint_rate_base2006 show, for example, that even with a memory bandwidth of 35% of the SPEC benchmark, up to 90% performance can be achieved.[2] RAM-assemlies witth less modules lower the maximum bandwith. However, they have lower energy consumption and offer a good expansion for the future.

Single-CPU systems with 8 DIMM slots

Configuration of DIMM-slots of a Supermicro H11SSL-i mainboard.

The following table shows possible DIMM-configurations of a single-CPU system with a mainboard with 8 DIMM slots (z.B. Supermicro H11SSL-i):

module size 2 DIMMs 4 DIMMs 8 DIMMs
8 GB 16 GB 32 GB 64 GB
16 GB 32 GB 64 GB 128 GB
32 GB 64 GB 128 GB 256 GB
64 GB 128 GB 256 GB 512 GB
Bestückung
maximum transfer rate 3.200 MT/s 3.200 MT/s 3.200 MT/s
maximum storage bandwidth [3] ~28% ~ 54 % ~ 100 %

Single CPU systems with 16 DIMM slots

The following table shows possible DIMM-configurations of a Single-CPU system with a mainboard with 16 DIMM slots (for example ASUS KRPA-U16):

module size 2 DIMMs 4 DIMMs 8 DIMMs 16 DIMMs
8 GB 16 GB 32 GB 64 GB 128 GB
16 GB 32 GB 64 GB 128 GB 256 GB
32 GB 64 GB 128 GB 256 GB 512 GB
64 GB 128 GB 256 GB 512 GB 1.024 GB
assembly
maximum transfer rate 3.200 MT/s 3.200 MT/s 3.200 MT/s 2.933 MT/s (2 DPC)
maximum storage-
bandwidth[3]
~28% ~ 54% ~ 100 % ~ 92% (*)
(*) In the listed test report,[3] 100 per cent are listed. However, the tests have been performed with 2.933 MT/s modules. Compared to 8 x 3,200 MT/s modules, equipping the system with 16 modules results in approximately 92% of the maximum transfer rate, as when using 2 DPC (Dimms per channel) maximum 2.933 MT/s are supported by the CPU.[4]

Dual-CPU systems with 16 DIMM slots

The following table shows possible DIMM-configurations of a Dual-CPU system with a Supermicro H11DSi-NT mainboard:

module size 4 DIMMs
(2 DIMMs per CPU)
8 DIMMs
(4 DIMMs per CPU)
16 DIMMs
(8 DIMMs pero CPU)
8 GB 32 GB 64 GB 128 GB
16 GB 64 GB 128 GB 256 GB
32 GB 128 GB 256 GB 512 GB
64 GB 256 GB 512 GB 1.024 GB
assembly
maximum transfer rate 3.200 MT/s 3.200 MT/s 3.200 MT/s
maximum storage

bandwidth[3]

~28% ~ 54 % ~ 100 %

Dual-CPU systems with 32 DIMM slots

The following table shows possible DIMM-configurations of a Dual-CPU system with an ASUS KMPP-D32 mainboard:

module size 4 DIMMs
(2 DIMMs per CPU)
8 DIMMs
(4 DIMMs per CPU)
16 DIMMs
(8 DIMMs pro CPU)
32 DIMMs
(16 DIMMs per CPU)
8 GB 32 GB 64 GB 128 GB 256 GB
16 GB 64 GB 128 GB 256 GB 512 GB
32 GB 128 GB 256 GB 512 GB 1.024 GB
64 GB 256 GB 512 GB 1.024 GB 2.048 GB
assembly
maximum transfer rate 3.200 MT/s 3.200 MT/s 3.200 MT/s 2.933 MT/s
maximum storage

bandwidth[3]

~28% ~ 54 % ~ 100 % ~ 92 % (*)

More information

References

  1. STREAM: Sustainable Memory Bandwidth in High Performance Computers (www.cs.virginia.edu)
  2. Memory Performance of Xeon scalable processor (Skylake-SP) based Systems (sp.ts.fujitsu.com)
  3. 3.0 3.1 3.2 3.3 3.4 Balanced Memory Configurations with Second-Generation AMD EPYC Processors (lenovopress.com)
  4. AMD EPYC Rome SKU List and Block Diagram Posted (tomshardware.com, 07.08.2019) [...] The official memory speed supported is DDR4-3200 in a 1 DPC (DIMM per channel) configuration and DDR4-2933 in a 2 DPC configuration. [...]


Author: Werner Fischer

Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.


Translator: Alina Ranzinger

Alina has been working at Thomas-Krenn.AG since 2024. After her training as multilingual business assistant, she got her job as assistant of the Product Management and is responsible for the translation of texts and for the organisation of the department.


Related articles

AMD EPYC 7003 Milan Workload Profile NIC Throughput Intensive
AMD-V virtualization function
Optimization of AMD EPYC 9004 Genoa and Bergamo working memory performance