<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/ras, branch v6.13.1</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.13.1</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.13.1'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-10-22T16:55:57Z</updated>
<entry>
<title>RAS/AMD/ATL: Add debug prints for DF register reads</title>
<updated>2024-10-22T16:55:57Z</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2024-10-21T15:21:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=233679b58c0bfe7bafeb6e048b90af08bb9e7fc1'/>
<id>urn:sha1:233679b58c0bfe7bafeb6e048b90af08bb9e7fc1</id>
<content type='text'>
The ATL will fail early if the DF register access fails due to missing
PCI IDs in the amd_nb code. There aren't any clear indicators on why the
ATL will fail to load in this case.

Add a couple of debug print statements to highlight reasons for failure.

A common scenario is missing support for new hardware. If the ATL fails
to load on a system, and there is interest to support it, then dynamic
debugging can be enabled to help find the cause for failure. If there is
no interest in supporting ATL on a new system, then these failures will
be silent.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/20241021152158.2525669-1-yazen.ghannam@amd.com
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Translate normalized to system physical addresses using PRM</title>
<updated>2024-08-01T12:36:29Z</updated>
<author>
<name>John Allen</name>
<email>john.allen@amd.com</email>
</author>
<published>2024-07-30T15:17:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=26e43c9a894176e7b7f7eaff61aaf2748d4aa520'/>
<id>urn:sha1:26e43c9a894176e7b7f7eaff61aaf2748d4aa520</id>
<content type='text'>
AMD Zen-based systems report memory error addresses through machine
check banks representing Unified Memory Controllers (UMCs) in the form
of UMC relative "normalized" addresses. A normalized address must be
converted to a system physical address to be usable by the OS.

Future AMD platforms will provide a UEFI PRM module that implements a
number of address translation PRM handlers. This will provide an
interface for the OS to call platform specific code without requiring
the use of SMM or other heavy firmware operations.

Add support for the normalized to system physical address translation
PRM handler in the AMD Address Translation Library and prefer it over
native code if available. The GUID and parameter buffer structure are
specific to the normalized to system physical address handler provided
by the address translation PRM module included in future AMD systems.

The address translation PRM module is documented in chapter 22 of the
publicly available "AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
ACPI v6.5 Porting Guide".

  [ bp: Massage commit message. ]

Signed-off-by: John Allen &lt;john.allen@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/20240730151731.15363-3-john.allen@amd.com
</content>
</entry>
<entry>
<title>Merge tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras</title>
<updated>2024-07-16T01:20:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-07-16T01:20:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8028e290b6354ddb404e88f17fe5d37945cb122f'/>
<id>urn:sha1:8028e290b6354ddb404e88f17fe5d37945cb122f</id>
<content type='text'>
Pull EDAC updates from Borislav Petkov:

 - The AMD memory controllers data fabric version 4.5 supports
   non-power-of-2 denormalization in the sense that certain bits of the
   system physical address cannot be reconstructed from the normalized
   address reported by the RAS hardware. Add support for handling such
   addresses

 - Switch the EDAC drivers to the new Intel CPU model defines

 - The usual fixes and cleanups all over the place

* tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC: Add missing MODULE_DESCRIPTION() macros
  EDAC/dmc520: Use devm_platform_ioremap_resource()
  EDAC/igen6: Add Intel Arrow Lake-U/H SoCs support
  RAS/AMD/FMPM: Use atl internal.h for INVALID_SPA
  RAS/AMD/ATL: Implement DF 4.5 NP2 denormalization
  RAS/AMD/ATL: Validate address map when information is gathered
  RAS/AMD/ATL: Expand helpers for adding and removing base and hole
  RAS/AMD/ATL: Read DRAM hole base early
  RAS/AMD/ATL: Add amd_atl pr_fmt() prefix
  RAS/AMD/ATL: Add a missing module description
  EDAC, i10nm: make skx_common.o a separate module
  EDAC/skx: Switch to new Intel CPU model defines
  EDAC/sb_edac: Switch to new Intel CPU model defines
  EDAC, pnd2: Switch to new Intel CPU model defines
  EDAC/i10nm: Switch to new Intel CPU model defines
  EDAC/ghes: Add missing newline to pr_info() statement
  RAS/AMD/ATL: Add missing newline to pr_info() statement
  EDAC/thunderx: Remove unused struct error_syndrome
</content>
</entry>
<entry>
<title>Merge remote-tracking branches 'ras/edac-amd-atl' and 'ras/edac-misc' into edac-updates</title>
<updated>2024-07-15T09:59:10Z</updated>
<author>
<name>Borislav Petkov (AMD)</name>
<email>bp@alien8.de</email>
</author>
<published>2024-07-15T09:59:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=03a9b67087ba071f69b12d730b36aa7c2d3dbf21'/>
<id>urn:sha1:03a9b67087ba071f69b12d730b36aa7c2d3dbf21</id>
<content type='text'>
* ras/edac-amd-atl:
  RAS/AMD/FMPM: Use atl internal.h for INVALID_SPA
  RAS/AMD/ATL: Implement DF 4.5 NP2 denormalization
  RAS/AMD/ATL: Validate address map when information is gathered
  RAS/AMD/ATL: Expand helpers for adding and removing base and hole
  RAS/AMD/ATL: Read DRAM hole base early
  RAS/AMD/ATL: Add amd_atl pr_fmt() prefix
  RAS/AMD/ATL: Add a missing module description

* ras/edac-misc:
  EDAC: Add missing MODULE_DESCRIPTION() macros
  EDAC/dmc520: Use devm_platform_ioremap_resource()
  EDAC/igen6: Add Intel Arrow Lake-U/H SoCs support
  EDAC, i10nm: make skx_common.o a separate module
  EDAC/skx: Switch to new Intel CPU model defines
  EDAC/sb_edac: Switch to new Intel CPU model defines
  EDAC, pnd2: Switch to new Intel CPU model defines
  EDAC/i10nm: Switch to new Intel CPU model defines
  EDAC/ghes: Add missing newline to pr_info() statement
  RAS/AMD/ATL: Add missing newline to pr_info() statement
  EDAC/thunderx: Remove unused struct error_syndrome

Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Use system settings for MI300 DRAM to normalized address translation</title>
<updated>2024-06-16T09:22:57Z</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2024-06-07T21:33:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ba437905b4fbf0ee1686c175069239a1cc292558'/>
<id>urn:sha1:ba437905b4fbf0ee1686c175069239a1cc292558</id>
<content type='text'>
The currently used normalized address format is not applicable to all
MI300 systems. This leads to incorrect results during address
translation.

Drop the fixed layout and construct the normalized address from system
settings.

Fixes: 87a612375307 ("RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: &lt;stable@kernel.org&gt;
Link: https://lore.kernel.org/r/20240607-mi300-dram-xl-fix-v1-2-2f11547a178c@amd.com
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Fix MI300 bank hash</title>
<updated>2024-06-10T05:56:33Z</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2024-06-07T21:32:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe8a08973a0dea9757394c5adbdc3c0a03b0b432'/>
<id>urn:sha1:fe8a08973a0dea9757394c5adbdc3c0a03b0b432</id>
<content type='text'>
Apply the SID bits to the correct offset in the Bank value. Do this in
the temporary value so they don't need to be masked off later.

Fixes: 87a612375307 ("RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: &lt;stable@kernel.org&gt;
Link: https://lore.kernel.org/r/20240607-mi300-dram-xl-fix-v1-1-2f11547a178c@amd.com
</content>
</entry>
<entry>
<title>RAS/AMD/FMPM: Use atl internal.h for INVALID_SPA</title>
<updated>2024-06-09T21:44:05Z</updated>
<author>
<name>John Allen</name>
<email>john.allen@amd.com</email>
</author>
<published>2024-06-06T20:33:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f4c0cd1870afd57181e8087c6cf8da3d7fa2cebe'/>
<id>urn:sha1:f4c0cd1870afd57181e8087c6cf8da3d7fa2cebe</id>
<content type='text'>
Both the AMD ATL and the FMPM driver define INVALID_SPA. Include the
definition from the ATL internal.h header in the FMPM driver.

Signed-off-by: John Allen &lt;john.allen@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/20240606203313.51197-7-john.allen@amd.com
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Implement DF 4.5 NP2 denormalization</title>
<updated>2024-06-09T21:43:58Z</updated>
<author>
<name>John Allen</name>
<email>john.allen@amd.com</email>
</author>
<published>2024-06-06T20:33:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0372d6969bca2bc57e1a24129473694ff65641c'/>
<id>urn:sha1:e0372d6969bca2bc57e1a24129473694ff65641c</id>
<content type='text'>
Unlike with previous Data Fabric versions, with Data Fabric 4.5
non-power-of-2 denormalization, there are bits of the system physical
address that can't be fully reconstructed from the normalized address.

To determine the proper combination of missing system physical address
bits, iterate through each possible combination of these bits, normalize
the resulting system physical address, and compare to the original
address that is being translated. If the addresses match, then the
correct permutation of bits has been found.

Signed-off-by: John Allen &lt;john.allen@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Link: https://lore.kernel.org/r/20240606203313.51197-6-john.allen@amd.com
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Validate address map when information is gathered</title>
<updated>2024-06-09T21:43:51Z</updated>
<author>
<name>John Allen</name>
<email>john.allen@amd.com</email>
</author>
<published>2024-06-06T20:33:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d5811a165caf63a69cd8ae11156b8587cc57d1d1'/>
<id>urn:sha1:d5811a165caf63a69cd8ae11156b8587cc57d1d1</id>
<content type='text'>
Validate address maps at the time the information is gathered as the
address map will not change during translation.

Signed-off-by: John Allen &lt;john.allen@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Link: https://lore.kernel.org/r/20240606203313.51197-5-john.allen@amd.com
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Expand helpers for adding and removing base and hole</title>
<updated>2024-06-09T21:43:36Z</updated>
<author>
<name>John Allen</name>
<email>john.allen@amd.com</email>
</author>
<published>2024-06-06T20:33:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6cce048cb31f272ca2c9b772cf541715b9ff6ca1'/>
<id>urn:sha1:6cce048cb31f272ca2c9b772cf541715b9ff6ca1</id>
<content type='text'>
The ret_addr field in struct addr_ctx contains the intermediate value of
the returned address as it passes through multiple steps in the
translation process. Currently, adding the DRAM base and legacy hole
is only done once, so it operates directly on the intermediate value.

However, for DF 4.5 non-power-of-2 denormalization, adding and removing
the DRAM base and legacy hole needs to be done for multiple temporary
address values. During this process, the intermediate value should not be
lost so the ret_addr value can't be reused.

Update the existing 'add' helper to operate on an arbitrary address
and introduce a new 'remove' helper to do the inverse operations.

Signed-off-by: John Allen &lt;john.allen@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Link: https://lore.kernel.org/r/20240606203313.51197-4-john.allen@amd.com
</content>
</entry>
</feed>
