<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/edac, branch v5.4.3</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.3</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.3'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-12-13T07:43:30Z</updated>
<entry>
<title>EDAC/ghes: Fix locking and memory barrier issues</title>
<updated>2019-12-13T07:43:30Z</updated>
<author>
<name>Robert Richter</name>
<email>rrichter@marvell.com</email>
</author>
<published>2019-11-05T20:07:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d2136afb5193440c2fb62ae20e98f44b431e45ef'/>
<id>urn:sha1:d2136afb5193440c2fb62ae20e98f44b431e45ef</id>
<content type='text'>
[ Upstream commit 23f61b9fc5cc10d87f66e50518707eec2a0fbda1 ]

The ghes registration and refcount is broken in several ways:

 * ghes_edac_register() returns with success for a 2nd instance
   even if a first instance's registration is still running. This is
   not correct as the first instance may fail later. A subsequent
   registration may not finish before the first. Parallel registrations
   must be avoided.

 * The refcount was increased even if a registration failed. This
   leads to stale counters preventing the device from being released.

 * The ghes refcount may not be decremented properly on unregistration.
   Always decrement the refcount once ghes_edac_unregister() is called to
   keep the refcount sane.

 * The ghes_pvt pointer is handed to the irq handler before registration
   finished.

 * The mci structure could be freed while the irq handler is running.

Fix this by adding a mutex to ghes_edac_register(). This mutex
serializes instances to register and unregister. The refcount is only
increased if the registration succeeded. This makes sure the refcount is
in a consistent state after registering or unregistering a device.

Note: A spinlock cannot be used here as the code section may sleep.

The ghes_pvt is protected by ghes_lock now. This ensures the pointer is
not updated before registration was finished or while the irq handler is
running. It is unset before unregistering the device including necessary
(implicit) memory barriers making the changes visible to other CPUs.
Thus, the device can not be used anymore by an interrupt.

Also, rename ghes_init to ghes_refcount for better readability and
switch to refcount API.

A refcount is needed because there can be multiple GHES structures being
defined (see ACPI 6.3 specification, 18.3.2.7 Generic Hardware Error
Source, "Some platforms may describe multiple Generic Hardware Error
Source structures with different notification types, ...").

Another approach to use the mci's device refcount (get_device()) and
have a release function does not work here. A release function will be
called only for device_release() with the last put_device() call. The
device must be deleted *before* that with device_del(). This is only
possible by maintaining an own refcount.

 [ bp: touchups. ]

Fixes: 0fe5f281f749 ("EDAC, ghes: Model a single, logical memory controller")
Fixes: 1e72e673b9d1 ("EDAC/ghes: Fix Use after free in ghes_edac remove path")
Co-developed-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Co-developed-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Robert Richter &lt;rrichter@marvell.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "linux-edac@vger.kernel.org" &lt;linux-edac@vger.kernel.org&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20191105200732.3053-1-rrichter@marvell.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/ghes: Fix Use after free in ghes_edac remove path</title>
<updated>2019-10-17T09:27:05Z</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2019-10-14T17:19:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1e72e673b9d102ff2e8333e74b3308d012ddf75b'/>
<id>urn:sha1:1e72e673b9d102ff2e8333e74b3308d012ddf75b</id>
<content type='text'>
ghes_edac models a single logical memory controller, and uses a global
ghes_init variable to ensure only the first ghes_edac_register() will
do anything.

ghes_edac is registered the first time a GHES entry in the HEST is
probed. There may be multiple entries, so subsequent attempts to
register ghes_edac are silently ignored as the work has already been
done.

When a GHES entry is unregistered, it calls ghes_edac_unregister(),
which free()s the memory behind the global variables in ghes_edac.

But there may be multiple GHES entries, the next call to
ghes_edac_unregister() will dereference the free()d memory, and attempt
to free it a second time.

This may also be triggered on a platform with one GHES entry, if the
driver is unbound/re-bound and unbound. The re-bind step will do
nothing because of ghes_init, the second unbind will then do the same
work as the first.

Doing the unregister work on the first call is unsafe, as another
CPU may be processing a notification in ghes_edac_report_mem_error(),
using the memory we are about to free.

ghes_init is already half of the reference counting. We only need
to do the register work for the first call, and the unregister work
for the last. Add the unregister check.

This means we no longer free ghes_edac's memory while there are
GHES entries that may receive a notification.

This was detected by KASAN and DEBUG_TEST_DRIVER_REMOVE.

 [ bp: merge into a single patch. ]

Fixes: 0fe5f281f749 ("EDAC, ghes: Model a single, logical memory controller")
Reported-by: John Garry &lt;john.garry@huawei.com&gt;
Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Robert Richter &lt;rrichter@marvell.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20191014171919.85044-2-james.morse@arm.com
Link: https://lkml.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.com
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm</title>
<updated>2019-09-22T16:39:09Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-09-22T16:39:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8808cf8cbc4da1ceef9307fba7e499563908c211'/>
<id>urn:sha1:8808cf8cbc4da1ceef9307fba7e499563908c211</id>
<content type='text'>
Pull ARM updates from Russell King:

 - fix various clang build and cppcheck issues

 - switch ARM to use new common outgoing-CPU-notification code

 - add some additional explanation about the boot code

 - kbuild "make clean" fixes

 - get rid of another "(____ptrval____)", this time for the VDSO code

 - avoid treating cache maintenance faults as a write

 - add a frame pointer unwinder implementation for clang

 - add EDAC support for Aurora L2 cache

 - improve robustness of adjust_lowmem_bounds() finding the bounds of
   lowmem.

 - add reset control for AMBA primecell devices

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (24 commits)
  ARM: 8906/1: drivers/amba: add reset control to amba bus probe
  ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer
  ARM: 8904/1: skip nomap memblocks while finding the lowmem/highmem boundary
  ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address
  ARM: 8891/1: EDAC: armada_xp: Add support for more SoCs
  ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC
  ARM: 8892/1: EDAC: Add missing debugfs_create_x32 wrapper
  ARM: 8890/1: l2x0: add marvell,ecc-enable property for aurora
  ARM: 8889/1: dt-bindings: document marvell,ecc-enable binding
  ARM: 8886/1: l2x0: support parity-enable/disable on aurora
  ARM: 8885/1: aurora-l2: add defines for parity and ECC registers
  ARM: 8887/1: aurora-l2: add prefix to MAX_RANGE_SIZE
  ARM: 8902/1: l2c: move cache-aurora-l2.h to asm/hardware
  ARM: 8900/1: UNWINDER_FRAME_POINTER implementation for Clang
  ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes
  ARM: 8896/1: VDSO: Don't leak kernel addresses
  ARM: 8895/1: visit mach-* and plat-* directories when cleaning
  ARM: 8894/1: boot: Replace open-coded nop with macro
  ARM: 8893/1: boot: Explain the 8 nops
  ARM: 8876/1: fix O= building with CONFIG_FPE_FASTFPE
  ...
</content>
</entry>
<entry>
<title>Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2019-09-17T01:47:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-09-17T01:47:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=22331f895298bd23ca9f99f6a237aae883c9e1c7'/>
<id>urn:sha1:22331f895298bd23ca9f99f6a237aae883c9e1c7</id>
<content type='text'>
Pull x86 cpu-feature updates from Ingo Molnar:

 - Rework the Intel model names symbols/macros, which were decades of
   ad-hoc extensions and added random noise. It's now a coherent, easy
   to follow nomenclature.

 - Add new Intel CPU model IDs:
    - "Tiger Lake" desktop and mobile models
    - "Elkhart Lake" model ID
    - and the "Lightning Mountain" variant of Airmont, plus support code

 - Add the new AVX512_VP2INTERSECT instruction to cpufeatures

 - Remove Intel MPX user-visible APIs and the self-tests, because the
   toolchain (gcc) is not supporting it going forward. This is the
   first, lowest-risk phase of MPX removal.

 - Remove X86_FEATURE_MFENCE_RDTSC

 - Various smaller cleanups and fixes

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  x86/cpu: Update init data for new Airmont CPU model
  x86/cpu: Add new Airmont variant to Intel family
  x86/cpu: Add Elkhart Lake to Intel family
  x86/cpu: Add Tiger Lake to Intel family
  x86: Correct misc typos
  x86/intel: Add common OPTDIFFs
  x86/intel: Aggregate microserver naming
  x86/intel: Aggregate big core graphics naming
  x86/intel: Aggregate big core mobile naming
  x86/intel: Aggregate big core client naming
  x86/cpufeature: Explain the macro duplication
  x86/ftrace: Remove mcount() declaration
  x86/PCI: Remove superfluous returns from void functions
  x86/msr-index: Move AMD MSRs where they belong
  x86/cpu: Use constant definitions for CPU models
  lib: Remove redundant ftrace flag removal
  x86/crash: Remove unnecessary comparison
  x86/bitops: Use __builtin_constant_p() directly instead of IS_IMMEDIATE()
  x86: Remove X86_FEATURE_MFENCE_RDTSC
  x86/mpx: Remove MPX APIs
  ...
</content>
</entry>
<entry>
<title>EDAC/amd64: Add PCI device IDs for family 17h, model 70h</title>
<updated>2019-09-07T05:29:27Z</updated>
<author>
<name>Isaac Vaughn</name>
<email>isaac.vaughn@Knights.ucf.edu</email>
</author>
<published>2019-09-06T23:21:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3e443eb353eda6f4b4796e07f2599683fa752f1d'/>
<id>urn:sha1:3e443eb353eda6f4b4796e07f2599683fa752f1d</id>
<content type='text'>
Add the new Family 17h Model 70h PCI IDs (device 18h functions 0 and 6)
to the AMD64 EDAC module.

 [ bp: s/f17_base_addr_to_cs_size/f17_addr_mask_to_cs_size/g ]

Signed-off-by: Isaac Vaughn &lt;isaac.vaughn@knights.ucf.edu&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: linux-edac@vger.kernel.org
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Robert Richter &lt;rrichter@marvell.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20190906192131.8ced0ca112146f32d82b6cae@knights.ucf.edu
</content>
</entry>
<entry>
<title>EDAC/mc_sysfs: Make debug messages consistent</title>
<updated>2019-09-04T09:39:19Z</updated>
<author>
<name>Robert Richter</name>
<email>rrichter@marvell.com</email>
</author>
<published>2019-09-02T12:33:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e701f412030ec3783f1c30c7741492693d6213e3'/>
<id>urn:sha1:e701f412030ec3783f1c30c7741492693d6213e3</id>
<content type='text'>
Debug messages are inconsistently used in the error handlers. Some lack
an error message, some are called regardless of the return status,
messages for the same error are at different locations in the code
depending on the error code. This happens esp. near put_device() calls.

Make those debug messages more consistent. Additionally, unify the error
messages to have the same terms for the same operations of the device.

Signed-off-by: Robert Richter &lt;rrichter@marvell.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: "linux-edac@vger.kernel.org" &lt;linux-edac@vger.kernel.org&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20190902123216.9809-5-rrichter@marvell.com
</content>
</entry>
<entry>
<title>EDAC/mc_sysfs: Remove pointless gotos</title>
<updated>2019-09-03T17:24:10Z</updated>
<author>
<name>Robert Richter</name>
<email>rrichter@marvell.com</email>
</author>
<published>2019-09-02T12:33:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=644110e17d26e51f9f33830753cc899500d5d384'/>
<id>urn:sha1:644110e17d26e51f9f33830753cc899500d5d384</id>
<content type='text'>
Use direct returns instead of gotos. Error handling code becomes
smaller and better readable.

Signed-off-by: Robert Richter &lt;rrichter@marvell.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: "linux-edac@vger.kernel.org" &lt;linux-edac@vger.kernel.org&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20190902123216.9809-4-rrichter@marvell.com
</content>
</entry>
<entry>
<title>EDAC: Prefer 'unsigned int' to bare use of 'unsigned'</title>
<updated>2019-09-03T17:21:19Z</updated>
<author>
<name>Robert Richter</name>
<email>rrichter@marvell.com</email>
</author>
<published>2019-09-02T12:33:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d55c79ac86f78fce3c224bda2b383edf96bb6438'/>
<id>urn:sha1:d55c79ac86f78fce3c224bda2b383edf96bb6438</id>
<content type='text'>
Use of 'unsigned int' instead of bare use of 'unsigned'. Fix this for
edac_mc*, ghes and the i5100 driver as reported by checkpatch.pl.

While at it, struct member dev_ch_attribute-&gt;channel is always used as
unsigned int. Change type to unsigned int to avoid type casts.

 [ bp: Massage. ]

Signed-off-by: Robert Richter &lt;rrichter@marvell.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: "linux-edac@vger.kernel.org" &lt;linux-edac@vger.kernel.org&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20190902123216.9809-2-rrichter@marvell.com
</content>
</entry>
<entry>
<title>ARM: 8891/1: EDAC: armada_xp: Add support for more SoCs</title>
<updated>2019-08-29T06:58:01Z</updated>
<author>
<name>Chris Packham</name>
<email>chris.packham@alliedtelesis.co.nz</email>
</author>
<published>2019-07-12T04:46:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=23d103ae3e061b466dc68dd10a035dcd3b2de99a'/>
<id>urn:sha1:23d103ae3e061b466dc68dd10a035dcd3b2de99a</id>
<content type='text'>
The Armada 38x and other integrated SoCs use a reduced pin count so the
width of the SDRAM interface is smaller than the Armada XP SoCs. This
means that the definition of "full" and "half" width is reduced from
64/32 to 32/16.

Signed-off-by: Chris Packham &lt;chris.packham@alliedtelesis.co.nz&gt;
Signed-off-by: Russell King &lt;rmk+kernel@armlinux.org.uk&gt;
</content>
</entry>
<entry>
<title>ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC</title>
<updated>2019-08-29T06:58:01Z</updated>
<author>
<name>Jan Luebbe</name>
<email>jlu@pengutronix.de</email>
</author>
<published>2019-07-12T04:46:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7f6998a41257a8930ee5b6866ba56a25230841ed'/>
<id>urn:sha1:7f6998a41257a8930ee5b6866ba56a25230841ed</id>
<content type='text'>
Add support for the ECC functionality as found in the DDR RAM and L2
cache controllers on the MV78230/MV78x60 SoCs. This driver has been
tested on the MV78460 (on a custom board with a DDR3 ECC DIMM).

[cp use SPDX license]

Signed-off-by: Jan Luebbe &lt;jlu@pengutronix.de&gt;
Signed-off-by: Chris Packham &lt;chris.packham@alliedtelesis.co.nz&gt;
Reviewed-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Russell King &lt;rmk+kernel@armlinux.org.uk&gt;
</content>
</entry>
</feed>
