user/sven/linux.git/drivers/base, branch v4.9.16

base/memory, hotplug: fix a kernel oops in show_valid_zones()

2017-02-09T07:08:28Z

commit a96dfddbcc04336bbed50dc2b24823e45e09e80c upstream. Reading a sysfs "memoryN/valid_zones" file leads to the following oops when the first page of a range is not backed by struct page. show_valid_zones() assumes that 'start_pfn' is always valid for page_zone(). BUG: unable to handle kernel paging request at ffffea017a000000 IP: show_valid_zones+0x6f/0x160 This issue may happen on x86-64 systems with 64GiB or more memory since their memory block size is bumped up to 2GiB. [1] An example of such systems is desribed below. 0x3240000000 is only aligned by 1GiB and this memory block starts from 0x3200000000, which is not backed by struct page. BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable Since test_pages_in_a_zone() already checks holes, fix this issue by extending this function to return 'valid_start' and 'valid_end' for a given range. show_valid_zones() then proceeds with the valid range. [1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64 systems")' Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com Signed-off-by: Toshi Kani Cc: Greg Kroah-Hartman Cc: Zhang Zhen Cc: Reza Arbab Cc: David Rientjes Cc: Dan Williams Signed-off-by: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

memory_hotplug: make zone_can_shift() return a boolean value

2017-02-01T07:33:13Z

commit 8a1f780e7f28c7c1d640118242cf68d528c456cd upstream. online_{kernel|movable} is used to change the memory zone to ZONE_{NORMAL|MOVABLE} and online the memory. To check that memory zone can be changed, zone_can_shift() is used. Currently the function returns minus integer value, plus integer value and 0. When the function returns minus or plus integer value, it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}. But when the function returns 0, there are two meanings. One of the meanings is that the memory zone does not need to be changed. For example, when memory is in ZONE_NORMAL and onlined by online_kernel the memory zone does not need to be changed. Another meaning is that the memory zone cannot be changed. When memory is in ZONE_NORMAL and onlined by online_movable, the memory zone may not be changed to ZONE_MOVALBE due to memory online limitation(see Documentation/memory-hotplug.txt). In this case, memory must not be onlined. The patch changes the return type of zone_can_shift() so that memory online operation fails when memory zone cannot be changed as follows: Before applying patch: # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal node_scanned 0 spanned 8388608 present 7864320 managed 7864320 # echo online_movable > memory4097/state # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal node_scanned 0 spanned 8388608 present 8388608 managed 8388608 online_movable operation succeeded. But memory is onlined as ZONE_NORMAL, not ZONE_MOVABLE. After applying patch: # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal node_scanned 0 spanned 8388608 present 7864320 managed 7864320 # echo online_movable > memory4097/state bash: echo: write error: Invalid argument # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal node_scanned 0 spanned 8388608 present 7864320 managed 7864320 online_movable operation failed because of failure of changing the memory zone from ZONE_NORMAL to ZONE_MOVABLE Fixes: df429ac03936 ("memory-hotplug: more general validation of zone during online") Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com Signed-off-by: Yasuaki Ishimatsu Reviewed-by: Reza Arbab Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman

PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend

2017-01-12T10:39:31Z

commit bed570307ed78f21b77cb04a1df781dee4a8f05a upstream. I noticed some wakeirq flakeyness with consumer drivers not using autosuspend. For drivers not using autosuspend, the wakeirq may never get unmasked in rpm_suspend() because of irq desc->depth. We are configuring dedicated wakeirqs to start with IRQ_NOAUTOEN as we naturally don't want them running until rpm_suspend() is called. However, when a consumer driver initially calls pm_runtime_get(), we now wrongly start with disable_irq_nosync() call on the dedicated wakeirq that is disabled to start with. This causes desc->depth to toggle between 1 and 2 instead of the usual 0 and 1. This can prevent enable_irq() from unmasking the wakeirq as that only happens at desc->depth 1. This does not necessarily show up with drivers using autosuspend as there is time for disable_irq_nosync() before rpm_suspend() gets called after the autosuspend timeout. Let's fix the issue by adding wirq->status that lazily gets set on the first rpm_suspend(). We also need PM runtime core private functions for dev_pm_enable_wake_irq_check() and dev_pm_disable_wake_irq_check() so we can enable the dedicated wakeirq on the first rpm_suspend(). While at it, let's also fix the comments for dev_pm_enable_wake_irq() and dev_pm_disable_wake_irq(). Those can still be used by the consumer drivers as needed because the IRQ core manages the interrupt usecount for us. Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling) Signed-off-by: Tony Lindgren Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman

firmware: fix usermode helper fallback loading

2017-01-09T07:32:22Z

commit 2e700f8d85975f516ccaad821278c1fe66b2cc98 upstream. When you use the firmware usermode helper fallback with a timeout value set to a value greater than INT_MAX (2147483647) a cast overflow issue causes the timeout value to go negative and breaks all usermode helper loading. This regression was introduced through commit 68ff2a00dbf5 ("firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()") on kernel v4.0. The firmware_class drivers relies on the firmware usermode helper fallback as a mechanism to look for firmware if the direct filesystem search failed only if: a) You've enabled CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros): Then all of these callers will rely on the fallback mechanism in case the firmware is not found through an initial direct filesystem lookup: o request_firmware() o request_firmware_into_buf() o request_firmware_nowait() b) If you've only enabled CONFIG_FW_LOADER_USER_HELPER (most distros): Then only callers using request_firmware_nowait() with the second argument set to false, this explicitly is requesting the UMH firmware fallback to be relied on in case the first filesystem lookup fails. Using Coccinelle SmPL grammar we have identified only two drivers explicitly requesting the UMH firmware fallback mechanism: - drivers/firmware/dell_rbu.c - drivers/leds/leds-lp55xx-common.c Since most distributions only enable CONFIG_FW_LOADER_USER_HELPER the biggest impact of this regression are users of the dell_rbu and leds-lp55xx-common device driver which required the UMH to find their respective needed firmwares. The default timeout for the UMH is set to 60 seconds always, as of commit 68ff2a00dbf5 ("firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()") the timeout was bumped to MAX_JIFFY_OFFSET ((LONG_MAX >> 1)-1). Additionally the MAX_JIFFY_OFFSET value was also used if the timeout was configured by a user to 0. The following works: echo 2147483647 > /sys/class/firmware/timeout But both of the following set the timeout to MAX_JIFFY_OFFSET even if we display 0 back to userspace: echo 2147483648 > /sys/class/firmware/timeout cat /sys/class/firmware/timeout 0 echo 0> /sys/class/firmware/timeout cat /sys/class/firmware/timeout 0 A max value of INT_MAX (2147483647) seconds is therefore implicit due to the another cast with simple_strtol(). This fixes the secondary cast (the first one is simple_strtol() but its an issue only by forcing an implicit limit) by re-using the timeout variable and only setting retval in appropriate cases. Lastly worth noting systemd had ripped out the UMH firmware fallback mechanism from udev since udev 2014 via commit be2ea723b1d023b3d ("udev: remove userspace firmware loading support"), so as of systemd v217. Signed-off-by: Yves-Alexis Perez Fixes: 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()" Cc: Luis R. Rodriguez Cc: Ming Lei Cc: Bjorn Andersson Cc: Greg Kroah-Hartman Acked-by: Luis R. Rodriguez Reviewed-by: Bjorn Andersson [mcgrof@kernel.org: gave commit log a whole lot of love] Signed-off-by: Luis R. Rodriguez Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman

PM / OPP: Don't use OPP structure outside of rcu protected section

2017-01-06T09:40:15Z

commit dc39d06fcd7a4a82d72eae7b71e94e888b96d29e upstream. The OPP structure must not be used out of the rcu protected section. Cache the values to be used in separate variables instead. Signed-off-by: Viresh Kumar Reviewed-by: Stephen Boyd Tested-by: Dave Gerlach Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman

PM / OPP: Pass opp_table to dev_pm_opp_put_regulator()

2017-01-06T09:40:15Z

commit 91291d9ad92faa65a56a9a19d658d8049b78d3d4 upstream. Joonyoung Shim reported an interesting problem on his ARM octa-core Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator() was failing for a struct device for which dev_pm_opp_set_regulator() is called earlier. This happened because an earlier call to dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file) removed all the entries from opp_table->dev_list apart from the last CPU device in the cpumask of CPUs sharing the OPP. But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator() routines get CPU device for the first CPU in the cpumask. And so the OPP core failed to find the OPP table for the struct device. This patch attempts to fix this problem by returning a pointer to the opp_table from dev_pm_opp_set_regulator() and using that as the parameter to dev_pm_opp_put_regulator(). This ensures that the dev_pm_opp_put_regulator() doesn't fail to find the opp table. Note that similar design problem also exists with other dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and so we don't need to update them for now. Reported-by: Joonyoung Shim Signed-off-by: Stephen Boyd Signed-off-by: Viresh Kumar [ Viresh: Wrote commit log and tested on exynos 5250 ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman

Merge tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

2016-11-13T18:22:07Z

Pull driver core fixes from Greg KH: "Here are two driver core fixes for 4.9-rc5. The first resolves an issue with some drivers not liking to be unbound and bound again (if CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled), which solves some reported problems with graphics and storage drivers. The other resolves a smatch error with the 4.9-rc1 driver core changes around this feature. Both have been in linux-next with no reported issues" * tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver core: fix smatch warning on dev->bus check driver core: skip removal test for non-removable drivers

PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails

2016-11-11T00:29:09Z

Consider two devices, A and B, where B is a child of A, and B utilizes asynchronous suspend (it does not matter whether A is sync or async). If B fails to suspend_noirq() or suspend_late(), or is interrupted by a wakeup (pm_wakeup_pending()), then it aborts and sets the async_error variable. However, device A does not (immediately) check the async_error variable; it may continue to run its own suspend_noirq()/suspend_late() callback. This is bad. We can resolve this problem by doing our error and wakeup checking (particularly, for the async_error flag) after waiting for children to suspend, instead of before. This also helps align the logic for the noirq and late suspend cases with the logic in __device_suspend(). It's easy to observe this erroneous behavior by, for example, forcing a device to sleep a bit in its suspend_noirq() (to ensure the parent is waiting for the child to complete), then return an error, and watch the parent suspend_noirq() still get called. (Or similarly, fake a wakeup event at the right (or is it wrong?) time.) Fixes: de377b397272 (PM / sleep: Asynchronous threads for suspend_late) Fixes: 28b6fd6e3779 (PM / sleep: Asynchronous threads for suspend_noirq) Reported-by: Jeffy Chen Signed-off-by: Brian Norris Signed-off-by: Rafael J. Wysocki

driver core: fix smatch warning on dev->bus check

2016-10-31T15:15:22Z

Commit d42a09802174 (driver core: skip removal test for non-removable drivers) introduced a smatch warning: drivers/base/dd.c:386 really_probe() warn: variable dereferenced before check 'dev->bus' (see line 373) Fix the warning by removing the dev->bus NULL check. dev->bus will never be NULL, so the check was unnecessary. Reported-by: Dan Carpenter Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman

driver core: skip removal test for non-removable drivers

2016-10-31T15:15:22Z

Some drivers do not support removal/unbinding. These drivers should have drv->suppress_bind_attrs set to true, so use that to skip the removal test. This doesn't fix anything reported so far, but should prevent some other cases. Some drivers will need fixes to set suppress_bind_attrs to avoid this test. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=177021 Fixes: bea5b158ff0d ("driver core: add test of driver remove calls during probe") Reported-by: Laszlo Ersek Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman