user/sven/linux.git/drivers/s390, branch v4.14.151

scsi: zfcp: fix reaction on bit error threshold notification

2019-10-29T08:17:37Z

[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ] On excessive bit errors for the FCP channel ingress fibre path, the channel notifies us. Previously, we only emitted a kernel message and a trace record. Since performance can become suboptimal with I/O timeouts due to bit errors, we now stop using an FCP device by default on channel notification so multipath on top can timely failover to other paths. A new module parameter zfcp.ber_stop can be used to get zfcp old behavior. User explanation of new kernel message: * Description: * The FCP channel reported that its bit error threshold has been exceeded. * These errors might result from a problem with the physical components * of the local fibre link into the FCP channel. * The problem might be damage or malfunction of the cable or * cable connection between the FCP channel and * the adjacent fabric switch port or the point-to-point peer. * Find details about the errors in the HBA trace for the FCP device. * The zfcp device driver closed down the FCP device * to limit the performance impact from possible I/O command timeouts. * User action: * Check for problems on the local fibre link, ensure that fibre optics are * clean and functional, and all cables are properly plugged. * After the repair action, you can manually recover the FCP device by * writing "0" into its "failed" sysfs attribute. * If recovery through sysfs is not possible, set the CHPID of the device * offline and back online on the service element. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: #2.6.30+ Link: https://lore.kernel.org/r/20191001104949.42810-1-maier@linux.ibm.com Reviewed-by: Jens Remus Reviewed-by: Benjamin Block Signed-off-by: Steffen Maier Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin

s390/cio: exclude subchannels with no parent from pseudo check

2019-10-11T16:18:28Z

commit ab5758848039de9a4b249d46e4ab591197eebaf2 upstream. ccw console is created early in start_kernel and used before css is initialized or ccw console subchannel is registered. Until then console subchannel does not have a parent. For that reason assume subchannels with no parent are not pseudo subchannels. This fixes the following kasan finding: BUG: KASAN: global-out-of-bounds in sch_is_pseudo_sch+0x8e/0x98 Read of size 8 at addr 00000000000005e8 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc8-07370-g6ac43dd12538 #2 Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) Call Trace: ([<000000000012cd76>] show_stack+0x14e/0x1e0) [<0000000001f7fb44>] dump_stack+0x1a4/0x1f8 [<00000000007d7afc>] print_address_description+0x64/0x3c8 [<00000000007d75f6>] __kasan_report+0x14e/0x180 [<00000000018a2986>] sch_is_pseudo_sch+0x8e/0x98 [<000000000189b950>] cio_enable_subchannel+0x1d0/0x510 [<00000000018cac7c>] ccw_device_recognition+0x12c/0x188 [<0000000002ceb1a8>] ccw_device_enable_console+0x138/0x340 [<0000000002cf1cbe>] con3215_init+0x25e/0x300 [<0000000002c8770a>] console_init+0x68a/0x9b8 [<0000000002c6a3d6>] start_kernel+0x4fe/0x728 [<0000000000100070>] startup_continue+0x70/0xd0 Cc: stable@vger.kernel.org Reviewed-by: Sebastian Ott Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman

s390/cio: avoid calling strlen on null pointer

2019-10-11T16:18:28Z

commit ea298e6ee8b34b3ed4366be7eb799d0650ebe555 upstream. Fix the following kasan finding: BUG: KASAN: global-out-of-bounds in ccwgroup_create_dev+0x850/0x1140 Read of size 1 at addr 0000000000000000 by task systemd-udevd.r/561 CPU: 30 PID: 561 Comm: systemd-udevd.r Tainted: G B Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: ([<0000000231b3db7e>] show_stack+0x14e/0x1a8) [<0000000233826410>] dump_stack+0x1d0/0x218 [<000000023216fac4>] print_address_description+0x64/0x380 [<000000023216f5a8>] __kasan_report+0x138/0x168 [<00000002331b8378>] ccwgroup_create_dev+0x850/0x1140 [<00000002332b618a>] group_store+0x3a/0x50 [<00000002323ac706>] kernfs_fop_write+0x246/0x3b8 [<00000002321d409a>] vfs_write+0x132/0x450 [<00000002321d47da>] ksys_write+0x122/0x208 [<0000000233877102>] system_call+0x2a6/0x2c8 Triggered by: openat(AT_FDCWD, "/sys/bus/ccwgroup/drivers/qeth/group", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 16 write(16, "0.0.bd00,0.0.bd01,0.0.bd02", 26) = 26 The problem is that __get_next_id in ccwgroup_create_dev might set "buf" buffer pointer to NULL and explicit check for that is required. Cc: stable@vger.kernel.org Reviewed-by: Sebastian Ott Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman

s390/qdio: add sanity checks to the fast-requeue path

2019-08-16T08:13:52Z

[ Upstream commit a6ec414a4dd529eeac5c3ea51c661daba3397108 ] If the device driver were to send out a full queue's worth of SBALs, current code would end up discovering the last of those SBALs as PRIMED and erroneously skip the SIGA-w. This immediately stalls the queue. Add a check to not attempt fast-requeue in this case. While at it also make sure that the state of the previous SBAL was successfully extracted before inspecting it. Signed-off-by: Julian Wiedmann Reviewed-by: Jens Remus Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin

vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn

2019-08-16T08:13:50Z

[ Upstream commit c1ab69268d124ebdbb3864580808188ccd3ea355 ] So we don't call try to call vfio_unpin_pages() incorrectly. Fixes: 0a19e61e6d4c ("vfio: ccw: introduce channel program interfaces") Signed-off-by: Farhan Ali Reviewed-by: Eric Farman Reviewed-by: Cornelia Huck Message-Id: <33a89467ad6369196ae6edf820cbcb1e2d8d050c.1562854091.git.alifm@linux.ibm.com> Signed-off-by: Cornelia Huck Signed-off-by: Sasha Levin

s390/dasd: fix endless loop after read unit address configuration

2019-08-06T17:05:26Z

commit 41995342b40c418a47603e1321256d2c4a2ed0fb upstream. After getting a storage server event that causes the DASD device driver to update its unit address configuration during a device shutdown there is the possibility of an endless loop in the device driver. In the system log there will be ongoing DASD error messages with RC: -19. The reason is that the loop starting the ruac request only terminates when the retry counter is decreased to 0. But in the sleep_on function there are early exit paths that do not decrease the retry counter. Prevent an endless loop by handling those cases separately. Remove the unnecessary do..while loop since the sleep_on function takes care of retries by itself. Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1") Cc: stable@vger.kernel.org # 2.6.25+ Signed-off-by: Stefan Haberland Reviewed-by: Jan Hoeppner Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman

scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized

2019-08-06T17:05:22Z

[ Upstream commit 484647088826f2f651acbda6bcf9536b8a466703 ] GCC v9 emits this warning: CC drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue': drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized] 217 | struct zfcp_erp_action *erp_action; | ^~~~~~~~~~ This is a possible false positive case, as also documented in the GCC documentations: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uninitialized The actual code-sequence is like this: Various callers can invoke the function below with the argument "want" being one of: ZFCP_ERP_ACTION_REOPEN_ADAPTER, ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, ZFCP_ERP_ACTION_REOPEN_PORT, or ZFCP_ERP_ACTION_REOPEN_LUN. zfcp_erp_action_enqueue(want, ...) ... need = zfcp_erp_required_act(want, ...) need = want ... maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER ... return need ... zfcp_erp_setup_act(need, ...) struct zfcp_erp_action *erp_action; // <== line 217 ... switch(need) { case ZFCP_ERP_ACTION_REOPEN_LUN: ... erp_action = &zfcp_sdev->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_PORT: case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: ... erp_action = &port->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_ADAPTER: ... erp_action = &adapter->erp_action; WARN_ON_ONCE(erp_action->port != NULL); // <== access ... break; } ... WARN_ON_ONCE(erp_action->adapter != adapter); // <== access When zfcp_erp_setup_act() is called, 'need' will never be anything else than one of the 4 possible enumeration-names that are used in the switch-case, and 'erp_action' is initialized for every one of them, before it is used. Thus the warning is a false positive, as documented. We introduce the extra if{} in the beginning to create an extra code-flow, so the compiler can be convinced that the switch-case will never see any other value. BUG_ON()/BUG() is intentionally not used to not crash anything, should this ever happen anyway - right now it's impossible, as argued above; and it doesn't introduce a 'default:' switch-case to retain warnings should 'enum zfcp_erp_act_type' ever be extended and no explicit case be introduced. See also v5.0 commit 399b6c8bc9f7 ("scsi: zfcp: drop old default switch case which might paper over missing case"). Signed-off-by: Benjamin Block Reviewed-by: Jens Remus Reviewed-by: Steffen Maier Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin

s390/qdio: handle PENDING state for QEBSM devices

2019-07-31T05:28:23Z

[ Upstream commit 04310324c6f482921c071444833e70fe861b73d9 ] When a CQ-enabled device uses QEBSM for SBAL state inspection, get_buf_states() can return the PENDING state for an Output Queue. get_outbound_buffer_frontier() isn't prepared for this, and any PENDING buffer will permanently stall all further completion processing on this Queue. This isn't a concern for non-QEBSM devices, as get_buf_states() for such devices will manually turn PENDING buffers into EMPTY ones. Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin

s390/qdio: don't touch the dsci in tiqdio_add_input_queues()

2019-07-21T07:04:42Z

commit ac6639cd3db607d386616487902b4cc1850a7be5 upstream. Current code sets the dsci to 0x00000080. Which doesn't make any sense, as the indicator area is located in the _left-most_ byte. Worse: if the dsci is the _shared_ indicator, this potentially clears the indication of activity for a _different_ device. tiqdio_thinint_handler() will then have no reason to call that device's IRQ handler, and the device ends up stalling. Fixes: d0c9d4a89fff ("[S390] qdio: set correct bit in dsci") Cc: Signed-off-by: Julian Wiedmann Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman

s390/qdio: (re-)initialize tiqdio list entries

2019-07-21T07:04:42Z

commit e54e4785cb5cb4896cf4285964aeef2125612fb2 upstream. When tiqdio_remove_input_queues() removes a queue from the tiq_list as part of qdio_shutdown(), it doesn't re-initialize the queue's list entry and the prev/next pointers go stale. If a subsequent qdio_establish() fails while sending the ESTABLISH cmd, it calls qdio_shutdown() again in QDIO_IRQ_STATE_ERR state and tiqdio_remove_input_queues() will attempt to remove the queue entry a second time. This dereferences the stale pointers, and bad things ensue. Fix this by re-initializing the list entry after removing it from the list. For good practice also initialize the list entry when the queue is first allocated, and remove the quirky checks that papered over this omission. Note that prior to commit e521813468f7 ("s390/qdio: fix access to uninitialized qdio_q fields"), these checks were bogus anyway. setup_queues_misc() clears the whole queue struct, and thus needs to re-init the prev/next pointers as well. Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.") Cc: Signed-off-by: Julian Wiedmann Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman