summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-08-04Linux 5.2.6v5.2.6Greg Kroah-Hartman
2019-08-04ceph: hold i_ceph_lock when removing caps for freeing inodeYan, Zheng
commit d6e47819721ae2d9d090058ad5570a66f3c42e39 upstream. ceph_d_revalidate(, LOOKUP_RCU) may call __ceph_caps_issued_mask() on a freeing inode. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04Fix allyesconfig output.Yoshinori Sato
commit 1b496469d0c020e09124e03e66a81421c21272a7 upstream. Conflict JCore-SoC and SolutionEngine 7619. Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctlMiroslav Lichvar
commit 5515e9a6273b8c02034466bcbd717ac9f53dab99 upstream. The PPS assert/clear offset corrections are set by the PPS_SETPARAMS ioctl in the pps_ktime structs, which also contain flags. The flags are not initialized by applications (using the timepps.h header) and they are not used by the kernel for anything except returning them back in the PPS_GETPARAMS ioctl. Set the flags to zero to make it clear they are unused and avoid leaking uninitialized data of the PPS_SETPARAMS caller to other applications that have a read access to the PPS device. Link: http://lkml.kernel.org/r/20190702092251.24303-1-mlichvar@redhat.com Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Cc: Greg KH <greg@kroah.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04/proc/<pid>/cmdline: add back the setproctitle() special caseLinus Torvalds
commit d26d0cd97c88eb1a5704b42e41ab443406807810 upstream. This makes the setproctitle() special case very explicit indeed, and handles it with a separate helper function entirely. In the process, it re-instates the original semantics of simply stopping at the first NUL character when the original last NUL character is no longer there. [ The original semantics can still be seen in mm/util.c: get_cmdline() that is limited to a fixed-size buffer ] This makes the logic about when we use the string lengths etc much more obvious, and makes it easier to see what we do and what the two very different cases are. Note that even when we allow walking past the end of the argument array (because the setproctitle() might have overwritten and overflowed the original argv[] strings), we only allow it when it overflows into the environment region if it is immediately adjacent. [ Fixed for missing 'count' checks noted by Alexey Izbyshev ] Link: https://lore.kernel.org/lkml/alpine.LNX.2.21.1904052326230.3249@kich.toxcorp.com/ Fixes: 5ab827189965 ("fs/proc: simplify and clarify get_mm_cmdline() function") Cc: Jakub Jankowski <shasta@toxcorp.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04/proc/<pid>/cmdline: remove all the special casesLinus Torvalds
commit 3d712546d8ba9f25cdf080d79f90482aa4231ed4 upstream. Start off with a clean slate that only reads exactly from arg_start to arg_end, without any oddities. This simplifies the code and in the process removes the case that caused us to potentially leak an uninitialized byte from the temporary kernel buffer. Note that in order to start from scratch with an understandable base, this simplifies things _too_ much, and removes all the legacy logic to handle setproctitle() having changed the argument strings. We'll add back those special cases very differently in the next commit. Link: https://lore.kernel.org/lkml/20190712160913.17727-1-izbyshev@ispras.ru/ Fixes: f5b65348fd77 ("proc: fix missing final NUL in get_mm_cmdline() rewrite") Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04sched/fair: Use RCU accessors consistently for ->numa_groupJann Horn
commit cb361d8cdef69990f6b4504dc1fd9a594d983c97 upstream. The old code used RCU annotations and accessors inconsistently for ->numa_group, which can lead to use-after-frees and NULL dereferences. Let all accesses to ->numa_group use proper RCU helpers to prevent such issues. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Fixes: 8c8a743c5087 ("sched/numa: Use {cpu, pid} to create task groups for shared faults") Link: https://lkml.kernel.org/r/20190716152047.14424-3-jannh@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04sched/fair: Don't free p->numa_faults with concurrent readersJann Horn
commit 16d51a590a8ce3befb1308e0e7ab77f3b661af33 upstream. When going through execve(), zero out the NUMA fault statistics instead of freeing them. During execve, the task is reachable through procfs and the scheduler. A concurrent /proc/*/sched reader can read data from a freed ->numa_faults allocation (confirmed by KASAN) and write it back to userspace. I believe that it would also be possible for a use-after-free read to occur through a race between a NUMA fault and execve(): task_numa_fault() can lead to task_numa_compare(), which invokes task_weight() on the currently running task of a different CPU. Another way to fix this would be to make ->numa_faults RCU-managed or add extra locking, but it seems easier to wipe the NUMA fault statistics on execve. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Fixes: 82727018b0d3 ("sched/numa: Call task_numa_free() from do_execve()") Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04Bluetooth: hci_uart: check for missing tty operationsVladis Dronov
commit b36a1552d7319bbfd5cf7f08726c23c5c66d4f73 upstream. Certain ttys operations (pty_unix98_ops) lack tiocmget() and tiocmset() functions which are called by the certain HCI UART protocols (hci_ath, hci_bcm, hci_intel, hci_mrvl, hci_qca) via hci_uart_set_flow_control() or directly. This leads to an execution at NULL and can be triggered by an unprivileged user. Fix this by adding a helper function and a check for the missing tty operations in the protocols code. This fixes CVE-2019-10207. The Fixes: lines list commits where calls to tiocm[gs]et() or hci_uart_set_flow_control() were added to the HCI UART protocols. Link: https://syzkaller.appspot.com/bug?id=1b42faa2848963564a5b1b7f8c837ea7b55ffa50 Reported-by: syzbot+79337b501d6aa974d0f6@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # v2.6.36+ Fixes: b3190df62861 ("Bluetooth: Support for Atheros AR300x serial chip") Fixes: 118612fb9165 ("Bluetooth: hci_bcm: Add suspend/resume PM functions") Fixes: ff2895592f0f ("Bluetooth: hci_intel: Add Intel baudrate configuration support") Fixes: 162f812f23ba ("Bluetooth: hci_uart: Add Marvell support") Fixes: fa9ad876b8e0 ("Bluetooth: hci_qca: Add support for Qualcomm Bluetooth chip wcn3990") Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Reviewed-by: Yu-Chen, Cho <acho@suse.com> Tested-by: Yu-Chen, Cho <acho@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04nvme: fix multipath crash when ANA is deactivatedMarta Rybczynska
commit 66b20ac0a1a10769d059d6903202f53494e3d902 upstream. Fix a crash with multipath activated. It happends when ANA log page is larger than MDTS and because of that ANA is disabled. The driver then tries to access unallocated buffer when connecting to a nvme target. The signature is as follows: [ 300.433586] nvme nvme0: ANA log page size (8208) larger than MDTS (8192). [ 300.435387] nvme nvme0: disabling ANA support. [ 300.437835] nvme nvme0: creating 4 I/O queues. [ 300.459132] nvme nvme0: new ctrl: NQN "nqn.0.0.0", addr 10.91.0.1:8009 [ 300.464609] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 300.466342] #PF error: [normal kernel read fault] [ 300.467385] PGD 0 P4D 0 [ 300.467987] Oops: 0000 [#1] SMP PTI [ 300.468787] CPU: 3 PID: 50 Comm: kworker/u8:1 Not tainted 5.0.20kalray+ #4 [ 300.470264] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 300.471532] Workqueue: nvme-wq nvme_scan_work [nvme_core] [ 300.472724] RIP: 0010:nvme_parse_ana_log+0x21/0x140 [nvme_core] [ 300.474038] Code: 45 01 d2 d8 48 98 c3 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 08 48 8b af 20 0a 00 00 48 89 34 24 <66> 83 7d 08 00 0f 84 c6 00 00 00 44 8b 7d 14 49 89 d5 8b 55 10 48 [ 300.477374] RSP: 0018:ffffa50e80fd7cb8 EFLAGS: 00010296 [ 300.478334] RAX: 0000000000000001 RBX: ffff9130f1872258 RCX: 0000000000000000 [ 300.479784] RDX: ffffffffc06c4c30 RSI: ffff9130edad4280 RDI: ffff9130f1872258 [ 300.481488] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044 [ 300.483203] R10: 0000000000000220 R11: 0000000000000040 R12: ffff9130f18722c0 [ 300.484928] R13: ffff9130f18722d0 R14: ffff9130edad4280 R15: ffff9130f18722c0 [ 300.486626] FS: 0000000000000000(0000) GS:ffff9130f7b80000(0000) knlGS:0000000000000000 [ 300.488538] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 300.489907] CR2: 0000000000000008 CR3: 00000002365e6000 CR4: 00000000000006e0 [ 300.491612] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 300.493303] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 300.494991] Call Trace: [ 300.495645] nvme_mpath_add_disk+0x5c/0xb0 [nvme_core] [ 300.496880] nvme_validate_ns+0x2ef/0x550 [nvme_core] [ 300.498105] ? nvme_identify_ctrl.isra.45+0x6a/0xb0 [nvme_core] [ 300.499539] nvme_scan_work+0x2b4/0x370 [nvme_core] [ 300.500717] ? __switch_to_asm+0x35/0x70 [ 300.501663] process_one_work+0x171/0x380 [ 300.502340] worker_thread+0x49/0x3f0 [ 300.503079] kthread+0xf8/0x130 [ 300.503795] ? max_active_store+0x80/0x80 [ 300.504690] ? kthread_bind+0x10/0x10 [ 300.505502] ret_from_fork+0x35/0x40 [ 300.506280] Modules linked in: nvme_tcp nvme_rdma rdma_cm iw_cm ib_cm ib_core nvme_fabrics nvme_core xt_physdev ip6table_raw ip6table_mangle ip6table_filter ip6_tables xt_comment iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_CHECKSUM iptable_mangle iptable_filter veth ebtable_filter ebtable_nat ebtables iptable_raw vxlan ip6_udp_tunnel udp_tunnel sunrpc joydev pcspkr virtio_balloon br_netfilter bridge stp llc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net virtio_console net_failover virtio_blk failover ata_piix serio_raw libata virtio_pci virtio_ring virtio [ 300.514984] CR2: 0000000000000008 [ 300.515569] ---[ end trace faa2eefad7e7f218 ]--- [ 300.516354] RIP: 0010:nvme_parse_ana_log+0x21/0x140 [nvme_core] [ 300.517330] Code: 45 01 d2 d8 48 98 c3 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 08 48 8b af 20 0a 00 00 48 89 34 24 <66> 83 7d 08 00 0f 84 c6 00 00 00 44 8b 7d 14 49 89 d5 8b 55 10 48 [ 300.520353] RSP: 0018:ffffa50e80fd7cb8 EFLAGS: 00010296 [ 300.521229] RAX: 0000000000000001 RBX: ffff9130f1872258 RCX: 0000000000000000 [ 300.522399] RDX: ffffffffc06c4c30 RSI: ffff9130edad4280 RDI: ffff9130f1872258 [ 300.523560] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044 [ 300.524734] R10: 0000000000000220 R11: 0000000000000040 R12: ffff9130f18722c0 [ 300.525915] R13: ffff9130f18722d0 R14: ffff9130edad4280 R15: ffff9130f18722c0 [ 300.527084] FS: 0000000000000000(0000) GS:ffff9130f7b80000(0000) knlGS:0000000000000000 [ 300.528396] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 300.529440] CR2: 0000000000000008 CR3: 00000002365e6000 CR4: 00000000000006e0 [ 300.530739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 300.531989] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 300.533264] Kernel panic - not syncing: Fatal exception [ 300.534338] Kernel Offset: 0x17c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 300.536227] ---[ end Kernel panic - not syncing: Fatal exception ]--- Condition check refactoring from Christoph Hellwig. Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> Tested-by: Jean-Baptiste Riaux <jbriaux@kalray.eu> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04xfrm: policy: fix bydst hlist corruption on hash rebuildFlorian Westphal
commit fd709721352dd5239056eacaded00f2244e6ef58 upstream. syzbot reported following spat: BUG: KASAN: use-after-free in __write_once_size include/linux/compiler.h:221 BUG: KASAN: use-after-free in hlist_del_rcu include/linux/rculist.h:455 BUG: KASAN: use-after-free in xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318 Write of size 8 at addr ffff888095e79c00 by task kworker/1:3/8066 Workqueue: events xfrm_hash_rebuild Call Trace: __write_once_size include/linux/compiler.h:221 [inline] hlist_del_rcu include/linux/rculist.h:455 [inline] xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318 process_one_work+0x814/0x1130 kernel/workqueue.c:2269 Allocated by task 8064: __kmalloc+0x23c/0x310 mm/slab.c:3669 kzalloc include/linux/slab.h:742 [inline] xfrm_hash_alloc+0x38/0xe0 net/xfrm/xfrm_hash.c:21 xfrm_policy_init net/xfrm/xfrm_policy.c:4036 [inline] xfrm_net_init+0x269/0xd60 net/xfrm/xfrm_policy.c:4120 ops_init+0x336/0x420 net/core/net_namespace.c:130 setup_net+0x212/0x690 net/core/net_namespace.c:316 The faulting address is the address of the old chain head, free'd by xfrm_hash_resize(). In xfrm_hash_rehash(), chain heads get re-initialized without any hlist_del_rcu: for (i = hmask; i >= 0; i--) INIT_HLIST_HEAD(odst + i); Then, hlist_del_rcu() gets called on the about to-be-reinserted policy when iterating the per-net list of policies. hlist_del_rcu() will then make chain->first be nonzero again: static inline void __hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next; // address of next element in list struct hlist_node **pprev = n->pprev;// location of previous elem, this // can point at chain->first WRITE_ONCE(*pprev, next); // chain->first points to next elem if (next) next->pprev = pprev; Then, when we walk chainlist to find insertion point, we may find a non-empty list even though we're supposedly reinserting the first policy to an empty chain. To fix this first unlink all exact and inexact policies instead of zeroing the list heads. Add the commands equivalent to the syzbot reproducer to xfrm_policy.sh, without fix KASAN catches the corruption as it happens, SLUB poisoning detects it a bit later. Reported-by: syzbot+0165480d4ef07360eeda@syzkaller.appspotmail.com Fixes: 1548bc4e0512 ("xfrm: policy: delete inexact policies from inexact list on hash rebuild") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04media: radio-raremono: change devm_k*alloc to k*allocLuke Nowakowski-Krijger
commit c666355e60ddb4748ead3bdd983e3f7f2224aaf0 upstream. Change devm_k*alloc to k*alloc to manually allocate memory The manual allocation and freeing of memory is necessary because when the USB radio is disconnected, the memory associated with devm_k*alloc is freed. Meaning if we still have unresolved references to the radio device, then we get use-after-free errors. This patch fixes this by manually allocating memory, and freeing it in the v4l2.release callback that gets called when the last radio device exits. Reported-and-tested-by: syzbot+a4387f5b6b799f6becbf@syzkaller.appspotmail.com Signed-off-by: Luke Nowakowski-Krijger <lnowakow@eng.ucsd.edu> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> [hverkuil-cisco@xs4all.nl: cleaned up two small checkpatch.pl warnings] [hverkuil-cisco@xs4all.nl: prefix subject with driver name] Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04NFS: Cleanup if nfs_match_client is interruptedBenjamin Coddington
commit 9f7761cf0409465075dadb875d5d4b8ef2f890c8 upstream. Don't bail out before cleaning up a new allocation if the wait for searching for a matching nfs client is interrupted. Memory leaks. Reported-by: syzbot+7fe11b49c1cc30e3fce2@syzkaller.appspotmail.com Fixes: 950a578c6128 ("NFS: make nfs_match_client killable") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04media: pvrusb2: use a different format for warningsAndrey Konovalov
commit 1753c7c4367aa1201e1e5d0a601897ab33444af1 upstream. When the pvrusb2 driver detects that there's something wrong with the device, it prints a warning message. Right now those message are printed in two different formats: 1. ***WARNING*** message here 2. WARNING: message here There's an issue with the second format. Syzkaller recognizes it as a message produced by a WARN_ON(), which is used to indicate a bug in the kernel. However pvrusb2 prints those warnings to indicate an issue with the device, not the bug in the kernel. This patch changes the pvrusb2 driver to consistently use the first warning message format. This will unblock syzkaller testing of this driver. Reported-by: syzbot+af8f8d2ac0d39b0ed3a0@syzkaller.appspotmail.com Reported-by: syzbot+170a86bf206dd2c6217e@syzkaller.appspotmail.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04media: cpia2_usb: first wake up, then free in disconnectOliver Neukum
commit eff73de2b1600ad8230692f00bc0ab49b166512a upstream. Kasan reported a use after free in cpia2_usb_disconnect() It first freed everything and then woke up those waiting. The reverse order is correct. Fixes: 6c493f8b28c67 ("[media] cpia2: major overhaul to get it in a working state again") Signed-off-by: Oliver Neukum <oneukum@suse.com> Reported-by: syzbot+0c90fc937c84f97d0aa6@syzkaller.appspotmail.com Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04ath10k: Change the warning message stringFabio Estevam
commit 265df32eae5845212ad9f55f5ae6b6dcb68b187b upstream. The "WARNING" string confuses syzbot, which thinks it found a crash [1]. Change the string to avoid such problem. [1] https://lkml.org/lkml/2019/5/9/243 Reported-by: syzbot+c1b25598aa60dcd47e78@syzkaller.appspotmail.com Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04media: au0828: fix null dereference in error pathSean Young
commit 6d0d1ff9ff21fbb06b867c13a1d41ce8ddcd8230 upstream. au0828_usb_disconnect() gets the au0828_dev struct via usb_get_intfdata, so it needs to set up for the error paths. Reported-by: syzbot+357d86bcb4cca1a2f572@syzkaller.appspotmail.com Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04bpf: fix NULL deref in btf_type_is_resolve_source_onlyStanislav Fomichev
commit e4f07120210a1794c1f1ae64d209a2fbc7bd2682 upstream. Commit 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") added invocations of btf_type_is_resolve_source_only before btf_type_nosize_or_null which checks for the NULL pointer. Swap the order of btf_type_nosize_or_null and btf_type_is_resolve_source_only to make sure the do the NULL pointer check first. Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04ISDN: hfcsusb: checking idx of ep configurationPhong Tran
commit f384e62a82ba5d85408405fdd6aeff89354deaa9 upstream. The syzbot test with random endpoint address which made the idx is overflow in the table of endpoint configuations. this adds the checking for fixing the error report from syzbot KASAN: stack-out-of-bounds Read in hfcsusb_probe [1] The patch tested by syzbot [2] Reported-by: syzbot+8750abbc3a46ef47d509@syzkaller.appspotmail.com [1]: https://syzkaller.appspot.com/bug?id=30a04378dac680c5d521304a00a86156bb913522 [2]: https://groups.google.com/d/msg/syzkaller-bugs/_6HBdge8F3E/OJn7wVNpBAAJ Signed-off-by: Phong Tran <tranmanphong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04vsock: correct removal of socket from the listSunil Muthuswamy
commit d5afa82c977ea06f7119058fa0eb8519ea501031 upstream. The current vsock code for removal of socket from the list is both subject to race and inefficient. It takes the lock, checks whether the socket is in the list, drops the lock and if the socket was on the list, deletes it from the list. This is subject to race because as soon as the lock is dropped once it is checked for presence, that condition cannot be relied upon for any decision. It is also inefficient because if the socket is present in the list, it takes the lock twice. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31Linux 5.2.5v5.2.5Greg Kroah-Hartman
2019-07-31io_uring: don't use iov_iter_advance() for fixed buffersJens Axboe
commit bd11b3a391e3df6fa958facbe4b3f9f4cca9bd49 upstream. Hrvoje reports that when a large fixed buffer is registered and IO is being done to the latter pages of said buffer, the IO submission time is much worse: reading to the start of the buffer: 11238 ns reading to the end of the buffer: 1039879 ns In fact, it's worse by two orders of magnitude. The reason for that is how io_uring figures out how to setup the iov_iter. We point the iter at the first bvec, and then use iov_iter_advance() to fast-forward to the offset within that buffer we need. However, that is abysmally slow, as it entails iterating the bvecs that we setup as part of buffer registration. There's really no need to use this generic helper, as we know it's a BVEC type iterator, and we also know that each bvec is PAGE_SIZE in size, apart from possibly the first and last. Hence we can just use a shift on the offset to find the right index, and then adjust the iov_iter appropriately. After this fix, the timings are: reading to the start of the buffer: 10135 ns reading to the end of the buffer: 1377 ns Or about an 755x improvement for the tail page. Reported-by: Hrvoje Zeba <zeba.hrvoje@gmail.com> Tested-by: Hrvoje Zeba <zeba.hrvoje@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31io_uring: fix counter inc/dec mismatch in async_listZhengyuan Liu
commit f7b76ac9d17e16e44feebb6d2749fec92bfd6dd4 upstream. We could queue a work for each req in defer and link list without increasing async_list->cnt, so we shouldn't decrease it while exiting from workqueue as well if we didn't process the req in async list. Thanks to Jens Axboe <axboe@kernel.dk> for his guidance. Fixes: 31b515106428 ("io_uring: allow workqueue item to handle multiple buffered requests") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31io_uring: ensure ->list is initialized for poll commandsJens Axboe
commit 36703247d5f52a679df9da51192b6950fe81689f upstream. Daniel reports that when testing an http server that uses io_uring to poll for incoming connections, sometimes it hard crashes. This is due to an uninitialized list member for the io_uring request. Normally this doesn't trigger and none of the test cases caught it. Reported-by: Daniel Kozak <kozzi11@gmail.com> Tested-by: Daniel Kozak <kozzi11@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31io_uring: add a memory barrier before atomic_readZhengyuan Liu
commit c0e48f9dea9129aa11bec3ed13803bcc26e96e49 upstream. There is a hang issue while using fio to do some basic test. The issue can be easily reproduced using the below script: while true do fio --ioengine=io_uring -rw=write -bs=4k -numjobs=1 \ -size=1G -iodepth=64 -name=uring --filename=/dev/zero done After several minutes (or more), fio would block at io_uring_enter->io_cqring_wait in order to waiting for previously committed sqes to be completed and can't return to user anymore until we send a SIGTERM to fio. After receiving SIGTERM, fio hangs at io_ring_ctx_wait_and_kill with a backtrace like this: [54133.243816] Call Trace: [54133.243842] __schedule+0x3a0/0x790 [54133.243868] schedule+0x38/0xa0 [54133.243880] schedule_timeout+0x218/0x3b0 [54133.243891] ? sched_clock+0x9/0x10 [54133.243903] ? wait_for_completion+0xa3/0x130 [54133.243916] ? _raw_spin_unlock_irq+0x2c/0x40 [54133.243930] ? trace_hardirqs_on+0x3f/0xe0 [54133.243951] wait_for_completion+0xab/0x130 [54133.243962] ? wake_up_q+0x70/0x70 [54133.243984] io_ring_ctx_wait_and_kill+0xa0/0x1d0 [54133.243998] io_uring_release+0x20/0x30 [54133.244008] __fput+0xcf/0x270 [54133.244029] ____fput+0xe/0x10 [54133.244040] task_work_run+0x7f/0xa0 [54133.244056] do_exit+0x305/0xc40 [54133.244067] ? get_signal+0x13b/0xbd0 [54133.244088] do_group_exit+0x50/0xd0 [54133.244103] get_signal+0x18d/0xbd0 [54133.244112] ? _raw_spin_unlock_irqrestore+0x36/0x60 [54133.244142] do_signal+0x34/0x720 [54133.244171] ? exit_to_usermode_loop+0x7e/0x130 [54133.244190] exit_to_usermode_loop+0xc0/0x130 [54133.244209] do_syscall_64+0x16b/0x1d0 [54133.244221] entry_SYSCALL_64_after_hwframe+0x49/0xbe The reason is that we had added a req to ctx->pending_async at the very end, but it didn't get a chance to be processed. How could this happen? fio#cpu0 wq#cpu1 io_add_to_prev_work io_sq_wq_submit_work atomic_read() <<< 1 atomic_dec_return() << 1->0 list_empty(); <<< true; list_add_tail() atomic_read() << 0 or 1? As atomic_ops.rst states, atomic_read does not guarantee that the runtime modification by any other thread is visible yet, so we must take care of that with a proper implicit or explicit memory barrier. This issue was detected with the help of Jackie's <liuyun01@kylinos.cn> Fixes: 31b515106428 ("io_uring: allow workqueue item to handle multiple buffered requests") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31access: avoid the RCU grace period for the temporary subjective credentialsLinus Torvalds
commit d7852fbd0f0423937fa287a598bfde188bb68c22 upstream. It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU work because it installs a temporary credential that gets allocated and freed for each system call. The allocation and freeing overhead is mostly benign, but because credentials can be accessed under the RCU read lock, the freeing involves a RCU grace period. Which is not a huge deal normally, but if you have a lot of access() calls, this causes a fair amount of seconday damage: instead of having a nice alloc/free patterns that hits in hot per-CPU slab caches, you have all those delayed free's, and on big machines with hundreds of cores, the RCU overhead can end up being enormous. But it turns out that all of this is entirely unnecessary. Exactly because access() only installs the credential as the thread-local subjective credential, the temporary cred pointer doesn't actually need to be RCU free'd at all. Once we're done using it, we can just free it synchronously and avoid all the RCU overhead. So add a 'non_rcu' flag to 'struct cred', which can be set by users that know they only use it in non-RCU context (there are other potential users for this). We can make it a union with the rcu freeing list head that we need for the RCU case, so this doesn't need any extra storage. Note that this also makes 'get_current_cred()' clear the new non_rcu flag, in case we have filesystems that take a long-term reference to the cred and then expect the RCU delayed freeing afterwards. It's not entirely clear that this is required, but it makes for clear semantics: the subjective cred remains non-RCU as long as you only access it synchronously using the thread-local accessors, but you _can_ use it as a generic cred if you want to. It is possible that we should just remove the whole RCU markings for ->cred entirely. Only ->real_cred is really supposed to be accessed through RCU, and the long-term cred copies that nfs uses might want to explicitly re-enable RCU freeing if required, rather than have get_current_cred() do it implicitly. But this is a "minimal semantic changes" change for the immediate problem. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jan Glauber <jglauber@marvell.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com> Cc: Greg KH <greg@kroah.com> Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31drm/i915: Make the semaphore saturation mask globalChris Wilson
commit 44d89409a12eb8333735958509d7d591b461d13d upstream. The idea behind keeping the saturation mask local to a context backfired spectacularly. The premise with the local mask was that we would be more proactive in attempting to use semaphores after each time the context idled, and that all new contexts would attempt to use semaphores ignoring the current state of the system. This turns out to be horribly optimistic. If the system state is still oversaturated and the existing workloads have all stopped using semaphores, the new workloads would attempt to use semaphores and be deprioritised behind real work. The new contexts would not switch off using semaphores until their initial batch of low priority work had completed. Given sufficient backload load of equal user priority, this would completely starve the new work of any GPU time. To compensate, remove the local tracking in favour of keeping it as global state on the engine -- once the system is saturated and semaphores are disabled, everyone stops attempting to use semaphores until the system is idle again. One of the reason for preferring local context tracking was that it worked with virtual engines, so for switching to global state we could either do a complete check of all the virtual siblings or simply disable semaphores for those requests. This takes the simpler approach of disabling semaphores on virtual engines. The downside is that the decision that the engine is saturated is a local measure -- we are only checking whether or not this context was scheduled in a timely fashion, it may be legitimately delayed due to user priorities. We still have the same dilemma though, that we do not want to employ the semaphore poll unless it will be used. v2: Explain why we need to assume the worst wrt virtual engines. Fixes: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-8-chris@chris-wilson.co.uk Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31structleak: disable STRUCTLEAK_BYREF in combination with KASAN_STACKArnd Bergmann
commit 173e6ee21e2b3f477f07548a79c43b8d9cfbb37d upstream. The combination of KASAN_STACK and GCC_PLUGIN_STRUCTLEAK_BYREF leads to much larger kernel stack usage, as seen from the warnings about functions that now exceed the 2048 byte limit: drivers/media/i2c/tvp5150.c:253:1: error: the frame size of 3936 bytes is larger than 2048 bytes drivers/media/tuners/r820t.c:1327:1: error: the frame size of 2816 bytes is larger than 2048 bytes drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:16552:1: error: the frame size of 3144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] fs/ocfs2/aops.c:1892:1: error: the frame size of 2088 bytes is larger than 2048 bytes fs/ocfs2/dlm/dlmrecovery.c:737:1: error: the frame size of 2088 bytes is larger than 2048 bytes fs/ocfs2/namei.c:1677:1: error: the frame size of 2584 bytes is larger than 2048 bytes fs/ocfs2/super.c:1186:1: error: the frame size of 2640 bytes is larger than 2048 bytes fs/ocfs2/xattr.c:3678:1: error: the frame size of 2176 bytes is larger than 2048 bytes net/bluetooth/l2cap_core.c:7056:1: error: the frame size of 2144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] net/bluetooth/l2cap_core.c: In function 'l2cap_recv_frame': net/bridge/br_netlink.c:1505:1: error: the frame size of 2448 bytes is larger than 2048 bytes net/ieee802154/nl802154.c:548:1: error: the frame size of 2232 bytes is larger than 2048 bytes net/wireless/nl80211.c:1726:1: error: the frame size of 2224 bytes is larger than 2048 bytes net/wireless/nl80211.c:2357:1: error: the frame size of 4584 bytes is larger than 2048 bytes net/wireless/nl80211.c:5108:1: error: the frame size of 2760 bytes is larger than 2048 bytes net/wireless/nl80211.c:6472:1: error: the frame size of 2112 bytes is larger than 2048 bytes The structleak plugin was previously disabled for CONFIG_COMPILE_TEST, but meant we missed some bugs, so this time we should address them. The frame size warnings are distracting, and risking a kernel stack overflow is generally not beneficial to performance, so it may be best to disallow that particular combination. This can be done by turning off either one. I picked the dependency in GCC_PLUGIN_STRUCTLEAK_BYREF and GCC_PLUGIN_STRUCTLEAK_BYREF_ALL, as this option is designed to make uninitialized stack usage less harmful when enabled on its own, but it also prevents KASAN from detecting those cases in which it was in fact needed. KASAN_STACK is currently implied by KASAN on gcc, but could be made a user selectable option if we want to allow combining (non-stack) KASAN with GCC_PLUGIN_STRUCTLEAK_BYREF. Note that it would be possible to specifically address the files that print the warning, but presumably the overall stack usage is still significantly higher than in other configurations, so this would not address the full problem. I could not test this with CONFIG_INIT_STACK_ALL, which may or may not suffer from a similar problem. Fixes: 81a56f6dcd20 ("gcc-plugins: structleak: Generalize to all variable types") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20190722114134.3123901-1-arnd@arndb.de Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()Dan Williams
commit b70d31d054ee3a6fc1034b9d7fc0ae1e481aa018 upstream. In preparation for fixing a deadlock between wait_for_bus_probe_idle() and the nvdimm_bus_list_mutex arrange for __nd_ioctl() without nvdimm_bus_list_mutex held. This also unifies the 'dimm' and 'bus' level ioctls into a common nd_ioctl() preamble implementation. Marked for -stable as it is a pre-requisite for a follow-on fix. Cc: <stable@vger.kernel.org> Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation") Cc: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Jane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/156341209518.292348.7183897251740665198.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31libnvdimm/region: Register badblocks before namespacesDan Williams
commit 700cd033a82d466ad8f9615f9985525e45f8960a upstream. Namespace activation expects to be able to reference region badblocks. The following warning sometimes triggers when asynchronous namespace activation races in front of the completion of namespace probing. Move all possible namespace probing after region badblocks initialization. Otherwise, lockdep sometimes catches the uninitialized state of the badblocks seqlock with stack trace signatures like: INFO: trying to register non-static key. pmem2: detected capacity change from 0 to 136365211648 the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 9 PID: 358 Comm: kworker/u80:5 Tainted: G OE 5.2.0-rc4+ #3382 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Workqueue: events_unbound async_run_entry_fn Call Trace: dump_stack+0x85/0xc0 pmem1.12: detected capacity change from 0 to 8589934592 register_lock_class+0x56a/0x570 ? check_object+0x140/0x270 __lock_acquire+0x80/0x1710 ? __mutex_lock+0x39d/0x910 lock_acquire+0x9e/0x180 ? nd_pfn_validate+0x28f/0x440 [libnvdimm] badblocks_check+0x93/0x1f0 ? nd_pfn_validate+0x28f/0x440 [libnvdimm] nd_pfn_validate+0x28f/0x440 [libnvdimm] ? lockdep_hardirqs_on+0xf0/0x180 nd_dax_probe+0x9a/0x120 [libnvdimm] nd_pmem_probe+0x6d/0x180 [nd_pmem] nvdimm_bus_probe+0x90/0x2c0 [libnvdimm] Fixes: 48af2f7e52f4 ("libnvdimm, pfn: during init, clear errors...") Cc: <stable@vger.kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/156341208365.292348.1547528796026249120.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31libnvdimm/bus: Prevent duplicate device_unregister() callsDan Williams
commit 8aac0e2338916e273ccbd438a2b7a1e8c61749f5 upstream. A multithreaded namespace creation/destruction stress test currently fails with signatures like the following: sysfs group 'power' not found for kobject 'dax1.1' RIP: 0010:sysfs_remove_group+0x76/0x80 Call Trace: device_del+0x73/0x370 device_unregister+0x16/0x50 nd_async_device_unregister+0x1e/0x30 [libnvdimm] async_run_entry_fn+0x39/0x160 process_one_work+0x23c/0x5e0 worker_thread+0x3c/0x390 BUG: kernel NULL pointer dereference, address: 0000000000000020 RIP: 0010:klist_put+0x1b/0x6c Call Trace: klist_del+0xe/0x10 device_del+0x8a/0x2c9 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 device_unregister+0x44/0x4f nd_async_device_unregister+0x22/0x2d [libnvdimm] async_run_entry_fn+0x47/0x15a process_one_work+0x1a2/0x2eb worker_thread+0x1b8/0x26e Use the kill_device() helper to atomically resolve the race of multiple threads issuing kill, device_unregister(), requests. Reported-by: Jane Chu <jane.chu@oracle.com> Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com> Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...") Cc: <stable@vger.kernel.org> Link: https://github.com/pmem/ndctl/issues/96 Tested-by: Tested-by: Jane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/156341207846.292348.10435719262819764054.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31drivers/base: Introduce kill_device()Dan Williams
commit 00289cd87676e14913d2d8492d1ce05c4baafdae upstream. The libnvdimm subsystem arranges for devices to be destroyed as a result of a sysfs operation. Since device_unregister() cannot be called from an actively running sysfs attribute of the same device libnvdimm arranges for device_unregister() to be performed in an out-of-line async context. The driver core maintains a 'dead' state for coordinating its own racing async registration / de-registration requests. Rather than add local 'dead' state tracking infrastructure to libnvdimm device objects, export the existing state tracking via a new kill_device() helper. The kill_device() helper simply marks the device as dead, i.e. that it is on its way to device_del(), or returns that the device was already dead. This can be used in advance of calling device_unregister() for subsystems like libnvdimm that might need to handle multiple user threads racing to delete a device. This refactoring does not change any behavior, but it is a pre-requisite for follow-on fixes and therefore marked for -stable. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...") Cc: <stable@vger.kernel.org> Tested-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/156341207332.292348.14959761496009347574.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVAJoerg Roedel
commit 201c1db90cd643282185a00770f12f95da330eca upstream. The stub function for !CONFIG_IOMMU_IOVA needs to be 'static inline'. Fixes: effa467870c76 ('iommu/vt-d: Don't queue_iova() if there is no flush queue') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31iommu/iova: Remove stale cached32_nodeChris Wilson
commit 9eed17d37c77171cf5ffb95c4257f87df3cd4c8f upstream. Since the cached32_node is allowed to be advanced above dma_32bit_pfn (to provide a shortcut into the limited range), we need to be careful to remove the to be freed node if it is the cached32_node. [ 48.477773] BUG: KASAN: use-after-free in __cached_rbnode_delete_update+0x68/0x110 [ 48.477812] Read of size 8 at addr ffff88870fc19020 by task kworker/u8:1/37 [ 48.477843] [ 48.477879] CPU: 1 PID: 37 Comm: kworker/u8:1 Tainted: G U 5.2.0+ #735 [ 48.477915] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 48.478047] Workqueue: i915 __i915_gem_free_work [i915] [ 48.478075] Call Trace: [ 48.478111] dump_stack+0x5b/0x90 [ 48.478137] print_address_description+0x67/0x237 [ 48.478178] ? __cached_rbnode_delete_update+0x68/0x110 [ 48.478212] __kasan_report.cold.3+0x1c/0x38 [ 48.478240] ? __cached_rbnode_delete_update+0x68/0x110 [ 48.478280] ? __cached_rbnode_delete_update+0x68/0x110 [ 48.478308] __cached_rbnode_delete_update+0x68/0x110 [ 48.478344] private_free_iova+0x2b/0x60 [ 48.478378] iova_magazine_free_pfns+0x46/0xa0 [ 48.478403] free_iova_fast+0x277/0x340 [ 48.478443] fq_ring_free+0x15a/0x1a0 [ 48.478473] queue_iova+0x19c/0x1f0 [ 48.478597] cleanup_page_dma.isra.64+0x62/0xb0 [i915] [ 48.478712] __gen8_ppgtt_cleanup+0x63/0x80 [i915] [ 48.478826] __gen8_ppgtt_cleanup+0x42/0x80 [i915] [ 48.478940] __gen8_ppgtt_clear+0x433/0x4b0 [i915] [ 48.479053] __gen8_ppgtt_clear+0x462/0x4b0 [i915] [ 48.479081] ? __sg_free_table+0x9e/0xf0 [ 48.479116] ? kfree+0x7f/0x150 [ 48.479234] i915_vma_unbind+0x1e2/0x240 [i915] [ 48.479352] i915_vma_destroy+0x3a/0x280 [i915] [ 48.479465] __i915_gem_free_objects+0xf0/0x2d0 [i915] [ 48.479579] __i915_gem_free_work+0x41/0xa0 [i915] [ 48.479607] process_one_work+0x495/0x710 [ 48.479642] worker_thread+0x4c7/0x6f0 [ 48.479687] ? process_one_work+0x710/0x710 [ 48.479724] kthread+0x1b2/0x1d0 [ 48.479774] ? kthread_create_worker_on_cpu+0xa0/0xa0 [ 48.479820] ret_from_fork+0x1f/0x30 [ 48.479864] [ 48.479907] Allocated by task 631: [ 48.479944] save_stack+0x19/0x80 [ 48.479994] __kasan_kmalloc.constprop.6+0xc1/0xd0 [ 48.480038] kmem_cache_alloc+0x91/0xf0 [ 48.480082] alloc_iova+0x2b/0x1e0 [ 48.480125] alloc_iova_fast+0x58/0x376 [ 48.480166] intel_alloc_iova+0x90/0xc0 [ 48.480214] intel_map_sg+0xde/0x1f0 [ 48.480343] i915_gem_gtt_prepare_pages+0xb8/0x170 [i915] [ 48.480465] huge_get_pages+0x232/0x2b0 [i915] [ 48.480590] ____i915_gem_object_get_pages+0x40/0xb0 [i915] [ 48.480712] __i915_gem_object_get_pages+0x90/0xa0 [i915] [ 48.480834] i915_gem_object_prepare_write+0x2d6/0x330 [i915] [ 48.480955] create_test_object.isra.54+0x1a9/0x3e0 [i915] [ 48.481075] igt_shared_ctx_exec+0x365/0x3c0 [i915] [ 48.481210] __i915_subtests.cold.4+0x30/0x92 [i915] [ 48.481341] __run_selftests.cold.3+0xa9/0x119 [i915] [ 48.481466] i915_live_selftests+0x3c/0x70 [i915] [ 48.481583] i915_pci_probe+0xe7/0x220 [i915] [ 48.481620] pci_device_probe+0xe0/0x180 [ 48.481665] really_probe+0x163/0x4e0 [ 48.481710] device_driver_attach+0x85/0x90 [ 48.481750] __driver_attach+0xa5/0x180 [ 48.481796] bus_for_each_dev+0xda/0x130 [ 48.481831] bus_add_driver+0x205/0x2e0 [ 48.481882] driver_register+0xca/0x140 [ 48.481927] do_one_initcall+0x6c/0x1af [ 48.481970] do_init_module+0x106/0x350 [ 48.482010] load_module+0x3d2c/0x3ea0 [ 48.482058] __do_sys_finit_module+0x110/0x180 [ 48.482102] do_syscall_64+0x62/0x1f0 [ 48.482147] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 48.482190] [ 48.482224] Freed by task 37: [ 48.482273] save_stack+0x19/0x80 [ 48.482318] __kasan_slab_free+0x12e/0x180 [ 48.482363] kmem_cache_free+0x70/0x140 [ 48.482406] __free_iova+0x1d/0x30 [ 48.482445] fq_ring_free+0x15a/0x1a0 [ 48.482490] queue_iova+0x19c/0x1f0 [ 48.482624] cleanup_page_dma.isra.64+0x62/0xb0 [i915] [ 48.482749] __gen8_ppgtt_cleanup+0x63/0x80 [i915] [ 48.482873] __gen8_ppgtt_cleanup+0x42/0x80 [i915] [ 48.482999] __gen8_ppgtt_clear+0x433/0x4b0 [i915] [ 48.483123] __gen8_ppgtt_clear+0x462/0x4b0 [i915] [ 48.483250] i915_vma_unbind+0x1e2/0x240 [i915] [ 48.483378] i915_vma_destroy+0x3a/0x280 [i915] [ 48.483500] __i915_gem_free_objects+0xf0/0x2d0 [i915] [ 48.483622] __i915_gem_free_work+0x41/0xa0 [i915] [ 48.483659] process_one_work+0x495/0x710 [ 48.483704] worker_thread+0x4c7/0x6f0 [ 48.483748] kthread+0x1b2/0x1d0 [ 48.483787] ret_from_fork+0x1f/0x30 [ 48.483831] [ 48.483868] The buggy address belongs to the object at ffff88870fc19000 [ 48.483868] which belongs to the cache iommu_iova of size 40 [ 48.483920] The buggy address is located 32 bytes inside of [ 48.483920] 40-byte region [ffff88870fc19000, ffff88870fc19028) [ 48.483964] The buggy address belongs to the page: [ 48.484006] page:ffffea001c3f0600 refcount:1 mapcount:0 mapping:ffff8888181a91c0 index:0x0 compound_mapcount: 0 [ 48.484045] flags: 0x8000000000010200(slab|head) [ 48.484096] raw: 8000000000010200 ffffea001c421a08 ffffea001c447e88 ffff8888181a91c0 [ 48.484141] raw: 0000000000000000 0000000000120012 00000001ffffffff 0000000000000000 [ 48.484188] page dumped because: kasan: bad access detected [ 48.484230] [ 48.484265] Memory state around the buggy address: [ 48.484314] ffff88870fc18f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 48.484361] ffff88870fc18f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 48.484406] >ffff88870fc19000: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc [ 48.484451] ^ [ 48.484494] ffff88870fc19080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 48.484530] ffff88870fc19100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108602 Fixes: e60aa7b53845 ("iommu/iova: Extend rbtree node caching") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31iommu/vt-d: Don't queue_iova() if there is no flush queueDmitry Safonov
commit effa467870c7612012885df4e246bdb8ffd8e44c upstream. Intel VT-d driver was reworked to use common deferred flushing implementation. Previously there was one global per-cpu flush queue, afterwards - one per domain. Before deferring a flush, the queue should be allocated and initialized. Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush queue. It's probably worth to init it for static or unmanaged domains too, but it may be arguable - I'm leaving it to iommu folks. Prevent queuing an iova flush if the domain doesn't have a queue. The defensive check seems to be worth to keep even if queue would be initialized for all kinds of domains. And is easy backportable. On 4.19.43 stable kernel it has a user-visible effect: previously for devices in si domain there were crashes, on sata devices: BUG: spinlock bad magic on CPU#6, swapper/0/1 lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x45/0x115 intel_unmap+0x107/0x113 intel_unmap_sg+0x6b/0x76 __ata_qc_complete+0x7f/0x103 ata_qc_complete+0x9b/0x26a ata_qc_complete_multiple+0xd0/0xe3 ahci_handle_port_interrupt+0x3ee/0x48a ahci_handle_port_intr+0x73/0xa9 ahci_single_level_irq_intr+0x40/0x60 __handle_irq_event_percpu+0x7f/0x19a handle_irq_event_percpu+0x32/0x72 handle_irq_event+0x38/0x56 handle_edge_irq+0x102/0x121 handle_irq+0x147/0x15c do_IRQ+0x66/0xf2 common_interrupt+0xf/0xf RIP: 0010:__do_softirq+0x8c/0x2df The same for usb devices that use ehci-pci: BUG: spinlock bad magic on CPU#0, swapper/0/1 lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x77/0x145 intel_unmap+0x107/0x113 intel_unmap_page+0xe/0x10 usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d usb_hcd_unmap_urb_for_dma+0x17/0x100 unmap_urb_for_dma+0x22/0x24 __usb_hcd_giveback_urb+0x51/0xc3 usb_giveback_urb_bh+0x97/0xde tasklet_action_common.isra.4+0x5f/0xa1 tasklet_action+0x2d/0x30 __do_softirq+0x138/0x2df irq_exit+0x7d/0x8b smp_apic_timer_interrupt+0x10f/0x151 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39 Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: <stable@vger.kernel.org> # 4.14+ Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31io_uring: fix the sequence comparison in io_sequence_deferZhengyuan Liu
commit dbd0f6d6c2a11eb9c31ca9cd454f95bb5713e92e upstream. sq->cached_sq_head and cq->cached_cq_tail are both unsigned int. If cached_sq_head overflows before cached_cq_tail, then we may miss a barrier req. As cached_cq_tail always follows cached_sq_head, the NQ should be enough. Cc: stable@vger.kernel.org Fixes: de0617e46717 ("io_uring: add support for marking commands as draining") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/pmu: Set pmcregs_in_use in paca when running as LPARSuraj Jitindar Singh
commit 28d2a6e6684d9851905f379816d8a4d03587ed94 upstream. The ability to run nested guests under KVM means that a guest can also act as a hypervisor for it's own nested guest. Currently ppc_set_pmu_inuse() assumes that either FW_FEATURE_LPAR is set, indicating a guest environment, and so sets the pmcregs_in_use flag in the lppaca, or that it isn't set, indicating a hypervisor environment, and so sets the pmcregs_in_use flag in the paca. The pmcregs_in_use flag in the lppaca is used to communicate this information to a hypervisor and so must be set in a guest environment. The pmcregs_in_use flag in the paca is used by KVM code to determine whether the host state of the performance monitoring unit (PMU) must be saved and restored when running a guest. Thus when a guest also acts as a hypervisor it must set this bit in both places since it needs to ensure both that the real hypervisor saves it's PMU registers when it runs (requires pmcregs_in_use flag in lppaca), and that it saves it's own PMU registers when running a nested guest (requires pmcregs_in_use flag in paca). Modify ppc_set_pmu_inuse() so that the pmcregs_in_use bit is set in both the lppaca and the paca when a guest (LPAR) is running with the capability of running it's own guests (CONFIG_KVM_BOOK3S_HV_POSSIBLE). Fixes: 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190703012022.15644-2-sjitindarsingh@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/tm: Fix oops on sigreturn on systems without TMMichael Neuling
commit f16d80b75a096c52354c6e0a574993f3b0dfbdfe upstream. On systems like P9 powernv where we have no TM (or P8 booted with ppc_tm=off), userspace can construct a signal context which still has the MSR TS bits set. The kernel tries to restore this context which results in the following crash: Unexpected TM Bad Thing exception at c0000000000022fc (msr 0x8000000102a03031) tm_scratch=800000020280f033 Oops: Unrecoverable exception, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 1636 Comm: sigfuz Not tainted 5.2.0-11043-g0a8ad0ffa4 #69 NIP: c0000000000022fc LR: 00007fffb2d67e48 CTR: 0000000000000000 REGS: c00000003fffbd70 TRAP: 0700 Not tainted (5.2.0-11045-g7142b497d8) MSR: 8000000102a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[E]> CR: 42004242 XER: 00000000 CFAR: c0000000000022e0 IRQMASK: 0 GPR00: 0000000000000072 00007fffb2b6e560 00007fffb2d87f00 0000000000000669 GPR04: 00007fffb2b6e728 0000000000000000 0000000000000000 00007fffb2b6f2a8 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00007fffb2b76900 0000000000000000 0000000000000000 GPR16: 00007fffb2370000 00007fffb2d84390 00007fffea3a15ac 000001000a250420 GPR20: 00007fffb2b6f260 0000000010001770 0000000000000000 0000000000000000 GPR24: 00007fffb2d843a0 00007fffea3a14a0 0000000000010000 0000000000800000 GPR28: 00007fffea3a14d8 00000000003d0f00 0000000000000000 00007fffb2b6e728 NIP [c0000000000022fc] rfi_flush_fallback+0x7c/0x80 LR [00007fffb2d67e48] 0x7fffb2d67e48 Call Trace: Instruction dump: e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 7d2903a6 e92d0c00 e94d0c08 e96d0c10 e82d0c18 7db242a6 <4c000024> 7db243a6 7db142a6 f82d0c18 The problem is the signal code assumes TM is enabled when CONFIG_PPC_TRANSACTIONAL_MEM is enabled. This may not be the case as with P9 powernv or if `ppc_tm=off` is used on P8. This means any local user can crash the system. Fix the problem by returning a bad stack frame to the user if they try to set the MSR TS bits with sigreturn() on systems where TM is not supported. Found with sigfuz kernel selftest on P9. This fixes CVE-2019-13648. Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9 Reported-by: Praveen Pandey <Praveen.Pandey@in.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190719050502.405-1-mikey@neuling.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/mm: Limit rma_size to 1TB when running without HV modeSuraj Jitindar Singh
commit da0ef93310e67ae6902efded60b6724dab27a5d1 upstream. The virtual real mode addressing (VRMA) mechanism is used when a partition is using HPT (Hash Page Table) translation and performs real mode accesses (MSR[IR|DR] = 0) in non-hypervisor mode. In this mode effective address bits 0:23 are treated as zero (i.e. the access is aliased to 0) and the access is performed using an implicit 1TB SLB entry. The size of the RMA (Real Memory Area) is communicated to the guest as the size of the first memory region in the device tree. And because of the mechanism described above can be expected to not exceed 1TB. In the event that the host erroneously represents the RMA as being larger than 1TB, guest accesses in real mode to memory addresses above 1TB will be aliased down to below 1TB. This means that a memory access performed in real mode may differ to one performed in virtual mode for the same memory address, which would likely have unintended consequences. To avoid this outcome have the guest explicitly limit the size of the RMA to the current maximum, which is 1TB. This means that even if the first memory block is larger than 1TB, only the first 1TB should be accessed in real mode. Fixes: c610d65c0ad0 ("powerpc/pseries: lift RTAS limit for hash") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190710052018.14628-1-sjitindarsingh@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask()Gautham R. Shenoy
commit 4d202c8c8ed3822327285747db1765967110b274 upstream. xive_find_target_in_mask() has the following for(;;) loop which has a bug when @first == cpumask_first(@mask) and condition 1 fails to hold for every CPU in @mask. In this case we loop forever in the for-loop. first = cpu; for (;;) { if (cpu_online(cpu) && xive_try_pick_target(cpu)) // condition 1 return cpu; cpu = cpumask_next(cpu, mask); if (cpu == first) // condition 2 break; if (cpu >= nr_cpu_ids) // condition 3 cpu = cpumask_first(mask); } This is because, when @first == cpumask_first(@mask), we never hit the condition 2 (cpu == first) since prior to this check, we would have executed "cpu = cpumask_next(cpu, mask)" which will set the value of @cpu to a value greater than @first or to nr_cpus_ids. When this is coupled with the fact that condition 1 is not met, we will never exit this loop. This was discovered by the hard-lockup detector while running LTP test concurrently with SMT switch tests. watchdog: CPU 12 detected hard LOCKUP on other CPUs 68 watchdog: CPU 12 TB:85587019220796, last SMP heartbeat TB:85578827223399 (15999ms ago) watchdog: CPU 68 Hard LOCKUP watchdog: CPU 68 TB:85587019361273, last heartbeat TB:85576815065016 (19930ms ago) CPU: 68 PID: 45050 Comm: hxediag Kdump: loaded Not tainted 4.18.0-100.el8.ppc64le #1 NIP: c0000000006f5578 LR: c000000000cba9ec CTR: 0000000000000000 REGS: c000201fff3c7d80 TRAP: 0100 Not tainted (4.18.0-100.el8.ppc64le) MSR: 9000000002883033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 24028424 XER: 00000000 CFAR: c0000000006f558c IRQMASK: 1 GPR00: c0000000000afc58 c000201c01c43400 c0000000015ce500 c000201cae26ec18 GPR04: 0000000000000800 0000000000000540 0000000000000800 00000000000000f8 GPR08: 0000000000000020 00000000000000a8 0000000080000000 c00800001a1beed8 GPR12: c0000000000b1410 c000201fff7f4c00 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000540 0000000000000001 GPR20: 0000000000000048 0000000010110000 c00800001a1e3780 c000201cae26ed18 GPR24: 0000000000000000 c000201cae26ed8c 0000000000000001 c000000001116bc0 GPR28: c000000001601ee8 c000000001602494 c000201cae26ec18 000000000000001f NIP [c0000000006f5578] find_next_bit+0x38/0x90 LR [c000000000cba9ec] cpumask_next+0x2c/0x50 Call Trace: [c000201c01c43400] [c000201cae26ec18] 0xc000201cae26ec18 (unreliable) [c000201c01c43420] [c0000000000afc58] xive_find_target_in_mask+0x1b8/0x240 [c000201c01c43470] [c0000000000b0228] xive_pick_irq_target.isra.3+0x168/0x1f0 [c000201c01c435c0] [c0000000000b1470] xive_irq_startup+0x60/0x260 [c000201c01c43640] [c0000000001d8328] __irq_startup+0x58/0xf0 [c000201c01c43670] [c0000000001d844c] irq_startup+0x8c/0x1a0 [c000201c01c436b0] [c0000000001d57b0] __setup_irq+0x9f0/0xa90 [c000201c01c43760] [c0000000001d5aa0] request_threaded_irq+0x140/0x220 [c000201c01c437d0] [c00800001a17b3d4] bnx2x_nic_load+0x188c/0x3040 [bnx2x] [c000201c01c43950] [c00800001a187c44] bnx2x_self_test+0x1fc/0x1f70 [bnx2x] [c000201c01c43a90] [c000000000adc748] dev_ethtool+0x11d8/0x2cb0 [c000201c01c43b60] [c000000000b0b61c] dev_ioctl+0x5ac/0xa50 [c000201c01c43bf0] [c000000000a8d4ec] sock_do_ioctl+0xbc/0x1b0 [c000201c01c43c60] [c000000000a8dfb8] sock_ioctl+0x258/0x4f0 [c000201c01c43d20] [c0000000004c9704] do_vfs_ioctl+0xd4/0xa70 [c000201c01c43de0] [c0000000004ca274] sys_ioctl+0xc4/0x160 [c000201c01c43e30] [c00000000000b388] system_call+0x5c/0x70 Instruction dump: 78aad182 54a806be 3920ffff 78a50664 794a1f24 7d294036 7d43502a 7d295039 4182001c 48000034 78a9d182 79291f24 <7d23482a> 2fa90000 409e0020 38a50040 To fix this, move the check for condition 2 after the check for condition 3, so that we are able to break out of the loop soon after iterating through all the CPUs in the @mask in the problem case. Use do..while() to achieve this. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Indira P. Joga <indira.priya@in.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1563359724-13931-1-git-send-email-ego@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/dma: Fix invalid DMA mmap behaviorShawn Anastasio
commit b4fc36e60f25cf22bf8b7b015a701015740c3743 upstream. The refactor of powerpc DMA functions in commit 6666cc17d780 ("powerpc/dma: remove dma_nommu_mmap_coherent") incorrectly changes the way DMA mappings are handled on powerpc. Since this change, all mapped pages are marked as cache-inhibited through the default implementation of arch_dma_mmap_pgprot. This differs from the previous behavior of only marking pages in noncoherent mappings as cache-inhibited and has resulted in sporadic system crashes in certain hardware configurations and workloads (see Bugzilla). This commit restores the previous correct behavior by providing an implementation of arch_dma_mmap_pgprot that only marks pages in noncoherent mappings as cache-inhibited. As this behavior should be universal for all powerpc platforms a new file, dma-generic.c, was created to store it. Fixes: 6666cc17d780 ("powerpc/dma: remove dma_nommu_mmap_coherent") # NOTE: fixes commit 6666cc17d780 released in v5.1. # Consider a stable tag: # Cc: stable@vger.kernel.org # v5.1+ # NOTE: fixes commit 6666cc17d780 released in v5.1. # Consider a stable tag: # Cc: stable@vger.kernel.org # v5.1+ Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Shawn Anastasio <shawn@anastas.io> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190717235437.12908-1-shawn@anastas.io Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31ALSA: hda - Add a conexant codec entry to let mute led workHui Wang
commit 3f8809499bf02ef7874254c5e23fc764a47a21a0 upstream. This conexant codec isn't in the supported codec list yet, the hda generic driver can drive this codec well, but on a Lenovo machine with mute/mic-mute leds, we need to apply CXT_FIXUP_THINKPAD_ACPI to make the leds work. After adding this codec to the list, the driver patch_conexant.c will apply THINKPAD_ACPI to this machine. Cc: stable@vger.kernel.org Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chipsTakashi Iwai
commit 2756d9143aa517b97961e85412882b8ce31371a6 upstream. It turned out that the recent Intel HD-audio controller chips show a significant stall during the system PM resume intermittently. It doesn't happen so often and usually it may read back successfully after one or more seconds, but in some rare worst cases the driver went into fallback mode. After trial-and-error, we found out that the communication stall seems covered by issuing the sync after each verb write, as already done for AMD and other chipsets. So this patch enables the write-sync flag for the recent Intel chips, Skylake and onward, as a workaround. Also, since Broxton and co have the very same driver flags as Skylake, refer to the Skylake driver flags instead of defining the same contents again for simplification. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201901 Reported-and-tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31ALSA: pcm: Fix refcount_inc() on zero usageTakashi Iwai
commit 0e279dcea0ec897af1c979ebee4ec92b461793f5 upstream. The recent rewrite of PCM link lock management introduced the refcount in snd_pcm_group object, managed by the kernel refcount_t API. This caused unexpected kernel warnings when the kernel is built with CONFIG_REFCOUNT_FULL=y. As the warning line indicates, the problem is obviously that we start with refcount=0 and do refcount_inc() for adding each PCM link, while refcount_t API doesn't like refcount_inc() performed on zero. For adapting the proper refcount_t usage, this patch changes the logic slightly: - The initial refcount is 1, assuming the single list entry - The refcount is incremented / decremented at each PCM link addition and deletion - ... which allows us concentrating only on the refcount as a release condition Fixes: f57f3df03a8e ("ALSA: pcm: More fine-grained PCM link locking") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=204221 Reported-and-tested-by: Duncan Overbruck <kernel@duncano.de> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1Kai-Heng Feng
commit 70256b42caaf3e13c2932c2be7903a73fbe8bb8b upstream. Commit 7b9584fa1c0b ("staging: line6: Move altsetting to properties") set a wrong altsetting for LINE6_PODHD500_1 during refactoring. Set the correct altsetting number to fix the issue. BugLink: https://bugs.launchpad.net/bugs/1790595 Fixes: 7b9584fa1c0b ("staging: line6: Move altsetting to properties") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31ALSA: ac97: Fix double free of ac97_codec_deviceDing Xiang
commit 607975b30db41aad6edc846ed567191aa6b7d893 upstream. put_device will call ac97_codec_release to free ac97_codec_device and other resources, so remove the kfree and other redundant code. Fixes: 74426fbff66e ("ALSA: ac97: add an ac97 bus") Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31drm/panel: Add support for Armadeus ST0700 AdaptSébastien Szymanski
commit c479450f61c7f1f248c9a54aedacd2a6ca521ff8 upstream. This patch adds support for the Armadeus ST0700 Adapt. It comes with a Santek ST0700I5Y-RBSLW 7.0" WVGA (800x480) TFT and an adapter board so that it can be connected on the TFT header of Armadeus Dev boards. Cc: stable@vger.kernel.org # v4.19 Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190507152713.27494-1-sebastien.szymanski@armadeus.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31hpet: Fix division by zero in hpet_time_div()Kefeng Wang
commit 0c7d37f4d9b8446956e97b7c5e61173cdb7c8522 upstream. The base value in do_div() called by hpet_time_div() is truncated from unsigned long to uint32_t, resulting in a divide-by-zero exception. UBSAN: Undefined behaviour in ../drivers/char/hpet.c:572:2 division by zero CPU: 1 PID: 23682 Comm: syz-executor.3 Not tainted 4.4.184.x86_64+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 0000000000000000 b573382df1853d00 ffff8800a3287b98 ffffffff81ad7561 ffff8800a3287c00 ffffffff838b35b0 ffffffff838b3860 ffff8800a3287c20 0000000000000000 ffff8800a3287bb0 ffffffff81b8f25e ffffffff838b35a0 Call Trace: [<ffffffff81ad7561>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffffff81ad7561>] dump_stack+0xc1/0x120 lib/dump_stack.c:51 [<ffffffff81b8f25e>] ubsan_epilogue+0x12/0x8d lib/ubsan.c:166 [<ffffffff81b900cb>] __ubsan_handle_divrem_overflow+0x282/0x2c8 lib/ubsan.c:262 [<ffffffff823560dd>] hpet_time_div drivers/char/hpet.c:572 [inline] [<ffffffff823560dd>] hpet_ioctl_common drivers/char/hpet.c:663 [inline] [<ffffffff823560dd>] hpet_ioctl_common.cold+0xa8/0xad drivers/char/hpet.c:577 [<ffffffff81e63d56>] hpet_ioctl+0xc6/0x180 drivers/char/hpet.c:676 [<ffffffff81711590>] vfs_ioctl fs/ioctl.c:43 [inline] [<ffffffff81711590>] file_ioctl fs/ioctl.c:470 [inline] [<ffffffff81711590>] do_vfs_ioctl+0x6e0/0xf70 fs/ioctl.c:605 [<ffffffff81711eb4>] SYSC_ioctl fs/ioctl.c:622 [inline] [<ffffffff81711eb4>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:613 [<ffffffff82846003>] tracesys_phase2+0x90/0x95 The main C reproducer autogenerated by syzkaller, syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0); memcpy((void*)0x20000100, "/dev/hpet\000", 10); syscall(__NR_openat, 0xffffffffffffff9c, 0x20000100, 0, 0); syscall(__NR_ioctl, r[0], 0x40086806, 0x40000000000000); Fix it by using div64_ul(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Zhang HongJun <zhanghongjun2@huawei.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20190711132757.130092-1-wangkefeng.wang@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31eeprom: make older eeprom drivers select NVMEM_SYSFSArseny Solokha
commit 1b5621832f9bd9899370ea6928462cd02ebe7dc0 upstream. misc/eeprom/{at24,at25,eeprom_93xx46} drivers all register their corresponding devices in the nvmem framework in compat mode which requires nvmem sysfs interface to be present. The latter, however, has been split out from nvmem under a separate Kconfig in commit ae0c2d725512 ("nvmem: core: add NVMEM_SYSFS Kconfig"). As a result, probing certain I2C-attached EEPROMs now fails with at24: probe of 0-0050 failed with error -38 because of a stub implementation of nvmem_sysfs_setup_compat() in drivers/nvmem/nvmem.h. Update the nvmem dependency for these drivers so they could load again: at24 0-0050: 32768 byte 24c256 EEPROM, writable, 64 bytes/write Cc: Adrian Bunk <bunk@kernel.org> Cc: Bartosz Golaszewski <brgl@bgdev.pl> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru> Link: https://lore.kernel.org/r/20190716111236.27803-1-asolokha@kb.kras.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31mei: me: add mule creek canyon (EHL) device idsAlexander Usyskin
commit 1be8624a0cbef720e8da39a15971e01abffc865b upstream. Add Mule Creek Canyon (PCH) MEI device ids for Elkhart Lake (EHL) Platform. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20190712095814.20746-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>