summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2020-09-26net: add __must_check to skb_put_padto()Eric Dumazet
[ Upstream commit 4a009cb04aeca0de60b73f37b102573354214b52 ] skb_put_padto() and __skb_put_padto() callers must check return values or risk use-after-free. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAXJan Kara
commit 88b67edd7247466bc47f01e1dc539b0d0d4b931e upstream. dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy implementation should be defined only in !CONFIG_DAX case, not in !CONFIG_FS_DAX case. Fixes: e2ec51282545 ("dm: Call proper helper to determine dax support") Cc: <stable@vger.kernel.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23dm: Call proper helper to determine dax supportJan Kara
commit e2ec5128254518cae320d5dc631b71b94160f663 upstream. DM was calling generic_fsdax_supported() to determine whether a device referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that they don't have to duplicate common generic checks. High level code should call dax_supported() helper which that calls into appropriate helper for the particular device. This problem manifested itself as kernel messages: dm-3: error: dax access failed (-95) when lvm2-testsuite run in cases where a DM device was stacked on top of another DM device. Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Tested-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Mike Snitzer <snitzer@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23arm64: paravirt: Initialize steal time when cpu is onlineAndrew Jones
commit 75df529bec9110dad43ab30e2d9490242529e8b8 upstream. Steal time initialization requires mapping a memory region which invokes a memory allocation. Doing this at CPU starting time results in the following trace when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at mm/slab.h:498 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #1 Call trace: dump_backtrace+0x0/0x208 show_stack+0x1c/0x28 dump_stack+0xc4/0x11c ___might_sleep+0xf8/0x130 __might_sleep+0x58/0x90 slab_pre_alloc_hook.constprop.101+0xd0/0x118 kmem_cache_alloc_node_trace+0x84/0x270 __get_vm_area_node+0x88/0x210 get_vm_area_caller+0x38/0x40 __ioremap_caller+0x70/0xf8 ioremap_cache+0x78/0xb0 memremap+0x9c/0x1a8 init_stolen_time_cpu+0x54/0xf0 cpuhp_invoke_callback+0xa8/0x720 notify_cpu_starting+0xc8/0xd8 secondary_start_kernel+0x114/0x180 CPU1: Booted secondary processor 0x0000000001 [0x431f0a11] However we don't need to initialize steal time at CPU starting time. We can simply wait until CPU online time, just sacrificing a bit of accuracy by returning zero for steal time until we know better. While at it, add __init to the functions that are only called by pv_time_init() which is __init. Signed-off-by: Andrew Jones <drjones@redhat.com> Fixes: e0685fa228fd ("arm64: Retrieve stolen time as paravirtualized guest") Cc: stable@vger.kernel.org Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200916154530.40809-1-drjones@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23serial: core: fix console port-lock regressionJohan Hovold
commit e0830dbf71f191851ed3772d2760f007b7c5bc3a upstream. Fix the port-lock initialisation regression introduced by commit a3cb39d258ef ("serial: core: Allow detach and attach serial device for console") by making sure that the lock is again initialised during console setup. The console may be registered before the serial controller has been probed in which case the port lock needs to be initialised during console setup by a call to uart_set_options(). The console-detach changes introduced a regression in several drivers by effectively removing that initialisation by not initialising the lock when the port is used as a console (which is always the case during console setup). Add back the early lock initialisation and instead use a new console-reinit flag to handle the case where a console is being re-attached through sysfs. The question whether the console-detach interface should have been added in the first place is left for another discussion. Note that the console-enabled check in uart_set_options() is not redundant because of kgdboc, which can end up reinitialising an already enabled console (see commit 42b6a1baa3ec ("serial_core: Don't re-initialize a previously initialized spinlock.")). Fixes: a3cb39d258ef ("serial: core: Allow detach and attach serial device for console") Cc: stable <stable@vger.kernel.org> # 5.7 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200909143101.15389-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_countHou Tao
[ Upstream commit e6b1a44eccfcab5e5e280be376f65478c3b2c7a2 ] The __this_cpu*() accessors are (in general) IRQ-unsafe which, given that percpu-rwsem is a blocking primitive, should be just fine. However, file_end_write() is used from IRQ context and will cause load-store issues on architectures where the per-cpu accessors are not natively irq-safe. Fix it by using the IRQ-safe this_cpu_*() for operations on read_count. This will generate more expensive code on a number of platforms, which might cause a performance regression for some of the other percpu-rwsem users. If any such is reported, we can consider alternative solutions. Fixes: 70fe2f48152e ("aio: fix freeze protection of aio writes") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-23i2c: algo: pca: Reapply i2c bus settings after resetEvan Nimmo
[ Upstream commit 0a355aeb24081e4538d4d424cd189f16c0bbd983 ] If something goes wrong (such as the SCL being stuck low) then we need to reset the PCA chip. The issue with this is that on reset we lose all config settings and the chip ends up in a disabled state which results in a lock up/high CPU usage. We need to re-apply any configuration that had previously been set and re-enable the chip. Signed-off-by: Evan Nimmo <evan.nimmo@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-17test_firmware: Test platform fw loading on non-EFI systemsKees Cook
commit baaabecfc80fad255f866563b53b8c7a3eec176e upstream. On non-EFI systems, it wasn't possible to test the platform firmware loader because it will have never set "checked_fw" during __init. Instead, allow the test code to override this check. Additionally split the declarations into a private symbol namespace so there is greater enforcement of the symbol visibility. Fixes: 548193cba2a7 ("test_firmware: add support for firmware_request_platform") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200909225354.3118328-1-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-17netfilter: conntrack: allow sctp hearbeat after connection re-useFlorian Westphal
[ Upstream commit cc5453a5b7e90c39f713091a7ebc53c1f87d1700 ] If an sctp connection gets re-used, heartbeats are flagged as invalid because their vtag doesn't match. Handle this in a similar way as TCP conntrack when it suspects that the endpoints and conntrack are out-of-sync. When a HEARTBEAT request fails its vtag validation, flag this in the conntrack state and accept the packet. When a HEARTBEAT_ACK is received with an invalid vtag in the reverse direction after we allowed such a HEARTBEAT through, assume we are out-of-sync and re-set the vtag info. v2: remove left-over snippet from an older incarnation that moved new_state/old_state assignments, thats not needed so keep that as-is. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to SandisksTejun Heo
commit 3b5455636fe26ea21b4189d135a424a6da016418 upstream. All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09block: allow for_each_bvec to support zero len bvecMing Lei
commit 7e24969022cbd61ddc586f14824fc205661bb124 upstream. Block layer usually doesn't support or allow zero-length bvec. Since commit 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()"), iterate_bvec() switches to bvec iterator. However, Al mentioned that 'Zero-length segments are not disallowed' in iov_iter. Fixes for_each_bvec() so that it can move on after seeing one zero length bvec. Fixes: 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()") Reported-by: syzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.html Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09include/linux/log2.h: add missing () around n in roundup_pow_of_two()Jason Gunthorpe
[ Upstream commit 428fc0aff4e59399ec719ffcc1f7a5d29a4ee476 ] Otherwise gcc generates warnings if the expression is complicated. Fixes: 312a0c170945 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09netfilter: nfnetlink: nfnetlink_unicast() reports EAGAIN instead of ENOBUFSPablo Neira Ayuso
[ Upstream commit ee921183557af39c1a0475f982d43b0fcac25e2e ] Frontend callback reports EAGAIN to nfnetlink to retry a command, this is used to signal that module autoloading is required. Unfortunately, nlmsg_unicast() reports EAGAIN in case the receiver socket buffer gets full, so it enters a busy-loop. This patch updates nfnetlink_unicast() to turn EAGAIN into ENOBUFS and to use nlmsg_unicast(). Remove the flags field in nfnetlink_unicast() since this is always MSG_DONTWAIT in the existing code which is exactly what nlmsg_unicast() passes to netlink_unicast() as parameter. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-05HID: core: Sanitize event code and type when mapping inputMarc Zyngier
commit 35556bed836f8dc07ac55f69c8d17dce3e7f0e25 upstream. When calling into hid_map_usage(), the passed event code is blindly stored as is, even if it doesn't fit in the associated bitmap. This event code can come from a variety of sources, including devices masquerading as input devices, only a bit more "programmable". Instead of taking the event code at face value, check that it actually fits the corresponding bitmap, and if it doesn't: - spit out a warning so that we know which device is acting up - NULLify the bitmap pointer so that we catch unexpected uses Code paths that can make use of untrusted inputs can now check that the mapping was indeed correct and bail out if not. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-03fbmem: pull fbcon_update_vcs() out of fb_set_var()Tetsuo Handa
[ Upstream commit d88ca7e1a27eb2df056bbf37ddef62e1c73d37ea ] syzbot is reporting OOB read bug in vc_do_resize() [1] caused by memcpy() based on outdated old_{rows,row_size} values, for resize_screen() can recurse into vc_do_resize() which changes vc->vc_{cols,rows} that outdates old_{rows,row_size} values which were saved before calling resize_screen(). Daniel Vetter explained that resize_screen() should not recurse into fbcon_update_vcs() path due to FBINFO_MISC_USEREVENT being still set when calling resize_screen(). Instead of masking FBINFO_MISC_USEREVENT before calling fbcon_update_vcs(), we can remove FBINFO_MISC_USEREVENT by calling fbcon_update_vcs() only if fb_set_var() returned 0. This change assumes that it is harmless to call fbcon_update_vcs() when fb_set_var() returned 0 without reaching fb_notifier_call_chain(). [1] https://syzkaller.appspot.com/bug?id=c70c88cfd16dcf6e1d3c7f0ab8648b3144b5b25e Reported-and-tested-by: syzbot <syzbot+c37a14770d51a085a520@syzkaller.appspotmail.com> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: kernel test robot <lkp@intel.com> for missing #include Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/075b7e37-3278-cd7d-31ab-c5073cfa8e92@i-love.sakura.ne.jp Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03writeback: Avoid skipping inode writebackJan Kara
commit 5afced3bf28100d81fb2fe7e98918632a08feaf5 upstream. Inode's i_io_list list head is used to attach inode to several different lists - wb->{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker prepares a list of inodes to writeback e.g. for sync(2), it moves inodes to b_io list. Thus it is critical for sync(2) data integrity guarantees that inode is not requeued to any other writeback list when inode is queued for processing by flush worker. That's the reason why writeback_single_inode() does not touch i_io_list (unless the inode is completely clean) and why __mark_inode_dirty() does not touch i_io_list if I_SYNC flag is set. However there are two flaws in the current logic: 1) When inode has only I_DIRTY_TIME set but it is already queued in b_io list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC) can still move inode back to b_dirty list resulting in skipping writeback of inode time stamps during sync(2). 2) When inode is on b_dirty_time list and writeback_single_inode() races with __mark_inode_dirty() like: writeback_single_inode() __mark_inode_dirty(inode, I_DIRTY_PAGES) inode->i_state |= I_SYNC __writeback_single_inode() inode->i_state |= I_DIRTY_PAGES; if (inode->i_state & I_SYNC) bail if (!(inode->i_state & I_DIRTY_ALL)) - not true so nothing done We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus standard background writeback will not writeback this inode leading to possible dirty throttling stalls etc. (thanks to Martijn Coenen for this analysis). Fix these problems by tracking whether inode is queued in b_io or b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we know flush worker has queued inode and we should not touch i_io_list. On the other hand we also know that once flush worker is done with the inode it will requeue the inode to appropriate dirty list. When I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode to appropriate dirty list. Reported-by: Martijn Coenen <maco@android.com> Reviewed-by: Martijn Coenen <maco@android.com> Tested-by: Martijn Coenen <maco@android.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-03dma-pool: fix coherent pool allocations for IOMMU mappingsChristoph Hellwig
[ Upstream commit 9420139f516d7fbc248ce17f35275cb005ed98ea ] When allocating coherent pool memory for an IOMMU mapping we don't care about the DMA mask. Move the guess for the initial GFP mask into the dma_direct_alloc_pages and pass dma_coherent_ok as a function pointer argument so that it doesn't get applied to the IOMMU case. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03netfilter: avoid ipv6 -> nf_defrag_ipv6 module dependencyFlorian Westphal
[ Upstream commit 2404b73c3f1a5f15726c6ecd226b56f6f992767f ] nf_ct_frag6_gather is part of nf_defrag_ipv6.ko, not ipv6 core. The current use of the netfilter ipv6 stub indirections causes a module dependency between ipv6 and nf_defrag_ipv6. This prevents nf_defrag_ipv6 module from being removed because ipv6 can't be unloaded. Remove the indirection and always use a direct call. This creates a depency from nf_conntrack_bridge to nf_defrag_ipv6 instead: modinfo nf_conntrack depends: nf_conntrack,nf_defrag_ipv6,bridge .. and nf_conntrack already depends on nf_defrag_ipv6 anyway. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03efi: provide empty efi_enter_virtual_mode implementationAndrey Konovalov
[ Upstream commit 2c547f9da0539ad1f7ef7f08c8c82036d61b011a ] When CONFIG_EFI is not enabled, we might get an undefined reference to efi_enter_virtual_mode() error, if this efi_enabled() call isn't inlined into start_kernel(). This happens in particular, if start_kernel() is annodated with __no_sanitize_address. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Elena Petrova <lenaptr@google.com> Cc: Marco Elver <elver@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Walter Wu <walter-zh.wu@mediatek.com> Link: http://lkml.kernel.org/r/6514652d3a32d3ed33d6eb5c91d0af63bf0d1a0c.1596544734.git.andreyknvl@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26arch/ia64: Restore arch-specific pgd_offset_k implementationJessica Clarke
[ Upstream commit bd05220c7be3356046861c317d9c287ca50445ba ] IA-64 is special and treats pgd_offset_k() differently to pgd_offset(), using different formulae to calculate the indices into the kernel and user PGDs. The index into the user PGDs takes into account the region number, but the index into the kernel (init_mm) PGD always assumes a predefined kernel region number. Commit 974b9b2c68f3 ("mm: consolidate pte_index() and pte_offset_*() definitions") made IA-64 use a generic pgd_offset_k() which incorrectly used pgd_index() for kernel page tables. As a result, the index into the kernel PGD was going out of bounds and the kernel hung during early boot. Allow overrides of pgd_offset_k() and override it on IA-64 with the old implementation that will correctly index the kernel PGD. Fixes: 974b9b2c68f3 ("mm: consolidate pte_index() and pte_offset_*() definitions") Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26watch_queue: Limit the number of watches a user can holdDavid Howells
[ Upstream commit 29e44f4535faa71a70827af3639b5e6762d8f02a ] Impose a limit on the number of watches that a user can hold so that they can't use this mechanism to fill up all the available memory. This is done by putting a counter in user_struct that's incremented when a watch is allocated and decreased when it is released. If the number exceeds the RLIMIT_NOFILE limit, the watch is rejected with EAGAIN. This can be tested by the following means: (1) Create a watch queue and attach it to fd 5 in the program given - in this case, bash: keyctl watch_session /tmp/nlog /tmp/gclog 5 bash (2) In the shell, set the maximum number of files to, say, 99: ulimit -n 99 (3) Add 200 keyrings: for ((i=0; i<200; i++)); do keyctl newring a$i @s || break; done (4) Try to watch all of the keyrings: for ((i=0; i<200; i++)); do echo $i; keyctl watch_add 5 %:a$i || break; done This should fail when the number of watches belonging to the user hits 99. (5) Remove all the keyrings and all of those watches should go away: for ((i=0; i<200; i++)); do keyctl unlink %:a$i; done (6) Kill off the watch queue by exiting the shell spawned by watch_session. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21iommu/vt-d: Enforce PASID devTLB field maskLiu Yi L
[ Upstream commit 5f77d6ca5ca74e4b4a5e2e010f7ff50c45dea326 ] Set proper masks to avoid invalid input spillover to reserved bits. Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200724014925.15523-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21libnvdimm: Validate command family indicesDan Williams
commit 92fe2aa859f52ce6aa595ca97fec110dc7100e63 upstream. The ND_CMD_CALL format allows for a general passthrough of passlisted commands targeting a given command set. However there is no validation of the family index relative to what the bus supports. - Update the NFIT bus implementation (the only one that supports ND_CMD_CALL passthrough) to also passlist the valid set of command family indices. - Update the generic __nd_ioctl() path to validate that field on behalf of all implementations. Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism") Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsemMike Kravetz
commit 34ae204f18519f0920bd50a644abd6fefc8dbfcf upstream. Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") requires callers of huge_pte_alloc to hold i_mmap_rwsem in at least read mode. This is because the explicit locking in huge_pmd_share (called by huge_pte_alloc) was removed. When restructuring the code, the call to huge_pte_alloc in the else block at the beginning of hugetlb_fault was missed. Unfortunately, that else clause is exercised when there is no page table entry. This will likely lead to a call to huge_pmd_share. If huge_pmd_share thinks pmd sharing is possible, it will traverse the mapping tree (i_mmap) without holding i_mmap_rwsem. If someone else is modifying the tree, bad things such as addressing exceptions or worse could happen. Simply remove the else clause. It should have been removed previously. The code following the else will call huge_pte_alloc with the appropriate locking. To prevent this type of issue in the future, add routines to assert that i_mmap_rwsem is held, and call these routines in huge pmd sharing routines. Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A.Shutemov" <kirill.shutemov@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/e670f327-5cf9-1959-96e4-6dc7cc30d3d5@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21PCI/ATS: Add pci_pri_supported() to check device or associated PFAshok Raj
commit 3f9a7a13fe4cb6e119e4e4745fbf975d30bfac9b upstream. For SR-IOV, the PF PRI is shared between the PF and any associated VFs, and the PRI Capability is allowed for PFs but not for VFs. Searching for the PRI Capability on a VF always fails, even if its associated PF supports PRI. Add pci_pri_supported() to check whether device or its associated PF supports PRI. [bhelgaas: commit log, avoid "!!"] Fixes: b16d0cb9e2fc ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS") Link: https://lore.kernel.org/r/1595543849-19692-1-git-send-email-ashok.raj@intel.com Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de> Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21genirq/affinity: Make affinity setting if activated opt-inThomas Gleixner
commit f0c7baca180046824e07fc5f1326e83a8fd150c7 upstream. John reported that on a RK3288 system the perf per CPU interrupts are all affine to CPU0 and provided the analysis: "It looks like what happens is that because the interrupts are not per-CPU in the hardware, armpmu_request_irq() calls irq_force_affinity() while the interrupt is deactivated and then request_irq() with IRQF_PERCPU | IRQF_NOBALANCING. Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls irq_setup_affinity() which returns early because IRQF_PERCPU and IRQF_NOBALANCING are set, leaving the interrupt on its original CPU." This was broken by the recent commit which blocked interrupt affinity setting in hardware before activation of the interrupt. While this works in general, it does not work for this particular case. As contrary to the initial analysis not all interrupt chip drivers implement an activate callback, the safe cure is to make the deferred interrupt affinity setting at activation time opt-in. Implement the necessary core logic and make the two irqchip implementations for which this is required opt-in. In hindsight this would have been the right thing to do, but ... Fixes: baedb87d1b53 ("genirq/affinity: Handle affinity setting on inactive interrupts correctly") Reported-by: John Keeping <john@metanate.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-19bitfield.h: don't compile-time validate _val in FIELD_FITJakub Kicinski
commit 444da3f52407d74c9aa12187ac6b01f76ee47d62 upstream. When ur_load_imm_any() is inlined into jeq_imm(), it's possible for the compiler to deduce a case where _val can only have the value of -1 at compile time. Specifically, /* struct bpf_insn: _s32 imm */ u64 imm = insn->imm; /* sign extend */ if (imm >> 32) { /* non-zero only if insn->imm is negative */ /* inlined from ur_load_imm_any */ u32 __imm = imm >> 32; /* therefore, always 0xffffffff */ if (__builtin_constant_p(__imm) && __imm > 255) compiletime_assert_XXX() This can result in tripping a BUILD_BUG_ON() in __BF_FIELD_CHECK() that checks that a given value is representable in one byte (interpreted as unsigned). FIELD_FIT() should return true or false at runtime for whether a value can fit for not. Don't break the build over a value that's too large for the mask. We'd prefer to keep the inlining and compiler optimizations though we know this case will always return false. Cc: stable@vger.kernel.org Fixes: 1697599ee301a ("bitfield.h: add FIELD_FIT() helper") Link: https://lore.kernel.org/kernel-hardening/CAK7LNASvb0UDJ0U5wkYYRzTAdnEs64HjXpEUL7d=V0CXiAXcNw@mail.gmail.com/ Reported-by: Masahiro Yamada <masahiroy@kernel.org> Debugged-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-19tpm: Unify the mismatching TPM space buffer sizesJarkko Sakkinen
commit 6c4e79d99e6f42b79040f1a33cd4018f5425030b upstream. The size of the buffers for storing context's and sessions can vary from arch to arch as PAGE_SIZE can be anything between 4 kB and 256 kB (the maximum for PPC64). Define a fixed buffer size set to 16 kB. This should be enough for most use with three handles (that is how many we allow at the moment). Parametrize the buffer size while doing this, so that it is easier to revisit this later on if required. Cc: stable@vger.kernel.org Reported-by: Stefan Berger <stefanb@linux.ibm.com> Fixes: 745b361e989a ("tpm: infrastructure for TPM spaces") Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-19iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommuLu Baolu
commit b1012ca8dc4f9b1a1fe8e2cb1590dd6d43ea3849 upstream. The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. This provides a quirk list for those devices and skips TE disabling if the qurik hits. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571 Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Koba Ko <koba.ko@canonical.com> Tested-by: Jun Miao <jun.miao@windriver.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200723013437.2268-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-19gpio: don't use same lockdep class for all devm_gpiochip_add_data usersAhmad Fatoum
[ Upstream commit 5f402bb17533113c21d61c2d4bc4ef4a6fa1c9a5 ] Commit 959bc7b22bd2 ("gpio: Automatically add lockdep keys") documents in its commits message its intention to "create a unique class key for each driver". It does so by having gpiochip_add_data add in-place the definition of two static lockdep classes for LOCKDEP use. That way, every caller of the macro adds their gpiochip with unique lockdep classes. There are many indirect callers of gpiochip_add_data, however, via use of devm_gpiochip_add_data. devm_gpiochip_add_data has external linkage and all its users will share the same lockdep classes, which probably is not intended. Fix this by replicating the gpio_chip_add_data statics-in-macro for the devm_ version as well. Fixes: 959bc7b22bd2 ("gpio: Automatically add lockdep keys") Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20200731123835.8003-1-a.fatoum@pengutronix.de Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19gpio: regmap: fix type clashMichael Walle
[ Upstream commit a070bdbbb06d7787ec7844a4f1e059cf8b55205d ] GPIO_REGMAP_ADDR_ZERO() cast to unsigned long but the actual config parameters are unsigned int. We use unsigned int here because that is the type which is used by the underlying regmap. Fixes: ebe363197e52 ("gpio: add a reusable generic gpio_chip using regmap") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Walle <michael@walle.cc> Link: https://lore.kernel.org/r/20200725232337.27581-1-michael@walle.cc Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19tpm: Require that all digests are present in TCG_PCR_EVENT2 structuresTyler Hicks
[ Upstream commit 7f3d176f5f7e3f0477bf82df0f600fcddcdcc4e4 ] Require that the TCG_PCR_EVENT2.digests.count value strictly matches the value of TCG_EfiSpecIdEvent.numberOfAlgorithms in the event field of the TCG_PCClientPCREvent event log header. Also require that TCG_EfiSpecIdEvent.numberOfAlgorithms is non-zero. The TCG PC Client Platform Firmware Profile Specification section 9.1 (Family "2.0", Level 00 Revision 1.04) states: For each Hash algorithm enumerated in the TCG_PCClientPCREvent entry, there SHALL be a corresponding digest in all TCG_PCR_EVENT2 structures. Note: This includes EV_NO_ACTION events which do not extend the PCR. Section 9.4.5.1 provides this description of TCG_EfiSpecIdEvent.numberOfAlgorithms: The number of Hash algorithms in the digestSizes field. This field MUST be set to a value of 0x01 or greater. Enforce these restrictions, as required by the above specification, in order to better identify and ignore invalid sequences of bytes at the end of an otherwise valid TPM2 event log. Firmware doesn't always have the means necessary to inform the kernel of the actual event log size so the kernel's event log parsing code should be stringent when parsing the event log for resiliency against firmware bugs. This is true, for example, when firmware passes the event log to the kernel via a reserved memory region described in device tree. POWER and some ARM systems use the "linux,sml-base" and "linux,sml-size" device tree properties to describe the memory region used to pass the event log from firmware to the kernel. Unfortunately, the "linux,sml-size" property describes the size of the entire reserved memory region rather than the size of the event long within the memory region and the event log format does not include information describing the size of the event log. tpm_read_log_of(), in drivers/char/tpm/eventlog/of.c, is where the "linux,sml-size" property is used. At the end of that function, log->bios_event_log_end is pointing at the end of the reserved memory region. That's typically 0x10000 bytes offset from "linux,sml-base", depending on what's defined in the device tree source. The firmware event log only fills a portion of those 0x10000 bytes and the rest of the memory region should be zeroed out by firmware. Even in the case of a properly zeroed bytes in the remainder of the memory region, the only thing allowing the kernel's event log parser to detect the end of the event log is the following conditional in __calc_tpm2_event_size(): if (event_type == 0 && event_field->event_size == 0) size = 0; If that wasn't there, __calc_tpm2_event_size() would think that a 16 byte sequence of zeroes, following an otherwise valid event log, was a valid event. However, problems can occur if a single bit is set in the offset corresponding to either the TCG_PCR_EVENT2.eventType or TCG_PCR_EVENT2.eventSize fields, after the last valid event log entry. This could confuse the parser into thinking that an additional entry is present in the event log and exposing this invalid entry to userspace in the /sys/kernel/security/tpm0/binary_bios_measurements file. Such problems have been seen if firmware does not fully zero the memory region upon a warm reboot. This patch significantly raises the bar on how difficult it is for stale/invalid memory to confuse the kernel's event log parser but there's still, ultimately, a reliance on firmware to properly initialize the remainder of the memory region reserved for the event log as the parser cannot be expected to detect a stale but otherwise properly formatted firmware event log entry. Fixes: fd5c78694f3f ("tpm: fix handling of the TPM 2.0 event logs") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19tracepoint: Mark __tracepoint_string's __usedNick Desaulniers
commit f3751ad0116fb6881f2c3c957d66a9327f69cefb upstream. __tracepoint_string's have their string data stored in .rodata, and an address to that data stored in the "__tracepoint_str" section. Functions that refer to those strings refer to the symbol of the address. Compiler optimization can replace those address references with references directly to the string data. If the address doesn't appear to have other uses, then it appears dead to the compiler and is removed. This can break the /tracing/printk_formats sysfs node which iterates the addresses stored in the "__tracepoint_str" section. Like other strings stored in custom sections in this header, mark these __used to inform the compiler that there are other non-obvious users of the address, so they should still be emitted. Link: https://lkml.kernel.org/r/20200730224555.2142154-2-ndesaulniers@google.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: stable@vger.kernel.org Fixes: 102c9323c35a8 ("tracing: Add __tracepoint_string() to export string pointers") Reported-by: Tim Murray <timmurray@google.com> Reported-by: Simon MacMullen <simonmacm@google.com> Suggested-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-11random32: move the pseudo-random 32-bit definitions to prandom.hLinus Torvalds
commit c0842fbc1b18c7a044e6ff3e8fa78bfa822c7d1a upstream. The addition of percpu.h to the list of includes in random.h revealed some circular dependencies on arm64 and possibly other platforms. This include was added solely for the pseudo-random definitions, which have nothing to do with the rest of the definitions in this file but are still there for legacy reasons. This patch moves the pseudo-random parts to linux/prandom.h and the percpu.h include with it, which is now guarded by _LINUX_PRANDOM_H and protected against recursive inclusion. A further cleanup step would be to remove this from <linux/random.h> entirely, and make people who use the prandom infrastructure include just the new header file. That's a bit of a churn patch, but grepping for "prandom_" and "next_pseudo_random32" "struct rnd_state" should catch most users. But it turns out that that nice cleanup step is fairly painful, because a _lot_ of code currently seems to depend on the implicit include of <linux/random.h>, which can currently come in a lot of ways, including such fairly core headfers as <linux/net.h>. So the "nice cleanup" part may or may never happen. Fixes: 1c9df907da83 ("random: fix circular include dependency on arm64 after addition of percpu.h") Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-11xattr: break delegations in {set,remove}xattrFrank van der Linden
commit 08b5d5014a27e717826999ad20e394a8811aae92 upstream. set/removexattr on an exported filesystem should break NFS delegations. This is true in general, but also for the upcoming support for RFC 8726 (NFSv4 extended attribute support). Make sure that they do. Additionally, they need to grow a _locked variant, since callers might call this with i_rwsem held (like the NFS server code). Cc: stable@vger.kernel.org # v4.9+ Cc: linux-fsdevel@vger.kernel.org Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca. 2) Better parameter validation in pfkey_dump(), from Mark Salyzyn. 3) Fix several clang issues on powerpc in selftests, from Tanner Love. 4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al Viro. 5) Out of bounds access in mlx5e driver, from Raed Salem. 6) Fix transfer buffer memleak in lan78xx, from Johan Havold. 7) RCU fixups in rhashtable, from Herbert Xu. 8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang. 9) vxlan FDB dump must be done under RCU, from Ido Schimmel. 10) Fix use after free in mlxsw, from Ido Schimmel. 11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko. 12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar Thiagarajan. 13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang. 14) Fix bpf program reference count leaks in mlx5 during mlx5e_alloc_rq(), from Xin Xiong. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits) vxlan: fix memleak of fdb rds: Prevent kernel-infoleak in rds_notify_queue_get() net/sched: The error lable position is corrected in ct_init_module net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq net/mlx5e: E-Switch, Specify flow_source for rule with no in_port net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring net/mlx5e: CT: Support restore ipv6 tunnel net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe() ionic: unlock queue mutex in error path atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent net: ethernet: mtk_eth_soc: fix MTU warnings net: nixge: fix potential memory leak in nixge_probe() devlink: ignore -EOPNOTSUPP errors on dumpit rxrpc: Fix race between recvmsg and sendmsg on immediate call failure MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer selftests/bpf: fix netdevsim trap_flow_action_cookie read ipv6: fix memory leaks on IPV6_ADDRFORM path net/bpfilter: Initialize pos in __bpfilter_process_sockopt igb: reinit_locked() should be called with rtnl_lock e1000e: continue to init PHY even when failed to disable ULP ...
2020-07-31Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some I2C core improvements to prevent NULL pointer usage and a MAINTAINERS update" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: slave: add sanity check when unregistering i2c: slave: improve sanity check when registering MAINTAINERS: Update GENI I2C maintainers list i2c: also convert placeholder function to return errno
2020-07-30random: fix circular include dependency on arm64 after addition of percpu.hWilly Tarreau
Daniel Díaz and Kees Cook independently reported that commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity") broke arm64 due to a circular dependency on include files since the addition of percpu.h in random.h. The correct fix would definitely be to move all the prandom32 stuff out of random.h but for backporting, a smaller solution is preferred. This one replaces linux/percpu.h with asm/percpu.h, and this fixes the problem on x86_64, arm64, arm, and mips. Note that moving percpu.h around didn't change anything and that removing it entirely broke differently. When backporting, such options might still be considered if this patch fails to help. [ It turns out that an alternate fix seems to be to just remove the troublesome <asm/pointer_auth.h> remove from the arm64 <asm/smp.h> that causes the circular dependency. But we might as well do the whole belt-and-suspenders thing, and minimize inclusion in <linux/random.h> too. Either will fix the problem, and both are good changes. - Linus ] Reported-by: Daniel Díaz <daniel.diaz@linaro.org> Reported-by: Kees Cook <keescook@chromium.org> Tested-by: Marc Zyngier <maz@kernel.org> Fixes: f227e3ec3b5c Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-29random32: remove net_rand_state from the latent entropy gcc pluginLinus Torvalds
It turns out that the plugin right now ends up being really unhappy about the change from 'static' to 'extern' storage that happened in commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity"). This is probably a trivial fix for the latent_entropy plugin, but for now, just remove net_rand_state from the list of things the plugin worries about. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Emese Revfy <re.emese@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-29random32: update the net random state on interrupt and activityWilly Tarreau
This modifies the first 32 bits out of the 128 bits of a random CPU's net_rand_state on interrupt or CPU activity to complicate remote observations that could lead to guessing the network RNG's internal state. Note that depending on some network devices' interrupt rate moderation or binding, this re-seeding might happen on every packet or even almost never. In addition, with NOHZ some CPUs might not even get timer interrupts, leaving their local state rarely updated, while they are running networked processes making use of the random state. For this reason, we also perform this update in update_process_times() in order to at least update the state when there is user or system activity, since it's the only case we care about. Reported-by: Amit Klein <aksecurity@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-28rhashtable: Restore RCU marking on rhash_lock_headHerbert Xu
This patch restores the RCU marking on bucket_table->buckets as it really does need RCU protection. Its removal had led to a fatal bug. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28rhashtable: Fix unprotected RCU dereference in __rht_ptrHerbert Xu
The rcu_dereference call in rht_ptr_rcu is completely bogus because we've already dereferenced the value in __rht_ptr and operated on it. This causes potential double readings which could be fatal. The RCU dereference must occur prior to the comparison in __rht_ptr. This patch changes the order of RCU dereference so that it is done first and the result is then fed to __rht_ptr. The RCU marking changes have been minimised using casts which will be removed in a follow-up patch. Fixes: ba6306e3f648 ("rhashtable: Remove RCU marking from...") Reported-by: "Gong, Sishuai" <sishuai@purdue.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net/mlx5e: Modify uplink state on interface up/downRon Diskin
When setting the PF interface up/down, notify the firmware to update uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled. This behavior will prevent sending traffic out on uplink port when PF is down, such as sending traffic from a VF interface which is still up. Currently when calling mlx5e_open/close(), the driver only sends PAOS command to notify the firmware to set the physical port state to up/down, however, it is not sufficient. When VF is in "auto" state, it follows the uplink state, which was not updated on mlx5e_open/close() before this patch. When switchdev mode is enabled and uplink representor is first enabled, set the uplink port state value back to its FW default "AUTO". Fixes: 63bfd399de55 ("net/mlx5e: Send PAOS command on interface up/down") Signed-off-by: Ron Diskin <rondi@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-27i2c: also convert placeholder function to return errnoWolfram Sang
All i2c_new_device-alike functions return ERR_PTR these days, but this fallback function was missed. Fixes: 2dea645ffc21 ("i2c: acpi: Return error pointers from i2c_acpi_new_device()") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [wsa: changed from 'ENOSYS' to 'ENODEV'] Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-25Merge tag 'efi-urgent-2020-07-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull EFI fixes from Ingo Molnar: "Various EFI fixes: - Fix the layering violation in the use of the EFI runtime services availability mask in users of the 'efivars' abstraction - Revert build fix for GCC v4.8 which is no longer supported - Clean up some x86 EFI stub details, some of which are borderline bugs that copy around garbage into padding fields - let's fix these out of caution. - Fix build issues while working on RISC-V support - Avoid --whole-archive when linking the stub on arm64" * tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Revert "efi/x86: Fix build with gcc 4" efi/efivars: Expose RT service availability via efivars abstraction efi/libstub: Move the function prototypes to header file efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds efi/libstub/arm64: link stub lib.a conditionally efi/x86: Only copy upto the end of setup_header efi/x86: Remove unused variables
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into masterLinus Torvalds
Pull networking fixes from David Miller: 1) Fix RCU locaking in iwlwifi, from Johannes Berg. 2) mt76 can access uninitialized NAPI struct, from Felix Fietkau. 3) Fix race in updating pause settings in bnxt_en, from Vasundhara Volam. 4) Propagate error return properly during unbind failures in ax88172a, from George Kennedy. 5) Fix memleak in adf7242_probe, from Liu Jian. 6) smc_drv_probe() can leak, from Wang Hai. 7) Don't muck with the carrier state if register_netdevice() fails in the bonding driver, from Taehee Yoo. 8) Fix memleak in dpaa_eth_probe, from Liu Jian. 9) Need to check skb_put_padto() return value in hsr_fill_tag(), from Murali Karicheri. 10) Don't lose ionic RSS hash settings across FW update, from Shannon Nelson. 11) Fix clobbered SKB control block in act_ct, from Wen Xu. 12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang. 13) IS_UDPLITE cleanup a long time ago, incorrectly handled transformations involving UDPLITE_RECV_CC. From Miaohe Lin. 14) Unbalanced locking in netdevsim, from Taehee Yoo. 15) Suppress false-positive error messages in qed driver, from Alexander Lobakin. 16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye. 17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost. 18) Uninitialized value in geneve_changelink(), from Cong Wang. 19) Fix deadlock in xen-netfront, from Andera Righi. 19) flush_backlog() frees skbs with IRQs disabled, so should use dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits) drivers/net/wan: lapb: Corrected the usage of skb_cow dev: Defer free of skbs in flush_backlog qrtr: orphan socket in qrtr_release() xen-netfront: fix potential deadlock in xennet_remove() flow_offload: Move rhashtable inclusion to the source file geneve: fix an uninitialized value in geneve_changelink() bonding: check return value of register_netdevice() in bond_newlink() tcp: allow at most one TLP probe per flight AX.25: Prevent integer overflows in connect and sendmsg cxgb4: add missing release on skb in uld_send() net: atlantic: fix PTP on AQC10X AX.25: Prevent out-of-bounds read in ax25_sendmsg() sctp: shrink stream outq when fails to do addstream reconf sctp: shrink stream outq only when new outcnt < old outcnt AX.25: Fix out-of-bounds read in ax25_connect() enetc: Remove the mdio bus on PF probe bailout net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload net: ethernet: ave: Fix error returns in ave_init drivers/net/wan/x25_asy: Fix to make it work ipvs: fix the connection sync failed in some cases ...
2020-07-24Merge branch 'akpm' into master (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "Subsystems affected by this patch series: mm/pagemap, mm/shmem, mm/hotfixes, mm/memcg, mm/hugetlb, mailmap, squashfs, scripts, io-mapping, MAINTAINERS, and gdb" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: scripts/gdb: fix lx-symbols 'gdb.error' while loading modules MAINTAINERS: add KCOV section io-mapping: indicate mapping failure scripts/decode_stacktrace: strip basepath from all paths squashfs: fix length field overlap check in metadata reading mailmap: add entry for Mike Rapoport khugepaged: fix null-pointer dereference due to race mm/hugetlb: avoid hardcoding while checking if cma is enabled mm: memcg/slab: fix memory leak at non-root kmem_cache destroy mm/memcg: fix refcount error while moving and swapping mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages() mm: initialize return of vm_insert_pages vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
2020-07-24Merge tag 'for-5.8/dm-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm into master Pull device mapper fix from Mike Snitzer: "A stable fix for DM integrity target's integrity recalculation that gets skipped when resuming a device. This is a fix for a previous stable@ fix" * tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm integrity: fix integrity recalculation that is improperly skipped
2020-07-24io-mapping: indicate mapping failureMichael J. Ruhl
The !ATOMIC_IOMAP version of io_maping_init_wc will always return success, even when the ioremap fails. Since the ATOMIC_IOMAP version returns NULL when the init fails, and callers check for a NULL return on error this is unexpected. During a device probe, where the ioremap failed, a crash can look like this: BUG: unable to handle page fault for address: 0000000000210000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP CPU: 0 PID: 177 Comm: RIP: 0010:fill_page_dma [i915] gen8_ppgtt_create [i915] i915_ppgtt_create [i915] intel_gt_init [i915] i915_gem_init [i915] i915_driver_probe [i915] pci_device_probe really_probe driver_probe_device The remap failure occurred much earlier in the probe. If it had been propagated, the driver would have exited with an error. Return NULL on ioremap failure. [akpm@linux-foundation.org: detect ioremap_wc() errors earlier] Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping") Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-24vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right wayChengguang Xu
After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc"), simple xattr entry is allocated with kvmalloc() instead of kmalloc(), so we should release it with kvfree() instead of kfree(). Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc") Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Chris Down <chris@chrisdown.name> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [5.7] Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>