summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-06-03SUNRPC: More fixes for backlog congestionTrond Myklebust
[ Upstream commit e86be3a04bc4aeaf12f93af35f08f8d4385bcd98 ] Ensure that we fix the XPRT_CONGESTED starvation issue for RDMA as well as socket based transports. Ensure we always initialise the request after waking up from the backlog list. Fixes: e877a88d1f06 ("SUNRPC in case of backlog, hand free slots directly to waiting task") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03linux/bits.h: fix compilation error with GENMASKRikard Falkeborn
[ Upstream commit f747e6667ebb2ffb8133486c9cd19800d72b0d98 ] GENMASK() has an input check which uses __builtin_choose_expr() to enable a compile time sanity check of its inputs if they are known at compile time. However, it turns out that __builtin_constant_p() does not always return a compile time constant [0]. It was thought this problem was fixed with gcc 4.9 [1], but apparently this is not the case [2]. Switch to use __is_constexpr() instead which always returns a compile time constant, regardless of its inputs. Link: https://lore.kernel.org/lkml/42b4342b-aefc-a16a-0d43-9f9c0d63ba7a@rasmusvillemoes.dk [0] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449 [1] Link: https://lore.kernel.org/lkml/1ac7bbc2-45d9-26ed-0b33-bf382b8d858b@I-love.SAKURA.ne.jp [2] Link: https://lkml.kernel.org/r/20210511203716.117010-1-rikard.falkeborn@gmail.com Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03{net, RDMA}/mlx5: Fix override of log_max_qp by other deviceMaor Gottlieb
commit 3410fbcd47dc6479af4309febf760ccaa5efb472 upstream. mlx5_core_dev holds pointer to static profile, hence when the log_max_qp of the profile is override by some device, then it effect all other mlx5 devices that share the same profile. Fix it by having a profile instance for every mlx5 device. Fixes: 883371c453b9 ("net/mlx5: Check FW limitations on log_max_qp before setting it") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03{net,vdpa}/mlx5: Configure interface MAC into mpfs L2 tableEli Cohen
commit 7c9f131f366ab414691907fa0407124ea2b2f3bc upstream. net/mlx5: Expose MPFS configuration API MPFS is the multi physical function switch that bridges traffic between the physical port and any physical functions associated with it. The driver is required to add or remove MAC entries to properly forward incoming traffic to the correct physical function. We export the API to control MPFS so that other drivers, such as mlx5_vdpa are able to add MAC addresses of their network interfaces. The MAC address of the vdpa interface must be configured into the MPFS L2 address. Failing to do so could cause, in some NIC configurations, failure to forward packets to the vdpa network device instance. Fix this by adding calls to update the MPFS table. CC: <mst@redhat.com> CC: <jasowang@redhat.com> CC: <virtualization@lists.linux-foundation.org> Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03drivers: base: Fix device link removalRafael J. Wysocki
commit 80dd33cf72d1ab4f0af303f1fa242c6d6c8d328f upstream. When device_link_free() drops references to the supplier and consumer devices of the device link going away and the reference being dropped turns out to be the last one for any of those device objects, its ->release callback will be invoked and it may sleep which goes against the SRCU callback execution requirements. To address this issue, make the device link removal code carry out the device_link_free() actions preceded by SRCU synchronization from a separate work item (the "long" workqueue is used for that, because it does not matter when the device link memory is released and it may take time to get to that point) instead of using SRCU callbacks. While at it, make the code work analogously when SRCU is not enabled to reduce the differences between the SRCU and non-SRCU cases. Fixes: 843e600b8a2b ("driver core: Fix sleeping in invalid context during device link deletion") Cc: stable <stable@vger.kernel.org> Reported-by: chenxiang (M) <chenxiang66@hisilicon.com> Tested-by: chenxiang (M) <chenxiang66@hisilicon.com> Reviewed-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/5722787.lOV4Wx5bFT@kreacher Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-28context_tracking: Move guest exit vtime accounting to separate helpersWanpeng Li
commit 88d8220bbf06dd8045b2ac4be1046290eaa7773a upstream. Provide separate vtime accounting functions for guest exit instead of open coding the logic within the context tracking code. This will allow KVM x86 to handle vtime accounting slightly differently when using tick-based accounting. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210505002735.1684165-3-seanjc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-28context_tracking: Move guest exit context tracking to separate helpersWanpeng Li
commit 866a6dadbb027b2955a7ae00bab9705d382def12 upstream. Provide separate context tracking helpers for guest exit, the standalone helpers will be called separately by KVM x86 in later patches to fix tick-based accounting. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210505002735.1684165-2-seanjc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26vt: Fix character height handling with VT_RESIZEXMaciej W. Rozycki
commit 860dafa902595fb5f1d23bbcce1215188c3341e6 upstream. Restore the original intent of the VT_RESIZEX ioctl's `v_clin' parameter which is the number of pixel rows per character (cell) rather than the height of the font used. For framebuffer devices the two values are always the same, because the former is inferred from the latter one. For VGA used as a true text mode device these two parameters are independent from each other: the number of pixel rows per character is set in the CRT controller, while font height is in fact hardwired to 32 pixel rows and fonts of heights below that value are handled by padding their data with blanks when loaded to hardware for use by the character generator. One can change the setting in the CRT controller and it will update the screen contents accordingly regardless of the font loaded. The `v_clin' parameter is used by the `vgacon' driver to set the height of the character cell and then the cursor position within. Make the parameter explicit then, by defining a new `vc_cell_height' struct member of `vc_data', set it instead of `vc_font.height' from `v_clin' in the VT_RESIZEX ioctl, and then use it throughout the `vgacon' driver except where actual font data is accessed which as noted above is independent from the CRTC setting. This way the framebuffer console driver is free to ignore the `v_clin' parameter as irrelevant, as it always should have, avoiding any issues attempts to give the parameter a meaning there could have caused, such as one that has led to commit 988d0763361b ("vt_ioctl: make VT_RESIZEX behave like VT_RESIZE"): "syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than actual font height calculated by con_font_set() from ioctl(PIO_FONT). Since fbcon_set_font() from con_font_set() allocates minimal amount of memory based on actual font height calculated by con_font_set(), use of vt_resizex() can cause UAF/OOB read for font data." The problem first appeared around Linux 2.5.66 which predates our repo history, but the origin could be identified with the old MIPS/Linux repo also at: <git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux.git> as commit 9736a3546de7 ("Merge with Linux 2.5.66."), where VT_RESIZEX code in `vt_ioctl' was updated as follows: if (clin) - video_font_height = clin; + vc->vc_font.height = clin; making the parameter apply to framebuffer devices as well, perhaps due to the use of "font" in the name of the original `video_font_height' variable. Use "cell" in the new struct member then to avoid ambiguity. References: [1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb855245837 [2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48e3 Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org # v2.6.12+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940Tony Lindgren
commit 25de4ce5ed02994aea8bc111d133308f6fd62566 upstream. There is a timer wrap issue on dra7 for the ARM architected timer. In a typical clock configuration the timer fails to wrap after 388 days. To work around the issue, we need to use timer-ti-dm percpu timers instead. Let's configure dmtimer3 and 4 as percpu timers by default, and warn about the issue if the dtb is not configured properly. Let's do this as a single patch so it can be backported to v5.8 and later kernels easily. Note that this patch depends on earlier timer-ti-dm systimer posted mode fixes, and a preparatory clockevent patch "clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue". For more information, please see the errata for "AM572x Sitara Processors Silicon Revisions 1.1, 2.0": https://www.ti.com/lit/er/sprz429m/sprz429m.pdf The concept is based on earlier reference patches done by Tero Kristo and Keerthy. Cc: Keerthy <j-keerthy@ti.com> Cc: Tero Kristo <kristo@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210323074326.28302-3-tony@atomide.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19mm: fix struct page layout on 32-bit systemsMatthew Wilcox (Oracle)
commit 9ddb3c14afba8bc5950ed297f02d4ae05ff35cd1 upstream. 32-bit architectures which expect 8-byte alignment for 8-byte integers and need 64-bit DMA addresses (arm, mips, ppc) had their struct page inadvertently expanded in 2019. When the dma_addr_t was added, it forced the alignment of the union to 8 bytes, which inserted a 4 byte gap between 'flags' and the union. Fix this by storing the dma_addr_t in one or two adjacent unsigned longs. This restores the alignment to that of an unsigned long. We always store the low bits in the first word to prevent the PageTail bit from being inadvertently set on a big endian platform. If that happened, get_user_pages_fast() racing against a page which was freed and reallocated to the page_pool could dereference a bogus compound_head(), which would be hard to trace back to this cause. Link: https://lkml.kernel.org/r/20210510153211.1504886-1-willy@infradead.org Fixes: c25fff7171be ("mm: add dma_addr_t to struct page") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Matteo Croce <mcroce@linux.microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19kyber: fix out of bounds access when preemptedOmar Sandoval
[ Upstream commit efed9a3337e341bd0989161b97453b52567bc59d ] __blk_mq_sched_bio_merge() gets the ctx and hctx for the current CPU and passes the hctx to ->bio_merge(). kyber_bio_merge() then gets the ctx for the current CPU again and uses that to get the corresponding Kyber context in the passed hctx. However, the thread may be preempted between the two calls to blk_mq_get_ctx(), and the ctx returned the second time may no longer correspond to the passed hctx. This "works" accidentally most of the time, but it can cause us to read garbage if the second ctx came from an hctx with more ctx's than the first one (i.e., if ctx->index_hw[hctx->type] > hctx->nr_ctx). This manifested as this UBSAN array index out of bounds error reported by Jakub: UBSAN: array-index-out-of-bounds in ../kernel/locking/qspinlock.c:130:9 index 13106 is out of range for type 'long unsigned int [128]' Call Trace: dump_stack+0xa4/0xe5 ubsan_epilogue+0x5/0x40 __ubsan_handle_out_of_bounds.cold.13+0x2a/0x34 queued_spin_lock_slowpath+0x476/0x480 do_raw_spin_lock+0x1c2/0x1d0 kyber_bio_merge+0x112/0x180 blk_mq_submit_bio+0x1f5/0x1100 submit_bio_noacct+0x7b0/0x870 submit_bio+0xc2/0x3a0 btrfs_map_bio+0x4f0/0x9d0 btrfs_submit_data_bio+0x24e/0x310 submit_one_bio+0x7f/0xb0 submit_extent_page+0xc4/0x440 __extent_writepage_io+0x2b8/0x5e0 __extent_writepage+0x28d/0x6e0 extent_write_cache_pages+0x4d7/0x7a0 extent_writepages+0xa2/0x110 do_writepages+0x8f/0x180 __writeback_single_inode+0x99/0x7f0 writeback_sb_inodes+0x34e/0x790 __writeback_inodes_wb+0x9e/0x120 wb_writeback+0x4d2/0x660 wb_workfn+0x64d/0xa10 process_one_work+0x53a/0xa80 worker_thread+0x69/0x5b0 kthread+0x20b/0x240 ret_from_fork+0x1f/0x30 Only Kyber uses the hctx, so fix it by passing the request_queue to ->bio_merge() instead. BFQ and mq-deadline just use that, and Kyber can map the queues itself to avoid the mismatch. Fixes: a6088845c2bf ("block: kyber: make kyber more friendly with merging") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Link: https://lore.kernel.org/r/c7598605401a48d5cfeadebb678abd10af22b83f.1620691329.git.osandov@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19mm/hugetlb: fix F_SEAL_FUTURE_WRITEPeter Xu
commit 22247efd822e6d263f3c8bd327f3f769aea9b1d9 upstream. Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2. Hugh reported issue with F_SEAL_FUTURE_WRITE not applied correctly to hugetlbfs, which I can easily verify using the memfd_test program, which seems that the program is hardly run with hugetlbfs pages (as by default shmem). Meanwhile I found another probably even more severe issue on that hugetlb fork won't wr-protect child cow pages, so child can potentially write to parent private pages. Patch 2 addresses that. After this series applied, "memfd_test hugetlbfs" should start to pass. This patch (of 2): F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day. There is a test program for that and it fails constantly. $ ./memfd_test hugetlbfs memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE mmap() didn't fail as expected Aborted (core dumped) I think it's probably because no one is really running the hugetlbfs test. Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we do in shmem_mmap(). Generalize a helper for that. Link: https://lkml.kernel.org/r/20210503234356.9097-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210503234356.9097-2-peterx@redhat.com Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Hugh Dickins <hughd@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19NFS: nfs4_bitmask_adjust() must not change the server global bitmasksTrond Myklebust
[ Upstream commit 332d1a0373be32a3a3c152756bca45ff4f4e11b5 ] As currently set, the calls to nfs4_bitmask_adjust() will end up overwriting the contents of the nfs_server cache_consistency_bitmask field. The intention here should be to modify a private copy of that mask in the close/delegreturn/write arguments. Fixes: 76bd5c016ef4 ("NFSv4: make cache consistency bitmask dynamic") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19net: phy: make PHY PM ops a no-op if MAC driver manages PHY PMHeiner Kallweit
[ Upstream commit fba863b816049b03f3fbb07b10ebdcfe5c4141f7 ] Resume callback of the PHY driver is called after the one for the MAC driver. The PHY driver resume callback calls phy_init_hw(), and this is potentially problematic if the MAC driver calls phy_start() in its resume callback. One issue was reported with the fec driver and a KSZ8081 PHY which seems to become unstable if a soft reset is triggered during aneg. The new flag allows MAC drivers to indicate that they take care of suspending/resuming the PHY. Then the MAC PM callbacks can handle any dependency between MAC and PHY PM. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19i2c: Add I2C_AQ_NO_REP_START adapter quirkBence Csókás
[ Upstream commit aca01415e076aa96cca0f801f4420ee5c10c660d ] This quirk signifies that the adapter cannot do a repeated START, it always issues a STOP condition after transfers. Suggested-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Bence Csókás <bence98@sch.bme.hu> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19PM: runtime: Fix unpaired parent child_count for force_resumeTony Lindgren
commit c745253e2a691a40c66790defe85c104a887e14a upstream. As pm_runtime_need_not_resume() relies also on usage_count, it can return a different value in pm_runtime_force_suspend() compared to when called in pm_runtime_force_resume(). Different return values can happen if anything calls PM runtime functions in between, and causes the parent child_count to increase on every resume. So far I've seen the issue only for omapdrm that does complicated things with PM runtime calls during system suspend for legacy reasons: omap_atomic_commit_tail() for omapdrm.0 dispc_runtime_get() wakes up 58000000.dss as it's the dispc parent dispc_runtime_resume() rpm_resume() increases parent child_count dispc_runtime_put() won't idle, PM runtime suspend blocked pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume() __update_runtime_status() system suspended pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume() pm_runtime_enable() only called because of pm_runtime_need_not_resume() omap_atomic_commit_tail() for omapdrm.0 dispc_runtime_get() wakes up 58000000.dss as it's the dispc parent dispc_runtime_resume() rpm_resume() increases parent child_count dispc_runtime_put() won't idle, PM runtime suspend blocked ... rpm_suspend for 58000000.dss but parent child_count is now unbalanced Let's fix the issue by adding a flag for needs_force_resume and use it in pm_runtime_force_resume() instead of pm_runtime_need_not_resume(). Additionally omapdrm system suspend could be simplified later on to avoid lots of unnecessary PM runtime calls and the complexity it adds. The driver can just use internal functions that are shared between the PM runtime and system suspend related functions. Fixes: 4918e1f87c5f ("PM / runtime: Rework pm_runtime_force_suspend/resume()") Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14smp: Fix smp_call_function_single_async prototypeArnd Bergmann
commit 1139aeb1c521eb4a050920ce6c64c36c4f2a3ab7 upstream. As of commit 966a967116e6 ("smp: Avoid using two cache lines for struct call_single_data"), the smp code prefers 32-byte aligned call_single_data objects for performance reasons, but the block layer includes an instance of this structure in the main 'struct request' that is more senstive to size than to performance here, see 4ccafe032005 ("block: unalign call_single_data in struct request"). The result is a violation of the calling conventions that clang correctly points out: block/blk-mq.c:630:39: warning: passing 8-byte aligned argument to 32-byte aligned parameter 2 of 'smp_call_function_single_async' may result in an unaligned pointer access [-Walign-mismatch] smp_call_function_single_async(cpu, &rq->csd); It does seem that the usage of the call_single_data without cache line alignment should still be allowed by the smp code, so just change the function prototype so it accepts both, but leave the default alignment unchanged for the other users. This seems better to me than adding a local hack to shut up an otherwise correct warning in the caller. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lkml.kernel.org/r/20210505211300.3174456-1-arnd@kernel.org [nc: Fix conflicts] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14iommu/vt-d: Invalidate PASID cache when root/context entry changedLu Baolu
[ Upstream commit c0474a606ecb9326227b4d68059942f9db88a897 ] When the Intel IOMMU is operating in the scalable mode, some information from the root and context table may be used to tag entries in the PASID cache. Software should invalidate the PASID-cache when changing root or context table entries. Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14iommu: Fix a boundary issue to avoid performance dropXiang Chen
[ Upstream commit 3431c3f660a39f6ced954548a59dba6541ce3eb1 ] After the change of patch ("iommu: Switch gather->end to the inclusive end"), the performace drops from 1600+K IOPS to 1200K in our kunpeng ARM64 platform. We find that the range [start1, end1) actually is joint from the range [end1, end2), but it is considered as disjoint after the change, so it needs more times of TLB sync, and spends more time on it. So fix the boundary issue to avoid performance drop. Fixes: 862c3715de8f ("iommu: Switch gather->end to the inclusive end") Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1616643504-120688-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14udp: never accept GSO_FRAGLIST packetsPaolo Abeni
[ Upstream commit 78352f73dc5047f3f744764cc45912498c52f3c9 ] Currently the UDP protocol delivers GSO_FRAGLIST packets to the sockets without the expected segmentation. This change addresses the issue introducing and maintaining a couple of new fields to explicitly accept SKB_GSO_UDP_L4 or GSO_FRAGLIST packets. Additionally updates udp_unexpected_gso() accordingly. UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist zeroed. v1 -> v2: - use 2 bits instead of a whole GSO bitmask (Willem) Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14gpio: guard gpiochip_irqchip_add_domain() with GPIOLIB_IRQCHIPÁlvaro Fernández Rojas
[ Upstream commit 9c7d24693d864f90b27aad5d15fbfe226c02898b ] The current code doesn't check if GPIOLIB_IRQCHIP is enabled, which results in a compilation error when trying to build gpio-regmap if CONFIG_GPIOLIB_IRQCHIP isn't enabled. Fixes: 6a45b0e2589f ("gpiolib: Introduce gpiochip_irqchip_add_domain()") Suggested-by: Michael Walle <michael@walle.cc> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Michael Walle <michael@walle.cc> Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20210324081923.20379-2-noltari@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14iommu/dma: Resurrect the "forcedac" optionRobin Murphy
[ Upstream commit 3542dcb15cef66c0b9e6c3b33168eb657e0d9520 ] In converting intel-iommu over to the common IOMMU DMA ops, it quietly lost the functionality of its "forcedac" option. Since this is a handy thing both for testing and for performance optimisation on certain platforms, reimplement it under the common IOMMU parameter namespace. For the sake of fixing the inadvertent breakage of the Intel-specific parameter, remove the dmar_forcedac remnants and hook it up as an alias while documenting the transition to the new common parameter. Fixes: c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/7eece8e0ea7bfbe2cd0e30789e0d46df573af9b0.1614961776.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14RDMA/mlx5: Fix query RoCE portMaor Gottlieb
[ Upstream commit 7852546f524595245382a919e752468f73421451 ] mlx5_is_roce_enabled returns the devlink RoCE init value, therefore it should be used only when driver is loaded. Instead we just need to read the roce_en field. In addition, rename mlx5_is_roce_enabled to mlx5_is_roce_init_enabled. Fixes: 7a58779edd75 ("IB/mlx5: Improve query port for representor port") Link: https://lore.kernel.org/r/20210304124517.1100608-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14HID: plantronics: Workaround for double volume key pressesMaxim Mikityanskiy
[ Upstream commit f567d6ef8606fb427636e824c867229ecb5aefab ] Plantronics Blackwire 3220 Series (047f:c056) sends HID reports twice for each volume key press. This patch adds a quirk to hid-plantronics for this product ID, which will ignore the second volume key press if it happens within 5 ms from the last one that was handled. The patch was tested on the mentioned model only, it shouldn't affect other models, however, this quirk might be needed for them too. Auto-repeat (when a key is held pressed) is not affected, because the rate is about 3 times per second, which is far less frequent than once in 5 ms. Fixes: 81bb773faed7 ("HID: plantronics: Update to map volume up/down controls") Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14usb: typec: tcpm: Honour pSnkStdby requirement during negotiationBadhri Jagan Sridharan
[ Upstream commit 123086843372bc93d26f52edfb71dbf951cd2f17 ] >From PD Spec: The Sink Shall transition to Sink Standby before a positive or negative voltage transition of VBUS. During Sink Standby the Sink Shall reduce its power draw to pSnkStdby. This allows the Source to manage the voltage transition as well as supply sufficient operating current to the Sink to maintain PD operation during the transition. The Sink Shall complete this transition to Sink Standby within tSnkStdby after evaluating the Accept Message from the Source. The transition when returning to Sink operation from Sink Standby Shall be completed within tSnkNewPower. The pSnkStdby requirement Shall only apply if the Sink power draw is higher than this level. The above requirement needs to be met to prevent hard resets from port partner. Without the patch: (5V/3A during SNK_DISCOVERY all the way through explicit contract) [ 95.711984] CC1: 0 -> 0, CC2: 0 -> 5 [state TOGGLING, polarity 0, connected] [ 95.712007] state change TOGGLING -> SNK_ATTACH_WAIT [rev3 NONE_AMS] [ 95.712017] pending state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED @ 170 ms [rev3 NONE_AMS] [ 95.837190] VBUS on [ 95.882075] state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED [delayed 170 ms] [ 95.882082] state change SNK_DEBOUNCED -> SNK_ATTACHED [rev3 NONE_AMS] [ 95.882086] polarity 1 [ 95.883151] set_auto_vbus_discharge_threshold mode:0 pps_active:n vbus:5000 ret:0 [ 95.883441] enable vbus discharge ret:0 [ 95.883445] Requesting mux state 1, usb-role 2, orientation 2 [ 95.883776] state change SNK_ATTACHED -> SNK_STARTUP [rev3 NONE_AMS] [ 95.883879] pending state change SNK_STARTUP -> SNK_DISCOVERY @ 500 ms [rev3 NONE_AMS] [ 96.038960] VBUS on [ 96.383939] state change SNK_STARTUP -> SNK_DISCOVERY [delayed 500 ms] [ 96.383946] Setting voltage/current limit 5000 mV 3000 mA [ 96.383961] vbus=0 charge:=1 [ 96.386044] state change SNK_DISCOVERY -> SNK_WAIT_CAPABILITIES [rev3 NONE_AMS] [ 96.386309] pending state change SNK_WAIT_CAPABILITIES -> HARD_RESET_SEND @ 450 ms [rev3 NONE_AMS] [ 96.394404] PD RX, header: 0x2161 [1] [ 96.394408] PDO 0: type 0, 5000 mV, 3000 mA [E] [ 96.394410] PDO 1: type 0, 9000 mV, 2000 mA [] [ 96.394412] state change SNK_WAIT_CAPABILITIES -> SNK_NEGOTIATE_CAPABILITIES [rev2 POWER_NEGOTIATION] [ 96.394416] Setting usb_comm capable false [ 96.395083] cc=0 cc1=0 cc2=5 vbus=0 vconn=sink polarity=1 [ 96.395089] Requesting PDO 1: 9000 mV, 2000 mA [ 96.395093] PD TX, header: 0x1042 [ 96.397404] PD TX complete, status: 0 [ 96.397424] pending state change SNK_NEGOTIATE_CAPABILITIES -> HARD_RESET_SEND @ 60 ms [rev2 POWER_NEGOTIATION] [ 96.400826] PD RX, header: 0x363 [1] [ 96.400829] state change SNK_NEGOTIATE_CAPABILITIES -> SNK_TRANSITION_SINK [rev2 POWER_NEGOTIATION] [ 96.400832] pending state change SNK_TRANSITION_SINK -> HARD_RESET_SEND @ 500 ms [rev2 POWER_NEGOTIATION] [ 96.577315] PD RX, header: 0x566 [1] [ 96.577321] Setting voltage/current limit 9000 mV 2000 mA [ 96.578363] set_auto_vbus_discharge_threshold mode:3 pps_active:n vbus:9000 ret:0 [ 96.578370] state change SNK_TRANSITION_SINK -> SNK_READY [rev2 POWER_NEGOTIATION] With the patch: [ 168.398573] CC1: 0 -> 0, CC2: 0 -> 5 [state TOGGLING, polarity 0, connected] [ 168.398605] state change TOGGLING -> SNK_ATTACH_WAIT [rev3 NONE_AMS] [ 168.398619] pending state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED @ 170 ms [rev3 NONE_AMS] [ 168.522348] VBUS on [ 168.568676] state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED [delayed 170 ms] [ 168.568684] state change SNK_DEBOUNCED -> SNK_ATTACHED [rev3 NONE_AMS] [ 168.568688] polarity 1 [ 168.569867] set_auto_vbus_discharge_threshold mode:0 pps_active:n vbus:5000 ret:0 [ 168.570158] enable vbus discharge ret:0 [ 168.570161] Requesting mux state 1, usb-role 2, orientation 2 [ 168.570504] state change SNK_ATTACHED -> SNK_STARTUP [rev3 NONE_AMS] [ 168.570634] pending state change SNK_STARTUP -> SNK_DISCOVERY @ 500 ms [rev3 NONE_AMS] [ 169.070689] state change SNK_STARTUP -> SNK_DISCOVERY [delayed 500 ms] [ 169.070695] Setting voltage/current limit 5000 mV 3000 mA [ 169.070702] vbus=0 charge:=1 [ 169.072719] state change SNK_DISCOVERY -> SNK_WAIT_CAPABILITIES [rev3 NONE_AMS] [ 169.073145] pending state change SNK_WAIT_CAPABILITIES -> HARD_RESET_SEND @ 450 ms [rev3 NONE_AMS] [ 169.077162] PD RX, header: 0x2161 [1] [ 169.077172] PDO 0: type 0, 5000 mV, 3000 mA [E] [ 169.077178] PDO 1: type 0, 9000 mV, 2000 mA [] [ 169.077183] state change SNK_WAIT_CAPABILITIES -> SNK_NEGOTIATE_CAPABILITIES [rev2 POWER_NEGOTIATION] [ 169.077191] Setting usb_comm capable false [ 169.077753] cc=0 cc1=0 cc2=5 vbus=0 vconn=sink polarity=1 [ 169.077759] Requesting PDO 1: 9000 mV, 2000 mA [ 169.077762] PD TX, header: 0x1042 [ 169.079990] PD TX complete, status: 0 [ 169.080013] pending state change SNK_NEGOTIATE_CAPABILITIES -> HARD_RESET_SEND @ 60 ms [rev2 POWER_NEGOTIATION] [ 169.083183] VBUS on [ 169.084195] PD RX, header: 0x363 [1] [ 169.084200] state change SNK_NEGOTIATE_CAPABILITIES -> SNK_TRANSITION_SINK [rev2 POWER_NEGOTIATION] [ 169.084206] Setting standby current 5000 mV @ 500 mA [ 169.084209] Setting voltage/current limit 5000 mV 500 mA [ 169.084220] pending state change SNK_TRANSITION_SINK -> HARD_RESET_SEND @ 500 ms [rev2 POWER_NEGOTIATION] [ 169.260222] PD RX, header: 0x566 [1] [ 169.260227] Setting voltage/current limit 9000 mV 2000 mA [ 169.261315] set_auto_vbus_discharge_threshold mode:3 pps_active:n vbus:9000 ret:0 [ 169.261321] state change SNK_TRANSITION_SINK -> SNK_READY [rev2 POWER_NEGOTIATION] [ 169.261570] AMS POWER_NEGOTIATION finished Fixes: f0690a25a140b ("staging: typec: USB Type-C Port Manager (tcpm)") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Link: https://lore.kernel.org/r/20210414024000.4175263-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14tty: fix return value for unsupported ioctlsJohan Hovold
[ Upstream commit 1b8b20868a6d64cfe8174a21b25b74367bdf0560 ] Drivers should return -ENOTTY ("Inappropriate I/O control operation") when an ioctl isn't supported, while -EINVAL is used for invalid arguments. Fix up the TIOCMGET, TIOCMSET and TIOCGICOUNT helpers which returned -EINVAL when a tty driver did not implement the corresponding operations. Note that the TIOCMGET and TIOCMSET helpers predate git and do not get a corresponding Fixes tag below. Fixes: d281da7ff6f7 ("tty: Make tiocgicount a handler") Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20210407095208.31838-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14PM: runtime: Replace inline function pm_runtime_callbacks_present()YueHaibing
[ Upstream commit 953c1fd96b1a70bcbbfb10973c2126eba8d891c7 ] Commit 9a7875461fd0 ("PM: runtime: Replace pm_runtime_callbacks_present()") forgot to change the inline version. Fixes: 9a7875461fd0 ("PM: runtime: Replace pm_runtime_callbacks_present()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14spi: Fix use-after-free with devm_spi_alloc_*William A. Kennington III
[ Upstream commit 794aaf01444d4e765e2b067cba01cc69c1c68ed9 ] We can't rely on the contents of the devres list during spi_unregister_controller(), as the list is already torn down at the time we perform devres_find() for devm_spi_release_controller. This causes devices registered with devm_spi_alloc_{master,slave}() to be mistakenly identified as legacy, non-devm managed devices and have their reference counters decremented below 0. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 660 at lib/refcount.c:28 refcount_warn_saturate+0x108/0x174 [<b0396f04>] (refcount_warn_saturate) from [<b03c56a4>] (kobject_put+0x90/0x98) [<b03c5614>] (kobject_put) from [<b0447b4c>] (put_device+0x20/0x24) r4:b6700140 [<b0447b2c>] (put_device) from [<b07515e8>] (devm_spi_release_controller+0x3c/0x40) [<b07515ac>] (devm_spi_release_controller) from [<b045343c>] (release_nodes+0x84/0xc4) r5:b6700180 r4:b6700100 [<b04533b8>] (release_nodes) from [<b0454160>] (devres_release_all+0x5c/0x60) r8:b1638c54 r7:b117ad94 r6:b1638c10 r5:b117ad94 r4:b163dc10 [<b0454104>] (devres_release_all) from [<b044e41c>] (__device_release_driver+0x144/0x1ec) r5:b117ad94 r4:b163dc10 [<b044e2d8>] (__device_release_driver) from [<b044f70c>] (device_driver_detach+0x84/0xa0) r9:00000000 r8:00000000 r7:b117ad94 r6:b163dc54 r5:b1638c10 r4:b163dc10 [<b044f688>] (device_driver_detach) from [<b044d274>] (unbind_store+0xe4/0xf8) Instead, determine the devm allocation state as a flag on the controller which is guaranteed to be stable during cleanup. Fixes: 5e844cc37a5c ("spi: Introduce device-managed SPI controller allocation") Signed-off-by: William A. Kennington III <wak@google.com> Link: https://lore.kernel.org/r/20210407095527.2771582-1-wak@google.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14driver core: platform: Declare early_platform_cleanup() prototypeAndy Shevchenko
[ Upstream commit 1768289b44bae847612751d418fc5c5e680b5e5c ] Compiler is not happy: CC drivers/base/platform.o drivers/base/platform.c:1557:20: warning: no previous prototype for ‘early_platform_cleanup’ [-Wmissing-prototypes] 1557 | void __weak __init early_platform_cleanup(void) { } | ^~~~~~~~~~~~~~~~~~~~~~ Declare early_platform_cleanup() prototype in the header to make everyone happy. Fixes: eecd37e105f0 ("drivers: Fix boot problem on SuperH") Cc: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210331150525.59223-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in ↵Nobuhiro Iwamatsu
IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE) [ Upstream commit 79bfe480a0a0b259ab9fddcd2fe52c03542b1196 ] zynqmp_pm_get_eemi_ops() was removed in commit 4db8180ffe7c: "Firmware: xilinx: Remove eemi ops for fpga related APIs", but not in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE). Any driver who want to communicate with PMC using EEMI APIs use the functions provided for each function This removed zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE), and also modify the documentation for this driver. Fixes: 4db8180ffe7c ("firmware: xilinx: Remove eemi ops for fpga related APIs") Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Link: https://lore.kernel.org/r/20210215155849.2425846-1-iwamatsu@nigauri.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14KVM: Stop looking for coalesced MMIO zones if the bus is destroyedSean Christopherson
commit 5d3c4c79384af06e3c8e25b7770b6247496b4417 upstream. Abort the walk of coalesced MMIO zones if kvm_io_bus_unregister_dev() fails to allocate memory for the new instance of the bus. If it can't instantiate a new bus, unregister_dev() destroys all devices _except_ the target device. But, it doesn't tell the caller that it obliterated the bus and invoked the destructor for all devices that were on the bus. In the coalesced MMIO case, this can result in a deleted list entry dereference due to attempting to continue iterating on coalesced_zones after future entries (in the walk) have been deleted. Opportunistically add curly braces to the for-loop, which encompasses many lines but sneaks by without braces due to the guts being a single if statement. Fixes: f65886606c2d ("KVM: fix memory leak in kvm_io_bus_unregister_dev()") Cc: stable@vger.kernel.org Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210412222050.876100-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-12perf: Rework perf_event_exit_event()Peter Zijlstra
[ Upstream commit ef54c1a476aef7eef26fe13ea10dc090952c00f8 ] Make perf_event_exit_event() more robust, such that we can use it from other contexts. Specifically the up and coming remove_on_exec. For this to work we need to address a few issues. Remove_on_exec will not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to disable event_function_call() and we thus have to use perf_remove_from_context(). When using perf_remove_from_context(), there's two races to consider. The first is against close(), where we can have concurrent tear-down of the event. The second is against child_list iteration, which should not find a half baked event. To address this, teach perf_remove_from_context() to special case !ctx->is_active and about DETACH_CHILD. [ elver@google.com: fix racing parent/child exit in sync_child_event(). ] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-12mfd: da9063: Support SMBus and I2C modeHubert Streidl
[ Upstream commit 586478bfc9f7e16504d6f64cf18bcbdf6fd0cbc9 ] By default the PMIC DA9063 2-wire interface is SMBus compliant. This means the PMIC will automatically reset the interface when the clock signal ceases for more than the SMBus timeout of 35 ms. If the I2C driver / device is not capable of creating atomic I2C transactions, a context change can cause a ceasing of the clock signal. This can happen if for example a real-time thread is scheduled. Then the DA9063 in SMBus mode will reset the 2-wire interface. Subsequently a write message could end up in the wrong register. This could cause unpredictable system behavior. The DA9063 PMIC also supports an I2C compliant mode for the 2-wire interface. This mode does not reset the interface when the clock signal ceases. Thus the problem depicted above does not occur. This patch tests for the bus functionality "I2C_FUNC_I2C". It can reasonably be assumed that the bus cannot obey SMBus timings if this functionality is set. SMBus commands most probably are emulated in this case which is prone to the latency issue described above. This patch enables the I2C bus mode if I2C_FUNC_I2C is set or otherwise keeps the default SMBus mode. Signed-off-by: Hubert Streidl <hubert.streidl@de.bosch.com> Signed-off-by: Mark Jonas <mark.jonas@de.bosch.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-12mfd: intel-m10-bmc: Fix the register access rangeXu Yilun
[ Upstream commit d9b326b2c3673f939941806146aee38e5c635fd0 ] This patch fixes the max register address of MAX 10 BMC. The range 0x20000000 ~ 0x200000fc are for control registers of the QSPI flash controller, which are not accessible to host. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Tom Rix <trix@redhat.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-12power: supply: bq27xxx: fix power_avg for newer ICsMatthias Schiffer
[ Upstream commit c4d57c22ac65bd503716062a06fad55a01569cac ] On all newer bq27xxx ICs, the AveragePower register contains a signed value; in addition to handling the raw value as unsigned, the driver code also didn't convert it to µW as expected. At least for the BQ28Z610, the reference manual incorrectly states that the value is in units of 1mW and not 10mW. I have no way of knowing whether the manuals of other supported ICs contain the same error, or if there are models that actually use 1mW. At least, the new code shouldn't be *less* correct than the old version for any device. power_avg is removed from the cache structure, se we don't have to extend it to store both a signed value and an error code. Always getting an up-to-date value may be desirable anyways, as it avoids inconsistent current and power readings when switching between charging and discharging. Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-12resource: Prevent irqresource_disabled() from erasing flagsAngela Czubak
[ Upstream commit d08a745729646f407277e904b02991458f20d261 ] Some Chromebooks use hard-coded interrupts in their ACPI tables. This is an excerpt as dumped on Relm: ... Name (_HID, "ELAN0001") // _HID: Hardware ID Name (_DDN, "Elan Touchscreen ") // _DDN: DOS Device Name Name (_UID, 0x05) // _UID: Unique ID Name (ISTP, Zero) Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings { Name (BUF0, ResourceTemplate () { I2cSerialBusV2 (0x0010, ControllerInitiated, 0x00061A80, AddressingMode7Bit, "\\_SB.I2C1", 0x00, ResourceConsumer, , Exclusive, ) Interrupt (ResourceConsumer, Edge, ActiveLow, Exclusive, ,, ) { 0x000000B8, } }) Return (BUF0) /* \_SB_.I2C1.ETSA._CRS.BUF0 */ } ... This interrupt is hard-coded to 0xB8 = 184 which is too high to be mapped to IO-APIC, so no triggering information is propagated as acpi_register_gsi() fails and irqresource_disabled() is issued, which leads to erasing triggering and polarity information. Do not overwrite flags as it leads to erasing triggering and polarity information which might be useful in case of hard-coded interrupts. This way the information can be read later on even though mapping to APIC domain failed. Signed-off-by: Angela Czubak <acz@semihalf.com> [ rjw: Changelog rearrangement ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-12mmc: core: Fix hanging on I/O during system suspend for removable cardsUlf Hansson
commit 17a17bf50612e6048a9975450cf1bd30f93815b5 upstream. The mmc core uses a PM notifier to temporarily during system suspend, turn off the card detection mechanism for removal/insertion of (e)MMC/SD/SDIO cards. Additionally, the notifier may be used to remove an SDIO card entirely, if a corresponding SDIO functional driver don't have the system suspend/resume callbacks assigned. This behaviour has been around for a very long time. However, a recent bug report tells us there are problems with this approach. More precisely, when receiving the PM_SUSPEND_PREPARE notification, we may end up hanging on I/O to be completed, thus also preventing the system from getting suspended. In the end what happens, is that the cancel_delayed_work_sync() in mmc_pm_notify() ends up waiting for mmc_rescan() to complete - and since mmc_rescan() wants to claim the host, it needs to wait for the I/O to be completed first. Typically, this problem is triggered in Android, if there is ongoing I/O while the user decides to suspend, resume and then suspend the system again. This due to that after the resume, an mmc_rescan() work gets punted to the workqueue, which job is to verify that the card remains inserted after the system has resumed. To fix this problem, userspace needs to become frozen to suspend the I/O, prior to turning off the card detection mechanism. Therefore, let's drop the PM notifiers for mmc subsystem altogether and rely on the card detection to be turned off/on as a part of the system_freezable_wq, that we are already using. Moreover, to allow and SDIO card to be removed during system suspend, let's manage this from a ->prepare() callback, assigned at the mmc_host_class level. In this way, we can use the parent device (the mmc_host_class device), to remove the card device that is the child, in the device_prepare() phase. Reported-by: Kiwoong Kim <kwmad.kim@samsung.com> Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20210310152900.149380-1-ulf.hansson@linaro.org Reviewed-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-12reset: add missing empty function reset_control_rearm()Jim Quinlan
commit 48582b2e3b87b794a9845d488af2c76ce055502b upstream. All other functions are defined for when CONFIG_RESET_CONTROLLER is not set. Fixes: 557acb3d2cd9 ("reset: make shared pulsed reset controls re-triggerable") Link: https://lore.kernel.org/r/20210430152156.21162-2-jim2101024@gmail.com Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-07bpf: Fix leakage of uninitialized bpf stack under speculationDaniel Borkmann
commit 801c6058d14a82179a7ee17a4b532cac6fad067f upstream. The current implemented mechanisms to mitigate data disclosure under speculation mainly address stack and map value oob access from the speculative domain. However, Piotr discovered that uninitialized BPF stack is not protected yet, and thus old data from the kernel stack, potentially including addresses of kernel structures, could still be extracted from that 512 bytes large window. The BPF stack is special compared to map values since it's not zero initialized for every program invocation, whereas map values /are/ zero initialized upon their initial allocation and thus cannot leak any prior data in either domain. In the non-speculative domain, the verifier ensures that every stack slot read must have a prior stack slot write by the BPF program to avoid such data leaking issue. However, this is not enough: for example, when the pointer arithmetic operation moves the stack pointer from the last valid stack offset to the first valid offset, the sanitation logic allows for any intermediate offsets during speculative execution, which could then be used to extract any restricted stack content via side-channel. Given for unprivileged stack pointer arithmetic the use of unknown but bounded scalars is generally forbidden, we can simply turn the register-based arithmetic operation into an immediate-based arithmetic operation without the need for masking. This also gives the benefit of reducing the needed instructions for the operation. Given after the work in 7fedb63a8307 ("bpf: Tighten speculative pointer arithmetic mask"), the aux->alu_limit already holds the final immediate value for the offset register with the known scalar. Thus, a simple mov of the immediate to AX register with using AX as the source for the original instruction is sufficient and possible now in this case. Reported-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Piotr Krysiuk <piotras@gmail.com> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-23Merge tag 'gpio-fixes-for-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: "Save and restore the sysconfig register in gpio-omap to fix a power-management issue" * tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: omap: Save and restore sysconfig
2021-04-21gpio: omap: Save and restore sysconfigTony Lindgren
As we are using cpu_pm to save and restore context, we must also save and restore the GPIO sysconfig register. This is needed because we are not calling PM runtime functions at all with cpu_pm. We need to save the sysconfig on idle as it's value can get reconfigured by PM runtime and can be different from the init time value. Device specific flags like "ti,no-idle-on-init" can affect the init value. Fixes: b764a5863fd8 ("gpio: omap: Remove custom PM calls and use cpu_pm instead") Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Adam Ford <aford173@gmail.com> Cc: Andreas Kemnade <andreas@kemnade.info> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-04-20capabilities: require CAP_SETFCAP to map uid 0Serge E. Hallyn
cap_setfcap is required to create file capabilities. Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"), a process running as uid 0 but without cap_setfcap is able to work around this as follows: unshare a new user namespace which maps parent uid 0 into the child namespace. While this task will not have new capabilities against the parent namespace, there is a loophole due to the way namespaced file capabilities are represented as xattrs. File capabilities valid in userns 1 are distinguished from file capabilities valid in userns 2 by the kuid which underlies uid 0. Therefore the restricted root process can unshare a new self-mapping namespace, add a namespaced file capability onto a file, then use that file capability in the parent namespace. To prevent that, do not allow mapping parent uid 0 if the process which opened the uid_map file does not have CAP_SETFCAP, which is the capability for setting file capabilities. As a further wrinkle: a task can unshare its user namespace, then open its uid_map file itself, and map (only) its own uid. In this case we do not have the credential from before unshare, which was potentially more restricted. So, when creating a user namespace, we record whether the creator had CAP_SETFCAP. Then we can use that during map_write(). With this patch: 1. Unprivileged user can still unshare -Ur ubuntu@caps:~$ unshare -Ur root@caps:~# logout 2. Root user can still unshare -Ur ubuntu@caps:~$ sudo bash root@caps:/home/ubuntu# unshare -Ur root@caps:/home/ubuntu# logout 3. Root user without CAP_SETFCAP cannot unshare -Ur: root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap -- root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap unable to set CAP_SETFCAP effective capability: Operation not permitted root@caps:/home/ubuntu# unshare -Ur unshare: write failed /proc/self/uid_map: Operation not permitted Note: an alternative solution would be to allow uid 0 mappings by processes without CAP_SETFCAP, but to prevent such a namespace from writing any file capabilities. This approach can be seen at [1]. Background history: commit 95ebabde382 ("capabilities: Don't allow writing ambiguous v3 file capabilities") tried to fix the issue by preventing v3 fscaps to be written to disk when the root uid would map to the same uid in nested user namespaces. This led to regressions for various workloads. For example, see [2]. Ultimately this is a valid use-case we have to support meaning we had to revert this change in 3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities")"). Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1] Link: https://github.com/containers/buildah/issues/3071 [2] Signed-off-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Andrew G. Morgan <morgan@kernel.org> Tested-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-17Merge tag 'net-5.12-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc8, including fixes from netfilter, and bpf. BPF verifier changes stand out, otherwise things have slowed down. Current release - regressions: - gro: ensure frag0 meets IP header alignment - Revert "net: stmmac: re-init rx buffers when mac resume back" - ethernet: macb: fix the restore of cmp registers Previous releases - regressions: - ixgbe: Fix NULL pointer dereference in ethtool loopback test - ixgbe: fix unbalanced device enable/disable in suspend/resume - phy: marvell: fix detection of PHY on Topaz switches - make tcp_allowed_congestion_control readonly in non-init netns - xen-netback: Check for hotplug-status existence before watching Previous releases - always broken: - bpf: mitigate a speculative oob read of up to map value size by tightening the masking window - sctp: fix race condition in sctp_destroy_sock - sit, ip6_tunnel: Unregister catch-all devices - netfilter: nftables: clone set element expression template - netfilter: flowtable: fix NAT IPv6 offload mangling - net: geneve: check skb is large enough for IPv4/IPv6 header - netlink: don't call ->netlink_bind with table lock held" * tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) netlink: don't call ->netlink_bind with table lock held MAINTAINERS: update my email bpf: Update selftests to reflect new error states bpf: Tighten speculative pointer arithmetic mask bpf: Move sanitize_val_alu out of op switch bpf: Refactor and streamline bounds check into helper bpf: Improve verifier error messages for users bpf: Rework ptr_limit into alu_limit and add common error path bpf: Ensure off_reg has no mixed signed bounds for all types bpf: Move off_reg into sanitize_ptr_alu bpf: Use correct permission flag for mixed signed bounds arithmetic ch_ktls: do not send snd_una update to TCB in middle ch_ktls: tcb close causes tls connection failure ch_ktls: fix device connection close ch_ktls: Fix kernel panic i40e: fix the panic when running bpf in xdpdrv mode net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta net/mlx5e: Fix setting of RS FEC mode net/mlx5: Fix setting of devlink traps in switchdev mode Revert "net: stmmac: re-init rx buffers when mac resume back" ...
2021-04-17Merge tag 'libnvdimm-fixes-for-5.12-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "The largest change is for a regression that landed during -rc1 for block-device read-only handling. Vaibhav found a new use for the ability (originally introduced by virtio_pmem) to call back to the platform to flush data, but also found an original bug in that implementation. Lastly, Arnd cleans up some compile warnings in dax. This has all appeared in -next with no reported issues. Summary: - Fix a regression of read-only handling in the pmem driver - Fix a compile warning - Fix support for platform cache flush commands on powerpc/papr" * tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC libnvdimm: Notify disk drivers to revalidate region read-only dax: avoid -Wempty-body warnings
2021-04-16kasan: remove redundant config optionWalter Wu
CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack instrumentation, but we should only need one config, so that we remove CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable. see [1]. When enable KASAN stack instrumentation, then for gcc we could do no prompt and default value y, and for clang prompt and default value n. This patch fixes the following compilation warning: include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef] [akpm@linux-foundation.org: fix merge snafu] Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221 [1] Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com Fixes: d9b571c885a8 ("kasan: fix KASAN_STACK dependency for HW_TAGS") Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix NAT IPv6 offload in the flowtable. 2) icmpv6 is printed as unknown in /proc/net/nf_conntrack. 3) Use div64_u64() in nft_limit, from Eric Dumazet. 4) Use pre_exit to unregister ebtables and arptables hooks, from Florian Westphal. 5) Fix out-of-bound memset in x_tables compat match/target, also from Florian. 6) Clone set elements expression to ensure proper initialization. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12net: phy: marvell: fix detection of PHY on Topaz switchesPali Rohár
Since commit fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading"), Linux reports the temperature of Topaz hwmon as constant -75°C. This is because switches from the Topaz family (88E6141 / 88E6341) have the address of the temperature sensor register different from Peridot. This address is instead compatible with 88E1510 PHYs, as was used for Topaz before the above mentioned commit. Create a new mapping table between switch family and PHY ID for families which don't have a model number. And define PHY IDs for Topaz and Peridot families. Create a new PHY ID and a new PHY driver for Topaz's internal PHY. The only difference from Peridot's PHY driver is the HWMON probing method. Prior this change Topaz's internal PHY is detected by kernel as: PHY [...] driver [Marvell 88E6390] (irq=63) And afterwards as: PHY [...] driver [Marvell 88E6341 Family] (irq=63) Signed-off-by: Pali Rohár <pali@kernel.org> BugLink: https://github.com/globalscaletechnologies/linux/issues/1 Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading") Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-10netfilter: arp_tables: add pre_exit hook for table unregisterFlorian Westphal
Same problem that also existed in iptables/ip(6)tables, when arptable_filter is removed there is no longer a wait period before the table/ruleset is free'd. Unregister the hook in pre_exit, then remove the table in the exit function. This used to work correctly because the old nf_hook_unregister API did unconditional synchronize_net. The per-net hook unregister function uses call_rcu instead. Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-10netfilter: bridge: add pre_exit hooks for ebtable unregistrationFlorian Westphal
Just like ip/ip6/arptables, the hooks have to be removed, then synchronize_rcu() has to be called to make sure no more packets are being processed before the ruleset data is released. Place the hook unregistration in the pre_exit hook, then call the new ebtables pre_exit function from there. Years ago, when first netns support got added for netfilter+ebtables, this used an older (now removed) netfilter hook unregister API, that did a unconditional synchronize_rcu(). Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit handlers may free the ebtable ruleset while packets are still in flight. This can only happens on module removal, not during netns exit. The new function expects the table name, not the table struct. This is because upcoming patch set (targeting -next) will remove all net->xt.{nat,filter,broute}_table instances, this makes it necessary to avoid external references to those member variables. The existing APIs will be converted, so follow the upcoming scheme of passing name + hook type instead. Fixes: aee12a0a3727e ("ebtables: remove nf_hook_register usage") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-09Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "14 patches. Subsystems affected by this patch series: mm (kasan, gup, pagecache, and kfence), MAINTAINERS, mailmap, nds32, gcov, ocfs2, ia64, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS kfence, x86: fix preemptible warning on KPTI-enabled systems lib/test_kasan_module.c: suppress unused var warning kasan: fix conflict with page poisoning fs: direct-io: fix missing sdio->boundary ia64: fix user_stack_pointer() for ptrace() ocfs2: fix deadlock between setattr and dio_end_io_write gcov: re-fix clang-11+ support nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff mm/gup: check page posion status for coredump. .mailmap: fix old email addresses mailmap: update email address for Jordan Crouse treewide: change my e-mail address, fix my name MAINTAINERS: update CZ.NIC's Turris information