| Age | Commit message (Collapse) | Author |
|
[ Upstream commit fed84c78527009d4f799a3ed9a566502fa026d82 ]
Kmemleak does not play well with KASAN (tested on both HPE Apollo 70 and
Huawei TaiShan 2280 aarch64 servers).
After calling start_kernel()->setup_arch()->kasan_init(), kmemleak early
log buffer went from something like 280 to 260000 which caused kmemleak
disabled and crash dump memory reservation failed. The multitude of
kmemleak_alloc() calls is from nested loops while KASAN is setting up full
memory mappings, so let early kmemleak allocations skip those
memblock_alloc_internal() calls came from kasan_init() given that those
early KASAN memory mappings should not reference to other memory. Hence,
no kmemleak false positives.
kasan_init
kasan_map_populate [1]
kasan_pgd_populate [2]
kasan_pud_populate [3]
kasan_pmd_populate [4]
kasan_pte_populate [5]
kasan_alloc_zeroed_page
memblock_alloc_try_nid
memblock_alloc_internal
kmemleak_alloc
[1] for_each_memblock(memory, reg)
[2] while (pgdp++, addr = next, addr != end)
[3] while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)))
[4] while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)))
[5] while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep)))
Link: http://lkml.kernel.org/r/1543442925-17794-1-git-send-email-cai@gmx.us
Signed-off-by: Qian Cai <cai@gmx.us>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed to an array with
MAX_NUMNODES elements. This made swap_info_struct size increased to 40KiB
and needs an order-4 page to hold it.
This is not optimal in that:
1 Most systems have way less than MAX_NUMNODES(1024) nodes so it
is a waste of memory;
2 It could cause swapon failure if the swap device is swapped on
after system has been running for a while, due to no order-4
page is available as pointed out by Vasily Averin.
Solve the above two issues by using nr_node_ids(which is the actual
possible node number the running system has) for avail_lists instead of
MAX_NUMNODES.
nr_node_ids is unknown at compile time so can't be directly used when
declaring this array. What I did here is to declare avail_lists as zero
element array and allocate space for it when allocating space for
swap_info_struct. The reason why keep using array but not pointer is
plist_for_each_entry needs the field to be part of the struct, so pointer
will not work.
This patch is on top of Vasily Averin's fix commit. I think the use of
kvzalloc for swap_info_struct is still needed in case nr_node_ids is
really big on some systems.
Link: http://lkml.kernel.org/r/20181115083847.GA11129@intel.com
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 46f53a65d2de3e1591636c22b626b09d8684fd71 ]
Currently BPF verifier allows narrow loads for a context field only with
offset zero. E.g. if there is a __u32 field then only the following
loads are permitted:
* off=0, size=1 (narrow);
* off=0, size=2 (narrow);
* off=0, size=4 (full).
On the other hand LLVM can generate a load with offset different than
zero that make sense from program logic point of view, but verifier
doesn't accept it.
E.g. tools/testing/selftests/bpf/sendmsg4_prog.c has code:
#define DST_IP4 0xC0A801FEU /* 192.168.1.254 */
...
if ((ctx->user_ip4 >> 24) == (bpf_htonl(DST_IP4) >> 24) &&
where ctx is struct bpf_sock_addr.
Some versions of LLVM can produce the following byte code for it:
8: 71 12 07 00 00 00 00 00 r2 = *(u8 *)(r1 + 7)
9: 67 02 00 00 18 00 00 00 r2 <<= 24
10: 18 03 00 00 00 00 00 fe 00 00 00 00 00 00 00 00 r3 = 4261412864 ll
12: 5d 32 07 00 00 00 00 00 if r2 != r3 goto +7 <LBB0_6>
where `*(u8 *)(r1 + 7)` means narrow load for ctx->user_ip4 with size=1
and offset=3 (7 - sizeof(ctx->user_family) = 3). This load is currently
rejected by verifier.
Verifier code that rejects such loads is in bpf_ctx_narrow_access_ok()
what means any is_valid_access implementation, that uses the function,
works this way, e.g. bpf_skb_is_valid_access() for __sk_buff or
sock_addr_is_valid_access() for bpf_sock_addr.
The patch makes such loads supported. Offset can be in [0; size_default)
but has to be multiple of load size. E.g. for __u32 field the following
loads are supported now:
* off=0, size=1 (narrow);
* off=1, size=1 (narrow);
* off=2, size=1 (narrow);
* off=3, size=1 (narrow);
* off=0, size=2 (narrow);
* off=2, size=2 (narrow);
* off=0, size=4 (full).
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 347a28b586802d09604a149c1a1f6de5dccbe6fa ]
This happened while running in qemu-system-aarch64, the AMBA PL011 UART
driver when enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE.
arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init),
devtmpfs' handle_remove() crashes because the reference count is a NULL
pointer only because wb->bdi hasn't been initialized yet.
Rework so that wb_put have an extra check if wb->bdi before decrement
wb->refcnt and also add a WARN_ON_ONCE to get a warning if it happens again
in other drivers.
Fixes: 52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks")
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 23b5f73266e59a598c1e5dd435d87651b5a7626b ]
During HARD_RESET the data link is disconnected.
For self powered device, the spec is advising against doing that.
>From USB_PD_R3_0
7.1.5 Response to Hard Resets
Device operation during and after a Hard Reset is defined as follows:
Self-powered devices Should Not disconnect from USB during a Hard Reset
(see Section 9.1.2).
Bus powered devices will disconnect from USB during a Hard Reset due to the
loss of their power source.
Tackle this by letting TCPM know whether the device is self or bus powered.
This overcomes unnecessary port disconnections from hard reset.
Also, speeds up the enumeration time when connected to Type-A ports.
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
---------
Version history:
V3:
Rebase on top of usb-next
V2:
Based on feedback from heikki.krogerus@linux.intel.com
- self_powered added to the struct tcpm_port which is populated from
a. "connector" node of the device tree in tcpm_fw_get_caps()
b. "self_powered" node of the tcpc_config in tcpm_copy_caps
Based on feedbase from linux@roeck-us.net
- Code was refactored
- SRC_HARD_RESET_VBUS_OFF sets the link state to false based
on self_powered flag
V1 located here:
https://lkml.org/lkml/2018/9/13/94
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 94a2c3a32b62e868dc1e3d854326745a7f1b8c7a upstream.
We recently got a stack by syzkaller like this:
BUG: sleeping function called from invalid context at mm/slab.h:361
in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid
INFO: lockdep is turned off.
CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194
0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00
0000000000000000 0000000000000001 0000000000000004 0000000000000000
Call Trace:
<IRQ> [<ffffffff81cb2194>] __dump_stack lib/dump_stack.c:15 [inline]
<IRQ> [<ffffffff81cb2194>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
[<ffffffff8129a981>] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675
[<ffffffff8129ac33>] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637
[<ffffffff81794c13>] slab_pre_alloc_hook mm/slab.h:361 [inline]
[<ffffffff81794c13>] slab_alloc_node mm/slub.c:2610 [inline]
[<ffffffff81794c13>] slab_alloc mm/slub.c:2692 [inline]
[<ffffffff81794c13>] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709
[<ffffffff81cbe9a7>] kmalloc include/linux/slab.h:479 [inline]
[<ffffffff81cbe9a7>] kzalloc include/linux/slab.h:623 [inline]
[<ffffffff81cbe9a7>] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227
[<ffffffff81cbf84f>] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374
[<ffffffff81cbb5b9>] kobject_cleanup lib/kobject.c:633 [inline]
[<ffffffff81cbb5b9>] kobject_release+0x229/0x440 lib/kobject.c:675
[<ffffffff81cbb0a2>] kref_sub include/linux/kref.h:73 [inline]
[<ffffffff81cbb0a2>] kref_put include/linux/kref.h:98 [inline]
[<ffffffff81cbb0a2>] kobject_put+0x72/0xd0 lib/kobject.c:692
[<ffffffff8216f095>] put_device+0x25/0x30 drivers/base/core.c:1237
[<ffffffff81c4cc34>] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232
[<ffffffff813c08bc>] __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
[<ffffffff813c08bc>] rcu_do_batch kernel/rcu/tree.c:2705 [inline]
[<ffffffff813c08bc>] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline]
[<ffffffff813c08bc>] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline]
[<ffffffff813c08bc>] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957
[<ffffffff8120f509>] __do_softirq+0x299/0xe20 kernel/softirq.c:273
[<ffffffff81210496>] invoke_softirq kernel/softirq.c:350 [inline]
[<ffffffff81210496>] irq_exit+0x216/0x2c0 kernel/softirq.c:391
[<ffffffff82c2cd7b>] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
[<ffffffff82c2cd7b>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
[<ffffffff82c2bc25>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746
<EOI> [<ffffffff814cbf40>] ? audit_kill_trees+0x180/0x180
[<ffffffff8187d2f7>] fd_install+0x57/0x80 fs/file.c:626
[<ffffffff8180989e>] do_sys_open+0x45e/0x550 fs/open.c:1043
[<ffffffff818099c2>] SYSC_open fs/open.c:1055 [inline]
[<ffffffff818099c2>] SyS_open+0x32/0x40 fs/open.c:1050
[<ffffffff82c299e1>] entry_SYSCALL_64_fastpath+0x1e/0x9a
In softirq context, we call rcu callback function delete_partition_rcu_cb(),
which may allocate memory by kzalloc with GFP_KERNEL flag. If the
allocation cannot be satisfied, it may sleep. However, That is not allowed
in softirq contex.
Although we found this problem on linux 4.4, the latest kernel version
seems to have this problem as well. And it is very similar to the
previous one:
https://lkml.org/lkml/2018/7/9/391
Fix it by using RCU workqueue, which allows sleep.
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 321c46b91550adc03054125fa7a1639390608e1a upstream.
So far we never had any device registered for the SoC. This resulted in
some small issues that we kept ignoring like:
1) Not working GPIOLIB_IRQCHIP (gpiochip_irqchip_add_key() failing)
2) Lack of proper tree in the /sys/devices/
3) mips_dma_alloc_coherent() silently handling empty coherent_dma_mask
Kernel 4.19 came with a lot of DMA changes and caused a regression on
bcm47xx. Starting with the commit f8c55dc6e828 ("MIPS: use generic dma
noncoherent ops for simple noncoherent platforms") DMA coherent
allocations just fail. Example:
[ 1.114914] bgmac_bcma bcma0:2: Allocation of TX ring 0x200 failed
[ 1.121215] bgmac_bcma bcma0:2: Unable to alloc memory for DMA
[ 1.127626] bgmac_bcma: probe of bcma0:2 failed with error -12
[ 1.133838] bgmac_bcma: Broadcom 47xx GBit MAC driver loaded
The bgmac driver also triggers a WARNING:
[ 0.959486] ------------[ cut here ]------------
[ 0.964387] WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 bgmac_enet_probe+0x1b4/0x5c4
[ 0.973751] Modules linked in:
[ 0.976913] CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.9 #0
[ 0.982750] Stack : 804a0000 804597c4 00000000 00000000 80458fd8 8381bc2c 838282d4 80481a47
[ 0.991367] 8042e3ec 00000001 804d38f0 00000204 83980000 00000065 8381bbe0 6f55b24f
[ 0.999975] 00000000 00000000 80520000 00002018 00000000 00000075 00000007 00000000
[ 1.008583] 00000000 80480000 000ee811 00000000 00000000 00000000 80432c00 80248db8
[ 1.017196] 00000009 00000204 83980000 803ad7b0 00000000 801feeec 00000000 804d0000
[ 1.025804] ...
[ 1.028325] Call Trace:
[ 1.030875] [<8000aef8>] show_stack+0x58/0x100
[ 1.035513] [<8001f8b4>] __warn+0xe4/0x118
[ 1.039708] [<8001f9a4>] warn_slowpath_null+0x48/0x64
[ 1.044935] [<80248db8>] bgmac_enet_probe+0x1b4/0x5c4
[ 1.050101] [<802498e0>] bgmac_probe+0x558/0x590
[ 1.054906] [<80252fd0>] bcma_device_probe+0x38/0x70
[ 1.060017] [<8020e1e8>] really_probe+0x170/0x2e8
[ 1.064891] [<8020e714>] __driver_attach+0xa4/0xec
[ 1.069784] [<8020c1e0>] bus_for_each_dev+0x58/0xb0
[ 1.074833] [<8020d590>] bus_add_driver+0xf8/0x218
[ 1.079731] [<8020ef24>] driver_register+0xcc/0x11c
[ 1.084804] [<804b54cc>] bgmac_init+0x1c/0x44
[ 1.089258] [<8000121c>] do_one_initcall+0x7c/0x1a0
[ 1.094343] [<804a1d34>] kernel_init_freeable+0x150/0x218
[ 1.099886] [<803a082c>] kernel_init+0x10/0x104
[ 1.104583] [<80005878>] ret_from_kernel_thread+0x14/0x1c
[ 1.110107] ---[ end trace f441c0d873d1fb5b ]---
This patch setups a "struct device" (and passes it to the bcma) which
allows fixing all the mentioned problems. It'll also require a tiny bcma
patch which will follow through the wireless tree & its maintainer.
Fixes: f8c55dc6e828 ("MIPS: use generic dma noncoherent ops for simple noncoherent platforms")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-wireless@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9e857a40dc4eba15a739b4194d7db873d82c28a0 ]
The bcm87xx and micrel driver has PHYs which are missing the .features
value. Add them. The bcm87xx is a 10G FEC only PHY. Add the needed
features definition of this PHY.
Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap")
Reported-by: Scott Wood <oss@buserror.net>
Reported-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream.
if node have NFSv41+ mounts inside several net namespaces
it can lead to use-after-free in svc_process_common()
svc_process_common()
/* Setup reply header */
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE
svc_process_common() can use incorrect rqstp->rq_xprt,
its caller function bc_svc_process() takes it from serv->sv_bc_xprt.
The problem is that serv is global structure but sv_bc_xprt
is assigned per-netnamespace.
According to Trond, the whole "let's set up rqstp->rq_xprt
for the back channel" is nothing but a giant hack in order
to work around the fact that svc_process_common() uses it
to find the xpt_ops, and perform a couple of (meaningless
for the back channel) tests of xpt_flags.
All we really need in svc_process_common() is to be able to run
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()
Bruce J Fields points that this xpo_prep_reply_hdr() call
is an awfully roundabout way just to do "svc_putnl(resv, 0);"
in the tcp case.
This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(),
now it calls svc_process_common() with rqstp->rq_xprt = NULL.
To adjust reply header svc_process_common() just check
rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.
To handle rqstp->rq_xprt = NULL case in functions called from
svc_process_common() patch intruduces net namespace pointer
svc_rqst->rq_bc_net and adjust SVC_NET() definition.
Some other function was also adopted to properly handle described case.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Fixes: 23c20ecd4475 ("NFS: callback up - users counting cleanup")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
v2: added lost extern svc_tcp_prep_reply_hdr()
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e4f358916d528d479c3c12bd2fd03f2d5a576380 upstream.
Commit
4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
replaced the RETPOLINE define with CONFIG_RETPOLINE checks. Remove the
remaining pieces.
[ bp: Massage commit message. ]
Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
Signed-off-by: WANG Chao <chao.wang@ucloud.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: linux-kbuild@vger.kernel.org
Cc: srinivas.eeda@oracle.com
Cc: stable <stable@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181210163725.95977-1-chao.wang@ucloud.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1690d8bb91e370ab772062b79bd434ce815c4729 upstream.
Since the commit 2a4eb7358aba "OPP: Don't remove dynamic OPPs from
_dev_pm_opp_remove_table()", dynamically created OPP aren't
automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
affects the scpi and scmi cpufreq drivers which no longer free OPPs on
failures or on invocations of the policy->exit() callback.
Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
called from these drivers instead of dev_pm_opp_cpumask_remove_table().
In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
opp_list isn't getting accessed simultaneously from other parts of the
OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
the opp_table->lock while traversing through the OPP list. And to
accomplish that, this patch also creates _opp_kref_release_unlocked()
which can be called from this new helper with the opp_table lock already
held.
Cc: 4.20 <stable@vger.kernel.org> # v4.20
Reported-by: Valentin Schneider <valentin.schneider@arm.com>
Fixes: 2a4eb7358aba "OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()"
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 58ef15b765af0d2cbe6799ec564f1dc485010ab8 upstream.
devm semantics arrange for resources to be torn down when
device-driver-probe fails or when device-driver-release completes.
Similar to devm_memremap_pages() there is no need to support an explicit
remove operation when the users properly adhere to devm semantics.
Note that devm_kzalloc() automatically handles allocating node-local
memory.
Link: http://lkml.kernel.org/r/154275559545.76910.9186690723515469051.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a95c90f1e2c253b280385ecf3d4ebfe476926b28 upstream.
The last step before devm_memremap_pages() returns success is to allocate
a release action, devm_memremap_pages_release(), to tear the entire setup
down. However, the result from devm_add_action() is not checked.
Checking the error from devm_add_action() is not enough. The api
currently relies on the fact that the percpu_ref it is using is killed by
the time the devm_memremap_pages_release() is run. Rather than continue
this awkward situation, offload the responsibility of killing the
percpu_ref to devm_memremap_pages_release() directly. This allows
devm_memremap_pages() to do the right thing relative to init failures and
shutdown.
Without this change we could fail to register the teardown of
devm_memremap_pages(). The likelihood of hitting this failure is tiny as
small memory allocations almost always succeed. However, the impact of
the failure is large given any future reconfiguration, or disable/enable,
of an nvdimm namespace will fail forever as subsequent calls to
devm_memremap_pages() will fail to setup the pgmap_radix since there will
be stale entries for the physical address range.
An argument could be made to require that the ->kill() operation be set in
the @pgmap arg rather than passed in separately. However, it helps code
readability, tracking the lifetime of a given instance, to be able to grep
the kill routine directly at the devm_memremap_pages() call site.
Link: http://lkml.kernel.org/r/154275558526.76910.7535251937849268605.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 80cd795630d6526ba729a089a435bf74a57af927 upstream.
44d8047f1d8 ("binder: use standard functions to allocate fds")
exposed a pre-existing issue in the binder driver.
fdget() is used in ksys_ioctl() as a performance optimization.
One of the rules associated with fdget() is that ksys_close() must
not be called between the fdget() and the fdput(). There is a case
where this requirement is not met in the binder driver which results
in the reference count dropping to 0 when the device is still in
use. This can result in use-after-free or other issues.
If userpace has passed a file-descriptor for the binder driver using
a BINDER_TYPE_FDA object, then kys_close() is called on it when
handling a binder_ioctl(BC_FREE_BUFFER) command. This violates
the assumptions for using fdget().
The problem is fixed by deferring the close using task_work_add(). A
new variant of __close_fd() was created that returns a struct file
with a reference. The fput() is deferred instead of using ksys_close().
Fixes: 44d8047f1d87a ("binder: use standard functions to allocate fds")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 81b1e6e6a8590a19257e37a1633bec098d499c57 upstream.
Since the addition of platform MSI support, there were two helpers
supposed to allocate/free IRQs for a device:
platform_msi_domain_alloc_irqs()
platform_msi_domain_free_irqs()
In these helpers, IRQ descriptors are allocated in the "alloc" routine
while they are freed in the "free" one.
Later, two other helpers have been added to handle IRQ domains on top
of MSI domains:
platform_msi_domain_alloc()
platform_msi_domain_free()
Seen from the outside, the logic is pretty close with the former
helpers and people used it with the same logic as before: a
platform_msi_domain_alloc() call should be balanced with a
platform_msi_domain_free() call. While this is probably what was
intended to do, the platform_msi_domain_free() does not remove/free
the IRQ descriptor(s) created/inserted in
platform_msi_domain_alloc().
One effect of such situation is that removing a module that requested
an IRQ will let one orphaned IRQ descriptor (with an allocated MSI
entry) in the device descriptors list. Next time the module will be
inserted back, one will observe that the allocation will happen twice
in the MSI domain, one time for the remaining descriptor, one time for
the new one. It also has the side effect to quickly overshoot the
maximum number of allocated MSI and then prevent any module requesting
an interrupt in the same domain to be inserted anymore.
This situation has been met with loops of insertion/removal of the
mvpp2.ko module (requesting 15 MSIs each time).
Fixes: 552c494a7666 ("platform-msi: Allow creation of a MSI-based stacked irq domain")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit aff6db454599d62191aabc208930e891748e4322 ]
__ptr_ring_swap_queue() tries to move pointers from the old
ring to the new one, but it forgets to check if ->producer
is beyond the new size at the end of the operation. This leads
to an out-of-bound access in __ptr_ring_produce() as reported
by syzbot.
Reported-by: syzbot+8993c0fa96d57c399735@syzkaller.appspotmail.com
Fixes: 5d49de532002 ("ptr_ring: resize support")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
https://github.com/ojeda/linux
Pull compiler_types.h fix from Miguel Ojeda:
"A cleanup for userspace in compiler_types.h: don't pollute userspace
with macro definitions (Xiaozhou Liu)
This is harmless for the kernel, but v4.19 was released with a few
macros exposed to userspace as the patch explains; which this removes,
so it *could* happen that we break something for someone (although
leaving inline redefined is probably worse)"
* tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux:
include/linux/compiler_types.h: don't pollute userspace with macro definitions
|
|
We really need the writecombine flag in dma_alloc_wc, fix a stupid
oversight.
Fixes: 7ed1d91a9e ("dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"The biggest part is a series of reverts for the macro based GCC
inlining workarounds. It caused regressions in distro build and other
kernel tooling environments, and the GCC project was very receptive to
fixing the underlying inliner weaknesses - so as time ran out we
decided to do a reasonably straightforward revert of the patches. The
plan is to rely on the 'asm inline' GCC 9 feature, which might be
backported to GCC 8 and could thus become reasonably widely available
on modern distros.
Other than those reverts, there's misc fixes from all around the
place.
I wish our final x86 pull request for v4.20 was smaller..."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs"
Revert "x86/objtool: Use asm macros to work around GCC inlining bugs"
Revert "x86/refcount: Work around GCC inlining bug"
Revert "x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs"
Revert "x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs"
Revert "x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops"
Revert "x86/extable: Macrofy inline assembly code to work around GCC inlining bugs"
Revert "x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs"
Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"
x86/mtrr: Don't copy uninitialized gentry fields back to userspace
x86/fsgsbase/64: Fix the base write helper functions
x86/mm/cpa: Fix cpa_flush_array() TLB invalidation
x86/vdso: Pass --eh-frame-hdr to the linker
x86/mm: Fix decoy address handling vs 32-bit builds
x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence
x86/dump_pagetables: Fix LDT remap address marker
x86/mm: Fix guard hole handling
|
|
Pull networking fixes from David Miller:
1) Off by one in netlink parsing of mac802154_hwsim, from Alexander
Aring.
2) nf_tables RCU usage fix from Taehee Yoo.
3) Flow dissector needs nhoff and thoff clamping, from Stanislav
Fomichev.
4) Missing sin6_flowinfo initialization in SCTP, from Xin Long.
5) Spectrev1 in ipmr and ip6mr, from Gustavo A. R. Silva.
6) Fix r8169 crash when DEBUG_SHIRQ is enabled, from Heiner Kallweit.
7) Fix SKB leak in rtlwifi, from Larry Finger.
8) Fix state pruning in bpf verifier, from Jakub Kicinski.
9) Don't handle completely duplicate fragments as overlapping, from
Michal Kubecek.
10) Fix memory corruption with macb and 64-bit DMA, from Anssi Hannula.
11) Fix TCP fallback socket release in smc, from Myungho Jung.
12) gro_cells_destroy needs to napi_disable, from Lorenzo Bianconi.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (130 commits)
rds: Fix warning.
neighbor: NTF_PROXY is a valid ndm_flag for a dump request
net: mvpp2: fix the phylink mode validation
net/sched: cls_flower: Remove old entries from rhashtable
net/tls: allocate tls context using GFP_ATOMIC
iptunnel: make TUNNEL_FLAGS available in uapi
gro_cell: add napi_disable in gro_cells_destroy
lan743x: Remove MAC Reset from initialization
net/mlx5e: Remove the false indication of software timestamping support
net/mlx5: Typo fix in del_sw_hw_rule
net/mlx5e: RX, Fix wrong early return in receive queue poll
ipv6: explicitly initialize udp6_addr in udp_sock_create6()
bnxt_en: Fix ethtool self-test loopback.
net/rds: remove user triggered WARN_ON in rds_sendmsg
net/rds: fix warn in rds_message_alloc_sgs
ath10k: skip sending quiet mode cmd for WCN3990
mac80211: free skb fraglist before freeing the skb
nl80211: fix memory leak if validate_pae_over_nl80211() fails
net/smc: fix TCP fallback socket release
vxge: ensure data0 is initialized in when fetching firmware version information
...
|
|
This reverts commit c06c4d8090513f2974dfdbed2ac98634357ac475.
See this commit for details about the revert:
e769742d3584 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")
Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Richard Biener <rguenther@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three fixes: The t10-pi one is a regression from the 4.19 release, the
qla2xxx one is a 4.20 merge window regression and the bnx2fc is a very
old bug"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: t10-pi: Return correct ref tag when queue has no integrity profile
scsi: bnx2fc: Fix NULL dereference in error handling
Revert "scsi: qla2xxx: Fix NVMe Target discovery"
|
|
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2018-12-15
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) fix liveness propagation of callee saved registers, from Jakub.
2) fix overflow in bpf_jit_limit knob, from Daniel.
3) bpf_flow_dissector api fix, from Stanislav.
4) bpf_perf_event api fix on powerpc, from Sandipan.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Presently the arches arm64, arm and sh have a function which loops
through each memblock and calls memory present. riscv will require a
similar function.
Introduce a common memblocks_present() function that can be used by all
the arches. Subsequent patches will cleanup the arches that make use of
this.
Link: http://lkml.kernel.org/r/20181107205433.3875-3-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This define is used by arm64 to calculate the size of the vmemmap
region. It is defined as the log2 of the upper bound on the size of a
struct page.
We move it into mm_types.h so it can be defined properly instead of set
and checked with a build bug. This also allows us to use the same
define for riscv.
Link: http://lkml.kernel.org/r/20181107205433.3875-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Macros 'inline' and '__gnu_inline' used to be defined in compiler-gcc.h,
which was (and is) included entirely in (__KERNEL__ && !__ASSEMBLY__).
Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually
exclusive") had those macros exposed to userspace, unintentionally.
Then commit a3f8a30f3f00 ("Compiler Attributes: use feature checks
instead of version checks") moved '__gnu_inline' back into
(__KERNEL__ && !__ASSEMBLY__) and 'inline' was left behind. Since 'inline'
depends on '__gnu_inline', compiling error showing "unknown type name
‘__gnu_inline’" will pop up, if userspace somehow includes
<linux/compiler.h>.
Other macros like __must_check, notrace, etc. are in a similar situation.
So just move all these macros back into (__KERNEL__ && !__ASSEMBLY__).
Note:
1. This patch only affects what userspace sees.
2. __must_check (when !CONFIG_ENABLE_MUST_CHECK) and noinline_for_stack
were once defined in __KERNEL__ only, but we believe that they can
be put into !__ASSEMBLY__ too.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Xiaozhou Liu <liuxiaozhou@bytedance.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
mlx5-fixes-2018-12-13
Subject: [pull request][net 0/9] Mellanox, mlx5 fixes 2018-12-13
Saeed Mahameed says:
====================
This series introduces some fixes to the mlx5 core and mlx5e netdevice
driver.
=======
Conflict with net-next: When merged with net-next this series will
cause a moderate conflict:
1) in drivers/net/ethernet/mellanox/mlx5/core/en_tc.c (2 hunks)
Take hunks from net only and just replace *attr->mirror_count to *attr->split_count
1.1) there is one more instance of slow_attr->mirror_count to be replaced
with slow_attr->split_count, it doesn't appear in the conflict, it will
cause a compilation error if left out.
2) in mlx5_ifc.h, take hunks only from net.
Example for the merge resolution can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=merge/mlx5-fixes&id=48830adf29804d85d77ed8a251d625db0eb5b8a8
branch merge/mlx5-fixes of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
(I simply merged this pull request tag into net-next and resolved the conflict)
I don't know if it's ok with you, but to save your time, you can just:
git pull git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux merge/mlx5-fixes
Into net-next, before your next net merge, and you will have a clean
merge of net into net-next (at least for mlx5 files).
======
Please pull and let me know if there's any problem.
For -stable v4.18
338d615be484 ('net/mlx5e: Cancel DIM work on close SQ')
91f40f9904ad ('net/mlx5e: RX, Verify MPWQE stride size is in range')
For -stable v4.19
c5c7e1c41bbe ('net/mlx5e: Remove unused UDP GSO remaining counter')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull XArray fixes from Matthew Wilcox:
"Two bugfixes, each with test-suite updates, two improvements to the
test-suite without associated bugs, and one patch adding a missing
API"
* tag 'xarray-4.20-rc7' of git://git.infradead.org/users/willy/linux-dax:
XArray: Fix xa_alloc when id exceeds max
XArray tests: Check iterating over multiorder entries
XArray tests: Handle larger indices more elegantly
XArray: Add xa_cmpxchg_irq and xa_cmpxchg_bh
radix tree: Don't return retry entries from lookup
|
|
The cap bits locations for the fdb caps of multi path to table (used for
local mirroring) and multi encap (used for prio/chains) were wrongly used
in swapped locations. This went unnoted so far b/c we tested the offending
patch with CX5 FW that supports both of them. On different environments where
not both caps are supported, we will be messed up, fix that.
Fixes: b9aa0ba17af5 ('net/mlx5: Add cap bits for multi fdb encap')
Signed-off-by: Vu Pham <vu@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix warnings suspicious rcu usage when handling base chain
statistics, from Taehee Yoo.
2) Refetch pointer to tcp header from nf_ct_sack_adjust() since
skb_make_writable() may reallocate data area, reported by Google
folks patch from Florian.
3) Incorrect netlink nest end after previous cancellation from error
path in ipset, from Pan Bian.
4) Use dst_hold_safe() from nf_xfrm_me_harder(), from Florian.
5) Use rb_link_node_rcu() for rcu-protected rbtree node in
nf_conncount, from Taehee Yoo.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michael and Sandipan report:
Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
and is adjusted again at init time if MODULES_VADDR is defined.
For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
the compile-time default at boot-time, which is 0x9c400000 when
using 64K page size. This overflows the signed 32-bit bpf_jit_limit
value:
root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
-1673527296
and can cause various unexpected failures throughout the network
stack. In one case `strace dhclient eth0` reported:
setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
16) = -1 ENOTSUPP (Unknown error 524)
and similar failures can be seen with tools like tcpdump. This doesn't
always reproduce however, and I'm not sure why. The more consistent
failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
host would time out on systemd/netplan configuring a virtio-net NIC
with no noticeable errors in the logs.
Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().
Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.
Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.
Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pull networking fixes from David Miller:
"A decent batch of fixes here. I'd say about half are for problems that
have existed for a while, and half are for new regressions added in
the 4.20 merge window.
1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach.
2) Revert bogus emac driver change, from Benjamin Herrenschmidt.
3) Handle BPF exported data structure with pointers when building
32-bit userland, from Daniel Borkmann.
4) Memory leak fix in act_police, from Davide Caratti.
5) Check RX checksum offload in RX descriptors properly in aquantia
driver, from Dmitry Bogdanov.
6) SKB unlink fix in various spots, from Edward Cree.
7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from
Eric Dumazet.
8) Fix FID leak in mlxsw driver, from Ido Schimmel.
9) IOTLB locking fix in vhost, from Jean-Philippe Brucker.
10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory
limits otherwise namespace exit can hang. From Jiri Wiesner.
11) Address block parsing length fixes in x25 from Martin Schiller.
12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan.
13) For tun interfaces, only iface delete works with rtnl ops, enforce
this by disallowing add. From Nicolas Dichtel.
14) Use after free in liquidio, from Pan Bian.
15) Fix SKB use after passing to netif_receive_skb(), from Prashant
Bhole.
16) Static key accounting and other fixes in XPS from Sabrina Dubroca.
17) Partially initialized flow key passed to ip6_route_output(), from
Shmulik Ladkani.
18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas
Falcon.
19) Several small TCP fixes (off-by-one on window probe abort, NULL
deref in tail loss probe, SNMP mis-estimations) from Yuchung
Cheng"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits)
net/sched: cls_flower: Reject duplicated rules also under skip_sw
bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips.
bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips.
bnxt_en: Keep track of reserved IRQs.
bnxt_en: Fix CNP CoS queue regression.
net/mlx4_core: Correctly set PFC param if global pause is turned off.
Revert "net/ibm/emac: wrong bit is used for STA control"
neighbour: Avoid writing before skb->head in neigh_hh_output()
ipv6: Check available headroom in ip6_xmit() even without options
tcp: lack of available data can also cause TSO defer
ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output
mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl
mlxsw: spectrum_router: Relax GRE decap matching check
mlxsw: spectrum_switchdev: Avoid leaking FID's reference count
mlxsw: spectrum_nve: Remove easily triggerable warnings
ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
sctp: frag_point sanity check
tcp: fix NULL ref in tail loss probe
tcp: Do not underestimate rwnd_limited
net: use skb_list_del_init() to remove from RX sublists
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small driver fixes for 4.20-rc6.
There is a hyperv fix that for some reaon took forever to get into a
shape that could be applied to the tree properly, but resolves a much
reported issue. The others are some gnss patches, one a bugfix and the
two others updates to the MAINTAINERS file to properly match the gnss
files in the tree.
All have been in linux-next for a while with no reported issues"
* tag 'char-misc-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
MAINTAINERS: exclude gnss from SIRFPRIMA2 regex matching
MAINTAINERS: add gnss scm tree
gnss: sirf: fix activation retry handling
Drivers: hv: vmbus: Offload the handling of channels to two workqueues
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 4.20-rc6
The "largest" here are some xhci fixes for reported issues. Also here
is a USB core fix, some quirk additions, and a usb-serial fix which
required the export of one of the tty layer's functions to prevent
code duplication. The tty maintainer agreed with this change.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: Prevent U1/U2 link pm states if exit latency is too long
xhci: workaround CSS timeout on AMD SNPS 3.0 xHC
USB: check usb_get_extra_descriptor for proper size
USB: serial: console: fix reported terminal settings
usb: quirk: add no-LPM quirk on SanDisk Ultra Flair device
USB: Fix invalid-free bug in port_over_current_notify()
usb: appledisplay: Add 27" Apple Cinema Display
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull dax fixes from Dan Williams:
"The last of the known regression fixes and fallout from the Xarray
conversion of the filesystem-dax implementation.
On the path to debugging why the dax memory-failure injection test
started failing after the Xarray conversion a couple more fixes for
the dax_lock_mapping_entry(), now called dax_lock_page(), surfaced.
Those plus the bug that started the hunt are now addressed. These
patches have appeared in a -next release with no issues reported.
Note the touches to mm/memory-failure.c are just the conversion to the
new function signature for dax_lock_page().
Summary:
- Fix the Xarray conversion of fsdax to properly handle
dax_lock_mapping_entry() in the presense of pmd entries
- Fix inode destruction racing a new lock request"
* tag 'dax-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix unlock mismatch with updated API
dax: Don't access a freed inode
dax: Check page->mapping isn't NULL
|
|
alloc_hugepage_direct_gfpmask"
This reverts commit 89c83fb539f95491be80cdd5158e6f0ce329e317.
This should have been done as part of 2f0799a0ffc0 ("mm, thp: restore
node-local hugepage allocations"). The movement of the thp allocation
policy from alloc_pages_vma() to alloc_hugepage_direct_gfpmask() was
intended to only set __GFP_THISNODE for mempolicies that are not
MPOL_BIND whereas the revert could set this regardless of mempolicy.
While the check for MPOL_BIND between alloc_hugepage_direct_gfpmask()
and alloc_pages_vma() was racy, that has since been removed since the
revert. What is left is the possibility to use __GFP_THISNODE in
policy_node() when it is unexpected because the special handling for
hugepages in alloc_pages_vma() was removed as part of the consolidation.
Secondly, prior to 89c83fb539f9, alloc_pages_vma() implemented a somewhat
different policy for hugepage allocations, which were allocated through
alloc_hugepage_vma(). For hugepage allocations, if the allocating
process's node is in the set of allowed nodes, allocate with
__GFP_THISNODE for that node (for MPOL_PREFERRED, use that node with
__GFP_THISNODE instead). This was changed for shmem_alloc_hugepage() to
allow fallback to other nodes in 89c83fb539f9 as it did for new_page() in
mm/mempolicy.c which is functionally different behavior and removes the
requirement to only allocate hugepages locally.
So this commit does a full revert of 89c83fb539f9 instead of the partial
revert that was done in 2f0799a0ffc0. The result is the same thp
allocation policy for 4.20 that was in 4.19.
Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
Fixes: 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations")
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit ddd0bc756983 ("block: move ref_tag calculation func to the block
layer") moved ref tag calculation from SCSI to a library function. However,
this change broke returning the correct ref tag for devices operating in
DIF mode since these do not have an associated block integrity profile.
This in turn caused read/write failures on PI-formatted disks attached to
an mpt3sas controller.
Fixes: ddd0bc756983 ("block: move ref_tag calculation func to the block layer")
Cc: stable@vger.kernel.org # 4.19+
Reported-by: John Garry <john.garry@huawei.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull NFS client bugfixes from Trond Myklebust:
"This is mainly fallout from the updates to the SUNRPC code that is
being triggered from less common combinations of NFS mount options.
Highlights include:
Stable fixes:
- Fix a page leak when using RPCSEC_GSS/krb5p to encrypt data.
Bugfixes:
- Fix a regression that causes the RPC receive code to hang
- Fix call_connect_status() so that it handles tasks that got
transmitted while queued waiting for the socket lock.
- Fix a memory leak in call_encode()
- Fix several other connect races.
- Fix receive code error handling.
- Use the discard iterator rather than MSG_TRUNC for compatibility
with AF_UNIX/AF_LOCAL sockets.
- nfs: don't dirty kernel pages read by direct-io
- pnfs/Flexfiles fix to enforce per-mirror stateid only for NFSv4
data servers"
* tag 'nfs-for-4.20-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Don't force a redundant disconnection in xs_read_stream()
SUNRPC: Fix up socket polling
SUNRPC: Use the discard iterator rather than MSG_TRUNC
SUNRPC: Treat EFAULT as a truncated message in xs_read_stream_request()
SUNRPC: Fix up handling of the XDRBUF_SPARSE_PAGES flag
SUNRPC: Fix RPC receive hangs
SUNRPC: Fix a potential race in xprt_connect()
SUNRPC: Fix a memory leak in call_encode()
SUNRPC: Fix leak of krb5p encode pages
SUNRPC: call_connect_status() must handle tasks that got transmitted
nfs: don't dirty kernel pages read by direct-io
flexfiles: enforce per-mirror stateid only for v4 DSes
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial fix for v4.20-rc6
Here's a fix for a reported USB-console regression in 4.18 which
revealed a long-standing bug in the console implementation.
The patch has been in linux-next over night with no reported issues.
Signed-off-by: Johan Hovold <johan@kernel.org>
* tag 'usb-serial-4.20-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: console: fix reported terminal settings
|
|
These convenience wrappers match the other _irq and _bh wrappers we
already have. It turns out I'd already open-coded xa_cmpxchg_irq()
in the shmem code, so convert that.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2018-12-05
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) fix bpf uapi pointers for 32-bit architectures, from Daniel.
2) improve verifer ability to handle progs with a lot of branches, from Alexei.
3) strict btf checks, from Yonghong.
4) bpf_sk_lookup api cleanup, from Joe.
5) other misc fixes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a full revert of ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for
MADV_HUGEPAGE mappings") and a partial revert of 89c83fb539f9 ("mm, thp:
consolidate THP gfp handling into alloc_hugepage_direct_gfpmask").
By not setting __GFP_THISNODE, applications can allocate remote hugepages
when the local node is fragmented or low on memory when either the thp
defrag setting is "always" or the vma has been madvised with
MADV_HUGEPAGE.
Remote access to hugepages often has much higher latency than local pages
of the native page size. On Haswell, ac5b2c18911f was shown to have a
13.9% access regression after this commit for binaries that remap their
text segment to be backed by transparent hugepages.
The intent of ac5b2c18911f is to address an issue where a local node is
low on memory or fragmented such that a hugepage cannot be allocated. In
every scenario where this was described as a fix, there is abundant and
unfragmented remote memory available to allocate from, even with a greater
access latency.
If remote memory is also low or fragmented, not setting __GFP_THISNODE was
also measured on Haswell to have a 40% regression in allocation latency.
Restore __GFP_THISNODE for thp allocations.
Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When reading an extra descriptor, we need to properly check the minimum
and maximum size allowed, to prevent from invalid data being sent by a
device.
Reported-by: Hui Peng <benquike@gmail.com>
Reported-by: Mathias Payer <mathias.payer@nebelwelt.net>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hui Peng <benquike@gmail.com>
Signed-off-by: Mathias Payer <mathias.payer@nebelwelt.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The USB-serial console implementation has never reported the actual
terminal settings used. Despite storing the corresponding cflags in its
struct console, these were never honoured on later tty open() where the
tty termios would be left initialised to the driver defaults.
Unlike the serial console implementation, the USB-serial code calls
subdriver open() already at console setup. While calling set_termios()
and write() before open() looks like it could work for some USB-serial
drivers, others definitely do not expect this, so modelling this after
serial core is going to be intrusive, if at all possible.
Instead, use a (renamed) tty helper to save the termios data used at
console setup so that the tty termios reflects the actual terminal
settings after a subsequent tty open().
Note that the calls to tty_init_termios() (tty_driver_install()) and
tty_save_termios() are serialised using the disconnect mutex.
This specifically fixes a regression that was triggered by a recent
change adding software flow control to the pl2303 driver: a getty trying
to disable flow control while leaving the baud rate unchanged would now
also set the baud rate to the driver default (prior to the flow-control
change this had been a noop).
Fixes: 7041d9c3f01b ("USB: serial: pl2303: add support for tx xon/xoff flow control")
Cc: stable <stable@vger.kernel.org> # 4.18
Cc: Florian Zumbiehl <florz@florz.de>
Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
|
|
Internal to dax_unlock_mapping_entry(), dax_unlock_entry() is used to
store a replacement entry in the Xarray at the given xas-index with the
DAX_LOCKED bit clear. When called, dax_unlock_entry() expects the unlocked
value of the entry relative to the current Xarray state to be specified.
In most contexts dax_unlock_entry() is operating in the same scope as
the matched dax_lock_entry(). However, in the dax_unlock_mapping_entry()
case the implementation needs to recall the original entry. In the case
where the original entry is a 'pmd' entry it is possible that the pfn
performed to do the lookup is misaligned to the value retrieved in the
Xarray.
Change the api to return the unlock cookie from dax_lock_page() and pass
it to dax_unlock_page(). This fixes a bug where dax_unlock_page() was
assuming that the page was PMD-aligned if the entry was a PMD entry with
signatures like:
WARNING: CPU: 38 PID: 1396 at fs/dax.c:340 dax_insert_entry+0x2b2/0x2d0
RIP: 0010:dax_insert_entry+0x2b2/0x2d0
[..]
Call Trace:
dax_iomap_pte_fault.isra.41+0x791/0xde0
ext4_dax_huge_fault+0x16f/0x1f0
? up_read+0x1c/0xa0
__do_fault+0x1f/0x160
__handle_mm_fault+0x1033/0x1490
handle_mm_fault+0x18b/0x3d0
Link: https://lkml.kernel.org/r/20181130154902.GL10377@bombadil.infradead.org
Fixes: 9f32d221301c ("dax: Convert dax_lock_mapping_entry to XArray")
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
basechain->stats is rcu protected data which is updated from
nft_chain_stats_replace(). This function is executed from the commit
phase which holds the pernet nf_tables commit mutex - not the global
nfnetlink subsystem mutex.
Test commands to reproduce the problem are:
%iptables-nft -I INPUT
%iptables-nft -Z
%iptables-nft -Z
This patch uses RCU calls to handle basechain->stats updates to fix a
splat that looks like:
[89279.358755] =============================
[89279.363656] WARNING: suspicious RCU usage
[89279.368458] 4.20.0-rc2+ #44 Tainted: G W L
[89279.374661] -----------------------------
[89279.379542] net/netfilter/nf_tables_api.c:1404 suspicious rcu_dereference_protected() usage!
[...]
[89279.406556] 1 lock held by iptables-nft/5225:
[89279.411728] #0: 00000000bf45a000 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1f/0x70 [nf_tables]
[89279.424022] stack backtrace:
[89279.429236] CPU: 0 PID: 5225 Comm: iptables-nft Tainted: G W L 4.20.0-rc2+ #44
[89279.430135] Call Trace:
[89279.430135] dump_stack+0xc9/0x16b
[89279.430135] ? show_regs_print_info+0x5/0x5
[89279.430135] ? lockdep_rcu_suspicious+0x117/0x160
[89279.430135] nft_chain_commit_update+0x4ea/0x640 [nf_tables]
[89279.430135] ? sched_clock_local+0xd4/0x140
[89279.430135] ? check_flags.part.35+0x440/0x440
[89279.430135] ? __rhashtable_remove_fast.constprop.67+0xec0/0xec0 [nf_tables]
[89279.430135] ? sched_clock_cpu+0x126/0x170
[89279.430135] ? find_held_lock+0x39/0x1c0
[89279.430135] ? hlock_class+0x140/0x140
[89279.430135] ? is_bpf_text_address+0x5/0xf0
[89279.430135] ? check_flags.part.35+0x440/0x440
[89279.430135] ? __lock_is_held+0xb4/0x140
[89279.430135] nf_tables_commit+0x2555/0x39c0 [nf_tables]
Fixes: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
vmbus_process_offer() mustn't call channel->sc_creation_callback()
directly for sub-channels, because sc_creation_callback() ->
vmbus_open() may never get the host's response to the
OPEN_CHANNEL message (the host may rescind a channel at any time,
e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
may not wake up the vmbus_open() as it's blocked due to a non-zero
vmbus_connection.offer_in_progress, and finally we have a deadlock.
The above is also true for primary channels, if the related device
drivers use sync probing mode by default.
And, usually the handling of primary channels and sub-channels can
depend on each other, so we should offload them to different
workqueues to avoid possible deadlock, e.g. in sync-probing mode,
NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
and waits for all the sub-channels to appear, but the latter
can't get the rtnl_lock and this blocks the handling of sub-channels.
The patch can fix the multiple-NIC deadlock described above for
v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
but don't enable async-probing for Hyper-V drivers (yet).
The patch can also fix the hang issue in sub-channel's handling described
above for all versions of kernels, including v4.19 and v4.20-rc4.
So actually the patch should be applied to all the existing kernels,
not only the kernels that have 8195b1396ec8.
Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Volume is a little higher than usual due to a set of gpio fixes for
Davinci platforms that's been around a while, still seemed appropriate
to not hold off until next merge window.
Besides that it's the usual mix of minor fixes, mostly corrections of
small stuff in device trees.
Major stability-related one is the removal of a regulator from DT on
Rock960, since DVFS caused undervoltage. I expect it'll be restored
once they figure out the underlying issue"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
MAINTAINERS: Remove unused Qualcomm SoC mailing list
ARM: davinci: dm644x: set the GPIO base to 0
ARM: davinci: da830: set the GPIO base to 0
ARM: davinci: dm355: set the GPIO base to 0
ARM: davinci: dm646x: set the GPIO base to 0
ARM: davinci: dm365: set the GPIO base to 0
ARM: davinci: da850: set the GPIO base to 0
gpio: davinci: restore a way to manually specify the GPIO base
ARM: davinci: dm644x: define gpio interrupts as separate resources
ARM: davinci: dm355: define gpio interrupts as separate resources
ARM: davinci: dm646x: define gpio interrupts as separate resources
ARM: davinci: dm365: define gpio interrupts as separate resources
ARM: davinci: da8xx: define gpio interrupts as separate resources
ARM: dts: at91: sama5d2: use the divided clock for SMC
ARM: dts: imx51-zii-rdu1: Remove EEPROM node
ARM: dts: rockchip: Remove @0 from the veyron memory node
arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou.
arm64: dts: qcom: msm8998: Reserve gpio ranges on MTP
arm64: dts: sdm845-mtp: Reserve reserved gpios
arm64: dts: ti: k3-am654: Fix wakeup_uart reg address
...
|
|
If we retransmit an RPC request, we currently end up clobbering the
value of req->rq_rcv_buf.bvec that was allocated by the initial call to
xprt_request_prepare(req).
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|