| Age | Commit message (Collapse) | Author |
|
[ Upstream commit f6b5f1a56987de837f8e25cd560847106b8632a8 ]
absolute_pointer() disassociates a pointer from its originating symbol
type and context. Use it to prevent compiler warnings/errors such as
drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
arch/m68k/include/asm/string.h:72:25: error:
'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]
Such warnings may be reported by gcc 11.x for string and memory
operations on fixed addresses.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d4ffd5df9d18031b6a53f934388726775b4452d3 ]
The function __bad_area_nosemaphore() calls kernelmode_fixup_or_oops()
with the parameter @signal being actually @pkey, which will send a
signal numbered with the argument in @pkey.
This bug can be triggered when the kernel fails to access user-given
memory pages that are protected by a pkey, so it can go down the
do_user_addr_fault() path and pass the !user_mode() check in
__bad_area_nosemaphore().
Most cases will simply run the kernel fixup code to make an -EFAULT. But
when another condition current->thread.sig_on_uaccess_err is met, which
is only used to emulate vsyscall, the kernel will generate the wrong
signal.
Add a new parameter @pkey to kernelmode_fixup_or_oops() to fix this.
[ bp: Massage commit message, fix build error as reported by the 0day
bot: https://lkml.kernel.org/r/202109202245.APvuT8BX-lkp@intel.com ]
Fixes: 5042d40a264c ("x86/fault: Bypass no_context() for implicit kernel faults from usermode")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jiashuo Liang <liangjs@pku.edu.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20210730030152.249106-1-liangjs@pku.edu.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
on error
[ Upstream commit fd292c189a979838622d5e03e15fa688c81dd50b ]
Commit 86f8b1c01a0a ("net: dsa: Do not make user port errors fatal")
decided it was fine to ignore errors on certain ports that fail to
probe, and go on with the ports that do probe fine.
Commit fb6ec87f7229 ("net: dsa: Fix type was not set for devlink port")
noticed that devlink_port_type_eth_set(dlp, dp->slave); does not get
called, and devlink notices after a timeout of 3600 seconds and prints a
WARN_ON. So it went ahead to unregister the devlink port. And because
there exists an UNUSED port flavour, we actually re-register the devlink
port as UNUSED.
Commit 08156ba430b4 ("net: dsa: Add devlink port regions support to
DSA") added devlink port regions, which are set up by the driver and not
by DSA.
When we trigger the devlink port deregistration and reregistration as
unused, devlink now prints another WARN_ON, from here:
devlink_port_unregister:
WARN_ON(!list_empty(&devlink_port->region_list));
So the port still has regions, which makes sense, because they were set
up by the driver, and the driver doesn't know we're unregistering the
devlink port.
Somebody needs to tear them down, and optionally (actually it would be
nice, to be consistent) set them up again for the new devlink port.
But DSA's layering stays in our way quite badly here.
The options I've considered are:
1. Introduce a function in devlink to just change a port's type and
flavour. No dice, devlink keeps a lot of state, it really wants the
port to not be registered when you set its parameters, so changing
anything can only be done by destroying what we currently have and
recreating it.
2. Make DSA cache the parameters passed to dsa_devlink_port_region_create,
and the region returned, keep those in a list, then when the devlink
port unregister needs to take place, the existing devlink regions are
destroyed by DSA, and we replay the creation of new regions using the
cached parameters. Problem: mv88e6xxx keeps the region pointers in
chip->ports[port].region, and these will remain stale after DSA frees
them. There are many things DSA can do, but updating mv88e6xxx's
private pointers is not one of them.
3. Just let the driver do it (i.e. introduce a very specific method
called ds->ops->port_reinit_as_unused, which unregisters its devlink
port devlink regions, then the old devlink port, then registers the
new one, then the devlink port regions for it). While it does work,
as opposed to the others, it's pretty horrible from an API
perspective and we can do better.
4. Introduce a new pair of methods, ->port_setup and ->port_teardown,
which in the case of mv88e6xxx must register and unregister the
devlink port regions. Call these 2 methods when the port must be
reinitialized as unused.
Naturally, I went for the 4th approach.
Fixes: 08156ba430b4 ("net: dsa: Add devlink port regions support to DSA")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 93368aab0efc87288cac65e99c9ed2e0ffc9e7d0 upstream.
Fix up a misuse that the filename pointer isn't always valid in
the ring buffer, and we should copy the content instead.
Link: https://lore.kernel.org/r/20210921143531.81356-1-hsiangkao@linux.alibaba.com
Fixes: 13f06f48f7bf ("staging: erofs: support tracepoint")
Cc: stable@vger.kernel.org # 4.19+
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 58877b0824da15698bd85a0a9dbfa8c354e6ecb7 upstream.
It has been observed with certain PCIe USB cards (like Inateck connected
to AM64 EVM or J7200 EVM) that as soon as the primary roothub is
registered, port status change is handled even before xHC is running
leading to cold plug USB devices not detected. For such cases, registering
both the root hubs along with the second HCD is required. Add support for
deferring roothub registration in usb_add_hcd(), so that both primary and
secondary roothubs are registered along with the second HCD.
CC: stable@vger.kernel.org # 5.4+
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210909064200.16216-2-kishon@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b564171ade70570b7f335fa8ed17adb28409e3ac upstream.
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.
But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.
1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.
This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.
Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.
As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.
NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.
To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.
If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.
A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.
the same time, confirmed the pending transactions succeeded.
Fixes: 432ff1e91694 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 71731090ab17a208a58020e4b342fdfee280458a ]
On init, the disabled state is cleared right before hw_init and that
causes the device to report on "Operational" state before the device
initialization is finished. Although the char device is not yet exposed
to the user at this stage, the sysfs entries are exposed.
This can cause errors in monitoring applications that use the sysfs
entries.
In order to avoid this, a new state "in device creation" is introduced
to ne reported when the device is not disabled but is still in init
flow.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4b92d4add5f6dcf21275185c997d6ecb800054cd ]
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework
to ensure that the cache related functions are called on the upcoming CPU
because the notifier itself could run on any online CPU.
The hotplug state machine guarantees that the callbacks are invoked on the
upcoming CPU. So there is no need to have this SMP function call
obfuscation. That indirection was missed when the hotplug notifiers were
converted.
This also solves the problem of ARM64 init_cache_level() invoking ACPI
functions which take a semaphore in that context. That's invalid as SMP
function calls run with interrupts disabled. Running it just from the
callback in context of the CPU hotplug thread solves this.
Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fb83610762dd5927212aa62a468dd3b756b57a88 ]
There are two pairs of declarations for thermal_cooling_device_register()
and thermal_of_cooling_device_register(), and only one set was changed
in a recent patch, so the other one now causes a compile-time warning:
drivers/net/wireless/mediatek/mt76/mt7915/init.c: In function 'mt7915_thermal_init':
drivers/net/wireless/mediatek/mt76/mt7915/init.c:134:48: error: passing argument 1 of 'thermal_cooling_device_register' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
134 | cdev = thermal_cooling_device_register(wiphy_name(wiphy), phy,
| ^~~~~~~~~~~~~~~~~
In file included from drivers/net/wireless/mediatek/mt76/mt7915/init.c:7:
include/linux/thermal.h:407:39: note: expected 'char *' but argument is of type 'const char *'
407 | thermal_cooling_device_register(char *type, void *devdata,
| ~~~~~~^~~~
Change the dummy helper functions to have the same arguments as the
normal version.
Fixes: f991de53a8ab ("thermal: make device_register's type argument const")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jean-Francois Dagenais <jeff.dagenais@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210722090717.1116748-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c7c5e6ff533fe1f9afef7d2fa46678987a1335a7 ]
syzbot found that forcing a big quantum attribute would crash hosts fast,
essentially using this:
tc qd replace dev eth0 root fq_codel quantum 4294967295
This is because fq_codel_dequeue() would have to loop
~2^31 times in :
if (flow->deficit <= 0) {
flow->deficit += q->quantum;
list_move_tail(&flow->flowchain, &q->old_flows);
goto begin;
}
SFQ max quantum is 2^19 (half a megabyte)
Lets adopt a max quantum of one megabyte for FQ_CODEL.
Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 817f9916a6e96ae43acdd4e75459ef4f92d96eb1 ]
The CONFIG_PCI=y case got a new parameter long time ago. Sync the stub as
well.
[bhelgaas: add parameter names]
Fixes: 725522b5453d ("PCI: add the sysfs driver name to all modules")
Link: https://lore.kernel.org/r/20210813153619.89574-1-andriy.shevchenko@linux.intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f97493657c6372eeefe70faadd214bf31488c44e ]
Joakim Zhang reports that Wake-on-Lan with the stmmac ethernet driver broke
when moving the incorrect handling of mac link state out of mac_config().
This reason this breaks is because the stmmac's WoL is handled by the MAC
rather than the PHY, and phylink doesn't cater for that scenario.
This patch adds the necessary phylink code to handle suspend/resume events
according to whether the MAC still needs a valid link or not. This is the
barest minimum for this support.
Reported-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Tested-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b9edbfe1adecfc48fd11061dce68afb03d6adbdc ]
Commit 3df98d79215ace13 ("lsm,selinux: pass flowi_common instead of flowi
to the LSM hooks") introduced flowi{4,6}_to_flowi_common() functions which
cause UBSAN warning when building with LLVM 11.0.1 on Ubuntu 21.04.
================================================================================
UBSAN: object-size-mismatch in ./include/net/flow.h:197:33
member access within address ffffc9000109fbd8 with insufficient space
for an object of type 'struct flowi'
CPU: 2 PID: 7410 Comm: systemd-resolve Not tainted 5.14.0 #51
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
Call Trace:
dump_stack_lvl+0x103/0x171
ubsan_type_mismatch_common+0x1de/0x390
__ubsan_handle_type_mismatch_v1+0x41/0x50
udp_sendmsg+0xda2/0x1300
? ip_skb_dst_mtu+0x1f0/0x1f0
? sock_rps_record_flow+0xe/0x200
? inet_send_prepare+0x2d/0x90
sock_sendmsg+0x49/0x80
____sys_sendmsg+0x269/0x370
__sys_sendmsg+0x15e/0x1d0
? syscall_enter_from_user_mode+0xf0/0x1b0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7081a50497
Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
RSP: 002b:00007ffc153870f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f7081a50497
RDX: 0000000000000000 RSI: 00007ffc15387140 RDI: 000000000000000c
RBP: 00007ffc15387140 R08: 0000563f29a5e4fc R09: 000000000000cd28
R10: 0000563f29a68a30 R11: 0000000000000246 R12: 000000000000000c
R13: 0000000000000001 R14: 0000563f29a68a30 R15: 0000563f29a5e50c
================================================================================
I don't think we need to call flowi{4,6}_to_flowi() from these functions
because the first member of "struct flowi4" and "struct flowi6" is
struct flowi_common __fl_common;
while the first member of "struct flowi" is
union {
struct flowi_common __fl_common;
struct flowi4 ip4;
struct flowi6 ip6;
struct flowidn dn;
} u;
which should point to the same address without access to "struct flowi".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d08c8b855140e9f5240b3ffd1b8b9d435675e281 ]
Root Ports in NXP LX2xx0 and LX2xx2, where each Root Port is a Root Complex
with unique segment numbers, do provide isolation features to disable peer
transactions and validate bus numbers in requests, but do not provide an
actual PCIe ACS capability.
Add ACS quirks for NXP LX2xx0 A/C/E/N and LX2xx2 A/C/E/N platforms.
LX2xx0A : without security features + CAN-FD
LX2160A (0x8d81) - 16 cores
LX2120A (0x8da1) - 12 cores
LX2080A (0x8d83) - 8 cores
LX2xx0C : security features + CAN-FD
LX2160C (0x8d80) - 16 cores
LX2120C (0x8da0) - 12 cores
LX2080C (0x8d82) - 8 cores
LX2xx0E : security features + CAN
LX2160E (0x8d90) - 16 cores
LX2120E (0x8db0) - 12 cores
LX2080E (0x8d92) - 8 cores
LX2xx0N : without security features + CAN
LX2160N (0x8d91) - 16 cores
LX2120N (0x8db1) - 12 cores
LX2080N (0x8d93) - 8 cores
LX2xx2A : without security features + CAN-FD
LX2162A (0x8d89) - 16 cores
LX2122A (0x8da9) - 12 cores
LX2082A (0x8d8b) - 8 cores
LX2xx2C : security features + CAN-FD
LX2162C (0x8d88) - 16 cores
LX2122C (0x8da8) - 12 cores
LX2082C (0x8d8a) - 8 cores
LX2xx2E : security features + CAN
LX2162E (0x8d98) - 16 cores
LX2122E (0x8db8) - 12 cores
LX2082E (0x8d9a) - 8 cores
LX2xx2N : without security features + CAN
LX2162N (0x8d99) - 16 cores
LX2122N (0x8db9) - 12 cores
LX2082N (0x8d9b) - 8 cores
[bhelgaas: put PCI_VENDOR_ID_NXP definition next to PCI_VENDOR_ID_FREESCALE
as a clue that they share the same Device ID namespace]
Link: https://lore.kernel.org/r/20210729121747.1823086-1-wasim.khan@oss.nxp.com
Link: https://lore.kernel.org/r/20210803180021.3252886-1-wasim.khan@oss.nxp.com
Signed-off-by: Wasim Khan <wasim.khan@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a57d8c217aadac75530b8e7ffb3a3e1b7bfd0330 ]
Sometimes when unbinding the mv88e6xxx driver on Turris MOX, these error
messages appear:
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 0 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 100 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 0 from fdb: -2
(and similarly for other ports)
What happens is that DSA has a policy "even if there are bugs, let's at
least not leak memory" and dsa_port_teardown() clears the dp->fdbs and
dp->mdbs lists, which are supposed to be empty.
But deleting that cleanup code, the warnings go away.
=> the FDB and MDB lists (used for refcounting on shared ports, aka CPU
and DSA ports) will eventually be empty, but are not empty by the time
we tear down those ports. Aka we are deleting them too soon.
The addresses that DSA complains about are host-trapped addresses: the
local addresses of the ports, and the MAC address of the bridge device.
The problem is that offloading those entries happens from a deferred
work item scheduled by the SWITCHDEV_FDB_DEL_TO_DEVICE handler, and this
races with the teardown of the CPU and DSA ports where the refcounting
is kept.
In fact, not only it races, but fundamentally speaking, if we iterate
through the port list linearly, we might end up tearing down the shared
ports even before we delete a DSA user port which has a bridge upper.
So as it turns out, we need to first tear down the user ports (and the
unused ones, for no better place of doing that), then the shared ports
(the CPU and DSA ports). In between, we need to ensure that all work
items scheduled by our switchdev handlers (which only run for user
ports, hence the reason why we tear them down first) have finished.
Fixes: 161ca59d39e9 ("net: dsa: reference count the MDB entries at the cross-chip notifier level")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210914134726.2305133-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 04f08eb44b5011493d77b602fdec29ff0f5c6cd5 upstream.
syzbot reported another data-race in af_unix [1]
Lets change __skb_insert() to use WRITE_ONCE() when changing
skb head qlen.
Also, change unix_dgram_poll() to use lockless version
of unix_recvq_full()
It is verry possible we can switch all/most unix_recvq_full()
to the lockless version, this will be done in a future kernel version.
[1] HEAD commit: 8596e589b787732c8346f0482919e83cc9362db1
BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll
write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
__skb_insert include/linux/skbuff.h:1938 [inline]
__skb_queue_before include/linux/skbuff.h:2043 [inline]
__skb_queue_tail include/linux/skbuff.h:2076 [inline]
skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
sock_sendmsg_nosec net/socket.c:703 [inline]
sock_sendmsg net/socket.c:723 [inline]
____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
___sys_sendmsg net/socket.c:2446 [inline]
__sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
__do_sys_sendmmsg net/socket.c:2561 [inline]
__se_sys_sendmmsg net/socket.c:2558 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
skb_queue_len include/linux/skbuff.h:1869 [inline]
unix_recvq_full net/unix/af_unix.c:194 [inline]
unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
sock_poll+0x23e/0x260 net/socket.c:1288
vfs_poll include/linux/poll.h:90 [inline]
ep_item_poll fs/eventpoll.c:846 [inline]
ep_send_events fs/eventpoll.c:1683 [inline]
ep_poll fs/eventpoll.c:1798 [inline]
do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
__do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
__se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
__x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x0000001b -> 0x00000001
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G W 5.14.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 86b18aaa2b5b ("skbuff: fix a data race in skb_queue_len()")
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 81065b35e2486c024c7aa86caed452e1f01a59d4 upstream.
There are two cases for machine check recovery:
1) The machine check was triggered by ring3 (application) code.
This is the simpler case. The machine check handler simply queues
work to be executed on return to user. That code unmaps the page
from all users and arranges to send a SIGBUS to the task that
triggered the poison.
2) The machine check was triggered in kernel code that is covered by
an exception table entry. In this case the machine check handler
still queues a work entry to unmap the page, etc. but this will
not be called right away because the #MC handler returns to the
fix up code address in the exception table entry.
Problems occur if the kernel triggers another machine check before the
return to user processes the first queued work item.
Specifically, the work is queued using the ->mce_kill_me callback
structure in the task struct for the current thread. Attempting to queue
a second work item using this same callback results in a loop in the
linked list of work functions to call. So when the kernel does return to
user, it enters an infinite loop processing the same entry for ever.
There are some legitimate scenarios where the kernel may take a second
machine check before returning to the user.
1) Some code (e.g. futex) first tries a get_user() with page faults
disabled. If this fails, the code retries with page faults enabled
expecting that this will resolve the page fault.
2) Copy from user code retries a copy in byte-at-time mode to check
whether any additional bytes can be copied.
On the other side of the fence are some bad drivers that do not check
the return value from individual get_user() calls and may access
multiple user addresses without noticing that some/all calls have
failed.
Fix by adding a counter (current->mce_count) to keep track of repeated
machine checks before task_work() is called. First machine check saves
the address information and calls task_work_add(). Subsequent machine
checks before that task_work call back is executed check that the address
is in the same page as the first machine check (since the callback will
offline exactly one page).
Expected worst case is four machine checks before moving on (e.g. one
user access with page faults disabled, then a repeat to the same address
with page faults enabled ... repeat in copy tail bytes). Just in case
there is some code that loops forever enforce a limit of 10.
[ bp: Massage commit message, drop noinstr, fix typo, extend panic
messages. ]
Fixes: 5567d11c21a1 ("x86/mce: Send #MC singal from task work")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 13db8c50477d83ad3e3b9b0ae247e5cd833a7ae4 upstream.
After fork, the child process will get incorrect (2x) hugetlb_usage. If
a process uses 5 2MB hugetlb pages in an anonymous mapping,
HugetlbPages: 10240 kB
and then forks, the child will show,
HugetlbPages: 20480 kB
The reason for double the amount is because hugetlb_usage will be copied
from the parent and then increased when we copy page tables from parent
to child. Child will have 2x actual usage.
Fix this by adding hugetlb_count_init in mm_init.
Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com
Fixes: 5d317b2b6536 ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status")
Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.
Patch series "mm/memory_hotplug: preparatory patches for new online policy and memory"
These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.
These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.
[1] https://lkml.kernel.org/r/20210607195430.48228-1-david@redhat.com
This patch (of 4):
Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.
As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.
Use "unsigned long" instead, just as we do in other places when handling
PFNs. This can bite us once we have physical addresses in the range of
multiple TB.
Link: https://lkml.kernel.org/r/20210712124052.26491-2-david@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the preferred ordering")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jia He <justin.he@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pierre Morel <pmorel@linux.ibm.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: Sergei Trofimovich <slyfox@gentoo.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 09a26e832705fdb7a9484495b71a05e0bbc65207 upstream.
Guillaume Morin reported hitting the following WARNING followed by GPF or
NULL pointer deference either in cgroups_destroy or in the kill_css path.:
percpu ref (css_release) <= 0 (-1) after switching to atomic
WARNING: CPU: 23 PID: 130 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x127/0x130
CPU: 23 PID: 130 Comm: ksoftirqd/23 Kdump: loaded Tainted: G O 5.10.60 #1
RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x127/0x130
Call Trace:
rcu_core+0x30f/0x530
rcu_core_si+0xe/0x10
__do_softirq+0x103/0x2a2
run_ksoftirqd+0x2b/0x40
smpboot_thread_fn+0x11a/0x170
kthread+0x10a/0x140
ret_from_fork+0x22/0x30
Upon further examination, it was discovered that the css structure was
associated with hugetlb reservations.
For private hugetlb mappings the vma points to a reserve map that
contains a pointer to the css. At mmap time, reservations are set up
and a reference to the css is taken. This reference is dropped in the
vma close operation; hugetlb_vm_op_close. However, if a vma is split no
additional reference to the css is taken yet hugetlb_vm_op_close will be
called twice for the split vma resulting in an underflow.
Fix by taking another reference in hugetlb_vm_op_open. Note that the
reference is only taken for the owner of the reserve map. In the more
common fork case, the pointer to the reserve map is cleared for
non-owning vmas.
Link: https://lkml.kernel.org/r/20210830215015.155224-1-mike.kravetz@oracle.com
Fixes: e9fe92ae0cd2 ("hugetlb_cgroup: add reservation accounting for private mappings")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Suggested-by: Guillaume Morin <guillaume@morinfr.org>
Tested-by: Guillaume Morin <guillaume@morinfr.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f657f8eef3ff870552c9fd2839e0061046f44618 ]
NFS implements blocking locks by blocking inside its lock method. In
the reexport case, this blocks the nfs server thread, which could lead
to deadlocks since an nfs server thread might be required to unlock the
conflicting lock. It also causes a crash, since the nfs server thread
assumes it can free the lock when its lm_notify lock callback is called.
Ideal would be to make the nfs lock method return without blocking in
this case, but for now it works just not to attempt blocking locks. The
difference is just that the original client will have to poll (as it
does in the v4.0 case) instead of getting a callback when the lock's
available.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 74fc4f828769cca1c3be89ea92cb88feaa27ef52 ]
Currently, when creating an ingress qdisc on an indirect device before
the driver registered for callbacks, the driver will not have a chance
to register its filter configuration callbacks.
To fix that, modify the code such that it keeps track of all the ingress
qdiscs that call flow_indr_dev_setup_offload(). When a driver calls
flow_indr_dev_register(), go through the list of tracked ingress qdiscs
and call the driver callback entry point so as to give it a chance to
register its callback.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b41cda03765580caf7723b8c1b672d191c71013f ]
RT mutexes belong to the LD_WAIT_SLEEP class. Make them so.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211302.031014562@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c45074d68a9b1e893d86520af71fab37693c3d7e ]
Code was checking if random_addr and hdev->rpa match without first
checking if the RPA has not been set (BDADDR_ANY), furthermore it was
clearing HCI_RPA_EXPIRED before the command completes and the RPA is
actually programmed which in case of failure would leave the expired
RPA still set.
Since advertising instance have a similar problem the clearing of
HCI_RPA_EXPIRED has been moved to hci_event.c after checking the random
address is in fact the hdev->rap and then proceed to set the expire
timeout.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3df15d6f37246d2f12f53d915c41d806289d3d46 ]
The console variable is used everywhere in some fun pointer path and
array indexes and for some reason isn't always declared as unsigned.
This plays havoc with some static analysis tools so mark the variable as
unsigned so we "know" we can not wrap the arrays backwards here.
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: Jordy Zomer <jordy@pwning.systems>
Link: https://lore.kernel.org/r/20210726134322.2274919-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d7aff291d069c4418285f3c8ee27b0ff67ce5998 ]
Oxford Semiconductor 950 serial port devices have a 128-byte FIFO and in
the enhanced (650) mode, which we select in `autoconfig_has_efr' with
the ECB bit set in the EFR register, they support the receive interrupt
trigger level selectable with FCR bits 7:6 from the set of 16, 32, 112,
120. This applies to the original OX16C950 discrete UART[1] as well as
950 cores embedded into more complex devices.
For these devices we set the default to 112, which sets an excessively
high level of 112 or 7/8 of the FIFO capacity, unlike with other port
types where we choose at most 1/2 of their respective FIFO capacities.
Additionally we don't make the trigger level configurable. Consequently
frequent input overruns happen with high bit rates where hardware flow
control cannot be used (e.g. terminal applications) even with otherwise
highly-performant systems.
Lower the default receive interrupt trigger level to 32 then, and make
it configurable. Document the trigger levels along with other port
types, including the set of 16, 32, 64, 112 for the transmit interrupt
as well[2].
References:
[1] "OX16C950 rev B High Performance UART with 128 byte FIFOs", Oxford
Semiconductor, Inc., DS-0031, Sep 05, Table 10: "Receiver Trigger
Levels", p. 22
[2] same, Table 9: "Transmit Interrupt Trigger Levels", p. 22
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2106260608480.37803@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dd98d2895de6485c884a9cb42de69fed02826fa4 ]
The ethtool compat ioctl handling is hidden away in net/socket.c,
which introduces a couple of minor oddities:
- The implementation may end up diverging, as seen in the RXNFC
extension in commit 84a1d9c48200 ("net: ethtool: extend RXNFC
API to support RSS spreading of filter matches") that does not work
in compat mode.
- Most architectures do not need the compat handling at all
because u64 and compat_u64 have the same alignment.
- On x86, the conversion is done for both x32 and i386 user space,
but it's actually wrong to do it for x32 and cannot work there.
- On 32-bit Arm, it never worked for compat oabi user space, since
that needs to do the same conversion but does not.
- It would be nice to get rid of both compat_alloc_user_space()
and copy_in_user() throughout the kernel.
None of these actually seems to be a serious problem that real
users are likely to encounter, but fixing all of them actually
leads to code that is both shorter and more readable.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fed31a4dd3adb5455df7c704de2abb639a1dc1c0 ]
This commit fixes several typos where CONFIG_TASKS_RCU_TRACE should
instead be CONFIG_TASKS_TRACE_RCU. Among other things, these typos
could cause CONFIG_TASKS_TRACE_RCU_READ_MB=y kernels to suffer from
memory-ordering bugs that could result in false-positive quiescent
states and too-short grace periods.
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 56f0729a510f92151682ff6c89f69724d5595d6e ]
drm_file->master pointers should be protected by
drm_device.master_mutex or drm_file.master_lookup_lock when being
dereferenced.
However, in drm_lease.c, there are multiple instances where
drm_file->master is accessed and dereferenced while neither lock is
held. This makes drm_lease.c vulnerable to use-after-free bugs.
We address this issue in 2 ways:
1. Add a new drm_file_get_master() function that calls drm_master_get
on drm_file->master while holding on to
drm_file.master_lookup_lock. Since drm_master_get increments the
reference count of master, this prevents master from being freed until
we unreference it with drm_master_put.
2. In each case where drm_file->master is directly accessed and
eventually dereferenced in drm_lease.c, we wrap the access in a call
to the new drm_file_get_master function, then unreference the master
pointer once we are done using it.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712043508.11584-6-desmondcheongzx@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0b0860a3cf5eccf183760b1177a1dcdb821b0b66 ]
Currently, drm_file.master pointers should be protected by
drm_device.master_mutex when being dereferenced. This is because
drm_file.master is not invariant for the lifetime of drm_file. If
drm_file is not the creator of master, then drm_file.is_master is
false, and a call to drm_setmaster_ioctl will invoke
drm_new_set_master, which then allocates a new master for drm_file and
puts the old master.
Thus, without holding drm_device.master_mutex, the old value of
drm_file.master could be freed while it is being used by another
concurrent process.
However, it is not always possible to lock drm_device.master_mutex to
dereference drm_file.master. Through the fbdev emulation code, this
might occur in a deep nest of other locks. But drm_device.master_mutex
is also the outermost lock in the nesting hierarchy, so this leads to
potential deadlocks.
To address this, we introduce a new spin lock at the bottom of the
lock hierarchy that only serializes drm_file.master. With this change,
the value of drm_file.master changes only when both
drm_device.master_mutex and drm_file.master_lookup_lock are
held. Hence, any process holding either of those locks can ensure that
the value of drm_file.master will not change concurrently.
Since no lock depends on the new drm_file.master_lookup_lock, when
drm_file.master is dereferenced, but drm_device.master_mutex cannot be
held, we can safely protect the master pointer with
drm_file.master_lookup_lock.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712043508.11584-5-desmondcheongzx@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4d99efb229e63928c6b03a756a2e38cd4777fbe8 ]
The VT-d spec Revision 3.3 updated the virtual command registers, virtual
command opcode B register, virtual command response register and virtual
command capability register (Section 10.4.43, 10.4.44, 10.4.45, 10.4.46).
This updates the virtual command interface implementation in the Intel
IOMMU driver accordingly.
Fixes: 24f27d32ab6b7 ("iommu/vt-d: Enlightened PASID allocation")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Sanjay Kumar <sanjay.k.kumar@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210713042649.3547403-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c2dc3e5fad13aca5d7bdf4bcb52b1a1d707c8555 ]
We really should not call rpc_wake_up_queued_task_set_status() with
xprt->snd_task as an argument unless we are certain that is actually an
rpc_task.
Fixes: 0445f92c5d53 ("SUNRPC: Fix disconnection races")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit f985911b7bc75d5c98ed24d8aaa8b94c590f7c6a upstream.
Hit kernel warning like this, it can be reproduced by verifying 256
bytes datafile by keyctl command, run script:
RAWDATA=rawdata
SIGDATA=sigdata
modprobe pkcs8_key_parser
rm -rf *.der *.pem *.pfx
rm -rf $RAWDATA
dd if=/dev/random of=$RAWDATA bs=256 count=1
openssl req -nodes -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
-subj "/C=CN/ST=GD/L=SZ/O=vihoo/OU=dev/CN=xx.com/emailAddress=yy@xx.com"
KEY_ID=`openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER | keyctl \
padd asymmetric 123 @s`
keyctl pkey_sign $KEY_ID 0 $RAWDATA enc=pkcs1 hash=sha1 > $SIGDATA
keyctl pkey_verify $KEY_ID 0 $RAWDATA $SIGDATA enc=pkcs1 hash=sha1
Then the kernel reports:
WARNING: CPU: 5 PID: 344556 at crypto/rsa-pkcs1pad.c:540
pkcs1pad_verify+0x160/0x190
...
Call Trace:
public_key_verify_signature+0x282/0x380
? software_key_query+0x12d/0x180
? keyctl_pkey_params_get+0xd6/0x130
asymmetric_key_verify_signature+0x66/0x80
keyctl_pkey_verify+0xa5/0x100
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The reason of this issue, in function 'asymmetric_key_verify_signature':
'.digest_size(u8) = params->in_len(u32)' leads overflow of an u8 value,
so use u32 instead of u8 for digest_size field. And reorder struct
public_key_signature, it saves 8 bytes on a 64-bit machine.
Cc: stable@vger.kernel.org
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 03fe78cc2942c55cc13be5ca42578750f17204a1 upstream.
We have been hitting some early ENOSPC issues in production with more
recent kernels, and I tracked it down to us simply not flushing delalloc
as aggressively as we should be. With tracing I was seeing us failing
all tickets with all of the block rsvs at or around 0, with very little
pinned space, but still around 120MiB of outstanding bytes_may_used.
Upon further investigation I saw that we were flushing around 14 pages
per shrink call for delalloc, despite having around 2GiB of delalloc
outstanding.
Consider the example of a 8 way machine, all CPUs trying to create a
file in parallel, which at the time of this commit requires 5 items to
do. Assuming a 16k leaf size, we have 10MiB of total metadata reclaim
size waiting on reservations. Now assume we have 128MiB of delalloc
outstanding. With our current math we would set items to 20, and then
set to_reclaim to 20 * 256k, or 5MiB.
Assuming that we went through this loop all 3 times, for both
FLUSH_DELALLOC and FLUSH_DELALLOC_WAIT, and then did the full loop
twice, we'd only flush 60MiB of the 128MiB delalloc space. This could
leave a fair bit of delalloc reservations still hanging around by the
time we go to ENOSPC out all the remaining tickets.
Fix this two ways. First, change the calculations to be a fraction of
the total delalloc bytes on the system. Prior to this change we were
calculating based on dirty inodes so our math made more sense, now it's
just completely unrelated to what we're actually doing.
Second add a FLUSH_DELALLOC_FULL state, that we hold off until we've
gone through the flush states at least once. This will empty the system
of all delalloc so we're sure to be truly out of space when we start
failing tickets.
I'm tagging stable 5.10 and forward, because this is where we started
using the page stuff heavily again. This affects earlier kernel
versions as well, but would be a pain to backport to them as the
flushing mechanisms aren't the same.
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 528521f72b8f615e0930ba4b07d508ec73b07837 which is
commit 39ff83f2f6cc5cc1458dfcea9697f96338210beb upstream.
Arnd reports that this needs more review before being merged into all of
the trees.
Link: https://lore.kernel.org/r/CAK8P3a0z5jE=Z3Ps5bFTCFT7CHZR1JQ8VhdntDJAfsUxSPCcEw@mail.gmail.com
Reported-by: Arnd Bergmann <arnd@kernel.org>
Cc: Lukas Hannen <lukas.hannen@opensource.tttech-industrial.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2fc2a7a62eb58650e71b4550cf6fa6cc0a75b2d2 upstream.
It currently takes a long, and while that's normally OK, the io_uring
limit is an int. Internally in io_uring it's an int, but sometimes it's
passed as a long. That can yield confusing results where a completions
seems to generate a huge result:
ou-sqp-1297-1298 [001] ...1 788.056371: io_uring_complete: ring 000000000e98e046, user_data 0x0, result 4294967171, cflags 0
which is due to -ECANCELED being stored in an unsigned, and then passed
in as a long. Using the right int type, the trace looks correct:
iou-sqp-338-339 [002] ...1 15.633098: io_uring_complete: ring 00000000e0ac60cf, user_data 0x0, result -125, cflags 0
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 39ff83f2f6cc5cc1458dfcea9697f96338210beb upstream.
timespec64_ns() prevents multiplication overflows by comparing the seconds
value of the timespec to KTIME_SEC_MAX. If the value is greater or equal it
returns KTIME_MAX.
But that check casts the signed seconds value to unsigned which makes the
comparision true for all negative values and therefore return wrongly
KTIME_MAX.
Negative second values are perfectly valid and required in some places,
e.g. ptp_clock_adjtime().
Remove the cast and add a check for the negative boundary which is required
to prevent undefined behaviour due to multiplication underflow.
Fixes: cb47755725da ("time: Prevent undefined behaviour in timespec64_to_ns()")'
Signed-off-by: Lukas Hannen <lukas.hannen@opensource.tttech-industrial.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/AM6PR01MB541637BD6F336B8FFB72AF80EEC69@AM6PR01MB5416.eurprd01.prod.exchangelabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ca49bfd90a9dde175d2929dc1544b54841e33804 ]
In HTB offload mode, qdiscs of leaf classes are grafted to netdev
queues. sch_htb expects the dev_queue field of these qdiscs to point to
the corresponding queues. However, qdisc creation may fail, and in that
case noop_qdisc is used instead. Its dev_queue doesn't point to the
right queue, so sch_htb can lose track of used netdev queues, which will
cause internal inconsistencies.
This commit fixes this bug by keeping track of the netdev queue inside
struct htb_class. All reads of cl->leaf.q->dev_queue are replaced by the
new field, the two values are synced on writes, and WARNs are added to
assert equality of the two values.
The driver API has changed: when TC_HTB_LEAF_DEL needs to move a queue,
the driver used to pass the old and new queue IDs to sch_htb. Now that
there is a new field (offload_queue) in struct htb_class that needs to
be updated on this operation, the driver will pass the old class ID to
sch_htb instead (it already knows the new class ID).
Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210826115425.1744053-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5c11720767f70d34357d00a15ba5a0ad052c40fe ]
Some paths through svc_process() leave rqst->rq_procinfo set to
NULL, which triggers a crash if tracing happens to be enabled.
Fixes: 89ff87494c6e ("SUNRPC: Display RPC procedure names instead of proc numbers")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d8bbd97ad0b99a9394f2cd8410b884c48e218cf0 ]
If CONFIG_DEBUG_LOCK_ALLOC=y is enabled then local_lock_t has an 'owner'
member which is checked for consistency, but nothing initialized it to
zero explicitly.
The static initializer does so implicit, and the run time allocated per CPU
storage is usually zero initialized as well, but relying on that is not
really good practice.
Fixes: 91710728d172 ("locking: Introduce local_lock()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211301.969975279@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7fcc17d0cb12938d2b3507973a6f93fc9ed2c7a1 ]
The Energy Model (EM) provides useful information about device power in
each performance state to other subsystems like: Energy Aware Scheduler
(EAS). The energy calculation in EAS does arithmetic operation based on
the EM em_cpu_energy(). Current implementation of that function uses
em_perf_state::cost as a pre-computed cost coefficient equal to:
cost = power * max_frequency / frequency.
The 'power' is expressed in milli-Watts (or in abstract scale).
There are corner cases when the EAS energy calculation for two Performance
Domains (PDs) return the same value. The EAS compares these values to
choose smaller one. It might happen that this values are equal due to
rounding error. In such scenario, we need better resolution, e.g. 1000
times better. To provide this possibility increase the resolution in the
em_perf_state::cost for 64-bit architectures. The cost of increasing
resolution on 32-bit is pretty high (64-bit division) and is not justified
since there are no new 32bit big.LITTLE EAS systems expected which would
benefit from this higher resolution.
This patch allows to avoid the rounding to milli-Watt errors, which might
occur in EAS energy estimation for each PD. The rounding error is common
for small tasks which have small utilization value.
There are two places in the code where it makes a difference:
1. In the find_energy_efficient_cpu() where we are searching for
best_delta. We might suffer there when two PDs return the same result,
like in the example below.
Scenario:
Low utilized system e.g. ~200 sum_util for PD0 and ~220 for PD1. There
are quite a few small tasks ~10-15 util. These tasks would suffer for
the rounding error. These utilization values are typical when running games
on Android. One of our partners has reported 5..10mA less battery drain
when running with increased resolution.
Some details:
We have two PDs: PD0 (big) and PD1 (little)
Let's compare w/o patch set ('old') and w/ patch set ('new')
We are comparing energy w/ task and w/o task placed in the PDs
a) 'old' w/o patch set, PD0
task_util = 13
cost = 480
sum_util_w/o_task = 215
sum_util_w_task = 228
scale_cpu = 1024
energy_w/o_task = 480 * 215 / 1024 = 100.78 => 100
energy_w_task = 480 * 228 / 1024 = 106.87 => 106
energy_diff = 106 - 100 = 6
(this is equal to 'old' PD1's energy_diff in 'c)')
b) 'new' w/ patch set, PD0
task_util = 13
cost = 480 * 1000 = 480000
sum_util_w/o_task = 215
sum_util_w_task = 228
energy_w/o_task = 480000 * 215 / 1024 = 100781
energy_w_task = 480000 * 228 / 1024 = 106875
energy_diff = 106875 - 100781 = 6094
(this is not equal to 'new' PD1's energy_diff in 'd)')
c) 'old' w/o patch set, PD1
task_util = 13
cost = 160
sum_util_w/o_task = 283
sum_util_w_task = 293
scale_cpu = 355
energy_w/o_task = 160 * 283 / 355 = 127.55 => 127
energy_w_task = 160 * 296 / 355 = 133.41 => 133
energy_diff = 133 - 127 = 6
(this is equal to 'old' PD0's energy_diff in 'a)')
d) 'new' w/ patch set, PD1
task_util = 13
cost = 160 * 1000 = 160000
sum_util_w/o_task = 283
sum_util_w_task = 293
scale_cpu = 355
energy_w/o_task = 160000 * 283 / 355 = 127549
energy_w_task = 160000 * 296 / 355 = 133408
energy_diff = 133408 - 127549 = 5859
(this is not equal to 'new' PD0's energy_diff in 'b)')
2. Difference in the 6% energy margin filter at the end of
find_energy_efficient_cpu(). With this patch the margin comparison also
has better resolution, so it's possible to have better task placement
thanks to that.
Fixes: 27871f7a8a341ef ("PM: Introduce an Energy Model management framework")
Reported-by: CCJ Yeh <CCj.Yeh@mediatek.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
querier
[ Upstream commit c73c57081b3d59aa99093fbedced32ea02620cd3 ]
Commit 08cc83cc7fd8 ("net: dsa: add support for BRIDGE_MROUTER
attribute") added an option for users to turn off multicast flooding
towards the CPU if they turn off the IGMP querier on a bridge which
already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router).
And commit a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags")
simply papered over that issue, because it moved the decision to flood
the CPU with multicast (or not) from the DSA core down to individual drivers,
instead of taking a more radical position then.
The truth is that disabling multicast flooding to the CPU is simply
something we are not prepared to do now, if at all. Some reasons:
- ICMP6 neighbor solicitation messages are unregistered multicast
packets as far as the bridge is concerned. So if we stop flooding
multicast, the outside world cannot ping the bridge device's IPv6
link-local address.
- There might be foreign interfaces bridged with our DSA switch ports
(sending a packet towards the host does not necessarily equal
termination, but maybe software forwarding). So if there is no one
interested in that multicast traffic in the local network stack, that
doesn't mean nobody is.
- PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as
the bridge is concerned. This should reach the CPU port.
- The switch driver might not do FDB partitioning. And since we don't
even bother to do more fine-grained flood disabling (such as "disable
flooding _from_port_N_ towards the CPU port" as opposed to "disable
flooding _from_any_port_ towards the CPU port"), this breaks standalone
ports, or even multiple bridges where one has an IGMP querier and one
doesn't.
Reverting the logic makes all of the above work.
Fixes: a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags")
Fixes: 08cc83cc7fd8 ("net: dsa: add support for BRIDGE_MROUTER attribute")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 26ab7b384525ccfa678c518577f7f0d841209c8b ]
This commit does a cleanup in LRO configuration.
LRO is a parameter of an RQ, but its state is changed by modifying a TIR
related to the RQ.
The current status: LRO for tunneled packets is not supported in the
driver, inner TIRs may enable LRO on creation, but LRO status of inner
TIRs isn't changed in mlx5e_modify_tirs_lro(). This is inconsistent, but
as long as the firmware doesn't declare support for tunneled LRO, it
works, because the same RQs are shared between the inner and outer TIRs.
This commit does two fixes:
1. If the firmware has the tunneled LRO capability, LRO is blocked
altogether, because it's not possible to block it for inner TIRs only,
when the same RQs are shared between inner and outer TIRs, and the
driver won't be able to handle tunneled LRO traffic.
2. mlx5e_modify_tirs_lro() is patched to modify LRO state for all TIRs,
including inner ones, because all TIRs related to an RQ should agree on
their LRO state.
Fixes: 7b3722fa9ef6 ("net/mlx5e: Support RSS for GRE tunneled packets")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f170acda7ffaf0473d06e1e17c12cd9fd63904f5 ]
Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo
in bpf.h.
Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 866663b7b52d2da267b28e12eed89ee781b8fed1 ]
When merging one bio to request, if they are discard IO and the queue
supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE
because both block core and related drivers(nvme, virtio-blk) doesn't
handle mixed discard io merge(traditional IO merge together with
discard merge) well.
Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation,
so both blk-mq and drivers just need to handle multi-range discard.
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Fixes: 2705dfb20947 ("block: fix discard request merge")
Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ed0d0a0506025f06061325cedae1bbebd081620a ]
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8c3b5e6ec0fee18bc2ce38d1dfe913413205f908 ]
If high resolution timers are disabled the timerfd notification about a
clock was set event is not happening for all cases which use
clock_was_set_delayed() because that's a NOP for HIGHRES=n, which is wrong.
Make clock_was_set_delayed() unconditially available to fix that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.196661266@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit d0efb16294d145d157432feda83877ae9d7cdf37 upstream.
A common implementation of isatty(3) involves calling a ioctl passing
a dummy struct argument and checking whether the syscall failed --
bionic and glibc use TCGETS (passing a struct termios), and musl uses
TIOCGWINSZ (passing a struct winsize). If the FD is a socket, we will
copy sizeof(struct ifreq) bytes of data from the argument and return
-EFAULT if that fails. The result is that the isatty implementations
may return a non-POSIX-compliant value in errno in the case where part
of the dummy struct argument is inaccessible, as both struct termios
and struct winsize are smaller than struct ifreq (at least on arm64).
Although there is usually enough stack space following the argument
on the stack that this did not present a practical problem up to now,
with MTE stack instrumentation it's more likely for the copy to fail,
as the memory following the struct may have a different tag.
Fix the problem by adding an early check for whether the ioctl is a
valid socket ioctl, and return -ENOTTY if it isn't.
Fixes: 44c02a2c3dc5 ("dev_ioctl(): move copyin/copyout to callers")
Link: https://linux-review.googlesource.com/id/I869da6cf6daabc3e4b7b82ac979683ba05e27d4d
Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d18760560593e5af921f51a8c9b64b6109d634c2 upstream.
Add a helper function fscrypt_symlink_getattr() which will be called
from the various filesystems' ->getattr() methods to read and decrypt
the target of encrypted symlinks in order to report the correct st_size.
Detailed explanation:
As required by POSIX and as documented in various man pages, st_size for
a symlink is supposed to be the length of the symlink target.
Unfortunately, st_size has always been wrong for encrypted symlinks
because st_size is populated from i_size from disk, which intentionally
contains the length of the encrypted symlink target. That's slightly
greater than the length of the decrypted symlink target (which is the
symlink target that userspace usually sees), and usually won't match the
length of the no-key encoded symlink target either.
This hadn't been fixed yet because reporting the correct st_size would
require reading the symlink target from disk and decrypting or encoding
it, which historically has been considered too heavyweight to do in
->getattr(). Also historically, the wrong st_size had only broken a
test (LTP lstat03) and there were no known complaints from real users.
(This is probably because the st_size of symlinks isn't used too often,
and when it is, typically it's for a hint for what buffer size to pass
to readlink() -- which a slightly-too-large size still works for.)
However, a couple things have changed now. First, there have recently
been complaints about the current behavior from real users:
- Breakage in rpmbuild:
https://github.com/rpm-software-management/rpm/issues/1682
https://github.com/google/fscrypt/issues/305
- Breakage in toybox cpio:
https://www.mail-archive.com/toybox@lists.landley.net/msg07193.html
- Breakage in libgit2: https://issuetracker.google.com/issues/189629152
(on Android public issue tracker, requires login)
Second, we now cache decrypted symlink targets in ->i_link. Therefore,
taking the performance hit of reading and decrypting the symlink target
in ->getattr() wouldn't be as big a deal as it used to be, since usually
it will just save having to do the same thing later.
Also note that eCryptfs ended up having to read and decrypt symlink
targets in ->getattr() as well, to fix this same issue; see
commit 3a60a1686f0d ("eCryptfs: Decrypt symlink target for stat size").
So, let's just bite the bullet, and read and decrypt the symlink target
in ->getattr() in order to report the correct st_size. Add a function
fscrypt_symlink_getattr() which the filesystems will call to do this.
(Alternatively, we could store the decrypted size of symlinks on-disk.
But there isn't a great place to do so, and encryption is meant to hide
the original size to some extent; that property would be lost.)
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210702065350.209646-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from can and bpf.
Closing three hw-dependent regressions. Any fixes of note are in the
'old code' category. Nothing blocking release from our perspective.
Current release - regressions:
- stmmac: revert "stmmac: align RX buffers"
- usb: asix: ax88772: move embedded PHY detection as early as
possible
- usb: asix: do not call phy_disconnect() for ax88178
- Revert "net: really fix the build...", from Kalle to fix QCA6390
Current release - new code bugs:
- phy: mediatek: add the missing suspend/resume callbacks
Previous releases - regressions:
- qrtr: fix another OOB Read in qrtr_endpoint_post
- stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings
Previous releases - always broken:
- inet: use siphash in exception handling
- ip_gre: add validation for csum_start
- bpf: fix ringbuf helper function compatibility
- rtnetlink: return correct error on changing device netns
- e1000e: do not try to recover the NVM checksum on Tiger Lake"
* tag 'net-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
Revert "net: really fix the build..."
net: hns3: fix get wrong pfc_en when query PFC configuration
net: hns3: fix GRO configuration error after reset
net: hns3: change the method of getting cmd index in debugfs
net: hns3: fix duplicate node in VLAN list
net: hns3: fix speed unknown issue in bond 4
net: hns3: add waiting time before cmdq memory is released
net: hns3: clear hardware resource when loading driver
net: fix NULL pointer reference in cipso_v4_doi_free
rtnetlink: Return correct error on changing device netns
net: dsa: hellcreek: Adjust schedule look ahead window
net: dsa: hellcreek: Fix incorrect setting of GCL
cxgb4: dont touch blocked freelist bitmap after free
ipv4: use siphash instead of Jenkins in fnhe_hashfun()
ipv6: use siphash in rt6_exception_hash()
can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters
net: usb: asix: ax88772: fix boolconv.cocci warnings
net/sched: ets: fix crash when flipping from 'strict' to 'quantum'
qede: Fix memset corruption
net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdp
...
|