<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/exit.c, branch next/HEAD</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=next%2FHEAD</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=next%2FHEAD'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2026-04-09T13:09:24Z</updated>
<entry>
<title>Merge branch 'fs-next' of linux-next</title>
<updated>2026-04-09T13:09:24Z</updated>
<author>
<name>Mark Brown</name>
<email>broonie@kernel.org</email>
</author>
<published>2026-04-09T13:09:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=85315ec393d8fe7bd0e9543d18cbc3f4ae61b01d'/>
<id>urn:sha1:85315ec393d8fe7bd0e9543d18cbc3f4ae61b01d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>exit: kill unnecessary thread_group_leader() checks in exit_notify() and do_notify_parent()</title>
<updated>2026-03-28T04:19:34Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2026-02-22T15:23:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=617ab884b893032765fbc0ebf656fa3015016648'/>
<id>urn:sha1:617ab884b893032765fbc0ebf656fa3015016648</id>
<content type='text'>
thread_group_empty(tsk) is only possible if tsk is a group leader, and
thread_group_empty() already does the thread_group_leader() check.

So it makes no sense to check "thread_group_leader() &amp;&amp;
thread_group_empty()"; thread_group_empty() alone is enough.

Link: https://lkml.kernel.org/r/aZsfeegKZPZZszJh@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Mateusz Guzik &lt;mjguzik@gmail.com&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc; Deepanshu Kartikey &lt;kartikey406@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>pid_namespace: avoid optimization of accesses to -&gt;child_reaper</title>
<updated>2026-03-20T13:44:25Z</updated>
<author>
<name>Pavel Tikhomirov</name>
<email>ptikhomirov@virtuozzo.com</email>
</author>
<published>2026-03-18T12:21:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d9c857aee2ebcdf5e9d81718b78b7966b8eee876'/>
<id>urn:sha1:d9c857aee2ebcdf5e9d81718b78b7966b8eee876</id>
<content type='text'>
To avoid potential problems related to cpu/compiler optimizations around
-&gt;child_reaper, let's use WRITE_ONCE (additional to task_list lock)
everywhere we write it and use READ_ONCE where we read it without
explicit lock. Note: It also pairs with existing READ_ONCE with no lock
in nsfs_fh_to_dentry().

Also let's add ASSERT_EXCLUSIVE_WRITER before write to identify to KCSAN
that we don't expect any concurrent -&gt;child_reaper modifications, and
those must be detected.

--

Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;
Link: https://patch.msgid.link/20260318122157.280595-2-ptikhomirov@virtuozzo.com
v3: Split from main commit. Add ASSERT_EXCLUSIVE_WRITER.
v5: Add one more READ_ONCE for access without lock in free_pid().
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>kthread: consolidate kthread exit paths to prevent use-after-free</title>
<updated>2026-02-26T09:45:49Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-02-26T09:43:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=28aaa9c39945b7925a1cc1d513c8f21ed38f5e4f'/>
<id>urn:sha1:28aaa9c39945b7925a1cc1d513c8f21ed38f5e4f</id>
<content type='text'>
Guillaume reported crashes via corrupted RCU callback function pointers
during KUnit testing. The crash was traced back to the pidfs rhashtable
conversion which replaced the 24-byte rb_node with an 8-byte rhash_head
in struct pid, shrinking it from 160 to 144 bytes.

struct kthread (without CONFIG_BLK_CGROUP) is also 144 bytes. With
CONFIG_SLAB_MERGE_DEFAULT and SLAB_HWCACHE_ALIGN both round up to
192 bytes and share the same slab cache. struct pid.rcu.func and
struct kthread.affinity_node both sit at offset 0x78.

When a kthread exits via make_task_dead() it bypasses kthread_exit() and
misses the affinity_node cleanup. free_kthread_struct() frees the memory
while the node is still linked into the global kthread_affinity_list. A
subsequent list_del() by another kthread writes through dangling list
pointers into the freed and reused memory, corrupting the pid's
rcu.func pointer.

Instead of patching free_kthread_struct() to handle the missed cleanup,
consolidate all kthread exit paths. Turn kthread_exit() into a macro
that calls do_exit() and add kthread_do_exit() which is called from
do_exit() for any task with PF_KTHREAD set. This guarantees that
kthread-specific cleanup always happens regardless of the exit path -
make_task_dead(), direct do_exit(), or kthread_exit().

Replace __to_kthread() with a new tsk_is_kthread() accessor in the
public header. Export do_exit() since module code using the
kthread_exit() macro now needs it directly.

Reported-by: Guillaume Tucker &lt;gtucker@gtucker.io&gt;
Tested-by: Guillaume Tucker &lt;gtucker@gtucker.io&gt;
Tested-by: Mark Brown &lt;broonie@kernel.org&gt;
Tested-by: David Gow &lt;davidgow@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/all/20260224-mittlerweile-besessen-2738831ae7f6@brauner
Co-developed-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Fixes: 4d13f4304fa4 ("kthread: Implement preferred affinity")
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2025-12-06T22:01:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-06T22:01:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=509d3f45847627f4c5cdce004c3ec79262b5239c'/>
<id>urn:sha1:509d3f45847627f4c5cdce004c3ec79262b5239c</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:

 - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko)
   fixes a build issue and does some cleanup in ib/sys_info.c

 - "Implement mul_u64_u64_div_u64_roundup()" (David Laight)
   enhances the 64-bit math code on behalf of a PWM driver and beefs up
   the test module for these library functions

 - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich)
   makes BPF symbol names, sizes, and line numbers available to the GDB
   debugger

 - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang)
   adds a sysctl which can be used to cause additional info dumping when
   the hung-task and lockup detectors fire

 - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu)
   adds a general base64 encoder/decoder to lib/ and migrates several
   users away from their private implementations

 - "rbree: inline rb_first() and rb_last()" (Eric Dumazet)
   makes TCP a little faster

 - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin)
   reworks the KEXEC Handover interfaces in preparation for Live Update
   Orchestrator (LUO), and possibly for other future clients

 - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin)
   increases the flexibility of KEXEC Handover. Also preparation for LUO

 - "Live Update Orchestrator" (Pasha Tatashin)
   is a major new feature targeted at cloud environments. Quoting the
   cover letter:

      This series introduces the Live Update Orchestrator, a kernel
      subsystem designed to facilitate live kernel updates using a
      kexec-based reboot. This capability is critical for cloud
      environments, allowing hypervisors to be updated with minimal
      downtime for running virtual machines. LUO achieves this by
      preserving the state of selected resources, such as memory,
      devices and their dependencies, across the kernel transition.

      As a key feature, this series includes support for preserving
      memfd file descriptors, which allows critical in-memory data, such
      as guest RAM or any other large memory region, to be maintained in
      RAM across the kexec reboot.

   Mike Rappaport merits a mention here, for his extensive review and
   testing work.

 - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain)
   moves the kexec and kdump sysfs entries from /sys/kernel/ to
   /sys/kernel/kexec/ and adds back-compatibility symlinks which can
   hopefully be removed one day

 - "kho: fixes for vmalloc restoration" (Mike Rapoport)
   fixes a BUG which was being hit during KHO restoration of vmalloc()
   regions

* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits)
  calibrate: update header inclusion
  Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
  vmcoreinfo: track and log recoverable hardware errors
  kho: fix restoring of contiguous ranges of order-0 pages
  kho: kho_restore_vmalloc: fix initialization of pages array
  MAINTAINERS: TPM DEVICE DRIVER: update the W-tag
  init: replace simple_strtoul with kstrtoul to improve lpj_setup
  KHO: fix boot failure due to kmemleak access to non-PRESENT pages
  Documentation/ABI: new kexec and kdump sysfs interface
  Documentation/ABI: mark old kexec sysfs deprecated
  kexec: move sysfs entries to /sys/kernel/kexec
  test_kho: always print restore status
  kho: free chunks using free_page() instead of kfree()
  selftests/liveupdate: add kexec test for multiple and empty sessions
  selftests/liveupdate: add simple kexec-based selftest for LUO
  selftests/liveupdate: add userspace API selftests
  docs: add documentation for memfd preservation via LUO
  mm: memfd_luo: allow preserving memfd
  liveupdate: luo_file: add private argument to store runtime state
  mm: shmem: export some functions to internal.h
  ...
</content>
</entry>
<entry>
<title>Merge tag 'cgroup-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2025-12-03T21:04:07Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-03T21:04:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8449d3252c2603a51ffc7c36cb5bd94874378b7d'/>
<id>urn:sha1:8449d3252c2603a51ffc7c36cb5bd94874378b7d</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:

 - Defer task cgroup unlink until after the dying task's final context
   switch so that controllers see the cgroup properly populated until
   the task is truly gone

 - cpuset cleanups and simplifications.

   Enforce that domain isolated CPUs stay in root or isolated partitions
   and fail if isolated+nohz_full would leave no housekeeping CPU. Fix
   sched/deadline root domain handling during CPU hot-unplug and race
   for tasks in attaching cpusets

 - Misc fixes including memory reclaim protection documentation and
   selftest KTAP conformance

* tag 'cgroup-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
  cpuset: Treat cpusets in attaching as populated
  sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug
  cgroup/cpuset: Introduce cpuset_cpus_allowed_locked()
  docs: cgroup: No special handling of unpopulated memcgs
  docs: cgroup: Note about sibling relative reclaim protection
  docs: cgroup: Explain reclaim protection target
  selftests/cgroup: conform test to KTAP format output
  cpuset: remove need_rebuild_sched_domains
  cpuset: remove global remote_children list
  cpuset: simplify node setting on error
  cgroup: include missing header for struct irq_work
  cgroup: Fix sleeping from invalid context warning on PREEMPT_RT
  cgroup/cpuset: Globally track isolated_cpus update
  cgroup/cpuset: Ensure domain isolated CPUs stay in root or isolated partition
  cgroup/cpuset: Move up prstate_housekeeping_conflict() helper
  cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping
  cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks()
  cgroup: Defer task cgroup unlink until after the task is done switching out
  cgroup: Move dying_tasks cleanup from cgroup_task_release() to cgroup_task_free()
  cgroup: Rename cgroup lifecycle hooks to cgroup_task_*()
  ...
</content>
</entry>
<entry>
<title>Merge tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2025-12-02T16:48:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-02T16:48:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2b09f480f0a1e68111ae36a7be9aa1c93e067255'/>
<id>urn:sha1:2b09f480f0a1e68111ae36a7be9aa1c93e067255</id>
<content type='text'>
Pull rseq updates from Thomas Gleixner:
 "A large overhaul of the restartable sequences and CID management:

  The recent enablement of RSEQ in glibc resulted in regressions which
  are caused by the related overhead. It turned out that the decision to
  invoke the exit to user work was not really a decision. More or less
  each context switch caused that. There is a long list of small issues
  which sums up nicely and results in a 3-4% regression in I/O
  benchmarks.

  The other detail which caused issues due to extra work in context
  switch and task migration is the CID (memory context ID) management.
  It also requires to use a task work to consolidate the CID space,
  which is executed in the context of an arbitrary task and results in
  sporadic uncontrolled exit latencies.

  The rewrite addresses this by:

   - Removing deprecated and long unsupported functionality

   - Moving the related data into dedicated data structures which are
     optimized for fast path processing.

   - Caching values so actual decisions can be made

   - Replacing the current implementation with a optimized inlined
     variant.

   - Separating fast and slow path for architectures which use the
     generic entry code, so that only fault and error handling goes into
     the TIF_NOTIFY_RESUME handler.

   - Rewriting the CID management so that it becomes mostly invisible in
     the context switch path. That moves the work of switching modes
     into the fork/exit path, which is a reasonable tradeoff. That work
     is only required when a process creates more threads than the
     cpuset it is allowed to run on or when enough threads exit after
     that. An artificial thread pool benchmarks which triggers this did
     not degrade, it actually improved significantly.

     The main effect in migration heavy scenarios is that runqueue lock
     held time and therefore contention goes down significantly"

* tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched/mmcid: Switch over to the new mechanism
  sched/mmcid: Implement deferred mode change
  irqwork: Move data struct to a types header
  sched/mmcid: Provide CID ownership mode fixup functions
  sched/mmcid: Provide new scheduler CID mechanism
  sched/mmcid: Introduce per task/CPU ownership infrastructure
  sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex
  sched/mmcid: Provide precomputed maximal value
  sched/mmcid: Move initialization out of line
  signal: Move MMCID exit out of sighand lock
  sched/mmcid: Convert mm CID mask to a bitmap
  cpumask: Cache num_possible_cpus()
  sched/mmcid: Use cpumask_weighted_or()
  cpumask: Introduce cpumask_weighted_or()
  sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
  sched/mmcid: Move scheduler code out of global header
  sched: Fixup whitespace damage
  sched/mmcid: Cacheline align MM CID storage
  sched/mmcid: Use proper data structures
  sched/mmcid: Revert the complex CID management
  ...
</content>
</entry>
<entry>
<title>Merge tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2025-12-02T04:42:01Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-02T04:42:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6c26fbe8c9d3e932dce6afe2505b19b4b261cae9'/>
<id>urn:sha1:6c26fbe8c9d3e932dce6afe2505b19b4b261cae9</id>
<content type='text'>
Pull performance events updates from Ingo Molnar:
 "Callchain support:

   - Add support for deferred user-space stack unwinding for perf,
     enabled on x86. (Peter Zijlstra, Steven Rostedt)

   - unwind_user/x86: Enable frame pointer unwinding on x86 (Josh
     Poimboeuf)

  x86 PMU support and infrastructure:

   - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra)

   - x86/insn,uprobes,alternative: Unify insn_is_nop() (Peter Zijlstra)

  Intel PMU driver:

   - Large series to prepare for and implement architectural PEBS
     support for Intel platforms such as Clearwater Forest (CWF) and
     Panther Lake (PTL). (Dapeng Mi, Kan Liang)

   - Check dynamic constraints (Kan Liang)

   - Optimize PEBS extended config (Peter Zijlstra)

   - cstates:
      - Remove PC3 support from LunarLake (Zhang Rui)
      - Add Pantherlake support (Zhang Rui)
      - Clearwater Forest support (Zide Chen)

  AMD PMU driver:

   - x86/amd: Check event before enable to avoid GPF (George Kennedy)

  Fixes and cleanups:

   - task_work: Fix NMI race condition (Peter Zijlstra)

   - perf/x86: Fix NULL event access and potential PEBS record loss
     (Dapeng Mi)

   - Misc other fixes and cleanups (Dapeng Mi, Ingo Molnar, Peter
     Zijlstra)"

* tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use
  perf/x86/intel: Optimize PEBS extended config
  perf/x86/intel: Check PEBS dyn_constraints
  perf/x86/intel: Add a check for dynamic constraints
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Update dyn_constraint base on PEBS event precise level
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel/ds: Factor out PEBS group processing code to functions
  perf/x86/intel/ds: Factor out PEBS record processing code to functions
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel: Correct large PEBS flag check
  perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  perf/x86: Fix NULL event access and potential PEBS record loss
  perf/x86: Remove redundant is_x86_event() prototype
  entry,unwind/deferred: Fix unwind_reset_info() placement
  unwind_user/x86: Fix arch=um build
  perf: Support deferred user unwind
  unwind_user/x86: Teach FP unwind about start of function
  ...
</content>
</entry>
<entry>
<title>signal: Move MMCID exit out of sighand lock</title>
<updated>2025-11-25T18:45:40Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2025-11-19T17:27:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2b1642b881088bbf73fcb1147c474a198ec46729'/>
<id>urn:sha1:2b1642b881088bbf73fcb1147c474a198ec46729</id>
<content type='text'>
There is no need anymore to keep this under sighand lock as the current
code and the upcoming replacement are not depending on the exit state of a
task anymore.

That allows to use a mutex in the exit path.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://patch.msgid.link/20251119172549.706439391@linutronix.de
</content>
</entry>
<entry>
<title>release_task: kill unnecessary rcu_read_lock() around dec_rlimit_ucounts()</title>
<updated>2025-11-12T18:00:17Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2025-10-26T14:31:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c25d24d0f4c6c0954f2b0eb1fc69d293e88192bf'/>
<id>urn:sha1:c25d24d0f4c6c0954f2b0eb1fc69d293e88192bf</id>
<content type='text'>
rcu_read_lock() was added to shut RCU-lockdep up when this code used
__task_cred()-&gt;rcu_dereference(), but after the commit 21d1c5e386bc
("Reimplement RLIMIT_NPROC on top of ucounts") it is no longer needed:
task_ucounts()-&gt;task_cred_xxx() takes rcu_read_lock() itself.

NOTE: task_ucounts() returns the pointer to another rcu-protected data,
struct ucounts.  So it should either be used when task-&gt;real_cred and thus
task-&gt;real_cred-&gt;ucounts is stable (release_task, copy_process,
copy_creds), or it should be called under rcu_read_lock().  In both cases
it is pointless to take rcu_read_lock() to read the cred-&gt;ucounts pointer.

Link: https://lkml.kernel.org/r/20251026143140.GA22463@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Mateusz Guzik &lt;mjguzik@gmail.com&gt;
Cc: "Paul E . McKenney" &lt;paulmck@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
