<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v5.18</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.18</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.18'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-05-20T18:44:00Z</updated>
<entry>
<title>perf: Fix sys_perf_event_open() race against self</title>
<updated>2022-05-20T18:44:00Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-05-20T18:38:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3ac6487e584a1eb54071dbe1212e05b884136704'/>
<id>urn:sha1:3ac6487e584a1eb54071dbe1212e05b884136704</id>
<content type='text'>
Norbert reported that it's possible to race sys_perf_event_open() such
that the looser ends up in another context from the group leader,
triggering many WARNs.

The move_group case checks for races against itself, but the
!move_group case doesn't, seemingly relying on the previous
group_leader-&gt;ctx == ctx check. However, that check is racy due to not
holding any locks at that time.

Therefore, re-check the result after acquiring locks and bailing
if they no longer match.

Additionally, clarify the not_move_group case from the
move_group-vs-move_group race.

Fixes: f63a8daa5812 ("perf: Fix event-&gt;ctx locking")
Reported-by: Norbert Slusarek &lt;nslusarek@gmx.net&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts</title>
<updated>2022-05-17T19:03:36Z</updated>
<author>
<name>Julian Orth</name>
<email>ju.orth@gmail.com</email>
</author>
<published>2022-05-17T10:32:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=69e9cd66ae1392437234a63a3a1d60b6655f92ef'/>
<id>urn:sha1:69e9cd66ae1392437234a63a3a1d60b6655f92ef</id>
<content type='text'>
Not calling the function for dummy contexts will cause the context to
not be reset. During the next syscall, this will cause an error in
__audit_syscall_entry:

	WARN_ON(context-&gt;context != AUDIT_CTX_UNUSED);
	WARN_ON(context-&gt;name_count);
	if (context-&gt;context != AUDIT_CTX_UNUSED || context-&gt;name_count) {
		audit_panic("unrecoverable error in audit_syscall_entry()");
		return;
	}

These problematic dummy contexts are created via the following call
chain:

       exit_to_user_mode_prepare
    -&gt; arch_do_signal_or_restart
    -&gt; get_signal
    -&gt; task_work_run
    -&gt; tctx_task_work
    -&gt; io_req_task_submit
    -&gt; io_issue_sqe
    -&gt; audit_uring_entry

Cc: stable@vger.kernel.org
Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring")
Signed-off-by: Julian Orth &lt;ju.orth@gmail.com&gt;
[PM: subject line tweaks]
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'sched-urgent-2022-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-05-15T13:40:11Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-15T13:40:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=990e798d182a63afba6cc990f5bcd42d18436b55'/>
<id>urn:sha1:990e798d182a63afba6cc990f5bcd42d18436b55</id>
<content type='text'>
Pull scheduler fix from Thomas Gleixner:
 "The recent expansion of the sched switch tracepoint inserted a new
  argument in the middle of the arguments. This reordering broke BPF
  programs which relied on the old argument list.

  While tracepoints are not considered stable ABI, it's not trivial to
  make BPF cope with such a change, but it's being worked on. For now
  restore the original argument order and move the new argument to the
  end of the argument list"

* tag 'sched-urgent-2022-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/tracing: Append prev_state to tp args instead
</content>
</entry>
<entry>
<title>Merge tag 'irq-urgent-2022-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-05-15T13:37:05Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-15T13:37:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb756280f97788525e898181adfc4feb106c79d3'/>
<id>urn:sha1:fb756280f97788525e898181adfc4feb106c79d3</id>
<content type='text'>
Pull irq fix from Thomas Gleixner:
 "A single fix for a recent (introduced in 5.16) regression in the core
  interrupt code.

  The consolidation of the interrupt handler invocation code added an
  unconditional warning when generic_handle_domain_irq() is invoked from
  outside hard interrupt context. That's overbroad as the requirement
  for invoking these handlers in hard interrupt context is only required
  for certain interrupt types. The subsequently called code already
  contains a warning which triggers conditionally for interrupt chips
  which indicate this requirement in their properties.

  Remove the overbroad one"

* tag 'irq-urgent-2022-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Remove WARN_ON_ONCE() in generic_handle_domain_irq()
</content>
</entry>
<entry>
<title>Merge branch 'for-5.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2022-05-12T17:42:56Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-12T17:42:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0ac824f379fba2c2b17b75fd5ada69cd68c66348'/>
<id>urn:sha1:0ac824f379fba2c2b17b75fd5ada69cd68c66348</id>
<content type='text'>
Pull cgroup fix from Tejun Heo:
 "Waiman's fix for a cgroup2 cpuset bug where it could miss nodes which
  were hot-added"

* 'for-5.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Remove cpus_allowed/mems_allowed setup in cpuset_init_smp()
</content>
</entry>
<entry>
<title>sched/tracing: Append prev_state to tp args instead</title>
<updated>2022-05-11T22:37:11Z</updated>
<author>
<name>Delyan Kratunov</name>
<email>delyank@fb.com</email>
</author>
<published>2022-05-11T18:28:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9c2136be0878c88c53dea26943ce40bb03ad8d8d'/>
<id>urn:sha1:9c2136be0878c88c53dea26943ce40bb03ad8d8d</id>
<content type='text'>
Commit fa2c3254d7cf (sched/tracing: Don't re-read p-&gt;state when emitting
sched_switch event, 2022-01-20) added a new prev_state argument to the
sched_switch tracepoint, before the prev task_struct pointer.

This reordering of arguments broke BPF programs that use the raw
tracepoint (e.g. tp_btf programs). The type of the second argument has
changed and existing programs that assume a task_struct* argument
(e.g. for bpf_task_storage access) will now fail to verify.

If we instead append the new argument to the end, all existing programs
would continue to work and can conditionally extract the prev_state
argument on supported kernel versions.

Fixes: fa2c3254d7cf (sched/tracing: Don't re-read p-&gt;state when emitting sched_switch event, 2022-01-20)
Signed-off-by: Delyan Kratunov &lt;delyank@fb.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Link: https://lkml.kernel.org/r/c8a6930dfdd58a4a5755fc01732675472979732b.camel@fb.com
</content>
</entry>
<entry>
<title>genirq: Remove WARN_ON_ONCE() in generic_handle_domain_irq()</title>
<updated>2022-05-11T00:22:52Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2022-05-10T07:56:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=792ea6a074ae7ea5ab6f1b8b31f76bb0297de66c'/>
<id>urn:sha1:792ea6a074ae7ea5ab6f1b8b31f76bb0297de66c</id>
<content type='text'>
Since commit 0953fb263714 ("irq: remove handle_domain_{irq,nmi}()"),
generic_handle_domain_irq() warns if called outside hardirq context, even
though the function calls down to handle_irq_desc(), which warns about the
same, but conditionally on handle_enforce_irqctx().

The newly added warning is a false positive if the interrupt originates
from any other irqchip than x86 APIC or ARM GIC/GICv3.  Those are the only
ones for which handle_enforce_irqctx() returns true.  Per commit
c16816acd086 ("genirq: Add protection against unsafe usage of
generic_handle_irq()"):

 "In general calling generic_handle_irq() with interrupts disabled from non
  interrupt context is harmless. For some interrupt controllers like the
  x86 trainwrecks this is outright dangerous as it might corrupt state if
  an interrupt affinity change is pending."

Examples for interrupt chips where the warning is a false positive are
USB-attached GPIO controllers such as drivers/gpio/gpio-dln2.c:

  USB gadgets are incapable of directly signaling an interrupt because they
  cannot initiate a bus transaction by themselves.  All communication on
  the bus is initiated by the host controller, which polls a gadget's
  Interrupt Endpoint in regular intervals.  If an interrupt is pending,
  that information is passed up the stack in softirq context, from which a
  hardirq is synthesized via generic_handle_domain_irq().

Remove the warning to eliminate such false positives.

Fixes: 0953fb263714 ("irq: remove handle_domain_{irq,nmi}()")
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
CC: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Cc: Bartosz Golaszewski &lt;brgl@bgdev.pl&gt;
Cc: Octavian Purdila &lt;octavian.purdila@nxp.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220505113207.487861b2@kernel.org
Link: https://lore.kernel.org/r/20220506203242.GA1855@wunner.de
Link: https://lore.kernel.org/r/c3caf60bfa78e5fdbdf483096b7174da65d1813a.1652168866.git.lukas@wunner.de
</content>
</entry>
<entry>
<title>Merge tag 'timers-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-05-08T18:18:11Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-08T18:18:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ea82593bad9a77f6f14c9701c13ff7368b22f027'/>
<id>urn:sha1:ea82593bad9a77f6f14c9701c13ff7368b22f027</id>
<content type='text'>
Pull timer fix from Thomas Gleixner:
 "A fix and an email address update:

   - Mark the NMI safe time accessors notrace to prevent tracer
     recursion when they are selected as trace clocks.

   - John Stultz has a new email address"

* tag 'timers-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Mark NMI safe time accessors as notrace
  MAINTAINERS: Update email address for John Stultz
</content>
</entry>
<entry>
<title>Merge tag 'irq-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-05-08T18:10:17Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-08T18:10:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9692df0581eae29abcc925ca3c12babc6c194741'/>
<id>urn:sha1:9692df0581eae29abcc925ca3c12babc6c194741</id>
<content type='text'>
Pull irq fix from Thomas Gleixner:
 "A fix for the threaded interrupt core.

  A quick sequence of request/free_irq() can result in a hang because
  the interrupt thread did not reach the thread function and got stopped
  in the kthread core already. That leaves a state active counter
  arround which makes a invocation of synchronized_irq() on that
  interrupt hang forever.

  Ensure that the thread reached the thread function in request_irq() to
  prevent that"

* tag 'irq-urgent-2022-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Synchronize interrupt thread startup
</content>
</entry>
<entry>
<title>cgroup/cpuset: Remove cpus_allowed/mems_allowed setup in cpuset_init_smp()</title>
<updated>2022-05-05T18:57:00Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2022-04-27T14:54:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2685027fca387b602ae565bff17895188b803988'/>
<id>urn:sha1:2685027fca387b602ae565bff17895188b803988</id>
<content type='text'>
There are 3 places where the cpu and node masks of the top cpuset can
be initialized in the order they are executed:
 1) start_kernel -&gt; cpuset_init()
 2) start_kernel -&gt; cgroup_init() -&gt; cpuset_bind()
 3) kernel_init_freeable() -&gt; do_basic_setup() -&gt; cpuset_init_smp()

The first cpuset_init() call just sets all the bits in the masks.
The second cpuset_bind() call sets cpus_allowed and mems_allowed to the
default v2 values. The third cpuset_init_smp() call sets them back to
v1 values.

For systems with cgroup v2 setup, cpuset_bind() is called once.  As a
result, cpu and memory node hot add may fail to update the cpu and node
masks of the top cpuset to include the newly added cpu or node in a
cgroup v2 environment.

For systems with cgroup v1 setup, cpuset_bind() is called again by
rebind_subsystem() when the v1 cpuset filesystem is mounted as shown
in the dmesg log below with an instrumented kernel.

  [    2.609781] cpuset_bind() called - v2 = 1
  [    3.079473] cpuset_init_smp() called
  [    7.103710] cpuset_bind() called - v2 = 0

smp_init() is called after the first two init functions.  So we don't
have a complete list of active cpus and memory nodes until later in
cpuset_init_smp() which is the right time to set up effective_cpus
and effective_mems.

To fix this cgroup v2 mask setup problem, the potentially incorrect
cpus_allowed &amp; mems_allowed setting in cpuset_init_smp() are removed.
For cgroup v2 systems, the initial cpuset_bind() call will set the masks
correctly.  For cgroup v1 systems, the second call to cpuset_bind()
will do the right setup.

cc: stable@vger.kernel.org
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Tested-by: Feng Tang &lt;feng.tang@intel.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
