<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/kthread.c, branch v4.20</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.20</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.20'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-08-13T18:25:07Z</updated>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2018-08-13T18:25:07Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-13T18:25:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7951c33f0fed14ee26651a70a46899a59a31e18'/>
<id>urn:sha1:f7951c33f0fed14ee26651a70a46899a59a31e18</id>
<content type='text'>
Pull scheduler updates from Thomas Gleixner:

 - Cleanup and improvement of NUMA balancing

 - Refactoring and improvements to the PELT (Per Entity Load Tracking)
   code

 - Watchdog simplification and related cleanups

 - The usual pile of small incremental fixes and improvements

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  watchdog: Reduce message verbosity
  stop_machine: Reflow cpu_stop_queue_two_works()
  sched/numa: Move task_numa_placement() closer to numa_migrate_preferred()
  sched/numa: Use group_weights to identify if migration degrades locality
  sched/numa: Update the scan period without holding the numa_group lock
  sched/numa: Remove numa_has_capacity()
  sched/numa: Modify migrate_swap() to accept additional parameters
  sched/numa: Remove unused task_capacity from 'struct numa_stats'
  sched/numa: Skip nodes that are at 'hoplimit'
  sched/debug: Reverse the order of printing faults
  sched/numa: Use task faults only if numa_group is not yet set up
  sched/numa: Set preferred_node based on best_cpu
  sched/numa: Simplify load_too_imbalanced()
  sched/numa: Evaluate move once per node
  sched/numa: Remove redundant field
  sched/debug: Show the sum wait time of a task group
  sched/fair: Remove #ifdefs from scale_rt_capacity()
  sched/core: Remove get_cpu() from sched_fork()
  sched/cpufreq: Clarify sugov_get_util()
  sched/sysctl: Remove unused sched_time_avg_ms sysctl
  ...
</content>
</entry>
<entry>
<title>kthread, tracing: Don't expose half-written comm when creating kthreads</title>
<updated>2018-07-26T13:59:33Z</updated>
<author>
<name>Snild Dolkow</name>
<email>snild@sony.com</email>
</author>
<published>2018-07-26T07:15:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3e536e222f2930534c252c1cc7ae799c725c5ff9'/>
<id>urn:sha1:3e536e222f2930534c252c1cc7ae799c725c5ff9</id>
<content type='text'>
There is a window for racing when printing directly to task-&gt;comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.

	creator                     other
	vsnprintf:
	  fill (not terminated)
	  count the rest            trace_sched_waking(p):
	  ...                         memcpy(comm, p-&gt;comm, TASK_COMM_LEN)
	  write \0

The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):

	crash-arm64&gt; x/1024s savedcmd-&gt;saved_cmdlines | grep 'evenk'
	0xffffffd5b3818640:     "irq/497-pwr_evenkworker/u16:12"

...and a strcpy out of there would cause stack corruption:

	[224761.522292] Kernel panic - not syncing: stack-protector:
	    Kernel stack is corrupted in: ffffff9bf9783c78

	crash-arm64&gt; kbt | grep 'comm\|trace_print_context'
	#6  0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
	      comm (char [16]) =  "irq/497-pwr_even"

	crash-arm64&gt; rd 0xffffffd4d0e17d14 8
	ffffffd4d0e17d14:  2f71726900000000 5f7277702d373934   ....irq/497-pwr_
	ffffffd4d0e17d24:  726f776b6e657665 3a3631752f72656b   evenkworker/u16:
	ffffffd4d0e17d34:  f9780248ff003231 cede60e0ffffff9b   12..H.x......`..
	ffffffd4d0e17d44:  cede60c8ffffffd4 00000fffffffffd4   .....`..........

The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.

Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.

Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com

Cc: stable@vger.kernel.org
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Snild Dolkow &lt;snild@sony.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>kthread: Simplify kthread_park() completion</title>
<updated>2018-07-03T07:20:44Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-06-07T08:55:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f83ee19be4272564ad592ef90145db7295229490'/>
<id>urn:sha1:f83ee19be4272564ad592ef90145db7295229490</id>
<content type='text'>
Oleg explains the reason we could hit park+park is that
smpboot_update_cpumask_percpu_thread()'s

  for_each_cpu_and(cpu, &amp;tmp, cpu_online_mask)
	smpboot_park_kthread();

turns into:

  for ((cpu) = 0; (cpu) &lt; 1; (cpu)++, (void)mask, (void)and)
	smpboot_park_kthread();

on UP, ignoring the mask. But since we just completely removed that
function, this is no longer relevant.

So revert commit:

  b1f5b378e126 ("kthread: Allow kthread_park() on a parked kthread")

Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kthread, sched/core: Fix kthread_parkme() (again...)</title>
<updated>2018-07-03T07:17:30Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-06-07T09:45:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1cef1150ef40ec52f507436a14230cbc2623299c'/>
<id>urn:sha1:1cef1150ef40ec52f507436a14230cbc2623299c</id>
<content type='text'>
Gaurav reports that commit:

  85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")

isn't working for him. Because of the following race:

&gt; controller Thread                               CPUHP Thread
&gt; takedown_cpu
&gt; kthread_park
&gt; kthread_parkme
&gt; Set KTHREAD_SHOULD_PARK
&gt;                                                 smpboot_thread_fn
&gt;                                                 set Task interruptible
&gt;
&gt;
&gt; wake_up_process
&gt;  if (!(p-&gt;state &amp; state))
&gt;                 goto out;
&gt;
&gt;                                                 Kthread_parkme
&gt;                                                 SET TASK_PARKED
&gt;                                                 schedule
&gt;                                                 raw_spin_lock(&amp;rq-&gt;lock)
&gt; ttwu_remote
&gt; waiting for __task_rq_lock
&gt;                                                 context_switch
&gt;
&gt;                                                 finish_lock_switch
&gt;
&gt;
&gt;
&gt;                                                 Case TASK_PARKED
&gt;                                                 kthread_park_complete
&gt;
&gt;
&gt; SET Running

Furthermore, Oleg noticed that the whole scheduler TASK_PARKED
handling is buggered because the TASK_DEAD thing is done with
preemption disabled, the current code can still complete early on
preemption :/

So basically revert that earlier fix and go with a variant of the
alternative mentioned in the commit. Promote TASK_PARKED to special
state to avoid the store-store issue on task-&gt;state leading to the
WARN in kthread_unpark() -&gt; __kthread_bind().

But in addition, add wait_task_inactive() to kthread_park() to ensure
the task really is PARKED when we return from kthread_park(). This
avoids the whole kthread still gets migrated nonsense -- although it
would be really good to get this done differently.

Reported-by: Gaurav Kohli &lt;gkohli@codeaurora.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kthread: Allow kthread_park() on a parked kthread</title>
<updated>2018-05-25T06:03:51Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-05-04T09:11:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b1f5b378e126133521df668379249fb8265121f1'/>
<id>urn:sha1:b1f5b378e126133521df668379249fb8265121f1</id>
<content type='text'>
The following commit:

  85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")

added a WARN() in the case where we call kthread_park() on an already
parked thread, because the old code wasn't doing the right thing there
and it wasn't at all clear that would happen.

It turns out, this does in fact happen, so we have to deal with it.

Instead of potentially returning early, also wait for the completion.
This does however mean we have to use complete_all() and re-initialize
the completion on re-use.

Reported-by: LKP &lt;lkp@01.org&gt;
Tested-by: Meelis Roos &lt;mroos@linux.ee&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Cc: wfg@linux.intel.com
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")
Link: http://lkml.kernel.org/r/20180504091142.GI12235@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kthread, sched/wait: Fix kthread_parkme() completion issue</title>
<updated>2018-05-03T05:38:05Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-05-01T16:14:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=85f1abe0019fcb3ea10df7029056cf42702283a8'/>
<id>urn:sha1:85f1abe0019fcb3ea10df7029056cf42702283a8</id>
<content type='text'>
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().

When that happens, sched_cpu_dying() -&gt; migrate_tasks() can end up
migrating such a still runnable task onto another CPU.

Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.

However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:

 - observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
   the park and set TASK_RUNNING, or

 - __kthread_bind()'s wait_task_inactive() to observe the competing
   TASK_RUNNING store.

Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.

Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.

The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.

Reported-by: Gaurav Kohli &lt;gkohli@codeaurora.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kthread, sched/wait: Fix kthread_parkme() wait-loop</title>
<updated>2018-05-03T05:38:04Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-04-30T12:50:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=741a76b350897604c48fb12beff1c9b77724dc96'/>
<id>urn:sha1:741a76b350897604c48fb12beff1c9b77724dc96</id>
<content type='text'>
Gaurav reported a problem with __kthread_parkme() where a concurrent
try_to_wake_up() could result in competing stores to -&gt;state which,
when the TASK_PARKED store got lost bad things would happen.

The comment near set_current_state() actually mentions this competing
store, but only mentions the case against TASK_RUNNING. This same
store, with different timing, can happen against a subsequent !RUNNING
store.

This normally is not a problem, because as per that same comment, the
!RUNNING state store is inside a condition based wait-loop:

  for (;;) {
    set_current_state(TASK_UNINTERRUPTIBLE);
    if (!need_sleep)
      break;
    schedule();
  }
  __set_current_state(TASK_RUNNING);

If we loose the (first) TASK_UNINTERRUPTIBLE store to a previous
(concurrent) wakeup, the schedule() will NO-OP and we'll go around the
loop once more.

The problem here is that the TASK_PARKED store is not inside the
KTHREAD_SHOULD_PARK condition wait-loop.

There is a genuine issue with sleeps that do not have a condition;
this is addressed in a subsequent patch.

Reported-by: Gaurav Kohli &lt;gkohli@codeaurora.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts</title>
<updated>2017-11-22T00:35:54Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2017-10-23T07:40:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=841b86f3289dbe858daeceec36423d4ea286fac2'/>
<id>urn:sha1:841b86f3289dbe858daeceec36423d4ea286fac2</id>
<content type='text'>
With all callbacks converted, and the timer callback prototype
switched over, the TIMER_FUNC_TYPE cast is no longer needed,
so remove it. Conversion was done with the following scripts:

    perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
        $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)

    perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
        $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)

The now unused macros are also dropped from include/linux/timer.h.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block</title>
<updated>2017-11-14T23:32:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-14T23:32:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e2c5923c349c1738fe8fda980874d93f6fb2e5b6'/>
<id>urn:sha1:e2c5923c349c1738fe8fda980874d93f6fb2e5b6</id>
<content type='text'>
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
</content>
</entry>
<entry>
<title>kthread: zero the kthread data structure</title>
<updated>2017-11-11T02:53:25Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-11-07T19:09:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e10237cc76ef9a4066a84aa2cc710bfd708cc341'/>
<id>urn:sha1:e10237cc76ef9a4066a84aa2cc710bfd708cc341</id>
<content type='text'>
kthread() could bail out early before we initialize blkcg_css (if the
kthread is killed very early. Please see xchg() statement in kthread()),
which confuses free_kthread_struct. Instead of moving the blkcg_css
initialization early, we simply zero the whole 'self' data structure,
which doesn't sound much overhead.

Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Fixes: 05e3db95ebfc ("kthread: add a mechanism to store cgroup info")
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
