<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/uapi/linux/sched.h, branch v5.5.14</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.14</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.14'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-11-26T02:36:49Z</updated>
<entry>
<title>Merge tag 'threads-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux</title>
<updated>2019-11-26T02:36:49Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-26T02:36:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0acefef58451a995981e6d641220fecc53bd85a4'/>
<id>urn:sha1:0acefef58451a995981e6d641220fecc53bd85a4</id>
<content type='text'>
Pull thread management updates from Christian Brauner:

 - A pidfd's fdinfo file currently contains the field "Pid:\t&lt;pid&gt;"
   where &lt;pid&gt; is the pid of the process in the pid namespace of the
   procfs instance the fdinfo file for the pidfd was opened in.

   The fdinfo file has now gained a new "NSpid:\t&lt;ns-pid1&gt;[\t&lt;ns-pid2&gt;[...]]"
   field which lists the pids of the process in all child pid namespaces
   provided the pid namespace of the procfs instance it is looked up
   under has an ancestoral relationship with the pid namespace of the
   process. If it does not 0 will be shown and no further pid namespaces
   will be listed. Tests included. (Christian Kellner)

 - If the process the pidfd references has already exited, print -1 for
   the Pid and NSpid fields in the pidfd's fdinfo file. Tests included.
   (me)

 - Add CLONE_CLEAR_SIGHAND. This lets callers clear all signal handler
   that are not SIG_DFL or SIG_IGN at process creation time. This
   originated as a feature request from glibc to improve performance and
   elimate races in their posix_spawn() implementation. Tests included.
   (me)

 - Add support for choosing a specific pid for a process with clone3().
   This is the feature which was part of the thread update for v5.4 but
   after a discussion at LPC in Lisbon we decided to delay it for one
   more cycle in order to make the interface more generic. This has now
   done. It is now possible to choose a specific pid in a whole pid
   namespaces (sub)hierarchy instead of just one pid namespace. In order
   to choose a specific pid the caller must have CAP_SYS_ADMIN in all
   owning user namespaces of the target pid namespaces. Tests included.
   (Adrian Reber)

 - Test improvements and extensions. (Andrei Vagin, me)

* tag 'threads-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests/clone3: skip if clone3() is ENOSYS
  selftests/clone3: check that all pids are released on error paths
  selftests/clone3: report a correct number of fails
  selftests/clone3: flush stdout and stderr before clone3() and _exit()
  selftests: add tests for clone3() with *set_tid
  fork: extend clone3() to support setting a PID
  selftests: add tests for clone3()
  tests: test CLONE_CLEAR_SIGHAND
  clone3: add CLONE_CLEAR_SIGHAND
  pid: use pid_has_task() in pidfd_open()
  exit: use pid_has_task() in do_wait()
  pid: use pid_has_task() in __change_pid()
  test: verify fdinfo for pidfd of reaped process
  pidfd: check pid has attached task in fdinfo
  pidfd: add tests for NSpid info in fdinfo
  pidfd: add NSpid entries to fdinfo
</content>
</entry>
<entry>
<title>fork: extend clone3() to support setting a PID</title>
<updated>2019-11-15T22:49:22Z</updated>
<author>
<name>Adrian Reber</name>
<email>areber@redhat.com</email>
</author>
<published>2019-11-15T12:36:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49cb2fc42ce4b7a656ee605e30c302efaa39c1a7'/>
<id>urn:sha1:49cb2fc42ce4b7a656ee605e30c302efaa39c1a7</id>
<content type='text'>
The main motivation to add set_tid to clone3() is CRIU.

To restore a process with the same PID/TID CRIU currently uses
/proc/sys/kernel/ns_last_pid. It writes the desired (PID - 1) to
ns_last_pid and then (quickly) does a clone(). This works most of the
time, but it is racy. It is also slow as it requires multiple syscalls.

Extending clone3() to support *set_tid makes it possible restore a
process using CRIU without accessing /proc/sys/kernel/ns_last_pid and
race free (as long as the desired PID/TID is available).

This clone3() extension places the same restrictions (CAP_SYS_ADMIN)
on clone3() with *set_tid as they are currently in place for ns_last_pid.

The original version of this change was using a single value for
set_tid. At the 2019 LPC, after presenting set_tid, it was, however,
decided to change set_tid to an array to enable setting the PID of a
process in multiple PID namespaces at the same time. If a process is
created in a PID namespace it is possible to influence the PID inside
and outside of the PID namespace. Details also in the corresponding
selftest.

To create a process with the following PIDs:

      PID NS level         Requested PID
        0 (host)              31496
        1                        42
        2                         1

For that example the two newly introduced parameters to struct
clone_args (set_tid and set_tid_size) would need to be:

  set_tid[0] = 1;
  set_tid[1] = 42;
  set_tid[2] = 31496;
  set_tid_size = 3;

If only the PIDs of the two innermost nested PID namespaces should be
defined it would look like this:

  set_tid[0] = 1;
  set_tid[1] = 42;
  set_tid_size = 2;

The PID of the newly created process would then be the next available
free PID in the PID namespace level 0 (host) and 42 in the PID namespace
at level 1 and the PID of the process in the innermost PID namespace
would be 1.

The set_tid array is used to specify the PID of a process starting
from the innermost nested PID namespaces up to set_tid_size PID namespaces.

set_tid_size cannot be larger then the current PID namespace level.

Signed-off-by: Adrian Reber &lt;areber@redhat.com&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Dmitry Safonov &lt;0x7f454c46@gmail.com&gt;
Acked-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Link: https://lore.kernel.org/r/20191115123621.142252-1-areber@redhat.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>clone3: validate stack arguments</title>
<updated>2019-11-05T14:50:14Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2019-10-31T11:36:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fa729c4df558936b4a1a7b3e2234011f44ede28b'/>
<id>urn:sha1:fa729c4df558936b4a1a7b3e2234011f44ede28b</id>
<content type='text'>
Validate the stack arguments and setup the stack depening on whether or not
it is growing down or up.

Legacy clone() required userspace to know in which direction the stack is
growing and pass down the stack pointer appropriately. To make things more
confusing microblaze uses a variant of the clone() syscall selected by
CONFIG_CLONE_BACKWARDS3 that takes an additional stack_size argument.
IA64 has a separate clone2() syscall which also takes an additional
stack_size argument. Finally, parisc has a stack that is growing upwards.
Userspace therefore has a lot nasty code like the following:

 #define __STACK_SIZE (8 * 1024 * 1024)
 pid_t sys_clone(int (*fn)(void *), void *arg, int flags, int *pidfd)
 {
         pid_t ret;
         void *stack;

         stack = malloc(__STACK_SIZE);
         if (!stack)
                 return -ENOMEM;

 #ifdef __ia64__
         ret = __clone2(fn, stack, __STACK_SIZE, flags | SIGCHLD, arg, pidfd);
 #elif defined(__parisc__) /* stack grows up */
         ret = clone(fn, stack, flags | SIGCHLD, arg, pidfd);
 #else
         ret = clone(fn, stack + __STACK_SIZE, flags | SIGCHLD, arg, pidfd);
 #endif
         return ret;
 }

or even crazier variants such as [3].

With clone3() we have the ability to validate the stack. We can check that
when stack_size is passed, the stack pointer is valid and the other way
around. We can also check that the memory area userspace gave us is fine to
use via access_ok(). Furthermore, we probably should not require
userspace to know in which direction the stack is growing. It is easy
for us to do this in the kernel and I couldn't find the original
reasoning behind exposing this detail to userspace.

/* Intentional user visible API change */
clone3() was released with 5.3. Currently, it is not documented and very
unclear to userspace how the stack and stack_size argument have to be
passed. After talking to glibc folks we concluded that trying to change
clone3() to setup the stack instead of requiring userspace to do this is
the right course of action.
Note, that this is an explicit change in user visible behavior we introduce
with this patch. If it breaks someone's use-case we will revert! (And then
e.g. place the new behavior under an appropriate flag.)
Breaking someone's use-case is very unlikely though. First, neither glibc
nor musl currently expose a wrapper for clone3(). Second, there is no real
motivation for anyone to use clone3() directly since it does not provide
features that legacy clone doesn't. New features for clone3() will first
happen in v5.5 which is why v5.4 is still a good time to try and make that
change now and backport it to v5.3. Searches on [4] did not reveal any
packages calling clone3().

[1]: https://lore.kernel.org/r/CAG48ez3q=BeNcuVTKBN79kJui4vC6nw0Bfq6xc-i0neheT17TA@mail.gmail.com
[2]: https://lore.kernel.org/r/20191028172143.4vnnjpdljfnexaq5@wittgenstein
[3]: https://github.com/systemd/systemd/blob/5238e9575906297608ff802a27e2ff9effa3b338/src/basic/raw-clone.h#L31
[4]: https://codesearch.debian.net
Fixes: 7f192e3cd316 ("fork: add clone3")
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Florian Weimer &lt;fweimer@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: &lt;stable@vger.kernel.org&gt; # 5.3
Cc: GNU C Library &lt;libc-alpha@sourceware.org&gt;
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Link: https://lore.kernel.org/r/20191031113608.20713-1-christian.brauner@ubuntu.com
</content>
</entry>
<entry>
<title>clone3: add CLONE_CLEAR_SIGHAND</title>
<updated>2019-10-21T19:46:47Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2019-10-14T10:45:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b612e5df4587c934bd056bf05f4a1deca4de4f75'/>
<id>urn:sha1:b612e5df4587c934bd056bf05f4a1deca4de4f75</id>
<content type='text'>
Reset all signal handlers of the child not set to SIG_IGN to SIG_DFL.
Mutually exclusive with CLONE_SIGHAND to not disturb other thread's
signal handler.

In the spirit of closer cooperation between glibc developers and kernel
developers (cf. [2]) this patchset came out of a discussion on the glibc
mailing list for improving posix_spawn() (cf. [1], [3], [4]). Kernel
support for this feature has been explicitly requested by glibc and I
see no reason not to help them with this.

The child helper process on Linux posix_spawn must ensure that no signal
handlers are enabled, so the signal disposition must be either SIG_DFL
or SIG_IGN. However, it requires a sigprocmask to obtain the current
signal mask and at least _NSIG sigaction calls to reset the signal
handlers for each posix_spawn call or complex state tracking that might
lead to data corruption in glibc. Adding this flags lets glibc avoid
these problems.

[1]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00149.html
[3]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00158.html
[4]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00160.html
[2]: https://lwn.net/Articles/799331/
     '[...] by asking for better cooperation with the C-library projects
     in general. They should be copied on patches containing ABI
     changes, for example. I noted that there are often times where
     C-library developers wish the kernel community had done things
     differently; how could those be avoided in the future? Members of
     the audience suggested that more glibc developers should perhaps
     join the linux-api list. The other suggestion was to "copy Florian
     on everything".'
Cc: Florian Weimer &lt;fweimer@redhat.com&gt;
Cc: libc-alpha@sourceware.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Link: https://lore.kernel.org/r/20191014104538.3096-1-christian.brauner@ubuntu.com
</content>
</entry>
<entry>
<title>Merge tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux</title>
<updated>2019-10-04T17:36:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-10-04T17:36:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e524d16e7e324039f2a9f82e302f0a39ac7d5812'/>
<id>urn:sha1:e524d16e7e324039f2a9f82e302f0a39ac7d5812</id>
<content type='text'>
Pull copy_struct_from_user() helper from Christian Brauner:
 "This contains the copy_struct_from_user() helper which got split out
  from the openat2() patchset. It is a generic interface designed to
  copy a struct from userspace.

  The helper will be especially useful for structs versioned by size of
  which we have quite a few. This allows for backwards compatibility,
  i.e. an extended struct can be passed to an older kernel, or a legacy
  struct can be passed to a newer kernel. For the first case (extended
  struct, older kernel) the new fields in an extended struct can be set
  to zero and the struct safely passed to an older kernel.

  The most obvious benefit is that this helper lets us get rid of
  duplicate code present in at least sched_setattr(), perf_event_open(),
  and clone3(). More importantly it will also help to ensure that users
  implementing versioning-by-size end up with the same core semantics.

  This point is especially crucial since we have at least one case where
  versioning-by-size is used but with slighly different semantics:
  sched_setattr(), perf_event_open(), and clone3() all do do similar
  checks to copy_struct_from_user() while rt_sigprocmask(2) always
  rejects differently-sized struct arguments.

  With this pull request we also switch over sched_setattr(),
  perf_event_open(), and clone3() to use the new helper"

* tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  usercopy: Add parentheses around assignment in test_copy_struct_from_user
  perf_event_open: switch to copy_struct_from_user()
  sched_setattr: switch to copy_struct_from_user()
  clone3: switch to copy_struct_from_user()
  lib: introduce copy_struct_from_user() helper
</content>
</entry>
<entry>
<title>sched: add kernel-doc for struct clone_args</title>
<updated>2019-10-03T19:19:29Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2019-09-27T15:29:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=78f6face5af344f12f4bd48b32faa6f499a06f36'/>
<id>urn:sha1:78f6face5af344f12f4bd48b32faa6f499a06f36</id>
<content type='text'>
Add kernel-doc for struct clone_args for the clone3() syscall.

Link: https://lore.kernel.org/r/20191001114701.24661-3-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>clone3: switch to copy_struct_from_user()</title>
<updated>2019-10-01T13:45:10Z</updated>
<author>
<name>Aleksa Sarai</name>
<email>cyphar@cyphar.com</email>
</author>
<published>2019-10-01T01:10:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f14c234b4bc5184fd40d9a47830e5b32c3b36d49'/>
<id>urn:sha1:f14c234b4bc5184fd40d9a47830e5b32c3b36d49</id>
<content type='text'>
Switch clone3() syscall from it's own copying struct clone_args from
userspace to the new dedicated copy_struct_from_user() helper.

The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls. Additionally, explicitly
define CLONE_ARGS_SIZE_VER0 to match the other users of the
struct-extension pattern.

Signed-off-by: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
[christian.brauner@ubuntu.com: improve commit message]
Link: https://lore.kernel.org/r/20191001011055.19283-3-cyphar@cyphar.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>sched: Add __ASSEMBLY__ guards around struct clone_args</title>
<updated>2019-09-30T20:32:52Z</updated>
<author>
<name>Seth Forshee</name>
<email>seth.forshee@canonical.com</email>
</author>
<published>2019-09-17T07:18:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=61129dd29f7962f278b618a2a3e8fdb986a66dc8'/>
<id>urn:sha1:61129dd29f7962f278b618a2a3e8fdb986a66dc8</id>
<content type='text'>
The addition of struct clone_args to uapi/linux/sched.h is not protected
by __ASSEMBLY__ guards, causing a failure to build from source for glibc
on RISC-V. Add the guards to fix this.

Fixes: 7f192e3cd316 ("fork: add clone3")
Signed-off-by: Seth Forshee &lt;seth.forshee@canonical.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20190917071853.12385-1-seth.forshee@canonical.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux</title>
<updated>2019-07-11T17:09:44Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-11T17:09:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8f6ccf6159aed1f04c6d179f61f6fb2691261e84'/>
<id>urn:sha1:8f6ccf6159aed1f04c6d179f61f6fb2691261e84</id>
<content type='text'>
Pull clone3 system call from Christian Brauner:
 "This adds the clone3 syscall which is an extensible successor to clone
  after we snagged the last flag with CLONE_PIDFD during the 5.2 merge
  window for clone(). It cleanly supports all of the flags from clone()
  and thus all legacy workloads.

  There are few user visible differences between clone3 and clone.
  First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse
  this flag. Second, the CSIGNAL flag is deprecated and will cause
  EINVAL to be reported. It is superseeded by a dedicated "exit_signal"
  argument in struct clone_args thus freeing up even more flags. And
  third, clone3 gives CLONE_PIDFD a dedicated return argument in struct
  clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr
  argument.

  The clone3 uapi is designed to be easy to handle on 32- and 64 bit:

    /* uapi */
    struct clone_args {
            __aligned_u64 flags;
            __aligned_u64 pidfd;
            __aligned_u64 child_tid;
            __aligned_u64 parent_tid;
            __aligned_u64 exit_signal;
            __aligned_u64 stack;
            __aligned_u64 stack_size;
            __aligned_u64 tls;
    };

  and a separate kernel struct is used that uses proper kernel typing:

    /* kernel internal */
    struct kernel_clone_args {
            u64 flags;
            int __user *pidfd;
            int __user *child_tid;
            int __user *parent_tid;
            int exit_signal;
            unsigned long stack;
            unsigned long stack_size;
            unsigned long tls;
    };

  The system call comes with a size argument which enables the kernel to
  detect what version of clone_args userspace is passing in. clone3
  validates that any additional bytes a given kernel does not know about
  are set to zero and that the size never exceeds a page.

  A nice feature is that this patchset allowed us to cleanup and
  simplify various core kernel codepaths in kernel/fork.c by making the
  internal _do_fork() function take struct kernel_clone_args even for
  legacy clone().

  This patch also unblocks the time namespace patchset which wants to
  introduce a new CLONE_TIMENS flag.

  Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and
  xtensa. These were the architectures that did not require special
  massaging.

  Other architectures treat fork-like system calls individually and
  after some back and forth neither Arnd nor I felt confident that we
  dared to add clone3 unconditionally to all architectures. We agreed to
  leave this up to individual architecture maintainers. This is why
  there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3
  which any architecture can set once it has implemented support for
  clone3. The patch also adds a cond_syscall(clone3) for architectures
  such as nios2 or h8300 that generate their syscall table by simply
  including asm-generic/unistd.h. The hope is to get rid of
  __ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon"

* tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  arch: handle arches who do not yet define clone3
  arch: wire-up clone3() syscall
  fork: add clone3
</content>
</entry>
<entry>
<title>sched/uclamp: Extend sched_setattr() to support utilization clamping</title>
<updated>2019-06-24T17:23:46Z</updated>
<author>
<name>Patrick Bellasi</name>
<email>patrick.bellasi@arm.com</email>
</author>
<published>2019-06-21T08:42:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a509a7cd79747074a2c018a45bbbc52d1f4aed44'/>
<id>urn:sha1:a509a7cd79747074a2c018a45bbbc52d1f4aed44</id>
<content type='text'>
The SCHED_DEADLINE scheduling class provides an advanced and formal
model to define tasks requirements that can translate into proper
decisions for both task placements and frequencies selections. Other
classes have a more simplified model based on the POSIX concept of
priorities.

Such a simple priority based model however does not allow to exploit
most advanced features of the Linux scheduler like, for example, driving
frequencies selection via the schedutil cpufreq governor. However, also
for non SCHED_DEADLINE tasks, it's still interesting to define tasks
properties to support scheduler decisions.

Utilization clamping exposes to user-space a new set of per-task
attributes the scheduler can use as hints about the expected/required
utilization for a task. This allows to implement a "proactive" per-task
frequency control policy, a more advanced policy than the current one
based just on "passive" measured task utilization. For example, it's
possible to boost interactive tasks (e.g. to get better performance) or
cap background tasks (e.g. to be more energy/thermal efficient).

Introduce a new API to set utilization clamping values for a specified
task by extending sched_setattr(), a syscall which already allows to
define task specific properties for different scheduling classes. A new
pair of attributes allows to specify a minimum and maximum utilization
the scheduler can consider for a task.

Do that by validating the required clamp values before and then applying
the required changes using _the_ same pattern already in use for
__setscheduler(). This ensures that the task is re-enqueued with the new
clamp values.

Signed-off-by: Patrick Bellasi &lt;patrick.bellasi@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alessio Balsini &lt;balsini@android.com&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Joel Fernandes &lt;joelaf@google.com&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Morten Rasmussen &lt;morten.rasmussen@arm.com&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Quentin Perret &lt;quentin.perret@arm.com&gt;
Cc: Rafael J . Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Steve Muckle &lt;smuckle@google.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Todd Kjos &lt;tkjos@google.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Link: https://lkml.kernel.org/r/20190621084217.8167-7-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
