<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cgroup, branch v5.10.77</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.77</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.77'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-11-02T18:48:21Z</updated>
<entry>
<title>cgroup: Fix memory leak caused by missing cgroup_bpf_offline</title>
<updated>2021-11-02T18:48:21Z</updated>
<author>
<name>Quanyang Wang</name>
<email>quanyang.wang@windriver.com</email>
</author>
<published>2021-10-18T07:56:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=01599bf7cc2b49c3d2be886cb438647dc25446ed'/>
<id>urn:sha1:01599bf7cc2b49c3d2be886cb438647dc25446ed</id>
<content type='text'>
commit 04f8ef5643bcd8bcde25dfdebef998aea480b2ba upstream.

When enabling CONFIG_CGROUP_BPF, kmemleak can be observed by running
the command as below:

    $mount -t cgroup -o none,name=foo cgroup cgroup/
    $umount cgroup/

unreferenced object 0xc3585c40 (size 64):
  comm "mount", pid 425, jiffies 4294959825 (age 31.990s)
  hex dump (first 32 bytes):
    01 00 00 80 84 8c 28 c0 00 00 00 00 00 00 00 00  ......(.........
    00 00 00 00 00 00 00 00 6c 43 a0 c3 00 00 00 00  ........lC......
  backtrace:
    [&lt;e95a2f9e&gt;] cgroup_bpf_inherit+0x44/0x24c
    [&lt;1f03679c&gt;] cgroup_setup_root+0x174/0x37c
    [&lt;ed4b0ac5&gt;] cgroup1_get_tree+0x2c0/0x4a0
    [&lt;f85b12fd&gt;] vfs_get_tree+0x24/0x108
    [&lt;f55aec5c&gt;] path_mount+0x384/0x988
    [&lt;e2d5e9cd&gt;] do_mount+0x64/0x9c
    [&lt;208c9cfe&gt;] sys_mount+0xfc/0x1f4
    [&lt;06dd06e0&gt;] ret_fast_syscall+0x0/0x48
    [&lt;a8308cb3&gt;] 0xbeb4daa8

This is because that since the commit 2b0d3d3e4fcf ("percpu_ref: reduce
memory footprint of percpu_ref in fast path") root_cgrp-&gt;bpf.refcnt.data
is allocated by the function percpu_ref_init in cgroup_bpf_inherit which
is called by cgroup_setup_root when mounting, but not freed along with
root_cgrp when umounting. Adding cgroup_bpf_offline which calls
percpu_ref_kill to cgroup_kill_sb can free root_cgrp-&gt;bpf.refcnt.data in
umount path.

This patch also fixes the commit 4bfc0bb2c60e ("bpf: decouple the lifetime
of cgroup_bpf from cgroup itself"). A cgroup_bpf_offline is needed to do a
cleanup that frees the resources which are allocated by cgroup_bpf_inherit
in cgroup_setup_root.

And inside cgroup_bpf_offline, cgroup_get() is at the beginning and
cgroup_put is at the end of cgroup_bpf_release which is called by
cgroup_bpf_offline. So cgroup_bpf_offline can keep the balance of
cgroup's refcount.

Fixes: 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Signed-off-by: Quanyang Wang &lt;quanyang.wang@windriver.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20211018075623.26884-1-quanyang.wang@windriver.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cgroup/cpuset: Fix violation of cpuset locking rule</title>
<updated>2021-09-15T07:50:38Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2021-07-20T14:18:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=10dfcfda5c6f532726caf3b0e63a6d705592942b'/>
<id>urn:sha1:10dfcfda5c6f532726caf3b0e63a6d705592942b</id>
<content type='text'>
[ Upstream commit 6ba34d3c73674e46d9e126e4f0cee79e5ef2481c ]

The cpuset fields that manage partition root state do not strictly
follow the cpuset locking rule that update to cpuset has to be done
with both the callback_lock and cpuset_mutex held. This is now fixed
by making sure that the locking rule is upheld.

Fixes: 3881b86128d0 ("cpuset: Add an error state to cpuset.sched.partition")
Fixes: 4b842da276a8 ("cpuset: Make CPU hotplug work with partition")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup/cpuset: Miscellaneous code cleanup</title>
<updated>2021-09-15T07:50:38Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2021-07-20T14:18:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cbc97661439d9dd7693ebd6b9d7b8f5c4084e7e8'/>
<id>urn:sha1:cbc97661439d9dd7693ebd6b9d7b8f5c4084e7e8</id>
<content type='text'>
[ Upstream commit 0f3adb8a1e5f36e792598c1d77a2cfac9c90a4f9 ]

Use more descriptive variable names for update_prstate(), remove
unnecessary code and fix some typos. There is no functional change.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup/cpuset: Fix a partition bug with hotplug</title>
<updated>2021-09-15T07:50:35Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2021-07-20T14:18:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0f3de1573fd00cfcff5252ebc66d70df92ce717'/>
<id>urn:sha1:e0f3de1573fd00cfcff5252ebc66d70df92ce717</id>
<content type='text'>
[ Upstream commit 15d428e6fe77fffc3f4fff923336036f5496ef17 ]

In cpuset_hotplug_workfn(), the detection of whether the cpu list
has been changed is done by comparing the effective cpus of the top
cpuset with the cpu_active_mask. However, in the rare case that just
all the CPUs in the subparts_cpus are offlined, the detection fails
and the partition states are not updated correctly. Fix it by forcing
the cpus_updated flag to true in this particular case.

Fixes: 4b842da276a8 ("cpuset: Make CPU hotplug work with partition")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup1: fix leaked context root causing sporadic NULL deref in LTP</title>
<updated>2021-07-31T06:16:11Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2021-06-16T12:51:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df34f888628e961d8158dbd9712ebf04dfa4ad8c'/>
<id>urn:sha1:df34f888628e961d8158dbd9712ebf04dfa4ad8c</id>
<content type='text'>
commit 1e7107c5ef44431bc1ebbd4c353f1d7c22e5f2ec upstream.

Richard reported sporadic (roughly one in 10 or so) null dereferences and
other strange behaviour for a set of automated LTP tests.  Things like:

   BUG: kernel NULL pointer dereference, address: 0000000000000008
   #PF: supervisor read access in kernel mode
   #PF: error_code(0x0000) - not-present page
   PGD 0 P4D 0
   Oops: 0000 [#1] PREEMPT SMP PTI
   CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
   RIP: 0010:kernfs_sop_show_path+0x1b/0x60

...or these others:

   RIP: 0010:do_mkdirat+0x6a/0xf0
   RIP: 0010:d_alloc_parallel+0x98/0x510
   RIP: 0010:do_readlinkat+0x86/0x120

There were other less common instances of some kind of a general scribble
but the common theme was mount and cgroup and a dubious dentry triggering
the NULL dereference.  I was only able to reproduce it under qemu by
replicating Richard's setup as closely as possible - I never did get it
to happen on bare metal, even while keeping everything else the same.

In commit 71d883c37e8d ("cgroup_do_mount(): massage calling conventions")
we see this as a part of the overall change:

   --------------
           struct cgroup_subsys *ss;
   -       struct dentry *dentry;

   [...]

   -       dentry = cgroup_do_mount(&amp;cgroup_fs_type, fc-&gt;sb_flags, root,
   -                                CGROUP_SUPER_MAGIC, ns);

   [...]

   -       if (percpu_ref_is_dying(&amp;root-&gt;cgrp.self.refcnt)) {
   -               struct super_block *sb = dentry-&gt;d_sb;
   -               dput(dentry);
   +       ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns);
   +       if (!ret &amp;&amp; percpu_ref_is_dying(&amp;root-&gt;cgrp.self.refcnt)) {
   +               struct super_block *sb = fc-&gt;root-&gt;d_sb;
   +               dput(fc-&gt;root);
                   deactivate_locked_super(sb);
                   msleep(10);
                   return restart_syscall();
           }
   --------------

In changing from the local "*dentry" variable to using fc-&gt;root, we now
export/leave that dentry pointer in the file context after doing the dput()
in the unlikely "is_dying" case.   With LTP doing a crazy amount of back to
back mount/unmount [testcases/bin/cgroup_regression_5_1.sh] the unlikely
becomes slightly likely and then bad things happen.

A fix would be to not leave the stale reference in fc-&gt;root as follows:

   --------------
                  dput(fc-&gt;root);
  +               fc-&gt;root = NULL;
                  deactivate_locked_super(sb);
   --------------

...but then we are just open-coding a duplicate of fc_drop_locked() so we
simply use that instead.

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Zefan Li &lt;lizefan.x@bytedance.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: stable@vger.kernel.org      # v5.1+
Reported-by: Richard Purdie &lt;richard.purdie@linuxfoundation.org&gt;
Fixes: 71d883c37e8d ("cgroup_do_mount(): massage calling conventions")
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cgroup: verify that source is a string</title>
<updated>2021-07-20T14:05:36Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2021-07-14T13:47:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=811763e3beb6c922d168e9f509ec593e9240842e'/>
<id>urn:sha1:811763e3beb6c922d168e9f509ec593e9240842e</id>
<content type='text'>
commit 3b0462726e7ef281c35a7a4ae33e93ee2bc9975b upstream.

The following sequence can be used to trigger a UAF:

    int fscontext_fd = fsopen("cgroup");
    int fd_null = open("/dev/null, O_RDONLY);
    int fsconfig(fscontext_fd, FSCONFIG_SET_FD, "source", fd_null);
    close_range(3, ~0U, 0);

The cgroup v1 specific fs parser expects a string for the "source"
parameter.  However, it is perfectly legitimate to e.g.  specify a file
descriptor for the "source" parameter.  The fs parser doesn't know what
a filesystem allows there.  So it's a bug to assume that "source" is
always of type fs_value_is_string when it can reasonably also be
fs_value_is_file.

This assumption in the cgroup code causes a UAF because struct
fs_parameter uses a union for the actual value.  Access to that union is
guarded by the param-&gt;type member.  Since the cgroup paramter parser
didn't check param-&gt;type but unconditionally moved param-&gt;string into
fc-&gt;source a close on the fscontext_fd would trigger a UAF during
put_fs_context() which frees fc-&gt;source thereby freeing the file stashed
in param-&gt;file causing a UAF during a close of the fd_null.

Fix this by verifying that param-&gt;type is actually a string and report
an error if not.

In follow up patches I'll add a new generic helper that can be used here
and by other filesystems instead of this error-prone copy-pasta fix.
But fixing it in here first makes backporting a it to stable a lot
easier.

Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing")
Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: syzkaller-bugs &lt;syzkaller-bugs@googlegroups.com&gt;
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cgroup1: don't allow '\n' in renaming</title>
<updated>2021-06-16T10:01:40Z</updated>
<author>
<name>Alexander Kuznetsov</name>
<email>wwfq@yandex-team.ru</email>
</author>
<published>2021-06-09T07:17:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=74d3b20b1b206a76f2cbccc5e09106adf6b5775c'/>
<id>urn:sha1:74d3b20b1b206a76f2cbccc5e09106adf6b5775c</id>
<content type='text'>
commit b7e24eb1caa5f8da20d405d262dba67943aedc42 upstream.

cgroup_mkdir() have restriction on newline usage in names:
$ mkdir $'/sys/fs/cgroup/cpu/test\ntest2'
mkdir: cannot create directory
'/sys/fs/cgroup/cpu/test\ntest2': Invalid argument

But in cgroup1_rename() such check is missed.
This allows us to make /proc/&lt;pid&gt;/cgroup unparsable:
$ mkdir /sys/fs/cgroup/cpu/test
$ mv /sys/fs/cgroup/cpu/test $'/sys/fs/cgroup/cpu/test\ntest2'
$ echo $$ &gt; $'/sys/fs/cgroup/cpu/test\ntest2'
$ cat /proc/self/cgroup
11:pids:/
10:freezer:/
9:hugetlb:/
8:cpuset:/
7:blkio:/user.slice
6:memory:/user.slice
5:net_cls,net_prio:/
4:perf_event:/
3:devices:/user.slice
2:cpu,cpuacct:/test
test2
1:name=systemd:/
0::/

Signed-off-by: Alexander Kuznetsov &lt;wwfq@yandex-team.ru&gt;
Reported-by: Andrey Krasichkov &lt;buglloc@yandex-team.ru&gt;
Acked-by: Dmitry Yakunin &lt;zeil@yandex-team.ru&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cgroup: disable controllers at parse time</title>
<updated>2021-06-16T10:01:36Z</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2021-05-12T20:19:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5ca472d40e2daebbfd926701902debcab0bbd15c'/>
<id>urn:sha1:5ca472d40e2daebbfd926701902debcab0bbd15c</id>
<content type='text'>
[ Upstream commit 45e1ba40837ac2f6f4d4716bddb8d44bd7e4a251 ]

This patch effectively reverts the commit a3e72739b7a7 ("cgroup: fix
too early usage of static_branch_disable()"). The commit 6041186a3258
("init: initialize jump labels before command line option parsing") has
moved the jump_label_init() before parse_args() which has made the
commit a3e72739b7a7 unnecessary. On the other hand there are
consequences of disabling the controllers later as there are subsystems
doing the controller checks for different decisions. One such incident
is reported [1] regarding the memory controller and its impact on memory
reclaim code.

[1] https://lore.kernel.org/linux-mm/921e53f3-4b13-aab8-4a9e-e83ff15371e4@nec.com

Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reported-by: NOMURA JUNICHI(野村　淳一) &lt;junichi.nomura@nec.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Tested-by: Jun'ichi Nomura &lt;junichi.nomura@nec.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup-v1: add disabled controller check in cgroup1_parse_param()</title>
<updated>2021-02-17T10:02:25Z</updated>
<author>
<name>Chen Zhou</name>
<email>chenzhou10@huawei.com</email>
</author>
<published>2021-01-15T09:37:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3e53d64e9a4d88226fb1ab55900fe4e89e81df1e'/>
<id>urn:sha1:3e53d64e9a4d88226fb1ab55900fe4e89e81df1e</id>
<content type='text'>
[ Upstream commit 61e960b07b637f0295308ad91268501d744c21b5 ]

When mounting a cgroup hierarchy with disabled controller in cgroup v1,
all available controllers will be attached.
For example, boot with cgroup_no_v1=cpu or cgroup_disable=cpu, and then
mount with "mount -t cgroup -ocpu cpu /sys/fs/cgroup/cpu", then all
enabled controllers will be attached except cpu.

Fix this by adding disabled controller check in cgroup1_parse_param().
If the specified controller is disabled, just return error with information
"Disabled controller xx" rather than attaching all the other enabled
controllers.

Fixes: f5dfb5315d34 ("cgroup: take options parsing into -&gt;parse_monolithic()")
Signed-off-by: Chen Zhou &lt;chenzhou10@huawei.com&gt;
Reviewed-by: Zefan Li &lt;lizefan.x@bytedance.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: fix psi monitor for root cgroup</title>
<updated>2021-02-17T10:02:21Z</updated>
<author>
<name>Odin Ugedal</name>
<email>odin@uged.al</email>
</author>
<published>2021-01-16T17:36:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e72a65802a3e053f53ee97b55d6fb4a426cfffa9'/>
<id>urn:sha1:e72a65802a3e053f53ee97b55d6fb4a426cfffa9</id>
<content type='text'>
commit 385aac1519417b89cb91b77c22e4ca21db563cd0 upstream.

Fix NULL pointer dereference when adding new psi monitor to the root
cgroup. PSI files for root cgroup was introduced in df5ba5be742 by using
system wide psi struct when reading, but file write/monitor was not
properly fixed. Since the PSI config for the root cgroup isn't
initialized, the current implementation tries to lock a NULL ptr,
resulting in a crash.

Can be triggered by running this as root:
$ tee /sys/fs/cgroup/cpu.pressure &lt;&lt;&lt; "some 10000 1000000"

Signed-off-by: Odin Ugedal &lt;odin@uged.al&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Acked-by: Dan Schatzberg &lt;dschatzberg@fb.com&gt;
Fixes: df5ba5be7425 ("kernel/sched/psi.c: expose pressure metrics on root cgroup")
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: stable@vger.kernel.org # 5.2+
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
