<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cgroup, branch v5.5.14</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.14</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.14'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-04-01T08:59:57Z</updated>
<entry>
<title>cgroup1: don't call release_agent when it is ""</title>
<updated>2020-04-01T08:59:57Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2020-02-19T19:01:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e3259b4caf3c905c7172d79db1b7fe574d993b55'/>
<id>urn:sha1:e3259b4caf3c905c7172d79db1b7fe574d993b55</id>
<content type='text'>
[ Upstream commit 2e5383d7904e60529136727e49629a82058a5607 ]

Older (and maybe current) versions of systemd set release_agent to "" when
shutting down, but do not set notify_on_release to 0.

Since 64e90a8acb85 ("Introduce STATIC_USERMODEHELPER to mediate
call_usermodehelper()"), we filter out such calls when the user mode helper
path is "". However, when used in conjunction with an actual (i.e. non "")
STATIC_USERMODEHELPER, the path is never "", so the real usermode helper
will be called with argv[0] == "".

Let's avoid this by not invoking the release_agent when it is "".

Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup-v1: cgroup_pidlist_next should update position index</title>
<updated>2020-04-01T08:59:56Z</updated>
<author>
<name>Vasily Averin</name>
<email>vvs@virtuozzo.com</email>
</author>
<published>2020-01-30T10:34:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f95fc632016e43c9af45d1a38288387559ad80d6'/>
<id>urn:sha1:f95fc632016e43c9af45d1a38288387559ad80d6</id>
<content type='text'>
[ Upstream commit db8dd9697238be70a6b4f9d0284cd89f59c0e070 ]

if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

 # mount | grep cgroup
 # dd if=/mnt/cgroup.procs bs=1  # normal output
...
1294
1295
1296
1304
1382
584+0 records in
584+0 records out
584 bytes copied

dd: /mnt/cgroup.procs: cannot skip to specified offset
83  &lt;&lt;&lt; generates end of last line
1383  &lt;&lt;&lt; ... and whole last line once again
0+1 records in
0+1 records out
8 bytes copied

dd: /mnt/cgroup.procs: cannot skip to specified offset
1386  &lt;&lt;&lt; generates last line anyway
0+1 records in
0+1 records out
5 bytes copied

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: Iterate tasks that did not finish do_exit()</title>
<updated>2020-03-18T06:19:07Z</updated>
<author>
<name>Michal Koutný</name>
<email>mkoutny@suse.com</email>
</author>
<published>2020-01-24T11:40:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=de9d0ad764628a55a9e1d46a7d6bd30acd1a02a2'/>
<id>urn:sha1:de9d0ad764628a55a9e1d46a7d6bd30acd1a02a2</id>
<content type='text'>
commit 9c974c77246460fa6a92c18554c3311c8c83c160 upstream.

PF_EXITING is set earlier than actual removal from css_set when a task
is exitting. This can confuse cgroup.procs readers who see no PF_EXITING
tasks, however, rmdir is checking against css_set membership so it can
transitionally fail with EBUSY.

Fix this by listing tasks that weren't unlinked from css_set active
lists.
It may happen that other users of the task iterator (without
CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This
is equal to the state before commit c03cd7738a83 ("cgroup: Include dying
leaders with live threads in PROCS iterations") but it may be reviewed
later.

Reported-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations")
Signed-off-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: cgroup_procs_next should increase position index</title>
<updated>2020-03-18T06:19:07Z</updated>
<author>
<name>Vasily Averin</name>
<email>vvs@virtuozzo.com</email>
</author>
<published>2020-01-30T10:34:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3f605480ac77e07429cea5fef223d1fa3bcf6d35'/>
<id>urn:sha1:3f605480ac77e07429cea5fef223d1fa3bcf6d35</id>
<content type='text'>
commit 2d4ecb030dcc90fb725ecbfc82ce5d6c37906e0e upstream.

If seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output:

1) dd bs=1 skip output of each 2nd elements
$ dd if=/sys/fs/cgroup/cgroup.procs bs=8 count=1
2
3
4
5
1+0 records in
1+0 records out
8 bytes copied, 0,000267297 s, 29,9 kB/s
[test@localhost ~]$ dd if=/sys/fs/cgroup/cgroup.procs bs=1 count=8
2
4 &lt;&lt;&lt; NB! 3 was skipped
6 &lt;&lt;&lt;    ... and 5 too
8 &lt;&lt;&lt;    ... and 7
8+0 records in
8+0 records out
8 bytes copied, 5,2123e-05 s, 153 kB/s

 This happen because __cgroup_procs_start() makes an extra
 extra cgroup_procs_next() call

2) read after lseek beyond end of file generates whole last line.
3) read after lseek into middle of last line generates
expected rest of last line and unexpected whole line once again.

Additionally patch removes an extra position index changes in
__cgroup_procs_start()

Cc: stable@vger.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: fix psi_show() crash on 32bit ino archs</title>
<updated>2020-03-18T06:19:06Z</updated>
<author>
<name>Qian Cai</name>
<email>cai@lca.pw</email>
</author>
<published>2020-02-24T03:00:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13be570b96c53d191ca764cce763a0f62578ba5c'/>
<id>urn:sha1:13be570b96c53d191ca764cce763a0f62578ba5c</id>
<content type='text'>
commit 190ecb190a9cd8c0599d8499b901e3c32e87966a upstream.

Similar to the commit d7495343228f ("cgroup: fix incorrect
WARN_ON_ONCE() in cgroup_setup_root()"), cgroup_id(root_cgrp) does not
equal to 1 on 32bit ino archs which triggers all sorts of issues with
psi_show() on s390x. For example,

 BUG: KASAN: slab-out-of-bounds in collect_percpu_times+0x2d0/
 Read of size 4 at addr 000000001e0ce000 by task read_all/3667
 collect_percpu_times+0x2d0/0x798
 psi_show+0x7c/0x2a8
 seq_read+0x2ac/0x830
 vfs_read+0x92/0x150
 ksys_read+0xe2/0x188
 system_call+0xd8/0x2b4

Fix it by using cgroup_ino().

Fixes: 743210386c03 ("cgroup: use cgrp-&gt;kn-&gt;id as the cgroup ID")
Signed-off-by: Qian Cai &lt;cai@lca.pw&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org # v5.5
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: memcg: net: do not associate sock with unrelated cgroup</title>
<updated>2020-03-18T06:18:59Z</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2020-03-10T05:16:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=48f7246793390c0410d74ec5d273dfc9d75338a1'/>
<id>urn:sha1:48f7246793390c0410d74ec5d273dfc9d75338a1</id>
<content type='text'>
[ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ]

We are testing network memory accounting in our setup and noticed
inconsistent network memory usage and often unrelated cgroups network
usage correlates with testing workload. On further inspection, it
seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in
irq context specially for cgroup v1.

mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context
and kind of assumes that this can only happen from sk_clone_lock()
and the source sock object has already associated cgroup. However in
cgroup v1, where network memory accounting is opt-in, the source sock
can be unassociated with any cgroup and the new cloned sock can get
associated with unrelated interrupted cgroup.

Cgroup v2 can also suffer if the source sock object was created by
process in the root cgroup or if sk_alloc() is called in irq context.
The fix is to just do nothing in interrupt.

WARNING: Please note that about half of the TCP sockets are allocated
from the IRQ context, so, memory used by such sockets will not be
accouted by the memcg.

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:

CPU: 70 PID: 12720 Comm: ssh Tainted:  5.6.0-smp-DEV #1
Hardware name: ...
Call Trace:
 &lt;IRQ&gt;
 dump_stack+0x57/0x75
 mem_cgroup_sk_alloc+0xe9/0xf0
 sk_clone_lock+0x2a7/0x420
 inet_csk_clone_lock+0x1b/0x110
 tcp_create_openreq_child+0x23/0x3b0
 tcp_v6_syn_recv_sock+0x88/0x730
 tcp_check_req+0x429/0x560
 tcp_v6_rcv+0x72d/0xa40
 ip6_protocol_deliver_rcu+0xc9/0x400
 ip6_input+0x44/0xd0
 ? ip6_protocol_deliver_rcu+0x400/0x400
 ip6_rcv_finish+0x71/0x80
 ipv6_rcv+0x5b/0xe0
 ? ip6_sublist_rcv+0x2e0/0x2e0
 process_backlog+0x108/0x1e0
 net_rx_action+0x26b/0x460
 __do_softirq+0x104/0x2a6
 do_softirq_own_stack+0x2a/0x40
 &lt;/IRQ&gt;
 do_softirq.part.19+0x40/0x50
 __local_bh_enable_ip+0x51/0x60
 ip6_finish_output2+0x23d/0x520
 ? ip6table_mangle_hook+0x55/0x160
 __ip6_finish_output+0xa1/0x100
 ip6_finish_output+0x30/0xd0
 ip6_output+0x73/0x120
 ? __ip6_finish_output+0x100/0x100
 ip6_xmit+0x2e3/0x600
 ? ipv6_anycast_cleanup+0x50/0x50
 ? inet6_csk_route_socket+0x136/0x1e0
 ? skb_free_head+0x1e/0x30
 inet6_csk_xmit+0x95/0xf0
 __tcp_transmit_skb+0x5b4/0xb20
 __tcp_send_ack.part.60+0xa3/0x110
 tcp_send_ack+0x1d/0x20
 tcp_rcv_state_process+0xe64/0xe80
 ? tcp_v6_connect+0x5d1/0x5f0
 tcp_v6_do_rcv+0x1b1/0x3f0
 ? tcp_v6_do_rcv+0x1b1/0x3f0
 __release_sock+0x7f/0xd0
 release_sock+0x30/0xa0
 __inet_stream_connect+0x1c3/0x3b0
 ? prepare_to_wait+0xb0/0xb0
 inet_stream_connect+0x3b/0x60
 __sys_connect+0x101/0x120
 ? __sys_getsockopt+0x11b/0x140
 __x64_sys_connect+0x1a/0x20
 do_syscall_64+0x51/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking")
Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cgroup: init_tasks shouldn't be linked to the root cgroup</title>
<updated>2020-02-19T18:54:05Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2020-01-30T16:37:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dcf6b17b8a036bb53f65d1a4f0e1636a5df96868'/>
<id>urn:sha1:dcf6b17b8a036bb53f65d1a4f0e1636a5df96868</id>
<content type='text'>
commit 0cd9d33ace336bc424fc30944aa3defd6786e4fe upstream.

5153faac18d2 ("cgroup: remove cgroup_enable_task_cg_lists()
optimization") removed lazy initialization of css_sets so that new
tasks are always lniked to its css_set. In the process, it incorrectly
ended up adding init_tasks to root css_set. They show up as PID 0's in
root's cgroup.procs triggering warnings in systemd and generally
confusing people.

Fix it by skip css_set linking for init_tasks.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: https://github.com/joanbm
Link: https://github.com/systemd/systemd/issues/14682
Fixes: 5153faac18d2 ("cgroup: remove cgroup_enable_task_cg_lists() optimization")
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: Prevent double killing of css when enabling threaded cgroup</title>
<updated>2020-02-04T18:18:03Z</updated>
<author>
<name>Michal Koutný</name>
<email>mkoutny@suse.com</email>
</author>
<published>2020-01-09T15:05:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5c41180d719de814d686536e335651299da5a4b3'/>
<id>urn:sha1:5c41180d719de814d686536e335651299da5a4b3</id>
<content type='text'>
commit 3bc0bb36fa30e95ca829e9cf480e1ef7f7638333 upstream.

The test_cgcore_no_internal_process_constraint_on_threads selftest when
running with subsystem controlling noise triggers two warnings:

&gt; [  597.443115] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3131 cgroup_apply_control_enable+0xe0/0x3f0
&gt; [  597.443413] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3177 cgroup_apply_control_disable+0xa6/0x160

Both stem from a call to cgroup_type_write. The first warning was also
triggered by syzkaller.

When we're switching cgroup to threaded mode shortly after a subsystem
was disabled on it, we can see the respective subsystem css dying there.

The warning in cgroup_apply_control_enable is harmless in this case
since we're not adding new subsys anyway.
The warning in cgroup_apply_control_disable indicates an attempt to kill
css of recently disabled subsystem repeatedly.

The commit prevents these situations by making cgroup_type_write wait
for all dying csses to go away before re-applying subtree controls.
When at it, the locations of WARN_ON_ONCE calls are moved so that
warning is triggered only when we are about to misuse the dying css.

Reported-by: syzbot+5493b2a54d31d6aea629@syzkaller.appspotmail.com
Reported-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2019-11-26T03:23:46Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-26T03:23:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b96a41b420972f8ba563373de8154d59d5b2208'/>
<id>urn:sha1:1b96a41b420972f8ba563373de8154d59d5b2208</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:
 "There are several notable changes here:

   - Single thread migrating itself has been optimized so that it
     doesn't need threadgroup rwsem anymore.

   - Freezer optimization to avoid unnecessary frozen state changes.

   - cgroup ID unification so that cgroup fs ino is the only unique ID
     used for the cgroup and can be used to directly look up live
     cgroups through filehandle interface on 64bit ino archs. On 32bit
     archs, cgroup fs ino is still the only ID in use but it is only
     unique when combined with gen.

   - selftest and other changes"

* 'for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (24 commits)
  writeback: fix -Wformat compilation warnings
  docs: cgroup: mm: Fix spelling of "list"
  cgroup: fix incorrect WARN_ON_ONCE() in cgroup_setup_root()
  cgroup: use cgrp-&gt;kn-&gt;id as the cgroup ID
  kernfs: use 64bit inos if ino_t is 64bit
  kernfs: implement custom exportfs ops and fid type
  kernfs: combine ino/id lookup functions into kernfs_find_and_get_node_by_id()
  kernfs: convert kernfs_node-&gt;id from union kernfs_node_id to u64
  kernfs: kernfs_find_and_get_node_by_ino() should only look up activated nodes
  kernfs: use dumber locking for kernfs_find_and_get_node_by_ino()
  netprio: use css ID instead of cgroup ID
  writeback: use ino_t for inodes in tracepoints
  kernfs: fix ino wrap-around detection
  kselftests: cgroup: Avoid the reuse of fd after it is deallocated
  cgroup: freezer: don't change task and cgroups status unnecessarily
  cgroup: use cgroup-&gt;last_bstat instead of cgroup-&gt;bstat_pending for consistency
  cgroup: remove cgroup_enable_task_cg_lists() optimization
  cgroup: pids: use atomic64_t for pids-&gt;limit
  selftests: cgroup: Run test_core under interfering stress
  selftests: cgroup: Add task migration tests
  ...
</content>
</entry>
<entry>
<title>Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2019-11-15T16:44:08Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-15T16:44:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b4c0800e4285f96900b7aa4a13ae724fc1205f65'/>
<id>urn:sha1:b4c0800e4285f96900b7aa4a13ae724fc1205f65</id>
<content type='text'>
Pull misc vfs fixes from Al Viro:
 "Assorted fixes all over the place; some of that is -stable fodder,
  some regressions from the last window"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ecryptfs_lookup_interpose(): lower_dentry-&gt;d_parent is not stable either
  ecryptfs_lookup_interpose(): lower_dentry-&gt;d_inode is not stable
  ecryptfs: fix unlink and rmdir in face of underlying fs modifications
  audit_get_nd(): don't unlock parent too early
  exportfs_decode_fh(): negative pinned may become positive without the parent locked
  cgroup: don't put ERR_PTR() into fc-&gt;root
  autofs: fix a leak in autofs_expire_indirect()
  aio: Fix io_pgetevents() struct __compat_aio_sigset layout
  fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry()
</content>
</entry>
</feed>
