<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/taskstats.c, branch v3.0.67</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.0.67</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.0.67'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2011-12-21T20:57:40Z</updated>
<entry>
<title>Make TASKSTATS require root access</title>
<updated>2011-12-21T20:57:40Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-09-20T00:04:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b2c51348a4e6b08bb493b2b33d06df89d1a9a0bc'/>
<id>urn:sha1:b2c51348a4e6b08bb493b2b33d06df89d1a9a0bc</id>
<content type='text'>
commit 1a51410abe7d0ee4b1d112780f46df87d3621043 upstream.

Ok, this isn't optimal, since it means that 'iotop' needs admin
capabilities, and we may have to work on this some more.  But at the
same time it is very much not acceptable to let anybody just read
anybody elses IO statistics quite at this level.

Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative
to checking the capabilities by hand.

Reported-by: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Cc: Johannes Berg &lt;johannes.berg@intel.com&gt;
Acked-by: Balbir Singh &lt;bsingharora@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Moritz Mühlenhoff &lt;jmm@inutil.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>taskstats: don't allow duplicate entries in listener mode</title>
<updated>2011-06-28T01:00:13Z</updated>
<author>
<name>Vasiliy Kulikov</name>
<email>segoon@openwall.com</email>
</author>
<published>2011-06-27T23:18:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=26c4caea9d697043cc5a458b96411b86d7f6babd'/>
<id>urn:sha1:26c4caea9d697043cc5a458b96411b86d7f6babd</id>
<content type='text'>
Currently a single process may register exit handlers unlimited times.
It may lead to a bloated listeners chain and very slow process
terminations.

Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of
kernel memory is stolen for the handlers chain and "time id" shows 2-7
seconds instead of normal 0.003.  It makes it possible to exhaust all
kernel memory and to eat much of CPU time by triggerring numerous exits
on a single CPU.

The patch limits the number of times a single process may register
itself on a single CPU to one.

One little issue is kept unfixed - as taskstats_exit() is called before
exit_files() in do_exit(), the orphaned listener entry (if it was not
explicitly deregistered) is kept until the next someone's exit() and
implicit deregistration in send_cpu_listeners().  So, if a process
registered itself as a listener exits and the next spawned process gets
the same pid, it would inherit taskstats attributes.

Signed-off-by: Vasiliy Kulikov &lt;segooon@gmail.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>taskstats: use appropriate printk priority level</title>
<updated>2011-03-24T02:47:14Z</updated>
<author>
<name>Mandeep Singh Baines</name>
<email>msb@chromium.org</email>
</author>
<published>2011-03-23T23:43:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f9b182e24ecb2b3bb33340f053ba31c8c4e1d895'/>
<id>urn:sha1:f9b182e24ecb2b3bb33340f053ba31c8c4e1d895</id>
<content type='text'>
printk()s without a priority level default to KERN_WARNING.  To reduce
noise at KERN_WARNING, this patch set the priority level appriopriately
for unleveled printks()s.  This should be useful to folks that look at
dmesg warnings closely.

Signed-off-by: Mandeep Singh Baines &lt;msb@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>taskstats: use better ifdef for alignment</title>
<updated>2011-01-13T16:03:19Z</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2011-01-13T01:00:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ab020cf07e457a8b425bf5af17e704f90f86d8b'/>
<id>urn:sha1:9ab020cf07e457a8b425bf5af17e704f90f86d8b</id>
<content type='text'>
Commit 4be2c95d ("taskstats: pad taskstats netlink response for aligment
issues on ia64") added a null field to align the taskstats structure but
the discussion centered around ia64.  The issue exists on other platforms
with inefficient unaligned access and adding them piecemeal would be an
unmaintainable mess.

This patch uses Dave Miller's suggestion of using a combination of
CONFIG_64BIT &amp;&amp; !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to determine
whether alignment is needed.

Note that this will cause breakage on those platforms with applications
like iotop which had hard-coded offsets into the packet to access the
taskstats structure.

The message seen on systems without the alignment fixes looks like: kernel
unaligned access to 0xe000023879dca9bc, ip=0xa000000100133d10

The addresses may vary but resolve to locations inside __delayacct_add_tsk.

iotop makes what I'd call unreasonable assumptions about the contents of a
netlink genetlink packet containing generic attributes.  They're typed and
have headers that specify value lengths, so the client can (should)
identify and skip the ones the client doesn't understand.

The kernel, as of version 2.6.36, presented a packet like so:
+--------------------------------+
| genlmsghdr - 4 bytes           |
+--------------------------------+
| NLA header - 4 bytes           | /* Aggregate header */
+-+------------------------------+
| | NLA header - 4 bytes         | /* PID header */
| +------------------------------+
| | pid/tgid   - 4 bytes         |
| +------------------------------+
| | NLA header - 4 bytes         | /* stats header */
| + -----------------------------+ &lt;- oops. aligned on 4 byte boundary
| | struct taskstats - 328 bytes |
+-+------------------------------+

The iotop code expects that the kernel will behave as it did then,
assuming that the packet format is set in stone.  The format is set in
stone, but the packet offsets are not.  There's nothing in the packet
format that guarantees that the packet will be sent in exactly the same
way.  The attribute contents are set (or versioned) and the aggregate
contents are set but they can be anywhere in the packet.

The issue here isn't that an unaligned structure gets passed to userspace,
it's that the NLA infrastructure has something of a weakness: The 4 byte
attribute header may force the payload to be unaligned.  The taskstats
structure is created at an unaligned location and then 64-bit values are
operated on inside the kernel, so the unaligned access warnings gets
spewed everywhere.

It's possible to use the unaligned access API to operate on the structure
in the kernel but it seems like a wasted effort to work around userspace
code that isn't following the packet format.  Any new additions would also
need the be worked around.  It's a maintenance nightmare.

The conclusion of the earlier discussion seemed to be "ok fine, if we have
to break it, don't break it on arches that don't have the problem." Dave
pointed out that the unaligned access problem doesn't only exist on ia64,
but also on other 64-bit arches that don't have efficient unaligned access
and it should be fixed there as well.  The committed version of the patch
and this addition keep with the conclusion of that discussion not to break
it unnecessarily, which the pid padding and the packet padding fixes did
do.  x86_64 and powerpc don't suffer this problem so they shouldn't suffer
the solution.  Other 64-bit architectures do and will, though.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Reported-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Dan Carpenter &lt;error27@gmail.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Florian Mickler &lt;florian@mickler.org&gt;
Cc: Guillaume Chazarain &lt;guichaz@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu</title>
<updated>2011-01-08T01:02:58Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-01-08T01:02:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=72eb6a791459c87a0340318840bb3bd9252b627b'/>
<id>urn:sha1:72eb6a791459c87a0340318840bb3bd9252b627b</id>
<content type='text'>
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
  gameport: use this_cpu_read instead of lookup
  x86: udelay: Use this_cpu_read to avoid address calculation
  x86: Use this_cpu_inc_return for nmi counter
  x86: Replace uses of current_cpu_data with this_cpu ops
  x86: Use this_cpu_ops to optimize code
  vmstat: User per cpu atomics to avoid interrupt disable / enable
  irq_work: Use per cpu atomics instead of regular atomics
  cpuops: Use cmpxchg for xchg to avoid lock semantics
  x86: this_cpu_cmpxchg and this_cpu_xchg operations
  percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
  percpu,x86: relocate this_cpu_add_return() and friends
  connector: Use this_cpu operations
  xen: Use this_cpu_inc_return
  taskstats: Use this_cpu_ops
  random: Use this_cpu_inc_return
  fs: Use this_cpu_inc_return in buffer.c
  highmem: Use this_cpu_xx_return() operations
  vmstat: Use this_cpu_inc_return for vm statistics
  x86: Support for this_cpu_add, sub, dec, inc_return
  percpu: Generic support for this_cpu_add, sub, dec, inc_return
  ...

Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.
</content>
</entry>
<entry>
<title>taskstats: pad taskstats netlink response for aligment issues on ia64</title>
<updated>2010-12-23T03:43:34Z</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2010-12-22T01:24:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4be2c95d1f7706ca0e74499f2bd118e1cee19669'/>
<id>urn:sha1:4be2c95d1f7706ca0e74499f2bd118e1cee19669</id>
<content type='text'>
The taskstats structure is internally aligned on 8 byte boundaries but the
layout of the aggregrate reply, with two NLA headers and the pid (each 4
bytes), actually force the entire structure to be unaligned.  This causes
the kernel to issue unaligned access warnings on some architectures like
ia64.  Unfortunately, some software out there doesn't properly unroll the
NLA packet and assumes that the start of the taskstats structure will
always be 20 bytes from the start of the netlink payload.  Aligning the
start of the taskstats structure breaks this software, which we don't
want.  So, for now the alignment only happens on architectures that
require it and those users will have to update to fixed versions of those
packages.  Space is reserved in the packet only when needed.  This ifdef
should be removed in several years e.g.  2012 once we can be confident
that fixed versions are installed on most systems.  We add the padding
before the aggregate since the aggregate is already a defined type.

Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems")
previously addressed the alignment issues by padding out the pid field.
This was supposed to be a compatible change but the circumstances
described above mean that it wasn't.  This patch backs out that change,
since it was a hack, and introduces a new NULL attribute type to provide
the padding.  Padding the response with 4 bytes avoids allocating an
aligned taskstats structure and copying it back.  Since the structure
weighs in at 328 bytes, it's too big to do it on the stack.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Reported-by: Brian Rogers &lt;brian@xyzw.org&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Guillaume Chazarain &lt;guichaz@gmail.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>taskstats: Use this_cpu_ops</title>
<updated>2010-12-17T14:18:05Z</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux.com</email>
</author>
<published>2010-12-08T16:42:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cd85fc58cd71bf6b89612efafb9a84e655ed7d66'/>
<id>urn:sha1:cd85fc58cd71bf6b89612efafb9a84e655ed7d66</id>
<content type='text'>
Use this_cpu_inc_return in one place and avoid ugly __raw_get_cpu in
another.

V3-&gt;V4:
	- Fix off by one.

V4-V4f:
	- Use &amp;listener_array

Cc: Michael Holzheu &lt;holzheu@linux.vnet.ibm.com&gt;
Acked-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>taskstats: split fill_pid function</title>
<updated>2010-10-28T01:03:17Z</updated>
<author>
<name>Michael Holzheu</name>
<email>holzheu@linux.vnet.ibm.com</email>
</author>
<published>2010-10-27T22:34:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3d9e0cf1fe007b88db55d43dfdb6839e1a029ca5'/>
<id>urn:sha1:3d9e0cf1fe007b88db55d43dfdb6839e1a029ca5</id>
<content type='text'>
Separate the finding of a task_struct by pid or tgid from filling the
taskstats data. This makes the code more readable.

Signed-off-by: Michael Holzheu &lt;holzheu@linux.vnet.ibm.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>taskstats: separate taskstats commands</title>
<updated>2010-10-28T01:03:17Z</updated>
<author>
<name>Michael Holzheu</name>
<email>holzheu@linux.vnet.ibm.com</email>
</author>
<published>2010-10-27T22:34:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9323312592cca636d7c2580dc85fa4846efa86a2'/>
<id>urn:sha1:9323312592cca636d7c2580dc85fa4846efa86a2</id>
<content type='text'>
Move each taskstats command into a single function.  This makes the code
more readable and makes it easier to add new commands.

Signed-off-by: Michael Holzheu &lt;holzheu@linux.vnet.ibm.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>delayacct: align to 8 byte boundary on 64-bit systems</title>
<updated>2010-10-28T01:03:17Z</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2010-10-27T22:34:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=85893120699f8bae8caa12a8ee18ab5fceac978e'/>
<id>urn:sha1:85893120699f8bae8caa12a8ee18ab5fceac978e</id>
<content type='text'>
prepare_reply() sets up an skb for the response.  The payload contains:

 +--------------------------------+
 | genlmsghdr - 4 bytes           |
 +--------------------------------+
 | NLA header - 4 bytes           | /* Aggregate header */
 +-+------------------------------+
 | | NLA header - 4 bytes         | /* PID header */
 | +------------------------------+
 | | pid/tgid   - 4 bytes         |
 | +------------------------------+
 | | NLA header - 4 bytes         | /* stats header */
 | + -----------------------------+ &lt;- oops. aligned on 4 byte boundary
 | | struct taskstats - 328 bytes |
 +-+------------------------------+

The start of the taskstats struct must be 8 byte aligned on IA64 (and
other systems with 8 byte alignment rules for 64-bit types) or runtime
alignment warnings will be issued.

This patch pads the pid/tgid field out to sizeof(long), which forces the
alignment of taskstats.  The getdelays userspace code is ok with this
since it assumes 32-bit pid/tgid and then honors that header's length
field.

An array is used to avoid exposing kernel memory contents to userspace in
the response.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
