<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/kernel_stat.h, branch v3.14.38</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.14.38</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.14.38'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-01-27T16:18:55Z</updated>
<entry>
<title>genirq: Prevent proc race against freeing of irq descriptors</title>
<updated>2015-01-27T16:18:55Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2014-12-11T22:01:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5666a3de7ab455e889cdcecd2128bc316f842df3'/>
<id>urn:sha1:5666a3de7ab455e889cdcecd2128bc316f842df3</id>
<content type='text'>
commit c291ee622165cb2c8d4e7af63fffd499354a23be upstream.

Since the rework of the sparse interrupt code to actually free the
unused interrupt descriptors there exists a race between the /proc
interfaces to the irq subsystem and the code which frees the interrupt
descriptor.

CPU0				CPU1
				show_interrupts()
				  desc = irq_to_desc(X);
free_desc(desc)
  remove_from_radix_tree();
  kfree(desc);
				  raw_spinlock_irq(&amp;desc-&gt;lock);

/proc/interrupts is the only interface which can actively corrupt
kernel memory via the lock access. /proc/stat can only read from freed
memory. Extremly hard to trigger, but possible.

The interfaces in /proc/irq/N/ are not affected by this because the
removal of the proc file is serialized in procfs against concurrent
readers/writers. The removal happens before the descriptor is freed.

For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
as the descriptor is never freed. It's merely cleared out with the irq
descriptor lock held. So any concurrent proc access will either see
the old correct value or the cleared out ones.

Protect the lookup and access to the irq descriptor in
show_interrupts() with the sparse_irq_lock.

Provide kstat_irqs_usr() which is protecting the lookup and access
with sparse_irq_lock and switch /proc/stat to use it.

Document the existing kstat_irqs interfaces so it's clear that the
caller needs to take care about protection. The users of these
interfaces are either not affected due to SPARSE_IRQ=n or already
protected against removal.

Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Remove GENERIC_HARDIRQ config option</title>
<updated>2013-09-13T13:09:52Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2013-08-30T07:39:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0244ad004a54e39308d495fee0a2e637f8b5c317'/>
<id>urn:sha1:0244ad004a54e39308d495fee0a2e637f8b5c317</id>
<content type='text'>
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS &amp; GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>cputime: Generic on-demand virtual cputime accounting</title>
<updated>2013-01-27T18:23:27Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-07-25T05:56:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=abf917cd91cbb73952758f9741e2fa65002a48ee'/>
<id>urn:sha1:abf917cd91cbb73952758f9741e2fa65002a48ee</id>
<content type='text'>
If we want to stop the tick further idle, we need to be
able to account the cputime without using the tick.

Virtual based cputime accounting solves that problem by
hooking into kernel/user boundaries.

However implementing CONFIG_VIRT_CPU_ACCOUNTING require
low level hooks and involves more overhead. But we already
have a generic context tracking subsystem that is required
for RCU needs by archs which plan to shut down the tick
outside idle.

This patch implements a generic virtual based cputime
accounting that relies on these generic kernel/user hooks.

There are some upsides of doing this:

- This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
if context tracking is already built (already necessary for RCU in full
tickless mode).

- We can rely on the generic context tracking subsystem to dynamically
(de)activate the hooks, so that we can switch anytime between virtual
and tick based accounting. This way we don't have the overhead
of the virtual accounting when the tick is running periodically.

And one downside:

- There is probably more overhead than a native virtual based cputime
accounting. But this relies on hooks that are already set anyway.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Li Zhong &lt;zhong@linux.vnet.ibm.com&gt;
Cc: Namhyung Kim &lt;namhyung.kim@lge.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>vtime: Explicitly account pending user time on process tick</title>
<updated>2012-11-19T15:41:21Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-11-13T22:51:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bcebdf846522056a84ba0b0cba5f5413868c9394'/>
<id>urn:sha1:bcebdf846522056a84ba0b0cba5f5413868c9394</id>
<content type='text'>
All vtime implementations just flush the user time on process
tick. Consolidate that in generic code by calling a user time
accounting helper. This avoids an indirect call in ia64 and
prepare to also consolidate vtime context switch code.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>vtime: Gather vtime declarations to their own header file</title>
<updated>2012-10-29T20:31:30Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-10-05T21:07:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dcbf832e5823156e8f155359b47bd108cac8ad68'/>
<id>urn:sha1:dcbf832e5823156e8f155359b47bd108cac8ad68</id>
<content type='text'>
These APIs are scattered around and are going to expand a bit.
Let's create a dedicated header file for sanity.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>vtime: Consolidate system/idle context detection</title>
<updated>2012-09-25T13:42:37Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-09-08T14:14:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a7e1a9e3af71b45ecae2dae35851f238117b317d'/>
<id>urn:sha1:a7e1a9e3af71b45ecae2dae35851f238117b317d</id>
<content type='text'>
Move the code that finds out to which context we account the
cputime into generic layer.

Archs that consider the whole time spent in the idle task as idle
time (ia64, powerpc) can rely on the generic vtime_account()
and implement vtime_account_system() and vtime_account_idle(),
letting the generic code to decide when to call which API.

Archs that have their own meaning of idle time, such as s390
that only considers the time spent in CPU low power mode as idle
time, can just override vtime_account().

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>cputime: Use a proper subsystem naming for vtime related APIs</title>
<updated>2012-09-25T13:31:31Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-09-08T13:23:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf9fae9f5e4ca8dce4708812f9ad6281e61df109'/>
<id>urn:sha1:bf9fae9f5e4ca8dce4708812f9ad6281e61df109</id>
<content type='text'>
Use a naming based on vtime as a prefix for virtual based
cputime accounting APIs:

- account_system_vtime() -&gt; vtime_account()
- account_switch_vtime() -&gt; vtime_task_switch()

It makes it easier to allow for further declension such
as vtime_account_system(), vtime_account_idle(), ... if we
want to find out the context we account to from generic code.

This also make it better to know on which subsystem these APIs
refer to.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>cputime: Consolidate vtime handling on context switch</title>
<updated>2012-08-20T11:05:28Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-06-18T15:54:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=baa36046d09ea6dbc122c795566992318663d9eb'/>
<id>urn:sha1:baa36046d09ea6dbc122c795566992318663d9eb</id>
<content type='text'>
The archs that implement virtual cputime accounting all
flush the cputime of a task when it gets descheduled
and sometimes set up some ground initialization for the
next task to account its cputime.

These archs all put their own hooks in their context
switch callbacks and handle the off-case themselves.

Consolidate this by creating a new account_switch_vtime()
callback called in generic code right after a context switch
and that these archs must implement to flush the prev task
cputime and initialize the next task cputime related state.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Acked-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>sched/accounting: Change cpustat fields to an array</title>
<updated>2011-12-06T08:06:38Z</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2011-11-28T16:45:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3292beb340c76884427faa1f5d6085719477d889'/>
<id>urn:sha1:3292beb340c76884427faa1f5d6085719477d889</id>
<content type='text'>
This patch changes fields in cpustat from a structure, to an
u64 array. Math gets easier, and the code is more flexible.

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul Tuner &lt;pjt@google.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>irq: use per_cpu kstat_irqs</title>
<updated>2011-01-14T01:32:31Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-01-13T23:45:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6c9ae009b298753a3baf71298d676a68b5a10c8f'/>
<id>urn:sha1:6c9ae009b298753a3baf71298d676a68b5a10c8f</id>
<content type='text'>
Use modern per_cpu API to increment {soft|hard}irq counters, and use
per_cpu allocation for (struct irq_desc)-&gt;kstats_irq instead of an array.

This gives better SMP/NUMA locality and saves few instructions per irq.

With small nr_cpuids values (8 for example), kstats_irq was a small array
(less than L1_CACHE_BYTES), potentially source of false sharing.

In the !CONFIG_SPARSE_IRQ case, remove the huge, NUMA/cache unfriendly
kstat_irqs_all[NR_IRQS][NR_CPUS] array.

Note: we still populate kstats_irq for all possible irqs in
early_irq_init().  We probably could use on-demand allocations.  (Code
included in alloc_descs()).  Problem is not all IRQS are used with a prior
alloc_descs() call.

kstat_irqs_this_cpu() is not used anymore, remove it.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
