<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/tick.h, branch v3.9.11</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.9.11</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.9.11'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-02-04T23:48:46Z</updated>
<entry>
<title>Merge branch 'nohz/printk-v8' into irq/core</title>
<updated>2013-02-04T23:48:46Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2013-02-04T23:48:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=077931446b85e7858bf9dc0927cd116669b965d2'/>
<id>urn:sha1:077931446b85e7858bf9dc0927cd116669b965d2</id>
<content type='text'>
Conflicts:
	kernel/irq_work.c

Add support for printk in full dynticks CPU.

* Don't stop tick with irq works pending. This
fix is generally useful and concerns archs that
can't raise self IPIs.

* Flush irq works before CPU offlining.

* Introduce "lazy" irq works that can wait for the
next tick to be executed, unless it's stopped.

* Implement klogd wake up using irq work. This
removes the ad-hoc printk_tick()/printk_needs_cpu()
hooks and make it working even in dynticks mode.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
</content>
</entry>
<entry>
<title>nohz: Add API to check tick state</title>
<updated>2012-11-17T18:30:18Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-10-11T15:52:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33a5f6261a61af28f7b4c86f9f958da0f206c914'/>
<id>urn:sha1:33a5f6261a61af28f7b4c86f9f958da0f206c914</id>
<content type='text'>
We need some quick way to check if the CPU has stopped
its tick. This will be useful to implement the printk tick
using the irq work subsystem.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: Quickly notice prediction failure for repeat mode</title>
<updated>2012-11-14T23:34:19Z</updated>
<author>
<name>Youquan Song</name>
<email>youquan.song@intel.com</email>
</author>
<published>2012-10-26T10:26:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=69a37beabf1f0a6705c08e879bdd5d82ff6486c4'/>
<id>urn:sha1:69a37beabf1f0a6705c08e879bdd5d82ff6486c4</id>
<content type='text'>
The prediction for future is difficult and when the cpuidle governor prediction
fails and govenor possibly choose the shallower C-state than it should. How to
quickly notice and find the failure becomes important for power saving.

cpuidle menu governor has a method to predict the repeat pattern if there are 8
C-states residency which are continuous and the same or very close, so it will
predict the next C-states residency will keep same residency time.

There is a real case that turbostat utility (tools/power/x86/turbostat)
at kernel 3.3 or early. turbostat utility will read 10 registers one by one at
Sandybridge, so it will generate 10 IPIs to wake up idle CPUs. So cpuidle menu
 governor will predict it is repeat mode and there is another IPI wake up idle
 CPU soon, so it keeps idle CPU stay at C1 state even though CPU is totally
idle. However, in the turbostat, following 10 registers reading is sleep 5
seconds by default, so the idle CPU will keep at C1 for a long time though it is
 idle until break event occurs.
In a idle Sandybridge system, run "./turbostat -v", we will notice that deep
C-state dangles between "70% ~ 99%". After patched the kernel, we will notice
deep C-state stays at &gt;99.98%.

In the patch, a timer is added when menu governor detects a repeat mode and
choose a shallow C-state. The timer is set to a time out value that greater
than predicted time, and we conclude repeat mode prediction failure if timer is
triggered. When repeat mode happens as expected, the timer is not triggered
and CPU waken up from C-states and it will cancel the timer initiatively.
When repeat mode does not happen, the timer will be time out and menu governor
will quickly notice that the repeat mode prediction fails and then re-evaluates
deeper C-states possibility.

Below is another case which will clearly show the patch much benefit:

#include &lt;stdlib.h&gt;
#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;signal.h&gt;
#include &lt;sys/time.h&gt;
#include &lt;time.h&gt;
#include &lt;pthread.h&gt;

volatile int * shutdown;
volatile long * count;
int delay = 20;
int loop = 8;

void usage(void)
{
	fprintf(stderr,
		"Usage: idle_predict [options]\n"
		"  --help	-h  Print this help\n"
		"  --thread	-n  Thread number\n"
		"  --loop     	-l  Loop times in shallow Cstate\n"
		"  --delay	-t  Sleep time (uS)in shallow Cstate\n");
}

void *simple_loop() {
	int idle_num = 1;
	while (!(*shutdown)) {
		*count = *count + 1;

		if (idle_num % loop)
			usleep(delay);
		else {
			/* sleep 1 second */
			usleep(1000000);
			idle_num = 0;
		}
		idle_num++;
	}

}

static void sighand(int sig)
{
	*shutdown = 1;
}

int main(int argc, char *argv[])
{
	sigset_t sigset;
	int signum = SIGALRM;
	int i, c, er = 0, thread_num = 8;
	pthread_t pt[1024];

	static char optstr[] = "n:l:t:h:";

	while ((c = getopt(argc, argv, optstr)) != EOF)
		switch (c) {
			case 'n':
				thread_num = atoi(optarg);
				break;
			case 'l':
				loop = atoi(optarg);
				break;
			case 't':
				delay = atoi(optarg);
				break;
			case 'h':
			default:
				usage();
				exit(1);
		}

	printf("thread=%d,loop=%d,delay=%d\n",thread_num,loop,delay);
	count = malloc(sizeof(long));
	shutdown = malloc(sizeof(int));
	*count = 0;
	*shutdown = 0;

	sigemptyset(&amp;sigset);
	sigaddset(&amp;sigset, signum);
	sigprocmask (SIG_BLOCK, &amp;sigset, NULL);
	signal(SIGINT, sighand);
	signal(SIGTERM, sighand);

	for(i = 0; i &lt; thread_num ; i++)
		pthread_create(&amp;pt[i], NULL, simple_loop, NULL);

	for (i = 0; i &lt; thread_num; i++)
		pthread_join(pt[i], NULL);

	exit(0);
}

Get powertop V2 from git://github.com/fenrus75/powertop, build powertop.
After build the above test application, then run it.
Test plaform can be Intel Sandybridge or other recent platforms.
#./idle_predict -l 10 &amp;
#./powertop

We will find that deep C-state will dangle between 40%~100% and much time spent
on C1 state. It is because menu governor wrongly predict that repeat mode
is kept, so it will choose the C1 shallow C-state even though it has chance to
sleep 1 second in deep C-state.

While after patched the kernel, we find that deep C-state will keep &gt;99.6%.

Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Youquan Song &lt;youquan.song@intel.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>nohz: Rename ts-&gt;idle_tick to ts-&gt;last_tick</title>
<updated>2012-06-11T18:07:17Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-07-31T15:44:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f5d411c91ede162240f34e05a233f2759412988e'/>
<id>urn:sha1:f5d411c91ede162240f34e05a233f2759412988e</id>
<content type='text'>
Now that idle and nohz logics are going to be independant each others,
ts-&gt;idle_tick becomes too much a biased name to describe the field that
saves the last scheduled tick on top of which we re-calculate the next
tick to schedule when the timer is restarted.

We want to reuse this even to stop the tick outside idle cases. So let's
rename it to some more generic name: ts-&gt;last_tick.

This changes a bit the timer list stat export so we need to increase its
version.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Alessio Igor Bogani &lt;abogani@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Avi Kivity &lt;avi@redhat.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: Geoff Levand &lt;geoff@infradead.org&gt;
Cc: Gilad Ben Yossef &lt;gilad@benyossef.com&gt;
Cc: Hakan Akkan &lt;hakanakkan@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Kevin Hilman &lt;khilman@ti.com&gt;
Cc: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Sven-Thorsten Dietrich &lt;thebigcorporation@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()</title>
<updated>2011-12-11T18:31:57Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-11-17T17:48:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1268fbc746ea1cd279886a740dcbad4ba5232225'/>
<id>urn:sha1:1268fbc746ea1cd279886a740dcbad4ba5232225</id>
<content type='text'>
Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.

Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Reviewed-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>nohz: Allow rcu extended quiescent state handling seperately from tick stop</title>
<updated>2011-12-11T18:31:36Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-10-08T14:01:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2bbb6817c0ac1b5f2a68d720f364f98eeb1ac4fd'/>
<id>urn:sha1:2bbb6817c0ac1b5f2a68d720f364f98eeb1ac4fd</id>
<content type='text'>
It is assumed that rcu won't be used once we switch to tickless
mode and until we restart the tick. However this is not always
true, as in x86-64 where we dereference the idle notifiers after
the tick is stopped.

To prepare for fixing this, add two new APIs:
tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().

If no use of RCU is made in the idle loop between
tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
must instead call the new *_norcu() version such that the arch doesn't
need to call rcu_idle_enter() and rcu_idle_exit().

Otherwise the arch must call tick_nohz_enter_idle() and
tick_nohz_exit_idle() and also call explicitly:

- rcu_idle_enter() after its last use of RCU before the CPU is put
to sleep.
- rcu_idle_exit() before the first use of RCU after the CPU is woken
up.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Mike Frysinger &lt;vapier@gentoo.org&gt;
Cc: Guan Xuetao &lt;gxt@mprc.pku.edu.cn&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Hans-Christian Egtvedt &lt;hans-christian.egtvedt@atmel.com&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>nohz: Separate out irq exit and idle loop dyntick logic</title>
<updated>2011-12-11T18:31:35Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-10-07T16:22:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=280f06774afedf849f0b34248ed6aff57d0f6908'/>
<id>urn:sha1:280f06774afedf849f0b34248ed6aff57d0f6908</id>
<content type='text'>
The tick_nohz_stop_sched_tick() function, which tries to delay
the next timer tick as long as possible, can be called from two
places:

- From the idle loop to start the dytick idle mode
- From interrupt exit if we have interrupted the dyntick
idle mode, so that we reprogram the next tick event in
case the irq changed some internal state that requires this
action.

There are only few minor differences between both that
are handled by that function, driven by the ts-&gt;inidle
cpu variable and the inidle parameter. The whole guarantees
that we only update the dyntick mode on irq exit if we actually
interrupted the dyntick idle mode, and that we enter in RCU extended
quiescent state from idle loop entry only.

Split this function into:

- tick_nohz_idle_enter(), which sets ts-&gt;inidle to 1, enters
dynticks idle mode unconditionally if it can, and enters into RCU
extended quiescent state.

- tick_nohz_irq_exit() which only updates the dynticks idle mode
when ts-&gt;inidle is set (ie: if tick_nohz_idle_enter() has been called).

To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
into tick_nohz_idle_exit().

This simplifies the code and micro-optimize the irq exit path (no need
for local_irq_save there). This also prepares for the split between
dynticks and rcu extended quiescent state logics. We'll need this split to
further fix illegal uses of RCU in extended quiescent states in the idle
loop.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Mike Frysinger &lt;vapier@gentoo.org&gt;
Cc: Guan Xuetao &lt;gxt@mprc.pku.edu.cn&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Hans-Christian Egtvedt &lt;hans-christian.egtvedt@atmel.com&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
</content>
</entry>
<entry>
<title>rcu: Track idleness independent of idle tasks</title>
<updated>2011-12-11T18:31:24Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paul.mckenney@linaro.org</email>
</author>
<published>2011-09-30T19:10:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b2e4f1880b789be1f24f9684f7a54b90310b5c0'/>
<id>urn:sha1:9b2e4f1880b789be1f24f9684f7a54b90310b5c0</id>
<content type='text'>
Earlier versions of RCU used the scheduling-clock tick to detect idleness
by checking for the idle task, but handled idleness differently for
CONFIG_NO_HZ=y.  But there are now a number of uses of RCU read-side
critical sections in the idle task, for example, for tracing.  A more
fine-grained detection of idleness is therefore required.

This commit presses the old dyntick-idle code into full-time service,
so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
always invoked at the beginning of an idle loop iteration.  Similarly,
rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
at the end of an idle-loop iteration.  This allows the idle task to
use RCU everywhere except between consecutive rcu_idle_enter() and
rcu_idle_exit() calls, in turn allowing architecture maintainers to
specify exactly where in the idle loop that RCU may be used.

Because some of the userspace upcall uses can result in what looks
to RCU like half of an interrupt, it is not possible to expect that
the irq_enter() and irq_exit() hooks will give exact counts.  This
patch therefore expands the -&gt;dynticks_nesting counter to 64 bits
and uses two separate bitfields to count process/idle transitions
and interrupt entry/exit transitions.  It is presumed that userspace
upcalls do not happen in the idle loop or from usermode execution
(though usermode might do a system call that results in an upcall).
The counter is hard-reset on each process/idle transition, which
avoids the interrupt entry/exit error from accumulating.  Overflow
is avoided by the 64-bitness of the -&gt;dyntick_nesting counter.

This commit also adds warnings if a non-idle task asks RCU to enter
idle state (and these checks will need some adjustment before applying
Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
In addition, validation of -&gt;dynticks and -&gt;dynticks_nesting is added.

Signed-off-by: Paul E. McKenney &lt;paul.mckenney@linaro.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
</content>
</entry>
<entry>
<title>sched: Intoduce get_cpu_iowait_time_us()</title>
<updated>2010-05-09T17:35:27Z</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2010-05-09T15:25:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0224cf4c5ee0d7faec83956b8e21f7d89e3df3bd'/>
<id>urn:sha1:0224cf4c5ee0d7faec83956b8e21f7d89e3df3bd</id>
<content type='text'>
For the ondemand cpufreq governor, it is desired that the iowait
time is microaccounted in a similar way as idle time is.

This patch introduces the infrastructure to account and expose
this information via the get_cpu_iowait_time_us() function.

[akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build]
Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: davej@redhat.com
LKML-Reference: &lt;20100509082523.284feab6@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: Eliminate the ts-&gt;idle_lastupdate field</title>
<updated>2010-05-09T17:35:26Z</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2010-05-09T15:24:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0e37c200f1357db0dd986edb359c41c57d24f6e'/>
<id>urn:sha1:e0e37c200f1357db0dd986edb359c41c57d24f6e</id>
<content type='text'>
Now that the only user of ts-&gt;idle_lastupdate is
update_ts_time_stats(), the entire field can be eliminated.

In update_ts_time_stats(), idle_lastupdate is first set to
"now", and a few lines later, the only user is an if() statement
that assigns a variable either to "now" or to
ts-&gt;idle_lastupdate, which has the value of "now" at that point.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: davej@redhat.com
LKML-Reference: &lt;20100509082439.2fab0b4f@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
</feed>
