<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/arch/x86/kernel/process_32.c, branch v3.2.70</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.70</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.70'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-02-27T18:25:55Z</updated>
<entry>
<title>i387: re-introduce FPU state preloading at context switch time</title>
<updated>2012-02-27T18:25:55Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-02-18T20:56:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9016ec427136d5b5d025948319cf1114dc7734e4'/>
<id>urn:sha1:9016ec427136d5b5d025948319cf1114dc7734e4</id>
<content type='text'>
commit 34ddc81a230b15c0e345b6b253049db731499f7e upstream.

After all the FPU state cleanups and finally finding the problem that
caused all our FPU save/restore problems, this re-introduces the
preloading of FPU state that was removed in commit b3b0870ef3ff ("i387:
do not preload FPU state at task switch time").

However, instead of simply reverting the removal, this reimplements
preloading with several fixes, most notably

 - properly abstracted as a true FPU state switch, rather than as
   open-coded save and restore with various hacks.

   In particular, implementing it as a proper FPU state switch allows us
   to optimize the CR0.TS flag accesses: there is no reason to set the
   TS bit only to then almost immediately clear it again.  CR0 accesses
   are quite slow and expensive, don't flip the bit back and forth for
   no good reason.

 - Make sure that the same model works for both x86-32 and x86-64, so
   that there are no gratuitous differences between the two due to the
   way they save and restore segment state differently due to
   architectural differences that really don't matter to the FPU state.

 - Avoid exposing the "preload" state to the context switch routines,
   and in particular allow the concept of lazy state restore: if nothing
   else has used the FPU in the meantime, and the process is still on
   the same CPU, we can avoid restoring state from memory entirely, just
   re-expose the state that is still in the FPU unit.

   That optimized lazy restore isn't actually implemented here, but the
   infrastructure is set up for it.  Of course, older CPU's that use
   'fnsave' to save the state cannot take advantage of this, since the
   state saving also trashes the state.

In other words, there is now an actual _design_ to the FPU state saving,
rather than just random historical baggage.  Hopefully it's easier to
follow as a result.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>i387: do not preload FPU state at task switch time</title>
<updated>2012-02-27T18:25:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-02-16T23:45:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ba6aaed5cc8f55b77644daf56e9ae3a75f042908'/>
<id>urn:sha1:ba6aaed5cc8f55b77644daf56e9ae3a75f042908</id>
<content type='text'>
commit b3b0870ef3ffed72b92415423da864f440f57ad6 upstream.

Yes, taking the trap to re-load the FPU/MMX state is expensive, but so
is spending several days looking for a bug in the state save/restore
code.  And the preload code has some rather subtle interactions with
both paravirtualization support and segment state restore, so it's not
nearly as simple as it should be.

Also, now that we no longer necessarily depend on a single bit (ie
TS_USEDFPU) for keeping track of the state of the FPU, we migth be able
to do better.  If we are really switching between two processes that
keep touching the FP state, save/restore is inevitable, but in the case
of having one process that does most of the FPU usage, we may actually
be able to do much better than the preloading.

In particular, we may be able to keep track of which CPU the process ran
on last, and also per CPU keep track of which process' FP state that CPU
has.  For modern CPU's that don't destroy the FPU contents on save time,
that would allow us to do a lazy restore by just re-enabling the
existing FPU state - with no restore cost at all!

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2011-10-26T15:03:38Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-10-26T15:03:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7115e3fcf45514db7525a05365b10454ff7f345e'/>
<id>urn:sha1:7115e3fcf45514db7525a05365b10454ff7f345e</id>
<content type='text'>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits)
  perf symbols: Increase symbol KSYM_NAME_LEN size
  perf hists browser: Refuse 'a' hotkey on non symbolic views
  perf ui browser: Use libslang to read keys
  perf tools: Fix tracing info recording
  perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads
  perf hists: Don't consider filtered entries when calculating column widths
  perf hists: Don't decay total_period for filtered entries
  perf hists browser: Honour symbol_conf.show_{nr_samples,total_period}
  perf hists browser: Do not exit on tab key with single event
  perf annotate browser: Don't change selection line when returning from callq
  perf tools: handle endianness of feature bitmap
  perf tools: Add prelink suggestion to dso update message
  perf script: Fix unknown feature comment
  perf hists browser: Apply the dso and thread filters when merging new batches
  perf hists: Move the dso and thread filters from hist_browser
  perf ui browser: Honour the xterm colors
  perf top tui: Give color hints just on the percentage, like on --stdio
  perf ui browser: Make the colors configurable and change the defaults
  perf tui: Remove unneeded call to newtCls on startup
  perf hists: Don't format the percentage on hist_entry__snprintf
  ...

Fix up conflicts in arch/x86/kernel/kprobes.c manually.

Ingo's tree did the insane "add volatile to const array", which just
doesn't make sense ("volatile const"?).  But we could remove the const
*and* make the array volatile to make doubly sure that gcc doesn't
optimize it away..

Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the
reader_lock has been turned into a raw lock by the core locking merge,
and there was a new user of it introduced in this perf core merge.  Make
sure that new use also uses the raw accessor functions.
</content>
</entry>
<entry>
<title>x86, nmi: Add in logic to handle multiple events and unknown NMIs</title>
<updated>2011-10-10T04:57:01Z</updated>
<author>
<name>Don Zickus</name>
<email>dzickus@redhat.com</email>
</author>
<published>2011-09-30T19:06:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b227e23399dc59977aa42c49bd668bdab7a61812'/>
<id>urn:sha1:b227e23399dc59977aa42c49bd668bdab7a61812</id>
<content type='text'>
Previous patches allow the NMI subsystem to process multipe NMI events
in one NMI.  As previously discussed this can cause issues when an event
triggered another NMI but is processed in the current NMI.  This causes the
next NMI to go unprocessed and become an 'unknown' NMI.

To handle this, we first have to flag whether or not the NMI handler handled
more than one event or not.  If it did, then there exists a chance that
the next NMI might be already processed.  Once the NMI is flagged as a
candidate to be swallowed, we next look for a back-to-back NMI condition.

This is determined by looking at the %rip from pt_regs.  If it is the same
as the previous NMI, it is assumed the cpu did not have a chance to jump
back into a non-NMI context and execute code and instead handled another NMI.

If both of those conditions are true then we will swallow any unknown NMI.

There still exists a chance that we accidentally swallow a real unknown NMI,
but for now things seem better.

An optimization has also been added to the nmi notifier rountine.  Because x86
can latch up to one NMI while currently processing an NMI, we don't have to
worry about executing _all_ the handlers in a standalone NMI.  The idea is
if multiple NMIs come in, the second NMI will represent them.  For those
back-to-back NMI cases, we have the potentail to drop NMIs.  Therefore only
execute all the handlers in the second half of a detected back-to-back NMI.

Signed-off-by: Don Zickus &lt;dzickus@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' into for-next</title>
<updated>2011-09-15T13:08:18Z</updated>
<author>
<name>Jiri Kosina</name>
<email>jkosina@suse.cz</email>
</author>
<published>2011-09-15T13:08:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e060c38434b2caa78efe7cedaff4191040b65a15'/>
<id>urn:sha1:e060c38434b2caa78efe7cedaff4191040b65a15</id>
<content type='text'>
Fast-forward merge with Linus to be able to merge patches
based on more recent version of the tree.
</content>
</entry>
<entry>
<title>sched: x86_32 Fix typo in switch_to() description</title>
<updated>2011-09-15T12:13:48Z</updated>
<author>
<name>Kamalesh Babulal</name>
<email>kamalesh@linux.vnet.ibm.com</email>
</author>
<published>2011-04-28T09:02:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ea70ef3d9db1cbfd439f3e3f124c28ef4fe5835d'/>
<id>urn:sha1:ea70ef3d9db1cbfd439f3e3f124c28ef4fe5835d</id>
<content type='text'>
This patch fixes the typo in parameters passed to
x86_32 switch_to() description.

Signed-off-by: Kamalesh Babulal &lt;kamalesh@linux.vnet.ibm.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
</entry>
<entry>
<title>cpuidle: stop depending on pm_idle</title>
<updated>2011-08-03T23:06:37Z</updated>
<author>
<name>Len Brown</name>
<email>len.brown@intel.com</email>
</author>
<published>2011-04-01T23:34:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a0bfa1373859e9d11dc92561a8667588803e42d8'/>
<id>urn:sha1:a0bfa1373859e9d11dc92561a8667588803e42d8</id>
<content type='text'>
cpuidle users should call cpuidle_call_idle() directly
rather than via (pm_idle)() function pointer.

Architecture may choose to continue using (pm_idle)(),
but cpuidle need not depend on it:

  my_arch_cpu_idle()
	...
	if(cpuidle_call_idle())
		pm_idle();

cc: Kevin Hilman &lt;khilman@deeprootsystems.com&gt;
cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
cc: x86@kernel.org
Acked-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Signed-off-by: Len Brown &lt;len.brown@intel.com&gt;
</content>
</entry>
<entry>
<title>exec: delay address limit change until point of no return</title>
<updated>2011-06-09T19:50:05Z</updated>
<author>
<name>Mathias Krause</name>
<email>minipli@googlemail.com</email>
</author>
<published>2011-06-09T18:05:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dac853ae89043f1b7752875300faf614de43c74b'/>
<id>urn:sha1:dac853ae89043f1b7752875300faf614de43c74b</id>
<content type='text'>
Unconditionally changing the address limit to USER_DS and not restoring
it to its old value in the error path is wrong because it prevents us
using kernel memory on repeated calls to this function.  This, in fact,
breaks the fallback of hard coded paths to the init program from being
ever successful if the first candidate fails to load.

With this patch applied switching to USER_DS is delayed until the point
of no return is reached which makes it possible to have a multi-arch
rootfs with one arch specific init binary for each of the (hard coded)
probed paths.

Since the address limit is already set to USER_DS when start_thread()
will be invoked, this redundancy can be safely removed.

Signed-off-by: Mathias Krause &lt;minipli@googlemail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer</title>
<updated>2011-01-12T23:05:16Z</updated>
<author>
<name>Thomas Renninger</name>
<email>trenn@suse.de</email>
</author>
<published>2011-01-07T10:29:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f77cfe4ea21760268c0277fa3e4b02dfd2a2c2f4'/>
<id>urn:sha1:f77cfe4ea21760268c0277fa3e4b02dfd2a2c2f4</id>
<content type='text'>
Currently intel_idle and acpi_idle driver show double cpu_idle "exit idle"
events -&gt; this patch fixes it and makes cpu_idle events throwing less complex.

It also introduces cpu_idle events for all architectures which use
the cpuidle subsystem, namely:
  - arch/arm/mach-at91/cpuidle.c
  - arch/arm/mach-davinci/cpuidle.c
  - arch/arm/mach-kirkwood/cpuidle.c
  - arch/arm/mach-omap2/cpuidle34xx.c
  - arch/drivers/acpi/processor_idle.c (for all cases, not only mwait)
  - arch/x86/kernel/process.c (did throw events before, but was a mess)
  - drivers/idle/intel_idle.c (did throw events before)

Convention should be:
Fire cpu_idle events inside the current pm_idle function (not somewhere
down the the callee tree) to keep things easy.

Current possible pm_idle functions in X86:
c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle
-&gt; this is really easy is now.

This affects userspace:
The type field of the cpu_idle power event can now direclty get
mapped to:
/sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...}
instead of throwing very CPU/mwait specific values.
This change is not visible for the intel_idle driver.
For the acpi_idle driver it should only be visible if the vendor
misses out C-states in his BIOS.
Another (perf timechart) patch reads out cpuidle info of cpu_idle
events from:
/sys/.../cpuidle/stateX/*, then the cpuidle events are mapped
to the correct C-/cpuidle state again, even if e.g. vendors miss
out C-states in their BIOS and for example only export C1 and C3.
-&gt; everything is fine.

Signed-off-by: Thomas Renninger &lt;trenn@suse.de&gt;
CC: Robert Schoene &lt;robert.schoene@tu-dresden.de&gt;
CC: Jean Pihet &lt;j-pihet@ti.com&gt;
CC: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
CC: Ingo Molnar &lt;mingo@elte.hu&gt;
CC: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
CC: linux-pm@lists.linux-foundation.org
CC: linux-acpi@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-perf-users@vger.kernel.org
CC: linux-omap@vger.kernel.org
Signed-off-by: Len Brown &lt;len.brown@intel.com&gt;
</content>
</entry>
<entry>
<title>perf: Clean up power events by introducing new, more generic ones</title>
<updated>2011-01-04T07:16:54Z</updated>
<author>
<name>Thomas Renninger</name>
<email>trenn@suse.de</email>
</author>
<published>2011-01-03T16:50:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=25e41933b58777f2d020c3b0186b430ea004ec28'/>
<id>urn:sha1:25e41933b58777f2d020c3b0186b430ea004ec28</id>
<content type='text'>
Add these new power trace events:

 power:cpu_idle
 power:cpu_frequency
 power:machine_suspend

The old C-state/idle accounting events:
  power:power_start
  power:power_end

Have now a replacement (but we are still keeping the old
tracepoints for compatibility):

  power:cpu_idle

and
  power:power_frequency

is replaced with:
  power:cpu_frequency

power:machine_suspend is newly introduced.

Jean Pihet has a patch integrated into the generic layer
(kernel/power/suspend.c) which will make use of it.

the type= field got removed from both, it was never
used and the type is differed by the event type itself.

perf timechart userspace tool gets adjusted in a separate patch.

Signed-off-by: Thomas Renninger &lt;trenn@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Acked-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Acked-by: Jean Pihet &lt;jean.pihet@newoldbits.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: rjw@sisk.pl
LKML-Reference: &lt;1294073445-14812-3-git-send-email-trenn@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
LKML-Reference: &lt;1290072314-31155-2-git-send-email-trenn@suse.de&gt;
</content>
</entry>
</feed>
