| Age | Commit message (Collapse) | Author |
|
These two files are now built in arch/powerpc/kernel instead of
arch/ppc64/kernel.
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
Add hardware data breakpoint support.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
Make firmware_has_feature() evaluate at compile time for the non pSeries
case and tidy up code where possible.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
Create the firmware_has_feature() inline and move the firmware feature
stuff into its own header file.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
The firmware_features field of struct cpu_spec should really be a separate
variable as the firmware features do not depend on the chip and the
bitmask is constructed independently. By removing it, we save 112 bytes
from the cpu_specs array and we access the bitmask directly instead of via
the cur_cpu_spec pointer.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
The following is a patch provided by Ananth Mavinakayanahalli that implements
the new PPC64 specific parts of the new function return probe design.
NOTE: Since getting Ananth's patch, I changed trampoline_probe_handler()
to consume each of the outstanding return probem instances (feedback
on my original RFC after Ananth cut a patch), and also added the
arch_init() function (adding arch specific initialization.) I have
cross compiled but have not testing this on a PPC64 machine.
Changes include:
* Addition of kretprobe_trampoline to act as a dummy function for instrumented
functions to return to, and for the return probe infrastructure to place
a kprobe on on, gaining control so that the return probe handler
can be called, and so that the instruction pointer can be moved back
to the original return address.
* Addition of arch_init(), allowing a kprobe to be registered on
kretprobe_trampoline
* Addition of trampoline_probe_handler() which is used as the pre_handler
for the kprobe inserted on kretprobe_implementation. This is the function
that handles the details for calling the return probe handler function
and returning control back at the original return address
* Addition of arch_prepare_kretprobe() which is setup as the pre_handler
for a kprobe registered at the beginning of the target function by
kernel/kprobes.c so that a return probe instance can be setup when
a caller enters the target function. (A return probe instance contains
all the needed information for trampoline_probe_handler to do it's job.)
* Hooks added to the exit path of a task so that we can cleanup any left-over
return probe instances (i.e. if a task dies while inside a targeted function
then the return probe instance was reserved at the beginning of the function
but the function never returns so we need to mark the instance as unused.)
Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Currently ppc64 has two mm_structs for the kernel, init_mm and also
ioremap_mm. The latter really isn't necessary: this patch abolishes it,
instead restricting vmallocs to the lower 1TB of the init_mm's range and
placing io mappings in the upper 1TB. This simplifies the code in a number
of places and eliminates an unecessary set of pagetables. It also tweaks
the unmap/free path a little, allowing us to remove the unmap_im_area() set
of page table walkers, replacing them with unmap_vm_area().
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The iseries has a bar graph on the front panel that shows how busy it is.
The operating system sets and clears a bit in the CTRL register to control
it.
Instead of going to the complexity of using a thread info bit, just set and
clear it in the idle loop.
Also create two helper functions, ppc64_runlatch_on and ppc64_runlatch_off.
Finally don't use the short form of the SPR defines.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch just moves as many as possible EXPORT_SYMBOL()s from
arch/ppc64/kernel/ppc_ksyms.c to where the symbols are defined. This has
been compiled on pSeries, iSeries and pmac.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
POWER5 machines have a per-hardware-thread register which counts at a rate
which is proportional to the percentage of cycles on which the cpu
dispatches an instruction for this thread (if the thread gets all the
dispatch cycles it counts at the same rate as the timebase register). This
register is also context-switched by the hypervisor. Thus it gives a
fine-grained measure of the actual cpu usage by the thread over time.
This patch adds code to read this register every timer interrupt and on
every context switch. The total over all virtual processors is available
through the existing /proc/ppc64/lparcfg file, giving a way to measure the
total cpu usage over the whole partition.
Signed-off-by: Manish Ahuja <ahuja@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Abstract most manual mask checks of cpu_features with cpu_has_feature()
Signed-off-by: Olof Johansson <olof@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch cleans up usage of UTS_RELEASE, by replacing many references
with system_utsname.release, and deleting others. This eliminates a
dependency on version.h for these files, so they don't get rebuilt if
EXTRAVERSION or localversion change.
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Here are some changes to the oops printout, stuff that would have been
useful when I was chasing various bugs.
- print out instructions around the fail (3/4 before 1/4 after).
- print out CTR and CR registers, make some space by cutting down XER
(its only 32bit)
- always print the DAR and DSISR, its sometimes useful
- print_modules() like x86
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
{set,clear}_child_tid initialized in copy_process() right after return from
copy_thread().
These vars are not used in cleanup path if copy_thread() fails.
grep -r _child_tid arch/ shows only ia64/kernel/asm-offsets.c,
so i blindly patched non i386 archs too.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
There is a race between PTRACE_ATTACH and the real parent calling wait.
For a moment, the task is put in PT_PTRACED but with its parent still
pointing to its real_parent. In this circumstance, if the real parent
calls wait without the WUNTRACED flag, he can see a stopped child status,
which wait should never return without WUNTRACED when the caller is not
using ptrace. Here it is not the caller that is using ptrace, but some
third party.
This patch avoids this race condition by adding the PT_ATTACHED flag to
distinguish a real parent from a ptrace_attach parent when PT_PTRACED is
set, and then having wait use this flag to confirm that things are in order
and not consider the child ptraced when its ->ptrace flags are set but its
parent links have not yet been switched. (ptrace_check_attach also uses it
similarly to rule out a possible race with a bogus ptrace call by the real
parent during ptrace_attach.)
While looking into this, I noticed that every arch's sys_execve has:
current->ptrace &= ~PT_DTRACE;
with no locking at all. So, if an exec happens in a race with
PTRACE_ATTACH, you could wind up with ->ptrace not having PT_PTRACED set
because this store clobbered it. That will cause later BUG hits because
the parent links indicate ptracedness but the flag is not set. The patch
corrects all the places I found to use task_lock around diddling ->ptrace
when it's possible to be racing with ptrace_attach. (The ptrace operation
code itself doesn't have this issue because it already excludes anyone else
being in ptrace_attach.)
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
arch/ppc64/kernel/process.c has an #ifdef CONFIG_ALTIVEC within an #ifdef
CONFIG_ALTIVEC. This patch removes the inner one.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Squash a couple of "pointer from integer" warnings recently introduced.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
There have been reports of problems running UP ppc64 kernels where the
kernel would die in the floating point save/restore code.
It turns out kernel threads that call exec (and so become user tasks) do
not have a valid thread.regs. This means init (pid 1) does not, it also
means anything called out of exec_usermodehelper does not. Once that task
has forked (eg init), then the thread.regs in the new task is correctly
set.
On UP do lazy save/restore of floating point regs. The SLES9 init is doing
floating point (the debian version of init appears not to). The lack of
thread.regs in init combined with the fact that it does floating point
leads to our lazy FP save/restore code blowing up.
There were other places where this problem exhibited itself in weird and
interesting ways. If a task being exec'ed out of a kernel thread used more
than 1MB of stack, it would be terminated due to the checks in
arch/ppc64/mm/fault.c (looking for a valid thread.regs when extending the
stack). We had a test case using the tux webserver that was failing due to
this.
Since we zero all registers in ELF_PLAT_INIT, I removed the extra memset
in start_thread32.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Since the irq handling rework in 2.5 lots of code in the individual
<asm/hardirq.h> files is the same. This patch moves that common code
to <linux/hardirq.h>. The following differences existed:
- alpha, m68k, m68knommu and v850 were missing the ~PREEMPT_ACTIVE
masking in the CONFIG_PREEMPT case of in_atomic(). These
architectures don't support CONFIG_PREEMPT else this would have been
an easily-spottbale bug
- S390 didn't provide synchronize_irq as it doesn't fit into their
I/O model. They now get a spurious prototype/macro
- ppc added a new preemptible() macro that is provided for all
architectures now.
Most drivers were using <linux/interrupt.h> as they should, but a few
drivers and lots of architecture code has been updated to use
<linux/hardirq.h> instead of <asm/hardirq.h>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This has been given basic testing on Power4 pSeries and iSeries machines.
At present, the SLB miss handler has to check the SLB slot it is about to
use to ensure that it does not contain the SLBE for the current kernel
stack - throwing out the SLBE for the kernel stack can trigger the
"megabug": we have no SLBE for the stack, but we don't fault immediately
because we have an ERAT entry for it. The ERAT entry is then lost due to a
tlbie on another CPU during the unrecoverable section of the exception
return path.
This patch implements a different approach - with this patch SLB slot 2
always (well, nearly always) contains an SLBE for the stack. This slot is
never cast out by the normal SLB miss path. On context switch, an SLBE for
the new stack is pinned into this slot, unless the new stack is the the
bolted segment.
For iSeries we need a special workaround because there is no way of
ensuring the stack SLBE is preserved an a shared processor switch. So, we
still need to handle taking an SLB miss on the stack, in which case we must
make sure it is loaded into slot 2, rather than using the normal
round-robin.
This approach shaves a few ns off the slb miss time (on pSeries), but more
importantly makes it easier to experiment with different SLB castout
aporoaches without worrying about reinstating the megabug.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Every arch now bears the burden of sanitizing CLONE_IDLETASK out of the
clone_flags passed to do_fork() by userspace. This patch hoists the
masking of CLONE_IDLETASK out of the system call entrypoints into
do_fork(), and thereby removes some small overheads from do_fork(), as
do_fork() may now assume that CLONE_IDLETASK has been cleared.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This rewrites the PPC64 exception entry/exit routines to make them
smaller and faster.
In particular we no longer save all of the registers for the common
exceptions - system calls, hardware interrupts and decrementer (timer)
interrupts - only the volatile registers. The other registers are saved
and restored (if used) by the C functions we call. This involved
changing the registers we use in early exception processing from r20-r23
to r9-r12, which ended up changing quite a lot of code in head.S.
Overall this gives us about a 20% reduction in null syscall time.
Some system calls need all the registers (e.g. fork/clone/vfork and
[rt_]sigsuspend). For these the syscall dispatch code calls a stub that
saves the nonvolatile registers before calling the real handler.
This also implements the force_successful_syscall_return() thing for
ppc64.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This implements CONFIG_PREEMPT for ppc64. Aside from the entry.S
changes to check the _TIF_NEED_RESCHED bit when returning from an
exception, there are various changes to make the ppc64-specific code
preempt-safe, mostly adding preempt_enable/disable or get_cpu/put_cpu
calls where needed. I have been using this on my desktop G5 for the
last week without problems.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
|
|
|
|
Even with a 16kB stack, we have been seeing stack overflows on PPC64
under stress. This patch implements separate per-cpu stacks for
processing interrupts and softirqs, along the lines of the
CONFIG_4KSTACKS stuff on x86. At the moment the stacks are still 16kB
but I hope we can reduce that to 8kB in future. (Gcc is capable of
adding instructions to the function prolog to check the stack pointer
whenever it moves it downwards, and I want to use that when I try
using 8kB stacks so I can be confident that we aren't overflowing the
stack.)
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
This improves the stack traces we get on PPC64 by putting a marker in
those stack frames that are created as a result of an interrupt or
exception. The marker is "regshere" (0x7265677368657265).
With this, stack traces show where exceptions have occurred, which can
be very useful. This also improves the accuracy of the trace because
the relevant return address can be in the link register at the time of
the exception rather than on the stack. We now print the PC and
exception type for each exception frame, and then the link register if
appropriate as the next item in the trace.
|
|
From: Rusty Russell <rusty@rustcorp.com.au>
1) Create an in_sched_functions() function in sched.c and make the
archs use it. (Two archs have wchan #if 0'd out: left them alone).
2) Move __sched from linux/init.h to linux/sched.h and add comment.
3) Rename __scheduling_functions_start_here/end_here to __sched_text_start/end.
Thanks to wli and Sam Ravnborg for clue donation.
|
|
From: Rusty Russell <rusty@rustcorp.com.au>
Clean up initrd handling.
1) Expose initrd_start and initrd_end to prom.c (replacing its local
initrd_start and initrd_len).
2) Don't hand mem (aka klimit) through functions which don't need it.
3) Add more debugging under DEBUG_PROM in case we broke anything.
|
|
It no longer has any callers.
|
|
From: William Lee Irwin III <wli@holomorphy.com>
This addresses the issue with get_wchan() that the various functions acting
as scheduling-related primitives are not, in fact, contiguous in the text
segment. It creates an ELF section for scheduling primitives to be placed
in, and places currently-detected (i.e. skipped during stack decoding)
scheduling primitives and others like io_schedule() and down(), which are
currently missed by get_wchan() code, into this section also.
The net effects are more reliability of get_wchan()'s results and the new
ability, made use of by this code, to arbitrarily place scheduling
primitives in the source code without disturbing get_wchan()'s accuracy.
Suggestions by Arnd Bergmann and Matthew Wilcox regarding reducing the
invasiveness of the patch were incorporated during prior rounds of review.
I've at least tried to sweep all arches in this patch.
|
|
From: Paul Mackerras <paulus@samba.org>
Recently we found a particularly nasty bug in the segment handling in the
ppc64 kernel. It would only happen rarely under heavy load, but when it
did the machine would lock up with the whole of memory filled with
exception stack frames.
The primary cause was that we were losing the translation for the kernel
stack from the SLB, but we still had it in the ERAT for a while longer.
Now, there is a critical region in various exception exit paths where we
have loaded the SRR0 and SRR1 registers from GPRs and we are loading those
GPRs and the stack pointer from the exception frame on the kernel stack.
If we lose the ERAT entry for the kernel stack in that region, we take an
SLB miss on the next access to the kernel stack. Taking the exception
overwrites the values we have put into SRR0 and SRR1, which means we lose
state. In fact we ended up repeating that last section of the exception
exit path, but using the user stack pointer this time. That caused another
exception (or if it didn't, we loaded a new value from the user stack and
then went around and tried to use that). And it spiralled downwards from
there.
The patch below fixes the primary problem by making sure that we really
never cast out the SLB entry for the kernel stack. It also improves
debuggability in case anything like this happens again by:
- In our exception exit paths, we now check whether the RI bit in the
SRR1 value is 0. We already set the RI bit to 0 before starting the
critical region, but we never checked it. Now, if we do ever get an
exception in one of the critical regions, we will detect it before
returning to the critical region, and instead we will print a nasty
message and oops.
- In the exception entry code, we now check that the kernel stack pointer
value we're about to use isn't a userspace address. If it is, we print a
nasty message and oops.
This has been tested on G5 and pSeries (both with and without hypervisor)
and compile-tested on iSeries.
|
|
From: Anton Blanchard <anton@samba.org>
Add kernel version to oops.
|
|
ppc64 tlb flush rework from Paul Mackerras
Instead of doing a double pass of the pagetables, we batch things
up in the pte flush routines and then shoot the batch down in
flush_tlb_pending.
Our page aging was broken, we never flushed entries out of the ppc64
hashtable. We now flush in ptep_test_and_clear_young.
A number of other things were fixed up in the process:
- change ppc64_tlb_batch to per cpu data
- remove some LPAR debug code
- be more careful with ioremap_mm inits
- clean up arch/ppc64/mm/init.c, create tlb.c
|
|
From: Anton Blanchard <anton@samba.org>
__get_SP used to be a function call which meant we allocated a stack
frame before calling it. This meant the SP it returned was one frame
below the current function. Lets call that bogusSP (and the real one
SP).
The new dump_stack was being tail call optimised so it remained one
frame above bogusSP. dump_stack would then store below SP (as the ABI
allows us to) and would stomp over the back link that bogusSP pointed
to (__get_SP had set the back link up so it worked sometimes, just not
all the time).
Fix this by just making __get_SP an inline that returns the current SP.
|
|
From: Anton Blanchard <anton@samba.org>
The might_sleep infrastructure doesnt like our get_users in the backtrace
code, we often end up with might_sleep warnings inside might_sleep warnings.
Instead just be careful about pointers before dereferencing them.
Also remove the hack where we only printed the bottom 32bits of the WCHAN
value.
|
|
- Add thread_info to pointer, its a useful piece of information.
- Do the kallsyms lookup on the link register
- Remove extra newline on one call to die()
|
|
From: Anton Blanchard <anton@samba.org>
The current SLB handling code has a number of problems:
- We loop trying to find an empty SLB entry before deciding to cast one
out. On large working sets this really hurts since the SLB is always full
and we end up looping through all 64 entries unnecessarily.
- During castout we currently invalidate the entry we are replacing. This
is to avoid a nasty race where the entry is in the ERAT but not the SLB and
another cpu does a tlbie that removes the ERAT at a critical point. If
this race is fixed the SLB can be removed.
- The SLB prefault code doesnt work properly
The following patch addresses all the above concerns and adds some more
optimisations:
- feature nop out some segment table only code
- slb invalidate the kernel segment on context switch (avoids us having to
slb invalidate at each cast out)
- optimise flush on context switch, the lazy tlb stuff avoids it being
called when going from userspace to kernel thread, but it gets called when
going to kernel thread to userspace. In many cases we are returning to the
same userspace task, we now check for this and avoid the flush
- use the optimised POWER4 mtcrf where possible
|
|
From: Anton Blanchard <anton@samba.org>
VMX (Altivec) support & signal32 rework, from Ben Herrenschmidt
|
|
|
|
ELF_CORE_SYNC and dump_smp_unlazy_fpu seem to have been introduced
by Ingo around 2.5.43, but as far as I can tell, never used.
|
|
|
|
|
|
|
|
From: David Mosberger <davidm@napali.hpl.hp.com>
This is an attempt at sanitizing the interface for stack trace dumping
somewhat. It's basically the last thing which prevents 2.5.x from working
out-of-the-box for ia64. ia64 apparently cannot reasonably implement the
show_stack interface declared in sched.h.
Here is the rationale: modern calling conventions don't maintain a frame
pointer and it's not possible to get a reliable stack trace with only a stack
pointer as the starting point. You really need more machine state to start
with. For a while, I thought the solution is to pass a task pointer to
show_stack(), but it turns out that this would negatively impact x86 because
it's sometimes useful to show only portions of a stack trace (e.g., starting
from the point at which a trap occurred). Thus, this patch _adds_ the task
pointer instead:
extern void show_stack(struct task_struct *tsk, unsigned long *sp);
The idea here is that show_stack(tsk, sp) will show the backtrace of task
"tsk", starting from the stack frame that "sp" is pointing to. If tsk is
NULL, the trace will be for the current task. If "sp" is NULL, all stack
frames of the task are shown. If both are NULL, you'll get the full trace of
the current task.
I _think_ this should make everyone happy.
The patch also removes the declaration of show_trace() in linux/sched.h (it
never was a generic function; some platforms, in particular x86, may want to
update accordingly).
Finally, the patch replaces the one call to show_trace_task() with the
equivalent call show_stack(task, NULL).
The patch below is for Alpha and i386, since I can (compile-)test those (I'll
provide the ia64 update through my regular updates). The other arches will
break visibly and updating the code should be trivial:
- add a task pointer argument to show_stack() and pass NULL as the first
argument where needed
- remove show_trace_task()
- declare show_trace() in a platform-specific header file if you really
want to keep it around
|
|
|
|
|
|
|
|
This updates ppc64 for the do_fork() semantics change.
|