From 95f238eac82907c4ccbc301cd5788e67db0715ce Mon Sep 17 00:00:00 2001
From: Andrew Morton <akpm@osdl.org>
Date: Sun, 11 Apr 2004 23:18:43 -0700
Subject: [PATCH] ia32: 4Kb stacks (and irqstacks) patch

From: Arjan van de Ven <arjanv@redhat.com>

Below is a patch to enable 4Kb stacks for x86. The goal of this is to

1) Reduce footprint per thread so that systems can run many more threads
   (for the java people)

2) Reduce the pressure on the VM for order > 0 allocations. We see real life
   workloads (granted with 2.4 but the fundamental fragmentation issue isn't
   solved in 2.6 and isn't solvable in theory) where this can be a problem.
   In addition order > 0 allocations can make the VM "stutter" and give more
   latency due to having to do much much more work trying to defragment

The first 2 bits of the patch actually affect compiler options in a generic
way: I propose to disable the -funit-at-a-time feature from gcc.  With this
enabled (and it's default with -O2), gcc will very agressively inline
functions, which is nice and all for userspace, but for the kernel this makes
us suffer a gcc deficiency more: gcc is extremely bad at sharing stackslots,
for example a situation like this:

if (some_condition)
	function_A();
else
	function_B();

with -funit-at-a-time, both function_A() and _B() might get inlined, however
the stack usage of both functions of the parent function grows the stack
usage of both functions COMBINED instead of the maximum of the two.  Even
with the normal 8Kb stacks this is a danger since we see some functions grow
3Kb to 4Kb of stack use this way.  With 4Kb stacks, 4Kb of stack usage growth
obviously is deadly ;-( but even with 8Kb stacks it's pure lottery.
Disabling -funit-at-a-time also exposes another thing in the -mm tree; the
attribute always_inline is considered harmful by gcc folks in that when gcc
makes a decision to NOT inline a function marked this way, it throws an
error.  Disabling -funit-at-a-time disables some of the agressive inlining
(eg of large functions that come later in the .c file) so this would make
your tree not compile.

The 4k stackness of the kernel is included in modversions, so people don't
load 4k-stack modules into 8k-stack kernels.

At present 4k stacks are selectable in config.  When the feature has settled
in we should remove the 8k option.  This will break the nvidia modules.  But
Fedora uses 4k stacks so a new nvidia driver is expected soon.
---
 include/linux/compiler-gcc3.h | 2 +-
 include/linux/irq.h           | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

(limited to 'include/linux')

diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h
index c472cac3029d..265dad4c3cb4 100644
--- a/include/linux/compiler-gcc3.h
+++ b/include/linux/compiler-gcc3.h
@@ -3,7 +3,7 @@
 /* These definitions are for GCC v3.x.  */
 #include <linux/compiler-gcc.h>
 
-#if __GNUC_MINOR__ >= 1
+#if __GNUC_MINOR__ >= 1  && __GNUC_MINOR__ < 4
 # define inline		__inline__ __attribute__((always_inline))
 # define __inline__	__inline__ __attribute__((always_inline))
 # define __inline	__inline__ __attribute__((always_inline))
diff --git a/include/linux/irq.h b/include/linux/irq.h
index fa03b836c29a..5bc740d9bc47 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -71,7 +71,6 @@ extern irq_desc_t irq_desc [NR_IRQS];
 
 #include <asm/hw_irq.h> /* the arch dependent stuff */
 
-extern int handle_IRQ_event(unsigned int, struct pt_regs *, struct irqaction *);
 extern int setup_irq(unsigned int , struct irqaction * );
 
 extern hw_irq_controller no_irq_type;  /* needed in every arch ? */
-- 
cgit v1.2.3