summaryrefslogtreecommitdiff
path: root/init
AgeCommit message (Collapse)Author
2005-12-20[SPARC64]: Stop putting -finline-limit=XXX into CFLAGSDavid S. Miller
It was a stupid workaround for the "static inline" vs. "extern inline" issues of long ago, and it is what causes schedule() to be inlined like crazy into kernel/sched.c when -Os is specified. MIPS and S390 should probably do the same. Now CC_OPTIMIZE_FOR_SIZE can be safely used on sparc64 once more. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-14Move size optimization option outside of EMBEDDED menu, mark it EXPERIMENTALLinus Torvalds
Also, disable on sparc64 - a number of people report breakage. Probably a compiler bug, but it's quite possible that it tickles some latent kernel problem too. It still defaults to 'y' everywhere else (when enabled through EXPERIMENTAL), and Dave Jones points out that Fedora (and RHEL4) has been building with size optimizations for a long time on x86, x86-64, ia64, s390, s390x, ppc32 and ppc64. So it is really only moderately experimental, but the sparc64 breakage certainly shows that it can trigger "issues". Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-13Expose "Optimize for size" option for everybodyLinus Torvalds
Let's put my money where my mouth is. Smaller code is almost always faster, if only because a single I$ miss ends up leaving a lot of cycles to make up for. And system software - kernels in particular - are known for taking more cache misses than most other kinds. On my random config, this made the kernel about 10% smaller, and lmbench seems to say that it's pretty uniformly faster too. Your milage may vary. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-13[PATCH] allow KOBJECT_UEVENT=n only if EMBEDDEDAdrian Bunk
KOBJECT_UEVENT=n seems to be a common pitfall for udev users in 2.6.14 . -mm already contains a bigger patch removing this option that is IMHO too big for being applied now to 2.6.15-rc. This patch simply allows KOBJECT_UEVENT=n only if EMBEDDED. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] sched: disable preempt in idle tasksNick Piggin
Run idle threads with preempt disabled. Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()). How did it ever work before? Might fix the CPU hotplugging hang which Nigel Cunningham noted. We think the bug hits if the idle thread is preempted after checking need_resched() and before going to sleep, then the CPU offlined. After calling stop_machine_run, the CPU eventually returns from preemption and into the idle thread and goes to sleep. The CPU will continue executing previous idle and have no chance to call play_dead. By disabling preemption until we are ready to explicitly schedule, this bug is fixed and the idle threads generally become more robust. From: alexs <ashepard@u.washington.edu> PPC build fix From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> MIPS build fix Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-04[BLOCK] Move all core block layer code to new block/ directoryJens Axboe
drivers/block/ is right now a mix of core and driver parts. Lets move the core parts to a new top level directory. Al will move the fs/ related block parts to block/ next. Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-31Revert "i386: move apic init in init_IRQs"Linus Torvalds
Commit f2b36db692b7ff6972320ad9839ae656a3b0ee3e causes a bootup hang on at least one machine. Revert for now until we understand why. The old code may be ugly, but it works. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30[PATCH] clarify help text for INIT_ENV_ARG_LIMITRandy Dunlap
Try to make the INIT_ENV_ARG_LIMIT help text more readable and understandable. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30[PATCH] i386: move apic init in init_IRQsEric W. Biederman
All kinds of ugliness exists because we don't initialize the apics during init_IRQs. - We calibrate jiffies in non apic mode even when we are using apics. - We have to have special code to initialize the apics when non-smp. - The legacy i8259 must exist and be setup correctly, even when we won't use it past initialization. - The kexec on panic code must restore the state of the io_apics. - init/main.c needs a special case for !smp smp_init on x86 In addition to pure code movement I needed a couple of non-obvious changes: - Move setup_boot_APIC_clock into APIC_late_time_init for simplicity. - Use cpu_khz to generate a better approximation of loops_per_jiffies so I can verify the timer interrupt is working. - Call setup_apic_nmi_watchdog again after cpu_khz is initialized on the boot cpu. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13[PATCH] free initrd mem adjustmentJan Beulich
Besides freeing initrd memory, also clear out the now dangling pointers to it, to make sure accidental late use attempts can be detected. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Add rdinit parameter to pick early userspace initOlof Johansson
Since early userspace was added, there's no way to override which init to run from it. Some people tack on an extra cpio archive with a link from /init depending on what they want to run, but that's sometimes impractical. Changing the "init=" to also override the early userspace isn't feasible, since it is still used to indicate what init to run from disk when early userspace has completed doing whatever it's doing (i.e. load filesystem modules and drivers). Instead, introduce "rdinit=" and make it override the default "/init" if specified. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Add warning `init=' to init/main.cAvery, Brian
I passed init=/mylinuxrc to the kernel on the command line. The kernel silently dropped down to exec /sbin/init. It turned out that /mylinuxrc had improper permissions. Without any warning message from the kernel that something was wrong it took awhile to find the issue. The patch below adds a warning. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] detect soft lockupsIngo Molnar
This patch adds a new kernel debug feature: CONFIG_DETECT_SOFTLOCKUP. When enabled then per-CPU watchdog threads are started, which try to run once per second. If they get delayed for more than 10 seconds then a callback from the timer interrupt detects this condition and prints out a warning message and a stack dump (once per lockup incident). The feature is otherwise non-intrusive, it doesnt try to unlock the box in any way, it only gets the debug info out, automatically, and on all CPUs affected by the lockup. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de> Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-06Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild Linus Torvalds
2005-09-02[PATCH] remove driverfs references from init/do_mounts.cRolf Eike Beer
This patch is against 2.6.10, but still applies cleanly. It's just s/driverfs/sysfs/ in this file. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29[NET]: Fix sparse warningsArnaldo Carvalho de Melo
Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-10[PATCH] kbuild: automatically append a short string to the version based ↵Ryan Anderson
upon the git commit If CONFIG_AUTO_LOCALVERSION is set, the user is using a git-based tree, and the current HEAD is not referred to by any tags in .git/refs/tags/, append -g and the first 8 characters of the commit to the version string. This makes it easier to use git-bisect, and/or to do a daily build, without trampling on your older, working builds, or accidentally setting up conflicting sets of modules. Signed-off-by: Ryan Anderson <ryan@michonline.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-08-10kconfig: move initramfs options to General SetupSam Ravnborg
Move initramfs options from Device Drivers | Block Drivers to General Setup This is a more natural place for this option. Furthermore separate out intramfs options to usr/Kconfig Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-07-28[PATCH] x86_64: Some cleanup in setup64.cAndi Kleen
Minor cleanup. Move things into their include files, remove obsolete includes, fix indentation, remove obsolete special cases etc. I also added the per cpu section to asm-generic/sections.h and fixed init/main.c to use it. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27[PATCH] kernel/cpuset.c: add kerneldoc, fix typosRandy Dunlap
Add kerneldoc to kernel/cpuset.c Fix cpuset typos in init/Kconfig Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26[PATCH] kallsyms: clarify KALLSYMS_ALL help textJesper Juhl
Clarify the KALLSYMS_ALL help text slightly. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
2005-07-14kbuild: "PREEMPT" in UTS_VERSIONSam Ravnborg
From: Matt Mackall <mpm@selenic.com> Add PREEMPT to UTS_VERSION where enabled as is done for SMP to make preempt kernels easily identifiable. Added SMP PREEMPT as comment in compile.h to force it to be updated when they change (sam). Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-07-14[PATCH] remove EXPORT_SYMBOL for root_devPaolo 'Blaisorblade' Giarrusso
Remove ROOT_DEV after unexporting it in the previous patch, as requested time ago by Christoph Hellwig. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12[PATCH] name_to_dev_t warning fixAndrew Morton
kernel/power/disk.c needs a declaration of name_to_dev_t() in scope. mount.h seems like an appropriate choice. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-10[SPARC64]: Add syscall auditing support.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-30[PATCH] Improper initrd failure message at boot timeJay Lan
On system boot up, there was an failure reported to boot.msg: <5>Trying to move old root to /initrd ... failed According to initrd(4) man page, step #7 of BOOT-UP OPERATION is described as below: 7. If the normal root file has directory /initrd, device /dev/ram0 is moved from / to /initrd. Otherwise if directory /initrd does not exist device /dev/ram0 is unmounted. We got service calls from customers concerning about this failure message at boot time. Many systems do not have /initrd and thus the message can be changed in the case of non-existing /initrd so that it does not sound like a failure of the system. Signed-off-by: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28[PATCH] Tweak idle thread setup semanticsIngo Molnar
This patch tweaks idle thread setup semantics a bit: instead of setting NEED_RESCHED in init_idle(), we do an explicit schedule() before calling into cpu_idle(). This patch, while having no negative side-effects, enables wider use of cond_resched()s. (which might happen in the stock kernel too, but it's particulary important for voluntary-preempt) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] init/do_mounts_initrd.c: fix sparse warningDomen Puncer
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] Platform SMIs and their interferance with tsc based delay calibrationVenkatesh Pallipadi
Issue: Current tsc based delay_calibration can result in significant errors in loops_per_jiffy count when the platform events like SMIs (System Management Interrupts that are non-maskable) are present. This could lead to potential kernel panic(). This issue is becoming more visible with 2.6 kernel (as default HZ is 1000) and on platforms with higher SMI handling latencies. During the boot time, SMIs are mostly used by BIOS (for things like legacy keyboard emulation). Description: The psuedocode for current delay calibration with tsc based delay looks like (0) Estimate a value for loops_per_jiffy (1) While (loops_per_jiffy estimate is accurate enough) (2) wait for jiffy transition (jiffy1) (3) Note down current tsc (tsc1) (4) loop until tsc becomes tsc1 + loops_per_jiffy (5) check whether jiffy changed since jiffy1 or not and refine loops_per_jiffy estimate Consider the following cases Case 1: If SMIs happen between (2) and (3) above, we can end up with a loops_per_jiffy value that is too low. This results in shorted delays and kernel can panic () during boot (Mostly at IOAPIC timer initialization timer_irq_works() as we don't have enough timer interrupts in a specified interval). Case 2: If SMIs happen between (3) and (4) above, then we can end up with a loops_per_jiffy value that is too high. And with current i386 code, too high lpj value (greater than 17M) can result in a overflow in delay.c:__const_udelay() again resulting in shorter delay and panic(). Solution: The patch below makes the calibration routine aware of asynchronous events like SMIs. We increase the delay calibration time and also identify any significant errors (greater than 12.5%) in the calibration and notify it to user. Patch below changes both i386 and x86-64 architectures to use this new and improved calibrate_delay_direct() routine. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] node local per-cpu-pagesChristoph Lameter
This patch modifies the way pagesets in struct zone are managed. Each zone has a per-cpu array of pagesets. So any particular CPU has some memory in each zone structure which belongs to itself. Even if that CPU is not local to that zone. So the patch relocates the pagesets for each cpu to the node that is nearest to the cpu instead of allocating the pagesets in the (possibly remote) target zone. This means that the operations to manage pages on remote zone can be done with information available locally. We play a macro trick so that non-NUMA pmachines avoid the additional pointer chase on the page allocator fastpath. AIM7 benchmark on a 32 CPU SGI Altix w/o patches: Tasks jobs/min jti jobs/min/task real cpu 1 484.68 100 484.6769 12.01 1.97 Fri Mar 25 11:01:42 2005 100 27140.46 89 271.4046 21.44 148.71 Fri Mar 25 11:02:04 2005 200 30792.02 82 153.9601 37.80 296.72 Fri Mar 25 11:02:42 2005 300 32209.27 81 107.3642 54.21 451.34 Fri Mar 25 11:03:37 2005 400 34962.83 78 87.4071 66.59 588.97 Fri Mar 25 11:04:44 2005 500 31676.92 75 63.3538 91.87 742.71 Fri Mar 25 11:06:16 2005 600 36032.69 73 60.0545 96.91 885.44 Fri Mar 25 11:07:54 2005 700 35540.43 77 50.7720 114.63 1024.28 Fri Mar 25 11:09:49 2005 800 33906.70 74 42.3834 137.32 1181.65 Fri Mar 25 11:12:06 2005 900 34120.67 73 37.9119 153.51 1325.26 Fri Mar 25 11:14:41 2005 1000 34802.37 74 34.8024 167.23 1465.26 Fri Mar 25 11:17:28 2005 with slab API changes and pageset patch: Tasks jobs/min jti jobs/min/task real cpu 1 485.00 100 485.0000 12.00 1.96 Fri Mar 25 11:46:18 2005 100 28000.96 89 280.0096 20.79 150.45 Fri Mar 25 11:46:39 2005 200 32285.80 79 161.4290 36.05 293.37 Fri Mar 25 11:47:16 2005 300 40424.15 84 134.7472 43.19 438.42 Fri Mar 25 11:47:59 2005 400 39155.01 79 97.8875 59.46 590.05 Fri Mar 25 11:48:59 2005 500 37881.25 82 75.7625 76.82 730.19 Fri Mar 25 11:50:16 2005 600 39083.14 78 65.1386 89.35 872.79 Fri Mar 25 11:51:46 2005 700 38627.83 77 55.1826 105.47 1022.46 Fri Mar 25 11:53:32 2005 800 39631.94 78 49.5399 117.48 1169.94 Fri Mar 25 11:55:30 2005 900 36903.70 79 41.0041 141.94 1310.78 Fri Mar 25 11:57:53 2005 1000 36201.23 77 36.2012 160.77 1458.31 Fri Mar 25 12:00:34 2005 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com> Signed-off-by: Shai Fultheim <Shai@Scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.gitDavid Woodhouse
2005-05-28[PATCH] uml: add modversions supportPaolo 'Blaisorblade' Giarrusso
Actually, the real support was added by some earlier patches. Now we simply re-enable the config. option. I've actually tested it and it works well. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-11Audit requires CONFIG_NETChris Wright
Audit now actually requires netlink. So make it depend on CONFIG_NET, and remove the inline dependencies on CONFIG_NET. Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-05-08Add CONFIG_AUDITSC and CONFIG_SECCOMP support for ppc32David Woodhouse
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-05-03Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.gitDavid Woodhouse
2005-05-03[AUDIT] Update UML audit-syscall-{entry,exit} calls to new prototypesJeff Dike
This patch is for -mm only. It should probably be included in git-audit, and should be forwarded to Linus iff git-audit is. It updates the audit-syscall-{entry,exit} calls to current -mm. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-05-01[PATCH] clean up kernel messagesMatt Mackall
Arrange for all kernel printks to be no-ops. Only available if CONFIG_EMBEDDED. This patch saves about 375k on my laptop config and nearly 100k on minimal configs. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01[PATCH] remove all kernel BUGsMatt Mackall
This patch eliminates all kernel BUGs, trims about 35k off the typical kernel, and makes the system slightly faster. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-28[PATCH] uml: disable more hardware kconfig opt and rename USERMODE to UMLPaolo \'Blaisorblade\' Giarrusso
Disable some hardware-only configuration options when configuring for ARCH=um. By the way, we rename CONFIG_USERMODE to CONFIG_UML, as requested some time ago by the UML maintainer Jeff Dike. We also update defconfig as a consequence of all this. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-28[PATCH] uml: extend cmd line limitsPaolo \'Blaisorblade\' Giarrusso
From: "Catalin(ux aka Dino) BOIE" <util@deuroconsult.ro>, Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>, Jeff Dike <jdike@addtoit.com> Increase UML command line size. And fix a crash from passing an overly-long command line to UML. XXX: check that init can handle 128 params and 128 env. var. The original patch set this limit to 256, but it seems me too much. Think! Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-13[PATCH] smp{,boot}.c cleanupsAdrian Bunk
This patch contains the following cleanups on several architectures: - make some needlessly global code static - remove the following write-only (except for printk's) variables: - cache_decay_ticks - smp_threads_ready - cacheflush_time I've only tried the compilation on i386, but I hope all mistakes I made are on unimportant architectures. ;-) Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-13[PATCH] Make loglevels in init/main.c a little more sane.Jesper Juhl
This patch modifies a few of the printk() loglevels used in init/main.c in an attempt to make them a bit more appropriate. The default loglevel is KERN_WARNING, but a few printk's without explicit loglevel are not (in my oppinion) warnings, so add proper warning levels - for instance; telling the user how many CPU's were brought up is hardly a warning, make it KERN_INFO instead. The initial printing of linux_banner is not a warning condition, I'd say it's more of a NOTICE or even INFO condition - I've made it KERN_NOTICE just as the printing of the kernel command line. A few printk's without explicit loglevel do match the default one, but I've made them explicit (the default could change in the future, and if it does then explicitly setting the proper loglevel is a nice thing). Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-13[PATCH] CONFIG_BASE_FULL help clarificationMatt Mackall
Clarify the BASE_FULL help text. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-13[PATCH] swsusp: enable resume from initrdPavel Machek
From: <mjg59@scrf.ucam.org> When using a fully modularized kernel it is necessary to activate resume manually as the device node might not be available during kernel init. This patch implements a new sysfs attribute '/sys/power/resume' which allows for manual activation of software resume. When read from it prints the configured resume device in 'major:minor' format. When written to it expects a device in 'major:minor' format. This device is then checked for a suspended image and resume is started if a valid image is found. The original functionality is left in place. It should be used from initramfs, or with care. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-13Kconfig: cleanup kernel hacking menuSam Ravnborg
o This properly indents the kernel hacking menu. o Move LOG_BUF_SHIFT into kernel hacking menu (it already depended on DEBUG_KERNEL). o Add DEBUG_KERNEL dependency to EARLY_PRINTK, DEBUG_PREEMPT and FRAME_POINTER. o Remove overlong dependency, which included practically every arch. o Merge the two MAGIC_SYSRQ menu entries. o Remove unnecessary "default n" options. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-03-09[PATCH] cpusets - big numa cpu and memory placementPaul Jackson
This my cpuset patch, with the following changes in the last two weeks: 1) Updated to 2.6.8.1-mm1 2) [Simon Derr <Simon.Derr@bull.net>] Fix new cpuset to begin empty, not copied from parent. Needed to avoid breaking exclusive property. 3) [Dinakar Guniguntala <dino@in.ibm.com>] Finish initializing top cpuset from cpu_possible_map after smp_init() called. 4) [Paul Jackson <pj@sgi.com>] Check on each call to __alloc_pages() if the current tasks cpuset mems_allowed has changed. Use a cpuset generation number, bumped on any cpuset memory placement change, to make this check efficient. Update the tasks mems_allowed from its cpuset, if the cpuset has changed. 5) [Paul Jackson <pj@sgi.com>] If a task is moved to another cpuset, then update its cpus_allowed, using set_cpus_allowed(). 6) [Paul Jackson <pj@sgi.com>] Update Documentation/cpusets.txt to reflect above changes (4) and (5). I continue to recommend the following patch for inclusion in your 2.6.9-*mm series, when that opens. It provides an important facility for high performance computing on large systems. Simon Derr of Bull (France) and myself are the primary authors. Erich Focht has indicated that NEC is also a potential user of this patch on the TX-7 NUMA machines, and that he "would very much welcome the inclusion of cpusets." I offer this update to lkml, in order to invite continued feedback. The one prerequiste patch for this cpuset patch was just posted before this one. That was a patch to provide a new bitmap list format, of which cpusets is the first user. This patch has been built on top of 2.6.8.1-mm1, for the arch's: i386 x86_64 sparc ia64 powerpc-405 powerpc-750 sparc64 with and without CONFIG_CPUSET. It has been booted and tested on ia64 (sn2_defconfig, SN2 hardware). The 'alpha' arch also built, except for what seems to be an unrelated toolchain problem (crosstool ld sigsegv) in the final link step. === Cpusets provide a mechanism for assigning a set of CPUs and Memory Nodes to a set of tasks. Cpusets constrain the CPU and Memory placement of tasks to only the processor and memory resources within a tasks current cpuset. They form a nested hierarchy visible in a virtual file system. These are the essential hooks, beyond what is already present, required to manage dynamic job placement on large systems. Cpusets require small kernel hooks in init, exit, fork, mempolicy, sched_setaffinity, page_alloc and vmscan. And they require a "struct cpuset" pointer, a cpuset_mems_generation, and a "mems_allowed" nodemask_t (to go along with the "cpus_allowed" cpumask_t that's already there) in each task struct. These hooks: 1) establish and propagate cpusets, 2) enforce CPU placement in sched_setaffinity, 3) enforce Memory placement in mbind and sys_set_mempolicy, 4) restrict page allocation and scanning to mems_allowed, and 5) restrict migration and set_cpus_allowed to cpus_allowed. The other required hook, restricting task scheduling to CPUs in a tasks cpus_allowed mask, is already present. Cpusets extend the usefulness of, the existing placement support that was added to Linux 2.6 kernels: sched_setaffinity() for CPU placement, and mbind() and set_mempolicy() for memory placement. On smaller or dedicated use systems, the existing calls are often sufficient. On larger NUMA systems, running more than one, performance critical, job, it is necessary to be able to manage jobs in their entirety. This includes providing a job with exclusive CPU and memory that no other job can use, and being able to list all tasks currently in a cpuset. A given job running within a cpuset, would likely use the existing placement calls to manage its CPU and memory placement in more detail. Cpusets are named, nested sets of CPUs and Memory Nodes. Each cpuset is represented by a directory in the cpuset virtual file system, normally mounted at /dev/cpuset. Each cpuset directory provides the following files, which can be read and written: cpus: List of CPUs allowed to tasks in that cpuset. mems: List of Memory Nodes allowed to tasks in that cpuset. tasks: List of pid's of tasks in that cpuset. cpu_exclusive: Flag (0 or 1) - if set, cpuset has exclusive use of its CPUs (no sibling or cousin cpuset may overlap CPUs). mem_exclusive: Flag (0 or 1) - if set, cpuset has exclusive use of its Memory Nodes (no sibling or cousin may overlap). notify_on_release: Flag (0 or 1) - if set, then /sbin/cpuset_release_agent will be invoked, with the name (/dev/cpuset relative path) of that cpuset in argv[1], when the last user of it (task or child cpuset) goes away. This supports automatic cleanup of abandoned cpusets. In addition one new filetype is added to the /proc file system: /proc/<pid>/cpuset: For each task (pid), list its cpuset path, relative to the root of the cpuset file system. This file is read-only. New cpusets are created using 'mkdir' (at the shell or in C). Old ones are removed using 'rmdir'. The above files are accessed using read(2) and write(2) system calls, or shell commands such as 'cat' and 'echo'. The CPUs and Memory Nodes in a given cpuset are always a subset of its parent. The root cpuset has all possible CPUs and Memory Nodes in the system. A cpuset may be exclusive (cpu or memory) only if its parent is similarly exclusive. See further Documentation/cpusets.txt, at the top of the following patch. /proc interface: It is useful, when learning and making new uses of cpusets and placement to be able to see what are the current value of a tasks cpus_allowed and mems_allowed, which are the actual placement used by the kernel scheduler and memory allocator. The cpus_allowed and mems_allowed values are needed by user space apps that are micromanaging placement, such as when moving an app to a obtained by that app within its cpuset using sched_setaffinity, mbind and set_mempolicy. The cpus_allowed value is also available via the sched_getaffinity system call. But since the entire rest of the cpuset API, including the display of mems_allowed added here, is via an ascii style presentation in /proc and /dev/cpuset, it is worth the extra couple lines of code to display cpus_allowed in the same way. This patch adds the display of these two fields to the 'status' file in the /proc/<pid> directory of each task. The fields are only added if CONFIG_CPUSETS is enabled (which is also needed to define the mems_allowed field of each task). The new output lines look like: $ tail -2 /proc/1/status Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff Mems_allowed: ffffffff,ffffffff Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com> Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-09[PATCH] Properly share process and session keyrings with CLONE_THREAD [try #2]David Howells
The attached patch causes process and session keyrings to be shared properly when CLONE_THREAD is in force. It does this by moving the keyring pointers into struct signal_struct[*]. [*] I have a patch to rename this to struct thread_group that I'll revisit after the advent of 2.6.11. Furthermore, once this patch is applied, process keyrings will no longer be allocated at fork, but will instead only be allocated when needed. Allocating them at fork was a way of half getting around the sharing across threads problem, but that's no longer necessary. This revision of the patch has the documentation changes patch rolled into it and no longer abstracts the locking for signal_struct into a pair of macros. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-07[PATCH] lib/sort: Heapsort implementation of sort()Matt Mackall
This patch adds a generic array sorting library routine. This is meant to replace qsort, which has two problem areas for kernel use. The first issue is quadratic worst-case performance. While quicksort worst-case datasets are rarely encountered in normal scenarios, it is in fact quite easy to construct worst cases for almost all quicksort algorithms given source or access to an element comparison callback. This could allow attackers to cause sorts that would otherwise take less than a millisecond to take seconds and sorts that should take less than a second to take weeks or months. Fixing this problem requires randomizing pivot selection with a secure random number generator, which is rather expensive. The second is that quicksort's recursion tracking requires either nontrivial amounts of stack space or dynamic memory allocation and out of memory error handling. By comparison, heapsort has both O(n log n) average and worst-case performance and practically no extra storage requirements. This version runs within 70-90% of the average performance of optimized quicksort so it should be an acceptable replacement wherever quicksort would be used in the kernel. Note that this function has an extra parameter for passing in an optimized swapping function. This is worth 10% or more over the typical byte-by-byte exchange functions. Benchmarks: qsort: glibc variant 1189 bytes (+ 256/1024 stack) qsort_3f: my simplified variant 459 bytes (+ 256/1024 stack) heapsort: the version below 346 bytes shellsort: an optimized shellsort 196 bytes P4 1.8GHz Opteron 1.4GHz (32-bit) size algorithm cycles relative cycles relative 100: qsort: 38682 100.00% 27631 100.00% qsort_3f: 36277 106.63% 22406 123.32% heapsort: 43574 88.77% 30301 91.19% shellsort: 39087 98.97% 25139 109.91% 200: qsort: 86468 100.00% 61148 100.00% qsort_3f: 78918 109.57% 48959 124.90% heapsort: 98040 88.20% 68235 89.61% shellsort: 95688 90.36% 62279 98.18% 400: qsort: 187720 100.00% 131313 100.00% qsort_3f: 174905 107.33% 107954 121.64% heapsort: 223896 83.84% 154241 85.13% shellsort: 223037 84.17% 148990 88.14% 800: qsort: 407060 100.00% 287460 100.00% qsort_3f: 385106 105.70% 239131 120.21% heapsort: 484662 83.99% 340099 84.52% shellsort: 537110 75.79% 354755 81.03% 1600: qsort: 879596 100.00% 621331 100.00% qsort_3f: 861568 102.09% 522013 119.03% heapsort: 1079750 81.46% 746677 83.21% shellsort: 1234243 71.27% 820782 75.70% 3200: qsort: 1903902 100.00% 1342126 100.00% qsort_3f: 1908816 99.74% 1131496 118.62% heapsort: 2515493 75.69% 1630333 82.32% shellsort: 2985339 63.78% 1964794 68.31% 6400: qsort: 4046370 100.00% 2909215 100.00% qsort_3f: 4164468 97.16% 2468393 117.86% heapsort: 5150659 78.56% 3533585 82.33% shellsort: 6650225 60.85% 4429849 65.67% 12800: qsort: 8729730 100.00% 6185097 100.00% qsort_3f: 8776885 99.46% 5288826 116.95% heapsort: 11064224 78.90% 7603061 81.35% shellsort: 15487905 56.36% 10305163 60.02% 25600: qsort: 18357770 100.00% 13172205 100.00% qsort_3f: 18687842 98.23% 11337115 116.19% heapsort: 24121241 76.11% 16612122 79.29% shellsort: 35552814 51.64% 24106987 54.64% 51200: qsort: 38658883 100.00% 28008505 100.00% qsort_3f: 39498463 97.87% 24339675 115.07% heapsort: 50553552 76.47% 37013828 75.67% shellsort: 82602416 46.80% 56201889 49.84% 102400: qsort: 81197794 100.00% 58918933 100.00% qsort_3f: 84257930 96.37% 51986219 113.34% heapsort: 110540577 73.46% 81419675 72.36% shellsort: 191303132 42.44% 129786472 45.40% From: Zou Nan hai <nanhai.zou@intel.com> The new sort routine only works if there are an even number of entries in the ia64 exception fix-up tables. If the number of entries is odd the sort fails, and then random get_user/put_user calls can fail. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-07[PATCH] base-small: introduce the CONFIG_BASE_SMALL flagMatt Mackall
This patch series introduced a new pair of CONFIG_EMBEDDED options call CONFIG_BASE_FULL/CONFIG_BASE_SMALL. Disabling CONFIG_BASE_FULL sets the boolean CONFIG_BASE_SMALL to 1 and it is used to shrink a number of core data structures. The space savings for the current batch is around 14k. This patch: Add CONFIG_BASE_SMALL for miscellaneous core size that don't warrant their own options. Example users to follow. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-07[PATCH] explicitly bind idle tasksNathan T. Lynch
With hotplug cpu and preempt, we tend to see smp_processor_id warnings from idle loop code because it's always checking whether its cpu has gone offline. Replacing every use of smp_processor_id with _smp_processor_id in all idle loop code is one solution; another way is explicitly binding idle threads to their cpus (the smp_processor_id warning does not fire if the caller is bound only to the calling cpu). This has the (admittedly slight) advantage of letting us know if an idle thread ever runs on the wrong cpu. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>