summaryrefslogtreecommitdiff
path: root/kernel/sysctl.c
AgeCommit message (Collapse)Author
2005-12-31sysctl: make sure to terminate strings with a NULLinus Torvalds
This is a slightly more complete fix for the previous minimal sysctl string fix. It always terminates the returned string with a NUL, even if the full result wouldn't fit in the user-supplied buffer. The returned length is the full untruncated length, so that you can tell when truncation has occurred. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-30[PATCH] Fix false old value return of sysctlYi Yang
For the sysctl syscall, if the user wants to get the old value of a sysctl entry and set a new value for it in the same syscall, the old value is always overwritten by the new value if the sysctl entry is of string type and if the user sets its strategy to sysctl_string. This issue lies in the strategy being run twice if the strategy is set to sysctl_string, the general strategy sysctl_string always returns 0 if success. Such strategy routines as sysctl_jiffies and sysctl_jiffies_ms return 1 because they do read and write for the sysctl entry. The strategy routine sysctl_string return 0 although it actually read and write the sysctl entry. According to my analysis, if a strategy routine do read and write, it should return 1, if it just does some necessary check but not read and write, it should return 0, for example sysctl_intvec. Signed-off-by: Yi Yang <yang.y.yi@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-30sysctl: don't overflow the user-supplied buffer with '\0'Linus Torvalds
If the string was too long to fit in the user-supplied buffer, the sysctl layer would zero-terminate it by writing past the end of the buffer. Don't do that. Noticed by Yi Yang <yang.y.yi@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-08[PATCH] Fix sysctl unregistration oops (CVE-2005-2709)Al Viro
You could open the /proc/sys/net/ipv4/conf/<if>/<whatever> file, then wait for interface to go away, try to grab as much memory as possible in hope to hit the (kfreed) ctl_table. Then fill it with pointers to your function. Then do read from file you've opened and if you are lucky, you'll get it called as ->proc_handler() in kernel mode. So this is at least an Oops and possibly more. It does depend on an interface going away though, so less of a security risk than it would otherwise be. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] more kernel-doc cleanups, additionsRandy Dunlap
Various core kernel-doc cleanups: - add missing function parameters in ipc, irq/manage, kernel/sys, kernel/sysctl, and mm/slab; - move description to just above function for kernel_restart() Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] aio: remove aio_max_nr accounting raceZach Brown
AIO was adding a new context's max requests to the global total before testing if that resulting total was over the global limit. This let innocent tasks get their new limit tested along with a racing guilty task that was crossing the limit. This serializes the _nr accounting with a spinlock It also switches to using unsigned long for the global totals. Individual contexts are still limited to an unsigned int's worth of requests by the syscall interface. The problem and fix were verified with a simple program that spun creating and destroying a context while holding on to another long lived context. Before the patch a task creating a tiny context could get a spurious EAGAIN if it raced with a task creating a very large context that overran the limit. Signed-off-by: Zach Brown <zach.brown@oracle.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29[NET]: Fix sparse warningsArnaldo Carvalho de Melo
Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27[PATCH] s390: spin lock retryMartin Schwidefsky
Split spin lock and r/w lock implementation into a single try which is done inline and an out of line function that repeatedly tries to get the lock before doing the cpu_relax(). Add a system control to set the number of retries before a cpu is yielded. The reason for the spin lock retry is that the diagnose 0x44 that is used to give up the virtual cpu is quite expensive. For spin locks that are held only for a short period of time the costs of the diagnoses outweights the savings for spin locks that are held for a longer timer. The default retry count is 1000. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13[PATCH] inotify: move sysctlRobert Love
This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from "/proc/sys/fs". Also some related cleanup. Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12[PATCH] inotifyRobert Love
inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a system call that returns a fd, not SIGIO. You get a single fd, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure), Gamin (a FAM replacement), and other projects. See Documentation/filesystems/inotify.txt. Signed-off-by: Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] remove redundant NULL check before before kfree() in kernel/sysctl.cJesper Juhl
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] setuid core dumpAlan Cox
Add a new `suid_dumpable' sysctl: This value can be used to query and set the core dump mode for setuid or otherwise protected/tainted binaries. The modes are 0 - (default) - traditional behaviour. Any process which has changed privilege levels or is execute only will not be dumped 1 - (debug) - all processes dump core when possible. The core dump is owned by the current user and no security is applied. This is intended for system debugging situations only. Ptrace is unchecked. 2 - (suidsafe) - any binary which normally would not be dumped is dumped readable by root only. This allows the end user to remove such a dump but not access it directly. For security reasons core dumps in this mode will not overwrite one another or other files. This mode is appropriate when adminstrators are attempting to debug problems in a normal environment. (akpm: > > +EXPORT_SYMBOL(suid_dumpable); > > EXPORT_SYMBOL_GPL? No problem to me. > > if (current->euid == current->uid && current->egid == current->gid) > > current->mm->dumpable = 1; > > Should this be SUID_DUMP_USER? Actually the feedback I had from last time was that the SUID_ defines should go because its clearer to follow the numbers. They can go everywhere (and there are lots of places where dumpable is tested/used as a bool in untouched code) > Maybe this should be renamed to `dump_policy' or something. Doing that > would help us catch any code which isn't using the #defines, too. Fair comment. The patch was designed to be easy to maintain for Red Hat rather than for merging. Changing that field would create a gigantic diff because it is used all over the place. ) Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01[PATCH] DocBook: fix some descriptionsMartin Waitz
Some KernelDoc descriptions are updated to match the current code. No code changes. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-11[PATCH] docbook: update function parameter description in block/fs codeMartin Waitz
Update function parameter description in block/fs code Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-04[PATCH] Randomisation: enable by defaultArjan van de Ven
Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-04[PATCH] Randomisation: global sysctlArjan van de Ven
This first patch of the series introduces a sysctl (default off) that enables/disables the randomisation feature globally. Since randomisation may make it harder to debug really tricky situations (reproducability goes down), the sysadmin needs a way to disable it globally. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-02-19add sysctl helper functions to provide milliseconds-based interfaces.Hideaki Yoshifuji
Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
2005-02-01[PATCH] mm: rework lower-zone protection initialisationAndrea Arcangeli
- Rename various fields related to the lower-zone protection code to sync up with 2.4. - Remove the automatic determination of the values of the per-zone protection levels from a single tunable. Replace this with a simple per-zone sysctl. Signed-off-by: Andrea Arcangeli <andrea@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-04[PATCH] panic_timeout: move to kernel.hRandy Dunlap
Move 'panic_timeout' to linux/kernel.h. ipmi_watchdog.c wanted to know why panic_timeout isn't in some header file. However, ipmi_watchdog.c doesn't even use it, so that reference was deleted. Other references now use kernel.h instead of straight extern int. Signed-off-by: Randy Dunlap <rddunlap@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-04[PATCH] VM routine fixesDavid Howells
The attached patch fixes a number of problems in the VM routines: (1) Some inline funcs don't compile if CONFIG_MMU is not set. (2) swapper_pml4 needn't exist if CONFIG_MMU is not set. (3) __free_pages_ok() doesn't counter set_page_refs() different behaviour if CONFIG_MMU is not set. (4) swsusp.c invokes TLB flushing functions without including the header file that declares them. CONFIG_SHMEM semantics: - If MMU: Always enabled if !EMBEDDED - If MMU && EMBEDDED: configurable - If !MMU: disabled Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-03[PATCH] /proc/sys/kernel/bootloader_typeH. Peter Anvin
This patch exports to userspace the boot loader ID which has been exported by (b)zImage boot loaders since boot protocol version 2. It is needed so that update tools that update kernels from vendors know which bootloader file they need to update; eg right now those tools do all kinds of hairy heuristics to find out if it's grub or lilo or .. that installed the kernel. Those heuristics are fragile in the presence of more than one bootloader (which isn't that uncommon in OS upgrade situations). Tested on i386 and x86-64; as far as I know those are the only architectures which use zImage/bzImage format. Signed-Off-By: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-11-01[PATCH] x86_64: add nmi button supportAndi Kleen
Ported from i386 Support a sysctl to raise an oops with an NMI Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-31[PATCH] take me home, hotplug_path[]Kay Sievers
Move hotplug_path[] out of kmod.[ch] to kobject_uevent.[ch] where it belongs now. At some time in the future we should fix the remaining bad hotplug calls (no SEQNUM, no netlink uevent): ./drivers/input/input.c (no DEVPATH on some hotplug events!) ./drivers/pnp/pnpbios/core.c ./drivers/s390/crypto/z90main.c Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
2004-10-28[PATCH] make dnotify a configure-time optionRobert Love
make dnotify configurable, via CONFIG_DNOTIFY. CONFIG_EMBEDDED is required for disabling dnotify. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-20[PATCH] vm thrashing control tuning CONFIG_SWAP=n build fixHideo Aoki
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-19[PATCH] vm thrashing control tuningHideo Aoki
This patch adds "swap_token_timeout" parameter in /proc/sys/vm. The parameter means expired time of token. Unit of the value is HZ, and the default value is the same as current SWAP_TOKEN_TIMEOUT (i.e. HZ * 300). Signed-off-by: Hideo Aoki <aoki@sdl.hitachi.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] pidhashing: enforce PID_MAX_LIMIT in sysctlsWilliam Lee Irwin III
The pid_max sysctl doesn't enforce PID_MAX_LIMIT or sane lower bounds. RESERVED_PIDS + 1 is the minimum pid_max that won't break alloc_pidmap(), and PID_MAX_LIMIT may not be aligned to 8*PAGE_SIZE boundaries for unusual values of PAGE_SIZE, so this also rounds up PID_MAX_LIMIT to it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] add missing linux/syscalls.h includesArnd Bergmann
I found that the prototypes for sys_waitid and sys_fcntl in <linux/syscalls.h> don't match the implementation. In order to keep all prototypes in sync in the future, now include the header from each file implementing any syscall. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23Merge bk://ppc.bkbits.net/for-linus-ppcLinus Torvalds
into ppc970.osdl.org:/home/torvalds/v2.6/linux
2004-08-23[PATCH] fix permissions on the `tainted' sysctlRusty Russell
From: Arjan van de Ven <arjanv@redhat.com> The patch below sets the tainted sysctl file to read only, otherwise userspace can just overwrite/reset it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23[PATCH] sysctl tunable for flexmmapArjan van de Ven
Create /proc/sys/vm/legacy_va_layout. If this is non-zero, the kernel will use the old mmap layout for all tasks. it presently defaults to zero (the new layout). From: William Lee Irwin III <wli@holomorphy.com> hugetlb CONFIG_SYSCTL=n fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23Merge upstream changes by hand.Tom Rini
2004-08-22[PATCH] NMI trigger switch support for debugging(updated)Akiyama Nobuyuki
I made a patch for debugging with the help of NMI trigger switch. When kernel hangs severely, keyboard operation(e.g.Ctrl-Alt-Del) doesn't work properly. This patch enables debugging information to be displayed on console in this case. I think this feature is necessary as standard functionality. Please feel free to use this patch and let me know if you have any comments. Background: When a trouble occurs in kernel, we usually begin to investigate with following information: - panic >> panic message. - oops >> CPU registers and stack trace. - hang >> **NONE** no standard method established. How it works: Most IA32 servers have a NMI switch that fires NMI interrupt up. The NMI interrupt can interrupt even if kernel is serious state, for example deadlock under the interrupt disabled. When the NMI switch is pressed after this feature is activated, CPU registers and stack trace are displayed on console and then panic occurs. This feature is activated or deactivated with sysctl. On IA32 architecture, only the following are defined as reason of NMI interrupt: - memory parity error - I/O check error The reason code of NMI switch is not defined, so this patch assumes that all undefined NMI interrupts are fired by MNI switch. However, oprofile and NMI watchdog also use undefined NMI interrupt. Therefore this feature cannot be used at the same time with oprofile and NMI watchdog. This feature hands NMI interrupt over to oprofile and NMI watchdog. So, when they have been activated, this feature doesn't work even if it is activated. Supported architecture: IA32 Setup: Set up the system control parameter as follows: # sysctl -w kernel.unknown_nmi_panic=1 kernel.unknown_nmi_panic = 1 If the NMI switch is pressed, CPU registers and stack trace will be displayed on console and then panic occurs. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-17ppc32: Move ppc32-specific sysctls to arch/ppcTom Rini
Remove both of the PPC32 && 6xx sysctls from kernel/sysctl.c to arch/ppc/ Signed-off-by: Tom Rini <trini@kernel.crashing.org>
2004-08-12[PATCH] ppc32: Fix warning on CONFIG_PPC32 && CONFIG_6xxTom Rini
In the *ppos cleanups, proc_dol2crvec was updated, but the prototype found at the top of kernel/sysctl.h was not, generating warning. This corrects the prototype to match the code. (I'm gonna take a stab at moving these into arch/ppc shortly) Signed-off-by: Tom Rini <trini@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-07Make sysctl pass the pos pointer around properly.Linus Torvalds
Nobody ever fixed the big FIXME in sysctl - but we really need to pass around the proper "loff_t *" to all the sysctl functions if we want them to be well-behaved wrt the file pointer position. This is all preparation for making direct f_pos accesses go away.
2004-07-30[PATCH] sparse: misc cleanupsAlexander Viro
all sorts of minor stuff - basically, all chunks are independent here, but IMO that one is not worth splitting. Contains: * pmac_cpufreq.c: declaration in the middle of a block. * sys_ia32.c: couple of trivial annotations. * ipmi_si_intf.c: should be using asm/irq.h instead of linux/irq.h * synclink_cs.c: assignment-in-conditional with nobody ever looking at the variable we are assigning to afterwards; variable removed. * sbni.c: s/__volatile/__volatile__ * matroxfb_base.h: got rid of ((u32 *)p)++ * asm-ppc/checksum.h and asm-sparc64/floppy.h: NULL noise removal * amd64 compat.h: missing L in long constant. * mtd-abi.h: annotated ioctl structure * sysctl.c: corrected annotations in extern Signed-off-by: Al Viro <viro@parcelfarce.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-07-28[PATCH] fix for buffer limit for long in sysctl.cStéphane Eranian
Fix a bug in do_proc_doulongvec_minmax() where the the string buffer was too short to parse a 64-bit number expressed in decimal. That was causing problems with entries in /proc/sys using long and allowing large number (such as -1) Signed-off-by: Stephane Eranian <eranian@hpl.hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-07-01[PATCH] Remaining sparse warnings in allnoconfigMika Kukkonen
Attached is a smallish patch for couple trivial sparse warnings in allnoconfig build and more importantly an "excuses" text file explaining why the rest have not been fixed. Basically all of them (with the exception of the one in Andrews tree) need some serious re-engineering. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-23[PATCH] vm: vfs shrinkage tuningAndrew Morton
Some people want the dentry and inode caches shrink harder, others want them shrunk more reluctantly. The patch adds /proc/sys/vm/vfs_cache_pressure, which tunes the vfs cache versus pagecache scanning pressure. - at vfs_cache_pressure=0 we don't shrink dcache and icache at all. - at vfs_cache_pressure=100 there is no change in behaviour. - at vfs_cache_pressure > 100 we reclaim dentries and inodes harder. The number of megabytes of slab left after a slocate.cron on my 256MB test box: vfs_cache_pressure=100000 33480 vfs_cache_pressure=10000 61996 vfs_cache_pressure=1000 104056 vfs_cache_pressure=200 166340 vfs_cache_pressure=100 190200 vfs_cache_pressure=50 206168 Of course, this just left more directory and inode pagecache behind instead of vfs cache. Interestingly, on this machine the entire slocate run fits into pagecache, but not into VFS caches. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-17[PATCH] RLIM: remove unused queued_signals global accountingChris Wright
Remove unused queued_signals global accounting. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-12[PATCH] fix modprobe_path and hotplug_path sizes and sysctlAndrew Morton
From: Andy Whitcroft <apw@shadowen.org> Both modprobe_path and hotplug_path are arbitrarily sized at 256 bytes and that size is also expressed directly in the sysctl code. It seems reasonable to define a standard length and use that for consitancy. This patch introduces the constant KMOD_PATH_LEN and uses that. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-08[PATCH] kernel/sysctl annotations for sparseRandy Dunlap
Add __user annotations to kernel/sysctl.c to satisfy sparse for !CONFIG_SYSCTL, !CONFIG_PROC_FS. Signed-off-by: Randy Dunlap <rddunlap@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-08[PATCH] fix uts sysctl write sizeAndrew Morton
From: Andy Whitcroft <apw@shadowen.org> The sysctl interfaces for updating the uts entries such as hostname and domainname are using the wrong length for these buffers; they are hard coded to 64. Although safe, this artifically limits the size of these fields to one less than the true maximum. This generates an inconsistency between the various methods of update for these fields. # hostname 12345678901234567890123456789012345678901234567890123456789012345 hostname: name too long # hostname 1234567890123456789012345678901234567890123456789012345678901234 # hostname 1234567890123456789012345678901234567890123456789012345678901234 # sysctl -w kernel.hostname=1234567890123456789012345678901234567890123456789012345678901234567890 kernel.hostname = 1234567890123456789012345678901234567890123456789012345678901234567890 # hostname 123456789012345678901234567890123456789012345678901234567890123 # The error originates from the fact the handler for strings (proc_dostring) already allows for the string terminator. This patch corrects the limit, taking the oppotunity to convert to use of sizeof(). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-29[PATCH] sparse: kernel/sysctl.c annotation and cleanupAlexander Viro
2004-05-10[PATCH] Add sysctl to define a hugetlb-capable groupAndrew Morton
From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>, "Seth, Rohit" <rohit.seth@intel.com> This patch addresses the longstanding problem wherein Oracle needs CAP_IPC_LOCK to allocate SHM_HUGETLB shm memory, but people don't want to run Oracle as root, and capabilties are busted. Various ideas with rlimits didn't work out, mainly because these objects live beyond the lifetime of the user processes which establish them. What we do is to create root-writeable /proc/sys/vm/hugetlb_shm_group which specifies a single group ID. Users who belong to that group may allocate hugepages for SHM_HUGETLB shm segments. So the sysadmin will greate a new group, say `hugepageusers', will add the oracle user to that group and will write that group's ID into /proc/sys/vm/hugetlb_shm_group.
2004-04-26[PATCH] s390: no timer interrupts in idle.Andrew Morton
From: Martin Schwidefsky <schwidefsky@de.ibm.com> This patch add a system control that allows to switch off the jiffies timer interrupts while a cpu sleeps in idle. This is useful for a system running with virtual cpus under z/VM.
2004-04-12Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6David S. Miller
into nuts.davemloft.net:/disk1/BK/sparc-2.6
2004-04-11[PATCH] hugetlb consolidationAndrew Morton
From: William Lee Irwin III <wli@holomorphy.com> The following patch consolidates redundant code in various hugetlb implementations. I took the liberty of renaming a few things, since the code was all moved anyway, and it has the benefit of helping to catch missed conversions and/or consolidations.
2004-04-11[PATCH] laptop modeAndrew Morton
From: Bart Samwel <bart@samwel.tk> Adds /proc/sys/vm/laptop-mode: a special knob which says "this is a laptop". In this mode the kernel will attempt to avoid spinning disks up. Algorithm: the idea is to hold dirty data in memory for a long time, but to flush everything which has been accumulated if the disk happens to spin up for other reasons. - Whenever a disk request completes (read or write), schedule a timer a few seconds hence. If the timer was already pending, reset it to a few seconds hence. - When the timer expires, write back the whole world. We use sync_filesystems() for this because it will force ext3 journal commits as well. - In balance_dirty_pages(), kick off background writeback when we hit the high threshold (dirty_ratio), not when we hit the low threshold. This has the effect of causing "lumpy" writeback which is something I spent a year fixing, but in laptop mode, it is desirable. - In try_to_free_pages(), only kick pdflush if the VM is getting into distress: we want to keep scanning for clean pages, deferring writeback. - In page reclaim, avoid writing back the odd random dirty page off the LRU: only start I/O if the scanning is working harder. The effect is to perform a sync() a few seconds after all I/O has ceased. The value which was written into /proc/sys/vm/laptop-mode determines, in seconds, the delay between the final I/O and the flush. Additionally, the patch adds tools which help answer the question "why the heck does my disk spin up all the time?". The user may set /proc/sys/vm/block_dump to a non-zero value and the kernel will print out information which will identify the process which is performing disk reads or which is dirtying pagecache. The user should probably disable syslogd before setting block-dump.