| Age | Commit message (Collapse) | Author |
|
Apparently a lot of scripts use a construct like
cat /proc/net/ip_conntrack | wc -l
which has a negative impact on system performance due to all the locking
required.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Create /proc/sys/vm/legacy_va_layout. If this is non-zero, the kernel
will use the old mmap layout for all tasks. it presently defaults to zero
(the new layout).
From: William Lee Irwin III <wli@holomorphy.com>
hugetlb CONFIG_SYSCTL=n fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
I made a patch for debugging with the help of NMI trigger switch.
When kernel hangs severely, keyboard operation(e.g.Ctrl-Alt-Del)
doesn't work properly. This patch enables debugging information
to be displayed on console in this case.
I think this feature is necessary as standard functionality.
Please feel free to use this patch and let me know if you have
any comments.
Background:
When a trouble occurs in kernel, we usually begin to investigate
with following information:
- panic >> panic message.
- oops >> CPU registers and stack trace.
- hang >> **NONE** no standard method established.
How it works:
Most IA32 servers have a NMI switch that fires NMI interrupt up.
The NMI interrupt can interrupt even if kernel is serious state,
for example deadlock under the interrupt disabled.
When the NMI switch is pressed after this feature is activated,
CPU registers and stack trace are displayed on console and then
panic occurs.
This feature is activated or deactivated with sysctl.
On IA32 architecture, only the following are defined as reason
of NMI interrupt:
- memory parity error
- I/O check error
The reason code of NMI switch is not defined, so this patch assumes
that all undefined NMI interrupts are fired by MNI switch.
However, oprofile and NMI watchdog also use undefined NMI interrupt.
Therefore this feature cannot be used at the same time with oprofile
and NMI watchdog. This feature hands NMI interrupt over to oprofile
and NMI watchdog. So, when they have been activated, this feature
doesn't work even if it is activated.
Supported architecture:
IA32
Setup:
Set up the system control parameter as follows:
# sysctl -w kernel.unknown_nmi_panic=1
kernel.unknown_nmi_panic = 1
If the NMI switch is pressed, CPU registers and stack trace will
be displayed on console and then panic occurs.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into nuts.davemloft.net:/disk1/BK/net-2.6
|
|
Nobody ever fixed the big FIXME in sysctl - but we really need
to pass around the proper "loff_t *" to all the sysctl functions
if we want them to be well-behaved wrt the file pointer position.
This is all preparation for making direct f_pos accesses go
away.
|
|
Incremental to all other patches so far, there is also the new SCTP
conntrack helper by Kiran Kumar. Please apply for 2.6.9 ++, thanks.
Signed-off-by: Kiran Kumar Immidi <immidi_kiran@yahoo.com>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Some people want the dentry and inode caches shrink harder, others want them
shrunk more reluctantly.
The patch adds /proc/sys/vm/vfs_cache_pressure, which tunes the vfs cache
versus pagecache scanning pressure.
- at vfs_cache_pressure=0 we don't shrink dcache and icache at all.
- at vfs_cache_pressure=100 there is no change in behaviour.
- at vfs_cache_pressure > 100 we reclaim dentries and inodes harder.
The number of megabytes of slab left after a slocate.cron on my 256MB test
box:
vfs_cache_pressure=100000 33480
vfs_cache_pressure=10000 61996
vfs_cache_pressure=1000 104056
vfs_cache_pressure=200 166340
vfs_cache_pressure=100 190200
vfs_cache_pressure=50 206168
Of course, this just left more directory and inode pagecache behind instead of
vfs cache. Interestingly, on this machine the entire slocate run fits into
pagecache, but not into VFS caches.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
1) Add sysctl to control rcvbuf moderation, off for now.
2) Set default winscale to zero.
|
|
|
|
From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>,
"Seth, Rohit" <rohit.seth@intel.com>
This patch addresses the longstanding problem wherein Oracle needs
CAP_IPC_LOCK to allocate SHM_HUGETLB shm memory, but people don't want to run
Oracle as root, and capabilties are busted.
Various ideas with rlimits didn't work out, mainly because these objects live
beyond the lifetime of the user processes which establish them.
What we do is to create root-writeable /proc/sys/vm/hugetlb_shm_group which
specifies a single group ID. Users who belong to that group may allocate
hugepages for SHM_HUGETLB shm segments.
So the sysadmin will greate a new group, say `hugepageusers', will add the
oracle user to that group and will write that group's ID into
/proc/sys/vm/hugetlb_shm_group.
|
|
This is a version of Binary Increase Control (BIC) TCP
developed by NCSU. It is yet another TCP congestion control
algorithm for handling big fat pipes. For normal size congestion
windows it behaves the same as existing TCP Reno, but when window
is large it uses additive increase to ensure fairness and when
window is small it uses binary search increase.
For more details see the BIC TCP web page
http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/
The original code was for web100 (2.4); this version is pretty
much the same but targeted for 2.6 with less sysctl parameters
and more constants.
I don't have a real high speed long haul network to test, but
when running over 1G links with delays, the performance is more stable
(ie tests are repeatable) and as fast as existing Reno.
|
|
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch add a system control that allows to switch off the jiffies timer
interrupts while a cpu sleeps in idle. This is useful for a system running
with virtual cpus under z/VM.
|
|
A forward port of an old 2.3.x kernel hack done
years ago. I (DaveM) did the first rough port,
Stephen Hemminger actually cleaned it up and
made it usable.
|
|
|
|
into nuts.davemloft.net:/disk1/BK/net-2.6
|
|
|
|
into nuts.davemloft.net:/disk1/BK/sparc-2.6
|
|
From: Bart Samwel <bart@samwel.tk>
Adds /proc/sys/vm/laptop-mode: a special knob which says "this is a laptop".
In this mode the kernel will attempt to avoid spinning disks up.
Algorithm: the idea is to hold dirty data in memory for a long time, but to
flush everything which has been accumulated if the disk happens to spin up
for other reasons.
- Whenever a disk request completes (read or write), schedule a timer a few
seconds hence. If the timer was already pending, reset it to a few seconds
hence.
- When the timer expires, write back the whole world. We use
sync_filesystems() for this because it will force ext3 journal commits as
well.
- In balance_dirty_pages(), kick off background writeback when we hit the
high threshold (dirty_ratio), not when we hit the low threshold. This has
the effect of causing "lumpy" writeback which is something I spent a year
fixing, but in laptop mode, it is desirable.
- In try_to_free_pages(), only kick pdflush if the VM is getting into
distress: we want to keep scanning for clean pages, deferring writeback.
- In page reclaim, avoid writing back the odd random dirty page off the
LRU: only start I/O if the scanning is working harder.
The effect is to perform a sync() a few seconds after all I/O has ceased.
The value which was written into /proc/sys/vm/laptop-mode determines, in
seconds, the delay between the final I/O and the flush.
Additionally, the patch adds tools which help answer the question "why the
heck does my disk spin up all the time?". The user may set
/proc/sys/vm/block_dump to a non-zero value and the kernel will print out
information which will identify the process which is performing disk reads or
which is dirtying pagecache.
The user should probably disable syslogd before setting block-dump.
|
|
into nuts.davemloft.net:/disk1/BK/sparc-2.6
|
|
|
|
From: David Mosberger <davidm@napali.hpl.hp.com>
Below is a warmed up version of a patch originally done by Werner Almesberger
(see http://tinyurl.com/25zra) to replace the MAX_MAP_COUNT limit with a
sysctl variable. I thought this had gone into the tree a long time ago but
alas it has not and as luck would have it, the hard limit bit someone today
once again with a large app on a large machine.
Here is a small test app:
|
|
|
|
|
|
From: "Randy.Dunlap" <rddunlap@osdl.org>
Add syscalls.h, which contains prototypes for the kernel's system calls.
Replace open-coded declarations all over the place. This patch found a
couple of prior bugs. It appears to be more important with -mregparm=3 as we
discover more asmlinkage mismatches.
Some syscalls have arch-dependent arguments, so their prototypes are in the
arch-specific unistd.h. Maybe it should have been asm/syscalls.h, but there
were already arch-specific syscall prototypes in asm/unistd.h...
Tested on x86, ia64, x86_64, ppc64, s390 and sparc64. May cause
trivial-to-fix build breakage on other architectures.
|
|
From: Tim Hockin <thockin@sun.com>
Attached is a simple patch to expose NGROUPS_MAX via sysctl. Nothing
fancy, just a read-only variable. glibc can use this to sysconf() the
value properly, so apps will stop relying on NGROUPS_MAX as a real
constant.
|
|
From: "H. Peter Anvin" <hpa@transmeta.com>
Remove the limit of 2048 pty's - allocate them on demand up to the 12:20
dev_t limit: a million.
|
|
- Update listhelp.h to benefit from prefetching
- More efficient selective_cleanup() impl. in conntrack
- Export number of conntrack buckets via r/o sysctl.
|
|
From: Janet Morgan <janetmor@us.ibm.com>
It looks like aio_nr and aio_max_nr were intended to be sysctl parameters.
|
|
|
|
|
|
|
|
into nuts.davemloft.net:/disk1/BK/net-2.6
|
|
into us.ibm.com:/home/sridhar/BK/lksctp-2.6.1
|
|
Original 2.4.x version by Angelo Dell'Aera (buffer@antifork.org)
Here is the 2.4 version with some cleanups converted to 2.6.
- use tcp_ prefix (dave)
- get rid of rwlock not needed (dave)
- do some hand optimization of the inline's
- don't make init inline
- get rid of extra whitespace
- eliminate accessor for mss_cache
|
|
|
|
|
|
neighbour sysctls.
|
|
into nuts.ninka.net:/disk1/davem/BK/net-2.6
|
|
From: Anton Blanchard <anton@samba.org>
Generate a global printk rate-limiting function, printk_ratelimit().
Also, use it in the page allocator warning code. Also add a dump_stack to
that code.
Later, we need to switch net_ratelimit() over to use printk_ratelimit().
|
|
|
|
into us.ibm.com:/home/sridhar/BK/lksctp-2.6.1
|
|
bridge-nf-call-arptables - pass or don't pass bridged ARP traffic to
arptables' FORWARD chain.
bridge-nf-call-iptables - pass or don't pass bridged IPv4 traffic to
iptables' chains.
bridge-nf-filter-vlan-tagged - pass or don't pass bridged vlan-tagged
ARP/IP traffic to arptables/iptables.
|
|
into us.ibm.com:/home/sridhar/BK/lksctp-2.6.0-test10
|
|
|
|
|
|
|
|
The logging level is now controlled by a
/proc/sys/dev/scsi/logging_level sysctl instead of /proc/scsi/scsi.
The format is the same as the logging_level module parameter.
|
|
From: Bernardo Innocenti <bernie@develer.com>
sysctl.h needs compiler.h
|