summaryrefslogtreecommitdiff
path: root/kernel/timer.c
AgeCommit message (Collapse)Author
2002-12-13Fix nanosleep() behaviour with NULL "remaining" argument.Linus Torvalds
2002-12-05Implement system call restarting for the "nanosleep()" system callLinus Torvalds
using the new system call restart infrastructure. This breaks the compat layer - it really needs to do its own version of restarting, since the restarting depends on the types.
2002-12-03[PATCH] compatibility syscall layerStephen Rothwell
This is the generic part of the start of the compatibility syscall layer. I think I have made it generic enough that each architecture can define what compatibility means. To use this, an architecture must create asm/compat.h and provide typedefs for (currently) 'compat_time_t', 'struct compat_timeval' and 'struct compat_timespec'.
2002-12-01[PATCH] getppid-2.5.50-A3Ingo Molnar
This changes sys_getppid() to be more POSIX-threading conformant. sys_getppid() needs to return the PID of the "process' parent" (ie. the tgid of the parent thread), not the thread parent's PID. The patch has no effect on non-CLONE_THREAD users, for them current->group_leader == current. The effect on CLONE_THREAD threads is that getppid() does not return any PID within the thread group anymore. Plus if a threaded application starts up a (non-thread) child then the child sees the process PID of the parent process, not the thread PID of the parent thread. in theory we could introduce the getttid() variant to get to the TID of the parent thread, but i doubt it would be of any use. (and we can add it if the need arises.) The lockless algorithm is still safe because the ->group_leader pointer never changes asynchronously. (the ->real_parent pointer might still change asynchronously so the SMP checks are still needed.) I've also updated the comments (they referenced the nonexistent p_ooptr field.), plus i've changed the mb() to rmb() - we need to order the reads, we dont do any global writes that need some predictable ordering.
2002-11-25[PATCH] shrink task_struct by removing per_cpu utime and stimeAndrew Morton
Patch from Bill Irwin. It has the potential to break userspace monitoring tools a little bit, and I'm a rater uncertain about how useful the per-process per-cpu accounting is. Bill sent this out as an RFC on July 29: "These statistics severely bloat the task_struct and nothing in userspace can rely on them as they're conditional on CONFIG_SMP. If anyone is using them (or just wants them around), please speak up." And nobody spoke up. If we apply this, the contents of /proc/783/cpu will go from cpu 1 1 cpu0 0 0 cpu1 0 0 cpu2 1 1 cpu3 0 0 to cpu 1 1 And we shall save 256 bytes from the ia32 task_struct. On my SMP build with NR_CPUS=32: Without this patch, sizeof(task_struct) is 1824, slab uses a 1-order allocation and we are getting 2 task_structs per page. With this patch, sizeof(task_struct) is 1568, slab uses a 2-order allocation and we are getting 2.5 task_structs per page. So it seems worthwhile. (Maybe this highlights a shortcoming in slab. For the 1824-byte case it could have used a 0-order allocation)
2002-11-16[PATCH] Run timers as softirqs, not taskletsMatthew Wilcox
The timer code is attempting to replicate the softirq characteristics at the tasklet level, which is a little pointless. This patch converts timers to be a first-class softirq citizen.
2002-11-04[PATCH] fix mod_timer() raceAndrew Morton
If two CPUs run mod_timer against the same not-pending timer then they have no locking relationship. They can both see the timer as not-pending and they both add the timer to their cpu-local list. The CPU which gets there second corrupts the first CPU's lists. This was causing Dave Hansen's 8-way to oops after a couple of minutes of specweb testing. I believe that to fix this we need locking which is associated with the timer itself. The easy fix is hashed spinlocking based on the timer's address. The hard fix is a lock inside the timer itself. It is hard because init_timer() becomes compulsory, to initialise that spinlock. An unknown number of code paths in the kernel just wipe the timer to all-zeroes and start using it. I chose the hard way - it is cleaner and more idiomatic. The patch also adds a "magic number" to the timer so we can detect when a timer was not correctly initialised. A warning and stack backtrace is generated and the timer is fixed up. After 16 such warnings the warning mechanism shuts itself up until a reboot. It took six patches to my kernel to stop the warnings from coming out. The uninitialised timers are extremely easy to find and fix. But it will take some time to weed them all out. Maybe we should go for the hashed locking... Note that the new timer->lock means that we can clean up some awkward "oh we raced, let's try again" code in timer.c. But to do that we'd also need to take timer->lock in the commonly-called del_timer(), so I left it as-is. The lock is not needed in add_timer() because concurrent add_timer()/add_timer() and concurrent add_timer()/mod_timer() are illegal.
2002-10-31[PATCH] make kernel_stat use per-cpu infrastructureAndrew Morton
Patch from Ravikiran G Thirumalai <kiran@in.ibm.com> 1. Break out disk stats from kernel_stat and move disk stat to blkdev.h 2. Group cpu stat in kernel_stat and make them "per_cpu" instead of the NR_CPUS array 3. Remove EXPORT_SYMBOL(kstat) from ksyms.c (as I noticed that no module is using kstat)
2002-10-29[PATCH] percpu: convert timersAndrew Morton
Patch from Dipankar Sarma <dipankar@in.ibm.com> This patch changes the per-CPU data in timer management (tvec_bases) to use per_cpu data area and makes it safe for cpu_possible allocation by using CPU notifiers. End result - saving space. Depends on cpu_possible patch.
2002-10-29[PATCH] slab: add_timer_on: add a timer on a particular CPUAndrew Morton
add_timer_on is like add_timer, except it takes a target CPU on which to add the timer. The slab code needs per-cpu timers for shrinking the per-cpu caches.
2002-10-15[PATCH] oprofile - timer hookJohn Levon
This implements a simple hook into the profiling timer for x86 so that non-perfctr machines can still use oprofile. This has proven useful for laptops and the like. It also reduces header dependencies a bit by centralising readprofile code
2002-10-09[PATCH] timer cleanupsIngo Molnar
This is my latest timer patchset, it makes del_timer_sync() a bit more robust wrt. code that re-adds timers from the timer handler. Other changes in the patch: - clean up cascading a bit. - do not save flags in __run_timer_list - we enter from an irqs-enabled tasklet.
2002-10-08[PATCH] getpid() comment typoRobert Love
Comment above getpid() is wrong. This patch fixes it, and expands the comment to explain why on earth we have getpid() returning ->tgid and not ->pid.
2002-10-04[PATCH] 64-bit timer fixAnton Blanchard
I think I have found it and it only hits on a 64 bit machine. If the timeout is big enough we still need to initialise timer->entry. Otherwise bad things happen we we hit del_timer.
2002-10-02[PATCH] timer-2.5.40-F7Ingo Molnar
This does a number of timer subsystem enhancements: - simplified timer initialization, now it's the cheapest possible thing: static inline void init_timer(struct timer_list * timer) { timer->base = NULL; } since the timer functions already did a !timer->base check this did not have any effect on their fastpath. - the rule from now on is that timer->base is set upon activation of the timer, and cleared upon deactivation. This also made it possible to: - reorganize all the timer handling code to not assume anything about timer->entry.next and timer->entry.prev - this also removed lots of unnecessery cleaning of these fields. Removed lots of unnecessary list operations from the fastpath. - simplified del_timer_sync(): it now uses del_timer() plus some simple synchronization code. Note that this also fixes a bug: if mod_timer (or add_timer) moves a currently executing timer to another CPU's timer vector, then del_timer_sync() does not synchronize with the handler properly. - bugfix: moved run_local_timers() from scheduler_tick() into update_process_times() .. scheduler_tick() might be called from the fork code which will not quite have the intended effect ... - removed the APIC-timer-IRQ shifting done on SMP, Dipankar Sarma's testing shows no negative effects. - cleaned up include/linux/timer.h: - removed the timer_t typedef, and fixes up kernel/workqueue.c to use the 'struct timer_list' name instead. - removed unnecessery includes - renamed the 'list' field to 'entry' (it's an entry not a list head) - exchanged the 'function' and 'data' fields. This, besides being more logical, also unearthed the last few remaining places that initialized timers by assuming some given field ordering, the patch also fixes these places. (fs/xfs/pagebuf/page_buf.c, net/core/profile.c and net/ipv4/inetpeer.c) - removed the defunct sync_timers(), timer_enter() and timer_exit() prototypes. - added docbook-style comments. - other kernel/timer.c changes: - base->running_timer does not have to be volatile ... - added consistent comments to all the important functions. - made the sync-waiting in del_timer_sync preempt- and lowpower- friendly. i've compiled, booted & tested the patched kernel on x86 UP and SMP. I have tried moderately high networking load as well, to make sure the timer changes are correct - they appear to be.
2002-09-28[PATCH] smptimers, old BH removal, tq-cleanupIngo Molnar
This is the smptimers patch plus the removal of old BHs and a rewrite of task-queue handling. Basically with the removal of TIMER_BH i think the time is right to get rid of old BHs forever, and to do a massive cleanup of all related fields. The following five basic 'execution context' abstractions are supported by the kernel: - hardirq - softirq - tasklet - keventd-driven task-queues - process contexts I've done the following cleanups/simplifications to task-queues: - removed the ability to define your own task-queue, what can be done is to schedule_task() a given task to keventd, and to flush all pending tasks. This is actually a quite easy transition, since 90% of all task-queue users in the kernel used BH_IMMEDIATE - which is very similar in functionality to keventd. I believe task-queues should not be removed from the kernel altogether. It's true that they were written as a candidate replacement for BHs originally, but they do make sense in a different way: it's perhaps the easiest interface to do deferred processing from IRQ context, in performance-uncritical code areas. They are easier to use than tasklets. code that cares about performance should convert to tasklets - as the timer code and the serial subsystem has done already. For extreme performance softirqs should be used - the net subsystem does this. and we can do this for 2.6 - there are only a couple of areas left after fixing all the BH_IMMEDIATE places. i have moved all the taskqueue handling code into kernel/context.c, and only kept the basic 'queue a task' definitions in include/linux/tqueue.h. I've converted three of the most commonly used BH_IMMEDIATE users: tty_io.c, floppy.c and random.c. [random.c might need more thought though.] i've also cleaned up kernel/timer.c over that of the stock smptimers patch: privatized the timer-vec definitions (nothing needs it, init_timer() used it mistakenly) and cleaned up the code. Plus i've moved some code around that does not belong into timer.c, and within timer.c i've organized data and functions along functionality and further separated the base timer code from the NTP bits. net_bh_lock: i have removed it, since it would synchronize to nothing. The old protocol handlers should still run on UP, and on SMP the kernel prints a warning upon use. Alexey, is this approach fine with you? scalable timers: i've further improved the patch ported to 2.5 by wli and Dipankar. There is only one pending issue i can see, the question of whether to migrate timers in mod_timer() or not. I'm quite convinced that they should be migrated, but i might be wrong. It's a 10 lines change to switch between migrating and non-migrating timers, we can do performance tests later on. The current, more complex migration code is pretty fast and has been stable under extremely high networking loads in the past 2 years, so we can immediately switch to the simpler variant if someone proves it improves performance. (I'd say if non-migrating timers improve Apache performance on one of the bigger NUMA boxes then the point is proven, no further though will be needed.)
2002-09-25Remove busy-wait for short RT nanosleeps. It's a random special caseLinus Torvalds
and does the wrong thing for higher HZ values anyway.
2002-09-09[PATCH] USER_HZ & NTP problemsRolf Fokkens
I've been playing with different HZ values in the 2.4 kernel for a while now, and apparantly Linus also has decided to introduce a USER_HZ constant (I used CLOCKS_PER_SEC) while raising the HZ value on x86 to 1000. On x86 timekeeping has shown to be relative fragile when raising HZ (OK, I tried HZ=2048 which is quite high) because of the way the interrupt timer is configured to fire HZ times each second. This is done by configuring a divisor in the timer chip (LATCH) which divides a certain clock (1193180) and makes the chip fire interrupts at the resulting frequency. Now comes the catch: NTP requires a clock accuracy of 500 ppm. For some HZ values the clock is not accurate enough to meet this requirement, hence NTP won't work well. An example HZ value is 1020 which exceeds the 500 ppm requirement. In this case the best approximation is 1019.8 Hz. the xtime.tv_usec value is raised with a value of 980 each tick which means that after one second the tv_usec value has increased with 999404 (should be 1000000) which is an accuracy of 596 ppm. Some more examples: HZ Accuracy (ppm) ---- -------------- 100 17 1000 151 1024 632 2000 687 2008 343 2011 18 2048 1249 What I've been doing is replace tv_usec by tv_nsec, meaning xtime is now a timespec instead of a timeval. This allows the accuracy to be improved by a factor of 1000 for any (well ... any?) HZ value. Of course all kinds of calculations had te be improved as well. The ACTHZ constantant is introduced to approximate the actual HZ value, it's used to do some approximations of other related values.
2002-08-04Add KERN_xxx prefixes to printk's in kernel/ subdir.Cory Watson
2002-07-24[PATCH] Ensure xtime_lock and timerlist_lock are on difft cachelinesRavikiran G. Thirumalai
I've noticed that xtime_lock and timerlist_lock ends up on the same cacheline all the time (atleaset on x86). Not a good thing for loads with high xxx_timer and do_gettimeofday counts I guess (networking etc). Here's a trivial fix.
2002-07-23[PATCH] scheduler fixesIngo Molnar
- introduce new type of context-switch locking, this is a must-have for ia64 and sparc64. - load_balance() bug noticed by Scott Rhine and myself: scan the whole list to find imbalance number of tasks, not just the tail of the list. - sched_yield() fix: use current->array not rq->active.
2002-07-01Make in-kernel HZ be 1000 on x86, retaining user-level 100 HZ clock_t.Linus Torvalds
Stop using "struct tms" internally - always use timer ticks (or one of the sane timeval/timespec types) instead. Explicitly convert to clock_t when copying to user space for the old broken interfaces that still use "clock_t". Clean up and unify jiffies<->timeval conversion.
2002-06-19[PATCH] add unlikely() into add_timer()Andrey Panin
This micropatch adds unlikely() macro into add_timer() bug check code. Without this path gcc 3.1 makes bad thing reordering printk() into the middle of function body.
2002-06-17[PATCH] Remove sync_timersMatthew Wilcox
Nobody's using it any more, kill:
2002-06-17[PATCH] remove tqueue.h from sched.hMatthew Wilcox
This is actually part of the work I've been doing to remove BHs, but it stands by itself.
2002-06-17[PATCH] Move jiffies_64 down into architecturesAndi Kleen
x86-64 needs an own special declaration of jiffies_64. prepare for this by moving the jiffies_64 declaration from kernel/timer.c down into each architecture.
2002-06-02[PATCH] sys_sysinfo cleanupRobert Love
Looks like sys_sysinfo has not been touched in years. Among other things, it uses a global cli() for protection; I switched it to an existing rwlock. I also pulled it out of info.c and stuck it in timer.c (I choose timer.c because it shares dependencies there already). The details: - move sys_sysinfo to kernel/timer.c from kernel/info.c: why one small syscall got its own file is beyond me. - delete kernel/info.c - stop the global cli! now grab a read_lock on xtime_lock. this is safe as we moved the write_unlock on xtime_lock down one line to cover the calculating of avenrun. - trivial code cleanup
2002-05-30[PATCH] time-offset patchDavid Mosberger
On ia64 MP machines, we use the cycle counter register of each CPU to obtain fine-grained time-stamps. At boot-time, we synchronize the counters as close as possible (similar to x86, though with a different algorithm). But even with this synchronization, there is still a small (really: tiny) chance that a process bouncing from one CPU to another could observe time going backwards. To guard against this, I maintain a global variable called "last_time_offset" which keeps track of the largest time-interpolation value returned so far. Most of this is in platform-specific code (arch/ia64/kernel/time.c), but there are a handful of places in platform-independent code where this variable needs to be cleared to zero. This is what the patch below does. I didn't put it inside CONFIG_IA64 because I think this can be useful for other platforms, too. I suppose I could put it inside CONFIG_SMP though this would make the code uglier. If you think it's OK, please apply, otherwise, I'd appreciate your feedback.
2002-05-28[PATCH] O(1) count_active_tasksRobert Love
This is William Irwin's algorithmically O(1) version of count_active_tasks (which is currently O(n) for n total tasks on the system). I like it a lot: we become O(1) because now we count uninterruptible tasks, so we can return (nr_uninterruptible + nr_running). It does not introduce any overhead or hurt the case for small n, so I have no complaints. This copy has a small optimization over the original posting, but is otherwise the same thing wli posted earlier. I have tested to make sure this returns accurate results and that the kernel profile improves.
2002-05-10[PATCH] 64-bit jiffies, a better solutionGeorge Anzinger
Ok, here it is. The following arch are not covered: Mips, Mips64 in 32-bit mode, parisc in __LP64__ mode. In addition, x86_64 mentions jiffies in the existing script. This may be a problem.
2002-03-14[PATCH] wait4() WIFSTOPPED starvation fix #1/2David Howells
This patch (#1) just converts the task_struct to use struct list_head rather than direct pointers for maintaining the children list.
2002-02-11merge to the -K3 scheduler.Ingo Molnar
2002-02-05v2.5.2.1 -> v2.5.2.1.1Linus Torvalds
- David Howells: abtract out "current->need_resched" as "need_resched()" - Frank Davis: ide-tape update for bio - various: header file fixups - Jens Axboe: fix up bio/ide/highmem issues - Kai Germaschewski: ISDN update - Tim Waugh: parport update - Patrik Mochel: initcall update - Greg KH: USB and Compaq PCI hotplug updates
2002-02-05v2.5.1.9 -> v2.5.1.10Linus Torvalds
- Kai Germaschewski: ISDN updates - Al Viro: start moving buffer cache indexing to "struct block_device *" - Greg KH: USB update - Russell King: fix up some ARM merge issues - Ingo Molnar: scalable scheduler
2002-02-04v2.5.1.2 -> v2.5.1.3Linus Torvalds
- Christoph Hellwig: scsi_register_module cleanup - Mikael Pettersson: apic.c LVTERR fixes - Russell King: ARM update (including bio update for icside) - Jens Axboe: more bio updates - Al Viro: make ready to switch bread away from kdev_t.. - Davide Libenzi: scheduler cleanups - Anders Gustafsson: LVM fixes for bio - Richard Gooch: devfs update
2002-02-04v2.4.10.5 -> v2.4.10.6Linus Torvalds
- various: fix some module exports uncovered by stricter error checking - Urban Widmark: make smbfs use same error define names as samba and win32 - Greg KH: USB update - Tom Rini: MPC8xx ppc update - Matthew Wilcox: rd.c page cache flushing fix - Richard Gooch: devfs race fix: rwsem for symlinks - Björn Wesen: Cris arch update - Nikita Danilov: reiserfs cleanup - Tim Waugh: parport update - Peter Rival: update alpha SMP bootup to match wait_init_idle fixes - Trond Myklebust: lockd/grace period fix
2002-02-04v2.4.5.2 -> v2.4.5.3Linus Torvalds
- remember to increment the version number - Chris Mason: reiserfs mark_journal_new and bh leak fix - Richard Gooch: devfs update - Alexander Viro: further FS cleanup (superblock list) - David Woodhouse: MTD update - Kai Germaschewski: ISDN update (stanford checker fixes etc) - Rich Baum: gcc-3.0 warning fixes - Jeff Garzik: network driver updates - Geert Uytterhoeven: m68k fbdev logo merge glitch fix - Andrea Arcangeli: fix signal return path - David Miller: Sparc updates - Johannes Erdfelt: USB update - Carsten Otte, Andries Brouwer: don't clear blk_size unconditionally on partition check - Martin Frey: alpha Sable irq fix - Paul Mackerras: PPC softirq update - Patrick Mochel: PCI power management infrastructure - Robert Siemer: miroSOUND driver update - Neil Brown: knfsd updates, including ability to export ReiserFS filesystems - Trond Myklebust: NFS readdir fixup, don't update atime on client - Andrew Morton: truncate_inode_pages speedup - Paul Menage: make inode quota count all inodes..
2002-02-04Import changesetLinus Torvalds