diff options
| author | Christoph Lameter <clameter@sgi.com> | 2004-09-07 17:46:02 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2004-09-07 17:46:02 -0700 |
| commit | bd46a4f183290995b9c2fc4de40670e995482769 (patch) | |
| tree | 8903c89ca648af88cb8ebc48af4e602f43bb6b9f /include/asm-generic | |
| parent | c9daeae66c9eea359c65210229b3962e66642a43 (diff) | |
[PATCH] Time interpolator: Scalability enhancements and high resolution time for IA64
This has been in the ia64 (and hence -mm) trees for a couple of months.
Changelog:
* Affects only architectures which define CONFIG_TIME_INTERPOLATION
(currently only IA64)
* Genericize time interpolation, make time interpolators easily usable
and provide instructions on how to use the interpolator for other
architectures.
* Provide nanosecond resolution for clock_gettime and an accuracy
up to the time interpolator time base.
* clock_getres() reports resolution of underlying time basis which
is typically <50ns and may be 1ns on some systems.
* Make time interpolator self-tuning to limit time jumps
and to make the interpolators work correctly on systems with
broken time base specifications.
* SMP scalability: Make clock_gettime and gettimeofday scale O(1)
by removing the cmpxchg for most clocks (tested for up to 512 CPUs)
* IA64: provide asm fastcall that doubles the performance
of gettimeofday and clock_gettime on SGI and other IA64 systems
(asm fastcalls scale O(1) together with the scalability fixes).
* IA64: provide nojitter kernel option so that IA64 systems with
correctly synchronized ITC counters may also enjoy the
scalability enhancements.
Performance measurements for single calls (ITC cycles):
A. 4 way Intel IA64 SMP system (kmart)
ITC offsets:
kmart:/usr/src/noship-tests # dmesg|grep synchr
CPU 1: synchronized ITC with CPU 0 (last diff 1 cycles, maxerr 417 cycles)
CPU 2: synchronized ITC with CPU 0 (last diff 2 cycles, maxerr 417 cycles)
CPU 3: synchronized ITC with CPU 0 (last diff 1 cycles, maxerr 417 cycles)
A.1. Current kernel code
kmart:/usr/src/noship-tests # ./dmt
gettimeofday cycles: 3737 220 215 215 215 215 215 215 215 215
clock_gettime(REAL) cycles: 4058 575 564 576 565 566 558 558 558 558
clock_gettime(MONO) cycles: 1583 621 609 609 609 609 609 609 609 609
clock_gettime(PROCESS) cycles: 71428 298 259 259 259 259 259 259 259 259
clock_gettime(THREAD) cycles: 3982 336 290 298 298 298 298 286 286 286
A.2 New code using cmpxchg
kmart:/usr/src/noship-tests # ./dmt
gettimeofday cycles: 3145 213 216 213 213 213 213 213 213 213
clock_gettime(REAL) cycles: 3185 230 210 210 210 210 210 210 210 210
clock_gettime(MONO) cycles: 284 217 217 216 216 216 216 216 216 216
clock_gettime(PROCESS) cycles: 68857 289 270 259 259 259 259 259 259 259
clock_gettime(THREAD) cycles: 3862 339 298 298 298 298 290 286 286 286
A.3 New code with cmpxchg switched off (nojitter kernel option)
kmart:/usr/src/noship-tests # ./dmt
gettimeofday cycles: 3195 219 219 212 212 212 212 212 212 212
clock_gettime(REAL) cycles: 3003 228 205 205 205 205 205 205 205 205
clock_gettime(MONO) cycles: 279 209 209 209 208 208 208 208 208 208
clock_gettime(PROCESS) cycles: 65849 292 259 259 268 270 270 259 259 259
B. SGI SN2 system running 512 IA64 CPUs.
B.1. Current kernel code
[root@ascender noship-tests]# ./dmt
gettimeofday cycles: 17221 1028 1007 1004 1004 1004 1010 25928 1002 1003
clock_gettime(REAL) cycles: 10388 1099 1055 1044 1064 1063 1051 1056 1061 1056
clock_gettime(MONO) cycles: 2363 96 96 96 96 96 96 96 96 96
clock_gettime(PROCESS) cycles: 46537 804 660 666 666 666 666 666 666 666
clock_gettime(THREAD) cycles: 10945 727 710 684 685 686 685 686 685 686
B.2 New code
ascender:~/noship-tests # ./dmt
gettimeofday cycles: 3874 610 588 588 588 588 588 588 588 588
clock_gettime(REAL) cycles: 3893 612 588 582 588 588 588 588 588 588
clock_gettime(MONO) cycles: 686 595 595 588 588 588 588 588 588 588
clock_gettime(PROCESS) cycles: 290759 322 269 269 259 265 265 265 259 259
clock_gettime(THREAD) cycles: 5153 358 306 298 296 304 290 298 298 298
Scalability of time functions (in time it takes to do a million calls):
=======================================================================
A. 4 way Intel IA SMP system (kmart)
A.1 Current code
kmart:/usr/src/noship-tests # ./todscale -n1000000
CPUS WALL WALL/CPUS
1 0.192 0.192
2 1.125 0.563
4 9.229 2.307
A.2 New code using cmpxchg
kmart:/usr/src/noship-tests # ./todscale
CPUS WALL WALL/CPUS
1 0.188 0.188
2 0.457 0.229
4 0.413 0.103
(the measurement with 4 cpus may fluctuate up to 15.x somehow)
A.3 New code without cmpxchg (nojitter kernel option)
kmart:/usr/src/noship-tests # ./todscale -n10000000
CPUS WALL WALL/CPUS
1 0.180 0.180
2 0.180 0.090
4 0.252 0.063
B. SGI SN2 system running 512 IA64 CPUs.
The system has a global monotonic clock and therefore has
no need for compensation. Current code uses a cmpxchg. New
code has no cmpxchg.
B.1 current code
ascender:~/noship-tests # ./todscale
CPUS WALL WALL/CPUS
1 0.850 0.850
2 1.767 0.884
4 6.124 1.531
8 20.777 2.597
16 57.693 3.606
32 164.688 5.146
64 456.647 7.135
128 1093.371 8.542
256 2778.257 10.853
(System crash at 512 CPUs)
B.2 New code
ascender:~/noship-tests # ./todscale -n1000000
CPUS WALL WALL/CPUS
1 0.426 0.426
2 0.429 0.215
4 0.436 0.109
8 0.452 0.057
16 0.454 0.028
32 0.457 0.014
64 0.459 0.007
128 0.466 0.004
256 0.474 0.002
512 0.518 0.001
Clock Accuracy
==============
A. 4 CPU SMP system
A.1 Old code
kmart:/usr/src/noship-tests # ./cdisp
Gettimeofday() = 1092124757.270305000
CLOCK_REALTIME= 1092124757.270382000 resolution= 0.000976563
CLOCK_MONOTONIC= 89.696726590 resolution= 0.000976563
CLOCK_PROCESS_CPUTIME_ID= 0.001242507 resolution= 0.000000001
CLOCK_THREAD_CPUTIME_ID= 0.001255310 resolution= 0.000000001
A.2 New code
kmart:/usr/src/noship-tests # ./cdisp
Gettimeofday() = 1092124478.194530000
CLOCK_REALTIME= 1092124478.194603399 resolution= 0.000000001
CLOCK_MONOTONIC= 88.198315204 resolution= 0.000000001
CLOCK_PROCESS_CPUTIME_ID= 0.001241235 resolution= 0.000000001
CLOCK_THREAD_CPUTIME_ID= 0.001254747 resolution= 0.000000001
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'include/asm-generic')
0 files changed, 0 insertions, 0 deletions
