user/sven/linux.git/include/linux/timekeeper_internal.h, branch v4.4.2

time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge

2015-06-12T09:15:49Z

Currently, leapsecond adjustments are done at tick time. As a result, the leapsecond was applied at the first timer tick *after* the leapsecond (~1-10ms late depending on HZ), rather then exactly on the second edge. This was in part historical from back when we were always tick based, but correcting this since has been avoided since it adds extra conditional checks in the gettime fastpath, which has performance overhead. However, it was recently pointed out that ABS_TIME CLOCK_REALTIME timers set for right after the leapsecond could fire a second early, since some timers may be expired before we trigger the timekeeping timer, which then applies the leapsecond. This isn't quite as bad as it sounds, since behaviorally it is similar to what is possible w/ ntpd made leapsecond adjustments done w/o using the kernel discipline. Where due to latencies, timers may fire just prior to the settimeofday call. (Also, one should note that all applications using CLOCK_REALTIME timers should always be careful, since they are prone to quirks from settimeofday() disturbances.) However, the purpose of having the kernel do the leap adjustment is to avoid such latencies, so I think this is worth fixing. So in order to properly keep those timers from firing a second early, this patch modifies the ntp and timekeeping logic so that we keep enough state so that the update_base_offsets_now accessor, which provides the hrtimer core the current time, can check and apply the leapsecond adjustment on the second edge. This prevents the hrtimer core from expiring timers too early. This patch does not modify any other time read path, so no additional overhead is incurred. However, this also means that the leap-second continues to be applied at tick time for all other read-paths. Apologies to Richard Cochran, who pushed for similar changes years ago, which I resisted due to the concerns about the performance overhead. While I suspect this isn't extremely critical, folks who care about strict leap-second correctness will likely want to watch this. Potentially a -stable candidate eventually. Originally-suggested-by: Richard Cochran Reported-by: Daniel Bristot de Oliveira Reported-by: Prarit Bhargava Signed-off-by: John Stultz Cc: Richard Cochran Cc: Jan Kara Cc: Jiri Bohac Cc: Shuah Khan Cc: Ingo Molnar Link: http://lkml.kernel.org/r/1434063297-28657-4-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner

time: Rework debugging variables so they aren't global

2015-05-22T16:13:43Z

Ingo suggested that the timekeeping debugging variables recently added should not be global, and should be tied to the timekeeper's read_base. Thus this patch implements that suggestion. This version is different from the earlier versions as it keeps the variables in the timekeeper structure rather then in the tkr. Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Peter Zijlstra Cc: Prarit Bhargava Cc: Richard Cochran Signed-off-by: John Stultz

hrtimer: Make offset update smarter

2015-04-22T15:06:49Z

On every tick/hrtimer interrupt we update the offset variables of the clock bases. That's silly because these offsets change very seldom. Add a sequence counter to the time keeping code which keeps track of the offset updates (clock_was_set()). Have a sequence cache in the hrtimer cpu bases to evaluate whether the offsets must be updated or not. This allows us later to avoid pointless cacheline pollution. Signed-off-by: Thomas Gleixner Reviewed-by: Preeti U Murthy Acked-by: Peter Zijlstra Cc: Viresh Kumar Cc: Marcelo Tosatti Cc: Frederic Weisbecker Cc: John Stultz Link: http://lkml.kernel.org/r/20150414203501.132820245@linutronix.de Signed-off-by: Thomas Gleixner Cc: John Stultz

time: Add timerkeeper::tkr_raw

2015-03-27T08:45:07Z

Introduce tkr_raw and make use of it. base_raw -> tkr_raw.base clock->{mult,shift} -> tkr_raw.{mult.shift} Kill timekeeping_get_ns_raw() in favour of timekeeping_get_ns(&tkr_raw), this removes all mono_raw special casing. Duplicate the updates to tkr_mono.cycle_last into tkr_raw.cycle_last, both need the same value. Signed-off-by: Peter Zijlstra (Intel) Acked-by: John Stultz Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20150319093400.422589590@infradead.org Signed-off-by: Ingo Molnar

time: Rename timekeeper::tkr to timekeeper::tkr_mono

2015-03-27T08:45:06Z

In preparation of adding another tkr field, rename this one to tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base, since the structure is not specific to CLOCK_MONOTONIC and the mono name got added to the tk_read_base instance. Lots of trivial churn. Signed-off-by: Peter Zijlstra (Intel) Acked-by: John Stultz Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org Signed-off-by: Ingo Molnar

timekeeping: Provide fast accessor to the seconds part of CLOCK_MONOTONIC

2014-10-29T14:15:40Z

This is the counterpart to get_seconds() based on CLOCK_MONOTONIC. The use case for this interface are kernel internal coarse grained timestamps which do neither require the nanoseconds fraction of current time nor the CLOCK_REALTIME properties. Such timestamps can currently only retrieved by calling ktime_get_ts64() and using the tv_sec field of the returned timespec64. That's inefficient as it involves the read of the clocksource, math operations and must be protected by the timekeeper sequence counter. To avoid the sequence counter protection we restrict the return value to unsigned 32bit on 32bit machines. This covers ~136 years of uptime and therefor an overflow is not expected to hit anytime soon. To avoid math in the function we calculate the current seconds portion of CLOCK_MONOTONIC when the timekeeper gets updated in tk_update_ktime_data() similar to the CLOCK_REALTIME counterpart xtime_sec. [ tglx: Massaged changelog, simplified and commented the update function, added docbook comment ] Signed-off-by: Heena Sirwani Reviewed-by: Arnd Bergman Cc: John Stultz Cc: opw-kernel@googlegroups.com Link: http://lkml.kernel.org/r/da0b63f4bdf3478909f92becb35861197da3a905.1414578445.git.heenasirwani@gmail.com Signed-off-by: Thomas Gleixner

timekeeping: Fixup typo in update_vsyscall_old definition

2014-07-30T07:26:25Z

In commit 4a0e637738f0 ("clocksource: Get rid of cycle_last"), currently in the -tip tree, there was a small typo where cycles_t was used intstead of cycle_t. This broke ppc64 builds. Fix this by using the proper cycle_t type for this usage, in both the definition and the ia64 implementation. Now, having both cycle_t and cycles_t types seems like a very bad idea just asking for these sorts of issues. But that will be a cleanup for another day. Reported-by: Stephen Rothwell Signed-off-by: John Stultz Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1406349439-11785-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner

timekeeping: Use cached ntp_tick_length when accumulating error

2014-07-23T22:01:57Z

By caching the ntp_tick_length() when we correct the frequency error, and then using that cached value to accumulate error, we avoid large initial errors when the tick length is changed. This makes convergence happen much faster in the simulator, since the initial error doesn't have to be slowly whittled away. This initially seems like an accounting error, but Miroslav pointed out that ntp_tick_length() can change mid-tick, so when we apply it in the error accumulation, we are applying any recent change to the entire tick. This approach chooses to apply changes in the ntp_tick_length() only to the next tick, which allows us to calculate the freq correction before using the new tick length, which avoids accummulating error. Credit to Miroslav for pointing this out and providing the original patch this functionality has been pulled out from, along with the rational. Cc: Miroslav Lichvar Cc: Richard Cochran Cc: Prarit Bhargava Reported-by: Miroslav Lichvar Signed-off-by: John Stultz

timekeeping: Rework frequency adjustments to work better w/ nohz

2014-07-23T22:01:56Z

The existing timekeeping_adjust logic has always been complicated to understand. Further, since it was developed prior to NOHZ becoming common, its not surprising it performs poorly when NOHZ is enabled. Since Miroslav pointed out the problematic nature of the existing code in the NOHZ case, I've tried to refactor the code to perform better. The problem with the previous approach was that it tried to adjust for the total cumulative error using a scaled dampening factor. This resulted in large errors to be corrected slowly, while small errors were corrected quickly. With NOHZ the timekeeping code doesn't know how far out the next tick will be, so this results in bad over-correction to small errors, and insufficient correction to large errors. Inspired by Miroslav's patch, I've refactored the code to try to address the correction in two steps. 1) Check the future freq error for the next tick, and if the frequency error is large, try to make sure we correct it so it doesn't cause much accumulated error. 2) Then make a small single unit adjustment to correct any cumulative error that has collected over time. This method performs fairly well in the simulator Miroslav created. Major credit to Miroslav for pointing out the issue, providing the original patch to resolve this, a simulator for testing, as well as helping debug and resolve issues in my implementation so that it performed closer to his original implementation. Cc: Miroslav Lichvar Cc: Richard Cochran Cc: Prarit Bhargava Reported-by: Miroslav Lichvar Signed-off-by: John Stultz

timekeeping: Create struct tk_read_base and use it in struct timekeeper

2014-07-23T22:01:53Z

The members of the new struct are the required ones for the new NMI safe accessor to clcok monotonic. In order to reuse the existing timekeeping code and to make the update of the fast NMI safe timekeepers a simple memcpy use the struct for the timekeeper as well and convert all users. Signed-off-by: Thomas Gleixner Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Mathieu Desnoyers Signed-off-by: John Stultz