* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
@ 2003-10-10 16:42 ` David Mosberger
2003-10-13 2:11 ` Ian Wienand
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2003-10-10 16:42 UTC (permalink / raw)
To: linux-ia64
Ian,
Just a quick note (got to run): I think you correctly identified the
race causing the now < last_tick problem. Purely from (bad) memory, I
think the problem was introduced when xtime_lock was converted from an
irq-safe spinlock to a seq-lock. In theory, xtime_lock still protects
get_offset(), but the theory only holds as long as the seq-lock body
is "transactional" (no side-effects until read_seqretry() returns 0).
I think the source of the probem is that we consider the value
returned by get_offset() to be valid EVEN when read_seqretry() returns
1. Because of that, we'll end up updating last_nsec_offset with a
potentialy bad value.
--david
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
2003-10-10 16:42 ` David Mosberger
@ 2003-10-13 2:11 ` Ian Wienand
2003-10-13 18:17 ` David Mosberger
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ian Wienand @ 2003-10-13 2:11 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 616 bytes --]
On Fri, Oct 10, 2003 at 09:42:35AM -0700, David Mosberger wrote:
> I think the source of the probem is that we consider the value
> returned by get_offset() to be valid EVEN when read_seqretry() returns
> 1. Because of that, we'll end up updating last_nsec_offset with a
> potentialy bad value.
Well, to my eyes the use of the xtime_lock in do_gettimeofday() looks
OK, but I guess what you are saying is that the message is moot --
xtime_lock protects everything itc_get_offset() needs, and
do_gettimeofday() has a read lock on xtime_lock and so reads the
offset again if something was updated underneath it.
-i
[-- Attachment #2: time.c.nomsg.diff --]
[-- Type: text/plain, Size: 1166 bytes --]
===== arch/ia64/kernel/time.c 1.35 vs edited =====
--- 1.35/arch/ia64/kernel/time.c Wed Oct 8 12:53:38 2003
+++ edited/arch/ia64/kernel/time.c Mon Oct 13 11:54:05 2003
@@ -65,8 +65,11 @@
}
/*
- * Return the number of nano-seconds that elapsed since the last update to jiffy. The
- * xtime_lock must be at least read-locked when calling this routine.
+ * Return the number of nano-seconds that elapsed since the last
+ * update to jiffy. It is quite possible that the timer interrupt
+ * will interrupt this and result in a race for any of jiffies,
+ * wall_jiffies or itm_next. Thus, the xtime_lock must be at least
+ * read-locked when calling this routine.
*/
unsigned long
itc_get_offset (void)
@@ -77,11 +80,6 @@
last_tick = (cpu_data(TIME_KEEPER_ID)->itm_next
- (lost + 1)*cpu_data(TIME_KEEPER_ID)->itm_delta);
- if (unlikely((long) (now - last_tick) < 0)) {
- printk(KERN_ERR "CPU %d: now < last_tick (now=0x%lx,last_tick=0x%lx)!\n",
- smp_processor_id(), now, last_tick);
- return last_nsec_offset;
- }
elapsed_cycles = now - last_tick;
return (elapsed_cycles*local_cpu_data->nsec_per_cyc) >> IA64_NSEC_PER_CYC_SHIFT;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
2003-10-10 16:42 ` David Mosberger
2003-10-13 2:11 ` Ian Wienand
@ 2003-10-13 18:17 ` David Mosberger
2003-10-13 23:06 ` Ian Wienand
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2003-10-13 18:17 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 13 Oct 2003 12:11:52 +1000, Ian Wienand <ianw@gelato.unsw.edu.au> said:
Ian> Well, to my eyes the use of the xtime_lock in do_gettimeofday() looks
Ian> OK, but I guess what you are saying is that the message is moot --
Ian> xtime_lock protects everything itc_get_offset() needs, and
Ian> do_gettimeofday() has a read lock on xtime_lock and so reads the
Ian> offset again if something was updated underneath it.
You do realize that xtime_lock is NOT a lock at all? Seqlock is a
scheme for lock-free synchronization. Readers and writers will run
concurrently and the only guarantee that you get is that if a writer
interfered with a reader, the reader will retry its operation (but of
course, this means that the reader occasionally will be seeing
inconsistent data).
--david
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
` (2 preceding siblings ...)
2003-10-13 18:17 ` David Mosberger
@ 2003-10-13 23:06 ` Ian Wienand
2003-10-14 5:23 ` David Mosberger
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ian Wienand @ 2003-10-13 23:06 UTC (permalink / raw)
To: linux-ia64
On Mon, Oct 13, 2003 at 11:17:06AM -0700, David Mosberger wrote:
> You do realize that xtime_lock is NOT a lock at all? Seqlock is a
> scheme for lock-free synchronization.
That was my point, sorry.
do_gettimeofday() does
while (1) {
seq = read_seqbegin(&xtime_lock);
{
old = last_nsec_offset;
offset = time_interpolator_get_offset();
sec = xtime.tv_sec;
nsec = xtime.tv_nsec;
}
if (unlikely(read_seqretry(&xtime_lock, seq)))
continue;
... and so on ...
}
Previously, if that read_seqbegin was some kind of irq save lock, then
time_interpolator_get_offset() (which is itc_get_offset()) should
never have been interrupted (especially by the timer interrupt), and
the warning message (now < last_tick) meant something was wrong.
Using synchronisation, it's probable that itc_get_offset() will be
interrupted every now and then, but do_gettimeofday() will keep
retrying it till read_seqretry informs it that read the right values.
So the warning message in itc_get_offset isn't really needed?
-i
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
` (3 preceding siblings ...)
2003-10-13 23:06 ` Ian Wienand
@ 2003-10-14 5:23 ` David Mosberger
2003-10-14 5:53 ` Ian Wienand
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2003-10-14 5:23 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 14 Oct 2003 09:06:21 +1000, Ian Wienand <ianw@gelato.unsw.edu.au> said:
Ian> That was my point, sorry.
Ian> do_gettimeofday() does
Ian> while (1) {
Ian> seq = read_seqbegin(&xtime_lock);
Ian> {
Ian> old = last_nsec_offset;
Ian> offset = time_interpolator_get_offset();
Ian> sec = xtime.tv_sec;
Ian> nsec = xtime.tv_nsec;
Ian> }
Ian> if (unlikely(read_seqretry(&xtime_lock, seq)))
Ian> continue;
Ian> ... and so on ...
Ian> }
Ian> Previously, if that read_seqbegin was some kind of irq save lock, then
Ian> time_interpolator_get_offset() (which is itc_get_offset()) should
Ian> never have been interrupted (especially by the timer interrupt), and
Ian> the warning message (now < last_tick) meant something was wrong.
Ian> Using synchronisation, it's probable that itc_get_offset() will be
Ian> interrupted every now and then, but do_gettimeofday() will keep
Ian> retrying it till read_seqretry informs it that read the right values.
Ian> So the warning message in itc_get_offset isn't really needed?
Hmmh, I seem to have misremembered the code. I thought we updated
last_nsec_offset _before_ read_seqretry(), but that's not the case. I
think you may be right that we can simply delete the (bogus)
consistency-check. May want to add a comment about that, though.
--david
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
` (4 preceding siblings ...)
2003-10-14 5:23 ` David Mosberger
@ 2003-10-14 5:53 ` Ian Wienand
2003-10-14 16:58 ` David Mosberger
2003-10-14 23:05 ` Ian Wienand
7 siblings, 0 replies; 9+ messages in thread
From: Ian Wienand @ 2003-10-14 5:53 UTC (permalink / raw)
To: linux-ia64
On Mon, Oct 13, 2003 at 10:23:39PM -0700, David Mosberger wrote:
> consistency-check. May want to add a comment about that, though.
A few messages up I added something like
+ * Return the number of nano-seconds that elapsed since the last
+ * update to jiffy. It is quite possible that the timer interrupt
+ * will interrupt this and result in a race for any of jiffies,
+ * wall_jiffies or itm_next. Thus, the xtime_lock must be at least
+ * read-locked when calling this routine.
although read-locked should probably be 'read synchronised' or
something to avoid confusion.
-i
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
` (5 preceding siblings ...)
2003-10-14 5:53 ` Ian Wienand
@ 2003-10-14 16:58 ` David Mosberger
2003-10-14 23:05 ` Ian Wienand
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2003-10-14 16:58 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 14 Oct 2003 15:53:15 +1000, Ian Wienand <ianw@gelato.unsw.edu.au> said:
Ian> On Mon, Oct 13, 2003 at 10:23:39PM -0700, David Mosberger
Ian> wrote:
>> consistency-check. May want to add a comment about that, though.
Ian> A few messages up I added something like
Ian> + * Return the number of nano-seconds that elapsed since the
Ian> last + * update to jiffy. It is quite possible that the timer
Ian> interrupt + * will interrupt this and result in a race for any
Ian> of jiffies, + * wall_jiffies or itm_next. Thus, the xtime_lock
Ian> must be at least + * read-locked when calling this routine.
Ian> although read-locked should probably be 'read synchronised' or
Ian> something to avoid confusion.
Sounds fine to me. Do you want to send me a complete patch so you'll
get the proper credit?
Thanks,
--david
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH_TAKE_2] now < last_tick problem
2003-10-10 4:13 [PATCH_TAKE_2] now < last_tick problem Ian Wienand
` (6 preceding siblings ...)
2003-10-14 16:58 ` David Mosberger
@ 2003-10-14 23:05 ` Ian Wienand
7 siblings, 0 replies; 9+ messages in thread
From: Ian Wienand @ 2003-10-14 23:05 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 148 bytes --]
On Tue, Oct 14, 2003 at 09:58:14AM -0700, David Mosberger wrote:
> Sounds fine to me. Do you want to send me a complete patch
attached
thanks,
-i
[-- Attachment #2: time.c.take3.diff --]
[-- Type: text/plain, Size: 1221 bytes --]
===== arch/ia64/kernel/time.c 1.35 vs edited =====
--- 1.35/arch/ia64/kernel/time.c Wed Oct 8 12:53:38 2003
+++ edited/arch/ia64/kernel/time.c Wed Oct 15 08:54:31 2003
@@ -65,8 +65,12 @@
}
/*
- * Return the number of nano-seconds that elapsed since the last update to jiffy. The
- * xtime_lock must be at least read-locked when calling this routine.
+ * Return the number of nano-seconds that elapsed since the last
+ * update to jiffy. It is quite possible that the timer interrupt
+ * will interrupt this and result in a race for any of jiffies,
+ * wall_jiffies or itm_next. Thus, the xtime_lock must be at least
+ * read synchronised when calling this routine (see do_gettimeofday()
+ * below for an example).
*/
unsigned long
itc_get_offset (void)
@@ -77,11 +81,6 @@
last_tick = (cpu_data(TIME_KEEPER_ID)->itm_next
- (lost + 1)*cpu_data(TIME_KEEPER_ID)->itm_delta);
- if (unlikely((long) (now - last_tick) < 0)) {
- printk(KERN_ERR "CPU %d: now < last_tick (now=0x%lx,last_tick=0x%lx)!\n",
- smp_processor_id(), now, last_tick);
- return last_nsec_offset;
- }
elapsed_cycles = now - last_tick;
return (elapsed_cycles*local_cpu_data->nsec_per_cyc) >> IA64_NSEC_PER_CYC_SHIFT;
}
^ permalink raw reply [flat|nested] 9+ messages in thread