Re: [Qemu-devel] [PATCH v4 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts

From: Ulrich Obergfell <uobergfe@redhat.com>
To: Zachary Amsden <zamsden@redhat.com>
Cc: qemu-devel@nongnu.org, aliguori@us.ibm.com, kvm@vger.kernel.org,
	jan kiszka <jan.kiszka@siemens.com>,
	mtosatti@redhat.com, gcosta@redhat.com, avi@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts
Date: Thu, 12 May 2011 05:48:46 -0400 (EDT)	[thread overview]
Message-ID: <179318643.452716.1305193726535.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com> (raw)
In-Reply-To: <4DCB3EDF.5020704@redhat.com>

Hi Zachary,

1. re.:

>> +static void hpet_timer_driftfix_reset(HPETTimer *t)
>> +{
>> +    if (t->state->driftfix&&  timer_is_periodic(t)) {
>> +        t->ticks_not_accounted = t->prev_period = t->period;
>>    
>
> This is rather confusing.  Clearly, ticks_not_accounted isn't actually
> ticks not accounted, it's a different variable entirely which is based
> on the current period.
>
> If it were actually ticks_not_accounted, it should be reset to zero.

hpet_timer_driftfix_reset() is called at two sites before the periodic
timer is actually started via hpet_set_timer(). An interrupt shall be
delivered at the first callback of hpet_timer(), after t->period amount
of time has passed. The ticks are recorded as 'not accounted' until the
interrupt is delivered to the guest. The following message explains why
t->prev_period is needed in addition to t->period and how it is used.

  http://lists.gnu.org/archive/html/qemu-devel/2011-05/msg00275.html

On entry to hpet_timer(), t->ticks_not_accounted is >= t->prev_period,
and is it increased by t->period while the comparator register is being
advanced. This is also the place where we can detect whether the current
callback was delayed. If the period_count is > 1, the delay was so large
that we missed to inject an interrupt (which needs to be compensated).

             while (hpet_time_after(cur_tick, t->cmp)) {
                 t->cmp = (uint32_t)(t->cmp + t->period);
+                t->ticks_not_accounted += t->period;
+                period_count++;
             }

Please note that period_count can be zero. This happens during callbacks
that inject additional interrupts 'inside of' a period interval in order
to compensate missed and coalesced interrupts.

t->ticks_not_accounted is decreased only if an interrupt was delivered
to the guest. If an interrupt could not be delivered, the ticks that are
represented by that interrupt remain recorded as 'not accounted' (this
also triggers compensation of coalesced interrupts).

+            if (irq_delivered) {
+                t->ticks_not_accounted -= t->prev_period;
+                t->prev_period = t->period;
+            } else {
+                if (period_count) {
+                    t->irq_rate++;
+                    t->irq_rate = MIN(t->irq_rate, MAX_IRQ_RATE);
+                }
+            }

2. re.:

>> +                if (period_count) {
>> +                    t->divisor = t->irq_rate;
>> +                }
>> +                diff /= t->divisor--;
>>    
>
> Why subtracting from the divisor?  Shouldn't the divisor and irq_rate
> change in lockstep?

t->irq_rate is the rate at which interrupts are delivered to the guest,
relative to the period length. If t->irq_rate is 1, then one interrupt
shall be injected per period interval. If t->irq_rate is > 1, we are in
'compensation mode' (trying to inject additional interrupts 'inside of'
an interval).

- If t->irq_rate is 2, then two interrupts shall be injected during a
  period interval (one regular and one additional).

- If t->irq_rate is 3, then three interrupts shall be injected during a
  period interval (one regular and two additional).

- etc.

A non-zero period_count marks the start of an interval, at which the
divisor is set to t->irq_rate. Let's take a look at an example where
t->divisor = t->irq_rate = 3.

- The current period starts at t(p), the next period starts at t(p+1).
  We are now at t(p). The first additional interrupt shall be injected
  at a(1), the second at a(2). Hence, the next callback is scheduled
  to occur at a(1) = t(p) + diff / 3.

  t(p)                 a(1)                 a(2)                 t(p+1)
  +--------------------+--------------------+--------------------+ time
   <--------------------------- diff --------------------------->
   <---- diff / 3 ---->

- We are now in the callback at a(1). A new diff has been calculated,
  which is equal to the remaining time in the interval from a(1) to
  t(p+1). The second additional interrupt shall be injected at a(2).
  Hence, the next callback is scheduled to occur at a(2) = a(1) + diff / 2.

                       a(1)                 a(2)                 t(p+1)
                       +--------------------+--------------------+ time
                        <------------------ diff --------------->
                        <---- diff / 2 ---->

- We are now in the callback at a(2). A new diff has been calculated,
  which is equal to the remaining time in the interval from a(2) to
  t(p+1). The next callback marks the beginning of period t(p+1).

                                            a(2)                 t(p+1)
                                            +--------------------+ time
                                             <------ diff ------>
                                             <---- diff / 1 ---->

At t(p), the divisor is set to irq_rate (= 3). diff is divided by 3
and the divisor is decremented by one. At a(1), the divisor is 2.
diff is divided by 2 and the divisor is decremented by one. At a(2),
the divisor is 1. diff is divided by 1 and the divisor will be reset
at the beginning of t(p+1).

Regards,

Uli