From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42016)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1V4teD-0005el-ND
	for qemu-devel@nongnu.org; Thu, 01 Aug 2013 10:14:39 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1V4te7-0000HC-Ki
	for qemu-devel@nongnu.org; Thu, 01 Aug 2013 10:14:33 -0400
Received: from mx1.redhat.com ([209.132.183.28]:46988)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1V4te7-0000Gc-CC
	for qemu-devel@nongnu.org; Thu, 01 Aug 2013 10:14:27 -0400
Date: Thu, 1 Aug 2013 16:14:11 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20130801141411.GA5761@mail.corp.redhat.com>
References: <34B083D4DB6D2988A571E420@nimrod.local>
	<1374863862-32517-1-git-send-email-alex@alex.org.uk>
	<1374863862-32517-11-git-send-email-alex@alex.org.uk>
	<20130801124152.GF5245@mail.corp.redhat.com>
	<C6BAB67610059E7BC3B9EABD@Ximines.local>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <C6BAB67610059E7BC3B9EABD@Ximines.local>
Subject: Re: [Qemu-devel] [RFC] [PATCHv4 10/13] aio / timers: Convert
 mainloop to use timeout
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alex Bligh <alex@alex.org.uk>
Cc: Kevin Wolf <kwolf@redhat.com>, Anthony Liguori <aliguori@us.ibm.com>, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>, rth@twiddle.net

 On Aug 01 2013, Alex Bligh wrote:
> Paolo,
> 
> >>@@ -449,6 +460,7 @@ int main_loop_wait(int nonblocking)
> >> {
> >>     int ret;
> >>     uint32_t timeout = UINT32_MAX;
> >>+    int64_t timeout_ns;
> >>
> >>     if (nonblocking) {
> >>         timeout = 0;
> >>@@ -462,7 +474,21 @@ int main_loop_wait(int nonblocking)
> >>     slirp_pollfds_fill(gpollfds);
> >> # endif
> >>     qemu_iohandler_fill(gpollfds);
> >>-    ret = os_host_main_loop_wait(timeout);
> >>+
> >>+    if (timeout == UINT32_MAX) {
> >>+        timeout_ns = -1;
> >>+    } else {
> >>+        timeout_ns = (uint64_t)timeout * (int64_t)(SCALE_MS);
> >>+    }
> >>+
> >>+    timeout_ns = qemu_soonest_timeout(timeout_ns,
> >>+                                      qemu_clock_deadline_ns(rt_clock));
> >>+    timeout_ns = qemu_soonest_timeout(timeout_ns,
> >>+                                      qemu_clock_deadline_ns(vm_clock));
> >
> >This must not be included if use_icount.
> 
> Really? qemu_run_all_timers was running all 3 timers irrespective of
> use_icount, and doing a qemu_notify if anything expired, so I'm merely
> mimicking the existing behaviour here.

Maybe I'm misreading the code.  If it is a replacement of
qemu_next_alarm_deadline, then it is indeed omitting vm_clock.

> I'm not quite sure what use_icount does. Does vm_clock get disabled
> if it is set? (in which case it won't matter).

It doesn't count nanoseconds anymore.  The clock is updated by the
VCPU thread.  When the VCPU thread notices that the clock is past the
earliest timers, it does a qemu_notify_event().  That exits the g_poll
and qemu_run_all_timers then can process the callbacks.

> The way it's done at the moment is that the QEMUTimerList user can
> create as many QEMUTimerLists as he wants. So AioContext asks for one
> of one type. It could equally ask for three - one of each type.
> 
> I think that's probably adequate.
> 
> >Once you do this, you get some complications due to more data structures,
> >but other code is simplified noticeably.  For example, you lose the
> >concept of a default timerlist (it's just the timerlist of the default
> >AioContext).
> 
> Yep - per the above that's really intrusive (I think I touched well over
> a hundred files). The problem is that lots of things refer to vm_clock
> to set a timer (when it's a clock so should use a timer list) and
> also to vm_clock to read the timer (which is a clock function).

Yes, that's fine.  You can still keep the shorter invocation,
but instead of using clock->timerlist it would use
qemu_aio_context->clocks[clock->type].

Related to this, a better name for the "full" functions taking
a timerlist could be simply timer_new_ns etc.  And I would remove
the allocation step for these functions.  It is shameful how little
we use qemu_free_timer, and removing allocation would "fix" the
problem completely for users of the QEMUTimerList-based functions.

It's already a convention to use qemu_* only for functions that use some
global state, for example qemu_notify_event() vs. aio_notify().

> >And because all timerlists have an AioContext,
> 
> Well old callers, particularly those not using an AioContext, would
> not have an AioContext would they?

It would be qemu_aio_context.

> I was trying hard to avoid anything having to iterate over all
> timerlists, and leave the timerlist to be per-thread where possible.
> This may well fail for the clock warp stuff. I probably need to
> exactly the same as on qemu_clock_enable() here if use_icount is
> true. WDYT?

Yes.  This:

        qemu_mod_timer(icount_warp_timer, vm_clock_warp_start + deadline);

would have to use the earliest deadline of all vm_clock timerlists.

And this:

        if (qemu_clock_expired(vm_clock)) {
            qemu_notify_event();
        }

would also have to walk all timerlists for vm_clock, and notify
those that have expired.  But you would not need one warp timer
per timerlist.

Paolo