All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Using aio_poll for timer carrier threads
@ 2013-08-13  7:56 Jan Kiszka
  2013-08-13 13:45 ` Stefan Hajnoczi
  2013-08-13 14:45 ` Paolo Bonzini
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Kiszka @ 2013-08-13  7:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Anthony Liguori, qemu-devel, liu ping fan,
	Alex Bligh, Paolo Bonzini, MORITA Kazutaka, Richard Henderson

Hi Stefan,

in the attempt to use Alex' ppoll-based timer rework for decoupled,
real-time capable timer device models I'm now scratching my head over
the aio_poll interface. I'm looking at dataplane/virtio-blk.c, just finding

static void *data_plane_thread(void *opaque)
{
    VirtIOBlockDataPlane *s = opaque;

    do {
        aio_poll(s->ctx, true);
    } while (!s->stopping || s->num_reqs > 0);
    return NULL;
}

wondering where the locking is. Or doesn't this use need any at all? Are
all data structures that this thread accesses exclusively used by it, or
are they all accessed in a lock-less way?

Our iothread mainloop more or less open-codes aio_poll and is, thus,
able to drop its lock before falling asleep while still holding it
during event dispatching. Obviously, I need the same when processing
timer lists of an AioContext, protecting them against concurrent
modifications over VCPUs or other threads. So I'm thinking of adding a
block notification callback to aio_poll, to be called before/after
qemu_poll_ns so that any locks can be dropped / reacquired as needed. Or
am I missing some magic interface / pattern?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13  7:56 [Qemu-devel] Using aio_poll for timer carrier threads Jan Kiszka
@ 2013-08-13 13:45 ` Stefan Hajnoczi
  2013-08-13 14:13   ` Jan Kiszka
  2013-08-13 14:45 ` Paolo Bonzini
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2013-08-13 13:45 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On Tue, Aug 13, 2013 at 09:56:17AM +0200, Jan Kiszka wrote:
> in the attempt to use Alex' ppoll-based timer rework for decoupled,
> real-time capable timer device models I'm now scratching my head over
> the aio_poll interface. I'm looking at dataplane/virtio-blk.c, just finding
> 
> static void *data_plane_thread(void *opaque)
> {
>     VirtIOBlockDataPlane *s = opaque;
> 
>     do {
>         aio_poll(s->ctx, true);
>     } while (!s->stopping || s->num_reqs > 0);
>     return NULL;
> }
> 
> wondering where the locking is. Or doesn't this use need any at all? Are
> all data structures that this thread accesses exclusively used by it, or
> are they all accessed in a lock-less way?

Most of the data structures in dataplane upstream are not shared.
Virtio, virtio-blk, and Linux AIO raw file I/O are duplicated for
dataplane and do not rely on QEMU infrastructure.

I've been working on undoing this duplication over the past months but
upstream QEMU still mostly does not share data structures and therefore
does not need much synchronization.  For the crude synchronization that
we do need we simply start/stop the dataplane thread.

> Our iothread mainloop more or less open-codes aio_poll and is, thus,
> able to drop its lock before falling asleep while still holding it
> during event dispatching. Obviously, I need the same when processing
> timer lists of an AioContext, protecting them against concurrent
> modifications over VCPUs or other threads. So I'm thinking of adding a
> block notification callback to aio_poll, to be called before/after
> qemu_poll_ns so that any locks can be dropped / reacquired as needed. Or
> am I missing some magic interface / pattern?

Upstream dataplane does not use timers, so the code there cannot serve
as an example.

If you combine Alex Bligh, Ping Fan, and my latest timer series, you get
support for QEMUTimer in AioContexts where qemu_timer_mod_ns() and
qemu_timer_del() are thread-safe.  vm_clock (without icount) and
rt_clock are thread-safe clock sources.

This should make timers usable in another thread for clock device
emulation if only your iothread uses the AioContext and its timers
(besides the thread-safe mod/del interfaces).

The details depend on your device, do you have a git repo I can look at
to understand your device model?

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13 13:45 ` Stefan Hajnoczi
@ 2013-08-13 14:13   ` Jan Kiszka
  2013-08-14  0:48     ` liu ping fan
                       ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jan Kiszka @ 2013-08-13 14:13 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On 2013-08-13 15:45, Stefan Hajnoczi wrote:
> On Tue, Aug 13, 2013 at 09:56:17AM +0200, Jan Kiszka wrote:
>> in the attempt to use Alex' ppoll-based timer rework for decoupled,
>> real-time capable timer device models I'm now scratching my head over
>> the aio_poll interface. I'm looking at dataplane/virtio-blk.c, just finding
>>
>> static void *data_plane_thread(void *opaque)
>> {
>>     VirtIOBlockDataPlane *s = opaque;
>>
>>     do {
>>         aio_poll(s->ctx, true);
>>     } while (!s->stopping || s->num_reqs > 0);
>>     return NULL;
>> }
>>
>> wondering where the locking is. Or doesn't this use need any at all? Are
>> all data structures that this thread accesses exclusively used by it, or
>> are they all accessed in a lock-less way?
> 
> Most of the data structures in dataplane upstream are not shared.
> Virtio, virtio-blk, and Linux AIO raw file I/O are duplicated for
> dataplane and do not rely on QEMU infrastructure.
> 
> I've been working on undoing this duplication over the past months but
> upstream QEMU still mostly does not share data structures and therefore
> does not need much synchronization.  For the crude synchronization that
> we do need we simply start/stop the dataplane thread.
> 
>> Our iothread mainloop more or less open-codes aio_poll and is, thus,
>> able to drop its lock before falling asleep while still holding it
>> during event dispatching. Obviously, I need the same when processing
>> timer lists of an AioContext, protecting them against concurrent
>> modifications over VCPUs or other threads. So I'm thinking of adding a
>> block notification callback to aio_poll, to be called before/after
>> qemu_poll_ns so that any locks can be dropped / reacquired as needed. Or
>> am I missing some magic interface / pattern?
> 
> Upstream dataplane does not use timers, so the code there cannot serve
> as an example.
> 
> If you combine Alex Bligh, Ping Fan, and my latest timer series, you get
> support for QEMUTimer in AioContexts where qemu_timer_mod_ns() and
> qemu_timer_del() are thread-safe.  vm_clock (without icount) and
> rt_clock are thread-safe clock sources.

To which series of yours and Ping Fan are you referring? [1] and [2]?

> 
> This should make timers usable in another thread for clock device
> emulation if only your iothread uses the AioContext and its timers
> (besides the thread-safe mod/del interfaces).

As argued in the other thread, I don't think we need (and want) locking
in the timer subsystem, rather push this to its users. But I'll look
again at your patches, if they are also usable.

> 
> The details depend on your device, do you have a git repo I can look at
> to understand your device model?

Pushed my hacks here:

git://git.kiszka.org/qemu.git queues/rt.new3

Jan

[1] http://thread.gmane.org/gmane.comp.emulators.qemu/227590
[2] http://thread.gmane.org/gmane.comp.emulators.qemu/226369

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13  7:56 [Qemu-devel] Using aio_poll for timer carrier threads Jan Kiszka
  2013-08-13 13:45 ` Stefan Hajnoczi
@ 2013-08-13 14:45 ` Paolo Bonzini
  2013-08-13 14:54   ` Jan Kiszka
  1 sibling, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2013-08-13 14:45 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, MORITA Kazutaka, Richard Henderson

Il 13/08/2013 09:56, Jan Kiszka ha scritto:
> Hi Stefan,
> 
> in the attempt to use Alex' ppoll-based timer rework for decoupled,
> real-time capable timer device models I'm now scratching my head over
> the aio_poll interface. I'm looking at dataplane/virtio-blk.c, just finding
> 
> static void *data_plane_thread(void *opaque)
> {
>     VirtIOBlockDataPlane *s = opaque;
> 
>     do {
>         aio_poll(s->ctx, true);
>     } while (!s->stopping || s->num_reqs > 0);
>     return NULL;
> }
> 
> wondering where the locking is. Or doesn't this use need any at all? Are
> all data structures that this thread accesses exclusively used by it, or
> are they all accessed in a lock-less way?

There is some locking in aio_bh_poll.  It is pretty lightweight because
adding elements and deleting them is done under a lock, while the list
is walked without taking it (because deletion is only done by the thread
that walks).  We could do something similar for file descriptors; timers
are more complicated because insertions happen for every mod_timer.

Using an AioContext lock for timers is somewhat complicated for lock
ordering, because context A could try to modify a timer from context B,
at the same time when context B is modifying a timer from context A.
This would cause a deadlock.  So I agree with Stefan's usage of a
finer-grain lock for timer lists.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13 14:45 ` Paolo Bonzini
@ 2013-08-13 14:54   ` Jan Kiszka
  2013-08-19 13:21     ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2013-08-13 14:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, MORITA Kazutaka, Richard Henderson

On 2013-08-13 16:45, Paolo Bonzini wrote:
> Il 13/08/2013 09:56, Jan Kiszka ha scritto:
>> Hi Stefan,
>>
>> in the attempt to use Alex' ppoll-based timer rework for decoupled,
>> real-time capable timer device models I'm now scratching my head over
>> the aio_poll interface. I'm looking at dataplane/virtio-blk.c, just finding
>>
>> static void *data_plane_thread(void *opaque)
>> {
>>     VirtIOBlockDataPlane *s = opaque;
>>
>>     do {
>>         aio_poll(s->ctx, true);
>>     } while (!s->stopping || s->num_reqs > 0);
>>     return NULL;
>> }
>>
>> wondering where the locking is. Or doesn't this use need any at all? Are
>> all data structures that this thread accesses exclusively used by it, or
>> are they all accessed in a lock-less way?
> 
> There is some locking in aio_bh_poll.  It is pretty lightweight because
> adding elements and deleting them is done under a lock, while the list
> is walked without taking it (because deletion is only done by the thread
> that walks).  We could do something similar for file descriptors; timers
> are more complicated because insertions happen for every mod_timer.
> 
> Using an AioContext lock for timers is somewhat complicated for lock
> ordering, because context A could try to modify a timer from context B,
> at the same time when context B is modifying a timer from context A.
> This would cause a deadlock.

That's like MMIO access on device A triggers MMIO access on B and vice
versa - why should we need this, so why should we support this? I think
the typical case is that timers (and their lists) and data structures
they access have a fairly close relation, thus can reuse the same lock.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13 14:13   ` Jan Kiszka
@ 2013-08-14  0:48     ` liu ping fan
  2013-08-14  8:52     ` Stefan Hajnoczi
  2013-08-14 12:32     ` Jan Kiszka
  2 siblings, 0 replies; 15+ messages in thread
From: liu ping fan @ 2013-08-14  0:48 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, Stefan Hajnoczi,
	qemu-devel, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On Tue, Aug 13, 2013 at 10:13 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2013-08-13 15:45, Stefan Hajnoczi wrote:
>> On Tue, Aug 13, 2013 at 09:56:17AM +0200, Jan Kiszka wrote:
>>> in the attempt to use Alex' ppoll-based timer rework for decoupled,
>>> real-time capable timer device models I'm now scratching my head over
>>> the aio_poll interface. I'm looking at dataplane/virtio-blk.c, just finding
>>>
>>> static void *data_plane_thread(void *opaque)
>>> {
>>>     VirtIOBlockDataPlane *s = opaque;
>>>
>>>     do {
>>>         aio_poll(s->ctx, true);
>>>     } while (!s->stopping || s->num_reqs > 0);
>>>     return NULL;
>>> }
>>>
>>> wondering where the locking is. Or doesn't this use need any at all? Are
>>> all data structures that this thread accesses exclusively used by it, or
>>> are they all accessed in a lock-less way?
>>
>> Most of the data structures in dataplane upstream are not shared.
>> Virtio, virtio-blk, and Linux AIO raw file I/O are duplicated for
>> dataplane and do not rely on QEMU infrastructure.
>>
>> I've been working on undoing this duplication over the past months but
>> upstream QEMU still mostly does not share data structures and therefore
>> does not need much synchronization.  For the crude synchronization that
>> we do need we simply start/stop the dataplane thread.
>>
>>> Our iothread mainloop more or less open-codes aio_poll and is, thus,
>>> able to drop its lock before falling asleep while still holding it
>>> during event dispatching. Obviously, I need the same when processing
>>> timer lists of an AioContext, protecting them against concurrent
>>> modifications over VCPUs or other threads. So I'm thinking of adding a
>>> block notification callback to aio_poll, to be called before/after
>>> qemu_poll_ns so that any locks can be dropped / reacquired as needed. Or
>>> am I missing some magic interface / pattern?
>>
>> Upstream dataplane does not use timers, so the code there cannot serve
>> as an example.
>>
>> If you combine Alex Bligh, Ping Fan, and my latest timer series, you get
>> support for QEMUTimer in AioContexts where qemu_timer_mod_ns() and
>> qemu_timer_del() are thread-safe.  vm_clock (without icount) and
>> rt_clock are thread-safe clock sources.
>
> To which series of yours and Ping Fan are you referring? [1] and [2]?
>
Stefan's [1] has been rebased onto Alex's v10
My part [2']  http://thread.gmane.org/gmane.comp.emulators.qemu/227751,
rebased onto v10 too.

>>
>> This should make timers usable in another thread for clock device
>> emulation if only your iothread uses the AioContext and its timers
>> (besides the thread-safe mod/del interfaces).
>
> As argued in the other thread, I don't think we need (and want) locking
> in the timer subsystem, rather push this to its users. But I'll look
> again at your patches, if they are also usable.
>
>>
>> The details depend on your device, do you have a git repo I can look at
>> to understand your device model?
>
> Pushed my hacks here:
>
> git://git.kiszka.org/qemu.git queues/rt.new3
>
> Jan
>
> [1] http://thread.gmane.org/gmane.comp.emulators.qemu/227590
> [2] http://thread.gmane.org/gmane.comp.emulators.qemu/226369
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SES-DE
> Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13 14:13   ` Jan Kiszka
  2013-08-14  0:48     ` liu ping fan
@ 2013-08-14  8:52     ` Stefan Hajnoczi
  2013-08-14  9:05       ` Jan Kiszka
  2013-08-14 12:32     ` Jan Kiszka
  2 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2013-08-14  8:52 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On Tue, Aug 13, 2013 at 04:13:03PM +0200, Jan Kiszka wrote:
> On 2013-08-13 15:45, Stefan Hajnoczi wrote:
> > On Tue, Aug 13, 2013 at 09:56:17AM +0200, Jan Kiszka wrote:
> > The details depend on your device, do you have a git repo I can look at
> > to understand your device model?
> 
> Pushed my hacks here:
> 
> git://git.kiszka.org/qemu.git queues/rt.new3

Excellent, thanks!  Are you calling qemu_raise_irq() outside the global
mutex and how is it protected?

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-14  8:52     ` Stefan Hajnoczi
@ 2013-08-14  9:05       ` Jan Kiszka
  2013-08-14 11:35         ` Stefan Hajnoczi
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2013-08-14  9:05 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On 2013-08-14 10:52, Stefan Hajnoczi wrote:
> On Tue, Aug 13, 2013 at 04:13:03PM +0200, Jan Kiszka wrote:
>> On 2013-08-13 15:45, Stefan Hajnoczi wrote:
>>> On Tue, Aug 13, 2013 at 09:56:17AM +0200, Jan Kiszka wrote:
>>> The details depend on your device, do you have a git repo I can look at
>>> to understand your device model?
>>
>> Pushed my hacks here:
>>
>> git://git.kiszka.org/qemu.git queues/rt.new3
> 
> Excellent, thanks!  Are you calling qemu_raise_irq() outside the global
> mutex and how is it protected?

By luck and via many exceptions, specifically by disabling of HPET
support (to avoid that it is involved in IRQ routing - or even used in
legacy mode) and by relying on the direct delivery to the kernel in KVM
mode. Yes, IRQ delivery is still a huge construction site for BQL-free
device models.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-14  9:05       ` Jan Kiszka
@ 2013-08-14 11:35         ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2013-08-14 11:35 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On Wed, Aug 14, 2013 at 11:05:29AM +0200, Jan Kiszka wrote:
> On 2013-08-14 10:52, Stefan Hajnoczi wrote:
> > On Tue, Aug 13, 2013 at 04:13:03PM +0200, Jan Kiszka wrote:
> >> On 2013-08-13 15:45, Stefan Hajnoczi wrote:
> >>> On Tue, Aug 13, 2013 at 09:56:17AM +0200, Jan Kiszka wrote:
> >>> The details depend on your device, do you have a git repo I can look at
> >>> to understand your device model?
> >>
> >> Pushed my hacks here:
> >>
> >> git://git.kiszka.org/qemu.git queues/rt.new3
> > 
> > Excellent, thanks!  Are you calling qemu_raise_irq() outside the global
> > mutex and how is it protected?
> 
> By luck and via many exceptions, specifically by disabling of HPET
> support (to avoid that it is involved in IRQ routing - or even used in
> legacy mode) and by relying on the direct delivery to the kernel in KVM
> mode. Yes, IRQ delivery is still a huge construction site for BQL-free
> device models.

Okay.  In dataplane I use the guest notifier for virtio devices (irqfd
for KVM mode with MSI-X or bounced through
virtio_queue_guest_notifier_read()).

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13 14:13   ` Jan Kiszka
  2013-08-14  0:48     ` liu ping fan
  2013-08-14  8:52     ` Stefan Hajnoczi
@ 2013-08-14 12:32     ` Jan Kiszka
  2013-08-19 12:00       ` Paolo Bonzini
  2 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2013-08-14 12:32 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, Paolo Bonzini, MORITA Kazutaka,
	Richard Henderson

On 2013-08-13 16:13, Jan Kiszka wrote:
> On 2013-08-13 15:45, Stefan Hajnoczi wrote:
>> This should make timers usable in another thread for clock device
>> emulation if only your iothread uses the AioContext and its timers
>> (besides the thread-safe mod/del interfaces).
> 
> As argued in the other thread, I don't think we need (and want) locking
> in the timer subsystem, rather push this to its users. But I'll look
> again at your patches, if they are also usable.

I've checked and applied your two patches adding the active_timers lock.
This model apparently works as well for the RTC use case. And it avoids
having to patch aio_poll, the RTC device lock is now taken inside the
timer handlers.

I still need to check more corner cases as timer dequeuing can now race
with the handler execution, ie. a dequeued timer can still see one more
handler run after timer_del returned. That's a property one can easily
take into account when writing device models, but it has to be kept in
mind that it's different from current behavior.

Updated queue is at git://git.kiszka.org/qemu.git queues/rt.new3 again.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-14 12:32     ` Jan Kiszka
@ 2013-08-19 12:00       ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2013-08-19 12:00 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Alex Bligh, Stefan Hajnoczi,
	qemu-devel, liu ping fan, Stefan Hajnoczi, MORITA Kazutaka,
	Richard Henderson

Il 14/08/2013 14:32, Jan Kiszka ha scritto:
> I still need to check more corner cases as timer dequeuing can now race
> with the handler execution, ie. a dequeued timer can still see one more
> handler run after timer_del returned. That's a property one can easily
> take into account when writing device models, but it has to be kept in
> mind that it's different from current behavior.
> 
> Updated queue is at git://git.kiszka.org/qemu.git queues/rt.new3 again.

I took a look here, there are several patches that I guess are basically
ready... two random things I noticed:

(1) rcu_init() must be called in rcutorture too

(2) there are several load/store functions in exec.c that do not go
through address_space_rw.  These would not call
qemu_flush_coalesced_mmio_buffer() after your patch "provide
address_space_rw_unlocked".  I think that in most cases the solution
should be to make these functions go through address_space_rw.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-13 14:54   ` Jan Kiszka
@ 2013-08-19 13:21     ` Paolo Bonzini
  2013-08-19 13:40       ` Jan Kiszka
  2013-08-19 13:58       ` Alex Bligh
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2013-08-19 13:21 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, MORITA Kazutaka, Richard Henderson

Il 13/08/2013 16:54, Jan Kiszka ha scritto:
>> > Using an AioContext lock for timers is somewhat complicated for lock
>> > ordering, because context A could try to modify a timer from context B,
>> > at the same time when context B is modifying a timer from context A.
>> > This would cause a deadlock.
> That's like MMIO access on device A triggers MMIO access on B and vice
> versa - why should we need this, so why should we support this? I think
> the typical case is that timers (and their lists) and data structures
> they access have a fairly close relation, thus can reuse the same lock.

Yes, that's true.  Still it would have to be documented, and using
too-coarse locks risks having many BQLs, which multiplies the complexity
(fine-grained locking at least keeps critical sections small and limits
the amount of nested locking).

I like Stefan's patches to make the timer list thread-safe, especially
if we can optimize it (with RCU?) to make the read side lockless.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-19 13:21     ` Paolo Bonzini
@ 2013-08-19 13:40       ` Jan Kiszka
  2013-08-19 14:09         ` Paolo Bonzini
  2013-08-19 13:58       ` Alex Bligh
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2013-08-19 13:40 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, MORITA Kazutaka, Richard Henderson

On 2013-08-19 15:21, Paolo Bonzini wrote:
> Il 13/08/2013 16:54, Jan Kiszka ha scritto:
>>>> Using an AioContext lock for timers is somewhat complicated for lock
>>>> ordering, because context A could try to modify a timer from context B,
>>>> at the same time when context B is modifying a timer from context A.
>>>> This would cause a deadlock.
>> That's like MMIO access on device A triggers MMIO access on B and vice
>> versa - why should we need this, so why should we support this? I think
>> the typical case is that timers (and their lists) and data structures
>> they access have a fairly close relation, thus can reuse the same lock.
> 
> Yes, that's true.  Still it would have to be documented, and using
> too-coarse locks risks having many BQLs, which multiplies the complexity
> (fine-grained locking at least keeps critical sections small and limits
> the amount of nested locking).

As this lock does not require taking other locks while holding it, it
should actually be fine.

> 
> I like Stefan's patches to make the timer list thread-safe, especially
> if we can optimize it (with RCU?) to make the read side lockless.

What is a pure read-side in that context? Checks if some timer is
expired? Given that RCU write sides are heavier than plain mutexes and
many typical accesses (mod, del, expire) involve writing, such an
optimization may also be counterproductive.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-19 13:21     ` Paolo Bonzini
  2013-08-19 13:40       ` Jan Kiszka
@ 2013-08-19 13:58       ` Alex Bligh
  1 sibling, 0 replies; 15+ messages in thread
From: Alex Bligh @ 2013-08-19 13:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Alex Bligh, Jan Kiszka, qemu-devel,
	liu ping fan, Stefan Hajnoczi, MORITA Kazutaka,
	Richard Henderson


On 19 Aug 2013, at 14:21, Paolo Bonzini wrote:

> I like Stefan's patches to make the timer list thread-safe, especially
> if we can optimize it (with RCU?) to make the read side lockless.

We will have to be careful on the write side. I'm trying to
work out just how slow this would with Paolo's RCU patches.
If call_rcu is used (or synchronize_rcu after the entire list
has been traversed, all timers updated etc), how bad is it?

There is a particular concern for  for repeating timers where two writes
are done (firstly to remove the timer from the list, then next
inside the timer routine to add it back to the list). As timers
are only called by a single thread, it might be worth having
some form of timer API which specifies a timer should be
called every X nano/micro/milli/seconds, or just once in X
nano/micro/milli/seconds time, which would save the inevitable
call to read the clock value first, and would speed things up
if write locking was slow.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Using aio_poll for timer carrier threads
  2013-08-19 13:40       ` Jan Kiszka
@ 2013-08-19 14:09         ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2013-08-19 14:09 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	liu ping fan, Alex Bligh, MORITA Kazutaka, Richard Henderson

Il 19/08/2013 15:40, Jan Kiszka ha scritto:
> On 2013-08-19 15:21, Paolo Bonzini wrote:
>> Il 13/08/2013 16:54, Jan Kiszka ha scritto:
>>>>> Using an AioContext lock for timers is somewhat complicated for lock
>>>>> ordering, because context A could try to modify a timer from context B,
>>>>> at the same time when context B is modifying a timer from context A.
>>>>> This would cause a deadlock.
>>> That's like MMIO access on device A triggers MMIO access on B and vice
>>> versa - why should we need this, so why should we support this? I think
>>> the typical case is that timers (and their lists) and data structures
>>> they access have a fairly close relation, thus can reuse the same lock.
>>
>> Yes, that's true.  Still it would have to be documented, and using
>> too-coarse locks risks having many BQLs, which multiplies the complexity
>> (fine-grained locking at least keeps critical sections small and limits
>> the amount of nested locking).
> 
> As this lock does not require taking other locks while holding it, it
> should actually be fine.

Using an AioContext lock would be more dangerous, though.  The risk is
mostly that someone is holding _another_ AioContext lock while calling a
timer function.

For MMIO, the rule would be that all address_space_rw/map/unmap must be
done with no lock taken.

>> I like Stefan's patches to make the timer list thread-safe, especially
>> if we can optimize it (with RCU?) to make the read side lockless.
> 
> What is a pure read-side in that context? Checks if some timer is
> expired?

Yes.  It is actually quite common, as Stefan pointed out elsewhere to
Ping Fan, as it happens at least once in each iteration of the loop.

The read side doesn't really need to look beyond the head of the list.

> Given that RCU write sides are heavier than plain mutexes and
> many typical accesses (mod, del, expire) involve writing, such an
> optimization may also be counterproductive.

This would have to be a creative use of RCU, with no "copy" in it...
basically you can use RCU only as a substitute for reference counting.
call_rcu/synchronize_rcu is not done on every write operation, all we
need to do is ensuring that devices free timers only after an RCU
critical section.  We should do the same for memory regions, see my
patches to split exitfn and instance_finalize; I'll go more in depth in
my KVM Forum 2013.

Lockless algorithms (including RCU) are somewhat appealing because you
do not have to worry about locking, but of course all warnings about
increased complexity and premature optimization apply.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-08-19 14:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-13  7:56 [Qemu-devel] Using aio_poll for timer carrier threads Jan Kiszka
2013-08-13 13:45 ` Stefan Hajnoczi
2013-08-13 14:13   ` Jan Kiszka
2013-08-14  0:48     ` liu ping fan
2013-08-14  8:52     ` Stefan Hajnoczi
2013-08-14  9:05       ` Jan Kiszka
2013-08-14 11:35         ` Stefan Hajnoczi
2013-08-14 12:32     ` Jan Kiszka
2013-08-19 12:00       ` Paolo Bonzini
2013-08-13 14:45 ` Paolo Bonzini
2013-08-13 14:54   ` Jan Kiszka
2013-08-19 13:21     ` Paolo Bonzini
2013-08-19 13:40       ` Jan Kiszka
2013-08-19 14:09         ` Paolo Bonzini
2013-08-19 13:58       ` Alex Bligh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.