All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Bligh <alex@alex.org.uk>
To: Hans de Goede <hdegoede@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, Alex Bligh <alex@alex.org.uk>
Subject: Re: [Qemu-devel] [PATCH] main-loop: Don't lock starve io-threads when main_loop_tlg has pending events
Date: Tue, 8 Oct 2013 21:01:57 +0100	[thread overview]
Message-ID: <6CD4D4EE-341E-48D0-98F4-D55C0D3922D4@alex.org.uk> (raw)
In-Reply-To: <52546000.6070308@redhat.com>


On 8 Oct 2013, at 20:41, Hans de Goede wrote:

>>> 
>>> Wasn't it 1 ms until the offending commit (note 250 us does
>>> sound better to me).
>> 
>> I believe you've made it 1 nanosecond not 1 millisecond;
> 
> Correct, the 1 ms I referred to was before your commit which changed
> things from ms to ns.

OK I was looking at the patch as it would apply to master now.

> The purpose of the 1 ns timeout is to cause os_host_main_loop_wait
> to unlock the iothread, as $subject says the problem I'm seeing seems
> to be lock starvation not cpu starvation.
> 
> Note as I already indicated I'm in no way an expert in this, if you
> and or Paolo suspect cpu starvation may happen too, then bumping
> the timeout to 250 us is fine with me too.
> 
> If we go with 250 us that thus pose the question though if we should
> always keep a minimum timeout of 250 us when not non-blocking, or only
> bump it to 250 us when main_loop_tlg has already expired events and
> thus is causing a timeout of 0.

I am by no means an expert in the iothread bit, so let's pool our
ignorance ... :-) 

Somewhere within that patch series (7b595f35 I think) I fixed up
the spin counter bit, which made it slightly less yucky and work
with milliseconds. I hope I didn't break it but there seems
something slightly odd about the use case here.

If you are getting the spin error, this implies something is
pretty much constantly polling os_host_main_loop_wait with a
zero timeout. As you point out this is going to be main_loop_wait
and almost certainly main_loop_wait called with nonblocking
set to 1.

The comment at line 208 suggests that "the I/O thread is very busy
or we are incorrectly busy waiting in the I/O thread". Do we know
which is happening? Perhaps rather than give up the io_thread
mutex on every call (which is in practice what a 1 nanosecond
timeout does) we should give it up if we have not released
it for X nanoseconds (maybe X=250us), or on every Y calls. I think
someone other than me should consider the effect of dropping and
reacquiring a mutex so frequently under heavy I/O load, but I'm not
sure it's a great idea.

So on reflection you might be more right with 1 nanosecond than
250us as a timeout of 250us, but I wonder whether a strategy
of just dropping the lock occasionally (and still using a zero
timeout) might be better.

-- 
Alex Bligh

  reply	other threads:[~2013-10-08 20:02 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-08 19:10 [Qemu-devel] [PATCH] main-loop: Don't lock starve io-threads when main_loop_tlg has pending events Hans de Goede
2013-10-08 19:13 ` Paolo Bonzini
2013-10-08 19:21   ` Hans de Goede
2013-10-08 19:33     ` Alex Bligh
2013-10-08 19:41       ` Hans de Goede
2013-10-08 20:01         ` Alex Bligh [this message]
2013-10-08 20:07           ` Alex Bligh
2013-10-08 20:16           ` Hans de Goede
2013-10-08 20:32             ` Alex Bligh
2013-10-08 20:50             ` Paolo Bonzini
2013-10-09 12:58               ` Hans de Goede
2013-10-09 13:18                 ` Alex Bligh
2013-10-09 18:03                   ` Hans de Goede
2013-10-09 18:15                     ` Hans de Goede
2013-10-09 18:28                     ` Alex Bligh
2013-10-09 18:36                       ` Alex Bligh
2013-10-09 18:49                         ` Hans de Goede
2013-10-09 19:03                           ` Paolo Bonzini
2013-10-09 19:15                             ` Hans de Goede
2013-10-09 14:37                 ` Paolo Bonzini
2013-10-09 16:19                   ` Alex Bligh
2013-10-09 16:26                     ` Paolo Bonzini
2013-10-09 16:33                       ` Alex Bligh
2013-10-09 17:53                       ` Hans de Goede
2013-10-09 18:09                   ` Hans de Goede
2013-10-08 19:48 ` Alex Bligh
2013-10-08 20:01   ` Hans de Goede
2013-10-08 21:25     ` Alex Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6CD4D4EE-341E-48D0-98F4-D55C0D3922D4@alex.org.uk \
    --to=alex@alex.org.uk \
    --cc=hdegoede@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.