All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glauber.costa@datadoghq.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <fam@euphon.net>, Zhenyu Ye <yezhenyu2@huawei.com>,
	 Paolo Bonzini <pbonzini@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	 Zhanghailiang <zhang.zhanghailiang@huawei.com>,
	qemu-block@nongnu.org,  qemu-devel@nongnu.org,
	xiexiangyou@huawei.com, armbru@redhat.com,  mreitz@redhat.com,
	Glauber Costa <glauber@datadoghq.com>
Subject: Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Date: Tue, 8 Dec 2020 08:47:42 -0500	[thread overview]
Message-ID: <CAMdqtNWGYu-U5pECzffNvu8Dv_hMfwJ9w5RPoLjF-_NX4cfjdw@mail.gmail.com> (raw)
In-Reply-To: <20201208131057.GB272246@stefanha-x1.localdomain>

On Tue, Dec 8, 2020 at 8:11 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Thu, Oct 22, 2020 at 05:29:16PM +0100, Fam Zheng wrote:
> > On Tue, 2020-10-20 at 09:34 +0800, Zhenyu Ye wrote:
> > > On 2020/10/19 21:25, Paolo Bonzini wrote:
> > > > On 19/10/20 14:40, Zhenyu Ye wrote:
> > > > > The kernel backtrace for io_submit in GUEST is:
> > > > >
> > > > >         guest# ./offcputime -K -p `pgrep -nx fio`
> > > > >             b'finish_task_switch'
> > > > >             b'__schedule'
> > > > >             b'schedule'
> > > > >             b'io_schedule'
> > > > >             b'blk_mq_get_tag'
> > > > >             b'blk_mq_get_request'
> > > > >             b'blk_mq_make_request'
> > > > >             b'generic_make_request'
> > > > >             b'submit_bio'
> > > > >             b'blkdev_direct_IO'
> > > > >             b'generic_file_read_iter'
> > > > >             b'aio_read'
> > > > >             b'io_submit_one'
> > > > >             b'__x64_sys_io_submit'
> > > > >             b'do_syscall_64'
> > > > >             b'entry_SYSCALL_64_after_hwframe'
> > > > >             -                fio (1464)
> > > > >                 40031912
> > > > >
> > > > > And Linux io_uring can avoid the latency problem.
> >
> > Thanks for the info. What this tells us is basically the inflight
> > requests are high. It's sad that the linux-aio is in practice
> > implemented as a blocking API.

it is.

> >
> > Host side backtrace will be of more help. Can you get that too?
>
> I guess Linux AIO didn't set the BLK_MQ_REQ_NOWAIT flag so the task went
> to sleep when it ran out of blk-mq tags. The easiest solution is to move
> to io_uring. Linux AIO is broken - it's not AIO :).

Agree!
>
> If we know that no other process is writing to the host block device
> then maybe we can determine the blk-mq tags limit (the queue depth) and
> avoid sending more requests. That way QEMU doesn't block, but I don't
> think this approach works when other processes are submitting I/O to the
> same host block device :(.
>
> Fam's original suggestion of invoking io_submit(2) from a worker thread
> is an option, but I'm afraid it will slow down the uncontended case.
>
> I'm CCing Glauber in case he battled this in the past in ScyllaDB.

We have, and a lot. I don't recall seeing this particular lock, but
XFS would block us all the time
if it had to update metadata to submit the operation, lock inodes, etc.

The work we did at the time was in fixing those things in the kernel
as much as we could.
But the API is just like that...

>
> Stefan


  reply	other threads:[~2020-12-08 15:12 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10 14:52 [PATCH v1 0/2] Add timeout mechanism to qmp actions Zhenyu Ye
2020-08-10 14:52 ` [PATCH v1 1/2] util: introduce aio_context_acquire_timeout Zhenyu Ye
2020-08-10 14:52 ` [PATCH v1 2/2] qmp: use aio_context_acquire_timeout replace aio_context_acquire Zhenyu Ye
2020-08-10 15:38 ` [PATCH v1 0/2] Add timeout mechanism to qmp actions Kevin Wolf
2020-08-11 13:54   ` Zhenyu Ye
2020-08-21 12:52     ` Stefan Hajnoczi
2020-09-14 13:27     ` Stefan Hajnoczi
2020-09-17  7:36       ` Zhenyu Ye
2020-09-17 10:10         ` Fam Zheng
2020-09-17 15:44         ` Stefan Hajnoczi
2020-09-17 16:01           ` Fam Zheng
2020-09-18 11:23             ` Zhenyu Ye
2020-09-18 14:06               ` Fam Zheng
2020-09-19  2:22                 ` Zhenyu Ye
2020-09-21 11:14                   ` Fam Zheng
2020-10-13 10:00                     ` Stefan Hajnoczi
2020-10-19 12:40                       ` Zhenyu Ye
2020-10-19 13:25                         ` Paolo Bonzini
2020-10-20  1:34                           ` Zhenyu Ye
2020-10-22 16:29                             ` Fam Zheng
2020-12-08 13:10                               ` Stefan Hajnoczi
2020-12-08 13:47                                 ` Glauber Costa [this message]
2020-12-14 16:33                                   ` Stefan Hajnoczi
2020-12-21 11:30                                     ` Zhenyu Ye
2020-09-14 14:42     ` Daniel P. Berrangé
2020-09-17  8:12       ` Zhenyu Ye
2020-08-12 13:51 ` Stefan Hajnoczi
2020-08-13  1:51   ` Zhenyu Ye

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMdqtNWGYu-U5pECzffNvu8Dv_hMfwJ9w5RPoLjF-_NX4cfjdw@mail.gmail.com \
    --to=glauber.costa@datadoghq.com \
    --cc=armbru@redhat.com \
    --cc=fam@euphon.net \
    --cc=glauber@datadoghq.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=xiexiangyou@huawei.com \
    --cc=yezhenyu2@huawei.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.