All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Aarushi Mehta <mehta.aaru20@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Max Reitz <mreitz@redhat.com>,
	saket.sinha89@gmail.com, Julia Suvorova <jusual@mail.ru>
Subject: Re: [Qemu-devel] [PATCH v3 4/8] block/io_uring: implements interfaces for io_uring
Date: Mon, 27 May 2019 10:32:21 +0100	[thread overview]
Message-ID: <20190527093221.GB25631@stefanha-x1.localdomain> (raw)
In-Reply-To: <20190527080327.10780-5-mehta.aaru20@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4776 bytes --]

On Mon, May 27, 2019 at 01:33:23PM +0530, Aarushi Mehta wrote:
> +static void qemu_luring_process_completions(LuringState *s)
> +{
> +    struct io_uring_cqe *cqes;
> +    /*
> +     * Request completion callbacks can run the nested event loop.
> +     * Schedule ourselves so the nested event loop will "see" remaining
> +     * completed requests and process them.  Without this, completion
> +     * callbacks that wait for other requests using a nested event loop
> +     * would hang forever.
> +     */
> +    qemu_bh_schedule(s->completion_bh);
> +
> +    while (!io_uring_peek_cqe(&s->ring, &cqes)) {
> +        io_uring_cqe_seen(&s->ring, cqes);

The kernel may overwrite the cqe once we've marked it seen.  Therefore
the cqe must only be marked seen after the last access to it.  This is
analogous to a use-after-free bug: we're not allowed to access fields of
an object after it has been freed.

The place to do so is...

> +
> +        LuringAIOCB *luringcb = io_uring_cqe_get_data(cqes);
> +        luringcb->ret = io_cqe_ret(cqes);

...here:

  io_uring_cqe_seen(&s->ring, cqes);
  cqes = NULL; /* returned to ring, don't access it anymore */

> +        if (luringcb->co) {
> +            /*
> +             * If the coroutine is already entered it must be in ioq_submit()
> +             * and will notice luringcb->ret has been filled in when it
> +             * eventually runs later. Coroutines cannot be entered recursively
> +             * so avoid doing that!
> +             */
> +            if (!qemu_coroutine_entered(luringcb->co)) {
> +                aio_co_wake(luringcb->co);
> +            }
> +        } else {
> +            luringcb->common.cb(luringcb->common.opaque, luringcb->ret);
> +            qemu_aio_unref(luringcb);
> +        }
> +        /* Change counters one-by-one because we can be nested. */
> +        s->io_q.in_flight--;

This counter must be decremented before invoking luringcb's callback.
That way the nested event loop doesn't consider this completed request
in flight anymore.

> +static void ioq_submit(LuringState *s)
> +{
> +    int ret;
> +    LuringAIOCB *luringcb, *luringcb_next;
> +
> +    while(!s->io_q.in_queue) {

Should this be while (s->io_q.in_queue > 0)?

> +        QSIMPLEQ_FOREACH_SAFE(luringcb, &s->io_q.sq_overflow, next,
> +                              luringcb_next) {
> +            struct io_uring_sqe *sqes = io_uring_get_sqe(&s->ring);
> +            if (!sqes) {
> +                break;
> +            }
> +            /* Prep sqe for submission */
> +            *sqes = luringcb->sqeq;
> +            io_uring_sqe_set_data(sqes, luringcb);

This is unnecessary, the data field has already been set in
luring_do_submit() and copied to *sqes in the previous line.

> +BlockAIOCB *luring_submit(BlockDriverState *bs, LuringState *s, int fd,
> +        int64_t sector_num, QEMUIOVector *qiov, BlockCompletionFunc *cb,
> +        void *opaque, int type)
> +{
> +    LuringAIOCB *luringcb;
> +    off_t offset = sector_num * BDRV_SECTOR_SIZE;
> +    int ret;
> +
> +    luringcb = qemu_aio_get(&luring_aiocb_info, bs, cb, opaque);
> +    luringcb->ret = -EINPROGRESS;

luringcb isn't zeroed by qemu_aio_get().  luringcb->co must be
explicitly set to NULL to prevent undefined behavior in
qemu_luring_process_completions() (uninitialized memory access).

  luring->co = NULL;

By the way, this bug originates from linux-aio.c.  I have sent a patch
to fix it there!

> +    ret = luring_do_submit(fd, luringcb, s, offset, qiov, type);
> +    if (ret < 0) {
> +        qemu_aio_unref(luringcb);
> +        return NULL;
> +    }
> +
> +    return &luringcb->common;
> +}
> +
> +void luring_detach_aio_context(LuringState *s, AioContext *old_context)
> +{
> +    aio_set_fd_handler(old_context, s->ring.ring_fd, false, NULL, NULL, NULL,
> +                       &s);
> +    qemu_bh_delete(s->completion_bh);
> +    s->aio_context = NULL;
> +}
> +
> +void luring_attach_aio_context(LuringState *s, AioContext *new_context)
> +{
> +    s->aio_context = new_context;
> +    s->completion_bh = aio_bh_new(new_context, qemu_luring_completion_bh, s);
> +    aio_set_fd_handler(s->aio_context, s->ring.ring_fd, false,
> +                       qemu_luring_completion_cb, NULL, NULL, &s);
> +}
> +
> +LuringState *luring_init(Error **errp)
> +{
> +    int rc;
> +    LuringState *s;
> +    s = g_malloc0(sizeof(*s));
> +    struct io_uring *ring = &s->ring;
> +    rc =  io_uring_queue_init(MAX_EVENTS, ring, 0);
> +    if (rc == -1) {
> +        error_setg_errno(errp, -rc, "failed to init linux io_uring ring");

Why was this changed from error_setg_errno(errp, errno, "failed to init
linux io_uring ring") to -rc in v3?

rc is -1 here, not an errno value, so the error message will be
incorrect.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-05-27  9:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-27  8:03 [Qemu-devel] [PATCH v3 0/8] Add support for io_uring Aarushi Mehta
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 1/8] configure: permit use of io_uring Aarushi Mehta
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 2/8] qapi/block-core: add option for io_uring Aarushi Mehta
2019-05-27  8:29   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 3/8] block/block: add BDRV flag " Aarushi Mehta
2019-05-27  8:30   ` Stefan Hajnoczi
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 4/8] block/io_uring: implements interfaces " Aarushi Mehta
2019-05-27  9:32   ` Stefan Hajnoczi [this message]
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 5/8] stubs: add stubs for io_uring interface Aarushi Mehta
2019-05-27  8:32   ` Stefan Hajnoczi
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 6/8] util/async: add aio interfaces for io_uring Aarushi Mehta
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 7/8] blockdev: accept io_uring as option Aarushi Mehta
2019-05-27  8:03 ` [Qemu-devel] [PATCH v3 8/8] block/fileposix: extend to use io_uring Aarushi Mehta
2019-05-27  9:42   ` Stefan Hajnoczi
2019-05-27  8:25 ` [Qemu-devel] [PATCH v3 0/8] Add support for io_uring no-reply
2019-05-27  9:01   ` Stefan Hajnoczi
2019-05-27  8:43 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190527093221.GB25631@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=jusual@mail.ru \
    --cc=kwolf@redhat.com \
    --cc=mehta.aaru20@gmail.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=saket.sinha89@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.