All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Aarushi Mehta <mehta.aaru20@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>,
	Sergio Lopez <slp@redhat.com>,
	qemu-block@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, Maxim Levitsky <mlevitsk@redhat.com>,
	saket.sinha89@gmail.com, Max Reitz <mreitz@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefan@redhat.com>,
	Julia Suvorova <jusual@mail.ru>
Subject: Re: [Qemu-devel] [PATCH v8 16/16] block/io_uring: adds fd registration
Date: Wed, 31 Jul 2019 11:26:34 +0100	[thread overview]
Message-ID: <20190731102634.GA22809@stefanha-x1.localdomain> (raw)
In-Reply-To: <20190730173441.26486-17-mehta.aaru20@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4130 bytes --]

On Tue, Jul 30, 2019 at 11:04:41PM +0530, Aarushi Mehta wrote:

I'm concerned about file descriptor leaks.  fd_array[] keeps file
descriptors basically forever, even after the file is no longer in use
by the rest of QEMU.  There needs to be a call to unregister whenever a
file is closed elsewhere in QEMU.  For benchmarking and experimentation
the current code is okay, but for production usage the leak must be
prevented.

> +/**
> + * luring_fd_register:
> + *
> + * Register and unregisters file descriptors, see luring_fd_lookup
> + */
> +static int luring_fd_register(struct io_uring *ring, LuringFd *fd_reg, int fd)
> +{
> +    int ret, nr;
> +    GHashTable *lookup = fd_reg->fd_lookup;
> +    nr = g_hash_table_size(lookup);
> +
> +    /* Unregister */
> +    if (!fd) {
> +        ret = io_uring_unregister_files(ring);
> +        g_hash_table_remove_all(lookup);

Is it correct to clear the hash table be cleared if there was an error?

> +        return ret;
> +    }

Please make unregistering all files a separate function.  It's not
necessary to overload this function since this is a completely separate
operation.

> +
> +    /* If adding new, API requires older registrations to be removed */
> +    if (nr) {
> +        io_uring_unregister_files(ring);
> +    }
> +
> +    fd_reg->fd_array = g_realloc_n(fd_reg->fd_array, nr + 1, sizeof(int));
> +    fd_reg->fd_array[nr] = fd;
> +    fd_reg->fd_index = g_realloc_n(fd_reg->fd_index, nr + 1, sizeof(int));
> +    fd_reg->fd_index[nr] = nr;
> +
> +    g_hash_table_insert(lookup, &fd_reg->fd_array[nr], &fd_reg->fd_index[nr]);

fd_index[] is not necessary, you can cast nr to a gpointer instead to
store the data directly inside GHashTable:

  g_hash_table_insert(lookup, &fd_reg->fd_array[nr],
                      GINT_TO_POINTER(nr));

The hash table accesses can be made slightly more efficient by avoiding
the pointer dereference for keys as well:

  g_hash_table_insert(lookup, GINT_TO_POINTER(fd),
                      GINT_TO_POINTER(nr));

In this case fd_array[] is only used for the io_uring_register_files()
call and nothing else.  Remember to switch to g_direct_equal() and
g_direct_hash() in g_hash_table_new_full() if you make the key a direct
gpointer.

> +    trace_luring_fd_register(fd, nr);
> +    return io_uring_register_files(ring, fd_reg->fd_array, nr + 1);
> +}
> +
> +/**
> + * luring_fd_lookup:
> + *
> + * Used to lookup fd index in registered array at submission time
> + * If the lookup table has not been created or the fd is not in the table,
> + * the fd is registered.
> + *
> + * If registration errors, the hash is cleared and the fd used directly
> + *
> + * Unregistering is done at luring_detach_aio_context
> + */
> +static int luring_fd_lookup(LuringState *s, int fd)
> +{
> +    int *index, ret;
> +    if (!s->fd_reg.fd_lookup) {
> +        s->fd_reg.fd_lookup = g_hash_table_new_full(g_int_hash, g_int_equal,
> +                                                    g_free, g_free);

fd_array[] and fd_index[] are allocated in single allocations for the
entire array, therefore g_free(key) and g_free(value) on individual
elements is undefined behavior and could crash the program.  There
should be no destroy function for them.

Missing g_hash_table_unref() to free fd_lookup.

> +        luring_fd_register(&s->ring, &s->fd_reg, fd);
> +    }
> +    index = g_hash_table_lookup(s->fd_reg.fd_lookup, &fd);
> +
> +    if (!index) {
> +        ret = luring_fd_register(&s->ring, &s->fd_reg, fd);
> +        if (ret < 0) {
> +            g_hash_table_remove_all(s->fd_reg.fd_lookup);

Why is the hash table cleared and why are fd_array[]/fd_index[] left
behind?

> +            return ret;
> +        }
> +        index = g_hash_table_lookup(s->fd_reg.fd_lookup, &fd);
> +    }
> +    return *index;
> +}

What are the concerns about in-flight requests and how are they
addressed?  For example, if a request is in-flight and another request
wants to add a new fd then io_uring_unregister_files() and
io_uring_register_files() are called while a request is still in-flight.
How does the io_uring kernel code handle this?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2019-07-31 10:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30 17:34 [Qemu-devel] [PATCH v8 00/16] Add support for io_uring Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 01/16] configure: permit use of io_uring Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 02/16] qapi/block-core: add option for io_uring Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 03/16] block/block: add BDRV flag " Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 04/16] block/io_uring: implements interfaces " Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 05/16] stubs: add stubs for io_uring interface Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 06/16] util/async: add aio interfaces for io_uring Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 07/16] blockdev: adds bdrv_parse_aio to use io_uring Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 08/16] block/file-posix.c: extend " Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 09/16] block: add trace events for io_uring Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 10/16] block/io_uring: adds userspace completion polling Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 11/16] qemu-io: adds option to use aio engine Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 12/16] qemu-img: adds option to use aio engine for benchmarking Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 13/16] qemu-nbd: adds option for aio engines Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 14/16] tests/qemu-iotests: enable testing with aio options Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 15/16] tests/qemu-iotests: use AIOMODE with various tests Aarushi Mehta
2019-07-30 17:34 ` [Qemu-devel] [PATCH v8 16/16] block/io_uring: adds fd registration Aarushi Mehta
2019-07-31 10:26   ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190731102634.GA22809@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=fam@euphon.net \
    --cc=jusual@mail.ru \
    --cc=kwolf@redhat.com \
    --cc=mehta.aaru20@gmail.com \
    --cc=mlevitsk@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=saket.sinha89@gmail.com \
    --cc=slp@redhat.com \
    --cc=stefan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.