qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [ANNOUNCE] libblkio v0.1.0 preview release
@ 2021-04-29 14:05 Stefan Hajnoczi
  2021-04-29 14:22 ` Richard W.M. Jones
  2021-04-29 15:51 ` Kevin Wolf
  0 siblings, 2 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-04-29 14:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 3008 bytes --]

Hi,
A preview release of libblkio, a library for high-performance block I/O,
is now available:

  https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0

Applications are increasingly integrating high-performance I/O
interfaces such as Linux io_uring, userspace device drivers, and
vhost-user device support. The effort required to add each of these
low-level interfaces into an application is relatively high. libblkio
provides a single API for efficiently accessing block devices and
eliminates the need to write custom code for each one.

The library is not yet ready to use and currently lacks many features.
In fact, I hope this news doesn't spread too far yet because libblkio is
at a very early stage, but feedback from the QEMU community is necessary
at this time.

The purpose of this preview release is to discuss both the API design
and general direction of the project. API documentation is available
here:

  https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst

Examples are available here:

  https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples

The goal is to eventually include the following drivers:
- Linux io_uring
- NVMe (VFIO and vfio-user)
- virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)

There are a few reasons why libblkio is needed:

1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
   more low-level block I/O code, most of it duplicated. Usually only
   one or two of Linux AIO, io_uring, userspace drivers, vhost-user
   drivers, etc are implemented. This makes it difficult to benefit from
   the latest advances in high-performance block I/O.

2. Coding to a standard API makes it possible to introduce new
   optimizations or hardware interfaces without costly changes to the
   software stack.

3. A client library is needed so applications can take advantage of
   qemu-storage-daemon's vhost-user-blk exports.

4. Implementing block I/O as a library allows QEMU to use Rust for new
   code without messy QEMU internal API bindings. Note that libblkio
   currently does not provide a Rust crate, it only offers a C API.

For QEMU integration the next step is a libblkio BlockDriver. This new
BlockDriver will provide libblkio functionality to QEMU. Eventually
libblkio will provide block/file-posix.c, block/nvme.c, and the upcoming
vhost-vdpa-blk functionality and the QEMU code can be simplified or
removed.

libblkio does not contain an event loop implementation or disk image
format drivers and that functionality will not be part of libblkio.

How to participate:
1. Share your thoughts on the direction and your requirements.
2. Review the API docs and give feedback.
3. Write a libblkio driver for NVMe VFIO/vfio-user, virtio-blk
   VFIO/vfio-user, vhost-user-blk. Or just a null driver for
   benchmarking/testing :).
4. Integrate libblkio into Ceph, GlusterFS, MariaDB, etc.

I am now beginning QEMU and fio integration. Stefano Garzarella is
looking at adding a vhost-vdpa-blk driver to libblkio.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 14:05 [ANNOUNCE] libblkio v0.1.0 preview release Stefan Hajnoczi
@ 2021-04-29 14:22 ` Richard W.M. Jones
  2021-04-29 14:41   ` Stefan Hajnoczi
  2021-04-29 15:51 ` Kevin Wolf
  1 sibling, 1 reply; 22+ messages in thread
From: Richard W.M. Jones @ 2021-04-29 14:22 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> The purpose of this preview release is to discuss both the API design
> and general direction of the project. API documentation is available
> here:
> 
>   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst

libvirt originally, and now libnbd, keep a per-thread error message
(stored in thread-local storage).  It's a lot nicer than having to
pass &errmsg to every function.  You can just write:

 if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
   fprintf (stderr,
            "failed to connect to remote server: %s (errno = %d)\n",
            nbd_get_error (), nbd_get_errno ());
   exit (EXIT_FAILURE);
 }

(https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)

It means you can extend the range of error information available in
future.  Also you can return a 'const char *' and the application
doesn't have to worry about lifetimes, at least in the common case.

> Examples are available here:
> 
>   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples
> 
> The goal is to eventually include the following drivers:
> - Linux io_uring
> - NVMe (VFIO and vfio-user)
> - virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)
>
> There are a few reasons why libblkio is needed:
> 
> 1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
>    more low-level block I/O code, most of it duplicated. Usually only
>    one or two of Linux AIO, io_uring, userspace drivers, vhost-user
>    drivers, etc are implemented. This makes it difficult to benefit from
>    the latest advances in high-performance block I/O.
> 
> 2. Coding to a standard API makes it possible to introduce new
>    optimizations or hardware interfaces without costly changes to the
>    software stack.
> 
> 3. A client library is needed so applications can take advantage of
>    qemu-storage-daemon's vhost-user-blk exports.
> 
> 4. Implementing block I/O as a library allows QEMU to use Rust for new
>    code without messy QEMU internal API bindings. Note that libblkio
>    currently does not provide a Rust crate, it only offers a C API.

This is where I get confused about what this library actually does.
It's not just a nicer wrapper around io_uring, but what is it actually
doing?

Rich.

> For QEMU integration the next step is a libblkio BlockDriver. This new
> BlockDriver will provide libblkio functionality to QEMU. Eventually
> libblkio will provide block/file-posix.c, block/nvme.c, and the upcoming
> vhost-vdpa-blk functionality and the QEMU code can be simplified or
> removed.
> 
> libblkio does not contain an event loop implementation or disk image
> format drivers and that functionality will not be part of libblkio.
> 
> How to participate:
> 1. Share your thoughts on the direction and your requirements.
> 2. Review the API docs and give feedback.
> 3. Write a libblkio driver for NVMe VFIO/vfio-user, virtio-blk
>    VFIO/vfio-user, vhost-user-blk. Or just a null driver for
>    benchmarking/testing :).
> 4. Integrate libblkio into Ceph, GlusterFS, MariaDB, etc.
> 
> I am now beginning QEMU and fio integration. Stefano Garzarella is
> looking at adding a vhost-vdpa-blk driver to libblkio.
> 
> Stefan



-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 14:22 ` Richard W.M. Jones
@ 2021-04-29 14:41   ` Stefan Hajnoczi
  2021-04-29 15:00     ` Richard W.M. Jones
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-04-29 14:41 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 3127 bytes --]

On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > The purpose of this preview release is to discuss both the API design
> > and general direction of the project. API documentation is available
> > here:
> > 
> >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> 
> libvirt originally, and now libnbd, keep a per-thread error message
> (stored in thread-local storage).  It's a lot nicer than having to
> pass &errmsg to every function.  You can just write:
> 
>  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
>    fprintf (stderr,
>             "failed to connect to remote server: %s (errno = %d)\n",
>             nbd_get_error (), nbd_get_errno ());
>    exit (EXIT_FAILURE);
>  }
> 
> (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> 
> It means you can extend the range of error information available in
> future.  Also you can return a 'const char *' and the application
> doesn't have to worry about lifetimes, at least in the common case.

Thanks for sharing the idea, I think it would work well for libblkio
too.

Do you ignore the dlclose(3) memory leak?

> > Examples are available here:
> > 
> >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples
> > 
> > The goal is to eventually include the following drivers:
> > - Linux io_uring
> > - NVMe (VFIO and vfio-user)
> > - virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)
> >
> > There are a few reasons why libblkio is needed:
> > 
> > 1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
> >    more low-level block I/O code, most of it duplicated. Usually only
> >    one or two of Linux AIO, io_uring, userspace drivers, vhost-user
> >    drivers, etc are implemented. This makes it difficult to benefit from
> >    the latest advances in high-performance block I/O.
> > 
> > 2. Coding to a standard API makes it possible to introduce new
> >    optimizations or hardware interfaces without costly changes to the
> >    software stack.
> > 
> > 3. A client library is needed so applications can take advantage of
> >    qemu-storage-daemon's vhost-user-blk exports.
> > 
> > 4. Implementing block I/O as a library allows QEMU to use Rust for new
> >    code without messy QEMU internal API bindings. Note that libblkio
> >    currently does not provide a Rust crate, it only offers a C API.
> 
> This is where I get confused about what this library actually does.
> It's not just a nicer wrapper around io_uring, but what is it actually
> doing?

It's a library for what QEMU calls protocol drivers (POSIX files,
userspace NVMe driver, etc). In particular, anything offering
multi-queue block I/O fits into libblkio.

It is not intended to replace libnbd or other network storage libraries.
libblkio's properties API is synchronous to keep things simple for
applications. Attaching to network storage needs to be asynchronous,
although the libblkio API could be extended if people want to support
network storage.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 14:41   ` Stefan Hajnoczi
@ 2021-04-29 15:00     ` Richard W.M. Jones
  2021-04-29 15:08       ` Daniel P. Berrangé
  2021-04-30 16:20       ` [ANNOUNCE] libblkio v0.1.0 preview release Stefan Hajnoczi
  0 siblings, 2 replies; 22+ messages in thread
From: Richard W.M. Jones @ 2021-04-29 15:00 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > The purpose of this preview release is to discuss both the API design
> > > and general direction of the project. API documentation is available
> > > here:
> > > 
> > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > 
> > libvirt originally, and now libnbd, keep a per-thread error message
> > (stored in thread-local storage).  It's a lot nicer than having to
> > pass &errmsg to every function.  You can just write:
> > 
> >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> >    fprintf (stderr,
> >             "failed to connect to remote server: %s (errno = %d)\n",
> >             nbd_get_error (), nbd_get_errno ());
> >    exit (EXIT_FAILURE);
> >  }
> > 
> > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > 
> > It means you can extend the range of error information available in
> > future.  Also you can return a 'const char *' and the application
> > doesn't have to worry about lifetimes, at least in the common case.
> 
> Thanks for sharing the idea, I think it would work well for libblkio
> too.
> 
> Do you ignore the dlclose(3) memory leak?

I believe this mechanism in libnbd ensures there is no leak in the
ordinary shared library (not dlopen/dlclose) case:

https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35

However I'm not sure what happens in the dlopen case, so I'm going to
test that out now ...

> > > Examples are available here:
> > > 
> > >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples
> > > 
> > > The goal is to eventually include the following drivers:
> > > - Linux io_uring
> > > - NVMe (VFIO and vfio-user)
> > > - virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)
> > >
> > > There are a few reasons why libblkio is needed:
> > > 
> > > 1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
> > >    more low-level block I/O code, most of it duplicated. Usually only
> > >    one or two of Linux AIO, io_uring, userspace drivers, vhost-user
> > >    drivers, etc are implemented. This makes it difficult to benefit from
> > >    the latest advances in high-performance block I/O.
> > > 
> > > 2. Coding to a standard API makes it possible to introduce new
> > >    optimizations or hardware interfaces without costly changes to the
> > >    software stack.
> > > 
> > > 3. A client library is needed so applications can take advantage of
> > >    qemu-storage-daemon's vhost-user-blk exports.
> > > 
> > > 4. Implementing block I/O as a library allows QEMU to use Rust for new
> > >    code without messy QEMU internal API bindings. Note that libblkio
> > >    currently does not provide a Rust crate, it only offers a C API.
> > 
> > This is where I get confused about what this library actually does.
> > It's not just a nicer wrapper around io_uring, but what is it actually
> > doing?
> 
> It's a library for what QEMU calls protocol drivers (POSIX files,
> userspace NVMe driver, etc). In particular, anything offering
> multi-queue block I/O fits into libblkio.
> 
> It is not intended to replace libnbd or other network storage libraries.
> libblkio's properties API is synchronous to keep things simple for
> applications. Attaching to network storage needs to be asynchronous,
> although the libblkio API could be extended if people want to support
> network storage.

I think what confuses me is why is NVMe any different from io_uring?
ie would this work?

$ blkio-info --output=json io_uring path=/dev/nvme0

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 15:00     ` Richard W.M. Jones
@ 2021-04-29 15:08       ` Daniel P. Berrangé
  2021-04-29 15:31         ` Richard W.M. Jones
  2021-04-29 17:17         ` libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release) Richard W.M. Jones
  2021-04-30 16:20       ` [ANNOUNCE] libblkio v0.1.0 preview release Stefan Hajnoczi
  1 sibling, 2 replies; 22+ messages in thread
From: Daniel P. Berrangé @ 2021-04-29 15:08 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block,
	Markus Armbruster, qemu-devel, mreitz,
	Vladimir Sementsov-Ogievskiy, Stefan Hajnoczi, Klaus Jensen,
	philmd, sgarzare

On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > > The purpose of this preview release is to discuss both the API design
> > > > and general direction of the project. API documentation is available
> > > > here:
> > > > 
> > > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > > 
> > > libvirt originally, and now libnbd, keep a per-thread error message
> > > (stored in thread-local storage).  It's a lot nicer than having to
> > > pass &errmsg to every function.  You can just write:
> > > 
> > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > >    fprintf (stderr,
> > >             "failed to connect to remote server: %s (errno = %d)\n",
> > >             nbd_get_error (), nbd_get_errno ());
> > >    exit (EXIT_FAILURE);
> > >  }
> > > 
> > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > 
> > > It means you can extend the range of error information available in
> > > future.  Also you can return a 'const char *' and the application
> > > doesn't have to worry about lifetimes, at least in the common case.
> > 
> > Thanks for sharing the idea, I think it would work well for libblkio
> > too.
> > 
> > Do you ignore the dlclose(3) memory leak?
> 
> I believe this mechanism in libnbd ensures there is no leak in the
> ordinary shared library (not dlopen/dlclose) case:
> 
> https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> 
> However I'm not sure what happens in the dlopen case, so I'm going to
> test that out now ...

pthread_key destructors are a disaster waiting to happen with
dlclose, if the dlclose happens while non-main threads are
still running. When those threads exit, they'll run the
destructor which points to a function that is no longer
resident in memory.

IOW if you have destrutors, then you need to make sure your
library uses "-z nodelete" when linking, to turn dlclose()
into a no-op.

  commit 8e44e5593eb9b89fbc0b54fde15f130707a0d81e
  Author: Daniel P. Berrangé <berrange@redhat.com>
  Date:   Thu Sep 1 17:57:06 2011 +0100

    Prevent crash from dlclose() of libvirt.so
    
    When libvirt calls virInitialize it creates a thread local
    for the virErrorPtr storage, and registers a callback to
    cleanup memory when a thread exits. When libvirt is dlclose()d
    or otherwise made non-resident, the callback function is
    removed from memory, but the thread local may still exist
    and if a thread later exists, it will invoke the callback
    and SEGV. There may also be other thread locals with callbacks
    pointing to libvirt code, so it is in general never safe to
    unload libvirt.so from memory once initialized.
    
    To allow dlclose() to succeed, but keep libvirt.so resident
    in memory, link with '-z nodelete'. This issue was first
    found with the libvirt CIM provider, but can potentially
    hit many of the dynamic language bindings which all ultimately
    involve dlopen() in some way, either on libvirt.so itself,
    or on the glue code for the binding which in turns links
    to libvirt



Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 15:08       ` Daniel P. Berrangé
@ 2021-04-29 15:31         ` Richard W.M. Jones
  2021-04-29 15:35           ` Daniel P. Berrangé
  2021-04-29 17:17         ` libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release) Richard W.M. Jones
  1 sibling, 1 reply; 22+ messages in thread
From: Richard W.M. Jones @ 2021-04-29 15:31 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block,
	Markus Armbruster, qemu-devel, mreitz,
	Vladimir Sementsov-Ogievskiy, Stefan Hajnoczi, Klaus Jensen,
	philmd, sgarzare

On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > > > The purpose of this preview release is to discuss both the API design
> > > > > and general direction of the project. API documentation is available
> > > > > here:
> > > > > 
> > > > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > > > 
> > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > pass &errmsg to every function.  You can just write:
> > > > 
> > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > >    fprintf (stderr,
> > > >             "failed to connect to remote server: %s (errno = %d)\n",
> > > >             nbd_get_error (), nbd_get_errno ());
> > > >    exit (EXIT_FAILURE);
> > > >  }
> > > > 
> > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > 
> > > > It means you can extend the range of error information available in
> > > > future.  Also you can return a 'const char *' and the application
> > > > doesn't have to worry about lifetimes, at least in the common case.
> > > 
> > > Thanks for sharing the idea, I think it would work well for libblkio
> > > too.
> > > 
> > > Do you ignore the dlclose(3) memory leak?
> > 
> > I believe this mechanism in libnbd ensures there is no leak in the
> > ordinary shared library (not dlopen/dlclose) case:
> > 
> > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > 
> > However I'm not sure what happens in the dlopen case, so I'm going to
> > test that out now ...
> 
> pthread_key destructors are a disaster waiting to happen with
> dlclose, if the dlclose happens while non-main threads are
> still running. When those threads exit, they'll run the
> destructor which points to a function that is no longer
> resident in memory.
> 
> IOW if you have destrutors, then you need to make sure your
> library uses "-z nodelete" when linking, to turn dlclose()
> into a no-op.

I suspect letting the string leak is a better idea for libnbd.

Still trying to write a nice reproducer ..

Rich.

>   commit 8e44e5593eb9b89fbc0b54fde15f130707a0d81e
>   Author: Daniel P. Berrangé <berrange@redhat.com>
>   Date:   Thu Sep 1 17:57:06 2011 +0100
> 
>     Prevent crash from dlclose() of libvirt.so
>     
>     When libvirt calls virInitialize it creates a thread local
>     for the virErrorPtr storage, and registers a callback to
>     cleanup memory when a thread exits. When libvirt is dlclose()d
>     or otherwise made non-resident, the callback function is
>     removed from memory, but the thread local may still exist
>     and if a thread later exists, it will invoke the callback
>     and SEGV. There may also be other thread locals with callbacks
>     pointing to libvirt code, so it is in general never safe to
>     unload libvirt.so from memory once initialized.
>     
>     To allow dlclose() to succeed, but keep libvirt.so resident
>     in memory, link with '-z nodelete'. This issue was first
>     found with the libvirt CIM provider, but can potentially
>     hit many of the dynamic language bindings which all ultimately
>     involve dlopen() in some way, either on libvirt.so itself,
>     or on the glue code for the binding which in turns links
>     to libvirt
> 
> 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 15:31         ` Richard W.M. Jones
@ 2021-04-29 15:35           ` Daniel P. Berrangé
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel P. Berrangé @ 2021-04-29 15:35 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block,
	Markus Armbruster, qemu-devel, mreitz,
	Vladimir Sementsov-Ogievskiy, Stefan Hajnoczi, Klaus Jensen,
	philmd, sgarzare

On Thu, Apr 29, 2021 at 04:31:41PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > > > > The purpose of this preview release is to discuss both the API design
> > > > > > and general direction of the project. API documentation is available
> > > > > > here:
> > > > > > 
> > > > > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > > > > 
> > > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > > pass &errmsg to every function.  You can just write:
> > > > > 
> > > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > > >    fprintf (stderr,
> > > > >             "failed to connect to remote server: %s (errno = %d)\n",
> > > > >             nbd_get_error (), nbd_get_errno ());
> > > > >    exit (EXIT_FAILURE);
> > > > >  }
> > > > > 
> > > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > > 
> > > > > It means you can extend the range of error information available in
> > > > > future.  Also you can return a 'const char *' and the application
> > > > > doesn't have to worry about lifetimes, at least in the common case.
> > > > 
> > > > Thanks for sharing the idea, I think it would work well for libblkio
> > > > too.
> > > > 
> > > > Do you ignore the dlclose(3) memory leak?
> > > 
> > > I believe this mechanism in libnbd ensures there is no leak in the
> > > ordinary shared library (not dlopen/dlclose) case:
> > > 
> > > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > > 
> > > However I'm not sure what happens in the dlopen case, so I'm going to
> > > test that out now ...
> > 
> > pthread_key destructors are a disaster waiting to happen with
> > dlclose, if the dlclose happens while non-main threads are
> > still running. When those threads exit, they'll run the
> > destructor which points to a function that is no longer
> > resident in memory.
> > 
> > IOW if you have destrutors, then you need to make sure your
> > library uses "-z nodelete" when linking, to turn dlclose()
> > into a no-op.
> 
> I suspect letting the string leak is a better idea for libnbd.

Is dlclose() really that important ? libnbd is such a small
thing that i doubt anyone will even notice the space being
consumed if you mark it nodelete, and if an app is repeatedly
doing an op that triggers dlopen+dlclose many times, then
dlclose is especially useless.

Personally I think removing memory leaks on thread exit is
more important than honouring dlclose, as some apps can do
pathelogical things creating *many* very short lived
threads.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 14:05 [ANNOUNCE] libblkio v0.1.0 preview release Stefan Hajnoczi
  2021-04-29 14:22 ` Richard W.M. Jones
@ 2021-04-29 15:51 ` Kevin Wolf
  2021-04-30 15:49   ` Stefan Hajnoczi
  1 sibling, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2021-04-29 15:51 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 2804 bytes --]

Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> Hi,
> A preview release of libblkio, a library for high-performance block I/O,
> is now available:
> 
>   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0
> 
> Applications are increasingly integrating high-performance I/O
> interfaces such as Linux io_uring, userspace device drivers, and
> vhost-user device support. The effort required to add each of these
> low-level interfaces into an application is relatively high. libblkio
> provides a single API for efficiently accessing block devices and
> eliminates the need to write custom code for each one.
> 
> The library is not yet ready to use and currently lacks many features.
> In fact, I hope this news doesn't spread too far yet because libblkio is
> at a very early stage, but feedback from the QEMU community is necessary
> at this time.

I'm a bit worried about the configuration interface. This looks an awful
lot like plain QOM properties, which have proven to result in both very
verbose (not to say messy) and rather error prone implementations.

You have to write setters/getters for every property, even if it usually
ends up doing the same thing, storing the value somewhere. Worse, these
getters and setters have to work in very different circumstances, namely
initialisation where you usually have to store the value away so that it
can be checked for consistency with other properties in .realize() or
.complete(), and property updates during runtime. Often enough, we
forget the latter and create bugs. If we don't create bugs, we usually
start with 'if (realized)' and have two completely different code paths.
Another common bug in QOM objects is forgetting to check if mandatory
properties were actually set.

Did you already consider these problems and decided to go this way
anyway, or is this more of an accidental design? And if the former, what
were the reasons that made it appealing?

Alternatives in QEMU are qdev properties (which are internally QOM
properties, but provide default implementations and are at least
automatically read-only after realize, avoiding that whole class of
bugs) and QAPI.
If this was QEMU code, I would of course go for QAPI, but a library is
something different and adding the code generator would probably be a
bit too much anyway. But the idea in the resulting code would be dealing
with native structs instead of a bunch of function calls. This would
probably be the least error prone way for the implementation, but of
course, it would make binary compatibility a bit harder when adding new
properties.

Thinking a bit further, we'll likely get the same problems as with QOM
in other places, too, e.g. how do you introspect which properties exist
in a given build?

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release)
  2021-04-29 15:08       ` Daniel P. Berrangé
  2021-04-29 15:31         ` Richard W.M. Jones
@ 2021-04-29 17:17         ` Richard W.M. Jones
  2021-04-29 17:27           ` Daniel P. Berrangé
  1 sibling, 1 reply; 22+ messages in thread
From: Richard W.M. Jones @ 2021-04-29 17:17 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block,
	Markus Armbruster, qemu-devel, mreitz,
	Vladimir Sementsov-Ogievskiy, Stefan Hajnoczi, Klaus Jensen,
	philmd, sgarzare

On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > pass &errmsg to every function.  You can just write:
> > > > 
> > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > >    fprintf (stderr,
> > > >             "failed to connect to remote server: %s (errno = %d)\n",
> > > >             nbd_get_error (), nbd_get_errno ());
> > > >    exit (EXIT_FAILURE);
> > > >  }
> > > > 
> > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > 
> > > > It means you can extend the range of error information available in
> > > > future.  Also you can return a 'const char *' and the application
> > > > doesn't have to worry about lifetimes, at least in the common case.
> > > 
> > > Thanks for sharing the idea, I think it would work well for libblkio
> > > too.
> > > 
> > > Do you ignore the dlclose(3) memory leak?
> > 
> > I believe this mechanism in libnbd ensures there is no leak in the
> > ordinary shared library (not dlopen/dlclose) case:
> > 
> > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > 
> > However I'm not sure what happens in the dlopen case, so I'm going to
> > test that out now ...
> 
> pthread_key destructors are a disaster waiting to happen with
> dlclose, if the dlclose happens while non-main threads are
> still running. When those threads exit, they'll run the
> destructor which points to a function that is no longer
> resident in memory.

While libnbd had a memory leak there, now fixed by:

https://gitlab.com/nbdkit/libnbd/-/commit/026d281c57dd95485cc9bf829918b5efd9e32ddb

I don't actually think we have the bug you're describing.  We have a
destructor (errors_free()) which runs on dlclose, which deletes the
thread-local storage key, so the destructor will not run after the
library has been unloaded.

I added a test for this which works fine for me:

https://gitlab.com/nbdkit/libnbd/-/commit/831d142787aba4f5b638418e02cf7e0f2a051565

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release)
  2021-04-29 17:17         ` libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release) Richard W.M. Jones
@ 2021-04-29 17:27           ` Daniel P. Berrangé
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel P. Berrangé @ 2021-04-29 17:27 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block,
	Markus Armbruster, qemu-devel, mreitz,
	Vladimir Sementsov-Ogievskiy, Stefan Hajnoczi, Klaus Jensen,
	philmd, sgarzare

On Thu, Apr 29, 2021 at 06:17:32PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > > pass &errmsg to every function.  You can just write:
> > > > > 
> > > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > > >    fprintf (stderr,
> > > > >             "failed to connect to remote server: %s (errno = %d)\n",
> > > > >             nbd_get_error (), nbd_get_errno ());
> > > > >    exit (EXIT_FAILURE);
> > > > >  }
> > > > > 
> > > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > > 
> > > > > It means you can extend the range of error information available in
> > > > > future.  Also you can return a 'const char *' and the application
> > > > > doesn't have to worry about lifetimes, at least in the common case.
> > > > 
> > > > Thanks for sharing the idea, I think it would work well for libblkio
> > > > too.
> > > > 
> > > > Do you ignore the dlclose(3) memory leak?
> > > 
> > > I believe this mechanism in libnbd ensures there is no leak in the
> > > ordinary shared library (not dlopen/dlclose) case:
> > > 
> > > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > > 
> > > However I'm not sure what happens in the dlopen case, so I'm going to
> > > test that out now ...
> > 
> > pthread_key destructors are a disaster waiting to happen with
> > dlclose, if the dlclose happens while non-main threads are
> > still running. When those threads exit, they'll run the
> > destructor which points to a function that is no longer
> > resident in memory.
> 
> While libnbd had a memory leak there, now fixed by:
> 
> https://gitlab.com/nbdkit/libnbd/-/commit/026d281c57dd95485cc9bf829918b5efd9e32ddb

So on dlclose(), the errors_free() method will be run by the linker.

The pthread_key_delete method doesn't run the cleanup callbacks, so
your fix here frees the thread local error string in the thread that
called dlclose explicitly.

If any other thread had an error string associated with the thread
local, that will still be leaked.  I don't see any attractive way
to avoid that leak.

IOW, I think the fix is incomplete, but its better than nothing,
and I don't have suggestion to improve it.

> I don't actually think we have the bug you're describing.  We have a
> destructor (errors_free()) which runs on dlclose, which deletes the
> thread-local storage key, so the destructor will not run after the
> library has been unloaded.
> 
> I added a test for this which works fine for me:
> 
> https://gitlab.com/nbdkit/libnbd/-/commit/831d142787aba4f5b638418e02cf7e0f2a051565

Yes, I think your using of ELF destructor function is enough in this
scenario.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 15:51 ` Kevin Wolf
@ 2021-04-30 15:49   ` Stefan Hajnoczi
  2021-05-04 13:44     ` Kevin Wolf
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-04-30 15:49 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 6345 bytes --]

On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > Hi,
> > A preview release of libblkio, a library for high-performance block I/O,
> > is now available:
> > 
> >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0
> > 
> > Applications are increasingly integrating high-performance I/O
> > interfaces such as Linux io_uring, userspace device drivers, and
> > vhost-user device support. The effort required to add each of these
> > low-level interfaces into an application is relatively high. libblkio
> > provides a single API for efficiently accessing block devices and
> > eliminates the need to write custom code for each one.
> > 
> > The library is not yet ready to use and currently lacks many features.
> > In fact, I hope this news doesn't spread too far yet because libblkio is
> > at a very early stage, but feedback from the QEMU community is necessary
> > at this time.
> 
> I'm a bit worried about the configuration interface. This looks an awful
> lot like plain QOM properties, which have proven to result in both very
> verbose (not to say messy) and rather error prone implementations.
> 
> You have to write setters/getters for every property, even if it usually
> ends up doing the same thing, storing the value somewhere. Worse, these
> getters and setters have to work in very different circumstances, namely
> initialisation where you usually have to store the value away so that it
> can be checked for consistency with other properties in .realize() or
> .complete(), and property updates during runtime. Often enough, we
> forget the latter and create bugs. If we don't create bugs, we usually
> start with 'if (realized)' and have two completely different code paths.
> Another common bug in QOM objects is forgetting to check if mandatory
> properties were actually set.
>
> Did you already consider these problems and decided to go this way
> anyway, or is this more of an accidental design? And if the former, what
> were the reasons that made it appealing?

That's true. Here is the code to reject accesses when the instance is
not initialized:

  self.must_be_initialized()?;

It's very concise but you still need to remember to add it.

The overall reasons for choosing the properties API were:

1. It keeps the library's API very minimal (just generic getters/setters
   for primitive types). It minimizes ABI compatibility issues because
   there are no configuration structs or functions exported by the
   library.

2. It's trivial to have a string setter/getter that automatically
   converts to the primitive type representation, so application config
   file or command-line values can easily be set.

   This is kind of the inverse of what you're saying. If the library
   offers dedicated interfaces for each configuration value then the
   library doesn't need getters/setters for each one but all
   applications need special-purpose code for each configuration value.

That said, this is exactly why I published the preview release. If
someone has a better way to do this or the feedback is just that this is
bad style, then I'm happy to change it.

> Alternatives in QEMU are qdev properties (which are internally QOM
> properties, but provide default implementations and are at least
> automatically read-only after realize, avoiding that whole class of
> bugs) and QAPI.
> If this was QEMU code, I would of course go for QAPI, but a library is
> something different and adding the code generator would probably be a
> bit too much anyway. But the idea in the resulting code would be dealing
> with native structs instead of a bunch of function calls. This would
> probably be the least error prone way for the implementation, but of
> course, it would make binary compatibility a bit harder when adding new
> properties.

An alternative I considered was the typestate and builder patterns:

  /* Create a new io_uring driver in the uninitialized state */
  struct blkio_iou_uninit *blkio_new_io_uring(void);

  /* Uninitialized state property setters */
  int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
                                const char *path);
  int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
                                  bool o_direct);
  ...

  /* Transition to initialized state. Frees u on success. */
  struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);

  /* Initialized state property setters/getters */
  int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
                                  uint64_t *capacity);
  ...

  /* Transition to started state. Frees i on success. */
  struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);

  ...

  /* Transition back to initialized state. Frees s on success. */
  struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);

On the plus side:

- No state checks are needed because an API won't even exist if it's
  unavailable in a given state (uninitialized/initialized/started).

- State structs come with pre-initialized default values, so the caller
  only needs to set non-default values. For example O_DIRECT is false by
  default and callers happy with that don't need to set the property.

- ABI compatibility is easy since the state structs are opaque (their
  size is not defined) and new properties can be added at any time.

On the minus side:

- Completely static. Hard to introspect and requires a dedicated call
  site for each property (applications cannot simply assign a property
  string given to them on the command-line). This means every single
  property must be explicitly coded into every application :(.

- So many functions! This makes understanding the API harder.

- Very verbose. The function and type names get long and there is a lot
  of repetition in the API.

> Thinking a bit further, we'll likely get the same problems as with QOM
> in other places, too, e.g. how do you introspect which properties exist
> in a given build?

Right, there is no introspection although probing individual properties
is possible (-ENOENT). Adding introspection wouldn't be hard but the
library API just doesn't exist today.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-29 15:00     ` Richard W.M. Jones
  2021-04-29 15:08       ` Daniel P. Berrangé
@ 2021-04-30 16:20       ` Stefan Hajnoczi
  1 sibling, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-04-30 16:20 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]

On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > This is where I get confused about what this library actually does.
> > > It's not just a nicer wrapper around io_uring, but what is it actually
> > > doing?
> > 
> > It's a library for what QEMU calls protocol drivers (POSIX files,
> > userspace NVMe driver, etc). In particular, anything offering
> > multi-queue block I/O fits into libblkio.
> > 
> > It is not intended to replace libnbd or other network storage libraries.
> > libblkio's properties API is synchronous to keep things simple for
> > applications. Attaching to network storage needs to be asynchronous,
> > although the libblkio API could be extended if people want to support
> > network storage.
> 
> I think what confuses me is why is NVMe any different from io_uring?
> ie would this work?
> 
> $ blkio-info --output=json io_uring path=/dev/nvme0

The libblkio io_uring driver can handle /dev/nvme* block devices.

The future userspace NVMe driver mentioned above will be a VFIO PCI
driver, like the block/nvme.c driver in QEMU today. It uses a PCI device
directly, not a /dev/nvme* block device.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-04-30 15:49   ` Stefan Hajnoczi
@ 2021-05-04 13:44     ` Kevin Wolf
  2021-05-05 16:19       ` Stefan Hajnoczi
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2021-05-04 13:44 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 8473 bytes --]

Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > Hi,
> > > A preview release of libblkio, a library for high-performance block I/O,
> > > is now available:
> > > 
> > >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0
> > > 
> > > Applications are increasingly integrating high-performance I/O
> > > interfaces such as Linux io_uring, userspace device drivers, and
> > > vhost-user device support. The effort required to add each of these
> > > low-level interfaces into an application is relatively high. libblkio
> > > provides a single API for efficiently accessing block devices and
> > > eliminates the need to write custom code for each one.
> > > 
> > > The library is not yet ready to use and currently lacks many features.
> > > In fact, I hope this news doesn't spread too far yet because libblkio is
> > > at a very early stage, but feedback from the QEMU community is necessary
> > > at this time.
> > 
> > I'm a bit worried about the configuration interface. This looks an awful
> > lot like plain QOM properties, which have proven to result in both very
> > verbose (not to say messy) and rather error prone implementations.
> > 
> > You have to write setters/getters for every property, even if it usually
> > ends up doing the same thing, storing the value somewhere. Worse, these
> > getters and setters have to work in very different circumstances, namely
> > initialisation where you usually have to store the value away so that it
> > can be checked for consistency with other properties in .realize() or
> > .complete(), and property updates during runtime. Often enough, we
> > forget the latter and create bugs. If we don't create bugs, we usually
> > start with 'if (realized)' and have two completely different code paths.
> > Another common bug in QOM objects is forgetting to check if mandatory
> > properties were actually set.
> >
> > Did you already consider these problems and decided to go this way
> > anyway, or is this more of an accidental design? And if the former, what
> > were the reasons that made it appealing?
> 
> That's true. Here is the code to reject accesses when the instance is
> not initialized:
> 
>   self.must_be_initialized()?;
> 
> It's very concise but you still need to remember to add it.

Yes, I think the problem isn't the length (in QOM it's commonly four
lines because we don't inline it like that, but same complexity really),
but remembering to have it there.

> The overall reasons for choosing the properties API were:
> 
> 1. It keeps the library's API very minimal (just generic getters/setters
>    for primitive types). It minimizes ABI compatibility issues because
>    there are no configuration structs or functions exported by the
>    library.
> 
> 2. It's trivial to have a string setter/getter that automatically
>    converts to the primitive type representation, so application config
>    file or command-line values can easily be set.
> 
>    This is kind of the inverse of what you're saying. If the library
>    offers dedicated interfaces for each configuration value then the
>    library doesn't need getters/setters for each one but all
>    applications need special-purpose code for each configuration value.
> 
> That said, this is exactly why I published the preview release. If
> someone has a better way to do this or the feedback is just that this is
> bad style, then I'm happy to change it.

I can see the advantages in this. Maybe the problem that I'm seeing is
more about implementing drivers in the Rust code rather than the
external interface in C.

I've been playing a bit with this and maybe a good part of it can be
abstracted away (maybe a bit like qdev properties abstract the
complexity of QOM properties away). If I get somewhere with this, I can
send patches.

There is one more thing I'm wondering right now: Why do we have separate
states for connecting to the backend (created) and configuring it
(initialized)? The property setters check for the right state, but they
don't really do anything that wouldn't be possible in the other state.
A state machine exposed as two boolean rather than a tristate enum feels
a bit awkward, too, but even more so if two states could possibly be
enough.

The reason why I'm asking is that in order to address my points, it
would be best to have separate property accessors for each state, and
two pairs of accessors would make property declarations more managable
than three pairs.

> > Alternatives in QEMU are qdev properties (which are internally QOM
> > properties, but provide default implementations and are at least
> > automatically read-only after realize, avoiding that whole class of
> > bugs) and QAPI.
> > If this was QEMU code, I would of course go for QAPI, but a library is
> > something different and adding the code generator would probably be a
> > bit too much anyway. But the idea in the resulting code would be dealing
> > with native structs instead of a bunch of function calls. This would
> > probably be the least error prone way for the implementation, but of
> > course, it would make binary compatibility a bit harder when adding new
> > properties.
> 
> An alternative I considered was the typestate and builder patterns:
> 
>   /* Create a new io_uring driver in the uninitialized state */
>   struct blkio_iou_uninit *blkio_new_io_uring(void);
> 
>   /* Uninitialized state property setters */
>   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
>                                 const char *path);
>   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
>                                   bool o_direct);
>   ...
> 
>   /* Transition to initialized state. Frees u on success. */
>   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> 
>   /* Initialized state property setters/getters */
>   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
>                                   uint64_t *capacity);
>   ...
> 
>   /* Transition to started state. Frees i on success. */
>   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);
> 
>   ...
> 
>   /* Transition back to initialized state. Frees s on success. */
>   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);
> 
> On the plus side:
> 
> - No state checks are needed because an API won't even exist if it's
>   unavailable in a given state (uninitialized/initialized/started).
> 
> - State structs come with pre-initialized default values, so the caller
>   only needs to set non-default values. For example O_DIRECT is false by
>   default and callers happy with that don't need to set the property.
> 
> - ABI compatibility is easy since the state structs are opaque (their
>   size is not defined) and new properties can be added at any time.
> 
> On the minus side:
> 
> - Completely static. Hard to introspect and requires a dedicated call
>   site for each property (applications cannot simply assign a property
>   string given to them on the command-line). This means every single
>   property must be explicitly coded into every application :(.

How are you going to deal with this for QEMU integration, by the way?
Put all the properties that we know into the QAPI schema and then some
way of passing key/value pairs for the rest?

> - So many functions! This makes understanding the API harder.
> 
> - Very verbose. The function and type names get long and there is a lot
>   of repetition in the API.

I think it wouldn't be too bad if all drivers exposed the same
properties, but you're explicitly expecting driver-specific properties.
If drivers add an external APIs that just fail for other drivers, it
would indeed make understanding the API much harder.

We could consider a mix where you would first create a configuration
object, then use the generic property functions to set options for it
and finally have a separate blkio_initialize() function where you turn
that config into a struct blkio that is needed to actually do I/O (and
also supports generic property functions for runtime option updates).

I'm not sure it provides much except making the state machine more
prominent than just two random bool properties.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-04 13:44     ` Kevin Wolf
@ 2021-05-05 16:19       ` Stefan Hajnoczi
  2021-05-05 16:46         ` Kevin Wolf
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-05-05 16:19 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 5571 bytes --]

On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> There is one more thing I'm wondering right now: Why do we have separate
> states for connecting to the backend (created) and configuring it
> (initialized)? The property setters check for the right state, but they
> don't really do anything that wouldn't be possible in the other state.
> A state machine exposed as two boolean rather than a tristate enum feels
> a bit awkward, too, but even more so if two states could possibly be
> enough.
> 
> The reason why I'm asking is that in order to address my points, it
> would be best to have separate property accessors for each state, and
> two pairs of accessors would make property declarations more managable
> than three pairs.

There is no need to have separate boolean properties, it's just how I
implemented it. There could be a single state property. For example, a
string that is "uninitialized", "initialized", and "started".

> > > Alternatives in QEMU are qdev properties (which are internally QOM
> > > properties, but provide default implementations and are at least
> > > automatically read-only after realize, avoiding that whole class of
> > > bugs) and QAPI.
> > > If this was QEMU code, I would of course go for QAPI, but a library is
> > > something different and adding the code generator would probably be a
> > > bit too much anyway. But the idea in the resulting code would be dealing
> > > with native structs instead of a bunch of function calls. This would
> > > probably be the least error prone way for the implementation, but of
> > > course, it would make binary compatibility a bit harder when adding new
> > > properties.
> > 
> > An alternative I considered was the typestate and builder patterns:
> > 
> >   /* Create a new io_uring driver in the uninitialized state */
> >   struct blkio_iou_uninit *blkio_new_io_uring(void);
> > 
> >   /* Uninitialized state property setters */
> >   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
> >                                 const char *path);
> >   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
> >                                   bool o_direct);
> >   ...
> > 
> >   /* Transition to initialized state. Frees u on success. */
> >   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> > 
> >   /* Initialized state property setters/getters */
> >   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
> >                                   uint64_t *capacity);
> >   ...
> > 
> >   /* Transition to started state. Frees i on success. */
> >   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);
> > 
> >   ...
> > 
> >   /* Transition back to initialized state. Frees s on success. */
> >   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);
> > 
> > On the plus side:
> > 
> > - No state checks are needed because an API won't even exist if it's
> >   unavailable in a given state (uninitialized/initialized/started).
> > 
> > - State structs come with pre-initialized default values, so the caller
> >   only needs to set non-default values. For example O_DIRECT is false by
> >   default and callers happy with that don't need to set the property.
> > 
> > - ABI compatibility is easy since the state structs are opaque (their
> >   size is not defined) and new properties can be added at any time.
> > 
> > On the minus side:
> > 
> > - Completely static. Hard to introspect and requires a dedicated call
> >   site for each property (applications cannot simply assign a property
> >   string given to them on the command-line). This means every single
> >   property must be explicitly coded into every application :(.
> 
> How are you going to deal with this for QEMU integration, by the way?
> Put all the properties that we know into the QAPI schema and then some
> way of passing key/value pairs for the rest?

In QEMU's case let's define each property explicitly instead of passing
them through. That's due to QAPI's philosophy rather than libblkio.

> > - So many functions! This makes understanding the API harder.
> > 
> > - Very verbose. The function and type names get long and there is a lot
> >   of repetition in the API.
> 
> I think it wouldn't be too bad if all drivers exposed the same
> properties, but you're explicitly expecting driver-specific properties.
> If drivers add an external APIs that just fail for other drivers, it
> would indeed make understanding the API much harder.
> 
> We could consider a mix where you would first create a configuration
> object, then use the generic property functions to set options for it
> and finally have a separate blkio_initialize() function where you turn
> that config into a struct blkio that is needed to actually do I/O (and
> also supports generic property functions for runtime option updates).
> 
> I'm not sure it provides much except making the state machine more
> prominent than just two random bool properties.

I prefer to keep the configuration public API as it is. We can change
the properties.rs implementation however we want though.

Do you think the public API should be a typestate API instead with
struct blkio_init_info, struct blkio_start_info, and struct blkio
expressing the 3 states instead?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-05 16:19       ` Stefan Hajnoczi
@ 2021-05-05 16:46         ` Kevin Wolf
  2021-05-06  8:46           ` Stefan Hajnoczi
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2021-05-05 16:46 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 7193 bytes --]

Am 05.05.2021 um 18:19 hat Stefan Hajnoczi geschrieben:
> On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> > Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > There is one more thing I'm wondering right now: Why do we have separate
> > states for connecting to the backend (created) and configuring it
> > (initialized)? The property setters check for the right state, but they
> > don't really do anything that wouldn't be possible in the other state.
> > A state machine exposed as two boolean rather than a tristate enum feels
> > a bit awkward, too, but even more so if two states could possibly be
> > enough.
> > 
> > The reason why I'm asking is that in order to address my points, it
> > would be best to have separate property accessors for each state, and
> > two pairs of accessors would make property declarations more managable
> > than three pairs.
> 
> There is no need to have separate boolean properties, it's just how I
> implemented it. There could be a single state property. For example, a
> string that is "uninitialized", "initialized", and "started".

Right. I think this would make the way the API works a bit more obvious,
though it's really only a small detail.

However, my real question was why we even need those three distinct
states. From what I could see so far in the io_uring implemtation,
"uninitialized" and "started" would be enough. Everything that can be
configured in "initialized", could just as well be configured in
"uninitialized" and the state transition would both open the image file
and apply the configuration in a single step.

Do you have intentions to add features in the future where the three
separate states are actually needed because the user needs to be able to
do something while the image file is already opened, but queue
processing hasn't started yet?

> > > > Alternatives in QEMU are qdev properties (which are internally QOM
> > > > properties, but provide default implementations and are at least
> > > > automatically read-only after realize, avoiding that whole class of
> > > > bugs) and QAPI.
> > > > If this was QEMU code, I would of course go for QAPI, but a library is
> > > > something different and adding the code generator would probably be a
> > > > bit too much anyway. But the idea in the resulting code would be dealing
> > > > with native structs instead of a bunch of function calls. This would
> > > > probably be the least error prone way for the implementation, but of
> > > > course, it would make binary compatibility a bit harder when adding new
> > > > properties.
> > > 
> > > An alternative I considered was the typestate and builder patterns:
> > > 
> > >   /* Create a new io_uring driver in the uninitialized state */
> > >   struct blkio_iou_uninit *blkio_new_io_uring(void);
> > > 
> > >   /* Uninitialized state property setters */
> > >   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
> > >                                 const char *path);
> > >   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
> > >                                   bool o_direct);
> > >   ...
> > > 
> > >   /* Transition to initialized state. Frees u on success. */
> > >   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> > > 
> > >   /* Initialized state property setters/getters */
> > >   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
> > >                                   uint64_t *capacity);
> > >   ...
> > > 
> > >   /* Transition to started state. Frees i on success. */
> > >   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);
> > > 
> > >   ...
> > > 
> > >   /* Transition back to initialized state. Frees s on success. */
> > >   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);
> > > 
> > > On the plus side:
> > > 
> > > - No state checks are needed because an API won't even exist if it's
> > >   unavailable in a given state (uninitialized/initialized/started).
> > > 
> > > - State structs come with pre-initialized default values, so the caller
> > >   only needs to set non-default values. For example O_DIRECT is false by
> > >   default and callers happy with that don't need to set the property.
> > > 
> > > - ABI compatibility is easy since the state structs are opaque (their
> > >   size is not defined) and new properties can be added at any time.
> > > 
> > > On the minus side:
> > > 
> > > - Completely static. Hard to introspect and requires a dedicated call
> > >   site for each property (applications cannot simply assign a property
> > >   string given to them on the command-line). This means every single
> > >   property must be explicitly coded into every application :(.
> > 
> > How are you going to deal with this for QEMU integration, by the way?
> > Put all the properties that we know into the QAPI schema and then some
> > way of passing key/value pairs for the rest?
> 
> In QEMU's case let's define each property explicitly instead of passing
> them through. That's due to QAPI's philosophy rather than libblkio.

Okay, so new features in libblkio would simply be unaccessible until we
update the QAPI schema. Given the overlap in developers, this shouldn't
be a problem in the foreseeable future, so this is fine with me.

> > > - So many functions! This makes understanding the API harder.
> > > 
> > > - Very verbose. The function and type names get long and there is a lot
> > >   of repetition in the API.
> > 
> > I think it wouldn't be too bad if all drivers exposed the same
> > properties, but you're explicitly expecting driver-specific properties.
> > If drivers add an external APIs that just fail for other drivers, it
> > would indeed make understanding the API much harder.
> > 
> > We could consider a mix where you would first create a configuration
> > object, then use the generic property functions to set options for it
> > and finally have a separate blkio_initialize() function where you turn
> > that config into a struct blkio that is needed to actually do I/O (and
> > also supports generic property functions for runtime option updates).
> > 
> > I'm not sure it provides much except making the state machine more
> > prominent than just two random bool properties.
> 
> I prefer to keep the configuration public API as it is. We can change
> the properties.rs implementation however we want though.
> 
> Do you think the public API should be a typestate API instead with
> struct blkio_init_info, struct blkio_start_info, and struct blkio
> expressing the 3 states instead?

I just mentioned it as a theoretical option for a middle ground. I'm not
sure if it's a good idea, and probably not worth the effort to change
it.

Maybe I would give the state transitions a separate function instead of
making them look like normal properties (then they could also use enums
instead of strings or two bools). But that's not a hard preference
either.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-05 16:46         ` Kevin Wolf
@ 2021-05-06  8:46           ` Stefan Hajnoczi
  2021-05-06 10:33             ` Kevin Wolf
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-05-06  8:46 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 8963 bytes --]

On Wed, May 05, 2021 at 06:46:36PM +0200, Kevin Wolf wrote:
> Am 05.05.2021 um 18:19 hat Stefan Hajnoczi geschrieben:
> > On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> > > Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > There is one more thing I'm wondering right now: Why do we have separate
> > > states for connecting to the backend (created) and configuring it
> > > (initialized)? The property setters check for the right state, but they
> > > don't really do anything that wouldn't be possible in the other state.
> > > A state machine exposed as two boolean rather than a tristate enum feels
> > > a bit awkward, too, but even more so if two states could possibly be
> > > enough.
> > > 
> > > The reason why I'm asking is that in order to address my points, it
> > > would be best to have separate property accessors for each state, and
> > > two pairs of accessors would make property declarations more managable
> > > than three pairs.
> > 
> > There is no need to have separate boolean properties, it's just how I
> > implemented it. There could be a single state property. For example, a
> > string that is "uninitialized", "initialized", and "started".
> 
> Right. I think this would make the way the API works a bit more obvious,
> though it's really only a small detail.
> 
> However, my real question was why we even need those three distinct
> states. From what I could see so far in the io_uring implemtation,
> "uninitialized" and "started" would be enough. Everything that can be
> configured in "initialized", could just as well be configured in
> "uninitialized" and the state transition would both open the image file
> and apply the configuration in a single step.
> 
> Do you have intentions to add features in the future where the three
> separate states are actually needed because the user needs to be able to
> do something while the image file is already opened, but queue
> processing hasn't started yet?

Yes, three states will be needed. Think of a virtio-blk driver where a
VFIO PCI bus address or a vhost-user-blk UNIX domain socket path are
needed to connect to the device. After connection completes you then
have access to the block queue limits, the number of queues, etc. Once
those things are configured the device can be started.

> > > > > Alternatives in QEMU are qdev properties (which are internally QOM
> > > > > properties, but provide default implementations and are at least
> > > > > automatically read-only after realize, avoiding that whole class of
> > > > > bugs) and QAPI.
> > > > > If this was QEMU code, I would of course go for QAPI, but a library is
> > > > > something different and adding the code generator would probably be a
> > > > > bit too much anyway. But the idea in the resulting code would be dealing
> > > > > with native structs instead of a bunch of function calls. This would
> > > > > probably be the least error prone way for the implementation, but of
> > > > > course, it would make binary compatibility a bit harder when adding new
> > > > > properties.
> > > > 
> > > > An alternative I considered was the typestate and builder patterns:
> > > > 
> > > >   /* Create a new io_uring driver in the uninitialized state */
> > > >   struct blkio_iou_uninit *blkio_new_io_uring(void);
> > > > 
> > > >   /* Uninitialized state property setters */
> > > >   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
> > > >                                 const char *path);
> > > >   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
> > > >                                   bool o_direct);
> > > >   ...
> > > > 
> > > >   /* Transition to initialized state. Frees u on success. */
> > > >   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> > > > 
> > > >   /* Initialized state property setters/getters */
> > > >   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
> > > >                                   uint64_t *capacity);
> > > >   ...
> > > > 
> > > >   /* Transition to started state. Frees i on success. */
> > > >   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);
> > > > 
> > > >   ...
> > > > 
> > > >   /* Transition back to initialized state. Frees s on success. */
> > > >   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);
> > > > 
> > > > On the plus side:
> > > > 
> > > > - No state checks are needed because an API won't even exist if it's
> > > >   unavailable in a given state (uninitialized/initialized/started).
> > > > 
> > > > - State structs come with pre-initialized default values, so the caller
> > > >   only needs to set non-default values. For example O_DIRECT is false by
> > > >   default and callers happy with that don't need to set the property.
> > > > 
> > > > - ABI compatibility is easy since the state structs are opaque (their
> > > >   size is not defined) and new properties can be added at any time.
> > > > 
> > > > On the minus side:
> > > > 
> > > > - Completely static. Hard to introspect and requires a dedicated call
> > > >   site for each property (applications cannot simply assign a property
> > > >   string given to them on the command-line). This means every single
> > > >   property must be explicitly coded into every application :(.
> > > 
> > > How are you going to deal with this for QEMU integration, by the way?
> > > Put all the properties that we know into the QAPI schema and then some
> > > way of passing key/value pairs for the rest?
> > 
> > In QEMU's case let's define each property explicitly instead of passing
> > them through. That's due to QAPI's philosophy rather than libblkio.
> 
> Okay, so new features in libblkio would simply be unaccessible until we
> update the QAPI schema. Given the overlap in developers, this shouldn't
> be a problem in the foreseeable future, so this is fine with me.

Great.

> > > > - So many functions! This makes understanding the API harder.
> > > > 
> > > > - Very verbose. The function and type names get long and there is a lot
> > > >   of repetition in the API.
> > > 
> > > I think it wouldn't be too bad if all drivers exposed the same
> > > properties, but you're explicitly expecting driver-specific properties.
> > > If drivers add an external APIs that just fail for other drivers, it
> > > would indeed make understanding the API much harder.
> > > 
> > > We could consider a mix where you would first create a configuration
> > > object, then use the generic property functions to set options for it
> > > and finally have a separate blkio_initialize() function where you turn
> > > that config into a struct blkio that is needed to actually do I/O (and
> > > also supports generic property functions for runtime option updates).
> > > 
> > > I'm not sure it provides much except making the state machine more
> > > prominent than just two random bool properties.
> > 
> > I prefer to keep the configuration public API as it is. We can change
> > the properties.rs implementation however we want though.
> > 
> > Do you think the public API should be a typestate API instead with
> > struct blkio_init_info, struct blkio_start_info, and struct blkio
> > expressing the 3 states instead?
> 
> I just mentioned it as a theoretical option for a middle ground. I'm not
> sure if it's a good idea, and probably not worth the effort to change
> it.
> 
> Maybe I would give the state transitions a separate function instead of
> making them look like normal properties (then they could also use enums
> instead of strings or two bools). But that's not a hard preference
> either.

What do you think about this:

The blkio instance states are:

  created -> attached -> started -> destroyed

It is not possible to go backwards anymore, which simplifies driver
implementations and it probably won't be needed by applications.

The "initialized" state is renamed to "attached" to make it clearer that
this means the block device has been connected/opened. Also
"initialized" can be confused with "created".

The corresponding APIs are:

int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
int blkio_attach(struct blkio *bp, char **errmsg);
int blkio_start(struct blkio *bp, char **errmsg);
void blkio_destroy(struct blkio **bp);

There is no way to query the state here, but that probably isn't
necessary since an application setting up the blkio instance must
already be aware of the state in order to configure it in the first
place.

One advantage of this approach is that it can support network drivers
where the attach and start operations can take a long time while regular
property accesses do not block.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-06  8:46           ` Stefan Hajnoczi
@ 2021-05-06 10:33             ` Kevin Wolf
  2021-05-06 13:53               ` Stefan Hajnoczi
  2021-05-13  9:47               ` Stefan Hajnoczi
  0 siblings, 2 replies; 22+ messages in thread
From: Kevin Wolf @ 2021-05-06 10:33 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 10604 bytes --]

Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> On Wed, May 05, 2021 at 06:46:36PM +0200, Kevin Wolf wrote:
> > Am 05.05.2021 um 18:19 hat Stefan Hajnoczi geschrieben:
> > > On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> > > > Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > > > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > > > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > > There is one more thing I'm wondering right now: Why do we have separate
> > > > states for connecting to the backend (created) and configuring it
> > > > (initialized)? The property setters check for the right state, but they
> > > > don't really do anything that wouldn't be possible in the other state.
> > > > A state machine exposed as two boolean rather than a tristate enum feels
> > > > a bit awkward, too, but even more so if two states could possibly be
> > > > enough.
> > > > 
> > > > The reason why I'm asking is that in order to address my points, it
> > > > would be best to have separate property accessors for each state, and
> > > > two pairs of accessors would make property declarations more managable
> > > > than three pairs.
> > > 
> > > There is no need to have separate boolean properties, it's just how I
> > > implemented it. There could be a single state property. For example, a
> > > string that is "uninitialized", "initialized", and "started".
> > 
> > Right. I think this would make the way the API works a bit more obvious,
> > though it's really only a small detail.
> > 
> > However, my real question was why we even need those three distinct
> > states. From what I could see so far in the io_uring implemtation,
> > "uninitialized" and "started" would be enough. Everything that can be
> > configured in "initialized", could just as well be configured in
> > "uninitialized" and the state transition would both open the image file
> > and apply the configuration in a single step.
> > 
> > Do you have intentions to add features in the future where the three
> > separate states are actually needed because the user needs to be able to
> > do something while the image file is already opened, but queue
> > processing hasn't started yet?
> 
> Yes, three states will be needed. Think of a virtio-blk driver where a
> VFIO PCI bus address or a vhost-user-blk UNIX domain socket path are
> needed to connect to the device. After connection completes you then
> have access to the block queue limits, the number of queues, etc. Once
> those things are configured the device can be started.

You mean, the user of the library would first query some limits before
deciding on the right values to use for some properties?

When the values come directly from the command line, this won't be the
case anyway, but I agree there may be cases where not the user, but the
application decides and then the separation would be helpful.

> > > > > > Alternatives in QEMU are qdev properties (which are internally QOM
> > > > > > properties, but provide default implementations and are at least
> > > > > > automatically read-only after realize, avoiding that whole class of
> > > > > > bugs) and QAPI.
> > > > > > If this was QEMU code, I would of course go for QAPI, but a library is
> > > > > > something different and adding the code generator would probably be a
> > > > > > bit too much anyway. But the idea in the resulting code would be dealing
> > > > > > with native structs instead of a bunch of function calls. This would
> > > > > > probably be the least error prone way for the implementation, but of
> > > > > > course, it would make binary compatibility a bit harder when adding new
> > > > > > properties.
> > > > > 
> > > > > An alternative I considered was the typestate and builder patterns:
> > > > > 
> > > > >   /* Create a new io_uring driver in the uninitialized state */
> > > > >   struct blkio_iou_uninit *blkio_new_io_uring(void);
> > > > > 
> > > > >   /* Uninitialized state property setters */
> > > > >   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
> > > > >                                 const char *path);
> > > > >   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
> > > > >                                   bool o_direct);
> > > > >   ...
> > > > > 
> > > > >   /* Transition to initialized state. Frees u on success. */
> > > > >   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> > > > > 
> > > > >   /* Initialized state property setters/getters */
> > > > >   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
> > > > >                                   uint64_t *capacity);
> > > > >   ...
> > > > > 
> > > > >   /* Transition to started state. Frees i on success. */
> > > > >   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);
> > > > > 
> > > > >   ...
> > > > > 
> > > > >   /* Transition back to initialized state. Frees s on success. */
> > > > >   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);
> > > > > 
> > > > > On the plus side:
> > > > > 
> > > > > - No state checks are needed because an API won't even exist if it's
> > > > >   unavailable in a given state (uninitialized/initialized/started).
> > > > > 
> > > > > - State structs come with pre-initialized default values, so the caller
> > > > >   only needs to set non-default values. For example O_DIRECT is false by
> > > > >   default and callers happy with that don't need to set the property.
> > > > > 
> > > > > - ABI compatibility is easy since the state structs are opaque (their
> > > > >   size is not defined) and new properties can be added at any time.
> > > > > 
> > > > > On the minus side:
> > > > > 
> > > > > - Completely static. Hard to introspect and requires a dedicated call
> > > > >   site for each property (applications cannot simply assign a property
> > > > >   string given to them on the command-line). This means every single
> > > > >   property must be explicitly coded into every application :(.
> > > > 
> > > > How are you going to deal with this for QEMU integration, by the way?
> > > > Put all the properties that we know into the QAPI schema and then some
> > > > way of passing key/value pairs for the rest?
> > > 
> > > In QEMU's case let's define each property explicitly instead of passing
> > > them through. That's due to QAPI's philosophy rather than libblkio.
> > 
> > Okay, so new features in libblkio would simply be unaccessible until we
> > update the QAPI schema. Given the overlap in developers, this shouldn't
> > be a problem in the foreseeable future, so this is fine with me.
> 
> Great.
> 
> > > > > - So many functions! This makes understanding the API harder.
> > > > > 
> > > > > - Very verbose. The function and type names get long and there is a lot
> > > > >   of repetition in the API.
> > > > 
> > > > I think it wouldn't be too bad if all drivers exposed the same
> > > > properties, but you're explicitly expecting driver-specific properties.
> > > > If drivers add an external APIs that just fail for other drivers, it
> > > > would indeed make understanding the API much harder.
> > > > 
> > > > We could consider a mix where you would first create a configuration
> > > > object, then use the generic property functions to set options for it
> > > > and finally have a separate blkio_initialize() function where you turn
> > > > that config into a struct blkio that is needed to actually do I/O (and
> > > > also supports generic property functions for runtime option updates).
> > > > 
> > > > I'm not sure it provides much except making the state machine more
> > > > prominent than just two random bool properties.
> > > 
> > > I prefer to keep the configuration public API as it is. We can change
> > > the properties.rs implementation however we want though.
> > > 
> > > Do you think the public API should be a typestate API instead with
> > > struct blkio_init_info, struct blkio_start_info, and struct blkio
> > > expressing the 3 states instead?
> > 
> > I just mentioned it as a theoretical option for a middle ground. I'm not
> > sure if it's a good idea, and probably not worth the effort to change
> > it.
> > 
> > Maybe I would give the state transitions a separate function instead of
> > making them look like normal properties (then they could also use enums
> > instead of strings or two bools). But that's not a hard preference
> > either.
> 
> What do you think about this:
> 
> The blkio instance states are:
> 
>   created -> attached -> started -> destroyed
> 
> It is not possible to go backwards anymore, which simplifies driver
> implementations and it probably won't be needed by applications.
> 
> The "initialized" state is renamed to "attached" to make it clearer that
> this means the block device has been connected/opened. Also
> "initialized" can be confused with "created".
> 
> The corresponding APIs are:
> 
> int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> int blkio_attach(struct blkio *bp, char **errmsg);
> int blkio_start(struct blkio *bp, char **errmsg);
> void blkio_destroy(struct blkio **bp);
> 
> There is no way to query the state here, but that probably isn't
> necessary since an application setting up the blkio instance must
> already be aware of the state in order to configure it in the first
> place.
> 
> One advantage of this approach is that it can support network drivers
> where the attach and start operations can take a long time while regular
> property accesses do not block.

I like this.

For properties, I think, each property will have a first state in which
it becomes available and then it will be available in all later states,
too.

Currently, apart from properties that are always read-only, we only have
properties that are rw only in their first state and become read-only in
later states. It might be reasonable to assume that properties will
exist that can be rw in all later states, too.

In their first state, most properties only store the value into the
config and it's the next state transition that actually makes use of
them. Similarly, reading usually only reads the value from the config.
So these parts can be automatically covered. Usually you would then only
need a custom implementation for property updates after the fact. I
think this could simplify the driver implementations a lot. I'll play
with this a bit more.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-06 10:33             ` Kevin Wolf
@ 2021-05-06 13:53               ` Stefan Hajnoczi
  2021-05-13  9:47               ` Stefan Hajnoczi
  1 sibling, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-05-06 13:53 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 7414 bytes --]

On Thu, May 06, 2021 at 12:33:24PM +0200, Kevin Wolf wrote:
> Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> > On Wed, May 05, 2021 at 06:46:36PM +0200, Kevin Wolf wrote:
> > > Am 05.05.2021 um 18:19 hat Stefan Hajnoczi geschrieben:
> > > > On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> > > > > Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > > > > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > > > > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > > > There is one more thing I'm wondering right now: Why do we have separate
> > > > > states for connecting to the backend (created) and configuring it
> > > > > (initialized)? The property setters check for the right state, but they
> > > > > don't really do anything that wouldn't be possible in the other state.
> > > > > A state machine exposed as two boolean rather than a tristate enum feels
> > > > > a bit awkward, too, but even more so if two states could possibly be
> > > > > enough.
> > > > > 
> > > > > The reason why I'm asking is that in order to address my points, it
> > > > > would be best to have separate property accessors for each state, and
> > > > > two pairs of accessors would make property declarations more managable
> > > > > than three pairs.
> > > > 
> > > > There is no need to have separate boolean properties, it's just how I
> > > > implemented it. There could be a single state property. For example, a
> > > > string that is "uninitialized", "initialized", and "started".
> > > 
> > > Right. I think this would make the way the API works a bit more obvious,
> > > though it's really only a small detail.
> > > 
> > > However, my real question was why we even need those three distinct
> > > states. From what I could see so far in the io_uring implemtation,
> > > "uninitialized" and "started" would be enough. Everything that can be
> > > configured in "initialized", could just as well be configured in
> > > "uninitialized" and the state transition would both open the image file
> > > and apply the configuration in a single step.
> > > 
> > > Do you have intentions to add features in the future where the three
> > > separate states are actually needed because the user needs to be able to
> > > do something while the image file is already opened, but queue
> > > processing hasn't started yet?
> > 
> > Yes, three states will be needed. Think of a virtio-blk driver where a
> > VFIO PCI bus address or a vhost-user-blk UNIX domain socket path are
> > needed to connect to the device. After connection completes you then
> > have access to the block queue limits, the number of queues, etc. Once
> > those things are configured the device can be started.
> 
> You mean, the user of the library would first query some limits before
> deciding on the right values to use for some properties?
> 
> When the values come directly from the command line, this won't be the
> case anyway, but I agree there may be cases where not the user, but the
> application decides and then the separation would be helpful.

Yes, for example the number of queues. After attaching to the device the
application can determine the maximum number of queues. It then has a
chance to lower the number of queues if it wants.

> > > > > > - So many functions! This makes understanding the API harder.
> > > > > > 
> > > > > > - Very verbose. The function and type names get long and there is a lot
> > > > > >   of repetition in the API.
> > > > > 
> > > > > I think it wouldn't be too bad if all drivers exposed the same
> > > > > properties, but you're explicitly expecting driver-specific properties.
> > > > > If drivers add an external APIs that just fail for other drivers, it
> > > > > would indeed make understanding the API much harder.
> > > > > 
> > > > > We could consider a mix where you would first create a configuration
> > > > > object, then use the generic property functions to set options for it
> > > > > and finally have a separate blkio_initialize() function where you turn
> > > > > that config into a struct blkio that is needed to actually do I/O (and
> > > > > also supports generic property functions for runtime option updates).
> > > > > 
> > > > > I'm not sure it provides much except making the state machine more
> > > > > prominent than just two random bool properties.
> > > > 
> > > > I prefer to keep the configuration public API as it is. We can change
> > > > the properties.rs implementation however we want though.
> > > > 
> > > > Do you think the public API should be a typestate API instead with
> > > > struct blkio_init_info, struct blkio_start_info, and struct blkio
> > > > expressing the 3 states instead?
> > > 
> > > I just mentioned it as a theoretical option for a middle ground. I'm not
> > > sure if it's a good idea, and probably not worth the effort to change
> > > it.
> > > 
> > > Maybe I would give the state transitions a separate function instead of
> > > making them look like normal properties (then they could also use enums
> > > instead of strings or two bools). But that's not a hard preference
> > > either.
> > 
> > What do you think about this:
> > 
> > The blkio instance states are:
> > 
> >   created -> attached -> started -> destroyed
> > 
> > It is not possible to go backwards anymore, which simplifies driver
> > implementations and it probably won't be needed by applications.
> > 
> > The "initialized" state is renamed to "attached" to make it clearer that
> > this means the block device has been connected/opened. Also
> > "initialized" can be confused with "created".
> > 
> > The corresponding APIs are:
> > 
> > int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> > int blkio_attach(struct blkio *bp, char **errmsg);
> > int blkio_start(struct blkio *bp, char **errmsg);
> > void blkio_destroy(struct blkio **bp);
> > 
> > There is no way to query the state here, but that probably isn't
> > necessary since an application setting up the blkio instance must
> > already be aware of the state in order to configure it in the first
> > place.
> > 
> > One advantage of this approach is that it can support network drivers
> > where the attach and start operations can take a long time while regular
> > property accesses do not block.
> 
> I like this.
> 
> For properties, I think, each property will have a first state in which
> it becomes available and then it will be available in all later states,
> too.
> 
> Currently, apart from properties that are always read-only, we only have
> properties that are rw only in their first state and become read-only in
> later states. It might be reasonable to assume that properties will
> exist that can be rw in all later states, too.

Yes, there could be properties that are rw in all later states.

> In their first state, most properties only store the value into the
> config and it's the next state transition that actually makes use of
> them. Similarly, reading usually only reads the value from the config.
> So these parts can be automatically covered. Usually you would then only
> need a custom implementation for property updates after the fact. I
> think this could simplify the driver implementations a lot. I'll play
> with this a bit more.

Cool!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-06 10:33             ` Kevin Wolf
  2021-05-06 13:53               ` Stefan Hajnoczi
@ 2021-05-13  9:47               ` Stefan Hajnoczi
  2021-05-14 15:55                 ` Kevin Wolf
  1 sibling, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-05-13  9:47 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 11217 bytes --]

On Thu, May 06, 2021 at 12:33:24PM +0200, Kevin Wolf wrote:
> Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> > On Wed, May 05, 2021 at 06:46:36PM +0200, Kevin Wolf wrote:
> > > Am 05.05.2021 um 18:19 hat Stefan Hajnoczi geschrieben:
> > > > On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> > > > > Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > > > > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > > > > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > > > There is one more thing I'm wondering right now: Why do we have separate
> > > > > states for connecting to the backend (created) and configuring it
> > > > > (initialized)? The property setters check for the right state, but they
> > > > > don't really do anything that wouldn't be possible in the other state.
> > > > > A state machine exposed as two boolean rather than a tristate enum feels
> > > > > a bit awkward, too, but even more so if two states could possibly be
> > > > > enough.
> > > > > 
> > > > > The reason why I'm asking is that in order to address my points, it
> > > > > would be best to have separate property accessors for each state, and
> > > > > two pairs of accessors would make property declarations more managable
> > > > > than three pairs.
> > > > 
> > > > There is no need to have separate boolean properties, it's just how I
> > > > implemented it. There could be a single state property. For example, a
> > > > string that is "uninitialized", "initialized", and "started".
> > > 
> > > Right. I think this would make the way the API works a bit more obvious,
> > > though it's really only a small detail.
> > > 
> > > However, my real question was why we even need those three distinct
> > > states. From what I could see so far in the io_uring implemtation,
> > > "uninitialized" and "started" would be enough. Everything that can be
> > > configured in "initialized", could just as well be configured in
> > > "uninitialized" and the state transition would both open the image file
> > > and apply the configuration in a single step.
> > > 
> > > Do you have intentions to add features in the future where the three
> > > separate states are actually needed because the user needs to be able to
> > > do something while the image file is already opened, but queue
> > > processing hasn't started yet?
> > 
> > Yes, three states will be needed. Think of a virtio-blk driver where a
> > VFIO PCI bus address or a vhost-user-blk UNIX domain socket path are
> > needed to connect to the device. After connection completes you then
> > have access to the block queue limits, the number of queues, etc. Once
> > those things are configured the device can be started.
> 
> You mean, the user of the library would first query some limits before
> deciding on the right values to use for some properties?
> 
> When the values come directly from the command line, this won't be the
> case anyway, but I agree there may be cases where not the user, but the
> application decides and then the separation would be helpful.
> 
> > > > > > > Alternatives in QEMU are qdev properties (which are internally QOM
> > > > > > > properties, but provide default implementations and are at least
> > > > > > > automatically read-only after realize, avoiding that whole class of
> > > > > > > bugs) and QAPI.
> > > > > > > If this was QEMU code, I would of course go for QAPI, but a library is
> > > > > > > something different and adding the code generator would probably be a
> > > > > > > bit too much anyway. But the idea in the resulting code would be dealing
> > > > > > > with native structs instead of a bunch of function calls. This would
> > > > > > > probably be the least error prone way for the implementation, but of
> > > > > > > course, it would make binary compatibility a bit harder when adding new
> > > > > > > properties.
> > > > > > 
> > > > > > An alternative I considered was the typestate and builder patterns:
> > > > > > 
> > > > > >   /* Create a new io_uring driver in the uninitialized state */
> > > > > >   struct blkio_iou_uninit *blkio_new_io_uring(void);
> > > > > > 
> > > > > >   /* Uninitialized state property setters */
> > > > > >   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
> > > > > >                                 const char *path);
> > > > > >   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
> > > > > >                                   bool o_direct);
> > > > > >   ...
> > > > > > 
> > > > > >   /* Transition to initialized state. Frees u on success. */
> > > > > >   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> > > > > > 
> > > > > >   /* Initialized state property setters/getters */
> > > > > >   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
> > > > > >                                   uint64_t *capacity);
> > > > > >   ...
> > > > > > 
> > > > > >   /* Transition to started state. Frees i on success. */
> > > > > >   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);
> > > > > > 
> > > > > >   ...
> > > > > > 
> > > > > >   /* Transition back to initialized state. Frees s on success. */
> > > > > >   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);
> > > > > > 
> > > > > > On the plus side:
> > > > > > 
> > > > > > - No state checks are needed because an API won't even exist if it's
> > > > > >   unavailable in a given state (uninitialized/initialized/started).
> > > > > > 
> > > > > > - State structs come with pre-initialized default values, so the caller
> > > > > >   only needs to set non-default values. For example O_DIRECT is false by
> > > > > >   default and callers happy with that don't need to set the property.
> > > > > > 
> > > > > > - ABI compatibility is easy since the state structs are opaque (their
> > > > > >   size is not defined) and new properties can be added at any time.
> > > > > > 
> > > > > > On the minus side:
> > > > > > 
> > > > > > - Completely static. Hard to introspect and requires a dedicated call
> > > > > >   site for each property (applications cannot simply assign a property
> > > > > >   string given to them on the command-line). This means every single
> > > > > >   property must be explicitly coded into every application :(.
> > > > > 
> > > > > How are you going to deal with this for QEMU integration, by the way?
> > > > > Put all the properties that we know into the QAPI schema and then some
> > > > > way of passing key/value pairs for the rest?
> > > > 
> > > > In QEMU's case let's define each property explicitly instead of passing
> > > > them through. That's due to QAPI's philosophy rather than libblkio.
> > > 
> > > Okay, so new features in libblkio would simply be unaccessible until we
> > > update the QAPI schema. Given the overlap in developers, this shouldn't
> > > be a problem in the foreseeable future, so this is fine with me.
> > 
> > Great.
> > 
> > > > > > - So many functions! This makes understanding the API harder.
> > > > > > 
> > > > > > - Very verbose. The function and type names get long and there is a lot
> > > > > >   of repetition in the API.
> > > > > 
> > > > > I think it wouldn't be too bad if all drivers exposed the same
> > > > > properties, but you're explicitly expecting driver-specific properties.
> > > > > If drivers add an external APIs that just fail for other drivers, it
> > > > > would indeed make understanding the API much harder.
> > > > > 
> > > > > We could consider a mix where you would first create a configuration
> > > > > object, then use the generic property functions to set options for it
> > > > > and finally have a separate blkio_initialize() function where you turn
> > > > > that config into a struct blkio that is needed to actually do I/O (and
> > > > > also supports generic property functions for runtime option updates).
> > > > > 
> > > > > I'm not sure it provides much except making the state machine more
> > > > > prominent than just two random bool properties.
> > > > 
> > > > I prefer to keep the configuration public API as it is. We can change
> > > > the properties.rs implementation however we want though.
> > > > 
> > > > Do you think the public API should be a typestate API instead with
> > > > struct blkio_init_info, struct blkio_start_info, and struct blkio
> > > > expressing the 3 states instead?
> > > 
> > > I just mentioned it as a theoretical option for a middle ground. I'm not
> > > sure if it's a good idea, and probably not worth the effort to change
> > > it.
> > > 
> > > Maybe I would give the state transitions a separate function instead of
> > > making them look like normal properties (then they could also use enums
> > > instead of strings or two bools). But that's not a hard preference
> > > either.
> > 
> > What do you think about this:
> > 
> > The blkio instance states are:
> > 
> >   created -> attached -> started -> destroyed
> > 
> > It is not possible to go backwards anymore, which simplifies driver
> > implementations and it probably won't be needed by applications.
> > 
> > The "initialized" state is renamed to "attached" to make it clearer that
> > this means the block device has been connected/opened. Also
> > "initialized" can be confused with "created".
> > 
> > The corresponding APIs are:
> > 
> > int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> > int blkio_attach(struct blkio *bp, char **errmsg);
> > int blkio_start(struct blkio *bp, char **errmsg);
> > void blkio_destroy(struct blkio **bp);
> > 
> > There is no way to query the state here, but that probably isn't
> > necessary since an application setting up the blkio instance must
> > already be aware of the state in order to configure it in the first
> > place.
> > 
> > One advantage of this approach is that it can support network drivers
> > where the attach and start operations can take a long time while regular
> > property accesses do not block.
> 
> I like this.
> 
> For properties, I think, each property will have a first state in which
> it becomes available and then it will be available in all later states,
> too.
> 
> Currently, apart from properties that are always read-only, we only have
> properties that are rw only in their first state and become read-only in
> later states. It might be reasonable to assume that properties will
> exist that can be rw in all later states, too.
> 
> In their first state, most properties only store the value into the
> config and it's the next state transition that actually makes use of
> them. Similarly, reading usually only reads the value from the config.
> So these parts can be automatically covered. Usually you would then only
> need a custom implementation for property updates after the fact. I
> think this could simplify the driver implementations a lot. I'll play
> with this a bit more.

Hi Kevin,
I posted a patch that introduces blkio_connect() and blkio_start():
https://gitlab.com/libblkio/libblkio/-/merge_requests/4

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-13  9:47               ` Stefan Hajnoczi
@ 2021-05-14 15:55                 ` Kevin Wolf
  2021-05-17 14:09                   ` Stefan Hajnoczi
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2021-05-14 15:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 3204 bytes --]

Am 13.05.2021 um 11:47 hat Stefan Hajnoczi geschrieben:
> On Thu, May 06, 2021 at 12:33:24PM +0200, Kevin Wolf wrote:
> > Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> > > What do you think about this:
> > > 
> > > The blkio instance states are:
> > > 
> > >   created -> attached -> started -> destroyed
> > > 
> > > It is not possible to go backwards anymore, which simplifies driver
> > > implementations and it probably won't be needed by applications.
> > > 
> > > The "initialized" state is renamed to "attached" to make it clearer that
> > > this means the block device has been connected/opened. Also
> > > "initialized" can be confused with "created".
> > > 
> > > The corresponding APIs are:
> > > 
> > > int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> > > int blkio_attach(struct blkio *bp, char **errmsg);
> > > int blkio_start(struct blkio *bp, char **errmsg);
> > > void blkio_destroy(struct blkio **bp);
> > > 
> > > There is no way to query the state here, but that probably isn't
> > > necessary since an application setting up the blkio instance must
> > > already be aware of the state in order to configure it in the first
> > > place.
> > > 
> > > One advantage of this approach is that it can support network drivers
> > > where the attach and start operations can take a long time while regular
> > > property accesses do not block.
> > 
> > I like this.
> > 
> > For properties, I think, each property will have a first state in which
> > it becomes available and then it will be available in all later states,
> > too.
> > 
> > Currently, apart from properties that are always read-only, we only have
> > properties that are rw only in their first state and become read-only in
> > later states. It might be reasonable to assume that properties will
> > exist that can be rw in all later states, too.
> > 
> > In their first state, most properties only store the value into the
> > config and it's the next state transition that actually makes use of
> > them. Similarly, reading usually only reads the value from the config.
> > So these parts can be automatically covered. Usually you would then only
> > need a custom implementation for property updates after the fact. I
> > think this could simplify the driver implementations a lot. I'll play
> > with this a bit more.
> 
> Hi Kevin,
> I posted a patch that introduces blkio_connect() and blkio_start():
> https://gitlab.com/libblkio/libblkio/-/merge_requests/4

Assuming that you want review to happen on Gitlab, I added a few
comments there.

I'm not sure if you saw it, but on Wednesday, I also created a merge
request for some first changes to reduce the properties boilerplate in
the iouring module that would otherwise be duplicated for every new
driver. Not sure if everything is a good idea, but the first patch is
almost certainly one.

(However, I just realised that the test failure is not the same as on
main, so I degraded it to a draft now. It also conflicts with your merge
request. Next thing to learn for me is how to respin a merge request on
Gitlab... You may want to have a look anyway.)

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-14 15:55                 ` Kevin Wolf
@ 2021-05-17 14:09                   ` Stefan Hajnoczi
  2021-05-18  8:02                     ` Kevin Wolf
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2021-05-17 14:09 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 3567 bytes --]

On Fri, May 14, 2021 at 05:55:13PM +0200, Kevin Wolf wrote:
> Am 13.05.2021 um 11:47 hat Stefan Hajnoczi geschrieben:
> > On Thu, May 06, 2021 at 12:33:24PM +0200, Kevin Wolf wrote:
> > > Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> > > > What do you think about this:
> > > > 
> > > > The blkio instance states are:
> > > > 
> > > >   created -> attached -> started -> destroyed
> > > > 
> > > > It is not possible to go backwards anymore, which simplifies driver
> > > > implementations and it probably won't be needed by applications.
> > > > 
> > > > The "initialized" state is renamed to "attached" to make it clearer that
> > > > this means the block device has been connected/opened. Also
> > > > "initialized" can be confused with "created".
> > > > 
> > > > The corresponding APIs are:
> > > > 
> > > > int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> > > > int blkio_attach(struct blkio *bp, char **errmsg);
> > > > int blkio_start(struct blkio *bp, char **errmsg);
> > > > void blkio_destroy(struct blkio **bp);
> > > > 
> > > > There is no way to query the state here, but that probably isn't
> > > > necessary since an application setting up the blkio instance must
> > > > already be aware of the state in order to configure it in the first
> > > > place.
> > > > 
> > > > One advantage of this approach is that it can support network drivers
> > > > where the attach and start operations can take a long time while regular
> > > > property accesses do not block.
> > > 
> > > I like this.
> > > 
> > > For properties, I think, each property will have a first state in which
> > > it becomes available and then it will be available in all later states,
> > > too.
> > > 
> > > Currently, apart from properties that are always read-only, we only have
> > > properties that are rw only in their first state and become read-only in
> > > later states. It might be reasonable to assume that properties will
> > > exist that can be rw in all later states, too.
> > > 
> > > In their first state, most properties only store the value into the
> > > config and it's the next state transition that actually makes use of
> > > them. Similarly, reading usually only reads the value from the config.
> > > So these parts can be automatically covered. Usually you would then only
> > > need a custom implementation for property updates after the fact. I
> > > think this could simplify the driver implementations a lot. I'll play
> > > with this a bit more.
> > 
> > Hi Kevin,
> > I posted a patch that introduces blkio_connect() and blkio_start():
> > https://gitlab.com/libblkio/libblkio/-/merge_requests/4
> 
> Assuming that you want review to happen on Gitlab, I added a few
> comments there.
> 
> I'm not sure if you saw it, but on Wednesday, I also created a merge
> request for some first changes to reduce the properties boilerplate in
> the iouring module that would otherwise be duplicated for every new
> driver. Not sure if everything is a good idea, but the first patch is
> almost certainly one.
> 
> (However, I just realised that the test failure is not the same as on
> main, so I degraded it to a draft now. It also conflicts with your merge
> request. Next thing to learn for me is how to respin a merge request on
> Gitlab... You may want to have a look anyway.)

Awesome, I will take a look, thanks. I need to tweak my GitLab
notification options :-).

You can force push to your topic branch to respin the merge request.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] libblkio v0.1.0 preview release
  2021-05-17 14:09                   ` Stefan Hajnoczi
@ 2021-05-18  8:02                     ` Kevin Wolf
  0 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2021-05-18  8:02 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pkrempa, Alberto Garcia, slp, qemu-block, qemu-devel, rjones,
	mreitz, Vladimir Sementsov-Ogievskiy, Klaus Jensen, philmd,
	Markus Armbruster, sgarzare

[-- Attachment #1: Type: text/plain, Size: 4148 bytes --]

Am 17.05.2021 um 16:09 hat Stefan Hajnoczi geschrieben:
> On Fri, May 14, 2021 at 05:55:13PM +0200, Kevin Wolf wrote:
> > Am 13.05.2021 um 11:47 hat Stefan Hajnoczi geschrieben:
> > > On Thu, May 06, 2021 at 12:33:24PM +0200, Kevin Wolf wrote:
> > > > Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> > > > > What do you think about this:
> > > > > 
> > > > > The blkio instance states are:
> > > > > 
> > > > >   created -> attached -> started -> destroyed
> > > > > 
> > > > > It is not possible to go backwards anymore, which simplifies driver
> > > > > implementations and it probably won't be needed by applications.
> > > > > 
> > > > > The "initialized" state is renamed to "attached" to make it clearer that
> > > > > this means the block device has been connected/opened. Also
> > > > > "initialized" can be confused with "created".
> > > > > 
> > > > > The corresponding APIs are:
> > > > > 
> > > > > int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> > > > > int blkio_attach(struct blkio *bp, char **errmsg);
> > > > > int blkio_start(struct blkio *bp, char **errmsg);
> > > > > void blkio_destroy(struct blkio **bp);
> > > > > 
> > > > > There is no way to query the state here, but that probably isn't
> > > > > necessary since an application setting up the blkio instance must
> > > > > already be aware of the state in order to configure it in the first
> > > > > place.
> > > > > 
> > > > > One advantage of this approach is that it can support network drivers
> > > > > where the attach and start operations can take a long time while regular
> > > > > property accesses do not block.
> > > > 
> > > > I like this.
> > > > 
> > > > For properties, I think, each property will have a first state in which
> > > > it becomes available and then it will be available in all later states,
> > > > too.
> > > > 
> > > > Currently, apart from properties that are always read-only, we only have
> > > > properties that are rw only in their first state and become read-only in
> > > > later states. It might be reasonable to assume that properties will
> > > > exist that can be rw in all later states, too.
> > > > 
> > > > In their first state, most properties only store the value into the
> > > > config and it's the next state transition that actually makes use of
> > > > them. Similarly, reading usually only reads the value from the config.
> > > > So these parts can be automatically covered. Usually you would then only
> > > > need a custom implementation for property updates after the fact. I
> > > > think this could simplify the driver implementations a lot. I'll play
> > > > with this a bit more.
> > > 
> > > Hi Kevin,
> > > I posted a patch that introduces blkio_connect() and blkio_start():
> > > https://gitlab.com/libblkio/libblkio/-/merge_requests/4
> > 
> > Assuming that you want review to happen on Gitlab, I added a few
> > comments there.
> > 
> > I'm not sure if you saw it, but on Wednesday, I also created a merge
> > request for some first changes to reduce the properties boilerplate in
> > the iouring module that would otherwise be duplicated for every new
> > driver. Not sure if everything is a good idea, but the first patch is
> > almost certainly one.
> > 
> > (However, I just realised that the test failure is not the same as on
> > main, so I degraded it to a draft now. It also conflicts with your merge
> > request. Next thing to learn for me is how to respin a merge request on
> > Gitlab... You may want to have a look anyway.)
> 
> Awesome, I will take a look, thanks. I need to tweak my GitLab
> notification options :-).
> 
> You can force push to your topic branch to respin the merge request.

Ah, that sounds easy enough. On the other hand, it means I can't work in
the branch any more without automatically updating the merge request, so
a branch used for a merge request is burned in a way. I should have
created a separate branch for this.

Looks like I need to familiarise myself more with the Gitlab process
before I can expect it to work well for me. :-)

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-05-18  8:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29 14:05 [ANNOUNCE] libblkio v0.1.0 preview release Stefan Hajnoczi
2021-04-29 14:22 ` Richard W.M. Jones
2021-04-29 14:41   ` Stefan Hajnoczi
2021-04-29 15:00     ` Richard W.M. Jones
2021-04-29 15:08       ` Daniel P. Berrangé
2021-04-29 15:31         ` Richard W.M. Jones
2021-04-29 15:35           ` Daniel P. Berrangé
2021-04-29 17:17         ` libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release) Richard W.M. Jones
2021-04-29 17:27           ` Daniel P. Berrangé
2021-04-30 16:20       ` [ANNOUNCE] libblkio v0.1.0 preview release Stefan Hajnoczi
2021-04-29 15:51 ` Kevin Wolf
2021-04-30 15:49   ` Stefan Hajnoczi
2021-05-04 13:44     ` Kevin Wolf
2021-05-05 16:19       ` Stefan Hajnoczi
2021-05-05 16:46         ` Kevin Wolf
2021-05-06  8:46           ` Stefan Hajnoczi
2021-05-06 10:33             ` Kevin Wolf
2021-05-06 13:53               ` Stefan Hajnoczi
2021-05-13  9:47               ` Stefan Hajnoczi
2021-05-14 15:55                 ` Kevin Wolf
2021-05-17 14:09                   ` Stefan Hajnoczi
2021-05-18  8:02                     ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).