All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] ivshmem property size should be a size, not a string
@ 2015-11-20 16:07 Markus Armbruster
  2015-11-20 16:23 ` Marc-André Lureau
  2015-11-23 20:57 ` Bruce Rogers
  0 siblings, 2 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-20 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, Claudio Fontana, Luiz Capitulino

Everybody's favourite device model has "size" property.  It's declared
as *string*

    DEFINE_PROP_STRING("size", IVShmemState, sizearg),

which gets converted to a size manually in the realize method:

    } else if (s->sizearg == NULL) {
        s->ivshmem_size = 4 << 20; /* 4 MB default */
    } else {
        char *end;
        int64_t size = qemu_strtosz(s->sizearg, &end);
        if (size < 0 || *end != '\0' || !is_power_of_2(size)) {
            error_setg(errp, "Invalid size %s", s->sizearg);
            return;
        }
        s->ivshmem_size = size;
    }

This is *wrong*.  Impact, as far as I can tell:

* -device ivshmem,help shows the property as 'str' instead of 'size'.

  Unhelpful, but hardly show-stopper.

* On the command line and in HMP, ivshmem's size is parsed differently
  than other size properties.  In particular, a number without a suffix
  is normally interpreted as bytes, except for ivshmem, where it's
  Mebibytes.

  Ugly inconsistency, but hardly the only one.

* In QMP, the size must be given as JSON string instead of JSON number,
  and size suffixes are accepted.  Example: "size": "512k" instead of
  "size": 524288.

  Right now, this violation of QMP rules is tolerable (barely), because
  device_add breaks some of these rules already.  However, one goal of
  the current work on QAPI is to support a QMP command to plug devices
  that doesn't break QMP rules, and then this violation will stand out.

  Therefore, I want it fixed now, before ivshmem gets used in anger.

  A straight fix of size isn't fully backwards compatible: suffixes no
  longer work in QMP, and you need to use an 'M' suffix to get Mebibytes
  on command line and HMP.

  If that's unacceptable, we'll have to provide a new, fixed property,
  and deprecate size.

  Opinions?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 16:07 [Qemu-devel] ivshmem property size should be a size, not a string Markus Armbruster
@ 2015-11-20 16:23 ` Marc-André Lureau
  2015-11-20 16:46   ` Eric Blake
  2015-11-23 20:57 ` Bruce Rogers
  1 sibling, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-20 16:23 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Hi

----- Original Message -----
> Everybody's favourite device model has "size" property.  It's declared
> as *string*
> 
>     DEFINE_PROP_STRING("size", IVShmemState, sizearg),
> 
> which gets converted to a size manually in the realize method:
> 
>     } else if (s->sizearg == NULL) {
>         s->ivshmem_size = 4 << 20; /* 4 MB default */
>     } else {
>         char *end;
>         int64_t size = qemu_strtosz(s->sizearg, &end);
>         if (size < 0 || *end != '\0' || !is_power_of_2(size)) {
>             error_setg(errp, "Invalid size %s", s->sizearg);
>             return;
>         }
>         s->ivshmem_size = size;
>     }
> 
> This is *wrong*.  Impact, as far as I can tell:
> 
> * -device ivshmem,help shows the property as 'str' instead of 'size'.
> 
>   Unhelpful, but hardly show-stopper.
> 
> * On the command line and in HMP, ivshmem's size is parsed differently
>   than other size properties.  In particular, a number without a suffix
>   is normally interpreted as bytes, except for ivshmem, where it's
>   Mebibytes.
> 
>   Ugly inconsistency, but hardly the only one.
> 
> * In QMP, the size must be given as JSON string instead of JSON number,
>   and size suffixes are accepted.  Example: "size": "512k" instead of
>   "size": 524288.
> 
>   Right now, this violation of QMP rules is tolerable (barely), because
>   device_add breaks some of these rules already.  However, one goal of
>   the current work on QAPI is to support a QMP command to plug devices
>   that doesn't break QMP rules, and then this violation will stand out.
> 
>   Therefore, I want it fixed now, before ivshmem gets used in anger.
> 
>   A straight fix of size isn't fully backwards compatible: suffixes no
>   longer work in QMP, and you need to use an 'M' suffix to get Mebibytes
>   on command line and HMP.

I don't know the rules to break properties in qemu, but I would prefer to avoid it if possible.

>   If that's unacceptable, we'll have to provide a new, fixed property,
>   and deprecate size.

Sounds a better alternative to me.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 16:23 ` Marc-André Lureau
@ 2015-11-20 16:46   ` Eric Blake
  2015-11-20 18:06     ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Blake @ 2015-11-20 16:46 UTC (permalink / raw)
  To: Marc-André Lureau, Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

On 11/20/2015 09:23 AM, Marc-André Lureau wrote:
> Hi
> 
> ----- Original Message -----
>> Everybody's favourite device model has "size" property.  It's declared
>> as *string*
>>
>>     DEFINE_PROP_STRING("size", IVShmemState, sizearg),
>>
>>
>> * In QMP, the size must be given as JSON string instead of JSON number,
>>   and size suffixes are accepted.  Example: "size": "512k" instead of
>>   "size": 524288.
>>
>>   Right now, this violation of QMP rules is tolerable (barely), because
>>   device_add breaks some of these rules already.  However, one goal of
>>   the current work on QAPI is to support a QMP command to plug devices
>>   that doesn't break QMP rules, and then this violation will stand out.
>>
>>   Therefore, I want it fixed now, before ivshmem gets used in anger.

Was ivshmem even usable in 2.4?  I'd argue that if we are going to
change it, changing it for 2.5 is the ideal time.

>>
>>   A straight fix of size isn't fully backwards compatible: suffixes no
>>   longer work in QMP, and you need to use an 'M' suffix to get Mebibytes
>>   on command line and HMP.
> 
> I don't know the rules to break properties in qemu, but I would prefer to avoid it if possible.

It's possible to use an alternate to accept both a string and an
integer.  But in general, QMP _wants_ to use byte-count integers without
suffix (the convenience of '512k' is only for the command line and HMP),
so letting QMP expose a string of "512k" instead of an integer 524288
feels wrong.

> 
>>   If that's unacceptable, we'll have to provide a new, fixed property,
>>   and deprecate size.
> 
> Sounds a better alternative to me.

I'd rather fix the interface rather than catering to ugliness,
particularly since 2.5 already fixes so much else that was ugly about
ivshmem.

Markus, did you have a patch to propose?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 16:46   ` Eric Blake
@ 2015-11-20 18:06     ` Markus Armbruster
  2015-11-20 18:20       ` Marc-André Lureau
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2015-11-20 18:06 UTC (permalink / raw)
  To: Eric Blake
  Cc: marcandre lureau, Luiz Capitulino, Marc-André Lureau,
	Claudio Fontana, qemu-devel

Eric Blake <eblake@redhat.com> writes:

> On 11/20/2015 09:23 AM, Marc-André Lureau wrote:
>> Hi
>> 
>> ----- Original Message -----
>>> Everybody's favourite device model has "size" property.  It's declared
>>> as *string*
>>>
>>>     DEFINE_PROP_STRING("size", IVShmemState, sizearg),
>>>
>>>
>>> * In QMP, the size must be given as JSON string instead of JSON number,
>>>   and size suffixes are accepted.  Example: "size": "512k" instead of
>>>   "size": 524288.
>>>
>>>   Right now, this violation of QMP rules is tolerable (barely), because
>>>   device_add breaks some of these rules already.  However, one goal of
>>>   the current work on QAPI is to support a QMP command to plug devices
>>>   that doesn't break QMP rules, and then this violation will stand out.
>>>
>>>   Therefore, I want it fixed now, before ivshmem gets used in anger.
>
> Was ivshmem even usable in 2.4?  I'd argue that if we are going to
> change it, changing it for 2.5 is the ideal time.

Technically, we already have a compatibility break:

$ qemu-system-x86_64 -nodefaults -S -display none -chardev socket,path=/work/armbru/images/ivshmem-socket,id=ivshmem-chr -device ivshmem,size=4,chardev=ivshmem-chr,shm=foo

In 2.3.0, this ignores shm with warning "do not specify both 'chardev'
and 'shm' with ivshmem".

In current upstream, it's rejected.

I suspect we'd find more if we cared to look.

Of course, the only thing that *really* matters is *actual* usage in
anger: something of value that breaks.

Hash ivshmem been used in anger?  If yes, how?

>>>   A straight fix of size isn't fully backwards compatible: suffixes no
>>>   longer work in QMP, and you need to use an 'M' suffix to get Mebibytes
>>>   on command line and HMP.
>> 
>> I don't know the rules to break properties in qemu, but I would
>> prefer to avoid it if possible.
>
> It's possible to use an alternate to accept both a string and an
> integer.  But in general, QMP _wants_ to use byte-count integers without
> suffix (the convenience of '512k' is only for the command line and HMP),
> so letting QMP expose a string of "512k" instead of an integer 524288
> feels wrong.

Indeed.

>>>   If that's unacceptable, we'll have to provide a new, fixed property,
>>>   and deprecate size.
>> 
>> Sounds a better alternative to me.
>
> I'd rather fix the interface rather than catering to ugliness,
> particularly since 2.5 already fixes so much else that was ugly about
> ivshmem.
>
> Markus, did you have a patch to propose?

Working on one.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 18:06     ` Markus Armbruster
@ 2015-11-20 18:20       ` Marc-André Lureau
  2015-11-20 19:39         ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-20 18:20 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Luiz Capitulino, Claudio Fontana, qemu-devel



----- Original Message -----
> Eric Blake <eblake@redhat.com> writes:
> 
> > On 11/20/2015 09:23 AM, Marc-André Lureau wrote:
> >> Hi
> >> 
> >> ----- Original Message -----
> >>> Everybody's favourite device model has "size" property.  It's declared
> >>> as *string*
> >>>
> >>>     DEFINE_PROP_STRING("size", IVShmemState, sizearg),
> >>>
> >>>
> >>> * In QMP, the size must be given as JSON string instead of JSON number,
> >>>   and size suffixes are accepted.  Example: "size": "512k" instead of
> >>>   "size": 524288.
> >>>
> >>>   Right now, this violation of QMP rules is tolerable (barely), because
> >>>   device_add breaks some of these rules already.  However, one goal of
> >>>   the current work on QAPI is to support a QMP command to plug devices
> >>>   that doesn't break QMP rules, and then this violation will stand out.
> >>>
> >>>   Therefore, I want it fixed now, before ivshmem gets used in anger.
> >
> > Was ivshmem even usable in 2.4?  I'd argue that if we are going to
> > change it, changing it for 2.5 is the ideal time.
> 
> Technically, we already have a compatibility break:
> 
> $ qemu-system-x86_64 -nodefaults -S -display none -chardev
> socket,path=/work/armbru/images/ivshmem-socket,id=ivshmem-chr -device
> ivshmem,size=4,chardev=ivshmem-chr,shm=foo
> 
> In 2.3.0, this ignores shm with warning "do not specify both 'chardev'
> and 'shm' with ivshmem".
> 
> In current upstream, it's rejected.

That's a regression, either we should:
- keep accepting it
- or keep rejection and remove the "WARNING:" 

Imho in this case, it was an undefined behaviour, warned, so it's fine to reject it now.

> I suspect we'd find more if we cared to look.
> 
> Of course, the only thing that *really* matters is *actual* usage in
> anger: something of value that breaks.
> 
> Hash ivshmem been used in anger?  If yes, how?
> 
> >>>   A straight fix of size isn't fully backwards compatible: suffixes no
> >>>   longer work in QMP, and you need to use an 'M' suffix to get Mebibytes
> >>>   on command line and HMP.
> >> 
> >> I don't know the rules to break properties in qemu, but I would
> >> prefer to avoid it if possible.
> >
> > It's possible to use an alternate to accept both a string and an
> > integer.  But in general, QMP _wants_ to use byte-count integers without
> > suffix (the convenience of '512k' is only for the command line and HMP),
> > so letting QMP expose a string of "512k" instead of an integer 524288
> > feels wrong.
> 
> Indeed.
> 
> >>>   If that's unacceptable, we'll have to provide a new, fixed property,
> >>>   and deprecate size.
> >> 
> >> Sounds a better alternative to me.
> >
> > I'd rather fix the interface rather than catering to ugliness,
> > particularly since 2.5 already fixes so much else that was ugly about
> > ivshmem.
> >
> > Markus, did you have a patch to propose?
> 
> Working on one.
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 18:20       ` Marc-André Lureau
@ 2015-11-20 19:39         ` Markus Armbruster
  2015-11-20 20:18           ` Marc-André Lureau
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2015-11-20 19:39 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> ----- Original Message -----
>> Eric Blake <eblake@redhat.com> writes:
>> 
>> > On 11/20/2015 09:23 AM, Marc-André Lureau wrote:
>> >> Hi
>> >> 
>> >> ----- Original Message -----
>> >>> Everybody's favourite device model has "size" property.  It's declared
>> >>> as *string*
>> >>>
>> >>>     DEFINE_PROP_STRING("size", IVShmemState, sizearg),
>> >>>
>> >>>
>> >>> * In QMP, the size must be given as JSON string instead of JSON number,
>> >>>   and size suffixes are accepted.  Example: "size": "512k" instead of
>> >>>   "size": 524288.
>> >>>
>> >>>   Right now, this violation of QMP rules is tolerable (barely), because
>> >>>   device_add breaks some of these rules already.  However, one goal of
>> >>>   the current work on QAPI is to support a QMP command to plug devices
>> >>>   that doesn't break QMP rules, and then this violation will stand out.
>> >>>
>> >>>   Therefore, I want it fixed now, before ivshmem gets used in anger.
>> >
>> > Was ivshmem even usable in 2.4?  I'd argue that if we are going to
>> > change it, changing it for 2.5 is the ideal time.
>> 
>> Technically, we already have a compatibility break:
>> 
>> $ qemu-system-x86_64 -nodefaults -S -display none -chardev
>> socket,path=/work/armbru/images/ivshmem-socket,id=ivshmem-chr -device
>> ivshmem,size=4,chardev=ivshmem-chr,shm=foo
>> 
>> In 2.3.0, this ignores shm with warning "do not specify both 'chardev'
>> and 'shm' with ivshmem".
>> 
>> In current upstream, it's rejected.
>
> That's a regression, either we should:
> - keep accepting it
> - or keep rejection and remove the "WARNING:" 
>
> Imho in this case, it was an undefined behaviour, warned, so it's fine
> to reject it now.

It's exactly fine if...

>> I suspect we'd find more if we cared to look.
>> 
>> Of course, the only thing that *really* matters is *actual* usage in
>> anger: something of value that breaks.

... it doesn't break something of value.

>> Hash ivshmem been used in anger?  If yes, how?

Still the question to answer.


I had another look at ivshmem's properties, and I can't say I liked the
sight.

Besides the usual PCI properties, we have:

    ivshmem.chardev=str (ID of a chardev to use as a backend)
    ivshmem.size=str
    ivshmem.vectors=uint32
    ivshmem.ioeventfd=bool (on/off)
    ivshmem.msi=bool (on/off)
    ivshmem.shm=str
    ivshmem.role=str
    ivshmem.memdev=link<memory-backend>
    ivshmem.use64=uint32

Exactly one of chardev, shm, memdev must be given.  qemu-doc.info claims
you can omit all three, but that's a bug.

vectors, ioeventfd and msi make sense only with chardev.  The code
appears to silently ignore them unless chardev is given.  Except it
still rejects ioeventfd=on,msi=off.  I wouldn't bet on nonsensical
combinations to actually work.

qemu-doc documents role only with chardev.  The code doesn't care.

size makes sense only with shm and chardev.  If you specify it with
memdev, it's ignored with a warning.

With shm, we first try to create the shared memory object with the
specified size, and if that fails, we try to open it.  The specified
size must not be larger than the shared memory object then.

With chardev, we receive the shared memory object from the server.
Until we get it, BAR isn't mapped.  If the specified size is larger, the
BAR remains unmapped.

use64 sets PCI_BASE_ADDRESS_SPACE_MEMORY.  Not documented in qemu-doc.

This is a byzantine mess even for QEMU standards.

Questions:

* Is it okay to have an unmapped BAR?

* We actually have two distinct device kinds:

  - The basic device: shm with parameter size, or memdev

  - The doorbell-capable device: chardev with parameters size, vectors,
    ioeventfd=on, msi

  Common parameters: use64, role, although the latter is only documented
  for the doorbell-capable device.

  Why is this a single device model?

  It's not even obvious to me how the guest is supposed to figure out
  which kind it has.

* shm appears to be the same as memdev, just less flexible.  Why does it
  exist?

* Are we *sure* this is ready to become ABI?  I doubt it's ABI yet,
  because before Marc-André's massive rework, it was even worse (*much*
  worse), to a degree that makes me doubt anybody could use it
  seriously.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 19:39         ` Markus Armbruster
@ 2015-11-20 20:18           ` Marc-André Lureau
  2015-11-23 10:19             ` Markus Armbruster
  2015-11-23 23:29             ` Andrew James
  0 siblings, 2 replies; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-20 20:18 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Hi

----- Original Message -----
> >> Hash ivshmem been used in anger?  If yes, how?
> 
> Still the question to answer.

I don't expect users to read this ML everyday (anybody actually). Personally, I have no clue how widespread ivshmem usage is.

> Besides the usual PCI properties, we have:
> 
>     ivshmem.chardev=str (ID of a chardev to use as a backend)
>     ivshmem.size=str
>     ivshmem.vectors=uint32
>     ivshmem.ioeventfd=bool (on/off)
>     ivshmem.msi=bool (on/off)
>     ivshmem.shm=str
>     ivshmem.role=str
>     ivshmem.memdev=link<memory-backend>
>     ivshmem.use64=uint32
> 
> Exactly one of chardev, shm, memdev must be given.  qemu-doc.info claims
> you can omit all three, but that's a bug.

interesting, I guess the doc must be updated

> vectors, ioeventfd and msi make sense only with chardev.  The code
> appears to silently ignore them unless chardev is given.  Except it
> still rejects ioeventfd=on,msi=off.  I wouldn't bet on nonsensical
> combinations to actually work.

I haven't tried all combinations, to me it's not a bug if an argument is silently ignored, but we are missing a warning.

> qemu-doc documents role only with chardev.  The code doesn't care.

yeah, role is only really useful with a server. Another missing warning.

> size makes sense only with shm and chardev.  If you specify it with
> memdev, it's ignored with a warning.

Ah! it's probably fine then?

> With shm, we first try to create the shared memory object with the
> specified size, and if that fails, we try to open it.  The specified
> size must not be larger than the shared memory object then.
> 
> With chardev, we receive the shared memory object from the server.
> Until we get it, BAR isn't mapped.  If the specified size is larger, the
> BAR remains unmapped.

yep

> use64 sets PCI_BASE_ADDRESS_SPACE_MEMORY.  Not documented in qemu-doc.

I don't think all properties are documented, are they? Gerd added this in c08ba66f with pretty good commit message that we could adapt to add in documentation.

> This is a byzantine mess even for QEMU standards.
> 
> Questions:
> 
> * Is it okay to have an unmapped BAR?

I can't say much, though I remember I did some tests and it was ok.

> * We actually have two distinct device kinds:
> 
>   - The basic device: shm with parameter size, or memdev
> 
>   - The doorbell-capable device: chardev with parameters size, vectors,
>     ioeventfd=on, msi
> 
>   Common parameters: use64, role, although the latter is only documented
>   for the doorbell-capable device.
> 
>   Why is this a single device model?

No idea, but I agree it would make sense to have two different devices.

>   It's not even obvious to me how the guest is supposed to figure out
>   which kind it has.

I don't think it can easily: I can imagine sending a doorbell to yourself, but that wouldn't be really reliable either.

> * shm appears to be the same as memdev, just less flexible.  Why does it
>   exist?

It was there before.
 
> * Are we *sure* this is ready to become ABI?  I doubt it's ABI yet,
>   because before Marc-André's massive rework, it was even worse (*much*
>   worse), to a degree that makes me doubt anybody could use it
>   seriously.

It's supposed to be the same ABI as before, with the memdev addition.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 20:18           ` Marc-André Lureau
@ 2015-11-23 10:19             ` Markus Armbruster
  2015-11-23 12:15               ` Marc-André Lureau
  2015-11-23 18:22               ` Markus Armbruster
  2015-11-23 23:29             ` Andrew James
  1 sibling, 2 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-23 10:19 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> Hi
>
> ----- Original Message -----
>> >> Hash ivshmem been used in anger?  If yes, how?
>> 
>> Still the question to answer.
>
> I don't expect users to read this ML everyday (anybody
> actually). Personally, I have no clue how widespread ivshmem usage is.
>
>> Besides the usual PCI properties, we have:
>> 
>>     ivshmem.chardev=str (ID of a chardev to use as a backend)
>>     ivshmem.size=str
>>     ivshmem.vectors=uint32
>>     ivshmem.ioeventfd=bool (on/off)
>>     ivshmem.msi=bool (on/off)
>>     ivshmem.shm=str
>>     ivshmem.role=str
>>     ivshmem.memdev=link<memory-backend>
>>     ivshmem.use64=uint32
>> 
>> Exactly one of chardev, shm, memdev must be given.  qemu-doc.info claims
>> you can omit all three, but that's a bug.
>
> interesting, I guess the doc must be updated

One-liner.

>> vectors, ioeventfd and msi make sense only with chardev.  The code
>> appears to silently ignore them unless chardev is given.  Except it
>> still rejects ioeventfd=on,msi=off.  I wouldn't bet on nonsensical
>> combinations to actually work.
>
> I haven't tried all combinations, to me it's not a bug if an argument
> is silently ignored, but we are missing a warning.

In general, configuration that doesn't make sense should be flat-out
rejected.

A warning is appropriate when we feel rejecting would break backward
compatibility.

Non-sensical option combinations can signify badly designed interfaces.
I believe this is the case here.  More on that below.

>> qemu-doc documents role only with chardev.  The code doesn't care.
>
> yeah, role is only really useful with a server. Another missing warning.

I think it makes sense only when we can migrate the shared memory
contents out-of-band.  Vaguely similar to block devices: either you
migrate them along (block migration), or you have to ensure they have
identical contents on the target in some other way.

Is the latter possible only with a server?

>> size makes sense only with shm and chardev.  If you specify it with
>> memdev, it's ignored with a warning.
>
> Ah! it's probably fine then?

For a definition of "fine"....

>> With shm, we first try to create the shared memory object with the
>> specified size, and if that fails, we try to open it.  The specified
>> size must not be larger than the shared memory object then.
>> 
>> With chardev, we receive the shared memory object from the server.
>> Until we get it, BAR isn't mapped.  If the specified size is larger, the
>> BAR remains unmapped.
>
> yep
>
>> use64 sets PCI_BASE_ADDRESS_SPACE_MEMORY.  Not documented in qemu-doc.
>
> I don't think all properties are documented, are they? Gerd added this
> in c08ba66f with pretty good commit message that we could adapt to add
> in documentation.

Yes.

>> This is a byzantine mess even for QEMU standards.
>> 
>> Questions:
>> 
>> * Is it okay to have an unmapped BAR?
>
> I can't say much, though I remember I did some tests and it was ok.

Did you try chardev=...,size=S, where S is larger than what the server
provides?

A guest that fails there probably works for sufficiently small S only
when it loses the race with the server.

But my question is actually whether it's okay for a PCI device to
advertize a BAR, and then not map anything there.

Should realize wait for the server providing the shared memory?

>> * We actually have two distinct device kinds:
>> 
>>   - The basic device: shm with parameter size, or memdev
>> 
>>   - The doorbell-capable device: chardev with parameters size, vectors,
>>     ioeventfd=on, msi
>> 
>>   Common parameters: use64, role, although the latter is only documented
>>   for the doorbell-capable device.
>> 
>>   Why is this a single device model?
>
> No idea, but I agree it would make sense to have two different devices.
>
>>   It's not even obvious to me how the guest is supposed to figure out
>>   which kind it has.
>
> I don't think it can easily: I can imagine sending a doorbell to
> yourself, but that wouldn't be really reliable either.

Ugh!

>> * shm appears to be the same as memdev, just less flexible.  Why does it
>>   exist?
>
> It was there before.

Not only is memdev more flexible, it also provides the clean split
between frontend and backend we generally want.

>> * Are we *sure* this is ready to become ABI?  I doubt it's ABI yet,
>>   because before Marc-André's massive rework, it was even worse (*much*
>>   worse), to a degree that makes me doubt anybody could use it
>>   seriously.
>
> It's supposed to be the same ABI as before, with the memdev addition.

Well, it's *crap*.  We shouldn't extend and known crap so it can get
used more widely, we should deprecate it in favour of something less
crappy.

Here's my attempt:

1. Split ivshmem into ivshmem-plain and ivshmem-doorbell.

   Each one gets its own PCI device ID, so that guests can trivially see
   whether the device got a doorbell.

   Both have property use64.

   ivshmem-plain additionally has property memdev.

   ivshmem-doorbell additionally has chardev, size, vectors, ioeventfd,
   msi.

   Undecided: property role (see above).

   The only non-sensical property combination is ioeventfd=on,msi=off,
   which we cleanly reject.

   Undecided: behavior before the server provides the shared memory, and
   what to do when it's less than size (see above).  This is the *only*
   part of my proposal that may require code changes beyond
   configuration.  If we can't do this before the release, punt
   ivshmem-doorbell to the next cycle.

2. Drop ivshmem property memdev.

   Use ivshmem-plain instead.

3. Deprecate ivshmem.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 10:19             ` Markus Armbruster
@ 2015-11-23 12:15               ` Marc-André Lureau
  2015-11-23 13:25                 ` Markus Armbruster
  2015-11-23 18:22               ` Markus Armbruster
  1 sibling, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-23 12:15 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Hi

----- Original Message -----
>
> >> qemu-doc documents role only with chardev.  The code doesn't care.
> >
> > yeah, role is only really useful with a server. Another missing warning.
> 
> I think it makes sense only when we can migrate the shared memory
> contents out-of-band.  Vaguely similar to block devices: either you
> migrate them along (block migration), or you have to ensure they have
> identical contents on the target in some other way.
> 
> Is the latter possible only with a server?

The way ivshmem role works is:
- master: will handle migration
- peer: can't migrate

It's not out-of-band, qemu handles the shared memory migration, even if
the memory is owned by the server. In practice, it means you can only
migrate one VM, or you migrate them all but you will migrate the shared
memory N times and will have to synchronize in your guest app. I suppose ivshmem
"role" was designed to only migrate the master. Ability to migrate a pool of
peer would be a significant new feature. I am not aware of such request.

> >> This is a byzantine mess even for QEMU standards.
> >> 
> >> Questions:
> >> 
> >> * Is it okay to have an unmapped BAR?
> >
> > I can't say much, though I remember I did some tests and it was ok.
> 
> Did you try chardev=...,size=S, where S is larger than what the server
> provides?

It will fall in check_shm_size().

> A guest that fails there probably works for sufficiently small S only
> when it loses the race with the server.
> 
> But my question is actually whether it's okay for a PCI device to
> advertize a BAR, and then not map anything there.
> 
> Should realize wait for the server providing the shared memory?
> 
> >> * We actually have two distinct device kinds:
> >> 
> >>   - The basic device: shm with parameter size, or memdev
> >> 
> >>   - The doorbell-capable device: chardev with parameters size, vectors,
> >>     ioeventfd=on, msi
> >> 
> >>   Common parameters: use64, role, although the latter is only documented
> >>   for the doorbell-capable device.
> >> 

> 
> >> * Are we *sure* this is ready to become ABI?  I doubt it's ABI yet,
> >>   because before Marc-André's massive rework, it was even worse (*much*
> >>   worse), to a degree that makes me doubt anybody could use it
> >>   seriously.
> >
> > It's supposed to be the same ABI as before, with the memdev addition.
> 
> Well, it's *crap*.  We shouldn't extend and known crap so it can get
> used more widely, we should deprecate it in favour of something less
> crappy.

I think you overstate this, if it would be so bad I don't think anyone could use
it and it wouldn't probably be in qemu. We are mainly talking misconfiguration
and missing features (that people may not actually need). I think
before we split things, we can address your comments with more
options checking and documentation.

> Here's my attempt:
> 
> 1. Split ivshmem into ivshmem-plain and ivshmem-doorbell.
> 
>    Each one gets its own PCI device ID, so that guests can trivially see
>    whether the device got a doorbell.
> 
>    Both have property use64.
> 
>    ivshmem-plain additionally has property memdev.
> 
>    ivshmem-doorbell additionally has chardev, size, vectors, ioeventfd,
>    msi.
> 
>    Undecided: property role (see above).
> 
>    The only non-sensical property combination is ioeventfd=on,msi=off,
>    which we cleanly reject.
> 
>    Undecided: behavior before the server provides the shared memory, and
>    what to do when it's less than size (see above).  This is the *only*
>    part of my proposal that may require code changes beyond
>    configuration.  If we can't do this before the release, punt
>    ivshmem-doorbell to the next cycle.

> 2. Drop ivshmem property memdev.
> 
>    Use ivshmem-plain instead.
> 
> 3. Deprecate ivshmem.

It sounds like a reasonable thing to do, but I don't think the benefit is so important.

cheers

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 12:15               ` Marc-André Lureau
@ 2015-11-23 13:25                 ` Markus Armbruster
  2015-11-23 13:48                   ` Marc-André Lureau
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2015-11-23 13:25 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> Hi
>
> ----- Original Message -----
>>
>> >> qemu-doc documents role only with chardev.  The code doesn't care.
>> >
>> > yeah, role is only really useful with a server. Another missing warning.
>> 
>> I think it makes sense only when we can migrate the shared memory
>> contents out-of-band.  Vaguely similar to block devices: either you
>> migrate them along (block migration), or you have to ensure they have
>> identical contents on the target in some other way.
>> 
>> Is the latter possible only with a server?
>
> The way ivshmem role works is:
> - master: will handle migration
> - peer: can't migrate
>
> It's not out-of-band, qemu handles the shared memory migration, even if
> the memory is owned by the server. In practice, it means you can only
> migrate one VM, or you migrate them all but you will migrate the shared
> memory N times and will have to synchronize in your guest app. I suppose ivshmem
> "role" was designed to only migrate the master. Ability to migrate a pool of
> peer would be a significant new feature. I am not aware of such request.

I see.  But how is this supposed to work?

Before migration: one master and N peers connected to the server on host
A, N>=0.

After migration: one master and N' of the N peers connected to the
server on host B, N>=N'>=0, and the remaining N-N' peers still on host A
with their ivshmem device unplugged.

How would I do this even for N'==0?  I can't see how I'm supposted to
connect the migrated shared memory to a server on host B.

>> >> This is a byzantine mess even for QEMU standards.
>> >> 
>> >> Questions:
>> >> 
>> >> * Is it okay to have an unmapped BAR?
>> >
>> > I can't say much, though I remember I did some tests and it was ok.
>> 
>> Did you try chardev=...,size=S, where S is larger than what the server
>> provides?
>
> It will fall in check_shm_size().

Yes.  Called from ivshmem_read().  ivshmem_read() will then complain to
stderr, close the file descriptor we got from the server and leave the
BAR unmapped.  My question is how guests deal with that state.  Could be
anything from "detect the device is broken and fence it" to "kernel
panic".

Whatever it is, it could easily also happen if the guest wins the race
with the server and tries to use the device before it successfully got
its shared memory from the server.

>> A guest that fails there probably works for sufficiently small S only
>> when it loses the race with the server.
>> 
>> But my question is actually whether it's okay for a PCI device to
>> advertize a BAR, and then not map anything there.
>> 
>> Should realize wait for the server providing the shared memory?
>> 
>> >> * We actually have two distinct device kinds:
>> >> 
>> >>   - The basic device: shm with parameter size, or memdev
>> >> 
>> >>   - The doorbell-capable device: chardev with parameters size, vectors,
>> >>     ioeventfd=on, msi
>> >> 
>> >>   Common parameters: use64, role, although the latter is only documented
>> >>   for the doorbell-capable device.
>> >> 
>
>> 
>> >> * Are we *sure* this is ready to become ABI?  I doubt it's ABI yet,
>> >>   because before Marc-André's massive rework, it was even worse (*much*
>> >>   worse), to a degree that makes me doubt anybody could use it
>> >>   seriously.
>> >
>> > It's supposed to be the same ABI as before, with the memdev addition.
>> 
>> Well, it's *crap*.  We shouldn't extend and known crap so it can get
>> used more widely, we should deprecate it in favour of something less
>> crappy.
>
> I think you overstate this, if it would be so bad I don't think anyone could use
> it and it wouldn't probably be in qemu. We are mainly talking misconfiguration
> and missing features (that people may not actually need). I think
> before we split things, we can address your comments with more
> options checking and documentation.

We have at least two problems:

1. Unless the guest can reliably detect the doorbell feature, the
   doorbell feature is *broken*.

   As far as I can tell, a device with a doorbell behaves exactly like
   one without a doorbell until it got its shared memory from the
   server.  If that's correct, then doorbell detection is inherently
   racy.

   The only way to fix this in documentation is "broken, do not use".

   The maximally compatible way to fix this in code is to ensure the
   guest can't read register IVPosition before we got the shared memory
   from the server.  We can make realize wait, or the read.  The latter
   is probably an even worse idea.

   An easier way to fix it in code is splitting up the device, so guests
   can simply check the PCI device ID to figure out whether they have
   one with a doorbell.

   An even easier way is dropping the doorbell feature outright.

2. The UI is crap.

   We can fix this by rejecting nonsensical option combinations.
   However, the result will be more complex than splitting the device in
   two so that nonsensical options combinations are simply impossible.

   If we need to split it anyway to fix the doorbell, we can clean up
   the UI at next to no cost.

>> Here's my attempt:
>> 
>> 1. Split ivshmem into ivshmem-plain and ivshmem-doorbell.
>> 
>>    Each one gets its own PCI device ID, so that guests can trivially see
>>    whether the device got a doorbell.
>> 
>>    Both have property use64.
>> 
>>    ivshmem-plain additionally has property memdev.
>> 
>>    ivshmem-doorbell additionally has chardev, size, vectors, ioeventfd,
>>    msi.
>> 
>>    Undecided: property role (see above).
>> 
>>    The only non-sensical property combination is ioeventfd=on,msi=off,
>>    which we cleanly reject.
>> 
>>    Undecided: behavior before the server provides the shared memory, and
>>    what to do when it's less than size (see above).  This is the *only*
>>    part of my proposal that may require code changes beyond
>>    configuration.  If we can't do this before the release, punt
>>    ivshmem-doorbell to the next cycle.
>
>> 2. Drop ivshmem property memdev.
>> 
>>    Use ivshmem-plain instead.
>> 
>> 3. Deprecate ivshmem.
>
> It sounds like a reasonable thing to do, but I don't think the benefit
> is so important.

If we need to split & deprecate to fix the doorbell anyway, the earlier
we do it, the better.  Before your big rework, the device's general
crappiness should've limited its use.  But from now on, it can only get
harder.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 13:25                 ` Markus Armbruster
@ 2015-11-23 13:48                   ` Marc-André Lureau
  2015-11-23 14:08                     ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-23 13:48 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Hi

----- Original Message -----
> > "role" was designed to only migrate the master. Ability to migrate a pool
> > of
> > peer would be a significant new feature. I am not aware of such request.
> 
> I see.  But how is this supposed to work?
> 
> Before migration: one master and N peers connected to the server on host
> A, N>=0.
> 
> After migration: one master and N' of the N peers connected to the
> server on host B, N>=N'>=0, and the remaining N-N' peers still on host A
> with their ivshmem device unplugged.
> 
> How would I do this even for N'==0?  I can't see how I'm supposted to
> connect the migrated shared memory to a server on host B.

I am not sure I understand you.

You can't migrate the peers.

As I said, "ability to migrate a pool of peer would be a significant new feature".

> >> Did you try chardev=...,size=S, where S is larger than what the server
> >> provides?
> >
> > It will fall in check_shm_size().
> 
> Yes.  Called from ivshmem_read().  ivshmem_read() will then complain to
> stderr, close the file descriptor we got from the server and leave the
> BAR unmapped.  My question is how guests deal with that state.  Could be
> anything from "detect the device is broken and fence it" to "kernel
> panic".
> Whatever it is, it could easily also happen if the guest wins the race
> with the server and tries to use the device before it successfully got
> its shared memory from the server.

It's nothing bad from what I remember on qemu side. On guest side, it
depends how your driver/user is implemented I suppose. To me, it's not
a normal case, and the error should be enough to diagnose it.

> 1. Unless the guest can reliably detect the doorbell feature, the
>    doorbell feature is *broken*.
> 
>    As far as I can tell, a device with a doorbell behaves exactly like
>    one without a doorbell until it got its shared memory from the
>    server.  If that's correct, then doorbell detection is inherently
>    racy.

There are many ways you can do synchronization.
In test_ivshmem_server(), I trivially wait for the membar with the
required signature to be mapped. Verify that peers have different ids,
and then start using the doorbell. That seems good enough.

>    The only way to fix this in documentation is "broken, do not use".

It works fine in the tests. Feel free to point out races or other issues.

>    The maximally compatible way to fix this in code is to ensure the
>    guest can't read register IVPosition before we got the shared memory
>    from the server.  We can make realize wait, or the read.  The latter
>    is probably an even worse idea.
> 
>    An easier way to fix it in code is splitting up the device, so guests
>    can simply check the PCI device ID to figure out whether they have
>    one with a doorbell.
> 
>    An even easier way is dropping the doorbell feature outright.
> 
> 2. The UI is crap.
> 
>    We can fix this by rejecting nonsensical option combinations.

Yes, I think it's the simplest way for now. I dislike having to break stuff when you can overcome it with a few more checks.

>    However, the result will be more complex than splitting the device in
>    two so that nonsensical options combinations are simply impossible.

I disagree, adding more checks will add a few dozen lines with minimal impact. Splitting things will break stuff and require significant effort to share correctly what can be shared etc.

>    If we need to split it anyway to fix the doorbell, we can clean up
>    the UI at next to no cost.

I don't think the doorbell is broken.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 13:48                   ` Marc-André Lureau
@ 2015-11-23 14:08                     ` Markus Armbruster
  2015-11-23 14:16                       ` Marc-André Lureau
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2015-11-23 14:08 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> Hi
>
> ----- Original Message -----
>> > "role" was designed to only migrate the master. Ability to migrate a pool
>> > of
>> > peer would be a significant new feature. I am not aware of such request.
>> 
>> I see.  But how is this supposed to work?
>> 
>> Before migration: one master and N peers connected to the server on host
>> A, N>=0.
>> 
>> After migration: one master and N' of the N peers connected to the
>> server on host B, N>=N'>=0, and the remaining N-N' peers still on host A
>> with their ivshmem device unplugged.
>> 
>> How would I do this even for N'==0?  I can't see how I'm supposted to
>> connect the migrated shared memory to a server on host B.
>
> I am not sure I understand you.
>
> You can't migrate the peers.

Then explain the case N'=0 to me: how can you migrate the master so that
it's connected to a server afterwards?

> As I said, "ability to migrate a pool of peer would be a significant
> new feature".
>
>> >> Did you try chardev=...,size=S, where S is larger than what the server
>> >> provides?
>> >
>> > It will fall in check_shm_size().
>> 
>> Yes.  Called from ivshmem_read().  ivshmem_read() will then complain to
>> stderr, close the file descriptor we got from the server and leave the
>> BAR unmapped.  My question is how guests deal with that state.  Could be
>> anything from "detect the device is broken and fence it" to "kernel
>> panic".
>> Whatever it is, it could easily also happen if the guest wins the race
>> with the server and tries to use the device before it successfully got
>> its shared memory from the server.
>
> It's nothing bad from what I remember on qemu side. On guest side, it
> depends how your driver/user is implemented I suppose. To me, it's not
> a normal case, and the error should be enough to diagnose it.
>
>> 1. Unless the guest can reliably detect the doorbell feature, the
>>    doorbell feature is *broken*.
>> 
>>    As far as I can tell, a device with a doorbell behaves exactly like
>>    one without a doorbell until it got its shared memory from the
>>    server.  If that's correct, then doorbell detection is inherently
>>    racy.
>
> There are many ways you can do synchronization.
> In test_ivshmem_server(), I trivially wait for the membar with the
> required signature to be mapped. Verify that peers have different ids,
> and then start using the doorbell. That seems good enough.
>
>>    The only way to fix this in documentation is "broken, do not use".
>
> It works fine in the tests. Feel free to point out races or other issues.

I think I did: doorbell detection is inherently racy.

If you think it isn't, please refute my reasoning.

>>    The maximally compatible way to fix this in code is to ensure the
>>    guest can't read register IVPosition before we got the shared memory
>>    from the server.  We can make realize wait, or the read.  The latter
>>    is probably an even worse idea.
>> 
>>    An easier way to fix it in code is splitting up the device, so guests
>>    can simply check the PCI device ID to figure out whether they have
>>    one with a doorbell.
>> 
>>    An even easier way is dropping the doorbell feature outright.
>> 
>> 2. The UI is crap.
>> 
>>    We can fix this by rejecting nonsensical option combinations.
>
> Yes, I think it's the simplest way for now. I dislike having to break
> stuff when you can overcome it with a few more checks.
>
>>    However, the result will be more complex than splitting the device in
>>    two so that nonsensical options combinations are simply impossible.
>
> I disagree, adding more checks will add a few dozen lines with minimal
> impact. Splitting things will break stuff and require significant
> effort to share correctly what can be shared etc.
>
>>    If we need to split it anyway to fix the doorbell, we can clean up
>>    the UI at next to no cost.
>
> I don't think the doorbell is broken.

If it's not broken, please explain to me how the guest should find out
whether its ivshmem device sports a doorbell.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 14:08                     ` Markus Armbruster
@ 2015-11-23 14:16                       ` Marc-André Lureau
  2015-11-23 14:46                         ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-23 14:16 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Hi

----- Original Message -----
> >
> > You can't migrate the peers.
> 
> Then explain the case N'=0 to me: how can you migrate the master so that
> it's connected to a server afterwards?

Dest qemu: -incoming.. -chardev socket,path=dest-server 

That is, start your destination qemu with a destination server. The master
origin qemu will migrate the shared memory, and the dest memory will be sync
when the migration is done.

> > It works fine in the tests. Feel free to point out races or other issues.
> 
> I think I did: doorbell detection is inherently racy.
> 
> If you think it isn't, please refute my reasoning.

I gave you some clues on how I did it in ivshmem-test.c: waiting for a signature on the memory to be mapped (and also checking that peers received ids)

> If it's not broken, please explain to me how the guest should find out
> whether its ivshmem device sports a doorbell.

If you have received ID, you should be good to use the doorbell.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 14:16                       ` Marc-André Lureau
@ 2015-11-23 14:46                         ` Markus Armbruster
  2015-11-23 14:53                           ` Marc-André Lureau
  2015-11-23 19:52                           ` Eric Blake
  0 siblings, 2 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-23 14:46 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> Hi
>
> ----- Original Message -----
>> >
>> > You can't migrate the peers.
>> 
>> Then explain the case N'=0 to me: how can you migrate the master so that
>> it's connected to a server afterwards?
>
> Dest qemu: -incoming.. -chardev socket,path=dest-server 
>
> That is, start your destination qemu with a destination server. The master
> origin qemu will migrate the shared memory, and the dest memory will be sync
> when the migration is done.

Got it!

>> > It works fine in the tests. Feel free to point out races or other issues.
>> 
>> I think I did: doorbell detection is inherently racy.
>> 
>> If you think it isn't, please refute my reasoning.
>
> I gave you some clues on how I did it in ivshmem-test.c: waiting for a
> signature on the memory to be mapped (and also checking that peers
> received ids)
>
>> If it's not broken, please explain to me how the guest should find out
>> whether its ivshmem device sports a doorbell.
>
> If you have received ID, you should be good to use the doorbell.

That's not a complete answer, so let me try a different tack.

What value will a read of register IVPosition yield

* if the device has no doorbell:

  I think the answer is -1.

* if the device has a doorbell, but it isn't ready, yet:

  I think the answer is -1.

* if the device has a doorbell, and it is ready.

  This is the part you answered already: >= 0.

Please correct mistakes.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 14:46                         ` Markus Armbruster
@ 2015-11-23 14:53                           ` Marc-André Lureau
  2015-11-23 15:17                             ` Markus Armbruster
  2015-11-23 19:52                           ` Eric Blake
  1 sibling, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-23 14:53 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino



----- Original Message -----
> Marc-André Lureau <mlureau@redhat.com> writes:
> 
> > Hi
> >
> > ----- Original Message -----
> >> >
> >> > You can't migrate the peers.
> >> 
> >> Then explain the case N'=0 to me: how can you migrate the master so that
> >> it's connected to a server afterwards?
> >
> > Dest qemu: -incoming.. -chardev socket,path=dest-server
> >
> > That is, start your destination qemu with a destination server. The master
> > origin qemu will migrate the shared memory, and the dest memory will be
> > sync
> > when the migration is done.
> 
> Got it!
> 
> >> > It works fine in the tests. Feel free to point out races or other
> >> > issues.
> >> 
> >> I think I did: doorbell detection is inherently racy.
> >> 
> >> If you think it isn't, please refute my reasoning.
> >
> > I gave you some clues on how I did it in ivshmem-test.c: waiting for a
> > signature on the memory to be mapped (and also checking that peers
> > received ids)
> >
> >> If it's not broken, please explain to me how the guest should find out
> >> whether its ivshmem device sports a doorbell.
> >
> > If you have received ID, you should be good to use the doorbell.
> 
> That's not a complete answer, so let me try a different tack.
> 
> What value will a read of register IVPosition yield
> 
> * if the device has no doorbell:
> 
>   I think the answer is -1.
> 
> * if the device has a doorbell, but it isn't ready, yet:
> 
>   I think the answer is -1.
> 
> * if the device has a doorbell, and it is ready.
> 
>   This is the part you answered already: >= 0.
> 
> Please correct mistakes.
> 

Yes, I think it's correct. Arbitrary informations can be then shared via the shared memory (to say whether a doorbell will be present for ex).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 14:53                           ` Marc-André Lureau
@ 2015-11-23 15:17                             ` Markus Armbruster
  0 siblings, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-23 15:17 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> ----- Original Message -----
>> Marc-André Lureau <mlureau@redhat.com> writes:
>> 
>> > Hi
>> >
>> > ----- Original Message -----
>> >> >
>> >> > You can't migrate the peers.
>> >> 
>> >> Then explain the case N'=0 to me: how can you migrate the master so that
>> >> it's connected to a server afterwards?
>> >
>> > Dest qemu: -incoming.. -chardev socket,path=dest-server
>> >
>> > That is, start your destination qemu with a destination server. The master
>> > origin qemu will migrate the shared memory, and the dest memory will be
>> > sync
>> > when the migration is done.
>> 
>> Got it!
>> 
>> >> > It works fine in the tests. Feel free to point out races or other
>> >> > issues.
>> >> 
>> >> I think I did: doorbell detection is inherently racy.
>> >> 
>> >> If you think it isn't, please refute my reasoning.
>> >
>> > I gave you some clues on how I did it in ivshmem-test.c: waiting for a
>> > signature on the memory to be mapped (and also checking that peers
>> > received ids)
>> >
>> >> If it's not broken, please explain to me how the guest should find out
>> >> whether its ivshmem device sports a doorbell.
>> >
>> > If you have received ID, you should be good to use the doorbell.
>> 
>> That's not a complete answer, so let me try a different tack.
>> 
>> What value will a read of register IVPosition yield
>> 
>> * if the device has no doorbell:
>> 
>>   I think the answer is -1.
>> 
>> * if the device has a doorbell, but it isn't ready, yet:
>> 
>>   I think the answer is -1.
>> 
>> * if the device has a doorbell, and it is ready.
>> 
>>   This is the part you answered already: >= 0.
>> 
>> Please correct mistakes.
>> 
>
> Yes, I think it's correct. Arbitrary informations can be then shared
> via the shared memory (to say whether a doorbell will be present for
> ex).

Then there is no race-free way for the guest to distinguish between "no
doorbell" and "doorbell not ready" without some higher-level protocol.

Corollary: when IVPosition reads -1, BAR 2 may or may not be mapped, and
the only way to find out is accessing it.

Feels plenty broken to me.  Straight NAK in fact if it wasn't already in
the tree.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 10:19             ` Markus Armbruster
  2015-11-23 12:15               ` Marc-André Lureau
@ 2015-11-23 18:22               ` Markus Armbruster
  1 sibling, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-23 18:22 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

Markus Armbruster <armbru@redhat.com> writes:

> Marc-André Lureau <mlureau@redhat.com> writes:
[...]
>>> * shm appears to be the same as memdev, just less flexible.  Why does it
>>>   exist?
>>
>> It was there before.
>
> Not only is memdev more flexible, it also provides the clean split
> between frontend and backend we generally want.

In my tests, shm=foo can indeed be replaced by -object
memory-backend=file,mem-path=/dev/shm/foo.

However, /dev/shm is Linux-specific.  For portability, we might need a
memory-backend-shm.

[...]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 14:46                         ` Markus Armbruster
  2015-11-23 14:53                           ` Marc-André Lureau
@ 2015-11-23 19:52                           ` Eric Blake
  2015-11-23 20:19                             ` Marc-André Lureau
  1 sibling, 1 reply; 27+ messages in thread
From: Eric Blake @ 2015-11-23 19:52 UTC (permalink / raw)
  To: Markus Armbruster, Marc-André Lureau
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

On 11/23/2015 07:46 AM, Markus Armbruster wrote:

>>> If it's not broken, please explain to me how the guest should find out
>>> whether its ivshmem device sports a doorbell.
>>
>> If you have received ID, you should be good to use the doorbell.
> 
> That's not a complete answer, so let me try a different tack.
> 
> What value will a read of register IVPosition yield
> 
> * if the device has no doorbell:
> 
>   I think the answer is -1.
> 
> * if the device has a doorbell, but it isn't ready, yet:
> 
>   I think the answer is -1.
> 
> * if the device has a doorbell, and it is ready.
> 
>   This is the part you answered already: >= 0.
> 
> Please correct mistakes.

For what it's worth, I'm agreeing with Markus here - we have a race in
the guest learning whether doorbell is supported, and it's better to do
a clean break in API for 2.5 along with ALL the other fixes into using a
sane union of types (splitting between ivshmem-plain and
ivshmem-doorbell).  If you absolutely want it, you can still maintain
'ivshmem' as a front-end that maps to either ivshmem-plain,
ivshmem-doorbell, or an error (nonsensical combination of requests), but
having a sane interface as your starting point, rather than an
amalgamation of cruft that exists only due to the early implementation,
seems like the way to go.

I'm still waiting for any evidence that we even care about backwards
compatibility - what apps are out there that are actively using ivshmem
with its horrible pre-2.5 interface, and why can they not be fixed to
use the sane 2.5 interface?  Libvirt is NOT exposing ivshmem yet, in
part because the 2.4 interface was not very robust.  We already have a
clean break now due to all your work for 2.5; let's take advantage of
it, and make 2.5 robust, rather than breaking things again in 2.6.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 19:52                           ` Eric Blake
@ 2015-11-23 20:19                             ` Marc-André Lureau
  2015-11-24  9:56                               ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-23 20:19 UTC (permalink / raw)
  To: Eric Blake
  Cc: marcandre lureau, Luiz Capitulino, Claudio Fontana,
	Markus Armbruster, qemu-devel

Hi

----- Original Message -----
> On 11/23/2015 07:46 AM, Markus Armbruster wrote:
> 
> >>> If it's not broken, please explain to me how the guest should find out
> >>> whether its ivshmem device sports a doorbell.
> >>
> >> If you have received ID, you should be good to use the doorbell.
> > 
> > That's not a complete answer, so let me try a different tack.
> > 
> > What value will a read of register IVPosition yield
> > 
> > * if the device has no doorbell:
> > 
> >   I think the answer is -1.
> > 
> > * if the device has a doorbell, but it isn't ready, yet:
> > 
> >   I think the answer is -1.
> > 
> > * if the device has a doorbell, and it is ready.
> > 
> >   This is the part you answered already: >= 0.
> > 
> > Please correct mistakes.
> 
> For what it's worth, I'm agreeing with Markus here - we have a race in
> the guest learning whether doorbell is supported, and it's better to do

I think there is no race if you communicate this information in the shared memory.

> a clean break in API for 2.5 along with ALL the other fixes into using a
> sane union of types (splitting between ivshmem-plain and
> ivshmem-doorbell).  If you absolutely want it, you can still maintain
> 'ivshmem' as a front-end that maps to either ivshmem-plain,
> ivshmem-doorbell, or an error (nonsensical combination of requests), but

It's not about me. Until now I was not aware of anyone complaining about
the ivshmem interface, but by its implementation. I tried to adress all the
issues and comments I have found, and I tried to make sure not to break stuff,
because I usually receive huge complains whenever I do, and I have to throw
up the work. So if there is a consensus to break things now, I think it's
quite late for this cycle, but I am for it.

> having a sane interface as your starting point, rather than an
> amalgamation of cruft that exists only due to the early implementation,
> seems like the way to go.

It is just the way it was, isn't it?

> I'm still waiting for any evidence that we even care about backwards
> compatibility - what apps are out there that are actively using ivshmem
> with its horrible pre-2.5 interface, and why can they not be fixed to

> use the sane 2.5 interface?  Libvirt is NOT exposing ivshmem yet, in
> part because the 2.4 interface was not very robust.  We already have a

Afaik it's there since 1.2.10

> clean break now due to all your work for 2.5; let's take advantage of
> it, and make 2.5 robust, rather than breaking things again in 2.6.

2.5 should not break the ivshmem interface.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 16:07 [Qemu-devel] ivshmem property size should be a size, not a string Markus Armbruster
  2015-11-20 16:23 ` Marc-André Lureau
@ 2015-11-23 20:57 ` Bruce Rogers
  1 sibling, 0 replies; 27+ messages in thread
From: Bruce Rogers @ 2015-11-23 20:57 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: marcandre.lureau, Claudio Fontana, Luiz Capitulino

>>> On 11/20/2015 at 09:07 AM, Markus Armbruster <armbru@redhat.com> wrote: 
> Everybody's favourite device model has "size" property.  It's declared
> as *string*
> 
>     DEFINE_PROP_STRING("size", IVShmemState, sizearg),
> 
> which gets converted to a size manually in the realize method:
> 
>     } else if (s->sizearg == NULL) {
>         s->ivshmem_size = 4 << 20; /* 4 MB default */
>     } else {
>         char *end;
>         int64_t size = qemu_strtosz(s->sizearg, &end);
>         if (size < 0 || *end != '\0' || !is_power_of_2(size)) {
>             error_setg(errp, "Invalid size %s", s->sizearg);
>             return;
>         }
>         s->ivshmem_size = size;
>     }
> 
> This is *wrong*.  Impact, as far as I can tell:
> 
> * -device ivshmem,help shows the property as 'str' instead of 'size'.
> 
>   Unhelpful, but hardly show-stopper.
> 
> * On the command line and in HMP, ivshmem's size is parsed differently
>   than other size properties.  In particular, a number without a suffix
>   is normally interpreted as bytes, except for ivshmem, where it's
>   Mebibytes.
> 
>   Ugly inconsistency, but hardly the only one.
> 
> * In QMP, the size must be given as JSON string instead of JSON number,
>   and size suffixes are accepted.  Example: "size": "512k" instead of
>   "size": 524288.
> 
>   Right now, this violation of QMP rules is tolerable (barely), because
>   device_add breaks some of these rules already.  However, one goal of
>   the current work on QAPI is to support a QMP command to plug devices
>   that doesn't break QMP rules, and then this violation will stand out.
> 
>   Therefore, I want it fixed now, before ivshmem gets used in anger.
> 
>   A straight fix of size isn't fully backwards compatible: suffixes no
>   longer work in QMP, and you need to use an 'M' suffix to get Mebibytes
>   on command line and HMP.
> 
>   If that's unacceptable, we'll have to provide a new, fixed property,
>   and deprecate size.
> 
>   Opinions?


At the request of a customer, SUSE began basic support of the ivshmem
interface as of our SLE 12 release, which included QEMU v2.0.0. (This is
around the same time frame that it was also wired up in libvirt in v1.2.10.)

As has been pointed out, there are issues with its usability, but lets not
break existing users thoughtlessly. I certainly support efforts to make
improvements, and to even deprecate the problematic interfaces as suitable
substitutes are added, but to do so in an order fashion allowing users time to
migrate to the new interface instead of breaking them all of a sudden.

Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-20 20:18           ` Marc-André Lureau
  2015-11-23 10:19             ` Markus Armbruster
@ 2015-11-23 23:29             ` Andrew James
  2015-11-24  9:52               ` Markus Armbruster
  1 sibling, 1 reply; 27+ messages in thread
From: Andrew James @ 2015-11-23 23:29 UTC (permalink / raw)
  To: Marc-André Lureau, Markus Armbruster
  Cc: marcandre lureau, Claudio Fontana, qemu-devel, Luiz Capitulino

On 11/20/2015 01:18 PM, Marc-André Lureau wrote:
> Hi
> 
> ----- Original Message -----
>>>> Hash ivshmem been used in anger?  If yes, how?
>>
>> Still the question to answer.
> 
> I don't expect users to read this ML everyday (anybody actually). Personally, I have no clue how widespread ivshmem usage is.

We (Hewlett Packard Enterprise) are using ivshmem as a part of a project
that is set to go public in the next few days:

https://github.com/FabricAttachedMemory

Sorry for the premature announcement; I wanted to declare our interest
in ivshmem itself. QEMU 2.1 through 2.4.1 have worked well for our use-case.

>>   Why is this a single device model?
> 
> No idea, but I agree it would make sense to have two different devices.

FWIW, I support splitting the device into memdev and doorbell varieties
as long as a compatibility device is available too.


Thanks,
-- 
Andrew James

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 23:29             ` Andrew James
@ 2015-11-24  9:52               ` Markus Armbruster
  0 siblings, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-24  9:52 UTC (permalink / raw)
  To: Andrew James
  Cc: marcandre lureau, Luiz Capitulino, Marc-André Lureau,
	Claudio Fontana, qemu-devel

Andrew James <andrew.james@hpe.com> writes:

> On 11/20/2015 01:18 PM, Marc-André Lureau wrote:
>> Hi
>> 
>> ----- Original Message -----
>>>>> Hash ivshmem been used in anger?  If yes, how?
>>>
>>> Still the question to answer.
>> 
>> I don't expect users to read this ML everyday (anybody
>> actually). Personally, I have no clue how widespread ivshmem usage
>> is.
>
> We (Hewlett Packard Enterprise) are using ivshmem as a part of a project
> that is set to go public in the next few days:
>
> https://github.com/FabricAttachedMemory
>
> Sorry for the premature announcement; I wanted to declare our interest
> in ivshmem itself. QEMU 2.1 through 2.4.1 have worked well for our use-case.
>
>>>   Why is this a single device model?
>> 
>> No idea, but I agree it would make sense to have two different devices.
>
> FWIW, I support splitting the device into memdev and doorbell varieties
> as long as a compatibility device is available too.

Thanks for your input, it's appreciated.

We generally don't just drop interfaces when they have users relying on
them.  Instead, we deprecate them in favour of replacements.  Users are
then advised to migrate to the replacement in an orderly fashion.  The
deprecated interface may go away eventually.  We tend to keep it
indefinitely unless it's a maintenance burden.

In case of ivshmem, I'll take yours and Bruce's note as evidence of use.
Orderly replacement is still fine, but outright removal would be rather
unkind to its existing users.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-23 20:19                             ` Marc-André Lureau
@ 2015-11-24  9:56                               ` Markus Armbruster
  2015-11-24 12:23                                 ` Marc-André Lureau
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2015-11-24  9:56 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: qemu-devel, marcandre lureau, Claudio Fontana, Luiz Capitulino

Marc-André Lureau <mlureau@redhat.com> writes:

> Hi
>
> ----- Original Message -----
>> On 11/23/2015 07:46 AM, Markus Armbruster wrote:
>> 
>> >>> If it's not broken, please explain to me how the guest should find out
>> >>> whether its ivshmem device sports a doorbell.
>> >>
>> >> If you have received ID, you should be good to use the doorbell.
>> > 
>> > That's not a complete answer, so let me try a different tack.
>> > 
>> > What value will a read of register IVPosition yield
>> > 
>> > * if the device has no doorbell:
>> > 
>> >   I think the answer is -1.
>> > 
>> > * if the device has a doorbell, but it isn't ready, yet:
>> > 
>> >   I think the answer is -1.
>> > 
>> > * if the device has a doorbell, and it is ready.
>> > 
>> >   This is the part you answered already: >= 0.
>> > 
>> > Please correct mistakes.
>> 
>> For what it's worth, I'm agreeing with Markus here - we have a race in
>> the guest learning whether doorbell is supported, and it's better to do
>
> I think there is no race if you communicate this information in the
> shared memory.

The device makes robust detection of the doorbell unnecessarily hard.
You may be able to work around it at higher levels.  This involves
assumptions on peers.  However, that's no excuse not to fix the device.
Regardless, the mechanism you propose, namely signalling presence of the
doorbell in the shared memory, is useless.

Let me elaborate.

Say you know that ivshmem is used for a specific protocol, and that
protocol requires a doorbell.  You simply assume the doorbell is there,
and read of IVPosition yielding -1 means the device hasn't completed its
setup, yet.

Since the shared memory only becomes mapped when IVPosition goes
non-negative, a guest device driver will probably want to delay
completion of the operation to map the shared memory into user space
until IVPosition becomes non-negative, to shield user space from SIGBUS
(or however else the kernel reports the trap to user space) during the
setup window.  Happy polling.

Say you know that ivshmem is used for a specific protocol, and that
protocol signals its use of a doorbell in the shared memory.  If a read
of IVPosition yields a non-negative number, you know you got a doorbell.
If it yields -1, you know nothing.  You can attempt reading shared
memory to find out more.  This *traps* when the BAR hasn't been mapped,
yet.  If it does, you still don't know.  If it doesn't, and IVPosition
is still -1, you got no doorbell, regardless of what the shared memory
tells you.  If IVPosition has meanwhile become non-negative, you got a
doorbell, again regardless of what the shared memory tells you.  Bottom
line: the protocol signalling its use of the doorbell in the shared
memory is of no help.

I figure a robust guest device driver is possible, but it'll involve
dealing with traps and polling IVPosition.

This device is simply an embarrassment.

>> a clean break in API for 2.5 along with ALL the other fixes into using a
>> sane union of types (splitting between ivshmem-plain and
>> ivshmem-doorbell).  If you absolutely want it, you can still maintain
>> 'ivshmem' as a front-end that maps to either ivshmem-plain,
>> ivshmem-doorbell, or an error (nonsensical combination of requests), but
>
> It's not about me. Until now I was not aware of anyone complaining about
> the ivshmem interface, but by its implementation. I tried to adress all the
> issues and comments I have found, and I tried to make sure not to break stuff,
> because I usually receive huge complains whenever I do, and I have to throw
> up the work. So if there is a consensus to break things now, I think it's
> quite late for this cycle, but I am for it.

The remaining flaws in ivshmem are certainly not your fault.

>> having a sane interface as your starting point, rather than an
>> amalgamation of cruft that exists only due to the early implementation,
>> seems like the way to go.
>
> It is just the way it was, isn't it?
>
>> I'm still waiting for any evidence that we even care about backwards
>> compatibility - what apps are out there that are actively using ivshmem
>> with its horrible pre-2.5 interface, and why can they not be fixed to
>
>> use the sane 2.5 interface?  Libvirt is NOT exposing ivshmem yet, in
>> part because the 2.4 interface was not very robust.  We already have a
>
> Afaik it's there since 1.2.10
>
>> clean break now due to all your work for 2.5; let's take advantage of
>> it, and make 2.5 robust, rather than breaking things again in 2.6.
>
> 2.5 should not break the ivshmem interface.

I don't object to keeping the existing ivshmem interface, flaws, warts
and all, to support existing users.

I do object to extending it.  I agree with Eric that 2.5 is the ideal
time for a clean break.  But if we can't pull it off for 2.5, then we
should punt *all* interface changes to 2.6.  Having to support one
legacy interface version in addition to the current one is clearly
better than having to support two of them.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-24  9:56                               ` Markus Armbruster
@ 2015-11-24 12:23                                 ` Marc-André Lureau
  2015-11-24 13:50                                   ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-24 12:23 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Claudio Fontana, QEMU, Luiz Capitulino

Hi

On Tue, Nov 24, 2015 at 10:56 AM, Markus Armbruster <armbru@redhat.com> wrote:
> Marc-André Lureau <mlureau@redhat.com> writes:
>
>> Hi
>>
>> ----- Original Message -----
>>> On 11/23/2015 07:46 AM, Markus Armbruster wrote:
>>>
>>> >>> If it's not broken, please explain to me how the guest should find out
>>> >>> whether its ivshmem device sports a doorbell.
>>> >>
>>> >> If you have received ID, you should be good to use the doorbell.
>>> >
>>> > That's not a complete answer, so let me try a different tack.
>>> >
>>> > What value will a read of register IVPosition yield
>>> >
>>> > * if the device has no doorbell:
>>> >
>>> >   I think the answer is -1.
>>> >
>>> > * if the device has a doorbell, but it isn't ready, yet:
>>> >
>>> >   I think the answer is -1.
>>> >
>>> > * if the device has a doorbell, and it is ready.
>>> >
>>> >   This is the part you answered already: >= 0.
>>> >
>>> > Please correct mistakes.
>>>
>>> For what it's worth, I'm agreeing with Markus here - we have a race in
>>> the guest learning whether doorbell is supported, and it's better to do
>>
>> I think there is no race if you communicate this information in the
>> shared memory.
>
> The device makes robust detection of the doorbell unnecessarily hard.
> You may be able to work around it at higher levels.  This involves
> assumptions on peers.  However, that's no excuse not to fix the device.
> Regardless, the mechanism you propose, namely signalling presence of the
> doorbell in the shared memory, is useless.
>
> Let me elaborate.
>
> Say you know that ivshmem is used for a specific protocol, and that
> protocol requires a doorbell.  You simply assume the doorbell is there,
> and read of IVPosition yielding -1 means the device hasn't completed its
> setup, yet.
>
> Since the shared memory only becomes mapped when IVPosition goes
> non-negative, a guest device driver will probably want to delay
> completion of the operation to map the shared memory into user space
> until IVPosition becomes non-negative, to shield user space from SIGBUS
> (or however else the kernel reports the trap to user space) during the
> setup window.  Happy polling.
>
> Say you know that ivshmem is used for a specific protocol, and that
> protocol signals its use of a doorbell in the shared memory.  If a read
> of IVPosition yields a non-negative number, you know you got a doorbell.
> If it yields -1, you know nothing.  You can attempt reading shared
> memory to find out more.  This *traps* when the BAR hasn't been mapped,
> yet.  If it does, you still don't know.  If it doesn't, and IVPosition
> is still -1, you got no doorbell, regardless of what the shared memory
> tells you.  If IVPosition has meanwhile become non-negative, you got a
> doorbell, again regardless of what the shared memory tells you.  Bottom
> line: the protocol signalling its use of the doorbell in the shared
> memory is of no help.

I don't follow you. The point is that you can define your protocol in
the shared memory. Tbh, I think in practice users simply have a kind
of manager to start things, eventually waiting for doorbell to start
coming etc. Don't take my simple example for "the solution", I just
though about one, there might be plenty others.

> I figure a robust guest device driver is possible, but it'll involve
> dealing with traps and polling IVPosition.
>
> This device is simply an embarrassment.

Apparently not for the people using it, or they would certainly fix it
or report bugs. Yes, it could be better, but it's really not that
"horrible" imho.

>
>>> a clean break in API for 2.5 along with ALL the other fixes into using a
>>> sane union of types (splitting between ivshmem-plain and
>>> ivshmem-doorbell).  If you absolutely want it, you can still maintain
>>> 'ivshmem' as a front-end that maps to either ivshmem-plain,
>>> ivshmem-doorbell, or an error (nonsensical combination of requests), but
>>
>> It's not about me. Until now I was not aware of anyone complaining about
>> the ivshmem interface, but by its implementation. I tried to adress all the
>> issues and comments I have found, and I tried to make sure not to break stuff,
>> because I usually receive huge complains whenever I do, and I have to throw
>> up the work. So if there is a consensus to break things now, I think it's
>> quite late for this cycle, but I am for it.
>
> The remaining flaws in ivshmem are certainly not your fault.

Thank you :)

>>> having a sane interface as your starting point, rather than an
>>> amalgamation of cruft that exists only due to the early implementation,
>>> seems like the way to go.
>>
>> It is just the way it was, isn't it?
>>
>>> I'm still waiting for any evidence that we even care about backwards
>>> compatibility - what apps are out there that are actively using ivshmem
>>> with its horrible pre-2.5 interface, and why can they not be fixed to
>>
>>> use the sane 2.5 interface?  Libvirt is NOT exposing ivshmem yet, in
>>> part because the 2.4 interface was not very robust.  We already have a
>>
>> Afaik it's there since 1.2.10
>>
>>> clean break now due to all your work for 2.5; let's take advantage of
>>> it, and make 2.5 robust, rather than breaking things again in 2.6.
>>
>> 2.5 should not break the ivshmem interface.
>
> I don't object to keeping the existing ivshmem interface, flaws, warts
> and all, to support existing users.
>
> I do object to extending it.  I agree with Eric that 2.5 is the ideal
> time for a clean break.  But if we can't pull it off for 2.5, then we
> should punt *all* interface changes to 2.6.  Having to support one
> legacy interface version in addition to the current one is clearly
> better than having to support two of them.

What you call interface change is?:
- the error return in case chardev is provided with shm (this was
initially changed in baefb8bf8 before I unified behaviour)
- the new memdev property (similarly to the new use64 in c08ba66)

Imho it's not such a big deal to have a compatibility interface with
2.5 in the future, and I would rather keep the consistant behaviour
for the error return case.

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-24 12:23                                 ` Marc-André Lureau
@ 2015-11-24 13:50                                   ` Markus Armbruster
  2015-11-24 14:23                                     ` Marc-André Lureau
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2015-11-24 13:50 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: Claudio Fontana, QEMU, Luiz Capitulino

Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
> On Tue, Nov 24, 2015 at 10:56 AM, Markus Armbruster <armbru@redhat.com> wrote:
>> Marc-André Lureau <mlureau@redhat.com> writes:
>>
>>> Hi
>>>
>>> ----- Original Message -----
>>>> On 11/23/2015 07:46 AM, Markus Armbruster wrote:
>>>>
>>>> >>> If it's not broken, please explain to me how the guest should find out
>>>> >>> whether its ivshmem device sports a doorbell.
>>>> >>
>>>> >> If you have received ID, you should be good to use the doorbell.
>>>> >
>>>> > That's not a complete answer, so let me try a different tack.
>>>> >
>>>> > What value will a read of register IVPosition yield
>>>> >
>>>> > * if the device has no doorbell:
>>>> >
>>>> >   I think the answer is -1.
>>>> >
>>>> > * if the device has a doorbell, but it isn't ready, yet:
>>>> >
>>>> >   I think the answer is -1.
>>>> >
>>>> > * if the device has a doorbell, and it is ready.
>>>> >
>>>> >   This is the part you answered already: >= 0.
>>>> >
>>>> > Please correct mistakes.
>>>>
>>>> For what it's worth, I'm agreeing with Markus here - we have a race in
>>>> the guest learning whether doorbell is supported, and it's better to do
>>>
>>> I think there is no race if you communicate this information in the
>>> shared memory.
>>
>> The device makes robust detection of the doorbell unnecessarily hard.
>> You may be able to work around it at higher levels.  This involves
>> assumptions on peers.  However, that's no excuse not to fix the device.
>> Regardless, the mechanism you propose, namely signalling presence of the
>> doorbell in the shared memory, is useless.
>>
>> Let me elaborate.
>>
>> Say you know that ivshmem is used for a specific protocol, and that
>> protocol requires a doorbell.  You simply assume the doorbell is there,
>> and read of IVPosition yielding -1 means the device hasn't completed its
>> setup, yet.
>>
>> Since the shared memory only becomes mapped when IVPosition goes
>> non-negative, a guest device driver will probably want to delay
>> completion of the operation to map the shared memory into user space
>> until IVPosition becomes non-negative, to shield user space from SIGBUS
>> (or however else the kernel reports the trap to user space) during the
>> setup window.  Happy polling.
>>
>> Say you know that ivshmem is used for a specific protocol, and that
>> protocol signals its use of a doorbell in the shared memory.  If a read
>> of IVPosition yields a non-negative number, you know you got a doorbell.
>> If it yields -1, you know nothing.  You can attempt reading shared
>> memory to find out more.  This *traps* when the BAR hasn't been mapped,
>> yet.  If it does, you still don't know.  If it doesn't, and IVPosition
>> is still -1, you got no doorbell, regardless of what the shared memory
>> tells you.  If IVPosition has meanwhile become non-negative, you got a
>> doorbell, again regardless of what the shared memory tells you.  Bottom
>> line: the protocol signalling its use of the doorbell in the shared
>> memory is of no help.
>
> I don't follow you. The point is that you can define your protocol in
> the shared memory. Tbh, I think in practice users simply have a kind
> of manager to start things, eventually waiting for doorbell to start
> coming etc. Don't take my simple example for "the solution", I just
> though about one, there might be plenty others.

Facts:

* We have a half-initialized state that is visible to the guest.

* To detect the doorbell, you have to wait for the device to exit this
  state.

* Recognizing this state is cumbersome unless you already know you got a
  doorbell.

* Most guests most of the time should take longer to boot to the point
  where they look at the device than the device takes to exit this
  state.

This is a recipe for rare & obscure guest failure.  I doubt actual guest
software gets it right.  Moreover, I think it shouldn't have to get this
right!  It's a pointless trap for unwary guests.

>> I figure a robust guest device driver is possible, but it'll involve
>> dealing with traps and polling IVPosition.
>>
>> This device is simply an embarrassment.
>
> Apparently not for the people using it, or they would certainly fix it
> or report bugs. Yes, it could be better, but it's really not that
> "horrible" imho.

We fix all kind of bugs, the ones that bite every time, and the ones
that can bite only when you're sufficiently unlucky.  Applies do design
bugs just as much as to coding bugs.

>>>> a clean break in API for 2.5 along with ALL the other fixes into using a
>>>> sane union of types (splitting between ivshmem-plain and
>>>> ivshmem-doorbell).  If you absolutely want it, you can still maintain
>>>> 'ivshmem' as a front-end that maps to either ivshmem-plain,
>>>> ivshmem-doorbell, or an error (nonsensical combination of requests), but
>>>
>>> It's not about me. Until now I was not aware of anyone complaining about
>>> the ivshmem interface, but by its implementation. I tried to adress all the
>>> issues and comments I have found, and I tried to make sure not to
>>> break stuff,
>>> because I usually receive huge complains whenever I do, and I have to throw
>>> up the work. So if there is a consensus to break things now, I think it's
>>> quite late for this cycle, but I am for it.
>>
>> The remaining flaws in ivshmem are certainly not your fault.
>
> Thank you :)

Sizable chunk of work, thank *you*!

f7a199b ivshmem: use little-endian int64_t for the protocol
660c97e ivshmem: use kvm irqfd for msi notifications
0f57350 ivshmem: rename MSI eventfd_table
d160f3f ivshmem: remove EventfdEntry.vector
d9453c9 ivshmem: add hostmem backend
2c04752 ivshmem: use qemu_strtosz()
f689d28 ivshmem: do not keep shm_fd open

 1 file changed, 285 insertions(+), 91 deletions(-)

>>>> having a sane interface as your starting point, rather than an
>>>> amalgamation of cruft that exists only due to the early implementation,
>>>> seems like the way to go.
>>>
>>> It is just the way it was, isn't it?
>>>
>>>> I'm still waiting for any evidence that we even care about backwards
>>>> compatibility - what apps are out there that are actively using ivshmem
>>>> with its horrible pre-2.5 interface, and why can they not be fixed to
>>>
>>>> use the sane 2.5 interface?  Libvirt is NOT exposing ivshmem yet, in
>>>> part because the 2.4 interface was not very robust.  We already have a
>>>
>>> Afaik it's there since 1.2.10
>>>
>>>> clean break now due to all your work for 2.5; let's take advantage of
>>>> it, and make 2.5 robust, rather than breaking things again in 2.6.
>>>
>>> 2.5 should not break the ivshmem interface.
>>
>> I don't object to keeping the existing ivshmem interface, flaws, warts
>> and all, to support existing users.
>>
>> I do object to extending it.  I agree with Eric that 2.5 is the ideal
>> time for a clean break.  But if we can't pull it off for 2.5, then we
>> should punt *all* interface changes to 2.6.  Having to support one
>> legacy interface version in addition to the current one is clearly
>> better than having to support two of them.
>
> What you call interface change is?:
> - the error return in case chardev is provided with shm (this was
> initially changed in baefb8bf8 before I unified behaviour)

As far as I can tell, this is a bug fix.  Before, we first report the
missing chardev, then the fail shm_open(), then exit().  Now, we fail it
right away, and the right way, i.e. not with exit().

> - the new memdev property (similarly to the new use64 in c08ba66)

I would like to take that out, yes.

If we must have it in 2.5, mark it experimental: x-memdev.  Anybody
who'd want that, please speak up.

Post-2.5, deprecate the existing device model for something more sane.
I'm thinking of a pair of device models, one with and one without the
doorbell.  Make sure they have a clean guest ABI, a clean set of
properties, and a reasonable split between frontend and backend.  If we
kept x-memdev, drop it from the deprecated device.

> Imho it's not such a big deal to have a compatibility interface with
> 2.5 in the future, and I would rather keep the consistant behaviour
> for the error return case.

The fewer compatibility interfaces we maintain, and the simpler they
are, the better.  I don't see why we must complicate this one before we
deprecate it when we can it the other way round.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-24 13:50                                   ` Markus Armbruster
@ 2015-11-24 14:23                                     ` Marc-André Lureau
  2015-11-24 15:12                                       ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-André Lureau @ 2015-11-24 14:23 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Claudio Fontana, QEMU, Luiz Capitulino

Hi

On Tue, Nov 24, 2015 at 2:50 PM, Markus Armbruster <armbru@redhat.com> wrote:

> Facts:
>
> * We have a half-initialized state that is visible to the guest.
>
> * To detect the doorbell, you have to wait for the device to exit this
>   state.
>
> * Recognizing this state is cumbersome unless you already know you got a
>   doorbell.
>
> * Most guests most of the time should take longer to boot to the point
>   where they look at the device than the device takes to exit this
>   state.
>
> This is a recipe for rare & obscure guest failure.  I doubt actual guest
> software gets it right.  Moreover, I think it shouldn't have to get this
> right!  It's a pointless trap for unwary guests.

That's just stating that you want an easier way to deal with ivshmem
doorbell, it doesn't mean users are unable to cope or unhappy with it.

>>> I figure a robust guest device driver is possible, but it'll involve
>>> dealing with traps and polling IVPosition.
>>>
>>> This device is simply an embarrassment.
>>
>> Apparently not for the people using it, or they would certainly fix it
>> or report bugs. Yes, it could be better, but it's really not that
>> "horrible" imho.
>
> We fix all kind of bugs, the ones that bite every time, and the ones
> that can bite only when you're sufficiently unlucky.  Applies do design
> bugs just as much as to coding bugs.

Ivshmem is >5y old, not a new kid in town.

> Sizable chunk of work, thank *you*!
>
> f7a199b ivshmem: use little-endian int64_t for the protocol
> 660c97e ivshmem: use kvm irqfd for msi notifications
> 0f57350 ivshmem: rename MSI eventfd_table
> d160f3f ivshmem: remove EventfdEntry.vector
> d9453c9 ivshmem: add hostmem backend
> 2c04752 ivshmem: use qemu_strtosz()
> f689d28 ivshmem: do not keep shm_fd open
>
>  1 file changed, 285 insertions(+), 91 deletions(-)

What is this list of commits telling?

>> - the new memdev property (similarly to the new use64 in c08ba66)
>
> I would like to take that out, yes.
>
> If we must have it in 2.5, mark it experimental: x-memdev.  Anybody
> who'd want that, please speak up.

This has been requested (with patches) for >2y, with several iterations:
https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg01890.html

> Post-2.5, deprecate the existing device model for something more sane.
> I'm thinking of a pair of device models, one with and one without the
> doorbell.  Make sure they have a clean guest ABI, a clean set of
> properties, and a reasonable split between frontend and backend.  If we
> kept x-memdev, drop it from the deprecated device.
>
>> Imho it's not such a big deal to have a compatibility interface with
>> 2.5 in the future, and I would rather keep the consistant behaviour
>> for the error return case.
>
> The fewer compatibility interfaces we maintain, and the simpler they
> are, the better.  I don't see why we must complicate this one before we
> deprecate it when we can it the other way round.

I am just saying I am okay with this extra property.

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] ivshmem property size should be a size, not a string
  2015-11-24 14:23                                     ` Marc-André Lureau
@ 2015-11-24 15:12                                       ` Markus Armbruster
  0 siblings, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2015-11-24 15:12 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: Claudio Fontana, QEMU, Luiz Capitulino

Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
> On Tue, Nov 24, 2015 at 2:50 PM, Markus Armbruster <armbru@redhat.com> wrote:
>
>> Facts:
>>
>> * We have a half-initialized state that is visible to the guest.
>>
>> * To detect the doorbell, you have to wait for the device to exit this
>>   state.
>>
>> * Recognizing this state is cumbersome unless you already know you got a
>>   doorbell.
>>
>> * Most guests most of the time should take longer to boot to the point
>>   where they look at the device than the device takes to exit this
>>   state.
>>
>> This is a recipe for rare & obscure guest failure.  I doubt actual guest
>> software gets it right.  Moreover, I think it shouldn't have to get this
>> right!  It's a pointless trap for unwary guests.
>
> That's just stating that you want an easier way to deal with ivshmem
> doorbell, it doesn't mean users are unable to cope or unhappy with it.

Testing can't prove the absence of bugs.

>>>> I figure a robust guest device driver is possible, but it'll involve
>>>> dealing with traps and polling IVPosition.
>>>>
>>>> This device is simply an embarrassment.
>>>
>>> Apparently not for the people using it, or they would certainly fix it
>>> or report bugs. Yes, it could be better, but it's really not that
>>> "horrible" imho.
>>
>> We fix all kind of bugs, the ones that bite every time, and the ones
>> that can bite only when you're sufficiently unlucky.  Applies do design
>> bugs just as much as to coding bugs.
>
> Ivshmem is >5y old, not a new kid in town.

We also fix bugs regardless of age.

>> Sizable chunk of work, thank *you*!
>>
>> f7a199b ivshmem: use little-endian int64_t for the protocol
>> 660c97e ivshmem: use kvm irqfd for msi notifications
>> 0f57350 ivshmem: rename MSI eventfd_table
>> d160f3f ivshmem: remove EventfdEntry.vector
>> d9453c9 ivshmem: add hostmem backend
>> 2c04752 ivshmem: use qemu_strtosz()
>> f689d28 ivshmem: do not keep shm_fd open
>>
>>  1 file changed, 285 insertions(+), 91 deletions(-)
>
> What is this list of commits telling?

Failed attempt to show my gratitude for your work.

>>> - the new memdev property (similarly to the new use64 in c08ba66)
>>
>> I would like to take that out, yes.
>>
>> If we must have it in 2.5, mark it experimental: x-memdev.  Anybody
>> who'd want that, please speak up.
>
> This has been requested (with patches) for >2y, with several iterations:
> https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg01890.html

Thanks.

>> Post-2.5, deprecate the existing device model for something more sane.
>> I'm thinking of a pair of device models, one with and one without the
>> doorbell.  Make sure they have a clean guest ABI, a clean set of
>> properties, and a reasonable split between frontend and backend.  If we
>> kept x-memdev, drop it from the deprecated device.
>>
>>> Imho it's not such a big deal to have a compatibility interface with
>>> 2.5 in the future, and I would rather keep the consistant behaviour
>>> for the error return case.
>>
>> The fewer compatibility interfaces we maintain, and the simpler they
>> are, the better.  I don't see why we must complicate this one before we
>> deprecate it when we can it the other way round.
>
> I am just saying I am okay with this extra property.

Noted.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-11-24 15:12 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-20 16:07 [Qemu-devel] ivshmem property size should be a size, not a string Markus Armbruster
2015-11-20 16:23 ` Marc-André Lureau
2015-11-20 16:46   ` Eric Blake
2015-11-20 18:06     ` Markus Armbruster
2015-11-20 18:20       ` Marc-André Lureau
2015-11-20 19:39         ` Markus Armbruster
2015-11-20 20:18           ` Marc-André Lureau
2015-11-23 10:19             ` Markus Armbruster
2015-11-23 12:15               ` Marc-André Lureau
2015-11-23 13:25                 ` Markus Armbruster
2015-11-23 13:48                   ` Marc-André Lureau
2015-11-23 14:08                     ` Markus Armbruster
2015-11-23 14:16                       ` Marc-André Lureau
2015-11-23 14:46                         ` Markus Armbruster
2015-11-23 14:53                           ` Marc-André Lureau
2015-11-23 15:17                             ` Markus Armbruster
2015-11-23 19:52                           ` Eric Blake
2015-11-23 20:19                             ` Marc-André Lureau
2015-11-24  9:56                               ` Markus Armbruster
2015-11-24 12:23                                 ` Marc-André Lureau
2015-11-24 13:50                                   ` Markus Armbruster
2015-11-24 14:23                                     ` Marc-André Lureau
2015-11-24 15:12                                       ` Markus Armbruster
2015-11-23 18:22               ` Markus Armbruster
2015-11-23 23:29             ` Andrew James
2015-11-24  9:52               ` Markus Armbruster
2015-11-23 20:57 ` Bruce Rogers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.