All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] virtio-serial semantics for binary data and guest agents
@ 2011-02-22 22:40 Michael Roth
  2011-02-22 23:09 ` [Qemu-devel] " Anthony Liguori
  2011-02-23  4:59 ` Amit Shah
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Roth @ 2011-02-22 22:40 UTC (permalink / raw)
  To: qemu-devel@nongnu.org Developers; +Cc: Amit Shah, Anthony Liguori, Adam Litke

Hi everyone,

As some of you are aware we've been working for a few months now towards 
creating a qemu-specific guest agent to implement bi-directional RPCs 
between the host and the guest to support certain operations like 
copy/paste, guest-initiated shutdown, and basic file transfer.

Currently the protocol uses the xmlrpc-c library to marshall functions 
and arguments into a data format we can transport over 
virtio-serial/isa-serial via an HTTP-like protocol. Recently some 
concerns have been raised over pulling an external dependency such as 
xmlrpc-c into qemu, and we've been looking at some alternatives.

Some clear, well-defined approaches such as ASN.1/BER show much promise, 
but we currently lack a way to implement them cleanly due to the 
following drawbacks with virtio-serial which prevent us from being able 
to implement connection/session-oriented protocols:

If something in the guest is attempting to read/write from the 
virtio-serial device, and nothing is connected to virtio-serial's host 
character device (say, a socket)

1. writes will block until something connect()s, at which point the 
write will succeed

2. reads will always return 0 until something connect()s, at which point 
the reads will block until there's data

This makes it difficult (impossible?) to implement the notion of 
connect/disconnect or open/close over virtio-serial without layering 
another protocol on top using hackish things like length-encoded 
payloads or sentinel values to determine the end of one 
RPC/request/response/session and the start of the next.

For instance, if the host side disconnects, then reconnects before we 
read(), we may never get the read()=0, and our FD remains valid. Whereas 
with a tcp/unix socket our FD is no longer valid, and the read()=0 is an 
event we can check for at any point after the other end does a 
close/disconnect.

Or if the host side disconnects/closes before/while we write(), we 
block. If they reconnect, our write() succeeds, and they potentially end 
up with garbage meant for the previous process. With a tcp/unix socket, 
the write() will return an EPIPE indicating the FD is no longer valid.

Since virtio-serial is meant to be a general-purpose transport for raw 
binary data as well as a pv serial console, I wonder if a virtio-serial 
mode with semantics closer to tcp/unix sockets is necessary? Any thoughts?

Thanks,

Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-22 22:40 [Qemu-devel] virtio-serial semantics for binary data and guest agents Michael Roth
@ 2011-02-22 23:09 ` Anthony Liguori
  2011-02-23  4:59 ` Amit Shah
  1 sibling, 0 replies; 10+ messages in thread
From: Anthony Liguori @ 2011-02-22 23:09 UTC (permalink / raw)
  To: Michael Roth; +Cc: Amit Shah, qemu-devel@nongnu.org Developers, Adam Litke

My basic understanding here is--there's no way for a virtio-serial 
client (in-guest) to reliably reset a session such that if the client in 
the guest crashes, the protocol needs to have some recovery mechanism.

This seems like a huge oversight in the design of virtio-serial.  Are we 
missing something here?

Regards,

Anthony Liguori

On 02/22/2011 04:40 PM, Michael Roth wrote:
> Hi everyone,
>
> As some of you are aware we've been working for a few months now 
> towards creating a qemu-specific guest agent to implement 
> bi-directional RPCs between the host and the guest to support certain 
> operations like copy/paste, guest-initiated shutdown, and basic file 
> transfer.
>
> Currently the protocol uses the xmlrpc-c library to marshall functions 
> and arguments into a data format we can transport over 
> virtio-serial/isa-serial via an HTTP-like protocol. Recently some 
> concerns have been raised over pulling an external dependency such as 
> xmlrpc-c into qemu, and we've been looking at some alternatives.
>
> Some clear, well-defined approaches such as ASN.1/BER show much 
> promise, but we currently lack a way to implement them cleanly due to 
> the following drawbacks with virtio-serial which prevent us from being 
> able to implement connection/session-oriented protocols:
>
> If something in the guest is attempting to read/write from the 
> virtio-serial device, and nothing is connected to virtio-serial's host 
> character device (say, a socket)
>
> 1. writes will block until something connect()s, at which point the 
> write will succeed
>
> 2. reads will always return 0 until something connect()s, at which 
> point the reads will block until there's data
>
> This makes it difficult (impossible?) to implement the notion of 
> connect/disconnect or open/close over virtio-serial without layering 
> another protocol on top using hackish things like length-encoded 
> payloads or sentinel values to determine the end of one 
> RPC/request/response/session and the start of the next.
>
> For instance, if the host side disconnects, then reconnects before we 
> read(), we may never get the read()=0, and our FD remains valid. 
> Whereas with a tcp/unix socket our FD is no longer valid, and the 
> read()=0 is an event we can check for at any point after the other end 
> does a close/disconnect.
>
> Or if the host side disconnects/closes before/while we write(), we 
> block. If they reconnect, our write() succeeds, and they potentially 
> end up with garbage meant for the previous process. With a tcp/unix 
> socket, the write() will return an EPIPE indicating the FD is no 
> longer valid.
>
> Since virtio-serial is meant to be a general-purpose transport for raw 
> binary data as well as a pv serial console, I wonder if a 
> virtio-serial mode with semantics closer to tcp/unix sockets is 
> necessary? Any thoughts?
>
> Thanks,
>
> Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-22 22:40 [Qemu-devel] virtio-serial semantics for binary data and guest agents Michael Roth
  2011-02-22 23:09 ` [Qemu-devel] " Anthony Liguori
@ 2011-02-23  4:59 ` Amit Shah
  2011-02-23 14:31   ` Michael Roth
  1 sibling, 1 reply; 10+ messages in thread
From: Amit Shah @ 2011-02-23  4:59 UTC (permalink / raw)
  To: Michael Roth
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Adam Litke

On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
> If something in the guest is attempting to read/write from the
> virtio-serial device, and nothing is connected to virtio-serial's
> host character device (say, a socket)
> 
> 1. writes will block until something connect()s, at which point the
> write will succeed
> 
> 2. reads will always return 0 until something connect()s, at which
> point the reads will block until there's data
> 
> This makes it difficult (impossible?) to implement the notion of
> connect/disconnect or open/close over virtio-serial without layering
> another protocol on top using hackish things like length-encoded
> payloads or sentinel values to determine the end of one
> RPC/request/response/session and the start of the next.
> 
> For instance, if the host side disconnects, then reconnects before
> we read(), we may never get the read()=0, and our FD remains valid.
> Whereas with a tcp/unix socket our FD is no longer valid, and the
> read()=0 is an event we can check for at any point after the other
> end does a close/disconnect.

There's SIGIO support, so host connect-disconnect notifications can be
caught via the signal.

Also, nonblocking reads/writes will return -EPIPE if the host-side
connection is not up.

		Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-23  4:59 ` Amit Shah
@ 2011-02-23 14:31   ` Michael Roth
  2011-02-23 14:36     ` Michael Roth
  2011-02-24 12:48     ` Amit Shah
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Roth @ 2011-02-23 14:31 UTC (permalink / raw)
  To: Amit Shah; +Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Adam Litke

On 02/22/2011 10:59 PM, Amit Shah wrote:
> On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
>> If something in the guest is attempting to read/write from the
>> virtio-serial device, and nothing is connected to virtio-serial's
>> host character device (say, a socket)
>>
>> 1. writes will block until something connect()s, at which point the
>> write will succeed
>>
>> 2. reads will always return 0 until something connect()s, at which
>> point the reads will block until there's data
>>
>> This makes it difficult (impossible?) to implement the notion of
>> connect/disconnect or open/close over virtio-serial without layering
>> another protocol on top using hackish things like length-encoded
>> payloads or sentinel values to determine the end of one
>> RPC/request/response/session and the start of the next.
>>
>> For instance, if the host side disconnects, then reconnects before
>> we read(), we may never get the read()=0, and our FD remains valid.
>> Whereas with a tcp/unix socket our FD is no longer valid, and the
>> read()=0 is an event we can check for at any point after the other
>> end does a close/disconnect.
>
> There's SIGIO support, so host connect-disconnect notifications can be
> caught via the signal.

I recall looking into this at some point....but don't we get a SIGIO for 
read/write-ability in general? So you still need some way differentiate, 
say, readability from a disconnect/EOF, and the read()=0 that could 
determine this is still racing with host-side reconnects.

>
> Also, nonblocking reads/writes will return -EPIPE if the host-side
> connection is not up.

But we still essentially need to poll() for a host-side disconnected 
state, which is still racy since they may reconnect before we've done a 
read/write that would've generated the -EPIPE. It seems like what we 
really need is for the FD to be invalid from that point forward.

Also, I focused more on the guest-side connect/disconnect detection, but 
as Anthony mentioned I think the host side shares similar limitations as 
well. AFAIK once we connect to the chardev that FD remains valid until 
the connected process closes it, and so races with the guest side on 
detecting connect/disconnect events in a similar manner. For the host 
side it looks like virtio-console has guest_close/guest_open callbacks 
already that we could potentially use...seems like it's just a matter of 
tying them to the chardev... basically having virtio-serial's 
guest_close() result in a close() on the corresponding chardev 
connection's FD.

>
> 		Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-23 14:31   ` Michael Roth
@ 2011-02-23 14:36     ` Michael Roth
  2011-02-24 12:48     ` Amit Shah
  1 sibling, 0 replies; 10+ messages in thread
From: Michael Roth @ 2011-02-23 14:36 UTC (permalink / raw)
  To: Amit Shah; +Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Adam Litke

On 02/23/2011 08:31 AM, Michael Roth wrote:
> On 02/22/2011 10:59 PM, Amit Shah wrote:
>> On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
>>> If something in the guest is attempting to read/write from the
>>> virtio-serial device, and nothing is connected to virtio-serial's
>>> host character device (say, a socket)
>>>
>>> 1. writes will block until something connect()s, at which point the
>>> write will succeed
>>>
>>> 2. reads will always return 0 until something connect()s, at which
>>> point the reads will block until there's data
>>>
>>> This makes it difficult (impossible?) to implement the notion of
>>> connect/disconnect or open/close over virtio-serial without layering
>>> another protocol on top using hackish things like length-encoded
>>> payloads or sentinel values to determine the end of one
>>> RPC/request/response/session and the start of the next.
>>>
>>> For instance, if the host side disconnects, then reconnects before
>>> we read(), we may never get the read()=0, and our FD remains valid.
>>> Whereas with a tcp/unix socket our FD is no longer valid, and the
>>> read()=0 is an event we can check for at any point after the other
>>> end does a close/disconnect.
>>
>> There's SIGIO support, so host connect-disconnect notifications can be
>> caught via the signal.
>
> I recall looking into this at some point....but don't we get a SIGIO for
> read/write-ability in general? So you still need some way differentiate,
> say, readability from a disconnect/EOF, and the read()=0 that could
> determine this is still racing with host-side reconnects.
>
>>
>> Also, nonblocking reads/writes will return -EPIPE if the host-side
>> connection is not up.
>
> But we still essentially need to poll() for a host-side disconnected

heh, just "poll" not poll() sorry

> state, which is still racy since they may reconnect before we've done a
> read/write that would've generated the -EPIPE. It seems like what we
> really need is for the FD to be invalid from that point forward.
>
> Also, I focused more on the guest-side connect/disconnect detection, but
> as Anthony mentioned I think the host side shares similar limitations as
> well. AFAIK once we connect to the chardev that FD remains valid until
> the connected process closes it, and so races with the guest side on
> detecting connect/disconnect events in a similar manner. For the host
> side it looks like virtio-console has guest_close/guest_open callbacks
> already that we could potentially use...seems like it's just a matter of
> tying them to the chardev... basically having virtio-serial's
> guest_close() result in a close() on the corresponding chardev
> connection's FD.
>
>>
>> Amit
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-23 14:31   ` Michael Roth
  2011-02-23 14:36     ` Michael Roth
@ 2011-02-24 12:48     ` Amit Shah
  2011-02-24 14:44       ` Anthony Liguori
  2011-02-25 20:25       ` Michael Roth
  1 sibling, 2 replies; 10+ messages in thread
From: Amit Shah @ 2011-02-24 12:48 UTC (permalink / raw)
  To: Michael Roth
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Adam Litke

On (Wed) 23 Feb 2011 [08:31:52], Michael Roth wrote:
> On 02/22/2011 10:59 PM, Amit Shah wrote:
> >On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
> >>If something in the guest is attempting to read/write from the
> >>virtio-serial device, and nothing is connected to virtio-serial's
> >>host character device (say, a socket)
> >>
> >>1. writes will block until something connect()s, at which point the
> >>write will succeed
> >>
> >>2. reads will always return 0 until something connect()s, at which
> >>point the reads will block until there's data
> >>
> >>This makes it difficult (impossible?) to implement the notion of
> >>connect/disconnect or open/close over virtio-serial without layering
> >>another protocol on top using hackish things like length-encoded
> >>payloads or sentinel values to determine the end of one
> >>RPC/request/response/session and the start of the next.
> >>
> >>For instance, if the host side disconnects, then reconnects before
> >>we read(), we may never get the read()=0, and our FD remains valid.
> >>Whereas with a tcp/unix socket our FD is no longer valid, and the
> >>read()=0 is an event we can check for at any point after the other
> >>end does a close/disconnect.
> >
> >There's SIGIO support, so host connect-disconnect notifications can be
> >caught via the signal.
> 
> I recall looking into this at some point....but don't we get a SIGIO
> for read/write-ability in general?

I don't get you -- the virtio_console driver emits the SIGIO signal
only when the host side connects or disconnects.  See

http://www.linux-kvm.org/page/Virtio-serial_API

So whenever you receive a SIGIO, poll() in the signal handler for all
fds of interest and whichever has POLLIN set is writable.  Whichever
has POLLHUP set is not.  If you maintain previous state of the fd
(before signal), you can figure out if something happened on the host
side.

> So you still need some way
> differentiate, say, readability from a disconnect/EOF, and the
> read()=0 that could determine this is still racing with host-side
> reconnects.

> >Also, nonblocking reads/writes will return -EPIPE if the host-side
> >connection is not up.
> 
> But we still essentially need to poll() for a host-side disconnected
> state, which is still racy since they may reconnect before we've
> done a read/write that would've generated the -EPIPE. It seems like
> what we really need is for the FD to be invalid from that point
> forward.

This would go against (or abuse) a chardev interface.  It would
effectively treat a host-side port close as a hot-unplug event.

> Also, I focused more on the guest-side connect/disconnect detection,
> but as Anthony mentioned I think the host side shares similar
> limitations as well. AFAIK once we connect to the chardev that FD
> remains valid until the connected process closes it, and so races
> with the guest side on detecting connect/disconnect events in a
> similar manner. For the host side it looks like virtio-console has
> guest_close/guest_open callbacks already that we could potentially
> use...seems like it's just a matter of tying them to the chardev...
> basically having virtio-serial's guest_close() result in a close()
> on the corresponding chardev connection's FD.

Yes, this could be used.

However, the problem with that will be that the chardev can't be
opened again (AFAIR) and a new chardev will have to be used.


So if this is done on both the sides, the race will be eliminated but
the expectation that a chardev port is just a serial port will be
broken and we'll try to bake in some connection layer on top of it.
That wasn't the original idea.  We could extend this, but a better way
to achieve this could be a library on either side to abstract these
details off.

                Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-24 12:48     ` Amit Shah
@ 2011-02-24 14:44       ` Anthony Liguori
  2011-02-28 12:21         ` Amit Shah
  2011-02-25 20:25       ` Michael Roth
  1 sibling, 1 reply; 10+ messages in thread
From: Anthony Liguori @ 2011-02-24 14:44 UTC (permalink / raw)
  To: Amit Shah; +Cc: Adam Litke, Michael Roth, qemu-devel@nongnu.org Developers

On 02/24/2011 06:48 AM, Amit Shah wrote:
> On (Wed) 23 Feb 2011 [08:31:52], Michael Roth wrote:
>    
>> On 02/22/2011 10:59 PM, Amit Shah wrote:
>>      
>>> On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
>>>        
>>>> If something in the guest is attempting to read/write from the
>>>> virtio-serial device, and nothing is connected to virtio-serial's
>>>> host character device (say, a socket)
>>>>
>>>> 1. writes will block until something connect()s, at which point the
>>>> write will succeed
>>>>
>>>> 2. reads will always return 0 until something connect()s, at which
>>>> point the reads will block until there's data
>>>>
>>>> This makes it difficult (impossible?) to implement the notion of
>>>> connect/disconnect or open/close over virtio-serial without layering
>>>> another protocol on top using hackish things like length-encoded
>>>> payloads or sentinel values to determine the end of one
>>>> RPC/request/response/session and the start of the next.
>>>>
>>>> For instance, if the host side disconnects, then reconnects before
>>>> we read(), we may never get the read()=0, and our FD remains valid.
>>>> Whereas with a tcp/unix socket our FD is no longer valid, and the
>>>> read()=0 is an event we can check for at any point after the other
>>>> end does a close/disconnect.
>>>>          
>>> There's SIGIO support, so host connect-disconnect notifications can be
>>> caught via the signal.
>>>        
>> I recall looking into this at some point....but don't we get a SIGIO
>> for read/write-ability in general?
>>      
> I don't get you -- the virtio_console driver emits the SIGIO signal
> only when the host side connects or disconnects.  See
>    

Um, that's not the expected semantics of SIGIO.  SIGIO can be delivered 
for any number of reasons (including on a normal file descriptor) so if 
there's no way to poll for the specific event then the mechanism is 
inherently racy.

Regards,

Anthony Liguori

> http://www.linux-kvm.org/page/Virtio-serial_API
>
> So whenever you receive a SIGIO, poll() in the signal handler for all
> fds of interest and whichever has POLLIN set is writable.  Whichever
> has POLLHUP set is not.  If you maintain previous state of the fd
> (before signal), you can figure out if something happened on the host
> side.
>
>    
>> So you still need some way
>> differentiate, say, readability from a disconnect/EOF, and the
>> read()=0 that could determine this is still racing with host-side
>> reconnects.
>>      
>    
>>> Also, nonblocking reads/writes will return -EPIPE if the host-side
>>> connection is not up.
>>>        
>> But we still essentially need to poll() for a host-side disconnected
>> state, which is still racy since they may reconnect before we've
>> done a read/write that would've generated the -EPIPE. It seems like
>> what we really need is for the FD to be invalid from that point
>> forward.
>>      
> This would go against (or abuse) a chardev interface.  It would
> effectively treat a host-side port close as a hot-unplug event.
>
>    
>> Also, I focused more on the guest-side connect/disconnect detection,
>> but as Anthony mentioned I think the host side shares similar
>> limitations as well. AFAIK once we connect to the chardev that FD
>> remains valid until the connected process closes it, and so races
>> with the guest side on detecting connect/disconnect events in a
>> similar manner. For the host side it looks like virtio-console has
>> guest_close/guest_open callbacks already that we could potentially
>> use...seems like it's just a matter of tying them to the chardev...
>> basically having virtio-serial's guest_close() result in a close()
>> on the corresponding chardev connection's FD.
>>      
> Yes, this could be used.
>
> However, the problem with that will be that the chardev can't be
> opened again (AFAIR) and a new chardev will have to be used.
>
>
> So if this is done on both the sides, the race will be eliminated but
> the expectation that a chardev port is just a serial port will be
> broken and we'll try to bake in some connection layer on top of it.
> That wasn't the original idea.  We could extend this, but a better way
> to achieve this could be a library on either side to abstract these
> details off.
>
>                  Amit
>    

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-24 12:48     ` Amit Shah
  2011-02-24 14:44       ` Anthony Liguori
@ 2011-02-25 20:25       ` Michael Roth
  2011-03-01  6:17         ` Amit Shah
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Roth @ 2011-02-25 20:25 UTC (permalink / raw)
  To: Amit Shah; +Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Adam Litke

On 02/24/2011 06:48 AM, Amit Shah wrote:
> On (Wed) 23 Feb 2011 [08:31:52], Michael Roth wrote:
>> On 02/22/2011 10:59 PM, Amit Shah wrote:
>>> On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
>>>> If something in the guest is attempting to read/write from the
>>>> virtio-serial device, and nothing is connected to virtio-serial's
>>>> host character device (say, a socket)
>>>>
>>>> 1. writes will block until something connect()s, at which point the
>>>> write will succeed
>>>>
>>>> 2. reads will always return 0 until something connect()s, at which
>>>> point the reads will block until there's data
>>>>
>>>> This makes it difficult (impossible?) to implement the notion of
>>>> connect/disconnect or open/close over virtio-serial without layering
>>>> another protocol on top using hackish things like length-encoded
>>>> payloads or sentinel values to determine the end of one
>>>> RPC/request/response/session and the start of the next.
>>>>
>>>> For instance, if the host side disconnects, then reconnects before
>>>> we read(), we may never get the read()=0, and our FD remains valid.
>>>> Whereas with a tcp/unix socket our FD is no longer valid, and the
>>>> read()=0 is an event we can check for at any point after the other
>>>> end does a close/disconnect.
>>>
>>> There's SIGIO support, so host connect-disconnect notifications can be
>>> caught via the signal.
>>
>> I recall looking into this at some point....but don't we get a SIGIO
>> for read/write-ability in general?
>
> I don't get you -- the virtio_console driver emits the SIGIO signal
> only when the host side connects or disconnects.  See
>
> http://www.linux-kvm.org/page/Virtio-serial_API
>
> So whenever you receive a SIGIO, poll() in the signal handler for all
> fds of interest and whichever has POLLIN set is writable.  Whichever
> has POLLHUP set is not.  If you maintain previous state of the fd
> (before signal), you can figure out if something happened on the host
> side.
>

I tried this on RHEL6+rhn updates but the O_ASYNC flag doesn't seem to 
be supported. Has this been backported?

Either way, it seems we can still lose the disconnect event/poll state 
change if the host reconnects before the signal is delivered. So SIGIO 
in an application would need to be reserved for absolutely 2 things: a 
host connect or disconnect (distinguishing between the 2 may not be so 
important, we could treat either as the previous session having been 
closed). Which limits the application to only having 1 O_ASYNC FD open 
at a time.

But even if we do that, it seems like there might still be a small 
window where the application could read/write data intended for the 
previous connection before the signal handler is invoked. Not too sure 
on that point though. Assuming this isn't the case...it could work. But 
what about windows guests?

>> So you still need some way
>> differentiate, say, readability from a disconnect/EOF, and the
>> read()=0 that could determine this is still racing with host-side
>> reconnects.
>
>>> Also, nonblocking reads/writes will return -EPIPE if the host-side
>>> connection is not up.
>>
>> But we still essentially need to poll() for a host-side disconnected
>> state, which is still racy since they may reconnect before we've
>> done a read/write that would've generated the -EPIPE. It seems like
>> what we really need is for the FD to be invalid from that point
>> forward.
>
> This would go against (or abuse) a chardev interface.  It would
> effectively treat a host-side port close as a hot-unplug event.
>

Well, not a complete hot-unplug. The port would still be there, we'd 
just need to re-open it after a read()=0

Personally I'm not necessarily advocating we change the default 
behavior, but couldn't we support this as a separate mode?

-device virtserialport,inv_fd_on_host_close=1

or something along that line?

>> Also, I focused more on the guest-side connect/disconnect detection,
>> but as Anthony mentioned I think the host side shares similar
>> limitations as well. AFAIK once we connect to the chardev that FD
>> remains valid until the connected process closes it, and so races
>> with the guest side on detecting connect/disconnect events in a
>> similar manner. For the host side it looks like virtio-console has
>> guest_close/guest_open callbacks already that we could potentially
>> use...seems like it's just a matter of tying them to the chardev...
>> basically having virtio-serial's guest_close() result in a close()
>> on the corresponding chardev connection's FD.
>
> Yes, this could be used.
>
> However, the problem with that will be that the chardev can't be
> opened again (AFAIR) and a new chardev will have to be used.
>

Hmm...yeah I was thinking more specifically about the socket chardev, 
where we can leave the listen_fd alone but close anything we've 
accept()'d prior to a guest-side disconnect. But isn't that enough? Just 
add this option for chardevs where this actually makes sense? For instance:

-chardev socket,inv_fd_on_guest_close=1

Although, this wouldn't make sense if we're using the chardev for 
anything other than virtio-serial...so that flag makes more sense as a 
virtio-serial flag....but that virtio-serial flag only makes sense for 
particular chardevs...

I'm not sure what a good way to do this would be honestly...either would 
work it seems...but neither seems very intuitive.

>
> So if this is done on both the sides, the race will be eliminated but
> the expectation that a chardev port is just a serial port will be
> broken and we'll try to bake in some connection layer on top of it.
> That wasn't the original idea.  We could extend this, but a better way
> to achieve this could be a library on either side to abstract these
> details off.

As far as implementing a library...I tried layering something on top 
with previous applications to provide connect/disconnect detection over 
virtio-serial. Basically by proxing data to/from forwarding sockets on 
either side of the virtio-serial channel, and packetizing the 
virtio-serial stream into fixed-sized or length-encoded data packets and 
control packets that induced the kind of invalidate-fd-on-remote-close 
semantics we're discussing.

It seems to work ok in practice, but we run into trouble when the 
daemons managing the stream go down...mainly due to similar problems as 
those we're trying to address with the aforementioned changes. (for 
instance...forwarding daemon in guest sends partial packet, goes down, 
starts back up, sends control packet...but packet stream is out of sync 
now. only way around i've found around this so far is reserving bytes in 
the transport as sentinel values we can use to re-sync the stream..but 
then we have to do stuff like base64-encode binary data, which can be 
expensive both in terms of channel bandwidth as well as cpu)

If that's the kind of approach we'd need to take, I think optional flags 
like the ones mentioned above, or some other change to the transport, 
are still required. Unless there's a better approach to handling this 
outside of the actual transport that I'm missing?

>
>                  Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-24 14:44       ` Anthony Liguori
@ 2011-02-28 12:21         ` Amit Shah
  0 siblings, 0 replies; 10+ messages in thread
From: Amit Shah @ 2011-02-28 12:21 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Adam Litke, Michael Roth, qemu-devel@nongnu.org Developers

On (Thu) 24 Feb 2011 [08:44:07], Anthony Liguori wrote:
> >>>>For instance, if the host side disconnects, then reconnects before
> >>>>we read(), we may never get the read()=0, and our FD remains valid.
> >>>>Whereas with a tcp/unix socket our FD is no longer valid, and the
> >>>>read()=0 is an event we can check for at any point after the other
> >>>>end does a close/disconnect.
> >>>There's SIGIO support, so host connect-disconnect notifications can be
> >>>caught via the signal.
> >>I recall looking into this at some point....but don't we get a SIGIO
> >>for read/write-ability in general?
> >I don't get you -- the virtio_console driver emits the SIGIO signal
> >only when the host side connects or disconnects.  See
> 
> Um, that's not the expected semantics of SIGIO.  SIGIO can be
> delivered for any number of reasons (including on a normal file
> descriptor) so if there's no way to poll for the specific event then
> the mechanism is inherently racy.

Well -- for a custom driver something's going to be non-standard:  if
it's baking in connection info into the open/close semantics or having
SIGIO delivered when needed.

However, I think a generic library needs to be used for guest-host
communications, abstracting away virtio-serial entirely.  This way,
apps should just communicate over tcp, the library should do the
necessary virtio-serial (or even virt-9pfs) calls.  This way, all
guest apps written will be infrastructure-agnostic, and if there's a
better protocol to communicate with, virtio-serial can easily be
dropped w/o modifications to apps.

		Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: virtio-serial semantics for binary data and guest agents
  2011-02-25 20:25       ` Michael Roth
@ 2011-03-01  6:17         ` Amit Shah
  0 siblings, 0 replies; 10+ messages in thread
From: Amit Shah @ 2011-03-01  6:17 UTC (permalink / raw)
  To: Michael Roth
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Adam Litke

On (Fri) 25 Feb 2011 [14:25:20], Michael Roth wrote:
> On 02/24/2011 06:48 AM, Amit Shah wrote:
> >On (Wed) 23 Feb 2011 [08:31:52], Michael Roth wrote:
> >>On 02/22/2011 10:59 PM, Amit Shah wrote:
> >>>On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote:
> >>>>If something in the guest is attempting to read/write from the
> >>>>virtio-serial device, and nothing is connected to virtio-serial's
> >>>>host character device (say, a socket)
> >>>>
> >>>>1. writes will block until something connect()s, at which point the
> >>>>write will succeed
> >>>>
> >>>>2. reads will always return 0 until something connect()s, at which
> >>>>point the reads will block until there's data
> >>>>
> >>>>This makes it difficult (impossible?) to implement the notion of
> >>>>connect/disconnect or open/close over virtio-serial without layering
> >>>>another protocol on top using hackish things like length-encoded
> >>>>payloads or sentinel values to determine the end of one
> >>>>RPC/request/response/session and the start of the next.
> >>>>
> >>>>For instance, if the host side disconnects, then reconnects before
> >>>>we read(), we may never get the read()=0, and our FD remains valid.
> >>>>Whereas with a tcp/unix socket our FD is no longer valid, and the
> >>>>read()=0 is an event we can check for at any point after the other
> >>>>end does a close/disconnect.
> >>>
> >>>There's SIGIO support, so host connect-disconnect notifications can be
> >>>caught via the signal.
> >>
> >>I recall looking into this at some point....but don't we get a SIGIO
> >>for read/write-ability in general?
> >
> >I don't get you -- the virtio_console driver emits the SIGIO signal
> >only when the host side connects or disconnects.  See
> >
> >http://www.linux-kvm.org/page/Virtio-serial_API
> >
> >So whenever you receive a SIGIO, poll() in the signal handler for all
> >fds of interest and whichever has POLLIN set is writable.  Whichever
> >has POLLHUP set is not.  If you maintain previous state of the fd
> >(before signal), you can figure out if something happened on the host
> >side.
> >
> 
> I tried this on RHEL6+rhn updates but the O_ASYNC flag doesn't seem
> to be supported. Has this been backported?

Yes, it has been.

> Either way, it seems we can still lose the disconnect event/poll
> state change if the host reconnects before the signal is delivered.
> So SIGIO in an application would need to be reserved for absolutely
> 2 things: a host connect or disconnect (distinguishing between the 2
> may not be so important, we could treat either as the previous
> session having been closed). Which limits the application to only
> having 1 O_ASYNC FD open at a time.
> 
> But even if we do that, it seems like there might still be a small
> window where the application could read/write data intended for the
> previous connection before the signal handler is invoked. Not too
> sure on that point though. Assuming this isn't the case...it could
> work. But what about windows guests?
> 
> >>So you still need some way
> >>differentiate, say, readability from a disconnect/EOF, and the
> >>read()=0 that could determine this is still racing with host-side
> >>reconnects.
> >
> >>>Also, nonblocking reads/writes will return -EPIPE if the host-side
> >>>connection is not up.
> >>
> >>But we still essentially need to poll() for a host-side disconnected
> >>state, which is still racy since they may reconnect before we've
> >>done a read/write that would've generated the -EPIPE. It seems like
> >>what we really need is for the FD to be invalid from that point
> >>forward.
> >
> >This would go against (or abuse) a chardev interface.  It would
> >effectively treat a host-side port close as a hot-unplug event.
> >
> 
> Well, not a complete hot-unplug. The port would still be there, we'd
> just need to re-open it after a read()=0
> 
> Personally I'm not necessarily advocating we change the default
> behavior, but couldn't we support this as a separate mode?
> 
> -device virtserialport,inv_fd_on_host_close=1
> 
> or something along that line?

Yes, that would be reasonable.  Maybe even a monitor command to switch
the type?

> >>Also, I focused more on the guest-side connect/disconnect detection,
> >>but as Anthony mentioned I think the host side shares similar
> >>limitations as well. AFAIK once we connect to the chardev that FD
> >>remains valid until the connected process closes it, and so races
> >>with the guest side on detecting connect/disconnect events in a
> >>similar manner. For the host side it looks like virtio-console has
> >>guest_close/guest_open callbacks already that we could potentially
> >>use...seems like it's just a matter of tying them to the chardev...
> >>basically having virtio-serial's guest_close() result in a close()
> >>on the corresponding chardev connection's FD.
> >
> >Yes, this could be used.
> >
> >However, the problem with that will be that the chardev can't be
> >opened again (AFAIR) and a new chardev will have to be used.
> >
> 
> Hmm...yeah I was thinking more specifically about the socket
> chardev, where we can leave the listen_fd alone but close anything
> we've accept()'d prior to a guest-side disconnect. But isn't that
> enough?

That would restrict this to sockets started in server mode.

> Just add this option for chardevs where this actually makes
> sense? For instance:
> 
> -chardev socket,inv_fd_on_guest_close=1
> 
> Although, this wouldn't make sense if we're using the chardev for
> anything other than virtio-serial...so that flag makes more sense as
> a virtio-serial flag....but that virtio-serial flag only makes sense
> for particular chardevs...
> 
> I'm not sure what a good way to do this would be honestly...either
> would work it seems...but neither seems very intuitive.

Maybe a new char interface: qemu_chr_close_connected(vcon->chr).
Don't know though.  Will have to think about it.

> >So if this is done on both the sides, the race will be eliminated but
> >the expectation that a chardev port is just a serial port will be
> >broken and we'll try to bake in some connection layer on top of it.
> >That wasn't the original idea.  We could extend this, but a better way
> >to achieve this could be a library on either side to abstract these
> >details off.
> 
> As far as implementing a library...I tried layering something on top
> with previous applications to provide connect/disconnect detection
> over virtio-serial. Basically by proxing data to/from forwarding
> sockets on either side of the virtio-serial channel, and packetizing
> the virtio-serial stream into fixed-sized or length-encoded data
> packets and control packets that induced the kind of
> invalidate-fd-on-remote-close semantics we're discussing.
> 
> It seems to work ok in practice, but we run into trouble when the
> daemons managing the stream go down...mainly due to similar problems
> as those we're trying to address with the aforementioned changes.
> (for instance...forwarding daemon in guest sends partial packet,
> goes down, starts back up, sends control packet...but packet stream
> is out of sync now. only way around i've found around this so far is
> reserving bytes in the transport as sentinel values we can use to
> re-sync the stream..but then we have to do stuff like base64-encode
> binary data, which can be expensive both in terms of channel
> bandwidth as well as cpu)
> 
> If that's the kind of approach we'd need to take, I think optional
> flags like the ones mentioned above, or some other change to the
> transport, are still required. Unless there's a better approach to
> handling this outside of the actual transport that I'm missing?

Let's go with the flags you suggested above.

		Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-01  6:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-22 22:40 [Qemu-devel] virtio-serial semantics for binary data and guest agents Michael Roth
2011-02-22 23:09 ` [Qemu-devel] " Anthony Liguori
2011-02-23  4:59 ` Amit Shah
2011-02-23 14:31   ` Michael Roth
2011-02-23 14:36     ` Michael Roth
2011-02-24 12:48     ` Amit Shah
2011-02-24 14:44       ` Anthony Liguori
2011-02-28 12:21         ` Amit Shah
2011-02-25 20:25       ` Michael Roth
2011-03-01  6:17         ` Amit Shah

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.