qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: Ralph Schmieder <ralph.schmieder@gmail.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>, qemu-devel@nongnu.org
Subject: Re: socket.c added support for unix domain socket datagram transport
Date: Tue, 27 Apr 2021 23:51:52 +0200	[thread overview]
Message-ID: <20210427235152.6e1e080a@elisabeth> (raw)
In-Reply-To: <2DC6F891-4F28-4044-A055-0CDAB45A3C24@gmail.com>

On Mon, 26 Apr 2021 13:14:48 +0200
Ralph Schmieder <ralph.schmieder@gmail.com> wrote:

> > On Apr 23, 2021, at 18:39, Stefano Brivio <sbrivio@redhat.com>
> > wrote:
> > 
> > [...]
> >
> > Okay, so it doesn't seem to fit your case, but this specific point
> > is where you actually have a small advantage using a stream-oriented
> > socket. If you receive a packet and have a smaller receive buffer,
> > you can read the length of the packet from the vnet header and then
> > read the rest of the packet at a later time.
> > 
> > With a datagram-oriented socket, you would need to know the maximum
> > packet size in advance, and use a receive buffer that's large
> > enough to contain it, because if you don't, you'll discard data.  
> 
> For me, the maximum packet size is a jumbo frame (e.g. 9x1024) anyway
> -- everything must fit into an atomic write of that size.

Well, the day you want to do some batching... ;) but sure, I see your
point.

> > [...]
> > 
> > On a side note, I wonder why you need two named sockets instead of
> > one -- I mean, they're bidirectional...  
> 
> Hmm... each peer needs to send unsolicited frames/packets to the
> other end... and thus needs to bind to their socket.  Pretty much for
> the same reason as the UDP transport requires you to specify a local
> and a remote 5-tuple.  Even though for AF_INET, the local port does
> not have to be specified, the OS would assign an ephemeral port to
> make it unique. Am I missing something?

I see your point now. Well, I think it's different from the AF_INET case
due to the way AF_UNIX works: UNIX domain sockets don't necessarily
need to make the endpoint known or visible, see a more detailed
explanation at:
	https://comp.unix.admin.narkive.com/AhAOKP1s/lsof-find-both-endpoints-of-a-unix-socket

Even though, nowadays on Linux:

$ nc -luU my_path & (sleep 1; nc.openbsd -uU my_path & lsof +E -aUc nc)
[1] 373285
COMMAND      PID    USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
nc        373285 sbrivio    3u  unix 0x000000004076431a      0t0 3957568 my_path type=DGRAM ->INO=3956394 373288,nc.openbs,4u
nc.openbs 373288 sbrivio    4u  unix 0x00000000f5b2e2e1      0t0 3956394 /tmp/nc.XXXXC0whUu type=DGRAM ->INO=3957568 373285,nc,3u

for datagram sockets, the endpoint is exported, and lsof can report that
the endpoint for "my_path" here (-luU binds to a UNIX domain datagram
socket, -uU connects to it). With a stream socket, by the way:

$ nc -lU my_path & (sleep 1; nc.openbsd -U my_path & lsof +E -aUc nc)
[1] 375445
COMMAND      PID    USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
nc        375445 sbrivio    3u  unix 0x0000000053abf57c      0t0 3969787 my_path type=STREAM
nc        375445 sbrivio    4u  unix 0x000000001960c1ef      0t0 3969788 my_path type=STREAM ->INO=3970624 375448,nc.openbs,3u
nc.openbs 375448 sbrivio    3u  unix 0x000000000538fa63      0t0 3970624 type=STREAM ->INO=3969788 375445,nc,4u

so I think it should be optional. Even with datagram sockets, just like
the example above (I'm not suggesting that you do this, it's just
another possible choice), only one peer needs to bind to a named
socket, and yet they can exchange data.

> Another thing: on Windows, there's a AF_UNIX/SOCK_STREAM
> implementation... So, technically it should be possible to use that
> code path on Windows, too.  Not a windows guy, though... So, can't
> say whether it would simply work or not:
> 
> https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/

Thanks for the pointer. I can't test this, so I wouldn't remove that
#ifndef, but perhaps I could add a link to this, in case somebody needs
it and stumbles upon this code path.

-- 
Stefano



  reply	other threads:[~2021-04-27 21:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-23  6:56 socket.c added support for unix domain socket datagram transport Ralph Schmieder
2021-04-23  9:16 ` Daniel P. Berrangé
2021-04-23 13:38   ` Ralph Schmieder
2021-04-23 15:29 ` Stefano Brivio
2021-04-23 15:48   ` Ralph Schmieder
2021-04-23 16:39     ` Stefano Brivio
2021-04-26 11:14       ` Ralph Schmieder
2021-04-27 21:51         ` Stefano Brivio [this message]
2021-04-23 16:21   ` Daniel P. Berrangé
2021-04-23 16:54     ` Stefano Brivio
2021-04-26 12:05       ` Ralph Schmieder
2021-04-26 12:56       ` Daniel P. Berrangé
2021-04-27 21:52         ` Stefano Brivio
2021-04-28  9:02           ` Daniel P. Berrangé
2021-04-29 12:07             ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210427235152.6e1e080a@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=berrange@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ralph.schmieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).