From: Leonardo Bras Soares Passos <leobras@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
"Jason Wang" <jasowang@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
"Peter Xu" <peterx@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v4 2/3] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX
Date: Wed, 3 Nov 2021 17:50:13 -0300 [thread overview]
Message-ID: <CAJ6HWG7g+5vEZGPLDCBLGqgSbaDmeS6MAeYwZfnJNXsqmhgA6Q@mail.gmail.com> (raw)
In-Reply-To: <87o872k8et.fsf@secure.mitica>
On Tue, Nov 2, 2021 at 10:13 AM Juan Quintela <quintela@redhat.com> wrote:
>
> Leonardo Bras <leobras@redhat.com> wrote:
> > For CONFIG_LINUX, implement the new optional callbacks io_write_zerocopy and
> > io_flush_zerocopy on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
> > feature is available in the host kernel, which is checked on
> > qio_channel_socket_connect_sync()
> >
> > qio_channel_socket_writev() contents were moved to a helper function
> > qio_channel_socket_writev_flags() which accepts an extra argument for flags.
> > (This argument is passed directly to sendmsg().
> >
> > The above helper function is used to implement qio_channel_socket_writev(),
> > with flags = 0, keeping it's behavior unchanged, and
> > qio_channel_socket_writev_zerocopy() with flags = MSG_ZEROCOPY.
> >
> > qio_channel_socket_flush_zerocopy() was implemented by counting how many times
> > sendmsg(...,MSG_ZEROCOPY) was sucessfully called, and then reading the
> > socket's error queue, in order to find how many of them finished sending.
> > Flush will loop until those counters are the same, or until some error ocurs.
> >
> > A new function qio_channel_socket_poll() was also created in order to avoid
> > busy-looping recvmsg() in qio_channel_socket_flush_zerocopy() while waiting for
> > updates in socket's error queue.
> >
> > Notes on using writev_zerocopy():
> > 1: Buffer
> > - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying,
> > some caution is necessary to avoid overwriting any buffer before it's sent.
> > If something like this happen, a newer version of the buffer may be sent instead.
> > - If this is a problem, it's recommended to call flush_zerocopy() before freeing
> > or re-using the buffer.
> >
> > 2: Locked memory
> > - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and
> > unlocked after it's sent.
> > - Depending on the size of each buffer, and how often it's sent, it may require
> > a larger amount of locked memory than usually available to non-root user.
> > - If the required amount of locked memory is not available, writev_zerocopy
> > will return an error, which can abort an operation like migration,
> > - Because of this, when an user code wants to add zerocopy as a feature, it
> > requires a mechanism to disable it, so it can still be acessible to less
> > privileged users.
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
>
> I think this patch would be easier to review if you split in:
> - add the flags parameter left and right
> - add the meat of what you do with the flags.
ok, I will try to split it like this.
>
> > +++ b/include/io/channel-socket.h
> > @@ -47,6 +47,8 @@ struct QIOChannelSocket {
> > socklen_t localAddrLen;
> > struct sockaddr_storage remoteAddr;
> > socklen_t remoteAddrLen;
> > + ssize_t zerocopy_queued;
> > + ssize_t zerocopy_sent;
>
> I am not sure if this is good/bad to put it inside
>
> #ifdef CONFIG_LINUX
>
> basically everything else uses it.
I think it makes sense that zerocopy_{sent,queued} is inside
CONFIG_LINUX as no one else is using zerocopy.
>
> > +#ifdef CONFIG_LINUX
> > + ret = qemu_setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v));
> > + if (ret < 0) {
> > + /* Zerocopy not available on host */
> > + return 0;
> > + }
> > +
> > + qio_channel_set_feature(QIO_CHANNEL(ioc),
> > + QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY);
>
> As Peter said, you shouldn't fail if the feature is not there.
>
> But on the other hand, on patch 3, you don't check that this feature
> exist when you allow to enable multifd_zerocopy.
This had a major rework on v5, but I will make sure this suggestion is
addressed before releasing it.
>
> > +#endif
> > +
> > return 0;
> > }
>
>
> > error_setg_errno(errp, errno,
> > "Unable to write to socket");
>
> Why do you split this in two lines?
>
> Yes, I know that this file is not consistent either on how the do with
> this, sometimes one line, otherwise more.
IIUC, this lines have no '+' in them, so they are not my addition.
>
> I don't know how ZEROCPY works at kernel level to comment on rest of the
> patch.
>
> Later, Juan.
Thanks for reviewing Juan.
Best regards,
Leo
next prev parent reply other threads:[~2021-11-03 20:50 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-09 7:56 [PATCH v4 0/3] MSG_ZEROCOPY for multifd Leonardo Bras
2021-10-09 7:56 ` [PATCH v4 1/3] QIOChannel: Add io_writev_zerocopy & io_flush_zerocopy callbacks Leonardo Bras
2021-10-11 19:17 ` Eric Blake
2021-10-11 19:38 ` Leonardo Bras Soares Passos
2021-10-11 20:45 ` Eric Blake
2021-10-11 20:59 ` Leonardo Bras Soares Passos
2021-10-13 6:07 ` Peter Xu
2021-10-13 6:32 ` Peter Xu
2021-10-27 6:07 ` Leonardo Bras Soares Passos
2021-10-27 6:15 ` Peter Xu
2021-10-27 6:31 ` Leonardo Bras Soares Passos
2021-10-09 7:56 ` [PATCH v4 2/3] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX Leonardo Bras
2021-10-11 19:27 ` Eric Blake
2021-10-11 19:44 ` Leonardo Bras Soares Passos
2021-10-13 6:18 ` Peter Xu
2021-10-27 6:30 ` Leonardo Bras Soares Passos
2021-11-02 13:13 ` Juan Quintela
2021-11-03 20:50 ` Leonardo Bras Soares Passos [this message]
2021-10-09 7:56 ` [PATCH v4 3/3] multifd: Implement zerocopy write in multifd migration (multifd-zerocopy) Leonardo Bras
2021-10-11 19:31 ` Eric Blake
2021-10-11 19:56 ` Leonardo Bras Soares Passos
2021-10-12 5:53 ` Markus Armbruster
2021-10-28 1:56 ` Leonardo Bras Soares Passos
2021-10-28 4:30 ` Markus Armbruster
2021-10-28 4:37 ` Leonardo Bras Soares Passos
2021-10-13 6:23 ` Peter Xu
2021-10-27 6:47 ` Leonardo Bras Soares Passos
2021-10-27 7:06 ` Peter Xu
2021-10-13 6:26 ` Peter Xu
2021-10-27 6:50 ` Leonardo Bras Soares Passos
2021-11-02 12:32 ` Juan Quintela
2021-11-03 21:29 ` Leonardo Bras Soares Passos
2021-11-03 23:24 ` Juan Quintela
2021-11-04 3:43 ` Leonardo Bras Soares Passos
2022-04-14 4:00 ` Leonardo Bras Soares Passos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJ6HWG7g+5vEZGPLDCBLGqgSbaDmeS6MAeYwZfnJNXsqmhgA6Q@mail.gmail.com \
--to=leobras@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=jasowang@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).