qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Leonardo Bras Soares Passos <leobras@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Jason Wang" <jasowang@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v4 2/3] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX
Date: Wed, 3 Nov 2021 17:50:13 -0300	[thread overview]
Message-ID: <CAJ6HWG7g+5vEZGPLDCBLGqgSbaDmeS6MAeYwZfnJNXsqmhgA6Q@mail.gmail.com> (raw)
In-Reply-To: <87o872k8et.fsf@secure.mitica>

On Tue, Nov 2, 2021 at 10:13 AM Juan Quintela <quintela@redhat.com> wrote:
>
> Leonardo Bras <leobras@redhat.com> wrote:
> > For CONFIG_LINUX, implement the new optional callbacks io_write_zerocopy and
> > io_flush_zerocopy on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
> > feature is available in the host kernel, which is checked on
> > qio_channel_socket_connect_sync()
> >
> > qio_channel_socket_writev() contents were moved to a helper function
> > qio_channel_socket_writev_flags() which accepts an extra argument for flags.
> > (This argument is passed directly to sendmsg().
> >
> > The above helper function is used to implement qio_channel_socket_writev(),
> > with flags = 0, keeping it's behavior unchanged, and
> > qio_channel_socket_writev_zerocopy() with flags = MSG_ZEROCOPY.
> >
> > qio_channel_socket_flush_zerocopy() was implemented by counting how many times
> > sendmsg(...,MSG_ZEROCOPY) was sucessfully called, and then reading the
> > socket's error queue, in order to find how many of them finished sending.
> > Flush will loop until those counters are the same, or until some error ocurs.
> >
> > A new function qio_channel_socket_poll() was also created in order to avoid
> > busy-looping recvmsg() in qio_channel_socket_flush_zerocopy() while waiting for
> > updates in socket's error queue.
> >
> > Notes on using writev_zerocopy():
> > 1: Buffer
> > - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying,
> > some caution is necessary to avoid overwriting any buffer before it's sent.
> > If something like this happen, a newer version of the buffer may be sent instead.
> > - If this is a problem, it's recommended to call flush_zerocopy() before freeing
> > or re-using the buffer.
> >
> > 2: Locked memory
> > - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and
> > unlocked after it's sent.
> > - Depending on the size of each buffer, and how often it's sent, it may require
> > a larger amount of locked memory than usually available to non-root user.
> > - If the required amount of locked memory is not available, writev_zerocopy
> > will return an error, which can abort an operation like migration,
> > - Because of this, when an user code wants to add zerocopy as a feature, it
> > requires a mechanism to disable it, so it can still be acessible to less
> > privileged users.
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
>
> I think this patch would be easier to review if you split in:
> - add the flags parameter left and right
> - add the meat of what you do with the flags.

ok, I will try to split it like this.

>
> > +++ b/include/io/channel-socket.h
> > @@ -47,6 +47,8 @@ struct QIOChannelSocket {
> >      socklen_t localAddrLen;
> >      struct sockaddr_storage remoteAddr;
> >      socklen_t remoteAddrLen;
> > +    ssize_t zerocopy_queued;
> > +    ssize_t zerocopy_sent;
>
> I am not sure if this is good/bad to put it inside
>
> #ifdef CONFIG_LINUX
>
> basically everything else uses it.

I think it makes sense that zerocopy_{sent,queued} is inside
CONFIG_LINUX as no one else is using zerocopy.

>
> > +#ifdef CONFIG_LINUX
> > +    ret = qemu_setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v));
> > +    if (ret < 0) {
> > +        /* Zerocopy not available on host */
> > +        return 0;
> > +    }
> > +
> > +    qio_channel_set_feature(QIO_CHANNEL(ioc),
> > +                            QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY);
>
> As Peter said, you shouldn't fail if the feature is not there.
>
> But on the other hand, on patch 3, you don't check that this feature
> exist when you allow to enable multifd_zerocopy.

This had a major rework on v5, but I will make sure this suggestion is
addressed before releasing it.

>
> > +#endif
> > +
> >      return 0;
> >  }
>
>
> >          error_setg_errno(errp, errno,
> >                           "Unable to write to socket");
>
> Why do you split this in two lines?
>
> Yes, I know that this file is not consistent either on how the do with
> this, sometimes one line, otherwise more.

IIUC, this lines have no '+' in them, so they are not my addition.

>
> I don't know how ZEROCPY works at kernel level to comment on rest of the
> patch.
>
> Later, Juan.

Thanks for reviewing Juan.

Best regards,
Leo



  reply	other threads:[~2021-11-03 20:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-09  7:56 [PATCH v4 0/3] MSG_ZEROCOPY for multifd Leonardo Bras
2021-10-09  7:56 ` [PATCH v4 1/3] QIOChannel: Add io_writev_zerocopy & io_flush_zerocopy callbacks Leonardo Bras
2021-10-11 19:17   ` Eric Blake
2021-10-11 19:38     ` Leonardo Bras Soares Passos
2021-10-11 20:45       ` Eric Blake
2021-10-11 20:59         ` Leonardo Bras Soares Passos
2021-10-13  6:07   ` Peter Xu
2021-10-13  6:32     ` Peter Xu
2021-10-27  6:07       ` Leonardo Bras Soares Passos
2021-10-27  6:15         ` Peter Xu
2021-10-27  6:31           ` Leonardo Bras Soares Passos
2021-10-09  7:56 ` [PATCH v4 2/3] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX Leonardo Bras
2021-10-11 19:27   ` Eric Blake
2021-10-11 19:44     ` Leonardo Bras Soares Passos
2021-10-13  6:18   ` Peter Xu
2021-10-27  6:30     ` Leonardo Bras Soares Passos
2021-11-02 13:13   ` Juan Quintela
2021-11-03 20:50     ` Leonardo Bras Soares Passos [this message]
2021-10-09  7:56 ` [PATCH v4 3/3] multifd: Implement zerocopy write in multifd migration (multifd-zerocopy) Leonardo Bras
2021-10-11 19:31   ` Eric Blake
2021-10-11 19:56     ` Leonardo Bras Soares Passos
2021-10-12  5:53       ` Markus Armbruster
2021-10-28  1:56         ` Leonardo Bras Soares Passos
2021-10-28  4:30           ` Markus Armbruster
2021-10-28  4:37             ` Leonardo Bras Soares Passos
2021-10-13  6:23   ` Peter Xu
2021-10-27  6:47     ` Leonardo Bras Soares Passos
2021-10-27  7:06       ` Peter Xu
2021-10-13  6:26   ` Peter Xu
2021-10-27  6:50     ` Leonardo Bras Soares Passos
2021-11-02 12:32   ` Juan Quintela
2021-11-03 21:29     ` Leonardo Bras Soares Passos
2021-11-03 23:24       ` Juan Quintela
2021-11-04  3:43         ` Leonardo Bras Soares Passos
2022-04-14  4:00     ` Leonardo Bras Soares Passos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ6HWG7g+5vEZGPLDCBLGqgSbaDmeS6MAeYwZfnJNXsqmhgA6Q@mail.gmail.com \
    --to=leobras@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).