All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Schoenebeck <linux_oss@crudebyte.com>
To: Dominique Martinet <asmadeus@codewreck.org>
Cc: Nikolay Kichukov <nikolay@oldum.net>,
	v9fs-developer@lists.sourceforge.net, netdev@vger.kernel.org,
	Eric Van Hensbergen <ericvh@gmail.com>,
	Latchesar Ionkov <lucho@ionkov.net>, Greg Kurz <groug@kaod.org>,
	Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [PATCH v4 00/12] remove msize limit in virtio transport
Date: Mon, 24 Jan 2022 14:55:44 +0100	[thread overview]
Message-ID: <5072414.NxdI05fPOR@silver> (raw)
In-Reply-To: <Ye6h8U/NJcx3ErHa@codewreck.org>

On Montag, 24. Januar 2022 13:56:17 CET Dominique Martinet wrote:
> Christian Schoenebeck wrote on Mon, Jan 24, 2022 at 12:57:35PM +0100:
> > > We're just starting a new development cycle for 5.18 while 5.17 is
> > > stabilizing, so this mostly depends on the ability to check if a msize
> > > given in parameter is valid as described in the first "STILL TO DO"
> > > point listed in the cover letter.
> > 
> > I will ping the Redhat guys on the open virtio spec issue this week. If
> > you
> > want I can CC you Dominique on the discussion regarding the virtio spec
> > changes. It's a somewhat dry topic though.
> 
> I don't have much expertise on virtio stuff so don't think I'll bring
> much to the discussion, but always happy to fill my inbox :)
> It's always good to keep an eye on things at least.

Ok, I'll then CC you from now on at the virtio spec front, if it gets too 
noisy just raise your hand.

> > > I personally would be happy considering this series for this cycle with
> > > just a max msize of 4MB-8k and leave that further bump for later if
> > > we're sure qemu will handle it.
> > 
> > I haven't actually checked whether there was any old QEMU version that did
> > not support exceeding the virtio queue size. So it might be possible that
> > a very ancient QEMU version might error out if msize > (128 * 4096 =
> > 512k).
> Even if the spec gets implemented we need the default msize to work for
> reasonably older versions of qemu (at least a few years e.g. supported
> versions of debian/rhel can go quite a while back), and ideally have a
> somewhat sensible error if we go above some max...

Once the virtio spec changes are accepted and implemented, that would not be 
an issue at all, virtio changes are always made with backward compatibility in 
mind. The plan is to negotiate that new virtio feature on virtio subsystem 
level, if either side does not support the new virtio feature (either too old 
QEMU or too old kernel), then msize would automatically be limited to the old 
virtio size/behaviour (a.k.a. virtio "queue size") and with QEMU as 9p server 
that would be max. msize 500k.

Therefore I suggest just waiting for the virtio spec changes to be complete 
and implemented. People who care about performance should then just use an 
updated kernel *and* updated QEMU version to achieve msize > 500k. IMO, no 
need to risk breaking some old kernel/QEMU combination if nobody asked for it 
anyway, and if somebody does, then we could still add some kind of
--force-at-your-own-risk switch later on.

> > Besides QEMU, what other 9p server implementations are actually out there,
> > and how would they behave on this? A test on their side would definitely
> > be a good idea.
> 
> 9p virtio would only be qemu as far as I know.
> 
> For tcp/fd there are a few:
>  - https://github.com/chaos/diod (also supports rdma iirc, I don't have
> any hardware for rdma tests anymore though)
>  - https://github.com/nfs-ganesha/nfs-ganesha (also rdma)
>  - I was pointed at https://github.com/lionkov/go9p in a recent bug
> report
>  - http://repo.cat-v.org/libixp/ is also a server implementation I
> haven't tested with the linux client in a while but iirc it used to work
> 
> 
> I normally run some tests with qemu (virtio) and ganesha (tcp) before
> pushing to my linux-next branch, so we hopefully don't make too many
> assumptions that are specific to a server

Good to know, thanks!

> > > We're still seeing a boost for that and the smaller buffers for small
> > > messages will benefit all transport types, so that would get in in
> > > roughly two months for 5.18-rc1, then another two months for 5.18 to
> > > actually be released and start hitting production code.
> > > 
> > > 
> > > I'm not sure when exactly but I'll run some tests with it as well and
> > > redo a proper code review within the next few weeks, so we can get this
> > > in -next for a little while before the merge window.
> > 
> > Especially the buffer size reduction patches needs a proper review. Those
> > changes can be tricky. So far I have not encountered any issues with tests
> > at least. OTOH these patches could be pushed through separately already,
> > no matter what the decision regarding the virtio issue will be.
> 
> Yes, I've had a first look and it's quite different from what I'd have
> done, but it didn't look bad and I just wanted to spend a bit more time
> on it.
> On a very high level I'm not fond of the logical duplication brought by
> deciding the size in a different function (duplicates format strings for
> checks and brings in a huge case with all formats) when we already have
> one function per call which could take the size decision directly
> without going through the format varargs, but it's not like the protocol
> has evolved over the past ten years so it's not really a problem -- I
> just need to get down to it and check it all matches up.

Yeah I know, the advantage though is that this separate function/switch-case 
approach merges many message types. So it is actually less code. And I tried 
to automate code sanity with various BUG_ON() calls to prevent them from 
accidentally drifting with future changes.

> I also agree it's totally orthogonal to the virtio size extension so if
> you want to wait for the new virtio standard I'll focus on this part
> first.

IMO it would make sense to give these message size reduction patches priority 
for now, as long as the virtio spec changes are incomplete.

One more thing: so far I have just concentrated on behavioural aspects and 
testing. What I completed neglected so far was code style. If you want I can 
send a v5 this week with code style (and only code style) being fixed if that 
helps you to keep diff-noise low for your review.

Best regards,
Christian Schoenebeck



  reply	other threads:[~2022-01-24 13:55 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 13:23 [PATCH v4 00/12] remove msize limit in virtio transport Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 12/12] net/9p: allocate appropriate reduced message buffers Christian Schoenebeck
2022-04-02 14:05   ` Dominique Martinet
2022-04-03 11:29     ` Christian Schoenebeck
2022-04-03 12:37       ` Dominique Martinet
2022-04-03 14:00         ` Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 08/12] net/9p: limit 'msize' to KMALLOC_MAX_SIZE for all transports Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 04/12] 9p/trans_virtio: introduce struct virtqueue_sg Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 01/12] net/9p: show error message if user 'msize' cannot be satisfied Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 10/12] 9p: add P9_ERRMAX for 9p2000 and 9p2000.u Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 11/12] net/9p: add p9_msg_buf_size() Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 06/12] 9p/trans_virtio: support larger msize values Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 02/12] 9p/trans_virtio: separate allocation of scatter gather list Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 09/12] net/9p: split message size argument into 't_size' and 'r_size' pair Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 07/12] 9p/trans_virtio: resize sg lists to whatever is possible Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 05/12] net/9p: add trans_maxsize to struct p9_client Christian Schoenebeck
2021-12-30 13:23 ` [PATCH v4 03/12] 9p/trans_virtio: turn amount of sg lists into runtime info Christian Schoenebeck
2022-01-20 22:43 ` [PATCH v4 00/12] remove msize limit in virtio transport Nikolay Kichukov
2022-01-22 13:34   ` Christian Schoenebeck
2022-01-24 10:21     ` Nikolay Kichukov
2022-01-24 11:07       ` Dominique Martinet
2022-01-24 11:57         ` Christian Schoenebeck
2022-01-24 12:56           ` Dominique Martinet
2022-01-24 13:55             ` Christian Schoenebeck [this message]
2022-01-25  8:45           ` Nikolay Kichukov
2022-05-24  8:10         ` Nikolay Kichukov
2022-05-24 11:29           ` Christian Schoenebeck
2022-07-07 14:30 ` Christian Schoenebeck
     [not found]   ` <CAFkjPT=GAoViYd0E7CZQDq3ZjhmYT0DsBytfZXnE10JL0P8O-Q@mail.gmail.com>
2022-07-08  1:15     ` Dominique Martinet
     [not found]       ` <CAFkjPTngeFh=0mPVW-Yf1Sxkxp_HDNUeANndoYN3-eU9_rGLuQ@mail.gmail.com>
2022-07-08 11:18         ` Christian Schoenebeck
2022-07-08 11:40           ` Dominique Martinet
2022-07-08 13:00             ` Christian Schoenebeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5072414.NxdI05fPOR@silver \
    --to=linux_oss@crudebyte.com \
    --cc=asmadeus@codewreck.org \
    --cc=ericvh@gmail.com \
    --cc=groug@kaod.org \
    --cc=lucho@ionkov.net \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@oldum.net \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.