linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Howells <dhowells@redhat.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Jeff Layton <jlayton@redhat.com>,
	Steve French <sfrench@samba.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Mike Marshall <hubcap@omnibond.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Shyam Prasad N <nspmangalore@gmail.com>,
	linux-cachefs@redhat.com, linux-afs@lists.infradead.org,
	"open list:NFS, SUNRPC, AND..." <linux-nfs@vger.kernel.org>,
	CIFS <linux-cifs@vger.kernel.org>,
	ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net,
	devel@lists.orangefs.org, Linux-MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Canvassing for network filesystem write size vs page size
Date: Thu, 5 Aug 2021 10:27:05 -0700	[thread overview]
Message-ID: <CAHk-=wjyEk9EuYgE3nBnRCRd_AmRYVOGACEjt0X33QnORd5-ig@mail.gmail.com> (raw)
In-Reply-To: <1219713.1628181333@warthog.procyon.org.uk>

On Thu, Aug 5, 2021 at 9:36 AM David Howells <dhowells@redhat.com> wrote:
>
> Some network filesystems, however, currently keep track of which byte ranges
> are modified within a dirty page (AFS does; NFS seems to also) and only write
> out the modified data.

NFS definitely does. I haven't used NFS in two decades, but I worked
on some of the code (read: I made nfs use the page cache both for
reading and writing) back in my Transmeta days, because NFSv2 was the
default filesystem setup back then.

See fs/nfs/write.c, although I have to admit that I don't recognize
that code any more.

It's fairly important to be able to do streaming writes without having
to read the old contents for some loads. And read-modify-write cycles
are death for performance, so you really want to coalesce writes until
you have the whole page.

That said, I suspect it's also *very* filesystem-specific, to the
point where it might not be worth trying to do in some generic manner.

In particular, NFS had things like interesting credential issues, so
if you have multiple concurrent writers that used different 'struct
file *' to write to the file, you can't just mix the writes. You have
to sync the writes from one writer before you start the writes for the
next one, because one might succeed and the other not.

So you can't just treat it as some random "page cache with dirty byte
extents". You really have to be careful about credentials, timeouts,
etc, and the pending writes have to keep a fair amount of state
around.

At least that was the case two decades ago.

[ goes off and looks. See "nfs_write_begin()" and friends in
fs/nfs/file.c for some of the examples of these things, althjough it
looks like the code is less aggressive about avoding the
read-modify-write case than I thought I remembered, and only does it
for write-only opens ]

               Linus

            Linus

  reply	other threads:[~2021-08-05 17:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <YQv+iwmhhZJ+/ndc@casper.infradead.org>
     [not found] ` <YQvpDP/tdkG4MMGs@casper.infradead.org>
     [not found]   ` <YQvbiCubotHz6cN7@casper.infradead.org>
     [not found]     ` <1017390.1628158757@warthog.procyon.org.uk>
     [not found]       ` <1170464.1628168823@warthog.procyon.org.uk>
     [not found]         ` <1186271.1628174281@warthog.procyon.org.uk>
2021-08-05 16:35           ` Canvassing for network filesystem write size vs page size David Howells
2021-08-05 17:27             ` Linus Torvalds [this message]
2021-08-05 17:43               ` Trond Myklebust
2021-08-05 22:11               ` Matthew Wilcox
2021-08-06 13:42               ` David Howells
2021-08-06 14:17                 ` Matthew Wilcox
2021-08-06 15:04                 ` David Howells
2021-08-05 17:52             ` Adam Borowski
2021-08-05 18:50             ` Jeff Layton
2021-08-05 23:47             ` Matthew Wilcox
2021-08-06 13:44             ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wjyEk9EuYgE3nBnRCRd_AmRYVOGACEjt0X33QnORd5-ig@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=anna.schumaker@netapp.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=devel@lists.orangefs.org \
    --cc=dhowells@redhat.com \
    --cc=hubcap@omnibond.com \
    --cc=jlayton@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=nspmangalore@gmail.com \
    --cc=sfrench@samba.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).