linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Howells <dhowells@redhat.com>
Cc: linux-afs@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] afs: Fixes
Date: Sat, 25 Nov 2017 12:55:18 -1000	[thread overview]
Message-ID: <CA+55aFzHMTdqefmtxcKjiGDC=HsKiimfuDQxZZ-hbu-6f1nL3g@mail.gmail.com> (raw)
In-Reply-To: <23134.1511649343@warthog.procyon.org.uk>

On Sat, Nov 25, 2017 at 12:35 PM, David Howells <dhowells@redhat.com> wrote:
>
> Doesn't clear_page_dirty_for_io() write-protect the PTE for the page to be
> written out, in which case ->page_mkwrite() will get called again before the
> page is redirtied?

No, it literally just sets the dirty bit (and does accounting).

But I think you may be right that we always write-protect he page when
we move the dirty bit from the page tables to the 'struct page'
(page_mkclean_one()).

However, even when you do that, the page can be writable in other
mappings. At least fork(), for example, only clears the dirty bit,
doesn't mark it write-protected.

So there is some rate-limiting of dirty pages, but I do not believe
that we've ever really *serialized* writes.

>>  (b) can cause some really nasty latency issues
>
> True, but I think the most common case is a file being opened, written start
> to finish and then closed.  Actually, the worst-handled thing I've seen is a
> shell script appending a bunch of things to a file because ->flush() syncs the
> file each time it is closed:-/
>
> What would you recommend instead?  I'm currently trying and keep track of what
> needs to be written so that I only write what's changed to the server, rather
> than writing only whole pages.

I don't think that what you are doing is necessarily wrong, I'm just
pointing out that you may still see mmap'ed pages being modified
concurrently with the actual IO, and that this will potentially mean
(for example) that things like checksums won't be reliably unless you
do the checksum as you copy the data to a network packet or something.

Of course, if that kind of inconsistency happens, a later write-back
will also happen, and eventually fix it. So the server may see
temporarily 'wrong' data, but it won't be final.

I just hope that the inconsistency isn't fatal to the afs client or
server code. For example, if you retry writes forever when a checksum
were to not match the data, that would be bad.

And yes, this can be

 (a) really hard  to trigger in practice

 (b) very hard to debug due to a combination of very specific timing
and behavior.

so I just wanted to bring this up as a potential issue, not
necessarily as a big problem.

             Linus

  parent reply	other threads:[~2017-11-25 22:55 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-24 14:22 [GIT PULL] afs: Fixes David Howells
2017-11-25 18:05 ` Linus Torvalds
2017-11-25 22:35 ` David Howells
2017-11-25 22:48   ` Dave Chinner
2017-11-25 22:55   ` Linus Torvalds [this message]
2017-11-25 23:19   ` David Howells
  -- strict thread matches above, loose matches on Subject: below --
2020-10-29 14:07 [GIT PULL] afs fixes David Howells
2020-10-29 17:31 ` pr-tracker-bot
2020-04-13 14:50 [GIT PULL] afs: Fixes David Howells
2020-04-13 15:13 ` David Howells
2020-04-14 19:05 ` pr-tracker-bot
2019-08-14 14:18 David Howells
2019-08-14 22:35 ` pr-tracker-bot
2019-08-14 14:15 David Howells
2019-06-26 13:39 [GIT PULL] AFS fixes David Howells
2019-06-28  0:45 ` pr-tracker-bot
2019-06-26  8:50 David Howells
2019-06-26 13:38 ` David Howells
2018-05-14 21:25 [GIT PULL] afs: Fixes David Howells
2017-03-17 15:29 [GIT PULL] AFS fixes David Howells
2017-03-17 19:23 ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFzHMTdqefmtxcKjiGDC=HsKiimfuDQxZZ-hbu-6f1nL3g@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=linux-afs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).