linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Amir Goldstein <amir73il@gmail.com>,
	Sargun Dhillon <sargun@sargun.me>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Vivek Goyal <vgoyal@redhat.com>,
	overlayfs <linux-unionfs@vger.kernel.org>,
	Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	NeilBrown <neilb@suse.com>, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v3] errseq: split the ERRSEQ_SEEN flag into two new flags
Date: Sat, 19 Dec 2020 08:25:19 -0500	[thread overview]
Message-ID: <f622563d89515a834c299a07012f500e4c026e98.camel@kernel.org> (raw)
In-Reply-To: <f84f3259d838f132029576b531d81525abd4e1b8.camel@kernel.org>

On Sat, 2020-12-19 at 07:53 -0500, Jeff Layton wrote:
> On Sat, 2020-12-19 at 06:13 +0000, Matthew Wilcox wrote:
> > On Thu, Dec 17, 2020 at 10:00:37AM -0500, Jeff Layton wrote:
> > > Overlayfs's volatile mounts want to be able to sample an error for their
> > > own purposes, without preventing a later opener from potentially seeing
> > > the error.
> > 
> > umm ... can't they just copy the errseq_t they're interested in, followed
> > by calling errseq_check() later?
> > 
> 
> They don't want the sampling for the volatile mount to prevent later
> openers from seeing an error that hasn't yet been reported.
> 
> If they copy the errseq_t (or just do an errseq_sample), and then follow
> it with a errseq_check_and_advance then the SEEN bit will end up being
> set and a later opener wouldn't see the error.
> 
> Aside from that though, I think this patch clarifies things a bit since
> the SEEN flag currently means two different things:
> 
> 1/ do I need to increment the counter when recording another error?
> 
> 2/ do I need to report this error to new samplers (at open time)
> 
> That was ok before, since we those conditions were always changed
> together, but with the overlayfs volatile mount use-case, it no longer
> does.
> 
> > actually, isn't errseq_check() buggy in the face of multiple
> > watchers?  consider this:
> > 
> > worker.es starts at 0
> > t2.es = errseq_sample(&worker.es)
> > errseq_set(&worker.es, -EIO)
> > t1.es = errseq_sample(&worker.es)
> > t2.err = errseq_check_and_advance(&es, t2.es)
> > 	** this sets ERRSEQ_SEEN **
> > t1.err = errseq_check(&worker.es, t1.es)
> > 	** reports an error, even though the only change is that
> > 	   ERRSEQ_SEEN moved **.
> > 
> > i think errseq_check() should be:
> > 
> > 	if (likely(cur | ERRSEQ_SEEN) == (since | ERRSEQ_SEEN))
> > 		return 0;
> > 
> > i'm not yet convinced other changes are needed to errseq.  but i am
> > having great trouble understanding exactly what overlayfs is trying to do.
> 
> I think you're right on errseq_check. I'll plan to do a patch to fix
> that up as well.
> 

I take it back. This isn't a problem for a couple of (non-obvious)
reasons:

1/ any "real" sample will either have the SEEN bit set or be 0. In the
first case, errseq_check will return true (which is correct). In the
second any change from what you sampled will have to have been a
recorded error.

2/ ...but even if that weren't the case and you got a false positive
from errseq_check, all of the existing callers call
errseq_check_and_advance just afterward, which handles this correctly.

Either way, splitting the flag in two means that we do need to take the
flag bits into account, so errseq_same masks them off before comparing.

> I too am having a bit of trouble understanding all of the nuances here.
> My current understanding is that it depends on the "volatility" of the
> mount:
> 
> normal (non-volatile): they basically want to be able to track errors as
> if the files were being opened on the upper layer. For this case I think
> they should aim to just do all of the error checking against the upper
> sb and ignore the overlayfs s_wb_err field. This does mean pushing the
> errseq_check_and_advance down into the individual filesystems in some
> fashion though.
> 
> volatile: they want to sample at mount time and always return an error
> to syncfs if there has been another error since the original sample
> point. This sampling should also not affect later openers on the upper
> layer (or on other overlayfs mounts).
> 
> I'm not 100% clear on whether I understand both use-cases correctly
> though.

-- 
Jeff Layton <jlayton@kernel.org>


  reply	other threads:[~2020-12-19 13:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-17 15:00 [PATCH v3] errseq: split the ERRSEQ_SEEN flag into two new flags Jeff Layton
2020-12-17 20:35 ` Sargun Dhillon
2020-12-17 21:18   ` Jeff Layton
2020-12-18 23:44     ` Sargun Dhillon
2020-12-19  1:03       ` Jeff Layton
2020-12-19  9:09         ` Amir Goldstein
2020-12-19  6:13 ` Matthew Wilcox
2020-12-19 12:53   ` Jeff Layton
2020-12-19 13:25     ` Jeff Layton [this message]
2020-12-19 15:33     ` Matthew Wilcox
2020-12-19 15:49       ` Jeff Layton
2020-12-19 16:53         ` Matthew Wilcox
2020-12-21 15:14     ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f622563d89515a834c299a07012f500e4c026e98.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neilb@suse.com \
    --cc=sargun@sargun.me \
    --cc=vgoyal@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).