From: Jeff Layton <jlayton@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Sargun Dhillon <sargun@sargun.me>,
Amir Goldstein <amir73il@gmail.com>,
linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org,
Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: [PATCH] overlay: Implement volatile-specific fsync error behaviour
Date: Wed, 02 Dec 2020 16:52:33 -0500 [thread overview]
Message-ID: <2e08895bf0650513d7d12e66965eec611f361be3.camel@redhat.com> (raw)
In-Reply-To: <20201202213434.GA4070@redhat.com>
On Wed, 2020-12-02 at 16:34 -0500, Vivek Goyal wrote:
> On Wed, Dec 02, 2020 at 02:26:23PM -0500, Jeff Layton wrote:
> [..]
> > > > > > > > + upper_mnt_sb = ovl_upper_mnt(ofs)->mnt_sb;
> > > > > > > > + sb->s_stack_depth = upper_mnt_sb->s_stack_depth;
> > > > > > > > + sb->s_time_gran = upper_mnt_sb->s_time_gran;
> > > > > > > > + ofs->upper_errseq = errseq_sample(&upper_mnt_sb->s_wb_err);
> > > > > > >
> > > > > > > I asked this question in last email as well. errseq_sample() will return
> > > > > > > 0 if current error has not been seen yet. That means next time a sync
> > > > > > > call comes for volatile mount, it will return an error. But that's
> > > > > > > not what we want. When we mounted a volatile overlay, if there is an
> > > > > > > existing error (seen/unseen), we don't care. We only care if there
> > > > > > > is a new error after the volatile mount, right?
> > > > > > >
> > > > > > > I guess we will need another helper similar to errseq_smaple() which
> > > > > > > just returns existing value of errseq. And then we will have to
> > > > > > > do something about errseq_check() to not return an error if "since"
> > > > > > > and "eseq" differ only by "seen" bit.
> > > > > > >
> > > > > > > Otherwise in current form, volatile mount will always return error
> > > > > > > if upperdir has error and it has not been seen by anybody.
> > > > > > >
> > > > > > > How did you finally end up testing the error case. Want to simualate
> > > > > > > error aritificially and test it.
> > > > > > >
> > > > > >
> > > > > > If you don't want to see errors that occurred before you did the mount,
> > > > > > then you probably can just resurrect and rename the original version of
> > > > > > errseq_sample. Something like this, but with a different name:
> > > > > >
> > > > > > +errseq_t errseq_sample(errseq_t *eseq)
> > > > > > +{
> > > > > > + errseq_t old = READ_ONCE(*eseq);
> > > > > > + errseq_t new = old;
> > > > > > +
> > > > > > + /*
> > > > > > + * For the common case of no errors ever having been set, we can skip
> > > > > > + * marking the SEEN bit. Once an error has been set, the value will
> > > > > > + * never go back to zero.
> > > > > > + */
> > > > > > + if (old != 0) {
> > > > > > + new |= ERRSEQ_SEEN;
> > > > > > + if (old != new)
> > > > > > + cmpxchg(eseq, old, new);
> > > > > > + }
> > > > > > + return new;
> > > > > > +}
> > > > >
> > > > > Yes, a helper like this should solve the issue at hand. We are not
> > > > > interested in previous errors. This also sets the ERRSEQ_SEEN on
> > > > > sample and it will also solve the other issue when after sampling
> > > > > if error gets seen, we don't want errseq_check() to return error.
> > > > >
> > > > > Thinking of some possible names for new function.
> > > > >
> > > > > errseq_sample_seen()
> > > > > errseq_sample_set_seen()
> > > > > errseq_sample_consume_unseen()
> > > > > errseq_sample_current()
> > > > >
> > > >
> > > > errseq_sample_consume_unseen() sounds good, though maybe it should be
> > > > "ignore_unseen"? IDK, naming this stuff is the hardest part.
> > > >
> > > > If you don't want to add a new helper, I think you'd probably also be
> > > > able to do something like this in fill_super:
> > > >
> > > > Â Â Â Â errseq_sample()
> > > > Â Â Â Â errseq_check_and_advance()
> > > >
> > > >
> > > > ...and just ignore the error returned by the check and advance. At that
> > > > point, the cursor should be caught up and any subsequent syncfs call
> > > > should return 0 until you record another error. It's a little less
> > > > efficient, but only slightly so.
> > >
> > > This seems even better.
> > >
> > > Thinking little bit more. I am now concerned about setting ERRSEQ_SEEN on
> > > sample. In our case, that would mean that we consumed an unseen error but
> > > never reported it back to user space. And then somebody might complain.
> > >
> > > This kind of reminds me posgresql's fsync issues where they did
> > > writes using one fd and another thread opened another fd and
> > > did sync and they expected any errors to be reported.
> > >
> >
> > > Similary what if an unseen error is present on superblock on upper
> > > and if we mount volatile overlay and mark the error SEEN, then
> > > if another process opens a file on upper and did syncfs(), it will
> > > complain that exisiting error was not reported to it.
> > >
> > > Overlay use case seems to be that we just want to check if an error
> > > has happened on upper superblock since we sampled it and don't
> > > want to consume that error as such. Will it make sense to introduce
> > > two helpers for error sampling and error checking which mask the
> > > SEEN bit and don't do anything with it. For example, following compile
> > > tested only patch.
> > >
> > > Now we will not touch SEEN bit at all. And even if SEEN gets set
> > > since we sampled, errseq_check_mask_seen() will not flag it as
> > > error.
> > >
> > > Thanks
> > > Vivek
> > >
> >
> > Again, you're not really hiding this from anyone doing something _sane_.
> > You're only hiding an error from someone who opens the file after an
> > error occurs and expects to see an error.
> >
> > That was the behavior for fsync before we switched to errseq_t, and we
> > had to change errseq_sample for applications that relied on that. syncfs
> > reporting these errors is pretty new however. I don't think we
> > necessarily need to make the same guarantees there.
> >
> > The solution to all of these problems is to ensure that you open the
> > files early you're issuing syncfs on and keep them open. Then you'll
> > always see any subsequent errors.
>
> Ok. I guess we will have to set SEEN bit during error_sample otherwise,
> we miss errors. I had missed this point.
>
> So mounting a volatile overlay instance will become somewhat
> equivalent of as if somebody did a syncfs on upper, consumed
> error and did not do anything about it.
>
> If a user cares about not losing such errors, they need to keep an
> fd open on upper.
>
> /me hopes that this does not become an issue for somebody. Even
> if it does, one workaround can be don't do volatile overlay or
> don't share overlay upper with other conflicting workload.
>
Yeah, there are limits to what we can do with 32 bits.
It's not pretty, but I guess you could pr_warn at mount time if you find
an unseen error. That would at least not completely drop it on the
floor.
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2020-12-02 21:54 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-02 9:27 [PATCH] overlay: Implement volatile-specific fsync error behaviour Sargun Dhillon
2020-12-02 10:25 ` Amir Goldstein
2020-12-02 15:07 ` Vivek Goyal
2020-12-02 17:02 ` Jeff Layton
2020-12-02 17:29 ` Vivek Goyal
2020-12-02 18:22 ` Jeff Layton
2020-12-02 18:56 ` Vivek Goyal
2020-12-02 19:03 ` Sargun Dhillon
2020-12-02 19:26 ` Jeff Layton
2020-12-02 21:34 ` Vivek Goyal
2020-12-02 21:52 ` Jeff Layton [this message]
2020-12-03 10:42 ` Sargun Dhillon
2020-12-03 12:06 ` Jeff Layton
2020-12-03 14:27 ` Vivek Goyal
2020-12-03 15:20 ` Jeff Layton
2020-12-03 17:08 ` Sargun Dhillon
2020-12-03 17:50 ` Jeff Layton
2020-12-03 20:43 ` Vivek Goyal
2020-12-03 21:36 ` Jeff Layton
2020-12-03 22:24 ` Vivek Goyal
2020-12-03 23:36 ` Jeff Layton
2020-12-04 6:45 ` Amir Goldstein
2020-12-04 15:00 ` Vivek Goyal
2020-12-03 20:31 ` Vivek Goyal
2020-12-02 18:49 ` Sargun Dhillon
2020-12-02 19:10 ` Jeff Layton
2020-12-03 10:36 ` Amir Goldstein
2020-12-02 17:41 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2e08895bf0650513d7d12e66965eec611f361be3.camel@redhat.com \
--to=jlayton@redhat.com \
--cc=amir73il@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=sargun@sargun.me \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).