linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Josh England <jjengla@gmail.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-unionfs@vger.kernel.org,
	Miklos Szeredi <mszeredi@redhat.com>
Subject: Re: overlayfs: allowing for changes to lowerdir
Date: Mon, 27 Feb 2017 12:40:48 +0200	[thread overview]
Message-ID: <CAOQ4uxh5OVadTPLM-jw5J_6sRKBFv0WB_CTMdy68fe3dxtYsKA@mail.gmail.com> (raw)
In-Reply-To: <CA+ZH+jH9yoB5_wxyff088Fsm2-g5=oO1ECimHVo7u2SKVK4AAw@mail.gmail.com>

On Wed, Feb 22, 2017 at 1:08 AM, Josh England <jjengla@gmail.com> wrote:
> Amir,
>
> After playing with it some, this patch seems to provide precisely the
> behavior I need for my use case.  Do you think it makes sense to turn
> this behavior into a module parameter (eg: allow_revalidate)?
>

I don't know, because I don't know the reason that Miklos chose to error
on revalidate of remote lower fs.
But it would be strange to introduce a feature that changes one undefined
behavior (maybe ESTALE) with another undefined behavior.

It may be easier if you can argue for a use case which does have
defined behavior,
for example, lower fs has some directory subtrees that are not
modified via overlayfs
and only modified directry via lower fs.
I think this *may* result in defined behavior over lower remote fs,
but can't tell for sure.
Anyway, you will have to argue why such a setup is useful.


> -JE
>
>
> On Tue, Feb 14, 2017 at 6:01 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Mon, Feb 13, 2017 at 11:41 PM, Josh England <jjengla@gmail.com> wrote:
>>> So here's the use case:  lowerdir is an NFS mounted root filesystem
>>> (shared by a bunch of nodes).  upperdir is a tmpfs RAM disk to allow
>>> for writes to happen.  This works great with the caveat being I cannot
>>> make 'live' changes to the root filesystem, which poses the problem.
>>> Any access to a changed file causes a 'Stale file handle' error.
>>>
>>> With some experimenting, I've discovered that remounting the overlay
>>> filesystem (mount -o remount / /)  registers any changes that have
>>> been made to the lower NFS filesystem.  In addition, dumping cache
>>> (via /proc/sys/vm/drop_caches) also makes the stale file handle errors
>>> go away and reads pass through to the lower dir and correctly show
>>> changes.
>>>
>>> I'd like to make this use case feasible by allowing changes to the NFS
>>> lowerdir to work more or less transparently.  It seems like if the
>>> overlay did not do any caching at all, all reads would fall through to
>>> either the upperdir ram disk or the NFS lower, which is precisely what
>>> I want.
>>>
>>> So, let me pose this somewhat naive question:  Would it be possible to
>>> simply disable any cacheing performed by the overlay to force all
>>> reads to go to either the tmpfs upper or the (VFS-cached) NFS lower?
>>> Would this be enough to accomplish my goal of being able to change the
>>> lowerdir of an active overlayfs?
>>>
>>
>> There is no need to disable caching. There is already a mechanism
>> in place in VFS to revalidate inode cache entries.
>> NFS implements d_revalidate() and overlayfs implements d_revalidate()
>> by calling into the lower fs d_revalidate().
>>
>> However overlayfs intentionally errors when lower entry has been modified.
>> (see: 7c03b5d ovl: allow distributed fs as lower layer)
>>
>> You can try this (untested) patch to revert this behavior, just to see if it
>> works for your use case, but it won't change this fact
>> from Documentation/filesystems/overlayfs.txt:
>> " Changes to the underlying filesystems while part of a mounted overlay
>> filesystem are not allowed.  If the underlying filesystem is changed,
>> the behavior of the overlay is undefined, though it will not result in
>> a crash or deadlock."
>>
>> Specifically, renaming directories and files in lower that were already
>> copied up is going to have a weird outcome.
>>
>> Also, the situation with changing files in lower remote fs could be worse
>> than changing files on lower local fs, simply because right now, this
>> use case is not tested (i.e. it results in ESTALE).
>>
>> I believe that fixing this use case, if at all possible, would require quite
>> a bit of work, a lot of documentation (about expected behavior) and
>> even more testing.
>>
>> Amir.
>>
>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>> index e8ef9d1..6806ef3 100644
>> --- a/fs/overlayfs/super.c
>> +++ b/fs/overlayfs/super.c
>> @@ -106,16 +106,11 @@ static int ovl_dentry_revalidate(struct dentry
>> *dentry, unsigned int flags)
>>
>>                 if (d->d_flags & DCACHE_OP_REVALIDATE) {
>>                         ret = d->d_op->d_revalidate(d, flags);
>> -                       if (ret < 0)
>> +                       if (ret =< 0)
>>                                 return ret;
>> -                       if (!ret) {
>> -                               if (!(flags & LOOKUP_RCU))
>> -                                       d_invalidate(d);
>> -                               return -ESTALE;
>> -                       }
>>                 }
>>         }
>> -       return 1;
>> +       return ret;
>>  }

  parent reply	other threads:[~2017-02-27 10:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-13 21:41 overlayfs: allowing for changes to lowerdir Josh England
2017-02-14 14:01 ` Amir Goldstein
2017-02-14 17:14   ` Josh England
2017-02-21 23:08   ` Josh England
2017-02-22  9:00     ` Ian Kent
2017-02-27 10:40     ` Amir Goldstein [this message]
2017-02-28 19:08       ` Josh England
2017-02-28 19:44         ` Al Viro
2017-03-01 11:15           ` Amir Goldstein
2017-03-01 18:22           ` Josh England
2017-03-01 20:22         ` Colin Walters
2017-03-09 10:37   ` Miklos Szeredi
2017-03-09 11:22     ` Amir Goldstein
2017-03-09 13:12       ` Miklos Szeredi
2017-02-15  1:29 ` J. R. Okajima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxh5OVadTPLM-jw5J_6sRKBFv0WB_CTMdy68fe3dxtYsKA@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=jjengla@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).