linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
@ 2018-08-24  9:02 Amir Goldstein
  2018-08-24 23:39 ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Amir Goldstein @ 2018-08-24  9:02 UTC (permalink / raw)
  To: Darrick J . Wong, Dave Chinner
  Cc: Eryu Guan, Miklos Szeredi, linux-xfs, linux-fsdevel

Since overlayfs implements stacked file operations, f_inode
is no longer euqivalent to f_mapping->host and xfs should use
the latter, same as generic_swapfile_activate().

Using f_inode results in an attempt to dereference an xfs_inode
struct from an ovl_inode pointer:

 CPU: 0 PID: 2462 Comm: swapon Not tainted
 4.18.0-xfstests-12721-g33e17876ea4e #3402
 RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f
 Call Trace:
  xfs_iomap_swapfile_activate+0x1f/0x43
  __se_sys_swapon+0xb1a/0xee9

Fixes: d1d04ef8572b ("ovl: stack file ops")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Darrick/Dave,

Running "./check -overlay" on master crashes kernel on swap tests.
Please send this fix for rc1/rc2 or ack it so Miklos could apply it.

Thanks,
Amir.

 fs/xfs/xfs_aops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 49f5f5896a43..09f093f89b19 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1012,7 +1012,7 @@ xfs_iomap_swapfile_activate(
 	struct file			*swap_file,
 	sector_t			*span)
 {
-	sis->bdev = xfs_find_bdev_for_inode(file_inode(swap_file));
+	sis->bdev = xfs_find_bdev_for_inode(swap_file->f_mapping->host);
 	return iomap_swapfile_activate(sis, swap_file, span, &xfs_iomap_ops);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-24  9:02 [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs Amir Goldstein
@ 2018-08-24 23:39 ` Dave Chinner
  2018-08-25 10:47   ` Amir Goldstein
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2018-08-24 23:39 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Darrick J . Wong, Eryu Guan, Miklos Szeredi, linux-xfs, linux-fsdevel

On Fri, Aug 24, 2018 at 12:02:51PM +0300, Amir Goldstein wrote:
> Since overlayfs implements stacked file operations, f_inode
> is no longer euqivalent to f_mapping->host and xfs should use
> the latter, same as generic_swapfile_activate().

Since when has file_inode() not pointed to the inode backing the
struct file?

> Using f_inode results in an attempt to dereference an xfs_inode
> struct from an ovl_inode pointer:
> 
>  CPU: 0 PID: 2462 Comm: swapon Not tainted
>  4.18.0-xfstests-12721-g33e17876ea4e #3402
>  RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f
>  Call Trace:
>   xfs_iomap_swapfile_activate+0x1f/0x43
>   __se_sys_swapon+0xb1a/0xee9
> 
> Fixes: d1d04ef8572b ("ovl: stack file ops")

Oh, since about 3 days ago.

So now we've got to deal with a vfs interface change that isn't
documented, hasn't been clearly communicated prior to merging, and
it subtly breaks a subset of callers.

So, please enlighten me with a documentation patch before changing
any XFS code: What is the new behaviour and the rules we must follow
for calling file_inode()?

I'm also interested in understanding whether anyone auditted all the
callers to see if they follow those rules.  There are another
20-odd file_inode() calls in XFS, and ~750 across fs/. How many
other places have been checked for correctness under the new rules?

i.e.  I can't review this patch without also confirming that all the
other file_inode() callers in XFS are still correct, and I can't do
that until I understand what the new behaviour of file_inode() is
and when it is not safe to use.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-24 23:39 ` Dave Chinner
@ 2018-08-25 10:47   ` Amir Goldstein
  2018-08-25 20:04     ` Miklos Szeredi
  2018-08-26 22:52     ` Dave Chinner
  0 siblings, 2 replies; 8+ messages in thread
From: Amir Goldstein @ 2018-08-25 10:47 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Miklos Szeredi, linux-xfs, linux-fsdevel,
	overlayfs, Al Viro

[+cc: Al,linux-unionfs]

On Sat, Aug 25, 2018 at 2:39 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Fri, Aug 24, 2018 at 12:02:51PM +0300, Amir Goldstein wrote:
> > Since overlayfs implements stacked file operations, f_inode
> > is no longer euqivalent to f_mapping->host and xfs should use
> > the latter, same as generic_swapfile_activate().
>
> Since when has file_inode() not pointed to the inode backing the
> struct file?
>
> > Using f_inode results in an attempt to dereference an xfs_inode
> > struct from an ovl_inode pointer:
> >
> >  CPU: 0 PID: 2462 Comm: swapon Not tainted
> >  4.18.0-xfstests-12721-g33e17876ea4e #3402
> >  RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f
> >  Call Trace:
> >   xfs_iomap_swapfile_activate+0x1f/0x43
> >   __se_sys_swapon+0xb1a/0xee9
> >
> > Fixes: d1d04ef8572b ("ovl: stack file ops")
>
> Oh, since about 3 days ago.
>
> So now we've got to deal with a vfs interface change that isn't
> documented, hasn't been clearly communicated prior to merging, and
> it subtly breaks a subset of callers.
>

Well, when you put it this way... ;-)

First of all - self NACK.
My fix is papering over a bigger issue, that is leaking of overlay
file/inode into xfs f_aops.
I believe the correct fix right now would be to add an overlayfs hack
in swapon(), as well as some other hacks in mm/* syscalls
(e.g. readahead()).

The virtue of merging stacked file operations was getting rid of many
VFS hacks, but the last chapter has not been written yet, or to put it
in Al's words [1]:

"Uses of ->vm_file (and rules for those) are too convoluted to untangle
at the moment.  I still would love to get that straightened out, but
it's not this cycle fodder, more's the pity..."

So I expect this cycle will require adding a few temporary mm
syscall hacks, in the hope they will be more short lived than the
departing VFS hacks.

> So, please enlighten me with a documentation patch before changing
> any XFS code: What is the new behaviour and the rules we must follow
> for calling file_inode()?
>

Actually, I believe the intention was that fs developers don't need to worry
about using file_inode() at all, because before the change we had:

- file passed in to xfs f_op's and a_ops is either overlay file OR xfs file
- file_inode() of either overlay/xfs file in xfs context is always xfs inode
- file->f_path in xfs context, BTW, was overlay path and therefore,
  XFS_IOC_OPEN_BY_HANDLE was slightly broken in overlayfs over xfs,
  as were several other fs specific ioctls

After stacked file operations change we should have the rules:

1. file passed in to xfs f_op's is always xfs file (*)
2. file passed in to xfs a_ops is always xfs file (**)
3. file_inode() of overlay file is an overlay inode

(*) as explicit file argument or on iocb->ki_filp
(**) as explicit file argument or on ->vm_file

I believe that swapfile leaking an overlay file into xfs was an oversight,
that is breaking rule #2.

> I'm also interested in understanding whether anyone auditted all the
> callers to see if they follow those rules.  There are another
> 20-odd file_inode() calls in XFS, and ~750 across fs/. How many
> other places have been checked for correctness under the new rules?
>

It is my understanding that code was reviewed by Al with the
assumptions made above about users of file_inode() not being affected
by this change. If those assumptions hold, then filesystem specific code
should not be affected.

So looking for users of file_inode() is looking for the wrong thing.
We need to be looking for entry points that can pass an overlay file
to fs f_aops. I found two and I'll keep looking.

> i.e.  I can't review this patch without also confirming that all the
> other file_inode() callers in XFS are still correct, and I can't do
> that until I understand what the new behaviour of file_inode() is
> and when it is not safe to use.
>

I audited the XFS uses of file_inode() and they seem to be correct
because they are all inside implementations of file or address_space
operations and operate on file argument passed in directly as one of
the operation arguments or indirectly via ->vm_file or iocb->ki_filp.

All calls to file_inode() in  xfs_file.c are in static helpers that are not
called by any non static (or non fs operation) function.
In xfs_ioctl*.c, some calls to file_inode() are not in static helpers,
but those are only called from both ioctl operation variants.

The one exception I found is file_inode(tmp.file) call in
xfs_ioc_swapext() which is operating on a file passed in by
user via ioctl arg, but that case is well covered by a validation
check (tmp.file->f_op != &xfs_file_operations).

Cheers,
Amir.

[1] https://marc.info/?l=linux-fsdevel&m=152882788408566&w=2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-25 10:47   ` Amir Goldstein
@ 2018-08-25 20:04     ` Miklos Szeredi
  2018-08-26 11:32       ` Amir Goldstein
  2018-08-26 22:52     ` Dave Chinner
  1 sibling, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2018-08-25 20:04 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Dave Chinner, Darrick J. Wong, linux-xfs, linux-fsdevel,
	overlayfs, Al Viro

On Sat, Aug 25, 2018 at 12:47 PM, Amir Goldstein <amir73il@gmail.com> wrote:

> Actually, I believe the intention was that fs developers don't need to worry
> about using file_inode() at all, because before the change we had:
>
> - file passed in to xfs f_op's and a_ops is either overlay file OR xfs file
> - file_inode() of either overlay/xfs file in xfs context is always xfs inode
> - file->f_path in xfs context, BTW, was overlay path and therefore,
>   XFS_IOC_OPEN_BY_HANDLE was slightly broken in overlayfs over xfs,
>   as were several other fs specific ioctls
>
> After stacked file operations change we should have the rules:
>
> 1. file passed in to xfs f_op's is always xfs file (*)
> 2. file passed in to xfs a_ops is always xfs file (**)
> 3. file_inode() of overlay file is an overlay inode
>
> (*) as explicit file argument or on iocb->ki_filp
> (**) as explicit file argument or on ->vm_file
>
> I believe that swapfile leaking an overlay file into xfs was an oversight,
> that is breaking rule #2.

Correct.

I believe the root cause is this

    /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */
    file->f_mapping = realfile->f_mapping;

in ovl_open().  So lets start with removing that.  That should fix any
oopses related to this, but we'll have some other issues:

 1) open(..., O_DIRECT) will return an error

This is easy to fix:  add ovl_file_aops with a dummy ovl_direct_IO()
function that will never be called.

 2) swapon() will return an error

First question that comes to mind:  does anybody care?  I wouldn't
think swapfiles would be an important feature for overlayfs, but we
did support them up till now, so removing support might cause a
regression somewhere out there.   Unfortunate...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-25 20:04     ` Miklos Szeredi
@ 2018-08-26 11:32       ` Amir Goldstein
  2018-08-26 22:59         ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Amir Goldstein @ 2018-08-26 11:32 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Dave Chinner, Darrick J. Wong, linux-xfs, linux-fsdevel,
	overlayfs, Al Viro

On Sat, Aug 25, 2018 at 11:04 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Sat, Aug 25, 2018 at 12:47 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
> > Actually, I believe the intention was that fs developers don't need to worry
> > about using file_inode() at all, because before the change we had:
> >
> > - file passed in to xfs f_op's and a_ops is either overlay file OR xfs file
> > - file_inode() of either overlay/xfs file in xfs context is always xfs inode
> > - file->f_path in xfs context, BTW, was overlay path and therefore,
> >   XFS_IOC_OPEN_BY_HANDLE was slightly broken in overlayfs over xfs,
> >   as were several other fs specific ioctls
> >
> > After stacked file operations change we should have the rules:
> >
> > 1. file passed in to xfs f_op's is always xfs file (*)
> > 2. file passed in to xfs a_ops is always xfs file (**)
> > 3. file_inode() of overlay file is an overlay inode
> >
> > (*) as explicit file argument or on iocb->ki_filp
> > (**) as explicit file argument or on ->vm_file
> >
> > I believe that swapfile leaking an overlay file into xfs was an oversight,
> > that is breaking rule #2.
>
> Correct.
>
> I believe the root cause is this
>
>     /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */
>     file->f_mapping = realfile->f_mapping;
>
> in ovl_open().  So lets start with removing that.  That should fix any
> oopses related to this, but we'll have some other issues:
>
>  1) open(..., O_DIRECT) will return an error
>
> This is easy to fix:  add ovl_file_aops with a dummy ovl_direct_IO()
> function that will never be called.
>

Nice. I pushed a patch to ovl-fixes branch [1].

>  2) swapon() will return an error
>
> First question that comes to mind:  does anybody care?  I wouldn't
> think swapfiles would be an important feature for overlayfs, but we
> did support them up till now, so removing support might cause a
> regression somewhere out there.   Unfortunate...
>

I think we better introduce this "regression" and see if there are any real
world users out there. If there are, I have a WIP patch to make it work,
but it involves cloning an accountable file from real file - not a pretty sight.

3) readahead() will return an error and fadvise() will ignore request

I have patches [1] to fix those by familiarizing vfs with file_real().

4) I am afraid we may end up with more vfs hacks -
    right off the top of my grep f_mapping fs/*.c:
- FIBMAP
- sync_file_range()

But wasn't able to demonstrate a problem with those yet.
I am running [1] through xfstests now and if it looks fine, I'll post
the patches.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/ovl-fixes

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-25 10:47   ` Amir Goldstein
  2018-08-25 20:04     ` Miklos Szeredi
@ 2018-08-26 22:52     ` Dave Chinner
  1 sibling, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2018-08-26 22:52 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Darrick J. Wong, Miklos Szeredi, linux-xfs, linux-fsdevel,
	overlayfs, Al Viro

On Sat, Aug 25, 2018 at 01:47:52PM +0300, Amir Goldstein wrote:
> [+cc: Al,linux-unionfs]
> 
> On Sat, Aug 25, 2018 at 2:39 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Fri, Aug 24, 2018 at 12:02:51PM +0300, Amir Goldstein wrote:
> > > Since overlayfs implements stacked file operations, f_inode
> > > is no longer euqivalent to f_mapping->host and xfs should use
> > > the latter, same as generic_swapfile_activate().
> >
> > Since when has file_inode() not pointed to the inode backing the
> > struct file?
> >
> > > Using f_inode results in an attempt to dereference an xfs_inode
> > > struct from an ovl_inode pointer:
> > >
> > >  CPU: 0 PID: 2462 Comm: swapon Not tainted
> > >  4.18.0-xfstests-12721-g33e17876ea4e #3402
> > >  RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f
> > >  Call Trace:
> > >   xfs_iomap_swapfile_activate+0x1f/0x43
> > >   __se_sys_swapon+0xb1a/0xee9
> > >
> > > Fixes: d1d04ef8572b ("ovl: stack file ops")
> >
> > Oh, since about 3 days ago.
> >
> > So now we've got to deal with a vfs interface change that isn't
> > documented, hasn't been clearly communicated prior to merging, and
> > it subtly breaks a subset of callers.
> >
> 
> Well, when you put it this way... ;-)
> 
> First of all - self NACK.
> My fix is papering over a bigger issue, that is leaking of overlay
> file/inode into xfs f_aops.

That means any new operation vector introduced that passes a struct
file is always going to need an overlay interposer function to
ensure that the correct file is passed to the lower filesystem, yes?

That seems like a bit of a landmine to leave for anyone implementing
a new generic operation vector. Documentation patch?

> I believe the correct fix right now would be to add an overlayfs hack
> in swapon(), as well as some other hacks in mm/* syscalls
> (e.g. readahead()).
> 
> The virtue of merging stacked file operations was getting rid of many
> VFS hacks, but the last chapter has not been written yet, or to put it
> in Al's words [1]:
> 
> "Uses of ->vm_file (and rules for those) are too convoluted to untangle
> at the moment.  I still would love to get that straightened out, but
> it's not this cycle fodder, more's the pity..."
> 
> So I expect this cycle will require adding a few temporary mm
> syscall hacks, in the hope they will be more short lived than the
> departing VFS hacks.

Yuck. 

> > So, please enlighten me with a documentation patch before changing
> > any XFS code: What is the new behaviour and the rules we must follow
> > for calling file_inode()?
> >
> 
> Actually, I believe the intention was that fs developers don't need to worry
> about using file_inode() at all, because before the change we had:
> 
> - file passed in to xfs f_op's and a_ops is either overlay file OR xfs file
> - file_inode() of either overlay/xfs file in xfs context is always xfs inode
> - file->f_path in xfs context, BTW, was overlay path and therefore,
>   XFS_IOC_OPEN_BY_HANDLE was slightly broken in overlayfs over xfs,
>   as were several other fs specific ioctls
> 
> After stacked file operations change we should have the rules:
> 
> 1. file passed in to xfs f_op's is always xfs file (*)
> 2. file passed in to xfs a_ops is always xfs file (**)
> 3. file_inode() of overlay file is an overlay inode
> 
> (*) as explicit file argument or on iocb->ki_filp
> (**) as explicit file argument or on ->vm_file
> 
> I believe that swapfile leaking an overlay file into xfs was an oversight,
> that is breaking rule #2.

Please add documentation explaining how this all works so pepole
don't have to ask every time we come across a bug as a result of a
missing/incorrect translation in overlay.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-26 11:32       ` Amir Goldstein
@ 2018-08-26 22:59         ` Dave Chinner
  2018-08-27  7:17           ` Amir Goldstein
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2018-08-26 22:59 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Darrick J. Wong, linux-xfs, linux-fsdevel,
	overlayfs, Al Viro

On Sun, Aug 26, 2018 at 02:32:33PM +0300, Amir Goldstein wrote:
> On Sat, Aug 25, 2018 at 11:04 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Sat, Aug 25, 2018 at 12:47 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > > Actually, I believe the intention was that fs developers don't need to worry
> > > about using file_inode() at all, because before the change we had:
> > >
> > > - file passed in to xfs f_op's and a_ops is either overlay file OR xfs file
> > > - file_inode() of either overlay/xfs file in xfs context is always xfs inode
> > > - file->f_path in xfs context, BTW, was overlay path and therefore,
> > >   XFS_IOC_OPEN_BY_HANDLE was slightly broken in overlayfs over xfs,
> > >   as were several other fs specific ioctls
> > >
> > > After stacked file operations change we should have the rules:
> > >
> > > 1. file passed in to xfs f_op's is always xfs file (*)
> > > 2. file passed in to xfs a_ops is always xfs file (**)
> > > 3. file_inode() of overlay file is an overlay inode
> > >
> > > (*) as explicit file argument or on iocb->ki_filp
> > > (**) as explicit file argument or on ->vm_file
> > >
> > > I believe that swapfile leaking an overlay file into xfs was an oversight,
> > > that is breaking rule #2.
> >
> > Correct.
> >
> > I believe the root cause is this
> >
> >     /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */
> >     file->f_mapping = realfile->f_mapping;
> >
> > in ovl_open().  So lets start with removing that.  That should fix any
> > oopses related to this, but we'll have some other issues:
> >
> >  1) open(..., O_DIRECT) will return an error
> >
> > This is easy to fix:  add ovl_file_aops with a dummy ovl_direct_IO()
> > function that will never be called.
> >
> 
> Nice. I pushed a patch to ovl-fixes branch [1].
> 
> >  2) swapon() will return an error
> >
> > First question that comes to mind:  does anybody care?  I wouldn't
> > think swapfiles would be an important feature for overlayfs, but we
> > did support them up till now, so removing support might cause a
> > regression somewhere out there.   Unfortunate...
> >
> 
> I think we better introduce this "regression" and see if there are any real
> world users out there. If there are, I have a WIP patch to make it work,
> but it involves cloning an accountable file from real file - not a pretty sight.
> 
> 3) readahead() will return an error and fadvise() will ignore request
> 
> I have patches [1] to fix those by familiarizing vfs with file_real().

What's file_real()?

I don't see it in 4.19-rc1 - is this a new vfs interface we're
expected to know about and use correctly without being told about it
and having no documentation explaining it's use to refer to?

> 4) I am afraid we may end up with more vfs hacks -
>     right off the top of my grep f_mapping fs/*.c:
> - FIBMAP
> - sync_file_range()

.... and this is how we found out that the light wasn't the end of
the tunnel, it was the oncoming train... :/

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs
  2018-08-26 22:59         ` Dave Chinner
@ 2018-08-27  7:17           ` Amir Goldstein
  0 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2018-08-27  7:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Miklos Szeredi, Darrick J. Wong, linux-xfs, linux-fsdevel,
	overlayfs, Al Viro

On Mon, Aug 27, 2018 at 1:59 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sun, Aug 26, 2018 at 02:32:33PM +0300, Amir Goldstein wrote:
> > On Sat, Aug 25, 2018 at 11:04 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Sat, Aug 25, 2018 at 12:47 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > > Actually, I believe the intention was that fs developers don't need to worry
> > > > about using file_inode() at all, because before the change we had:
> > > >
> > > > - file passed in to xfs f_op's and a_ops is either overlay file OR xfs file
> > > > - file_inode() of either overlay/xfs file in xfs context is always xfs inode
> > > > - file->f_path in xfs context, BTW, was overlay path and therefore,
> > > >   XFS_IOC_OPEN_BY_HANDLE was slightly broken in overlayfs over xfs,
> > > >   as were several other fs specific ioctls
> > > >
> > > > After stacked file operations change we should have the rules:
> > > >
> > > > 1. file passed in to xfs f_op's is always xfs file (*)
> > > > 2. file passed in to xfs a_ops is always xfs file (**)
> > > > 3. file_inode() of overlay file is an overlay inode
> > > >
> > > > (*) as explicit file argument or on iocb->ki_filp
> > > > (**) as explicit file argument or on ->vm_file
> > > >
> > > > I believe that swapfile leaking an overlay file into xfs was an oversight,
> > > > that is breaking rule #2.
> > >
> > > Correct.
> > >
> > > I believe the root cause is this
> > >
> > >     /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */
> > >     file->f_mapping = realfile->f_mapping;
> > >
> > > in ovl_open().  So lets start with removing that.  That should fix any
> > > oopses related to this, but we'll have some other issues:
> > >
> > >  1) open(..., O_DIRECT) will return an error
> > >
> > > This is easy to fix:  add ovl_file_aops with a dummy ovl_direct_IO()
> > > function that will never be called.
> > >
> >
> > Nice. I pushed a patch to ovl-fixes branch [1].
> >
> > >  2) swapon() will return an error
> > >
> > > First question that comes to mind:  does anybody care?  I wouldn't
> > > think swapfiles would be an important feature for overlayfs, but we
> > > did support them up till now, so removing support might cause a
> > > regression somewhere out there.   Unfortunate...
> > >
> >
> > I think we better introduce this "regression" and see if there are any real
> > world users out there. If there are, I have a WIP patch to make it work,
> > but it involves cloning an accountable file from real file - not a pretty sight.
> >
> > 3) readahead() will return an error and fadvise() will ignore request
> >
> > I have patches [1] to fix those by familiarizing vfs with file_real().
>
> What's file_real()?
>
> I don't see it in 4.19-rc1 - is this a new vfs interface we're
> expected to know about and use correctly without being told about it
> and having no documentation explaining it's use to refer to?
>

At this point, file_real() is a proposal (from patch [1/5]) that has been
rejected... If a similar patch does go through, I will add documentation.
For the record, file_real() is the alter ego of d_real{,_inode}().
v4.19-rc1 got rid of all d_real{,_inode}() calls in vfs code expect for
one in probe_event_enable() a thorny one in file_dentry()!
filesystem code is not *supposed* to be aware of those quirks, but
if you ever try to access file->f_path directly from filesystem code
and assume it's an xfs path/mnt/dentry, you are in for an unpleasant
surprise.

Is d_real() documented? Yes, in vfs.txt.
Is this sufficient for you to understand the implications for filesystem
developers - probably not.

> > 4) I am afraid we may end up with more vfs hacks -
> >     right off the top of my grep f_mapping fs/*.c:
> > - FIBMAP
> > - sync_file_range()
>
> .... and this is how we found out that the light wasn't the end of
> the tunnel, it was the oncoming train... :/
>

How can you frown about overlayfs breaking the two interfaces you
hate the most? ;-)

Kidding aside, the reason why v4.19-rc1 changes are good for
filesystem developers is because they shift more responsibility to
overlayfs code and fewer surrounding code needs to be aware of
its quirks. For example, in v4.19-rc1 overlayfs does not let filesystem
specific ioctls go through - whether or not users will be happy about
this, only time will tell...

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-08-27 11:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-24  9:02 [PATCH] xfs: fix GPF in swapfile_activate of file from overlayfs Amir Goldstein
2018-08-24 23:39 ` Dave Chinner
2018-08-25 10:47   ` Amir Goldstein
2018-08-25 20:04     ` Miklos Szeredi
2018-08-26 11:32       ` Amir Goldstein
2018-08-26 22:59         ` Dave Chinner
2018-08-27  7:17           ` Amir Goldstein
2018-08-26 22:52     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).