linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
@ 2019-04-23  6:50 Murphy Zhou
  2019-04-23 10:59 ` Amir Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: Murphy Zhou @ 2019-04-23  6:50 UTC (permalink / raw)
  To: miklos, linux-unionfs, linux-fsdevel; +Cc: amir73il, Murphy Zhou

Overlays ovl_iter_write calls vfs_iter_write to write on real file,
in which calls fsnotify_modify on this change, however vfs_write also
calls fsnotify_modify after ovl_iter_write. The first notification
sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
because of its fake path. The second one sent by vfs_write grabs marks
from ovl inode and ovl mnt.

LTP fanotify06 add modify mark for mnt point, then add ignore modify
mask on testfile, then truncate and write the file. Because the ignore
mask is marked on ovl inode, not the upper inode, the first event is not
masked like the second one. So we get a modification event even with a
mask on the file.

Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
vfs_iter_write used by ovl can use this flag to skip one duplicate event.

Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
---
 fs/overlayfs/file.c     | 2 +-
 fs/read_write.c         | 2 +-
 include/uapi/linux/fs.h | 5 ++++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 84dd957efa24..0827199a5311 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -242,7 +242,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	file_start_write(real.file);
 	ret = vfs_iter_write(real.file, iter, &iocb->ki_pos,
-			     ovl_iocb_to_rwf(iocb));
+			     ovl_iocb_to_rwf(iocb)|RWF_NONOTIFY);
 	file_end_write(real.file);
 	revert_creds(old_cred);
 
diff --git a/fs/read_write.c b/fs/read_write.c
index 61b43ad7608e..aec751cacedf 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -957,7 +957,7 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
 		ret = do_iter_readv_writev(file, iter, pos, WRITE, flags);
 	else
 		ret = do_loop_readv_writev(file, iter, pos, WRITE, flags);
-	if (ret > 0)
+	if (ret > 0 && !(flags & RWF_NONOTIFY))
 		fsnotify_modify(file);
 	return ret;
 }
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 121e82ce296b..103f1a9375f2 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -342,8 +342,11 @@ typedef int __bitwise __kernel_rwf_t;
 /* per-IO O_APPEND */
 #define RWF_APPEND	((__force __kernel_rwf_t)0x00000010)
 
+/* do not notify this IO */
+#define RWF_NONOTIFY	((__force __kernel_rwf_t)0x00000020)
+
 /* mask of flags supported by the kernel */
 #define RWF_SUPPORTED	(RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
-			 RWF_APPEND)
+			 RWF_APPEND | RWF_NONOTIFY)
 
 #endif /* _UAPI_LINUX_FS_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23  6:50 [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event Murphy Zhou
@ 2019-04-23 10:59 ` Amir Goldstein
  2019-04-23 11:39   ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2019-04-23 10:59 UTC (permalink / raw)
  To: Murphy Zhou; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, Amir Goldstein

[-- Attachment #1: Type: text/plain, Size: 3370 bytes --]

On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
>
> Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> in which calls fsnotify_modify on this change, however vfs_write also
> calls fsnotify_modify after ovl_iter_write. The first notification
> sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> because of its fake path. The second one sent by vfs_write grabs marks
> from ovl inode and ovl mnt.
>
> LTP fanotify06 add modify mark for mnt point, then add ignore modify
> mask on testfile, then truncate and write the file. Because the ignore
> mask is marked on ovl inode, not the upper inode, the first event is not
> masked like the second one. So we get a modification event even with a
> mask on the file.

Care to extend fanotify06 in a similar manner to the way readahead02
was extended to test overlay test case regardless of the base LTP filesystem?

>
> Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> vfs_iter_write used by ovl can use this flag to skip one duplicate event.
>

This fix is wrong for several reasons:
- It exports RWF_NONOTIFY to uapi
- It will cause no events at all when overlay writes to file even when user
requested events on upper inode

Please try attached patch.

Thanks,
Amir.

> Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
> ---
>  fs/overlayfs/file.c     | 2 +-
>  fs/read_write.c         | 2 +-
>  include/uapi/linux/fs.h | 5 ++++-
>  3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 84dd957efa24..0827199a5311 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -242,7 +242,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
>         old_cred = ovl_override_creds(file_inode(file)->i_sb);
>         file_start_write(real.file);
>         ret = vfs_iter_write(real.file, iter, &iocb->ki_pos,
> -                            ovl_iocb_to_rwf(iocb));
> +                            ovl_iocb_to_rwf(iocb)|RWF_NONOTIFY);
>         file_end_write(real.file);
>         revert_creds(old_cred);
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 61b43ad7608e..aec751cacedf 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -957,7 +957,7 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
>                 ret = do_iter_readv_writev(file, iter, pos, WRITE, flags);
>         else
>                 ret = do_loop_readv_writev(file, iter, pos, WRITE, flags);
> -       if (ret > 0)
> +       if (ret > 0 && !(flags & RWF_NONOTIFY))
>                 fsnotify_modify(file);
>         return ret;
>  }
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 121e82ce296b..103f1a9375f2 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -342,8 +342,11 @@ typedef int __bitwise __kernel_rwf_t;
>  /* per-IO O_APPEND */
>  #define RWF_APPEND     ((__force __kernel_rwf_t)0x00000010)
>
> +/* do not notify this IO */
> +#define RWF_NONOTIFY   ((__force __kernel_rwf_t)0x00000020)
> +
>  /* mask of flags supported by the kernel */
>  #define RWF_SUPPORTED  (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
> -                        RWF_APPEND)
> +                        RWF_APPEND | RWF_NONOTIFY)
>
>  #endif /* _UAPI_LINUX_FS_H */
> --
> 2.21.0
>

[-- Attachment #2: 0001-ovl-use-real-path-for-open_with_fake_path.patch.txt --]
[-- Type: text/plain, Size: 1492 bytes --]

From c511a103ba6690cbf6748f25700cb8e95dd77982 Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il@gmail.com>
Date: Tue, 23 Apr 2019 13:11:10 +0300
Subject: [PATCH] ovl: use real path for open_with_fake_path()

ovl_open() uses open_with_fake_path() to create a file object with a
given path.  Currently overlayfs uses the overlay path as the "fake"
path. This causes some issues when VFS code refers to file->f_path->mnt.

One issue, reported by Murphy Zhou, is that fsnotify hooks check
for mount marks against the wrong vfsmount.

Link: https://lore.kernel.org/linux-fsdevel/20190423065024.12695-1-jencce.kernel@gmail.com/
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/file.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index ddfd93f13cc5..ec9f8763fe08 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -28,10 +28,13 @@ static struct file *ovl_open_realfile(const struct file *file,
 {
 	struct inode *inode = file_inode(file);
 	struct file *realfile;
+	struct path realpath;
 	const struct cred *old_cred;
 
+	ovl_path_real(file_dentry(file), &realpath);
+
 	old_cred = ovl_override_creds(inode->i_sb);
-	realfile = open_with_fake_path(&file->f_path, file->f_flags | O_NOATIME,
+	realfile = open_with_fake_path(&realpath, file->f_flags | O_NOATIME,
 				       realinode, current_cred());
 	revert_creds(old_cred);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 10:59 ` Amir Goldstein
@ 2019-04-23 11:39   ` Miklos Szeredi
  2019-04-23 12:44     ` Amir Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: Miklos Szeredi @ 2019-04-23 11:39 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Murphy Zhou, overlayfs, linux-fsdevel

On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> >
> > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > in which calls fsnotify_modify on this change, however vfs_write also
> > calls fsnotify_modify after ovl_iter_write. The first notification
> > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > because of its fake path. The second one sent by vfs_write grabs marks
> > from ovl inode and ovl mnt.
> >
> > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > mask on testfile, then truncate and write the file. Because the ignore
> > mask is marked on ovl inode, not the upper inode, the first event is not
> > masked like the second one. So we get a modification event even with a
> > mask on the file.
>
> Care to extend fanotify06 in a similar manner to the way readahead02
> was extended to test overlay test case regardless of the base LTP filesystem?
>
> >
> > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> >
>
> This fix is wrong for several reasons:
> - It exports RWF_NONOTIFY to uapi
> - It will cause no events at all when overlay writes to file even when user
> requested events on upper inode
>
> Please try attached patch.

Would be nice, but until mmap stops using realfile this isn't a good solution.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 11:39   ` Miklos Szeredi
@ 2019-04-23 12:44     ` Amir Goldstein
  2019-04-23 13:41       ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2019-04-23 12:44 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Murphy Zhou, overlayfs, linux-fsdevel

On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> > >
> > > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > > in which calls fsnotify_modify on this change, however vfs_write also
> > > calls fsnotify_modify after ovl_iter_write. The first notification
> > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > > because of its fake path. The second one sent by vfs_write grabs marks
> > > from ovl inode and ovl mnt.
> > >
> > > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > > mask on testfile, then truncate and write the file. Because the ignore
> > > mask is marked on ovl inode, not the upper inode, the first event is not
> > > masked like the second one. So we get a modification event even with a
> > > mask on the file.
> >
> > Care to extend fanotify06 in a similar manner to the way readahead02
> > was extended to test overlay test case regardless of the base LTP filesystem?
> >
> > >
> > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> > >
> >
> > This fix is wrong for several reasons:
> > - It exports RWF_NONOTIFY to uapi
> > - It will cause no events at all when overlay writes to file even when user
> > requested events on upper inode
> >
> > Please try attached patch.
>
> Would be nice, but until mmap stops using realfile this isn't a good solution.
>

Sigh! I figured there was a catch...
Will it be ok if fake path used a cloned private mount of overlay mount?

This will have averted the ignore mask problem.
Users cannot set a mark on a private mount.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 12:44     ` Amir Goldstein
@ 2019-04-23 13:41       ` Miklos Szeredi
  2019-04-23 13:53         ` Amir Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: Miklos Szeredi @ 2019-04-23 13:41 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Murphy Zhou, overlayfs, linux-fsdevel

On Tue, Apr 23, 2019 at 2:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> > > >
> > > > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > > > in which calls fsnotify_modify on this change, however vfs_write also
> > > > calls fsnotify_modify after ovl_iter_write. The first notification
> > > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > > > because of its fake path. The second one sent by vfs_write grabs marks
> > > > from ovl inode and ovl mnt.
> > > >
> > > > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > > > mask on testfile, then truncate and write the file. Because the ignore
> > > > mask is marked on ovl inode, not the upper inode, the first event is not
> > > > masked like the second one. So we get a modification event even with a
> > > > mask on the file.
> > >
> > > Care to extend fanotify06 in a similar manner to the way readahead02
> > > was extended to test overlay test case regardless of the base LTP filesystem?
> > >
> > > >
> > > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > > > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> > > >
> > >
> > > This fix is wrong for several reasons:
> > > - It exports RWF_NONOTIFY to uapi
> > > - It will cause no events at all when overlay writes to file even when user
> > > requested events on upper inode
> > >
> > > Please try attached patch.
> >
> > Would be nice, but until mmap stops using realfile this isn't a good solution.
> >
>
> Sigh! I figured there was a catch...
> Will it be ok if fake path used a cloned private mount of overlay mount?

So the reason we have the fake path is e.g. /proc/$$/maps.  That can
only work with the original mount, AFAICS.

We could have one realfile for regular I/O and a separate one for mmap
but that would increase complexity as well as resource use, so I'm not
quite sure if that's the right solution.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 13:41       ` Miklos Szeredi
@ 2019-04-23 13:53         ` Amir Goldstein
  2019-04-23 14:16           ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2019-04-23 13:53 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Murphy Zhou, overlayfs, linux-fsdevel

On Tue, Apr 23, 2019 at 4:41 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, Apr 23, 2019 at 2:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> > > > >
> > > > > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > > > > in which calls fsnotify_modify on this change, however vfs_write also
> > > > > calls fsnotify_modify after ovl_iter_write. The first notification
> > > > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > > > > because of its fake path. The second one sent by vfs_write grabs marks
> > > > > from ovl inode and ovl mnt.
> > > > >
> > > > > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > > > > mask on testfile, then truncate and write the file. Because the ignore
> > > > > mask is marked on ovl inode, not the upper inode, the first event is not
> > > > > masked like the second one. So we get a modification event even with a
> > > > > mask on the file.
> > > >
> > > > Care to extend fanotify06 in a similar manner to the way readahead02
> > > > was extended to test overlay test case regardless of the base LTP filesystem?
> > > >
> > > > >
> > > > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > > > > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> > > > >
> > > >
> > > > This fix is wrong for several reasons:
> > > > - It exports RWF_NONOTIFY to uapi
> > > > - It will cause no events at all when overlay writes to file even when user
> > > > requested events on upper inode
> > > >
> > > > Please try attached patch.
> > >
> > > Would be nice, but until mmap stops using realfile this isn't a good solution.
> > >
> >
> > Sigh! I figured there was a catch...
> > Will it be ok if fake path used a cloned private mount of overlay mount?
>
> So the reason we have the fake path is e.g. /proc/$$/maps.  That can
> only work with the original mount, AFAICS.
>
> We could have one realfile for regular I/O and a separate one for mmap
> but that would increase complexity as well as resource use, so I'm not
> quite sure if that's the right solution.
>

I see. Well what we could do is set a file flag for fake path, so at least
fsnotify will be able to ignore f_path->mnt when calculating ignore mask.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 13:53         ` Amir Goldstein
@ 2019-04-23 14:16           ` Miklos Szeredi
  2019-04-23 15:05             ` Amir Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: Miklos Szeredi @ 2019-04-23 14:16 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Murphy Zhou, overlayfs, linux-fsdevel

On Tue, Apr 23, 2019 at 3:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Apr 23, 2019 at 4:41 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Tue, Apr 23, 2019 at 2:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> > > > > >
> > > > > > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > > > > > in which calls fsnotify_modify on this change, however vfs_write also
> > > > > > calls fsnotify_modify after ovl_iter_write. The first notification
> > > > > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > > > > > because of its fake path. The second one sent by vfs_write grabs marks
> > > > > > from ovl inode and ovl mnt.
> > > > > >
> > > > > > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > > > > > mask on testfile, then truncate and write the file. Because the ignore
> > > > > > mask is marked on ovl inode, not the upper inode, the first event is not
> > > > > > masked like the second one. So we get a modification event even with a
> > > > > > mask on the file.
> > > > >
> > > > > Care to extend fanotify06 in a similar manner to the way readahead02
> > > > > was extended to test overlay test case regardless of the base LTP filesystem?
> > > > >
> > > > > >
> > > > > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > > > > > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> > > > > >
> > > > >
> > > > > This fix is wrong for several reasons:
> > > > > - It exports RWF_NONOTIFY to uapi
> > > > > - It will cause no events at all when overlay writes to file even when user
> > > > > requested events on upper inode
> > > > >
> > > > > Please try attached patch.
> > > >
> > > > Would be nice, but until mmap stops using realfile this isn't a good solution.
> > > >
> > >
> > > Sigh! I figured there was a catch...
> > > Will it be ok if fake path used a cloned private mount of overlay mount?
> >
> > So the reason we have the fake path is e.g. /proc/$$/maps.  That can
> > only work with the original mount, AFAICS.
> >
> > We could have one realfile for regular I/O and a separate one for mmap
> > but that would increase complexity as well as resource use, so I'm not
> > quite sure if that's the right solution.
> >
>
> I see. Well what we could do is set a file flag for fake path, so at least
> fsnotify will be able to ignore f_path->mnt when calculating ignore mask.

FMODE_NONOTIFY_MNT: like FMODE_NONOTIFY, but only affect mount marks?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 14:16           ` Miklos Szeredi
@ 2019-04-23 15:05             ` Amir Goldstein
  2019-04-23 16:24               ` Amir Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2019-04-23 15:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Murphy Zhou, overlayfs, linux-fsdevel, Jan Kara

On Tue, Apr 23, 2019 at 5:16 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, Apr 23, 2019 at 3:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Tue, Apr 23, 2019 at 4:41 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Tue, Apr 23, 2019 at 2:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > > >
> > > > > On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> > > > > > >
> > > > > > > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > > > > > > in which calls fsnotify_modify on this change, however vfs_write also
> > > > > > > calls fsnotify_modify after ovl_iter_write. The first notification
> > > > > > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > > > > > > because of its fake path. The second one sent by vfs_write grabs marks
> > > > > > > from ovl inode and ovl mnt.
> > > > > > >
> > > > > > > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > > > > > > mask on testfile, then truncate and write the file. Because the ignore
> > > > > > > mask is marked on ovl inode, not the upper inode, the first event is not
> > > > > > > masked like the second one. So we get a modification event even with a
> > > > > > > mask on the file.
> > > > > >
> > > > > > Care to extend fanotify06 in a similar manner to the way readahead02
> > > > > > was extended to test overlay test case regardless of the base LTP filesystem?
> > > > > >
> > > > > > >
> > > > > > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > > > > > > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> > > > > > >
> > > > > >
> > > > > > This fix is wrong for several reasons:
> > > > > > - It exports RWF_NONOTIFY to uapi
> > > > > > - It will cause no events at all when overlay writes to file even when user
> > > > > > requested events on upper inode
> > > > > >
> > > > > > Please try attached patch.
> > > > >
> > > > > Would be nice, but until mmap stops using realfile this isn't a good solution.
> > > > >
> > > >
> > > > Sigh! I figured there was a catch...
> > > > Will it be ok if fake path used a cloned private mount of overlay mount?
> > >
> > > So the reason we have the fake path is e.g. /proc/$$/maps.  That can
> > > only work with the original mount, AFAICS.
> > >
> > > We could have one realfile for regular I/O and a separate one for mmap
> > > but that would increase complexity as well as resource use, so I'm not
> > > quite sure if that's the right solution.
> > >
> >
> > I see. Well what we could do is set a file flag for fake path, so at least
> > fsnotify will be able to ignore f_path->mnt when calculating ignore mask.
>
> FMODE_NONOTIFY_MNT: like FMODE_NONOTIFY, but only affect mount marks?
>

On second thought I don't think there a need for an explicit
FMODE_NONOTIFY_MNT/FMODE_FAKE_PATH flag because code
can just check for file_inode(f) != d_inode(f->f_path->dentry).
The question is what do do about this.
I think situation is actually worse than reported (CC Jan).

This problem existed even before stacked f_op, but is now manifesting
differently because of stacked f_op.

When user sets a mark on real inode or real parent dir inode, the
user events that user gets when operations are performed via overlayfs file
are with event->fd created by dentry_open(&event->path, ...
So even when ignore mount marks are not involved, user will get the wrong
path (overlayfs path) and referenced by event->fd while the mark was set
on the real inode.

The good new is that I believe fanotify marks are most commonly set on
mount marks in the wild (?), so the events via overlayfs file on real inode
won't be triggered in this case anyway and that's fine.
FAN_MARK_FILESYSTEM may have opened up the possibility of exposing
this bug to new users.

In fsnotify_path(), when inode mismatches path->dentry->d_inode
calling fsnotify_parent() is wrong (we are looking for parent watch of
inode) and passing FSNOTIFY_EVENT_PATH info is wrong, because
fsnotify() will filter by wrong mount and wrong path will we reported to
user via event->fd.

We could still report the event with FSNOTIFY_EVENT_INODE info,
so at least inotify/audit watch on inode itself will work, but parent
watches will not work as well as legacy fanotify inode watches.
New FAN_REPORT_FID fanotify watches should work just fine.

Any other ideas how to untangle?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event
  2019-04-23 15:05             ` Amir Goldstein
@ 2019-04-23 16:24               ` Amir Goldstein
  0 siblings, 0 replies; 9+ messages in thread
From: Amir Goldstein @ 2019-04-23 16:24 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Murphy Zhou, overlayfs, linux-fsdevel, Jan Kara

[-- Attachment #1: Type: text/plain, Size: 5102 bytes --]

On Tue, Apr 23, 2019 at 6:05 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Apr 23, 2019 at 5:16 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Tue, Apr 23, 2019 at 3:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Tue, Apr 23, 2019 at 4:41 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Tue, Apr 23, 2019 at 2:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > > > >
> > > > > > On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou <jencce.kernel@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Overlays ovl_iter_write calls vfs_iter_write to write on real file,
> > > > > > > > in which calls fsnotify_modify on this change, however vfs_write also
> > > > > > > > calls fsnotify_modify after ovl_iter_write. The first notification
> > > > > > > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt,
> > > > > > > > because of its fake path. The second one sent by vfs_write grabs marks
> > > > > > > > from ovl inode and ovl mnt.
> > > > > > > >
> > > > > > > > LTP fanotify06 add modify mark for mnt point, then add ignore modify
> > > > > > > > mask on testfile, then truncate and write the file. Because the ignore
> > > > > > > > mask is marked on ovl inode, not the upper inode, the first event is not
> > > > > > > > masked like the second one. So we get a modification event even with a
> > > > > > > > mask on the file.
> > > > > > >
> > > > > > > Care to extend fanotify06 in a similar manner to the way readahead02
> > > > > > > was extended to test overlay test case regardless of the base LTP filesystem?
> > > > > > >
> > > > > > > >
> > > > > > > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO.
> > > > > > > > vfs_iter_write used by ovl can use this flag to skip one duplicate event.
> > > > > > > >
> > > > > > >
> > > > > > > This fix is wrong for several reasons:
> > > > > > > - It exports RWF_NONOTIFY to uapi
> > > > > > > - It will cause no events at all when overlay writes to file even when user
> > > > > > > requested events on upper inode
> > > > > > >
> > > > > > > Please try attached patch.
> > > > > >
> > > > > > Would be nice, but until mmap stops using realfile this isn't a good solution.
> > > > > >
> > > > >
> > > > > Sigh! I figured there was a catch...
> > > > > Will it be ok if fake path used a cloned private mount of overlay mount?
> > > >
> > > > So the reason we have the fake path is e.g. /proc/$$/maps.  That can
> > > > only work with the original mount, AFAICS.
> > > >
> > > > We could have one realfile for regular I/O and a separate one for mmap
> > > > but that would increase complexity as well as resource use, so I'm not
> > > > quite sure if that's the right solution.
> > > >
> > >
> > > I see. Well what we could do is set a file flag for fake path, so at least
> > > fsnotify will be able to ignore f_path->mnt when calculating ignore mask.
> >
> > FMODE_NONOTIFY_MNT: like FMODE_NONOTIFY, but only affect mount marks?
> >
>
> On second thought I don't think there a need for an explicit
> FMODE_NONOTIFY_MNT/FMODE_FAKE_PATH flag because code
> can just check for file_inode(f) != d_inode(f->f_path->dentry).
> The question is what do do about this.
> I think situation is actually worse than reported (CC Jan).
>
> This problem existed even before stacked f_op, but is now manifesting
> differently because of stacked f_op.
>
> When user sets a mark on real inode or real parent dir inode, the
> user events that user gets when operations are performed via overlayfs file
> are with event->fd created by dentry_open(&event->path, ...
> So even when ignore mount marks are not involved, user will get the wrong
> path (overlayfs path) and referenced by event->fd while the mark was set
> on the real inode.
>
> The good new is that I believe fanotify marks are most commonly set on
> mount marks in the wild (?), so the events via overlayfs file on real inode
> won't be triggered in this case anyway and that's fine.
> FAN_MARK_FILESYSTEM may have opened up the possibility of exposing
> this bug to new users.
>
> In fsnotify_path(), when inode mismatches path->dentry->d_inode
> calling fsnotify_parent() is wrong (we are looking for parent watch of
> inode) and passing FSNOTIFY_EVENT_PATH info is wrong, because
> fsnotify() will filter by wrong mount and wrong path will we reported to
> user via event->fd.
>
> We could still report the event with FSNOTIFY_EVENT_INODE info,
> so at least inotify/audit watch on inode itself will work, but parent
> watches will not work as well as legacy fanotify inode watches.
> New FAN_REPORT_FID fanotify watches should work just fine.
>
> Any other ideas how to untangle?
>

How about this patch?

If prevents duplicate events due to "fake" path and addresses
the reported issue with ignore mask, but it leaves the "wrong"
paths reported with event->fd as is.

Murpphy, can you test it?

Thanks,
Amir.

[-- Attachment #2: 0001-fsnotify-do-not-generate-duplicate-fsnotify-events-f.patch.txt --]
[-- Type: text/plain, Size: 3044 bytes --]

From 2cc6fe16130e482176de105d6f906cf0047e2db8 Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il@gmail.com>
Date: Tue, 23 Apr 2019 18:24:12 +0300
Subject: [PATCH] fsnotify: do not generate duplicate fsnotify events for
 "fake" path

Overlayfs "fake" path is used for stacked file operations on
underlying files.  Operations on files with "fake" path must not
generate events on mount marks and on parent watches, because
those events have already been generated at overlayfs layer.

The reported event->fd for inode/sb marks will have the wrong path
(overlayfs path), but we have no choice but to report them anyway.

Link: https://lore.kernel.org/linux-fsdevel/20190423065024.12695-1-jencce.kernel@gmail.com/
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Fixes: d1d04ef8572b ("ovl: stack file ops")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c     | 18 ++++++++++++++++--
 include/linux/fsnotify.h | 14 +++++++++++---
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index df06f3da166c..ee5e4ea7e5ca 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -335,9 +335,23 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	__u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
 
 	if (data_is == FSNOTIFY_EVENT_PATH) {
-		mnt = real_mount(((const struct path *)data)->mnt);
-		mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
+		const struct path *path = (const struct path *)data;
+
+		/*
+		 * Overlayfs "fake" path used for stacked file operations on
+		 * underlying files must not generate events on mount marks,
+		 * because those events have already been generated at
+		 * overlayfs layer.  The reported event->fd for inode/sb marks
+		 * will have the wrong path (overlayfs path), but we have no
+		 * choice but to report them anyway.
+		 */
+		if (likely((mask & FS_EVENT_ON_CHILD) ||
+		    to_tell == d_inode(path->dentry))) {
+			mnt = real_mount(((const struct path *)data)->mnt);
+			mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
+		}
 	}
+
 	/* An event "on child" is not intended for a mount/sb mark */
 	if (mask & FS_EVENT_ON_CHILD)
 		mnt_or_sb_mask = 0;
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 09587e2860b5..2a197799c06d 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -47,10 +47,18 @@ static inline int fsnotify_parent(const struct path *path,
 static inline int fsnotify_path(struct inode *inode, const struct path *path,
 				__u32 mask)
 {
-	int ret = fsnotify_parent(path, NULL, mask);
+	/*
+	 * Overlayfs "fake" path used for stacked file operations on underlying
+	 * files must not generate events on parent, because those events have
+	 * already been generated at overlayfs layer.
+	 */
+	if (likely(inode == d_inode(path->dentry))) {
+		int ret = fsnotify_parent(path, NULL, mask);
+
+		if (ret)
+			return ret;
+	}
 
-	if (ret)
-		return ret;
 	return fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-23 16:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23  6:50 [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event Murphy Zhou
2019-04-23 10:59 ` Amir Goldstein
2019-04-23 11:39   ` Miklos Szeredi
2019-04-23 12:44     ` Amir Goldstein
2019-04-23 13:41       ` Miklos Szeredi
2019-04-23 13:53         ` Amir Goldstein
2019-04-23 14:16           ` Miklos Szeredi
2019-04-23 15:05             ` Amir Goldstein
2019-04-23 16:24               ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).