All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	linux-unionfs@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Jeremy Eder <jeder@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Ratna Bolla <rbolla@portworx.com>, Gou Rao <grao@portworx.com>,
	Vinod Jayaraman <jv@portworx.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [POC/RFC PATCH] overlayfs: fix data inconsistency at copy up
Date: Fri, 21 Oct 2016 16:13:35 -0400	[thread overview]
Message-ID: <20161021201335.GB20129@redhat.com> (raw)
In-Reply-To: <CAOQ4uxh7cfh5ZtNX8hkeTuye6hRxbMpzZ4p9mGAn2z5rq3RGvA@mail.gmail.com>

On Fri, Oct 21, 2016 at 11:53:41AM +0300, Amir Goldstein wrote:
> On Thu, Oct 20, 2016 at 11:54 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Thu, Oct 20, 2016 at 04:46:30PM -0400, Vivek Goyal wrote:
> >
> > [..]
> >> > +static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *to)
> >> > +{
> >> > +   struct file *file = iocb->ki_filp;
> >> > +   bool isupper = OVL_TYPE_UPPER(ovl_path_type(file->f_path.dentry));
> >> > +   ssize_t ret = -EINVAL;
> >> > +
> >> > +   if (likely(!isupper)) {
> >> > +           const struct file_operations *fop = ovl_real_fop(file);
> >> > +
> >> > +           if (likely(fop->read_iter))
> >> > +                   ret = fop->read_iter(iocb, to);
> >> > +   } else {
> >> > +           struct file *upperfile = filp_clone_open(file);
> >> > +
> >>
> >> IIUC, every read of lower file will call filp_clone_open(). Looking at the
> >> code of filp_clone_open(), I am concerned about the overhead of this call.
> >> Is it significant? Don't want to be paying too much of penalty for read
> >> operation on lower files. That would be a common case for containers.
> >>
> >
> > Looks like I read the code in reverse. So if I open a file read-only,
> > and if it has not been copied up, I will simply call read_iter() on
> > lower filesystem. But if file has been copied up, then I will call
> > filp_clone_open() and pay the cost. And this will continue till this
> > file is closed by caller.
> >
> 
> I wonder if that cost could be reduced by calling replace_fd() or
> some variant of it to install the cloned file onto the rofd after the
> first access??

Hmm.., Interesting. Will something like following work? This applies on
top of Miklos's patch. It seems to work for me. It might be completely
broken/racy though. Somebody who understands this code well, will have
to have a look.

---
 fs/file.c            |   41 +++++++++++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c |    1 +
 2 files changed, 42 insertions(+)

Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2016-10-21 15:43:05.391488406 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2016-10-21 16:07:57.409420795 -0400
@@ -416,6 +416,7 @@ static ssize_t ovl_read_iter(struct kioc
 		if (IS_ERR(upperfile)) {
 			ret = PTR_ERR(upperfile);
 		} else {
+			replace_file(file, upperfile);
 			ret = vfs_iter_read(upperfile, to, &iocb->ki_pos);
 			fput(upperfile);
 		}
Index: rhvgoyal-linux/fs/file.c
===================================================================
--- rhvgoyal-linux.orig/fs/file.c	2016-10-21 15:43:05.391488406 -0400
+++ rhvgoyal-linux/fs/file.c	2016-10-21 16:08:18.168420795 -0400
@@ -864,6 +864,47 @@ Ebusy:
 	return -EBUSY;
 }
 
+
+int replace_file(struct file *old_file, struct file *new_file)
+{
+#define MAX_TO_FREE	8
+	int n, idx = 0;
+	struct files_struct *files = current->files;
+	struct fdtable *fdt;
+	struct file *to_free[MAX_TO_FREE];
+	bool retry = false;
+
+try_again:
+	spin_lock(&files->file_lock);
+	for (n = 0, fdt = files_fdtable(files); n < fdt->max_fds; n++) {
+                struct file *file;
+                file = rcu_dereference_check_fdtable(files, fdt->fd[n]);
+                if (!file)
+                        continue;
+		if (file == old_file) {
+			get_file(new_file);
+			rcu_assign_pointer(fdt->fd[n], new_file);
+			to_free[idx++] = file;
+			if (idx >= MAX_TO_FREE) {
+				retry = true;
+				break;
+			}
+		}
+        }
+	spin_unlock(&files->file_lock);
+	while (idx) {
+		filp_close(to_free[--idx], files);
+	}
+
+	if (retry) {
+		retry = false;
+		idx = 0;
+		goto try_again;
+	}
+	return 0;
+}
+EXPORT_SYMBOL(replace_file);
+
 int replace_fd(unsigned fd, struct file *file, unsigned flags)
 {
 	int err;

  reply	other threads:[~2016-10-21 20:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-12 13:33 [POC/RFC PATCH] overlayfs: fix data inconsistency at copy up Miklos Szeredi
2016-10-13 18:45 ` Amir Goldstein
2016-10-20 20:46 ` Vivek Goyal
2016-10-20 20:54   ` Vivek Goyal
2016-10-21  8:53     ` Amir Goldstein
2016-10-21 20:13       ` Vivek Goyal [this message]
2016-10-22  7:24         ` Amir Goldstein
2016-10-22 15:39           ` Amir Goldstein
2016-10-24  8:11             ` Miklos Szeredi
2016-10-21  9:12     ` Miklos Szeredi
2016-10-21 13:31       ` Vivek Goyal
2016-10-21  9:13   ` Amir Goldstein
2016-10-21  9:30     ` Miklos Szeredi
2016-10-21 13:18       ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161021201335.GB20129@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=grao@portworx.com \
    --cc=jeder@redhat.com \
    --cc=jv@portworx.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=rbolla@portworx.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.