linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite
@ 2012-03-01 11:41 Jan Kara
  2012-03-01 11:41 ` [PATCH 1/9] fb_defio: Push file_update_time() into fb_deferred_io_mkwrite() Jan Kara
                   ` (11 more replies)
  0 siblings, 12 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Jaya Kumar, Sage Weil, ceph-devel, Steve French, linux-cifs,
	Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer, Miklos Szeredi, fuse-devel, Steven Whitehouse,
	cluster-devel, Greg Kroah-Hartman

  Hello,

  to provide reliable support for filesystem freezing, filesystems need to have
complete control over when metadata is changed. In particular,
file_update_time() calls from page fault code make it impossible for
filesystems to prevent inodes from being dirtied while the filesystem is
frozen.

To fix the issue, this patch set changes page fault code to call
file_update_time() only when ->page_mkwrite() callback is not provided. If the
callback is provided, it is the responsibility of the filesystem to perform
update of i_mtime / i_ctime if needed. We also push file_update_time() call
to all existing ->page_mkwrite() implementations if the time update does not
obviously happen by other means. If you know your filesystem does not need
update of modification times in ->page_mkwrite() handler, please speak up and
I'll drop the patch for your filesystem.

As a side note, an alternative would be to remove call of file_update_time()
from page fault code altogether and require all filesystems needing it to do
that in their ->page_mkwrite() implementation. That is certainly possible
although maybe slightly inefficient and would require auditting 100+
vm_operations_structs *shiver*.

Changes since v1:
* Dropped patches for filesystems which don't need them
* Added some acks
* Improved sysfs patch by Alex Elder's suggestion

Andrew, would you be willing to merge these patches via your tree?

								Honza

CC: Jaya Kumar <jayalk@intworks.biz>
CC: Sage Weil <sage@newdream.net>
CC: ceph-devel@vger.kernel.org
CC: Steve French <sfrench@samba.org>
CC: linux-cifs@vger.kernel.org
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Ron Minnich <rminnich@sandia.gov>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: v9fs-developer@lists.sourceforge.net
CC: Miklos Szeredi <miklos@szeredi.hu>
CC: fuse-devel@lists.sourceforge.net
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/9] fb_defio: Push file_update_time() into fb_deferred_io_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 11:41 ` [PATCH 2/9] fs: Push file_update_time() into __block_page_mkwrite() Jan Kara
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara, Jaya Kumar

CC: Jaya Kumar <jayalk@intworks.biz>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/video/fb_defio.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/video/fb_defio.c b/drivers/video/fb_defio.c
index c27e153..7a09c06 100644
--- a/drivers/video/fb_defio.c
+++ b/drivers/video/fb_defio.c
@@ -104,6 +104,8 @@ static int fb_deferred_io_mkwrite(struct vm_area_struct *vma,
 	deferred framebuffer IO. then if userspace touches a page
 	again, we repeat the same scheme */
 
+	file_update_time(vma->vm_file);
+
 	/* protect against the workqueue changing the page list */
 	mutex_lock(&fbdefio->lock);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/9] fs: Push file_update_time() into __block_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
  2012-03-01 11:41 ` [PATCH 1/9] fb_defio: Push file_update_time() into fb_deferred_io_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 11:41 ` [PATCH 3/9] ceph: Push file_update_time() into ceph_page_mkwrite() Jan Kara
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/buffer.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 1a30db7..5294a33 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2300,6 +2300,12 @@ int __block_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf,
 	loff_t size;
 	int ret;
 
+	/*
+	 * Update file times before taking page lock. We may end up failing the
+	 * fault so this update may be superfluous but who really cares...
+	 */
+	file_update_time(vma->vm_file);
+
 	lock_page(page);
 	size = i_size_read(inode);
 	if ((page->mapping != inode->i_mapping) ||
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/9] ceph: Push file_update_time() into ceph_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
  2012-03-01 11:41 ` [PATCH 1/9] fb_defio: Push file_update_time() into fb_deferred_io_mkwrite() Jan Kara
  2012-03-01 11:41 ` [PATCH 2/9] fs: Push file_update_time() into __block_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 11:41 ` [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite() Jan Kara
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Sage Weil, ceph-devel

CC: Sage Weil <sage@newdream.net>
CC: ceph-devel@vger.kernel.org
Acked-by: Sage Weil <sage@newdream.net>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ceph/addr.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 173b1d2..12b139f 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1181,6 +1181,9 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 	loff_t size, len;
 	int ret;
 
+	/* Update time before taking page lock */
+	file_update_time(vma->vm_file);
+
 	size = i_size_read(inode);
 	if (off + PAGE_CACHE_SIZE <= size)
 		len = PAGE_CACHE_SIZE;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (2 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 3/9] ceph: Push file_update_time() into ceph_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 12:25   ` Jeff Layton
  2012-03-01 11:41 ` [PATCH 5/9] 9p: Push file_update_time() into v9fs_vm_page_mkwrite() Jan Kara
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Steve French, linux-cifs

CC: Steve French <sfrench@samba.org>
CC: linux-cifs@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/cifs/file.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4dd9283..8e3b23b 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2425,6 +2425,9 @@ cifs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
 	struct page *page = vmf->page;
 
+	/* Update file times before taking page lock */
+	file_update_time(vma->vm_file);
+
 	lock_page(page);
 	return VM_FAULT_LOCKED;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/9] 9p: Push file_update_time() into v9fs_vm_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (3 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 11:41 ` [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite() Jan Kara
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer

CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Ron Minnich <rminnich@sandia.gov>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: v9fs-developer@lists.sourceforge.net
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/9p/vfs_file.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index fc06fd2..dd6f7ee 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -610,6 +610,9 @@ v9fs_vm_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 	p9_debug(P9_DEBUG_VFS, "page %p fid %lx\n",
 		 page, (unsigned long)filp->private_data);
 
+	/* Update file times before taking page lock */
+	file_update_time(filp);
+
 	v9inode = V9FS_I(inode);
 	/* make sure the cache has finished storing the page */
 	v9fs_fscache_wait_on_page_write(inode, page);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (4 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 5/9] 9p: Push file_update_time() into v9fs_vm_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 19:31   ` Miklos Szeredi
  2012-03-01 11:41 ` [PATCH 7/9] gfs2: Push file_update_time() into gfs2_page_mkwrite() Jan Kara
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Miklos Szeredi, fuse-devel

CC: Miklos Szeredi <miklos@szeredi.hu>
CC: fuse-devel@lists.sourceforge.net
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fuse/file.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 4a199fd..eade72e 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1323,6 +1323,7 @@ static int fuse_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 	 */
 	struct inode *inode = vma->vm_file->f_mapping->host;
 
+	file_update_time(vma->vm_file);
 	fuse_wait_on_page_writeback(inode, page->index);
 	return 0;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/9] gfs2: Push file_update_time() into gfs2_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (5 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 11:41 ` [PATCH 8/9] sysfs: Push file_update_time() into bin_page_mkwrite() Jan Kara
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Steven Whitehouse, cluster-devel

CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/gfs2/file.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c5fb359..1f03531 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -375,6 +375,9 @@ static int gfs2_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 	 */
 	vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE);
 
+	/* Update file times before taking page lock */
+	file_update_time(vma->vm_file);
+
 	gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
 	ret = gfs2_glock_nq(&gh);
 	if (ret)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/9] sysfs: Push file_update_time() into bin_page_mkwrite()
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (6 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 7/9] gfs2: Push file_update_time() into gfs2_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 11:41 ` [PATCH 9/9] mm: Update file times from fault path only if .page_mkwrite is not set Jan Kara
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Greg Kroah-Hartman

CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/sysfs/bin.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index a475983..614b2b5 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -228,6 +228,8 @@ static int bin_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 	ret = 0;
 	if (bb->vm_ops->page_mkwrite)
 		ret = bb->vm_ops->page_mkwrite(vma, vmf);
+	else
+		file_update_time(file);
 
 	sysfs_put_active(attr_sd);
 	return ret;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9/9] mm: Update file times from fault path only if .page_mkwrite is not set
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (7 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 8/9] sysfs: Push file_update_time() into bin_page_mkwrite() Jan Kara
@ 2012-03-01 11:41 ` Jan Kara
  2012-03-01 12:23 ` [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 11:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara

Filesystems wanting to properly support freezing need to have control
when file_update_time() is called. After pushing file_update_time()
to all relevant .page_mkwrite implementations we can just stop calling
file_update_time() when filesystem implements .page_mkwrite.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/memory.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index fa2f04e..1dfe9a1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2621,6 +2621,9 @@ reuse:
 		if (!page_mkwrite) {
 			wait_on_page_locked(dirty_page);
 			set_page_dirty_balance(dirty_page, page_mkwrite);
+			/* file_update_time outside page_lock */
+			if (vma->vm_file)
+				file_update_time(vma->vm_file);
 		}
 		put_page(dirty_page);
 		if (page_mkwrite) {
@@ -2638,10 +2641,6 @@ reuse:
 			}
 		}
 
-		/* file_update_time outside page_lock */
-		if (vma->vm_file)
-			file_update_time(vma->vm_file);
-
 		return ret;
 	}
 
@@ -3310,12 +3309,13 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 
 	if (dirty_page) {
 		struct address_space *mapping = page->mapping;
+		int dirtied = 0;
 
 		if (set_page_dirty(dirty_page))
-			page_mkwrite = 1;
+			dirtied = 1;
 		unlock_page(dirty_page);
 		put_page(dirty_page);
-		if (page_mkwrite && mapping) {
+		if ((dirtied || page_mkwrite) && mapping) {
 			/*
 			 * Some device drivers do not set page.mapping but still
 			 * dirty their pages
@@ -3324,7 +3324,7 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		}
 
 		/* file_update_time outside page_lock */
-		if (vma->vm_file)
+		if (vma->vm_file && !page_mkwrite)
 			file_update_time(vma->vm_file);
 	} else {
 		unlock_page(vmf.page);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (8 preceding siblings ...)
  2012-03-01 11:41 ` [PATCH 9/9] mm: Update file times from fault path only if .page_mkwrite is not set Jan Kara
@ 2012-03-01 12:23 ` Jan Kara
  2012-03-01 23:29 ` Ted Ts'o
  2012-03-08 23:12 ` Andy Lutomirski
  11 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 12:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Al Viro, linux-fsdevel, dchinner, Jan Kara,
	Jaya Kumar, Sage Weil, ceph-devel, Steve French, linux-cifs,
	Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer, Miklos Szeredi, fuse-devel, Steven Whitehouse,
	cluster-devel, Greg Kroah-Hartman

  Bah, the subject should have been 0/9... Sorry.

								Honza
On Thu 01-03-12 12:41:34, Jan Kara wrote:
>   Hello,
> 
>   to provide reliable support for filesystem freezing, filesystems need to have
> complete control over when metadata is changed. In particular,
> file_update_time() calls from page fault code make it impossible for
> filesystems to prevent inodes from being dirtied while the filesystem is
> frozen.
> 
> To fix the issue, this patch set changes page fault code to call
> file_update_time() only when ->page_mkwrite() callback is not provided. If the
> callback is provided, it is the responsibility of the filesystem to perform
> update of i_mtime / i_ctime if needed. We also push file_update_time() call
> to all existing ->page_mkwrite() implementations if the time update does not
> obviously happen by other means. If you know your filesystem does not need
> update of modification times in ->page_mkwrite() handler, please speak up and
> I'll drop the patch for your filesystem.
> 
> As a side note, an alternative would be to remove call of file_update_time()
> from page fault code altogether and require all filesystems needing it to do
> that in their ->page_mkwrite() implementation. That is certainly possible
> although maybe slightly inefficient and would require auditting 100+
> vm_operations_structs *shiver*.
> 
> Changes since v1:
> * Dropped patches for filesystems which don't need them
> * Added some acks
> * Improved sysfs patch by Alex Elder's suggestion
> 
> Andrew, would you be willing to merge these patches via your tree?
> 
> 								Honza
> 
> CC: Jaya Kumar <jayalk@intworks.biz>
> CC: Sage Weil <sage@newdream.net>
> CC: ceph-devel@vger.kernel.org
> CC: Steve French <sfrench@samba.org>
> CC: linux-cifs@vger.kernel.org
> CC: Eric Van Hensbergen <ericvh@gmail.com>
> CC: Ron Minnich <rminnich@sandia.gov>
> CC: Latchesar Ionkov <lucho@ionkov.net>
> CC: v9fs-developer@lists.sourceforge.net
> CC: Miklos Szeredi <miklos@szeredi.hu>
> CC: fuse-devel@lists.sourceforge.net
> CC: Steven Whitehouse <swhiteho@redhat.com>
> CC: cluster-devel@redhat.com
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite()
  2012-03-01 11:41 ` [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite() Jan Kara
@ 2012-03-01 12:25   ` Jeff Layton
  2012-03-01 12:30     ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Layton @ 2012-03-01 12:25 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel, dchinner,
	Steve French, linux-cifs

On Thu,  1 Mar 2012 12:41:38 +0100
Jan Kara <jack@suse.cz> wrote:

> CC: Steve French <sfrench@samba.org>
> CC: linux-cifs@vger.kernel.org
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/cifs/file.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 4dd9283..8e3b23b 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -2425,6 +2425,9 @@ cifs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
>  {
>  	struct page *page = vmf->page;
>  
> +	/* Update file times before taking page lock */
> +	file_update_time(vma->vm_file);
> +
>  	lock_page(page);
>  	return VM_FAULT_LOCKED;
>  }

Sorry, I meant to comment on this earlier...

I think we can probably drop this patch. cifs doesn't currently set
S_NOCMTIME on all inodes (only when MS_NOATIME is set currently), but I
think that it probably should. Timestamps should be the purview of the
server.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite()
  2012-03-01 12:25   ` Jeff Layton
@ 2012-03-01 12:30     ` Jan Kara
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 12:30 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Jan Kara, Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel,
	dchinner, Steve French, linux-cifs

On Thu 01-03-12 07:25:37, Jeff Layton wrote:
> On Thu,  1 Mar 2012 12:41:38 +0100
> Jan Kara <jack@suse.cz> wrote:
> 
> > CC: Steve French <sfrench@samba.org>
> > CC: linux-cifs@vger.kernel.org
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/cifs/file.c |    3 +++
> >  1 files changed, 3 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > index 4dd9283..8e3b23b 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -2425,6 +2425,9 @@ cifs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  {
> >  	struct page *page = vmf->page;
> >  
> > +	/* Update file times before taking page lock */
> > +	file_update_time(vma->vm_file);
> > +
> >  	lock_page(page);
> >  	return VM_FAULT_LOCKED;
> >  }
> 
> Sorry, I meant to comment on this earlier...
> 
> I think we can probably drop this patch. cifs doesn't currently set
> S_NOCMTIME on all inodes (only when MS_NOATIME is set currently), but I
> think that it probably should. Timestamps should be the purview of the
> server.
  OK, thanks for letting me know. Patch dropped.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite()
  2012-03-01 11:41 ` [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite() Jan Kara
@ 2012-03-01 19:31   ` Miklos Szeredi
  2012-03-01 20:36     ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Miklos Szeredi @ 2012-03-01 19:31 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel, dchinner,
	fuse-devel

Jan Kara <jack@suse.cz> writes:

> CC: Miklos Szeredi <miklos@szeredi.hu>
> CC: fuse-devel@lists.sourceforge.net
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/fuse/file.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 4a199fd..eade72e 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1323,6 +1323,7 @@ static int fuse_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	 */
>  	struct inode *inode = vma->vm_file->f_mapping->host;
>  
> +	file_update_time(vma->vm_file);

Fuse sets S_CMTIME in inode flags, so this is a no-op.  IOW the patch is
not needed.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite()
  2012-03-01 19:31   ` Miklos Szeredi
@ 2012-03-01 20:36     ` Jan Kara
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-01 20:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Jan Kara, Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel,
	dchinner, fuse-devel

On Thu 01-03-12 20:31:17, Miklos Szeredi wrote:
> Jan Kara <jack@suse.cz> writes:
> 
> > CC: Miklos Szeredi <miklos@szeredi.hu>
> > CC: fuse-devel@lists.sourceforge.net
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/fuse/file.c |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index 4a199fd..eade72e 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -1323,6 +1323,7 @@ static int fuse_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	 */
> >  	struct inode *inode = vma->vm_file->f_mapping->host;
> >  
> > +	file_update_time(vma->vm_file);
> 
> Fuse sets S_CMTIME in inode flags, so this is a no-op.  IOW the patch is
> not needed.
  I see. Thanks. Patch dropped.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (9 preceding siblings ...)
  2012-03-01 12:23 ` [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
@ 2012-03-01 23:29 ` Ted Ts'o
  2012-03-02  9:41   ` Jan Kara
  2012-03-08 23:12 ` Andy Lutomirski
  11 siblings, 1 reply; 19+ messages in thread
From: Ted Ts'o @ 2012-03-01 23:29 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel, dchinner,
	Jaya Kumar, Sage Weil, ceph-devel, Steve French, linux-cifs,
	Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer, Miklos Szeredi, fuse-devel, Steven Whitehouse,
	cluster-devel, Greg Kroah-Hartman

On Thu, Mar 01, 2012 at 12:41:34PM +0100, Jan Kara wrote:
> 
> To fix the issue, this patch set changes page fault code to call
> file_update_time() only when ->page_mkwrite() callback is not provided. If the
> callback is provided, it is the responsibility of the filesystem to perform
> update of i_mtime / i_ctime if needed. We also push file_update_time() call
> to all existing ->page_mkwrite() implementations if the time update does not
> obviously happen by other means. If you know your filesystem does not need
> update of modification times in ->page_mkwrite() handler, please speak up and
> I'll drop the patch for your filesystem.

I don't know if this introductory text is going to be saved anywhere
permanent, such as the merge commit (since git now has the ability to
have much more informative merge descriptions).  But if it is going to
be preserved, it might be worth mentioning that if the filesystem uses
block_page_mkpage(), it will handled automatically for them since the
patch series does push the call to file_update_time(0 into
__block_page_mkpage().

					- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite
  2012-03-01 23:29 ` Ted Ts'o
@ 2012-03-02  9:41   ` Jan Kara
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-02  9:41 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Jan Kara, Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel,
	dchinner, Jaya Kumar, Sage Weil, ceph-devel, Steve French,
	linux-cifs, Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer, Miklos Szeredi, fuse-devel, Steven Whitehouse,
	cluster-devel, Greg Kroah-Hartman

On Thu 01-03-12 18:29:42, Ted Tso wrote:
> On Thu, Mar 01, 2012 at 12:41:34PM +0100, Jan Kara wrote:
> > 
> > To fix the issue, this patch set changes page fault code to call
> > file_update_time() only when ->page_mkwrite() callback is not provided. If the
> > callback is provided, it is the responsibility of the filesystem to perform
> > update of i_mtime / i_ctime if needed. We also push file_update_time() call
> > to all existing ->page_mkwrite() implementations if the time update does not
> > obviously happen by other means. If you know your filesystem does not need
> > update of modification times in ->page_mkwrite() handler, please speak up and
> > I'll drop the patch for your filesystem.
> 
> I don't know if this introductory text is going to be saved anywhere
> permanent, such as the merge commit (since git now has the ability to
> have much more informative merge descriptions).  But if it is going to
> be preserved, it might be worth mentioning that if the filesystem uses
> block_page_mkpage(), it will handled automatically for them since the
> patch series does push the call to file_update_time(0 into
> __block_page_mkpage().
  Good point, added to description.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite
  2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
                   ` (10 preceding siblings ...)
  2012-03-01 23:29 ` Ted Ts'o
@ 2012-03-08 23:12 ` Andy Lutomirski
  2012-03-09  8:19   ` Jan Kara
  11 siblings, 1 reply; 19+ messages in thread
From: Andy Lutomirski @ 2012-03-08 23:12 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel, dchinner,
	Jaya Kumar, Sage Weil, ceph-devel, Steve French, linux-cifs,
	Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer, Miklos Szeredi, fuse-devel, Steven Whitehouse,
	cluster-devel, Greg Kroah-Hartman

On 03/01/2012 03:41 AM, Jan Kara wrote:
>   Hello,
> 
>   to provide reliable support for filesystem freezing, filesystems need to have
> complete control over when metadata is changed. In particular,
> file_update_time() calls from page fault code make it impossible for
> filesystems to prevent inodes from being dirtied while the filesystem is
> frozen.
> 
> To fix the issue, this patch set changes page fault code to call
> file_update_time() only when ->page_mkwrite() callback is not provided. If the
> callback is provided, it is the responsibility of the filesystem to perform
> update of i_mtime / i_ctime if needed. We also push file_update_time() call
> to all existing ->page_mkwrite() implementations if the time update does not
> obviously happen by other means. If you know your filesystem does not need
> update of modification times in ->page_mkwrite() handler, please speak up and
> I'll drop the patch for your filesystem.
> 
> As a side note, an alternative would be to remove call of file_update_time()
> from page fault code altogether and require all filesystems needing it to do
> that in their ->page_mkwrite() implementation. That is certainly possible
> although maybe slightly inefficient and would require auditting 100+
> vm_operations_structs *shiver*.



IMO updating file times should happen when changes get written out, not
when a page is made writable, for two reasons:

1. Correctness.  With the current approach, it's very easy for files to
be changed after the last mtime update -- any changes between mkwrite
and actual writeback won't affect mtime.

2. Performance.  I have an application (presumably guessable from my
email address) for which blocking in page_mkwrite is an absolute
show-stopper.  (In fact it's so bad that we reverted back to running on
Windows until I hacked up a kernel to not do this.)  I have an incorrect
patch [1] to fix it, but I haven't gotten around to a real fix.  (I also
have stable pages reverted in my kernel.  Some day I'll submit a patch
to make it a filesystem option.  Or maybe it should even be a block
device / queue property like the alignment offset and optimal io size --
there are plenty of block device and file combinations which don't
benefit at all from stable pages.)

I'd prefer if file_update_time in page_mkwrite didn't proliferate.  A
better fix is probably to introduce a new inode flag, update it when a
page is undirtied, and then dirty and write the inode from the writeback
path.  (Kind of like my patch, but with an inode flag instead of a page
flag, and with the file_update_time done from the fs.)

[1] http://patchwork.ozlabs.org/patch/122516/

--Andy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite
  2012-03-08 23:12 ` Andy Lutomirski
@ 2012-03-09  8:19   ` Jan Kara
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Kara @ 2012-03-09  8:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kara, Andrew Morton, linux-mm, LKML, Al Viro, linux-fsdevel,
	dchinner, Jaya Kumar, Sage Weil, ceph-devel, Steve French,
	linux-cifs, Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov,
	v9fs-developer, Miklos Szeredi, fuse-devel, Steven Whitehouse,
	cluster-devel, Greg Kroah-Hartman

  Hello,

On Thu 08-03-12 15:12:56, Andy Lutomirski wrote:
> On 03/01/2012 03:41 AM, Jan Kara wrote:
> >   Hello,
> > 
> >   to provide reliable support for filesystem freezing, filesystems need to have
> > complete control over when metadata is changed. In particular,
> > file_update_time() calls from page fault code make it impossible for
> > filesystems to prevent inodes from being dirtied while the filesystem is
> > frozen.
> > 
> > To fix the issue, this patch set changes page fault code to call
> > file_update_time() only when ->page_mkwrite() callback is not provided. If the
> > callback is provided, it is the responsibility of the filesystem to perform
> > update of i_mtime / i_ctime if needed. We also push file_update_time() call
> > to all existing ->page_mkwrite() implementations if the time update does not
> > obviously happen by other means. If you know your filesystem does not need
> > update of modification times in ->page_mkwrite() handler, please speak up and
> > I'll drop the patch for your filesystem.
> > 
> > As a side note, an alternative would be to remove call of file_update_time()
> > from page fault code altogether and require all filesystems needing it to do
> > that in their ->page_mkwrite() implementation. That is certainly possible
> > although maybe slightly inefficient and would require auditting 100+
> > vm_operations_structs *shiver*.
> 
> 
> 
> IMO updating file times should happen when changes get written out, not
> when a page is made writable, for two reasons:
> 
> 1. Correctness.  With the current approach, it's very easy for files to
> be changed after the last mtime update -- any changes between mkwrite
> and actual writeback won't affect mtime.
> 
> 2. Performance.  I have an application (presumably guessable from my
> email address) for which blocking in page_mkwrite is an absolute
> show-stopper.  (In fact it's so bad that we reverted back to running on
> Windows until I hacked up a kernel to not do this.)  I have an incorrect
> patch [1] to fix it, but I haven't gotten around to a real fix.  (I also
> have stable pages reverted in my kernel.  Some day I'll submit a patch
> to make it a filesystem option.  Or maybe it should even be a block
> device / queue property like the alignment offset and optimal io size --
> there are plenty of block device and file combinations which don't
> benefit at all from stable pages.)
> 
> I'd prefer if file_update_time in page_mkwrite didn't proliferate.  A
> better fix is probably to introduce a new inode flag, update it when a
> page is undirtied, and then dirty and write the inode from the writeback
> path.  (Kind of like my patch, but with an inode flag instead of a page
> flag, and with the file_update_time done from the fs.)
> 
> [1] http://patchwork.ozlabs.org/patch/122516/
  Andy, I'm aware of your problems. Just firstly, I wouldn't like to
complicate the filesystem freezing patch set even more by improving unrelated
things. And secondly, I think these changes won't make fixing your problem
harder. I'd even argue it will be easier because you can do conversion
filesystem by filesystem. Getting lock ordering and other things right for
all filesystems at once is much harded.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-03-09  8:19 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-01 11:41 [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
2012-03-01 11:41 ` [PATCH 1/9] fb_defio: Push file_update_time() into fb_deferred_io_mkwrite() Jan Kara
2012-03-01 11:41 ` [PATCH 2/9] fs: Push file_update_time() into __block_page_mkwrite() Jan Kara
2012-03-01 11:41 ` [PATCH 3/9] ceph: Push file_update_time() into ceph_page_mkwrite() Jan Kara
2012-03-01 11:41 ` [PATCH 4/9] cifs: Push file_update_time() into cifs_page_mkwrite() Jan Kara
2012-03-01 12:25   ` Jeff Layton
2012-03-01 12:30     ` Jan Kara
2012-03-01 11:41 ` [PATCH 5/9] 9p: Push file_update_time() into v9fs_vm_page_mkwrite() Jan Kara
2012-03-01 11:41 ` [PATCH 6/9] fuse: Push file_update_time() into fuse_page_mkwrite() Jan Kara
2012-03-01 19:31   ` Miklos Szeredi
2012-03-01 20:36     ` Jan Kara
2012-03-01 11:41 ` [PATCH 7/9] gfs2: Push file_update_time() into gfs2_page_mkwrite() Jan Kara
2012-03-01 11:41 ` [PATCH 8/9] sysfs: Push file_update_time() into bin_page_mkwrite() Jan Kara
2012-03-01 11:41 ` [PATCH 9/9] mm: Update file times from fault path only if .page_mkwrite is not set Jan Kara
2012-03-01 12:23 ` [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite Jan Kara
2012-03-01 23:29 ` Ted Ts'o
2012-03-02  9:41   ` Jan Kara
2012-03-08 23:12 ` Andy Lutomirski
2012-03-09  8:19   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).