All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] md: stop using do_sync_mapping_range
@ 2009-09-23 13:18 Christoph Hellwig
  2009-09-23 13:47 ` Jamie Lokier
  2009-09-23 22:52 ` Neil Brown
  0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2009-09-23 13:18 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid, linux-fsdevel

It's a very awkward way to write out all data and wait for it, so just
call filemap_write_and_wait.  I still can't figure what the point of
all this is, so a comment would surely be helpful.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: vfs-2.6.git/drivers/md/bitmap.c
===================================================================
--- vfs-2.6.git.orig/drivers/md/bitmap.c	2009-09-22 14:31:08.698262797 -0300
+++ vfs-2.6.git/drivers/md/bitmap.c	2009-09-22 14:33:01.573762756 -0300
@@ -1621,10 +1621,7 @@ int bitmap_create(mddev_t *mddev)
 	bitmap->offset = mddev->bitmap_offset;
 	if (file) {
 		get_file(file);
-		do_sync_mapping_range(file->f_mapping, 0, LLONG_MAX,
-				      SYNC_FILE_RANGE_WAIT_BEFORE |
-				      SYNC_FILE_RANGE_WRITE |
-				      SYNC_FILE_RANGE_WAIT_AFTER);
+		filemap_write_and_wait(file->f_mapping);
 	}
 	/* read superblock from bitmap file (this sets bitmap->chunksize) */
 	err = bitmap_read_sb(bitmap);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] md: stop using do_sync_mapping_range
  2009-09-23 13:18 [PATCH] md: stop using do_sync_mapping_range Christoph Hellwig
@ 2009-09-23 13:47 ` Jamie Lokier
  2009-09-26 15:11   ` Christoph Hellwig
  2009-09-23 22:52 ` Neil Brown
  1 sibling, 1 reply; 7+ messages in thread
From: Jamie Lokier @ 2009-09-23 13:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Neil Brown, linux-raid, linux-fsdevel

Christoph Hellwig wrote:
> -		do_sync_mapping_range(file->f_mapping, 0, LLONG_MAX,
> -				      SYNC_FILE_RANGE_WAIT_BEFORE |
> -				      SYNC_FILE_RANGE_WRITE |
> -				      SYNC_FILE_RANGE_WAIT_AFTER);
> +		filemap_write_and_wait(file->f_mapping);
>
> It's a very awkward way to write out all data and wait for it, so just
> call filemap_write_and_wait.  I still can't figure what the point of
> all this is, so a comment would surely be helpful.

The SYNC_FILE_RANGE_WAIT_BEFORE is to make sure that writes which were
started before the most recent dirtying are completed before
initiating a second write on the same pages to make sure the most
recently dirty data is written.

Does filemap_write_and_wait() do that?

Fwiw, It's partially redundant: it isn't necessary to have a second
write when the first write request was queued in the elevator but
hadn't yet reached the device.

-- Jamie

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] md: stop using do_sync_mapping_range
  2009-09-23 13:18 [PATCH] md: stop using do_sync_mapping_range Christoph Hellwig
  2009-09-23 13:47 ` Jamie Lokier
@ 2009-09-23 22:52 ` Neil Brown
  2009-09-26 15:13   ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Neil Brown @ 2009-09-23 22:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-raid, linux-fsdevel

On Wednesday September 23, hch@lst.de wrote:
> It's a very awkward way to write out all data and wait for it, so just
> call filemap_write_and_wait.  I still can't figure what the point of
> all this is, so a comment would surely be helpful.

When md/bitmap accesses a file, it uses bmap to find addresses and then
submit_bh to do IO, so it completely by-passes the page cache.
So this code is present to ensure that the page cache has no dirty
pages for the file before we start using the file.

I don't recall exactly why I used do_sync_mapping_range.  I suspect
that I looked at what "sys_fsync" used (do_fsync?) and found that
wasn't exported, so I looked at what sys_sync_file_range used, found
that was exported, and so used that.

Looking at the current state of the VFS, I think I would rather use
vfs_fsync.

So would you be happy with something like the following?

Thanks,
NeilBrown



diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 6986b00..60e2b32 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1624,10 +1624,11 @@ int bitmap_create(mddev_t *mddev)
 	bitmap->offset = mddev->bitmap_offset;
 	if (file) {
 		get_file(file);
-		do_sync_mapping_range(file->f_mapping, 0, LLONG_MAX,
-				      SYNC_FILE_RANGE_WAIT_BEFORE |
-				      SYNC_FILE_RANGE_WRITE |
-				      SYNC_FILE_RANGE_WAIT_AFTER);
+		/* As future accesses to this file will use bmap,
+		 * and bypass the page cache, we must sync the file
+		 * first.
+		 */
+		vfs_fsync(file, file->f_dentry, 1);
 	}
 	/* read superblock from bitmap file (this sets bitmap->chunksize) */
 	err = bitmap_read_sb(bitmap);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] md: stop using do_sync_mapping_range
  2009-09-23 13:47 ` Jamie Lokier
@ 2009-09-26 15:11   ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2009-09-26 15:11 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, Neil Brown, linux-raid, linux-fsdevel

On Wed, Sep 23, 2009 at 02:47:15PM +0100, Jamie Lokier wrote:
> Christoph Hellwig wrote:
> > -		do_sync_mapping_range(file->f_mapping, 0, LLONG_MAX,
> > -				      SYNC_FILE_RANGE_WAIT_BEFORE |
> > -				      SYNC_FILE_RANGE_WRITE |
> > -				      SYNC_FILE_RANGE_WAIT_AFTER);
> > +		filemap_write_and_wait(file->f_mapping);
> >
> > It's a very awkward way to write out all data and wait for it, so just
> > call filemap_write_and_wait.  I still can't figure what the point of
> > all this is, so a comment would surely be helpful.
> 
> The SYNC_FILE_RANGE_WAIT_BEFORE is to make sure that writes which were
> started before the most recent dirtying are completed before
> initiating a second write on the same pages to make sure the most
> recently dirty data is written.
> 
> Does filemap_write_and_wait() do that?

Yes, for WB_SYNC_ALL syncs write_cache_pages or the fs-specific
equivalent waits for pages marked writeback before starting starting
the writeout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] md: stop using do_sync_mapping_range
  2009-09-23 22:52 ` Neil Brown
@ 2009-09-26 15:13   ` Christoph Hellwig
  2009-09-26 21:51       ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2009-09-26 15:13 UTC (permalink / raw)
  To: Neil Brown; +Cc: Christoph Hellwig, linux-raid, linux-fsdevel

On Thu, Sep 24, 2009 at 08:52:36AM +1000, Neil Brown wrote:
> On Wednesday September 23, hch@lst.de wrote:
> > It's a very awkward way to write out all data and wait for it, so just
> > call filemap_write_and_wait.  I still can't figure what the point of
> > all this is, so a comment would surely be helpful.
> 
> When md/bitmap accesses a file, it uses bmap to find addresses and then
> submit_bh to do IO, so it completely by-passes the page cache.
> So this code is present to ensure that the page cache has no dirty
> pages for the file before we start using the file.
> 
> I don't recall exactly why I used do_sync_mapping_range.  I suspect
> that I looked at what "sys_fsync" used (do_fsync?) and found that
> wasn't exported, so I looked at what sys_sync_file_range used, found
> that was exported, and so used that.
> 
> Looking at the current state of the VFS, I think I would rather use
> vfs_fsync.

Using vfs_fsync is better, but using bmap is extremly dangerous, I'm
kinda suprised how this managed to get sneaked in without notice.  bmap
really is just an advisory interface as the mappings by change
underneath all the time.  Some modern filesystems like btrfs really
can't implement it at all.


> So would you be happy with something like the following?

It's at least a small improvement..


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] md: stop using do_sync_mapping_range
  2009-09-26 15:13   ` Christoph Hellwig
@ 2009-09-26 21:51       ` NeilBrown
  0 siblings, 0 replies; 7+ messages in thread
From: NeilBrown @ 2009-09-26 21:51 UTC (permalink / raw)
  Cc: Christoph Hellwig, linux-raid, linux-fsdevel

On Sun, September 27, 2009 1:13 am, Christoph Hellwig wrote:
> On Thu, Sep 24, 2009 at 08:52:36AM +1000, Neil Brown wrote:
>> On Wednesday September 23, hch@lst.de wrote:
>> > It's a very awkward way to write out all data and wait for it, so just
>> > call filemap_write_and_wait.  I still can't figure what the point of
>> > all this is, so a comment would surely be helpful.
>>
>> When md/bitmap accesses a file, it uses bmap to find addresses and then
>> submit_bh to do IO, so it completely by-passes the page cache.
>> So this code is present to ensure that the page cache has no dirty
>> pages for the file before we start using the file.
>>
>> I don't recall exactly why I used do_sync_mapping_range.  I suspect
>> that I looked at what "sys_fsync" used (do_fsync?) and found that
>> wasn't exported, so I looked at what sys_sync_file_range used, found
>> that was exported, and so used that.
>>
>> Looking at the current state of the VFS, I think I would rather use
>> vfs_fsync.
>
> Using vfs_fsync is better, but using bmap is extremly dangerous, I'm
> kinda suprised how this managed to get sneaked in without notice.  bmap
> really is just an advisory interface as the mappings by change
> underneath all the time.  Some modern filesystems like btrfs really
> can't implement it at all.
>

I used bmap because I wanted writeout characteristics similar to
swapout, and mm/swapfile.c uses bmap.  I copied the ideas from that
code.  If bmap is unsafe for bitmap files, then presumably it is
unsafe for swap files too.
This is exactly the argument that I used when Andrew Morton queried
the code on the way in.

The swap-over-NFS patches introduce a new interface to the filesystem
that supports swapout (i.e. writeout that is guaranteed not to block
on mem allocation etc).  If they ever get upstream, I will change
bitmap files to use that interface when it is present.


>
>> So would you be happy with something like the following?
>
> It's at least a small improvement..

Thanks.

NeilBrown



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] md: stop using do_sync_mapping_range
@ 2009-09-26 21:51       ` NeilBrown
  0 siblings, 0 replies; 7+ messages in thread
From: NeilBrown @ 2009-09-26 21:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Christoph Hellwig, linux-raid, linux-fsdevel

On Sun, September 27, 2009 1:13 am, Christoph Hellwig wrote:
> On Thu, Sep 24, 2009 at 08:52:36AM +1000, Neil Brown wrote:
>> On Wednesday September 23, hch@lst.de wrote:
>> > It's a very awkward way to write out all data and wait for it, so just
>> > call filemap_write_and_wait.  I still can't figure what the point of
>> > all this is, so a comment would surely be helpful.
>>
>> When md/bitmap accesses a file, it uses bmap to find addresses and then
>> submit_bh to do IO, so it completely by-passes the page cache.
>> So this code is present to ensure that the page cache has no dirty
>> pages for the file before we start using the file.
>>
>> I don't recall exactly why I used do_sync_mapping_range.  I suspect
>> that I looked at what "sys_fsync" used (do_fsync?) and found that
>> wasn't exported, so I looked at what sys_sync_file_range used, found
>> that was exported, and so used that.
>>
>> Looking at the current state of the VFS, I think I would rather use
>> vfs_fsync.
>
> Using vfs_fsync is better, but using bmap is extremly dangerous, I'm
> kinda suprised how this managed to get sneaked in without notice.  bmap
> really is just an advisory interface as the mappings by change
> underneath all the time.  Some modern filesystems like btrfs really
> can't implement it at all.
>

I used bmap because I wanted writeout characteristics similar to
swapout, and mm/swapfile.c uses bmap.  I copied the ideas from that
code.  If bmap is unsafe for bitmap files, then presumably it is
unsafe for swap files too.
This is exactly the argument that I used when Andrew Morton queried
the code on the way in.

The swap-over-NFS patches introduce a new interface to the filesystem
that supports swapout (i.e. writeout that is guaranteed not to block
on mem allocation etc).  If they ever get upstream, I will change
bitmap files to use that interface when it is present.


>
>> So would you be happy with something like the following?
>
> It's at least a small improvement..

Thanks.

NeilBrown



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-09-26 21:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-23 13:18 [PATCH] md: stop using do_sync_mapping_range Christoph Hellwig
2009-09-23 13:47 ` Jamie Lokier
2009-09-26 15:11   ` Christoph Hellwig
2009-09-23 22:52 ` Neil Brown
2009-09-26 15:13   ` Christoph Hellwig
2009-09-26 21:51     ` NeilBrown
2009-09-26 21:51       ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.