[patch] RFC directio: partial writes support

* [patch] RFC directio: partial writes support
@ 2010-02-25 12:45 Dmitry Monakhov
  2010-02-27 11:10 ` Dmitry Monakhov
  2010-03-01 23:21 ` Andrew Morton
  0 siblings, 2 replies; 6+ messages in thread
From: Dmitry Monakhov @ 2010-02-25 12:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 663 bytes --]

Can someone please describe me why directio deny partial writes.
For example if someone try to write 100Mb but file system has less
data it return ENOSPC in the middle of block allocation.
All allocated blocks will be truncated (it may be 100Mb -4k) end
ENOSPC will be returned. As far as i remember direct_io always act
like this, but i never asked why?
Why do we have to give up all the progress we made?
In fact partial writes are possible in case of holes, when we 
fall back to buffered write. XFS implemented partial writes.

I've done trivial changes and it works like charm. 
Let's enable partial writes support and allow caller to define
this behavior.


[-- Attachment #2: 0001-direct_io-Allow-partial-writes.patch --]
[-- Type: text/plain, Size: 2189 bytes --]

>From 4a72c4a61e133140750d05853b8dafecd8ef5d87 Mon Sep 17 00:00:00 2001
From: Dmitry Monakhov <dmonakhov@openvz.org>
Date: Thu, 25 Feb 2010 15:14:48 +0300
Subject: [PATCH] direct_io: Allow partial writes

Current direct io allocation behavior is inconvenient. Partial writes
are not supported. If we try to issue 10Mb chunk, but only 5Mb is
available then we will allocate thees 5Mb until ENOSPC, and then
drop such space and return ENOSPC.
But in fact partial writes are possible in case of holes.
Seems that there is no enough reason to deny partial writes.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/direct-io.c     |    4 ++++
 fs/ext4/inode.c    |   11 +++++++----
 include/linux/fs.h |    2 ++
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index e82adc2..250a041 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -229,6 +229,10 @@ static int dio_complete(struct dio *dio, loff_t offset, int ret)
 		ret = 0;
 
 	if (dio->result) {
+		/* Ignore error if we have written some data */
+		if (dio->flags & DIO_PARTIAL_WRITE)
+			ret = 0;
+
 		transferred = dio->result;
 
 		/* Check for short read case */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 218ea0b..8c00127 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3447,10 +3447,13 @@ retry:
 				 offset, nr_segs,
 				 ext4_get_block, NULL);
 	else
-		ret = blockdev_direct_IO(rw, iocb, inode,
-				 inode->i_sb->s_bdev, iov,
-				 offset, nr_segs,
-				 ext4_get_block, NULL);
+		ret = __blockdev_direct_IO(rw, iocb, inode,
+					inode->i_sb->s_bdev, iov,
+					offset, nr_segs,
+					ext4_get_block, NULL,
+					DIO_LOCKING | DIO_SKIP_HOLES |
+					DIO_PARTIAL_WRITE);
+
 	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
 		goto retry;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9147ca8..d887685 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2259,6 +2259,8 @@ enum {
 
 	/* filesystem does not support filling holes */
 	DIO_SKIP_HOLES	= 0x02,
+	/* allow partial writes */
+	DIO_PARTIAL_WRITE = 0x04,
 };
 
 static inline ssize_t blockdev_direct_IO(int rw, struct kiocb *iocb,
-- 
1.6.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread