All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write
@ 2010-04-19 12:34 Stefan Hajnoczi
  2010-04-19 12:34 ` [Qemu-devel] [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls Stefan Hajnoczi
  2010-04-20  8:37 ` [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write Kevin Wolf
  0 siblings, 2 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2010-04-19 12:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Jan Kiszka, Christoph Hellwig, Stefan Hajnoczi

This patch combines the lseek+read/write calls to use pread/pwrite
instead.  This will result in fewer system calls and is already used by
AIO.

Thanks to Jan Kiszka <jan.kiszka@siemens.com> for identifying excessive
lseek and Christoph Hellwig <hch@lst.de> for confirming that this
approach should work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 block/raw-posix.c |   37 ++++---------------------------------
 1 files changed, 4 insertions(+), 33 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 598ea19..7541ed2 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -105,7 +105,6 @@
 typedef struct BDRVRawState {
     int fd;
     int type;
-    unsigned int lseek_err_cnt;
     int open_flags;
 #if defined(__linux__)
     /* linux floppy specific */
@@ -134,8 +133,6 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
     BDRVRawState *s = bs->opaque;
     int fd, ret;
 
-    s->lseek_err_cnt = 0;
-
     s->open_flags = open_flags | O_BINARY;
     s->open_flags &= ~O_ACCMODE;
     if (bdrv_flags & BDRV_O_RDWR) {
@@ -243,19 +240,7 @@ static int raw_pread_aligned(BlockDriverState *bs, int64_t offset,
     if (ret < 0)
         return ret;
 
-    if (offset >= 0 && lseek(s->fd, offset, SEEK_SET) == (off_t)-1) {
-        ++(s->lseek_err_cnt);
-        if(s->lseek_err_cnt <= 10) {
-            DEBUG_BLOCK_PRINT("raw_pread(%d:%s, %" PRId64 ", %p, %d) [%" PRId64
-                              "] lseek failed : %d = %s\n",
-                              s->fd, bs->filename, offset, buf, count,
-                              bs->total_sectors, errno, strerror(errno));
-        }
-        return -1;
-    }
-    s->lseek_err_cnt=0;
-
-    ret = read(s->fd, buf, count);
+    ret = pread(s->fd, buf, count, offset);
     if (ret == count)
         goto label__raw_read__success;
 
@@ -276,12 +261,10 @@ static int raw_pread_aligned(BlockDriverState *bs, int64_t offset,
 
     /* Try harder for CDrom. */
     if (bs->type == BDRV_TYPE_CDROM) {
-        lseek(s->fd, offset, SEEK_SET);
-        ret = read(s->fd, buf, count);
+        ret = pread(s->fd, buf, count, offset);
         if (ret == count)
             goto label__raw_read__success;
-        lseek(s->fd, offset, SEEK_SET);
-        ret = read(s->fd, buf, count);
+        ret = pread(s->fd, buf, count, offset);
         if (ret == count)
             goto label__raw_read__success;
 
@@ -313,19 +296,7 @@ static int raw_pwrite_aligned(BlockDriverState *bs, int64_t offset,
     if (ret < 0)
         return -errno;
 
-    if (offset >= 0 && lseek(s->fd, offset, SEEK_SET) == (off_t)-1) {
-        ++(s->lseek_err_cnt);
-        if(s->lseek_err_cnt) {
-            DEBUG_BLOCK_PRINT("raw_pwrite(%d:%s, %" PRId64 ", %p, %d) [%"
-                              PRId64 "] lseek failed : %d = %s\n",
-                              s->fd, bs->filename, offset, buf, count,
-                              bs->total_sectors, errno, strerror(errno));
-        }
-        return -EIO;
-    }
-    s->lseek_err_cnt = 0;
-
-    ret = write(s->fd, buf, count);
+    ret = pwrite(s->fd, buf, count, offset);
     if (ret == count)
         goto label__raw_write__success;
 
-- 
1.7.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls
  2010-04-19 12:34 [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write Stefan Hajnoczi
@ 2010-04-19 12:34 ` Stefan Hajnoczi
  2010-04-19 14:10   ` [Qemu-devel] " Kevin Wolf
  2010-04-20  8:37 ` [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write Kevin Wolf
  1 sibling, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2010-04-19 12:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Jan Kiszka, Christoph Hellwig, Stefan Hajnoczi

The BlockDriver bdrv_getlength function is called from the I/O code path
when checking that the request falls within the device.  Unfortunately
this involves an lseek system call in the raw protocol; every read or
write request will incur this lseek cost.

Jan Kiszka <jan.kiszka@siemens.com> identified this issue and its
latency overhead.  This patch caches device length in the existing
total_sectors variable so lseek calls can be avoided for fixed size
devices.

Growable devices fall back to the full bdrv_getlength code path because
I have not added logic to detect extending the size of the device in a
write.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 block.c |   28 ++++++++++++++++++++++------
 1 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index def3400..d5a3ba7 100644
--- a/block.c
+++ b/block.c
@@ -363,6 +363,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
     assert(drv != NULL);
 
     bs->file = NULL;
+    bs->total_sectors = 0;
     bs->is_temporary = 0;
     bs->encrypted = 0;
     bs->valid_key = 0;
@@ -416,9 +417,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
     }
 
     bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
-    if (drv->bdrv_getlength) {
-        bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
-    }
+    bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
 #ifndef _WIN32
     if (bs->is_temporary) {
         unlink(filename);
@@ -959,13 +958,26 @@ int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
 int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 {
     BlockDriver *drv = bs->drv;
+    int ret;
     if (!drv)
         return -ENOMEDIUM;
     if (!drv->bdrv_truncate)
         return -ENOTSUP;
     if (bs->read_only)
         return -EACCES;
-    return drv->bdrv_truncate(bs, offset);
+    ret = drv->bdrv_truncate(bs, offset);
+    if (ret < 0) {
+        return ret;
+    }
+
+    /* refresh total sectors */
+    if (drv->bdrv_getlength) {
+        bs->total_sectors = 0; /* discard cached value */
+        bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
+    } else {
+        bs->total_sectors = offset >> BDRV_SECTOR_BITS;
+    }
+    return ret;
 }
 
 /**
@@ -976,8 +988,12 @@ int64_t bdrv_getlength(BlockDriverState *bs)
     BlockDriver *drv = bs->drv;
     if (!drv)
         return -ENOMEDIUM;
-    if (!drv->bdrv_getlength) {
-        /* legacy mode */
+
+    /* Fixed size devices use the total_sectors value for speed instead of
+       issuing a length query (like lseek) on each call.  Also, legacy block
+       drivers don't provide a bdrv_getlength function and must use
+       total_sectors. */
+    if ((bs->total_sectors && !bs->growable) || !drv->bdrv_getlength) {
         return bs->total_sectors * BDRV_SECTOR_SIZE;
     }
     return drv->bdrv_getlength(bs);
-- 
1.7.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] Re: [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls
  2010-04-19 12:34 ` [Qemu-devel] [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls Stefan Hajnoczi
@ 2010-04-19 14:10   ` Kevin Wolf
  2010-04-19 14:26     ` Stefan Hajnoczi
  0 siblings, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2010-04-19 14:10 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Jan Kiszka, qemu-devel, Christoph Hellwig

Am 19.04.2010 14:34, schrieb Stefan Hajnoczi:
> The BlockDriver bdrv_getlength function is called from the I/O code path
> when checking that the request falls within the device.  Unfortunately
> this involves an lseek system call in the raw protocol; every read or
> write request will incur this lseek cost.
> 
> Jan Kiszka <jan.kiszka@siemens.com> identified this issue and its
> latency overhead.  This patch caches device length in the existing
> total_sectors variable so lseek calls can be avoided for fixed size
> devices.
> 
> Growable devices fall back to the full bdrv_getlength code path because
> I have not added logic to detect extending the size of the device in a
> write.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
>  block.c |   28 ++++++++++++++++++++++------
>  1 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/block.c b/block.c
> index def3400..d5a3ba7 100644
> --- a/block.c
> +++ b/block.c
> @@ -363,6 +363,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>      assert(drv != NULL);
>  
>      bs->file = NULL;
> +    bs->total_sectors = 0;
>      bs->is_temporary = 0;
>      bs->encrypted = 0;
>      bs->valid_key = 0;
> @@ -416,9 +417,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>      }
>  
>      bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
> -    if (drv->bdrv_getlength) {
> -        bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
> -    }
> +    bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;

Does this hunk make a difference? If drv->bdrv_getlength == NULL, we'll
just get back the current value.

But now that you sent this hunk for review, one thing about the existing
code: We should probably check the return value of bdrv_getlength.

Otherwise both patches look good to me.

Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls
  2010-04-19 14:10   ` [Qemu-devel] " Kevin Wolf
@ 2010-04-19 14:26     ` Stefan Hajnoczi
  2010-04-19 14:31       ` Kevin Wolf
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2010-04-19 14:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Jan Kiszka, Christoph Hellwig, Stefan Hajnoczi, qemu-devel

On Mon, Apr 19, 2010 at 3:10 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> @@ -416,9 +417,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>>      }
>>
>>      bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
>> -    if (drv->bdrv_getlength) {
>> -        bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>> -    }
>> +    bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>
> Does this hunk make a difference? If drv->bdrv_getlength == NULL, we'll
> just get back the current value.

The if statement could be left as is.  I removed it to reduce the
number of places where if (drv->bdrv_getlength) is explicitly checked.
 If callers don't know the internals of bdrv_getlength() then it is
easier to extend it without auditing and changing callers.

Having said that, I did add an if (drv->bdrv_getlength) check into
bdrv_truncate...

> But now that you sent this hunk for review, one thing about the existing
> code: We should probably check the return value of bdrv_getlength.

You're right.

I'll clean this up and send a v2.

Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls
  2010-04-19 14:26     ` Stefan Hajnoczi
@ 2010-04-19 14:31       ` Kevin Wolf
  0 siblings, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2010-04-19 14:31 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Jan Kiszka, Christoph Hellwig, Stefan Hajnoczi, qemu-devel

Am 19.04.2010 16:26, schrieb Stefan Hajnoczi:
> On Mon, Apr 19, 2010 at 3:10 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>> @@ -416,9 +417,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>>>      }
>>>
>>>      bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
>>> -    if (drv->bdrv_getlength) {
>>> -        bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>> -    }
>>> +    bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>
>> Does this hunk make a difference? If drv->bdrv_getlength == NULL, we'll
>> just get back the current value.
> 
> The if statement could be left as is.  I removed it to reduce the
> number of places where if (drv->bdrv_getlength) is explicitly checked.
>  If callers don't know the internals of bdrv_getlength() then it is
> easier to extend it without auditing and changing callers.

Makes sense, I'm not opposed to it.

> Having said that, I did add an if (drv->bdrv_getlength) check into
> bdrv_truncate...

Well, you probably can't do much about it there.

Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write
  2010-04-19 12:34 [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write Stefan Hajnoczi
  2010-04-19 12:34 ` [Qemu-devel] [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls Stefan Hajnoczi
@ 2010-04-20  8:37 ` Kevin Wolf
  1 sibling, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2010-04-20  8:37 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Jan Kiszka, qemu-devel, Christoph Hellwig

Am 19.04.2010 14:34, schrieb Stefan Hajnoczi:
> This patch combines the lseek+read/write calls to use pread/pwrite
> instead.  This will result in fewer system calls and is already used by
> AIO.
> 
> Thanks to Jan Kiszka <jan.kiszka@siemens.com> for identifying excessive
> lseek and Christoph Hellwig <hch@lst.de> for confirming that this
> approach should work.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Thanks, applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-04-20  8:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-19 12:34 [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write Stefan Hajnoczi
2010-04-19 12:34 ` [Qemu-devel] [PATCH 2/2] block: Cache total_sectors to reduce bdrv_getlength calls Stefan Hajnoczi
2010-04-19 14:10   ` [Qemu-devel] " Kevin Wolf
2010-04-19 14:26     ` Stefan Hajnoczi
2010-04-19 14:31       ` Kevin Wolf
2010-04-20  8:37 ` [Qemu-devel] [PATCH 1/2] raw-posix: Use pread/pwrite instead of lseek+read/write Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.