All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*)
@ 2010-07-14 13:35 Christian Brunner
  2010-07-20 20:42 ` Yehuda Sadeh Weinraub
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Brunner @ 2010-07-14 13:35 UTC (permalink / raw)
  To: ceph-devel

Hi Yehuda,

this is an updated version of the patch I've sent yesterday. It is now based on the
current rbd branch and it is using qemu_cond_* and qemu_mutex_*.

Regards,
Christian

---
 Makefile.objs |    1 +
 block/rbd.c   |   33 ++++++++++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 1 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 56a13c1..e1b8513 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -12,6 +12,7 @@ block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 block-obj-$(CONFIG_POSIX) += compatfd.o
+block-obj-$(CONFIG_RBD) += qemu-thread.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
diff --git a/block/rbd.c b/block/rbd.c
index e7d4083..01786da 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -24,7 +24,7 @@
 #include <rados/librados.h>
 
 #include <signal.h>
-
+#include <qemu-thread.h>
 
 int eventfd(unsigned int initval, int flags);
 
@@ -50,6 +50,7 @@ int eventfd(unsigned int initval, int flags);
  */
 
 #define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
+#define MAX_QUEUE_SIZE 33554432 // 32MB
 
 typedef struct RBDAIOCB {
     BlockDriverAIOCB common;
@@ -82,6 +83,9 @@ typedef struct BDRVRBDState {
     uint64_t objsize;
     int qemu_aio_count;
     int read_only;
+    uint64_t queuesize;
+    QemuMutex *queue_mutex;
+    QemuCond *queue_threshold;
 } BDRVRBDState;
 
 typedef struct rbd_obj_header_ondisk RbdHeader1;
@@ -487,6 +491,13 @@ static int rbd_open(BlockDriverState *bs, const char *filename, int flags)
 
     s->read_only = (snap != NULL);
 
+    s->queuesize = 0;
+
+    s->queue_mutex = qemu_malloc(sizeof(QemuMutex));
+    qemu_mutex_init(s->queue_mutex);
+    s->queue_threshold = qemu_malloc(sizeof(QemuCond));
+    qemu_cond_init(s->queue_threshold);
+
     s->efd = eventfd(0, 0);
     if (s->efd < 0) {
         error_report("error opening eventfd");
@@ -523,6 +534,12 @@ static void rbd_close(BlockDriverState *bs)
 {
     BDRVRBDState *s = bs->opaque;
 
+    // The following do not exist in qemu:
+    // qemu_cond_destroy(s->queue_threshold);
+    // qemu_mutex_destroy(s->queue_mutex);
+    qemu_free(s->queue_threshold);
+    qemu_free(s->queue_mutex);
+
     rados_close_pool(s->header_pool);
     rados_close_pool(s->pool);
     rados_deinitialize();
@@ -613,6 +630,12 @@ static void rbd_finish_aiocb(rados_completion_t c, RADOSCB *rcb)
     int i;
 
     acb->aiocnt--;
+    acb->s->queuesize -= rcb->segsize;
+    if (acb->s->queuesize+rcb->segsize > MAX_QUEUE_SIZE && acb->s->queuesize <= MAX_QUEUE_SIZE) {
+        qemu_mutex_lock(acb->s->queue_mutex);
+        qemu_cond_signal(acb->s->queue_threshold);
+        qemu_mutex_unlock(acb->s->queue_mutex);
+    }
     r = rados_aio_get_return_value(c);
     rados_aio_release(c);
     if (acb->write) {
@@ -735,6 +758,14 @@ static BlockDriverAIOCB *rbd_aio_rw_vector(BlockDriverState *bs,
         rcb->segsize = segsize;
         rcb->buf = buf;
 
+        while  (s->queuesize > MAX_QUEUE_SIZE) {
+            qemu_mutex_lock(s->queue_mutex);
+            qemu_cond_wait(s->queue_threshold, s->queue_mutex);
+            qemu_mutex_unlock(s->queue_mutex);
+        }
+
+        s->queuesize += segsize;
+
         if (write) {
             rados_aio_create_completion(rcb, NULL,
                                         (rados_callback_t) rbd_finish_aiocb,
-- 
1.6.5.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*)
  2010-07-14 13:35 [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*) Christian Brunner
@ 2010-07-20 20:42 ` Yehuda Sadeh Weinraub
  2010-07-21 19:18   ` Christian Brunner
  0 siblings, 1 reply; 6+ messages in thread
From: Yehuda Sadeh Weinraub @ 2010-07-20 20:42 UTC (permalink / raw)
  To: Christian Brunner; +Cc: ceph-devel

On Wed, Jul 14, 2010 at 6:35 AM, Christian Brunner <chb@muc.de> wrote:
>
> Hi Yehuda,
>
> this is an updated version of the patch I've sent yesterday. It is now based on the
> current rbd branch and it is using qemu_cond_* and qemu_mutex_*.

<snip>

>
> +        while  (s->queuesize > MAX_QUEUE_SIZE) {
> +            qemu_mutex_lock(s->queue_mutex);
> +            qemu_cond_wait(s->queue_threshold, s->queue_mutex);
> +            qemu_mutex_unlock(s->queue_mutex);
> +        }

Actually we shouldn't be waiting inside the aio handler. We should
probably be feeding the request into some wait queue and have a
separate thread that drains all those requests out. Though this
wouldn't help us in throttling the client memory, it's probably a more
correct way to handle the problem.

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*)
  2010-07-20 20:42 ` Yehuda Sadeh Weinraub
@ 2010-07-21 19:18   ` Christian Brunner
  2010-07-23  8:56     ` Kevin Wolf
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Brunner @ 2010-07-21 19:18 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub, Kevin Wolf; +Cc: ceph-devel

2010/7/20 Yehuda Sadeh Weinraub <yehudasa@gmail.com>:
>> +        while  (s->queuesize > MAX_QUEUE_SIZE) {
>> +            qemu_mutex_lock(s->queue_mutex);
>> +            qemu_cond_wait(s->queue_threshold, s->queue_mutex);
>> +            qemu_mutex_unlock(s->queue_mutex);
>> +        }
>
> Actually we shouldn't be waiting inside the aio handler. We should
> probably be feeding the request into some wait queue and have a
> separate thread that drains all those requests out. Though this
> wouldn't help us in throttling the client memory, it's probably a more
> correct way to handle the problem.

I agree with you, however I don't think that this is worth the effort.
I've never seen this occur when running a virtual machine, only when
running qemu-io.

@Kevin: Did you follow this thread? What is your opinion?

Christian
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*)
  2010-07-21 19:18   ` Christian Brunner
@ 2010-07-23  8:56     ` Kevin Wolf
  2010-07-23 16:46       ` Sage Weil
  0 siblings, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2010-07-23  8:56 UTC (permalink / raw)
  To: Christian Brunner; +Cc: Yehuda Sadeh Weinraub, ceph-devel

Am 21.07.2010 21:18, schrieb Christian Brunner:
> 2010/7/20 Yehuda Sadeh Weinraub <yehudasa@gmail.com>:
>>> +        while  (s->queuesize > MAX_QUEUE_SIZE) {
>>> +            qemu_mutex_lock(s->queue_mutex);
>>> +            qemu_cond_wait(s->queue_threshold, s->queue_mutex);
>>> +            qemu_mutex_unlock(s->queue_mutex);
>>> +        }
>>
>> Actually we shouldn't be waiting inside the aio handler. We should
>> probably be feeding the request into some wait queue and have a
>> separate thread that drains all those requests out. Though this
>> wouldn't help us in throttling the client memory, it's probably a more
>> correct way to handle the problem.
> 
> I agree with you, however I don't think that this is worth the effort.
> I've never seen this occur when running a virtual machine, only when
> running qemu-io.
> 
> @Kevin: Did you follow this thread? What is your opinion?

No, I'm not subscribed to your mailing list and qemu-devel isn't CCed.

I'm not entirely sure what kind of queue this is about, but are you sure
that it can only happen with large requests as issued by qemu-io? Can
many small requests cause the same, especially with a slow network
connection (or it might even go down for some reason?

If I understand correctly, waiting here could cause qemu to become
unresponsive, right?

Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*)
  2010-07-23  8:56     ` Kevin Wolf
@ 2010-07-23 16:46       ` Sage Weil
  2010-07-23 18:51         ` Christian Brunner
  0 siblings, 1 reply; 6+ messages in thread
From: Sage Weil @ 2010-07-23 16:46 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Christian Brunner, Yehuda Sadeh Weinraub, ceph-devel

On Fri, 23 Jul 2010, Kevin Wolf wrote:

> Am 21.07.2010 21:18, schrieb Christian Brunner:
> > 2010/7/20 Yehuda Sadeh Weinraub <yehudasa@gmail.com>:
> >>> +        while  (s->queuesize > MAX_QUEUE_SIZE) {
> >>> +            qemu_mutex_lock(s->queue_mutex);
> >>> +            qemu_cond_wait(s->queue_threshold, s->queue_mutex);
> >>> +            qemu_mutex_unlock(s->queue_mutex);
> >>> +        }
> >>
> >> Actually we shouldn't be waiting inside the aio handler. We should
> >> probably be feeding the request into some wait queue and have a
> >> separate thread that drains all those requests out. Though this
> >> wouldn't help us in throttling the client memory, it's probably a more
> >> correct way to handle the problem.
> > 
> > I agree with you, however I don't think that this is worth the effort.
> > I've never seen this occur when running a virtual machine, only when
> > running qemu-io.
> > 
> > @Kevin: Did you follow this thread? What is your opinion?
> 
> No, I'm not subscribed to your mailing list and qemu-devel isn't CCed.
> 
> I'm not entirely sure what kind of queue this is about, but are you sure
> that it can only happen with large requests as issued by qemu-io? Can
> many small requests cause the same, especially with a slow network
> connection (or it might even go down for some reason?

In theory, small requests can, yes.  We haven't seen it in practice, I 
don't think.

In any case, I think any waiting/throttling just needs to happen inside 
the librados library, as that is the only place that may have any 
knowledge of the cluster size, throughput, and so forth.  This code should 
be dropped from the qemu-kvm/rbd side.

sage

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*)
  2010-07-23 16:46       ` Sage Weil
@ 2010-07-23 18:51         ` Christian Brunner
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Brunner @ 2010-07-23 18:51 UTC (permalink / raw)
  To: Sage Weil; +Cc: Kevin Wolf, Yehuda Sadeh Weinraub, ceph-devel

2010/7/23 Sage Weil <sage@newdream.net>:
> On Fri, 23 Jul 2010, Kevin Wolf wrote:
>
>> >> Actually we shouldn't be waiting inside the aio handler. We should
>> >> probably be feeding the request into some wait queue and have a
>> >> separate thread that drains all those requests out. Though this
>> >> wouldn't help us in throttling the client memory, it's probably a more
>> >> correct way to handle the problem.
>> >
>> > I agree with you, however I don't think that this is worth the effort.
>> > I've never seen this occur when running a virtual machine, only when
>> > running qemu-io.
>> >
>> > @Kevin: Did you follow this thread? What is your opinion?
>>
>> No, I'm not subscribed to your mailing list and qemu-devel isn't CCed.
>>
>> I'm not entirely sure what kind of queue this is about, but are you sure
>> that it can only happen with large requests as issued by qemu-io? Can
>> many small requests cause the same, especially with a slow network
>> connection (or it might even go down for some reason?
>
> In theory, small requests can, yes.  We haven't seen it in practice, I
> don't think.
>
> In any case, I think any waiting/throttling just needs to happen inside
> the librados library, as that is the only place that may have any
> knowledge of the cluster size, throughput, and so forth.  This code should
> be dropped from the qemu-kvm/rbd side.

That would be the best solution.

Christian
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-07-23 18:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-14 13:35 [PATCH] qemu-kvm/rbd: add queueing delay based on queuesize (using use qemu_mutex_* and qemu_cond_*) Christian Brunner
2010-07-20 20:42 ` Yehuda Sadeh Weinraub
2010-07-21 19:18   ` Christian Brunner
2010-07-23  8:56     ` Kevin Wolf
2010-07-23 16:46       ` Sage Weil
2010-07-23 18:51         ` Christian Brunner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.