All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] blk-merge: fix blk_recount_segments
@ 2014-09-02 15:02 Ming Lei
  2014-09-02 16:21 ` Christoph Hellwig
  2014-09-02 16:24 ` Jens Axboe
  0 siblings, 2 replies; 17+ messages in thread
From: Ming Lei @ 2014-09-02 15:02 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel; +Cc: Kick In, Chris J Arges, Ming Lei

QUEUE_FLAG_NO_SG_MERGE is set at default for blk-mq devices,
so bio->bi_phys_segment computed may be bigger than
queue_max_segments(q) for blk-mq devices, then drivers will
fail to handle the case, for example, BUG_ON() in
virtio_queue_rq() can be triggerd for virtio-blk:

	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359146

This patch fixes the issue by ignoring the QUEUE_FLAG_NO_SG_MERGE
flag if the computed bio->bi_phys_segment is bigger than
queue_max_segments(q), and the regression is caused by commit
05f1dd53152173(block: add queue flag for disabling SG merging).

Reported-by: Kick In <pierre-andre.morey@canonical.com>
Tested-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block/blk-merge.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 5453583..7788179 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -10,10 +10,11 @@
 #include "blk.h"
 
 static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
-					     struct bio *bio)
+					     struct bio *bio,
+					     bool no_sg_merge)
 {
 	struct bio_vec bv, bvprv = { NULL };
-	int cluster, high, highprv = 1, no_sg_merge;
+	int cluster, high, highprv = 1;
 	unsigned int seg_size, nr_phys_segs;
 	struct bio *fbio, *bbio;
 	struct bvec_iter iter;
@@ -35,7 +36,6 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	cluster = blk_queue_cluster(q);
 	seg_size = 0;
 	nr_phys_segs = 0;
-	no_sg_merge = test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags);
 	high = 0;
 	for_each_bio(bio) {
 		bio_for_each_segment(bv, bio, iter) {
@@ -88,18 +88,23 @@ new_segment:
 
 void blk_recalc_rq_segments(struct request *rq)
 {
-	rq->nr_phys_segments = __blk_recalc_rq_segments(rq->q, rq->bio);
+	bool no_sg_merge = !!test_bit(QUEUE_FLAG_NO_SG_MERGE,
+			&rq->q->queue_flags);
+
+	rq->nr_phys_segments = __blk_recalc_rq_segments(rq->q, rq->bio,
+			no_sg_merge);
 }
 
 void blk_recount_segments(struct request_queue *q, struct bio *bio)
 {
-	if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags))
+	if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags) &&
+			bio->bi_vcnt < queue_max_segments(q))
 		bio->bi_phys_segments = bio->bi_vcnt;
 	else {
 		struct bio *nxt = bio->bi_next;
 
 		bio->bi_next = NULL;
-		bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio);
+		bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, false);
 		bio->bi_next = nxt;
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-02 15:02 [PATCH] blk-merge: fix blk_recount_segments Ming Lei
@ 2014-09-02 16:21 ` Christoph Hellwig
  2014-09-02 16:24   ` Jens Axboe
  2014-09-02 16:24 ` Jens Axboe
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2014-09-02 16:21 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-kernel, Kick In, Chris J Arges

Btw, one thing we should reconsider is where we set
QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
doing the S/G merge should be a lot cheaper than fanning out into the
indirect descriptors.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-02 16:21 ` Christoph Hellwig
@ 2014-09-02 16:24   ` Jens Axboe
  2014-09-03  4:19     ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2014-09-02 16:24 UTC (permalink / raw)
  To: Christoph Hellwig, Ming Lei; +Cc: linux-kernel, Kick In, Chris J Arges

On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
> Btw, one thing we should reconsider is where we set
> QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
> doing the S/G merge should be a lot cheaper than fanning out into the
> indirect descriptors.

And we might consider flipping the default, as most would probably like
to have the sg merging on.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-02 15:02 [PATCH] blk-merge: fix blk_recount_segments Ming Lei
  2014-09-02 16:21 ` Christoph Hellwig
@ 2014-09-02 16:24 ` Jens Axboe
  2014-09-02 17:01   ` Jeff Moyer
  1 sibling, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2014-09-02 16:24 UTC (permalink / raw)
  To: Ming Lei, linux-kernel; +Cc: Kick In, Chris J Arges

On 09/02/2014 09:02 AM, Ming Lei wrote:
> QUEUE_FLAG_NO_SG_MERGE is set at default for blk-mq devices,
> so bio->bi_phys_segment computed may be bigger than
> queue_max_segments(q) for blk-mq devices, then drivers will
> fail to handle the case, for example, BUG_ON() in
> virtio_queue_rq() can be triggerd for virtio-blk:
> 
> 	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359146
> 
> This patch fixes the issue by ignoring the QUEUE_FLAG_NO_SG_MERGE
> flag if the computed bio->bi_phys_segment is bigger than
> queue_max_segments(q), and the regression is caused by commit
> 05f1dd53152173(block: add queue flag for disabling SG merging).
> 
> Reported-by: Kick In <pierre-andre.morey@canonical.com>
> Tested-by: Chris J Arges <chris.j.arges@canonical.com>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>

Thanks Ming, this looks nice and clean. Will apply for 3.17.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-02 16:24 ` Jens Axboe
@ 2014-09-02 17:01   ` Jeff Moyer
  2014-09-03  7:39     ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Moyer @ 2014-09-02 17:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Ming Lei, linux-kernel, Kick In, Chris J Arges

Jens Axboe <axboe@kernel.dk> writes:

> On 09/02/2014 09:02 AM, Ming Lei wrote:
>> QUEUE_FLAG_NO_SG_MERGE is set at default for blk-mq devices,
>> so bio->bi_phys_segment computed may be bigger than
>> queue_max_segments(q) for blk-mq devices, then drivers will
>> fail to handle the case, for example, BUG_ON() in
>> virtio_queue_rq() can be triggerd for virtio-blk:
>> 
>> 	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359146
>> 
>> This patch fixes the issue by ignoring the QUEUE_FLAG_NO_SG_MERGE
>> flag if the computed bio->bi_phys_segment is bigger than
>> queue_max_segments(q), and the regression is caused by commit
>> 05f1dd53152173(block: add queue flag for disabling SG merging).
>> 
>> Reported-by: Kick In <pierre-andre.morey@canonical.com>
>> Tested-by: Chris J Arges <chris.j.arges@canonical.com>
>> Signed-off-by: Ming Lei <ming.lei@canonical.com>
>
> Thanks Ming, this looks nice and clean. Will apply for 3.17.

This sounds an awful lot like the bug I fixed in dm:

commit 200612ec33e555a356eebc717630b866ae2b694f
Author: Jeff Moyer <jmoyer@redhat.com>
Date:   Fri Aug 8 11:03:41 2014 -0400

    dm table: propagate QUEUE_FLAG_NO_SG_MERGE
    
    Commit 05f1dd5 ("block: add queue flag for disabling SG merging")
    introduced a new queue flag: QUEUE_FLAG_NO_SG_MERGE.  This gets set by
    default in blk_mq_init_queue for mq-enabled devices.  The effect of
    the flag is to bypass the SG segment merging.  Instead, the
    bio->bi_vcnt is used as the number of hardware segments.
    
    With a device mapper target on top of a device with
    QUEUE_FLAG_NO_SG_MERGE set, we can end up sending down more segments
    than a driver is prepared to handle.  I ran into this when backporting
    the virtio_blk mq support.  It triggerred this BUG_ON, in
    virtio_queue_rq:
    
            BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);

    ...

Is bcache a stacking driver in this case?  Should it not inherit this
flag, just as DM now does?

-Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-02 16:24   ` Jens Axboe
@ 2014-09-03  4:19     ` Ming Lei
  2014-09-03  6:59       ` Ming Lei
  2014-09-05  5:43       ` Rusty Russell
  0 siblings, 2 replies; 17+ messages in thread
From: Ming Lei @ 2014-09-03  4:19 UTC (permalink / raw)
  To: Jens Axboe, Rusty Russell
  Cc: Christoph Hellwig, linux-kernel, Kick In, Chris J Arges

On Tue, 02 Sep 2014 10:24:24 -0600
Jens Axboe <axboe@kernel.dk> wrote:

> On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
> > Btw, one thing we should reconsider is where we set
> > QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
> > doing the S/G merge should be a lot cheaper than fanning out into the
> > indirect descriptors.

Indirect is always considered first no matter SG merge is off or on,
at least from current virtio-blk implementation.

But it is a good idea to try direct descriptor first, the below simple
change can improve randread(libaio, O_DIRECT, multi-queue) 7% in my test,
and 77% transfer starts to use direct descriptor, and almost all transfer
uses indirect descriptor only in current upstream implementation.

>From 8803341503dedab282c15f2801fdb6d7420cea6f Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei@canonical.com>
Date: Wed, 3 Sep 2014 10:42:32 +0800
Subject: [PATCH] virtio_ring: use direct descriptor as far as possible

Direct descriptor can avoid one kmalloc() and should have
better cache utilization, so try to use it if there are enough
free direct descriptors.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/virtio/virtio_ring.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 4d08f45a..c76b7a4 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -224,9 +224,13 @@ static inline int virtqueue_add(struct virtqueue *_vq,
 
 	total_sg = total_in + total_out;
 
-	/* If the host supports indirect descriptor tables, and we have multiple
-	 * buffers, then go indirect. FIXME: tune this threshold */
-	if (vq->indirect && total_sg > 1 && vq->vq.num_free) {
+	/*
+	 * If the host supports indirect descriptor tables, and we have
+	 * multiple buffers, then go indirect only if free descriptors
+	 * are less than 16.
+	 */
+	if (vq->indirect && total_sg > 1 && vq->vq.num_free &&
+			vq->vq.num_free < 16) {
 		head = vring_add_indirect(vq, sgs, next, total_sg, total_out,
 					  total_in,
 					  out_sgs, in_sgs, gfp);
-- 
1.7.9.5

 

> 
> And we might consider flipping the default, as most would probably like
> to have the sg merging on.
>

With this patch, I think SG merge can be kept as off at default if there is
no extra cost introduced for handling more segments by driver or HW.

If SG merge is changed to on in my above virtio-blk fio test, small throughput
drop can be observed. 


Thanks,
-- 
Ming Lei

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-03  4:19     ` Ming Lei
@ 2014-09-03  6:59       ` Ming Lei
  2014-09-05  5:43       ` Rusty Russell
  1 sibling, 0 replies; 17+ messages in thread
From: Ming Lei @ 2014-09-03  6:59 UTC (permalink / raw)
  To: Jens Axboe, Rusty Russell
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Kick In, Chris J Arges

On Wed, Sep 3, 2014 at 12:19 PM, Ming Lei <ming.lei@canonical.com> wrote:
> On Tue, 02 Sep 2014 10:24:24 -0600
> Jens Axboe <axboe@kernel.dk> wrote:
>
>> On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
>> > Btw, one thing we should reconsider is where we set
>> > QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
>> > doing the S/G merge should be a lot cheaper than fanning out into the
>> > indirect descriptors.
>
> Indirect is always considered first no matter SG merge is off or on,
> at least from current virtio-blk implementation.
>
> But it is a good idea to try direct descriptor first, the below simple
> change can improve randread(libaio, O_DIRECT, multi-queue) 7% in my test,
> and 77% transfer starts to use direct descriptor, and almost all transfer
> uses indirect descriptor only in current upstream implementation.
>
> From 8803341503dedab282c15f2801fdb6d7420cea6f Mon Sep 17 00:00:00 2001
> From: Ming Lei <ming.lei@canonical.com>
> Date: Wed, 3 Sep 2014 10:42:32 +0800
> Subject: [PATCH] virtio_ring: use direct descriptor as far as possible
>
> Direct descriptor can avoid one kmalloc() and should have
> better cache utilization, so try to use it if there are enough
> free direct descriptors.
>
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  drivers/virtio/virtio_ring.c |   10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 4d08f45a..c76b7a4 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -224,9 +224,13 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>
>         total_sg = total_in + total_out;
>
> -       /* If the host supports indirect descriptor tables, and we have multiple
> -        * buffers, then go indirect. FIXME: tune this threshold */
> -       if (vq->indirect && total_sg > 1 && vq->vq.num_free) {
> +       /*
> +        * If the host supports indirect descriptor tables, and we have
> +        * multiple buffers, then go indirect only if free descriptors
> +        * are less than 16.
> +        */
> +       if (vq->indirect && total_sg > 1 && vq->vq.num_free &&
> +                       vq->vq.num_free < 16) {

Oops, the above should have been below:

        if (vq->indirect && total_sg > 1 && vq->vq.num_free &&
                        (vq->vq.num_free < 16 ||
                         (vq->vq.num_free < total_sg))) {


Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-02 17:01   ` Jeff Moyer
@ 2014-09-03  7:39     ` Ming Lei
  0 siblings, 0 replies; 17+ messages in thread
From: Ming Lei @ 2014-09-03  7:39 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Jens Axboe, Linux Kernel Mailing List, Kick In, Chris J Arges

On Wed, Sep 3, 2014 at 1:01 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>
>> On 09/02/2014 09:02 AM, Ming Lei wrote:
>>> QUEUE_FLAG_NO_SG_MERGE is set at default for blk-mq devices,
>>> so bio->bi_phys_segment computed may be bigger than
>>> queue_max_segments(q) for blk-mq devices, then drivers will
>>> fail to handle the case, for example, BUG_ON() in
>>> virtio_queue_rq() can be triggerd for virtio-blk:
>>>
>>>      https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359146
>>>
>>> This patch fixes the issue by ignoring the QUEUE_FLAG_NO_SG_MERGE
>>> flag if the computed bio->bi_phys_segment is bigger than
>>> queue_max_segments(q), and the regression is caused by commit
>>> 05f1dd53152173(block: add queue flag for disabling SG merging).
>>>
>>> Reported-by: Kick In <pierre-andre.morey@canonical.com>
>>> Tested-by: Chris J Arges <chris.j.arges@canonical.com>
>>> Signed-off-by: Ming Lei <ming.lei@canonical.com>
>>
>> Thanks Ming, this looks nice and clean. Will apply for 3.17.
>
> This sounds an awful lot like the bug I fixed in dm:

Anyway block should respect max segments constraint even though
QUEUE_FLAG_NO_SG_MERGE is set, that is what this patch does.

>
> commit 200612ec33e555a356eebc717630b866ae2b694f
> Author: Jeff Moyer <jmoyer@redhat.com>
> Date:   Fri Aug 8 11:03:41 2014 -0400
>
>     dm table: propagate QUEUE_FLAG_NO_SG_MERGE
>
>     Commit 05f1dd5 ("block: add queue flag for disabling SG merging")
>     introduced a new queue flag: QUEUE_FLAG_NO_SG_MERGE.  This gets set by
>     default in blk_mq_init_queue for mq-enabled devices.  The effect of
>     the flag is to bypass the SG segment merging.  Instead, the
>     bio->bi_vcnt is used as the number of hardware segments.
>
>     With a device mapper target on top of a device with
>     QUEUE_FLAG_NO_SG_MERGE set, we can end up sending down more segments
>     than a driver is prepared to handle.  I ran into this when backporting
>     the virtio_blk mq support.  It triggerred this BUG_ON, in
>     virtio_queue_rq:
>
>             BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
>
>     ...
>
> Is bcache a stacking driver in this case?  Should it not inherit this
> flag, just as DM now does?

IMO your patch might make things complicated because:

- max segment constraint is HW related thing, so DM/bcache shouldn't
have touched related flag

- it isn't good to let both DM and underlying device do SG merge or not
do SG merge, performance may hurt if both do SG merge.

So why not let driver and block-core handle it? If driver has max segment
constraint requirement, block core just meets its requirement like this patch.

Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-03  4:19     ` Ming Lei
  2014-09-03  6:59       ` Ming Lei
@ 2014-09-05  5:43       ` Rusty Russell
  2014-09-05  6:26         ` Ming Lei
  2014-09-05 11:59         ` Rusty Russell
  1 sibling, 2 replies; 17+ messages in thread
From: Rusty Russell @ 2014-09-05  5:43 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: Christoph Hellwig, linux-kernel, Kick In, Chris J Arges

Ming Lei <ming.lei@canonical.com> writes:
> On Tue, 02 Sep 2014 10:24:24 -0600
> Jens Axboe <axboe@kernel.dk> wrote:
>
>> On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
>> > Btw, one thing we should reconsider is where we set
>> > QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
>> > doing the S/G merge should be a lot cheaper than fanning out into the
>> > indirect descriptors.
>
> Indirect is always considered first no matter SG merge is off or on,
> at least from current virtio-blk implementation.
>
> But it is a good idea to try direct descriptor first, the below simple
> change can improve randread(libaio, O_DIRECT, multi-queue) 7% in my test,
> and 77% transfer starts to use direct descriptor, and almost all transfer
> uses indirect descriptor only in current upstream implementation.

Hi Ming!

        In general, we want to use direct descriptors of we have plenty
of descriptors, and indirect if the ring is going to fill up.  I was
thinking about this just yesterday, in fact.

I've been trying to use EWMA to figure out how full the ring gets, but
so far it's not working well.  I'm still hacking on a solution though,
and your thoughts would be welcome.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-05  5:43       ` Rusty Russell
@ 2014-09-05  6:26         ` Ming Lei
  2014-09-05  6:28           ` Ming Lei
  2014-09-05 11:59         ` Rusty Russell
  1 sibling, 1 reply; 17+ messages in thread
From: Ming Lei @ 2014-09-05  6:26 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Jens Axboe, Christoph Hellwig, Linux Kernel Mailing List,
	Kick In, Chris J Arges

On Fri, Sep 5, 2014 at 1:43 PM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Ming Lei <ming.lei@canonical.com> writes:
>> On Tue, 02 Sep 2014 10:24:24 -0600
>> Jens Axboe <axboe@kernel.dk> wrote:
>>
>>> On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
>>> > Btw, one thing we should reconsider is where we set
>>> > QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
>>> > doing the S/G merge should be a lot cheaper than fanning out into the
>>> > indirect descriptors.
>>
>> Indirect is always considered first no matter SG merge is off or on,
>> at least from current virtio-blk implementation.
>>
>> But it is a good idea to try direct descriptor first, the below simple
>> change can improve randread(libaio, O_DIRECT, multi-queue) 7% in my test,
>> and 77% transfer starts to use direct descriptor, and almost all transfer
>> uses indirect descriptor only in current upstream implementation.
>
> Hi Ming!
>
>         In general, we want to use direct descriptors of we have plenty
> of descriptors, and indirect if the ring is going to fill up.  I was
> thinking about this just yesterday, in fact.

I thought about the idea further and looks it isn't mature at least for
virtio-blk:

- the queue num is a bit small, for example 128 returned from QEMU
- so it is easy to exhaust all direct descriptors, and queue has to be
stopped

My previous test is based on null_blk which is quite fast so no
above problem.

IMO, there are at least two advantages by using indirect descriptors:

- queue won't be stopped because descriptor is enough(may not be true
for other virtio devices, like virtio-net, rx traffic is difficult to predict)
- good cache utilization because all descriptors are put together

The main problem is the extra kmalloc(), which might be
improved by a memory pool.

>
> I've been trying to use EWMA to figure out how full the ring gets, but

How full should have been figured out by num_free?

> so far it's not working well.  I'm still hacking on a solution though,
> and your thoughts would be welcome.

I am wondering if it is easy to predict how many transfers will be coming
with some mathematics model. My concern is that the cost caused by
stopping queue may overwhelm advantage from using direct descriptor.


Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-05  6:26         ` Ming Lei
@ 2014-09-05  6:28           ` Ming Lei
  0 siblings, 0 replies; 17+ messages in thread
From: Ming Lei @ 2014-09-05  6:28 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Jens Axboe, Christoph Hellwig, Linux Kernel Mailing List,
	Kick In, Chris J Arges

On Fri, Sep 5, 2014 at 2:26 PM, Ming Lei <ming.lei@canonical.com> wrote:
> On Fri, Sep 5, 2014 at 1:43 PM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>
> The main problem is the extra kmalloc(), which might be
> improved by a memory pool.

Or use kind of EWMA model to cache previous allocated indirect descriptors.


Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-05  5:43       ` Rusty Russell
  2014-09-05  6:26         ` Ming Lei
@ 2014-09-05 11:59         ` Rusty Russell
  2014-09-10 23:38           ` Rusty Russell
  1 sibling, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2014-09-05 11:59 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: Christoph Hellwig, linux-kernel, Kick In, Chris J Arges

Rusty Russell <rusty@rustcorp.com.au> writes:
> Ming Lei <ming.lei@canonical.com> writes:
>> On Tue, 02 Sep 2014 10:24:24 -0600
>> Jens Axboe <axboe@kernel.dk> wrote:
>>
>>> On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
>>> > Btw, one thing we should reconsider is where we set
>>> > QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
>>> > doing the S/G merge should be a lot cheaper than fanning out into the
>>> > indirect descriptors.
>>
>> Indirect is always considered first no matter SG merge is off or on,
>> at least from current virtio-blk implementation.
>>
>> But it is a good idea to try direct descriptor first, the below simple
>> change can improve randread(libaio, O_DIRECT, multi-queue) 7% in my test,
>> and 77% transfer starts to use direct descriptor, and almost all transfer
>> uses indirect descriptor only in current upstream implementation.
>
> Hi Ming!
>
>         In general, we want to use direct descriptors of we have plenty
> of descriptors, and indirect if the ring is going to fill up.  I was
> thinking about this just yesterday, in fact.
>
> I've been trying to use EWMA to figure out how full the ring gets, but
> so far it's not working well.  I'm still hacking on a solution though,
> and your thoughts would be welcome.

Here's what I have.  It seems to work as expected, but I haven't
benchmarked it.

Subject: virtio_ring: try to use direct descriptors when we're not likely to fill ring

Indirect virtio descriptors allow us to use a single ring entry for a
large scatter-gather list, at the cost of a kmalloc.  If our ring
isn't heavily used, there's no point preserving descriptors.

This patch tracks the maximum number of descriptors in the ring, with
a slow decay.  When we add a new buffer, we assume there will be that
maximum number of descriptors, and use a direct buffer if there would
be room for that many descriptors of this size.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 6d2b5310c991..2ff583477139 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -78,6 +78,11 @@ struct vring_virtqueue
 	/* Number we've added since last sync. */
 	unsigned int num_added;
 
+	/* How many descriptors have been added. */
+	unsigned int num_in_use;
+	/* Maximum descriptors in use (degrades over time). */
+	unsigned int max_in_use;
+
 	/* Last used index we've seen. */
 	u16 last_used_idx;
 
@@ -120,6 +125,31 @@ static struct vring_desc *alloc_indirect(unsigned int total_sg, gfp_t gfp)
 	return desc;
 }
 
+static bool try_indirect(struct vring_virtqueue *vq, unsigned int total_sg)
+{
+	unsigned long num_expected;
+
+	if (!vq->indirect)
+		return false;
+
+	/* Completely full?  Don't even bother with indirect alloc */
+	if (!vq->vq.num_free)
+		return false;
+
+	/* We're not going to fit?  Try indirect. */
+	if (total_sg > vq->vq.num_free)
+		return true;
+
+	/* We should be tracking this. */
+	BUG_ON(vq->max_in_use < vq->num_in_use);
+
+	/* How many more descriptors do we expect at peak usage? */
+	num_expected = vq->max_in_use - vq->num_in_use;
+
+	/* If each were this size, would they overflow? */
+	return (total_sg * num_expected > vq->vq.num_free);
+}
+
 static inline int virtqueue_add(struct virtqueue *_vq,
 				struct scatterlist *sgs[],
 				unsigned int total_sg,
@@ -162,9 +192,7 @@ static inline int virtqueue_add(struct virtqueue *_vq,
 
 	head = vq->free_head;
 
-	/* If the host supports indirect descriptor tables, and we have multiple
-	 * buffers, then go indirect. FIXME: tune this threshold */
-	if (vq->indirect && total_sg > 1 && vq->vq.num_free)
+	if (try_indirect(vq, total_sg))
 		desc = alloc_indirect(total_sg, gfp);
 	else
 		desc = NULL;
@@ -243,6 +271,14 @@ static inline int virtqueue_add(struct virtqueue *_vq,
 	virtio_wmb(vq->weak_barriers);
 	vq->vring.avail->idx++;
 	vq->num_added++;
+	vq->num_in_use++;
+
+	/* Every vq->vring.num descriptors, decay the maximum value */
+	if (unlikely(avail == 0))
+		vq->max_in_use >>= 1;
+
+	if (vq->num_in_use > vq->max_in_use)
+		vq->max_in_use = vq->num_in_use;
 
 	/* This is very unlikely, but theoretically possible.  Kick
 	 * just in case. */
@@ -515,6 +551,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 		virtio_mb(vq->weak_barriers);
 	}
 
+	vq->num_in_use--;
 #ifdef DEBUG
 	vq->last_add_time_valid = false;
 #endif
@@ -737,6 +774,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int index,
 	vq->last_used_idx = 0;
 	vq->num_added = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
+	vq->num_in_use = 0;
+	vq->max_in_use = 0;
 #ifdef DEBUG
 	vq->in_use = false;
 	vq->last_add_time_valid = false;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-05 11:59         ` Rusty Russell
@ 2014-09-10 23:38           ` Rusty Russell
  2014-09-10 23:58             ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2014-09-10 23:38 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: Christoph Hellwig, linux-kernel, Kick In, Chris J Arges

Rusty Russell <rusty@rustcorp.com.au> writes:
> Rusty Russell <rusty@rustcorp.com.au> writes:
> Here's what I have.  It seems to work as expected, but I haven't
> benchmarked it.

Hi Ming,

        Any chance you can run your benchmarks against this patch?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-10 23:38           ` Rusty Russell
@ 2014-09-10 23:58             ` Ming Lei
  2014-09-12  1:43               ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2014-09-10 23:58 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Jens Axboe, Christoph Hellwig, Linux Kernel Mailing List,
	Kick In, Chris J Arges

On Thu, Sep 11, 2014 at 7:38 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Rusty Russell <rusty@rustcorp.com.au> writes:
>> Rusty Russell <rusty@rustcorp.com.au> writes:
>> Here's what I have.  It seems to work as expected, but I haven't
>> benchmarked it.
>
> Hi Ming,
>
>         Any chance you can run your benchmarks against this patch?

I can run the previous benchmark against this patch, but I am wondering
if it is enough since the change need lots of test cases to verify.

BTW, looks your patch doesn't against upstream kernel, since I can't
find alloc_indirect() in drivers/virtio/virtio_ring.c

Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-10 23:58             ` Ming Lei
@ 2014-09-12  1:43               ` Rusty Russell
  2014-09-13 15:15                 ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2014-09-12  1:43 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Linux Kernel Mailing List,
	Kick In, Chris J Arges

Ming Lei <ming.lei@canonical.com> writes:
> On Thu, Sep 11, 2014 at 7:38 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>> Here's what I have.  It seems to work as expected, but I haven't
>>> benchmarked it.
>>
>> Hi Ming,
>>
>>         Any chance you can run your benchmarks against this patch?
>
> I can run the previous benchmark against this patch, but I am wondering
> if it is enough since the change need lots of test cases to verify.

Indeed, I'll particularly need to run network tests as well, but you're
the one who suggesting the fix for block so I'm relying on you to see
if this helps :)

> BTW, looks your patch doesn't against upstream kernel, since I can't
> find alloc_indirect() in drivers/virtio/virtio_ring.c

Yes, the code in my pending-rebases tree has been reworked.  Here's
the backport for you:

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 4d08f45a9c29..5a911e1f7462 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -78,6 +78,11 @@ struct vring_virtqueue
 	/* Number we've added since last sync. */
 	unsigned int num_added;
 
+	/* How many descriptors have been added. */
+	unsigned int num_in_use;
+	/* Maximum descriptors in use (degrades over time). */
+	unsigned int max_in_use;
+
 	/* Last used index we've seen. */
 	u16 last_used_idx;
 
@@ -184,6 +189,31 @@ static inline int vring_add_indirect(struct vring_virtqueue *vq,
 	return head;
 }
 
+static bool try_indirect(struct vring_virtqueue *vq, unsigned int total_sg)
+{
+	unsigned long num_expected;
+
+	if (!vq->indirect)
+		return false;
+
+	/* Completely full?  Don't even bother with indirect alloc */
+	if (!vq->vq.num_free)
+		return false;
+
+	/* We're not going to fit?  Try indirect. */
+	if (total_sg > vq->vq.num_free)
+		return true;
+
+	/* We should be tracking this. */
+	BUG_ON(vq->max_in_use < vq->num_in_use);
+
+	/* How many more descriptors do we expect at peak usage? */
+	num_expected = vq->max_in_use - vq->num_in_use;
+
+	/* If each were this size, would they overflow? */
+	return (total_sg * num_expected > vq->vq.num_free);
+}
+
 static inline int virtqueue_add(struct virtqueue *_vq,
 				struct scatterlist *sgs[],
 				struct scatterlist *(*next)
@@ -226,7 +256,7 @@ static inline int virtqueue_add(struct virtqueue *_vq,
 
 	/* If the host supports indirect descriptor tables, and we have multiple
 	 * buffers, then go indirect. FIXME: tune this threshold */
-	if (vq->indirect && total_sg > 1 && vq->vq.num_free) {
+	if (try_indirect(vq, total_sg)) {
 		head = vring_add_indirect(vq, sgs, next, total_sg, total_out,
 					  total_in,
 					  out_sgs, in_sgs, gfp);
@@ -291,6 +321,14 @@ add_head:
 	virtio_wmb(vq->weak_barriers);
 	vq->vring.avail->idx++;
 	vq->num_added++;
+	vq->num_in_use++;
+
+	/* Every vq->vring.num descriptors, decay the maximum value */
+	if (unlikely(avail == 0))
+		vq->max_in_use >>= 1;
+
+	if (vq->num_in_use > vq->max_in_use)
+		vq->max_in_use = vq->num_in_use;
 
 	/* This is very unlikely, but theoretically possible.  Kick
 	 * just in case. */
@@ -569,6 +607,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 		virtio_mb(vq->weak_barriers);
 	}
 
+	vq->num_in_use--;
 #ifdef DEBUG
 	vq->last_add_time_valid = false;
 #endif
@@ -791,6 +830,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int index,
 	vq->last_used_idx = 0;
 	vq->num_added = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
+	vq->num_in_use = 0;
+	vq->max_in_use = 0;
 #ifdef DEBUG
 	vq->in_use = false;
 	vq->last_add_time_valid = false;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-12  1:43               ` Rusty Russell
@ 2014-09-13 15:15                 ` Ming Lei
  2014-10-14  3:54                   ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2014-09-13 15:15 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Jens Axboe, Christoph Hellwig, Linux Kernel Mailing List,
	Kick In, Chris J Arges

Hi Rusty,

On Fri, Sep 12, 2014 at 9:43 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Ming Lei <ming.lei@canonical.com> writes:
>> On Thu, Sep 11, 2014 at 7:38 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>>> Here's what I have.  It seems to work as expected, but I haven't
>>>> benchmarked it.
>>>
>>> Hi Ming,
>>>
>>>         Any chance you can run your benchmarks against this patch?
>>
>> I can run the previous benchmark against this patch, but I am wondering
>> if it is enough since the change need lots of test cases to verify.
>
> Indeed, I'll particularly need to run network tests as well, but you're
> the one who suggesting the fix for block so I'm relying on you to see
> if this helps :)
>
>> BTW, looks your patch doesn't against upstream kernel, since I can't
>> find alloc_indirect() in drivers/virtio/virtio_ring.c
>
> Yes, the code in my pending-rebases tree has been reworked.  Here's
> the backport for you:

I didn't know your virtio tree, otherwise I can pull your tree by myself, :-)

Follows my virtio-blk test and result:

1, FIO script
[global]
direct=1
size=128G
bsrange=${BS}-${BS}
timeout=60
numjobs=4
ioengine=libaio
iodepth=64
filename=/dev/vdb     #backed by /dev/nullb0, 4 virtqueues per virtio-blk
group_reporting=1

[f]
rw=randread

2, hypervisor(patched QEMU with multi virtqueue support)

git://kernel.ubuntu.com/ming/linux.git v3.17-block-dev_v3

3, host
- 2 sockets, 8 physical cores(16 threads), Intel Xeon 2.13GHz
- 24G RAM

4, QEMU command line
$QEMUPATH \
    -name 'kvm-test'  \
    -M pc  \
    -vga none  \
    -drive id=drive_image1,if=none,format=raw,cache=none,aio=native,file=rootfs.img
\
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,scsi=off,config-wce=off,x-data-plane=on,bus=pci.0,addr=02
\
    -drive id=drive_image3,if=none,format=raw,cache=none,aio=native,file=/dev/nullb0
\
    -device virtio-blk-pci,id=image3,drive=drive_image3,bootindex=3,scsi=off,config-wce=off,x-data-plane=on,vectors=5,num_queues=4,bus=pci.0,addr=04
\
    -device virtio-net-pci,mac=9a:be:bf:c0:c1:c2,id=idDyAIbK,vectors=4,netdev=idabMX4S,bus=pci.0,addr=05
\
    -netdev user,id=idabMX4S,hostfwd=tcp::5000-:22  \
    -m 8192  \
    -smp 4,maxcpus=4  \
    -kernel $KERNEL_IMG \
    -append 'earlyprintk console=ttyS0 mem=8192M rootfstype=ext4
root=/dev/vda rw virtio_blk.queue_depth=128 loglevel=9
no_console_suspend ip=dhcp'  \
    -nographic  \
    -rtc base=utc,clock=host,driftfix=none \
    -enable-kvm

5, result
5.1 without Rusty's virtio-vring patch
- BS=4K, throughput: 179K
- BS=256K, throughput: 27540

5.2 with Rusty's virtio-vring patch
- BS=4K, throughput: 173K
- BS=256K, throughput: 25350

Looks throughput decreases if BS is 256K in case of your patch.

Thanks,
--
Ming Lei
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 4d08f45a9c29..5a911e1f7462 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -78,6 +78,11 @@ struct vring_virtqueue
>         /* Number we've added since last sync. */
>         unsigned int num_added;
>
> +       /* How many descriptors have been added. */
> +       unsigned int num_in_use;
> +       /* Maximum descriptors in use (degrades over time). */
> +       unsigned int max_in_use;
> +
>         /* Last used index we've seen. */
>         u16 last_used_idx;
>
> @@ -184,6 +189,31 @@ static inline int vring_add_indirect(struct vring_virtqueue *vq,
>         return head;
>  }
>
> +static bool try_indirect(struct vring_virtqueue *vq, unsigned int total_sg)
> +{
> +       unsigned long num_expected;
> +
> +       if (!vq->indirect)
> +               return false;
> +
> +       /* Completely full?  Don't even bother with indirect alloc */
> +       if (!vq->vq.num_free)
> +               return false;
> +
> +       /* We're not going to fit?  Try indirect. */
> +       if (total_sg > vq->vq.num_free)
> +               return true;
> +
> +       /* We should be tracking this. */
> +       BUG_ON(vq->max_in_use < vq->num_in_use);
> +
> +       /* How many more descriptors do we expect at peak usage? */
> +       num_expected = vq->max_in_use - vq->num_in_use;
> +
> +       /* If each were this size, would they overflow? */
> +       return (total_sg * num_expected > vq->vq.num_free);
> +}
> +
>  static inline int virtqueue_add(struct virtqueue *_vq,
>                                 struct scatterlist *sgs[],
>                                 struct scatterlist *(*next)
> @@ -226,7 +256,7 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>
>         /* If the host supports indirect descriptor tables, and we have multiple
>          * buffers, then go indirect. FIXME: tune this threshold */
> -       if (vq->indirect && total_sg > 1 && vq->vq.num_free) {
> +       if (try_indirect(vq, total_sg)) {
>                 head = vring_add_indirect(vq, sgs, next, total_sg, total_out,
>                                           total_in,
>                                           out_sgs, in_sgs, gfp);
> @@ -291,6 +321,14 @@ add_head:
>         virtio_wmb(vq->weak_barriers);
>         vq->vring.avail->idx++;
>         vq->num_added++;
> +       vq->num_in_use++;
> +
> +       /* Every vq->vring.num descriptors, decay the maximum value */
> +       if (unlikely(avail == 0))
> +               vq->max_in_use >>= 1;
> +
> +       if (vq->num_in_use > vq->max_in_use)
> +               vq->max_in_use = vq->num_in_use;
>
>         /* This is very unlikely, but theoretically possible.  Kick
>          * just in case. */
> @@ -569,6 +607,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
>                 virtio_mb(vq->weak_barriers);
>         }
>
> +       vq->num_in_use--;
>  #ifdef DEBUG
>         vq->last_add_time_valid = false;
>  #endif
> @@ -791,6 +830,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int index,
>         vq->last_used_idx = 0;
>         vq->num_added = 0;
>         list_add_tail(&vq->vq.list, &vdev->vqs);
> +       vq->num_in_use = 0;
> +       vq->max_in_use = 0;
>  #ifdef DEBUG
>         vq->in_use = false;
>         vq->last_add_time_valid = false;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] blk-merge: fix blk_recount_segments
  2014-09-13 15:15                 ` Ming Lei
@ 2014-10-14  3:54                   ` Rusty Russell
  0 siblings, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2014-10-14  3:54 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Linux Kernel Mailing List,
	Kick In, Chris J Arges

Ming Lei <ming.lei@canonical.com> writes:
> Hi Rusty,

Hi Ming! 

        Sorry, I was on vacation.  I'm back and slowly working through
all my mail...

> 1, FIO script
> [global]
> direct=1
> size=128G
> bsrange=${BS}-${BS}
> timeout=60
> numjobs=4
> ioengine=libaio
> iodepth=64
> filename=/dev/vdb     #backed by /dev/nullb0, 4 virtqueues per virtio-blk
> group_reporting=1
>
> [f]
> rw=randread

> 5, result
> 5.1 without Rusty's virtio-vring patch
> - BS=4K, throughput: 179K
> - BS=256K, throughput: 27540

(ie. always using indirect)

> 5.2 with Rusty's virtio-vring patch
> - BS=4K, throughput: 173K
> - BS=256K, throughput: 25350

(ie. usually using indirect).

> Looks throughput decreases if BS is 256K in case of your patch.

Interesting.  Looks like we're ending up with fewer descs in flight,
though I'm surprised: with 256K blocks I'd expect us to hit the
steady state of "all indirect" almost immediately.

Hmm... I think the heuristic in my patch is flawed.  Let me try again,
and get back to you.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-10-14  6:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-02 15:02 [PATCH] blk-merge: fix blk_recount_segments Ming Lei
2014-09-02 16:21 ` Christoph Hellwig
2014-09-02 16:24   ` Jens Axboe
2014-09-03  4:19     ` Ming Lei
2014-09-03  6:59       ` Ming Lei
2014-09-05  5:43       ` Rusty Russell
2014-09-05  6:26         ` Ming Lei
2014-09-05  6:28           ` Ming Lei
2014-09-05 11:59         ` Rusty Russell
2014-09-10 23:38           ` Rusty Russell
2014-09-10 23:58             ` Ming Lei
2014-09-12  1:43               ` Rusty Russell
2014-09-13 15:15                 ` Ming Lei
2014-10-14  3:54                   ` Rusty Russell
2014-09-02 16:24 ` Jens Axboe
2014-09-02 17:01   ` Jeff Moyer
2014-09-03  7:39     ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.