All of lore.kernel.org
 help / color / mirror / Atom feed
* bcache strange behaviour in write back mode
@ 2013-04-22 19:15 Jack Wang
       [not found] ` <51758C42.4040708-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
       [not found] ` <CAC7rs0u=epPv0c_swBSifi_fDRC0k9WpZ+34OmFdgxjR3-8WcA@mail.gmail.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Jack Wang @ 2013-04-22 19:15 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA, koverstreet-hpIqsD4AKlfQT0dZR+AlfA
  Cc: dongsu.park-EIkl63zCoXaH+58JC4qpiA

Hi all,

We've seen strange behaviour in bcache mode in current bcache-testing
branch with Possible allocator fix:

Once I start writing data with "dd if=/dev/zero of=/dev/bcache0 bs=4k
count=10000 oflag=sync", all SSDs in the Pool go close to 100% util and
I see about 3600 writes/second in iostat for each disk in the pool, BUT
no data written in means of throughput.

Then after some seconds (the flush interval of bcache) I see the flush
of the writeback and also data written to the pool SSDs which looks
pretty much like reordering and merging happened for that data.

bcache-3.2 does not have such problem.
only bcache(master) and bcache-testing have such problem.

What's the possible reason?

Regards,
Jack

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found] ` <51758C42.4040708-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-04-22 20:26   ` Kent Overstreet
  2013-04-24 19:49   ` Jack Wang
  1 sibling, 0 replies; 10+ messages in thread
From: Kent Overstreet @ 2013-04-22 20:26 UTC (permalink / raw)
  To: Jack Wang
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Kent Overstreet,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

So you can easily reproduce this? If so, that's _awesome_ news - would
you be willing to try out a debug kernel with some tracing stuff
added? Maybe we can finally nail this.

On Mon, Apr 22, 2013 at 12:15 PM, Jack Wang <jinpu.wang-EIkl63zCoXaH+58JC4qpiA@public.gmane.org> wrote:
> Hi all,
>
> We've seen strange behaviour in bcache mode in current bcache-testing
> branch with Possible allocator fix:
>
> Once I start writing data with "dd if=/dev/zero of=/dev/bcache0 bs=4k
> count=10000 oflag=sync", all SSDs in the Pool go close to 100% util and
> I see about 3600 writes/second in iostat for each disk in the pool, BUT
> no data written in means of throughput.
>
> Then after some seconds (the flush interval of bcache) I see the flush
> of the writeback and also data written to the pool SSDs which looks
> pretty much like reordering and merging happened for that data.
>
> bcache-3.2 does not have such problem.
> only bcache(master) and bcache-testing have such problem.
>
> What's the possible reason?
>
> Regards,
> Jack
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]   ` <CAC7rs0u=epPv0c_swBSifi_fDRC0k9WpZ+34OmFdgxjR3-8WcA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-04-22 20:27     ` Jack Wang
       [not found]       ` <51759D17.5040204-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jack Wang @ 2013-04-22 20:27 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Kent Overstreet,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

On 2013年04月22日 22:19, Kent Overstreet wrote:
> So you can easily reproduce this? If so, that's _awesome_ news - would
> you be willing to try out a debug kernel with some tracing stuff added?
> Maybe we can finally nail this.
> 
> 
Thanks for reply, Kent, two of my colleagues saw this behaviour, so I
think we can reproduce this.
If you could give me more detailed guide to narrow it down, I can try it
on my side.

Regards,
Jack
> On Mon, Apr 22, 2013 at 12:15 PM, Jack Wang <jinpu.wang@profitbricks.com
> <mailto:jinpu.wang-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>> wrote:
> 
>     Hi all,
> 
>     We've seen strange behaviour in bcache mode in current bcache-testing
>     branch with Possible allocator fix:
> 
>     Once I start writing data with "dd if=/dev/zero of=/dev/bcache0 bs=4k
>     count=10000 oflag=sync", all SSDs in the Pool go close to 100% util and
>     I see about 3600 writes/second in iostat for each disk in the pool, BUT
>     no data written in means of throughput.
> 
>     Then after some seconds (the flush interval of bcache) I see the flush
>     of the writeback and also data written to the pool SSDs which looks
>     pretty much like reordering and merging happened for that data.
> 
>     bcache-3.2 does not have such problem.
>     only bcache(master) and bcache-testing have such problem.
> 
>     What's the possible reason?
> 
>     Regards,
>     Jack
>     --
>     To unsubscribe from this list: send the line "unsubscribe
>     linux-bcache" in
>     the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>     <mailto:majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
>     More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]       ` <51759D17.5040204-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-04-22 21:51         ` Kent Overstreet
       [not found]           ` <20130422215138.GA9931-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Kent Overstreet @ 2013-04-22 21:51 UTC (permalink / raw)
  To: Jack Wang
  Cc: Kent Overstreet, linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

On Mon, Apr 22, 2013 at 10:27:03PM +0200, Jack Wang wrote:
> Thanks for reply, Kent, two of my colleagues saw this behaviour, so I
> think we can reproduce this.
> If you could give me more detailed guide to narrow it down, I can try it
> on my side.

So, my current hypothesis is that the problem is the allocator spinning,
and the IO is from it continually rewriting prios/gens.

But I'm still not sure what's causing the allocator to spin, that's what
the last patch was supposed to fix.

Can you see if you can reproduce it with this patch, and then tell me
what shows up in the dmesg log? I expect you'll get a _lot_ of output -
flip timestamps on in your kernel config, if they're not already on.
Thanks!

commit 60a09d37d301f88dd0f0f413408821a067966d1a
Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Date:   Mon Apr 22 14:49:33 2013 -0700

    bcache: Allocator debug patch

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index 2879487..37c22c6 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -393,12 +393,15 @@ void bch_allocator_thread(struct closure *cl)
 				allocator_wait(ca, !list_empty(&ca->discards));
 				do_discard(ca, bucket);
 			} else {
-				fifo_push(&ca->free, bucket);
+				BUG_ON(!fifo_push(&ca->free, bucket));
 				closure_wake_up(&ca->set->bucket_wait);
 			}
 		}
 
 		allocator_wait(ca, ca->set->gc_mark_valid);
+
+		printk(KERN_DEBUG "bcache: invalidating buckets: free_inc %zu/%zu\n",
+		       fifo_used(&ca->free_inc), ca->free_inc.size);
 		invalidate_buckets(ca);
 
 		allocator_wait(ca, !atomic_read(&ca->set->prio_blocked) ||
@@ -407,8 +410,12 @@ void bch_allocator_thread(struct closure *cl)
 		if (CACHE_SYNC(&ca->set->sb) &&
 		    (!fifo_empty(&ca->free_inc) ||
 		     ca->need_save_prio > 64)) {
+			printk(KERN_DEBUG "bcache: writing prios: free_inc %zu/%zu\n",
+			       fifo_used(&ca->free_inc), ca->free_inc.size);
 			bch_prio_write(ca);
-		}
+		} else
+			printk(KERN_DEBUG "bcache: not writing prios: free_inc %zu/%zu\n",
+			       fifo_used(&ca->free_inc), ca->free_inc.size);
 	}
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]           ` <20130422215138.GA9931-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2013-04-23 11:56             ` Jack Wang
       [not found]               ` <517676E7.4030805-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jack Wang @ 2013-04-23 11:56 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Kent Overstreet, linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

Hi Kent,

After some test, we saw log in dmesg like below:



[ 1505.282400] bcache: bch_allocator_thread() bcache: invalidating
buckets: free_inc 0/127
[ 1505.285843] bcache: bch_allocator_thread() bcache: not writing prios:
free_inc 0/127
[ 1508.005957] bcache: bch_allocator_thread() bcache: invalidating
buckets: free_inc 0/127
[ 1508.009357] bcache: bch_allocator_thread() bcache: not writing prios:
free_inc 0/127
[ 1512.493609] bcache: bch_allocator_thread() bcache: invalidating
buckets: free_inc 0/127
[ 1512.497070] bcache: bch_allocator_thread() bcache: not writing prios:
free_inc 0/127

Does this show some clue to you?

Regards,
Jack

On 04/22/2013 11:51 PM, Kent Overstreet wrote:
> On Mon, Apr 22, 2013 at 10:27:03PM +0200, Jack Wang wrote:
>> Thanks for reply, Kent, two of my colleagues saw this behaviour, so I
>> think we can reproduce this.
>> If you could give me more detailed guide to narrow it down, I can try it
>> on my side.
> 
> So, my current hypothesis is that the problem is the allocator spinning,
> and the IO is from it continually rewriting prios/gens.
> 
> But I'm still not sure what's causing the allocator to spin, that's what
> the last patch was supposed to fix.
> 
> Can you see if you can reproduce it with this patch, and then tell me
> what shows up in the dmesg log? I expect you'll get a _lot_ of output -
> flip timestamps on in your kernel config, if they're not already on.
> Thanks!
> 
> commit 60a09d37d301f88dd0f0f413408821a067966d1a
> Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Date:   Mon Apr 22 14:49:33 2013 -0700
> 
>     bcache: Allocator debug patch
> 
> diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
> index 2879487..37c22c6 100644
> --- a/drivers/md/bcache/alloc.c
> +++ b/drivers/md/bcache/alloc.c
> @@ -393,12 +393,15 @@ void bch_allocator_thread(struct closure *cl)
>  				allocator_wait(ca, !list_empty(&ca->discards));
>  				do_discard(ca, bucket);
>  			} else {
> -				fifo_push(&ca->free, bucket);
> +				BUG_ON(!fifo_push(&ca->free, bucket));
>  				closure_wake_up(&ca->set->bucket_wait);
>  			}
>  		}
>  
>  		allocator_wait(ca, ca->set->gc_mark_valid);
> +
> +		printk(KERN_DEBUG "bcache: invalidating buckets: free_inc %zu/%zu\n",
> +		       fifo_used(&ca->free_inc), ca->free_inc.size);
>  		invalidate_buckets(ca);
>  
>  		allocator_wait(ca, !atomic_read(&ca->set->prio_blocked) ||
> @@ -407,8 +410,12 @@ void bch_allocator_thread(struct closure *cl)
>  		if (CACHE_SYNC(&ca->set->sb) &&
>  		    (!fifo_empty(&ca->free_inc) ||
>  		     ca->need_save_prio > 64)) {
> +			printk(KERN_DEBUG "bcache: writing prios: free_inc %zu/%zu\n",
> +			       fifo_used(&ca->free_inc), ca->free_inc.size);
>  			bch_prio_write(ca);
> -		}
> +		} else
> +			printk(KERN_DEBUG "bcache: not writing prios: free_inc %zu/%zu\n",
> +			       fifo_used(&ca->free_inc), ca->free_inc.size);
>  	}
>  }
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]               ` <517676E7.4030805-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-04-23 18:40                 ` Kent Overstreet
       [not found]                   ` <CAH+dOxL9Ajsp7fruw-9nC63rJLzOqBYreeE1URxF945-pv6vug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Kent Overstreet @ 2013-04-23 18:40 UTC (permalink / raw)
  To: Jack Wang
  Cc: Kent Overstreet, linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

Well _that_ interesting, the implication is that garbage collection is
getting stuck somehow and unable to finish.

I'll probably have another debug patch for you later today.

On Tue, Apr 23, 2013 at 4:56 AM, Jack Wang <jinpu.wang-EIkl63zCoXaH+58JC4qpiA@public.gmane.org> wrote:
> Hi Kent,
>
> After some test, we saw log in dmesg like below:
>
>
>
> [ 1505.282400] bcache: bch_allocator_thread() bcache: invalidating
> buckets: free_inc 0/127
> [ 1505.285843] bcache: bch_allocator_thread() bcache: not writing prios:
> free_inc 0/127
> [ 1508.005957] bcache: bch_allocator_thread() bcache: invalidating
> buckets: free_inc 0/127
> [ 1508.009357] bcache: bch_allocator_thread() bcache: not writing prios:
> free_inc 0/127
> [ 1512.493609] bcache: bch_allocator_thread() bcache: invalidating
> buckets: free_inc 0/127
> [ 1512.497070] bcache: bch_allocator_thread() bcache: not writing prios:
> free_inc 0/127
>
> Does this show some clue to you?
>
> Regards,
> Jack
>
> On 04/22/2013 11:51 PM, Kent Overstreet wrote:
>> On Mon, Apr 22, 2013 at 10:27:03PM +0200, Jack Wang wrote:
>>> Thanks for reply, Kent, two of my colleagues saw this behaviour, so I
>>> think we can reproduce this.
>>> If you could give me more detailed guide to narrow it down, I can try it
>>> on my side.
>>
>> So, my current hypothesis is that the problem is the allocator spinning,
>> and the IO is from it continually rewriting prios/gens.
>>
>> But I'm still not sure what's causing the allocator to spin, that's what
>> the last patch was supposed to fix.
>>
>> Can you see if you can reproduce it with this patch, and then tell me
>> what shows up in the dmesg log? I expect you'll get a _lot_ of output -
>> flip timestamps on in your kernel config, if they're not already on.
>> Thanks!
>>
>> commit 60a09d37d301f88dd0f0f413408821a067966d1a
>> Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>> Date:   Mon Apr 22 14:49:33 2013 -0700
>>
>>     bcache: Allocator debug patch
>>
>> diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
>> index 2879487..37c22c6 100644
>> --- a/drivers/md/bcache/alloc.c
>> +++ b/drivers/md/bcache/alloc.c
>> @@ -393,12 +393,15 @@ void bch_allocator_thread(struct closure *cl)
>>                               allocator_wait(ca, !list_empty(&ca->discards));
>>                               do_discard(ca, bucket);
>>                       } else {
>> -                             fifo_push(&ca->free, bucket);
>> +                             BUG_ON(!fifo_push(&ca->free, bucket));
>>                               closure_wake_up(&ca->set->bucket_wait);
>>                       }
>>               }
>>
>>               allocator_wait(ca, ca->set->gc_mark_valid);
>> +
>> +             printk(KERN_DEBUG "bcache: invalidating buckets: free_inc %zu/%zu\n",
>> +                    fifo_used(&ca->free_inc), ca->free_inc.size);
>>               invalidate_buckets(ca);
>>
>>               allocator_wait(ca, !atomic_read(&ca->set->prio_blocked) ||
>> @@ -407,8 +410,12 @@ void bch_allocator_thread(struct closure *cl)
>>               if (CACHE_SYNC(&ca->set->sb) &&
>>                   (!fifo_empty(&ca->free_inc) ||
>>                    ca->need_save_prio > 64)) {
>> +                     printk(KERN_DEBUG "bcache: writing prios: free_inc %zu/%zu\n",
>> +                            fifo_used(&ca->free_inc), ca->free_inc.size);
>>                       bch_prio_write(ca);
>> -             }
>> +             } else
>> +                     printk(KERN_DEBUG "bcache: not writing prios: free_inc %zu/%zu\n",
>> +                            fifo_used(&ca->free_inc), ca->free_inc.size);
>>       }
>>  }
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]                   ` <CAH+dOxL9Ajsp7fruw-9nC63rJLzOqBYreeE1URxF945-pv6vug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-04-24  6:56                     ` Jack Wang
  0 siblings, 0 replies; 10+ messages in thread
From: Jack Wang @ 2013-04-24  6:56 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Kent Overstreet, linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

On 04/23/2013 08:40 PM, Kent Overstreet wrote:
> Well _that_ interesting, the implication is that garbage collection is
> getting stuck somehow and unable to finish.
> 
> I'll probably have another debug patch for you later today.

Hi Kent,

Any possible patch, please don't hesitate to send to me , I can help to
test and hope we can fix that finally.

Regards,
Jack
> 
> On Tue, Apr 23, 2013 at 4:56 AM, Jack Wang <jinpu.wang-EIkl63zCoXaH+58JC4qpiA@public.gmane.org> wrote:
>> Hi Kent,
>>
>> After some test, we saw log in dmesg like below:
>>
>>
>>
>> [ 1505.282400] bcache: bch_allocator_thread() bcache: invalidating
>> buckets: free_inc 0/127
>> [ 1505.285843] bcache: bch_allocator_thread() bcache: not writing prios:
>> free_inc 0/127
>> [ 1508.005957] bcache: bch_allocator_thread() bcache: invalidating
>> buckets: free_inc 0/127
>> [ 1508.009357] bcache: bch_allocator_thread() bcache: not writing prios:
>> free_inc 0/127
>> [ 1512.493609] bcache: bch_allocator_thread() bcache: invalidating
>> buckets: free_inc 0/127
>> [ 1512.497070] bcache: bch_allocator_thread() bcache: not writing prios:
>> free_inc 0/127
>>
>> Does this show some clue to you?
>>
>> Regards,
>> Jack
>>
>> On 04/22/2013 11:51 PM, Kent Overstreet wrote:
>>> On Mon, Apr 22, 2013 at 10:27:03PM +0200, Jack Wang wrote:
>>>> Thanks for reply, Kent, two of my colleagues saw this behaviour, so I
>>>> think we can reproduce this.
>>>> If you could give me more detailed guide to narrow it down, I can try it
>>>> on my side.
>>>
>>> So, my current hypothesis is that the problem is the allocator spinning,
>>> and the IO is from it continually rewriting prios/gens.
>>>
>>> But I'm still not sure what's causing the allocator to spin, that's what
>>> the last patch was supposed to fix.
>>>
>>> Can you see if you can reproduce it with this patch, and then tell me
>>> what shows up in the dmesg log? I expect you'll get a _lot_ of output -
>>> flip timestamps on in your kernel config, if they're not already on.
>>> Thanks!
>>>
>>> commit 60a09d37d301f88dd0f0f413408821a067966d1a
>>> Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>>> Date:   Mon Apr 22 14:49:33 2013 -0700
>>>
>>>     bcache: Allocator debug patch
>>>
>>> diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
>>> index 2879487..37c22c6 100644
>>> --- a/drivers/md/bcache/alloc.c
>>> +++ b/drivers/md/bcache/alloc.c
>>> @@ -393,12 +393,15 @@ void bch_allocator_thread(struct closure *cl)
>>>                               allocator_wait(ca, !list_empty(&ca->discards));
>>>                               do_discard(ca, bucket);
>>>                       } else {
>>> -                             fifo_push(&ca->free, bucket);
>>> +                             BUG_ON(!fifo_push(&ca->free, bucket));
>>>                               closure_wake_up(&ca->set->bucket_wait);
>>>                       }
>>>               }
>>>
>>>               allocator_wait(ca, ca->set->gc_mark_valid);
>>> +
>>> +             printk(KERN_DEBUG "bcache: invalidating buckets: free_inc %zu/%zu\n",
>>> +                    fifo_used(&ca->free_inc), ca->free_inc.size);
>>>               invalidate_buckets(ca);
>>>
>>>               allocator_wait(ca, !atomic_read(&ca->set->prio_blocked) ||
>>> @@ -407,8 +410,12 @@ void bch_allocator_thread(struct closure *cl)
>>>               if (CACHE_SYNC(&ca->set->sb) &&
>>>                   (!fifo_empty(&ca->free_inc) ||
>>>                    ca->need_save_prio > 64)) {
>>> +                     printk(KERN_DEBUG "bcache: writing prios: free_inc %zu/%zu\n",
>>> +                            fifo_used(&ca->free_inc), ca->free_inc.size);
>>>                       bch_prio_write(ca);
>>> -             }
>>> +             } else
>>> +                     printk(KERN_DEBUG "bcache: not writing prios: free_inc %zu/%zu\n",
>>> +                            fifo_used(&ca->free_inc), ca->free_inc.size);
>>>       }
>>>  }
>>>
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found] ` <51758C42.4040708-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  2013-04-22 20:26   ` Kent Overstreet
@ 2013-04-24 19:49   ` Jack Wang
       [not found]     ` <51783733.9080907-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Jack Wang @ 2013-04-24 19:49 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA, koverstreet-hpIqsD4AKlfQT0dZR+AlfA
  Cc: dongsu.park-EIkl63zCoXaH+58JC4qpiA

I revert this commit:

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index e5ff12e..2f36743 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -489,6 +489,12 @@ static void bch_insert_data_loop(struct closure *cl)
 		bch_queue_gc(op->c);
 	}

+	/*
+	 * Journal writes are marked REQ_FLUSH; if the original write was a
+	 * flush, it'll wait on the journal write.
+	 */
+	bio->bi_rw &= ~(REQ_FLUSH|REQ_FUA);
+
 	do {
 		unsigned i;
 		struct bkey *k;
@@ -716,7 +722,7 @@ static struct search *search_alloc(struct bio *bio,
struct bcache_device *d)
 	s->task			= current;
 	s->orig_bio		= bio;
 	s->write		= (bio->bi_rw & REQ_WRITE) != 0;
-	s->op.flush_journal	= (bio->bi_rw & REQ_FLUSH) != 0;
+	s->op.flush_journal	= (bio->bi_rw & (REQ_FLUSH|REQ_FUA)) != 0;
 	s->op.skip		= (bio->bi_rw & REQ_DISCARD) != 0;
 	s->recoverable		= 1;
 	s->start_time		= jiffies;
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 6817ea4..0932580 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -766,6 +766,8 @@ static int bcache_device_init(struct bcache_device
*d, unsigned block_size)
 	set_bit(QUEUE_FLAG_NONROT,	&d->disk->queue->queue_flags);
 	set_bit(QUEUE_FLAG_DISCARD,	&d->disk->queue->queue_flags);

+	blk_queue_flush(q, REQ_FLUSH|REQ_FUA);
+
 	return 0;
 }

the strange behaviour is gone. And I checked the bcache-testing, it does
not contain that commit any more, maybe I'm lost in the git tree update.
Anyway, thanks Kent for your kindly support.

Jack

On 2013年04月22日 21:15, Jack Wang wrote:
> Hi all,
> 
> We've seen strange behaviour in bcache mode in current bcache-testing
> branch with Possible allocator fix:
> 
> Once I start writing data with "dd if=/dev/zero of=/dev/bcache0 bs=4k
> count=10000 oflag=sync", all SSDs in the Pool go close to 100% util and
> I see about 3600 writes/second in iostat for each disk in the pool, BUT
> no data written in means of throughput.
> 
> Then after some seconds (the flush interval of bcache) I see the flush
> of the writeback and also data written to the pool SSDs which looks
> pretty much like reordering and merging happened for that data.
> 
> bcache-3.2 does not have such problem.
> only bcache(master) and bcache-testing have such problem.
> 
> What's the possible reason?
> 
> Regards,
> Jack
> 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]     ` <51783733.9080907-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-04-24 20:13       ` Kent Overstreet
       [not found]         ` <CAC7rs0s-_e1WaymU1OcuBv+Tf8hiJLED=m5Qrpxhar6Zk3H9uQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Kent Overstreet @ 2013-04-24 20:13 UTC (permalink / raw)
  To: Jack Wang
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	koverstreet-hpIqsD4AKlfQT0dZR+AlfA,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

Ah, you would've been running an old version of the bcache-testing branch...

Someone else bisected a different bug to that commit, so that one's
staying in the dev branch for now.

I just pushed the allocator fix to the bcache-for-upstream branch, can
you give that a try?

On Wed, Apr 24, 2013 at 12:49 PM, Jack Wang <jinpu.wang-EIkl63zCoXbZz8UjPX3odA@public.gmane.orgm> wrote:
> I revert this commit:
>
> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
> index e5ff12e..2f36743 100644
> --- a/drivers/md/bcache/request.c
> +++ b/drivers/md/bcache/request.c
> @@ -489,6 +489,12 @@ static void bch_insert_data_loop(struct closure *cl)
>                 bch_queue_gc(op->c);
>         }
>
> +       /*
> +        * Journal writes are marked REQ_FLUSH; if the original write was a
> +        * flush, it'll wait on the journal write.
> +        */
> +       bio->bi_rw &= ~(REQ_FLUSH|REQ_FUA);
> +
>         do {
>                 unsigned i;
>                 struct bkey *k;
> @@ -716,7 +722,7 @@ static struct search *search_alloc(struct bio *bio,
> struct bcache_device *d)
>         s->task                 = current;
>         s->orig_bio             = bio;
>         s->write                = (bio->bi_rw & REQ_WRITE) != 0;
> -       s->op.flush_journal     = (bio->bi_rw & REQ_FLUSH) != 0;
> +       s->op.flush_journal     = (bio->bi_rw & (REQ_FLUSH|REQ_FUA)) != 0;
>         s->op.skip              = (bio->bi_rw & REQ_DISCARD) != 0;
>         s->recoverable          = 1;
>         s->start_time           = jiffies;
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 6817ea4..0932580 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -766,6 +766,8 @@ static int bcache_device_init(struct bcache_device
> *d, unsigned block_size)
>         set_bit(QUEUE_FLAG_NONROT,      &d->disk->queue->queue_flags);
>         set_bit(QUEUE_FLAG_DISCARD,     &d->disk->queue->queue_flags);
>
> +       blk_queue_flush(q, REQ_FLUSH|REQ_FUA);
> +
>         return 0;
>  }
>
> the strange behaviour is gone. And I checked the bcache-testing, it does
> not contain that commit any more, maybe I'm lost in the git tree update.
> Anyway, thanks Kent for your kindly support.
>
> Jack
>
> On 2013年04月22日 21:15, Jack Wang wrote:
>> Hi all,
>>
>> We've seen strange behaviour in bcache mode in current bcache-testing
>> branch with Possible allocator fix:
>>
>> Once I start writing data with "dd if=/dev/zero of=/dev/bcache0 bs=4k
>> count=10000 oflag=sync", all SSDs in the Pool go close to 100% util and
>> I see about 3600 writes/second in iostat for each disk in the pool, BUT
>> no data written in means of throughput.
>>
>> Then after some seconds (the flush interval of bcache) I see the flush
>> of the writeback and also data written to the pool SSDs which looks
>> pretty much like reordering and merging happened for that data.
>>
>> bcache-3.2 does not have such problem.
>> only bcache(master) and bcache-testing have such problem.
>>
>> What's the possible reason?
>>
>> Regards,
>> Jack
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bcache strange behaviour in write back mode
       [not found]         ` <CAC7rs0s-_e1WaymU1OcuBv+Tf8hiJLED=m5Qrpxhar6Zk3H9uQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-04-24 20:29           ` Jack Wang
  0 siblings, 0 replies; 10+ messages in thread
From: Jack Wang @ 2013-04-24 20:29 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	koverstreet-hpIqsD4AKlfQT0dZR+AlfA,
	dongsu.park-EIkl63zCoXaH+58JC4qpiA

On 2013年04月24日 22:13, Kent Overstreet wrote:
> Ah, you would've been running an old version of the bcache-testing branch...
> 
> Someone else bisected a different bug to that commit, so that one's
> staying in the dev branch for now.
> 
> I just pushed the allocator fix to the bcache-for-upstream branch, can
> you give that a try?
Yeah, I must seat on an old version.

Sure, will try and report back result.
> 
> On Wed, Apr 24, 2013 at 12:49 PM, Jack Wang <jinpu.wang@profitbricks.com> wrote:
>> I revert this commit:
>>
>> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
>> index e5ff12e..2f36743 100644
>> --- a/drivers/md/bcache/request.c
>> +++ b/drivers/md/bcache/request.c
>> @@ -489,6 +489,12 @@ static void bch_insert_data_loop(struct closure *cl)
>>                 bch_queue_gc(op->c);
>>         }
>>
>> +       /*
>> +        * Journal writes are marked REQ_FLUSH; if the original write was a
>> +        * flush, it'll wait on the journal write.
>> +        */
>> +       bio->bi_rw &= ~(REQ_FLUSH|REQ_FUA);
>> +
>>         do {
>>                 unsigned i;
>>                 struct bkey *k;
>> @@ -716,7 +722,7 @@ static struct search *search_alloc(struct bio *bio,
>> struct bcache_device *d)
>>         s->task                 = current;
>>         s->orig_bio             = bio;
>>         s->write                = (bio->bi_rw & REQ_WRITE) != 0;
>> -       s->op.flush_journal     = (bio->bi_rw & REQ_FLUSH) != 0;
>> +       s->op.flush_journal     = (bio->bi_rw & (REQ_FLUSH|REQ_FUA)) != 0;
>>         s->op.skip              = (bio->bi_rw & REQ_DISCARD) != 0;
>>         s->recoverable          = 1;
>>         s->start_time           = jiffies;
>> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
>> index 6817ea4..0932580 100644
>> --- a/drivers/md/bcache/super.c
>> +++ b/drivers/md/bcache/super.c
>> @@ -766,6 +766,8 @@ static int bcache_device_init(struct bcache_device
>> *d, unsigned block_size)
>>         set_bit(QUEUE_FLAG_NONROT,      &d->disk->queue->queue_flags);
>>         set_bit(QUEUE_FLAG_DISCARD,     &d->disk->queue->queue_flags);
>>
>> +       blk_queue_flush(q, REQ_FLUSH|REQ_FUA);
>> +
>>         return 0;
>>  }
>>
>> the strange behaviour is gone. And I checked the bcache-testing, it does
>> not contain that commit any more, maybe I'm lost in the git tree update.
>> Anyway, thanks Kent for your kindly support.
>>
>> Jack
>>
>> On 2013年04月22日 21:15, Jack Wang wrote:
>>> Hi all,
>>>
>>> We've seen strange behaviour in bcache mode in current bcache-testing
>>> branch with Possible allocator fix:
>>>
>>> Once I start writing data with "dd if=/dev/zero of=/dev/bcache0 bs=4k
>>> count=10000 oflag=sync", all SSDs in the Pool go close to 100% util and
>>> I see about 3600 writes/second in iostat for each disk in the pool, BUT
>>> no data written in means of throughput.
>>>
>>> Then after some seconds (the flush interval of bcache) I see the flush
>>> of the writeback and also data written to the pool SSDs which looks
>>> pretty much like reordering and merging happened for that data.
>>>
>>> bcache-3.2 does not have such problem.
>>> only bcache(master) and bcache-testing have such problem.
>>>
>>> What's the possible reason?
>>>
>>> Regards,
>>> Jack
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-24 20:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-22 19:15 bcache strange behaviour in write back mode Jack Wang
     [not found] ` <51758C42.4040708-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-04-22 20:26   ` Kent Overstreet
2013-04-24 19:49   ` Jack Wang
     [not found]     ` <51783733.9080907-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-04-24 20:13       ` Kent Overstreet
     [not found]         ` <CAC7rs0s-_e1WaymU1OcuBv+Tf8hiJLED=m5Qrpxhar6Zk3H9uQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-24 20:29           ` Jack Wang
     [not found] ` <CAC7rs0u=epPv0c_swBSifi_fDRC0k9WpZ+34OmFdgxjR3-8WcA@mail.gmail.com>
     [not found]   ` <CAC7rs0u=epPv0c_swBSifi_fDRC0k9WpZ+34OmFdgxjR3-8WcA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-22 20:27     ` Jack Wang
     [not found]       ` <51759D17.5040204-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-04-22 21:51         ` Kent Overstreet
     [not found]           ` <20130422215138.GA9931-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2013-04-23 11:56             ` Jack Wang
     [not found]               ` <517676E7.4030805-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-04-23 18:40                 ` Kent Overstreet
     [not found]                   ` <CAH+dOxL9Ajsp7fruw-9nC63rJLzOqBYreeE1URxF945-pv6vug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-24  6:56                     ` Jack Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.