All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
@ 2019-03-07  5:15 shile.zhang
  2019-03-07 10:34 ` Coly Li
  0 siblings, 1 reply; 9+ messages in thread
From: shile.zhang @ 2019-03-07  5:15 UTC (permalink / raw)
  To: Coly Li, Kent Overstreet; +Cc: linux-bcache, linux-kernel, Shile Zhang

From: Shile Zhang <shile.zhang@linux.alibaba.com>

Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
time with huge cache after long run.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
 drivers/md/bcache/sysfs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 557a8a3..028fea1 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct kobject *k)
 
 static int __bch_cache_cmp(const void *l, const void *r)
 {
+	cond_resched();
 	return *((uint16_t *)r) - *((uint16_t *)l);
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-07  5:15 [PATCH] bcache: add cond_resched() in __bch_cache_cmp() shile.zhang
@ 2019-03-07 10:34 ` Coly Li
  2019-03-07 15:06   ` Shile Zhang
  0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2019-03-07 10:34 UTC (permalink / raw)
  To: shile.zhang; +Cc: Kent Overstreet, linux-bcache, linux-kernel

On 2019/3/7 1:15 下午, shile.zhang@linux.alibaba.com wrote:
> From: Shile Zhang <shile.zhang@linux.alibaba.com>
> 
> Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
> time with huge cache after long run.
> 
> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>

Hi Shile,

Do you test your change ? It will be helpful with more performance data
(what problem that you improved).

Thanks.

Coly Li

> ---
>  drivers/md/bcache/sysfs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
> index 557a8a3..028fea1 100644
> --- a/drivers/md/bcache/sysfs.c
> +++ b/drivers/md/bcache/sysfs.c
> @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct kobject *k)
>  
>  static int __bch_cache_cmp(const void *l, const void *r)
>  {
> +	cond_resched();
>  	return *((uint16_t *)r) - *((uint16_t *)l);
>  }
>  
> 


-- 

Coly Li

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-07 10:34 ` Coly Li
@ 2019-03-07 15:06   ` Shile Zhang
  2019-03-07 15:36     ` Coly Li
  0 siblings, 1 reply; 9+ messages in thread
From: Shile Zhang @ 2019-03-07 15:06 UTC (permalink / raw)
  To: Coly Li; +Cc: Kent Overstreet, linux-bcache, linux-kernel


On 2019/3/7 18:34, Coly Li wrote:
> On 2019/3/7 1:15 下午, shile.zhang@linux.alibaba.com wrote:
>> From: Shile Zhang <shile.zhang@linux.alibaba.com>
>>
>> Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
>> time with huge cache after long run.
>>
>> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
> Hi Shile,
>
> Do you test your change ? It will be helpful with more performance data
> (what problem that you improved).

In case of 960GB SSD cache device, once read of the 'priority_stats' 
costs about 600ms in our test environment.

The perf tool shown that near 50% CPU time consumed by 'sort()', this 
means once sort will hold the CPU near 300ms.

In our case, the statistics collector reads the 'priority_stats' 
periodically, it will trigger the schedule latency jitters of the

task which shared same CPU core.

>
> Thanks.
>
> Coly Li
>
>> ---
>>   drivers/md/bcache/sysfs.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
>> index 557a8a3..028fea1 100644
>> --- a/drivers/md/bcache/sysfs.c
>> +++ b/drivers/md/bcache/sysfs.c
>> @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct kobject *k)
>>   
>>   static int __bch_cache_cmp(const void *l, const void *r)
>>   {
>> +	cond_resched();
>>   	return *((uint16_t *)r) - *((uint16_t *)l);
>>   }
>>   
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-07 15:06   ` Shile Zhang
@ 2019-03-07 15:36     ` Coly Li
  2019-03-07 15:44       ` Vojtech Pavlik
  0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2019-03-07 15:36 UTC (permalink / raw)
  To: Shile Zhang; +Cc: Kent Overstreet, linux-bcache, linux-kernel

On 2019/3/7 11:06 下午, Shile Zhang wrote:
> 
> On 2019/3/7 18:34, Coly Li wrote:
>> On 2019/3/7 1:15 下午, shile.zhang@linux.alibaba.com wrote:
>>> From: Shile Zhang <shile.zhang@linux.alibaba.com>
>>>
>>> Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
>>> time with huge cache after long run.
>>>
>>> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
>> Hi Shile,
>>
>> Do you test your change ? It will be helpful with more performance data
>> (what problem that you improved).
> 
> In case of 960GB SSD cache device, once read of the 'priority_stats'
> costs about 600ms in our test environment.
> 

After the fix, how much time it takes ?


> The perf tool shown that near 50% CPU time consumed by 'sort()', this
> means once sort will hold the CPU near 300ms.
> 
> In our case, the statistics collector reads the 'priority_stats'
> periodically, it will trigger the schedule latency jitters of the
> 
> task which shared same CPU core.
> 

Hmm, it seems you just make the sort slower, and nothing more changes.
Am I right ?


Coly Li


>>
>> Thanks.
>>
>> Coly Li
>>
>>> ---
>>>   drivers/md/bcache/sysfs.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
>>> index 557a8a3..028fea1 100644
>>> --- a/drivers/md/bcache/sysfs.c
>>> +++ b/drivers/md/bcache/sysfs.c
>>> @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct
>>> kobject *k)
>>>     static int __bch_cache_cmp(const void *l, const void *r)
>>>   {
>>> +    cond_resched();
>>>       return *((uint16_t *)r) - *((uint16_t *)l);
>>>   }
>>>  
>>


-- 

Coly Li

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-07 15:36     ` Coly Li
@ 2019-03-07 15:44       ` Vojtech Pavlik
  2019-03-08  0:35         ` Shile Zhang
  0 siblings, 1 reply; 9+ messages in thread
From: Vojtech Pavlik @ 2019-03-07 15:44 UTC (permalink / raw)
  To: Coly Li; +Cc: Shile Zhang, Kent Overstreet, linux-bcache, linux-kernel

On Thu, Mar 07, 2019 at 11:36:18PM +0800, Coly Li wrote:
> On 2019/3/7 11:06 下午, Shile Zhang wrote:
> > 
> > On 2019/3/7 18:34, Coly Li wrote:
> >> On 2019/3/7 1:15 下午, shile.zhang@linux.alibaba.com wrote:
> >>> From: Shile Zhang <shile.zhang@linux.alibaba.com>
> >>>
> >>> Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
> >>> time with huge cache after long run.
> >>>
> >>> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
> >> Hi Shile,
> >>
> >> Do you test your change ? It will be helpful with more performance data
> >> (what problem that you improved).
> > 
> > In case of 960GB SSD cache device, once read of the 'priority_stats'
> > costs about 600ms in our test environment.
> > 
> 
> After the fix, how much time it takes ?
> 
> 
> > The perf tool shown that near 50% CPU time consumed by 'sort()', this
> > means once sort will hold the CPU near 300ms.
> > 
> > In our case, the statistics collector reads the 'priority_stats'
> > periodically, it will trigger the schedule latency jitters of the
> > 
> > task which shared same CPU core.
> > 
> 
> Hmm, it seems you just make the sort slower, and nothing more changes.
> Am I right ?

Well, it has to make the sort slower, but it'll also avoid hogging the
CPU (on a non-preemptible kernel), avoiding a potential soft lockup
warning and allowing other tasks to run.

-- 
Vojtech Pavlik
VP SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-07 15:44       ` Vojtech Pavlik
@ 2019-03-08  0:35         ` Shile Zhang
  2019-03-08  2:19           ` Coly Li
  0 siblings, 1 reply; 9+ messages in thread
From: Shile Zhang @ 2019-03-08  0:35 UTC (permalink / raw)
  To: Vojtech Pavlik, Coly Li; +Cc: Kent Overstreet, linux-bcache, linux-kernel


On 2019/3/7 23:44, Vojtech Pavlik wrote:
> On Thu, Mar 07, 2019 at 11:36:18PM +0800, Coly Li wrote:
>> On 2019/3/7 11:06 下午, Shile Zhang wrote:
>>> On 2019/3/7 18:34, Coly Li wrote:
>>>> On 2019/3/7 1:15 下午, shile.zhang@linux.alibaba.com wrote:
>>>>> From: Shile Zhang <shile.zhang@linux.alibaba.com>
>>>>>
>>>>> Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
>>>>> time with huge cache after long run.
>>>>>
>>>>> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
>>>> Hi Shile,
>>>>
>>>> Do you test your change ? It will be helpful with more performance data
>>>> (what problem that you improved).
>>> In case of 960GB SSD cache device, once read of the 'priority_stats'
>>> costs about 600ms in our test environment.
>>>
>> After the fix, how much time it takes ?
>>
>>
>>> The perf tool shown that near 50% CPU time consumed by 'sort()', this
>>> means once sort will hold the CPU near 300ms.
>>>
>>> In our case, the statistics collector reads the 'priority_stats'
>>> periodically, it will trigger the schedule latency jitters of the
>>>
>>> task which shared same CPU core.
>>>
>> Hmm, it seems you just make the sort slower, and nothing more changes.
>> Am I right ?
> Well, it has to make the sort slower, but it'll also avoid hogging the
> CPU (on a non-preemptible kernel), avoiding a potential soft lockup
> warning and allowing other tasks to run.
>
Yes, there is a risk that other tasks have no chance to run due to sort 
hogging the CPU, it is harmful to some schedule-latency sensitive tasks.
This change just try to reduce the impact of sort, but not a performance 
improvement of it. I'm not sure if a better way can handle it more 
efficiency.

Thanks,

Shile



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-08  0:35         ` Shile Zhang
@ 2019-03-08  2:19           ` Coly Li
  2019-08-14 14:23             ` Heitor Alves de Siqueira
  0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2019-03-08  2:19 UTC (permalink / raw)
  To: Shile Zhang; +Cc: Vojtech Pavlik, Kent Overstreet, linux-bcache, linux-kernel

On 2019/3/8 8:35 上午, Shile Zhang wrote:
> 
> On 2019/3/7 23:44, Vojtech Pavlik wrote:
>> On Thu, Mar 07, 2019 at 11:36:18PM +0800, Coly Li wrote:
>>> On 2019/3/7 11:06 下午, Shile Zhang wrote:
>>>> On 2019/3/7 18:34, Coly Li wrote:
>>>>> On 2019/3/7 1:15 下午, shile.zhang@linux.alibaba.com wrote:
>>>>>> From: Shile Zhang <shile.zhang@linux.alibaba.com>
>>>>>>
>>>>>> Read /sys/fs/bcache/<uuid>/cacheN/priority_stats can take very long
>>>>>> time with huge cache after long run.
>>>>>>
>>>>>> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
>>>>> Hi Shile,
>>>>>
>>>>> Do you test your change ? It will be helpful with more performance
>>>>> data
>>>>> (what problem that you improved).
>>>> In case of 960GB SSD cache device, once read of the 'priority_stats'
>>>> costs about 600ms in our test environment.
>>>>
>>> After the fix, how much time it takes ?
>>>
>>>
>>>> The perf tool shown that near 50% CPU time consumed by 'sort()', this
>>>> means once sort will hold the CPU near 300ms.
>>>>
>>>> In our case, the statistics collector reads the 'priority_stats'
>>>> periodically, it will trigger the schedule latency jitters of the
>>>>
>>>> task which shared same CPU core.
>>>>
>>> Hmm, it seems you just make the sort slower, and nothing more changes.
>>> Am I right ?
>> Well, it has to make the sort slower, but it'll also avoid hogging the
>> CPU (on a non-preemptible kernel), avoiding a potential soft lockup
>> warning and allowing other tasks to run.
>>
> Yes, there is a risk that other tasks have no chance to run due to sort
> hogging the CPU, it is harmful to some schedule-latency sensitive tasks.
> This change just try to reduce the impact of sort, but not a performance
> improvement of it. I'm not sure if a better way can handle it more
> efficiency.
I know the above concept, since I would expect when people talking about
performance improvement, it would be better to provide real performance
number. Under some conditions it might be hard, but in this exact
example, it won't.

Could you please provide some data that on your configuration, how slow
'sort' becomes ?

AND, reading priority_stats in period for performance monitoring is not
good idea. The problem is not from 'sort', it is from
mutex_lock(&ca->set->bucket_lock) lines above the sort in sysfs.c, when
you have a quite large or very busy cache device, you may see normal I/O
performance drops due to too much time holding bucket_lock here.

Anyway, this is just my suggestion. Back to this patch, please provide
performance number.

Thanks.

Coly Li

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-03-08  2:19           ` Coly Li
@ 2019-08-14 14:23             ` Heitor Alves de Siqueira
  2019-08-14 16:50               ` Coly Li
  0 siblings, 1 reply; 9+ messages in thread
From: Heitor Alves de Siqueira @ 2019-08-14 14:23 UTC (permalink / raw)
  To: colyli; +Cc: kent.overstreet, linux-bcache, linux-kernel, shile.zhang, vojtech

Hi Coly,

We've had users impacted by system stalls and were able to trace it back to the
bcache priority_stats query. After investigating a bit further, it seems that
the sorting step in the quantiles calculation can cause heavy CPU
contention. This has a severe performance impact on any task that is running in
the same CPU as the sysfs query, and caused issues even for non-bcache
workloads.

We did some test runs with fio to get a better picture of the impact on
read/write workloads while a priority_stats query is running, and came up with
some interesting results. The bucket locking doesn't seem to have that
much performance impact even in full-write workloads, but the 'sort' can affect
bcache device throughput and latency pretty heavily (and any other tasks that
are "unlucky" to be scheduled together with it). In some of our tests, there was
a performance reduction of almost 90% in random read IOPS to the bcache device
(refer to the comparison graph at [0]). There's a few more details on the
Launchpad bug [1] we've created to track this, together with the complete fio
results + comparison graphs.

The cond_resched() patch suggested by Shile Zhang actually improved performance
a lot, and eliminated the stalls we've observed during the priority_stats
query. Even though it may cause the sysfs query to take a bit longer, it seems
like a decent tradeoff for general performance when running that query on a
system under heavy load. It's also a cheap short-term solution until we can look
into a more complex re-write of the priority_stats calculation (perhaps one that
doesn't require the locking?). Could we revisit Shile's patch, and consider
merging it?

Thanks!
Heitor

[0] https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
[1] https://bugs.launchpad.net/bugs/1840043

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
  2019-08-14 14:23             ` Heitor Alves de Siqueira
@ 2019-08-14 16:50               ` Coly Li
  0 siblings, 0 replies; 9+ messages in thread
From: Coly Li @ 2019-08-14 16:50 UTC (permalink / raw)
  To: Heitor Alves de Siqueira
  Cc: kent.overstreet, linux-bcache, linux-kernel, shile.zhang, vojtech

On 2019/8/14 10:23 下午, Heitor Alves de Siqueira wrote:
> Hi Coly,
> 
> We've had users impacted by system stalls and were able to trace it back to the
> bcache priority_stats query. After investigating a bit further, it seems that
> the sorting step in the quantiles calculation can cause heavy CPU
> contention. This has a severe performance impact on any task that is running in
> the same CPU as the sysfs query, and caused issues even for non-bcache
> workloads.
> 
> We did some test runs with fio to get a better picture of the impact on
> read/write workloads while a priority_stats query is running, and came up with
> some interesting results. The bucket locking doesn't seem to have that
> much performance impact even in full-write workloads, but the 'sort' can affect
> bcache device throughput and latency pretty heavily (and any other tasks that
> are "unlucky" to be scheduled together with it). In some of our tests, there was
> a performance reduction of almost 90% in random read IOPS to the bcache device
> (refer to the comparison graph at [0]). There's a few more details on the
> Launchpad bug [1] we've created to track this, together with the complete fio
> results + comparison graphs.
> 
> The cond_resched() patch suggested by Shile Zhang actually improved performance
> a lot, and eliminated the stalls we've observed during the priority_stats
> query. Even though it may cause the sysfs query to take a bit longer, it seems
> like a decent tradeoff for general performance when running that query on a
> system under heavy load. It's also a cheap short-term solution until we can look
> into a more complex re-write of the priority_stats calculation (perhaps one that
> doesn't require the locking?). Could we revisit Shile's patch, and consider
> merging it?
> 
> Thanks!
> Heitor
> 
> [0] https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
> [1] https://bugs.launchpad.net/bugs/1840043
> 

Hi Heitor,

With your very detailed explanation I come to understand why people
cares about performance impact of pririty_stats. In the case of system
monitoring, how long priority_stats returns is less important than
overall system throughput.

Yes I agree with your opinion and the benchmark chart makes me confident
with the original patch. I will add this patch to v5.4 window with
tested-by: Heitor Alves de Siqueira <halves@canonical.com>

Thanks for your detailed information. And thanks for Shile Zhang
originally composing this patch.

-- 

Coly Li

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-14 16:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-07  5:15 [PATCH] bcache: add cond_resched() in __bch_cache_cmp() shile.zhang
2019-03-07 10:34 ` Coly Li
2019-03-07 15:06   ` Shile Zhang
2019-03-07 15:36     ` Coly Li
2019-03-07 15:44       ` Vojtech Pavlik
2019-03-08  0:35         ` Shile Zhang
2019-03-08  2:19           ` Coly Li
2019-08-14 14:23             ` Heitor Alves de Siqueira
2019-08-14 16:50               ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.