All of lore.kernel.org
 help / color / mirror / Atom feed
* bcache I/O performance tests on 5.15.0-40-generic
@ 2022-06-25  6:29 Nikhil Kshirsagar
  2022-06-25 12:08 ` Coly Li
  0 siblings, 1 reply; 9+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-25  6:29 UTC (permalink / raw)
  To: linux-bcache; +Cc: Coly Li

Hello,

I've been doing some performance tests of bcache on 5.15.0-40-generic.

The baseline figures for the fast and slow disk for random writes are
consistent at around 225MiB/s and 3046KiB/s.

But the bcache results inexplicably drop sometimes to 10Mib/s, for
random write test using fio like this -

fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
--gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1

  WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
io=1024MiB (1074MB), run=6104-6104msec
  WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
io=1024MiB (1074MB), run=3621-3621msec
  WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
(10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
  WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
(8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
  WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
(9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
  WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
(4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec

This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)

Some details are here - https://pastebin.com/V9mpLCbY , I will share
the full testing results soon, but just was wondering about this
performance drop for no apparent reason once the cache gets about 50%
full.

I've tested in writeback mode, and also set
congested_read_threshold_us and congested_write_threshold_us to 0

I did not notice this issue while testing on an older kernel -
4.15.0-188-generic

Regards,
Nikhil.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
  2022-06-25  6:29 bcache I/O performance tests on 5.15.0-40-generic Nikhil Kshirsagar
@ 2022-06-25 12:08 ` Coly Li
  2022-06-30  4:24   ` Nikhil Kshirsagar
  2022-07-05 20:49   ` Eric Wheeler
  0 siblings, 2 replies; 9+ messages in thread
From: Coly Li @ 2022-06-25 12:08 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Hello,
> 
> I've been doing some performance tests of bcache on 5.15.0-40-generic.
> 
> The baseline figures for the fast and slow disk for random writes are
> consistent at around 225MiB/s and 3046KiB/s.
> 
> But the bcache results inexplicably drop sometimes to 10Mib/s, for
> random write test using fio like this -
> 
> fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
> --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
> 
>  WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
> io=1024MiB (1074MB), run=6104-6104msec
>  WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
> io=1024MiB (1074MB), run=3621-3621msec
>  WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
> (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
>  WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
> (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
>  WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
> (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
>  WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
> (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
> 
> This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
> 
> Some details are here - https://pastebin.com/V9mpLCbY , I will share
> the full testing results soon, but just was wondering about this
> performance drop for no apparent reason once the cache gets about 50%
> full.


It seems you are stuck by garbage collection. 4GB cache is small, the garbage collection might be invoked quite frequently. Maybe you can see the output of ’top -H’ to check whether there is kernel thread named bache_gc.

Anyway, 4GB cache is too small.

Coly Li


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
  2022-06-25 12:08 ` Coly Li
@ 2022-06-30  4:24   ` Nikhil Kshirsagar
       [not found]     ` <CAC6jXv0yOZ98XqG=quDcONuZ9ggqK4doM8EzVTc=Sk1m-H=_Xw@mail.gmail.com>
  2022-07-05 20:49   ` Eric Wheeler
  1 sibling, 1 reply; 9+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-30  4:24 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Thanks Coly!

Can garbage collection be turned off, by echo 1 into
/sys/fs/bcache/<UUID>/internal/gc_after_writeback ?

The issue I'm seeing is, garbage collection causes write performance
(writeback mode) to drop whenever the cache gets 50% full.

With a 10gb cache device, an 8 GB write (using fio randwrite) should
give SSD like speed, but it does not. I am wondering if its due to the
gc threads.

Regards,
Nikhil.

On Sat, 25 Jun 2022 at 17:38, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hello,
> >
> > I've been doing some performance tests of bcache on 5.15.0-40-generic.
> >
> > The baseline figures for the fast and slow disk for random writes are
> > consistent at around 225MiB/s and 3046KiB/s.
> >
> > But the bcache results inexplicably drop sometimes to 10Mib/s, for
> > random write test using fio like this -
> >
> > fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
> > --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
> >
> >  WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
> > io=1024MiB (1074MB), run=6104-6104msec
> >  WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
> > io=1024MiB (1074MB), run=3621-3621msec
> >  WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
> > (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
> >  WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
> > (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
> >  WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
> > (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
> >  WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
> > (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
> >
> > This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
> >
> > Some details are here - https://pastebin.com/V9mpLCbY , I will share
> > the full testing results soon, but just was wondering about this
> > performance drop for no apparent reason once the cache gets about 50%
> > full.
>
>
> It seems you are stuck by garbage collection. 4GB cache is small, the garbage collection might be invoked quite frequently. Maybe you can see the output of ’top -H’ to check whether there is kernel thread named bache_gc.
>
> Anyway, 4GB cache is too small.
>
> Coly Li
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
       [not found]     ` <CAC6jXv0yOZ98XqG=quDcONuZ9ggqK4doM8EzVTc=Sk1m-H=_Xw@mail.gmail.com>
@ 2022-06-30  6:49       ` Coly Li
       [not found]         ` <CAC6jXv2u_0s-hvj4J4gurfoxgKNHFcHYq9F8cfcf-s6oG+pU+Q@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2022-06-30  6:49 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月30日 13:07,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> HI Coly,
> 
> even after turning it on by echo 1 into
> /sys/fs/bcache/<UUID>/internal/gc_after_writeback

gc_after_writeback is a switch to triger a gc operation when writeback finished to flush all dirty data to backing device. Which might be good for future writing I/Os.
It doesn’t help to gc performance.



> 
> I still see [bcache_gc] threads appear about 70% into writing the 8 gb
> IO into 10 gb cache.. so with the result that 8gb write takes very
> long, in spite of having more than enough ssd cache for it..
> 

This is as designed. Gc thread is triggered when every 1/16 cache space is used, if there is no gc, the whole bcache process is very probably to be locked up, due to no space for meta-data or cached data.

This is why I suggest a larger cache device. And gc is unavoidable, when cache device is small, all allocation will wait for gc to make more free room. And in order to make more available free space, the dirty sectors should be written back to backing device, which is why you see everything is slow down.


Coly Li



> Regards,
> Nikhil.
> 
> On Thu, 30 Jun 2022 at 09:54, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>> 
>> Thanks Coly!
>> 
>> Can garbage collection be turned off, by echo 1 into
>> /sys/fs/bcache/<UUID>/internal/gc_after_writeback ?
>> 
>> The issue I'm seeing is, garbage collection causes write performance
>> (writeback mode) to drop whenever the cache gets 50% full.
>> 
>> With a 10gb cache device, an 8 GB write (using fio randwrite) should
>> give SSD like speed, but it does not. I am wondering if its due to the
>> gc threads.
>> 
>> Regards,
>> Nikhil.
>> 
>> On Sat, 25 Jun 2022 at 17:38, Coly Li <colyli@suse.de> wrote:
>>> 
>>> 
>>> 
>>>> 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>> 
>>>> Hello,
>>>> 
>>>> I've been doing some performance tests of bcache on 5.15.0-40-generic.
>>>> 
>>>> The baseline figures for the fast and slow disk for random writes are
>>>> consistent at around 225MiB/s and 3046KiB/s.
>>>> 
>>>> But the bcache results inexplicably drop sometimes to 10Mib/s, for
>>>> random write test using fio like this -
>>>> 
>>>> fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
>>>> --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
>>>> 
>>>> WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
>>>> io=1024MiB (1074MB), run=6104-6104msec
>>>> WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
>>>> io=1024MiB (1074MB), run=3621-3621msec
>>>> WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
>>>> (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
>>>> WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
>>>> (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
>>>> WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
>>>> (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
>>>> WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
>>>> (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
>>>> 
>>>> This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
>>>> 
>>>> Some details are here - https://pastebin.com/V9mpLCbY , I will share
>>>> the full testing results soon, but just was wondering about this
>>>> performance drop for no apparent reason once the cache gets about 50%
>>>> full.
>>> 
>>> 
>>> It seems you are stuck by garbage collection. 4GB cache is small, the garbage collection might be invoked quite frequently. Maybe you can see the output of ’top -H’ to check whether there is kernel thread named bache_gc.
>>> 
>>> Anyway, 4GB cache is too small.
>>> 
>>> Coly Li
>>> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
       [not found]         ` <CAC6jXv2u_0s-hvj4J4gurfoxgKNHFcHYq9F8cfcf-s6oG+pU+Q@mail.gmail.com>
@ 2022-06-30  7:36           ` Coly Li
  2022-06-30  7:39             ` Nikhil Kshirsagar
  0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2022-06-30  7:36 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月30日 15:26,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Thank you for the clarification. But my testing results show that even with 15GB cache device, if I write 12gb, it still slow down, so you do not get  "close to ssd" speed for such IO write.. even if its smaller than cache size.
>  
> Attached results of testing comparing dm-cache with bcache. command used was "fio --rw=randwrite --size=12G --ioengine=libaio --direct=1 --gtod_reduce=1 --iodepth=128 --bs=4k"
> 
> 

I cannot tell why dmcache is so good from your performance number. But if the peak write speed is around 550MB/s, it may take around 20 seconds. What happens if the I/O testing may take longer, e.g. 1 hours?

BTW, people cannot get “close to ssd” speed on bcache, for each write/read I/O request, bcache will update B+tree index, cache data, write journal, and maybe split B+tree node, and the I/O procedure might be interfered by I/Os from gc and writeback. So it is good enough, but cannot be close to SSD speed.

Coly Li

> 
> 
> -Nikhil.
> 
> On Thu, 30 Jun 2022 at 12:19, Coly Li <colyli@suse.de> wrote:
> >
> >
> >
> > > 2022年6月30日 13:07,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >
> > > HI Coly,
> > >
> > > even after turning it on by echo 1 into
> > > /sys/fs/bcache/<UUID>/internal/gc_after_writeback
> >
> > gc_after_writeback is a switch to triger a gc operation when writeback finished to flush all dirty data to backing device. Which might be good for future writing I/Os.
> > It doesn’t help to gc performance.
> >
> >
> >
> > >
> > > I still see [bcache_gc] threads appear about 70% into writing the 8 gb
> > > IO into 10 gb cache.. so with the result that 8gb write takes very
> > > long, in spite of having more than enough ssd cache for it..
> > >
> >
> > This is as designed. Gc thread is triggered when every 1/16 cache space is used, if there is no gc, the whole bcache process is very probably to be locked up, due to no space for meta-data or cached data.
> >
> > This is why I suggest a larger cache device. And gc is unavoidable, when cache device is small, all allocation will wait for gc to make more free room. And in order to make more available free space, the dirty sectors should be written back to backing device, which is why you see everything is slow down.
> >
> >
> > Coly Li
> >
> >
> >
> > > Regards,
> > > Nikhil.
> > >
> > > On Thu, 30 Jun 2022 at 09:54, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
> > >>
> > >> Thanks Coly!
> > >>
> > >> Can garbage collection be turned off, by echo 1 into
> > >> /sys/fs/bcache/<UUID>/internal/gc_after_writeback ?
> > >>
> > >> The issue I'm seeing is, garbage collection causes write performance
> > >> (writeback mode) to drop whenever the cache gets 50% full.
> > >>
> > >> With a 10gb cache device, an 8 GB write (using fio randwrite) should
> > >> give SSD like speed, but it does not. I am wondering if its due to the
> > >> gc threads.
> > >>
> > >> Regards,
> > >> Nikhil.
> > >>
> > >> On Sat, 25 Jun 2022 at 17:38, Coly Li <colyli@suse.de> wrote:
> > >>>
> > >>>
> > >>>
> > >>>> 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >>>>
> > >>>> Hello,
> > >>>>
> > >>>> I've been doing some performance tests of bcache on 5.15.0-40-generic.
> > >>>>
> > >>>> The baseline figures for the fast and slow disk for random writes are
> > >>>> consistent at around 225MiB/s and 3046KiB/s.
> > >>>>
> > >>>> But the bcache results inexplicably drop sometimes to 10Mib/s, for
> > >>>> random write test using fio like this -
> > >>>>
> > >>>> fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
> > >>>> --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
> > >>>>
> > >>>> WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
> > >>>> io=1024MiB (1074MB), run=6104-6104msec
> > >>>> WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
> > >>>> io=1024MiB (1074MB), run=3621-3621msec
> > >>>> WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
> > >>>> (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
> > >>>> WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
> > >>>> (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
> > >>>> WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
> > >>>> (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
> > >>>> WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
> > >>>> (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
> > >>>>
> > >>>> This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
> > >>>>
> > >>>> Some details are here - https://pastebin.com/V9mpLCbY , I will share
> > >>>> the full testing results soon, but just was wondering about this
> > >>>> performance drop for no apparent reason once the cache gets about 50%
> > >>>> full.
> > >>>
> > >>>
> > >>> It seems you are stuck by garbage collection. 4GB cache is small, the garbage collection might be invoked quite frequently. Maybe you can see the output of ’top -H’ to check whether there is kernel thread named bache_gc.
> > >>>
> > >>> Anyway, 4GB cache is too small.
> > >>>
> > >>> Coly Li
> > >>>
> >


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
  2022-06-30  7:36           ` Coly Li
@ 2022-06-30  7:39             ` Nikhil Kshirsagar
  2022-06-30  7:47               ` Coly Li
  0 siblings, 1 reply; 9+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-30  7:39 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Yes, I understand, but if you see the graphs, for a 12gb random write
IO (with a 15gb SSD, so enough cache in theory for entire write),
bcache gets speeds very close to SLOW DISK! (3MB/s consistently, while
dmcache gets 400mb/s consistently except the first run where it "warms
up"), so that is why I wanted to understand whether there's any
tunable to get the "close to ssd" or even 300-400MB/s speed (ssd speed
is 500mb/s avg)

Regards,
Nikhil.


On Thu, 30 Jun 2022 at 13:06, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月30日 15:26,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Thank you for the clarification. But my testing results show that even with 15GB cache device, if I write 12gb, it still slow down, so you do not get  "close to ssd" speed for such IO write.. even if its smaller than cache size.
> >
> > Attached results of testing comparing dm-cache with bcache. command used was "fio --rw=randwrite --size=12G --ioengine=libaio --direct=1 --gtod_reduce=1 --iodepth=128 --bs=4k"
> >
> >
>
> I cannot tell why dmcache is so good from your performance number. But if the peak write speed is around 550MB/s, it may take around 20 seconds. What happens if the I/O testing may take longer, e.g. 1 hours?
>
> BTW, people cannot get “close to ssd” speed on bcache, for each write/read I/O request, bcache will update B+tree index, cache data, write journal, and maybe split B+tree node, and the I/O procedure might be interfered by I/Os from gc and writeback. So it is good enough, but cannot be close to SSD speed.
>
> Coly Li
>
> >
> >
> > -Nikhil.
> >
> > On Thu, 30 Jun 2022 at 12:19, Coly Li <colyli@suse.de> wrote:
> > >
> > >
> > >
> > > > 2022年6月30日 13:07,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > > >
> > > > HI Coly,
> > > >
> > > > even after turning it on by echo 1 into
> > > > /sys/fs/bcache/<UUID>/internal/gc_after_writeback
> > >
> > > gc_after_writeback is a switch to triger a gc operation when writeback finished to flush all dirty data to backing device. Which might be good for future writing I/Os.
> > > It doesn’t help to gc performance.
> > >
> > >
> > >
> > > >
> > > > I still see [bcache_gc] threads appear about 70% into writing the 8 gb
> > > > IO into 10 gb cache.. so with the result that 8gb write takes very
> > > > long, in spite of having more than enough ssd cache for it..
> > > >
> > >
> > > This is as designed. Gc thread is triggered when every 1/16 cache space is used, if there is no gc, the whole bcache process is very probably to be locked up, due to no space for meta-data or cached data.
> > >
> > > This is why I suggest a larger cache device. And gc is unavoidable, when cache device is small, all allocation will wait for gc to make more free room. And in order to make more available free space, the dirty sectors should be written back to backing device, which is why you see everything is slow down.
> > >
> > >
> > > Coly Li
> > >
> > >
> > >
> > > > Regards,
> > > > Nikhil.
> > > >
> > > > On Thu, 30 Jun 2022 at 09:54, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
> > > >>
> > > >> Thanks Coly!
> > > >>
> > > >> Can garbage collection be turned off, by echo 1 into
> > > >> /sys/fs/bcache/<UUID>/internal/gc_after_writeback ?
> > > >>
> > > >> The issue I'm seeing is, garbage collection causes write performance
> > > >> (writeback mode) to drop whenever the cache gets 50% full.
> > > >>
> > > >> With a 10gb cache device, an 8 GB write (using fio randwrite) should
> > > >> give SSD like speed, but it does not. I am wondering if its due to the
> > > >> gc threads.
> > > >>
> > > >> Regards,
> > > >> Nikhil.
> > > >>
> > > >> On Sat, 25 Jun 2022 at 17:38, Coly Li <colyli@suse.de> wrote:
> > > >>>
> > > >>>
> > > >>>
> > > >>>> 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > > >>>>
> > > >>>> Hello,
> > > >>>>
> > > >>>> I've been doing some performance tests of bcache on 5.15.0-40-generic.
> > > >>>>
> > > >>>> The baseline figures for the fast and slow disk for random writes are
> > > >>>> consistent at around 225MiB/s and 3046KiB/s.
> > > >>>>
> > > >>>> But the bcache results inexplicably drop sometimes to 10Mib/s, for
> > > >>>> random write test using fio like this -
> > > >>>>
> > > >>>> fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
> > > >>>> --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
> > > >>>>
> > > >>>> WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
> > > >>>> io=1024MiB (1074MB), run=6104-6104msec
> > > >>>> WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
> > > >>>> io=1024MiB (1074MB), run=3621-3621msec
> > > >>>> WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
> > > >>>> (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
> > > >>>> WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
> > > >>>> (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
> > > >>>> WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
> > > >>>> (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
> > > >>>> WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
> > > >>>> (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
> > > >>>>
> > > >>>> This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
> > > >>>>
> > > >>>> Some details are here - https://pastebin.com/V9mpLCbY , I will share
> > > >>>> the full testing results soon, but just was wondering about this
> > > >>>> performance drop for no apparent reason once the cache gets about 50%
> > > >>>> full.
> > > >>>
> > > >>>
> > > >>> It seems you are stuck by garbage collection. 4GB cache is small, the garbage collection might be invoked quite frequently. Maybe you can see the output of ’top -H’ to check whether there is kernel thread named bache_gc.
> > > >>>
> > > >>> Anyway, 4GB cache is too small.
> > > >>>
> > > >>> Coly Li
> > > >>>
> > >
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
  2022-06-30  7:39             ` Nikhil Kshirsagar
@ 2022-06-30  7:47               ` Coly Li
  0 siblings, 0 replies; 9+ messages in thread
From: Coly Li @ 2022-06-30  7:47 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月30日 15:39,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Yes, I understand, but if you see the graphs, for a 12gb random write
> IO (with a 15gb SSD, so enough cache in theory for entire write),
> bcache gets speeds very close to SLOW DISK! (3MB/s consistently, while
> dmcache gets 400mb/s consistently except the first run where it "warms
> up"), so that is why I wanted to understand whether there's any
> tunable to get the "close to ssd" or even 300-400MB/s speed (ssd speed
> is 500mb/s avg)
> 
> Regards,
> Nikhil.


Every time when around 900MB data written into cache device, gc thread will be triggered awake to work. And when the dirty data exceeds around 1.5G, writeback thread will start and throttle front end write speed. With more dirty data, more throttle for front end I/Os. And when dirty data exceeds 70% cache size, which is around 10.5G in your case, all following I/Os will go directly into backing device and not synced.

I guess this is why you may observe slow I/O speed for your testing, which looks like as expected IMHO.

Coly Li


> 
> 
> On Thu, 30 Jun 2022 at 13:06, Coly Li <colyli@suse.de> wrote:
>> 
>> 
>> 
>>> 2022年6月30日 15:26,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>> 
>>> Thank you for the clarification. But my testing results show that even with 15GB cache device, if I write 12gb, it still slow down, so you do not get  "close to ssd" speed for such IO write.. even if its smaller than cache size.
>>> 
>>> Attached results of testing comparing dm-cache with bcache. command used was "fio --rw=randwrite --size=12G --ioengine=libaio --direct=1 --gtod_reduce=1 --iodepth=128 --bs=4k"
>>> 
>>> 
>> 
>> I cannot tell why dmcache is so good from your performance number. But if the peak write speed is around 550MB/s, it may take around 20 seconds. What happens if the I/O testing may take longer, e.g. 1 hours?
>> 
>> BTW, people cannot get “close to ssd” speed on bcache, for each write/read I/O request, bcache will update B+tree index, cache data, write journal, and maybe split B+tree node, and the I/O procedure might be interfered by I/Os from gc and writeback. So it is good enough, but cannot be close to SSD speed.
>> 
>> Coly Li
>> 
>>> 
>>> 
>>> -Nikhil.
>>> 
>>> On Thu, 30 Jun 2022 at 12:19, Coly Li <colyli@suse.de> wrote:
>>>> 
>>>> 
>>>> 
>>>>> 2022年6月30日 13:07,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>>> 
>>>>> HI Coly,
>>>>> 
>>>>> even after turning it on by echo 1 into
>>>>> /sys/fs/bcache/<UUID>/internal/gc_after_writeback
>>>> 
>>>> gc_after_writeback is a switch to triger a gc operation when writeback finished to flush all dirty data to backing device. Which might be good for future writing I/Os.
>>>> It doesn’t help to gc performance.
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> I still see [bcache_gc] threads appear about 70% into writing the 8 gb
>>>>> IO into 10 gb cache.. so with the result that 8gb write takes very
>>>>> long, in spite of having more than enough ssd cache for it..
>>>>> 
>>>> 
>>>> This is as designed. Gc thread is triggered when every 1/16 cache space is used, if there is no gc, the whole bcache process is very probably to be locked up, due to no space for meta-data or cached data.
>>>> 
>>>> This is why I suggest a larger cache device. And gc is unavoidable, when cache device is small, all allocation will wait for gc to make more free room. And in order to make more available free space, the dirty sectors should be written back to backing device, which is why you see everything is slow down.
>>>> 
>>>> 
>>>> Coly Li
>>>> 
>>>> 
>>>> 
>>>>> Regards,
>>>>> Nikhil.
>>>>> 
>>>>> On Thu, 30 Jun 2022 at 09:54, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>>>>>> 
>>>>>> Thanks Coly!
>>>>>> 
>>>>>> Can garbage collection be turned off, by echo 1 into
>>>>>> /sys/fs/bcache/<UUID>/internal/gc_after_writeback ?
>>>>>> 
>>>>>> The issue I'm seeing is, garbage collection causes write performance
>>>>>> (writeback mode) to drop whenever the cache gets 50% full.
>>>>>> 
>>>>>> With a 10gb cache device, an 8 GB write (using fio randwrite) should
>>>>>> give SSD like speed, but it does not. I am wondering if its due to the
>>>>>> gc threads.
>>>>>> 
>>>>>> Regards,
>>>>>> Nikhil.
>>>>>> 
>>>>>> On Sat, 25 Jun 2022 at 17:38, Coly Li <colyli@suse.de> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>>>>>> 
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> I've been doing some performance tests of bcache on 5.15.0-40-generic.
>>>>>>>> 
>>>>>>>> The baseline figures for the fast and slow disk for random writes are
>>>>>>>> consistent at around 225MiB/s and 3046KiB/s.
>>>>>>>> 
>>>>>>>> But the bcache results inexplicably drop sometimes to 10Mib/s, for
>>>>>>>> random write test using fio like this -
>>>>>>>> 
>>>>>>>> fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
>>>>>>>> --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
>>>>>>>> 
>>>>>>>> WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
>>>>>>>> io=1024MiB (1074MB), run=6104-6104msec
>>>>>>>> WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
>>>>>>>> io=1024MiB (1074MB), run=3621-3621msec
>>>>>>>> WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
>>>>>>>> (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
>>>>>>>> WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
>>>>>>>> (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
>>>>>>>> WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
>>>>>>>> (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
>>>>>>>> WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
>>>>>>>> (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
>>>>>>>> 
>>>>>>>> This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
>>>>>>>> 
>>>>>>>> Some details are here - https://pastebin.com/V9mpLCbY , I will share
>>>>>>>> the full testing results soon, but just was wondering about this
>>>>>>>> performance drop for no apparent reason once the cache gets about 50%
>>>>>>>> full.
>>>>>>> 
>>>>>>> 
>>>>>>> It seems you are stuck by garbage collection. 4GB cache is small, the garbage collection might be invoked quite frequently. Maybe you can see the output of ’top -H’ to check whether there is kernel thread named bache_gc.
>>>>>>> 
>>>>>>> Anyway, 4GB cache is too small.
>>>>>>> 
>>>>>>> Coly Li
>>>>>>> 
>>>> 
>> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
  2022-06-25 12:08 ` Coly Li
  2022-06-30  4:24   ` Nikhil Kshirsagar
@ 2022-07-05 20:49   ` Eric Wheeler
  2022-07-28 15:51     ` Coly Li
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Wheeler @ 2022-07-05 20:49 UTC (permalink / raw)
  To: Coly Li; +Cc: Nikhil Kshirsagar, linux-bcache

[-- Attachment #1: Type: text/plain, Size: 2179 bytes --]

On Sat, 25 Jun 2022, Coly Li wrote:
> > 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > 
> > Hello,
> > 
> > I've been doing some performance tests of bcache on 5.15.0-40-generic.
> > 
> > The baseline figures for the fast and slow disk for random writes are
> > consistent at around 225MiB/s and 3046KiB/s.
> > 
> > But the bcache results inexplicably drop sometimes to 10Mib/s, for
> > random write test using fio like this -
> > 
> > fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
> > --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
> > 
> >  WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
> > io=1024MiB (1074MB), run=6104-6104msec
> >  WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
> > io=1024MiB (1074MB), run=3621-3621msec
> >  WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
> > (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
> >  WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
> > (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
> >  WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
> > (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
> >  WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
> > (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
> > 
> > This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
> > 
> > Some details are here - https://pastebin.com/V9mpLCbY , I will share
> > the full testing results soon, but just was wondering about this
> > performance drop for no apparent reason once the cache gets about 50%
> > full.
> 
> 
> It seems you are stuck by garbage collection. 4GB cache is small, the 
> garbage collection might be invoked quite frequently. Maybe you can see 
> the output of ’top -H’ to check whether there is kernel thread named 
> bache_gc.

Hi Nikhil,

Do you have Mingzhe's GC patch?  It might help:
  https://www.spinics.net/lists/linux-bcache/msg11185.html

Coli, did Mingzhe's patch get into your testing tree?  It looks like it 
could be a good addition to bcache.

--
Eric Wheeler



> 
> Anyway, 4GB cache is too small.
> 
> Coly Li
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bcache I/O performance tests on 5.15.0-40-generic
  2022-07-05 20:49   ` Eric Wheeler
@ 2022-07-28 15:51     ` Coly Li
  0 siblings, 0 replies; 9+ messages in thread
From: Coly Li @ 2022-07-28 15:51 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: Nikhil Kshirsagar, linux-bcache



> 2022年7月6日 04:49,Eric Wheeler <bcache@lists.ewheeler.net> 写道:
> 
> On Sat, 25 Jun 2022, Coly Li wrote:
>>> 2022年6月25日 14:29,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>> 
>>> Hello,
>>> 
>>> I've been doing some performance tests of bcache on 5.15.0-40-generic.
>>> 
>>> The baseline figures for the fast and slow disk for random writes are
>>> consistent at around 225MiB/s and 3046KiB/s.
>>> 
>>> But the bcache results inexplicably drop sometimes to 10Mib/s, for
>>> random write test using fio like this -
>>> 
>>> fio --rw=randwrite --size=1G --ioengine=libaio --direct=1
>>> --gtod_reduce=1 --iodepth=128 --bs=4k --name=MY_TEST1
>>> 
>>> WRITE: bw=168MiB/s (176MB/s), 168MiB/s-168MiB/s (176MB/s-176MB/s),
>>> io=1024MiB (1074MB), run=6104-6104msec
>>> WRITE: bw=283MiB/s (297MB/s), 283MiB/s-283MiB/s (297MB/s-297MB/s),
>>> io=1024MiB (1074MB), run=3621-3621msec
>>> WRITE: bw=10.3MiB/s (10.9MB/s), 10.3MiB/s-10.3MiB/s
>>> (10.9MB/s-10.9MB/s), io=1024MiB (1074MB), run=98945-98945msec
>>> WRITE: bw=8236KiB/s (8434kB/s), 8236KiB/s-8236KiB/s
>>> (8434kB/s-8434kB/s), io=1024MiB (1074MB), run=127317-127317msec
>>> WRITE: bw=9657KiB/s (9888kB/s), 9657KiB/s-9657KiB/s
>>> (9888kB/s-9888kB/s), io=1024MiB (1074MB), run=108587-108587msec
>>> WRITE: bw=4543KiB/s (4652kB/s), 4543KiB/s-4543KiB/s
>>> (4652kB/s-4652kB/s), io=1024MiB (1074MB), run=230819-230819msec
>>> 
>>> This seems to happen after 2 runs of 1gb writes (cache disk is 4gb size)
>>> 
>>> Some details are here - https://pastebin.com/V9mpLCbY , I will share
>>> the full testing results soon, but just was wondering about this
>>> performance drop for no apparent reason once the cache gets about 50%
>>> full.
>> 
>> 
>> It seems you are stuck by garbage collection. 4GB cache is small, the 
>> garbage collection might be invoked quite frequently. Maybe you can see 
>> the output of ’top -H’ to check whether there is kernel thread named 
>> bache_gc.
> 
> Hi Nikhil,
> 
> Do you have Mingzhe's GC patch? It might help:
> https://www.spinics.net/lists/linux-bcache/msg11185.html
> 
> Coli, did Mingzhe's patch get into your testing tree? It looks like it 
> could be a good addition to bcache.

No. This patch just reduce the early gc action to make IO faster, and accumulates a large gc action finally to cause more lower I/O period.
The later larger gc may cause longer unpredictable  time I/O stuck, which doesn’t follow current bcache behavior.

It may be helpful for some I/O workloads, but for continuous heavy I/O loads, it won’t help too much.

Coly Li

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-07-28 15:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-25  6:29 bcache I/O performance tests on 5.15.0-40-generic Nikhil Kshirsagar
2022-06-25 12:08 ` Coly Li
2022-06-30  4:24   ` Nikhil Kshirsagar
     [not found]     ` <CAC6jXv0yOZ98XqG=quDcONuZ9ggqK4doM8EzVTc=Sk1m-H=_Xw@mail.gmail.com>
2022-06-30  6:49       ` Coly Li
     [not found]         ` <CAC6jXv2u_0s-hvj4J4gurfoxgKNHFcHYq9F8cfcf-s6oG+pU+Q@mail.gmail.com>
2022-06-30  7:36           ` Coly Li
2022-06-30  7:39             ` Nikhil Kshirsagar
2022-06-30  7:47               ` Coly Li
2022-07-05 20:49   ` Eric Wheeler
2022-07-28 15:51     ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.