All of lore.kernel.org
 help / color / mirror / Atom feed
* Btrfs performance with small blocksize on SSD
@ 2017-09-24 13:24 Fuhrmann, Carsten
  2017-09-24 13:40 ` Qu Wenruo
  2017-09-26 20:33 ` Peter Grandi
  0 siblings, 2 replies; 10+ messages in thread
From: Fuhrmann, Carsten @ 2017-09-24 13:24 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 745 bytes --]

Hello,

i run a few performance tests comparing mdadm, hardware raid and the btrfs raid. I noticed that the performance for small blocksizes (2k) is very bad on SSD in general and on HDD for sequential writing.
I wonder about that result, because you say on the wiki that btrfs is very effective for small files. 

I attached my results from raid 1 random write HDD (rH1), SSD (rS1) and from sequential write HDD (sH1), SSD (sS1)

Hopefully you have an explanation for that.

raid@raid-PowerEdge-T630:~$ uname -a
Linux raid-PowerEdge-T630 4.10.0-33-generic #37~16.04.1-Ubuntu SMP Fri Aug 11 14:07:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
raid@raid-PowerEdge-T630:~$ btrfs --version
btrfs-progs v4.4


best regards

Carsten


[-- Attachment #2: rH1.png --]
[-- Type: image/png, Size: 9075 bytes --]

[-- Attachment #3: rS1.png --]
[-- Type: image/png, Size: 8625 bytes --]

[-- Attachment #4: sH1.png --]
[-- Type: image/png, Size: 8582 bytes --]

[-- Attachment #5: sS1.png --]
[-- Type: image/png, Size: 9438 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Btrfs performance with small blocksize on SSD
  2017-09-24 13:24 Btrfs performance with small blocksize on SSD Fuhrmann, Carsten
@ 2017-09-24 13:40 ` Qu Wenruo
  2017-09-24 13:53   ` AW: " Fuhrmann, Carsten
  2017-09-26 20:33 ` Peter Grandi
  1 sibling, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2017-09-24 13:40 UTC (permalink / raw)
  To: Fuhrmann, Carsten, linux-btrfs



On 2017年09月24日 21:24, Fuhrmann, Carsten wrote:
> Hello,
> 
> i run a few performance tests comparing mdadm, hardware raid and the btrfs raid. I noticed that the performance for small blocksizes (2k) is very bad on SSD in general and on HDD for sequential writing.

2K is smaller than the minimal btrfs sectorsize (4K for x86 family).

It's common that unaligned access will impact performance, but we need 
more info about your test cases, including:
1) How write is done?
    Buffered? DIO? O_SYNC? fdatasync?
    I can't read Germany so I'm not sure what the result means. (Although
    I can guess Y axle is latency, but I don't know the meaning of X axle.
    And how many files are involved, how large of these files and etc.

2) Data/meta/sys profiles
    All RADI1?

3) Mkfs profile
    Like nodesize if not default, and any incompat features enabled.

> I wonder about that result, because you say on the wiki that btrfs is very effective for small files.

It can be space effective or performance effective.

If *ignoring* meta profile, btrfs is space-effectient since it inline 
the data into metadata, avoiding padding it to sectorsize so it can save 
some space.

And such behavior can also be somewhat performance effective, by 
avoiding extra seeking for data, since when reading out the metadata we 
have already read out the inlined data.

But such efficiency come with cost.

One obvious one is when we need to convert inline data into regular one.
It may cause extra tree balancing and increase latency.

Would you please try retest with "-o max_inline=0" mount option to 
disable inline data (which makes btrfs behavior like ext*/xfs) to see if 
it's related?

Thanks,
Qu

> 
> I attached my results from raid 1 random write HDD (rH1), SSD (rS1) and from sequential write HDD (sH1), SSD (sS1)
> 
> Hopefully you have an explanation for that.
> 
> raid@raid-PowerEdge-T630:~$ uname -a
> Linux raid-PowerEdge-T630 4.10.0-33-generic #37~16.04.1-Ubuntu SMP Fri Aug 11 14:07:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> raid@raid-PowerEdge-T630:~$ btrfs --version
> btrfs-progs v4.4
> 
> 
> best regards
> 
> Carsten
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* AW: Btrfs performance with small blocksize on SSD
  2017-09-24 13:40 ` Qu Wenruo
@ 2017-09-24 13:53   ` Fuhrmann, Carsten
  2017-09-24 14:10     ` Qu Wenruo
  2017-09-24 16:43     ` Andrei Borzenkov
  0 siblings, 2 replies; 10+ messages in thread
From: Fuhrmann, Carsten @ 2017-09-24 13:53 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3039 bytes --]

Hello,

1)
I used direct write (no page cache) but I didn't disable the Disk cache of the HDD/SSD itself. In all tests I wrote 1GB and looked for the runtime of that write process.
I run every test 5 times with different Blocksizes (2k, 8k, 32k, 128k, 512k). Those values are on the x-axis. On the Y-Axis is the runtime for the test.

2)
Yes every test is on RAID1 for data and metadata

3)
Everything default
mkfs.btrfs -d raid1 -m raid1 /dev/sda /dev/sdb /dev/sdc /dev/sdd


best regards

Carsten

-----Ursprüngliche Nachricht-----
Von: Qu Wenruo [mailto:quwenruo.btrfs@gmx.com] 
Gesendet: Sonntag, 24. September 2017 15:41
An: Fuhrmann, Carsten <carsten.fuhrmann@rwth-aachen.de>; linux-btrfs@vger.kernel.org
Betreff: Re: Btrfs performance with small blocksize on SSD



On 2017年09月24日 21:24, Fuhrmann, Carsten wrote:
> Hello,
> 
> i run a few performance tests comparing mdadm, hardware raid and the btrfs raid. I noticed that the performance for small blocksizes (2k) is very bad on SSD in general and on HDD for sequential writing.

2K is smaller than the minimal btrfs sectorsize (4K for x86 family).

It's common that unaligned access will impact performance, but we need more info about your test cases, including:
1) How write is done?
    Buffered? DIO? O_SYNC? fdatasync?
    I can't read Germany so I'm not sure what the result means. (Although
    I can guess Y axle is latency, but I don't know the meaning of X axle.
    And how many files are involved, how large of these files and etc.

2) Data/meta/sys profiles
    All RADI1?

3) Mkfs profile
    Like nodesize if not default, and any incompat features enabled.

> I wonder about that result, because you say on the wiki that btrfs is very effective for small files.

It can be space effective or performance effective.

If *ignoring* meta profile, btrfs is space-effectient since it inline the data into metadata, avoiding padding it to sectorsize so it can save some space.

And such behavior can also be somewhat performance effective, by avoiding extra seeking for data, since when reading out the metadata we have already read out the inlined data.

But such efficiency come with cost.

One obvious one is when we need to convert inline data into regular one.
It may cause extra tree balancing and increase latency.

Would you please try retest with "-o max_inline=0" mount option to disable inline data (which makes btrfs behavior like ext*/xfs) to see if it's related?

Thanks,
Qu

> 
> I attached my results from raid 1 random write HDD (rH1), SSD (rS1) 
> and from sequential write HDD (sH1), SSD (sS1)
> 
> Hopefully you have an explanation for that.
> 
> raid@raid-PowerEdge-T630:~$ uname -a
> Linux raid-PowerEdge-T630 4.10.0-33-generic #37~16.04.1-Ubuntu SMP Fri 
> Aug 11 14:07:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 
> raid@raid-PowerEdge-T630:~$ btrfs --version btrfs-progs v4.4
> 
> 
> best regards
> 
> Carsten
> 
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AW: Btrfs performance with small blocksize on SSD
  2017-09-24 13:53   ` AW: " Fuhrmann, Carsten
@ 2017-09-24 14:10     ` Qu Wenruo
  2017-09-24 14:22       ` AW: " Fuhrmann, Carsten
  2017-09-24 16:43     ` Andrei Borzenkov
  1 sibling, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2017-09-24 14:10 UTC (permalink / raw)
  To: Fuhrmann, Carsten, linux-btrfs



On 2017年09月24日 21:53, Fuhrmann, Carsten wrote:
> Hello,
> 
> 1)
> I used direct write (no page cache) but I didn't disable the Disk cache of the HDD/SSD itself. In all tests I wrote 1GB and looked for the runtime of that write process.

Are you writing all the 1G into one file?
Or into different files?

> I run every test 5 times with different Blocksizes (2k, 8k, 32k, 128k, 512k). Those values are on the x-axis. On the Y-Axis is the runtime for the test.

Good to know that.

Then there may be 2 factors impacting performance:

1) Convert between inlined and regular data extent
    1st 2K write will be inlined and then 2nd 2K write will convert it
    back to regular data extent.
    The overhead can be quite high.

    Retest with "-o max_inline=0" will disable such behavior so all write
    will only cause regular data extent.

2) Unaligned data size causing extra rewrite/CoW
    Btrfs restore its data in unit of sectorsize, and in your case it is
    4K.
    Writing with 2K will cause btrfs to read out the half written data
    and then CoW it to somewhere else.
    The overhead can be quite huge.

    And I assume 2) is the main overhead.

    Retest with 4K blocksize to see if it's related.

    Please note that, 4K blocksize and 2K blocksize are going through
    different routines (4K blocks routine has no extra CoW overhead, so
    it' should be near 8K blockszie result)

Thanks,
Qu

> 
> 2)
> Yes every test is on RAID1 for data and metadata
> 
> 3)
> Everything default
> mkfs.btrfs -d raid1 -m raid1 /dev/sda /dev/sdb /dev/sdc /dev/sdd
> 
> 
> best regards
> 
> Carsten
> 
> -----Ursprüngliche Nachricht-----
> Von: Qu Wenruo [mailto:quwenruo.btrfs@gmx.com]
> Gesendet: Sonntag, 24. September 2017 15:41
> An: Fuhrmann, Carsten <carsten.fuhrmann@rwth-aachen.de>; linux-btrfs@vger.kernel.org
> Betreff: Re: Btrfs performance with small blocksize on SSD
> 
> 
> 
> On 2017年09月24日 21:24, Fuhrmann, Carsten wrote:
>> Hello,
>>
>> i run a few performance tests comparing mdadm, hardware raid and the btrfs raid. I noticed that the performance for small blocksizes (2k) is very bad on SSD in general and on HDD for sequential writing.
> 
> 2K is smaller than the minimal btrfs sectorsize (4K for x86 family).
> 
> It's common that unaligned access will impact performance, but we need more info about your test cases, including:
> 1) How write is done?
>      Buffered? DIO? O_SYNC? fdatasync?
>      I can't read Germany so I'm not sure what the result means. (Although
>      I can guess Y axle is latency, but I don't know the meaning of X axle.
>      And how many files are involved, how large of these files and etc.
> 
> 2) Data/meta/sys profiles
>      All RADI1?
> 
> 3) Mkfs profile
>      Like nodesize if not default, and any incompat features enabled.
> 
>> I wonder about that result, because you say on the wiki that btrfs is very effective for small files.
> 
> It can be space effective or performance effective.
> 
> If *ignoring* meta profile, btrfs is space-effectient since it inline the data into metadata, avoiding padding it to sectorsize so it can save some space.
> 
> And such behavior can also be somewhat performance effective, by avoiding extra seeking for data, since when reading out the metadata we have already read out the inlined data.
> 
> But such efficiency come with cost.
> 
> One obvious one is when we need to convert inline data into regular one.
> It may cause extra tree balancing and increase latency.
> 
> Would you please try retest with "-o max_inline=0" mount option to disable inline data (which makes btrfs behavior like ext*/xfs) to see if it's related?
> 
> Thanks,
> Qu
> 
>>
>> I attached my results from raid 1 random write HDD (rH1), SSD (rS1)
>> and from sequential write HDD (sH1), SSD (sS1)
>>
>> Hopefully you have an explanation for that.
>>
>> raid@raid-PowerEdge-T630:~$ uname -a
>> Linux raid-PowerEdge-T630 4.10.0-33-generic #37~16.04.1-Ubuntu SMP Fri
>> Aug 11 14:07:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>> raid@raid-PowerEdge-T630:~$ btrfs --version btrfs-progs v4.4
>>
>>
>> best regards
>>
>> Carsten
>>
> N�����r��y���b�X��ǧv�^�)޺{.n�+����{�n�߲)���w*\x1fjg���\x1e�����ݢj/���z�ޖ��2�ޙ���&�)ߡ�a��\x7f��\x1e�G���h�\x0f�j:+v���w�٥
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* AW: AW: Btrfs performance with small blocksize on SSD
  2017-09-24 14:10     ` Qu Wenruo
@ 2017-09-24 14:22       ` Fuhrmann, Carsten
  0 siblings, 0 replies; 10+ messages in thread
From: Fuhrmann, Carsten @ 2017-09-24 14:22 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5010 bytes --]


1)
Every test has it's own file. So the 2k blocksize write to a different file then the 4k blocksize test. In the End there are 5 files on the disk (2k, 8k,...)

2)
Well I think it is 2 as well since for 4k and higher the performance is much better .
I'm gonna test the -o max_inline and test with 4k blocksize as well.

Thank you for your help

Carsten

-----Ursprüngliche Nachricht-----
Von: Qu Wenruo [mailto:quwenruo.btrfs@gmx.com] 
Gesendet: Sonntag, 24. September 2017 16:11
An: Fuhrmann, Carsten <carsten.fuhrmann@rwth-aachen.de>; linux-btrfs@vger.kernel.org
Betreff: Re: AW: Btrfs performance with small blocksize on SSD



On 2017年09月24日 21:53, Fuhrmann, Carsten wrote:
> Hello,
> 
> 1)
> I used direct write (no page cache) but I didn't disable the Disk cache of the HDD/SSD itself. In all tests I wrote 1GB and looked for the runtime of that write process.

Are you writing all the 1G into one file?
Or into different files?

> I run every test 5 times with different Blocksizes (2k, 8k, 32k, 128k, 512k). Those values are on the x-axis. On the Y-Axis is the runtime for the test.

Good to know that.

Then there may be 2 factors impacting performance:

1) Convert between inlined and regular data extent
    1st 2K write will be inlined and then 2nd 2K write will convert it
    back to regular data extent.
    The overhead can be quite high.

    Retest with "-o max_inline=0" will disable such behavior so all write
    will only cause regular data extent.

2) Unaligned data size causing extra rewrite/CoW
    Btrfs restore its data in unit of sectorsize, and in your case it is
    4K.
    Writing with 2K will cause btrfs to read out the half written data
    and then CoW it to somewhere else.
    The overhead can be quite huge.

    And I assume 2) is the main overhead.

    Retest with 4K blocksize to see if it's related.

    Please note that, 4K blocksize and 2K blocksize are going through
    different routines (4K blocks routine has no extra CoW overhead, so
    it' should be near 8K blockszie result)

Thanks,
Qu

> 
> 2)
> Yes every test is on RAID1 for data and metadata
> 
> 3)
> Everything default
> mkfs.btrfs -d raid1 -m raid1 /dev/sda /dev/sdb /dev/sdc /dev/sdd
> 
> 
> best regards
> 
> Carsten
> 
> -----Ursprüngliche Nachricht-----
> Von: Qu Wenruo [mailto:quwenruo.btrfs@gmx.com]
> Gesendet: Sonntag, 24. September 2017 15:41
> An: Fuhrmann, Carsten <carsten.fuhrmann@rwth-aachen.de>; 
> linux-btrfs@vger.kernel.org
> Betreff: Re: Btrfs performance with small blocksize on SSD
> 
> 
> 
> On 2017年09月24日 21:24, Fuhrmann, Carsten wrote:
>> Hello,
>>
>> i run a few performance tests comparing mdadm, hardware raid and the btrfs raid. I noticed that the performance for small blocksizes (2k) is very bad on SSD in general and on HDD for sequential writing.
> 
> 2K is smaller than the minimal btrfs sectorsize (4K for x86 family).
> 
> It's common that unaligned access will impact performance, but we need more info about your test cases, including:
> 1) How write is done?
>      Buffered? DIO? O_SYNC? fdatasync?
>      I can't read Germany so I'm not sure what the result means. (Although
>      I can guess Y axle is latency, but I don't know the meaning of X axle.
>      And how many files are involved, how large of these files and etc.
> 
> 2) Data/meta/sys profiles
>      All RADI1?
> 
> 3) Mkfs profile
>      Like nodesize if not default, and any incompat features enabled.
> 
>> I wonder about that result, because you say on the wiki that btrfs is very effective for small files.
> 
> It can be space effective or performance effective.
> 
> If *ignoring* meta profile, btrfs is space-effectient since it inline the data into metadata, avoiding padding it to sectorsize so it can save some space.
> 
> And such behavior can also be somewhat performance effective, by avoiding extra seeking for data, since when reading out the metadata we have already read out the inlined data.
> 
> But such efficiency come with cost.
> 
> One obvious one is when we need to convert inline data into regular one.
> It may cause extra tree balancing and increase latency.
> 
> Would you please try retest with "-o max_inline=0" mount option to disable inline data (which makes btrfs behavior like ext*/xfs) to see if it's related?
> 
> Thanks,
> Qu
> 
>>
>> I attached my results from raid 1 random write HDD (rH1), SSD (rS1) 
>> and from sequential write HDD (sH1), SSD (sS1)
>>
>> Hopefully you have an explanation for that.
>>
>> raid@raid-PowerEdge-T630:~$ uname -a
>> Linux raid-PowerEdge-T630 4.10.0-33-generic #37~16.04.1-Ubuntu SMP 
>> Fri Aug 11 14:07:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 
>> raid@raid-PowerEdge-T630:~$ btrfs --version btrfs-progs v4.4
>>
>>
>> best regards
>>
>> Carsten
>>
> N     r  y   b X  ǧv ^ )޺{.n +    { n ߲)   w*jg   \x1e     ݢj/   z ޖ  2 
> Þ™   & )ß¡ a  \x7f  \x1e G   h \x0f j:+v   w Ù¥
> 
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AW: Btrfs performance with small blocksize on SSD
  2017-09-24 13:53   ` AW: " Fuhrmann, Carsten
  2017-09-24 14:10     ` Qu Wenruo
@ 2017-09-24 16:43     ` Andrei Borzenkov
  2017-09-24 20:39       ` Kai Krakow
  2017-09-25  7:04       ` AW: AW: " Fuhrmann, Carsten
  1 sibling, 2 replies; 10+ messages in thread
From: Andrei Borzenkov @ 2017-09-24 16:43 UTC (permalink / raw)
  To: Fuhrmann, Carsten, Qu Wenruo, linux-btrfs

24.09.2017 16:53, Fuhrmann, Carsten пишет:
> Hello,
> 
> 1)
> I used direct write (no page cache) but I didn't disable the Disk cache of the HDD/SSD itself. In all tests I wrote 1GB and looked for the runtime of that write process.

So "latency" on your diagram means total time to write 1GiB file? That
is highly unusual meaning for "latency" which normally means time to
perform single IO. If so, you should better rename Y-axis to something
like "total run time".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Btrfs performance with small blocksize on SSD
  2017-09-24 16:43     ` Andrei Borzenkov
@ 2017-09-24 20:39       ` Kai Krakow
  2017-09-25  7:04       ` AW: AW: " Fuhrmann, Carsten
  1 sibling, 0 replies; 10+ messages in thread
From: Kai Krakow @ 2017-09-24 20:39 UTC (permalink / raw)
  To: linux-btrfs

Am Sun, 24 Sep 2017 19:43:05 +0300
schrieb Andrei Borzenkov <arvidjaar@gmail.com>:

> 24.09.2017 16:53, Fuhrmann, Carsten пишет:
> > Hello,
> > 
> > 1)
> > I used direct write (no page cache) but I didn't disable the Disk
> > cache of the HDD/SSD itself. In all tests I wrote 1GB and looked
> > for the runtime of that write process.  
> 
> So "latency" on your diagram means total time to write 1GiB file? That
> is highly unusual meaning for "latency" which normally means time to
> perform single IO. If so, you should better rename Y-axis to something
> like "total run time".

If you look closely it says "Laufzeit" which visually looks similar to
"latency" but really means "run time". ;-)


-- 
Regards,
Kai

Replies to list-only preferred.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* AW: AW: Btrfs performance with small blocksize on SSD
  2017-09-24 16:43     ` Andrei Borzenkov
  2017-09-24 20:39       ` Kai Krakow
@ 2017-09-25  7:04       ` Fuhrmann, Carsten
  2017-09-25  8:36         ` Kai Krakow
  1 sibling, 1 reply; 10+ messages in thread
From: Fuhrmann, Carsten @ 2017-09-25  7:04 UTC (permalink / raw)
  To: Andrei Borzenkov, Qu Wenruo, linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1100 bytes --]

Well the correct translation for "Laufzeit" is runtime and not latency. But thank you for that hint, I'll change it to "gesamt Laufzeit" to make it more clear.


Best regards

Carsten

-----Ursprüngliche Nachricht-----
Von: Andrei Borzenkov [mailto:arvidjaar@gmail.com] 
Gesendet: Sonntag, 24. September 2017 18:43
An: Fuhrmann, Carsten <carsten.fuhrmann@rwth-aachen.de>; Qu Wenruo <quwenruo.btrfs@gmx.com>; linux-btrfs@vger.kernel.org
Betreff: Re: AW: Btrfs performance with small blocksize on SSD

24.09.2017 16:53, Fuhrmann, Carsten пишет:
> Hello,
> 
> 1)
> I used direct write (no page cache) but I didn't disable the Disk cache of the HDD/SSD itself. In all tests I wrote 1GB and looked for the runtime of that write process.

So "latency" on your diagram means total time to write 1GiB file? That is highly unusual meaning for "latency" which normally means time to perform single IO. If so, you should better rename Y-axis to something like "total run time".
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Btrfs performance with small blocksize on SSD
  2017-09-25  7:04       ` AW: AW: " Fuhrmann, Carsten
@ 2017-09-25  8:36         ` Kai Krakow
  0 siblings, 0 replies; 10+ messages in thread
From: Kai Krakow @ 2017-09-25  8:36 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 25 Sep 2017 07:04:14 +0000
schrieb "Fuhrmann, Carsten" <carsten.fuhrmann@rwth-aachen.de>:

> Well the correct translation for "Laufzeit" is runtime and not
> latency. But thank you for that hint, I'll change it to "gesamt
> Laufzeit" to make it more clear.

How about better translating it to English in the first place as you're
trying to reach an international community?

Also, it would be nice to put the exact test you did as a command line
or configuration file, so it can be replayed on other systems, and -
most important - by the developers, to easily uncover what is causing
the behavior...


> Best regards
> 
> Carsten
> 
> -----Ursprüngliche Nachricht-----
> Von: Andrei Borzenkov [mailto:arvidjaar@gmail.com] 
> Gesendet: Sonntag, 24. September 2017 18:43
> An: Fuhrmann, Carsten <carsten.fuhrmann@rwth-aachen.de>; Qu Wenruo
> <quwenruo.btrfs@gmx.com>; linux-btrfs@vger.kernel.org Betreff: Re:
> AW: Btrfs performance with small blocksize on SSD
> 
> 24.09.2017 16:53, Fuhrmann, Carsten пишет:
> > Hello,
> > 
> > 1)
> > I used direct write (no page cache) but I didn't disable the Disk
> > cache of the HDD/SSD itself. In all tests I wrote 1GB and looked
> > for the runtime of that write process.  
> 
> So "latency" on your diagram means total time to write 1GiB file?
> That is highly unusual meaning for "latency" which normally means
> time to perform single IO. If so, you should better rename Y-axis to
> something like "total run time".
> N_____r__y____b_X____v_^_)__{.n_+____{_n___)____w*\x1fjg___\x1e_______j/___z_____2_______&_)___a__\x7f__\x1e_G___h_\x0f_j:+v___w____


-- 
Regards,
Kai

Replies to list-only preferred.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Btrfs performance with small blocksize on SSD
  2017-09-24 13:24 Btrfs performance with small blocksize on SSD Fuhrmann, Carsten
  2017-09-24 13:40 ` Qu Wenruo
@ 2017-09-26 20:33 ` Peter Grandi
  1 sibling, 0 replies; 10+ messages in thread
From: Peter Grandi @ 2017-09-26 20:33 UTC (permalink / raw)
  To: Linux fs Btrfs

> i run a few performance tests comparing mdadm, hardware raid
> and the btrfs raid.

Fantastic beginning already! :-)

> I noticed that the performance

I have seen over the years a lot of messages like this where
there is a wanton display of amusing misuses of terminology, of
which the misuse of the word "performance" to mean "speed" is
common, and your results are work-per-time which is a "speed":
http://www.sabi.co.uk/blog/15-two.html?151023#151023

The "tl;dr" is: you and another guy are told to race the 100m to
win a €10,000 prize, but you have to carry a sack with a 50Kg
weight. It takes you a lot longer, as your speed is much lower,
and the other guy gets the prize. Was that because your
performance was much worse? :-)

> for small blocksizes (2k) is very bad on SSD in general and on
> HDD for sequential writing.

Your graphs show pretty decent performance for small-file IO on
Btrfs, depending on conditions, and you are very astutely not
explaining the conditions, even if some can be guessed.

> I wonder about that result, because you say on the wiki that
> btrfs is very effective for small files.

Effectivess/efficiency are not the same as performance or speed
either. My own simplistic but somewhat meaningful tests show
that Btrfs does relatively well on small files:

  http://www.sabi.co.uk/blog/17-one.html?170302#170302

As to "small files" in general I have read about many attempts
to use filesystems as DBMSes, and I consider them intensely
stupid:

  http://www.sabi.co.uk/blog/anno05-4th.html?051016#051016

> I attached my results from raid 1 random write HDD (rH1), SSD
> (rS1) and from sequential write HDD (sH1), SSD (sS1)

Ah, so it was specifically about small *writes* (and presumably
because of other wording not small-updates-in-place of large
files, but creating and writing small files).

It is a very basic beginner level notion that most storage
systems are very anisotropic as to IO size, and also for read
vs. write, and never mind with and without 'fsync'. SSDs without
supercapacitor backed buffers in particular are an issue.

Btrfs has a performance envelope where the speed of small writes
(in particular small in-place updates, but also because of POSIX
small file creation) has been sacrificed for good reasons:

https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Copy_on_Write_.28CoW.29
https://btrfs.wiki.kernel.org/index.php/Gotchas#Fragmentation

Also consider the consequences of the 'max_inline' option for
'mount' and the 'nodesize' option for 'mkfs.btrfs'.

> Hopefully you have an explanation for that.

The best explanation seems to me (euphemism alert) quite
extensive "misknowledge" in the message I am responding to.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-09-26 20:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-24 13:24 Btrfs performance with small blocksize on SSD Fuhrmann, Carsten
2017-09-24 13:40 ` Qu Wenruo
2017-09-24 13:53   ` AW: " Fuhrmann, Carsten
2017-09-24 14:10     ` Qu Wenruo
2017-09-24 14:22       ` AW: " Fuhrmann, Carsten
2017-09-24 16:43     ` Andrei Borzenkov
2017-09-24 20:39       ` Kai Krakow
2017-09-25  7:04       ` AW: AW: " Fuhrmann, Carsten
2017-09-25  8:36         ` Kai Krakow
2017-09-26 20:33 ` Peter Grandi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.