All of lore.kernel.org
 help / color / mirror / Atom feed
* speeding up slow btrfs filesystem
@ 2011-12-16 17:51 Martin Steigerwald
  2011-12-16 17:54 ` Martin Steigerwald
  2011-12-17 11:11 ` Chris Samuel
  0 siblings, 2 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-16 17:51 UTC (permalink / raw)
  To: linux-btrfs

Hi!

On apt-get dist-upgrading my Amarok ThinkPad T23 with BTRFS as /
and as /home I get extremely slow operation - my ThinkPad T42 with Ext4
is running circles around it and thats likely not only due to the faste=
r CPU.

vmstat 1 shows:

procs -----------memory---------- ---swap-- -----io---- -system-- ----c=
pu----
 0  4 151452  75016     68 382084    0    0  1164    28  595 1959 31 19=
  0 50
 0  3 151452  74272     68 382560    0    0   488     0  538 1735 10  9=
  0 81
 4  2 151452  71644     68 385776    0    0  3804     0  663 1886 56 38=
  0  6
 3  2 151452  66916     68 387192    0    0  1264     0  633 1018 74 24=
  0  2
 1  3 151452  63296     68 389336    0    0  1580     0  656 4095 80 20=
  0  0
 2  3 151452  66272     68 390028    8    0   572     0  601 3449 40 17=
  0 43
 3  2 151452  65032     68 390828    0    0   760     0  673 2364 42 25=
  0 32
 3  2 151452  61816     68 393508    0    0  2672     0  748 2203 52 29=
  0 19
 2  2 151452  60824     68 394236    0    0   724     0  660 2338 51 22=
  0 27
 4  2 151452  59832     68 395024    0    0   808     0  662 2309 40 20=
  0 40
 0  2 151452  58708     68 395856    0    0   812    12  683 2217 46 23=
  0 30
 0  2 151452  57964     68 396416    0    0   512     0  619 2196 41 24=
  0 35

I know laptop harddisks aren=B4t the fastest, but AFAIR the T23 felt wa=
y faster
with Ext3/4.

I get quite some stalles when opening a new window in "screen". It can =
take
10-20 seconds to load the new Z-Shell into it. Also Amarok stops playin=
g
music for a while sometimes which it didn=B4t with Ext3/4. I suspect th=
at the
kernel does not service an I/O request of Amarok quickly enough.

Surprisingly I do not see an excessive amount of CPU usage of brtfs ker=
nel
threads with atop. But the disk seems to be quite busy with block out r=
ates
in vmstat of merely a few thousands at maximum.

Thus I suspect fragmentation of btrfs trees or files.

The filesystems has the following specifics - apt-get will work on / on=
ly
obviously:

deepdance:~> btrfs filesystem show
failed to read /dev/sr0
Label: 'debian'  uuid: 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
        Total devices 1 FS bytes used 7.34GB
        devid    1 size 15.00GB used 14.97GB path /dev/dm-0

Label: 'home'  uuid: a600de65-e1ab-4cbf-b150-bbaeaf9fa98d
        Total devices 1 FS bytes used 28.13GB
        devid    1 size 80.00GB used 40.54GB path /dev/dm-2

Btrfs Btrfs v0.19
deepdance:~> btrfs filesystem df /
Data: total=3D11.23GB, used=3D6.84GB
System, DUP: total=3D8.00MB, used=3D4.00KB
System: total=3D4.00MB, used=3D0.00
Metadata, DUP: total=3D1.86GB, used=3D510.99MB

I cleaned out a lot of packages due to the slow dist-upgrades already
and also cause I do not need them on that laptop anymore. Thus the
data tree only uses half of the allocated space. BTRFS doesn=B4t seem
to give space back to the pool for all trees. Maybe it will do that
on btrfs filesystem balance?

home is:

deepdance:~> btrfs filesystem df /home
Data: total=3D37.01GB, used=3D27.54GB
System, DUP: total=3D8.00MB, used=3D12.00KB
System: total=3D4.00MB, used=3D0.00
Metadata, DUP: total=3D1.75GB, used=3D598.76MB
Metadata: total=3D8.00MB, used=3D0.00
deepdance:~>

BTW why does it have two metadata and systems trees while /
only have one?

Currently I have:

deepdance:~> cat /proc/version
Linux version 3.0.0-2-686-pae (Debian 3.0.0-6) (ben@decadent.org.uk)
(gcc version 4.5.3 (Debian 4.5.3-9) ) #1 SMP Wed Nov 2 05:29:50
UTC 2011

from Debian Wheezy.

=46ree memory is quite okay:

deepdance:~> free -m
             total       used       free     shared    buffers     cach=
ed
Mem:           755        699         55          0          0        3=
47
-/+ buffers/cache:        352        402
Swap:         2047        148       1899

I am wondering on how to optimize performance on the /
BTRFS filesystem.

I am not sure whether to try btrfs filesystem balance or
btrfs filesystem defragment /.

I also wonder whether some Debian package management related
file might be fragmented. But the ones I tested do not seem to be:

deepdance:/var/lib/dpkg> filefrag available
available: 1 extent found
deepdance:/var/lib/dpkg> filefrag status  =20
status: 1 extent found
deepdance:/var/lib/dpkg>

But then I also do not know whether "filefrag" from "e2fsprogs"=20
1.42~WIP-2011-10-16-1 will work with BTRFS.

Any advice?

Its not critical for me to fix these issues (soon), but I am curious
whether its possible to get the filesystem speedier by some
maintenance.

Thanks,
--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 17:51 speeding up slow btrfs filesystem Martin Steigerwald
@ 2011-12-16 17:54 ` Martin Steigerwald
  2011-12-16 18:38   ` Goffredo Baroncelli
  2011-12-17 11:11 ` Chris Samuel
  1 sibling, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-16 17:54 UTC (permalink / raw)
  To: linux-btrfs

Am Freitag, 16. Dezember 2011 schrieb Martin Steigerwald:
> Its not critical for me to fix these issues (soon), but I am curious
> whether its possible to get the filesystem speedier by some
> maintenance.

Maybe after it is clear why it is so slow in the first place ;).

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 17:54 ` Martin Steigerwald
@ 2011-12-16 18:38   ` Goffredo Baroncelli
  2011-12-16 19:53     ` Martin Steigerwald
  2011-12-18 18:41     ` Andrea Gelmini
  0 siblings, 2 replies; 24+ messages in thread
From: Goffredo Baroncelli @ 2011-12-16 18:38 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

On Friday, 16 December, 2011 18:54:46 you wrote:
> Am Freitag, 16. Dezember 2011 schrieb Martin Steigerwald:
> > Its not critical for me to fix these issues (soon), but I am curious
> > whether its possible to get the filesystem speedier by some
> > maintenance.
> 
> Maybe after it is clear why it is so slow in the first place ;).

I had the same experience. apt-get upgrade was a frustrating experience!

IIRC the copy-on-write file-system in order to have good performance have to 
merge the write requests most as possible.

Instead apt-get makes a lot of sync calls which don't allow btrfs to merge the 
write requests. This explains why btrfs is slow in this case.

I found a solution, but requires a bit of setup.

The idea is to avoid do perform sync during the package installation. In order 
to avoid data loss in case of failure, I create a snapshot before the 
upgrading. If something goes wrong (i.e. a power failure) I rebooot the system 
from the snapshot. If the installation finish without problem, I flush all the 
data to the disk and remove the snapshot.

For the detail, see a my old post titled "[RFC] aptitude & BTRFS slow" 
(2011-10-19)

BR
G.Baroncelli




-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 18:38   ` Goffredo Baroncelli
@ 2011-12-16 19:53     ` Martin Steigerwald
  2011-12-16 20:58       ` Martin Steigerwald
  2011-12-17 11:39       ` Goffredo Baroncelli
  2011-12-18 18:41     ` Andrea Gelmini
  1 sibling, 2 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-16 19:53 UTC (permalink / raw)
  To: linux-btrfs, Goffredo Baroncelli

Am Freitag, 16. Dezember 2011 schrieb Goffredo Baroncelli:
> On Friday, 16 December, 2011 18:54:46 you wrote:
> > Am Freitag, 16. Dezember 2011 schrieb Martin Steigerwald:
> > > Its not critical for me to fix these issues (soon), but I am
> > > curious whether its possible to get the filesystem speedier by
> > > some maintenance.
> > 
> > Maybe after it is clear why it is so slow in the first place ;).
> 
> I had the same experience. apt-get upgrade was a frustrating
> experience!
> 
> IIRC the copy-on-write file-system in order to have good performance
> have to merge the write requests most as possible.
> 
> Instead apt-get makes a lot of sync calls which don't allow btrfs to
> merge the write requests. This explains why btrfs is slow in this
> case.

Ah, I see. AFAIR there have been added an option for apt/aptitude to omit 
the fsync itself.

Hmmm, a co-worker had the issue of Iceweasel with lots of tabs open being 
slow and I suspected that high fsync() usage of SQLite3 databases for 
bookmarks and stuff might be the culprit. The issue went away for him after 
switching to Ext4.

> I found a solution, but requires a bit of setup.
> 
> The idea is to avoid do perform sync during the package installation.
> In order to avoid data loss in case of failure, I create a snapshot
> before the upgrading. If something goes wrong (i.e. a power failure) I
> rebooot the system from the snapshot. If the installation finish
> without problem, I flush all the data to the disk and remove the
> snapshot.
> 
> For the detail, see a my old post titled "[RFC] aptitude & BTRFS slow"
> (2011-10-19)

Sounds more like a workaround to me than a solution.

I feel reluctant about working around what seems to be a filesystem 
limitation. (A filesystem should not break, i.e. slow down an existing user 
space application beyond a certain limit I think).

I wonder whether it might be a good idea to have nodatacow for /:

nodatacow - Do not copy-on-write data. datacow is used to ensure the user 
either has access to the old version of a file, or to the newer version of 
the file. datacow makes sure we never have partially updated files written 
to disk. nodatacow gives slight performance boost by directly overwriting 
data (like ext[234]), at the expense of potentially getting partially 
updated files on system failures. Performance gain is usually < 5% unless 
the workload is random writes to large database files, where the difference 
can become very large 

(see https://btrfs.wiki.kernel.org/articles/g/e/t/Getting_started.html)

Then writing of files would be back to the Ext3/4 way of doing it.

What do you think?

PS: I am not sure whether its just aptitude. I have occassional audio 
stalls even while not upgrading the system. But then that might be 
pulseaudio although sound playback threads are running with realtime 
priority.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 19:53     ` Martin Steigerwald
@ 2011-12-16 20:58       ` Martin Steigerwald
  2011-12-17  7:03         ` Sergei Trofimovich
  2011-12-17 11:39       ` Goffredo Baroncelli
  1 sibling, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-16 20:58 UTC (permalink / raw)
  To: linux-btrfs

Am Freitag, 16. Dezember 2011 schrieb Martin Steigerwald:
> I wonder whether it might be a good idea to have nodatacow for /:

Nope. Doesn=B4t seem to help much.

How to turn it off, after turning it on?

deepdance:~> LANG=3DC mount -o remount,datacow /=20
mount: / not mounted already, or bad option

Thanks,
--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 20:58       ` Martin Steigerwald
@ 2011-12-17  7:03         ` Sergei Trofimovich
  2011-12-17 11:09           ` Martin Steigerwald
  0 siblings, 1 reply; 24+ messages in thread
From: Sergei Trofimovich @ 2011-12-17  7:03 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 512 bytes --]

On Fri, 16 Dec 2011 21:58:45 +0100
Martin Steigerwald <Martin@lichtvoll.de> wrote:

> Nope. Doesn´t seem to help much.
> 
> How to turn it off, after turning it on?
> 
> deepdance:~> LANG=C mount -o remount,datacow / 
> mount: / not mounted already, or bad option

In debian you can disable syncing on per-process basis:
    http://packages.debian.org/sid/eatmydata

$ eatmydata apt-get install foo
$ eatmydata firefox
$ eatmydata liferea

makes things more bearable

HTH

-- 

  Sergei

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17  7:03         ` Sergei Trofimovich
@ 2011-12-17 11:09           ` Martin Steigerwald
  2011-12-17 11:26             ` Hugo Mills
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 11:09 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: linux-btrfs

Am Samstag, 17. Dezember 2011 schrieb Sergei Trofimovich:
> On Fri, 16 Dec 2011 21:58:45 +0100
>=20
> Martin Steigerwald <Martin@lichtvoll.de> wrote:
> > Nope. Doesn=C2=B4t seem to help much.
> >=20
> > How to turn it off, after turning it on?
> >=20
> > deepdance:~> LANG=3DC mount -o remount,datacow /
> > mount: / not mounted already, or bad option
>=20
> In debian you can disable syncing on per-process basis:
>     http://packages.debian.org/sid/eatmydata
>=20
> $ eatmydata apt-get install foo
> $ eatmydata firefox
> $ eatmydata liferea
>=20
> makes things more bearable

I am not ready to accept that this is the proper answer to what I=20
experience. Applications using fsync() are realistic real world scenari=
os=20
and I think BTRFS has to cope with that.

Yesterday I upgraded the laptop to 3.2-rc4. After converting the inode=20
cache the filesystem appeared to be faster, but I have to wait for some=
=20
Debian packages to pile up on the repository servers to get a real=20
impression.

I think I will scrub / balance / defragment the filesystem after a back=
up.=20
But I am not sure in what order.

I understand that defragment defragments files. But then what does bala=
nce=20
do? For RAID setup I have seen it distributing data evenly across drive=
s=20
when I echo > /sys/block/sda/[=E2=80=A6]/delete a drive before and BTRF=
S had to=20
distribute unevenly cause of that. But what does it do on a filesystem =
on a=20
single drive? I bet it would balance out trees? Will it resize trees wi=
th=20
lots of unused space as well?

According to

deepdance:~> btrfs filesystem df /=20
Data: total=3D11.23GB, used=3D6.98GB
System, DUP: total=3D8.00MB, used=3D4.00KB
System: total=3D4.00MB, used=3D0.00
Metadata, DUP: total=3D1.86GB, used=3D511.35MB
deepdance:~> btrfs filesystem show
[=E2=80=A6]
Label: 'debian'  uuid: 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
        Total devices 1 FS bytes used 7.48GB
        devid    1 size 15.00GB used 14.97GB path /dev/dm-0

Btrfs Btrfs v0.19

the filesystem might have had some chances to fragment heavily, cause t=
he=20
tree sizes add up almost to the 15 GB of space available.

I also remember that for some time the filesystem was nearly full which=
=20
would explain the tree sizes.

Ciao,
--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 17:51 speeding up slow btrfs filesystem Martin Steigerwald
  2011-12-16 17:54 ` Martin Steigerwald
@ 2011-12-17 11:11 ` Chris Samuel
  2011-12-17 12:00   ` Martin Steigerwald
  1 sibling, 1 reply; 24+ messages in thread
From: Chris Samuel @ 2011-12-17 11:11 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: Text/Plain, Size: 591 bytes --]

On Sat, 17 Dec 2011 04:51:51 AM Martin Steigerwald wrote:

> Currently I have:
> 
> deepdance:~> cat /proc/version
> Linux version 3.0.0-2-686-pae (Debian 3.0.0-6)

You are using a fairly old kernel btrfs-wise, I believe there's been 
work done in the 3.2 rc's to improve performance so I'd suggest it's 
well worth testing with 3.2-rc6 to see whether that helps.

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:09           ` Martin Steigerwald
@ 2011-12-17 11:26             ` Hugo Mills
  2011-12-17 11:38               ` Martin Steigerwald
  0 siblings, 1 reply; 24+ messages in thread
From: Hugo Mills @ 2011-12-17 11:26 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: Sergei Trofimovich, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2765 bytes --]

On Sat, Dec 17, 2011 at 12:09:56PM +0100, Martin Steigerwald wrote:
> I think I will scrub / balance / defragment the filesystem after a backup. 
> But I am not sure in what order.
> 
> I understand that defragment defragments files. But then what does balance 
> do? For RAID setup I have seen it distributing data evenly across drives 
> when I echo > /sys/block/sda/[…]/delete a drive before and BTRFS had to 
> distribute unevenly cause of that. But what does it do on a filesystem on a 
> single drive? I bet it would balance out trees? Will it resize trees with 
> lots of unused space as well?

   The metadata trees are automatically balanced, simply by the nature
of the B-tree algorithms used. Balance won't, in general, affect them.
The only thing that a balance will achieve on a single-disk filesystem
is to reclaim unused space from allocated block groups -- so the
"total" value in your Data and Metadata entries below will go down.

> According to
> 
> deepdance:~> btrfs filesystem df / 
> Data: total=11.23GB, used=6.98GB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.86GB, used=511.35MB
> deepdance:~> btrfs filesystem show
> […]
> Label: 'debian'  uuid: 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
>         Total devices 1 FS bytes used 7.48GB
>         devid    1 size 15.00GB used 14.97GB path /dev/dm-0
> 
> Btrfs Btrfs v0.19
> 
> the filesystem might have had some chances to fragment heavily, cause the 
> tree sizes add up almost to the 15 GB of space available.
> 
> I also remember that for some time the filesystem was nearly full which 
> would explain the tree sizes.

   For metadata, the lower bound on size is 0.1% of the data size
(because checksums are computed at 4 bytes for every 4096 bytes of
data). However, data usage can be very much greater than this with
inline extents, where small files can get embedded directly in the
metadata section. This is probably more likely what explains the tree
sizes.

   I understand (although I've not done the analysis myself) that the
maximum "wasted" space in btrfs's B-tree implementation is 50%. To the
best of my knowledge, there's no compaction process for btrfs's trees
available -- nor, in general, should you need one, as a fully-
compacted tree would only have to be rearranged when more data is
added to it, thus slowing the system down after compaction.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- I'll take your bet, but make it ten thousand francs. I'm only ---  
                       a _poor_ corrupt official.                        

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:26             ` Hugo Mills
@ 2011-12-17 11:38               ` Martin Steigerwald
  2011-12-17 11:45                 ` Hugo Mills
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 11:38 UTC (permalink / raw)
  To: Hugo Mills; +Cc: Sergei Trofimovich, linux-btrfs

Am Samstag, 17. Dezember 2011 schrieb Hugo Mills:
> On Sat, Dec 17, 2011 at 12:09:56PM +0100, Martin Steigerwald wrote:
> > I think I will scrub / balance / defragment the filesystem after a
> > backup. But I am not sure in what order.
> >=20
> > I understand that defragment defragments files. But then what does
> > balance do? For RAID setup I have seen it distributing data evenly
> > across drives when I echo > /sys/block/sda/[=E2=80=A6]/delete a dri=
ve before
> > and BTRFS had to distribute unevenly cause of that. But what does i=
t
> > do on a filesystem on a single drive? I bet it would balance out
> > trees? Will it resize trees with lots of unused space as well?
>=20
>    The metadata trees are automatically balanced, simply by the natur=
e
> of the B-tree algorithms used. Balance won't, in general, affect them=
=2E
> The only thing that a balance will achieve on a single-disk filesyste=
m
> is to reclaim unused space from allocated block groups -- so the
> "total" value in your Data and Metadata entries below will go down.

But thats only for optical viewing pleasure as far as I understood you?

Only if there would be not enough free space for one tree to extend the=
n a=20
balance would make sense? I.e. when I had a lot of metadata so that the=
=20
metadata would need to extend (which seems unlikely given below figures=
).

> > According to
> >=20
> > deepdance:~> btrfs filesystem df /
> > Data: total=3D11.23GB, used=3D6.98GB
> > System, DUP: total=3D8.00MB, used=3D4.00KB
> > System: total=3D4.00MB, used=3D0.00
> > Metadata, DUP: total=3D1.86GB, used=3D511.35MB
> > deepdance:~> btrfs filesystem show
> > [=E2=80=A6]
> > Label: 'debian'  uuid: 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
> >=20
> >         Total devices 1 FS bytes used 7.48GB
> >         devid    1 size 15.00GB used 14.97GB path /dev/dm-0
> >=20
> > Btrfs Btrfs v0.19
> >=20
> > the filesystem might have had some chances to fragment heavily, cau=
se
> > the tree sizes add up almost to the 15 GB of space available.
> >=20
> > I also remember that for some time the filesystem was nearly full
> > which would explain the tree sizes.
>=20
>    For metadata, the lower bound on size is 0.1% of the data size
> (because checksums are computed at 4 bytes for every 4096 bytes of
> data). However, data usage can be very much greater than this with
> inline extents, where small files can get embedded directly in the
> metadata section. This is probably more likely what explains the tree
> sizes.
>=20
>    I understand (although I've not done the analysis myself) that the
> maximum "wasted" space in btrfs's B-tree implementation is 50%. To th=
e
> best of my knowledge, there's no compaction process for btrfs's trees
> available -- nor, in general, should you need one, as a fully-
> compacted tree would only have to be rearranged when more data is
> added to it, thus slowing the system down after compaction.

If I understand this correctly this means I can skip the balance step=20
completely.

I might still be doing the balance for that optical viewing pleasure ;)=
=2E

Thanks,
--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 19:53     ` Martin Steigerwald
  2011-12-16 20:58       ` Martin Steigerwald
@ 2011-12-17 11:39       ` Goffredo Baroncelli
  1 sibling, 0 replies; 24+ messages in thread
From: Goffredo Baroncelli @ 2011-12-17 11:39 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

On Friday, 16 December, 2011 20:53:58 Martin Steigerwald wrote:
> > I found a solution, but requires a bit of setup.
> >
> > 
> >
> > The idea is to avoid do perform sync during the package installation.
> > In order to avoid data loss in case of failure, I create a snapshot
> > before the upgrading. If something goes wrong (i.e. a power failure) I
> > rebooot the system from the snapshot. If the installation finish
> > without problem, I flush all the data to the disk and remove the
> > snapshot.
> >
> > 
> >
> > For the detail, see a my old post titled "[RFC] aptitude & BTRFS slow"
> > (2011-10-19)
> 
> Sounds more like a workaround to me than a solution.

Sorry but I strongly disagree.

Aptitude was designed for an ordinary filesystem. Where the only way to have a 
filesystem consistency is to issue a lot of sync for every package. But this 
doesn't prevent to have an half package installed:(think about to an 
"openoffice" upgrade: in case of power failure, you could not have nor the old 
openoffice, nor the new one.
Instead with the snapshot you can always have the old system or the new 
system. No half packages

With BTRFS, I can say that the workaround[*] is using the sync and not the 
snapshot

The true is that BTRFS is different from ext4 (or ext3, xfs....). You can use 
BTRFS like ext4 and you will find a lot of regression like this. 

BTRFS is very different from an ordinary filesystem, and you have to change some 
behaviour to take advantages with is peculiarities.

Using the snapshot during an upgrade open a lot of possibility which are not 
allowed with EXT4. With snapshot you can always go back if during an upgrade 
if something goes wrong (like strange packages dependencies). Or you can have 
the previous configuration to go back in case of trouble.



[*] Of course this is due to the fact that the most part of the filesystem is 
like ext4. Supporting BTRFS could be not the highest priority.


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:38               ` Martin Steigerwald
@ 2011-12-17 11:45                 ` Hugo Mills
  2011-12-17 11:57                   ` Martin Steigerwald
  2011-12-17 16:35                   ` Martin Steigerwald
  0 siblings, 2 replies; 24+ messages in thread
From: Hugo Mills @ 2011-12-17 11:45 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: Sergei Trofimovich, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4459 bytes --]

On Sat, Dec 17, 2011 at 12:38:07PM +0100, Martin Steigerwald wrote:
> Am Samstag, 17. Dezember 2011 schrieb Hugo Mills:
> > On Sat, Dec 17, 2011 at 12:09:56PM +0100, Martin Steigerwald wrote:
> > > I think I will scrub / balance / defragment the filesystem after a
> > > backup. But I am not sure in what order.
> > > 
> > > I understand that defragment defragments files. But then what does
> > > balance do? For RAID setup I have seen it distributing data evenly
> > > across drives when I echo > /sys/block/sda/[…]/delete a drive before
> > > and BTRFS had to distribute unevenly cause of that. But what does it
> > > do on a filesystem on a single drive? I bet it would balance out
> > > trees? Will it resize trees with lots of unused space as well?
> > 
> >    The metadata trees are automatically balanced, simply by the nature
> > of the B-tree algorithms used. Balance won't, in general, affect them.
> > The only thing that a balance will achieve on a single-disk filesystem
> > is to reclaim unused space from allocated block groups -- so the
> > "total" value in your Data and Metadata entries below will go down.
> 
> But thats only for optical viewing pleasure as far as I understood you?
> 
> Only if there would be not enough free space for one tree to extend then a 
> balance would make sense? I.e. when I had a lot of metadata so that the 
> metadata would need to extend (which seems unlikely given below figures).

   From the context, I think you're misusing the term "tree" here to
mean "block group type" (i.e. data or metadata).

   That aside, though, yes, you're right, it's effectively only
cosmetic -- although it can be useful if you have a fully-allocated
filesystem where (for example) data is full and there's lots of
metadata space free, and you want to write more data. In that case,
the FS wants to allocate another Data block group, but can't because
there's no raw storage left to allocate from, despite there being lots
of free space in the allocated Metadata block groups. A balance in
that case would free up some of the metadata block groups and allow
that space to be reallocated as data. (I think it tries to do this
anyway, but I'm not 100% sure about that).

> > > According to
> > > 
> > > deepdance:~> btrfs filesystem df /
> > > Data: total=11.23GB, used=6.98GB
> > > System, DUP: total=8.00MB, used=4.00KB
> > > System: total=4.00MB, used=0.00
> > > Metadata, DUP: total=1.86GB, used=511.35MB
> > > deepdance:~> btrfs filesystem show
> > > […]
> > > Label: 'debian'  uuid: 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
> > > 
> > >         Total devices 1 FS bytes used 7.48GB
> > >         devid    1 size 15.00GB used 14.97GB path /dev/dm-0
> > > 
> > > Btrfs Btrfs v0.19
> > > 
> > > the filesystem might have had some chances to fragment heavily, cause
> > > the tree sizes add up almost to the 15 GB of space available.
> > > 
> > > I also remember that for some time the filesystem was nearly full
> > > which would explain the tree sizes.
> > 
> >    For metadata, the lower bound on size is 0.1% of the data size
> > (because checksums are computed at 4 bytes for every 4096 bytes of
> > data). However, data usage can be very much greater than this with
> > inline extents, where small files can get embedded directly in the
> > metadata section. This is probably more likely what explains the tree
> > sizes.
> > 
> >    I understand (although I've not done the analysis myself) that the
> > maximum "wasted" space in btrfs's B-tree implementation is 50%. To the
> > best of my knowledge, there's no compaction process for btrfs's trees
> > available -- nor, in general, should you need one, as a fully-
> > compacted tree would only have to be rearranged when more data is
> > added to it, thus slowing the system down after compaction.
> 
> If I understand this correctly this means I can skip the balance step 
> completely.

   Pretty much.

> I might still be doing the balance for that optical viewing pleasure ;).

   :)

   It can't hurt, and with such a small FS it probably won't take
long.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- I'll take your bet, but make it ten thousand francs. I'm only ---  
                       a _poor_ corrupt official.                        

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:45                 ` Hugo Mills
@ 2011-12-17 11:57                   ` Martin Steigerwald
  2011-12-17 16:35                   ` Martin Steigerwald
  1 sibling, 0 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 11:57 UTC (permalink / raw)
  To: Hugo Mills, Sergei Trofimovich, linux-btrfs

Am Samstag, 17. Dezember 2011 schrieb Hugo Mills:
> > >    The metadata trees are automatically balanced, simply by the
> > >nature
> > >
> > > of the B-tree algorithms used. Balance won't, in general, affect
> > > them. The only thing that a balance will achieve on a single-disk
> > > filesystem is to reclaim unused space from allocated block groups
> > > -- so the "total" value in your Data and Metadata entries below
> > > will go down.
> >
> > 
> >
> > But thats only for optical viewing pleasure as far as I understood
> > you?
> >
> > 
> >
> > Only if there would be not enough free space for one tree to extend
> > then a  balance would make sense? I.e. when I had a lot of metadata
> > so that the metadata would need to extend (which seems unlikely
> > given below figures).
> 
>    From the context, I think you're misusing the term "tree" here to
> mean "block group type" (i.e. data or metadata).
> 
>    That aside, though, yes, you're right, it's effectively only
> cosmetic -- although it can be useful if you have a fully-allocated
> filesystem where (for example) data is full and there's lots of
> metadata space free, and you want to write more data. In that case,
> the FS wants to allocate another Data block group, but can't because
> there's no raw storage left to allocate from, despite there being lots
> of free space in the allocated Metadata block groups. A balance in
> that case would free up some of the metadata block groups and allow
> that space to be reallocated as data. (I think it tries to do this
> anyway, but I'm not 100% sure about that).

Okay, thats the more likely case then ;).

Thanks for clearing that up,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:11 ` Chris Samuel
@ 2011-12-17 12:00   ` Martin Steigerwald
  2011-12-17 12:42     ` David McBride
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 12:00 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Chris Samuel

Am Samstag, 17. Dezember 2011 schrieb Chris Samuel:
> On Sat, 17 Dec 2011 04:51:51 AM Martin Steigerwald wrote:
> > Currently I have:
> > 
> > deepdance:~> cat /proc/version
> > Linux version 3.0.0-2-686-pae (Debian 3.0.0-6)
> 
> You are using a fairly old kernel btrfs-wise, I believe there's been
> work done in the 3.2 rc's to improve performance so I'd suggest it's
> well worth testing with 3.2-rc6 to see whether that helps.

I am now using 3.2-rc4 from Debian package already. Currently I do not 
build own kernels.

I have the subjective impression that after the initial rebuild of the 
inode cache it became faster.

I have the following mount options:

deepdance:~> grep btrfs /proc/mounts
/dev/mapper/deepdance-debian / btrfs rw,relatime,space_cache,inode_cache 0 
0
/dev/mapper/deepdance-home /home btrfs rw,relatime,space_cache,inode_cache 
0 0

Might be good to use noatime for harddisks as well.

BTW on my ThinkPad T520 I do not perceive performance issues for BTRFS as 
/. But then thats located on an Intel SSD 320 where seeks should not 
matter much.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 12:00   ` Martin Steigerwald
@ 2011-12-17 12:42     ` David McBride
  2011-12-17 16:14       ` Martin Steigerwald
  0 siblings, 1 reply; 24+ messages in thread
From: David McBride @ 2011-12-17 12:42 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

On Sat, 2011-12-17 at 13:00 +0100, Martin Steigerwald wrote:

> BTW on my ThinkPad T520 I do not perceive performance issues for BTRFS as 
> /. But then thats located on an Intel SSD 320 where seeks should not 
> matter much.

Okay, that would be consistent with the slow behaviour observed by
others on fsync()-heavy workloads.  Presumably this produces much more
seeky IO patterns than current common filesystems; I wonder if this is a
limitation of the current implementation or something that is an
inherent properties of the data-structures being used?

Cheers,
David
-- 
David McBride <dwm@doc.ic.ac.uk>
Department of Computing, Imperial College, London


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 12:42     ` David McBride
@ 2011-12-17 16:14       ` Martin Steigerwald
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 16:14 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David McBride

Am Samstag, 17. Dezember 2011 schrieb David McBride:
> On Sat, 2011-12-17 at 13:00 +0100, Martin Steigerwald wrote:
> > BTW on my ThinkPad T520 I do not perceive performance issues for
> > BTRFS as /. But then thats located on an Intel SSD 320 where seeks
> > should not matter much.
>=20
> Okay, that would be consistent with the slow behaviour observed by
> others on fsync()-heavy workloads.  Presumably this produces much mor=
e
> seeky IO patterns than current common filesystems; I wonder if this i=
s
> a limitation of the current implementation or something that is an
> inherent properties of the data-structures being used?

All I can say is that the ThinkPad T520 doesn=C2=B4t seem the best mach=
ine for=20
testing the performance of software. I have seen nothing thats actually=
=20
been slow on that machine yet. Its not a machine for triggering=20
bottlenecks easily it seems to me.

--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:45                 ` Hugo Mills
  2011-12-17 11:57                   ` Martin Steigerwald
@ 2011-12-17 16:35                   ` Martin Steigerwald
  2011-12-17 17:27                     ` Hugo Mills
  1 sibling, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 16:35 UTC (permalink / raw)
  To: Hugo Mills, Sergei Trofimovich, linux-btrfs

Am Samstag, 17. Dezember 2011 schrieb Hugo Mills:
> > I might still be doing the balance for that optical viewing pleasure
> > ;).
> 
>    :)
> 
>    It can't hurt, and with such a small FS it probably won't take
> long.

Now I first did a defrag and then a balance. The balance was heavier I had 
music stalls of about 5 to 10 seconds at time.

The defrag aborted  quickly with a non-zero return code on second run:

deepdance:~> btrfs filesystem defragment /
^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C

I wanted to start it via time.

deepdance:~> /usr/bin/time btrfs filesystem defragment /
Command exited with non-zero status 20
0.00user 1.26system 0:03.86elapsed 32%CPU (0avgtext+0avgdata 
2160maxresident)k
2656inputs+70712outputs (2major+184minor)pagefaults 0swaps

Nothing in dmesg. Does 20 as return code mean "already defragmented"? ;)

I am looking forward to the new asynchronous defrag interface I read about 
somewhere.

Current state now is:

deepdance:~> btrfs filesystem df /                   
Data: total=7.75GB, used=6.91GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=896.00MB, used=506.47MB

Lets see how that fares.

Balance did log something:

[24065.740937] btrfs: found 4207 extents
[24075.581494] btrfs: found 4207 extents
[24077.982099] btrfs: relocating block group 24465375232 flags 1
[24090.418623] btrfs: found 1152 extents
[24099.195646] btrfs: found 1152 extents
[24100.994087] btrfs: relocating block group 24196939776 flags 1
[24124.823654] btrfs: found 3857 extents
[24140.208385] btrfs: found 3857 extents
[24142.334232] btrfs: relocating block group 23928504320 flags 1
[24164.219827] btrfs: found 534 extents
[24171.483027] btrfs: found 534 extents
[24176.021604] btrfs: relocating block group 23391633408 flags 1
[24230.123062] btrfs: found 8607 extents
[24255.193673] btrfs: found 8607 extents
[24258.142945] btrfs: relocating block group 22586327040 flags 1
[24271.875868] btrfs: relocating block group 22452109312 flags 36
[24322.334007] btrfs: found 19112 extents
[24324.253074] btrfs: relocating block group 22317891584 flags 36
[24361.999904] btrfs: found 6934 extents
[24362.927413] btrfs: relocating block group 22183673856 flags 36
[24393.151548] btrfs: found 9031 extents
[24395.447755] btrfs: relocating block group 22049456128 flags 36
[24432.611355] btrfs: found 13216 extents
[24435.508280] btrfs: relocating block group 20975714304 flags 1
[24574.903545] btrfs: found 14600 extents
[24642.613698] btrfs: found 14586 extents
[24647.144462] btrfs: relocating block group 20841496576 flags 36
[24730.473343] btrfs: found 19754 extents
[24735.912210] btrfs: relocating block group 20707278848 flags 36
[24852.827906] btrfs: found 26482 extents
[24853.838002] btrfs: relocating block group 20698890240 flags 34
[24854.825685] btrfs: found 1 extents
[24855.858015] btrfs: relocating block group 20564672512 flags 36
[25001.321705] btrfs: found 31648 extents
[25002.330616] btrfs: relocating block group 20430454784 flags 36
[25170.694953] btrfs: found 30709 extents
[25173.027484] btrfs: relocating block group 20296237056 flags 36
[25240.022780] btrfs: found 19729 extents
[25242.373217] btrfs: relocating block group 20162019328 flags 36
[25293.659547] btrfs: found 11857 extents
[25294.514415] btrfs: relocating block group 20027801600 flags 36
[25381.873449] btrfs: found 20892 extents
[25382.837313] btrfs: relocating block group 18954059776 flags 1
[25407.731124] btrfs: relocating block group 17880317952 flags 1
[25528.179185] btrfs: found 13850 extents
[25572.737920] btrfs: found 13831 extents
[25574.017807] btrfs: found 1 extents
[25577.603801] btrfs: relocating block group 16806576128 flags 1
[25667.266953] btrfs: found 2448 extents
[25689.503862] btrfs: found 2448 extents
[25691.924348] btrfs: relocating block group 15732834304 flags 1
[25796.270409] btrfs: found 11264 extents
[25838.860555] btrfs: found 11264 extents
[25843.971106] btrfs: relocating block group 14659092480 flags 1
[25959.486034] btrfs: found 18680 extents
[26037.370148] btrfs: found 18680 extents
[26040.637078] btrfs: relocating block group 13585350656 flags 1
[26131.997384] btrfs: found 26798 extents
[26211.759652] btrfs: found 26787 extents
[26215.846016] btrfs: relocating block group 12511608832 flags 1
[26331.196068] btrfs: found 33247 extents
[26470.846542] btrfs: found 33197 extents
[26479.487194] btrfs: relocating block group 12377391104 flags 36
[26503.391492] btrfs: found 4410 extents
[26507.133189] btrfs: relocating block group 11303649280 flags 1
[26607.401285] btrfs: found 32999 extents
[26770.759705] btrfs: found 32926 extents
[26778.218628] btrfs: relocating block group 11169431552 flags 36
[26921.757006] btrfs: found 23449 extents
[26922.956668] btrfs: relocating block group 11035213824 flags 36
[27047.652332] btrfs: found 21526 extents

Appears quite fragmented to me, but as I do not understand whats exactly 
behind this numbers I leave it as it.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 16:35                   ` Martin Steigerwald
@ 2011-12-17 17:27                     ` Hugo Mills
  0 siblings, 0 replies; 24+ messages in thread
From: Hugo Mills @ 2011-12-17 17:27 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: Sergei Trofimovich, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3244 bytes --]

On Sat, Dec 17, 2011 at 05:35:15PM +0100, Martin Steigerwald wrote:
> Am Samstag, 17. Dezember 2011 schrieb Hugo Mills:
> > > I might still be doing the balance for that optical viewing pleasure
> > > ;).
> > 
> >    :)
> > 
> >    It can't hurt, and with such a small FS it probably won't take
> > long.
> 
> Now I first did a defrag and then a balance. The balance was heavier I had 
> music stalls of about 5 to 10 seconds at time.
> 
> The defrag aborted  quickly with a non-zero return code on second run:
> 
> deepdance:~> btrfs filesystem defragment /
> ^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C
> 
> I wanted to start it via time.
> 
> deepdance:~> /usr/bin/time btrfs filesystem defragment /
> Command exited with non-zero status 20
> 0.00user 1.26system 0:03.86elapsed 32%CPU (0avgtext+0avgdata 
> 2160maxresident)k
> 2656inputs+70712outputs (2major+184minor)pagefaults 0swaps
> 
> Nothing in dmesg. Does 20 as return code mean "already defragmented"? ;)

   I'd have to check what return code 20 means, but... btrfs fi defrag
is *not* recursive, so what you did is effectively a no-op anyway.

> I am looking forward to the new asynchronous defrag interface I read about 
> somewhere.
> 
> Current state now is:
> 
> deepdance:~> btrfs filesystem df /                   
> Data: total=7.75GB, used=6.91GB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=896.00MB, used=506.47MB
> 
> Lets see how that fares.
> 
> Balance did log something:
> 
> [24065.740937] btrfs: found 4207 extents
> [24075.581494] btrfs: found 4207 extents
> [24077.982099] btrfs: relocating block group 24465375232 flags 1
[snip]
> [24730.473343] btrfs: found 19754 extents
> [24735.912210] btrfs: relocating block group 20707278848 flags 36
> [24852.827906] btrfs: found 26482 extents
> [24853.838002] btrfs: relocating block group 20698890240 flags 34
[snip]

> Appears quite fragmented to me, but as I do not understand whats exactly 
> behind this numbers I leave it as it.

   The long numbers are block group IDs. These correspond to a
position in the FS's internal address space (which doesn't, in the
general case, map directly to anything -- there's an internal tree
that holds the map). The flags indicate what type of block group is
being moved. These correspond to the line headings in "btrfs fi df",
and are a bitmap. "flags 1" is a non-RAIDed data block group. "flags
34" is a DUP system block group, and "flags 36" is a DUP metadata
block group. You'll probably find a single reference to a block group
with flags 2, which is the vestigial non-RAID System group you can see
in your "btrfs fi df" output above.

   Extents are simply contiguous regions of storage, corresponding to
parts (or all) of a file, or to individual tree blocks (which are 4k
in size). The "found <N> extents" messages just indicate how many
extents there are to move in the block group it's currently looking
at.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
       --- "I lost my leg in 1942.  Some bastard stole it in a ---       
                            pub in Pimlico."                             

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-16 18:38   ` Goffredo Baroncelli
  2011-12-16 19:53     ` Martin Steigerwald
@ 2011-12-18 18:41     ` Andrea Gelmini
  2011-12-20 19:46       ` Goffredo Baroncelli
  1 sibling, 1 reply; 24+ messages in thread
From: Andrea Gelmini @ 2011-12-18 18:41 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: Martin Steigerwald, linux-btrfs

2011/12/16 Goffredo Baroncelli <kreijack@inwind.it>:
> I found a solution, but requires a bit of setup.

Did you try:
echo force-unsafe-io >> /etc/dpkg/dpkg.cfg

You need dpkg 1.16.

Ciao,
Andrea

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-18 18:41     ` Andrea Gelmini
@ 2011-12-20 19:46       ` Goffredo Baroncelli
  0 siblings, 0 replies; 24+ messages in thread
From: Goffredo Baroncelli @ 2011-12-20 19:46 UTC (permalink / raw)
  To: Andrea Gelmini; +Cc: linux-btrfs

Ciao Andrea,

On Sunday, 18 December, 2011 19:41:49 you wrote:
> 2011/12/16 Goffredo Baroncelli <kreijack@inwind.it>:
> > I found a solution, but requires a bit of setup.
> 
> Did you try:
> echo force-unsafe-io >> /etc/dpkg/dpkg.cfg

stracing an apt-get update, it seems that --force-unsafe-io doesn't stop all 
the sync command.

See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=613428

> You need dpkg 1.16.

# dpkg --version
Debian `dpkg' package management program version 1.16.1.2 (amd64).
This is free software; see the GNU General Public License version 2 or
later for copying conditions. There is NO warranty.



> 
> Ciao,
> Andrea
-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 12:50 ` Goffredo Baroncelli
@ 2011-12-17 16:10   ` Martin Steigerwald
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 16:10 UTC (permalink / raw)
  To: Goffredo Baroncelli, linux-btrfs

Am Samstag, 17. Dezember 2011 schrieb Goffredo Baroncelli:
> > Adding new possibilities is one thing. And supporting snapshots
> > properly would depend on some support side from the applications. I
> > think that using snapshots for upgrades is a good idea.
> >
> >=20
> >
> > But OTOH I think that BTRFS should not break or slow down existing
> > userspace. I think that existing approaches like using fsync() like
> > according to quite some filesystem developers it should be used
> > should continue to work nicely.
>=20
> Nobody wants to slowdown the application. But the life is full of
> compromises. If you want the speed of ext4, you can use ext4. If you
> want the snapshot capability and the COW guarantee you can use BTRFS,
> but you have some slowness.
>=20
> Of course the best would be have the speed of the ext4 with the
> capabilities  of btrfs.... :-) Unfortunately today this is not
> available.

Its perfectly acceptable for me that BTRFS does not deliver this yet.

I understood your initial answer that its just that BTRFS is different =
and=20
thus performs poorly in fsync() based workloads and thats about it. Tha=
t=20
its a principal issue. That part I didn=B4t agree too. Heck from the de=
sign=20
differences of COW filesystem it might even be some sort of a principal=
=20
issue. But then I like to see this as a challenge, not as a show stoppe=
r.

Actually for me especially for that Amarok Thinkpad T23 there is no hur=
ry.=20
Its play BTRFS play machine. I just want to see what I can do with=20
filesystem maintenance to bring it up to speed.

Everything else is following development and upgrading kernels ;).

Thanks,
--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:54 Martin Steigerwald
  2011-12-17 12:02 ` Martin Steigerwald
@ 2011-12-17 12:50 ` Goffredo Baroncelli
  2011-12-17 16:10   ` Martin Steigerwald
  1 sibling, 1 reply; 24+ messages in thread
From: Goffredo Baroncelli @ 2011-12-17 12:50 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

On Saturday, 17 December, 2011 12:54:47 you wrote:
[...]
>=20
> This reminds me of the delayed allocation discussion as Ext4 introduc=
ed
> that feature.
>=20
> Ext3/4 developer Theodore T=B4so  said if the applications are not us=
ing
> fsync() its their fault. But before OTOH applications began to avoid =
using
> fsync() since it has had serious performance drawbacks on ext3 (not e=
xt4)
> with data=3Dordered.
>=20
> Ext4 now has workarounds for the rename and truncate cases, after Lin=
us
> requested boldly to not break existing userspace.

IIRC the problem was data loss. Instead you was blaming (correctly) a s=
lowness=20
problem. Are two very different problem.

> Now applications that use fsync() the way Theodore T=B4so and other s=
ee it
> correctly used should now skip the fsync() on a BTRFS?

I never say to not use the fsync() call. I am only arguing that for a p=
ackage=20
manager the fsync() call is not the best API.=20

The package manager were designed with capabilities of the old file-sys=
tems in=20
mind. At the time the sync(2) API was the only available.
With this API it is impossible to have an atomic upgrade (all or nothin=
g) of a=20
package.=20

With the new filesystems (BTRFS and ZFS ), the package manager have mor=
e=20
options. They can create a snapshot at the beginning (of the old filesy=
stem)=20
and rollback if something goes wrong (I am simplifying a bit) . But the=
=20
package manager have to be updated.

As bonus you can avoid to use sync(2) which has performance drawbacks=20
(specially with BTRFS).

=20
[...]


>=20
> > Using the snapshot during an upgrade open a lot of possibility whic=
h
> > are not allowed with EXT4. With snapshot you can always go back if
> > during an upgrade if something goes wrong (like strange packages
> > dependencies). Or you can have the previous configuration to go bac=
k
> > in case of trouble.
>=20
> Adding new possibilities is one thing. And supporting snapshots prope=
rly
> would depend on some support side from the applications. I think that
> using snapshots for upgrades is a good idea.
>=20
> But OTOH I think that BTRFS should not break or slow down existing
> userspace. I think that existing approaches like using fsync() like
> according to quite some filesystem developers it should be used shoul=
d
> continue to work nicely.

Nobody wants to slowdown the application. But the life is full of compr=
omises.
If you want the speed of ext4, you can use ext4. If you want the snapsh=
ot=20
capability and the COW guarantee you can use BTRFS, but you have some=20
slowness.

Of course the best would be have the speed of the ext4 with the capabil=
ities=20
of btrfs.... :-) Unfortunately today this is not available.

[....]


>=20
> Thanks,
Regards
--=20
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwi=
nd.it>
Key fingerprint =3D 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
  2011-12-17 11:54 Martin Steigerwald
@ 2011-12-17 12:02 ` Martin Steigerwald
  2011-12-17 12:50 ` Goffredo Baroncelli
  1 sibling, 0 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 12:02 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goffredo Baroncelli

Am Samstag, 17. Dezember 2011 schrieb Martin Steigerwald:
> If BTRFS has other means to guarantee filesystem consistency that is
> faster  it might still make fsync() a no-op or just creating a
> snapshot temporarily automatically.

To clear this up: It should only make it a no-op if it guarentees the 
consistency without it. Otherwise it should do whatever is necessary to 
guarantee it.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: speeding up slow btrfs filesystem
@ 2011-12-17 11:54 Martin Steigerwald
  2011-12-17 12:02 ` Martin Steigerwald
  2011-12-17 12:50 ` Goffredo Baroncelli
  0 siblings, 2 replies; 24+ messages in thread
From: Martin Steigerwald @ 2011-12-17 11:54 UTC (permalink / raw)
  To: Goffredo Baroncelli, linux-btrfs

Am Samstag, 17. Dezember 2011 schrieben Sie:
> On Friday, 16 December, 2011 20:53:58 Martin Steigerwald wrote:
> > > I found a solution, but requires a bit of setup.
> > >=20
> > >=20
> > >=20
> > > The idea is to avoid do perform sync during the package
> > > installation. In order to avoid data loss in case of failure, I
> > > create a snapshot before the upgrading. If something goes wrong
> > > (i.e. a power failure) I rebooot the system from the snapshot. If
> > > the installation finish without problem, I flush all the data to
> > > the disk and remove the snapshot.
> > >=20
> > >=20
> > >=20
> > > For the detail, see a my old post titled "[RFC] aptitude & BTRFS
> > > slow" (2011-10-19)
> >=20
> > Sounds more like a workaround to me than a solution.
>=20
> Sorry but I strongly disagree.
>=20
> Aptitude was designed for an ordinary filesystem. Where the only way =
to
> have a filesystem consistency is to issue a lot of sync for every
> package. But this doesn't prevent to have an half package
> installed:(think about to an "openoffice" upgrade: in case of power
> failure, you could not have nor the old openoffice, nor the new one.
> Instead with the snapshot you can always have the old system or the n=
ew
> system. No half packages
>=20
> With BTRFS, I can say that the workaround[*] is using the sync and no=
t
> the snapshot
>=20
> The true is that BTRFS is different from ext4 (or ext3, xfs....). You
> can use BTRFS like ext4 and you will find a lot of regression like
> this.
>=20
> BTRFS is very different from an ordinary filesystem, and you have to
> change some behaviour to take advantages with is peculiarities.

This reminds me of the delayed allocation discussion as Ext4 introduced=
=20
that feature.

Ext3/4 developer Theodore T=C2=B4so  said if the applications are not u=
sing=20
fsync() its their fault. But before OTOH applications began to avoid us=
ing=20
fsync() since it has had serious performance drawbacks on ext3 (not ext=
4)=20
with data=3Dordered.=20

Ext4 now has workarounds for the rename and truncate cases, after Linus=
=20
requested boldly to not break existing userspace.

Now applications that use fsync() the way Theodore T=C2=B4so and other =
see it=20
correctly used should now skip the fsync() on a BTRFS?

I find it *highly* problematic when applications are required to adapt=20
their behavior depending of the filesystem being in use.

This just doesn=C2=B4t make sense to me.

If BTRFS has other means to guarantee filesystem consistency that is fa=
ster=20
it might still make fsync() a no-op or just creating a snapshot=20
temporarily automatically.

> Using the snapshot during an upgrade open a lot of possibility which
> are not allowed with EXT4. With snapshot you can always go back if
> during an upgrade if something goes wrong (like strange packages
> dependencies). Or you can have the previous configuration to go back
> in case of trouble.

Adding new possibilities is one thing. And supporting snapshots properl=
y=20
would depend on some support side from the applications. I think that=20
using snapshots for upgrades is a good idea.

But OTOH I think that BTRFS should not break or slow down existing=20
userspace. I think that existing approaches like using fsync() like=20
according to quite some filesystem developers it should be used should=20
continue to work nicely.

Similar goes for the hardlink limit.

> [*] Of course this is due to the fact that the most part of the
> filesystem is like ext4. Supporting BTRFS could be not the highest
> priority.

I do think that a

if fs=3Dext4 then do this

if fs=3Dbtrfs then do this and

if fs=3Dext3 + data=3Dordered then do this

if fs=3Dext3 + data=3Dordered + kernel=3Dwhatnot then do it a tad bit d=
ifferently

if fs=3Dunkown then assume this

in a application is just kind about broken and always thought that one=20
main task of a filesystem would be to lift off the burden on the detail=
s on=20
how data is saved from the application.

Ok, some guidelines might be needed like if you save 10 bytes 1000 time=
s=20
it might be less performant than saving 10000 bytes at once, but aside=20
from that=E2=80=A6

So I think BTRFS should have a fast fsync - that fullfils the consisten=
cy=20
guarentee by whatever compatible way it sees fit - and for the system=20
partition I would even trade in the cow functionality. I didn=C2=B4t ha=
ve it=20
with Ext4 anyway.

Thanks,
--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2011-12-20 19:46 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-16 17:51 speeding up slow btrfs filesystem Martin Steigerwald
2011-12-16 17:54 ` Martin Steigerwald
2011-12-16 18:38   ` Goffredo Baroncelli
2011-12-16 19:53     ` Martin Steigerwald
2011-12-16 20:58       ` Martin Steigerwald
2011-12-17  7:03         ` Sergei Trofimovich
2011-12-17 11:09           ` Martin Steigerwald
2011-12-17 11:26             ` Hugo Mills
2011-12-17 11:38               ` Martin Steigerwald
2011-12-17 11:45                 ` Hugo Mills
2011-12-17 11:57                   ` Martin Steigerwald
2011-12-17 16:35                   ` Martin Steigerwald
2011-12-17 17:27                     ` Hugo Mills
2011-12-17 11:39       ` Goffredo Baroncelli
2011-12-18 18:41     ` Andrea Gelmini
2011-12-20 19:46       ` Goffredo Baroncelli
2011-12-17 11:11 ` Chris Samuel
2011-12-17 12:00   ` Martin Steigerwald
2011-12-17 12:42     ` David McBride
2011-12-17 16:14       ` Martin Steigerwald
2011-12-17 11:54 Martin Steigerwald
2011-12-17 12:02 ` Martin Steigerwald
2011-12-17 12:50 ` Goffredo Baroncelli
2011-12-17 16:10   ` Martin Steigerwald

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.