All of lore.kernel.org
 help / color / mirror / Atom feed
* netapp-alike snapshots?
@ 2017-08-22 13:22 Ulli Horlacher
  2017-08-22 13:44 ` Peter Becker
  2017-09-09 13:26 ` Ulli Horlacher
  0 siblings, 2 replies; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 13:22 UTC (permalink / raw)
  To: linux-btrfs

With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
You can find these snapshots in every local directory (readonly).
Example:

framstag@fex:/sw/share: ll .snapshot/
drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-15_0010
drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-16_0010
drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-17_0010
drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-18_0010
drwxr-xr-x  framstag root - 2017-08-18 23:59:29 .snapshot/daily.2017-08-19_0010
drwxr-xr-x  framstag root - 2017-08-19 21:01:25 .snapshot/daily.2017-08-20_0010
drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/daily.2017-08-21_0010
drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1210
drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1610
drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/hourly.2017-08-20_2010
drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_0810
drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_1210
drwxr-xr-x  framstag root - 2017-08-21 13:05:28 .snapshot/hourly.2017-08-21_1610

I would like to have something similar with btrfs.
Programming such a feature is not a general problem for me, but I think I
am not the first one who wants this kind of auto-snapshooting.
Is there (where?) such a tool?

I know snapper, but it has a totally different approach.

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170822132208.GD14804@rus.uni-stuttgart.de>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 13:22 netapp-alike snapshots? Ulli Horlacher
@ 2017-08-22 13:44 ` Peter Becker
  2017-08-22 14:24   ` Ulli Horlacher
  2017-09-09 13:26 ` Ulli Horlacher
  1 sibling, 1 reply; 44+ messages in thread
From: Peter Becker @ 2017-08-22 13:44 UTC (permalink / raw)
  To: linux-btrfs

Is use: https://github.com/jf647/btrfs-snap

2017-08-22 15:22 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
> With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> You can find these snapshots in every local directory (readonly).
> Example:
>
> framstag@fex:/sw/share: ll .snapshot/
> drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-15_0010
> drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-16_0010
> drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-17_0010
> drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-18_0010
> drwxr-xr-x  framstag root - 2017-08-18 23:59:29 .snapshot/daily.2017-08-19_0010
> drwxr-xr-x  framstag root - 2017-08-19 21:01:25 .snapshot/daily.2017-08-20_0010
> drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/daily.2017-08-21_0010
> drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1210
> drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1610
> drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/hourly.2017-08-20_2010
> drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_0810
> drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_1210
> drwxr-xr-x  framstag root - 2017-08-21 13:05:28 .snapshot/hourly.2017-08-21_1610
>
> I would like to have something similar with btrfs.
> Programming such a feature is not a general problem for me, but I think I
> am not the first one who wants this kind of auto-snapshooting.
> Is there (where?) such a tool?
>
> I know snapper, but it has a totally different approach.
>
> --
> Ullrich Horlacher              Server und Virtualisierung
> Rechenzentrum TIK
> Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
> Allmandring 30a                Tel:    ++49-711-68565868
> 70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
> REF:<20170822132208.GD14804@rus.uni-stuttgart.de>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 13:44 ` Peter Becker
@ 2017-08-22 14:24   ` Ulli Horlacher
  2017-08-22 16:08     ` Peter Becker
  2017-08-22 16:45     ` Roman Mamedov
  0 siblings, 2 replies; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 14:24 UTC (permalink / raw)
  To: linux-btrfs

On Tue 2017-08-22 (15:44), Peter Becker wrote:
> Is use: https://github.com/jf647/btrfs-snap
> 
> 2017-08-22 15:22 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
> > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> > You can find these snapshots in every local directory (readonly).
> > Example:
> >
> > framstag@fex:/sw/share: ll .snapshot/
> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-15_0010
> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-16_0010
> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-17_0010
> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-18_0010
> > drwxr-xr-x  framstag root - 2017-08-18 23:59:29 .snapshot/daily.2017-08-19_0010
> > drwxr-xr-x  framstag root - 2017-08-19 21:01:25 .snapshot/daily.2017-08-20_0010
> > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/daily.2017-08-21_0010
> > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1210
> > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1610
> > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/hourly.2017-08-20_2010
> > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_0810
> > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_1210
> > drwxr-xr-x  framstag root - 2017-08-21 13:05:28 .snapshot/hourly.2017-08-21_1610

btrfs-snap does not create local .snapshot/ sub-directories, but saves the
snapshots in the toplevel root volume directory.



-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<CAEtw4r1cDuE0F_MXe6mHDUr03BhEueBWeq7YaRYjNhhf=UD6KQ@mail.gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 14:24   ` Ulli Horlacher
@ 2017-08-22 16:08     ` Peter Becker
  2017-08-22 16:48       ` Ulli Horlacher
  2017-08-22 16:45     ` Roman Mamedov
  1 sibling, 1 reply; 44+ messages in thread
From: Peter Becker @ 2017-08-22 16:08 UTC (permalink / raw)
  To: linux-btrfs

This is possible. Use the -b or -B option.

-b basedir places the snapshot in basedir with a directory structure
that mimics the mountpoint
-B basedir places the snapshots in basedir with NO additional
subdirectory structure

2017-08-22 16:24 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
> On Tue 2017-08-22 (15:44), Peter Becker wrote:
>> Is use: https://github.com/jf647/btrfs-snap
>>
>> 2017-08-22 15:22 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
>> > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
>> > You can find these snapshots in every local directory (readonly).
>> > Example:
>> >
>> > framstag@fex:/sw/share: ll .snapshot/
>> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-15_0010
>> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-16_0010
>> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-17_0010
>> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-18_0010
>> > drwxr-xr-x  framstag root - 2017-08-18 23:59:29 .snapshot/daily.2017-08-19_0010
>> > drwxr-xr-x  framstag root - 2017-08-19 21:01:25 .snapshot/daily.2017-08-20_0010
>> > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/daily.2017-08-21_0010
>> > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1210
>> > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1610
>> > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/hourly.2017-08-20_2010
>> > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_0810
>> > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_1210
>> > drwxr-xr-x  framstag root - 2017-08-21 13:05:28 .snapshot/hourly.2017-08-21_1610
>
> btrfs-snap does not create local .snapshot/ sub-directories, but saves the
> snapshots in the toplevel root volume directory.
>
>
>
> --
> Ullrich Horlacher              Server und Virtualisierung
> Rechenzentrum TIK
> Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
> Allmandring 30a                Tel:    ++49-711-68565868
> 70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
> REF:<CAEtw4r1cDuE0F_MXe6mHDUr03BhEueBWeq7YaRYjNhhf=UD6KQ@mail.gmail.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 14:24   ` Ulli Horlacher
  2017-08-22 16:08     ` Peter Becker
@ 2017-08-22 16:45     ` Roman Mamedov
  2017-08-22 16:57       ` Ulli Horlacher
  1 sibling, 1 reply; 44+ messages in thread
From: Roman Mamedov @ 2017-08-22 16:45 UTC (permalink / raw)
  To: Ulli Horlacher; +Cc: linux-btrfs

On Tue, 22 Aug 2017 16:24:51 +0200
Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote:

> On Tue 2017-08-22 (15:44), Peter Becker wrote:
> > Is use: https://github.com/jf647/btrfs-snap
> > 
> > 2017-08-22 15:22 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
> > > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> > > You can find these snapshots in every local directory (readonly).
> > > Example:
> > >
> > > framstag@fex:/sw/share: ll .snapshot/
> > > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-15_0010
> > > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-16_0010
> > > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-17_0010
> > > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-18_0010
> > > drwxr-xr-x  framstag root - 2017-08-18 23:59:29 .snapshot/daily.2017-08-19_0010
> > > drwxr-xr-x  framstag root - 2017-08-19 21:01:25 .snapshot/daily.2017-08-20_0010
> > > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/daily.2017-08-21_0010
> > > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1210
> > > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1610
> > > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/hourly.2017-08-20_2010
> > > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_0810
> > > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_1210
> > > drwxr-xr-x  framstag root - 2017-08-21 13:05:28 .snapshot/hourly.2017-08-21_1610
> 
> btrfs-snap does not create local .snapshot/ sub-directories, but saves the
> snapshots in the toplevel root volume directory.

It is beneficial to not have snapshots in-place. With a local directory of
snapshots, issuing things like "find", "grep -r" or even "du" will take an
inordinate amount of time and will produce a result you do not expect.

For some of those tools the problem can be avoided (by always keeping in mind
to use "-x" with du, or "--one-file-system" with tar), but not for all of them.

Personally I prefer to have a /snapshots directory on every FS, and e.g. timed
snapshots of /home/username/src will live in /snapshots/home-username-src/. No
point to hide it there with a dot either, as it's convenient to be able to
browse older snapshots with GUI filemanagers (which hide dot-files by default).

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 16:08     ` Peter Becker
@ 2017-08-22 16:48       ` Ulli Horlacher
  0 siblings, 0 replies; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 16:48 UTC (permalink / raw)
  To: linux-btrfs

On Tue 2017-08-22 (18:08), Peter Becker wrote:
> This is possible. Use the -b or -B option.
> 
> -b basedir places the snapshot in basedir with a directory structure
> that mimics the mountpoint
> -B basedir places the snapshots in basedir with NO additional
> subdirectory structure
> 
> 2017-08-22 16:24 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
> > On Tue 2017-08-22 (15:44), Peter Becker wrote:
> >> Is use: https://github.com/jf647/btrfs-snap
> >>
> >> 2017-08-22 15:22 GMT+02:00 Ulli Horlacher <framstag@rus.uni-stuttgart.de>:
> >> > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> >> > You can find these snapshots in every local directory (readonly).
> >> > Example:
> >> >
> >> > framstag@fex:/sw/share: ll .snapshot/
> >> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-15_0010
> >> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-16_0010
> >> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-17_0010
> >> > drwxr-xr-x  framstag root - 2017-08-14 10:21:47 .snapshot/daily.2017-08-18_0010
> >> > drwxr-xr-x  framstag root - 2017-08-18 23:59:29 .snapshot/daily.2017-08-19_0010
> >> > drwxr-xr-x  framstag root - 2017-08-19 21:01:25 .snapshot/daily.2017-08-20_0010
> >> > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/daily.2017-08-21_0010
> >> > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1210
> >> > drwxr-xr-x  framstag root - 2017-08-20 02:50:18 .snapshot/hourly.2017-08-20_1610
> >> > drwxr-xr-x  framstag root - 2017-08-20 19:48:40 .snapshot/hourly.2017-08-20_2010
> >> > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_0810
> >> > drwxr-xr-x  framstag root - 2017-08-21 00:42:28 .snapshot/hourly.2017-08-21_1210
> >> > drwxr-xr-x  framstag root - 2017-08-21 13:05:28 .snapshot/hourly.2017-08-21_1610
> >
> > btrfs-snap does not create local .snapshot/ sub-directories, but saves the
> > snapshots in the toplevel root volume directory.

No, I want in EVERY directory of the sourcetree a subdirectory named
snapshot, example:

framstag@fex:/sw/share: ll .snapshot a*/.snapshot a*/*/.snapshot
drwxrwxrwx  root     root     - 2017-08-22 16:10:01 .snapshot
drwxrwxrwx  root     root     - 2017-08-22 16:10:01 aggis-1.0/.snapshot
drwxrwxrwx  root     root     - 2017-08-22 16:10:01 aggis-1.0/bin/.snapshot
drwxrwxrwx  root     root     - 2017-08-22 16:10:01 aggis-1.0/man/.snapshot

(this is on a Netapp NFS volume)

btrfs-snap creates a snapshot directory tree on a different path.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<CAEtw4r1Dj4APzHuLfKR6LDNkjoDvX89mV-92Q0ntNueR+sss_Q@mail.gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 16:45     ` Roman Mamedov
@ 2017-08-22 16:57       ` Ulli Horlacher
  2017-08-22 17:19         ` A L
  2017-08-22 17:36         ` netapp-alike snapshots? Roman Mamedov
  0 siblings, 2 replies; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 16:57 UTC (permalink / raw)
  To: linux-btrfs

On Tue 2017-08-22 (21:45), Roman Mamedov wrote:

> It is beneficial to not have snapshots in-place. With a local directory of
> snapshots, issuing things like "find", "grep -r" or even "du" will take an
> inordinate amount of time and will produce a result you do not expect.

Netapp snapshots are invisible for tools doing opendir()/readdir()
One could simulate this with symlinks for the snapshot directory:
store the snapshot elsewhere (not inplace) and create a symlink to it, in
every directory.


> Personally I prefer to have a /snapshots directory on every FS

My users want the snapshots locally in a .snapshot subdirectory.
Because Netapp do it this way - for at least 20 years and we have a
multi-PB Netapp storage environment.
No chance to change this.

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170822214531.44538589@natsu>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 16:57       ` Ulli Horlacher
@ 2017-08-22 17:19         ` A L
  2017-08-22 18:01           ` Ulli Horlacher
  2017-08-22 17:36         ` netapp-alike snapshots? Roman Mamedov
  1 sibling, 1 reply; 44+ messages in thread
From: A L @ 2017-08-22 17:19 UTC (permalink / raw)
  To: linux-btrfs

Perhaps using a bind mount? It would look and work the same as a ordinary fs. Just need to make sure du uses one filesystem.

---- From: Ulli Horlacher <framstag@rus.uni-stuttgart.de> -- Sent: 2017-08-22 - 18:57 ----

> On Tue 2017-08-22 (21:45), Roman Mamedov wrote:
> 
>> It is beneficial to not have snapshots in-place. With a local directory of
>> snapshots, issuing things like "find", "grep -r" or even "du" will take an
>> inordinate amount of time and will produce a result you do not expect.
> 
> Netapp snapshots are invisible for tools doing opendir()/readdir()
> One could simulate this with symlinks for the snapshot directory:
> store the snapshot elsewhere (not inplace) and create a symlink to it, in
> every directory.
> 
> 
>> Personally I prefer to have a /snapshots directory on every FS
> 
> My users want the snapshots locally in a .snapshot subdirectory.
> Because Netapp do it this way - for at least 20 years and we have a
> multi-PB Netapp storage environment.
> No chance to change this.
> 
> -- 
> Ullrich Horlacher              Server und Virtualisierung
> Rechenzentrum TIK         
> Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
> Allmandring 30a                Tel:    ++49-711-68565868
> 70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
> REF:<20170822214531.44538589@natsu>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 16:57       ` Ulli Horlacher
  2017-08-22 17:19         ` A L
@ 2017-08-22 17:36         ` Roman Mamedov
  2017-08-22 18:10           ` Ulli Horlacher
  1 sibling, 1 reply; 44+ messages in thread
From: Roman Mamedov @ 2017-08-22 17:36 UTC (permalink / raw)
  To: Ulli Horlacher; +Cc: linux-btrfs

On Tue, 22 Aug 2017 18:57:25 +0200
Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote:

> On Tue 2017-08-22 (21:45), Roman Mamedov wrote:
> 
> > It is beneficial to not have snapshots in-place. With a local directory of
> > snapshots, issuing things like "find", "grep -r" or even "du" will take an
> > inordinate amount of time and will produce a result you do not expect.
> 
> Netapp snapshots are invisible for tools doing opendir()/readdir()
> One could simulate this with symlinks for the snapshot directory:
> store the snapshot elsewhere (not inplace) and create a symlink to it, in
> every directory.
> 
> 
> > Personally I prefer to have a /snapshots directory on every FS
> 
> My users want the snapshots locally in a .snapshot subdirectory.
> Because Netapp do it this way - for at least 20 years and we have a
> multi-PB Netapp storage environment.
> No chance to change this.

Just a side note, you do know that only subvolumes can be snapshotted on Btrfs,
not any regular directory? And that snapshots are not recursive, i.e. if a
subvolume "contains" other subvolumes (hint: it really doesn't), snapshots of
the parent one will not include content of subvolumes below that in the tree.

I don't know how Netapp does this, from the way you describe that setup it
feels like with Btrfs you're still in for some bad surprises and a part of
your expectations will not be met.

Do you plan to make each and every directory and subdirectory a subvolume (so
that it could have a trail of its own snapshots)? There will be performance
implications to that. Also deleting subvolumes can only be done via the
"btrfs" tool, they won't delete like normal dirs, e.g. when trying to do that
remotely via NFS or Samba share.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 17:19         ` A L
@ 2017-08-22 18:01           ` Ulli Horlacher
  2017-08-22 18:36             ` Peter Grandi
  0 siblings, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 18:01 UTC (permalink / raw)
  To: linux-btrfs

On Tue 2017-08-22 (19:19), A L wrote:
> Perhaps using a bind mount? It would look and work the same as a ordinary fs. Just need to make sure du uses one filesystem.
> 
> ---- From: Ulli Horlacher <framstag@rus.uni-stuttgart.de> -- Sent: 2017-08-22 - 18:57 ----
> 
> > On Tue 2017-08-22 (21:45), Roman Mamedov wrote:
> > 
> >> It is beneficial to not have snapshots in-place. With a local directory of
> >> snapshots, issuing things like "find", "grep -r" or even "du" will take an
> >> inordinate amount of time and will produce a result you do not expect.
> > 
> > Netapp snapshots are invisible for tools doing opendir()/readdir()
> > One could simulate this with symlinks for the snapshot directory:
> > store the snapshot elsewhere (not inplace) and create a symlink to it, in
> > every directory.

Not only du works recursivly, but also find and with option also ls, grep,
etc.

And it would require a bind mount for EVERY directory. There can be
hundreds... thousends!


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<ab52f32.7ade2c3f.15e0af46ee9@gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 17:36         ` netapp-alike snapshots? Roman Mamedov
@ 2017-08-22 18:10           ` Ulli Horlacher
  0 siblings, 0 replies; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 18:10 UTC (permalink / raw)
  To: linux-btrfs

On Tue 2017-08-22 (22:36), Roman Mamedov wrote:

> > My users want the snapshots locally in a .snapshot subdirectory.
> > Because Netapp do it this way - for at least 20 years and we have a
> > multi-PB Netapp storage environment.
> 
> Just a side note, you do know that only subvolumes can be snapshotted on Btrfs,
> not any regular directory? And that snapshots are not recursive, i.e. if a
> subvolume "contains" other subvolumes (hint: it really doesn't), snapshots of
> the parent one will not include content of subvolumes below that in the tree.

Yes, I know this. But thanks for your hints! (Other readers here may be
not aware of this)


> I don't know how Netapp does this

I am only a Netapp/waffle user, so I know no internals.
Netapp is not Linux based and definitly a lot older than btrfs.


> from the way you describe that setup it feels like with Btrfs you're
> still in for some bad surprises and a part of your expectations will not
> be met.

I will take care :-)


> Do you plan to make each and every directory and subdirectory a subvolume

No. My idea is to place a symlink in every subdirectory pointing to the
snapshot directory. Not yet programmed...
I was hoping someone already has implemented such a feature.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170822223647.350ca27d@natsu>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 18:01           ` Ulli Horlacher
@ 2017-08-22 18:36             ` Peter Grandi
  2017-08-22 20:48               ` Ulli Horlacher
  2017-08-22 21:53               ` user snapshots Ulli Horlacher
  0 siblings, 2 replies; 44+ messages in thread
From: Peter Grandi @ 2017-08-22 18:36 UTC (permalink / raw)
  To: Linux fs Btrfs

[ ... ]

>>>> It is beneficial to not have snapshots in-place. With a local
>>>> directory of snapshots, [ ... ]

Indeed and there is a fair description of some options for
subvolume nesting policies here which may be interesting to the
original poster:

  https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Layout

It is unsurprising to me that there are tradeoffs involved in
every choice. I find the "Flat" layout particularly desirable.

>>> Netapp snapshots are invisible for tools doing opendir()/
>>> readdir() One could simulate this with symlinks for the
>>> snapshot directory: store the snapshot elsewhere (not inplace)
>>> and create a symlink to it, in every directory.

More precisely in every subvolume root directory.

>>> My users want the snapshots locally in a .snapshot
>>> subdirectory.

Btrfs snapshots can only be done for a whole subvolume. Subvolumes
and snapshots can be created by users, but too many snapshots (see
below) can cause trouble. For somewhat good reasons subvolumes
including snapshots cannot be deleted by users though unless mount
option 'user_subvol_rm_allowed' is used.

>>> Because Netapp do it this way - for at least 20 years and we
>>> have a multi-PB Netapp storage environment. No chance to change
>>> this.

Send patches :-).

> Not only du works recursivly, but also find and with option
> also ls, grep, etc.

Note also that subvolume root directory inodes are indeed root
directory inodes so they can be 'mount'ed and therefore the
transition from a subvolume into a contained subvolume can be
detected at the mountpoint.

So 'find' has the '-xdev' option and 'du' has the '-x' options and
so similarly nearly all other tools, so perhaps someone expects
that to happen :-).

> And it would require a bind mount for EVERY directory. There can
> be hundreds... thousends!

Assumptions that all Btrfs features such as snapshots are
infinitely scalable at no cost may be optimistic:

  https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 18:36             ` Peter Grandi
@ 2017-08-22 20:48               ` Ulli Horlacher
  2017-08-23  7:18                 ` number of subvolumes Ulli Horlacher
  2017-08-22 21:53               ` user snapshots Ulli Horlacher
  1 sibling, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 20:48 UTC (permalink / raw)
  To: Linux fs Btrfs

On Tue 2017-08-22 (19:36), Peter Grandi wrote:

> Indeed and there is a fair description of some options for
> subvolume nesting policies here which may be interesting to the
> original poster:
> 
>   https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Layout
> 
> It is unsurprising to me that there are tradeoffs involved in
> every choice. I find the "Flat" layout particularly desirable.

My layout is already nearly "flat".
It seems my decision was right :-)



> Btrfs snapshots can only be done for a whole subvolume.

I know this.

> Subvolumes and snapshots can be created by users, but too many snapshots
> (see below) can cause trouble. For somewhat good reasons subvolumes
> including snapshots cannot be deleted by users though unless mount option
> 'user_subvol_rm_allowed' is used.

Ooops, this is new to me!

framstag@fex:~: btrfs subvolume create xx
Create subvolume './xx'

framstag@fex:~: btrfs subvolume delete xx
Delete subvolume '/local/home/framstag/xx'
ERROR: cannot delete '/local/home/framstag/xx' - Operation not permitted

This means, root has to remove the subvolme.
Is it possible to disallow creation of subvolumes for normal users?



> >>> Because Netapp do it this way - for at least 20 years and we
> >>> have a multi-PB Netapp storage environment. No chance to change
> >>> this.
> 
> Send patches :-).

For waffle or btrfs? :-)


> Assumptions that all Btrfs features such as snapshots are
> infinitely scalable at no cost may be optimistic:
> 
>   https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow

"when you do device removes on file systems with a lot of snapshots, it
 is unbelievably slow ... took nearly a week to move 20GB of FS data from
 one device to the other using that method"
  
"a balance on 2TB of data that was heavily snapshotted - it took 3 months" 

ARGH!!
Thanks for this warning!
I will overthink my multi-snapshots plan!

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<22940.31139.194399.982315@tree.ty.sabi.co.uk>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* user snapshots
  2017-08-22 18:36             ` Peter Grandi
  2017-08-22 20:48               ` Ulli Horlacher
@ 2017-08-22 21:53               ` Ulli Horlacher
  2017-08-23  6:28                 ` Dmitrii Tcvetkov
  1 sibling, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-22 21:53 UTC (permalink / raw)
  To: Linux fs Btrfs

On Tue 2017-08-22 (19:36), Peter Grandi wrote:

> For somewhat good reasons subvolumes including snapshots cannot be
> deleted by users though unless mount option 'user_subvol_rm_allowed' is
> used.

Also in https://btrfs.wiki.kernel.org/index.php/Mount_options
"user_subvol_rm_allowed (...) Use with caution."

Why? What is the problem?

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<22940.31139.194399.982315@tree.ty.sabi.co.uk>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: user snapshots
  2017-08-22 21:53               ` user snapshots Ulli Horlacher
@ 2017-08-23  6:28                 ` Dmitrii Tcvetkov
  2017-08-23  7:16                   ` Dmitrii Tcvetkov
  0 siblings, 1 reply; 44+ messages in thread
From: Dmitrii Tcvetkov @ 2017-08-23  6:28 UTC (permalink / raw)
  To: Ulli Horlacher, Linux fs Btrfs



>Also in https://btrfs.wiki.kernel.org/index.php/Mount_options
>"user_subvol_rm_allowed (...) Use with caution."
>
>Why? What is the problem?

Because with the mount option any user can delete any subvolume, including root one (subvol_id=5)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: user snapshots
  2017-08-23  6:28                 ` Dmitrii Tcvetkov
@ 2017-08-23  7:16                   ` Dmitrii Tcvetkov
  2017-08-23  7:20                     ` Ulli Horlacher
  0 siblings, 1 reply; 44+ messages in thread
From: Dmitrii Tcvetkov @ 2017-08-23  7:16 UTC (permalink / raw)
  To: Ulli Horlacher, Linux fs Btrfs

> >Also in https://btrfs.wiki.kernel.org/index.php/Mount_options
> >"user_subvol_rm_allowed (...) Use with caution."
> >
> >Why? What is the problem?  
> 
> Because with the mount option any user can delete any subvolume,
> including root one (subvol_id=5)

Apologies, it works somewhat different:
filesystem doesn't allow to delete subvolume with id 5 and POSIX access
is checked before deleting subvolume with user_subvol_rm_allowed mount
option.

>From btrfs-progs cmds-subvolume.c:

res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, &args);
if(res < 0 ){
         error("cannot delete '%s/%s': %s", dname, vname,
                 strerror(errno));

^ permalink raw reply	[flat|nested] 44+ messages in thread

* number of subvolumes
  2017-08-22 20:48               ` Ulli Horlacher
@ 2017-08-23  7:18                 ` Ulli Horlacher
  2017-08-23  8:37                   ` A L
  2017-08-23 12:11                   ` Peter Grandi
  0 siblings, 2 replies; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-23  7:18 UTC (permalink / raw)
  To: Linux fs Btrfs

On Tue 2017-08-22 (22:48), Ulli Horlacher wrote:

> > Assumptions that all Btrfs features such as snapshots are
> > infinitely scalable at no cost may be optimistic:
> > 
> >   https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow
> 
> "when you do device removes on file systems with a lot of snapshots, it
>  is unbelievably slow ... took nearly a week to move 20GB of FS data from
>  one device to the other using that method"
>   
> "a balance on 2TB of data that was heavily snapshotted - it took 3 months" 

This is a vanilla SLES12 installation:

root@ptm1:~# grep PRETTY_NAME /etc/os-release 
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP1"

root@ptm1:~# btrfs subvolume list /
ID 257 gen 358277 top level 5 path @
ID 258 gen 978361 top level 257 path @/home
ID 259 gen 1252501 top level 257 path @/opt
ID 260 gen 883012 top level 257 path @/srv
ID 261 gen 1252673 top level 257 path @/tmp
ID 262 gen 1252501 top level 257 path @/usr/local
ID 263 gen 882958 top level 257 path @/var/crash
ID 264 gen 1252673 top level 257 path @/var/log
ID 265 gen 882923 top level 257 path @/var/opt
ID 266 gen 1252673 top level 257 path @/var/spool
ID 267 gen 1252668 top level 257 path @/var/tmp
ID 270 gen 1252668 top level 257 path @/.snapshots
ID 452 gen 358277 top level 270 path @/.snapshots/127/snapshot
ID 453 gen 1252670 top level 270 path @/.snapshots/128/snapshot
ID 540 gen 368554 top level 270 path @/.snapshots/191/snapshot
ID 542 gen 419566 top level 270 path @/.snapshots/192/snapshot
ID 1035 gen 1027889 top level 270 path @/.snapshots/539/snapshot
ID 1036 gen 1027889 top level 270 path @/.snapshots/540/snapshot
ID 1045 gen 1048327 top level 270 path @/.snapshots/545/snapshot
ID 1046 gen 1048327 top level 270 path @/.snapshots/546/snapshot
ID 1062 gen 1068800 top level 270 path @/.snapshots/555/snapshot
ID 1063 gen 1068800 top level 270 path @/.snapshots/556/snapshot
ID 1122 gen 1130369 top level 270 path @/.snapshots/595/snapshot
ID 1123 gen 1130369 top level 270 path @/.snapshots/596/snapshot
ID 1124 gen 1171229 top level 270 path @/.snapshots/597/snapshot
ID 1125 gen 1171229 top level 270 path @/.snapshots/598/snapshot
ID 1135 gen 1171229 top level 270 path @/.snapshots/605/snapshot
ID 1136 gen 1171229 top level 270 path @/.snapshots/606/snapshot
ID 1137 gen 1171229 top level 270 path @/.snapshots/607/snapshot
ID 1138 gen 1171229 top level 270 path @/.snapshots/608/snapshot
ID 1139 gen 1171229 top level 270 path @/.snapshots/609/snapshot
ID 1140 gen 1171229 top level 270 path @/.snapshots/610/snapshot
ID 1141 gen 1171229 top level 270 path @/.snapshots/611/snapshot
ID 1142 gen 1171229 top level 270 path @/.snapshots/612/snapshot
ID 1158 gen 1172970 top level 270 path @/.snapshots/613/snapshot
ID 1159 gen 1172972 top level 270 path @/.snapshots/614/snapshot

Why does SUSE ignore this "not too many subvolumes" warning?


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170822204811.GO14804@rus.uni-stuttgart.de>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: user snapshots
  2017-08-23  7:16                   ` Dmitrii Tcvetkov
@ 2017-08-23  7:20                     ` Ulli Horlacher
  2017-08-23 11:42                       ` Peter Grandi
  0 siblings, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-23  7:20 UTC (permalink / raw)
  To: Linux fs Btrfs

On Wed 2017-08-23 (10:16), Dmitrii Tcvetkov wrote:
> > >Also in https://btrfs.wiki.kernel.org/index.php/Mount_options
> > >"user_subvol_rm_allowed (...) Use with caution."
> > >
> > >Why? What is the problem?  
> > 
> > Because with the mount option any user can delete any subvolume,
> > including root one (subvol_id=5)
> 
> Apologies, it works somewhat different:
> filesystem doesn't allow to delete subvolume with id 5 and POSIX access
> is checked before deleting subvolume with user_subvol_rm_allowed mount
> option.

I have tested it already :-)

So, still: What is the problem with user_subvol_rm_allowed?

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170823101635.114d02d2@job>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-23  7:18                 ` number of subvolumes Ulli Horlacher
@ 2017-08-23  8:37                   ` A L
  2017-08-23 16:48                     ` Ferry Toth
  2017-08-23 12:11                   ` Peter Grandi
  1 sibling, 1 reply; 44+ messages in thread
From: A L @ 2017-08-23  8:37 UTC (permalink / raw)
  To: Linux fs Btrfs



---- From: Ulli Horlacher <framstag@rus.uni-stuttgart.de> -- Sent: 2017-08-23 - 09:18 ----

> On Tue 2017-08-22 (22:48), Ulli Horlacher wrote:
> 
>> > Assumptions that all Btrfs features such as snapshots are
>> > infinitely scalable at no cost may be optimistic:
>> > 
>> >   https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow
>> 
>> "when you do device removes on file systems with a lot of snapshots, it
>>  is unbelievably slow ... took nearly a week to move 20GB of FS data from
>>  one device to the other using that method"
>>   
>> "a balance on 2TB of data that was heavily snapshotted - it took 3 months" 
> 
> This is a vanilla SLES12 installation:
> 
> root@ptm1:~# grep PRETTY_NAME /etc/os-release 
> PRETTY_NAME="SUSE Linux Enterprise Server 12 SP1"
> 
> root@ptm1:~# btrfs subvolume list /
> ID 257 gen 358277 top level 5 path @
> ID 258 gen 978361 top level 257 path @/home
> ID 259 gen 1252501 top level 257 path @/opt
> ID 260 gen 883012 top level 257 path @/srv
> ID 261 gen 1252673 top level 257 path @/tmp
> ID 262 gen 1252501 top level 257 path @/usr/local
> ID 263 gen 882958 top level 257 path @/var/crash
> ID 264 gen 1252673 top level 257 path @/var/log
> ID 265 gen 882923 top level 257 path @/var/opt
> ID 266 gen 1252673 top level 257 path @/var/spool
> ID 267 gen 1252668 top level 257 path @/var/tmp
> ID 270 gen 1252668 top level 257 path @/.snapshots
> ID 452 gen 358277 top level 270 path @/.snapshots/127/snapshot
> ID 453 gen 1252670 top level 270 path @/.snapshots/128/snapshot
> ID 540 gen 368554 top level 270 path @/.snapshots/191/snapshot
> ID 542 gen 419566 top level 270 path @/.snapshots/192/snapshot
> ID 1035 gen 1027889 top level 270 path @/.snapshots/539/snapshot
> ID 1036 gen 1027889 top level 270 path @/.snapshots/540/snapshot
> ID 1045 gen 1048327 top level 270 path @/.snapshots/545/snapshot
> ID 1046 gen 1048327 top level 270 path @/.snapshots/546/snapshot
> ID 1062 gen 1068800 top level 270 path @/.snapshots/555/snapshot
> ID 1063 gen 1068800 top level 270 path @/.snapshots/556/snapshot
> ID 1122 gen 1130369 top level 270 path @/.snapshots/595/snapshot
> ID 1123 gen 1130369 top level 270 path @/.snapshots/596/snapshot
> ID 1124 gen 1171229 top level 270 path @/.snapshots/597/snapshot
> ID 1125 gen 1171229 top level 270 path @/.snapshots/598/snapshot
> ID 1135 gen 1171229 top level 270 path @/.snapshots/605/snapshot
> ID 1136 gen 1171229 top level 270 path @/.snapshots/606/snapshot
> ID 1137 gen 1171229 top level 270 path @/.snapshots/607/snapshot
> ID 1138 gen 1171229 top level 270 path @/.snapshots/608/snapshot
> ID 1139 gen 1171229 top level 270 path @/.snapshots/609/snapshot
> ID 1140 gen 1171229 top level 270 path @/.snapshots/610/snapshot
> ID 1141 gen 1171229 top level 270 path @/.snapshots/611/snapshot
> ID 1142 gen 1171229 top level 270 path @/.snapshots/612/snapshot
> ID 1158 gen 1172970 top level 270 path @/.snapshots/613/snapshot
> ID 1159 gen 1172972 top level 270 path @/.snapshots/614/snapshot
> 
> Why does SUSE ignore this "not too many subvolumes" warning?

Using hundreds or thousands of snapshots is probably fine mostly. perhaps the slow performance is more related to what changed between them? I have regularly that many snapshots and export many as "Previous Versions" to Windows clients over Samba without any performance issues. But my data doesn't change that much.

I think those comments on the Wiki are a little misleading without better details to what workloads are affected this way. 
Perhaps someone can set up some tests and publish the results?

> 
> 
> -- 
> Ullrich Horlacher              Server und Virtualisierung
> Rechenzentrum TIK         
> Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
> Allmandring 30a                Tel:    ++49-711-68565868
> 70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
> REF:<20170822204811.GO14804@rus.uni-stuttgart.de>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: user snapshots
  2017-08-23  7:20                     ` Ulli Horlacher
@ 2017-08-23 11:42                       ` Peter Grandi
  2017-08-23 21:13                         ` Ulli Horlacher
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Grandi @ 2017-08-23 11:42 UTC (permalink / raw)
  To: Linux fs Btrfs

> So, still: What is the problem with user_subvol_rm_allowed?

As usual, it is complicated: mostly that while subvol creation
is very cheap, subvol deletion can be very expensive. But then
so can be creating many snapshots, as in this:

  https://www.spinics.net/lists/linux-btrfs/msg62760.html

Also that deleting a subvol can delete a lot of stuff
"inadvertently", including things that the user could not delete
using UNIX style permissions. But it many of the Btrfs semantics
feel a bit "arbitrary" in part because they break new ground, in
part because happenstance.

  http://linux-btrfs.vger.kernel.narkive.com/eTtmsQdL/patch-1-2-btrfs-don-t-check-the-permission-of-the-subvolume-which-we-want-to-delete
  http://linux-btrfs.vger.kernel.narkive.com/nR17xtw7/patch-btrfs-allow-subvol-deletion-by-unprivileged-user-with-o-user-subvol-rm-allowed

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-23  7:18                 ` number of subvolumes Ulli Horlacher
  2017-08-23  8:37                   ` A L
@ 2017-08-23 12:11                   ` Peter Grandi
  1 sibling, 0 replies; 44+ messages in thread
From: Peter Grandi @ 2017-08-23 12:11 UTC (permalink / raw)
  To: Linux fs Btrfs

> This is a vanilla SLES12 installation: [ ... ] Why does SUSE
> ignore this "not too many subvolumes" warning?

As in many cases with Btrfs "it's complicated" because of the
interaction of advanced features among themselves and the chosen
implementation and properties of storage; anisotropy rules.

IIRC the main problem actually is not with "too many subvolumes",
but with too many "reflinks"/"backrefs"; subvolumes, in particular
snapshots, are just the main way to create them:

  https://www.spinics.net/lists/linux-btrfs/msg42808.html

A couple dozen subvolumes without reflinks as in the '/' scheme
used by SUSE are going to be almost always fine.

Then there is different a issue: I remember seeing a post by a SUSE
guy saying that the 10/10/10/10 (hourly/daily/monthly/yearly)
snapshots in the default settings for 'snapper' was a bad idea
because it would create way too many snapshots, but that he was
told to set those defaults that high. I can imagine a cowardly but
plausible reason why "management" would want those defaults.

Some semi-useful links:

* Home page for 'snapper'
  https://snapper.io/
* Announcement of 'snapper'
  https://lizards.opensuse.org/2011/04/01/introducing-snapper/
* Useful maintenance scripts
  https://github.com/kdave/btrfsmaintenance

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-23  8:37                   ` A L
@ 2017-08-23 16:48                     ` Ferry Toth
  2017-08-24 17:45                       ` Peter Grandi
  2017-08-24 19:40                       ` Marat Khalili
  0 siblings, 2 replies; 44+ messages in thread
From: Ferry Toth @ 2017-08-23 16:48 UTC (permalink / raw)
  To: linux-btrfs

Op Wed, 23 Aug 2017 10:37:07 +0200, schreef A L:

> ---- From: Ulli Horlacher <framstag@rus.uni-stuttgart.de> -- Sent:
> 2017-08-23 - 09:18 ----
> 
>> On Tue 2017-08-22 (22:48), Ulli Horlacher wrote:
>> 
>>> > Assumptions that all Btrfs features such as snapshots are infinitely
>>> > scalable at no cost may be optimistic:
>>> > 
>>> >   https://btrfs.wiki.kernel.org/index.php/
Gotchas#Having_many_subvolumes_can_be_very_slow
>>> 
>>> "when you do device removes on file systems with a lot of snapshots,
>>> it
>>>  is unbelievably slow ... took nearly a week to move 20GB of FS data
>>>  from one device to the other using that method"
>>>   
>>> "a balance on 2TB of data that was heavily snapshotted - it took 3
>>> months"
>> 
>> This is a vanilla SLES12 installation:
>> 
>> root@ptm1:~# grep PRETTY_NAME /etc/os-release PRETTY_NAME="SUSE Linux
>> Enterprise Server 12 SP1"
>> 
>> root@ptm1:~# btrfs subvolume list /
>> ID 257 gen 358277 top level 5 path @
>> ID 258 gen 978361 top level 257 path @/home ID 259 gen 1252501 top
>> level 257 path @/opt ID 260 gen 883012 top level 257 path @/srv ID 261
>> gen 1252673 top level 257 path @/tmp ID 262 gen 1252501 top level 257
>> path @/usr/local ID 263 gen 882958 top level 257 path @/var/crash ID
>> 264 gen 1252673 top level 257 path @/var/log ID 265 gen 882923 top
>> level 257 path @/var/opt ID 266 gen 1252673 top level 257 path
>> @/var/spool ID 267 gen 1252668 top level 257 path @/var/tmp ID 270 gen
>> 1252668 top level 257 path @/.snapshots ID 452 gen 358277 top level 270
>> path @/.snapshots/127/snapshot ID 453 gen 1252670 top level 270 path
>> @/.snapshots/128/snapshot ID 540 gen 368554 top level 270 path
>> @/.snapshots/191/snapshot ID 542 gen 419566 top level 270 path
>> @/.snapshots/192/snapshot ID 1035 gen 1027889 top level 270 path
>> @/.snapshots/539/snapshot ID 1036 gen 1027889 top level 270 path
>> @/.snapshots/540/snapshot ID 1045 gen 1048327 top level 270 path
>> @/.snapshots/545/snapshot ID 1046 gen 1048327 top level 270 path
>> @/.snapshots/546/snapshot ID 1062 gen 1068800 top level 270 path
>> @/.snapshots/555/snapshot ID 1063 gen 1068800 top level 270 path
>> @/.snapshots/556/snapshot ID 1122 gen 1130369 top level 270 path
>> @/.snapshots/595/snapshot ID 1123 gen 1130369 top level 270 path
>> @/.snapshots/596/snapshot ID 1124 gen 1171229 top level 270 path
>> @/.snapshots/597/snapshot ID 1125 gen 1171229 top level 270 path
>> @/.snapshots/598/snapshot ID 1135 gen 1171229 top level 270 path
>> @/.snapshots/605/snapshot ID 1136 gen 1171229 top level 270 path
>> @/.snapshots/606/snapshot ID 1137 gen 1171229 top level 270 path
>> @/.snapshots/607/snapshot ID 1138 gen 1171229 top level 270 path
>> @/.snapshots/608/snapshot ID 1139 gen 1171229 top level 270 path
>> @/.snapshots/609/snapshot ID 1140 gen 1171229 top level 270 path
>> @/.snapshots/610/snapshot ID 1141 gen 1171229 top level 270 path
>> @/.snapshots/611/snapshot ID 1142 gen 1171229 top level 270 path
>> @/.snapshots/612/snapshot ID 1158 gen 1172970 top level 270 path
>> @/.snapshots/613/snapshot ID 1159 gen 1172972 top level 270 path
>> @/.snapshots/614/snapshot
>> 
>> Why does SUSE ignore this "not too many subvolumes" warning?
> 
> Using hundreds or thousands of snapshots is probably fine mostly.
> perhaps the slow performance is more related to what changed between
> them? I have regularly that many snapshots and export many as "Previous
> Versions" to Windows clients over Samba without any performance issues.
> But my data doesn't change that much.
> 
> I think those comments on the Wiki are a little misleading without
> better details to what workloads are affected this way.
> Perhaps someone can set up some tests and publish the results?
> 

We find that typically apt is very slow on a machine with 50 or so 
snapshots and raid10. Slow as in probably 10x slower as doing the same 
update on a machine with 'single' and no snapshots.

Other operations seem to be the same speed, especially disk benchmarks do 
not seem to indicate any performance degradation.

>> 
>> --
>> Ullrich Horlacher              Server und Virtualisierung Rechenzentrum
>> TIK Universitaet Stuttgart         E-Mail:
>> horlacher@tik.uni-stuttgart.de Allmandring 30a                Tel:   
>> ++49-711-68565868 70569 Stuttgart (Germany)      WWW:   
>> http://www.tik.uni-stuttgart.de/
>> REF:<20170822204811.GO14804@rus.uni-stuttgart.de>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: user snapshots
  2017-08-23 11:42                       ` Peter Grandi
@ 2017-08-23 21:13                         ` Ulli Horlacher
  2017-08-25 11:28                           ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-23 21:13 UTC (permalink / raw)
  To: Linux fs Btrfs

On Wed 2017-08-23 (12:42), Peter Grandi wrote:
> > So, still: What is the problem with user_subvol_rm_allowed?
> 
> As usual, it is complicated: mostly that while subvol creation
> is very cheap, subvol deletion can be very expensive. But then
> so can be creating many snapshots, as in this:

But it seems one cannot prohibit a user making snapshots?
Then root must delete them?

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<22941.27164.739577.517915@tree.ty.sabi.co.uk>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-23 16:48                     ` Ferry Toth
@ 2017-08-24 17:45                       ` Peter Grandi
  2017-08-31  6:49                         ` Ulli Horlacher
  2017-08-24 19:40                       ` Marat Khalili
  1 sibling, 1 reply; 44+ messages in thread
From: Peter Grandi @ 2017-08-24 17:45 UTC (permalink / raw)
  To: Linux fs Btrfs

>> Using hundreds or thousands of snapshots is probably fine
>> mostly.

As I mentioned previously, with a link to the relevant email
describing the details, the real issue is reflinks/backrefs.
Usually subvolume and snapshots involve them.

> We find that typically apt is very slow on a machine with 50
> or so snapshots and raid10. Slow as in probably 10x slower as
> doing the same update on a machine with 'single' and no
> snapshots.

That seems to indicate using snapshots on a '/' volume to
provide a "rollback machine" like SUSE. Since '/' usually has
many small files and installation of upgraded packages involves
only a small part of them, that usually involves a lot of
reflinks/backrefs.

But that you find that the system has slowed down significantly
in ordinary operations is unusual, because what is slow in
situations with many relinks/backrefs per extent is not access,
but operations like 'balance' or 'delete'.

Guessing wildly what you describe seems more the effect of low
locality (aka high fragmentation) which is often the result of
the 'ssd' option which should always be explicitly disabled
(even for volumes on flash SSD storage). I would suggest some
use of 'filefrag' to analyze and perhaps use of 'defrag' and
'balance'.

Another possibility is having enabled compression with the
presence of many in-place updates on some files, which can
result also in low locality (high fragmentation).

As usual with Btrfs, there are corner cases to avoid: 'defrag'
should be done before 'balance' and with compression switched
off (IIRC):

https://wiki.archlinux.org/index.php/Btrfs#Defragmentation

  Defragmenting a file which has a COW copy (either a snapshot
  copy or one made with cp --reflink or bcp) plus using the -c
  switch with a compression algorithm may result in two
  unrelated files effectively increasing the disk usage.

https://wiki.debian.org/Btrfs

  Mounting with -o autodefrag will duplicate reflinked or
  snapshotted files when you run a balance. Also, whenever a
  portion of the fs is defragmented with "btrfs filesystem
  defragment" those files will lose their reflinks and the data
  will be "duplicated" with n-copies. The effect of this is that
  volumes that make heavy use of reflinks or snapshots will run
  out of space.

  Additionally, if you have a lot of snapshots or reflinked files,
  please use "-f" to flush data for each file before going to the
  next file.

I prefer dump-and-reload.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-23 16:48                     ` Ferry Toth
  2017-08-24 17:45                       ` Peter Grandi
@ 2017-08-24 19:40                       ` Marat Khalili
  2017-08-24 21:56                         ` Ferry Toth
  1 sibling, 1 reply; 44+ messages in thread
From: Marat Khalili @ 2017-08-24 19:40 UTC (permalink / raw)
  To: Ferry Toth, linux-btrfs

> We find that typically apt is very slow on a machine with 50 or so snapshots and raid10. Slow as in probably 10x slower as doing the same update on a machine with 'single' and no snapshots.
>
> Other operations seem to be the same speed, especially disk benchmarks do not seem to indicate any performance degradation.

For meaningful discussion it is important to take into account the fact that dpkg infamously calls fsync after changing every bit of information, so basically you're measuring fsync speed. Which is slow on btrfs (compared to simpler filesystems), but unrelated to normal work.

I've got two near-identical servers here with several containers each different only on in filesystem: btrfs-raid1 on one (for historical reasons) and ext4/mdadm-raid1 on another, no snapshots, no reflinks. Each time containers on ext4 update several times faster, but in everyday operation there's no significant difference.
-- 

With Best Regards,
Marat Khalili

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-24 19:40                       ` Marat Khalili
@ 2017-08-24 21:56                         ` Ferry Toth
  2017-08-25  5:54                           ` Chris Murphy
  2017-08-25 11:45                           ` Austin S. Hemmelgarn
  0 siblings, 2 replies; 44+ messages in thread
From: Ferry Toth @ 2017-08-24 21:56 UTC (permalink / raw)
  To: linux-btrfs

Op Thu, 24 Aug 2017 22:40:54 +0300, schreef Marat Khalili:

>> We find that typically apt is very slow on a machine with 50 or so
>> snapshots and raid10. Slow as in probably 10x slower as doing the same
>> update on a machine with 'single' and no snapshots.
>>
>> Other operations seem to be the same speed, especially disk benchmarks
>> do not seem to indicate any performance degradation.
> 
> For meaningful discussion it is important to take into account the fact

Doing daily updates on a desktop is not uncommon and when 3 minutes 
become 30 then many would call that meaningfull.

Similar for a single office server, which is upgraded twice a year, where 
an upgrade normally would take an hour or 2, but now more than a working 
day. In the meantime, take samba and postgresql offline, preventing 
people to work for a few hours.

My point is: fsync is not targeted specifically in many common disk bench 
marks (phoronix?), it might be posible that there is no trigger to spend 
much time on optimizations in that area. That doesn't make it meaningless.

> that dpkg infamously calls fsync after changing every bit of
> information, so basically you're measuring fsync speed. Which is slow on
> btrfs (compared to simpler filesystems), but unrelated to normal work.

OTOH it would be nice if dpkg would at last start making use btrfs 
snapshot features and abandon these unnecssary fsyncs completely, instead 
restoring a failed install from a snapshot. This would probably result in 
a performance improve compared to ext4.

> I've got two near-identical servers here with several containers each
> different only on in filesystem: btrfs-raid1 on one (for historical
> reasons) and ext4/mdadm-raid1 on another, no snapshots, no reflinks.
> Each time containers on ext4 update several times faster, but in
> everyday operation there's no significant difference.
> --
> 
> With Best Regards,
> Marat Khalili



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-24 21:56                         ` Ferry Toth
@ 2017-08-25  5:54                           ` Chris Murphy
  2017-08-25 11:45                           ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 44+ messages in thread
From: Chris Murphy @ 2017-08-25  5:54 UTC (permalink / raw)
  To: Ferry Toth; +Cc: Btrfs BTRFS

On Thu, Aug 24, 2017 at 3:56 PM, Ferry Toth <ftoth@telfort.nl> wrote:
> Op Thu, 24 Aug 2017 22:40:54 +0300, schreef Marat Khalili:
>
>>> We find that typically apt is very slow on a machine with 50 or so
>>> snapshots and raid10. Slow as in probably 10x slower as doing the same
>>> update on a machine with 'single' and no snapshots.
>>>
>>> Other operations seem to be the same speed, especially disk benchmarks
>>> do not seem to indicate any performance degradation.
>>
>> For meaningful discussion it is important to take into account the fact
>
> Doing daily updates on a desktop is not uncommon and when 3 minutes
> become 30 then many would call that meaningfull.
>
> Similar for a single office server, which is upgraded twice a year, where
> an upgrade normally would take an hour or 2, but now more than a working
> day. In the meantime, take samba and postgresql offline, preventing
> people to work for a few hours.
>
> My point is: fsync is not targeted specifically in many common disk bench
> marks (phoronix?), it might be posible that there is no trigger to spend
> much time on optimizations in that area. That doesn't make it meaningless.
>
>> that dpkg infamously calls fsync after changing every bit of
>> information, so basically you're measuring fsync speed. Which is slow on
>> btrfs (compared to simpler filesystems), but unrelated to normal work.
>
> OTOH it would be nice if dpkg would at last start making use btrfs
> snapshot features and abandon these unnecssary fsyncs completely, instead
> restoring a failed install from a snapshot. This would probably result in
> a performance improve compared to ext4.
>
>> I've got two near-identical servers here with several containers each
>> different only on in filesystem: btrfs-raid1 on one (for historical
>> reasons) and ext4/mdadm-raid1 on another, no snapshots, no reflinks.
>> Each time containers on ext4 update several times faster, but in
>> everyday operation there's no significant difference.

In the thread "Containers, Btrfs vs Btrfs + overlayfs" there's the
idea of nullifying fsyncs in container contexts. Maybe it could be
adapted for out of band system software updates, i.e.

1. updater runs in a container
2. takes a snapshot of the system
3. assembles the snapshot per fstab
4. performs the OS update (within the container, on the snapshot),
filtering out fsync
5. does a sync() after completion of update
6. update bootloader configuration to point to the updated snapshot/file tree
7. quit container
8. user reboots whenever convenient to run the updated system

Basically this is still an atomic update that won't nerf either the
file system or the OS if there's a crash or power failure prior to
step 7. Any failure, just delete the snapshot (failed out of band
update). The existing tree isn't affected by either the update or the
failure so there's not even a problem running it while the user is
working, insofar as binaries and libraries aren't being yanked out
from under running processes, they're all out of band changes.

Something like this exists, but it's not package based, rather it's
"git like", and also has no optimizations for Btrfs. It's updates are
out of band, and always atomic, not matter the file system.

OSTree.

https://github.com/ostreedev/ostree
https://ostree.readthedocs.io/en/latest/

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: user snapshots
  2017-08-23 21:13                         ` Ulli Horlacher
@ 2017-08-25 11:28                           ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 44+ messages in thread
From: Austin S. Hemmelgarn @ 2017-08-25 11:28 UTC (permalink / raw)
  To: Linux fs Btrfs

On 2017-08-23 17:13, Ulli Horlacher wrote:
> On Wed 2017-08-23 (12:42), Peter Grandi wrote:
>>> So, still: What is the problem with user_subvol_rm_allowed?
>>
>> As usual, it is complicated: mostly that while subvol creation
>> is very cheap, subvol deletion can be very expensive. But then
>> so can be creating many snapshots, as in this:
> 
> But it seems one cannot prohibit a user making snapshots?
> Then root must delete them?
> 
That is correct.  This is one of the big outstanding issues with BTRFS 
being practical for enterprise usage, because it means anyone with basic 
shell access and either the ability to run arbitrary byte code or access 
to execute /sbin/btrfs can exhaust system resources with no effort 
whatsoever.  Taken together with how subvolume creation interacts with 
qgroups, it also means that qgroups are useless in the same situation 
because it's trivial to escape them.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-24 21:56                         ` Ferry Toth
  2017-08-25  5:54                           ` Chris Murphy
@ 2017-08-25 11:45                           ` Austin S. Hemmelgarn
  2017-08-25 12:55                             ` Ferry Toth
  1 sibling, 1 reply; 44+ messages in thread
From: Austin S. Hemmelgarn @ 2017-08-25 11:45 UTC (permalink / raw)
  To: Ferry Toth, linux-btrfs

On 2017-08-24 17:56, Ferry Toth wrote:
> Op Thu, 24 Aug 2017 22:40:54 +0300, schreef Marat Khalili:
> 
>>> We find that typically apt is very slow on a machine with 50 or so
>>> snapshots and raid10. Slow as in probably 10x slower as doing the same
>>> update on a machine with 'single' and no snapshots.
>>>
>>> Other operations seem to be the same speed, especially disk benchmarks
>>> do not seem to indicate any performance degradation.
>>
>> For meaningful discussion it is important to take into account the fact
> 
> Doing daily updates on a desktop is not uncommon and when 3 minutes
> become 30 then many would call that meaningful.
I think the more meaningful aspect here is that it's 30 minutes where 
persistent storage is liable to be unusable, not necessarily that it's 
30 minutes.
> 
> Similar for a single office server, which is upgraded twice a year, where
> an upgrade normally would take an hour or 2, but now more than a working
> day. In the meantime, take samba and postgresql offline, preventing
> people to work for a few hours.
That should only be the case if:
1. You don't have your data set properly segregated from the rest of 
your system (it should not be part of the upgrade snapshot, but an 
independent snapshot taken separately).
2. You are updating the main system, instead of updating the snapshot 
you took.

The ideal method of handling an upgrade in this case is:
1. Snapshot the system, but not the data set.
2. Run your updates on the snapshot of the system.
3. Rename the snapshot and the root subvolume so that you boot into the 
snapshot.
4. During the next maintenance window (or overnight), shutdown the 
system services, snapshot the data set (so you can roll back if the 
update screws up the database).
5. Reboot.

That provides minimal downtime, and removes the need to roll-back if the 
upgrade fails part way through (you just nuke the snapshot and start 
over, instead of having to manually switch to the snapshot and reboot).
> 
> My point is: fsync is not targeted specifically in many common disk bench
> marks (phoronix?), it might be posible that there is no trigger to spend
> much time on optimizations in that area. That doesn't make it meaningless.
> 
>> that dpkg infamously calls fsync after changing every bit of
>> information, so basically you're measuring fsync speed. Which is slow on
>> btrfs (compared to simpler filesystems), but unrelated to normal work.
> 
> OTOH it would be nice if dpkg would at last start making use btrfs
> snapshot features and abandon these unnecssary fsyncs completely, instead
> restoring a failed install from a snapshot. This would probably result in
> a performance improve compared to ext4.
Not dpkg, apt-get and wherever other frontedd you use (although all the 
other dpkg frontends I know of are actually apt-get frontends).  Take a 
look at how SUSE actually does this integration, it's done through 
Zypper/YaST2, not RPM.  If you do it through dpkg, or RPM, or whatever 
other low-level package tool, you need to do a snapshot per package so 
that it works reliably, while what you really need is a snapshot per 
high-level transaction.

FWIW, if you can guarantee that the system won't crash during an update 
(or are actually able to roll back by hand easily if it won't boot), you 
can install libeatmydata and LD_PRELOAD it for the apt-get (or aptitude, 
or synaptic, or whatever else) call, then call sync afterwards and 
probably see a significant perofrmance improvement.  The library itself 
overloads *sync() calls to be no-ops, so it's not safe to use when you 
don't have good fallback options, but it tends to severely improve 
performance for stuff like dpkg.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-25 11:45                           ` Austin S. Hemmelgarn
@ 2017-08-25 12:55                             ` Ferry Toth
  2017-08-25 19:18                               ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 44+ messages in thread
From: Ferry Toth @ 2017-08-25 12:55 UTC (permalink / raw)
  To: linux-btrfs

Op Fri, 25 Aug 2017 07:45:44 -0400, schreef Austin S. Hemmelgarn:

> On 2017-08-24 17:56, Ferry Toth wrote:
>> Op Thu, 24 Aug 2017 22:40:54 +0300, schreef Marat Khalili:
>> 
>>>> We find that typically apt is very slow on a machine with 50 or so
>>>> snapshots and raid10. Slow as in probably 10x slower as doing the
>>>> same update on a machine with 'single' and no snapshots.
>>>>
>>>> Other operations seem to be the same speed, especially disk
>>>> benchmarks do not seem to indicate any performance degradation.
>>>
>>> For meaningful discussion it is important to take into account the
>>> fact
>> 
>> Doing daily updates on a desktop is not uncommon and when 3 minutes
>> become 30 then many would call that meaningful.
> I think the more meaningful aspect here is that it's 30 minutes where
> persistent storage is liable to be unusable, not necessarily that it's
> 30 minutes.

Unusable - probably less (depending on your definition)
Irritating - yes

>> Similar for a single office server, which is upgraded twice a year,
>> where an upgrade normally would take an hour or 2, but now more than a
>> working day. In the meantime, take samba and postgresql offline,
>> preventing people to work for a few hours.
> That should only be the case if:
> 1. You don't have your data set properly segregated from the rest of
> your system (it should not be part of the upgrade snapshot, but an
> independent snapshot taken separately).
> 2. You are updating the main system, instead of updating the snapshot
> you took.
> 
> The ideal method of handling an upgrade in this case is:
> 1. Snapshot the system, but not the data set.
> 2. Run your updates on the snapshot of the system.
> 3. Rename the snapshot and the root subvolume so that you boot into the
> snapshot.
> 4. During the next maintenance window (or overnight), shutdown the
> system services, snapshot the data set (so you can roll back if the
> update screws up the database).
> 5. Reboot.
> 
> That provides minimal downtime, and removes the need to roll-back if the
> upgrade fails part way through (you just nuke the snapshot and start
> over, instead of having to manually switch to the snapshot and reboot).

Wow, yes that does sound ideal. Is that how you do it?

Now I just need Cananonical to update there installer to take of this. 
(that is: tell it to update the system on another partition (subvolume) 
than the one mounted on /, and not stop any running system services).

Or run a virtual machine on the server that boots from the snapshot and 
that update. Oh no, the virtual machine would slow down my running server 
to much.

Eh, share the snapshot via cifs or nfs to another machine that does a 
netboot and let that do the update.

O wait, I forgot, I installed btrfs to make our system maintenance easier 
than with ext. Maybe that was a mistake, at least until distros take 
advantage of the aadvantage and start avoiding the pitfalls?

>> My point is: fsync is not targeted specifically in many common disk
>> bench marks (phoronix?), it might be posible that there is no trigger
>> to spend much time on optimizations in that area. That doesn't make it
>> meaningless.
>> 
>>> that dpkg infamously calls fsync after changing every bit of
>>> information, so basically you're measuring fsync speed. Which is slow
>>> on btrfs (compared to simpler filesystems), but unrelated to normal
>>> work.
>> 
>> OTOH it would be nice if dpkg would at last start making use btrfs
>> snapshot features and abandon these unnecssary fsyncs completely,
>> instead restoring a failed install from a snapshot. This would probably
>> result in a performance improve compared to ext4.
> Not dpkg, apt-get and wherever other frontedd you use (although all the
> other dpkg frontends I know of are actually apt-get frontends).  Take a
> look at how SUSE actually does this integration, it's done through
> Zypper/YaST2, not RPM.  If you do it through dpkg, or RPM, or whatever
> other low-level package tool, you need to do a snapshot per package so
> that it works reliably, while what you really need is a snapshot per
> high-level transaction.
> 
> FWIW, if you can guarantee that the system won't crash during an update
> (or are actually able to roll back by hand easily if it won't boot), you
> can install libeatmydata and LD_PRELOAD it for the apt-get (or aptitude,
> or synaptic, or whatever else) call, then call sync afterwards and
> probably see a significant perofrmance improvement.  The library itself
> overloads *sync() calls to be no-ops, so it's not safe to use when you
> don't have good fallback options, but it tends to severely improve
> performance for stuff like dpkg.

Yeah I can guarantee that it can crash... All you need to do is start the 
upgrade from a remote terminal, forget to use screen and then close the 
terminal. And if not the installer will run until the first 'do you want 
to replace the conf file Y/n' in the background, at which point you have 
no choice than to nuke it.

But probably if you take a snapshot before eating the data you should be 
able to recover.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-25 12:55                             ` Ferry Toth
@ 2017-08-25 19:18                               ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 44+ messages in thread
From: Austin S. Hemmelgarn @ 2017-08-25 19:18 UTC (permalink / raw)
  To: Ferry Toth, linux-btrfs

On 2017-08-25 08:55, Ferry Toth wrote:
> Op Fri, 25 Aug 2017 07:45:44 -0400, schreef Austin S. Hemmelgarn:
> 
>> On 2017-08-24 17:56, Ferry Toth wrote:
>>> Op Thu, 24 Aug 2017 22:40:54 +0300, schreef Marat Khalili:
>>>
>>>>> We find that typically apt is very slow on a machine with 50 or so
>>>>> snapshots and raid10. Slow as in probably 10x slower as doing the
>>>>> same update on a machine with 'single' and no snapshots.
>>>>>
>>>>> Other operations seem to be the same speed, especially disk
>>>>> benchmarks do not seem to indicate any performance degradation.
>>>>
>>>> For meaningful discussion it is important to take into account the
>>>> fact
>>>
>>> Doing daily updates on a desktop is not uncommon and when 3 minutes
>>> become 30 then many would call that meaningful.
>> I think the more meaningful aspect here is that it's 30 minutes where
>> persistent storage is liable to be unusable, not necessarily that it's
>> 30 minutes.
> 
> Unusable - probably less (depending on your definition)
And depending on both your hardware and how many updates you're handling 
(if things are stalling with dpkg calling fsync, it's probably going to 
be bad for other stuff too, even if it's still 'usable').

> Irritating - yes
Still nothing compared to Windows 10 deciding to download updates right 
after you start a round in an FPS or RTS game...

In all seriousness though, I'm kind of used to half-hour daily updates 
(or 2 plus hour ones if GCC, LLVM, or glibc are being updated), I run 
Gentoo on most of my systems, so everything gets built locally.
> 
>>> Similar for a single office server, which is upgraded twice a year,
>>> where an upgrade normally would take an hour or 2, but now more than a
>>> working day. In the meantime, take samba and postgresql offline,
>>> preventing people to work for a few hours.
>> That should only be the case if:
>> 1. You don't have your data set properly segregated from the rest of
>> your system (it should not be part of the upgrade snapshot, but an
>> independent snapshot taken separately).
>> 2. You are updating the main system, instead of updating the snapshot
>> you took.
>>
>> The ideal method of handling an upgrade in this case is:
>> 1. Snapshot the system, but not the data set.
>> 2. Run your updates on the snapshot of the system.
>> 3. Rename the snapshot and the root subvolume so that you boot into the
>> snapshot.
>> 4. During the next maintenance window (or overnight), shutdown the
>> system services, snapshot the data set (so you can roll back if the
>> update screws up the database).
>> 5. Reboot.
>>
>> That provides minimal downtime, and removes the need to roll-back if the
>> upgrade fails part way through (you just nuke the snapshot and start
>> over, instead of having to manually switch to the snapshot and reboot).
> 
> Wow, yes that does sound ideal. Is that how you do it?
It's what I've been trying to get working on my Gentoo systems for 
months now with varying degrees of success.  We do some similar trickery 
with updates for our embedded systems where I work (albeit a bit 
differently), but we're also directly calling RPM for package installs 
instead of using higher level tools like Yum or DNF, so it's easy to 
point things somewhere else for the install target.  I'm pretty sure 
this is essentially what NixOS does for updates too, but I don't think 
it has BTRFS snapshot support (they use LVM2), and it's a pain to get 
used to the declarative system configuration it uses.

FWIW, it would be even easier if BTRFS supported operation as a lower 
layer for OverlayFS, then you could put /etc and similar stuff on 
OverlayFS, and the package manager never needs to ask if you want to 
update config files because it just puts them on the lower layer.
> 
> Now I just need Cananonical to update there installer to take of this.
> (that is: tell it to update the system on another partition (subvolume)
> than the one mounted on /, and not stop any running system services).
Yeah, that's one of the things I kind of hate about many mainstream 
distros.  Gentoo is nice in this respect, you can just run the update 
command with ROOT=/whatever in the environment, and it will use that for 
installing packages, and it expects the admin to handle restarting 
services.  RPM actually has an option to handle this too these days, but 
not Yum or DNF (don't know about Zypper or YaST2, but I think they do 
have such an option).  dkpg itself probably does, but last I checked, 
apt-get, aptitude, and most of the graphical frontends don't.

It's technically possible to work around this using containers (that is, 
you spawn a container with the snapshot as it's root filesystem, and run 
the update from there), it's just not easy to do right now.
> 
> Or run a virtual machine on the server that boots from the snapshot and
> that update. Oh no, the virtual machine would slow down my running server
> to much.
Depends on how you put it together.  This also has issues because of 
starting a new instance of init.
> 
> Eh, share the snapshot via cifs or nfs to another machine that does a
> netboot and let that do the update.
This won't work reliably for other reasons, and will probably kill your 
network performance.
> 
> O wait, I forgot, I installed btrfs to make our system maintenance easier
> than with ext. Maybe that was a mistake, at least until distros take
> advantage of the aadvantage and start avoiding the pitfalls?
It really depends on your particular use case.
> 
>>> My point is: fsync is not targeted specifically in many common disk
>>> bench marks (phoronix?), it might be posible that there is no trigger
>>> to spend much time on optimizations in that area. That doesn't make it
>>> meaningless.
>>>
>>>> that dpkg infamously calls fsync after changing every bit of
>>>> information, so basically you're measuring fsync speed. Which is slow
>>>> on btrfs (compared to simpler filesystems), but unrelated to normal
>>>> work.
>>>
>>> OTOH it would be nice if dpkg would at last start making use btrfs
>>> snapshot features and abandon these unnecssary fsyncs completely,
>>> instead restoring a failed install from a snapshot. This would probably
>>> result in a performance improve compared to ext4.
>> Not dpkg, apt-get and wherever other frontedd you use (although all the
>> other dpkg frontends I know of are actually apt-get frontends).  Take a
>> look at how SUSE actually does this integration, it's done through
>> Zypper/YaST2, not RPM.  If you do it through dpkg, or RPM, or whatever
>> other low-level package tool, you need to do a snapshot per package so
>> that it works reliably, while what you really need is a snapshot per
>> high-level transaction.
>>
>> FWIW, if you can guarantee that the system won't crash during an update
>> (or are actually able to roll back by hand easily if it won't boot), you
>> can install libeatmydata and LD_PRELOAD it for the apt-get (or aptitude,
>> or synaptic, or whatever else) call, then call sync afterwards and
>> probably see a significant perofrmance improvement.  The library itself
>> overloads *sync() calls to be no-ops, so it's not safe to use when you
>> don't have good fallback options, but it tends to severely improve
>> performance for stuff like dpkg.
> 
> Yeah I can guarantee that it can crash... All you need to do is start the
> upgrade from a remote terminal, forget to use screen and then close the
> terminal. And if not the installer will run until the first 'do you want
> to replace the conf file Y/n' in the background, at which point you have
> no choice than to nuke it.
I meant the system itself, not the update.  What matters is that the 
filesystem isn't shut down unexpectedly, not that the update fails part 
way through (if you're using snapshots to give restore points, you've 
already got the case of the update failing covered).
> 
> But probably if you take a snapshot before eating the data you should be
> able to recover.
Indeed.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-24 17:45                       ` Peter Grandi
@ 2017-08-31  6:49                         ` Ulli Horlacher
  2017-08-31 11:18                           ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-08-31  6:49 UTC (permalink / raw)
  To: Linux fs Btrfs

On Thu 2017-08-24 (18:45), Peter Grandi wrote:

> As usual with Btrfs, there are corner cases to avoid: 'defrag'
> should be done before 'balance'

Good hint. So far I did it the other way: balance before defrag. 
I will switch.


> and with compression switched off

I have filesystems with compress mount option:

framstag@fex:~: grep /local /etc/fstab
LABEL=local /local btrfs  defaults,compress,user_subvol_rm_allowed 0 2

and a weekly cronjob, which does a defrag and balance.
I cannot disable compression.
Any hint here?


> I prefer dump-and-reload.

What do you mean by this?


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<22943.4266.793339.528061@tree.ty.sabi.co.uk>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-31  6:49                         ` Ulli Horlacher
@ 2017-08-31 11:18                           ` Austin S. Hemmelgarn
  2017-08-31 14:38                             ` Michał Sokołowski
  0 siblings, 1 reply; 44+ messages in thread
From: Austin S. Hemmelgarn @ 2017-08-31 11:18 UTC (permalink / raw)
  To: Linux fs Btrfs

On 2017-08-31 02:49, Ulli Horlacher wrote:
> On Thu 2017-08-24 (18:45), Peter Grandi wrote:
> 
>> As usual with Btrfs, there are corner cases to avoid: 'defrag'
>> should be done before 'balance'
> 
> Good hint. So far I did it the other way: balance before defrag.
> I will switch.
For reference, the reason to do things this way is that defragmenting a 
filesystem may result in undoing some of the work balance did.
> 
> 
>> and with compression switched off
> 
> I have filesystems with compress mount option:
> 
> framstag@fex:~: grep /local /etc/fstab
> LABEL=local /local btrfs  defaults,compress,user_subvol_rm_allowed 0 2
> 
> and a weekly cronjob, which does a defrag and balance.
> I cannot disable compression.
> Any hint here?
Having compression enabled causes no issues with defray and balance. 
There appears to be a prevalent belief however that defrag is pointless 
if you're using compression, probably because some versions of 
`filefrag` don't report compressed extents properly (they list each 128k 
compressed unit as one extent, which is wrong).
> 
> 
>> I prefer dump-and-reload.
> 
> What do you mean by this?
I believe he means to copy everything off the filesystem, recreate it, 
and copy everything back in.  That will actually get you much closer to 
an optimal layout than a defrag=balance cycle, but it also takes a long 
time, requires extra space, and the layout will usually become 
sub-optimal almost immediately when you start writing to the filesystem.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-31 11:18                           ` Austin S. Hemmelgarn
@ 2017-08-31 14:38                             ` Michał Sokołowski
  2017-08-31 16:18                               ` Duncan
  0 siblings, 1 reply; 44+ messages in thread
From: Michał Sokołowski @ 2017-08-31 14:38 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, Linux fs Btrfs

[-- Attachment #1: Type: text/plain, Size: 514 bytes --]

On 08/31/2017 01:18 PM, Austin S. Hemmelgarn wrote:
> [...]
>> Any hint here?
> Having compression enabled causes no issues with defray and balance.
> There appears to be a prevalent belief however that defrag is
> pointless if you're using compression, probably because some versions
> of `filefrag` don't report compressed extents properly (they list each
> 128k compressed unit as one extent, which is wrong).
Is there another tool to verify fragments number of given file when
using compression?



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3849 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-31 14:38                             ` Michał Sokołowski
@ 2017-08-31 16:18                               ` Duncan
  2017-09-01 10:21                                 ` ein
  0 siblings, 1 reply; 44+ messages in thread
From: Duncan @ 2017-08-31 16:18 UTC (permalink / raw)
  To: linux-btrfs

Michał Sokołowski posted on Thu, 31 Aug 2017 16:38:14 +0200 as excerpted:

> On 08/31/2017 01:18 PM, Austin S. Hemmelgarn wrote:
>> [...]
>>> Any hint here?
>> Having compression enabled causes no issues with defray and balance.
>> There appears to be a prevalent belief however that defrag is pointless
>> if you're using compression, probably because some versions of
>> `filefrag` don't report compressed extents properly (they list each
>> 128k compressed unit as one extent, which is wrong).
> Is there another tool to verify fragments number of given file when
> using compression?

AFAIK there isn't an official one, tho someone posted a script (python, 
IIRC) at one point and may repost it here.

You can actually get the information needed from filefrag -v (and the 
script does), but it takes a bit more effort than usual, scripted or 
brain-power, to convert the results into real fragmentation numbers.

The problem is that btrfs compression works in 128 KiB blocks, and 
filefrag sees each of those as a fragment.  The extra effort involves 
checking the addresses of the reported 128 KiB blocks to see if they are 
actually contiguous, that is, one starts just after the previous one 
ends.  If so it's actually not fragmented at that point.  But if the 
addresses aren't contiguous, there's fragmentation at that point.

I don't personally worry too much about it here, for two reasons.  First, 
I /always/ run with the autodefrag mount option, which keeps 
fragmentation manageable in any case[1], and second, I'm on ssd, where 
the effects of fragmentation aren't as pronounced.  (On spinning rust 
it's generally the seek times that dominate.  On ssds that's 0, but 
there's still an IOPS cost.)

So while I've run filefrag -v and looked at the results a few times out 
of curiousity, and indeed could see the reported fragmentation that was 
actually contiguous, it was simply a curiosity to me, thus my not 
grabbing that script and putting it to immediate use.

---
[1] AFAIK autodefrag checks fragmentation on writes, and rewrites 16 MiB 
blocks if necessary.  If like me you always run it from the moment you 
start putting data on the filesystem, that should work pretty well.  If 
however you haven't been running it or doing manual defrag, because 
defrag only works on writes and the free space may be fragmented enough 
there's not 16 MiB blocks to write into, it may take awhile to "catch 
up", and of course won't defrag anything that's never written to again, 
but is often reread, making its existing fragmentation an issue.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-08-31 16:18                               ` Duncan
@ 2017-09-01 10:21                                 ` ein
  2017-09-01 11:47                                   ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 44+ messages in thread
From: ein @ 2017-09-01 10:21 UTC (permalink / raw)
  To: Duncan, linux-btrfs

On 08/31/2017 06:18 PM, Duncan wrote:
[...]
> Michał Sokołowski posted on Thu, 31 Aug 2017 16:38:14 +0200 as excerpted:
>> Is there another tool to verify fragments number of given file when
>> using compression?
> AFAIK there isn't an official one, tho someone posted a script (python, 
> IIRC) at one point and may repost it here.
>
> You can actually get the information needed from filefrag -v (and the 
> script does), but it takes a bit more effort than usual, scripted or 
> brain-power, to convert the results into real fragmentation numbers.
>
> The problem is that btrfs compression works in 128 KiB blocks, and 
> filefrag sees each of those as a fragment.  The extra effort involves 
> checking the addresses of the reported 128 KiB blocks to see if they are 
> actually contiguous, that is, one starts just after the previous one 
> ends.  If so it's actually not fragmented at that point.  But if the 
> addresses aren't contiguous, there's fragmentation at that point.
>
> I don't personally worry too much about it here, for two reasons.  First, 
> I /always/ run with the autodefrag mount option, which keeps 
> fragmentation manageable in any case[1], and second, I'm on ssd, where 
> the effects of fragmentation aren't as pronounced.  (On spinning rust 
> it's generally the seek times that dominate.  On ssds that's 0, but 
> there's still an IOPS cost.)
>
> So while I've run filefrag -v and looked at the results a few times out 
> of curiousity, and indeed could see the reported fragmentation that was 
> actually contiguous, it was simply a curiosity to me, thus my not 
> grabbing that script and putting it to immediate use.
>
> ---
> [1] AFAIK autodefrag checks fragmentation on writes, and rewrites 16 MiB 
> blocks if necessary.  If like me you always run it from the moment you 
> start putting data on the filesystem, that should work pretty well.  If 
> however you haven't been running it or doing manual defrag, because 
> defrag only works on writes and the free space may be fragmented enough 
> there's not 16 MiB blocks to write into, it may take awhile to "catch 
> up", and of course won't defrag anything that's never written to again, 
> but is often reread, making its existing fragmentation an issue.

Very comprehensive, thank you. I was asking because I'd like to learn
how really random writes by VM affects BTRFS (vs XFS,Ext4) performance
and try to develop some workaround to reduce/prevent it while having
csums, cow (snapshots) and compression.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: number of subvolumes
  2017-09-01 10:21                                 ` ein
@ 2017-09-01 11:47                                   ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 44+ messages in thread
From: Austin S. Hemmelgarn @ 2017-09-01 11:47 UTC (permalink / raw)
  To: ein, linux-btrfs

On 2017-09-01 06:21, ein wrote:
> Very comprehensive, thank you. I was asking because I'd like to learn
> how really random writes by VM affects BTRFS (vs XFS,Ext4) performance
> and try to develop some workaround to reduce/prevent it while having
> csums, cow (snapshots) and compression.

I've actually done some significant testing of VM's using BTRFS as 
backing storage, and would suggest the following:

1. Use raw images in sparse files, and expose them to the VM like SSD's 
(so that DISCARD support works).  Assuming your guests actually support 
DISCARD commands, this will save you just as much space as using 
allocate-on-demand formats like qcow2 or VDI, while providing access 
patterns that BTRFS is a bit better at dealing with.  There are a couple 
of specific caveats to doing this however, namely that you can't use the 
regular VirtIO block device when using QEMU, and have to use VirtIO SCSI 
instead, and you'll need smarter tools to copy images around (ddrescue 
is what I would personally suggest, it can handle sparse files 
correctly, and gives you a nice progress indicator).

2. Provide separate disks to the VM for any data you need to store in 
them.  By keeping this separate from the root filesystem's disk, you 
reduce what gets fragmented, and also aren't tied to a particular guest 
OS to the same degree.  As a more concrete example, if you're running a 
database server in a VM, give it a separate disk image for storing the 
database from the one the root filesystem is on.

3. Run with autodefrag on the host system.  Unless you're hitting the 
disks hard from inside the VM, this will actually do a reasonably good 
job of handling fragmentation.

4. Make sure the filesystem you're storing the disk images on has extra 
space.  Ideally, I would set things up so that it has enough room to 
store a second copy of the largest disk image you expect to use.  By 
keeping lots of extra space, you give the allocator in BTRFS more 
opportunity to arrange things intelligently.

5. Defragmet your disk images regularly (even when using autodefrag), 
both inside and outside the VM's.  Ideally, run a full defrag inside the 
VM, then shut it down and defragment the disk image outside the VM. 
Defragmenting inside the VM first will compact free space, which in turn 
means that defragmenting outside the VM will do a better job, and that 
future fragmentation will be less likely.  In general, I'd suggest doing 
this at least monthly (possibly a lot more frequently if you're running 
databases or similar random access workloads in the VM).

6. Avoid using a CoW filesystem inside the VM's.  This sounds odd at 
first, but is actually one of the biggest things you can do to reduce 
fragmentation.  In particular, try to avoid using BTRFS or ZFS on a VM 
disk that is stored on BTRFS, both of them will make usual fragmentation 
problems look nonexistent in comparison.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-08-22 13:22 netapp-alike snapshots? Ulli Horlacher
  2017-08-22 13:44 ` Peter Becker
@ 2017-09-09 13:26 ` Ulli Horlacher
  2017-09-09 13:36   ` Marc MERLIN
  1 sibling, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-09-09 13:26 UTC (permalink / raw)
  To: linux-btrfs

On Tue 2017-08-22 (15:22), Ulli Horlacher wrote:
> With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> You can find these snapshots in every local directory (readonly).

> I would like to have something similar with btrfs.
> Is there (where?) such a tool?

I have found none, so I have implemented it by myself:

https://fex.rus.uni-stuttgart.de/snaprotate.html

In contrast to Netapp, with snaprotate the local host administrator can
create a snapshot at any time or by cronjob.

Example:

root@fex:~# snaprotate single 3 /local/home
Create a readonly snapshot of '/local/home' in '/local/home/.snapshot/2017-09-09_1518.single'
Delete subvolume '/local/home/.snapshot/2017-09-09_1255.single'

root@fex:~# snaprotate -l
/local/home/.snapshot/2017-09-08_0000.daily
/local/home/.snapshot/2017-09-09_0000.daily
/local/home/.snapshot/2017-09-09_1331.single
/local/home/.snapshot/2017-09-09_1332.single
/local/home/.snapshot/2017-09-09_1400.hourly
/local/home/.snapshot/2017-09-09_1500.hourly
/local/home/.snapshot/2017-09-09_1518.single

root@fex:~# crontab -l | grep snaprotate
0 * * * * /root/bin/snaprotate -q hourly 2 /local/home
0 0 * * * /root/bin/snaprotate -q daily  3 /local/home
0 0 * * 1 /root/bin/snaprotate -q weekly 1 /local/home

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170822132208.GD14804@rus.uni-stuttgart.de>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-09-09 13:26 ` Ulli Horlacher
@ 2017-09-09 13:36   ` Marc MERLIN
  2017-09-09 13:44     ` Ulli Horlacher
  0 siblings, 1 reply; 44+ messages in thread
From: Marc MERLIN @ 2017-09-09 13:36 UTC (permalink / raw)
  To: linux-btrfs

On Sat, Sep 09, 2017 at 03:26:14PM +0200, Ulli Horlacher wrote:
> On Tue 2017-08-22 (15:22), Ulli Horlacher wrote:
> > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> > You can find these snapshots in every local directory (readonly).
> 
> > I would like to have something similar with btrfs.
> > Is there (where?) such a tool?
> 
> I have found none, so I have implemented it by myself:
> 
> https://fex.rus.uni-stuttgart.de/snaprotate.html

Not sure how you looked :)
https://www.google.com/search?q=btrfs+netapp+snapshot
http://marc.merlins.org/perso/btrfs/post_2014-03-21_Btrfs-Tips_-How-To-Setup-Netapp-Style-Snapshots.html

Might not be exactly what you wanted, but been using it for 3 years.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-09-09 13:36   ` Marc MERLIN
@ 2017-09-09 13:44     ` Ulli Horlacher
  2017-09-09 19:43       ` Andrei Borzenkov
  0 siblings, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-09-09 13:44 UTC (permalink / raw)
  To: linux-btrfs

On Sat 2017-09-09 (06:36), Marc MERLIN wrote:

> > On Tue 2017-08-22 (15:22), Ulli Horlacher wrote:
> > > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> > > You can find these snapshots in every local directory (readonly).
> > 
> > I have found none, so I have implemented it by myself:
> > 
> > https://fex.rus.uni-stuttgart.de/snaprotate.html
> 
> Not sure how you looked :)
> http://marc.merlins.org/perso/btrfs/post_2014-03-21_Btrfs-Tips_-How-To-Setup-Netapp-Style-Snapshots.html
> 
> Might not be exactly what you wanted, but been using it for 3 years.

Your tool does not create .snapshot subdirectories in EVERY directory like
Netapp does.
Example:

framstag@fex:~: cd ~/Mail/.snapshot/
framstag@fex:~/Mail/.snapshot: l
lR-X - 2017-09-09 09:55 2017-09-09_0000.daily -> /local/home/.snapshot/2017-09-09_0000.daily/framstag/Mail
lR-X - 2017-09-09 14:00 2017-09-09_1400.hourly -> /local/home/.snapshot/2017-09-09_1400.hourly/framstag/Mail
lR-X - 2017-09-09 15:00 2017-09-09_1500.hourly -> /local/home/.snapshot/2017-09-09_1500.hourly/framstag/Mail
lR-X - 2017-09-09 15:18 2017-09-09_1518.single -> /local/home/.snapshot/2017-09-09_1518.single/framstag/Mail
lR-X - 2017-09-09 15:20 2017-09-09_1520.single -> /local/home/.snapshot/2017-09-09_1520.single/framstag/Mail
lR-X - 2017-09-09 15:22 2017-09-09_1522.single -> /local/home/.snapshot/2017-09-09_1522.single/framstag/Mail

My users (and I) need snapshots in this way.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20170909133612.7iqwr6cbjxzvfny6@merlins.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-09-09 13:44     ` Ulli Horlacher
@ 2017-09-09 19:43       ` Andrei Borzenkov
  2017-09-09 19:52         ` Ulli Horlacher
  2017-09-10 14:54         ` Marc MERLIN
  0 siblings, 2 replies; 44+ messages in thread
From: Andrei Borzenkov @ 2017-09-09 19:43 UTC (permalink / raw)
  To: linux-btrfs

09.09.2017 16:44, Ulli Horlacher пишет:
> 
> Your tool does not create .snapshot subdirectories in EVERY directory like

Neither does NetApp. Those "directories" are magic handles that do not
really exist.

> Netapp does.
> Example:
> 
> framstag@fex:~: cd ~/Mail/.snapshot/
> framstag@fex:~/Mail/.snapshot: l
> lR-X - 2017-09-09 09:55 2017-09-09_0000.daily -> /local/home/.snapshot/2017-09-09_0000.daily/framstag/Mail

Apart from obvious problem with recursive directory traversal (NetApp
.snapshot are not visible with normal directory list) those will also be
captured in snapshots and cannot be removed. NetApp snapshots themselves
do not expose .snapshot "directories".

> lR-X - 2017-09-09 14:00 2017-09-09_1400.hourly -> /local/home/.snapshot/2017-09-09_1400.hourly/framstag/Mail
> lR-X - 2017-09-09 15:00 2017-09-09_1500.hourly -> /local/home/.snapshot/2017-09-09_1500.hourly/framstag/Mail
> lR-X - 2017-09-09 15:18 2017-09-09_1518.single -> /local/home/.snapshot/2017-09-09_1518.single/framstag/Mail
> lR-X - 2017-09-09 15:20 2017-09-09_1520.single -> /local/home/.snapshot/2017-09-09_1520.single/framstag/Mail
> lR-X - 2017-09-09 15:22 2017-09-09_1522.single -> /local/home/.snapshot/2017-09-09_1522.single/framstag/Mail
> 
> My users (and I) need snapshots in this way.
> 
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-09-09 19:43       ` Andrei Borzenkov
@ 2017-09-09 19:52         ` Ulli Horlacher
  2017-09-10  7:10           ` A L
  2017-09-10 14:54         ` Marc MERLIN
  1 sibling, 1 reply; 44+ messages in thread
From: Ulli Horlacher @ 2017-09-09 19:52 UTC (permalink / raw)
  To: linux-btrfs

On Sat 2017-09-09 (22:43), Andrei Borzenkov wrote:

> > Your tool does not create .snapshot subdirectories in EVERY directory like
> 
> Neither does NetApp. Those "directories" are magic handles that do not
> really exist.

I know.
But symbolic links are the next close thing (I am not a kernel programmer).


> Apart from obvious problem with recursive directory traversal (NetApp
> .snapshot are not visible with normal directory list)

Yes, they are, at least sometimes, eg tar includes the snapshots.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<14c87878-a5a0-d7d3-4a76-c55812e751a3@gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-09-09 19:52         ` Ulli Horlacher
@ 2017-09-10  7:10           ` A L
  0 siblings, 0 replies; 44+ messages in thread
From: A L @ 2017-09-10  7:10 UTC (permalink / raw)
  To: linux-btrfs

Perhaps netapp is using a VFS overlay. There is really only one snapshot but it is shown in the overlay on every folder. Kind of the same with samba Shadow Copies.

---- From: Ulli Horlacher <framstag@rus.uni-stuttgart.de> -- Sent: 2017-09-09 - 21:52 ----

> On Sat 2017-09-09 (22:43), Andrei Borzenkov wrote:
> 
>> > Your tool does not create .snapshot subdirectories in EVERY directory like
>> 
>> Neither does NetApp. Those "directories" are magic handles that do not
>> really exist.
> 
> I know.
> But symbolic links are the next close thing (I am not a kernel programmer).
> 
> 
>> Apart from obvious problem with recursive directory traversal (NetApp
>> .snapshot are not visible with normal directory list)
> 
> Yes, they are, at least sometimes, eg tar includes the snapshots.
> 
> 
> -- 
> Ullrich Horlacher              Server und Virtualisierung
> Rechenzentrum TIK         
> Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
> Allmandring 30a                Tel:    ++49-711-68565868
> 70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
> REF:<14c87878-a5a0-d7d3-4a76-c55812e751a3@gmail.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: netapp-alike snapshots?
  2017-09-09 19:43       ` Andrei Borzenkov
  2017-09-09 19:52         ` Ulli Horlacher
@ 2017-09-10 14:54         ` Marc MERLIN
  1 sibling, 0 replies; 44+ messages in thread
From: Marc MERLIN @ 2017-09-10 14:54 UTC (permalink / raw)
  To: Andrei Borzenkov, framstag; +Cc: linux-btrfs

On Sat, Sep 09, 2017 at 10:43:16PM +0300, Andrei Borzenkov wrote:
> 09.09.2017 16:44, Ulli Horlacher пишет:
> > 
> > Your tool does not create .snapshot subdirectories in EVERY directory like
> 
> Neither does NetApp. Those "directories" are magic handles that do not
> really exist.
 
Correct, thanks for saving me typing the same thing (I actually did work
at netapp many years back, so I'm familiar with how they work)

> > Netapp does.
> > Example:
> > 
> > framstag@fex:~: cd ~/Mail/.snapshot/
> > framstag@fex:~/Mail/.snapshot: l
> > lR-X - 2017-09-09 09:55 2017-09-09_0000.daily -> /local/home/.snapshot/2017-09-09_0000.daily/framstag/Mail
> 
> Apart from obvious problem with recursive directory traversal (NetApp
> .snapshot are not visible with normal directory list) those will also be
> captured in snapshots and cannot be removed. NetApp snapshots themselves
> do not expose .snapshot "directories".

Correct. Netapp knows this of course, which is why those .snapshot
directories are "magic" and hidden to ls(1), find(1) and others when
they do a readdir(3)

> > lR-X - 2017-09-09 14:00 2017-09-09_1400.hourly -> /local/home/.snapshot/2017-09-09_1400.hourly/framstag/Mail
> > lR-X - 2017-09-09 15:00 2017-09-09_1500.hourly -> /local/home/.snapshot/2017-09-09_1500.hourly/framstag/Mail
> > lR-X - 2017-09-09 15:18 2017-09-09_1518.single -> /local/home/.snapshot/2017-09-09_1518.single/framstag/Mail
> > lR-X - 2017-09-09 15:20 2017-09-09_1520.single -> /local/home/.snapshot/2017-09-09_1520.single/framstag/Mail
> > lR-X - 2017-09-09 15:22 2017-09-09_1522.single -> /local/home/.snapshot/2017-09-09_1522.single/framstag/Mail
> > 
> > My users (and I) need snapshots in this way.

You are used to them being there, I was too :)
While you could create lots of symlinks, I opted not to since it would
have littered the filesystem.
I can simply cd $(SNAPROOT)/volname_hourly/$(PWD)
and end up where I wanted to be.

I suppose you could make a snapcd shell function that does this for you.
The only issue is that volname_hourly comes before the rest of the path,
so you aren't given a list of all the snapshots available for a given
path, you have to cd into the given snapshot first, and then add the
path.
I agree it's not as nice as netapp, but honestly I don't think you can
do better with btrfs at this point.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2017-09-10 14:54 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-22 13:22 netapp-alike snapshots? Ulli Horlacher
2017-08-22 13:44 ` Peter Becker
2017-08-22 14:24   ` Ulli Horlacher
2017-08-22 16:08     ` Peter Becker
2017-08-22 16:48       ` Ulli Horlacher
2017-08-22 16:45     ` Roman Mamedov
2017-08-22 16:57       ` Ulli Horlacher
2017-08-22 17:19         ` A L
2017-08-22 18:01           ` Ulli Horlacher
2017-08-22 18:36             ` Peter Grandi
2017-08-22 20:48               ` Ulli Horlacher
2017-08-23  7:18                 ` number of subvolumes Ulli Horlacher
2017-08-23  8:37                   ` A L
2017-08-23 16:48                     ` Ferry Toth
2017-08-24 17:45                       ` Peter Grandi
2017-08-31  6:49                         ` Ulli Horlacher
2017-08-31 11:18                           ` Austin S. Hemmelgarn
2017-08-31 14:38                             ` Michał Sokołowski
2017-08-31 16:18                               ` Duncan
2017-09-01 10:21                                 ` ein
2017-09-01 11:47                                   ` Austin S. Hemmelgarn
2017-08-24 19:40                       ` Marat Khalili
2017-08-24 21:56                         ` Ferry Toth
2017-08-25  5:54                           ` Chris Murphy
2017-08-25 11:45                           ` Austin S. Hemmelgarn
2017-08-25 12:55                             ` Ferry Toth
2017-08-25 19:18                               ` Austin S. Hemmelgarn
2017-08-23 12:11                   ` Peter Grandi
2017-08-22 21:53               ` user snapshots Ulli Horlacher
2017-08-23  6:28                 ` Dmitrii Tcvetkov
2017-08-23  7:16                   ` Dmitrii Tcvetkov
2017-08-23  7:20                     ` Ulli Horlacher
2017-08-23 11:42                       ` Peter Grandi
2017-08-23 21:13                         ` Ulli Horlacher
2017-08-25 11:28                           ` Austin S. Hemmelgarn
2017-08-22 17:36         ` netapp-alike snapshots? Roman Mamedov
2017-08-22 18:10           ` Ulli Horlacher
2017-09-09 13:26 ` Ulli Horlacher
2017-09-09 13:36   ` Marc MERLIN
2017-09-09 13:44     ` Ulli Horlacher
2017-09-09 19:43       ` Andrei Borzenkov
2017-09-09 19:52         ` Ulli Horlacher
2017-09-10  7:10           ` A L
2017-09-10 14:54         ` Marc MERLIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.