linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Atomic replacement of subvolumes is not possible
@ 2010-06-26 17:25 Daniel Baumann
  2010-06-28  0:44 ` C Anthony Risinger
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Baumann @ 2010-06-26 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Roger Leigh

Hi,

this is basically a forward from
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D587253

"rename(2) allows for the atomic replacement of files.  Being able to
atomically replace subvolume snapshots would be equally invaluable,
since it would permit lock-free replacement of subvolumes.

  % btrfs subvolume snapshot <src> <dest>

creates dest as a snapshot of src. However, if I want to do the
converse,

  % btrfs subvolume snapshot <dest> <src>

then <dest> is snapshotted as <src>/<dest>, i.e. not replacing the
original subvolume, but going inside the original subvolume.

Use case 1:
  I have a subvolume of data under active use, which I want to
  periodically update.  I'd like to do this by atomically
  replacing its contents.  I can replace the content right now
  by deleting the old subvolume and then snapshotting the new
  on in its place, but it's racy.  It really needs to be
  replaced in a single operation, or else there's a small window
  where there is no data, and I'd need to resort to some external
  locking to protect myself.

Use case 2:
  In schroot, we create btrfs subvolume snapshots to get copy-on-
  write chroots.  This works just fine.  We also provide direct
  access to the "source" subvolume, but since it could be
  snapshotted in an inconsistent state while being updated, we
  want to do the following:

  =B7 snapshot source subvolume
  =B7 update snapshot
  =B7 replace source volume with updated snapshot"

Please keep roger in the cc for any replies, thanks.

Regards,
Daniel

--=20
Address:        Daniel Baumann, Burgunderstrasse 3, CH-4562 Biberist
Email:          daniel.baumann@panthera-systems.net
Internet:       http://people.panthera-systems.net/~daniel-baumann/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-06-26 17:25 Atomic replacement of subvolumes is not possible Daniel Baumann
@ 2010-06-28  0:44 ` C Anthony Risinger
  2010-06-30 13:31   ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: C Anthony Risinger @ 2010-06-28  0:44 UTC (permalink / raw)
  To: daniel; +Cc: linux-btrfs, Roger Leigh

On Sat, Jun 26, 2010 at 12:25 PM, Daniel Baumann <daniel@debian.org> wr=
ote:
> Hi,
>
> this is basically a forward from
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D587253
>
> "rename(2) allows for the atomic replacement of files. =A0Being able =
to
> atomically replace subvolume snapshots would be equally invaluable,
> since it would permit lock-free replacement of subvolumes.
>
> =A0% btrfs subvolume snapshot <src> <dest>
>
> creates dest as a snapshot of src. However, if I want to do the
> converse,
>
> =A0% btrfs subvolume snapshot <dest> <src>
>
> then <dest> is snapshotted as <src>/<dest>, i.e. not replacing the
> original subvolume, but going inside the original subvolume.
>
> Use case 1:
> =A0I have a subvolume of data under active use, which I want to
> =A0periodically update. =A0I'd like to do this by atomically
> =A0replacing its contents. =A0I can replace the content right now
> =A0by deleting the old subvolume and then snapshotting the new
> =A0on in its place, but it's racy. =A0It really needs to be
> =A0replaced in a single operation, or else there's a small window
> =A0where there is no data, and I'd need to resort to some external
> =A0locking to protect myself.
>
> Use case 2:
> =A0In schroot, we create btrfs subvolume snapshots to get copy-on-
> =A0write chroots. =A0This works just fine. =A0We also provide direct
> =A0access to the "source" subvolume, but since it could be
> =A0snapshotted in an inconsistent state while being updated, we
> =A0want to do the following:
>
> =A0=B7 snapshot source subvolume
> =A0=B7 update snapshot
> =A0=B7 replace source volume with updated snapshot"
>
> Please keep roger in the cc for any replies, thanks.

i am also looking for functionality similar to this, except i would
like to be able to replace the DEFAULT subvolume, with an empty or
existing subvolume, and put the original default subvolume INSIDE the
new root (or drop it completely), outlined by this post and the thread
it's in:

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05278.html

is there any feedback on these actions?  no one seems to even respond :=
-(

it would seem we need ways to swap subvolumes around, _including_ the
default, providing the on-disk format supports such operations.

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-06-28  0:44 ` C Anthony Risinger
@ 2010-06-30 13:31   ` Chris Mason
  2010-06-30 14:26     ` C Anthony Risinger
  2010-07-02 21:39     ` Bug#587253: " Roger Leigh
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Mason @ 2010-06-30 13:31 UTC (permalink / raw)
  To: C Anthony Risinger; +Cc: daniel, linux-btrfs, Roger Leigh

On Sun, Jun 27, 2010 at 07:44:12PM -0500, C Anthony Risinger wrote:
> On Sat, Jun 26, 2010 at 12:25 PM, Daniel Baumann <daniel@debian.org> =
wrote:
> > Hi,
> >
> > this is basically a forward from
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D587253
> >
> > "rename(2) allows for the atomic replacement of files. =A0Being abl=
e to
> > atomically replace subvolume snapshots would be equally invaluable,
> > since it would permit lock-free replacement of subvolumes.
> >
> > =A0% btrfs subvolume snapshot <src> <dest>
> >
> > creates dest as a snapshot of src. However, if I want to do the
> > converse,
> >
> > =A0% btrfs subvolume snapshot <dest> <src>
> >
> > then <dest> is snapshotted as <src>/<dest>, i.e. not replacing the
> > original subvolume, but going inside the original subvolume.
> >
> > Use case 1:
> > =A0I have a subvolume of data under active use, which I want to
> > =A0periodically update. =A0I'd like to do this by atomically
> > =A0replacing its contents. =A0I can replace the content right now
> > =A0by deleting the old subvolume and then snapshotting the new
> > =A0on in its place, but it's racy. =A0It really needs to be
> > =A0replaced in a single operation, or else there's a small window
> > =A0where there is no data, and I'd need to resort to some external
> > =A0locking to protect myself.

I'm not sure I understand use case #1.  The problem is that you'll have
files open in the subvolume and you can't just pull the rug out from
under them.  Could you tell me a little more about what you're trying t=
o
do?

> >
> > Use case 2:
> > =A0In schroot, we create btrfs subvolume snapshots to get copy-on-
> > =A0write chroots. =A0This works just fine. =A0We also provide direc=
t
> > =A0access to the "source" subvolume, but since it could be
> > =A0snapshotted in an inconsistent state while being updated, we
> > =A0want to do the following:
> >
> > =A0=B7 snapshot source subvolume
> > =A0=B7 update snapshot
> > =A0=B7 replace source volume with updated snapshot"
> >
> > Please keep roger in the cc for any replies, thanks.
>=20
> i am also looking for functionality similar to this, except i would
> like to be able to replace the DEFAULT subvolume, with an empty or
> existing subvolume, and put the original default subvolume INSIDE the
> new root (or drop it completely), outlined by this post and the threa=
d
> it's in:
>=20
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05278.html
>=20
> is there any feedback on these actions?  no one seems to even respond=
 :-(
>=20
> it would seem we need ways to swap subvolumes around, _including_ the
> default, providing the on-disk format supports such operations.

Moving 'default' generally involves a reboot for the same reasons.  We
have to worry about open files and their view of the filesystem.  mv on
a directory won't affect file handles that are open, and renaming
subvolumes needs to follow a similar model.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-06-30 13:31   ` Chris Mason
@ 2010-06-30 14:26     ` C Anthony Risinger
  2010-07-02  1:30       ` Chris Mason
  2010-07-02 21:39     ` Bug#587253: " Roger Leigh
  1 sibling, 1 reply; 9+ messages in thread
From: C Anthony Risinger @ 2010-06-30 14:26 UTC (permalink / raw)
  To: Chris Mason, C Anthony Risinger, daniel, linux-btrfs, Roger Leigh

On Wed, Jun 30, 2010 at 8:31 AM, Chris Mason <chris.mason@oracle.com> w=
rote:
> On Sun, Jun 27, 2010 at 07:44:12PM -0500, C Anthony Risinger wrote:
>> On Sat, Jun 26, 2010 at 12:25 PM, Daniel Baumann <daniel@debian.org>=
 wrote:
>> > Hi,
>> >
>> > this is basically a forward from
>> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D587253
>> >
>> > "rename(2) allows for the atomic replacement of files. =A0Being ab=
le to
>> > atomically replace subvolume snapshots would be equally invaluable=
,
>> > since it would permit lock-free replacement of subvolumes.
>> >
>> > =A0% btrfs subvolume snapshot <src> <dest>
>> >
>> > creates dest as a snapshot of src. However, if I want to do the
>> > converse,
>> >
>> > =A0% btrfs subvolume snapshot <dest> <src>
>> >
>> > then <dest> is snapshotted as <src>/<dest>, i.e. not replacing the
>> > original subvolume, but going inside the original subvolume.
>> >
>> > Use case 1:
>> > =A0I have a subvolume of data under active use, which I want to
>> > =A0periodically update. =A0I'd like to do this by atomically
>> > =A0replacing its contents. =A0I can replace the content right now
>> > =A0by deleting the old subvolume and then snapshotting the new
>> > =A0on in its place, but it's racy. =A0It really needs to be
>> > =A0replaced in a single operation, or else there's a small window
>> > =A0where there is no data, and I'd need to resort to some external
>> > =A0locking to protect myself.
>
> I'm not sure I understand use case #1. =A0The problem is that you'll =
have
> files open in the subvolume and you can't just pull the rug out from
> under them. =A0Could you tell me a little more about what you're tryi=
ng to
> do?
>
>> >
>> > Use case 2:
>> > =A0In schroot, we create btrfs subvolume snapshots to get copy-on-
>> > =A0write chroots. =A0This works just fine. =A0We also provide dire=
ct
>> > =A0access to the "source" subvolume, but since it could be
>> > =A0snapshotted in an inconsistent state while being updated, we
>> > =A0want to do the following:
>> >
>> > =A0=B7 snapshot source subvolume
>> > =A0=B7 update snapshot
>> > =A0=B7 replace source volume with updated snapshot"
>> >
>> > Please keep roger in the cc for any replies, thanks.
>>
>> i am also looking for functionality similar to this, except i would
>> like to be able to replace the DEFAULT subvolume, with an empty or
>> existing subvolume, and put the original default subvolume INSIDE th=
e
>> new root (or drop it completely), outlined by this post and the thre=
ad
>> it's in:
>>
>> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05278.htm=
l
>>
>> is there any feedback on these actions? =A0no one seems to even resp=
ond :-(
>>
>> it would seem we need ways to swap subvolumes around, _including_ th=
e
>> default, providing the on-disk format supports such operations.
>
> Moving 'default' generally involves a reboot for the same reasons. =A0=
We
> have to worry about open files and their view of the filesystem. =A0m=
v on
> a directory won't affect file handles that are open, and renaming
> subvolumes needs to follow a similar model.

could we fail if the user tries to replace a subvolume while it's
being used?  what if the root device is _not_ the default (".")
subvolume, then can it be swapped?

in my use case, i am running in initramfs, so the root device has not
even been mounted or pivoted to; it should be safe to do whatever i
want to the filesystem.  i want to move the user's installation to a
dedicated subvolume.

what about this:  would it be possible to have TWO subvolumes by
"default"?  the regular one (current directory, "."):

mount -o subvol=3D. <btrfs_dev> /mnt

would behave as it does now.  BUT... there would then be a special,
permanent (like "." is right now) subvol, say "parent directory"
(".."):

mount -o subvol=3D.. <btrfs_dev> /mnt

TWO dots would mount the parent of ".", where i could then swap out
the real default (".").

would that work?

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-06-30 14:26     ` C Anthony Risinger
@ 2010-07-02  1:30       ` Chris Mason
  2010-07-02 16:26         ` C Anthony Risinger
  2010-07-02 19:38         ` Goffredo Baroncelli
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Mason @ 2010-07-02  1:30 UTC (permalink / raw)
  To: C Anthony Risinger; +Cc: daniel, linux-btrfs, Roger Leigh

On Wed, Jun 30, 2010 at 09:26:11AM -0500, C Anthony Risinger wrote:
> On Wed, Jun 30, 2010 at 8:31 AM, Chris Mason <chris.mason@oracle.com>=
 wrote:
> > On Sun, Jun 27, 2010 at 07:44:12PM -0500, C Anthony Risinger wrote:
> >> On Sat, Jun 26, 2010 at 12:25 PM, Daniel Baumann <daniel@debian.or=
g> wrote:
> >> > Hi,
> >> >
> >> > this is basically a forward from
> >> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D587253
> >> >
> >> > "rename(2) allows for the atomic replacement of files. =A0Being =
able to
> >> > atomically replace subvolume snapshots would be equally invaluab=
le,
> >> > since it would permit lock-free replacement of subvolumes.
> >> >
> >> > =A0% btrfs subvolume snapshot <src> <dest>
> >> >
> >> > creates dest as a snapshot of src. However, if I want to do the
> >> > converse,
> >> >
> >> > =A0% btrfs subvolume snapshot <dest> <src>
> >> >
> >> > then <dest> is snapshotted as <src>/<dest>, i.e. not replacing t=
he
> >> > original subvolume, but going inside the original subvolume.
> >> >
> >> > Use case 1:
> >> > =A0I have a subvolume of data under active use, which I want to
> >> > =A0periodically update. =A0I'd like to do this by atomically
> >> > =A0replacing its contents. =A0I can replace the content right no=
w
> >> > =A0by deleting the old subvolume and then snapshotting the new
> >> > =A0on in its place, but it's racy. =A0It really needs to be
> >> > =A0replaced in a single operation, or else there's a small windo=
w
> >> > =A0where there is no data, and I'd need to resort to some extern=
al
> >> > =A0locking to protect myself.
> >
> > I'm not sure I understand use case #1. =A0The problem is that you'l=
l have
> > files open in the subvolume and you can't just pull the rug out fro=
m
> > under them. =A0Could you tell me a little more about what you're tr=
ying to
> > do?
> >
> >> >
> >> > Use case 2:
> >> > =A0In schroot, we create btrfs subvolume snapshots to get copy-o=
n-
> >> > =A0write chroots. =A0This works just fine. =A0We also provide di=
rect
> >> > =A0access to the "source" subvolume, but since it could be
> >> > =A0snapshotted in an inconsistent state while being updated, we
> >> > =A0want to do the following:
> >> >
> >> > =A0=B7 snapshot source subvolume
> >> > =A0=B7 update snapshot
> >> > =A0=B7 replace source volume with updated snapshot"
> >> >
> >> > Please keep roger in the cc for any replies, thanks.
> >>
> >> i am also looking for functionality similar to this, except i woul=
d
> >> like to be able to replace the DEFAULT subvolume, with an empty or
> >> existing subvolume, and put the original default subvolume INSIDE =
the
> >> new root (or drop it completely), outlined by this post and the th=
read
> >> it's in:
> >>
> >> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05278.h=
tml
> >>
> >> is there any feedback on these actions? =A0no one seems to even re=
spond :-(
> >>
> >> it would seem we need ways to swap subvolumes around, _including_ =
the
> >> default, providing the on-disk format supports such operations.
> >
> > Moving 'default' generally involves a reboot for the same reasons. =
=A0We
> > have to worry about open files and their view of the filesystem. =A0=
mv on
> > a directory won't affect file handles that are open, and renaming
> > subvolumes needs to follow a similar model.
>=20
> could we fail if the user tries to replace a subvolume while it's
> being used?  what if the root device is _not_ the default (".")
> subvolume, then can it be swapped?
>=20
> in my use case, i am running in initramfs, so the root device has not
> even been mounted or pivoted to; it should be safe to do whatever i
> want to the filesystem.  i want to move the user's installation to a
> dedicated subvolume.
>=20
> what about this:  would it be possible to have TWO subvolumes by
> "default"?  the regular one (current directory, "."):
>=20
> mount -o subvol=3D. <btrfs_dev> /mnt
>=20
> would behave as it does now.  BUT... there would then be a special,
> permanent (like "." is right now) subvol, say "parent directory"
> (".."):
>=20
> mount -o subvol=3D.. <btrfs_dev> /mnt
>=20
> TWO dots would mount the parent of ".", where i could then swap out
> the real default (".").
>=20
> would that work?

We do provide a set-default ioctl that can be used to change the defaul=
t
for the next mount.   This is pretty close to what you want, let me
think about ways to make it easier to use.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-07-02  1:30       ` Chris Mason
@ 2010-07-02 16:26         ` C Anthony Risinger
  2010-07-02 19:38         ` Goffredo Baroncelli
  1 sibling, 0 replies; 9+ messages in thread
From: C Anthony Risinger @ 2010-07-02 16:26 UTC (permalink / raw)
  To: Chris Mason, C Anthony Risinger, daniel, linux-btrfs, Roger Leigh

On Thu, Jul 1, 2010 at 8:30 PM, Chris Mason <chris.mason@oracle.com> wr=
ote:
> On Wed, Jun 30, 2010 at 09:26:11AM -0500, C Anthony Risinger wrote:
>> On Wed, Jun 30, 2010 at 8:31 AM, Chris Mason <chris.mason@oracle.com=
> wrote:
>> > On Sun, Jun 27, 2010 at 07:44:12PM -0500, C Anthony Risinger wrote=
:
>> >> On Sat, Jun 26, 2010 at 12:25 PM, Daniel Baumann <daniel@debian.o=
rg> wrote:
>> >> > Hi,
>> >> >
>> >> > this is basically a forward from
>> >> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D587253
>> >> >
>> >> > "rename(2) allows for the atomic replacement of files. =A0Being=
 able to
>> >> > atomically replace subvolume snapshots would be equally invalua=
ble,
>> >> > since it would permit lock-free replacement of subvolumes.
>> >> >
>> >> > =A0% btrfs subvolume snapshot <src> <dest>
>> >> >
>> >> > creates dest as a snapshot of src. However, if I want to do the
>> >> > converse,
>> >> >
>> >> > =A0% btrfs subvolume snapshot <dest> <src>
>> >> >
>> >> > then <dest> is snapshotted as <src>/<dest>, i.e. not replacing =
the
>> >> > original subvolume, but going inside the original subvolume.
>> >> >
>> >> > Use case 1:
>> >> > =A0I have a subvolume of data under active use, which I want to
>> >> > =A0periodically update. =A0I'd like to do this by atomically
>> >> > =A0replacing its contents. =A0I can replace the content right n=
ow
>> >> > =A0by deleting the old subvolume and then snapshotting the new
>> >> > =A0on in its place, but it's racy. =A0It really needs to be
>> >> > =A0replaced in a single operation, or else there's a small wind=
ow
>> >> > =A0where there is no data, and I'd need to resort to some exter=
nal
>> >> > =A0locking to protect myself.
>> >
>> > I'm not sure I understand use case #1. =A0The problem is that you'=
ll have
>> > files open in the subvolume and you can't just pull the rug out fr=
om
>> > under them. =A0Could you tell me a little more about what you're t=
rying to
>> > do?
>> >
>> >> >
>> >> > Use case 2:
>> >> > =A0In schroot, we create btrfs subvolume snapshots to get copy-=
on-
>> >> > =A0write chroots. =A0This works just fine. =A0We also provide d=
irect
>> >> > =A0access to the "source" subvolume, but since it could be
>> >> > =A0snapshotted in an inconsistent state while being updated, we
>> >> > =A0want to do the following:
>> >> >
>> >> > =A0=B7 snapshot source subvolume
>> >> > =A0=B7 update snapshot
>> >> > =A0=B7 replace source volume with updated snapshot"
>> >> >
>> >> > Please keep roger in the cc for any replies, thanks.
>> >>
>> >> i am also looking for functionality similar to this, except i wou=
ld
>> >> like to be able to replace the DEFAULT subvolume, with an empty o=
r
>> >> existing subvolume, and put the original default subvolume INSIDE=
 the
>> >> new root (or drop it completely), outlined by this post and the t=
hread
>> >> it's in:
>> >>
>> >> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05278.=
html
>> >>
>> >> is there any feedback on these actions? =A0no one seems to even r=
espond :-(
>> >>
>> >> it would seem we need ways to swap subvolumes around, _including_=
 the
>> >> default, providing the on-disk format supports such operations.
>> >
>> > Moving 'default' generally involves a reboot for the same reasons.=
 =A0We
>> > have to worry about open files and their view of the filesystem. =A0=
mv on
>> > a directory won't affect file handles that are open, and renaming
>> > subvolumes needs to follow a similar model.
>>
>> could we fail if the user tries to replace a subvolume while it's
>> being used? =A0what if the root device is _not_ the default (".")
>> subvolume, then can it be swapped?
>>
>> in my use case, i am running in initramfs, so the root device has no=
t
>> even been mounted or pivoted to; it should be safe to do whatever i
>> want to the filesystem. =A0i want to move the user's installation to=
 a
>> dedicated subvolume.
>>
>> what about this: =A0would it be possible to have TWO subvolumes by
>> "default"? =A0the regular one (current directory, "."):
>>
>> mount -o subvol=3D. <btrfs_dev> /mnt
>>
>> would behave as it does now. =A0BUT... there would then be a special=
,
>> permanent (like "." is right now) subvol, say "parent directory"
>> (".."):
>>
>> mount -o subvol=3D.. <btrfs_dev> /mnt
>>
>> TWO dots would mount the parent of ".", where i could then swap out
>> the real default (".").
>>
>> would that work?
>
> We do provide a set-default ioctl that can be used to change the defa=
ult
> for the next mount. =A0 This is pretty close to what you want, let me
> think about ways to make it easier to use.

that's the thing; set-default is not the effect i need to achieve.

now, if there was a way i could use "set-default" AND promote that
subvol to become the real root/default subvol (.), then that would
work.  maybe as a destructive option to set-default.

i need to effectively move the users installation from subvol ".", to
subvol "__active".  this is easy with any subvol _except_ (.).
without this, the user has a bunch of dead files they will never see,
and will eventually consume space.  the only way to remove them, as of
now, is to tell the user to mount the (.) subvol, and "rm -rf"
bin/lib/usr/etc... because there is no way to manage the "." subvol.

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-07-02  1:30       ` Chris Mason
  2010-07-02 16:26         ` C Anthony Risinger
@ 2010-07-02 19:38         ` Goffredo Baroncelli
  2010-07-03 15:19           ` C Anthony Risinger
  1 sibling, 1 reply; 9+ messages in thread
From: Goffredo Baroncelli @ 2010-07-02 19:38 UTC (permalink / raw)
  To: Chris Mason, C Anthony Risinger, daniel, linux-btrfs, Roger Leigh

On Friday, July 02, 2010, Chris Mason wrote:
> On Wed, Jun 30, 2010 at 09:26:11AM -0500, C Anthony Risinger wrote:
[...]
> > what about this:  would it be possible to have TWO subvolumes by
> > "default"?  the regular one (current directory, "."):
> > 
> > mount -o subvol=. <btrfs_dev> /mnt
> > 
> > would behave as it does now.  BUT... there would then be a special,
> > permanent (like "." is right now) subvol, say "parent directory"
> > (".."):
> > 
> > mount -o subvol=.. <btrfs_dev> /mnt
> > 
> > TWO dots would mount the parent of ".", where i could then swap out
> > the real default (".").
> > 
> > would that work?
> 
> We do provide a set-default ioctl that can be used to change the default
> for the next mount.   This is pretty close to what you want, let me
> think about ways to make it easier to use.
> 
> -chris

Hello Chris,

to me it seems that the Anthony request make sense. And it not so difficult to 
have. We have all the pieces, we need only a "policy" regarding the subvolume 
use and a bit of glue
It should be sufficent to "replace" the standard mkfs.btrfs command with the 
following commands sequence

# mkfs.btrfs <device>
# mount <device> /mnt/<tmp>
# btrfs subvol create /mnt/<tmp>/__root__
# btrfs subvol set-default __root__ /mnt/<tmp>/
# umount <device>

So if an user don't want to care about a subvolume, he simply mount a btrfs 
filesystem without any option. This user will work inside the __root__ 
subvolume, where he can create snapshot, subvolume...

Instead if an user want to play with different root in different subvolumes, 
he have to mount the ".", where he can manage the root-subvolume(s) (renaming, 
moving, snapshotting/branching ... ).

The key is to think the "." subvolume only to handling the subvolumes and not 
to storing files. If you don't want to use it, you can simply ignore it, 
because the default is to mount the __root__ subvolume.


Goffredo

-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijackATinwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Bug#587253: Atomic replacement of subvolumes is not possible
  2010-06-30 13:31   ` Chris Mason
  2010-06-30 14:26     ` C Anthony Risinger
@ 2010-07-02 21:39     ` Roger Leigh
  1 sibling, 0 replies; 9+ messages in thread
From: Roger Leigh @ 2010-07-02 21:39 UTC (permalink / raw)
  To: Chris Mason, C Anthony Risinger, daniel, linux-btrfs,
	Roger Leigh, 587253

[-- Attachment #1: Type: text/plain, Size: 6535 bytes --]

On Wed, Jun 30, 2010 at 09:31:42AM -0400, Chris Mason wrote:
> On Sun, Jun 27, 2010 at 07:44:12PM -0500, C Anthony Risinger wrote:
> > On Sat, Jun 26, 2010 at 12:25 PM, Daniel Baumann <daniel@debian.org> wrote:
> > > Hi,
> > >
> > > this is basically a forward from
> > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=587253
> > >
> > > "rename(2) allows for the atomic replacement of files.  Being able to
> > > atomically replace subvolume snapshots would be equally invaluable,
> > > since it would permit lock-free replacement of subvolumes.
> > >
> > >  % btrfs subvolume snapshot <src> <dest>
> > >
> > > creates dest as a snapshot of src. However, if I want to do the
> > > converse,
> > >
> > >  % btrfs subvolume snapshot <dest> <src>
> > >
> > > then <dest> is snapshotted as <src>/<dest>, i.e. not replacing the
> > > original subvolume, but going inside the original subvolume.
> > >
> > > Use case 1:
> > >  I have a subvolume of data under active use, which I want to
> > >  periodically update.  I'd like to do this by atomically
> > >  replacing its contents.  I can replace the content right now
> > >  by deleting the old subvolume and then snapshotting the new
> > >  on in its place, but it's racy.  It really needs to be
> > >  replaced in a single operation, or else there's a small window
> > >  where there is no data, and I'd need to resort to some external
> > >  locking to protect myself.
> 
> I'm not sure I understand use case #1.  The problem is that you'll have
> files open in the subvolume and you can't just pull the rug out from
> under them.  Could you tell me a little more about what you're trying to
> do?

This case was slightly contrived, but one example would be that I have
programs using generated/downloaded datasets.  I periodically update
these datasets.  The programs using these datasets should use the old
data or the replacement new data, but not a mixture of the two during
the replacement, hence the need to atomically update.

A real-world example: I download entire genome databases from the
internet which are regularly updated.  Programs querying/analysing
the databases might take a while to run and I might many to run
concurrently.  But, I do need to update them without interrupting
running programs.

> > > Use case 2:
> > >  In schroot, we create btrfs subvolume snapshots to get copy-on-
> > >  write chroots.  This works just fine.  We also provide direct
> > >  access to the "source" subvolume, but since it could be
> > >  snapshotted in an inconsistent state while being updated, we
> > >  want to do the following:
> > >
> > >  · snapshot source subvolume
> > >  · update snapshot
> > >  · replace source volume with updated snapshot"
> > >
> > > Please keep roger in the cc for any replies, thanks.
> > 
> > i am also looking for functionality similar to this, except i would
> > like to be able to replace the DEFAULT subvolume, with an empty or
> > existing subvolume, and put the original default subvolume INSIDE the
> > new root (or drop it completely), outlined by this post and the thread
> > it's in:
> > 
> > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05278.html
> > 
> > is there any feedback on these actions?  no one seems to even respond :-(
> > 
> > it would seem we need ways to swap subvolumes around, _including_ the
> > default, providing the on-disk format supports such operations.
> 
> Moving 'default' generally involves a reboot for the same reasons.  We
> have to worry about open files and their view of the filesystem.  mv on
> a directory won't affect file handles that are open, and renaming
> subvolumes needs to follow a similar model.

Thinking more about the problem, there's some possibilities I'd
like to suggest.  I'm currently unfamiliar with the btrfs internals,
so please forgive me if this is not feasible.

Firstly, would it be possible to swap subvolumes?  Sort of like
pivot_root but to atomically replace one subvolume with another.

  % btrfs subvolume swap /path/to/fs/subvol1 /path/to/fs/subvol2

would exchange /path/to/fs/subvol1 and /path/to/fs/subvol2 so that
the subvol at /path/to/fs/subvol2 would be visible at
/path/to/fs/subvol1 (and vice versa, of course).  Because both
subvolumes remain intact, this shouldn't affect programs with open
files or directories since nothing is deleted.  I guess this is
semantically equivalant to rename(2) of in use directories.  At
least for use case 2, above, this would be sufficient to work around
the lack of atomic replace, since we can then delete the unwanted
subvol.

There's the requirement that programs using the old subvolume still
have access to open files.  I see that since each subvolume is a
separate device, so I assume that deleting a subvolume means any
open filehandles are no longer valid?  A suggestion here: akin to
an unlink(2)ed file remaining open until the last user close()s the
last file descriptor referencing it, would it be possible for the
btrfs subvolume to only be deleted when the last user finishes
referencing it.  i.e. the subvolume deletion is "lazy" so it's no
longer visible/accessible but remains intact until the last file/
directory fd is closed (including processes with this as their cwd).
Or, at least behaving similarly to being in a directory which has
been "rm -rf"ed since this is effectively what we did.

This would allow direct atomic replacement of subvolumes without
impacting on running processes except as would be expected if running
on a traditional filesystem were the directory has been removed.

Lastly, regarding the comments about the default subvolume, ".".
When I first started using btrfs some months ago, I read the
documentation as mkfs.btrfs creating a default subvolume named
"default" similar to the __root__ suggestion and was quite
confused by the actual behaviour.  IMHO, having an initial
default subvolume named "default", "__root__" or whatever
makes a lot of sense compared with by default allowing normal
files to go into ".".  Users who never use subvolumes will never
need to be aware of this, but it will make use of subvolumes
much more straightforward for the rest of us!


Kind regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atomic replacement of subvolumes is not possible
  2010-07-02 19:38         ` Goffredo Baroncelli
@ 2010-07-03 15:19           ` C Anthony Risinger
  0 siblings, 0 replies; 9+ messages in thread
From: C Anthony Risinger @ 2010-07-03 15:19 UTC (permalink / raw)
  To: kreijack; +Cc: Chris Mason, daniel, linux-btrfs, Roger Leigh

On Fri, Jul 2, 2010 at 2:38 PM, Goffredo Baroncelli <kreijack@gmail.com=
> wrote:
> On Friday, July 02, 2010, Chris Mason wrote:
>> On Wed, Jun 30, 2010 at 09:26:11AM -0500, C Anthony Risinger wrote:
> [...]
>> > what about this: =A0would it be possible to have TWO subvolumes by
>> > "default"? =A0the regular one (current directory, "."):
>> >
>> > mount -o subvol=3D. <btrfs_dev> /mnt
>> >
>> > would behave as it does now. =A0BUT... there would then be a speci=
al,
>> > permanent (like "." is right now) subvol, say "parent directory"
>> > (".."):
>> >
>> > mount -o subvol=3D.. <btrfs_dev> /mnt
>> >
>> > TWO dots would mount the parent of ".", where i could then swap ou=
t
>> > the real default (".").
>> >
>> > would that work?
>>
>> We do provide a set-default ioctl that can be used to change the def=
ault
>> for the next mount. =A0 This is pretty close to what you want, let m=
e
>> think about ways to make it easier to use.
>>
>> -chris
>
> Hello Chris,
>
> to me it seems that the Anthony request make sense. And it not so dif=
ficult to
> have. We have all the pieces, we need only a "policy" regarding the s=
ubvolume
> use and a bit of glue
> It should be sufficent to "replace" the standard mkfs.btrfs command w=
ith the
> following commands sequence
>
> # mkfs.btrfs <device>
> # mount <device> /mnt/<tmp>
> # btrfs subvol create /mnt/<tmp>/__root__
> # btrfs subvol set-default __root__ /mnt/<tmp>/
> # umount <device>
>
> So if an user don't want to care about a subvolume, he simply mount a=
 btrfs
> filesystem without any option. This user will work inside the __root_=
_
> subvolume, where he can create snapshot, subvolume...
>
> Instead if an user want to play with different root in different subv=
olumes,
> he have to mount the ".", where he can manage the root-subvolume(s) (=
renaming,
> moving, snapshotting/branching ... ).
>
> The key is to think the "." subvolume only to handling the subvolumes=
 and not
> to storing files. If you don't want to use it, you can simply ignore =
it,
> because the default is to mount the __root__ subvolume.

i don't want to comment anymore on this thread, as i feel i kind of
hijacked it :-), but what Goffredo has suggested above is a great idea
and would solve my default subvol problems completely.

the real problem is that users are installing into the "." subvol not
knowing they cannot easily manipulate the system after that.  as
Goffredo hinted:

"The key is to think the "." subvolume only to handling the subvolumes
and not to storing files."

if a empty subvol is created, then marked as the new mount default,
users would never know the difference, and integrators like me could
still get "underneath" to prepare the system for all the cool
distribution-specific btrfs features.

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-07-03 15:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-26 17:25 Atomic replacement of subvolumes is not possible Daniel Baumann
2010-06-28  0:44 ` C Anthony Risinger
2010-06-30 13:31   ` Chris Mason
2010-06-30 14:26     ` C Anthony Risinger
2010-07-02  1:30       ` Chris Mason
2010-07-02 16:26         ` C Anthony Risinger
2010-07-02 19:38         ` Goffredo Baroncelli
2010-07-03 15:19           ` C Anthony Risinger
2010-07-02 21:39     ` Bug#587253: " Roger Leigh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).