All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs receive leaves new subvolume modifiable during operation
@ 2017-01-31 23:32 Christian Lupien
  2017-02-01  5:09 ` Duncan
  0 siblings, 1 reply; 20+ messages in thread
From: Christian Lupien @ 2017-01-31 23:32 UTC (permalink / raw)
  To: linux-btrfs

I have been testing btrfs send/receive. I like it.

During those tests I discovered that it is possible to access and
modify (add files, delete files ...) of the new receive snapshot during
the transfer. After the transfer it becomes readonly but it could
already have been modified.

So you can end up with a source and a destination which are not the
same. Therefore during a subsequent incremental transfers I can get
receive to crash (trying to unlink a file that is not in the parent but
should).

Is this behavior by design or will it be prevented in the future?

I can of course just not modify the subvolume during receive but is
there a way to make sure no user/program modifies it?

I can also get in the same kind of trouble by modifying a parent (after
changing its property temporarily to ro=false). send/receive is
checking that the same parent uuid is available on both sides but not
that generation has not changed. Of course in this case it requires
direct user intervention. Never changing the ro property of subvolumes
would prevent the problem. 

Again is this by design?
Otherwise I would suggest finding a way to avoid those conditions
(using the generation maybe?). There could be an override option to
allow more flexibility if needed.

Thanks
Christian	

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-01-31 23:32 btrfs receive leaves new subvolume modifiable during operation Christian Lupien
@ 2017-02-01  5:09 ` Duncan
  2017-02-01 12:28   ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 20+ messages in thread
From: Duncan @ 2017-02-01  5:09 UTC (permalink / raw)
  To: linux-btrfs

Christian Lupien posted on Tue, 31 Jan 2017 18:32:58 -0500 as excerpted:

> I have been testing btrfs send/receive. I like it.
> 
> During those tests I discovered that it is possible to access and modify
> (add files, delete files ...) of the new receive snapshot during the
> transfer. After the transfer it becomes readonly but it could already
> have been modified.
> 
> So you can end up with a source and a destination which are not the
> same. Therefore during a subsequent incremental transfers I can get
> receive to crash (trying to unlink a file that is not in the parent but
> should).
> 
> Is this behavior by design or will it be prevented in the future?
> 
> I can of course just not modify the subvolume during receive but is
> there a way to make sure no user/program modifies it?

I'm just a btrfs-using list regular not a dev, but AFAIK, the behavior is 
likely to be by design and difficult to change, because the send stream 
is simply a stream of userspace-context commands for receive to act upon, 
and any other suitably privileged userspace program could run the same 
commands.  (If your btrfs-progs is new enough receive even has a dump 
option, that prints the metadata operations in human readable form, one 
operation per line.)

So making the receive snapshot read-only during the transfer would 
prevent receive itself working.

> I can also get in the same kind of trouble by modifying a parent (after
> changing its property temporarily to ro=false). send/receive is checking
> that the same parent uuid is available on both sides but not that
> generation has not changed. Of course in this case it requires direct
> user intervention. Never changing the ro property of subvolumes would
> prevent the problem.
> 
> Again is this by design?

Again, yes.  The ability to toggle snapshots between ro/rw is a useful 
feature and was added deliberately.  This one would seem to me to be much 
like the (no doubt apocryphal) guy who went to the doctor complaining 
that when he beat his head against the wall, it hurt.  The doctor said, 
"Stop doing that then."

> Otherwise I would suggest finding a way to avoid those conditions (using
> the generation maybe?). There could be an override option to allow more
> flexibility if needed.

There's a send-stream format version bump planned, that should fix 
various issues and eliminate various limitations.  However, in ordered to 
minimize the number of format versions that must continue to be supported 
into the future, they don't plan to do that bump until they're relatively 
sure their list of changes to make is complete.  They don't want to do 
the bump and then a kernel series or two later discover they need yet 
another tweak.

Remember, btrfs' status remains "stabilizing, but not yet fully stable 
and mature."  A lot of stuff hasn't been optimized either, because 
they're focused on eliminating the bugs and adding missing features 
still, not optimizing, at this point, and they don't want to spend a 
bunch of time optimizing something, only to have to rewrite or even just 
tweak it, perhaps to support say N-way-mirroring, and have to redo the 
optimization as a result.

This sort of additional sync guarantees may be in the final generally 
considered stabilized product, but that's yet some time (years) away.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-01  5:09 ` Duncan
@ 2017-02-01 12:28   ` Austin S. Hemmelgarn
  2017-02-01 17:43     ` Graham Cobb
  0 siblings, 1 reply; 20+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-01 12:28 UTC (permalink / raw)
  To: linux-btrfs

On 2017-02-01 00:09, Duncan wrote:
> Christian Lupien posted on Tue, 31 Jan 2017 18:32:58 -0500 as excerpted:
>
>> I have been testing btrfs send/receive. I like it.
>>
>> During those tests I discovered that it is possible to access and modify
>> (add files, delete files ...) of the new receive snapshot during the
>> transfer. After the transfer it becomes readonly but it could already
>> have been modified.
>>
>> So you can end up with a source and a destination which are not the
>> same. Therefore during a subsequent incremental transfers I can get
>> receive to crash (trying to unlink a file that is not in the parent but
>> should).
>>
>> Is this behavior by design or will it be prevented in the future?
>>
>> I can of course just not modify the subvolume during receive but is
>> there a way to make sure no user/program modifies it?
>
> I'm just a btrfs-using list regular not a dev, but AFAIK, the behavior is
> likely to be by design and difficult to change, because the send stream
> is simply a stream of userspace-context commands for receive to act upon,
> and any other suitably privileged userspace program could run the same
> commands.  (If your btrfs-progs is new enough receive even has a dump
> option, that prints the metadata operations in human readable form, one
> operation per line.)
>
> So making the receive snapshot read-only during the transfer would
> prevent receive itself working.
That's correct.  Fixing this completely would require implementing 
receive on the kernel side, which is not a practical option for multiple 
reasons.

That said, some improvements could be made, such as making everything 
0700 and owned by the user running receive initially, then updating the 
metadata at the end of the operation.

There are also a handful of other security concerns with send/receive 
(the most notable being that receive doesn't validate the send stream 
against the receiving system as well as it should), so I would generally 
recommend not using send/receive outside of a tightly controlled 
environment.
>
>> I can also get in the same kind of trouble by modifying a parent (after
>> changing its property temporarily to ro=false). send/receive is checking
>> that the same parent uuid is available on both sides but not that
>> generation has not changed. Of course in this case it requires direct
>> user intervention. Never changing the ro property of subvolumes would
>> prevent the problem.
>>
>> Again is this by design?
>
> Again, yes.  The ability to toggle snapshots between ro/rw is a useful
> feature and was added deliberately.  This one would seem to me to be much
> like the (no doubt apocryphal) guy who went to the doctor complaining
> that when he beat his head against the wall, it hurt.  The doctor said,
> "Stop doing that then."
Agreed, especially considering that some of the most interesting 
use-cases for send/receive (which requires the sent subvolume to be 
read-only) require the subvolume to be made writable again on the other end.
>
>> Otherwise I would suggest finding a way to avoid those conditions (using
>> the generation maybe?). There could be an override option to allow more
>> flexibility if needed.
>
> There's a send-stream format version bump planned, that should fix
> various issues and eliminate various limitations.  However, in ordered to
> minimize the number of format versions that must continue to be supported
> into the future, they don't plan to do that bump until they're relatively
> sure their list of changes to make is complete.  They don't want to do
> the bump and then a kernel series or two later discover they need yet
> another tweak.
>
> Remember, btrfs' status remains "stabilizing, but not yet fully stable
> and mature."  A lot of stuff hasn't been optimized either, because
> they're focused on eliminating the bugs and adding missing features
> still, not optimizing, at this point, and they don't want to spend a
> bunch of time optimizing something, only to have to rewrite or even just
> tweak it, perhaps to support say N-way-mirroring, and have to redo the
> optimization as a result.
>
> This sort of additional sync guarantees may be in the final generally
> considered stabilized product, but that's yet some time (years) away.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-01 12:28   ` Austin S. Hemmelgarn
@ 2017-02-01 17:43     ` Graham Cobb
  2017-02-01 22:27       ` Duncan
  2017-02-05 11:54       ` Kai Krakow
  0 siblings, 2 replies; 20+ messages in thread
From: Graham Cobb @ 2017-02-01 17:43 UTC (permalink / raw)
  To: linux-btrfs

On 01/02/17 12:28, Austin S. Hemmelgarn wrote:
> On 2017-02-01 00:09, Duncan wrote:
>> Christian Lupien posted on Tue, 31 Jan 2017 18:32:58 -0500 as excerpted:
>>
>>> I have been testing btrfs send/receive. I like it.
>>>
>>> During those tests I discovered that it is possible to access and modify
>>> (add files, delete files ...) of the new receive snapshot during the
>>> transfer. After the transfer it becomes readonly but it could already
>>> have been modified.
>>>
>>> So you can end up with a source and a destination which are not the
>>> same. Therefore during a subsequent incremental transfers I can get
>>> receive to crash (trying to unlink a file that is not in the parent but
>>> should).
>>>
>>> Is this behavior by design or will it be prevented in the future?
>>>
>>> I can of course just not modify the subvolume during receive but is
>>> there a way to make sure no user/program modifies it?
>>
>> I'm just a btrfs-using list regular not a dev, but AFAIK, the behavior is
>> likely to be by design and difficult to change, because the send stream
>> is simply a stream of userspace-context commands for receive to act upon,
>> and any other suitably privileged userspace program could run the same
>> commands.  (If your btrfs-progs is new enough receive even has a dump
>> option, that prints the metadata operations in human readable form, one
>> operation per line.)
>>
>> So making the receive snapshot read-only during the transfer would
>> prevent receive itself working.
> That's correct.  Fixing this completely would require implementing
> receive on the kernel side, which is not a practical option for multiple
> reasons.

I am with Christian on this. Both the effects he discovered go against
my expectation of how send/receive would or should work.

This first bug is more serious because it appears to allow a
non-privileged user to disrupt the correct operation of receive,
creating a form of denial-of-service of a send/receive based backup
process. If I decided that I didn't want my pron collection (or my
incriminating emails) appearing in the backups I could just make sure
that I removed them from the receive snapshots while they were still
writeable.

You may be right that fixing this would require receive in the kernel,
and that is undesirable, although it seems to me that it should be
possible to do something like allow receive to create the snapshot with
a special flag that would cause the kernel to treat it as read-only to
any requests not delivered through the same file descriptor, or
something like that (or, if that can't be done, at least require root
access to make any changes). In any case, I believe it should be treated
as a bug, even if low priority, with an explicit warning about the
possible corruption of receive-based backups in the btrfs-receive man page.

>>> I can also get in the same kind of trouble by modifying a parent (after
>>> changing its property temporarily to ro=false). send/receive is checking
>>> that the same parent uuid is available on both sides but not that
>>> generation has not changed. Of course in this case it requires direct
>>> user intervention. Never changing the ro property of subvolumes would
>>> prevent the problem.
>>>
>>> Again is this by design?
>>
>> Again, yes.  The ability to toggle snapshots between ro/rw is a useful
>> feature and was added deliberately.  This one would seem to me to be much
>> like the (no doubt apocryphal) guy who went to the doctor complaining
>> that when he beat his head against the wall, it hurt.  The doctor said,
>> "Stop doing that then."
> Agreed, especially considering that some of the most interesting
> use-cases for send/receive (which requires the sent subvolume to be
> read-only) require the subvolume to be made writable again on the other
> end.

I agree that there are good reasons why subvolumes should be switchable
between ro and rw. However, receive should detect and issue warnings
when this problem has happened (for example by checking the generation).
Again, this may be low priority, and may need to wait for a send stream
format change, but it can't be claimed that this is correct behaviour.

Graham

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-01 17:43     ` Graham Cobb
@ 2017-02-01 22:27       ` Duncan
  2017-02-01 22:51         ` Graham Cobb
  2017-02-05 11:54       ` Kai Krakow
  1 sibling, 1 reply; 20+ messages in thread
From: Duncan @ 2017-02-01 22:27 UTC (permalink / raw)
  To: linux-btrfs

Graham Cobb posted on Wed, 01 Feb 2017 17:43:32 +0000 as excerpted:

> This first bug is more serious because it appears to allow a
> non-privileged user to disrupt the correct operation of receive,
> creating a form of denial-of-service of a send/receive based backup
> process. If I decided that I didn't want my pron collection (or my
> incriminating emails) appearing in the backups I could just make sure
> that I removed them from the receive snapshots while they were still
> writeable.

I'll prefix this question by noting that my own use-case doesn't use send/
receive, so while I know about it in general from following the list, 
I've no personal experience with it...

With that said, couldn't the entire problem be eliminated by properly 
setting the permissions on a directory/subvol upstream of the received 
snapshot?  If said upstream parent is only readable/enterable by root (or 
some specific user), then one would have to be root or that user in 
ordered to interfere, as nobody else could even get to the receiving 
snapshot to commit mayhem.

IOW, it should work like directory permissions have always worked.  If 
you don't have enter access to the parent, you can't read/write the 
child, thus no need for btrfs-receive specific permission-hoop-jumping.  
(And of course SELinux or similar could be used to tighten permissions 
even further, should that be justified by the use-case.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-01 22:27       ` Duncan
@ 2017-02-01 22:51         ` Graham Cobb
  2017-02-02  0:02           ` Duncan
  0 siblings, 1 reply; 20+ messages in thread
From: Graham Cobb @ 2017-02-01 22:51 UTC (permalink / raw)
  To: linux-btrfs

On 01/02/17 22:27, Duncan wrote:
> Graham Cobb posted on Wed, 01 Feb 2017 17:43:32 +0000 as excerpted:
> 
>> This first bug is more serious because it appears to allow a
>> non-privileged user to disrupt the correct operation of receive,
>> creating a form of denial-of-service of a send/receive based backup
>> process. If I decided that I didn't want my pron collection (or my
>> incriminating emails) appearing in the backups I could just make sure
>> that I removed them from the receive snapshots while they were still
>> writeable.
> 
> I'll prefix this question by noting that my own use-case doesn't use send/
> receive, so while I know about it in general from following the list, 
> I've no personal experience with it...
> 
> With that said, couldn't the entire problem be eliminated by properly 
> setting the permissions on a directory/subvol upstream of the received 
> snapshot?  

I (honestly) don't know. But even if that does work, it is clearly only
a workround for the bug. Where in the documentation does it warn the
system manager about the problem? Where does it tell them that they had
better make sure they only receive into a directory tree which does not
allow users read or execute access (not just not write access!)? What if
part of the point of the backup strategy is that user's have read access
to these snapshots so they can restore their own files?

The possibility of a knowledgeable system manager being able to
workround the problem by limiting how they use it doesn't stop it being
a bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-01 22:51         ` Graham Cobb
@ 2017-02-02  0:02           ` Duncan
  2017-02-02 10:52             ` Graham Cobb
  0 siblings, 1 reply; 20+ messages in thread
From: Duncan @ 2017-02-02  0:02 UTC (permalink / raw)
  To: linux-btrfs

Graham Cobb posted on Wed, 01 Feb 2017 22:51:34 +0000 as excerpted:

>> [C]ouldn't the entire problem be eliminated by properly
>> setting the permissions on a directory/subvol upstream of the received
>> snapshot?
> 
> I (honestly) don't know. But even if that does work, it is clearly only
> a workround for the bug. Where in the documentation does it warn the
> system manager about the problem? Where does it tell them that they had
> better make sure they only receive into a directory tree which does not
> allow users read or execute access (not just not write access!)? What if
> part of the point of the backup strategy is that user's have read access
> to these snapshots so they can restore their own files?
> 
> The possibility of a knowledgeable system manager being able to
> workround the problem by limiting how they use it doesn't stop it being
> a bug.

If it's a workaround, then many of the Linux procedures we as admins and 
users use every day are equally workarounds.  Setting 007 perms on a dir 
that doesn't have anything immediately security vulnerable in it, simply 
to keep other users from even potentially seeing or being able to write 
to something N layers down the subdir tree, is standard practice.

Which is my point.  This is no different than standard security practice, 
that an admin should be familiar with and using without even having to 
think about it.  Btrfs is simply making the same assumptions that 
everyone else does, that an admin knows what they are doing and sets the 
upstream permissions with that in mind.  If they don't, how is that 
btrfs' fault?

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-02  0:02           ` Duncan
@ 2017-02-02 10:52             ` Graham Cobb
  2017-02-02 12:49               ` Austin S. Hemmelgarn
                                 ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Graham Cobb @ 2017-02-02 10:52 UTC (permalink / raw)
  To: linux-btrfs

On 02/02/17 00:02, Duncan wrote:
> If it's a workaround, then many of the Linux procedures we as admins and 
> users use every day are equally workarounds.  Setting 007 perms on a dir 
> that doesn't have anything immediately security vulnerable in it, simply 
> to keep other users from even potentially seeing or being able to write 
> to something N layers down the subdir tree, is standard practice.

No. There is no need to normally place a read-only snapshot below a
no-execute directory just to prevent write access to it. That is not
part of the admin's expectation.

> Which is my point.  This is no different than standard security practice, 
> that an admin should be familiar with and using without even having to 
> think about it.  Btrfs is simply making the same assumptions that 
> everyone else does, that an admin knows what they are doing and sets the 
> upstream permissions with that in mind.  If they don't, how is that 
> btrfs' fault?

Because btrfs intends the receive snapshot to be read-only. That is the
expectation of the sysadmin. It is an important and useful feature which
makes send/receive very useful for creating
user-readable-but-not-modifiable backups (without it, send/receive are
useful for many things but less useful for creating backups). That
feature has a bug.

Just because you don't personally use the feature, doesn't mean it isn't
a bug! Many of us do rely on that feature.

Even though it is security-related, I agree it isn't the highest
priority btrfs bug. It can probably wait until receive is being worked
on for other reasons. But if it isn't going to be fixed any time soon,
it should be documented in the Wiki and the man page, with the suggested
workround for anyone who needs to make sure the receive won't be
tampered with.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-02 10:52             ` Graham Cobb
@ 2017-02-02 12:49               ` Austin S. Hemmelgarn
  2017-02-03  9:14               ` Duncan
  2017-02-05 12:08               ` Kai Krakow
  2 siblings, 0 replies; 20+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-02 12:49 UTC (permalink / raw)
  To: Graham Cobb, linux-btrfs

On 2017-02-02 05:52, Graham Cobb wrote:
> On 02/02/17 00:02, Duncan wrote:
>> If it's a workaround, then many of the Linux procedures we as admins and
>> users use every day are equally workarounds.  Setting 007 perms on a dir
>> that doesn't have anything immediately security vulnerable in it, simply
>> to keep other users from even potentially seeing or being able to write
>> to something N layers down the subdir tree, is standard practice.
>
> No. There is no need to normally place a read-only snapshot below a
> no-execute directory just to prevent write access to it. That is not
> part of the admin's expectation.
That depends on the admin.  I for example would absolutely expect that 
to be needed _while the snapshot is being created_ if the operation 
isn't being done by the kernel.
>
>> Which is my point.  This is no different than standard security practice,
>> that an admin should be familiar with and using without even having to
>> think about it.  Btrfs is simply making the same assumptions that
>> everyone else does, that an admin knows what they are doing and sets the
>> upstream permissions with that in mind.  If they don't, how is that
>> btrfs' fault?
>
> Because btrfs intends the receive snapshot to be read-only. That is the
> expectation of the sysadmin. It is an important and useful feature which
> makes send/receive very useful for creating
> user-readable-but-not-modifiable backups (without it, send/receive are
> useful for many things but less useful for creating backups). That
> feature has a bug.
If that's your use case, then from a consistency perspective, you should 
be receiving into a location the user can't read and then moving the 
subvolume into a user readable location once receive is done anyway. 
Otherwise, they may see a partial snapshot, and think something has been 
deleted that really hasn't.  This is a pretty basic method of ensuring 
consistency that's used literally all over the place on UNIX systems to 
atomically replace or update data sets.  Doing so also cleanly avoids 
needing to manipulate permissions on the directory the snapshots are in 
to prevent users from modifying the snapshots being received.
>
> Just because you don't personally use the feature, doesn't mean it isn't
> a bug! Many of us do rely on that feature.
>
> Even though it is security-related, I agree it isn't the highest
> priority btrfs bug. It can probably wait until receive is being worked
> on for other reasons. But if it isn't going to be fixed any time soon,
> it should be documented in the Wiki and the man page, with the suggested
> workround for anyone who needs to make sure the receive won't be
> tampered with.
I agree that it should be documented.  I don't agree that a specific 
workaround needs to be stated, as whether or not any particular 
workaround is needed is entirely dependent on the use case.  For 
example, if your users can only access the received snapshots over a 
networked filesystem, there's a much simpler option of just exporting 
the directories the snapshots are received into as read-only.  All that 
needs to be stated is that the way to prevent this is to make sure users 
can't write to the snapshots while they're being received.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-02 10:52             ` Graham Cobb
  2017-02-02 12:49               ` Austin S. Hemmelgarn
@ 2017-02-03  9:14               ` Duncan
  2017-02-03 12:44                 ` Austin S. Hemmelgarn
  2017-02-05 12:08               ` Kai Krakow
  2 siblings, 1 reply; 20+ messages in thread
From: Duncan @ 2017-02-03  9:14 UTC (permalink / raw)
  To: linux-btrfs

Graham Cobb posted on Thu, 02 Feb 2017 10:52:26 +0000 as excerpted:

> On 02/02/17 00:02, Duncan wrote:
>> If it's a workaround, then many of the Linux procedures we as admins
>> and users use every day are equally workarounds.  Setting 007 perms on
>> a dir that doesn't have anything immediately security vulnerable in it,
>> simply to keep other users from even potentially seeing or being able
>> to write to something N layers down the subdir tree, is standard
>> practice.
> 
> No. There is no need to normally place a read-only snapshot below a
> no-execute directory just to prevent write access to it. That is not
> part of the admin's expectation.
> 
>> Which is my point.  This is no different than standard security
>> practice,
>> that an admin should be familiar with and using without even having to
>> think about it.  Btrfs is simply making the same assumptions that
>> everyone else does, that an admin knows what they are doing and sets
>> the upstream permissions with that in mind.  If they don't, how is that
>> btrfs' fault?
> 
> Because btrfs intends the receive snapshot to be read-only. That is the
> expectation of the sysadmin.

Read-only *after* completion, yes.  But a sysadmin that believes really 
setting something read-only and then trying to write to it from 
userspace, which is what btrfs does until the receive is done, should 
work, doesn't understand the meaning of read-only.

Meanwhile, Austin said most of what I'd say, probably better than I'd say 
it, so I won't repeat it here, but I agree with him.

> Even though it is security-related, I agree it isn't the highest
> priority btrfs bug. It can probably wait until receive is being worked
> on for other reasons. But if it isn't going to be fixed any time soon,
> it should be documented in the Wiki and the man page, with the suggested
> workround for anyone who needs to make sure the receive won't be
> tampered with.

One thing I was going to say in the previous post and forgot, is that not 
withstanding all the technical reasons, I do agree that documenting it in 
the manpage, etc, would be a good idea.  In that I agree with both you 
and Austin. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-03  9:14               ` Duncan
@ 2017-02-03 12:44                 ` Austin S. Hemmelgarn
  2017-02-03 15:44                   ` Graham Cobb
  0 siblings, 1 reply; 20+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-03 12:44 UTC (permalink / raw)
  To: linux-btrfs

On 2017-02-03 04:14, Duncan wrote:
> Graham Cobb posted on Thu, 02 Feb 2017 10:52:26 +0000 as excerpted:
>
>> On 02/02/17 00:02, Duncan wrote:
>>> If it's a workaround, then many of the Linux procedures we as admins
>>> and users use every day are equally workarounds.  Setting 007 perms on
>>> a dir that doesn't have anything immediately security vulnerable in it,
>>> simply to keep other users from even potentially seeing or being able
>>> to write to something N layers down the subdir tree, is standard
>>> practice.
>>
>> No. There is no need to normally place a read-only snapshot below a
>> no-execute directory just to prevent write access to it. That is not
>> part of the admin's expectation.
>>
>>> Which is my point.  This is no different than standard security
>>> practice,
>>> that an admin should be familiar with and using without even having to
>>> think about it.  Btrfs is simply making the same assumptions that
>>> everyone else does, that an admin knows what they are doing and sets
>>> the upstream permissions with that in mind.  If they don't, how is that
>>> btrfs' fault?
>>
>> Because btrfs intends the receive snapshot to be read-only. That is the
>> expectation of the sysadmin.
>
> Read-only *after* completion, yes.  But a sysadmin that believes really
> setting something read-only and then trying to write to it from
> userspace, which is what btrfs does until the receive is done, should
> work, doesn't understand the meaning of read-only.
>
> Meanwhile, Austin said most of what I'd say, probably better than I'd say
> it, so I won't repeat it here, but I agree with him.
>
>> Even though it is security-related, I agree it isn't the highest
>> priority btrfs bug. It can probably wait until receive is being worked
>> on for other reasons. But if it isn't going to be fixed any time soon,
>> it should be documented in the Wiki and the man page, with the suggested
>> workround for anyone who needs to make sure the receive won't be
>> tampered with.
>
> One thing I was going to say in the previous post and forgot, is that not
> withstanding all the technical reasons, I do agree that documenting it in
> the manpage, etc, would be a good idea.  In that I agree with both you
> and Austin. =:^)
I can look at making a patch for this, but it may be next week before I 
have time (I'm not great at multi-tasking when it comes to software 
development, and I'm in the middle of helping to fix a bug in Ansible 
right now).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-03 12:44                 ` Austin S. Hemmelgarn
@ 2017-02-03 15:44                   ` Graham Cobb
  2017-02-03 16:01                     ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 20+ messages in thread
From: Graham Cobb @ 2017-02-03 15:44 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

On 03/02/17 12:44, Austin S. Hemmelgarn wrote:
> I can look at making a patch for this, but it may be next week before I
> have time (I'm not great at multi-tasking when it comes to software
> development, and I'm in the middle of helping to fix a bug in Ansible
> right now).

That would be great, Austin! It is about 15 years since I last submitted
a patch under kernel development patch rules and things have changed a
fair bit in that time. So if you are set up to do it that sounds good.

As a starting point, I have created a suggested text (patch attached).





[-- Attachment #2: btrfs-receive-rw-window --]
[-- Type: text/plain, Size: 1299 bytes --]

diff --git a/Documentation/btrfs-receive.asciidoc b/Documentation/btrfs-receive.asciidoc
index 6be4aa6..db525d9 100644
--- a/Documentation/btrfs-receive.asciidoc
+++ b/Documentation/btrfs-receive.asciidoc
@@ -31,7 +31,7 @@ the stream, and print the stream metadata, one operation per line.
 
 3. default subvolume has changed or you didn't mount the filesystem at the toplevel subvolume
 
-A subvolume is made read-only after the receiving process finishes succesfully.
+A subvolume is made read-only after the receiving process finishes succesfully (see BUGS below).
 
 `Options`
 
@@ -73,6 +73,16 @@ EXIT STATUS
 *btrfs receive* returns a zero exit status if it succeeds. Non zero is
 returned in case of failure.
 
+BUGS
+----
+*btrfs receive* sets the subvolume read-only after it completes successfully.
+However, while the receive is in progress, users who have write access to files
+or directories in the receiving 'path' can add, remove or modify files, in which
+case the resulting read-only subvolume will not be a copy of the sending subvolume.
+
+If the intention is to create an exact copy, the receiving 'path' should be protected
+from access by users until the receive has completed and the subvolume set to read-only.
+
 AVAILABILITY
 ------------
 *btrfs* is part of btrfs-progs.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-03 15:44                   ` Graham Cobb
@ 2017-02-03 16:01                     ` Austin S. Hemmelgarn
  2017-02-03 19:17                       ` Graham Cobb
  0 siblings, 1 reply; 20+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-03 16:01 UTC (permalink / raw)
  To: Graham Cobb, linux-btrfs

On 2017-02-03 10:44, Graham Cobb wrote:
> On 03/02/17 12:44, Austin S. Hemmelgarn wrote:
>> I can look at making a patch for this, but it may be next week before I
>> have time (I'm not great at multi-tasking when it comes to software
>> development, and I'm in the middle of helping to fix a bug in Ansible
>> right now).
>
> That would be great, Austin! It is about 15 years since I last submitted
> a patch under kernel development patch rules and things have changed a
> fair bit in that time. So if you are set up to do it that sounds good.
>
> As a starting point, I have created a suggested text (patch attached).
Ironically, I ended up having time sooner than I thought.  The message 
doesn't appear to be in any of the archives yet, but the message ID is:
<20170203134858.75210-1-ahferroin7@gmail.com>

I actually like how you explained things a bit better though, so if you 
are OK with it I'll update the patch I sent using your description (and 
credit you in the commit message too of course).


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-03 16:01                     ` Austin S. Hemmelgarn
@ 2017-02-03 19:17                       ` Graham Cobb
  2017-02-03 19:37                         ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 20+ messages in thread
From: Graham Cobb @ 2017-02-03 19:17 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, linux-btrfs

On 03/02/17 16:01, Austin S. Hemmelgarn wrote:
> Ironically, I ended up having time sooner than I thought.  The message
> doesn't appear to be in any of the archives yet, but the message ID is:
> <20170203134858.75210-1-ahferroin7@gmail.com>

Ah. I didn't notice it until after I had sent my message.

> I actually like how you explained things a bit better though, so if you
> are OK with it I'll update the patch I sent using your description (and
> credit you in the commit message too of course).

You are welcome to use any of my phrasing or approach, of course!



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-03 19:17                       ` Graham Cobb
@ 2017-02-03 19:37                         ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 20+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-03 19:37 UTC (permalink / raw)
  To: Graham Cobb, linux-btrfs

On 2017-02-03 14:17, Graham Cobb wrote:
> On 03/02/17 16:01, Austin S. Hemmelgarn wrote:
>> Ironically, I ended up having time sooner than I thought.  The message
>> doesn't appear to be in any of the archives yet, but the message ID is:
>> <20170203134858.75210-1-ahferroin7@gmail.com>
>
> Ah. I didn't notice it until after I had sent my message.
No worries.
>
>> I actually like how you explained things a bit better though, so if you
>> are OK with it I'll update the patch I sent using your description (and
>> credit you in the commit message too of course).
>
> You are welcome to use any of my phrasing or approach, of course!
>
I'll send out the new version shortly then.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-01 17:43     ` Graham Cobb
  2017-02-01 22:27       ` Duncan
@ 2017-02-05 11:54       ` Kai Krakow
  2017-02-06 12:30         ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 20+ messages in thread
From: Kai Krakow @ 2017-02-05 11:54 UTC (permalink / raw)
  To: linux-btrfs

Am Wed, 1 Feb 2017 17:43:32 +0000
schrieb Graham Cobb <g.btrfs@cobb.uk.net>:

> On 01/02/17 12:28, Austin S. Hemmelgarn wrote:
> > On 2017-02-01 00:09, Duncan wrote:  
> >> Christian Lupien posted on Tue, 31 Jan 2017 18:32:58 -0500 as
> >> excerpted: 
>  [...]  
> >>
> >> I'm just a btrfs-using list regular not a dev, but AFAIK, the
> >> behavior is likely to be by design and difficult to change,
> >> because the send stream is simply a stream of userspace-context
> >> commands for receive to act upon, and any other suitably
> >> privileged userspace program could run the same commands.  (If
> >> your btrfs-progs is new enough receive even has a dump option,
> >> that prints the metadata operations in human readable form, one
> >> operation per line.)
> >>
> >> So making the receive snapshot read-only during the transfer would
> >> prevent receive itself working.  
> > That's correct.  Fixing this completely would require implementing
> > receive on the kernel side, which is not a practical option for
> > multiple reasons.  
> 
> I am with Christian on this. Both the effects he discovered go against
> my expectation of how send/receive would or should work.
> 
> This first bug is more serious because it appears to allow a
> non-privileged user to disrupt the correct operation of receive,
> creating a form of denial-of-service of a send/receive based backup
> process. If I decided that I didn't want my pron collection (or my
> incriminating emails) appearing in the backups I could just make sure
> that I removed them from the receive snapshots while they were still
> writeable.
> 
> You may be right that fixing this would require receive in the kernel,
> and that is undesirable, although it seems to me that it should be
> possible to do something like allow receive to create the snapshot
> with a special flag that would cause the kernel to treat it as
> read-only to any requests not delivered through the same file
> descriptor, or something like that (or, if that can't be done, at
> least require root access to make any changes). In any case, I
> believe it should be treated as a bug, even if low priority, with an
> explicit warning about the possible corruption of receive-based
> backups in the btrfs-receive man page.

How about mounting the receiver below a directory only traversable by
root (chmod og-rwx)? Backups shouldn't be directly accessible by
ordinary users anyway.

If you want a backup becoming accessible, you can later snapshot it to
an accessible location after send/receive finished without errors.

An in-transit backup, however, should always be protected from possible
disruptive access. This is an issue with any backup software. So place
the receive within an inaccessible directory. This is not something the
backup process should directly bother with for simplicity.


-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-02 10:52             ` Graham Cobb
  2017-02-02 12:49               ` Austin S. Hemmelgarn
  2017-02-03  9:14               ` Duncan
@ 2017-02-05 12:08               ` Kai Krakow
  2017-02-06 22:56                 ` Graham Cobb
  2 siblings, 1 reply; 20+ messages in thread
From: Kai Krakow @ 2017-02-05 12:08 UTC (permalink / raw)
  To: linux-btrfs

Am Thu, 2 Feb 2017 10:52:26 +0000
schrieb Graham Cobb <g.btrfs@cobb.uk.net>:

> On 02/02/17 00:02, Duncan wrote:
> > If it's a workaround, then many of the Linux procedures we as
> > admins and users use every day are equally workarounds.  Setting
> > 007 perms on a dir that doesn't have anything immediately security
> > vulnerable in it, simply to keep other users from even potentially
> > seeing or being able to write to something N layers down the subdir
> > tree, is standard practice.  
> 
> No. There is no need to normally place a read-only snapshot below a
> no-execute directory just to prevent write access to it. That is not
> part of the admin's expectation.

Wrong. If you tend to not be in control of the permissions below a
mountpoint, you prevent access to it by restricting permissions on a
parent directory of the mountpoint. It's that easy and it always has
been. That is standard practice. While your backup is running, you have
no control of it - thus use this standard practice!

If you want to make selective or general access to the backups,
snapshot them to a different location later. You should really learn
how to work with scratch locations for a backup. The receiver is such a
scratch location and should be handled like it.

Scratch directories for backup locations are not a bug and not a
workaround. It's just how you should handle it and how it works. By
definition, the target directory cannot be read-only while the backup
is running - so by definition you need other means to protect it.

That in turn defines all your current locations for snapshot as scratch
locations. Put your final snapshots somewhere else if you want your
users access these after successful finishing a backup. You never want
people to access in-transit or partial backups. In-transit and partial
always goes to the scratch directory. Only then, after success of the
backup, a backup should be made available in a final directory.

Maybe your design of your backup is wrong, and not how btrfs handles
that? Using scratch locations is a design you should really really
always use for backups. Every sophisticated admin would agree that this
should be best practice.


-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-05 11:54       ` Kai Krakow
@ 2017-02-06 12:30         ` Austin S. Hemmelgarn
  2017-02-06 21:40           ` Kai Krakow
  0 siblings, 1 reply; 20+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-06 12:30 UTC (permalink / raw)
  To: linux-btrfs

On 2017-02-05 06:54, Kai Krakow wrote:
> Am Wed, 1 Feb 2017 17:43:32 +0000
> schrieb Graham Cobb <g.btrfs@cobb.uk.net>:
>
>> On 01/02/17 12:28, Austin S. Hemmelgarn wrote:
>>> On 2017-02-01 00:09, Duncan wrote:
>>>> Christian Lupien posted on Tue, 31 Jan 2017 18:32:58 -0500 as
>>>> excerpted:
>>  [...]
>>>>
>>>> I'm just a btrfs-using list regular not a dev, but AFAIK, the
>>>> behavior is likely to be by design and difficult to change,
>>>> because the send stream is simply a stream of userspace-context
>>>> commands for receive to act upon, and any other suitably
>>>> privileged userspace program could run the same commands.  (If
>>>> your btrfs-progs is new enough receive even has a dump option,
>>>> that prints the metadata operations in human readable form, one
>>>> operation per line.)
>>>>
>>>> So making the receive snapshot read-only during the transfer would
>>>> prevent receive itself working.
>>> That's correct.  Fixing this completely would require implementing
>>> receive on the kernel side, which is not a practical option for
>>> multiple reasons.
>>
>> I am with Christian on this. Both the effects he discovered go against
>> my expectation of how send/receive would or should work.
>>
>> This first bug is more serious because it appears to allow a
>> non-privileged user to disrupt the correct operation of receive,
>> creating a form of denial-of-service of a send/receive based backup
>> process. If I decided that I didn't want my pron collection (or my
>> incriminating emails) appearing in the backups I could just make sure
>> that I removed them from the receive snapshots while they were still
>> writeable.
>>
>> You may be right that fixing this would require receive in the kernel,
>> and that is undesirable, although it seems to me that it should be
>> possible to do something like allow receive to create the snapshot
>> with a special flag that would cause the kernel to treat it as
>> read-only to any requests not delivered through the same file
>> descriptor, or something like that (or, if that can't be done, at
>> least require root access to make any changes). In any case, I
>> believe it should be treated as a bug, even if low priority, with an
>> explicit warning about the possible corruption of receive-based
>> backups in the btrfs-receive man page.
>
> How about mounting the receiver below a directory only traversable by
> root (chmod og-rwx)? Backups shouldn't be directly accessible by
> ordinary users anyway.
There are perfectly legitimate reasons to have backups directly 
accessible by users.  If you're running automated backups for _their_ 
systems _they_ absolutely should have at least read access to the 
backups _once they're stable_.  This is not a common case, but it is a 
perfectly legitimate one.  In the same way, if you're storing backups 
for your users (in this case you absolutely should not be4 using 
send/receive for other reasons), then the use case dictates that they 
have some way to access them.
>
> If you want a backup becoming accessible, you can later snapshot it to
> an accessible location after send/receive finished without errors.
>
> An in-transit backup, however, should always be protected from possible
> disruptive access. This is an issue with any backup software. So place
> the receive within an inaccessible directory. This is not something the
> backup process should directly bother with for simplicity.
I agree on this point however.  Doing a backup directly into the final 
persistent storage location is never a good idea.  It makes error 
handling more complicated, it makes handling of multi-tier storage more 
complicated (and usually less efficient), and it makes security more 
difficult.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-06 12:30         ` Austin S. Hemmelgarn
@ 2017-02-06 21:40           ` Kai Krakow
  0 siblings, 0 replies; 20+ messages in thread
From: Kai Krakow @ 2017-02-06 21:40 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 6 Feb 2017 07:30:31 -0500
schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>:

> > How about mounting the receiver below a directory only traversable
> > by root (chmod og-rwx)? Backups shouldn't be directly accessible by
> > ordinary users anyway.  
> There are perfectly legitimate reasons to have backups directly 
> accessible by users.  If you're running automated backups for _their_ 
> systems _they_ absolutely should have at least read access to the 
> backups _once they're stable_.

This is not what I tried to explain. The OPs question mixes the
creation process with later access. The creation process, however,
should always be isolated. See below, you're even agreeing. ;-)

> This is not a common case, but it is
> a perfectly legitimate one.  In the same way, if you're storing
> backups for your users (in this case you absolutely should not be4
> using send/receive for other reasons), then the use case dictates
> that they have some way to access them.

I don't deny that. But the OP should understand to properly isolate
both operations from each other. This is best practice, this is how it
should be done.

> > If you want a backup becoming accessible, you can later snapshot it
> > to an accessible location after send/receive finished without
> > errors.
> >
> > An in-transit backup, however, should always be protected from
> > possible disruptive access. This is an issue with any backup
> > software. So place the receive within an inaccessible directory.
> > This is not something the backup process should directly bother
> > with for simplicity.  

> I agree on this point however.  Doing a backup directly into the
> final persistent storage location is never a good idea.  It makes
> error handling more complicated, it makes handling of multi-tier
> storage more complicated (and usually less efficient), and it makes
> security more difficult.

Agreed. It complicates a lot of things. In conclusion: If done right,
the original request isn't a problem, neither is anything wrong by
design. It's a question of isolation of operations.

This is simply one of the most basic principles of a safe and secure
backup.

Personally, if I use rsync for backups, I always rsync to a scratch
location only accessible by the backup process. This scratch area may
even be incomplete, inconsistent or broken in other ways. Only when
rsync successfully finished, the scratch area will be snapshot to its
final destination - which is accessible by its users/owners. This also
has the benefit of the snapshots being self-contained deltas which can
be removed without rewriting or reorganizing partial or complete backup
history. That's a plus-point for backup safety, performance, and
retention policies.

Currently, I'm using borgbackup for my personal backups. It has a
similar approach by using checkpoints for resuming a partial backup.
Only a successful backup process creates the final checkpoint. The
intermediate checkpoints can be thrown away at any time later. It
currently stores a backup history of 95.8 TB (multiple months) on a 3 TB
hard disk. Unfortunately, I don't yet sync this to an offsite location.
My most important data (photos, mental work like programming) is stored
in a different location through other means (Git, cloud sync).

For customers, I prefer to use a local cache where the backup is
stored, then it will be synced offsite using deduplication algorithms
to reduce transfer overhead. A second, different backup software stores
another local copy for fast recovery in case of disaster. There's only
need to sync back from offsite storage in case of total local data
loss. And there's a backup for the backup. If one doesn't work, there's
always the other.

In all cases, the intermediate storage is protected from tampering by
the user, or even completely blocked to be accessed by the user. Only
final and clean snapshots are made available.

Also, error handling and cleanup is easy because errors don't leak or
propagate into the final storage. You simply can clean caches,
intermediate checkpoints, or scratch/staging areas. You can even loose
it for whatever reason (hardware problems, storage errors etc). The
only downside would be that the next backups takes longer to complete.

-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs receive leaves new subvolume modifiable during operation
  2017-02-05 12:08               ` Kai Krakow
@ 2017-02-06 22:56                 ` Graham Cobb
  0 siblings, 0 replies; 20+ messages in thread
From: Graham Cobb @ 2017-02-06 22:56 UTC (permalink / raw)
  To: linux-btrfs

On 05/02/17 12:08, Kai Krakow wrote:
> Wrong. If you tend to not be in control of the permissions below a
> mountpoint, you prevent access to it by restricting permissions on a
> parent directory of the mountpoint. It's that easy and it always has
> been. That is standard practice. While your backup is running, you have
> no control of it - thus use this standard practice!

Sorry, you are missing the point. This isn't about backups, it is about
snapshots.

To the sysadmin who is not a developer and does not know how receive is
actually implemented, send/receive appears to work exactly like taking a
readonly snapshot, but between two different disks. That is the mental
model they have of the process.

Taking a snapshot does not require hiding the target: it either works or
it doesn't, and it cannot be interfered with. The sysadmin's natural
expectation is that send/receive works the same way.

You may say, from your position of knowledge about how it is
implemented, that is an unrealistic expectation but it is a natural and
common expectation. I very firmly believe that 80% of ordinary btrfs
sysadmins would be surprised by this behaviour.

But, in any case, we can all agree that this unexpected behaviour needs
to be documented.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-02-06 22:56 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-31 23:32 btrfs receive leaves new subvolume modifiable during operation Christian Lupien
2017-02-01  5:09 ` Duncan
2017-02-01 12:28   ` Austin S. Hemmelgarn
2017-02-01 17:43     ` Graham Cobb
2017-02-01 22:27       ` Duncan
2017-02-01 22:51         ` Graham Cobb
2017-02-02  0:02           ` Duncan
2017-02-02 10:52             ` Graham Cobb
2017-02-02 12:49               ` Austin S. Hemmelgarn
2017-02-03  9:14               ` Duncan
2017-02-03 12:44                 ` Austin S. Hemmelgarn
2017-02-03 15:44                   ` Graham Cobb
2017-02-03 16:01                     ` Austin S. Hemmelgarn
2017-02-03 19:17                       ` Graham Cobb
2017-02-03 19:37                         ` Austin S. Hemmelgarn
2017-02-05 12:08               ` Kai Krakow
2017-02-06 22:56                 ` Graham Cobb
2017-02-05 11:54       ` Kai Krakow
2017-02-06 12:30         ` Austin S. Hemmelgarn
2017-02-06 21:40           ` Kai Krakow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.