* Subvolume UUID, data corruption? @ 2015-12-04 12:05 S.J 2015-12-04 13:07 ` Hugo Mills 0 siblings, 1 reply; 51+ messages in thread From: S.J @ 2015-12-04 12:05 UTC (permalink / raw) To: linux-btrfs Hello As we know, two file systems with the same UUID (like reported by eg. "blkid") are problematic, especially if both are mounted at the same time it leads to data corruption. So, copying a BTRFS partition with eg. dd to another and use it immediately is bad. To prevent this, "btrfstune -u /dev/sdaX" changes the UUID of the given partition. However, BTRFS subvolumes have their own UUID, which can be viewed eg. with "btrfs sub list -u /mountpoint". This UUIDs are not changed by the command above, and apparently there is no other way to do this. My question is: Is this a problem similar to the main UUID? Can mounting two BTRFS partitions with equal subvolume UUIDs (but different main UUID) can cause data corruption? (...well, and maybe someone could explain me what these subvol UUIDs are for in the first place. Subvolumes already have an unique number, and from user p.o.v, there isn't anything where the subvol UUIDs can be used at all (?)) Thank you PS: Apologies for sending a second mail, somehow my first try didn't contain any text ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Subvolume UUID, data corruption? 2015-12-04 12:05 Subvolume UUID, data corruption? S.J @ 2015-12-04 13:07 ` Hugo Mills 2015-12-05 3:28 ` Christoph Anton Mitterer 0 siblings, 1 reply; 51+ messages in thread From: Hugo Mills @ 2015-12-04 13:07 UTC (permalink / raw) To: S.J; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2166 bytes --] On Fri, Dec 04, 2015 at 01:05:28PM +0100, S.J wrote: > Hello > > As we know, two file systems with the same UUID (like reported by eg. "blkid") are problematic, especially if both are mounted at the same time it leads to data corruption. So, copying a BTRFS partition with eg. dd to another and use it immediately is bad. To prevent this, "btrfstune -u /dev/sdaX" changes the UUID of the given partition. > > However, BTRFS subvolumes have their own UUID, which can be viewed eg. with "btrfs sub list -u /mountpoint". This UUIDs are not changed by the command above, and apparently there is no other way to do this. > > My question is: Is this a problem similar to the main UUID? Can mounting two BTRFS partitions with equal subvolume UUIDs (but different main UUID) can cause data corruption? I don't think it'll cause problems. The UUIDs on subvols are only really used internally to that filesystem, so the kernel doesn't have a chance to get confused. The main thing that could be confused is send/receive, but that's a matter of possibly losing some validation (thus allowing you to do something that will fail) rather than causing active damage, as in the duplicate-FS-UUID case. > (...well, and maybe someone could explain me what these subvol UUIDs are for in the first place. Subvolumes already have an unique number, and from user p.o.v, there isn't anything where the subvol UUIDs can be used at all (?)) The subvol UUIDs are used to identify them through send/receive operations. There are three main UUID fields on a subvol: the actual UUID (u), the Received_UUID (r) and the Parent_UUID (p), and these are used to identify whether an incremental send could function correctly when received. (I can give you chapter and verse on how they're used if you like, but that's a bit excessive just for answering your question here). Hugo. > Thank you > > PS: Apologies for sending a second mail, somehow my first try didn't contain any text -- Hugo Mills | Do not meddle in the affairs of system hugo@... carfax.org.uk | administrators, for they are subtle, and quick to http://carfax.org.uk/ | anger. PGP: E2AB1DE4 | [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Subvolume UUID, data corruption? 2015-12-04 13:07 ` Hugo Mills @ 2015-12-05 3:28 ` Christoph Anton Mitterer 2015-12-05 5:52 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer ` (2 more replies) 0 siblings, 3 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-05 3:28 UTC (permalink / raw) To: Hugo Mills; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 701 bytes --] On Fri, 2015-12-04 at 13:07 +0000, Hugo Mills wrote: > I don't think it'll cause problems. Is there any guaranteed behaviour when btrfs encounters two filesystems (i.e. not talking about the subvols now) with the same UUID? Given that it's long standing behaviour that people could clone filesystems (dd, etc.) and this just worked™, btrfs should at least handle such case gracefully. For example, when already more than one block device with a btrfs of the same UUID are known, then it should refuse to mount any of them. And if one is already known and another device pops up it should refuse to mount that and continue to normally use the already mounted one. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) 2015-12-05 3:28 ` Christoph Anton Mitterer @ 2015-12-05 5:52 ` Christoph Anton Mitterer 2015-12-05 12:01 ` Subvolume UUID, data corruption? Hugo Mills 2015-12-05 13:19 ` Duncan 2 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-05 5:52 UTC (permalink / raw) To: Hugo Mills; +Cc: linux-btrfs Thinking a bit more I that, I came to the conclusion that it's actually security relevant that btrfs deals gracefully with filesystems having the same UUID: Getting to know someone else's filesystem's UUID may be more easily possible than one may think. It's usually not considered secret and for example included in debug reports (e.g. several Debian packages do this). The only thing an attacker then needs to do is somehow making another filesystem with the UUID available in his victims system. Simplest way is via a USB stick when he has local access. Thanks to some stupid desktop environments, chances aren't to bad that the system will even auto mount the stick. If btrfs doesn't handle this gracefully the attacker may damage or destroy the original filesystem, or if things get awkwardly corrupted (and data is written to the fake btrfs) even get data out of such a system (despite any screen locks or dm-crypt). Cheers Chris. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Subvolume UUID, data corruption? 2015-12-05 3:28 ` Christoph Anton Mitterer 2015-12-05 5:52 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer @ 2015-12-05 12:01 ` Hugo Mills 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2015-12-11 12:33 ` Subvolume UUID, data corruption? Austin S. Hemmelgarn 2015-12-05 13:19 ` Duncan 2 siblings, 2 replies; 51+ messages in thread From: Hugo Mills @ 2015-12-05 12:01 UTC (permalink / raw) To: Christoph Anton Mitterer; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1731 bytes --] On Sat, Dec 05, 2015 at 04:28:24AM +0100, Christoph Anton Mitterer wrote: > On Fri, 2015-12-04 at 13:07 +0000, Hugo Mills wrote: > > I don't think it'll cause problems. > Is there any guaranteed behaviour when btrfs encounters two filesystems > (i.e. not talking about the subvols now) with the same UUID? Nothing guaranteed, but the likelihood is that things will go badly wrong, in the sense of corrupt filesystems. > Given that it's long standing behaviour that people could clone > filesystems (dd, etc.) and this just worked™, btrfs should at least > handle such case gracefully. > For example, when already more than one block device with a btrfs of > the same UUID are known, then it should refuse to mount any of them. > And if one is already known and another device pops up it should refuse > to mount that and continue to normally use the already mounted one. Except that that's exactly the mechanism that btrfs uses to handle multi-device filesystems, so you've just broken anything with more than one device in the FS. If you inspect the devid on each device as well, and refuse duplicates of those, you've just broken any multipathing configurations. Even if you can handle that, if you have two copies of dev1, and two copies of dev2, how do you guarantee that the "right" pair of dev1 and dev2 is selected? (e.g. if you have them as network devices, and the device enumeration order is unstable on each boot). Hugo. -- Hugo Mills | Geek, n.: hugo@... carfax.org.uk | Circus sideshow performer specialising in the eating http://carfax.org.uk/ | of live animals. PGP: E2AB1DE4 | OED [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) 2015-12-05 12:01 ` Subvolume UUID, data corruption? Hugo Mills @ 2015-12-06 1:51 ` Christoph Anton Mitterer 2015-12-11 12:33 ` Subvolume UUID, data corruption? Austin S. Hemmelgarn 1 sibling, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-06 1:51 UTC (permalink / raw) To: Hugo Mills; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2797 bytes --] On Sat, 2015-12-05 at 12:01 +0000, Hugo Mills wrote: > On Sat, Dec 05, 2015 at 04:28:24AM +0100, Christoph Anton Mitterer > wrote: > > On Fri, 2015-12-04 at 13:07 +0000, Hugo Mills wrote: > > > I don't think it'll cause problems. > > Is there any guaranteed behaviour when btrfs encounters two > > filesystems > > (i.e. not talking about the subvols now) with the same UUID? > > Nothing guaranteed, but the likelihood is that things will go > badly > wrong, in the sense of corrupt filesystems. Phew... well sorry, but I think that's really something that makes btrfs not productively usable until fixed. > Except that that's exactly the mechanism that btrfs uses to handle > multi-device filesystems, so you've just broken anything with more > than one device in the FS. Don't other containers (e.g. LVM) do something similar, and yet they don't fail badly in case e.g. multipl PVs with the same UUID appear, AFAIC. And shouldn't there be some kind of device UUID, which differs different parts of the same btrfs (with the same fs UUID) but on different devices?! > If you inspect the devid on each device as well, and refuse > duplicates of those, you've just broken any multipathing > configurations. Well, how many people are actually doing this? A minority. So then it would be simply necessary that multipathing doesn't work out of the box and one need to specifically tell the kernel to consider a device with the same btrfs UUID as not a clone but another path to the same device. In any cases, rare feature like multipathing cannot justify the possibility of data corruption. That situtation as it is now is IMHO completely unacceptable. > Even if you can handle that, if you have two copies of dev1, and > two copies of dev2, how do you guarantee that the "right" pair of > dev1 > and dev2 is selected? (e.g. if you have them as network devices, and > the device enumeration order is unstable on each boot). Not sure what you mean now: The multipathing case? Then, as I've said, such situations would simply require to manually set things up and explicitly tell the kernel that the devices foo and bar are to be used (despite their dup UUID). If you mean what happens when I have e.g. two clones of a 2-device btrfs, as in fsdev1 fsdev2 fsdev1_clone fsdev2_clone Then as I've said before... if one pair of them is already mounted (i.e. when the *_clone appear), than it's likely that these belong actually together and the kernel should continue to use them and ignore any other. If all appear before any is mounted, then either is should refuse to mount/use any of them, or it should require to manually specify which devices to be used (i.e. via /dev/sda or so). Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Subvolume UUID, data corruption? 2015-12-05 12:01 ` Subvolume UUID, data corruption? Hugo Mills 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer @ 2015-12-11 12:33 ` Austin S. Hemmelgarn 1 sibling, 0 replies; 51+ messages in thread From: Austin S. Hemmelgarn @ 2015-12-11 12:33 UTC (permalink / raw) To: Hugo Mills, Christoph Anton Mitterer, linux-btrfs On 2015-12-05 07:01, Hugo Mills wrote: > On Sat, Dec 05, 2015 at 04:28:24AM +0100, Christoph Anton Mitterer wrote: >> On Fri, 2015-12-04 at 13:07 +0000, Hugo Mills wrote: >>> I don't think it'll cause problems. >> Is there any guaranteed behaviour when btrfs encounters two filesystems >> (i.e. not talking about the subvols now) with the same UUID? > > Nothing guaranteed, but the likelihood is that things will go badly > wrong, in the sense of corrupt filesystems. > >> Given that it's long standing behaviour that people could clone >> filesystems (dd, etc.) and this just worked™, btrfs should at least >> handle such case gracefully. >> For example, when already more than one block device with a btrfs of >> the same UUID are known, then it should refuse to mount any of them. >> And if one is already known and another device pops up it should refuse >> to mount that and continue to normally use the already mounted one. > > Except that that's exactly the mechanism that btrfs uses to handle > multi-device filesystems, so you've just broken anything with more > than one device in the FS. > > If you inspect the devid on each device as well, and refuse > duplicates of those, you've just broken any multipathing > configurations. This already potentially breaks multipath configurations, as well as dm-cache, some soft raid configurations, and probably other things as well. > > Even if you can handle that, if you have two copies of dev1, and > two copies of dev2, how do you guarantee that the "right" pair of dev1 > and dev2 is selected? (e.g. if you have them as network devices, and > the device enumeration order is unstable on each boot). In some cases it can be done without much effort. Take dm-cache for example. The hierarchy of devices in a dm-cache device looks like this: cached-device + backing-device + cache-pool + pool-storage + pool-metadata At a minimum, the cached device and the backing device contain identical data (the cached-device just has a writeback or writethrough cache on it), and the pool storage device may under some circumstances look like a BTRFS filesystem as well. In this case, it's pretty obvious that the only device that BTRFS should be accessing is the cached device, not the backing device or the pool storage device. For this, if we simply blacklist all devices that are themselves components in device-mapper tables, then we avoid the issue here, and possibly in some other as of yet undiscovered cases. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Subvolume UUID, data corruption? 2015-12-05 3:28 ` Christoph Anton Mitterer 2015-12-05 5:52 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2015-12-05 12:01 ` Subvolume UUID, data corruption? Hugo Mills @ 2015-12-05 13:19 ` Duncan 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2 siblings, 1 reply; 51+ messages in thread From: Duncan @ 2015-12-05 13:19 UTC (permalink / raw) To: linux-btrfs Christoph Anton Mitterer posted on Sat, 05 Dec 2015 04:28:24 +0100 as excerpted: > On Fri, 2015-12-04 at 13:07 +0000, Hugo Mills wrote: >> I don't think it'll cause problems. > Is there any guaranteed behaviour when btrfs encounters two filesystems > (i.e. not talking about the subvols now) with the same UUID? > > Given that it's long standing behaviour that people could clone > filesystems (dd, etc.) and this just worked™, btrfs should at least > handle such case gracefully. > For example, when already more than one block device with a btrfs of the > same UUID are known, then it should refuse to mount any of them. > > And if one is already known and another device pops up it should refuse > to mount that and continue to normally use the already mounted one. The problem with btrfs is that because (unlike traditional filesystems) it's multi-device, it needs some way to identify what devices belong to a particular filesystem. And UUID is, by definition and expansion, Universally Unique ID. Btrfs simply depends on it being what it says on the the tin, universally unique, to ID the components of the filesystem and assemble them correctly. Besides dd, etc, LVM snapshots are another case where this goes screwy. If the UUID isn't UUID, do a btrfs device scan (which udev normally does by default these days) so the duplicate UUID is detected, and btrfs *WILL* eventually start trying to write to all the "newly added" devices that scan found, identified by their Universally Unique IDs, aka UUIDs. It's not a matter of if, but when. And the UUID is embedded so deeply within the filesystem and its operations, as an inextricable part of the metadata (thus avoiding the problem reiserfs had where a reiserfs stored in a loopback file on a reiserfs, would screw up reiserfsck, on btrfs, the loopback file would have a different UUID and thus couldn't be mixed up), that changing the UUID is not the simple operation of changing a few bytes in the superblock that it is on other filesystems, which is why there's now a tool to go thru all those metadata entries and change it. So an aware btrfs admin simply takes pains to avoid triggering a btrfs device scan at the wrong time, and to immediately hide their LVM snapshots, immediately unplug their directly dd-ed devices, etc, and thus doesn't have to deal with the filesystem corruption that'd be a when not if, if they didn't take such precautions with their dupped UUIDs that actually aren't as UUID as the name suggests... And as your followup suggests in a security context, they consider masking out their UUIDs before posting them, as well, tho most kernel hackers generally consider unsupervised physical access to be game-over, security-wise. (After all, in that case there's often little or nothing preventing a reboot to that USB stick, if desired, or simply yanking the devices and duping them or plugging them in elsewhere, if the BIOS is password protected, with the only thing standing in the way at that point being possible device encryption.) The UUID *as* a UUID, _unique_ at least on that system (if not actually universally) as it says on the tin, is so deeply embedded in btrfs that at this point it's not going to be removed. The only real alternative if you don't like it is using a different filesystem. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) 2015-12-05 13:19 ` Duncan @ 2015-12-06 1:51 ` Christoph Anton Mitterer 2015-12-06 4:06 ` Duncan 2015-12-06 14:34 ` attacking btrfs filesystems via UUID collisions? Qu Wenruo 0 siblings, 2 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-06 1:51 UTC (permalink / raw) To: Duncan, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 6099 bytes --] On Sat, 2015-12-05 at 13:19 +0000, Duncan wrote: > The problem with btrfs is that because (unlike traditional > filesystems) > it's multi-device, it needs some way to identify what devices belong > to a > particular filesystem. Sure, but that applies to lvm, or MD as well... and I wouldn't know of any random corruption issues there. > And UUID is, by definition and expansion, Universally Unique ID. Nitpicking doesn't help here,... reality is they're not,.. either by people doing stuff like dd, other forms of clones, LVM, etc. ... or as I've described maliciously. > Btrfs > simply depends on it being what it says on the the tin, universally > unique, to ID the components of the filesystem and assemble them > correctly. Admittedly, I'm not an expert to the internals of btrfs, but it seems other multi-device containers can handle UUID duplicates fine, or at least so that you don't get any data corruption (or leaks). This is a showstopper - maybe not under lab conditions but surely under real world scenarios. I'm actually quite surprised that no-one else didn't complain about that before, given how long btrfs exists. > Besides dd, etc, LVM snapshots are another case where this goes > screwy. > If the UUID isn't UUID, do a btrfs device scan (which udev normally > does > by default these days) so the duplicate UUID is detected, and btrfs > *WILL* eventually start trying to write to all the "newly added" > devices > that scan found, identified by their Universally Unique IDs, aka > UUIDs. > It's not a matter of if, but when. Well.. as I said... quite scary, with respect to both, accidental and malicious cases of duplicate UUIDs. > And the UUID is embedded so deeply within the filesystem and its > operations, as an inextricable part of the metadata (thus avoiding > the > problem reiserfs had where a reiserfs stored in a loopback file on a > reiserfs, would screw up reiserfsck, on btrfs, the loopback file > would > have a different UUID and thus couldn't be mixed up), that changing > the > UUID is not the simple operation of changing a few bytes in the > superblock > that it is on other filesystems, which is why there's now a tool to > go > thru all those metadata entries and change it. I don't think that this design is per se bad and prevents the kernel to handle such situations gracefully. I would expect that in addition to the fs UUID, it needs a form of device ID... so why not simply ignoring any new device for which there already is a matching fs UUID and device ID, unless the respective tool (mount, btrfs, etc.) is explicitly told so via some device=/dev/sda,/dev/sdb option. If that means that less things work out of the box (in the sense of "auto-assembly") well than this is simply necessary. data security and consistency is definitely much more important than any fancy auto-magic. > So an aware btrfs admin simply takes pains to avoid triggering a > btrfs > device scan at the wrong time, and to immediately hide their LVM > snapshots, immediately unplug their directly dd-ed devices, etc, and > thus > doesn't have to deal with the filesystem corruption that'd be a when > not > if, if they didn't take such precautions with their dupped UUIDs that > actually aren't as UUID as the name suggests... a) People shouldn't need to do days of study to be able to use btrfs securely. Of course it's more advanced and not everything can be simplified in a way so that users don't need to know anything (e.g. all the well-known effects of CoW)... but when the point is reached where security and data integrity is threatened, there's definitely a hard border that mustn't be crossed. b) Given how complex software is, I doubt that it's easily possible, even for the aware admin, to really prevent all situations that can lead to such situations. Not to talk about about any attack-scenarios. > And as your followup suggests in a security context, they consider > masking out their UUIDs before posting them, as well, tho most kernel > hackers generally consider unsupervised physical access to be game- > over, > security-wise. Do they? I rather thought many of them had a rather practical and real- world-situations-based POV. > (After all, in that case there's often little or nothing > preventing a reboot to that USB stick, if desired, or simply yanking > the > devices and duping them or plugging them in elsewhere, if the BIOS is > password protected, with the only thing standing in the way at that > point > being possible device encryption.) There's hardware which would, when it detects physicals intrusion (like yanking) lock up itself (securely clearing the memory, disconnecting itself from other nodes, which may be compromised as well, when the filesystem on the attacked node would go crazy. You have things like ATMs, which are physically usually quite well secured, but which do have rather easily accessible maintenance ports. All of us have seen such embedded devices rebooting themselves, where you see kernel messages. That's the point where an attacker could easily get the btrfs UUID: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.2.0-1-amd64 root=UUID=bd1ea5a0-9bba-11e5-82fa-502690aa641f If you can attack such devices already by just having access to a USB port... then holly sh**... > The only real > alternative if > you don't like it is using a different filesystem. As I've said, I don't have a problem with UUIDs... I just can't quite believe that btrfs and the userland cannot be modified so that it handles such cases gracefully. If not, than, to be quite honest, that would be really a major showstopper for many usage areas. And I'm not talking about ATMs (or any other embedded devices where people may have non-supervides access - e.g. TVs in a mall, entertainment systems in planes) but also the normal desktops/laptops where colleagues, fellow students, etc. may want to play some "prank". Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer @ 2015-12-06 4:06 ` Duncan 2015-12-09 5:07 ` Christoph Anton Mitterer 2015-12-06 14:34 ` attacking btrfs filesystems via UUID collisions? Qu Wenruo 1 sibling, 1 reply; 51+ messages in thread From: Duncan @ 2015-12-06 4:06 UTC (permalink / raw) To: linux-btrfs Christoph Anton Mitterer posted on Sun, 06 Dec 2015 02:51:20 +0100 as excerpted: > You have things like ATMs, which are physically usually quite well > secured, but which do have rather easily accessible maintenance ports. > All of us have seen such embedded devices rebooting themselves, where > you see kernel messages. > That's the point where an attacker could easily get the btrfs UUID: > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.2.0-1-amd64 > root=UUID=bd1ea5a0-9bba-11e5-82fa-502690aa641f > > If you can attack such devices already by just having access to a USB > port... then holly sh**... There's actually a number of USB-based hardware and software vulns out there, from the under $10 common-component-capacitor-based charge-and-zap (charges off the 5V USB line, zaps the port with several hundred volts reverse-polarity, if the machine survives the first pulse and continues supplying 5V power, repeat...), to the ones that act like USB-based input devices and "type" in whatever commands, to simple USB-boot to a forensic distro and let you inspect attached hardware (which is where the encrypted storage comes in, they've got everything that's not encrypted), to the plain old fashioned boot-sector viruses that quickly jump to everything else on the system that's not boot-sector protected and/or secure-boot locked, to... Which is why most people in the know say if you have unsupervised physical access, you effectively own the machine and everything on it, at least that's not encrypted. There's a reason some places hot-glue the USB ports. If you're plugging anything untrusted into them... and that's a well known social engineering hack as well, simply drop a few thumb drives in the target parking lot and wait to see who picks them up and plugs them in, so they can call home... Pen-testers do it. NSA does it. It's said a form of that is how they bridged the air-gap to the Iranian centrifuges... If you haven't been keeping up, you really have some reading to do. If you're plugging in untrusted USB devices, seriously, a thumb drive with a few duplicated btrfs UUIDs is the least of your worries! >> The only real alternative if you don't like it is using a different >> filesystem. > As I've said, I don't have a problem with UUIDs... I just can't quite > believe that btrfs and the userland cannot be modified so that it > handles such cases gracefully. As I implied, UUIDs usage is so deeply embedded, fixing btrfs to not work that way is pretty much impossible. You'd be pretty much starting from scratch and using some of the same ideas; it wouldn't be btrfs any longer. > If not, than, to be quite honest, that would be really a major > showstopper for many usage areas. Consider the show stopped, then. > And I'm not talking about ATMs (or any other embedded devices where > people may have non-supervides access - e.g. TVs in a mall, > entertainment systems in planes) but also the normal desktops/laptops > where colleagues, fellow students, etc. may want to play some "prank". As I said, if you're plugging in or allowing to be plugged in untrusted USB devices, show's over, they're already playing pretty much any prank they want, including zapping the hardware. USB's now less trusted than a raw Internet hookup with all services exposed. The only controlling factor now is the physical presence limitation, and if you're plugging in devices you get for instance as "gifts just for trying us out" or whatever, that someone mails to you... worse than running MS and mindlessly running any exe someone sends you. BTW, this is documented (in someone simpler "do not do XX" form) on the wiki, gotchas page. https://btrfs.wiki.kernel.org/index.php/Gotchas#Block-level_copies_of_devices -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) 2015-12-06 4:06 ` Duncan @ 2015-12-09 5:07 ` Christoph Anton Mitterer 2015-12-09 11:54 ` Duncan 0 siblings, 1 reply; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-09 5:07 UTC (permalink / raw) To: Duncan, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 4170 bytes --] On Sun, 2015-12-06 at 04:06 +0000, Duncan wrote: > There's actually a number of USB-based hardware and software vulns > out > there, from the under $10 common-component-capacitor-based charge- > and-zap > (charges off the 5V USB line, zaps the port with several hundred > volts > reverse-polarity, if the machine survives the first pulse and > continues > supplying 5V power, repeat...), to the ones that act like USB-based > input > devices and "type" in whatever commands, to simple USB-boot to a > forensic > distro and let you inspect attached hardware (which is where the > encrypted > storage comes in, they've got everything that's not encrypted), > to the plain old fashioned boot-sector viruses that quickly jump to > everything else on the system that's not boot-sector protected and/or > secure-boot locked, to... Well this is all well known - at least to security folks ;) - but to be quite honest: Not an excuse for allowing even more attack surface, in this case via the filesystem. One will *always* find a weaker element in the security chain, and could always argue with that not to fixe one's own issues. "Well, there's no need to fix that possible collision-data-leakage- issue in btrfs[0]! Why? Well an attacker could still simply abduct the bank manager, torture him for hours until he gives any secret with pleasure" ;-) > Which is why most people in the know say if you have unsupervised > physical > access, you effectively own the machine and everything on it, at > least > that's not encrypted. Sorry, I wouldn't say so. Ultimately you're of course right, which is why my fully-dm-crypted notebook is never left alone when it runs (cold boot or USB firmware attacks)... but in practise things are a bit different I think. Take the ATM example. Or take real world life in big computing centres. Fact is, many people have usually access, from the actual main personell, over electricians to the cleaning personnel. Whacking a device or attacking it via USB firmware tricks, is of course possible for them, but it's much more likely to be noted (making noise, taking time and so on),... so there is no need to give another attack surface by this. > If you haven't been keeping up, you really have some reading to > do. If > you're plugging in untrusted USB devices, seriously, a thumb drive > with a > few duplicated btrfs UUIDs is the least of your worries! Well as I've said, getting that in via USB may be only one way. We're already so far that GNOME&Co. automount devices when plugged... who says the the next step isn't that this happens remotely in some form, e.g. btrfs-image on dropbox, automounted by nautilus. Okay, that may be a bit constructed, but it should demonstrate that there could be plenty of ways for that to happen, which we don't even think of (and usually these are the worst in security). You said it's basically not fixable in btrfs: It's absolutely clear that I'm no btrfs expert (or even developer), but my poor man approach which I think I've written before doesn't seem so impossible, does it? 1) Don't simply "activate" btrfs devices that are found but rather: 2) Check if there are other devices of the same fs UUID + device ID, or more generally said: check if there are any collisions 3) If there are, and some of them are already active, continue to use them, don't activate the newly appeared ones 4) If there are, and none of them are already active, refuse to activate *any* of them unless the user manually instructs to do so via device= like options. > BTW, this is documented (in someone simpler "do not do XX" form) on > the > wiki, gotchas page. > > https://btrfs.wiki.kernel.org/index.php/Gotchas#Block-level_copies_of > _devices I know, but it doesn't really tell all possibly consequences, and again, it's unlikely that the end-user (even if possibly heavily affected by it) will stumble over that. Cheer, Chris. [0] Assuming there is actually one, I haven't really verified that and base it solely one what people told that basically arbitrary corruptions may happen on both devices. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) 2015-12-09 5:07 ` Christoph Anton Mitterer @ 2015-12-09 11:54 ` Duncan 0 siblings, 0 replies; 51+ messages in thread From: Duncan @ 2015-12-09 11:54 UTC (permalink / raw) To: linux-btrfs Christoph Anton Mitterer posted on Wed, 09 Dec 2015 06:07:38 +0100 as excerpted: > Well as I've said, getting that in via USB may be only one way. > We're already so far that GNOME&Co. automount devices when plugged... Ugh. ... And many know that's the sort of thing that made MS so much of a security headache, and want no part of it! FWIW, of course gentoo allows far more configurability in this regard than many distros, but no automount here, and while I don't do gnome because I like my system configurable and they'd just as soon it be their way or the highway (echoes of proprietaryware attitude there if you ask me, but I'm very glad gnome's available for them to work on as otherwise they'd be troubling kde and etc to go the same way), I do have a much more limited than usual kde installed, without stuff like the device notifier plasmoid or underlying infrastructure like udisks, as the only things I want mounted are the things I've either configured to be mounted via fstab, or the thing's I've manually mounted. (FWIW, the semantic- desktop crap is opted out at build-time too, so it's not even there to turn off at runtime, the best most distros allow for those not interested in that stuff. It meant dumping a few apps and some missing features in others, but I don't have indexing taking gigs of space and major IO bandwidth at the most inconvenient times (any time!) for nothing I'm going to make use of, either!) > You said it's basically not fixable in btrfs: > It's absolutely clear that I'm no btrfs expert (or even developer), but > my poor man approach which I think I've written before doesn't seem so > impossible, does it? > 1) Don't simply "activate" btrfs devices that are found but rather: > 2) Check if there are other devices of the same fs UUID + device ID, > or more generally said: check if there are any collisions > 3) If there are, and some of them are already active, > continue to use them, don't activate the newly appeared ones > 4) If there are, and none of them are already active, refuse to > activate *any* of them unless the user manually instructs to do so > via device= like options. The underlying issue pretty much isn't fixable, but as Qu has suggested on that subthread, there's ameliorations that can be done, basically in line with your suggestions above, and you've indicated that you'd consider that fixed, tho neither he nor I consider it "fixed", only hidden to some extent. Anyway, he's a dev and actively involved now, while I'm not a dev, so he can take it from there. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2015-12-06 4:06 ` Duncan @ 2015-12-06 14:34 ` Qu Wenruo 2015-12-06 20:55 ` Chris Murphy 2015-12-09 5:39 ` Christoph Anton Mitterer 1 sibling, 2 replies; 51+ messages in thread From: Qu Wenruo @ 2015-12-06 14:34 UTC (permalink / raw) To: Christoph Anton Mitterer, Duncan, linux-btrfs On 12/06/2015 09:51 AM, Christoph Anton Mitterer wrote: > On Sat, 2015-12-05 at 13:19 +0000, Duncan wrote: >> The problem with btrfs is that because (unlike traditional >> filesystems) >> it's multi-device, it needs some way to identify what devices belong >> to a >> particular filesystem. > Sure, but that applies to lvm, or MD as well... and I wouldn't know of > any random corruption issues there. Not sure about LVM/MD, but they should suffer the same UUID conflict problem. The only idea I have can only enhance the behavior, but never fix it. For example, if found multiple btrfs devices with same devid, just refuse to mount. And for already mounted btrfs, ignore any duplicated fsid/devid. The problem can get even tricky for case like device missing for a while and appear again case. But just as you mentioned, it *IS* a real problem, and we should need to enhance it. > > >> And UUID is, by definition and expansion, Universally Unique ID. > Nitpicking doesn't help here,... reality is they're not,.. either by > people doing stuff like dd, other forms of clones, LVM, etc. ... or as > I've described maliciously. > > >> Btrfs >> simply depends on it being what it says on the the tin, universally >> unique, to ID the components of the filesystem and assemble them >> correctly. > Admittedly, I'm not an expert to the internals of btrfs, but it seems > other multi-device containers can handle UUID duplicates fine, or at > least so that you don't get any data corruption (or leaks). I'd like to see how LVM/DM behaves first, at least as a reference if they are really so safe. For example, I have a whole disk as the following configuration: 0 10G 20G | test_lv | | ---------------- | test_vg | ----------------------- | test_pv | ----------------------- | /dev/sdb | ----------------------- If I did a dd copy of /dev/sdb to /dev/sdc, what will pv/vg/lv rescan show if test_pv/vg/lv is already active? And what will rescan show if they are not active? Or after a reboot? > > This is a showstopper - maybe not under lab conditions but surely under > real world scenarios. > I'm actually quite surprised that no-one else didn't complain about > that before, given how long btrfs exists. > > >> Besides dd, etc, LVM snapshots are another case where this goes >> screwy. >> If the UUID isn't UUID, do a btrfs device scan (which udev normally >> does >> by default these days) so the duplicate UUID is detected, and btrfs >> *WILL* eventually start trying to write to all the "newly added" >> devices >> that scan found, identified by their Universally Unique IDs, aka >> UUIDs. >> It's not a matter of if, but when. > Well.. as I said... quite scary, with respect to both, accidental and > malicious cases of duplicate UUIDs. > > >> And the UUID is embedded so deeply within the filesystem and its >> operations, as an inextricable part of the metadata (thus avoiding >> the >> problem reiserfs had where a reiserfs stored in a loopback file on a >> reiserfs, would screw up reiserfsck, on btrfs, the loopback file >> would >> have a different UUID and thus couldn't be mixed up), that changing >> the >> UUID is not the simple operation of changing a few bytes in the >> superblock >> that it is on other filesystems, which is why there's now a tool to >> go >> thru all those metadata entries and change it. > I don't think that this design is per se bad and prevents the kernel to > handle such situations gracefully. > > I would expect that in addition to the fs UUID, it needs a form of > device ID... so why not simply ignoring any new device for which there > already is a matching fs UUID and device ID, unless the respective tool > (mount, btrfs, etc.) is explicitly told so via some > device=/dev/sda,/dev/sdb option. IIRC, there were some btrfs-progs patches for such behavior, not sure about kernel part though. But at least an interesting method to solve the problem. (Better than just rejecting mounting any) > > If that means that less things work out of the box (in the sense of > "auto-assembly") well than this is simply necessary. > data security and consistency is definitely much more important than > any fancy auto-magic. Can't agree any more. Especially when auto leads to wrong behavior (Like kernel version based probing). And after all, this topic makes me remember the bugreport of fuzzed (but csum recalculated) images. I used to ignore them and I think that wouldn't happen. But the reporter is right, it's a btrfs security problem, and now I'm super happy to see such report. As it's easy to fix, I can always submit some patches if there is no other guy faster than me. :) So for this one, as long as we find a good behavior to solve it, it won't be a big thing. Thanks, Qu > > > >> So an aware btrfs admin simply takes pains to avoid triggering a >> btrfs >> device scan at the wrong time, and to immediately hide their LVM >> snapshots, immediately unplug their directly dd-ed devices, etc, and >> thus >> doesn't have to deal with the filesystem corruption that'd be a when >> not >> if, if they didn't take such precautions with their dupped UUIDs that >> actually aren't as UUID as the name suggests... > a) People shouldn't need to do days of study to be able to use btrfs > securely. > Of course it's more advanced and not everything can be > simplified in a way so that users don't need to know anything (e.g. all > the well-known effects of CoW)... but when the point is reached where > security and data integrity is threatened, there's definitely a hard > border that mustn't be crossed. > > b) Given how complex software is, I doubt that it's easily possible, > even for the aware admin, to really prevent all situations that can > lead to such situations. > Not to talk about about any attack-scenarios. > > > >> And as your followup suggests in a security context, they consider >> masking out their UUIDs before posting them, as well, tho most kernel >> hackers generally consider unsupervised physical access to be game- >> over, >> security-wise. > Do they? I rather thought many of them had a rather practical and real- > world-situations-based POV. > >> (After all, in that case there's often little or nothing >> preventing a reboot to that USB stick, if desired, or simply yanking >> the >> devices and duping them or plugging them in elsewhere, if the BIOS is >> password protected, with the only thing standing in the way at that >> point >> being possible device encryption.) > There's hardware which would, when it detects physicals intrusion (like > yanking) lock up itself (securely clearing the memory, disconnecting > itself from other nodes, which may be compromised as well, when the > filesystem on the attacked node would go crazy. > > You have things like ATMs, which are physically usually quite well > secured, but which do have rather easily accessible maintenance ports. > All of us have seen such embedded devices rebooting themselves, where > you see kernel messages. > That's the point where an attacker could easily get the btrfs UUID: > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.2.0-1-amd64 > root=UUID=bd1ea5a0-9bba-11e5-82fa-502690aa641f > > If you can attack such devices already by just having access to a USB > port... then holly sh**... > > >> The only real >> alternative if >> you don't like it is using a different filesystem. > As I've said, I don't have a problem with UUIDs... I just can't quite > believe that btrfs and the userland cannot be modified so that it > handles such cases gracefully. > > If not, than, to be quite honest, that would be really a major > showstopper for many usage areas. > And I'm not talking about ATMs (or any other embedded devices where > people may have non-supervides access - e.g. TVs in a mall, > entertainment systems in planes) but also the normal desktops/laptops > where colleagues, fellow students, etc. may want to play some "prank". > > > Cheers, > Chris. > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-06 14:34 ` attacking btrfs filesystems via UUID collisions? Qu Wenruo @ 2015-12-06 20:55 ` Chris Murphy 2015-12-09 5:39 ` Christoph Anton Mitterer 1 sibling, 0 replies; 51+ messages in thread From: Chris Murphy @ 2015-12-06 20:55 UTC (permalink / raw) To: Qu Wenruo; +Cc: Christoph Anton Mitterer, Duncan, Btrfs BTRFS On Sun, Dec 6, 2015 at 7:34 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > But just as you mentioned, it *IS* a real problem, and we should need to > enhance it. LVM sorta avoids the problem, because its snapshots aren't active by default so the underlying fs (and its UUID and superblock) don't appear to the kernel. In no order: 1. better practices, we really need to tell users, and documentation writers, that using dd (or variant) to copy Btrfs volumes has a consequence and should not be used to make copies. 2. Btrfs needs a better way to make a copy of a volume when there are snapshots (including even rw snapshots); e.g. permit send/receive to work on rw snapshots if the fs is ro mounted; e.g. a way to do "recursive" send/receive. 3. Some way to fail gracefully, when there's ambiguity that cannot be resolved. Once there are duplicate devs (dd or lvm snapshots, etc) then there's simply no way to resolve the ambiguity automatically, and the volume should just refuse to rw mount until the user resolves the ambiguity. I think it's OK to fallback to ro mount (maybe) by default in such a case rather than totally fail to mount. > I'd like to see how LVM/DM behaves first, at least as a reference if they > are really so safe. > For example, I have a whole disk as the following configuration: > > 0 10G 20G > | test_lv | | > ---------------- > | test_vg | > ----------------------- > | test_pv | > ----------------------- > | /dev/sdb | > ----------------------- > > If I did a dd copy of /dev/sdb to /dev/sdc, > what will pv/vg/lv rescan show if test_pv/vg/lv is already active? > And what will rescan show if they are not active? Or after a reboot? I haven't tested it recently but my recollection is that it flat refused to activate the VG/LV whenever two PV's with identical UUIDs were visible, that is, it would not use either PV until I resolved the ambiguity by physical PV removal, or using pvremove, or using wipefs. -- Chris Murphy ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-06 14:34 ` attacking btrfs filesystems via UUID collisions? Qu Wenruo 2015-12-06 20:55 ` Chris Murphy @ 2015-12-09 5:39 ` Christoph Anton Mitterer 2015-12-09 21:48 ` S.J. 1 sibling, 1 reply; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-09 5:39 UTC (permalink / raw) To: Qu Wenruo, Duncan, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 8444 bytes --] On Sun, 2015-12-06 at 22:34 +0800, Qu Wenruo wrote: > Not sure about LVM/MD, but they should suffer the same UUID conflict > problem. Well I had that actually quite often in LVM (i.e. same UUIDs visible on the same system), basically because we made clones from one template VM image and when that is normally booted, LVM doesn't allow to change the UUIDs of already active PV/VG/LVs (or maybe just some of these three, forgot the details) But there was never any issue, LVM on the host system, when one set was already used, continues to use that just fine and the toolset reports which it would use (more below). > The only idea I have can only enhance the behavior, but never fix it. > For example, if found multiple btrfs devices with same devid, just > refuse to mount. > And for already mounted btrfs, ignore any duplicated fsid/devid. Well I think that's already a perfectly valid solution... basically the idea that I had before. I'd call that a 100% fix, not just a workaround. If then the tools (i.e. btrfstune) allows to change the UUID of the duplicate set of devices (perhaps again with the necessity to specify each of them via device=/dev/sda,etc.) I'd be completely happy again,... and the show could get on ;) > The problem can get even tricky for case like device missing for a > while > and appear again case. I had thought about that too: a) In the non-malicious case, this could e.g. mean that a device from a btrfs RAID was missing and a clone with the same UUID / dev ID get's added to the system Possible consequences, AFAICS: - The data is simply auto-rebuilt on the clone. - Some corruptions occur when the clone is older, and data that was only on the newer device is now missing (not sure if this can happen at all or whether generation IDs prevent it). b) In the malicious/attack case, one possible scenario could be: A device is missing from a btrfs RAID... the machine is left unattended. An attacker comes plugs in the USB stick with the missing UUID. Is the rebuild (and thus data leakage) now happening automatically? In any case though, a simply solution could be, that not automatic assemblies happen per default, and the people who still want to do that, are properly warned about the possible implications in the docs. > But just as you mentioned, it *IS* a real problem, and we should need > to > enhance it. Should one (or I) add this as a ticket to the kernel bugzilla, or as an entry to the btrfs wiki? > I'd like to see how LVM/DM behaves first, at least as a reference if > they are really so safe. Well that's very simple to check, I did it here for the LV case only: root@lcg-lrz-admin:~# truncate -s 1G image1 root@lcg-lrz-admin:~# losetup -f image1 root@lcg-lrz-admin:~# pvcreate /dev/loop0 Physical volume "/dev/loop0" successfully created root@lcg-lrz-admin:~# losetup -d /dev/loop0 root@lcg-lrz-admin:~# cp image1 image2 root@lcg-lrz-admin:~# losetup -f image1 root@lcg-lrz-admin:~# pvscan PV /dev/sdb VG vg_data lvm2 [50,00 GiB / 0 free] PV /dev/sda1 VG vg_system lvm2 [9,99 GiB / 0 free] PV /dev/loop0 lvm2 [1,00 GiB] Total: 3 [60,99 GiB] / in use: 2 [59,99 GiB] / in no VG: 1 [1,00 GiB] root@lcg-lrz-admin:~# losetup -f image2 root@lcg-lrz-admin:~# pvscan Found duplicate PV tSK9Cdpw6bcmocZnxFPD6ThNz1opRXsB: using /dev/loop1 not /dev/loop0 PV /dev/sdb VG vg_data lvm2 [50,00 GiB / 0 free] PV /dev/sda1 VG vg_system lvm2 [9,99 GiB / 0 free] PV /dev/loop1 lvm2 [1,00 GiB] Total: 3 [60,99 GiB] / in use: 2 [59,99 GiB] / in no VG: 1 [1,00 GiB] Obviously, with PVs alone, there is no "x is already used" case. As one can see it just says it would ignore one of them, which I think is rather stupid in that particular case (i.e. non of the devices already used somehow), because it probably just "randomly" decides which is to be used, which is ambiguous. > And what will rescan show if they are not active? My experience was always (it's just quite late and I don't want to simulate everything right now, which is trivial anyway): - It shows warnings about the duplicates in the tools - It continues to use the already active devices (if any) - Unfortunately, while the kernel continues to use the already used devices, the toolset may use other device (kinda stupid, but at least it warns and the already used devices seem to be still properly used): continuation from the setup above: root@lcg-lrz-admin:~# losetup -d /dev/loop1 (now only image1 is seen as loop0) root@lcg-lrz-admin:~# vgcreate vg_test /dev/loop0 Volume group "vg_test" successfully created root@lcg-lrz-admin:~# lvcreate -n test vg_test -l 100 Logical volume "test" created root@lcg-lrz-admin:~# mkfs.ext4 /dev/vg_test/test mke2fs 1.42.12 (29-Aug-2014) ... root@lcg-lrz-admin:~# mount /dev/vg_test/test /mnt/ root@lcg-lrz-admin:~# losetup -a /dev/loop0: [64768]:518297 (/root/image1) root@lcg-lrz-admin:~# losetup -f image2 root@lcg-lrz-admin:~# vgs Found duplicate PV tSK9Cdpw6bcmocZnxFPD6ThNz1opRXsB: using /dev/loop1 not /dev/loop0 VG #PV #LV #SN Attr VSize VFree vg_data 1 1 0 wz--n- 50,00g 0 vg_system 1 2 0 wz--n- 9,99g 0 root@lcg-lrz-admin:~# lvs Found duplicate PV tSK9Cdpw6bcmocZnxFPD6ThNz1opRXsB: using /dev/loop1 not /dev/loop0 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data vg_data -wi-ao---- 50,00g root vg_system -wi-ao---- 9,02g swap vg_system -wi-ao---- 1000,00m As you can see, even though loop0 is used (by the kernel) the toolset would use loop1... o.O Yeah, don't ask me why... I once had a discussion with Alastair from the LVM people about that, forgot the exact reasons (if there were any) and I was simply happy that it continued to use the already open devices properly. > Or after a reboot? Haven't checked this right now but I guess it again just decides on one of them (which is pretty bad). > > I would expect that in addition to the fs UUID, it needs a form of > > device ID... so why not simply ignoring any new device for which > > there > > already is a matching fs UUID and device ID, unless the respective > > tool > > (mount, btrfs, etc.) is explicitly told so via some > > device=/dev/sda,/dev/sdb option. > > IIRC, there were some btrfs-progs patches for such behavior, not sure > about kernel part though. > But at least an interesting method to solve the problem. > (Better than just rejecting mounting any) Of course if the user wouldn't specify those, it would still need to reject mounting/using/activating/fsck'ing/etc. ... > > If that means that less things work out of the box (in the sense of > > "auto-assembly") well than this is simply necessary. > > data security and consistency is definitely much more important > > than > > any fancy auto-magic. > > Can't agree any more. > Especially when auto leads to wrong behavior (Like kernel version > based > probing). Good to hear... well... you're the developer... spread the word :D > And after all, this topic makes me remember the bugreport of fuzzed > (but > csum recalculated) images. > I used to ignore them and I think that wouldn't happen. > > But the reporter is right, it's a btrfs security problem, and now I'm > super happy to see such report. As I've said, I've been quite surprised that no one seems to have thought about that before (especially the security aspect of that issue). > As it's easy to fix, I can always submit some patches if there is no > other guy faster than me. :) Awesome... showstopper number #1 just seems to be about to walk away :D > So for this one, as long as we find a good behavior to solve it, it > won't be a big thing. Great... keep me/us updated :) Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-09 5:39 ` Christoph Anton Mitterer @ 2015-12-09 21:48 ` S.J. 2015-12-10 12:08 ` Austin S Hemmelgarn ` (2 more replies) 0 siblings, 3 replies; 51+ messages in thread From: S.J. @ 2015-12-09 21:48 UTC (permalink / raw) To: linux-btrfs > 1. better practices, we really need to tell users, and documentation > writers, that using dd (or variant) to copy Btrfs volumes has a > consequence and should not be used to make copies. > 2. Btrfs needs a better way to make a copy of a volume when there are > snapshots (including even rw snapshots); e.g. permit send/receive to > work on rw snapshots if the fs is ro mounted; e.g. a way to do > "recursive" send/receive. > 3. Some way to fail gracefully, when there's ambiguity that cannot be > resolved. Once there are duplicate devs (dd or lvm snapshots, etc) > then there's simply no way to resolve the ambiguity automatically, and > the volume should just refuse to rw mount until the user resolves the > ambiguity. I think it's OK to fallback to ro mount (maybe) by default > in such a case rather than totally fail to mount. About 3: RO fallback for the second device/partitions is not good. It won't stop confusing the two partitions, and even if both are RO, thinking it's ok to read and then reading the wrong data is bad. About 1 and 2 ... if 3 gets fulfilled, why? DD itself is not a problem "if" the UUID is changed after it (which is a command as simple as dd), and if someone doesn't know that, he/she will notice when mount refuses to work because UUID duplicate. PS: Kudos to C.A. Mitterer for discovering that problem ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-09 21:48 ` S.J. @ 2015-12-10 12:08 ` Austin S Hemmelgarn 2015-12-10 12:41 ` Hugo Mills 2015-12-10 19:42 ` Chris Murphy 2015-12-11 22:06 ` Christoph Anton Mitterer 2 siblings, 1 reply; 51+ messages in thread From: Austin S Hemmelgarn @ 2015-12-10 12:08 UTC (permalink / raw) To: S.J., linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1633 bytes --] On 2015-12-09 16:48, S.J. wrote: >> 1. better practices, we really need to tell users, and documentation >> writers, that using dd (or variant) to copy Btrfs volumes has a >> consequence and should not be used to make copies. > >> 2. Btrfs needs a better way to make a copy of a volume when there are >> snapshots (including even rw snapshots); e.g. permit send/receive to >> work on rw snapshots if the fs is ro mounted; e.g. a way to do >> "recursive" send/receive. > >> 3. Some way to fail gracefully, when there's ambiguity that cannot be >> resolved. Once there are duplicate devs (dd or lvm snapshots, etc) >> then there's simply no way to resolve the ambiguity automatically, and >> the volume should just refuse to rw mount until the user resolves the >> ambiguity. I think it's OK to fallback to ro mount (maybe) by default >> in such a case rather than totally fail to mount. > > About 3: > RO fallback for the second device/partitions is not good. > It won't stop confusing the two partitions, and even if both are RO, > thinking it's ok to read and then reading the wrong data is bad. > > About 1 and 2 ... if 3 gets fulfilled, why? > DD itself is not a problem "if" the UUID is changed after it > (which is a command as simple as dd), and if someone doesn't > know that, he/she will notice when mount refuses to work > because UUID duplicate. Unless things have changed significantly, changing the UUID on a BTRFS image is not anywhere near as simple as copying it with dd. The UUID gets used internally somehow, and changing it would require rewriting _all_ the metadata blocks. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 3019 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-10 12:08 ` Austin S Hemmelgarn @ 2015-12-10 12:41 ` Hugo Mills 2015-12-10 12:57 ` S.J. 0 siblings, 1 reply; 51+ messages in thread From: Hugo Mills @ 2015-12-10 12:41 UTC (permalink / raw) To: Austin S Hemmelgarn; +Cc: S.J., linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1984 bytes --] On Thu, Dec 10, 2015 at 07:08:51AM -0500, Austin S Hemmelgarn wrote: > On 2015-12-09 16:48, S.J. wrote: > >> 1. better practices, we really need to tell users, and documentation > >> writers, that using dd (or variant) to copy Btrfs volumes has a > >> consequence and should not be used to make copies. > > > >> 2. Btrfs needs a better way to make a copy of a volume when there are > >> snapshots (including even rw snapshots); e.g. permit send/receive to > >> work on rw snapshots if the fs is ro mounted; e.g. a way to do > >> "recursive" send/receive. > > > >> 3. Some way to fail gracefully, when there's ambiguity that cannot be > >> resolved. Once there are duplicate devs (dd or lvm snapshots, etc) > >> then there's simply no way to resolve the ambiguity automatically, and > >> the volume should just refuse to rw mount until the user resolves the > >> ambiguity. I think it's OK to fallback to ro mount (maybe) by default > >> in such a case rather than totally fail to mount. > > > > About 3: > > RO fallback for the second device/partitions is not good. > > It won't stop confusing the two partitions, and even if both are RO, > > thinking it's ok to read and then reading the wrong data is bad. > > > > About 1 and 2 ... if 3 gets fulfilled, why? > > DD itself is not a problem "if" the UUID is changed after it > > (which is a command as simple as dd), and if someone doesn't > > know that, he/she will notice when mount refuses to work > > because UUID duplicate. > Unless things have changed significantly, changing the UUID on a BTRFS > image is not anywhere near as simple as copying it with dd. The UUID > gets used internally somehow, and changing it would require rewriting > _all_ the metadata blocks. Indeed, but there is now a tool to do that. :) (btrfstune -u or -U) Hugo. -- Hugo Mills | Go not to the elves for counsel, for they will say hugo@... carfax.org.uk | both no and yes. http://carfax.org.uk/ | PGP: E2AB1DE4 | [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-10 12:41 ` Hugo Mills @ 2015-12-10 12:57 ` S.J. 0 siblings, 0 replies; 51+ messages in thread From: S.J. @ 2015-12-10 12:57 UTC (permalink / raw) To: linux-btrfs Am 10.12.2015 13:41, schrieb Hugo Mills: > On Thu, Dec 10, 2015 at 07:08:51AM -0500, Austin S Hemmelgarn wrote: >> On 2015-12-09 16:48, S.J. wrote: >>>> 1. better practices, we really need to tell users, and documentation >>>> writers, that using dd (or variant) to copy Btrfs volumes has a >>>> consequence and should not be used to make copies. >>>> 2. Btrfs needs a better way to make a copy of a volume when there are >>>> snapshots (including even rw snapshots); e.g. permit send/receive to >>>> work on rw snapshots if the fs is ro mounted; e.g. a way to do >>>> "recursive" send/receive. >>>> 3. Some way to fail gracefully, when there's ambiguity that cannot be >>>> resolved. Once there are duplicate devs (dd or lvm snapshots, etc) >>>> then there's simply no way to resolve the ambiguity automatically, and >>>> the volume should just refuse to rw mount until the user resolves the >>>> ambiguity. I think it's OK to fallback to ro mount (maybe) by default >>>> in such a case rather than totally fail to mount. >>> About 3: >>> RO fallback for the second device/partitions is not good. >>> It won't stop confusing the two partitions, and even if both are RO, >>> thinking it's ok to read and then reading the wrong data is bad. >>> >>> About 1 and 2 ... if 3 gets fulfilled, why? >>> DD itself is not a problem "if" the UUID is changed after it >>> (which is a command as simple as dd), and if someone doesn't >>> know that, he/she will notice when mount refuses to work >>> because UUID duplicate. >> Unless things have changed significantly, changing the UUID on a BTRFS >> image is not anywhere near as simple as copying it with dd. The UUID >> gets used internally somehow, and changing it would require rewriting >> _all_ the metadata blocks. > Indeed, but there is now a tool to do that. :) (btrfstune -u or -U) > > Hugo. > Yes, I meant that :) I'm not saying that the tool is internally as simple as a "dumb" dd block copy , but calling it certainly is. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-09 21:48 ` S.J. 2015-12-10 12:08 ` Austin S Hemmelgarn @ 2015-12-10 19:42 ` Chris Murphy 2015-12-11 22:21 ` Christoph Anton Mitterer 2015-12-11 22:06 ` Christoph Anton Mitterer 2 siblings, 1 reply; 51+ messages in thread From: Chris Murphy @ 2015-12-10 19:42 UTC (permalink / raw) To: S.J.; +Cc: Btrfs BTRFS On Wed, Dec 9, 2015 at 2:48 PM, S.J. <sorry@anonym.com> wrote: >> 1. better practices, we really need to tell users, and documentation >> writers, that using dd (or variant) to copy Btrfs volumes has a >> consequence and should not be used to make copies. > > >> 2. Btrfs needs a better way to make a copy of a volume when there are >> snapshots (including even rw snapshots); e.g. permit send/receive to >> work on rw snapshots if the fs is ro mounted; e.g. a way to do >> "recursive" send/receive. > > >> 3. Some way to fail gracefully, when there's ambiguity that cannot be >> resolved. Once there are duplicate devs (dd or lvm snapshots, etc) >> then there's simply no way to resolve the ambiguity automatically, and >> the volume should just refuse to rw mount until the user resolves the >> ambiguity. I think it's OK to fallback to ro mount (maybe) by default >> in such a case rather than totally fail to mount. > > > About 3: > RO fallback for the second device/partitions is not good. > It won't stop confusing the two partitions, and even if both are RO, > thinking it's ok to read and then reading the wrong data is bad. That isn't what I'm suggesting. In the multiple device volume case where there are two exact (same UUID, same devid, same generation) instances of one of the block devices, Btrfs could randomly choose either one if it's an RO mount. It may very well be safer to just refuse to mount it with an error indicating the ambiguity, and suggesting the user explicitly specify the devices to use to assemble the volume, and if the generations differ on those chosen devices, at least warn about that also. > > About 1 and 2 ... if 3 gets fulfilled, why? > DD itself is not a problem "if" the UUID is changed after it > (which is a command as simple as dd), and if someone doesn't > know that, he/she will notice when mount refuses to work > because UUID duplicate. dd is not a copy operation. It's creating a 2nd original. You don't end up with an original and a copy (or clone). A copy or clone has some distinguishing difference. Volume UUID is used throughout Btrfs metadata, it's not just in the superblocks. Changing volume UUID requires a rewrite of all metadata. This is inefficient for two reasons: one dd copies unused sectors; two it copies metadata that will have to be completely rewritten by btrfstune to change volume UUID; and also the subvolume UUIDs aren't changed, so it's an incomplete solution that has problems (see other threads). If your workflow requires making an exact copy (for the shelf or for an emergency) then dd might be OK. But most often it's used because it's been easy, not because it's a good practice. Note that Btrfs is not unique, XFS v5 does a very similar thing with volume UUID as well, and resulted in this change: http://oss.sgi.com/pipermail/xfs/2015-April/041267.html Using dd also means the volume is offline. For even medium sized multiple device volumes, it's a huge penalty. dd does not scale. Using dd means source and destination physical configurations are identical (at least the number of devices and the data and metadata profiles) which I may not want or need for a clone. Maybe I want a 1x6TB clone for the 5x1TB raid5 volume. Even for an online full volume copy/clone of a 5x1TB raid5, moving all subvolume+snapshots to a new 3x4TB raid5 (or whatever), that could be hundreds of subvolumes to btrfs send/receive. OK yeah script it. But that's tedious even assuming I have a script friendly subvolume naming convention to get the send/receive order correct, which I don't. Anyway, I think it's a nice to have now, that'll eventually be a need. And dd is just totally disqualified outside of very specific edge case need. -- Chris Murphy ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-10 19:42 ` Chris Murphy @ 2015-12-11 22:21 ` Christoph Anton Mitterer 2015-12-11 22:32 ` Christoph Anton Mitterer ` (2 more replies) 0 siblings, 3 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-11 22:21 UTC (permalink / raw) To: Chris Murphy, S.J.; +Cc: Btrfs BTRFS On Thu, 2015-12-10 at 12:42 -0700, Chris Murphy wrote: > That isn't what I'm suggesting. In the multiple device volume case > where there are two exact (same UUID, same devid, same generation) > instances of one of the block devices, Btrfs could randomly choose > either one if it's an RO mount. No, for the same reasons as just stated in my mail few minutes ago. An attacker could probably find out the UUID/devid/generation... it would probably possible for him to craft a device with exactly those and try to use it. If then btrfs would select any of these, it may also select the wrong one - ro or rw, this may likely lead to problems. > > About 1 and 2 ... if 3 gets fulfilled, why? > > DD itself is not a problem "if" the UUID is changed after it > > (which is a command as simple as dd), and if someone doesn't > > know that, he/she will notice when mount refuses to work > > because UUID duplicate. > > dd is not a copy operation. It's creating a 2nd original. You don't > end up with an original and a copy (or clone). A copy or clone has > some distinguishing difference. Volume UUID is used throughout Btrfs > metadata, it's not just in the superblocks. Changing volume UUID > requires a rewrite of all metadata. This is inefficient for two > reasons: one dd copies unused sectors; two it copies metadata that > will have to be completely rewritten by btrfstune to change volume > UUID; and also the subvolume UUIDs aren't changed, so it's an > incomplete solution that has problems (see other threads). Well dd is surely not the only thing that can be used to create a clone (i.e. a bitwise identical copy - I guess we don't really care which is the "original" and which are the "clones", or whether these are "2nd originals). We always just use it here as an example for scenarios in which bitwise identical copies are created. And even if internally it's a big thing, from the user's PoV, changing the UUID is pretty simple (I guess that's what S.J. meant). > If your workflow requires making an exact copy (for the shelf or for > an emergency) then dd might be OK. But most often it's used because > it's been easy, not because it's a good practice. Ufff.. I wouldn't got that far to call something here bad or good practice. At least, I do not see any reason to call it a bad practice, except that systems got over time much more complex and haven't dealt properly with the problems that can occur by using dd. Again, I don't demand magical "solutions" (i.e. the btrfs or LVM people getting code into all dd like tools, so that these auto-detect when the duplicate such data and auto-change the UUIDs)... they just should handle the situations gracefully. > Note that Btrfs is > not unique, XFS v5 does a very similar thing with volume UUID as > well, > and resulted in this change: > http://oss.sgi.com/pipermail/xfs/2015-April/041267.html Do you mean that xfs may suffer from the same issues that we're talking about here? If so, one should probably give them a notice. > Using dd also means the volume is offline. Not really, you could do it on a snapshotted LV, while the "original" is still running. Or in emergency cases one could do it on a ro-remounted... probably not guaranteed to work, but may do so in practise. Cheers, Chris. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-11 22:21 ` Christoph Anton Mitterer @ 2015-12-11 22:32 ` Christoph Anton Mitterer 2015-12-11 23:06 ` Chris Murphy 2015-12-11 23:14 ` Eric Sandeen 2 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-11 22:32 UTC (permalink / raw) To: Chris Murphy, S.J.; +Cc: Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 234 bytes --] Sorry, I'm just about to change my mail system, and used a bogus test From: address in the previous mail (please replace fo@fo with calestyo@scientia.net). Apologies for any inconveniences and this noise here. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-11 22:21 ` Christoph Anton Mitterer 2015-12-11 22:32 ` Christoph Anton Mitterer @ 2015-12-11 23:06 ` Chris Murphy 2015-12-12 1:34 ` S.J. 2015-12-14 0:27 ` Christoph Anton Mitterer 2015-12-11 23:14 ` Eric Sandeen 2 siblings, 2 replies; 51+ messages in thread From: Chris Murphy @ 2015-12-11 23:06 UTC (permalink / raw) To: Btrfs BTRFS On Fri, Dec 11, 2015 at 3:21 PM, Christoph Anton Mitterer <fo@fo> wrote: > On Thu, 2015-12-10 at 12:42 -0700, Chris Murphy wrote: >> That isn't what I'm suggesting. In the multiple device volume case >> where there are two exact (same UUID, same devid, same generation) >> instances of one of the block devices, Btrfs could randomly choose >> either one if it's an RO mount. > No, for the same reasons as just stated in my mail few minutes ago. > An attacker could probably find out the UUID/devid/generation... it > would probably possible for him to craft a device with exactly those > and try to use it. For anything but a new and empty Btrfs volume, this hypothetical attack would be a ton easier to do on LVM and mdadm raid because they have a tiny amount of metadata to spoof compared to a Btrfs volume with even a little bit of data on it. I think this concern is overblown. > If then btrfs would select any of these, it may also select the wrong > one - ro or rw, this may likely lead to problems. >> dd is not a copy operation. It's creating a 2nd original. You don't >> end up with an original and a copy (or clone). A copy or clone has >> some distinguishing difference. Volume UUID is used throughout Btrfs >> metadata, it's not just in the superblocks. Changing volume UUID >> requires a rewrite of all metadata. This is inefficient for two >> reasons: one dd copies unused sectors; two it copies metadata that >> will have to be completely rewritten by btrfstune to change volume >> UUID; and also the subvolume UUIDs aren't changed, so it's an >> incomplete solution that has problems (see other threads). > Well dd is surely not the only thing that can be used to create a clone > (i.e. a bitwise identical copy - I guess we don't really care which is > the "original" and which are the "clones", or whether these are "2nd > originals). > We always just use it here as an example for scenarios in which bitwise > identical copies are created. I'm suggesting bitwise identical copies being created is not what is wanted most of the time, except in edge cases. > > And even if internally it's a big thing, from the user's PoV, changing > the UUID is pretty simple (I guess that's what S.J. meant). > > >> If your workflow requires making an exact copy (for the shelf or for >> an emergency) then dd might be OK. But most often it's used because >> it's been easy, not because it's a good practice. > Ufff.. I wouldn't got that far to call something here bad or good > practice. It's not just bad practice, it's sufficiently sloppy that it's very nearly user sabotage. That this is due to innocent ignorance, and a long standing practice that's bad advice being handed down from previous generations doesn't absolve the practice and mean we should invent esoteric work arounds for what is not a good practice. We have all sorts of exhibits why it's not a good idea. > At least, I do not see any reason to call it a bad practice, except > that systems got over time much more complex and haven't dealt properly > with the problems that can occur by using dd. The lack of maturity in tools to make it just as easy, or easier, and faster, to make a *data* bitwise identical copy, that preserves the intent and integrity of UUID by ensuring there aren't duplicates of them floating around, as well as profile reshaping on the fly, as well as a means to account for format changes, etc is a completely reasonable excuse for continuing to use dd - but it's still suboptimal which is what I mean by bad idea. > Again, I don't demand magical "solutions" (i.e. the btrfs or LVM people > getting code into all dd like tools, so that these auto-detect when the > duplicate such data and auto-change the UUIDs)... they just should > handle the situations gracefully. I disagree. It was due to the rudimentary nature of earlier filesystems' metadata paradigm that it worked. That's no longer the case. Sure, the kernel code should get smarter about refusing to mount in ambiguous cases, so that a file system isn't nerfed. That shouldn't happen. But we also need to get away from this idea that dd is actually an appropriate tool for making a file system copy. > > >> Note that Btrfs is >> not unique, XFS v5 does a very similar thing with volume UUID as >> well, >> and resulted in this change: >> http://oss.sgi.com/pipermail/xfs/2015-April/041267.html > Do you mean that xfs may suffer from the same issues that we're talking > about here? If so, one should probably give them a notice. They're aware, that's why xfs_db had the option to change the UUID in the first place. And the XFS kernel code knows not to mount a 2nd instance of a volume UUID. But it doesn't support multiple devices, so it's no where near as prone to problems in this area. If you're using LVM snapshots, the duplicate UUID problem certainly comes up. While there is a 'nouuid' mount option for XFS, I have no idea what problems this might cause for V5 filesystems. -- Chris Murphy ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-11 23:06 ` Chris Murphy @ 2015-12-12 1:34 ` S.J. 2015-12-14 0:28 ` Christoph Anton Mitterer 2015-12-14 0:27 ` Christoph Anton Mitterer 1 sibling, 1 reply; 51+ messages in thread From: S.J. @ 2015-12-12 1:34 UTC (permalink / raw) To: Btrfs BTRFS A bit more about the dd-is-bad-topic: IMHO it doesn't matter at all. a) For this specific problem here, fixing a security problem automatically fixes the risk of data corruption because careless cloning+mounting (without UUID adjustments) too. So, if the user likes to use dd with its disadvantages, like waiting hours to copy lots of free space, and bad practice, etc.etc., why should it concern the Btrfs developers and/or us here? b) At wider scope; while Btrfs is more complex than Xfs etc., currently there is no other reason why things could go bad when dd'ing something. As long as this holds, is there really a place in the official Btrfs documentation for telling the users "dd is bad [practice]"? ... ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-12 1:34 ` S.J. @ 2015-12-14 0:28 ` Christoph Anton Mitterer 0 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-14 0:28 UTC (permalink / raw) To: Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 894 bytes --] On Sat, 2015-12-12 at 02:34 +0100, S.J. wrote: > A bit more about the dd-is-bad-topic: > > IMHO it doesn't matter at all. Yes, fully agree. > a) For this specific problem here, fixing a security problem > automatically > fixes the risk of data corruption because careless cloning+mounting > (without UUID adjustments) too. > So, if the user likes to use dd with its disadvantages, like waiting > hours to > copy lots of free space, and bad practice, etc.etc., why should it > concern > the Btrfs developers and/or us here? > > b) At wider scope; while Btrfs is more complex than Xfs etc., > currently > there is no other reason why things could go bad when dd'ing > something. > As long as this holds, is there really a place in the official Btrfs > documentation > for telling the users "dd is bad [practice]"? > ... fully agree as well. :-) Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-11 23:06 ` Chris Murphy 2015-12-12 1:34 ` S.J. @ 2015-12-14 0:27 ` Christoph Anton Mitterer 2015-12-14 13:23 ` Austin S. Hemmelgarn 2015-12-14 20:55 ` Chris Murphy 1 sibling, 2 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-14 0:27 UTC (permalink / raw) To: Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 4640 bytes --] On Fri, 2015-12-11 at 16:06 -0700, Chris Murphy wrote: > For anything but a new and empty Btrfs volume What's the influence of the fs being new/empty? > this hypothetical > attack would be a ton easier to do on LVM and mdadm raid because they > have a tiny amount of metadata to spoof compared to a Btrfs volume > with even a little bit of data on it. Uhm I haven't said that other systems properly handle this kind of attack. ;-) Guess that would need to be evaluated... > I think this concern is overblown. I don't think so. Let me give you an example: There is an attack[0] against crypto, where the attacker listens via a smartphone's microphone, and based on the acoustics of a computer where gnupg runs. This is surely not an attack many people would have considered even remotely possible, but in fact it works, at least under lab conditions. I guess the same applies for possible attack vectors like this here. The stronger actual crypto and the strong software gets in terms of classical security holes (buffer overruns and so), the more attackers will try to go alternative ways. > I'm suggesting bitwise identical copies being created is not what is > wanted most of the time, except in edge cases. mhh,.. well there's the VM case, e.g. duplicating a template VM, booting it deploying software. Guess that's already common enough. There are people who want to use btrfs on top of LVM and using the snapshot functionality of that... another use case. Some people may want to use it on top of MD (for whatever reason)... at least in the mirroring RAID case, the kernel would see the same btrfs twice. Apart from that, btrfs should be a general purpose fs, and not just a desktop or server fs. So edge cases like forensics (where it's common that you create bitwise identical images) shouln't be forgotten either. > > >If your workflow requires making an exact copy (for the shelf or > > > for > > > an emergency) then dd might be OK. But most often it's used > > > because > > > it's been easy, not because it's a good practice. > > Ufff.. I wouldn't got that far to call something here bad or good > > practice. > > It's not just bad practice, it's sufficiently sloppy that it's very > nearly user sabotage. That this is due to innocent ignorance, and a > long standing practice that's bad advice being handed down from > previous generations doesn't absolve the practice and mean we should > invent esoteric work arounds for what is not a good practice. We have > all sorts of exhibits why it's not a good idea. Well if you don't give any real arguments or technical reasons (apart from "working around software that doesn't handle this well") I consider this just repetition of the baseless claim that long standing practise would be bad. > I disagree. It was due to the rudimentary nature of earlier > filesystems' metadata paradigm that it worked. That's no longer the > case. Well in the end it's of course up to the developers to decide whether this is acceptable or not, but being on the admin/end-user side, I can at least say that not everyone on there would accept "this is no longer the case" as valid explanation when their fs was corrupted or attacked. > Sure, the kernel code should get smarter about refusing to mount in > ambiguous cases, so that a file system isn't nerfed. That shouldn't > happen. But we also need to get away from this idea that dd is > actually an appropriate tool for making a file system copy. Uhm... your view is a bit narrow-sighted... again take the forensics example. But apart from that,... I never said that dd should be the regular tool for people to copy a btrfs image. Typically it would be simply slower than other means. But for some solutions, it may still be the better choice, or at least the only choice implemented right now (e.g. I wouldn't now of a hypervisor system, that looks at an existing disk image, finds any btrfs in that (possibly "hidden" below further block layers), and cleanly copies the data into freshly created btrfs image, with the same structure. AFAIK, there's not even a solution right now, that copies a complete btrfs, with snapshots, etc. preserving all ref-links. At least nothing official that works in one command. Long story, short, I think we can agree, that - dd or not - corruptions or attack vectors shouldn't be possible. And be it just, to protect against the btrfs on hardware RAID1 case, which is accidentally switched to JBOD mode... Cheers, Chris. [0] http://www.tau.ac.il/~tromer/papers/acoustic-20131218.pdf [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 0:27 ` Christoph Anton Mitterer @ 2015-12-14 13:23 ` Austin S. Hemmelgarn 2015-12-14 21:26 ` Chris Murphy 2015-12-15 0:08 ` Christoph Anton Mitterer 2015-12-14 20:55 ` Chris Murphy 1 sibling, 2 replies; 51+ messages in thread From: Austin S. Hemmelgarn @ 2015-12-14 13:23 UTC (permalink / raw) To: Christoph Anton Mitterer, Chris Murphy, Btrfs BTRFS On 2015-12-13 19:27, Christoph Anton Mitterer wrote: > On Fri, 2015-12-11 at 16:06 -0700, Chris Murphy wrote: >> For anything but a new and empty Btrfs volume > What's the influence of the fs being new/empty? > >> this hypothetical >> attack would be a ton easier to do on LVM and mdadm raid because they >> have a tiny amount of metadata to spoof compared to a Btrfs volume >> with even a little bit of data on it. > Uhm I haven't said that other systems properly handle this kind of > attack. ;-) > Guess that would need to be evaluated... > > >> I think this concern is overblown. > I don't think so. Let me give you an example: There is an attack[0] > against crypto, where the attacker listens via a smartphone's > microphone, and based on the acoustics of a computer where gnupg runs. > This is surely not an attack many people would have considered even > remotely possible, but in fact it works, at least under lab conditions. > > I guess the same applies for possible attack vectors like this here. > The stronger actual crypto and the strong software gets in terms of > classical security holes (buffer overruns and so), the more attackers > will try to go alternative ways. The reason that this isn't quite as high of a concern is because performing this attack requires either root access, or direct physical access to the hardware, and in either case, your system is already compromised. I still think that that isn't a sufficient excuse for not fixing the issue, as there are a number of non-security related issues that can result from this (there are some things that are common practice with LVM or mdraid that can't be done with BTRFS because of this). > >> I'm suggesting bitwise identical copies being created is not what is >> wanted most of the time, except in edge cases. > mhh,.. well there's the VM case, e.g. duplicating a template VM, > booting it deploying software. Guess that's already common enough. > There are people who want to use btrfs on top of LVM and using the > snapshot functionality of that... another use case. > Some people may want to use it on top of MD (for whatever reason)... at > least in the mirroring RAID case, the kernel would see the same btrfs > twice. Also, using flat DM-RAID (and yes, people do use DM-RAID without LVM), using the DM-cache target, some multi-path setups, some shared storage setups, a couple of other DM targets, and probably a number of other things I haven't thought of yet. > > Apart from that, btrfs should be a general purpose fs, and not just a > desktop or server fs. > So edge cases like forensics (where it's common that you create bitwise > identical images) shouln't be forgotten either. While I would normally agree, there are ways to work around this in the forensics case that don't work for any other case (namely, if BTRFS is built as a module, you can unmount everything, unload the module, reload it, and only scan the devices you want). > > >>>> If your workflow requires making an exact copy (for the shelf or >>>> for >>>> an emergency) then dd might be OK. But most often it's used >>>> because >>>> it's been easy, not because it's a good practice. >>> Ufff.. I wouldn't got that far to call something here bad or good >>> practice. >> >> It's not just bad practice, it's sufficiently sloppy that it's very >> nearly user sabotage. That this is due to innocent ignorance, and a >> long standing practice that's bad advice being handed down from >> previous generations doesn't absolve the practice and mean we should >> invent esoteric work arounds for what is not a good practice. We have >> all sorts of exhibits why it's not a good idea. > Well if you don't give any real arguments or technical reasons (apart > from "working around software that doesn't handle this well") I > consider this just repetition of the baseless claim that long standing > practise would be bad. Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't making a valid argument. The fact that there is software that doesn't handle it well would say to me based on established practice that that software is what's broken, not common practice. The assumption that a UUID is actually unique is an inherently flawed one, because it depends both on the method of generation guaranteeing it's unique (and none of the defined methods guarantee that), and a distinct absence of malicious intent. > >> I disagree. It was due to the rudimentary nature of earlier >> filesystems' metadata paradigm that it worked. That's no longer the >> case. > Well in the end it's of course up to the developers to decide whether > this is acceptable or not, but being on the admin/end-user side, I can > at least say that not everyone on there would accept "this is no longer > the case" as valid explanation when their fs was corrupted or attacked. On that note, why exactly is it better to make the filesystem UUID such an integral part of the filesystem? The other thing I'm reading out of this all, is that by writing a total of 64 bytes to a specific location in a single disk in a multi-device BTRFS filesystem, you can make the whole filesystem fall apart, which is absolutely absurd. > >> Sure, the kernel code should get smarter about refusing to mount in >> ambiguous cases, so that a file system isn't nerfed. That shouldn't >> happen. But we also need to get away from this idea that dd is >> actually an appropriate tool for making a file system copy. > Uhm... your view is a bit narrow-sighted... again take the forensics > example. And some recovery situations (think along the lines of no recovery disk, and you only have busybox or something similar to work with). > > But apart from that,... I never said that dd should be the regular tool > for people to copy a btrfs image. Typically it would be simply slower > than other means. > > But for some solutions, it may still be the better choice, or at least > the only choice implemented right now (e.g. I wouldn't now of a > hypervisor system, that looks at an existing disk image, finds any > btrfs in that (possibly "hidden" below further block layers), and > cleanly copies the data into freshly created btrfs image, with the same > structure. > AFAIK, there's not even a solution right now, that copies a complete > btrfs, with snapshots, etc. preserving all ref-links. At least nothing > official that works in one command. Send-receive kind of works for that, but requires down time because the subvolumes all have to be read-only. In theory, it's possible, but it would take a lot of work, and a lot of special case handling to implement properly. > > Long story, short, I think we can agree, that - dd or not - corruptions > or attack vectors shouldn't be possible. > And be it just, to protect against the btrfs on hardware RAID1 case, > which is accidentally switched to JBOD mode... ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 13:23 ` Austin S. Hemmelgarn @ 2015-12-14 21:26 ` Chris Murphy 2015-12-15 0:35 ` Christoph Anton Mitterer 2015-12-15 13:54 ` Austin S. Hemmelgarn 2015-12-15 0:08 ` Christoph Anton Mitterer 1 sibling, 2 replies; 51+ messages in thread From: Chris Murphy @ 2015-12-14 21:26 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: Christoph Anton Mitterer, Btrfs BTRFS On Mon, Dec 14, 2015 at 6:23 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: > > Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't > making a valid argument. The fact that there is software that doesn't > handle it well would say to me based on established practice that that > software is what's broken, not common practice. The automobile is invented and due to the ensuing chaos, common practice of doing whatever the F you wanted came to an end in favor of rules of the road and traffic lights. I'm sure some people went ballistic, but for the most part things were much better without the brokenness or prior common practice. So the fact we're going to have this problem with all file systems that incorporate the volume UUID into the metadata stream, tells me that the very rudimentary common practice of using dd needs to go away, in general practice. I've already said data recovery (including forensics) and sticking drives away on a shelf could be reasonable. > The assumption that a UUID is actually unique is an inherently flawed one, > because it depends both on the method of generation guaranteeing it's unique > (and none of the defined methods guarantee that), and a distinct absence of > malicious intent. http://www.ietf.org/rfc/rfc4122.txt "A UUID is 128 bits long, and can guarantee uniqueness across space and time." Also see security considerations in section 6. > On that note, why exactly is it better to make the filesystem UUID such an > integral part of the filesystem? The other thing I'm reading out of this > all, is that by writing a total of 64 bytes to a specific location in a > single disk in a multi-device BTRFS filesystem, you can make the whole > filesystem fall apart, which is absolutely absurd. OK maybe I'm missing something. 1. UUID is 128 bits. So where are you getting the additional 48 bytes from? 2. The volume UUID is in every superblock, which for all practical purposes means at least two instances of that UUID per device. Are you saying the file system falls apart when changing just one of those volume UUIDs in one superblock? And how does it fall apart? I'd say all volume UUID instances (each superblock, on every device) should be checked and if any of them mismatch then fail to mount. There could be some leveraging of the device WWN, or absent that its serial number, propogated into all of the volume's devices (cross referencing each other's devid to WWN or serial). And then that way there's a way to differentiate. In the dd case, there would be mismatching real device WWN/serial number and the one written in metadata on all drives, including the copy. This doesn't say what policy should happen next, just that at least it's known there's a mismatch. -- Chris Murphy ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 21:26 ` Chris Murphy @ 2015-12-15 0:35 ` Christoph Anton Mitterer 2015-12-15 13:54 ` Austin S. Hemmelgarn 1 sibling, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-15 0:35 UTC (permalink / raw) To: Chris Murphy, Austin S. Hemmelgarn; +Cc: Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 2133 bytes --] On Mon, 2015-12-14 at 14:26 -0700, Chris Murphy wrote: > The automobile is invented and due to the ensuing chaos, common > practice of doing whatever the F you wanted came to an end in favor > of > rules of the road and traffic lights. I'm sure some people went > ballistic, but for the most part things were much better without the > brokenness or prior common practice. Okay than take your road traffic example, apply it to filesystems. In road traffic you have rules, e.g. pedestrians may cross the road when their light shows green and that of the cars red. That could be the rule, similar as to "don't have duplicate UUIDs with btrfs". Despite we have the rule, cars stop at red, pedestrians walk at green, we still teach our kids: "look at both sides on the road, only cross if there's no car (or tank or whatever ;) ) crossing. Applying that to filesystems would be: "hope that everyone plays the rules, but don't kill yourself in one doesn't and there are duplicate IDs). > So the fact we're going to have this problem with all file systems > that incorporate the volume UUID into the metadata stream, tells me > that the very rudimentary common practice of using dd needs to go > away, in general practice. Sure, for those that use multiple devices (LVM, MD, etc.), or for those that actually just use the UUID to select the block device for each write/read (and not use these only "once") to get the right major/minor dev id (or whatever the kernel uses internally for path based addressing). > http://www.ietf.org/rfc/rfc4122.txt > "A UUID is 128 bits long, and can guarantee uniqueness across space > and time." But of course not in terms of the problems we're talking about here, where UUIDs may be accidentally or maliciously duplicated. > Also see security considerations in section 6. Doesn't section 6 basically imply that you can not 100% guarantee they're equal? E.g. bad random seed on multiple systems? Also, IIRC, one of the UUID algos just used some combination of MAC, time and PID... which especially in VMs may even lead to dupes. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 21:26 ` Chris Murphy 2015-12-15 0:35 ` Christoph Anton Mitterer @ 2015-12-15 13:54 ` Austin S. Hemmelgarn 2015-12-15 14:18 ` Hugo Mills 2015-12-16 12:03 ` Christoph Anton Mitterer 1 sibling, 2 replies; 51+ messages in thread From: Austin S. Hemmelgarn @ 2015-12-15 13:54 UTC (permalink / raw) To: Chris Murphy; +Cc: Christoph Anton Mitterer, Btrfs BTRFS On 2015-12-14 16:26, Chris Murphy wrote: > On Mon, Dec 14, 2015 at 6:23 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: >> >> Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't >> making a valid argument. The fact that there is software that doesn't >> handle it well would say to me based on established practice that that >> software is what's broken, not common practice. > > The automobile is invented and due to the ensuing chaos, common > practice of doing whatever the F you wanted came to an end in favor of > rules of the road and traffic lights. I'm sure some people went > ballistic, but for the most part things were much better without the > brokenness or prior common practice. Except for one thing: Automobiles actually provide a measurable significant benefit to society. What specific benefit does embedding the filesystem UUID in the metadata actually provide? > > So the fact we're going to have this problem with all file systems > that incorporate the volume UUID into the metadata stream, tells me > that the very rudimentary common practice of using dd needs to go > away, in general practice. I've already said data recovery (including > forensics) and sticking drives away on a shelf could be reasonable. > >> The assumption that a UUID is actually unique is an inherently flawed one, >> because it depends both on the method of generation guaranteeing it's unique >> (and none of the defined methods guarantee that), and a distinct absence of >> malicious intent. > > http://www.ietf.org/rfc/rfc4122.txt > "A UUID is 128 bits long, and can guarantee uniqueness across space and time." > > Also see security considerations in section 6. Both aspects ignore the facts that: Version 1 is easy to cause a collision with (MAC addresses are by no means unique, and are easy to spoof, and so are timestamps). Version 2 is relatively easy to cause a collision with, because UID and GID numbers are a fixed size namespace. Version 3 is slightly better, but still not by any means unique because you just have to guess the seed string (or a collision for it). Version 4 is probably the hardest to get a collision with, but only if you are using a true RNG, and evne then, 122 bits of entropy is not much protection. Version 5 has the same issues as Version 3, but is more secure against hash collisions. In general, you should only use UUID's when either: a. You have absolutely 100% complete control of the storage of them, such that you can guarantee they don't get reused. b. They can be guaranteed to be relatively unique for the system using them. > > >> On that note, why exactly is it better to make the filesystem UUID such an >> integral part of the filesystem? The other thing I'm reading out of this >> all, is that by writing a total of 64 bytes to a specific location in a >> single disk in a multi-device BTRFS filesystem, you can make the whole >> filesystem fall apart, which is absolutely absurd. > > > OK maybe I'm missing something. > > 1. UUID is 128 bits. So where are you getting the additional 48 bytes from? > 2. The volume UUID is in every superblock, which for all practical > purposes means at least two instances of that UUID per device. > > Are you saying the file system falls apart when changing just one of > those volume UUIDs in one superblock? And how does it fall apart? I'd > say all volume UUID instances (each superblock, on every device) > should be checked and if any of them mismatch then fail to mount. You're right, it would probably take writing all the SB's (although I'm not 100% certain that we actually check that the SB UUID's match). The extra bytes, which I grossly miscalculated, are for the SB checksum, which would have to be updated to match the new SB. > > There could be some leveraging of the device WWN, or absent that its > serial number, propogated into all of the volume's devices (cross > referencing each other's devid to WWN or serial). And then that way > there's a way to differentiate. In the dd case, there would be > mismatching real device WWN/serial number and the one written in > metadata on all drives, including the copy. This doesn't say what > policy should happen next, just that at least it's known there's a > mismatch. > That gets tricky too, because for example you have stuff like flat files used as filesystem images. However, if we then use some separate UUID (possibly hashed off of the file location) in place of the device serial/WWN, that could theoretically provide some better protection. The obvious solution in the case of a mismatch would be to refuse the mount until either the issue is fixed using the tools, or the user specifies some particular mount option to either fix ti automatically, or ignore copies with a mismatching serial. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 13:54 ` Austin S. Hemmelgarn @ 2015-12-15 14:18 ` Hugo Mills 2015-12-15 14:27 ` Austin S. Hemmelgarn 2015-12-16 12:03 ` Christoph Anton Mitterer 2015-12-16 12:03 ` Christoph Anton Mitterer 1 sibling, 2 replies; 51+ messages in thread From: Hugo Mills @ 2015-12-15 14:18 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: Chris Murphy, Christoph Anton Mitterer, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 6073 bytes --] On Tue, Dec 15, 2015 at 08:54:01AM -0500, Austin S. Hemmelgarn wrote: > On 2015-12-14 16:26, Chris Murphy wrote: > >On Mon, Dec 14, 2015 at 6:23 AM, Austin S. Hemmelgarn > ><ahferroin7@gmail.com> wrote: > >> > >>Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't > >>making a valid argument. The fact that there is software that doesn't > >>handle it well would say to me based on established practice that that > >>software is what's broken, not common practice. > > > >The automobile is invented and due to the ensuing chaos, common > >practice of doing whatever the F you wanted came to an end in favor of > >rules of the road and traffic lights. I'm sure some people went > >ballistic, but for the most part things were much better without the > >brokenness or prior common practice. > Except for one thing: Automobiles actually provide a measurable > significant benefit to society. What specific benefit does > embedding the filesystem UUID in the metadata actually provide? That one's easy to answer. It deals with a major issue that reiserfs had: if you have a filesystem with another filesystem image stored on it, reiserfsck could end up deciding that both the metadata blocks of the main filesystem *and* the metadata blocks of the image were part of the same FS (because they're on the same block device), and so would splice both filesystems into one, generally complaining loudly along the way that there was a lot of corruption present that it was trying to fix. Putting the UUID of the FS into the metadata blocks means that the kind of low-level check/repair attempt which scans for "stuff that looks like metadata" can at least distinguish between the stuff that's really metadata and the stuff that's just data that looks like metadata. Hugo. > >So the fact we're going to have this problem with all file systems > >that incorporate the volume UUID into the metadata stream, tells me > >that the very rudimentary common practice of using dd needs to go > >away, in general practice. I've already said data recovery (including > >forensics) and sticking drives away on a shelf could be reasonable. > > > >>The assumption that a UUID is actually unique is an inherently flawed one, > >>because it depends both on the method of generation guaranteeing it's unique > >>(and none of the defined methods guarantee that), and a distinct absence of > >>malicious intent. > > > >http://www.ietf.org/rfc/rfc4122.txt > >"A UUID is 128 bits long, and can guarantee uniqueness across space and time." > > > >Also see security considerations in section 6. > Both aspects ignore the facts that: > Version 1 is easy to cause a collision with (MAC addresses are by no > means unique, and are easy to spoof, and so are timestamps). > Version 2 is relatively easy to cause a collision with, because UID > and GID numbers are a fixed size namespace. > Version 3 is slightly better, but still not by any means unique > because you just have to guess the seed string (or a collision for > it). > Version 4 is probably the hardest to get a collision with, but only > if you are using a true RNG, and evne then, 122 bits of entropy is > not much protection. > Version 5 has the same issues as Version 3, but is more secure > against hash collisions. > > In general, you should only use UUID's when either: > a. You have absolutely 100% complete control of the storage of them, > such that you can guarantee they don't get reused. > b. They can be guaranteed to be relatively unique for the system using them. > > > > > >>On that note, why exactly is it better to make the filesystem UUID such an > >>integral part of the filesystem? The other thing I'm reading out of this > >>all, is that by writing a total of 64 bytes to a specific location in a > >>single disk in a multi-device BTRFS filesystem, you can make the whole > >>filesystem fall apart, which is absolutely absurd. > > > > > >OK maybe I'm missing something. > > > >1. UUID is 128 bits. So where are you getting the additional 48 bytes from? > >2. The volume UUID is in every superblock, which for all practical > >purposes means at least two instances of that UUID per device. > > > >Are you saying the file system falls apart when changing just one of > >those volume UUIDs in one superblock? And how does it fall apart? I'd > >say all volume UUID instances (each superblock, on every device) > >should be checked and if any of them mismatch then fail to mount. > You're right, it would probably take writing all the SB's (although > I'm not 100% certain that we actually check that the SB UUID's > match). > The extra bytes, which I grossly miscalculated, are for the SB > checksum, which would have to be updated to match the new SB. > > > >There could be some leveraging of the device WWN, or absent that its > >serial number, propogated into all of the volume's devices (cross > >referencing each other's devid to WWN or serial). And then that way > >there's a way to differentiate. In the dd case, there would be > >mismatching real device WWN/serial number and the one written in > >metadata on all drives, including the copy. This doesn't say what > >policy should happen next, just that at least it's known there's a > >mismatch. > > > That gets tricky too, because for example you have stuff like flat > files used as filesystem images. > > However, if we then use some separate UUID (possibly hashed off of > the file location) in place of the device serial/WWN, that could > theoretically provide some better protection. The obvious solution > in the case of a mismatch would be to refuse the mount until either > the issue is fixed using the tools, or the user specifies some > particular mount option to either fix ti automatically, or ignore > copies with a mismatching serial. > -- Hugo Mills | I think that everything darkling says is actually a hugo@... carfax.org.uk | joke. It's just that we haven't worked out most of http://carfax.org.uk/ | them yet. PGP: E2AB1DE4 | Vashka [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 14:18 ` Hugo Mills @ 2015-12-15 14:27 ` Austin S. Hemmelgarn 2015-12-15 14:42 ` Hugo Mills 2015-12-16 12:03 ` Christoph Anton Mitterer 1 sibling, 1 reply; 51+ messages in thread From: Austin S. Hemmelgarn @ 2015-12-15 14:27 UTC (permalink / raw) To: Hugo Mills, Chris Murphy, Christoph Anton Mitterer, Btrfs BTRFS On 2015-12-15 09:18, Hugo Mills wrote: > On Tue, Dec 15, 2015 at 08:54:01AM -0500, Austin S. Hemmelgarn wrote: >> On 2015-12-14 16:26, Chris Murphy wrote: >>> On Mon, Dec 14, 2015 at 6:23 AM, Austin S. Hemmelgarn >>> <ahferroin7@gmail.com> wrote: >>>> >>>> Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't >>>> making a valid argument. The fact that there is software that doesn't >>>> handle it well would say to me based on established practice that that >>>> software is what's broken, not common practice. >>> >>> The automobile is invented and due to the ensuing chaos, common >>> practice of doing whatever the F you wanted came to an end in favor of >>> rules of the road and traffic lights. I'm sure some people went >>> ballistic, but for the most part things were much better without the >>> brokenness or prior common practice. >> Except for one thing: Automobiles actually provide a measurable >> significant benefit to society. What specific benefit does >> embedding the filesystem UUID in the metadata actually provide? > > That one's easy to answer. It deals with a major issue that > reiserfs had: if you have a filesystem with another filesystem image > stored on it, reiserfsck could end up deciding that both the metadata > blocks of the main filesystem *and* the metadata blocks of the image > were part of the same FS (because they're on the same block device), > and so would splice both filesystems into one, generally complaining > loudly along the way that there was a lot of corruption present that > it was trying to fix. IIRC, that was because of the way the SB was designed, and is why other filesystems have a UUID in the superblock. I probably should have been clearer with my statement, what I meant was: What specific benefit does using the UUID for multi-device filesystems to identify the various devices provide? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 14:27 ` Austin S. Hemmelgarn @ 2015-12-15 14:42 ` Hugo Mills 2015-12-15 16:03 ` Austin S. Hemmelgarn 2015-12-16 12:10 ` Christoph Anton Mitterer 0 siblings, 2 replies; 51+ messages in thread From: Hugo Mills @ 2015-12-15 14:42 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: Chris Murphy, Christoph Anton Mitterer, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 4036 bytes --] On Tue, Dec 15, 2015 at 09:27:12AM -0500, Austin S. Hemmelgarn wrote: > On 2015-12-15 09:18, Hugo Mills wrote: > >On Tue, Dec 15, 2015 at 08:54:01AM -0500, Austin S. Hemmelgarn wrote: > >>On 2015-12-14 16:26, Chris Murphy wrote: > >>>On Mon, Dec 14, 2015 at 6:23 AM, Austin S. Hemmelgarn > >>><ahferroin7@gmail.com> wrote: > >>>> > >>>>Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't > >>>>making a valid argument. The fact that there is software that doesn't > >>>>handle it well would say to me based on established practice that that > >>>>software is what's broken, not common practice. > >>> > >>>The automobile is invented and due to the ensuing chaos, common > >>>practice of doing whatever the F you wanted came to an end in favor of > >>>rules of the road and traffic lights. I'm sure some people went > >>>ballistic, but for the most part things were much better without the > >>>brokenness or prior common practice. > >>Except for one thing: Automobiles actually provide a measurable > >>significant benefit to society. What specific benefit does > >>embedding the filesystem UUID in the metadata actually provide? > > > > That one's easy to answer. It deals with a major issue that > >reiserfs had: if you have a filesystem with another filesystem image > >stored on it, reiserfsck could end up deciding that both the metadata > >blocks of the main filesystem *and* the metadata blocks of the image > >were part of the same FS (because they're on the same block device), > >and so would splice both filesystems into one, generally complaining > >loudly along the way that there was a lot of corruption present that > >it was trying to fix. > IIRC, that was because of the way the SB was designed, and is why > other filesystems have a UUID in the superblock. > > I probably should have been clearer with my statement, what I meant was: > What specific benefit does using the UUID for multi-device > filesystems to identify the various devices provide? Well, given a bunch of block devices, how do you identify which ones to use for each of the (unknown number of) filesystems in the system? You can either use some kind of config file, which is going to get out of date as device enumeration orders change or as devices are added/deleted from the FS, or you can try to identify the devices that belong together automatically in some way. btrfs uses the latter option (with the former option kind of supported using the device= mount option). The use of a UUID isn't fundamental to the latter process, but anything that you replaced the UUID with would have the same issues that we're seeing here -- make a duplicate of the device at the block level, and you get additional devices that look like they should be part of the FS. The question is not how you avoid duplicating the UUIDs, but how you identify that there are duplicates present, and how you deal with that issue once you've detected them. This is complicated by the fact that it's perfectly legitimate to have two block devices in the system that identify themselves as the same device for the same filesystem -- this happens when they're different views of the same underlying storage through multipathing. I would suggest trying to migrate to a state where detecting more than one device with the same UUID and devid is cause to prevent the FS from mounting, unless there's also a "mount_duplicates_yes_i_ know_this_is_dangerous_and_i_know_what_im_doing" mount flag present, for the multipathing people. That will break existing userspace behaviour for the multipathing case, but the migration can probably be managed. (e.g. NFS has successfully changed default behaviour for one of its mount options in the last few(?) years). Hugo. -- Hugo Mills | I think that everything darkling says is actually a hugo@... carfax.org.uk | joke. It's just that we haven't worked out most of http://carfax.org.uk/ | them yet. PGP: E2AB1DE4 | Vashka [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 14:42 ` Hugo Mills @ 2015-12-15 16:03 ` Austin S. Hemmelgarn 2015-12-16 12:14 ` Christoph Anton Mitterer 2015-12-16 12:10 ` Christoph Anton Mitterer 1 sibling, 1 reply; 51+ messages in thread From: Austin S. Hemmelgarn @ 2015-12-15 16:03 UTC (permalink / raw) To: Hugo Mills, Chris Murphy, Christoph Anton Mitterer, Btrfs BTRFS On 2015-12-15 09:42, Hugo Mills wrote: > On Tue, Dec 15, 2015 at 09:27:12AM -0500, Austin S. Hemmelgarn wrote: >> On 2015-12-15 09:18, Hugo Mills wrote: >>> On Tue, Dec 15, 2015 at 08:54:01AM -0500, Austin S. Hemmelgarn wrote: >>>> On 2015-12-14 16:26, Chris Murphy wrote: >>>>> On Mon, Dec 14, 2015 at 6:23 AM, Austin S. Hemmelgarn >>>>> <ahferroin7@gmail.com> wrote: >>>>>> >>>>>> Agreed, if yo9u can't substantiate _why_ it's bad practice, then you aren't >>>>>> making a valid argument. The fact that there is software that doesn't >>>>>> handle it well would say to me based on established practice that that >>>>>> software is what's broken, not common practice. >>>>> >>>>> The automobile is invented and due to the ensuing chaos, common >>>>> practice of doing whatever the F you wanted came to an end in favor of >>>>> rules of the road and traffic lights. I'm sure some people went >>>>> ballistic, but for the most part things were much better without the >>>>> brokenness or prior common practice. >>>> Except for one thing: Automobiles actually provide a measurable >>>> significant benefit to society. What specific benefit does >>>> embedding the filesystem UUID in the metadata actually provide? >>> >>> That one's easy to answer. It deals with a major issue that >>> reiserfs had: if you have a filesystem with another filesystem image >>> stored on it, reiserfsck could end up deciding that both the metadata >>> blocks of the main filesystem *and* the metadata blocks of the image >>> were part of the same FS (because they're on the same block device), >>> and so would splice both filesystems into one, generally complaining >>> loudly along the way that there was a lot of corruption present that >>> it was trying to fix. >> IIRC, that was because of the way the SB was designed, and is why >> other filesystems have a UUID in the superblock. >> >> I probably should have been clearer with my statement, what I meant was: >> What specific benefit does using the UUID for multi-device >> filesystems to identify the various devices provide? > > Well, given a bunch of block devices, how do you identify which > ones to use for each of the (unknown number of) filesystems in the > system? > > You can either use some kind of config file, which is going to get > out of date as device enumeration orders change or as devices are > added/deleted from the FS, or you can try to identify the devices that > belong together automatically in some way. btrfs uses the latter > option (with the former option kind of supported using the device= > mount option). The use of a UUID isn't fundamental to the latter > process, but anything that you replaced the UUID with would have the > same issues that we're seeing here -- make a duplicate of the device > at the block level, and you get additional devices that look like they > should be part of the FS. > > The question is not how you avoid duplicating the UUIDs, but how > you identify that there are duplicates present, and how you deal with > that issue once you've detected them. This is complicated by the fact > that it's perfectly legitimate to have two block devices in the system > that identify themselves as the same device for the same filesystem -- > this happens when they're different views of the same underlying > storage through multipathing. > > I would suggest trying to migrate to a state where detecting more > than one device with the same UUID and devid is cause to prevent the > FS from mounting, unless there's also a "mount_duplicates_yes_i_ > know_this_is_dangerous_and_i_know_what_im_doing" mount flag present, > for the multipathing people. That will break existing userspace > behaviour for the multipathing case, but the migration can probably be > managed. (e.g. NFS has successfully changed default behaviour for one > of its mount options in the last few(?) years). May I propose the alternative option of adding a flag to tell mount to _only_ use the devices specified in the options? That would allow people to work around the common issues (multipath, dm-cache, etc), and would provide people who have stable device enumeration to mitigate the possibility of an attack. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 16:03 ` Austin S. Hemmelgarn @ 2015-12-16 12:14 ` Christoph Anton Mitterer 0 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-16 12:14 UTC (permalink / raw) To: Austin S. Hemmelgarn, Hugo Mills, Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 891 bytes --] On Tue, 2015-12-15 at 11:03 -0500, Austin S. Hemmelgarn wrote: > May I propose the alternative option of adding a flag to tell mount > to > _only_ use the devices specified in the options? That's one part of exactly what I propose since a few days :-P (no one seems to read my mails ;-) ) Plus that this isn't the case only for mounts, but also fsck, repair, and all other userland tool operations. But it's only part of the solution to the whole problem, the other one is that automatic device activations/rebuilds/etc. of _already active_ devices should generally not happen (manual of course may happen, again with device= options, specifying *which* devices are actually meant). See my mail from "Fri, 11 Dec 2015 23:06:03 +0100" (http://thread.gmane.org/gmane.comp.file-systems.btrfs/50909/focus=5114 7) which I think covers pretty much all cases. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 14:42 ` Hugo Mills 2015-12-15 16:03 ` Austin S. Hemmelgarn @ 2015-12-16 12:10 ` Christoph Anton Mitterer 1 sibling, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-16 12:10 UTC (permalink / raw) To: Hugo Mills, Austin S. Hemmelgarn; +Cc: Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 1377 bytes --] On Tue, 2015-12-15 at 14:42 +0000, Hugo Mills wrote: > I would suggest trying to migrate to a state where detecting more > than one device with the same UUID and devid is cause to prevent the > FS from mounting, unless there's also a "mount_duplicates_yes_i_ > know_this_is_dangerous_and_i_know_what_im_doing" mount flag present, > for the multipathing people. That will break existing userspace > behaviour for the multipathing case, but the migration can probably > be > managed. (e.g. NFS has successfully changed default behaviour for one > of its mount options in the last few(?) years). I don't think that a single mountpoint a la "force-and-do-it" is a proper solution here. It would still open surface for attacks and also for accidents. In the case mutli-pathing is used, the only realistic way seems to be manually specifying the devices a la device=/dev/sda,/dev/sdb. Of course btrfs would stil use the UUIDs/deviceIDs of these, but *only* of those devices that have been whitelisted with the device=option. In the case of a general "mount_duplicates_yes_iknow_th..." option you could end up with having e.g. three duplicates, two being actually mutli-paths, and the third one being a losetup or USB clone of the image,... again allowing for the aforementioned attacks to happen, and again allowing for severe corruption to occur. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 14:18 ` Hugo Mills 2015-12-15 14:27 ` Austin S. Hemmelgarn @ 2015-12-16 12:03 ` Christoph Anton Mitterer 2015-12-16 14:41 ` Chris Mason 1 sibling, 1 reply; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-16 12:03 UTC (permalink / raw) To: Hugo Mills, Austin S. Hemmelgarn; +Cc: Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 1746 bytes --] On Tue, 2015-12-15 at 14:18 +0000, Hugo Mills wrote: > That one's easy to answer. It deals with a major issue that > reiserfs had: if you have a filesystem with another filesystem image > stored on it, reiserfsck could end up deciding that both the metadata > blocks of the main filesystem *and* the metadata blocks of the image > were part of the same FS (because they're on the same block device), > and so would splice both filesystems into one, generally complaining > loudly along the way that there was a lot of corruption present that > it was trying to fix. Hmm that's a bit strange though, and to me it rather sounds like other bugs... You can have a ext4 on a file in an ext4, with or without the same UUIDs, and it will just work. If the filesystem takes contents from a normal file as possible metadata, than something else is severely screwed up... or in case of the fsck: it probably means it's a bit too liberal in searching places. I'd be quite shocked if this is the case in btrfs, cause it would mean again, that we have a vulnerability against UUID collisions. Imagine some attacker finds out the UUID of a filesystem (which is probably rather easy)... next he uploads some file (e.g. it's a webserver with allows image uploads, a forum perhaps) that in reality contains what's looks like btrfs metadata and uses a matching UUID. It would run into the same issues as what you describe for reiser,.. the UUID would be no real help to solve that problem. Does anyone know whether btrfsck (or other userland) tools do such things? I.e. search more or less arbitrary blocks, where it cannot be sure it's *not* data, for what it would interpret as meta-data subsequently? CHeers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-16 12:03 ` Christoph Anton Mitterer @ 2015-12-16 14:41 ` Chris Mason 2015-12-16 15:04 ` Christoph Anton Mitterer 0 siblings, 1 reply; 51+ messages in thread From: Chris Mason @ 2015-12-16 14:41 UTC (permalink / raw) To: Christoph Anton Mitterer Cc: Hugo Mills, Austin S. Hemmelgarn, Chris Murphy, Btrfs BTRFS On Wed, Dec 16, 2015 at 01:03:38PM +0100, Christoph Anton Mitterer wrote: > On Tue, 2015-12-15 at 14:18 +0000, Hugo Mills wrote: > > That one's easy to answer. It deals with a major issue that > > reiserfs had: if you have a filesystem with another filesystem image > > stored on it, reiserfsck could end up deciding that both the metadata > > blocks of the main filesystem *and* the metadata blocks of the image > > were part of the same FS (because they're on the same block device), > > and so would splice both filesystems into one, generally complaining > > loudly along the way that there was a lot of corruption present that > > it was trying to fix. > Hmm that's a bit strange though, and to me it rather sounds like other > bugs... > You can have a ext4 on a file in an ext4, with or without the same > UUIDs, and it will just work. Hugo is right here. reiserfs had tools that would scan and entire block device for metadata blocks and try to reconstruct the filesystem based on what it found. Since there was no uuid, it was impossible to tell if a block from the scan was really part of this filesystem or part of some image file that happened to be sitting there. Adding UUIDs doesn't make that whole class of problem go away (you could have an image of the filesystem inside that filesystem), but it does make it dramatically less likely. At the end of the day it's just a best practice mechanism to help recovery and prevent admin mistakes. It's also a building block of the multi-device support. We could change the multi-device support to allow duplicate uuids in single device filesystems. But I'd much rather see a variation on seed devices enable transitioning from one uuid to another. > If the filesystem takes contents from a normal file as possible > metadata, than something else is severely screwed up... or in case of > the fsck: it probably means it's a bit too liberal in searching places. > > I'd be quite shocked if this is the case in btrfs, cause it would mean > again, that we have a vulnerability against UUID collisions. > Imagine some attacker finds out the UUID of a filesystem (which is > probably rather easy)... next he uploads some file (e.g. it's a > webserver with allows image uploads, a forum perhaps) that in reality > contains what's looks like btrfs metadata and uses a matching UUID. > > It would run into the same issues as what you describe for reiser,.. > the UUID would be no real help to solve that problem. > > > Does anyone know whether btrfsck (or other userland) tools do such > things? I.e. search more or less arbitrary blocks, where it cannot be > sure it's *not* data, for what it would interpret as meta-data > subsequently? These are emergency tools, btrfs restore and find-roots can do some scanning. We don't do it the way reiserfs did because it would be very difficult to reconstruct shared data and metadata from snapshots. -chris ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-16 14:41 ` Chris Mason @ 2015-12-16 15:04 ` Christoph Anton Mitterer 2015-12-17 3:25 ` Duncan 0 siblings, 1 reply; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-16 15:04 UTC (permalink / raw) To: Chris Mason; +Cc: Hugo Mills, Austin S. Hemmelgarn, Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 1682 bytes --] On Wed, 2015-12-16 at 09:41 -0500, Chris Mason wrote: > Hugo is right here. reiserfs had tools that would scan and entire > block > device for metadata blocks and try to reconstruct the filesystem > based > on what it found. Creepy... at least when talking about a "normal" fsck... good that btrfs is going to be the next-gen-ext, and not reiser4 ;) > Adding UUIDs doesn't make that whole class of problem go away (you > could > have an image of the filesystem inside that filesystem), but it does > make it dramatically less likely. Sure... > > Does anyone know whether btrfsck (or other userland) tools do such > > things? I.e. search more or less arbitrary blocks, where it cannot > > be > > sure it's *not* data, for what it would interpret as meta-data > > subsequently? > > These are emergency tools, btrfs restore and find-roots can do some > scanning. We don't do it the way reiserfs did because it would be > very > difficult to reconstruct shared data and metadata from snapshots. Hmm I agree, that it's valid for such tools, to do these kinds of scans (i.e. scan for meta-data in places that are not known for sure to be meta-data) when doing some last-resort-rescue tries... or for rescue operations, where it's clearly documented that this is done. But I think it shouldn't happen e.g. during a normal fsck - only when special options are given. And it should be properly documented (i.e. telling people in the docs, that this does a block for block scan for meta-data even within normal data, and that if they'd had e.g. another fs of the same UUIDs within, the results may be completely bogus. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-16 15:04 ` Christoph Anton Mitterer @ 2015-12-17 3:25 ` Duncan 2015-12-18 0:56 ` Christoph Anton Mitterer 2015-12-22 2:13 ` Kai Krakow 0 siblings, 2 replies; 51+ messages in thread From: Duncan @ 2015-12-17 3:25 UTC (permalink / raw) To: linux-btrfs Christoph Anton Mitterer posted on Wed, 16 Dec 2015 16:04:03 +0100 as excerpted: > On Wed, 2015-12-16 at 09:41 -0500, Chris Mason wrote: >> Hugo is right here. reiserfs had tools that would scan and entire >> block device for metadata blocks and try to reconstruct the filesystem >> based on what it found. > Creepy... at least when talking about a "normal" fsck... good that btrfs > is going to be the next-gen-ext, and not reiser4 ;) What often gets lost in discussions of this nature is that it _wasn't_ "normal" fsck that had the problem, but rather, a parameter (--rebuild-tree, IIRC) much like btrfs check (--init-csum-tree, init-extent-tree) and rescue (chunk-recover) use for blowing away and recreating the checksum tree, extent tree, chunk tree, etc. So it's definitely _not_ something that reiserfsck would do in a "normal" fsck, only when doing "I'm desperate and don't have backups, go to the ends of the earth if necessary to recover what you can of my data, and yes, I understand it could be a bit risky or end up rather disordered, but I'm willing to take that risk because I _am_ that desperate", level recovery. Arguably, however, the problem was that reiserfs (heh, that's the second time I almost wrote btrfs and caught it, hope I didn't miss any! =:^) had a rather minor items repair mode, and an "I'm desperate, ends of the earth and I don't care about the risk as anything is better than nothing" mode, but not a lot of choice in between the two. Additionally, now looking at btrfs (a correct reference this time! =:^), the "desperate" solution in btrfs is rather more fine-grained, including at least the three above options plus one for the superblock, with an additional read- only restore tool that can often restore most or all data to elsewhere, in the case of a missed or not current backup, that reiserfs never had. But AFAIK reiser4 (which I never actually tried as it never made mainline, which in general I prefer to stick to, but I read about it) improved on the reiserfs model in this regard as well -- indeed, it would have been surprising if it didn't, since both reiser4 and btrfs had the lessens of reiserfs to build upon. And of course reiserfs might have gotten the same sort of tool changes too, except for Hans Reiser's controversial policy of letting stable be stable, and putting the improvements into reiser4, which of course was intended to get into mainline in some reasonable time and thus wouldn't have left reiserfs users so in the lurch as actually happened, because reiser4 never did hit mainline due to $reasons, most/all of which I agree with, or at least understand, where I don't entirely agree. But anyway, for anyone with half a tech-oriented brain, it was very evident that the required options were "desperate" level, and for people without half a tech-oriented brain, the documentation clearly suggested danger ahead, you should have backups if you're going to do this as it's a risky process that could destroy chances of recovery instead of fixing things, as well. But of course so many don't read the docs, they just do it... and sometimes they suffer the consequences when they do... and sometimes then try to blame others for it. <shrug> That's the way of the world; not something we're going to change. Even the required actually spelled out "yes" confirmation, not just "y", didn't stop people, either from doing it or for blaming reiserfs for problems that were in fact mostly their own, when they went ahead anyway. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-17 3:25 ` Duncan @ 2015-12-18 0:56 ` Christoph Anton Mitterer 2015-12-22 2:13 ` Kai Krakow 1 sibling, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-18 0:56 UTC (permalink / raw) To: Duncan, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 659 bytes --] On Thu, 2015-12-17 at 03:25 +0000, Duncan wrote: > So it's definitely _not_ something that reiserfsck would do in a > "normal" > fsck, only when doing "I'm desperate and don't have backups, go to > the > ends of the earth if necessary to recover what you can of my data, > and > yes, I understand it could be a bit risky or end up rather > disordered, > but I'm willing to take that risk because I _am_ that desperate", > level > recovery. Well, as long as that was/is clearly documented (which in the btrfs would need to include any warnings about issues with multi-dev, if any), then it's IMHO completely okay. :) Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-17 3:25 ` Duncan 2015-12-18 0:56 ` Christoph Anton Mitterer @ 2015-12-22 2:13 ` Kai Krakow 1 sibling, 0 replies; 51+ messages in thread From: Kai Krakow @ 2015-12-22 2:13 UTC (permalink / raw) To: linux-btrfs Am Thu, 17 Dec 2015 03:25:50 +0000 (UTC) schrieb Duncan <1i5t5.duncan@cox.net>: > So it's definitely _not_ something that reiserfsck would do in a > "normal" fsck, only when doing "I'm desperate and don't have backups, > go to the ends of the earth if necessary to recover what you can of > my data, and yes, I understand it could be a bit risky or end up > rather disordered, but I'm willing to take that risk because I _am_ > that desperate", level recovery. What's fascinating: reiserfs was actually quite good at that and actually saved me from "I'm desperate and don't have backups, go to the ends of the earth if necessary to recover what you can of my data, and yes, I understand it could be a bit risky or end up rather disordered, but I'm willing to take that risk because I _am_ that desperate" (phew that's long). According to checksums all files except some inflight temporary data was completely intact (in addition to many files which came back out of nowhere - even ending up in their original directory but not so intact). Lucky me... :-D Cause of this was an unstable RAID controller which switched one hard disk after the next into offline mode, then completely went offline itself - leaving me with a system still running acceptably from cache only. It was strange... And reiserfs did this magic twice for me (but the second time I had current backups, just wanted to have a copy of files created since the nightly backup). BTW: Ext3 partitions on the same hardware were broken beyond repair and had to be recreated. e2fsck only made it worse. Apparently, reiserfs did absolutely not scale to multithreaded workloads - which is why I switched to xfs (it seemed pretty good at it, especially on RAID and its behavior to distribute data diagonally across the disks tho I won't recommend it without bbu as it tends to nullify file contents during log-replay). It has proven similarly stable in case of hardware havoc. BTW2: The server with the "RAID controller accident" is still in production but converted to XFS and migrated into virtualization meanwhile. And yes: It has a daily backup schedule. :-) -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 13:54 ` Austin S. Hemmelgarn 2015-12-15 14:18 ` Hugo Mills @ 2015-12-16 12:03 ` Christoph Anton Mitterer 2015-12-17 2:43 ` Duncan 1 sibling, 1 reply; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-16 12:03 UTC (permalink / raw) To: Austin S. Hemmelgarn, Chris Murphy; +Cc: Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 4904 bytes --] On Tue, 2015-12-15 at 08:54 -0500, Austin S. Hemmelgarn wrote: > Except for one thing: Automobiles actually provide a measurable > significant benefit to society. What specific benefit does embedding > the filesystem UUID in the metadata actually provide? I guess that's quite obvious. You want something that can be used to address devices stable (i.e. not their "path" like sda,sdb). So either some ID or a label. Human readable lables are basically guaranteed to collide, so UUIDs are the clean solution. Since there is however no guarantee that they don't collide (either by accident or malicious intent), you need to protect against that. Analogous for the device IDs of multi-device fs or containers. > > "A UUID is 128 bits long, and can guarantee uniqueness across space > > and time." > > > > Also see security considerations in section 6. > Both aspects ignore the facts that: > Version 1 is easy to cause a collision with (MAC addresses are by no > means unique, and are easy to spoof, and so are timestamps). > Version 2 is relatively easy to cause a collision with, because UID > and > GID numbers are a fixed size namespace. > Version 3 is slightly better, but still not by any means unique > because > you just have to guess the seed string (or a collision for it). > Version 4 is probably the hardest to get a collision with, but only > if > you are using a true RNG, and evne then, 122 bits of entropy is not > much > protection. > Version 5 has the same issues as Version 3, but is more secure > against > hash collisions. I guess we don't need to discuss how unique UUIDs are when they're *freshly created*, since this is the only thing what the RFC "guarantees"... That's mostly irrelevant for us here, as we have two far more stronger cases, accidental duplication and malicious collisions. The possible case, that by normal means (e.g. mkfs.btrfs) a UUID collision occurs, are small, but solving the actual two cases here, will solve that one as well. Apart from that, I've noticed in several of your mails that either something with the indention level goes wrong, or you mix contents from multiple mails from different people. E.g. that "Also see security considerations in section 6." wasn't from me, which was at quotation level 1 in your mail, but the example with the automobile, which was also on level 1, was from me. That's kinda confusing... > In general, you should only use UUID's when either: > a. You have absolutely 100% complete control of the storage of them, > such that you can guarantee they don't get reused. > b. They can be guaranteed to be relatively unique for the system > using them. No, this aren't necessary constraints. And in fact would make multi- device practically impossible (you always need some ID, unless you want to open the door for countless of errors, where people wrongly assemble their devices... whether it's UUID or anything else, doesn't matter). The only thing that one needs to do, is handle collisions gracefully and don't do auto-assemblies,.. all as I've described in the mail from "Fri, 11 Dec 2015 23:06:03 +0100" (http://thread.gmane.org/gmane.comp.file-systems.btrfs/50909/focus=51147) > > There could be some leveraging of the device WWN, or absent that > > its > > serial number, propogated into all of the volume's devices (cross > > referencing each other's devid to WWN or serial). And then that way > > there's a way to differentiate. In the dd case, there would be > > mismatching real device WWN/serial number and the one written in > > metadata on all drives, including the copy. This doesn't say what > > policy should happen next, just that at least it's known there's a > > mismatch. > > > That gets tricky too, because for example you have stuff like flat > files > used as filesystem images. plus... one cannot be sure whether any hardware device IDs, like serial numbers, are unique... a powerful attacker could surely change these as well. Or imagine you have a failing harddisk, and dd it's content to another... the btrfs part would stay identical, while the hardware device IDs change and confuse everything. > However, if we then use some separate UUID (possibly hashed off of > the > file location) in place of the device serial/WWN, that could > theoretically provide some better protection. Not really... it just delegates the problem one level further. The only real protection is, that the kernel and userland tools deal correctly with the situation. > The obvious solution in > the case of a mismatch would be to refuse the mount until either the > issue is fixed using the tools, or the user specifies some particular > mount option to either fix ti automatically, or ignore copies with a > mismatching serial. Sure, as I've said before :-) Cheers, Chris [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-16 12:03 ` Christoph Anton Mitterer @ 2015-12-17 2:43 ` Duncan 0 siblings, 0 replies; 51+ messages in thread From: Duncan @ 2015-12-17 2:43 UTC (permalink / raw) To: linux-btrfs Christoph Anton Mitterer posted on Wed, 16 Dec 2015 13:03:24 +0100 as excerpted: > Human readable lables are basically guaranteed to collide, Heh, not here, tho one could argue that my labels aren't "human readable", I suppose. grep LABEL= /etc/fstab | cut -f1 LABEL=bt0238gcn1+35l0 LABEL=bt0238gcn0+35l0 LABEL=bt0465gsg0+47f0 LABEL=rt0238gcnx+35l0 LABEL=rt0238gcnx+35l1 LABEL=rt0465gsg0+47f0 LABEL=hm0238gcnx+35l0 LABEL=pk0238gcnx+35l0 LABEL=nr0238gcnx+35l0 LABEL=hm0238gcnx+35l1 LABEL=pk0238gcnx+35l1 LABEL=nr0238gcnx+35l1 LABEL=hm0465gsg0+47f0 LABEL=pk0465gsg0+47f0 LABEL=nr0465gsg0+47f0 LABEL=lg0238gcnx+35l0 LABEL=lg0465gsg0+47f0 LABEL=mm0465gsg0+2550 LABEL=mm0465gsg0+2551 #LABEL=sw0465gsg0+47f0 The scheme was originally designed with reiserfs' 15-char limited labels in mind, so it's 15-char. These days I use it for both fs labels and gpt partition names/labels, with the two generally matched except for the device sequential, which is x in the multi-device case. * function: 2 chars bt=boot, hm=home, etc * device-id: 8 uniq-in-scope device id ** size: 5 0238g=238 GiB ** brand: 2 sg=seagate, cn=corsair neutron, etc ** dev-seq: 1 can be more than one 465 GiB seagate * target: 1 +=home workstation, . for the netbook, etc * date: 3 date of original partition creation ** year: 1 last digit of year, gives decade scope ** month 1 1-9abc ** day 1 1-9a-v (2char would be nice here, but...) * func-seq 1 0=working, backup-N 2+8+1+3+1=15 chars =:^) So for example rt0238gcnx+35l0 is root, on 238 GiB Corsair Neutron (multi- devices), targeted at the workstation, with the partitions originally setup on 2013, June (something, whatever l is), working copy. (Hmm... Only apropos to this thread due to the tangential btrfs angle, but that's two and a half years ago. Which since that's when I first deployed btrfs permanently, I've been running btrfs for two and a half years now. ... =:^) The function tells me at a glance what it's intended to be used for. The target (which also functions as a visual separator) tells me at a glance where the device is intended to be used. The func-seq tells me at a glance whether I'm dealing with the working copy or what level of backup, and taken together with the function and target, uniquely ID the partition/filesystem "software device". The dev-id is uniq-in-scope, easily IDing size, brand, and number of "hardware device", and size is ridiculously scalable from bytes to PiB and beyond. For multi-device btrfs, dev-seq is "x", while the individual device partitions composing it still have their sequence numbers in their gpt labels. The date (along with size, of course) provides some idea of the age of the device, or at least the partitioning scheme on it, as well as providing more bits of "software device" and overall unique-id. Both sequence numbers can easily and intuitively scale to 61 (1-9a-zA-Z) if needed, and less intuitively a bit higher if it's really necessary. Target would lose its separator status if it scaled too far, but certainly gives me as an individual /reasonable/ number of machines flexibility. This scheme self-evidently and easily scales to a library well into the multi-hundreds if not thousands of physical devices, portable or permanently installed, partitioned up as needed. I haven't yet found the need as my "device library" is small enough, but were I to need to, I could reasonably easily put together a database tracking where various files (and even various versions of those files) are located. With the "software device" and "hardware device" IDed separately, I can easily substitute out or add/remove hardware devices from software devices, or the reverse, as necessary. The biggest problem is the 15-char limit; I had to pack the fields rather tighter and more cryptically than I'd have liked, so it's not as easily human readable as I'd have liked. And of course it'd need adapted for deployment scales on the level of facebook/google/nsa, where 60-some device-scaling in the sequence numbers, and the target scaling as well, is pitifully laughable, but it's certainly reasonable on an individual scale, and with a couple revisions for mdraid and btrfs (basically, md for brand when I was doing partitioned mdraid, and substituting x for individual sequence number for multi-device), the scheme has served me surprisingly well over the years since I came up with it, and should continue to do so, I suppose, until I no longer have the need (death, or near-vegetable in a nursing home or whatever). Tho if HP's "the machine" were to ever take off in my lifetime, it could prove somewhat... challenging to the mental and nomenclature model, but that pretty much applies to the entire computer field, both hardware and software, as we know it, so I'm far from alone, there. But, despite the debatable human-readability, it's a h*** of a lot more readable than UUIDs, and works very well indeed in LABEL= usage in fstab, being a h*** of a lot easier to work with there than UUIDs! =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 13:23 ` Austin S. Hemmelgarn 2015-12-14 21:26 ` Chris Murphy @ 2015-12-15 0:08 ` Christoph Anton Mitterer 2015-12-15 14:19 ` Austin S. Hemmelgarn 1 sibling, 1 reply; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-15 0:08 UTC (permalink / raw) To: Austin S. Hemmelgarn, Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 4049 bytes --] On Mon, 2015-12-14 at 08:23 -0500, Austin S. Hemmelgarn wrote: > The reason that this isn't quite as high of a concern is because > performing this attack requires either root access, or direct > physical > access to the hardware, and in either case, your system is already > compromised. No necessarily. Apart from the ATM image (where most people wouldn't call it compromised, just because it's openly accessible on the street) imageine you're running a VM hosting service, where you allow users to upload images and have them deployed. In the cheap" case these will end up as regular files, where they couldn't do any harm (even if colliding UUIDs)... but even there one would have to expect, that the hypervisor admin may losetup them for whichever reason. But if you offer more professional services, you may give your clients e.g. direct access to some storage backend, which are then probably also seen on the host by its kernel. And here we already have the case, that a client could remotely trigger such collision. And remember, things only sounds far-fetched until it actually happens the first time ;) > I still think that that isn't a sufficient excuse for not fixing the > issue, as there are a number of non-security related issues that can > result from this (there are some things that are common practice with > LVM or mdraid that can't be done with BTRFS because of this). Sure, I guess we agree on that,... > > Apart from that, btrfs should be a general purpose fs, and not just > > a > > desktop or server fs. > > So edge cases like forensics (where it's common that you create > > bitwise > > identical images) shouln't be forgotten either. > While I would normally agree, there are ways to work around this in > the > forensics case that don't work for any other case (namely, if BTRFS > is > built as a module, you can unmount everything, unload the module, > reload > it, and only scan the devices you want). see below (*) > On that note, why exactly is it better to make the filesystem UUID > such > an integral part of the filesystem? Well I think it's a proper way to e.g. handle the multi-device case. You have n devices, you want to differ them,... using a pseudo-random UUID is surely better than giving them numbers. Same for the fs UUID, e.g. when used for mounting devices whose paths aren't stable. As said before, using the UUID isn't the problem - not protecting against collisions is. > The other thing I'm reading out of > this all, is that by writing a total of 64 bytes to a specific > location > in a single disk in a multi-device BTRFS filesystem, you can make the > whole filesystem fall apart, which is absolutely absurd. Well,... I don't think that writing *into* the filesystem is covered by common practise anymore. In UNIX, a device (which holds the filesystem) is a file. Therefore one can argue: if one copies/duplicates one file (i.e. the fs) neither of the two's contents should get corrupted. But if you actively write *into* the file by yourself,... then you're simply on your own, either you know what you do, or just may just corrupt *that* specific file. Of course it should again not lead to any of it's clones or become corrupted as well. > And some recovery situations (think along the lines of no recovery > disk, > and you only have busybox or something similar to work with). (*) which is however also, why you may not be able to unmount the device anymore or unload btrfs. Maybe you have reasons you must/want to do any forensics in the running system. > > AFAIK, there's not even a solution right now, that copies a > > complete > > btrfs, with snapshots, etc. preserving all ref-links. At least > > nothing > > official that works in one command. > Send-receive kind of works for that I've added the "in one command" for that... O:-) In case the btrfs would have subvols/snapshots... the user would need to make the recursion himself... Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 0:08 ` Christoph Anton Mitterer @ 2015-12-15 14:19 ` Austin S. Hemmelgarn 2015-12-16 12:56 ` Christoph Anton Mitterer 0 siblings, 1 reply; 51+ messages in thread From: Austin S. Hemmelgarn @ 2015-12-15 14:19 UTC (permalink / raw) To: Christoph Anton Mitterer, Chris Murphy, Btrfs BTRFS On 2015-12-14 19:08, Christoph Anton Mitterer wrote: > On Mon, 2015-12-14 at 08:23 -0500, Austin S. Hemmelgarn wrote: >> The reason that this isn't quite as high of a concern is because >> performing this attack requires either root access, or direct >> physical >> access to the hardware, and in either case, your system is already >> compromised. > No necessarily. > Apart from the ATM image (where most people wouldn't call it > compromised, just because it's openly accessible on the street) Um, no you don't have direct physical access to the hardware with an ATM, at least, not unless you are going to take apart the cover and anything else in your way (and probably set off internal alarms). And even without that, it's still possible to DoS an ATM without much effort. Most of them have a 3.5mm headphone jack for TTS for people with poor vision, and that's more than enough to overload at least part of the system with a relatively simple to put together bit of electronics that would cost you less than 10 USD. > imageine you're running a VM hosting service, where you allow users to > upload images and have them deployed. > In the cheap" case these will end up as regular files, where they > couldn't do any harm (even if colliding UUIDs)... but even there one > would have to expect, that the hypervisor admin may losetup them for > whichever reason. > But if you offer more professional services, you may give your clients > e.g. direct access to some storage backend, which are then probably > also seen on the host by its kernel. > And here we already have the case, that a client could remotely trigger > such collision. In that particular situation, it's not relevant unless the host admin goes to mount them. UUID collisions are only an issue if the filesystems get mounted. > > And remember, things only sounds far-fetched until it actually happens > the first time ;) > > >> I still think that that isn't a sufficient excuse for not fixing the >> issue, as there are a number of non-security related issues that can >> result from this (there are some things that are common practice with >> LVM or mdraid that can't be done with BTRFS because of this). > Sure, I guess we agree on that,... > > >>> Apart from that, btrfs should be a general purpose fs, and not just >>> a >>> desktop or server fs. >>> So edge cases like forensics (where it's common that you create >>> bitwise >>> identical images) shouln't be forgotten either. >> While I would normally agree, there are ways to work around this in >> the >> forensics case that don't work for any other case (namely, if BTRFS >> is >> built as a module, you can unmount everything, unload the module, >> reload >> it, and only scan the devices you want). > see below (*) > > >> On that note, why exactly is it better to make the filesystem UUID >> such >> an integral part of the filesystem? > Well I think it's a proper way to e.g. handle the multi-device case. > You have n devices, you want to differ them,... using a pseudo-random > UUID is surely better than giving them numbers. That's debatable, the same issues are obviously present in both cases (individual numbers can collide too). > Same for the fs UUID, e.g. when used for mounting devices whose paths > aren't stable. In the case of a sanely designed system using LVM for example, device paths are stable. > > As said before, using the UUID isn't the problem - not protecting > against collisions is. No, the issues are: 1. We assume that the UUID will be unique for the life of the filesystem, which is not a safe assumption. 2. We don't sanely handle things if it isn't unique. > > >> The other thing I'm reading out of >> this all, is that by writing a total of 64 bytes to a specific >> location >> in a single disk in a multi-device BTRFS filesystem, you can make the >> whole filesystem fall apart, which is absolutely absurd. > Well,... I don't think that writing *into* the filesystem is covered by > common practise anymore. For end users, I agree. Part of the discussion involves attacks on the system, and for a attacker it's not a far stretch to write directly to the block device if possible (and it's even common practice for bypassing permission checks done in the VFS layer). > > In UNIX, a device (which holds the filesystem) is a file. Therefore one > can argue: if one copies/duplicates one file (i.e. the fs) neither of > the two's contents should get corrupted. > But if you actively write *into* the file by yourself,... then you're > simply on your own, either you know what you do, or just may just > corrupt *that* specific file. Of course it should again not lead to any > of it's clones or become corrupted as well. My point is that by changing the UUID in a superblock (and properly updating the checksum for the superblock), you can trivially break a multi-device filesystem. And it's a whole lot easier to do that than it is to do the equivalent for LVM. > > >> And some recovery situations (think along the lines of no recovery >> disk, >> and you only have busybox or something similar to work with). > (*) which is however also, why you may not be able to unmount the > device anymore or unload btrfs. > Maybe you have reasons you must/want to do any forensics in the running > system. > > >>> AFAIK, there's not even a solution right now, that copies a >>> complete >>> btrfs, with snapshots, etc. preserving all ref-links. At least >>> nothing >>> official that works in one command. >> Send-receive kind of works for that > I've added the "in one command" for that... O:-) > In case the btrfs would have subvols/snapshots... the user would need > to make the recursion himself... ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-15 14:19 ` Austin S. Hemmelgarn @ 2015-12-16 12:56 ` Christoph Anton Mitterer 0 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-16 12:56 UTC (permalink / raw) To: Austin S. Hemmelgarn, Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 6035 bytes --] On Tue, 2015-12-15 at 09:19 -0500, Austin S. Hemmelgarn wrote: > Um, no you don't have direct physical access to the hardware with an > ATM, at least, not unless you are going to take apart the cover and > anything else in your way (and probably set off internal alarms). Well access to the services ports (which may be USB) is typically much easier, and doesn't require to completely dismantle the steel and so... Simply because service teams also need to access these "regularly". But even if we don't count ATMs here, use any other publicly accessible computer terminals. Library computer, the entertainment systems in airplanes, TVs in a shopping centre, etc. pp. > And > even without that, it's still possible to DoS an ATM without much > effort. Most of them have a 3.5mm headphone jack for TTS for people > with poor vision, and that's more than enough to overload at least > part > of the system with a relatively simple to put together bit of > electronics that would cost you less than 10 USD. As I've said before,.. you always find another weak link, of course,... as it was pointed out before, USB itself is quite a security problem (firmware attacks and that like). But just because there are other issues, right now, there is no justification to make btrfs "weak" as well... because this just leads to the vicious circle, that everyone has security issues, not willing to solve them, pointing to others as an excuse. > > imageine you're running a VM hosting service, where you allow users > > to > > upload images and have them deployed. > > In the cheap" case these will end up as regular files, where they > > couldn't do any harm (even if colliding UUIDs)... but even there > > one > > would have to expect, that the hypervisor admin may losetup them > > for > > whichever reason. > > But if you offer more professional services, you may give your > > clients > > e.g. direct access to some storage backend, which are then probably > > also seen on the host by its kernel. > > And here we already have the case, that a client could remotely > > trigger > > such collision. > In that particular situation, it's not relevant unless the host admin > goes to mount them. UUID collisions are only an issue if the > filesystems get mounted. Hmm from the impression I got so far, it was not only a problem when actually mounting... but even if... this doesn't change the situation. Same problem as before, the host system may have btrfs filesystems whose IDs have leaked, the attacker may upload them as VM images as described above, and even if the host's admin doesn't want to mount those, he may mount what he considers his filsystems, which however also collide. Boom. Same issues as before. Turn it as you want, resistance is futile ;-) > > Well I think it's a proper way to e.g. handle the multi-device > > case. > > You have n devices, you want to differ them,... using a pseudo- > > random > > UUID is surely better than giving them numbers. > That's debatable, the same issues are obviously present in both cases > (individual numbers can collide too). Sure, as I've said. You always must handle the case of accidentally or maliciously colliding IDs if you count on data integrity and security. But using UUIDs makes chances at least small that you run into collisions (that users must than manually resolve somehow) *even when* you just create fresh filesystem, have no attacker and no dd or that like goes in your way. > > Same for the fs UUID, e.g. when used for mounting devices whose > > paths > > aren't stable. > In the case of a sanely designed system using LVM for example, device > paths are stable. Well, but LVM itself works with UUIDs again, so you just delegate the problem. And apart from that, with btrfs, I thought, we rather want to avoid using LVM below. > > As said before, using the UUID isn't the problem - not protecting > > against collisions is. > No, the issues are: > 1. We assume that the UUID will be unique for the life of the > filesystem, which is not a safe assumption. > 2. We don't sanely handle things if it isn't unique. Well isn't that what I've said? At least it's what I've meant ;) > > Well,... I don't think that writing *into* the filesystem is > > covered by > > common practise anymore. > For end users, I agree. Part of the discussion involves attacks on > the > system, and for a attacker it's not a far stretch to write directly > to > the block device if possible (and it's even common practice for > bypassing permission checks done in the VFS layer). Well but that's something else here what I don't think we can cover. What we must assume is, that devices show up with colliding IDs, either by "accident" or means like dd... or by an attacker somehow being able to make them show up (USB, the image upload scenarios I've described before, and so on). If the attacker can however write to *arbitrary* (and not just "his") devices, bypassing checks in the VFS layer or anything else... well than game over. He wouldn't need bothering to do a probably compelx attack based on btrfs colliding UUIDs - he could simply overwrite the root filesystem and reboot with his own malicious kernel/etc. > My point is that by changing the UUID in a superblock (and properly > updating the checksum for the superblock), you can trivially break a > multi-device filesystem. And it's a whole lot easier to do that than > it > is to do the equivalent for LVM. I'm a bit unsure what you try to show: - Arbitrarily writing into a device (e.g. in the superblock) is IMHO not common practise, an not justified by common or historical use. - If one does however do it (and it doesn't matter if the admin does it or an attacker), one would of course end up in the situation, where btrfs should detect this, and refuse mounting, fsck'ing, and that like. Problem solved. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 0:27 ` Christoph Anton Mitterer 2015-12-14 13:23 ` Austin S. Hemmelgarn @ 2015-12-14 20:55 ` Chris Murphy 2015-12-15 0:22 ` Christoph Anton Mitterer 1 sibling, 1 reply; 51+ messages in thread From: Chris Murphy @ 2015-12-14 20:55 UTC (permalink / raw) To: Christoph Anton Mitterer; +Cc: Chris Murphy, Btrfs BTRFS On Sun, Dec 13, 2015 at 5:27 PM, Christoph Anton Mitterer <calestyo@scientia.net> wrote: > On Fri, 2015-12-11 at 16:06 -0700, Chris Murphy wrote: >> For anything but a new and empty Btrfs volume > What's the influence of the fs being new/empty? > >> this hypothetical >> attack would be a ton easier to do on LVM and mdadm raid because they >> have a tiny amount of metadata to spoof compared to a Btrfs volume >> with even a little bit of data on it. > Uhm I haven't said that other systems properly handle this kind of > attack. ;-) > Guess that would need to be evaluated... > > >> I think this concern is overblown. > I don't think so. Let me give you an example: There is an attack[0] > against crypto, where the attacker listens via a smartphone's > microphone, and based on the acoustics of a computer where gnupg runs. > This is surely not an attack many people would have considered even > remotely possible, but in fact it works, at least under lab conditions. I'm aware of this proof of concept. I'd put it, and this one, in the realm of a targeted attack, so it's not nearly as likely as other problems needing fixing. That doesn't mean don't understand it better so it can be fixed. It means understand before arriving at risk assessment let alone conclusions. > Apart from that, btrfs should be a general purpose fs, and not just a > desktop or server fs. > So edge cases like forensics (where it's common that you create bitwise > identical images) shouln't be forgotten either. I didn't. I did state there are edge cases, not normal use. My criticism of dd for copying a volume is for general purpose copying, not edge cases. > > >> > >If your workflow requires making an exact copy (for the shelf or >> > > for >> > > an emergency) then dd might be OK. But most often it's used >> > > because >> > > it's been easy, not because it's a good practice. >> > Ufff.. I wouldn't got that far to call something here bad or good >> > practice. >> >> It's not just bad practice, it's sufficiently sloppy that it's very >> nearly user sabotage. That this is due to innocent ignorance, and a >> long standing practice that's bad advice being handed down from >> previous generations doesn't absolve the practice and mean we should >> invent esoteric work arounds for what is not a good practice. We have >> all sorts of exhibits why it's not a good idea. > Well if you don't give any real arguments or technical reasons (apart > from "working around software that doesn't handle this well") I > consider this just repetition of the baseless claim that long standing > practise would be bad. I already have, as have others. Does the user want cake or pie? The computer doesn't have that level of granular information when there are two apparently bitwise identical devices. The file system sees them both as dessert, without other distinction. So option a is to simply fail and let the user resolve the ambiguity. Option b is maybe to leveral btrfs check code and find out if there's more to the story, some indication that one of the apparently identical copies isn't really identical. But that's a lot of work for something that probably won't happen. What's more likely is they aren't just apparently identical, they are in fact identical because it's an LVM snapshot or a dd copy that's making them appear identical. That's not something btrfs can resolve alone. To automate the distinction, requires more information. If it's LVM, possibly LVM and Btrfs could work together where LVM LV UUID * Btrfs volume UUID = Btrfs volume UUID' (as in a derivative) and to treat it internally with a new temp UUID that's throw away. If it's a raw device, I still see this as the user's problem. They created it, they'll have to help resolve the ambiguity by yanking one of the drives. > Long story, short, I think we can agree, that - dd or not - corruptions > or attack vectors shouldn't be possible. Yes. -- Chris Murphy ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-14 20:55 ` Chris Murphy @ 2015-12-15 0:22 ` Christoph Anton Mitterer 0 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-15 0:22 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 3604 bytes --] On Mon, 2015-12-14 at 13:55 -0700, Chris Murphy wrote: > I'm aware of this proof of concept. I'd put it, and this one, in the > realm of a targeted attack, so it's not nearly as likely as other > problems needing fixing. That doesn't mean don't understand it better > so it can be fixed. It means understand before arriving at risk > assessment let alone conclusions. Assessing the actual risk of any such attack vector is IMHO quite difficult... but at least past experience has shown countless times over and over again, that any system, where people already saw it would have issues, were sooner or later actively attacked. Take all the things from online banking... TAN, iTAN... at some point the two-factor auth via mobileTAN were some people already warned, that this would be rather easy to attack... banks and proponents of the system said, that this is rather not realistic in practise. I think alone in Germany we had some 8 million Euros that were stolen by hacking mTANs last year. > I didn't. I did state there are edge cases, not normal use. My > criticism of dd for copying a volume is for general purpose copying, > not edge cases. Sure... but I guess we've never needed to argue about that. If a howto were to be written on "how to best copy a btrfs filesystem" and someone would say "me! take dd"... I'd be surely on your side, sayin "Naaahh... stupid... you copy empty blocks and that like". But here we talk about something completely different... namely all those cases where UUID collisions could happen, including those where a bit-identical copy is, for whichever reason, the best solution. > I already have, as have others. So far you've only said it would be bad practise as it wouldn't work well with filesystems that do use UUIDs. I agree with what Austin gave you as an answer upon that. > Does the user want cake or pie? The computer doesn't have that level > of granular information when there are two apparently bitwise > identical devices. I'm quite sure the computer has some concept of device path, and UUID isn't the only way to identify a device. If that was so, than any cloned ext4 would suffer from corruptions as well, as the fs would chose the device based on UUID. brtfs does of course more, especially in the multi-device case,... where it needs to differ devices based on their content, no on their path (which may be unstable). But such case can surely be detected, and as you said yourself below: > So option a is to simply fail and let the user > resolve the ambiguity. ... on could e.g. simply require the user to resolve the situation manually. And I guess that's exactly what I've wrote here several times in this thread, for mounting situations, for rebuild/fsck/repair/etc. sitations. > Option b is maybe to leveral btrfs check code > and find out if there's more to the story, some indication that one > of > the apparently identical copies isn't really identical. Can't believe that this would be possible... if they're bitwise identical, they're bitwise identical, the only thing that differs them is how they're connected, e.g. USB port 1, sata port 2, etc.. But as this is unstable (just swap two sata disks) it cannot be used. > That's not something btrfs can resolve alone. Sure, I've never demanded that. I always said "handle it gracefully" (i.e. no corruptions, no new mounts, fsck's, etc.), require the user to manually sort out things. Not automagically determine which of the devices are actually the right ones and use them. Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-11 22:21 ` Christoph Anton Mitterer 2015-12-11 22:32 ` Christoph Anton Mitterer 2015-12-11 23:06 ` Chris Murphy @ 2015-12-11 23:14 ` Eric Sandeen 2 siblings, 0 replies; 51+ messages in thread From: Eric Sandeen @ 2015-12-11 23:14 UTC (permalink / raw) To: Christoph Anton Mitterer, Chris Murphy, S.J.; +Cc: Btrfs BTRFS On 12/11/15 4:21 PM, Christoph Anton Mitterer wrote: >> Note that Btrfs is >> > not unique, XFS v5 does a very similar thing with volume UUID as >> > well, >> > and resulted in this change: >> > http://oss.sgi.com/pipermail/xfs/2015-April/041267.html > Do you mean that xfs may suffer from the same issues that we're talking > about here? If so, one should probably give them a notice. That was disabled temporarily because changing the fs UUID meant that every piece of checksummed metadata with an embedded UUID would then mismatch. It was fixed (re-allowed) with ce748ea xfs: create new metadata UUID field and incompat flag in the kernel and 9c4e12f xfsprogs: Add new sb_meta_uuid field, update userspace tools to manipulate it in xfsprogs. -Eric ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: attacking btrfs filesystems via UUID collisions? 2015-12-09 21:48 ` S.J. 2015-12-10 12:08 ` Austin S Hemmelgarn 2015-12-10 19:42 ` Chris Murphy @ 2015-12-11 22:06 ` Christoph Anton Mitterer 2 siblings, 0 replies; 51+ messages in thread From: Christoph Anton Mitterer @ 2015-12-11 22:06 UTC (permalink / raw) To: linux-btrfs; +Cc: Qu Wenruo, S.J. [-- Attachment #1: Type: text/plain, Size: 8768 bytes --] On Wed, 2015-12-09 at 22:48 +0100, S.J. wrote: > > 3. Some way to fail gracefully, when there's ambiguity that cannot > > be > > resolved. Once there are duplicate devs (dd or lvm snapshots, etc) > > then there's simply no way to resolve the ambiguity automatically, > > and > > the volume should just refuse to rw mount until the user resolves > > the > > ambiguity. I think it's OK to fallback to ro mount (maybe) by > > default > > in such a case rather than totally fail to mount. > About 3: > RO fallback for the second device/partitions is not good. > It won't stop confusing the two partitions, and even if both are RO, > thinking it's ok to read and then reading the wrong data is bad. Adding my two cents about that, just to emphasise it, even though S.J. already covered it: Even romounts, if anything is ambiguous, are evil: Even if the filesystem itself wouldn't be destroyed by that, it could mean that bogus data (or even evil data by an attacker) shows up in the system that is then used and causes damage by being used. In the "accidental" scenario, data from the wrong device could e.g. contain outdated binaries, that still have security holes, or they could contain lists of datasets to be deleted by some software, but since being outdated or simply garbage, the wrong data could be deleted. In the "attacker" scenario,... well again as above, old binaries could get used, or garbage data injected into the system (even if ro) could make it compromised or be used for DoS. In general, the longer I think about it, the more I come to the conclusion that any form of auto activation (mounting, assembling, rebuilding, etc.) is kind of dangerous... (see below) And this applies in general, not just when using UUIDs,... but since in btrfs UUIDs are the main criterion for selecting/auto-assembling these devices, it's what applies for us here. We have several stages, where wrong devices could be picked up and lead to damage (either accidentally or as part of a tricky attack): 1) When the system boots, i.e. replacing parts of the system (e.g. root fs) itself. There's little we can do here in general (regardless of UUID, labels or device=/dev/sda,/dev/sdb). If an attacker can exchange one of the devices, he may do evil things. That's bad of course, but I think "fixing" it, is beyond the scope of btrfs. - If e.g. the ATM has an unsecured BIOS/UEFI/bootloader and allows the attacker easily to access these and select which device to boot from,... well than I feel no sorry for the owner (their fault). - If they configure their grub/initrd/etc. to boot LABEL/UUID... well that's certainly handy, but it's also stupid if these boots happen unattended, and there is an way around it (specify the device paths or e.g. /dev/sda)... if the HDDs are properly secured by steel, and attacker cannot use the possibly more easily accessible USB bus. - Another way to partially help here is: use disk dm-crypt and boot/assemble your system based on the dm-crypt devices. E.g. boot from the multi-device-btrfs device=/dev/mapper/crypt1,/dev/mapper/crypt2 and so on. As long as the kernel and initrd (which does all that) are secure (which is assumed here), then even when the attacker manages to replace one of the devices, it wouldn't help him, as the he couldn't present a device for which a dm-crypt mapping can be set up (unless he has the keys, but then game's over anyway) => Long story short, if the system boots unattended, then people should not use UUID/LABEL to select the device, if they do, their fault, not btrfs scope. If boots are attended, there's anyway not problem. => IHMO, this conceptually "fixes" (in the sense, that there's nothing to do specifically from the btrfs side) the possible problems of such a system being booted, with an attacker having replaced or added some devices to it (especially when unattended). And also the situation, that such system was left back, in an incomplete multi-device state (i.e. left back unattended with a degraded RAID) In other words, I think any problems, resulting of auto- assembly/activation/mounting, based on UUIDs/device-scanning/etc. that affect the valid system becoming running (i.e. booting) are beyond our scope here. Yes there are problems, but one can at least try to avoid them, by using dm-crypt or device paths instead of LABELS/UUIDs, and properly securing (i.e. steel and so on) the system disks, mainboard, bios, etc. So the remaining issues are those we discussed already before: The system runs already. 1) Further devices show up with colliding UUIDs /device IDs. a) Either none of them are used (mounted, fsck, etc.) already. b) Or some of them are used (mounted, fsck, etc.) already. 2) Further devices show up, that have no UUID / device ID collisions, but that may fit to an already used multi-device btrfs. E.g. in the sense of: I have degraded RAID1 btrfs where my system runs upon. A new device shows up that would fit to that btrfs. (1) we already discussed: Effects: - it leads to data corruption - attackers may use it to cause damage or even get out data Possible solutions: If such situations occur: - In case (a) refuse to do (mount, fsck, anything else from the btrfs tools) anything unless the user specified the devices to be used manually (i.e. device=/dev/sda,/dev/sdb), perhaps even checking for, whether the given value, may be accidentally a UUID or label, e.g. /dev/disk/by-uuid/* - In case (b), continue to use the already used/active/assembled devices (because we must assume they actually belong together), refuse to do anything (including mounting, adding to a multi-device fs, starting rebuild, etc. pp.) with the others unless the user manually says so via device=foo,bar,baz (2) is similar to (1), but I think we haven't discussed it already in depth. The effects here are the same as above (i.e. accidental data corruption, or possible attacks), but here they would happen if btrfs would ever automatically assemble/add devices to an already active (possibly degraded) fs. Examples: - I have a degraded RAID6, one disk missing, the system is e.g. unattended and an attacker can plug in a USB stick with IDs that just match perfectly. If btrfs would then start to automatically add that newly appeared device to the fs, being happy about the fact that it can now start to rebuild, we'd have a problem. In that example, because the attacker may use that to get data out of the system. Take the same example without an attacker, a sysadmin may just accidentally plug in wrong HDD, that should actually serve as backup... it would start to get written at (this is why many HW RAID controllers have auto-activation disabled). - One has a *non-degraded* RAID1, and an attacker manages to plug a device with matching IDs... If then btrfs would be happy about being able to enlarge the RAID to one more device, and automatically start to use that new device, perhaps even starting a balance, then same problem as above. Possible solutions: Long story short, never do auto-assemblies (i.e. add to an already active fs) in multi-device scenarios. That is, don't do it per default. I'd be fine if it was an option, e.g. a kernel parameter or whatever that enables btrfs to such auto assemblies, and if the documentation clearly explains the possible issues (especially security issues) implied by it.... but it shouln't be the default. - IMHO, a fs should be secure by default, thus I think, adding devices to an already active fs (e.g. for rebuild), should never happen (by default) automatically. - But perhaps it would be useful to have one additional option, which generally disables that (i.e. not just in case of already active devices). That option would make it mandatory in *all* cases, that the user specifies device=/dev/foo,/dev/bar. That behaviour may be preferred for some special use cases, and having a true option for it, may be better than just trying to get it by removing any udev scripts or so (which may get accidentally added back by the distro). > PS: Kudos to C.A. Mitterer for discovering that problem Thanks, guess I have a hand for thinking about such "higher-level" attacks,... unfortunately in most cases the people aren't that open about it as here :-/ Cheers, Chris. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5313 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2015-12-22 2:13 UTC | newest] Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-04 12:05 Subvolume UUID, data corruption? S.J 2015-12-04 13:07 ` Hugo Mills 2015-12-05 3:28 ` Christoph Anton Mitterer 2015-12-05 5:52 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2015-12-05 12:01 ` Subvolume UUID, data corruption? Hugo Mills 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2015-12-11 12:33 ` Subvolume UUID, data corruption? Austin S. Hemmelgarn 2015-12-05 13:19 ` Duncan 2015-12-06 1:51 ` attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?) Christoph Anton Mitterer 2015-12-06 4:06 ` Duncan 2015-12-09 5:07 ` Christoph Anton Mitterer 2015-12-09 11:54 ` Duncan 2015-12-06 14:34 ` attacking btrfs filesystems via UUID collisions? Qu Wenruo 2015-12-06 20:55 ` Chris Murphy 2015-12-09 5:39 ` Christoph Anton Mitterer 2015-12-09 21:48 ` S.J. 2015-12-10 12:08 ` Austin S Hemmelgarn 2015-12-10 12:41 ` Hugo Mills 2015-12-10 12:57 ` S.J. 2015-12-10 19:42 ` Chris Murphy 2015-12-11 22:21 ` Christoph Anton Mitterer 2015-12-11 22:32 ` Christoph Anton Mitterer 2015-12-11 23:06 ` Chris Murphy 2015-12-12 1:34 ` S.J. 2015-12-14 0:28 ` Christoph Anton Mitterer 2015-12-14 0:27 ` Christoph Anton Mitterer 2015-12-14 13:23 ` Austin S. Hemmelgarn 2015-12-14 21:26 ` Chris Murphy 2015-12-15 0:35 ` Christoph Anton Mitterer 2015-12-15 13:54 ` Austin S. Hemmelgarn 2015-12-15 14:18 ` Hugo Mills 2015-12-15 14:27 ` Austin S. Hemmelgarn 2015-12-15 14:42 ` Hugo Mills 2015-12-15 16:03 ` Austin S. Hemmelgarn 2015-12-16 12:14 ` Christoph Anton Mitterer 2015-12-16 12:10 ` Christoph Anton Mitterer 2015-12-16 12:03 ` Christoph Anton Mitterer 2015-12-16 14:41 ` Chris Mason 2015-12-16 15:04 ` Christoph Anton Mitterer 2015-12-17 3:25 ` Duncan 2015-12-18 0:56 ` Christoph Anton Mitterer 2015-12-22 2:13 ` Kai Krakow 2015-12-16 12:03 ` Christoph Anton Mitterer 2015-12-17 2:43 ` Duncan 2015-12-15 0:08 ` Christoph Anton Mitterer 2015-12-15 14:19 ` Austin S. Hemmelgarn 2015-12-16 12:56 ` Christoph Anton Mitterer 2015-12-14 20:55 ` Chris Murphy 2015-12-15 0:22 ` Christoph Anton Mitterer 2015-12-11 23:14 ` Eric Sandeen 2015-12-11 22:06 ` Christoph Anton Mitterer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.