* LVM snapshot broke between 4.14 and 4.16 @ 2018-08-02 12:26 WGH 2018-08-02 13:31 ` Ilya Dryomov 0 siblings, 1 reply; 46+ messages in thread From: WGH @ 2018-08-02 12:26 UTC (permalink / raw) To: axboe, idryomov; +Cc: linux-block, linux-kernel, sagi (I originally reported this problem here: https://bugzilla.kernel.org/show_bug.cgi?id=200439) When I updated from 4.14 to 4.16, my LVM snapshotting script broke for no apparent reason. My script has the following line, and it fails like this: + lvcreate --size 5G --snapshot --name snap0 --permission r /dev/mapper/vg0-lvol_rootfs device-mapper: create ioctl on vg0-snap0-cowLVM-sDdIeh9cecWdaNyRfZC31mxgfwTa4sOeHMJXVOykGVRtfP6Aii7IHvwS066AOLOM-cow failed: Device or resource busy Failed to lock logical volume vg0/lvol_rootfs. Aborting. Manual intervention required. At the same time, some errors appear in dmesg as well: [ 26.145279] generic_make_request: Trying to write to read-only block-device dm-3 (partno 0) [ 26.145288] device-mapper: persistent snapshot: write_header failed [ 26.145847] device-mapper: table: 253:4: snapshot: Failed to read snapshot metadata [ 26.145851] device-mapper: ioctl: error adding target to table I bisected the vanilla kernel, and the first bad commit is [721c7fc701c71f693307d274d2b346a1ecd4a534] block: fail op_is_write() requests to read-only partitions ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 12:26 LVM snapshot broke between 4.14 and 4.16 WGH @ 2018-08-02 13:31 ` Ilya Dryomov 0 siblings, 0 replies; 46+ messages in thread From: Ilya Dryomov @ 2018-08-02 13:31 UTC (permalink / raw) To: wgh Cc: Jens Axboe, linux-block, linux-kernel, Sagi Grimberg, Mike Snitzer, dm-devel On Thu, Aug 2, 2018 at 2:26 PM WGH <wgh@torlan.ru> wrote: > > (I originally reported this problem here: > https://bugzilla.kernel.org/show_bug.cgi?id=200439) > > When I updated from 4.14 to 4.16, my LVM snapshotting script broke for > no apparent reason. > > My script has the following line, and it fails like this: > + lvcreate --size 5G --snapshot --name snap0 --permission r > /dev/mapper/vg0-lvol_rootfs > device-mapper: create ioctl on > vg0-snap0-cowLVM-sDdIeh9cecWdaNyRfZC31mxgfwTa4sOeHMJXVOykGVRtfP6Aii7IHvwS066AOLOM-cow > failed: Device or resource busy > Failed to lock logical volume vg0/lvol_rootfs. > Aborting. Manual intervention required. > > At the same time, some errors appear in dmesg as well: > [ 26.145279] generic_make_request: Trying to write to read-only > block-device dm-3 (partno 0) > [ 26.145288] device-mapper: persistent snapshot: write_header failed > [ 26.145847] device-mapper: table: 253:4: snapshot: Failed to read > snapshot metadata > [ 26.145851] device-mapper: ioctl: error adding target to table > > I bisected the vanilla kernel, and the first bad commit is > [721c7fc701c71f693307d274d2b346a1ecd4a534] block: fail op_is_write() > requests to read-only partitions Adding Mike and dm-devel. >From a quick look, --permission r sets DM_READONLY_FLAG, which makes dm mark the disk read-only with set_disk_ro(dm_disk(md), 1) in do_resume(). A bit later it tries to write to the disk from write_header(): return chunk_io(ps, ps->header_area, 0, REQ_OP_WRITE, 0, 1); Thanks, Ilya ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 @ 2018-08-02 13:31 ` Ilya Dryomov 0 siblings, 0 replies; 46+ messages in thread From: Ilya Dryomov @ 2018-08-02 13:31 UTC (permalink / raw) To: wgh Cc: Jens Axboe, linux-block, linux-kernel, Sagi Grimberg, Mike Snitzer, dm-devel On Thu, Aug 2, 2018 at 2:26 PM WGH <wgh@torlan.ru> wrote: > > (I originally reported this problem here: > https://bugzilla.kernel.org/show_bug.cgi?id=200439) > > When I updated from 4.14 to 4.16, my LVM snapshotting script broke for > no apparent reason. > > My script has the following line, and it fails like this: > + lvcreate --size 5G --snapshot --name snap0 --permission r > /dev/mapper/vg0-lvol_rootfs > device-mapper: create ioctl on > vg0-snap0-cowLVM-sDdIeh9cecWdaNyRfZC31mxgfwTa4sOeHMJXVOykGVRtfP6Aii7IHvwS066AOLOM-cow > failed: Device or resource busy > Failed to lock logical volume vg0/lvol_rootfs. > Aborting. Manual intervention required. > > At the same time, some errors appear in dmesg as well: > [ 26.145279] generic_make_request: Trying to write to read-only > block-device dm-3 (partno 0) > [ 26.145288] device-mapper: persistent snapshot: write_header failed > [ 26.145847] device-mapper: table: 253:4: snapshot: Failed to read > snapshot metadata > [ 26.145851] device-mapper: ioctl: error adding target to table > > I bisected the vanilla kernel, and the first bad commit is > [721c7fc701c71f693307d274d2b346a1ecd4a534] block: fail op_is_write() > requests to read-only partitions Adding Mike and dm-devel. From a quick look, --permission r sets DM_READONLY_FLAG, which makes dm mark the disk read-only with set_disk_ro(dm_disk(md), 1) in do_resume(). A bit later it tries to write to the disk from write_header(): return chunk_io(ps, ps->header_area, 0, REQ_OP_WRITE, 0, 1); Thanks, Ilya ^ permalink raw reply [flat|nested] 46+ messages in thread
* LVM snapshot broke between 4.14 and 4.16 2018-08-02 13:31 ` Ilya Dryomov (?) @ 2018-08-02 15:10 ` WGH 2018-08-02 16:41 ` Linus Torvalds -1 siblings, 1 reply; 46+ messages in thread From: WGH @ 2018-08-02 15:10 UTC (permalink / raw) To: Ilya Dryomov Cc: Jens Axboe, linux-block, linux-kernel, Sagi Grimberg, Mike Snitzer, dm-devel On 08/02/2018 04:31 PM, Ilya Dryomov wrote: > On Thu, Aug 2, 2018 at 2:26 PM WGH <wgh@torlan.ru> wrote: >> (I originally reported this problem here: >> https://bugzilla.kernel.org/show_bug.cgi?id=200439) >> >> When I updated from 4.14 to 4.16, my LVM snapshotting script broke for >> no apparent reason. >> >> My script has the following line, and it fails like this: >> + lvcreate --size 5G --snapshot --name snap0 --permission r >> /dev/mapper/vg0-lvol_rootfs >> device-mapper: create ioctl on >> vg0-snap0-cowLVM-sDdIeh9cecWdaNyRfZC31mxgfwTa4sOeHMJXVOykGVRtfP6Aii7IHvwS066AOLOM-cow >> failed: Device or resource busy >> Failed to lock logical volume vg0/lvol_rootfs. >> Aborting. Manual intervention required. >> >> At the same time, some errors appear in dmesg as well: >> [ 26.145279] generic_make_request: Trying to write to read-only >> block-device dm-3 (partno 0) >> [ 26.145288] device-mapper: persistent snapshot: write_header failed >> [ 26.145847] device-mapper: table: 253:4: snapshot: Failed to read >> snapshot metadata >> [ 26.145851] device-mapper: ioctl: error adding target to table >> >> I bisected the vanilla kernel, and the first bad commit is >> [721c7fc701c71f693307d274d2b346a1ecd4a534] block: fail op_is_write() >> requests to read-only partitions > Adding Mike and dm-devel. > > From a quick look, --permission r sets DM_READONLY_FLAG, which makes dm > mark the disk read-only with set_disk_ro(dm_disk(md), 1) in do_resume(). > A bit later it tries to write to the disk from write_header(): > > return chunk_io(ps, ps->header_area, 0, REQ_OP_WRITE, 0, 1); > > Thanks, > > Ilya After further investigation, this was fixed on lvm2 side (userspace) in https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3 (snapshot: keep COW writable for read-only volumes). So I guess that's it. Time to poke my distribution package maintainers to bump the package version. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 15:10 ` WGH @ 2018-08-02 16:41 ` Linus Torvalds 2018-08-02 18:18 ` Ilya Dryomov 0 siblings, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-02 16:41 UTC (permalink / raw) To: wgh Cc: Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel On Thu, Aug 2, 2018 at 8:16 AM WGH <wgh@torlan.ru> wrote: > > On 08/02/2018 04:31 PM, Ilya Dryomov wrote: > > > > From a quick look, --permission r sets DM_READONLY_FLAG, which makes dm > > mark the disk read-only with set_disk_ro(dm_disk(md), 1) in do_resume(). > > A bit later it tries to write to the disk from write_header(): > > > > return chunk_io(ps, ps->header_area, 0, REQ_OP_WRITE, 0, 1); > > > > Thanks, > > > > Ilya > > After further investigation, this was fixed on lvm2 side (userspace) in > https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3 > (snapshot: keep COW writable for read-only volumes). > > So I guess that's it. Time to poke my distribution package maintainers > to bump the package version. That is *not* how kernel development is supposed to work. If your script used to work, it should continue to work. Why did it use to work despite that read-only flag? And what was it that actually broke this? We remain bug-for-bug compatible with older kernel versions when people depend on the bugs. Unless the old bugs are security issues, and even then we try to make it _look_ like we work the same way. Or was it a user-space lvm tool that broke in the first place, and the kernel release update was a red herring? Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 16:41 ` Linus Torvalds @ 2018-08-02 18:18 ` Ilya Dryomov 2018-08-02 18:32 ` Linus Torvalds 0 siblings, 1 reply; 46+ messages in thread From: Ilya Dryomov @ 2018-08-02 18:18 UTC (permalink / raw) To: Linus Torvalds Cc: wgh, Jens Axboe, linux-block, linux-kernel, Sagi Grimberg, Mike Snitzer, dm-devel On Thu, Aug 2, 2018 at 6:41 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Aug 2, 2018 at 8:16 AM WGH <wgh@torlan.ru> wrote: > > > > On 08/02/2018 04:31 PM, Ilya Dryomov wrote: > > > > > > From a quick look, --permission r sets DM_READONLY_FLAG, which makes dm > > > mark the disk read-only with set_disk_ro(dm_disk(md), 1) in do_resume(). > > > A bit later it tries to write to the disk from write_header(): > > > > > > return chunk_io(ps, ps->header_area, 0, REQ_OP_WRITE, 0, 1); > > > > > > Thanks, > > > > > > Ilya > > > > After further investigation, this was fixed on lvm2 side (userspace) in > > https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3 > > (snapshot: keep COW writable for read-only volumes). > > > > So I guess that's it. Time to poke my distribution package maintainers > > to bump the package version. > > That is *not* how kernel development is supposed to work. If your > script used to work, it should continue to work. > > Why did it use to work despite that read-only flag? And what was it > that actually broke this? I think it was my commit 721c7fc701c7 ("block: fail op_is_write() requests to read-only partitions"). The block layer was previously allowing non-blkdev_write_iter() writes to read-only disks and partitions. dm's chunk_io() boils down to submit_bio(), so in 4.16 it started to fail if the underlying device was marked read-only. > > We remain bug-for-bug compatible with older kernel versions when > people depend on the bugs. Unless the old bugs are security issues, > and even then we try to make it _look_ like we work the same way. > > Or was it a user-space lvm tool that broke in the first place, and the > kernel release update was a red herring? Apparently at least some versions of lvm(8) instruct dm to mark the COW device read-only even though it is always written to. It comes up only if the user asks for a read-only snapshot (the default is writable). One option might be to ignore the supplied DM_READONLY_FLAG for COW devices. Marking the COW device (i.e. the exception store) read-only is probably never sane... Thanks, Ilya ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 18:18 ` Ilya Dryomov @ 2018-08-02 18:32 ` Linus Torvalds 2018-08-02 21:32 ` WGH 0 siblings, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-02 18:32 UTC (permalink / raw) To: Ilya Dryomov Cc: wgh, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel On Thu, Aug 2, 2018 at 11:18 AM Ilya Dryomov <idryomov@gmail.com> wrote: > > I think it was my commit 721c7fc701c7 ("block: fail op_is_write() > requests to read-only partitions"). The block layer was previously > allowing non-blkdev_write_iter() writes to read-only disks and > partitions. dm's chunk_io() boils down to submit_bio(), so in 4.16 > it started to fail if the underlying device was marked read-only. Ugh. That commit does look like the obviously right thing to do, and our old behavior was obviously wrong. At the same time, we really really shouldn't break user flow, even if that flow was due to an obvious bug. Dang. > Apparently at least some versions of lvm(8) instruct dm to mark the COW > device read-only even though it is always written to. It comes up only > if the user asks for a read-only snapshot (the default is writable). > > One option might be to ignore the supplied DM_READONLY_FLAG for COW > devices. Marking the COW device (i.e. the exception store) read-only > is probably never sane... That sounds like possibly the sanest fixlet for this - perhaps together with a warning message saying "ignoring DM_READONLY_FLAG for COW device" so that people see that they are doing insane things. But some of our internal DM_READONLY_FLAG use obviously comes from other sources that we have to honor (ie some grepping shows me code like if (get_disk_ro(disk)) param->flags |= DM_READONLY_FLAG; where we obviously can *not* override the get_disk_ro() state. So it depends a lot on how this all happened. But it looks like in this particular case, it all came from one inconsistent lvmcreate command. Or - if there really is only _one_ user that noticed this, and a new lvm2 release fixes his problem, maybe we can say "we broke it, but the only person who reported it can work around it". WGH (sorry, no idea what your real name is) - what's the source of the script that broke? Was it some system script you got from outside and likely to affect others too? Or was it just some local thing you wrote yourself and was unintentionally buggy and nobody else is likely to hit this? Because if the latter, if you can work around it and you're the only user this hits, we might just be able to ignore it. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 18:32 ` Linus Torvalds @ 2018-08-02 21:32 ` WGH 2018-08-02 21:39 ` WGH 0 siblings, 1 reply; 46+ messages in thread From: WGH @ 2018-08-02 21:32 UTC (permalink / raw) To: Linus Torvalds, Ilya Dryomov Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel On 08/02/2018 09:32 PM, Linus Torvalds wrote: > WGH (sorry, no idea what your real name is) - what's the source of the > script that broke? Was it some system script you got from outside and > likely to affect others too? > > Or was it just some local thing you wrote yourself and was > unintentionally buggy and nobody else is likely to hit this? > > Because if the latter, if you can work around it and you're the only > user this hits, we might just be able to ignore it. The script in question is written by me and contains literally two lines: lvcreate --size 5G --snapshot --name snap0 --permission r /dev/mapper/vg0-lvol_rootfs mount /dev/mapper/vg0-snap0 /mnt/rootfs_snapshot The script is not buggy (I think), it was written under simple assumption that --permission r works. And it does - unless you happen to have combination of kernel >=4.16 and lvm2 <2.02.178. The commit in question appeared only in 4.16, and this kernel version is not widespread yet. You have to be running both recent kernel and not-so-recent lvm2 to be bitten by this. This probably explains why no one else reported this problem. Workaround certainly exists: I can just create read-write snapshot, but mount it read-only. The reason why I didn't discover the workaround earlier is that after unsuccessful read-only snapshot creation system ends up in some weird state where read-write snapshots are also failing with the same error (until reboot). So I got a wrong impression that read-write snapshots were broken as well. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 21:32 ` WGH @ 2018-08-02 21:39 ` WGH 2018-08-02 21:52 ` Linus Torvalds 0 siblings, 1 reply; 46+ messages in thread From: WGH @ 2018-08-02 21:39 UTC (permalink / raw) To: Linus Torvalds, Ilya Dryomov Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel On 08/03/2018 12:32 AM, WGH wrote: > On 08/02/2018 09:32 PM, Linus Torvalds wrote: >> WGH (sorry, no idea what your real name is) - what's the source of the >> script that broke? Was it some system script you got from outside and >> likely to affect others too? >> >> Or was it just some local thing you wrote yourself and was >> unintentionally buggy and nobody else is likely to hit this? >> >> Because if the latter, if you can work around it and you're the only >> user this hits, we might just be able to ignore it. > The script in question is written by me and contains literally two lines: > > lvcreate --size 5G --snapshot --name snap0 --permission r > /dev/mapper/vg0-lvol_rootfs > mount /dev/mapper/vg0-snap0 /mnt/rootfs_snapshot > > The script is not buggy (I think), it was written under simple > assumption that --permission r works. And it does - unless you happen to > have combination of kernel >=4.16 and lvm2 <2.02.178. > > The commit in question appeared only in 4.16, and this kernel version is > not widespread yet. You have to be running both recent kernel and > not-so-recent lvm2 to be bitten by this. This probably explains why no > one else reported this problem. > > Workaround certainly exists: I can just create read-write snapshot, but > mount it read-only. The reason why I didn't discover the workaround > earlier is that after unsuccessful read-only snapshot creation system > ends up in some weird state where read-write snapshots are also failing > with the same error (until reboot). So I got a wrong impression that > read-write snapshots were broken as well. > I've just found one public report of this bug, though: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442 ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 21:39 ` WGH @ 2018-08-02 21:52 ` Linus Torvalds 2018-08-03 13:31 ` Mike Snitzer 2018-08-03 13:31 ` Zdenek Kabelac 0 siblings, 2 replies; 46+ messages in thread From: Linus Torvalds @ 2018-08-02 21:52 UTC (permalink / raw) To: wgh Cc: Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel On Thu, Aug 2, 2018 at 2:39 PM WGH <wgh@torlan.ru> wrote: > > I've just found one public report of this bug, though: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442 Yeah, it does sound like we should fix this issue. Ilya? Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 21:52 ` Linus Torvalds @ 2018-08-03 13:31 ` Mike Snitzer 2018-08-03 15:20 ` [dm-devel] " Theodore Y. Ts'o 2018-08-03 13:31 ` Zdenek Kabelac 1 sibling, 1 reply; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 13:31 UTC (permalink / raw) To: Linus Torvalds Cc: wgh, Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, dm-devel On Thu, Aug 02 2018 at 5:52pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Aug 2, 2018 at 2:39 PM WGH <wgh@torlan.ru> wrote: > > > > I've just found one public report of this bug, though: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442 > > Yeah, it does sound like we should fix this issue. Debian is notorious for having a stale and/or custom lvm2. Generally speaking, it is recommended that lvm2 not be older than the kernel (but the opposite is fine). Given the narrow scope of users impacted (read-only snapshot creation is very niche, certainly not the default): I don't think it is worth making fiddly changes in dm-snapshot to warn the user and override the permissions for the cow device. Mike ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 13:31 ` Mike Snitzer @ 2018-08-03 15:20 ` Theodore Y. Ts'o 2018-08-03 18:39 ` Mike Snitzer 2018-08-03 19:56 ` [dm-devel] " Alasdair G Kergon 0 siblings, 2 replies; 46+ messages in thread From: Theodore Y. Ts'o @ 2018-08-03 15:20 UTC (permalink / raw) To: Mike Snitzer Cc: Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03, 2018 at 09:31:03AM -0400, Mike Snitzer wrote: > > Debian is notorious for having a stale and/or custom lvm2. > Generally speaking, it is recommended that lvm2 not be older than the > kernel (but the opposite is fine). On Fri, Aug 03, 2018 at 03:31:18PM +0200, Zdenek Kabelac wrote: > IMHO (as the author of fixing lvm2 patch) user should not be upgrading > kernels and keep running older lvm2 user-land tool (and there are very good > reasons for this). I'm going to have to strenuously disagree. In *general* it's quite common for users to update their kernel without updating their userspace. For example, I as a *developer*, I am often running bleeding kernels (e.g., at the moment I am running something based on 4.18-rc6 on a Debian testing system; and it's not at all uncommon for users to run a newer kernel on older distribution). This is the *first* I've heard that I should be continuously updating lvm because I'm running bleeding edge kernels --- and I would claim that this is entirely unreasonable. I'll also note that very often users will update kernels while running distribution userspace. And if you are using Linode, very often *Linode* will offer a newer kernel to better take advantage of the Linode VM, and this is done without needing to install the Linode kernel into the userspace. It *used* to be the case that users running RHEL 2 or RHEL 3 could try updating to the latest upstream kernel, and everything would break and fall apart. This was universally considered to be a failure, and a Bad Thing. So if LVM2 is not backwards compatible, and breaks in the face of newer kernels running older distributions, that is a bug. If there is a fundamental bug in the userspace API, and it can't be fixed without a serious security bug, sometimes we need to have an exception to the "you can't mandate newer userspace" rule. But I don't think this falls into this category; how would a user "exploit" what people are calling a "security bug" to break root? - Ted ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 15:20 ` [dm-devel] " Theodore Y. Ts'o @ 2018-08-03 18:39 ` Mike Snitzer 2018-08-03 18:57 ` Linus Torvalds 2018-08-03 19:56 ` [dm-devel] " Alasdair G Kergon 1 sibling, 1 reply; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 18:39 UTC (permalink / raw) To: Theodore Y. Ts'o, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03 2018 at 11:20am -0400, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Fri, Aug 03, 2018 at 09:31:03AM -0400, Mike Snitzer wrote: > > > > Debian is notorious for having a stale and/or custom lvm2. > > Generally speaking, it is recommended that lvm2 not be older than the > > kernel (but the opposite is fine). > > On Fri, Aug 03, 2018 at 03:31:18PM +0200, Zdenek Kabelac wrote: > > IMHO (as the author of fixing lvm2 patch) user should not be upgrading > > kernels and keep running older lvm2 user-land tool (and there are very good > > reasons for this). > > I'm going to have to strenuously disagree. > > In *general* it's quite common for users to update their kernel > without updating their userspace. For example, I as a *developer*, I > am often running bleeding kernels (e.g., at the moment I am running > something based on 4.18-rc6 on a Debian testing system; and it's not > at all uncommon for users to run a newer kernel on older > distribution). > > This is the *first* I've heard that I should be continuously updating > lvm because I'm running bleeding edge kernels --- and I would claim > that this is entirely unreasonable. It isn't a hard requirement. But relative to a newer kernel, you simply cannot deny that a newer lvm2 will work better than on older lvm2. Not even speaking about this block regression. Lessons are learned, fixes are made, support for the newer kernel's targets are available, etc etc. That is where the suggestion to keep lvm2 updated along with the kernel comes from. It isn't about "oh we regress _all_ the time.. screw users!". > I'll also note that very often users will update kernels while running > distribution userspace. And if you are using Linode, very often > *Linode* will offer a newer kernel to better take advantage of the > Linode VM, and this is done without needing to install the Linode > kernel into the userspace. > > It *used* to be the case that users running RHEL 2 or RHEL 3 could try > updating to the latest upstream kernel, and everything would break and > fall apart. This was universally considered to be a failure, and a > Bad Thing. So if LVM2 is not backwards compatible, and breaks in the > face of newer kernels running older distributions, that is a bug. Please stop with the overreaction and making this something it isn't. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 18:39 ` Mike Snitzer @ 2018-08-03 18:57 ` Linus Torvalds 2018-08-03 19:06 ` Mike Snitzer 0 siblings, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 18:57 UTC (permalink / raw) To: Mike Snitzer Cc: Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 3, 2018 at 11:39 AM Mike Snitzer <snitzer@redhat.com> wrote: > > Please stop with the overreaction and making this something it isn't. It's not an overreaction when people get their scripts broken, and some developers then argue "that's not a serious bug". Guys, this needs to be fixed. With all the stupid and fundamentyally incorrect excuses, I am now no longer even willing to maintain any other course of action. If you develop for the Linux kernel, you need to realize that "breaking user space" is simply not acceptable. And if you cannot live with that, then you should stop working on the kernel. Because I will refuse to continue to pull from you as a developer. At worst, I'll just revert the original commit entirely. I was hoping we'd be able to avoid that, partly because the commit looks fine, but partly because it also doesn't revert cleanly. Or I'll just do something like this, since it seems like it's the lvm people who have the hardest time with understanding the simple rules: diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c index b810ea77e6b1..fcfab812e025 100644 --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1049,7 +1049,12 @@ static int do_resume(struct dm_ioctl *param) return PTR_ERR(old_map); } - if (dm_table_get_mode(new_map) & FMODE_WRITE) + /* + * This used to do + * dm_table_get_mode(new_map) & FMODE_WRITE + * but the lvm tools got this wrong, and will + * continue to write to "read-only" volumes. + if (0) set_disk_ro(dm_disk(md), 0); else set_disk_ro(dm_disk(md), 1); which seems to target the actual problem more directly. Linus ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 18:57 ` Linus Torvalds @ 2018-08-03 19:06 ` Mike Snitzer 2018-08-03 19:11 ` Linus Torvalds 2018-08-03 19:22 ` Linus Torvalds 0 siblings, 2 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 19:06 UTC (permalink / raw) To: Linus Torvalds Cc: Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03 2018 at 2:57pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Aug 3, 2018 at 11:39 AM Mike Snitzer <snitzer@redhat.com> wrote: > > > > Please stop with the overreaction and making this something it isn't. > > It's not an overreaction when people get their scripts broken, and > some developers then argue "that's not a serious bug". > > Guys, this needs to be fixed. With all the stupid and fundamentyally > incorrect excuses, I am now no longer even willing to maintain any > other course of action. > > If you develop for the Linux kernel, you need to realize that > "breaking user space" is simply not acceptable. And if you cannot live > with that, then you should stop working on the kernel. Because I will > refuse to continue to pull from you as a developer. WTF!? > At worst, I'll just revert the original commit entirely. I was hoping > we'd be able to avoid that, partly because the commit looks fine, but > partly because it also doesn't revert cleanly. > > Or I'll just do something like this, since it seems like it's the lvm > people who have the hardest time with understanding the simple rules: I'll be your whipping boy all you like. But you're making Zdenek's response into mine and threathening to no longer pull from me. Over what? A block regression that an lvm2 developer papered over. > diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c > index b810ea77e6b1..fcfab812e025 100644 > --- a/drivers/md/dm-ioctl.c > +++ b/drivers/md/dm-ioctl.c > @@ -1049,7 +1049,12 @@ static int do_resume(struct dm_ioctl *param) > return PTR_ERR(old_map); > } > > - if (dm_table_get_mode(new_map) & FMODE_WRITE) > + /* > + * This used to do > + * dm_table_get_mode(new_map) & FMODE_WRITE > + * but the lvm tools got this wrong, and will > + * continue to write to "read-only" volumes. > + if (0) > set_disk_ro(dm_disk(md), 0); > else > set_disk_ro(dm_disk(md), 1); > > which seems to target the actual problem more directly. How does that pass for a fix to this issue? That'll unilaterally mark all dm device's readonly. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:06 ` Mike Snitzer @ 2018-08-03 19:11 ` Linus Torvalds 2018-08-03 19:33 ` Mike Snitzer 2018-08-03 19:22 ` Linus Torvalds 1 sibling, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 19:11 UTC (permalink / raw) To: Mike Snitzer Cc: Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 3, 2018 at 12:06 PM Mike Snitzer <snitzer@redhat.com> wrote: > > But you're making Zdenek's response into mine and threathening to no > longer pull from me. No. I'm *very* unhappy about how you seem to think that Zdenek's response was even half-way ok, and you jumped in when Ted said it wasn't. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:11 ` Linus Torvalds @ 2018-08-03 19:33 ` Mike Snitzer 0 siblings, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 19:33 UTC (permalink / raw) To: Linus Torvalds Cc: Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03 2018 at 3:11pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Aug 3, 2018 at 12:06 PM Mike Snitzer <snitzer@redhat.com> wrote: > > > > But you're making Zdenek's response into mine and threathening to no > > longer pull from me. > > No. I'm *very* unhappy about how you seem to think that Zdenek's > response was even half-way ok, and you jumped in when Ted said it > wasn't. I'm inclined not to make people who aren't kernel developers cry over spilt milk. But zero tolerance of kernel regressions is important. And that mantra wasn't properly impressed upon lvm2 developers. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:06 ` Mike Snitzer 2018-08-03 19:11 ` Linus Torvalds @ 2018-08-03 19:22 ` Linus Torvalds 2018-08-04 10:01 ` WGH 1 sibling, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 19:22 UTC (permalink / raw) To: Mike Snitzer Cc: Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac [-- Attachment #1: Type: text/plain, Size: 730 bytes --] On Fri, Aug 3, 2018 at 12:06 PM Mike Snitzer <snitzer@redhat.com> wrote: > > How does that pass for a fix to this issue? > > That'll unilaterally mark all dm device's readonly. Well, if it wasn't honored before anyway, then... But yes, a much more targeted patch would be preferred. And in fact, maybe the right thing to do is to not revert the original commit entirely, but instead just weaken it a lot. Turn the "you did a write request on a RO disk" into a WARN_ON_ONCE() instead of a hard error. Something like the attached patch. WGH, do you build your own kernels? Does this attached (untested) patch make things work for you? It should give a big warning, but let the old behavior through.. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 664 bytes --] block/blk-core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index f84a9b7b6f5a..95ca45bc4fc9 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2155,11 +2155,12 @@ static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part) if (part->policy && op_is_write(bio_op(bio))) { char b[BDEVNAME_SIZE]; - printk(KERN_ERR + WARN_ONCE( "generic_make_request: Trying to write " "to read-only block-device %s (partno %d)\n", bio_devname(bio, b), part->partno); - return true; + /* Older lvm-tools actually trigger this */ + return false; } return false; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:22 ` Linus Torvalds @ 2018-08-04 10:01 ` WGH 2018-08-04 17:04 ` Linus Torvalds 0 siblings, 1 reply; 46+ messages in thread From: WGH @ 2018-08-04 10:01 UTC (permalink / raw) To: Linus Torvalds, Mike Snitzer Cc: Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, Zdenek Kabelac On 08/03/2018 10:22 PM, Linus Torvalds wrote: > And in fact, maybe the right thing to do is to not revert the original > commit entirely, but instead just weaken it a lot. Turn the "you did a > write request on a RO disk" into a WARN_ON_ONCE() instead of a hard > error. > > Something like the attached patch. > > WGH, do you build your own kernels? Does this attached (untested) > patch make things work for you? It should give a big warning, but let > the old behavior through.. The patch works for me. However, there's no text messsage in the kernel log, just a traceback. I think that's because WARN_ONCE is supposed to take condition as a first argument. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 10:01 ` WGH @ 2018-08-04 17:04 ` Linus Torvalds 0 siblings, 0 replies; 46+ messages in thread From: Linus Torvalds @ 2018-08-04 17:04 UTC (permalink / raw) To: wgh Cc: Mike Snitzer, Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, Zdenek Kabelac On Sat, Aug 4, 2018 at 3:03 AM WGH <wgh@torlan.ru> wrote: > > > > The patch works for me. > > However, there's no text messsage in the kernel log, just a traceback. I > think that's because WARN_ONCE is supposed to take condition as a first > argument. Duh. It needs to be WARN_ONCE(1, ...); I obviously didn't test that patch, but I _did_ compile it. I wonder why I didn't get a compiler warning for it... [ Goes off and looks ] Oh, because the "bio_devname(bio, b)" argument ended up being interpreted as the format string, and since it was a dynamic string the compiler felt it was all fine. Just bad luck. Anyway, just out of curiosity, what was the traceback? I'm not entirely happy with that patch either (even after the obvious fix to add the "1" argument), but it does seem like the minimal temporary workaround for now. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 @ 2018-08-04 17:04 ` Linus Torvalds 0 siblings, 0 replies; 46+ messages in thread From: Linus Torvalds @ 2018-08-04 17:04 UTC (permalink / raw) To: wgh Cc: Jens Axboe, Theodore Ts'o, Sagi Grimberg, Mike Snitzer, Linux Kernel Mailing List, linux-block, dm-devel, Zdenek Kabelac, Ilya Dryomov On Sat, Aug 4, 2018 at 3:03 AM WGH <wgh@torlan.ru> wrote: > > > > The patch works for me. > > However, there's no text messsage in the kernel log, just a traceback. I > think that's because WARN_ONCE is supposed to take condition as a first > argument. Duh. It needs to be WARN_ONCE(1, ...); I obviously didn't test that patch, but I _did_ compile it. I wonder why I didn't get a compiler warning for it... [ Goes off and looks ] Oh, because the "bio_devname(bio, b)" argument ended up being interpreted as the format string, and since it was a dynamic string the compiler felt it was all fine. Just bad luck. Anyway, just out of curiosity, what was the traceback? I'm not entirely happy with that patch either (even after the obvious fix to add the "1" argument), but it does seem like the minimal temporary workaround for now. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 17:04 ` Linus Torvalds (?) @ 2018-08-04 18:19 ` Mike Snitzer -1 siblings, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-04 18:19 UTC (permalink / raw) To: Linus Torvalds Cc: wgh, Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, Zdenek Kabelac On Sat, Aug 04 2018 at 1:04pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Sat, Aug 4, 2018 at 3:03 AM WGH <wgh@torlan.ru> wrote: > > > > > > > The patch works for me. > > > > However, there's no text messsage in the kernel log, just a traceback. I > > think that's because WARN_ONCE is supposed to take condition as a first > > argument. > > Duh. > > It needs to be WARN_ONCE(1, ...); > > I obviously didn't test that patch, but I _did_ compile it. I wonder > why I didn't get a compiler warning for it... > > [ Goes off and looks ] > > Oh, because the "bio_devname(bio, b)" argument ended up being > interpreted as the format string, and since it was a dynamic string > the compiler felt it was all fine. Just bad luck. > > Anyway, just out of curiosity, what was the traceback? > > I'm not entirely happy with that patch either (even after the obvious > fix to add the "1" argument), but it does seem like the minimal > temporary workaround for now. I agree. Please feel free to add my: Acked-by: Mike Snitzer <snitzer@redhat.com> ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 17:04 ` Linus Torvalds (?) (?) @ 2018-08-04 20:29 ` WGH -1 siblings, 0 replies; 46+ messages in thread From: WGH @ 2018-08-04 20:29 UTC (permalink / raw) To: Linus Torvalds Cc: Mike Snitzer, Theodore Ts'o, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, Zdenek Kabelac [-- Attachment #1: Type: text/plain, Size: 289 bytes --] On 08/04/2018 08:04 PM, Linus Torvalds wrote: > Anyway, just out of curiosity, what was the traceback? > > I'm not entirely happy with that patch either (even after the obvious > fix to add the "1" argument), but it does seem like the minimal > temporary workaround for now. Here it is. [-- Attachment #2: traceback.log --] [-- Type: text/x-log, Size: 2526 bytes --] [ 68.313024] ------------[ cut here ]------------ [ 68.313030] dm-3 [ 68.313049] WARNING: CPU: 1 PID: 77 at block/blk-core.c:2180 generic_make_request_checks+0x4ac/0x600 [ 68.313054] Modules linked in: nf_tables xt_cgroup xt_owner kvm_intel iwlmvm kvm irqbypass iwlwifi [ 68.313068] CPU: 1 PID: 77 Comm: kworker/1:1 Not tainted 4.17.9-gentoo #3 [ 68.313071] Hardware name: LENOVO 20B6A019RT/20B6A019RT, BIOS GJET91WW (2.41 ) 09/21/2016 [ 68.313077] Workqueue: ksnaphd do_metadata [ 68.313083] RIP: 0010:generic_make_request_checks+0x4ac/0x600 [ 68.313087] RSP: 0018:ffffc900002e7ba8 EFLAGS: 00010282 [ 68.313091] RAX: 0000000000000000 RBX: ffff8801ef6da300 RCX: ffffffff82636218 [ 68.313094] RDX: 0000000000000001 RSI: 0000000000000086 RDI: ffffffff82cf57cc [ 68.313097] RBP: ffff880211c2b150 R08: 0000000000000004 R09: 0000000000000307 [ 68.313100] R10: ffffc900002e7c80 R11: ffffffff82cf780d R12: 0000000000000000 [ 68.313103] R13: 0000000000000001 R14: 0000000000001000 R15: ffff8801ef6da300 [ 68.313107] FS: 0000000000000000(0000) GS:ffff88021f240000(0000) knlGS:0000000000000000 [ 68.313111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 68.313114] CR2: 000056166d9e8520 CR3: 0000000003a0a006 CR4: 00000000001606e0 [ 68.313117] Call Trace: [ 68.313126] ? scsi_init_sgtable+0x42/0x80 [ 68.313134] ? kmem_cache_alloc+0x170/0x1a0 [ 68.313140] generic_make_request+0x64/0x400 [ 68.313146] ? submit_bio+0x6c/0x140 [ 68.313150] ? bio_alloc_bioset+0xa9/0x1f0 [ 68.313155] submit_bio+0x6c/0x140 [ 68.313160] ? vm_get_page+0x20/0x50 [ 68.313165] dispatch_io+0x287/0x430 [ 68.313171] ? bio_next_page+0xb0/0xb0 [ 68.313175] ? bio_get_page+0x40/0x40 [ 68.313180] ? cpumask_next+0x16/0x20 [ 68.313185] sync_io+0xc3/0x120 [ 68.313191] dm_io+0x1f8/0x220 [ 68.313196] ? bio_next_page+0xb0/0xb0 [ 68.313200] ? bio_get_page+0x40/0x40 [ 68.313204] ? cpumask_next+0x16/0x20 [ 68.313209] do_metadata+0x1d/0x30 [ 68.313216] process_one_work+0x1b9/0x3e0 [ 68.313222] worker_thread+0x2b/0x3c0 [ 68.313227] ? wq_update_unbound_numa+0x10/0x10 [ 68.313232] kthread+0x113/0x130 [ 68.313237] ? kthread_create_worker_on_cpu+0x70/0x70 [ 68.313242] ret_from_fork+0x35/0x40 [ 68.313246] Code: 3e fc ff ff 45 8b a4 24 4c 03 00 00 48 8d 74 24 08 48 89 df c6 05 3a 49 35 01 01 e8 df 68 01 00 48 89 c7 44 89 e6 e8 c4 a0 cd ff <0f> 0b 4c 8b 63 08 e9 0c fc ff ff 80 3d 18 49 35 01 00 0f 85 29 [ 68.313300] ---[ end trace 2a4da4181ebee969 ]--- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 15:20 ` [dm-devel] " Theodore Y. Ts'o 2018-08-03 18:39 ` Mike Snitzer @ 2018-08-03 19:56 ` Alasdair G Kergon 2018-08-03 20:08 ` Alasdair G Kergon 1 sibling, 1 reply; 46+ messages in thread From: Alasdair G Kergon @ 2018-08-03 19:56 UTC (permalink / raw) To: Theodore Y. Ts'o Cc: Mike Snitzer, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03, 2018 at 11:20:34AM -0400, Theodore Y. Ts'o wrote: > It *used* to be the case that users running RHEL 2 or RHEL 3 could try > updating to the latest upstream kernel, and everything would break and > fall apart. This was universally considered to be a failure, and a > Bad Thing. So if LVM2 is not backwards compatible, and breaks in the > face of newer kernels running older distributions, that is a bug. Brings back memories! For those who wonder what this is all about, with LVM1, if the version of kernel and userspace didn't match it simply stopped. No booting into the previous version of your kernel after upgrading unless you reverted userspace as well! Led to all sorts of not-so-fancy workarounds. As a reaction to this, LVM2 (through libdevmapper and anything else using the dm ioctls as documented in dm-ioctl.h) passes a version number up to the kernel (to say "I know about kernels with dm up to and including version x.y.z"), so there is an option for an in-kernel workaround here to limit its compatibility mode to the broken userspace versions. Anything passing in version 4.37.0 or earlier (which is the version in dm-ioctl.h when this kernel patch was applied) could be assumed to require the old behaviour. check_version() is where this version is seen, so it would either store it for later checking or do the check and store a flag to invoke compatible behaviour later. Alasdair ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:56 ` [dm-devel] " Alasdair G Kergon @ 2018-08-03 20:08 ` Alasdair G Kergon 2018-08-03 20:42 ` Linus Torvalds 0 siblings, 1 reply; 46+ messages in thread From: Alasdair G Kergon @ 2018-08-03 20:08 UTC (permalink / raw) To: Theodore Y. Ts'o, Mike Snitzer, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03, 2018 at 08:56:36PM +0100, Alasdair G Kergon wrote: > Anything passing in version 4.37.0 or earlier (which is the version in If taking this approach, it might be better to use the current version i.e. where we add the kernel-side fix. IOW anything compiling against a uapi header taken from a kernel up to now will see the old behaviour, but anything newer (after we next increment the kernel dm version) will be required to change its userspace code if it has this problem in it. The problematic versions of lvm2 ship with versions up to and including 4.36.0 in this field so should be caught either way. Alasdair ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 20:08 ` Alasdair G Kergon @ 2018-08-03 20:42 ` Linus Torvalds 2018-08-03 21:26 ` Alasdair G Kergon 0 siblings, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 20:42 UTC (permalink / raw) To: Theodore Ts'o, Mike Snitzer, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 3, 2018 at 1:08 PM Alasdair G Kergon <agk@redhat.com> wrote: > > If taking this approach, it might be better to use the current version > i.e. where we add the kernel-side fix. IOW anything compiling against > a uapi header taken from a kernel up to now will see the old behaviour, > but anything newer (after we next increment the kernel dm version) will > be required to change its userspace code if it has this problem in it. No, please don't do things like that. It's a nightmare. It just means that different people randomly have different behavior depending on what development environment they had, which is horrible for any situation where you do cross-builds, but it's also horrible from a testing standpoint. And it's a *huge* nightmare for stable releases and backporting fixes. Absolutely *no* user space shjould ever care about kernel versions. If you want to use a new system call or something, just *use* if, and then if it fails, fall back to the old one. Never ever check "what's the kernel version" or anything like that. Instead, what I suggest we do is: Case 1: if a user program notice that a new kernel breaks something, then (a) report it to kernel people with a test-case (b) if you notice that the *reason* a new kernel broke something is because you had a bug and you can just fix that bug and it will always work on all kernels, by all means just fix the bug. Obviously you don't want to just keep buggy code around just because it was found by a kernel change (c) but even if (b) happens, do that report. In fact, you now have a perfect test-case for the kernel people you report it to: you can tell them *exactly* what changed, and what the two situations were. (d) and if whoever you reported the kernel breakage to doesn't seem to take it seriously, just escalate it to me. Note that for the (d) case, there are situations where even _I_ won't necessarily take it seriously. If you're the only user of some program, I may just go "Oh, you already fixed your own problem, I don't need to worry". Or if it turns out that the breakage wasn't "user flow", but some test-suite regression, I will tend to ignore those. Test suites are by definition for finding _behavior_, but behavior can change - it's actual _breakage_ I worry about. But also notice how none of that "case 1" has any versioning related to it. Yes, you'll have old versions of the user program before the bug was fixed, but they will *not* look at kernel versions, and the new versions certainly shouldn't either. Case 2: the kernel side. We get that breakage reported, but it turns out that different versions of the workflow act differently, and maybe some versions depend on the new behavior, and some depend on the old behavior. And yes, this happens. We have been in the situation where we get a bug-report for something we changed three years ago, and by now, all modern user space depends on the new "fixed" behavior, but some embedded user who hadn't upgraded a kernel in years is unhappy. It's happened several times, and if it really is "it's been years", even I will go "tough luck, I guess you're stuck on your old kernel". Because we can't fix it. But in a situation like this, where we really want to encourage new behavior but we have somebody who reported the breakage in a timely manner (ie it's fairly recent), _and_ it's a system tool like lvm2 that actually gives us a version number, at that point the *kernel* might decide "this is an old binary, I will now use some versioning to fall back on old behavior". Because for the kernel, the compatibility really is a #1 concern, and if we have to check version numbers or infer compatibility issues some other way, we'll do it. But that "different behavior for different application versions" is something we generally avoid at all costs. We do have a couple of cases where we deduce what people want based on behavior. But it is very very rare, and we generally want to avoid it if there is _any_ other model for it. So even in case 2, we do try to avoid versioning. More often we add a new flag, and say "hey, if you want the new behavior, use the new flag to say so". Not versioning, but explicit "I want the new behavior" Not all system calls take flags, of course. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 20:42 ` Linus Torvalds @ 2018-08-03 21:26 ` Alasdair G Kergon 0 siblings, 0 replies; 46+ messages in thread From: Alasdair G Kergon @ 2018-08-03 21:26 UTC (permalink / raw) To: Linus Torvalds Cc: Theodore Ts'o, Mike Snitzer, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh, Zdenek Kabelac On Fri, Aug 03, 2018 at 01:42:52PM -0700, Linus Torvalds wrote: > So even in case 2, we do try to avoid versioning. More often we add a > new flag, and say "hey, if you want the new behavior, use the new flag > to say so". Not versioning, but explicit "I want the new behavior" There are spare flags available in dm-ioctl - one could indeed turn on the new more-restrictive behaviour. Alasdair ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-02 21:52 ` Linus Torvalds 2018-08-03 13:31 ` Mike Snitzer @ 2018-08-03 13:31 ` Zdenek Kabelac 2018-08-03 16:37 ` Linus Torvalds 1 sibling, 1 reply; 46+ messages in thread From: Zdenek Kabelac @ 2018-08-03 13:31 UTC (permalink / raw) To: Linus Torvalds, wgh Cc: Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel Dne 2.8.2018 v 23:52 Linus Torvalds napsal(a): > On Thu, Aug 2, 2018 at 2:39 PM WGH <wgh@torlan.ru> wrote: >> >> I've just found one public report of this bug, though: >> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442 > > Yeah, it does sound like we should fix this issue. > Hi IMHO (as the author of fixing lvm2 patch) user should not be upgrading kernels and keep running older lvm2 user-land tool (and there are very good reasons for this). Kernel had a bug which has been fixed, lvm2 misused this kernel bug and was also fixed. Keeping kernel bug present allowing certain device to write to read-only devices can be possibly seen as some security risk. Also the number of users who ever create a read-only snapshot is probably very low. Maybe there could be some 'dm-snapshot' loading modinfo option to allowing to create in case user really wants to have this bug being present in kernel (reinstantiate old buggy logic), but on default the user should get error when it tries to write to read-only volume and should upgrade lvm2. Regards Zdenek ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 13:31 ` Zdenek Kabelac @ 2018-08-03 16:37 ` Linus Torvalds 2018-08-03 18:54 ` Mike Snitzer 2018-08-03 19:18 ` Zdenek Kabelac 0 siblings, 2 replies; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 16:37 UTC (permalink / raw) To: Zdenek Kabelac Cc: wgh, Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, Mike Snitzer, dm-devel [ Dammit. I haven't had to shout and curse at people for a while, but this is ABSOLUTELY THE MOST IMPORTANT THING IN THE UNIVERSE WHEN IT COMES TO SOFTWARE DEVELOPMENT ] On Fri, Aug 3, 2018 at 6:31 AM Zdenek Kabelac <zkabelac@redhat.com> wrote: > > IMHO (as the author of fixing lvm2 patch) user should not be upgrading kernels > and keep running older lvm2 user-land tool (and there are very good reasons > for this). Yeah, HELL NO! Guess what? You're wrong. YOU ARE MISSING THE #1 KERNEL RULE. We do not regress, and we do not regress exactly because your are 100% wrong. And the reason you state for your opinion is in fact exactly *WHY* you are wrong. Your "good reasons" are pure and utter garbage. The whole point of "we do not regress" is so that people can upgrade the kernel and never have to worry about it. > Kernel had a bug which has been fixed That is *ENTIRELY* immaterial. Guys, whether something was buggy or not DOES NOT MATTER. Why? Bugs happen. That's a fact of life. Arguing that "we had to break something because we were fixing a bug" is completely insane. We fix tens of bugs every single day, thinking that "fixing a bug" means that we can break something is simply NOT TRUE. So bugs simply aren't even relevant to the discussion. They happen, they get found, they get fixed, and it has nothing to do with "we break users". Because the only thing that matters IS THE USER. How hard is that to understand? Anybody who uses "but it was buggy" as an argument is entirely missing the point. As far as the USER was concerned, it wasn't buggy - it worked for him/her. Maybe it worked *because* the user had taken the bug into account, maybe it worked because the user didn't notice - again, it doesn't matter. It worked for the user. Breaking a user workflow for a "bug" is absolutely the WORST reason for breakage you can imagine. It's basically saying "I took something that worked, and I broke it, but now it's better". Do you not see how f*cking insane that statement is? And without users, your program is not a program, it's a pointless piece of code that you might as well throw away. Seriously. This is *why* the #1 rule for kernel development is "we don't break users". Because "I fixed a bug" is absolutely NOT AN ARGUMENT if that bug fix broke a user setup. You actually introduced a MUCH BIGGER bug by "fixing" something that the user clearly didn't even care about. And dammit, we upgrade the kernel ALL THE TIME without upgrading any other programs at all. It is absolutely required, because flag-days and dependencies are horribly bad. And it is also required simply because I as a kernel developer do not upgrade random other tools that I don't even care about as I develop the kernel, and I want any of my users to feel safe doing the same time. So no. Your rule is COMPLETELY wrong. If you cannot upgrade a kernel without upgrading some other random binary, then we have a problem. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 16:37 ` Linus Torvalds @ 2018-08-03 18:54 ` Mike Snitzer 2018-08-03 19:18 ` Zdenek Kabelac 1 sibling, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 18:54 UTC (permalink / raw) To: Linus Torvalds Cc: Zdenek Kabelac, wgh, Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, dm-devel On Fri, Aug 03 2018 at 12:37pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > [ Dammit. I haven't had to shout and curse at people for a while, but > this is ABSOLUTELY THE MOST IMPORTANT THING IN THE UNIVERSE WHEN IT > COMES TO SOFTWARE DEVELOPMENT ] > > On Fri, Aug 3, 2018 at 6:31 AM Zdenek Kabelac <zkabelac@redhat.com> wrote: > > > > IMHO (as the author of fixing lvm2 patch) user should not be upgrading kernels > > and keep running older lvm2 user-land tool (and there are very good reasons > > for this). > > Yeah, HELL NO! > > Guess what? You're wrong. YOU ARE MISSING THE #1 KERNEL RULE. Nobody ever said there wasn't breakage. And yes, Zdenek papered over the regression introduced by commit 721c7fc701c71 in userspace (lvm2) rather than sound the alarm that the kernel regressed. > We do not regress, and we do not regress exactly because your are 100% wrong. We clearly _do_ regress. Hence this thread. > And the reason you state for your opinion is in fact exactly *WHY* you > are wrong. > > Your "good reasons" are pure and utter garbage. > > The whole point of "we do not regress" is so that people can upgrade > the kernel and never have to worry about it. > > > Kernel had a bug which has been fixed > > That is *ENTIRELY* immaterial. > > Guys, whether something was buggy or not DOES NOT MATTER. > > Why? > > Bugs happen. That's a fact of life. Arguing that "we had to break > something because we were fixing a bug" is completely insane. We fix > tens of bugs every single day, thinking that "fixing a bug" means that > we can break something is simply NOT TRUE. > > So bugs simply aren't even relevant to the discussion. They happen, > they get found, they get fixed, and it has nothing to do with "we > break users". > > Because the only thing that matters IS THE USER. > > How hard is that to understand? It isn't. But you're tearing the head off of a userspace developer (Zdenek). > Anybody who uses "but it was buggy" as an argument is entirely missing > the point. As far as the USER was concerned, it wasn't buggy - it > worked for him/her. > > Maybe it worked *because* the user had taken the bug into account, > maybe it worked because the user didn't notice - again, it doesn't > matter. It worked for the user. > > Breaking a user workflow for a "bug" is absolutely the WORST reason > for breakage you can imagine. > > It's basically saying "I took something that worked, and I broke it, > but now it's better". Do you not see how f*cking insane that statement > is? > > And without users, your program is not a program, it's a pointless > piece of code that you might as well throw away. > > Seriously. This is *why* the #1 rule for kernel development is "we > don't break users". Because "I fixed a bug" is absolutely NOT AN > ARGUMENT if that bug fix broke a user setup. You actually introduced a > MUCH BIGGER bug by "fixing" something that the user clearly didn't > even care about. > > And dammit, we upgrade the kernel ALL THE TIME without upgrading any > other programs at all. It is absolutely required, because flag-days > and dependencies are horribly bad. > > And it is also required simply because I as a kernel developer do not > upgrade random other tools that I don't even care about as I develop > the kernel, and I want any of my users to feel safe doing the same > time. > > So no. Your rule is COMPLETELY wrong. If you cannot upgrade a kernel > without upgrading some other random binary, then we have a problem. As I explained to Ted in my previous reply to this thread: using an lvm2 that is of the same vintage of the kernel is generally going to provide a more robust user experience (fixes, features, etc have been introduced). But both lvm2 and dm strive to never break users -- just like the rest of Linux. Anyway, what would you like done about this block regression? The dm-snapshot use-case isn't compelling because it impacts 2 users (that we know of) but there could be other scenarios (outside of DM) that could impact more -- though you'd think we'd have heard about them by now. Do you want to revert 4.16's commit 721c7fc701c71 ? Mike ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 @ 2018-08-03 18:54 ` Mike Snitzer 0 siblings, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 18:54 UTC (permalink / raw) To: Linus Torvalds Cc: Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Zdenek Kabelac, Ilya Dryomov, wgh On Fri, Aug 03 2018 at 12:37pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > [ Dammit. I haven't had to shout and curse at people for a while, but > this is ABSOLUTELY THE MOST IMPORTANT THING IN THE UNIVERSE WHEN IT > COMES TO SOFTWARE DEVELOPMENT ] > > On Fri, Aug 3, 2018 at 6:31 AM Zdenek Kabelac <zkabelac@redhat.com> wrote: > > > > IMHO (as the author of fixing lvm2 patch) user should not be upgrading kernels > > and keep running older lvm2 user-land tool (and there are very good reasons > > for this). > > Yeah, HELL NO! > > Guess what? You're wrong. YOU ARE MISSING THE #1 KERNEL RULE. Nobody ever said there wasn't breakage. And yes, Zdenek papered over the regression introduced by commit 721c7fc701c71 in userspace (lvm2) rather than sound the alarm that the kernel regressed. > We do not regress, and we do not regress exactly because your are 100% wrong. We clearly _do_ regress. Hence this thread. > And the reason you state for your opinion is in fact exactly *WHY* you > are wrong. > > Your "good reasons" are pure and utter garbage. > > The whole point of "we do not regress" is so that people can upgrade > the kernel and never have to worry about it. > > > Kernel had a bug which has been fixed > > That is *ENTIRELY* immaterial. > > Guys, whether something was buggy or not DOES NOT MATTER. > > Why? > > Bugs happen. That's a fact of life. Arguing that "we had to break > something because we were fixing a bug" is completely insane. We fix > tens of bugs every single day, thinking that "fixing a bug" means that > we can break something is simply NOT TRUE. > > So bugs simply aren't even relevant to the discussion. They happen, > they get found, they get fixed, and it has nothing to do with "we > break users". > > Because the only thing that matters IS THE USER. > > How hard is that to understand? It isn't. But you're tearing the head off of a userspace developer (Zdenek). > Anybody who uses "but it was buggy" as an argument is entirely missing > the point. As far as the USER was concerned, it wasn't buggy - it > worked for him/her. > > Maybe it worked *because* the user had taken the bug into account, > maybe it worked because the user didn't notice - again, it doesn't > matter. It worked for the user. > > Breaking a user workflow for a "bug" is absolutely the WORST reason > for breakage you can imagine. > > It's basically saying "I took something that worked, and I broke it, > but now it's better". Do you not see how f*cking insane that statement > is? > > And without users, your program is not a program, it's a pointless > piece of code that you might as well throw away. > > Seriously. This is *why* the #1 rule for kernel development is "we > don't break users". Because "I fixed a bug" is absolutely NOT AN > ARGUMENT if that bug fix broke a user setup. You actually introduced a > MUCH BIGGER bug by "fixing" something that the user clearly didn't > even care about. > > And dammit, we upgrade the kernel ALL THE TIME without upgrading any > other programs at all. It is absolutely required, because flag-days > and dependencies are horribly bad. > > And it is also required simply because I as a kernel developer do not > upgrade random other tools that I don't even care about as I develop > the kernel, and I want any of my users to feel safe doing the same > time. > > So no. Your rule is COMPLETELY wrong. If you cannot upgrade a kernel > without upgrading some other random binary, then we have a problem. As I explained to Ted in my previous reply to this thread: using an lvm2 that is of the same vintage of the kernel is generally going to provide a more robust user experience (fixes, features, etc have been introduced). But both lvm2 and dm strive to never break users -- just like the rest of Linux. Anyway, what would you like done about this block regression? The dm-snapshot use-case isn't compelling because it impacts 2 users (that we know of) but there could be other scenarios (outside of DM) that could impact more -- though you'd think we'd have heard about them by now. Do you want to revert 4.16's commit 721c7fc701c71 ? Mike ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 18:54 ` Mike Snitzer (?) @ 2018-08-03 19:09 ` Linus Torvalds 2018-08-03 19:30 ` Mike Snitzer -1 siblings, 1 reply; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 19:09 UTC (permalink / raw) To: Mike Snitzer Cc: Zdenek Kabelac, wgh, Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, dm-devel On Fri, Aug 3, 2018 at 11:54 AM Mike Snitzer <snitzer@redhat.com> wrote: > > > As I explained to Ted in my previous reply to this thread: using an lvm2 > that is of the same vintage of the kernel is generally going to provide > a more robust user experience You said that yes. And it is completely irrelevant. The fact is, if you use an older lvm2, then a newer kernel still needs to work. Your "more robust experience" argument has nothing what-so-ever to do with that. Will you get new features from newer user land tools? Sure, usually. And entirely immaterial to a kernel regression. Will newer user land tools hopefully fix other issues? You'd hope so, but again - immaterial. So why are you bringing up a complete red herring? It's entirely immaterial to the actual issue at hand. I would _hope_ that other projects hjave the same "no regressions" rule that the kernel has, but I know many don't. But whatever other projects are out there, and whatever other rules _they_ have for their development is also entirely immaterial to the kernel. The kernel has a simple rule: no user regressions. Yes, we've had to break that rule very occasionally - when the semantics are a huge honking security issue and cannot possibly be hidden any other way, then we obviously have to break them. So it has happened. It's happily quite rare. But in this case, the issue is that the block layer now enforces the read-only protection more. And it seems to be the case that the lvm tools set the read-only flag even when they then depended on being able to write to them, because we didn't use to. So just judging from that description, I do suspect that "we can't depend on the lvm read-only flag", so a patch like "let's not turn DM_READONLY_FLAG into actually set_disk_ro(dm_disk(md), 1)" makes sense. Obviously, if we can limit that more, that would be lovely. But dammit, NOBODY gets to say "oh, you should just update user land tools". Because when they do, I will explode. And I'm 1000% serious that I will refuse to work with people who continue to say that or continue to make excuses. And user land developers should damn well know about this. The fact that they are apparently not clued in about kernel rules is what allowed this bug to go undiscovered and unreported for much too long. Apparently the lvm2 user land developers *did* notice the breakage, but instead of reporting it as a kernel bug, they worked around it. So user land developers should actually know that if the kernel stops working for them, they should *not* work around it. Sure, fix your program, but let the kernel people know. And kernel people should know that "oh, the user land people already changed their behavior" is *not* a "I don't need to care about it". Unless the user land fix was so long ago that nobody cares any more. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:09 ` Linus Torvalds @ 2018-08-03 19:30 ` Mike Snitzer 2018-08-03 19:36 ` Linus Torvalds 2018-08-04 5:20 ` [dm-devel] " Theodore Y. Ts'o 0 siblings, 2 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-03 19:30 UTC (permalink / raw) To: Linus Torvalds Cc: Zdenek Kabelac, wgh, Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, dm-devel On Fri, Aug 03 2018 at 3:09pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Aug 3, 2018 at 11:54 AM Mike Snitzer <snitzer@redhat.com> wrote: > > > > > > As I explained to Ted in my previous reply to this thread: using an lvm2 > > that is of the same vintage of the kernel is generally going to provide > > a more robust user experience > > You said that yes. > > And it is completely irrelevant. > > The fact is, if you use an older lvm2, then a newer kernel still needs > to work. Your "more robust experience" argument has nothing > what-so-ever to do with that. > > Will you get new features from newer user land tools? Sure, usually. > And entirely immaterial to a kernel regression. I was merely giving context for the suggestion of keeping lvm2 updated. Not saying it was relevant for this regression. > Will newer user land tools hopefully fix other issues? You'd hope so, > but again - immaterial. > > So why are you bringing up a complete red herring? It's entirely > immaterial to the actual issue at hand. I was trying to give context for the "best to update lvm2 anyway" disclaimer that was used. Yeah, it was specious. And Zdenek exposed way more surface area for you to attack with his reply to this thread. My initial response to this thread was far more understated but was effectively: read-only dm-snapshot is rare, I'm inclined to just let this be. And yeah, that isn't a good excuse to ignore it but: dm-snapshot is a steaming pile as compared to dm thin-provisioning so dm-snapshot users who then go off the beaten path are already masochistic. SO the 2 users who noticed can cope.. But that too is a cop-out. > I would _hope_ that other projects hjave the same "no regressions" > rule that the kernel has, but I know many don't. But whatever other > projects are out there, and whatever other rules _they_ have for their > development is also entirely immaterial to the kernel. > > The kernel has a simple rule: no user regressions. > > Yes, we've had to break that rule very occasionally - when the > semantics are a huge honking security issue and cannot possibly be > hidden any other way, then we obviously have to break them. > > So it has happened. It's happily quite rare. > > But in this case, the issue is that the block layer now enforces the > read-only protection more. And it seems to be the case that the lvm > tools set the read-only flag even when they then depended on being > able to write to them, because we didn't use to. > > So just judging from that description, I do suspect that "we can't > depend on the lvm read-only flag", so a patch like > > "let's not turn DM_READONLY_FLAG into actually set_disk_ro(dm_disk(md), 1)" > > makes sense. > > Obviously, if we can limit that more, that would be lovely. > > But dammit, NOBODY gets to say "oh, you should just update user land tools". I'll have a closer look at all this. Could be DM in general is lacking for read-only permissions when you have complex stacking involved. > Because when they do, I will explode. And I'm 1000% serious that I > will refuse to work with people who continue to say that or continue > to make excuses. > > And user land developers should damn well know about this. The fact > that they are apparently not clued in about kernel rules is what > allowed this bug to go undiscovered and unreported for much too long. > Apparently the lvm2 user land developers *did* notice the breakage, > but instead of reporting it as a kernel bug, they worked around it. Yeap, they did.. I was unaware myself. > So user land developers should actually know that if the kernel stops > working for them, they should *not* work around it. Sure, fix your > program, but let the kernel people know. Agreed. > And kernel people should know that "oh, the user land people already > changed their behavior" is *not* a "I don't need to care about it". > Unless the user land fix was so long ago that nobody cares any more. I never didn't care. I just didn't care much. Because "dm-snapshot". Mike ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:30 ` Mike Snitzer @ 2018-08-03 19:36 ` Linus Torvalds 2018-08-04 5:20 ` [dm-devel] " Theodore Y. Ts'o 1 sibling, 0 replies; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 19:36 UTC (permalink / raw) To: Mike Snitzer Cc: Zdenek Kabelac, wgh, Ilya Dryomov, Jens Axboe, linux-block, Linux Kernel Mailing List, Sagi Grimberg, dm-devel On Fri, Aug 3, 2018 at 12:30 PM Mike Snitzer <snitzer@redhat.com> wrote: > > I was trying to give context for the "best to update lvm2 anyway" > disclaimer that was used. Yeah, it was specious. So I'll apologize for the strong words, and I'll just say - not excuse, but explanation - that this is the *ONE* thing that continues to drive me absolutely bat-shit insane. It's the *ONE* thing I will refuse to even discuss. Pretty much any technical thing, I'll happily discuss, and I'll admit when I was completely wrong and a f*cking moron. Hey, it happens. But the "no regressions" thing is _so_ important to me, and _so_ frustrating when I get the feeling that people ignore it, that I just go ballistic. I'm sorry for going ballistic, but there it is. It keeps happening every once in a while. I thought we'd gotten mostly over it. Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:30 ` Mike Snitzer 2018-08-03 19:36 ` Linus Torvalds @ 2018-08-04 5:20 ` Theodore Y. Ts'o 2018-08-04 8:36 ` Zdenek Kabelac 2018-08-04 15:19 ` Mike Snitzer 1 sibling, 2 replies; 46+ messages in thread From: Theodore Y. Ts'o @ 2018-08-04 5:20 UTC (permalink / raw) To: Mike Snitzer Cc: Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Zdenek Kabelac, Ilya Dryomov, wgh On Fri, Aug 03, 2018 at 03:30:37PM -0400, Mike Snitzer wrote: > > I was trying to give context for the "best to update lvm2 anyway" > disclaimer that was used. Yeah, it was specious. Well, it seemed to indicate a certain attitude that both Linus and I are concerned about. I tried to use more of a "pursuading" style to impress why that attitude was not ideal/correct. Linus used a much more assertive style (e.g., "Hell, no!"). > And yeah, that isn't a good excuse to ignore it but: dm-snapshot is a > steaming pile as compared to dm thin-provisioning... On a side note, this is the first that I've heard the assertion that dm-thin was better than dm-snapshot. My impression was that dm-snapshot was a proven code base, that only did one thing and (as far as I could tell) did it well. In contrast, dm-thin is much newer code, **far** more complex, with functionality and corner cases approaching that of a file system --- and just to be even more exciting, it doesn't have an fsck/repair tool to deal with corrupted metadata. In your opinion, is it because you disagree with the assumption that dm-thin is scary? Or is the argument that dm-snapshot is even scarier? - Ted P.S. It could be that my impression is wrong/out-dated, but the kernel documentation still says that userspace tools for checking and repairing the metadata are "under development". As a file system developer, the reaction this inspires is best summed up as: https://imgflip.com/memetemplate/50971393/Scared-Face ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-04 5:20 ` [dm-devel] " Theodore Y. Ts'o @ 2018-08-04 8:36 ` Zdenek Kabelac 2018-08-04 16:22 ` Theodore Y. Ts'o 2018-08-04 15:19 ` Mike Snitzer 1 sibling, 1 reply; 46+ messages in thread From: Zdenek Kabelac @ 2018-08-04 8:36 UTC (permalink / raw) To: Theodore Y. Ts'o, Mike Snitzer, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh Dne 4.8.2018 v 07:20 Theodore Y. Ts'o napsal(a): > On Fri, Aug 03, 2018 at 03:30:37PM -0400, Mike Snitzer wrote: >> >> I was trying to give context for the "best to update lvm2 anyway" >> disclaimer that was used. Yeah, it was specious. > > Well, it seemed to indicate a certain attitude that both Linus and I > are concerned about. I tried to use more of a "pursuading" style to > impress why that attitude was not ideal/correct. Linus used a much > more assertive style (e.g., "Hell, no!"). My whole point was - when someone has 'enough time' to compile fresh new kernel, spending couple seconds to build also recent lvm2 should have be no issue. Bugs are continually fixed, new feature from newer kernel are being used, known problems are being addressed - that's all what it means. Bug are fixed not just in kernel, but also in user-space. > >> And yeah, that isn't a good excuse to ignore it but: dm-snapshot is a >> steaming pile as compared to dm thin-provisioning... > > On a side note, this is the first that I've heard the assertion that > dm-thin was better than dm-snapshot. My impression was that > dm-snapshot was a proven code base, that only did one thing and (as > far as I could tell) did it well. In contrast, dm-thin is much newer > code, **far** more complex, with functionality and corner cases > approaching that of a file system --- and just to be even more > exciting, it doesn't have an fsck/repair tool to deal with corrupted > metadata. > > In your opinion, is it because you disagree with the assumption that > dm-thin is scary? Or is the argument that dm-snapshot is even > scarier? dm-snapshost has really outdated design - it's been useful in the old age where megabyte was hell lot of space. Nowadays, when users do need to handle snapshots in multi gigabyte sizes and moreover have number of snapshots from the same volume taken over the time, want to take snapshot of snapshot of snapshot, the old snapshot simple kills all the performance, uses tons of resources and becomes serious bottleneck of your system and has lots of usability limitation. That's where thin provisioning will shine.... However if you still need to take short-living smallish snapshot and you want to use non-provisioned origin device, there the snapshot will still work as proven technology - just no one can expect it will scale for the recent device size explosion... Regards Zdenek ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-04 8:36 ` Zdenek Kabelac @ 2018-08-04 16:22 ` Theodore Y. Ts'o 2018-08-04 18:18 ` Mike Snitzer 0 siblings, 1 reply; 46+ messages in thread From: Theodore Y. Ts'o @ 2018-08-04 16:22 UTC (permalink / raw) To: Zdenek Kabelac Cc: Mike Snitzer, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh On Sat, Aug 04, 2018 at 10:36:50AM +0200, Zdenek Kabelac wrote: > dm-snapshost has really outdated design - it's been useful in the old age > where megabyte was hell lot of space. > > Nowadays, when users do need to handle snapshots in multi gigabyte sizes and > moreover have number of snapshots from the same volume taken over the time, > want to take snapshot of snapshot of snapshot, the old snapshot simple kills > all the performance, uses tons of resources and becomes serious bottleneck > of your system and has lots of usability limitation. Fair enough. I don't think I would consider that makes dm-snapshot a "steaming pile". For me, protection against data loss is Job One. > That's where thin provisioning will shine.... The dm-thin development might want to take a look at what's currently in Documentation/device-mapper/thin-privisioning.txt: Status ====== These targets are very much still in the EXPERIMENTAL state. Please do not yet rely on them in production. But do experiment and offer us feedback. Different use cases will have different performance characteristics, for example due to fragmentation of the data volume. If you find this software is not performing as expected please mail dm-devel@redhat.com with details and we'll try our best to improve things for you. Userspace tools for checking and repairing the metadata are under development. Saying that dm-snapshot is a steaming pile and dm-thin is what everyone should use doesn't seem to be consistent with the above. Cheers, - Ted ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 16:22 ` Theodore Y. Ts'o @ 2018-08-04 18:18 ` Mike Snitzer 0 siblings, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-04 18:18 UTC (permalink / raw) To: Theodore Y. Ts'o Cc: Zdenek Kabelac, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh On Sat, Aug 04 2018 at 12:22pm -0400, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Sat, Aug 04, 2018 at 10:36:50AM +0200, Zdenek Kabelac wrote: > > dm-snapshost has really outdated design - it's been useful in the old age > > where megabyte was hell lot of space. > > > > Nowadays, when users do need to handle snapshots in multi gigabyte sizes and > > moreover have number of snapshots from the same volume taken over the time, > > want to take snapshot of snapshot of snapshot, the old snapshot simple kills > > all the performance, uses tons of resources and becomes serious bottleneck > > of your system and has lots of usability limitation. > > Fair enough. I don't think I would consider that makes dm-snapshot a > "steaming pile". For me, protection against data loss is Job One. What's your point Ted? Do you have _any_ intention of actually using anything DM or is this just a way for you to continue to snipe at it? > > That's where thin provisioning will shine.... > > The dm-thin development might want to take a look at what's currently > in Documentation/device-mapper/thin-privisioning.txt: > > Status > ====== > > These targets are very much still in the EXPERIMENTAL state. Please > do not yet rely on them in production. But do experiment and offer us > feedback. Different use cases will have different performance > characteristics, for example due to fragmentation of the data volume. > > If you find this software is not performing as expected please mail > dm-devel@redhat.com with details and we'll try our best to improve > things for you. > > Userspace tools for checking and repairing the metadata are under > development. > > Saying that dm-snapshot is a steaming pile and dm-thin is what > everyone should use doesn't seem to be consistent with the above. Maybe read your email from earlier today before repeating yourself: https://lkml.org/lkml/2018/8/4/366 ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 @ 2018-08-04 18:18 ` Mike Snitzer 0 siblings, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-04 18:18 UTC (permalink / raw) To: Theodore Y. Ts'o Cc: Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Zdenek Kabelac, Ilya Dryomov, Linus Torvalds, wgh On Sat, Aug 04 2018 at 12:22pm -0400, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Sat, Aug 04, 2018 at 10:36:50AM +0200, Zdenek Kabelac wrote: > > dm-snapshost has really outdated design - it's been useful in the old age > > where megabyte was hell lot of space. > > > > Nowadays, when users do need to handle snapshots in multi gigabyte sizes and > > moreover have number of snapshots from the same volume taken over the time, > > want to take snapshot of snapshot of snapshot, the old snapshot simple kills > > all the performance, uses tons of resources and becomes serious bottleneck > > of your system and has lots of usability limitation. > > Fair enough. I don't think I would consider that makes dm-snapshot a > "steaming pile". For me, protection against data loss is Job One. What's your point Ted? Do you have _any_ intention of actually using anything DM or is this just a way for you to continue to snipe at it? > > That's where thin provisioning will shine.... > > The dm-thin development might want to take a look at what's currently > in Documentation/device-mapper/thin-privisioning.txt: > > Status > ====== > > These targets are very much still in the EXPERIMENTAL state. Please > do not yet rely on them in production. But do experiment and offer us > feedback. Different use cases will have different performance > characteristics, for example due to fragmentation of the data volume. > > If you find this software is not performing as expected please mail > dm-devel@redhat.com with details and we'll try our best to improve > things for you. > > Userspace tools for checking and repairing the metadata are under > development. > > Saying that dm-snapshot is a steaming pile and dm-thin is what > everyone should use doesn't seem to be consistent with the above. Maybe read your email from earlier today before repeating yourself: https://lkml.org/lkml/2018/8/4/366 ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 18:18 ` Mike Snitzer @ 2018-08-04 19:37 ` Theodore Y. Ts'o -1 siblings, 0 replies; 46+ messages in thread From: Theodore Y. Ts'o @ 2018-08-04 19:37 UTC (permalink / raw) To: Mike Snitzer Cc: Zdenek Kabelac, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh On Sat, Aug 04, 2018 at 02:18:47PM -0400, Mike Snitzer wrote: > > Fair enough. I don't think I would consider that makes dm-snapshot a > > "steaming pile". For me, protection against data loss is Job One. > > What's your point Ted? Do you have _any_ intention of actually using > anything DM or is this just a way for you to continue to snipe at it? My point is that putting down dm-snapshot by calling it a "steaming pile" because it can't perform well on workloads that weren't a requirement when it was first designed is neither accurate nor fair. And steering users away from it by badmouthing to a technology which ever so often, requires enterprise support to recover, is something that *I* at least would classify as "marginal". Maybe it's just that file system developers have higher standards. I know that Dave Chinner at LSF/MM commented that using some of the things he has been developing for XFS subvolume support might be interesting precisely because it could provide some of the facilities currently provided by thin provisioning (perhaps not all of them; I'm not sure how well his virtual block remapping layer would handle hundreds of snapshots) but with file system tools which have a lot more seasoning and where people have spent a lot of effort on data recovery tools. In any case, I do use DM quite a lot. I use LVM2 and dm-snapshot (and it's been working just *fine* for my use cases). I've wanted to use dm-thin, but have been put off by it being labeled as experimental and by some of the comments about how robust its recovery tools are. If there was documentation about how an advanced user/developer could use low level tools to do manual repair of a thin pool when the automated procedures didn't work, without having to pay $$$ to some company for "enterprise support", I'd be a lot more willing to give it a try. Sorry, I just care a *lot* about data robustness. > Maybe read your email from earlier today before repeating yourself: > https://lkml.org/lkml/2018/8/4/366 Apologies. I'm currently staying at an Assisted Living facility keeping an eye on my Dad this week, and the internet at the senior living center has been.... marginal. As a result I've been reading my e-mail in batches, and so I hadn't seen the e-mail you had posted earlier before I had sent my reply. - Ted ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 @ 2018-08-04 19:37 ` Theodore Y. Ts'o 0 siblings, 0 replies; 46+ messages in thread From: Theodore Y. Ts'o @ 2018-08-04 19:37 UTC (permalink / raw) To: Mike Snitzer Cc: Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Zdenek Kabelac, Ilya Dryomov, Linus Torvalds, wgh On Sat, Aug 04, 2018 at 02:18:47PM -0400, Mike Snitzer wrote: > > Fair enough. I don't think I would consider that makes dm-snapshot a > > "steaming pile". For me, protection against data loss is Job One. > > What's your point Ted? Do you have _any_ intention of actually using > anything DM or is this just a way for you to continue to snipe at it? My point is that putting down dm-snapshot by calling it a "steaming pile" because it can't perform well on workloads that weren't a requirement when it was first designed is neither accurate nor fair. And steering users away from it by badmouthing to a technology which ever so often, requires enterprise support to recover, is something that *I* at least would classify as "marginal". Maybe it's just that file system developers have higher standards. I know that Dave Chinner at LSF/MM commented that using some of the things he has been developing for XFS subvolume support might be interesting precisely because it could provide some of the facilities currently provided by thin provisioning (perhaps not all of them; I'm not sure how well his virtual block remapping layer would handle hundreds of snapshots) but with file system tools which have a lot more seasoning and where people have spent a lot of effort on data recovery tools. In any case, I do use DM quite a lot. I use LVM2 and dm-snapshot (and it's been working just *fine* for my use cases). I've wanted to use dm-thin, but have been put off by it being labeled as experimental and by some of the comments about how robust its recovery tools are. If there was documentation about how an advanced user/developer could use low level tools to do manual repair of a thin pool when the automated procedures didn't work, without having to pay $$$ to some company for "enterprise support", I'd be a lot more willing to give it a try. Sorry, I just care a *lot* about data robustness. > Maybe read your email from earlier today before repeating yourself: > https://lkml.org/lkml/2018/8/4/366 Apologies. I'm currently staying at an Assisted Living facility keeping an eye on my Dad this week, and the internet at the senior living center has been.... marginal. As a result I've been reading my e-mail in batches, and so I hadn't seen the e-mail you had posted earlier before I had sent my reply. - Ted ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 19:37 ` Theodore Y. Ts'o (?) @ 2018-08-04 21:48 ` Mike Snitzer -1 siblings, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-04 21:48 UTC (permalink / raw) To: Theodore Y. Ts'o, Zdenek Kabelac, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh On Sat, Aug 04 2018 at 3:37pm -0400, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Sat, Aug 04, 2018 at 02:18:47PM -0400, Mike Snitzer wrote: > > > Fair enough. I don't think I would consider that makes dm-snapshot a > > > "steaming pile". For me, protection against data loss is Job One. > > > > What's your point Ted? Do you have _any_ intention of actually using > > anything DM or is this just a way for you to continue to snipe at it? > > My point is that putting down dm-snapshot by calling it a "steaming > pile" because it can't perform well on workloads that weren't a > requirement when it was first designed is neither accurate nor fair. As a person who has written a fair amount of dm-snapshot code I'm free to have my opinion. It is slow. Period. If it works for you, great. But it isn't adequate for most modern usecases I'm aware of. > And steering users away from it by badmouthing to a technology which > ever so often, requires enterprise support to recover, is something > that *I* at least would classify as "marginal". dm-snapshot is slow, as such I will badmouth it because dm-thinp is a much more capable replacement. I have to maintain both, so I'm free to steer people according to my experience. > Maybe it's just that file system developers have higher standards. I > know that Dave Chinner at LSF/MM commented that using some of the > things he has been developing for XFS subvolume support might be > interesting precisely because it could provide some of the facilities > currently provided by thin provisioning (perhaps not all of them; I'm > not sure how well his virtual block remapping layer would handle > hundreds of snapshots) but with file system tools which have a lot > more seasoning and where people have spent a lot of effort on data > recovery tools. Even new XFS features will have bugs. Just because XFS's fsck is historically robust, oversights and bugs happen when new features are added. And AFAIK future XFS would be looking to leverage DM thinp via its producer/consumer model. But this is going off on a tangent now. > In any case, I do use DM quite a lot. I use LVM2 and dm-snapshot (and > it's been working just *fine* for my use cases). I've wanted to use > dm-thin, but have been put off by it being labeled as experimental and > by some of the comments about how robust its recovery tools are. The Documentation was stale. I personally don't reference it so the need to update it got overlooked. > If there was documentation about how an advanced user/developer could > use low level tools to do manual repair of a thin pool when the > automated procedures didn't work, without having to pay $$$ to some > company for "enterprise support", I'd be a lot more willing to give it > a try. We could certainly improve out documentation for the use of thin_check and thin_repair. I know lvm2 has seen improvements to allow the metdata voulme to be activated in standalone mode (activate the metadata volume without the thin-pool or thin devices ontop) so that the thin_check and thin_repair tools can be used on it. I'd imagine you aren't aware of the lvm2 package's lvmthin manpage? See: man lvmthin It'd likely be one of the documentation locations to see improvements. I'll talk with others about where we can improve docs, the manpages for thin_check and thin_repair are _very_ sparse. Anyway, room for improvement for sure. But demonizing "enterprise support" like you don't provide that to your stake holders is bizarre. I again was candid and forthcoming about what drives/catches the need for thin_check and thin_repair fixes and improvements: it just so happens that "enterprise" deployments make use of DM-thinp and have exposed the need for support more than community users. I'm not saying I, or other DM thinp oriented developers, wouldn't provided the same type of support if a community user like yourself hit a problem. It is just that enterprise users are the prontlines of advanced usage and scale. Deploying hundreds of Gluster servers with every brick layered ontop of DM thinp historically exposed issues. Those issues get fixed and benefit everyone. This discussion, and my need to explain how "enterprise support" drives innovation, is so.. weird. > Sorry, I just care a *lot* about data robustness. You aren't alone. > > Maybe read your email from earlier today before repeating yourself: > > https://lkml.org/lkml/2018/8/4/366 > > Apologies. I'm currently staying at an Assisted Living facility > keeping an eye on my Dad this week, and the internet at the senior > living center has been.... marginal. As a result I've been reading my > e-mail in batches, and so I hadn't seen the e-mail you had posted > earlier before I had sent my reply. Best wishes. I've been dealing with stresses in my personal life myself. Might explain why we've had the awkwardness in this thread. Mike ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 2018-08-04 5:20 ` [dm-devel] " Theodore Y. Ts'o 2018-08-04 8:36 ` Zdenek Kabelac @ 2018-08-04 15:19 ` Mike Snitzer 1 sibling, 0 replies; 46+ messages in thread From: Mike Snitzer @ 2018-08-04 15:19 UTC (permalink / raw) To: Theodore Y. Ts'o, Linus Torvalds, Jens Axboe, Sagi Grimberg, Linux Kernel Mailing List, linux-block, dm-devel, Zdenek Kabelac, Ilya Dryomov, wgh On Sat, Aug 04 2018 at 1:20am -0400, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Fri, Aug 03, 2018 at 03:30:37PM -0400, Mike Snitzer wrote: > > > > I was trying to give context for the "best to update lvm2 anyway" > > disclaimer that was used. Yeah, it was specious. > > Well, it seemed to indicate a certain attitude that both Linus and I > are concerned about. I tried to use more of a "pursuading" style to > impress why that attitude was not ideal/correct. Linus used a much > more assertive style (e.g., "Hell, no!"). [I debated just ignoring this portion of your reply but it needs to be dealt with directly] I prefer how Linus handled it (at least he was timely with his follow-ups). Your initial reply where you joined a fragment of my initial reply with Zdenek's (we sent simultaneously, each half way around the world) served to merge Zdenek and myself into one fictional straw-man you both could attack. If you have something to say to _me_ address me directly; don't put words in my mouth because you thought I had a complete mind-meld with someone else. And please don't act like this wasn't already beaten to death yesterday; which left me (as DM maintainer) initially _unwarrantedly_ compromised. There was a block regression that I wasn't aware of but someone on my broader team (Zdenek) papered over it in userspace rather than report it as a regression. I did brush off the seriousness of side-effects on readonly dm-snapshot ("Because dm-snapshot"). But that doesn't speak to some systemic "problem" you seem to be concerned about. > > And yeah, that isn't a good excuse to ignore it but: dm-snapshot is a > > steaming pile as compared to dm thin-provisioning... > > On a side note, this is the first that I've heard the assertion that > dm-thin was better than dm-snapshot. You don't follow DM much, that's fine. But thinp is considerably more powerful for modern use-cases. > My impression was that dm-snapshot was a proven code base, that only > did one thing and (as far as I could tell) did it well. In contrast, > dm-thin is much newer code, **far** more complex, with functionality > and corner cases approaching that of a file system --- dm-snapshot's scaling is _awful_. This is due to the N-way copy-out penalty associated with N snapshots. So lots of snapshots perform very very slowly. Even one snapshot is slow compared to dm-thinp. dm-thin (2011) certainly is newer than dm-snapshot (well before 2005), and yes dm-thin is complex, but dm-snapshot's code isn't exactly "simple". The on-disk layout is but that simplicity contributes to why it doesn't scale at all. DM thin is a modern approach to snapshots grounded in the same btree-based concepts and design used by btrfs. Given dm-thinp's requirements and how it has been deployed and pushed _hard_ it really is holding up amazingly well. > and just to be even more exciting, it [dm-thin] doesn't have an > fsck/repair tool to deal with corrupted metadata. That's one definition for "exciting" on your Friday night ... ;) The documentation was outdated, see this thread: https://www.redhat.com/archives/dm-devel/2018-July/msg00200.html Where I shared that this Documentation update was staged for 4.19: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.19&id=6c7413c0f5ab61d2339cf516d25bb44d71f4bad4 That said, thin_repair has shown itself to be hit-or-miss. There are certain corruptions that it doesn't cope with well (leaving the metadata "repaired" but the result is an empty thin-pool). Those cases are more rare but still occur. So repairing thinp corruption can require escalations through "enterprise support" (which results in fixes to thin_repair, etc). > In your opinion, is it because you disagree with the assumption that > dm-thin is scary? Or is the argument that dm-snapshot is even > scarier? Apples and oranges. DM thinp is complex but necessarily so. dm-snapshot is still complex yet only covers legacy and narrow (read: now useless) use-cases. In the same thread I referenced above, see how Drew Hastings is looking to use DM thinp to host VM guest storage, which implies a scaling dm-snapshot has _zero_ hope of providing: https://www.redhat.com/archives/dm-devel/2018-August/msg00088.html > P.S. It could be that my impression is wrong/out-dated, but the > kernel documentation still says that userspace tools for checking and > repairing the metadata are "under development". As a file system > developer, the reaction this inspires is best summed up as: > > https://imgflip.com/memetemplate/50971393/Scared-Face Already addressed this. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 16:37 ` Linus Torvalds @ 2018-08-03 19:18 ` Zdenek Kabelac 2018-08-03 19:18 ` Zdenek Kabelac 1 sibling, 0 replies; 46+ messages in thread From: Zdenek Kabelac @ 2018-08-03 19:18 UTC (permalink / raw) To: Linus Torvalds Cc: Jens Axboe, Sagi Grimberg, Mike Snitzer, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh Dne 3.8.2018 v 18:37 Linus Torvalds napsal(a): > [ Dammit. I haven't had to shout and curse at people for a while, but > this is ABSOLUTELY THE MOST IMPORTANT THING IN THE UNIVERSE WHEN IT > COMES TO SOFTWARE DEVELOPMENT ] > > On Fri, Aug 3, 2018 at 6:31 AM Zdenek Kabelac <zkabelac@redhat.com> wrote: >> >> IMHO (as the author of fixing lvm2 patch) user should not be upgrading kernels >> and keep running older lvm2 user-land tool (and there are very good reasons >> for this). > > Yeah, HELL NO! > > Guess what? You're wrong. YOU ARE MISSING THE #1 KERNEL RULE. > > We do not regress, and we do not regress exactly because your are 100% wrong. > Hi Linus Sorry to hear you so crazy about this :) - but my comment was mainly targeted to 'distribution maintainers' trying to push the latest kernel and preserving old tools (not providing upgraded packages). If it's worth to upgrade kernel - it should be worth to upgrade surrounding packages, especially if they are very tightly bind to work with kernel. I just think you are overreacting here and you could trust me I personally do A LOT to keep everything as much usable & compatible in lvm2 as we can across all kernels, all distributions and many architectures. Lvm2 is always tracking individual bugs in kernel and reports them to user or tries to bypass... >> Kernel had a bug which has been fixed > > That is *ENTIRELY* immaterial. > > Guys, whether something was buggy or not DOES NOT MATTER. > > Why? > > Bugs happen. That's a fact of life. Arguing that "we had to break > something because we were fixing a bug" is completely insane. We fix > tens of bugs every single day, thinking that "fixing a bug" means that > we can break something is simply NOT TRUE. From my userland POV - leaving kernel write to devices that are supposed to be read-only 'just because' it's kernel is wrong - the fact it has NOT been discover for so long means - there are not many users and not many testers of this combination. But if kernel people do want to make a big stress case about this - I'm the last one to object - I'm just unsure what kind of bugs are supposed to be preserved and how it's going to be recognized which bugs are usable and which are fixable.... On a funny note - security exploits had also many users - so why fixing them.... > So no. Your rule is COMPLETELY wrong. If you cannot upgrade a kernel > without upgrading some other random binary, then we have a problem. It's was not meant as a rule, just recommendation - and I think we can end up this fight over nothing here - there is probably very low number of users of this combination.... Regards Zdenek ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: LVM snapshot broke between 4.14 and 4.16 @ 2018-08-03 19:18 ` Zdenek Kabelac 0 siblings, 0 replies; 46+ messages in thread From: Zdenek Kabelac @ 2018-08-03 19:18 UTC (permalink / raw) To: Linus Torvalds Cc: Jens Axboe, Sagi Grimberg, Mike Snitzer, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh Dne 3.8.2018 v 18:37 Linus Torvalds napsal(a): > [ Dammit. I haven't had to shout and curse at people for a while, but > this is ABSOLUTELY THE MOST IMPORTANT THING IN THE UNIVERSE WHEN IT > COMES TO SOFTWARE DEVELOPMENT ] > > On Fri, Aug 3, 2018 at 6:31 AM Zdenek Kabelac <zkabelac@redhat.com> wrote: >> >> IMHO (as the author of fixing lvm2 patch) user should not be upgrading kernels >> and keep running older lvm2 user-land tool (and there are very good reasons >> for this). > > Yeah, HELL NO! > > Guess what? You're wrong. YOU ARE MISSING THE #1 KERNEL RULE. > > We do not regress, and we do not regress exactly because your are 100% wrong. > Hi Linus Sorry to hear you so crazy about this :) - but my comment was mainly targeted to 'distribution maintainers' trying to push the latest kernel and preserving old tools (not providing upgraded packages). If it's worth to upgrade kernel - it should be worth to upgrade surrounding packages, especially if they are very tightly bind to work with kernel. I just think you are overreacting here and you could trust me I personally do A LOT to keep everything as much usable & compatible in lvm2 as we can across all kernels, all distributions and many architectures. Lvm2 is always tracking individual bugs in kernel and reports them to user or tries to bypass... >> Kernel had a bug which has been fixed > > That is *ENTIRELY* immaterial. > > Guys, whether something was buggy or not DOES NOT MATTER. > > Why? > > Bugs happen. That's a fact of life. Arguing that "we had to break > something because we were fixing a bug" is completely insane. We fix > tens of bugs every single day, thinking that "fixing a bug" means that > we can break something is simply NOT TRUE. From my userland POV - leaving kernel write to devices that are supposed to be read-only 'just because' it's kernel is wrong - the fact it has NOT been discover for so long means - there are not many users and not many testers of this combination. But if kernel people do want to make a big stress case about this - I'm the last one to object - I'm just unsure what kind of bugs are supposed to be preserved and how it's going to be recognized which bugs are usable and which are fixable.... On a funny note - security exploits had also many users - so why fixing them.... > So no. Your rule is COMPLETELY wrong. If you cannot upgrade a kernel > without upgrading some other random binary, then we have a problem. It's was not meant as a rule, just recommendation - and I think we can end up this fight over nothing here - there is probably very low number of users of this combination.... Regards Zdenek ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [dm-devel] LVM snapshot broke between 4.14 and 4.16 2018-08-03 19:18 ` Zdenek Kabelac (?) @ 2018-08-03 19:30 ` Linus Torvalds -1 siblings, 0 replies; 46+ messages in thread From: Linus Torvalds @ 2018-08-03 19:30 UTC (permalink / raw) To: Zdenek Kabelac Cc: Jens Axboe, Sagi Grimberg, Mike Snitzer, Linux Kernel Mailing List, linux-block, dm-devel, Ilya Dryomov, wgh On Fri, Aug 3, 2018 at 12:18 PM Zdenek Kabelac <zkabelac@redhat.com> wrote: > > From my userland POV - leaving kernel write to devices that are supposed to > be read-only 'just because' it's kernel is wrong - the fact it has NOT been > discover for so long means - there are not many users and not many testers of > this combination. Sure. But at the same time, the "read-only" issue from a _security_ standpoint should never be about some device state. Why? Because the Unix security rules aren't about "read-only devices". They are about permissions, and if you don't want your users to have write permissions to a device, then they shouldn't have those permissions. So the "set_disk_ro()" interface is simply not for security. Anybody who uses it for security is seriously confused. No, the "set_disk_ro()" interface is so that you can say "you physically cannot write to this medium". It's meant for things like CD-ROM devices, or for a floppy device when you notice that the controller reports that the floppy itself is physically write-protected. THAT is what the new code checks for, and that is also why ignoring the check really shouldn't be a security issue. Because if it turns out that somebody wrote to it, and the write succeeded, then obviously the "set_disk_ro()" usage was simply wrong. Now, I do agree that it 100% makes sense to have a layer like md/lvm be able to virtually mark volumes read-only. If only exactly to emulate the "this is like a write-protected floppy or a cd-rom" behavior. So the DM_READONLY_FLAG makes conceptual sense. But at the same time, if the DM_READONLY_FLAG was _wrong_, then it also makes a ton of sense to just say "oh, it was wrong, we'll ignore it". Exactly because it was never supposed to be about security, and it was about other things. See? Linus ^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2018-08-04 21:48 UTC | newest] Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-02 12:26 LVM snapshot broke between 4.14 and 4.16 WGH 2018-08-02 13:31 ` Ilya Dryomov 2018-08-02 13:31 ` Ilya Dryomov 2018-08-02 15:10 ` WGH 2018-08-02 16:41 ` Linus Torvalds 2018-08-02 18:18 ` Ilya Dryomov 2018-08-02 18:32 ` Linus Torvalds 2018-08-02 21:32 ` WGH 2018-08-02 21:39 ` WGH 2018-08-02 21:52 ` Linus Torvalds 2018-08-03 13:31 ` Mike Snitzer 2018-08-03 15:20 ` [dm-devel] " Theodore Y. Ts'o 2018-08-03 18:39 ` Mike Snitzer 2018-08-03 18:57 ` Linus Torvalds 2018-08-03 19:06 ` Mike Snitzer 2018-08-03 19:11 ` Linus Torvalds 2018-08-03 19:33 ` Mike Snitzer 2018-08-03 19:22 ` Linus Torvalds 2018-08-04 10:01 ` WGH 2018-08-04 17:04 ` Linus Torvalds 2018-08-04 17:04 ` Linus Torvalds 2018-08-04 18:19 ` Mike Snitzer 2018-08-04 20:29 ` WGH 2018-08-03 19:56 ` [dm-devel] " Alasdair G Kergon 2018-08-03 20:08 ` Alasdair G Kergon 2018-08-03 20:42 ` Linus Torvalds 2018-08-03 21:26 ` Alasdair G Kergon 2018-08-03 13:31 ` Zdenek Kabelac 2018-08-03 16:37 ` Linus Torvalds 2018-08-03 18:54 ` Mike Snitzer 2018-08-03 18:54 ` Mike Snitzer 2018-08-03 19:09 ` Linus Torvalds 2018-08-03 19:30 ` Mike Snitzer 2018-08-03 19:36 ` Linus Torvalds 2018-08-04 5:20 ` [dm-devel] " Theodore Y. Ts'o 2018-08-04 8:36 ` Zdenek Kabelac 2018-08-04 16:22 ` Theodore Y. Ts'o 2018-08-04 18:18 ` Mike Snitzer 2018-08-04 18:18 ` Mike Snitzer 2018-08-04 19:37 ` Theodore Y. Ts'o 2018-08-04 19:37 ` Theodore Y. Ts'o 2018-08-04 21:48 ` Mike Snitzer 2018-08-04 15:19 ` Mike Snitzer 2018-08-03 19:18 ` [dm-devel] " Zdenek Kabelac 2018-08-03 19:18 ` Zdenek Kabelac 2018-08-03 19:30 ` [dm-devel] " Linus Torvalds
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.