From: Ross Zwisler <email@example.com> To: Mike Snitzer <firstname.lastname@example.org> Cc: Ross Zwisler <email@example.com>, Toshi Kani <firstname.lastname@example.org>, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com Subject: Re: [PATCH v2 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode Date: Wed, 6 Jun 2018 11:24:21 -0600 [thread overview] Message-ID: <20180606172421.GA2208@linux.intel.com> (raw) In-Reply-To: <20180605004558.GB6898@redhat.com> On Mon, Jun 04, 2018 at 08:46:28PM -0400, Mike Snitzer wrote: > On Mon, Jun 04 2018 at 7:24pm -0400, > Ross Zwisler <firstname.lastname@example.org> wrote: > > > On Fri, Jun 01, 2018 at 06:04:43PM -0400, Mike Snitzer wrote: > > > On Tue, May 29 2018 at 3:51pm -0400, > > > Ross Zwisler <email@example.com> wrote: > > > > > > > The DM_TYPE_DAX_BIO_BASED dm_queue_mode was introduced to prevent DM > > > > devices that could possibly support DAX from transitioning into DM devices > > > > that cannot support DAX. > > > > > > > > For example, the following transition will currently fail: > > > > > > > > dm-linear: [fsdax pmem][fsdax pmem] => [fsdax pmem][fsdax raw] > > > > DM_TYPE_DAX_BIO_BASED DM_TYPE_BIO_BASED > > > > > > > > but these will both succeed: > > > > > > > > dm-linear: [fsdax pmem][brd ramdisk] => [fsdax pmem][fsdax raw] > > > > DM_TYPE_DAX_BIO_BASED DM_TYPE_BIO_BASED > > > > > > > > > > I fail to see how this succeeds given > > > drivers/md/dm-ioctl.c:is_valid_type() only allows transitions from: > > > > > > DM_TYPE_BIO_BASED => DM_TYPE_DAX_BIO_BASED > > > > Right, sorry, that was a typo. What I meant was: > > > > > For example, the following transition will currently fail: > > > > > > dm-linear: [fsdax pmem][fsdax pmem] => [fsdax pmem][fsdax raw] > > > DM_TYPE_DAX_BIO_BASED DM_TYPE_BIO_BASED > > > > > > but these will both succeed: > > > > > > dm-linear: [fsdax pmem][brd ramdisk] => [fsdax pmem][fsdax raw] > > > DM_TYPE_BIO_BASED DM_TYPE_BIO_BASED > > > > > > dm-linear: [fsdax pmem][fsdax raw] => [fsdax pmem][fsdax pmem] > > > DM_TYPE_BIO_BASED DM_TYPE_DAX_BIO_BASED > > > > So we allow 2 of the 3 transitions, but the reason that we disallow the third > > isn't fully clear to me. > > > > > > dm-linear: [fsdax pmem][fsdax raw] => [fsdax pmem][fsdax pmem] > > > > DM_TYPE_BIO_BASED DM_TYPE_DAX_BIO_BASED > > > > > > > > This seems arbitrary, as really the choice on whether to use DAX happens at > > > > filesystem mount time. There's no guarantee that the in the first case > > > > (double fsdax pmem) we were using the dax mount option with our file > > > > system. > > > > > > > > Instead, get rid of DM_TYPE_DAX_BIO_BASED and all the special casing around > > > > it, and instead make the request queue's QUEUE_FLAG_DAX be our one source > > > > of truth. If this is set, we can use DAX, and if not, not. We keep this > > > > up to date in table_load() as the table changes. As with regular block > > > > devices the filesystem will then know at mount time whether DAX is a > > > > supported mount option or not. > > > > > > If you don't think you need this specialization that is fine.. but DM > > > devices supporting suspending (as part of table reloads) so is there any > > > risk that there will be inflight IO (say if someone did 'dmsetup suspend > > > --noflush').. and then upon reload the device type changed out from > > > under us.. anyway, I don't have all the PMEM DAX stuff paged back into > > > my head yet. > > > > > > But this just seems like we really shouldn't be allowing the > > > transition from what was DM_TYPE_DAX_BIO_BASED back to DM_TYPE_BIO_BASED > > > > I admit I don't fully understand all the ways that DM supports suspending and > > resuming devices. Is there actually a case where we can change out the DM > > devices while I/O is running, and somehow end up trying to issue a DAX I/O to > > a device that doesn't support DAX? > > Yes, provided root permissions, it's very easy to dmsetup suspend/load/resume > to replace any portion of the DM device's logical address space to map to an > entirely different DM target (with a different backing store). It's > pretty intrusive to do such things, but easily done and powerful. > > Mike Hmmm, I don't understand how you can do this if there is a filesystem built on your DM device? Say you have a DM device, either striped or linear, that is made up of 2 devices, and then you use dmsetup to replace one of the DM member devices with something else. You've just swapped out half of your LBA space with new data, right? I don't understand how you can expect a filesystem built on the old DM device to still work? You especially can't do this while the filesystem is mounted - all the in-core filesystem metadata would be garbage because the on-media data would have totally changed. So, when dealing with a filesystem, the flow must be: unmount your filesystem redo your DM device, changing out devices reformat your filesystem on the new DM device remount your filesystem Right? If so, then I don't see how a transition of the DM device from supporting DAX to not supporting DAX or vice versa could harm us, as we can't be doing filesystem I/O at the time when we change the composition of the DM device.
next prev parent reply other threads:[~2018-06-06 17:24 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-05-29 19:50 [PATCH v2 0/7] Fix DM DAX handling Ross Zwisler 2018-05-29 19:51 ` [PATCH v2 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler 2018-05-29 19:51 ` [PATCH v2 2/7] dax: change bdev_dax_supported() to support boolean returns Ross Zwisler 2018-05-29 21:25 ` Darrick J. Wong 2018-05-29 22:01 ` Ross Zwisler 2018-05-31 19:13 ` Darrick J. Wong 2018-05-31 20:34 ` Ross Zwisler 2018-05-31 20:35 ` Dan Williams 2018-05-31 20:41 ` Ross Zwisler 2018-05-31 20:52 ` Mike Snitzer 2018-05-31 22:26 ` [dm-devel] " Darrick J. Wong 2018-06-01 20:59 ` Ross Zwisler 2018-06-01 1:26 ` Dave Chinner 2018-06-01 1:57 ` Dan Williams 2018-06-01 2:24 ` Dave Chinner 2018-06-01 4:02 ` Dan Williams 2018-06-03 22:20 ` Dave Chinner 2018-06-04 0:25 ` Dave Chinner 2018-06-04 1:48 ` Dan Williams 2018-06-04 23:40 ` Dan Williams 2018-06-05 0:33 ` Mike Snitzer 2018-06-05 5:55 ` Dave Chinner 2018-06-05 3:32 ` Dan Williams 2018-05-29 19:51 ` [PATCH v2 3/7] dm: fix test for DAX device support Ross Zwisler 2018-06-01 20:19 ` Mike Snitzer 2018-06-01 20:46 ` Mike Snitzer 2018-06-01 21:11 ` Ross Zwisler 2018-06-01 21:16 ` Dan Williams 2018-05-29 19:51 ` [PATCH v2 4/7] dm: prevent DAX mounts if not supported Ross Zwisler 2018-06-01 21:55 ` Mike Snitzer 2018-06-04 23:15 ` Ross Zwisler 2018-06-20 15:17 ` Mike Snitzer 2018-06-25 19:20 ` Ross Zwisler 2018-05-29 19:51 ` [PATCH v2 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode Ross Zwisler 2018-06-01 22:04 ` Mike Snitzer 2018-06-04 23:24 ` Ross Zwisler 2018-06-04 23:49 ` Kani, Toshi 2018-06-05 0:46 ` Mike Snitzer 2018-06-06 17:24 ` Ross Zwisler [this message] 2018-06-06 22:29 ` Mike Snitzer 2018-05-29 19:51 ` [PATCH v2 6/7] dm-snap: remove unnecessary direct_access() stub Ross Zwisler 2018-05-29 19:51 ` [PATCH v2 7/7] dm-error: " Ross Zwisler
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180606172421.GA2208@linux.intel.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [PATCH v2 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).