From: Matthew Wilcox <willy@infradead.org> To: Dan Williams <dan.j.williams@intel.com> Cc: Seema Pandit <seema.pandit@intel.com>, linux-nvdimm <linux-nvdimm@lists.01.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, stable <stable@vger.kernel.org>, Robert Barror <robert.barror@intel.com>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Jan Kara <jack@suse.cz> Subject: Re: [PATCH] filesystem-dax: Disable PMD support Date: Mon, 1 Jul 2019 20:34:10 -0700 [thread overview] Message-ID: <20190702033410.GB1729@bombadil.infradead.org> (raw) In-Reply-To: <CAPcyv4j2NBPBEUU3UW1Q5OyOEuo9R5e90HpkowpeEkMsAKiUyQ@mail.gmail.com> On Sun, Jun 30, 2019 at 02:37:32PM -0700, Dan Williams wrote: > On Sun, Jun 30, 2019 at 8:23 AM Matthew Wilcox <willy@infradead.org> wrote: > > I think my theory was slightly mistaken, but your fix has the effect of > > fixing the actual problem too. > > > > The xas->xa_index for a PMD is going to be PMD-aligned (ie a multiple of > > 512), but xas_find_conflict() does _not_ adjust xa_index (... which I > > really should have mentioned in the documentation). So we go to sleep > > on the PMD-aligned index instead of the index of the PTE. Your patch > > fixes this by using the PMD-aligned index for PTEs too. > > > > I'm trying to come up with a clean fix for this. Clearly we > > shouldn't wait for a PTE entry if we're looking for a PMD entry. > > But what should get_unlocked_entry() return if it detects that case? > > We could have it return an error code encoded as an internal entry, > > like grab_mapping_entry() does. Or we could have it return the _locked_ > > PTE entry, and have callers interpret that. > > > > At least get_unlocked_entry() is static, but it's got quite a few callers. > > Trying to discern which ones might ask for a PMD entry is a bit tricky. > > So this seems like a large patch which might have bugs. > > > > Thoughts? > > ...but if it was a problem of just mismatched waitqueue's I would have > expected it to trigger prior to commit b15cd800682f "dax: Convert page > fault handlers to XArray". That commit converts grab_mapping_entry() (called by dax_iomap_pmd_fault()) from calling get_unlocked_mapping_entry() to calling get_unlocked_entry(). get_unlocked_mapping_entry() (eventually) called __radix_tree_lookup() instead of dax_find_conflict(). > This hunk, if I'm reading it correctly, > looks suspicious: @index in this case is coming directly from > vm->pgoff without pmd alignment adjustment whereas after the > conversion it's always pmd aligned from the xas->xa_index. So perhaps > the issue is that the lock happens at pte granularity. I expect it > would cause the old put_locked_mapping_entry() to WARN, but maybe that > avoids the lockup and was missed in the bisect. I don't think that hunk is the problem. The __radix_tree_lookup() is going to return a 'slot' which points to the canonical slot, no matter which of the 512 indices corresponding to that slot is chosen. So I think it's going to do essentially the same thing. > @@ -884,21 +711,18 @@ static void *dax_insert_entry(struct > address_space *mapping, > * existing entry is a PMD, we will just leave the PMD in the > * tree and dirty it if necessary. > */ > - struct radix_tree_node *node; > - void **slot; > - void *ret; > - > - ret = __radix_tree_lookup(pages, index, &node, &slot); > - WARN_ON_ONCE(ret != entry); > - __radix_tree_replace(pages, node, slot, > - new_entry, NULL); > + void *old = dax_lock_entry(xas, new_entry); > + WARN_ON_ONCE(old != xa_mk_value(xa_to_value(entry) | > + DAX_LOCKED)); > entry = new_entry; > + } else { > + xas_load(xas); /* Walk the xa_state */ > } > > if (dirty) > - radix_tree_tag_set(pages, index, PAGECACHE_TAG_DIRTY); > + xas_set_mark(xas, PAGECACHE_TAG_DIRTY); > > - xa_unlock_irq(pages); > + xas_unlock_irq(xas); > return entry; > } _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@infradead.org> To: Dan Williams <dan.j.williams@intel.com> Cc: Seema Pandit <seema.pandit@intel.com>, linux-nvdimm <linux-nvdimm@lists.01.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, stable <stable@vger.kernel.org>, Robert Barror <robert.barror@intel.com>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Jan Kara <jack@suse.cz> Subject: Re: [PATCH] filesystem-dax: Disable PMD support Date: Mon, 1 Jul 2019 20:34:10 -0700 [thread overview] Message-ID: <20190702033410.GB1729@bombadil.infradead.org> (raw) In-Reply-To: <CAPcyv4j2NBPBEUU3UW1Q5OyOEuo9R5e90HpkowpeEkMsAKiUyQ@mail.gmail.com> On Sun, Jun 30, 2019 at 02:37:32PM -0700, Dan Williams wrote: > On Sun, Jun 30, 2019 at 8:23 AM Matthew Wilcox <willy@infradead.org> wrote: > > I think my theory was slightly mistaken, but your fix has the effect of > > fixing the actual problem too. > > > > The xas->xa_index for a PMD is going to be PMD-aligned (ie a multiple of > > 512), but xas_find_conflict() does _not_ adjust xa_index (... which I > > really should have mentioned in the documentation). So we go to sleep > > on the PMD-aligned index instead of the index of the PTE. Your patch > > fixes this by using the PMD-aligned index for PTEs too. > > > > I'm trying to come up with a clean fix for this. Clearly we > > shouldn't wait for a PTE entry if we're looking for a PMD entry. > > But what should get_unlocked_entry() return if it detects that case? > > We could have it return an error code encoded as an internal entry, > > like grab_mapping_entry() does. Or we could have it return the _locked_ > > PTE entry, and have callers interpret that. > > > > At least get_unlocked_entry() is static, but it's got quite a few callers. > > Trying to discern which ones might ask for a PMD entry is a bit tricky. > > So this seems like a large patch which might have bugs. > > > > Thoughts? > > ...but if it was a problem of just mismatched waitqueue's I would have > expected it to trigger prior to commit b15cd800682f "dax: Convert page > fault handlers to XArray". That commit converts grab_mapping_entry() (called by dax_iomap_pmd_fault()) from calling get_unlocked_mapping_entry() to calling get_unlocked_entry(). get_unlocked_mapping_entry() (eventually) called __radix_tree_lookup() instead of dax_find_conflict(). > This hunk, if I'm reading it correctly, > looks suspicious: @index in this case is coming directly from > vm->pgoff without pmd alignment adjustment whereas after the > conversion it's always pmd aligned from the xas->xa_index. So perhaps > the issue is that the lock happens at pte granularity. I expect it > would cause the old put_locked_mapping_entry() to WARN, but maybe that > avoids the lockup and was missed in the bisect. I don't think that hunk is the problem. The __radix_tree_lookup() is going to return a 'slot' which points to the canonical slot, no matter which of the 512 indices corresponding to that slot is chosen. So I think it's going to do essentially the same thing. > @@ -884,21 +711,18 @@ static void *dax_insert_entry(struct > address_space *mapping, > * existing entry is a PMD, we will just leave the PMD in the > * tree and dirty it if necessary. > */ > - struct radix_tree_node *node; > - void **slot; > - void *ret; > - > - ret = __radix_tree_lookup(pages, index, &node, &slot); > - WARN_ON_ONCE(ret != entry); > - __radix_tree_replace(pages, node, slot, > - new_entry, NULL); > + void *old = dax_lock_entry(xas, new_entry); > + WARN_ON_ONCE(old != xa_mk_value(xa_to_value(entry) | > + DAX_LOCKED)); > entry = new_entry; > + } else { > + xas_load(xas); /* Walk the xa_state */ > } > > if (dirty) > - radix_tree_tag_set(pages, index, PAGECACHE_TAG_DIRTY); > + xas_set_mark(xas, PAGECACHE_TAG_DIRTY); > > - xa_unlock_irq(pages); > + xas_unlock_irq(xas); > return entry; > }
next prev parent reply other threads:[~2019-07-02 3:34 UTC|newest] Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-27 0:15 [PATCH] filesystem-dax: Disable PMD support Dan Williams 2019-06-27 0:15 ` Dan Williams 2019-06-27 12:34 ` Matthew Wilcox 2019-06-27 12:34 ` Matthew Wilcox 2019-06-27 16:06 ` Dan Williams 2019-06-27 16:06 ` Dan Williams 2019-06-27 18:29 ` Dan Williams 2019-06-27 18:29 ` Dan Williams 2019-06-27 18:58 ` Dan Williams 2019-06-27 18:58 ` Dan Williams 2019-06-27 19:09 ` Dan Williams 2019-06-27 19:09 ` Dan Williams 2019-06-27 19:59 ` Matthew Wilcox 2019-06-27 19:59 ` Matthew Wilcox 2019-06-28 2:39 ` Dan Williams 2019-06-28 2:39 ` Dan Williams 2019-06-28 16:37 ` Matthew Wilcox 2019-06-28 16:37 ` Matthew Wilcox 2019-06-28 16:39 ` Dan Williams 2019-06-28 16:39 ` Dan Williams 2019-06-28 16:54 ` Matthew Wilcox 2019-06-28 16:54 ` Matthew Wilcox 2019-06-29 16:03 ` Matthew Wilcox 2019-06-29 16:03 ` Matthew Wilcox 2019-06-30 7:27 ` Dan Williams 2019-06-30 7:27 ` Dan Williams 2019-06-30 8:01 ` Dan Williams 2019-06-30 8:01 ` Dan Williams 2019-06-30 15:23 ` Matthew Wilcox 2019-06-30 15:23 ` Matthew Wilcox 2019-06-30 21:37 ` Dan Williams 2019-06-30 21:37 ` Dan Williams 2019-07-02 3:34 ` Matthew Wilcox [this message] 2019-07-02 3:34 ` Matthew Wilcox 2019-07-02 15:37 ` Dan Williams 2019-07-02 15:37 ` Dan Williams 2019-07-03 0:22 ` Boaz Harrosh 2019-07-03 0:22 ` Boaz Harrosh 2019-07-03 0:42 ` Dan Williams 2019-07-03 0:42 ` Dan Williams 2019-07-03 1:39 ` Boaz Harrosh 2019-07-03 1:39 ` Boaz Harrosh 2019-07-01 12:11 ` Jan Kara 2019-07-01 12:11 ` Jan Kara 2019-07-03 15:47 ` Matthew Wilcox 2019-07-03 15:47 ` Matthew Wilcox 2019-07-04 16:40 ` Jan Kara 2019-07-04 16:40 ` Jan Kara
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190702033410.GB1729@bombadil.infradead.org \ --to=willy@infradead.org \ --cc=dan.j.williams@intel.com \ --cc=jack@suse.cz \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvdimm@lists.01.org \ --cc=robert.barror@intel.com \ --cc=seema.pandit@intel.com \ --cc=stable@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.