linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Dan Williams <dan.j.williams@intel.com>, akpm@linux-foundation.org
Cc: stable@vger.kernel.org, "Jérôme Glisse" <jglisse@redhat.com>,
	"Christoph Hellwig" <hch@lst.de>,
	torvalds@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Stephen Bates" <sbates@raithlin.com>
Subject: Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
Date: Tue, 27 Nov 2018 14:43:52 -0700	[thread overview]
Message-ID: <6875ca04-a36a-89ae-825b-f629ab011d47@deltatee.com> (raw)
In-Reply-To: <154275558526.76910.7535251937849268605.stgit@dwillia2-desk3.amr.corp.intel.com>

Hey Dan,

On 2018-11-20 4:13 p.m., Dan Williams wrote:
> The last step before devm_memremap_pages() returns success is to
> allocate a release action, devm_memremap_pages_release(), to tear the
> entire setup down. However, the result from devm_add_action() is not
> checked.
> 
> Checking the error from devm_add_action() is not enough. The api
> currently relies on the fact that the percpu_ref it is using is killed
> by the time the devm_memremap_pages_release() is run. Rather than
> continue this awkward situation, offload the responsibility of killing
> the percpu_ref to devm_memremap_pages_release() directly. This allows
> devm_memremap_pages() to do the right thing  relative to init failures
> and shutdown.
> 
> Without this change we could fail to register the teardown of
> devm_memremap_pages(). The likelihood of hitting this failure is tiny as
> small memory allocations almost always succeed. However, the impact of
> the failure is large given any future reconfiguration, or
> disable/enable, of an nvdimm namespace will fail forever as subsequent
> calls to devm_memremap_pages() will fail to setup the pgmap_radix since
> there will be stale entries for the physical address range.
> 
> An argument could be made to require that the ->kill() operation be set
> in the @pgmap arg rather than passed in separately. However, it helps
> code readability, tracking the lifetime of a given instance, to be able
> to grep the kill routine directly at the devm_memremap_pages() call
> site.
> 
> Cc: <stable@vger.kernel.org>
> Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
> Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
> Reported-by: Logan Gunthorpe <logang@deltatee.com>
> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I recently realized this patch, which was recently added to the mm tree,
will break p2pdma. This is largely because the patch was written and
reviewed before p2pdma was merged (in 4.20). Originally, I think we both
expected this patch would be merged before p2pdma but that's not what
happened.

Also, while testing this, I found the teardown is still not quite
correct. In p2pdma, the struct pages will be removed before all of the
percpu references have released and if the device is unbound while pages
are in use, there will be a kernel panic. This is because we wait on the
completion that indicates all references have been free'd after
devm_memremap_pages_release() is called and the pages are removed. This
is fairly easily fixed by waiting for the completion in the kill
function and moving the call after the last put_page(). I suspect device
DAX also has this problem but I'm not entirely certain if something else
might be preventing us from hitting this bug.

Ideally, as part of this patch we need to update the p2pdma call site
for devm_memremap_pages() and fix the completion issue. The diff for all
this is below, but if you'd like I can send a proper patch.

Thanks,

Logan

--


diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index ae3c5b25dcc7..1df7bdb45eab 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -82,9 +82,10 @@ static void pci_p2pdma_percpu_release(struct
percpu_ref *ref)
        complete_all(&p2p->devmap_ref_done);
 }

-static void pci_p2pdma_percpu_kill(void *data)
+static void pci_p2pdma_percpu_kill(struct percpu_ref *ref)
 {
-       struct percpu_ref *ref = data;
+       struct pci_p2pdma *p2p =
+               container_of(ref, struct pci_p2pdma, devmap_ref);

        /*
         * pci_p2pdma_add_resource() may be called multiple times
@@ -96,6 +97,7 @@ static void pci_p2pdma_percpu_kill(void *data)
                return;

        percpu_ref_kill(ref);
+       wait_for_completion(&p2p->devmap_ref_done);
 }

 static void pci_p2pdma_release(void *data)
@@ -105,7 +107,6 @@ static void pci_p2pdma_release(void *data)
        if (!pdev->p2pdma)
                return;

-       wait_for_completion(&pdev->p2pdma->devmap_ref_done);
        percpu_ref_exit(&pdev->p2pdma->devmap_ref);

        gen_pool_destroy(pdev->p2pdma->pool);
@@ -198,6 +199,7 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev,
int bar, size_t size,
        pgmap->type = MEMORY_DEVICE_PCI_P2PDMA;
        pgmap->pci_p2pdma_bus_offset = pci_bus_address(pdev, bar) -
                pci_resource_start(pdev, bar);
+       pgmap->kill = pci_p2pdma_percpu_kill;

        addr = devm_memremap_pages(&pdev->dev, pgmap);
        if (IS_ERR(addr)) {
@@ -211,11 +213,6 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev,
int bar, size_t size,
        if (error)
                goto pgmap_free;

-       error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_percpu_kill,
-                                         &pdev->p2pdma->devmap_ref);
-       if (error)
-               goto pgmap_free;
-
        pci_info(pdev, "added peer-to-peer DMA memory %pR\n",
                 &pgmap->res);

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 5e45f0c327a5..dd9a953e796a 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -88,9 +88,9 @@ static void devm_memremap_pages_release(void *data)
        resource_size_t align_start, align_size;
        unsigned long pfn;

-       pgmap->kill(pgmap->ref);
        for_each_device_pfn(pfn, pgmap)
                put_page(pfn_to_page(pfn));
+       pgmap->kill(pgmap->ref);

        /* pages are dead and unused, undo the arch mapping */
        align_start = res->start & ~(SECTION_SIZE - 1);








  reply	other threads:[~2018-11-27 21:44 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-20 23:12 [PATCH v8 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Dan Williams
2018-11-20 23:12 ` [PATCH v8 1/7] mm, devm_memremap_pages: Mark devm_memremap_pages() EXPORT_SYMBOL_GPL Dan Williams
2018-11-22 13:30   ` Michal Hocko
2018-11-22 16:38     ` Christoph Hellwig
2018-11-22 16:40       ` Christoph Hellwig
2018-11-23  8:47       ` Michal Hocko
2018-11-20 23:13 ` [PATCH v8 2/7] mm, devm_memremap_pages: Kill mapping "System RAM" support Dan Williams
2018-11-20 23:13 ` [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling Dan Williams
2018-11-27 21:43   ` Logan Gunthorpe [this message]
2018-11-29  3:10     ` Dan Williams
2018-11-29 17:06       ` Logan Gunthorpe
2018-11-29 17:30         ` Dan Williams
2018-11-29 17:50           ` Logan Gunthorpe
2018-11-29 18:51             ` Dan Williams
2018-11-30 22:19               ` Logan Gunthorpe
2018-11-30 22:28                 ` Dan Williams
2018-11-30 22:34                   ` Logan Gunthorpe
2018-11-30 22:47                     ` Dan Williams
2018-11-20 23:13 ` [PATCH v8 4/7] mm, devm_memremap_pages: Add MEMORY_DEVICE_PRIVATE support Dan Williams
2018-11-23 10:48   ` David Hildenbrand
2018-11-20 23:13 ` [PATCH v8 5/7] mm, hmm: Use devm semantics for hmm_devmem_{add, remove} Dan Williams
2018-11-20 23:13 ` [PATCH v8 6/7] mm, hmm: Replace hmm_devmem_pages_create() with devm_memremap_pages() Dan Williams
2018-11-20 23:13 ` [PATCH v8 7/7] mm, hmm: Mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL Dan Williams
2018-11-22  1:20 ` [PATCH v8 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Andrew Morton
2018-11-25 22:04   ` Pavel Machek
2018-12-03 23:37 ` Jerome Glisse
2019-02-10 11:09 [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling Krzysztof Grygiencz
2019-02-11 19:57 ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6875ca04-a36a-89ae-825b-f629ab011d47@deltatee.com \
    --to=logang@deltatee.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sbates@raithlin.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).