[PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm()
@ 2017-09-05  3:57 Alistair Popple
  2017-09-05  3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
  0 siblings, 1 reply; 7+ messages in thread
From: Alistair Popple @ 2017-09-05  3:57 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: fbarrat, arbab, mpe, andrew.donnellan, Alistair Popple

With the optimisations introduced by commit a46cc7a908 ("powerpc/mm/radix:
Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk
cache with radix. Switch to using flush_all_mm() to ensure the pwc and tlb
are properly flushed on the nmmu.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
---

Michael,

This depends on Frederic's series:

http://patchwork.ozlabs.org/patch/809343/
http://patchwork.ozlabs.org/patch/809344/

Thanks.

Alistair

arch/powerpc/platforms/powernv/npu-dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 2cb6cbe..2fff9a65 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -549,7 +549,7 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
 	 * Unfortunately the nest mmu does not support flushing specific
 	 * addresses so we have to flush the whole mm.
 	 */
-	flush_tlb_mm(npu_context->mm);
+	flush_all_mm(npu_context->mm);
 
 	/*
 	 * Loop over all the NPUs this process is active on and launch
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  2017-09-05  3:57 [PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm() Alistair Popple
@ 2017-09-05  3:57 ` Alistair Popple
  2017-09-05  8:10   ` Frederic Barrat
  2017-09-08  4:36   ` kbuild test robot
  0 siblings, 2 replies; 7+ messages in thread
From: Alistair Popple @ 2017-09-05  3:57 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: fbarrat, arbab, mpe, andrew.donnellan, Alistair Popple

The nest mmu required an explicit flush as a tlbi would not flush it in the
same way as the core. However an alternate firmware fix exists which should
eliminate the need for this flush, so instead add a device-tree property
(ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
---
 arch/powerpc/platforms/powernv/npu-dma.c | 28 +++++++++++++++++++++++-----
 arch/powerpc/platforms/powernv/pci.h     |  3 +++
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 2fff9a65..4b4fcac 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -395,6 +395,7 @@ struct npu_context {
 	struct pci_dev *npdev[NV_MAX_NPUS][NV_MAX_LINKS];
 	struct mmu_notifier mn;
 	struct kref kref;
+	bool nmmu_flush;
 
 	/* Callback to stop translation requests on a given GPU */
 	struct npu_context *(*release_cb)(struct npu_context *, void *);
@@ -545,11 +546,13 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
 	struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS];
 	unsigned long pid = npu_context->mm->context.id;
 
-	/*
-	 * Unfortunately the nest mmu does not support flushing specific
-	 * addresses so we have to flush the whole mm.
-	 */
-	flush_all_mm(npu_context->mm);
+	if (npu_context->nmmu_flush)
+		/*
+		 * Unfortunately the nest mmu does not support flushing specific
+		 * addresses so we have to flush the whole mm once before
+		 * shooting down the GPU translation.
+		 */
+		flush_all_mm(npu_context->mm);
 
 	/*
 	 * Loop over all the NPUs this process is active on and launch
@@ -722,6 +725,16 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
 		return ERR_PTR(-ENODEV);
 	npu_context->npdev[npu->index][nvlink_index] = npdev;
 
+	if (!nphb->npu.nmmu_flush) {
+		/*
+		 * If we're not explicitly flushing ourselves we need to mark
+		 * the thread for global flushes
+		 */
+		npu_context->nmmu_flush = false;
+		inc_mm_active_cpus(mm);
+	} else
+		npu_context->nmmu_flush = true;
+
 	return npu_context;
 }
 EXPORT_SYMBOL(pnv_npu2_init_context);
@@ -731,6 +744,9 @@ static void pnv_npu2_release_context(struct kref *kref)
 	struct npu_context *npu_context =
 		container_of(kref, struct npu_context, kref);
 
+	if (!npu_context->nmmu_flush)
+		dec_mm_active_cpus(npu_context->mm);
+
 	npu_context->mm->context.npu_context = NULL;
 	mmu_notifier_unregister(&npu_context->mn,
 				npu_context->mm);
@@ -819,6 +835,8 @@ int pnv_npu2_init(struct pnv_phb *phb)
 	static int npu_index;
 	uint64_t rc = 0;
 
+	phb->npu.nmmu_flush =
+		of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush");
 	for_each_child_of_node(phb->hose->dn, dn) {
 		gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn));
 		if (gpdev) {
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index a95273c..22025c6 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -187,6 +187,9 @@ struct pnv_phb {
 
 		/* Bitmask for MMIO register usage */
 		unsigned long mmio_atsd_usage;
+
+		/* Do we need to explicitly flush the nest mmu? */
+		bool nmmu_flush;
 	} npu;
 
 #ifdef CONFIG_CXL_BASE
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  2017-09-05  3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
@ 2017-09-05  8:10   ` Frederic Barrat
  2017-09-05 11:48     ` Alistair Popple
  2017-09-08  4:36   ` kbuild test robot
  1 sibling, 1 reply; 7+ messages in thread
From: Frederic Barrat @ 2017-09-05  8:10 UTC (permalink / raw)
  To: Alistair Popple, linuxppc-dev; +Cc: arbab, mpe, andrew.donnellan



Le 05/09/2017 à 05:57, Alistair Popple a écrit :
> The nest mmu required an explicit flush as a tlbi would not flush it in the
> same way as the core. However an alternate firmware fix exists which should
> eliminate the need for this flush, so instead add a device-tree property
> (ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required.
> 
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> ---
>   arch/powerpc/platforms/powernv/npu-dma.c | 28 +++++++++++++++++++++++-----
>   arch/powerpc/platforms/powernv/pci.h     |  3 +++
>   2 files changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 2fff9a65..4b4fcac 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -395,6 +395,7 @@ struct npu_context {
>   	struct pci_dev *npdev[NV_MAX_NPUS][NV_MAX_LINKS];
>   	struct mmu_notifier mn;
>   	struct kref kref;
> +	bool nmmu_flush;
> 
>   	/* Callback to stop translation requests on a given GPU */
>   	struct npu_context *(*release_cb)(struct npu_context *, void *);
> @@ -545,11 +546,13 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
>   	struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS];
>   	unsigned long pid = npu_context->mm->context.id;
> 
> -	/*
> -	 * Unfortunately the nest mmu does not support flushing specific
> -	 * addresses so we have to flush the whole mm.
> -	 */
> -	flush_all_mm(npu_context->mm);
> +	if (npu_context->nmmu_flush)
> +		/*
> +		 * Unfortunately the nest mmu does not support flushing specific
> +		 * addresses so we have to flush the whole mm once before
> +		 * shooting down the GPU translation.
> +		 */
> +		flush_all_mm(npu_context->mm);
> 
>   	/*
>   	 * Loop over all the NPUs this process is active on and launch
> @@ -722,6 +725,16 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
>   		return ERR_PTR(-ENODEV);
>   	npu_context->npdev[npu->index][nvlink_index] = npdev;
> 
> +	if (!nphb->npu.nmmu_flush) {
> +		/*
> +		 * If we're not explicitly flushing ourselves we need to mark
> +		 * the thread for global flushes
> +		 */
> +		npu_context->nmmu_flush = false;
> +		inc_mm_active_cpus(mm);


I can see that you use the inc_mm_active_cpus()/dec_mm_active_cpus() and 
not the new mm_context_add_copro()/mm_context_remove_copro(). The only 
difference is that mm_context_remove_copro() adds an extra flush of the 
full mm (pwc and tlb) before decrementing the active cpu count.

But don't you need the flush when releasing the context? As soon as the 
active cpu count is decremented, the next TLBI may be local and not 
visible to the nMMU, so the nMMU could keep obsolete/wrong entries in 
its cache.

   Fred



> +	} else
> +		npu_context->nmmu_flush = true;
> +
>   	return npu_context;
>   }
>   EXPORT_SYMBOL(pnv_npu2_init_context);
> @@ -731,6 +744,9 @@ static void pnv_npu2_release_context(struct kref *kref)
>   	struct npu_context *npu_context =
>   		container_of(kref, struct npu_context, kref);
> 
> +	if (!npu_context->nmmu_flush)
> +		dec_mm_active_cpus(npu_context->mm);
> +
>   	npu_context->mm->context.npu_context = NULL;
>   	mmu_notifier_unregister(&npu_context->mn,
>   				npu_context->mm);
> @@ -819,6 +835,8 @@ int pnv_npu2_init(struct pnv_phb *phb)
>   	static int npu_index;
>   	uint64_t rc = 0;
> 
> +	phb->npu.nmmu_flush =
> +		of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush");
>   	for_each_child_of_node(phb->hose->dn, dn) {
>   		gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn));
>   		if (gpdev) {
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index a95273c..22025c6 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -187,6 +187,9 @@ struct pnv_phb {
> 
>   		/* Bitmask for MMIO register usage */
>   		unsigned long mmio_atsd_usage;
> +
> +		/* Do we need to explicitly flush the nest mmu? */
> +		bool nmmu_flush;
>   	} npu;
> 
>   #ifdef CONFIG_CXL_BASE
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  2017-09-05  8:10   ` Frederic Barrat
@ 2017-09-05 11:48     ` Alistair Popple
  0 siblings, 0 replies; 7+ messages in thread
From: Alistair Popple @ 2017-09-05 11:48 UTC (permalink / raw)
  To: Frederic Barrat; +Cc: linuxppc-dev, arbab, mpe, andrew.donnellan

Hi,

On Tue, 5 Sep 2017 10:10:03 AM Frederic Barrat wrote:
> > 
> > +	if (!nphb->npu.nmmu_flush) {
> > +		/*
> > +		 * If we're not explicitly flushing ourselves we need to mark
> > +		 * the thread for global flushes
> > +		 */
> > +		npu_context->nmmu_flush = false;
> > +		inc_mm_active_cpus(mm);
> 
> 
> I can see that you use the inc_mm_active_cpus()/dec_mm_active_cpus() and 
> not the new mm_context_add_copro()/mm_context_remove_copro(). The only 
> difference is that mm_context_remove_copro() adds an extra flush of the 
> full mm (pwc and tlb) before decrementing the active cpu count.

Thanks for pointing that out, I'd overlooked the difference between the two.

> But don't you need the flush when releasing the context? As soon as the 
> active cpu count is decremented, the next TLBI may be local and not 
> visible to the nMMU, so the nMMU could keep obsolete/wrong entries in 
> its cache.

Yes, I think you're right. In practice I doubt we'd ever hit it as the driver
never calls pnv_npu2_destroy_context() prior to the process being torn down
(which should trigger the right global tlbies). But obviously something that
needs fixing so I will post a v2 ... thanks for reviewing!

Regards,

Alistair

>    Fred
> 
> 
> 
> > +	} else
> > +		npu_context->nmmu_flush = true;
> > +
> >   	return npu_context;
> >   }
> >   EXPORT_SYMBOL(pnv_npu2_init_context);
> > @@ -731,6 +744,9 @@ static void pnv_npu2_release_context(struct kref *kref)
> >   	struct npu_context *npu_context =
> >   		container_of(kref, struct npu_context, kref);
> > 
> > +	if (!npu_context->nmmu_flush)
> > +		dec_mm_active_cpus(npu_context->mm);
> > +
> >   	npu_context->mm->context.npu_context = NULL;
> >   	mmu_notifier_unregister(&npu_context->mn,
> >   				npu_context->mm);
> > @@ -819,6 +835,8 @@ int pnv_npu2_init(struct pnv_phb *phb)
> >   	static int npu_index;
> >   	uint64_t rc = 0;
> > 
> > +	phb->npu.nmmu_flush =
> > +		of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush");
> >   	for_each_child_of_node(phb->hose->dn, dn) {
> >   		gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn));
> >   		if (gpdev) {
> > diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> > index a95273c..22025c6 100644
> > --- a/arch/powerpc/platforms/powernv/pci.h
> > +++ b/arch/powerpc/platforms/powernv/pci.h
> > @@ -187,6 +187,9 @@ struct pnv_phb {
> > 
> >   		/* Bitmask for MMIO register usage */
> >   		unsigned long mmio_atsd_usage;
> > +
> > +		/* Do we need to explicitly flush the nest mmu? */
> > +		bool nmmu_flush;
> >   	} npu;
> > 
> >   #ifdef CONFIG_CXL_BASE
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  2017-09-05  3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
  2017-09-05  8:10   ` Frederic Barrat
@ 2017-09-08  4:36   ` kbuild test robot
  1 sibling, 0 replies; 7+ messages in thread
From: kbuild test robot @ 2017-09-08  4:36 UTC (permalink / raw)
  To: Alistair Popple
  Cc: kbuild-all, linuxppc-dev, andrew.donnellan, arbab, fbarrat,
	Alistair Popple

[-- Attachment #1: Type: text/plain, Size: 10022 bytes --]

Hi Alistair,

[auto build test ERROR on powerpc/next]
[also build test ERROR on next-20170907]
[cannot apply to v4.13]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Alistair-Popple/powerpc-npu-Use-flush_all_mm-instead-of-flush_tlb_mm/20170908-072908
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/powernv/npu-dma.c: In function 'mmio_invalidate':
   arch/powerpc/platforms/powernv/npu-dma.c:555:3: error: implicit declaration of function 'flush_all_mm' [-Werror=implicit-function-declaration]
      flush_all_mm(npu_context->mm);
      ^~~~~~~~~~~~
   arch/powerpc/platforms/powernv/npu-dma.c: In function 'pnv_npu2_init_context':
>> arch/powerpc/platforms/powernv/npu-dma.c:744:3: error: implicit declaration of function 'inc_mm_active_cpus' [-Werror=implicit-function-declaration]
      inc_mm_active_cpus(mm);
      ^~~~~~~~~~~~~~~~~~
   arch/powerpc/platforms/powernv/npu-dma.c: In function 'pnv_npu2_release_context':
>> arch/powerpc/platforms/powernv/npu-dma.c:758:3: error: implicit declaration of function 'dec_mm_active_cpus' [-Werror=implicit-function-declaration]
      dec_mm_active_cpus(npu_context->mm);
      ^~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors

vim +/inc_mm_active_cpus +744 arch/powerpc/platforms/powernv/npu-dma.c

   534	
   535	/*
   536	 * Invalidate either a single address or an entire PID depending on
   537	 * the value of va.
   538	 */
   539	static void mmio_invalidate(struct npu_context *npu_context, int va,
   540				unsigned long address, bool flush)
   541	{
   542		int i, j;
   543		struct npu *npu;
   544		struct pnv_phb *nphb;
   545		struct pci_dev *npdev;
   546		struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS];
   547		unsigned long pid = npu_context->mm->context.id;
   548	
   549		if (npu_context->nmmu_flush)
   550			/*
   551			 * Unfortunately the nest mmu does not support flushing specific
   552			 * addresses so we have to flush the whole mm once before
   553			 * shooting down the GPU translation.
   554			 */
 > 555			flush_all_mm(npu_context->mm);
   556	
   557		/*
   558		 * Loop over all the NPUs this process is active on and launch
   559		 * an invalidate.
   560		 */
   561		for (i = 0; i <= max_npu2_index; i++) {
   562			mmio_atsd_reg[i].reg = -1;
   563			for (j = 0; j < NV_MAX_LINKS; j++) {
   564				npdev = npu_context->npdev[i][j];
   565				if (!npdev)
   566					continue;
   567	
   568				nphb = pci_bus_to_host(npdev->bus)->private_data;
   569				npu = &nphb->npu;
   570				mmio_atsd_reg[i].npu = npu;
   571	
   572				if (va)
   573					mmio_atsd_reg[i].reg =
   574						mmio_invalidate_va(npu, address, pid,
   575								flush);
   576				else
   577					mmio_atsd_reg[i].reg =
   578						mmio_invalidate_pid(npu, pid, flush);
   579	
   580				/*
   581				 * The NPU hardware forwards the shootdown to all GPUs
   582				 * so we only have to launch one shootdown per NPU.
   583				 */
   584				break;
   585			}
   586		}
   587	
   588		mmio_invalidate_wait(mmio_atsd_reg, flush);
   589		if (flush)
   590			/* Wait for the flush to complete */
   591			mmio_invalidate_wait(mmio_atsd_reg, false);
   592	}
   593	
   594	static void pnv_npu2_mn_release(struct mmu_notifier *mn,
   595					struct mm_struct *mm)
   596	{
   597		struct npu_context *npu_context = mn_to_npu_context(mn);
   598	
   599		/* Call into device driver to stop requests to the NMMU */
   600		if (npu_context->release_cb)
   601			npu_context->release_cb(npu_context, npu_context->priv);
   602	
   603		/*
   604		 * There should be no more translation requests for this PID, but we
   605		 * need to ensure any entries for it are removed from the TLB.
   606		 */
   607		mmio_invalidate(npu_context, 0, 0, true);
   608	}
   609	
   610	static void pnv_npu2_mn_change_pte(struct mmu_notifier *mn,
   611					struct mm_struct *mm,
   612					unsigned long address,
   613					pte_t pte)
   614	{
   615		struct npu_context *npu_context = mn_to_npu_context(mn);
   616	
   617		mmio_invalidate(npu_context, 1, address, true);
   618	}
   619	
   620	static void pnv_npu2_mn_invalidate_page(struct mmu_notifier *mn,
   621						struct mm_struct *mm,
   622						unsigned long address)
   623	{
   624		struct npu_context *npu_context = mn_to_npu_context(mn);
   625	
   626		mmio_invalidate(npu_context, 1, address, true);
   627	}
   628	
   629	static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
   630						struct mm_struct *mm,
   631						unsigned long start, unsigned long end)
   632	{
   633		struct npu_context *npu_context = mn_to_npu_context(mn);
   634		unsigned long address;
   635	
   636		for (address = start; address < end; address += PAGE_SIZE)
   637			mmio_invalidate(npu_context, 1, address, false);
   638	
   639		/* Do the flush only on the final addess == end */
   640		mmio_invalidate(npu_context, 1, address, true);
   641	}
   642	
   643	static const struct mmu_notifier_ops nv_nmmu_notifier_ops = {
   644		.release = pnv_npu2_mn_release,
   645		.change_pte = pnv_npu2_mn_change_pte,
   646		.invalidate_page = pnv_npu2_mn_invalidate_page,
   647		.invalidate_range = pnv_npu2_mn_invalidate_range,
   648	};
   649	
   650	/*
   651	 * Call into OPAL to setup the nmmu context for the current task in
   652	 * the NPU. This must be called to setup the context tables before the
   653	 * GPU issues ATRs. pdev should be a pointed to PCIe GPU device.
   654	 *
   655	 * A release callback should be registered to allow a device driver to
   656	 * be notified that it should not launch any new translation requests
   657	 * as the final TLB invalidate is about to occur.
   658	 *
   659	 * Returns an error if there no contexts are currently available or a
   660	 * npu_context which should be passed to pnv_npu2_handle_fault().
   661	 *
   662	 * mmap_sem must be held in write mode.
   663	 */
   664	struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
   665				unsigned long flags,
   666				struct npu_context *(*cb)(struct npu_context *, void *),
   667				void *priv)
   668	{
   669		int rc;
   670		u32 nvlink_index;
   671		struct device_node *nvlink_dn;
   672		struct mm_struct *mm = current->mm;
   673		struct pnv_phb *nphb;
   674		struct npu *npu;
   675		struct npu_context *npu_context;
   676	
   677		/*
   678		 * At present we don't support GPUs connected to multiple NPUs and I'm
   679		 * not sure the hardware does either.
   680		 */
   681		struct pci_dev *npdev = pnv_pci_get_npu_dev(gpdev, 0);
   682	
   683		if (!firmware_has_feature(FW_FEATURE_OPAL))
   684			return ERR_PTR(-ENODEV);
   685	
   686		if (!npdev)
   687			/* No nvlink associated with this GPU device */
   688			return ERR_PTR(-ENODEV);
   689	
   690		if (!mm || mm->context.id == 0) {
   691			/*
   692			 * Kernel thread contexts are not supported and context id 0 is
   693			 * reserved on the GPU.
   694			 */
   695			return ERR_PTR(-EINVAL);
   696		}
   697	
   698		nphb = pci_bus_to_host(npdev->bus)->private_data;
   699		npu = &nphb->npu;
   700	
   701		/*
   702		 * Setup the NPU context table for a particular GPU. These need to be
   703		 * per-GPU as we need the tables to filter ATSDs when there are no
   704		 * active contexts on a particular GPU.
   705		 */
   706		rc = opal_npu_init_context(nphb->opal_id, mm->context.id, flags,
   707					PCI_DEVID(gpdev->bus->number, gpdev->devfn));
   708		if (rc < 0)
   709			return ERR_PTR(-ENOSPC);
   710	
   711		/*
   712		 * We store the npu pci device so we can more easily get at the
   713		 * associated npus.
   714		 */
   715		npu_context = mm->context.npu_context;
   716		if (!npu_context) {
   717			npu_context = kzalloc(sizeof(struct npu_context), GFP_KERNEL);
   718			if (!npu_context)
   719				return ERR_PTR(-ENOMEM);
   720	
   721			mm->context.npu_context = npu_context;
   722			npu_context->mm = mm;
   723			npu_context->mn.ops = &nv_nmmu_notifier_ops;
   724			__mmu_notifier_register(&npu_context->mn, mm);
   725			kref_init(&npu_context->kref);
   726		} else {
   727			kref_get(&npu_context->kref);
   728		}
   729	
   730		npu_context->release_cb = cb;
   731		npu_context->priv = priv;
   732		nvlink_dn = of_parse_phandle(npdev->dev.of_node, "ibm,nvlink", 0);
   733		if (WARN_ON(of_property_read_u32(nvlink_dn, "ibm,npu-link-index",
   734								&nvlink_index)))
   735			return ERR_PTR(-ENODEV);
   736		npu_context->npdev[npu->index][nvlink_index] = npdev;
   737	
   738		if (!nphb->npu.nmmu_flush) {
   739			/*
   740			 * If we're not explicitly flushing ourselves we need to mark
   741			 * the thread for global flushes
   742			 */
   743			npu_context->nmmu_flush = false;
 > 744			inc_mm_active_cpus(mm);
   745		} else
   746			npu_context->nmmu_flush = true;
   747	
   748		return npu_context;
   749	}
   750	EXPORT_SYMBOL(pnv_npu2_init_context);
   751	
   752	static void pnv_npu2_release_context(struct kref *kref)
   753	{
   754		struct npu_context *npu_context =
   755			container_of(kref, struct npu_context, kref);
   756	
   757		if (!npu_context->nmmu_flush)
 > 758			dec_mm_active_cpus(npu_context->mm);
   759	
   760		npu_context->mm->context.npu_context = NULL;
   761		mmu_notifier_unregister(&npu_context->mn,
   762					npu_context->mm);
   763	
   764		kfree(npu_context);
   765	}
   766	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 23672 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD
@ 2017-08-11  6:22 Alistair Popple
  2017-08-11  6:22 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
  0 siblings, 1 reply; 7+ messages in thread
From: Alistair Popple @ 2017-08-11  6:22 UTC (permalink / raw)
  To: mpe; +Cc: arbab, linuxppc-dev, sbaskaran, fbarrat, Alistair Popple, stable

The nest mmu tlb flush needs to happen before the GPU translation shootdown
is launched to avoid the GPU refilling its tlb with stale nmmu translations
prior to the nmmu flush completing.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Cc: stable@vger.kernel.org
---
 arch/powerpc/platforms/powernv/npu-dma.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index b5d960d..3d4f879 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -546,6 +546,12 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
 	unsigned long pid = npu_context->mm->context.id;
 
 	/*
+	 * Unfortunately the nest mmu does not support flushing specific
+	 * addresses so we have to flush the whole mm.
+	 */
+	flush_tlb_mm(npu_context->mm);
+
+	/*
 	 * Loop over all the NPUs this process is active on and launch
 	 * an invalidate.
 	 */
@@ -576,12 +582,6 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
 		}
 	}
 
-	/*
-	 * Unfortunately the nest mmu does not support flushing specific
-	 * addresses so we have to flush the whole mm.
-	 */
-	flush_tlb_mm(npu_context->mm);
-
 	mmio_invalidate_wait(mmio_atsd_reg, flush);
 	if (flush)
 		/* Wait for the flush to complete */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  2017-08-11  6:22 [PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD Alistair Popple
@ 2017-08-11  6:22 ` Alistair Popple
  2017-08-13 17:04   ` kbuild test robot
  0 siblings, 1 reply; 7+ messages in thread
From: Alistair Popple @ 2017-08-11  6:22 UTC (permalink / raw)
  To: mpe; +Cc: arbab, linuxppc-dev, sbaskaran, fbarrat, Alistair Popple

The nest mmu required an explicit flush as a tlbi would not flush it in the
same way as the core. However an alternate firmware fix exists which should
eliminate the need for this flush, so instead add a device-tree property
(ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
---

Michael,

This patch depends on http://patchwork.ozlabs.org/patch/796775/ - [v3,1/3]
powerpc/mm: Add marker for contexts requiring global TLB invalidations.

- Alistair

 arch/powerpc/platforms/powernv/npu-dma.c | 27 +++++++++++++++++++++------
 arch/powerpc/platforms/powernv/pci.h     |  3 +++
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 3d4f879..ac07800 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -544,12 +544,7 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
 	struct pci_dev *npdev;
 	struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS];
 	unsigned long pid = npu_context->mm->context.id;
-
-	/*
-	 * Unfortunately the nest mmu does not support flushing specific
-	 * addresses so we have to flush the whole mm.
-	 */
-	flush_tlb_mm(npu_context->mm);
+	bool nmmu_flushed = false;
 
 	/*
 	 * Loop over all the NPUs this process is active on and launch
@@ -566,6 +561,17 @@ static void mmio_invalidate(struct npu_context *npu_context, int va,
 			npu = &nphb->npu;
 			mmio_atsd_reg[i].npu = npu;
 
+			if (nphb->npu.nmmu_flush && !nmmu_flushed) {
+				/*
+				 * Unfortunately the nest mmu does not support
+				 * flushing specific addresses so we have to
+				 * flush the whole mm once before shooting down
+				 * the GPU translation.
+				 */
+				flush_tlb_mm(npu_context->mm);
+				nmmu_flushed = true;
+			}
+
 			if (va)
 				mmio_atsd_reg[i].reg =
 					mmio_invalidate_va(npu, address, pid,
@@ -732,6 +738,13 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
 		return ERR_PTR(-ENODEV);
 	npu_context->npdev[npu->index][nvlink_index] = npdev;
 
+	if (!nphb->npu.nmmu_flush)
+		/*
+		 * If we're not explicitly flushing ourselves we need to mark
+		 * the thread for global flushes
+		 */
+		mm_context_set_global_tlbi(&mm->context);
+
 	return npu_context;
 }
 EXPORT_SYMBOL(pnv_npu2_init_context);
@@ -829,6 +842,8 @@ int pnv_npu2_init(struct pnv_phb *phb)
 	static int npu_index;
 	uint64_t rc = 0;
 
+	phb->npu.nmmu_flush =
+		of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush");
 	for_each_child_of_node(phb->hose->dn, dn) {
 		gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn));
 		if (gpdev) {
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index f16bc40..e8e3e20 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -184,6 +184,9 @@ struct pnv_phb {
 
 		/* Bitmask for MMIO register usage */
 		unsigned long mmio_atsd_usage;
+
+		/* Do we need to explicitly flush the nest mmu? */
+		bool nmmu_flush;
 	} npu;
 
 #ifdef CONFIG_CXL_BASE
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  2017-08-11  6:22 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
@ 2017-08-13 17:04   ` kbuild test robot
  0 siblings, 0 replies; 7+ messages in thread
From: kbuild test robot @ 2017-08-13 17:04 UTC (permalink / raw)
  To: Alistair Popple
  Cc: kbuild-all, mpe, linuxppc-dev, sbaskaran, fbarrat, arbab,
	Alistair Popple

[-- Attachment #1: Type: text/plain, Size: 5035 bytes --]

Hi Alistair,

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.13-rc4 next-20170811]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Alistair-Popple/powerpc-powernv-npu-Move-tlb-flush-before-launching-ATSD/20170813-211752
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/powernv/npu-dma.c: In function 'pnv_npu2_init_context':
>> arch/powerpc/platforms/powernv/npu-dma.c:746:3: error: implicit declaration of function 'mm_context_set_global_tlbi' [-Werror=implicit-function-declaration]
      mm_context_set_global_tlbi(&mm->context);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors

vim +/mm_context_set_global_tlbi +746 arch/powerpc/platforms/powernv/npu-dma.c

   652	
   653	/*
   654	 * Call into OPAL to setup the nmmu context for the current task in
   655	 * the NPU. This must be called to setup the context tables before the
   656	 * GPU issues ATRs. pdev should be a pointed to PCIe GPU device.
   657	 *
   658	 * A release callback should be registered to allow a device driver to
   659	 * be notified that it should not launch any new translation requests
   660	 * as the final TLB invalidate is about to occur.
   661	 *
   662	 * Returns an error if there no contexts are currently available or a
   663	 * npu_context which should be passed to pnv_npu2_handle_fault().
   664	 *
   665	 * mmap_sem must be held in write mode.
   666	 */
   667	struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
   668				unsigned long flags,
   669				struct npu_context *(*cb)(struct npu_context *, void *),
   670				void *priv)
   671	{
   672		int rc;
   673		u32 nvlink_index;
   674		struct device_node *nvlink_dn;
   675		struct mm_struct *mm = current->mm;
   676		struct pnv_phb *nphb;
   677		struct npu *npu;
   678		struct npu_context *npu_context;
   679	
   680		/*
   681		 * At present we don't support GPUs connected to multiple NPUs and I'm
   682		 * not sure the hardware does either.
   683		 */
   684		struct pci_dev *npdev = pnv_pci_get_npu_dev(gpdev, 0);
   685	
   686		if (!firmware_has_feature(FW_FEATURE_OPAL))
   687			return ERR_PTR(-ENODEV);
   688	
   689		if (!npdev)
   690			/* No nvlink associated with this GPU device */
   691			return ERR_PTR(-ENODEV);
   692	
   693		if (!mm || mm->context.id == 0) {
   694			/*
   695			 * Kernel thread contexts are not supported and context id 0 is
   696			 * reserved on the GPU.
   697			 */
   698			return ERR_PTR(-EINVAL);
   699		}
   700	
   701		nphb = pci_bus_to_host(npdev->bus)->private_data;
   702		npu = &nphb->npu;
   703	
   704		/*
   705		 * Setup the NPU context table for a particular GPU. These need to be
   706		 * per-GPU as we need the tables to filter ATSDs when there are no
   707		 * active contexts on a particular GPU.
   708		 */
   709		rc = opal_npu_init_context(nphb->opal_id, mm->context.id, flags,
   710					PCI_DEVID(gpdev->bus->number, gpdev->devfn));
   711		if (rc < 0)
   712			return ERR_PTR(-ENOSPC);
   713	
   714		/*
   715		 * We store the npu pci device so we can more easily get at the
   716		 * associated npus.
   717		 */
   718		npu_context = mm->context.npu_context;
   719		if (!npu_context) {
   720			npu_context = kzalloc(sizeof(struct npu_context), GFP_KERNEL);
   721			if (!npu_context)
   722				return ERR_PTR(-ENOMEM);
   723	
   724			mm->context.npu_context = npu_context;
   725			npu_context->mm = mm;
   726			npu_context->mn.ops = &nv_nmmu_notifier_ops;
   727			__mmu_notifier_register(&npu_context->mn, mm);
   728			kref_init(&npu_context->kref);
   729		} else {
   730			kref_get(&npu_context->kref);
   731		}
   732	
   733		npu_context->release_cb = cb;
   734		npu_context->priv = priv;
   735		nvlink_dn = of_parse_phandle(npdev->dev.of_node, "ibm,nvlink", 0);
   736		if (WARN_ON(of_property_read_u32(nvlink_dn, "ibm,npu-link-index",
   737								&nvlink_index)))
   738			return ERR_PTR(-ENODEV);
   739		npu_context->npdev[npu->index][nvlink_index] = npdev;
   740	
   741		if (!nphb->npu.nmmu_flush)
   742			/*
   743			 * If we're not explicitly flushing ourselves we need to mark
   744			 * the thread for global flushes
   745			 */
 > 746			mm_context_set_global_tlbi(&mm->context);
   747	
   748		return npu_context;
   749	}
   750	EXPORT_SYMBOL(pnv_npu2_init_context);
   751	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 23572 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-09-08  4:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-05  3:57 [PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm() Alistair Popple
2017-09-05  3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
2017-09-05  8:10   ` Frederic Barrat
2017-09-05 11:48     ` Alistair Popple
2017-09-08  4:36   ` kbuild test robot
  -- strict thread matches above, loose matches on Subject: below --
2017-08-11  6:22 [PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD Alistair Popple
2017-08-11  6:22 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple
2017-08-13 17:04   ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.