* [PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm() @ 2017-09-05 3:57 Alistair Popple 2017-09-05 3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple 0 siblings, 1 reply; 7+ messages in thread From: Alistair Popple @ 2017-09-05 3:57 UTC (permalink / raw) To: linuxppc-dev; +Cc: fbarrat, arbab, mpe, andrew.donnellan, Alistair Popple With the optimisations introduced by commit a46cc7a908 ("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk cache with radix. Switch to using flush_all_mm() to ensure the pwc and tlb are properly flushed on the nmmu. Signed-off-by: Alistair Popple <alistair@popple.id.au> --- Michael, This depends on Frederic's series: http://patchwork.ozlabs.org/patch/809343/ http://patchwork.ozlabs.org/patch/809344/ Thanks. Alistair arch/powerpc/platforms/powernv/npu-dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 2cb6cbe..2fff9a65 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -549,7 +549,7 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, * Unfortunately the nest mmu does not support flushing specific * addresses so we have to flush the whole mm. */ - flush_tlb_mm(npu_context->mm); + flush_all_mm(npu_context->mm); /* * Loop over all the NPUs this process is active on and launch -- 2.1.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb 2017-09-05 3:57 [PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm() Alistair Popple @ 2017-09-05 3:57 ` Alistair Popple 2017-09-05 8:10 ` Frederic Barrat 2017-09-08 4:36 ` kbuild test robot 0 siblings, 2 replies; 7+ messages in thread From: Alistair Popple @ 2017-09-05 3:57 UTC (permalink / raw) To: linuxppc-dev; +Cc: fbarrat, arbab, mpe, andrew.donnellan, Alistair Popple The nest mmu required an explicit flush as a tlbi would not flush it in the same way as the core. However an alternate firmware fix exists which should eliminate the need for this flush, so instead add a device-tree property (ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required. Signed-off-by: Alistair Popple <alistair@popple.id.au> --- arch/powerpc/platforms/powernv/npu-dma.c | 28 +++++++++++++++++++++++----- arch/powerpc/platforms/powernv/pci.h | 3 +++ 2 files changed, 26 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 2fff9a65..4b4fcac 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -395,6 +395,7 @@ struct npu_context { struct pci_dev *npdev[NV_MAX_NPUS][NV_MAX_LINKS]; struct mmu_notifier mn; struct kref kref; + bool nmmu_flush; /* Callback to stop translation requests on a given GPU */ struct npu_context *(*release_cb)(struct npu_context *, void *); @@ -545,11 +546,13 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS]; unsigned long pid = npu_context->mm->context.id; - /* - * Unfortunately the nest mmu does not support flushing specific - * addresses so we have to flush the whole mm. - */ - flush_all_mm(npu_context->mm); + if (npu_context->nmmu_flush) + /* + * Unfortunately the nest mmu does not support flushing specific + * addresses so we have to flush the whole mm once before + * shooting down the GPU translation. + */ + flush_all_mm(npu_context->mm); /* * Loop over all the NPUs this process is active on and launch @@ -722,6 +725,16 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, return ERR_PTR(-ENODEV); npu_context->npdev[npu->index][nvlink_index] = npdev; + if (!nphb->npu.nmmu_flush) { + /* + * If we're not explicitly flushing ourselves we need to mark + * the thread for global flushes + */ + npu_context->nmmu_flush = false; + inc_mm_active_cpus(mm); + } else + npu_context->nmmu_flush = true; + return npu_context; } EXPORT_SYMBOL(pnv_npu2_init_context); @@ -731,6 +744,9 @@ static void pnv_npu2_release_context(struct kref *kref) struct npu_context *npu_context = container_of(kref, struct npu_context, kref); + if (!npu_context->nmmu_flush) + dec_mm_active_cpus(npu_context->mm); + npu_context->mm->context.npu_context = NULL; mmu_notifier_unregister(&npu_context->mn, npu_context->mm); @@ -819,6 +835,8 @@ int pnv_npu2_init(struct pnv_phb *phb) static int npu_index; uint64_t rc = 0; + phb->npu.nmmu_flush = + of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush"); for_each_child_of_node(phb->hose->dn, dn) { gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn)); if (gpdev) { diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h index a95273c..22025c6 100644 --- a/arch/powerpc/platforms/powernv/pci.h +++ b/arch/powerpc/platforms/powernv/pci.h @@ -187,6 +187,9 @@ struct pnv_phb { /* Bitmask for MMIO register usage */ unsigned long mmio_atsd_usage; + + /* Do we need to explicitly flush the nest mmu? */ + bool nmmu_flush; } npu; #ifdef CONFIG_CXL_BASE -- 2.1.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb 2017-09-05 3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple @ 2017-09-05 8:10 ` Frederic Barrat 2017-09-05 11:48 ` Alistair Popple 2017-09-08 4:36 ` kbuild test robot 1 sibling, 1 reply; 7+ messages in thread From: Frederic Barrat @ 2017-09-05 8:10 UTC (permalink / raw) To: Alistair Popple, linuxppc-dev; +Cc: arbab, mpe, andrew.donnellan Le 05/09/2017 à 05:57, Alistair Popple a écrit : > The nest mmu required an explicit flush as a tlbi would not flush it in the > same way as the core. However an alternate firmware fix exists which should > eliminate the need for this flush, so instead add a device-tree property > (ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required. > > Signed-off-by: Alistair Popple <alistair@popple.id.au> > --- > arch/powerpc/platforms/powernv/npu-dma.c | 28 +++++++++++++++++++++++----- > arch/powerpc/platforms/powernv/pci.h | 3 +++ > 2 files changed, 26 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c > index 2fff9a65..4b4fcac 100644 > --- a/arch/powerpc/platforms/powernv/npu-dma.c > +++ b/arch/powerpc/platforms/powernv/npu-dma.c > @@ -395,6 +395,7 @@ struct npu_context { > struct pci_dev *npdev[NV_MAX_NPUS][NV_MAX_LINKS]; > struct mmu_notifier mn; > struct kref kref; > + bool nmmu_flush; > > /* Callback to stop translation requests on a given GPU */ > struct npu_context *(*release_cb)(struct npu_context *, void *); > @@ -545,11 +546,13 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, > struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS]; > unsigned long pid = npu_context->mm->context.id; > > - /* > - * Unfortunately the nest mmu does not support flushing specific > - * addresses so we have to flush the whole mm. > - */ > - flush_all_mm(npu_context->mm); > + if (npu_context->nmmu_flush) > + /* > + * Unfortunately the nest mmu does not support flushing specific > + * addresses so we have to flush the whole mm once before > + * shooting down the GPU translation. > + */ > + flush_all_mm(npu_context->mm); > > /* > * Loop over all the NPUs this process is active on and launch > @@ -722,6 +725,16 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, > return ERR_PTR(-ENODEV); > npu_context->npdev[npu->index][nvlink_index] = npdev; > > + if (!nphb->npu.nmmu_flush) { > + /* > + * If we're not explicitly flushing ourselves we need to mark > + * the thread for global flushes > + */ > + npu_context->nmmu_flush = false; > + inc_mm_active_cpus(mm); I can see that you use the inc_mm_active_cpus()/dec_mm_active_cpus() and not the new mm_context_add_copro()/mm_context_remove_copro(). The only difference is that mm_context_remove_copro() adds an extra flush of the full mm (pwc and tlb) before decrementing the active cpu count. But don't you need the flush when releasing the context? As soon as the active cpu count is decremented, the next TLBI may be local and not visible to the nMMU, so the nMMU could keep obsolete/wrong entries in its cache. Fred > + } else > + npu_context->nmmu_flush = true; > + > return npu_context; > } > EXPORT_SYMBOL(pnv_npu2_init_context); > @@ -731,6 +744,9 @@ static void pnv_npu2_release_context(struct kref *kref) > struct npu_context *npu_context = > container_of(kref, struct npu_context, kref); > > + if (!npu_context->nmmu_flush) > + dec_mm_active_cpus(npu_context->mm); > + > npu_context->mm->context.npu_context = NULL; > mmu_notifier_unregister(&npu_context->mn, > npu_context->mm); > @@ -819,6 +835,8 @@ int pnv_npu2_init(struct pnv_phb *phb) > static int npu_index; > uint64_t rc = 0; > > + phb->npu.nmmu_flush = > + of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush"); > for_each_child_of_node(phb->hose->dn, dn) { > gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn)); > if (gpdev) { > diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h > index a95273c..22025c6 100644 > --- a/arch/powerpc/platforms/powernv/pci.h > +++ b/arch/powerpc/platforms/powernv/pci.h > @@ -187,6 +187,9 @@ struct pnv_phb { > > /* Bitmask for MMIO register usage */ > unsigned long mmio_atsd_usage; > + > + /* Do we need to explicitly flush the nest mmu? */ > + bool nmmu_flush; > } npu; > > #ifdef CONFIG_CXL_BASE > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb 2017-09-05 8:10 ` Frederic Barrat @ 2017-09-05 11:48 ` Alistair Popple 0 siblings, 0 replies; 7+ messages in thread From: Alistair Popple @ 2017-09-05 11:48 UTC (permalink / raw) To: Frederic Barrat; +Cc: linuxppc-dev, arbab, mpe, andrew.donnellan Hi, On Tue, 5 Sep 2017 10:10:03 AM Frederic Barrat wrote: > > > > + if (!nphb->npu.nmmu_flush) { > > + /* > > + * If we're not explicitly flushing ourselves we need to mark > > + * the thread for global flushes > > + */ > > + npu_context->nmmu_flush = false; > > + inc_mm_active_cpus(mm); > > > I can see that you use the inc_mm_active_cpus()/dec_mm_active_cpus() and > not the new mm_context_add_copro()/mm_context_remove_copro(). The only > difference is that mm_context_remove_copro() adds an extra flush of the > full mm (pwc and tlb) before decrementing the active cpu count. Thanks for pointing that out, I'd overlooked the difference between the two. > But don't you need the flush when releasing the context? As soon as the > active cpu count is decremented, the next TLBI may be local and not > visible to the nMMU, so the nMMU could keep obsolete/wrong entries in > its cache. Yes, I think you're right. In practice I doubt we'd ever hit it as the driver never calls pnv_npu2_destroy_context() prior to the process being torn down (which should trigger the right global tlbies). But obviously something that needs fixing so I will post a v2 ... thanks for reviewing! Regards, Alistair > Fred > > > > > + } else > > + npu_context->nmmu_flush = true; > > + > > return npu_context; > > } > > EXPORT_SYMBOL(pnv_npu2_init_context); > > @@ -731,6 +744,9 @@ static void pnv_npu2_release_context(struct kref *kref) > > struct npu_context *npu_context = > > container_of(kref, struct npu_context, kref); > > > > + if (!npu_context->nmmu_flush) > > + dec_mm_active_cpus(npu_context->mm); > > + > > npu_context->mm->context.npu_context = NULL; > > mmu_notifier_unregister(&npu_context->mn, > > npu_context->mm); > > @@ -819,6 +835,8 @@ int pnv_npu2_init(struct pnv_phb *phb) > > static int npu_index; > > uint64_t rc = 0; > > > > + phb->npu.nmmu_flush = > > + of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush"); > > for_each_child_of_node(phb->hose->dn, dn) { > > gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn)); > > if (gpdev) { > > diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h > > index a95273c..22025c6 100644 > > --- a/arch/powerpc/platforms/powernv/pci.h > > +++ b/arch/powerpc/platforms/powernv/pci.h > > @@ -187,6 +187,9 @@ struct pnv_phb { > > > > /* Bitmask for MMIO register usage */ > > unsigned long mmio_atsd_usage; > > + > > + /* Do we need to explicitly flush the nest mmu? */ > > + bool nmmu_flush; > > } npu; > > > > #ifdef CONFIG_CXL_BASE > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb 2017-09-05 3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple 2017-09-05 8:10 ` Frederic Barrat @ 2017-09-08 4:36 ` kbuild test robot 1 sibling, 0 replies; 7+ messages in thread From: kbuild test robot @ 2017-09-08 4:36 UTC (permalink / raw) To: Alistair Popple Cc: kbuild-all, linuxppc-dev, andrew.donnellan, arbab, fbarrat, Alistair Popple [-- Attachment #1: Type: text/plain, Size: 10022 bytes --] Hi Alistair, [auto build test ERROR on powerpc/next] [also build test ERROR on next-20170907] [cannot apply to v4.13] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Alistair-Popple/powerpc-npu-Use-flush_all_mm-instead-of-flush_tlb_mm/20170908-072908 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): arch/powerpc/platforms/powernv/npu-dma.c: In function 'mmio_invalidate': arch/powerpc/platforms/powernv/npu-dma.c:555:3: error: implicit declaration of function 'flush_all_mm' [-Werror=implicit-function-declaration] flush_all_mm(npu_context->mm); ^~~~~~~~~~~~ arch/powerpc/platforms/powernv/npu-dma.c: In function 'pnv_npu2_init_context': >> arch/powerpc/platforms/powernv/npu-dma.c:744:3: error: implicit declaration of function 'inc_mm_active_cpus' [-Werror=implicit-function-declaration] inc_mm_active_cpus(mm); ^~~~~~~~~~~~~~~~~~ arch/powerpc/platforms/powernv/npu-dma.c: In function 'pnv_npu2_release_context': >> arch/powerpc/platforms/powernv/npu-dma.c:758:3: error: implicit declaration of function 'dec_mm_active_cpus' [-Werror=implicit-function-declaration] dec_mm_active_cpus(npu_context->mm); ^~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors vim +/inc_mm_active_cpus +744 arch/powerpc/platforms/powernv/npu-dma.c 534 535 /* 536 * Invalidate either a single address or an entire PID depending on 537 * the value of va. 538 */ 539 static void mmio_invalidate(struct npu_context *npu_context, int va, 540 unsigned long address, bool flush) 541 { 542 int i, j; 543 struct npu *npu; 544 struct pnv_phb *nphb; 545 struct pci_dev *npdev; 546 struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS]; 547 unsigned long pid = npu_context->mm->context.id; 548 549 if (npu_context->nmmu_flush) 550 /* 551 * Unfortunately the nest mmu does not support flushing specific 552 * addresses so we have to flush the whole mm once before 553 * shooting down the GPU translation. 554 */ > 555 flush_all_mm(npu_context->mm); 556 557 /* 558 * Loop over all the NPUs this process is active on and launch 559 * an invalidate. 560 */ 561 for (i = 0; i <= max_npu2_index; i++) { 562 mmio_atsd_reg[i].reg = -1; 563 for (j = 0; j < NV_MAX_LINKS; j++) { 564 npdev = npu_context->npdev[i][j]; 565 if (!npdev) 566 continue; 567 568 nphb = pci_bus_to_host(npdev->bus)->private_data; 569 npu = &nphb->npu; 570 mmio_atsd_reg[i].npu = npu; 571 572 if (va) 573 mmio_atsd_reg[i].reg = 574 mmio_invalidate_va(npu, address, pid, 575 flush); 576 else 577 mmio_atsd_reg[i].reg = 578 mmio_invalidate_pid(npu, pid, flush); 579 580 /* 581 * The NPU hardware forwards the shootdown to all GPUs 582 * so we only have to launch one shootdown per NPU. 583 */ 584 break; 585 } 586 } 587 588 mmio_invalidate_wait(mmio_atsd_reg, flush); 589 if (flush) 590 /* Wait for the flush to complete */ 591 mmio_invalidate_wait(mmio_atsd_reg, false); 592 } 593 594 static void pnv_npu2_mn_release(struct mmu_notifier *mn, 595 struct mm_struct *mm) 596 { 597 struct npu_context *npu_context = mn_to_npu_context(mn); 598 599 /* Call into device driver to stop requests to the NMMU */ 600 if (npu_context->release_cb) 601 npu_context->release_cb(npu_context, npu_context->priv); 602 603 /* 604 * There should be no more translation requests for this PID, but we 605 * need to ensure any entries for it are removed from the TLB. 606 */ 607 mmio_invalidate(npu_context, 0, 0, true); 608 } 609 610 static void pnv_npu2_mn_change_pte(struct mmu_notifier *mn, 611 struct mm_struct *mm, 612 unsigned long address, 613 pte_t pte) 614 { 615 struct npu_context *npu_context = mn_to_npu_context(mn); 616 617 mmio_invalidate(npu_context, 1, address, true); 618 } 619 620 static void pnv_npu2_mn_invalidate_page(struct mmu_notifier *mn, 621 struct mm_struct *mm, 622 unsigned long address) 623 { 624 struct npu_context *npu_context = mn_to_npu_context(mn); 625 626 mmio_invalidate(npu_context, 1, address, true); 627 } 628 629 static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn, 630 struct mm_struct *mm, 631 unsigned long start, unsigned long end) 632 { 633 struct npu_context *npu_context = mn_to_npu_context(mn); 634 unsigned long address; 635 636 for (address = start; address < end; address += PAGE_SIZE) 637 mmio_invalidate(npu_context, 1, address, false); 638 639 /* Do the flush only on the final addess == end */ 640 mmio_invalidate(npu_context, 1, address, true); 641 } 642 643 static const struct mmu_notifier_ops nv_nmmu_notifier_ops = { 644 .release = pnv_npu2_mn_release, 645 .change_pte = pnv_npu2_mn_change_pte, 646 .invalidate_page = pnv_npu2_mn_invalidate_page, 647 .invalidate_range = pnv_npu2_mn_invalidate_range, 648 }; 649 650 /* 651 * Call into OPAL to setup the nmmu context for the current task in 652 * the NPU. This must be called to setup the context tables before the 653 * GPU issues ATRs. pdev should be a pointed to PCIe GPU device. 654 * 655 * A release callback should be registered to allow a device driver to 656 * be notified that it should not launch any new translation requests 657 * as the final TLB invalidate is about to occur. 658 * 659 * Returns an error if there no contexts are currently available or a 660 * npu_context which should be passed to pnv_npu2_handle_fault(). 661 * 662 * mmap_sem must be held in write mode. 663 */ 664 struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, 665 unsigned long flags, 666 struct npu_context *(*cb)(struct npu_context *, void *), 667 void *priv) 668 { 669 int rc; 670 u32 nvlink_index; 671 struct device_node *nvlink_dn; 672 struct mm_struct *mm = current->mm; 673 struct pnv_phb *nphb; 674 struct npu *npu; 675 struct npu_context *npu_context; 676 677 /* 678 * At present we don't support GPUs connected to multiple NPUs and I'm 679 * not sure the hardware does either. 680 */ 681 struct pci_dev *npdev = pnv_pci_get_npu_dev(gpdev, 0); 682 683 if (!firmware_has_feature(FW_FEATURE_OPAL)) 684 return ERR_PTR(-ENODEV); 685 686 if (!npdev) 687 /* No nvlink associated with this GPU device */ 688 return ERR_PTR(-ENODEV); 689 690 if (!mm || mm->context.id == 0) { 691 /* 692 * Kernel thread contexts are not supported and context id 0 is 693 * reserved on the GPU. 694 */ 695 return ERR_PTR(-EINVAL); 696 } 697 698 nphb = pci_bus_to_host(npdev->bus)->private_data; 699 npu = &nphb->npu; 700 701 /* 702 * Setup the NPU context table for a particular GPU. These need to be 703 * per-GPU as we need the tables to filter ATSDs when there are no 704 * active contexts on a particular GPU. 705 */ 706 rc = opal_npu_init_context(nphb->opal_id, mm->context.id, flags, 707 PCI_DEVID(gpdev->bus->number, gpdev->devfn)); 708 if (rc < 0) 709 return ERR_PTR(-ENOSPC); 710 711 /* 712 * We store the npu pci device so we can more easily get at the 713 * associated npus. 714 */ 715 npu_context = mm->context.npu_context; 716 if (!npu_context) { 717 npu_context = kzalloc(sizeof(struct npu_context), GFP_KERNEL); 718 if (!npu_context) 719 return ERR_PTR(-ENOMEM); 720 721 mm->context.npu_context = npu_context; 722 npu_context->mm = mm; 723 npu_context->mn.ops = &nv_nmmu_notifier_ops; 724 __mmu_notifier_register(&npu_context->mn, mm); 725 kref_init(&npu_context->kref); 726 } else { 727 kref_get(&npu_context->kref); 728 } 729 730 npu_context->release_cb = cb; 731 npu_context->priv = priv; 732 nvlink_dn = of_parse_phandle(npdev->dev.of_node, "ibm,nvlink", 0); 733 if (WARN_ON(of_property_read_u32(nvlink_dn, "ibm,npu-link-index", 734 &nvlink_index))) 735 return ERR_PTR(-ENODEV); 736 npu_context->npdev[npu->index][nvlink_index] = npdev; 737 738 if (!nphb->npu.nmmu_flush) { 739 /* 740 * If we're not explicitly flushing ourselves we need to mark 741 * the thread for global flushes 742 */ 743 npu_context->nmmu_flush = false; > 744 inc_mm_active_cpus(mm); 745 } else 746 npu_context->nmmu_flush = true; 747 748 return npu_context; 749 } 750 EXPORT_SYMBOL(pnv_npu2_init_context); 751 752 static void pnv_npu2_release_context(struct kref *kref) 753 { 754 struct npu_context *npu_context = 755 container_of(kref, struct npu_context, kref); 756 757 if (!npu_context->nmmu_flush) > 758 dec_mm_active_cpus(npu_context->mm); 759 760 npu_context->mm->context.npu_context = NULL; 761 mmu_notifier_unregister(&npu_context->mn, 762 npu_context->mm); 763 764 kfree(npu_context); 765 } 766 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 23672 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD @ 2017-08-11 6:22 Alistair Popple 2017-08-11 6:22 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple 0 siblings, 1 reply; 7+ messages in thread From: Alistair Popple @ 2017-08-11 6:22 UTC (permalink / raw) To: mpe; +Cc: arbab, linuxppc-dev, sbaskaran, fbarrat, Alistair Popple, stable The nest mmu tlb flush needs to happen before the GPU translation shootdown is launched to avoid the GPU refilling its tlb with stale nmmu translations prior to the nmmu flush completing. Signed-off-by: Alistair Popple <alistair@popple.id.au> Cc: stable@vger.kernel.org --- arch/powerpc/platforms/powernv/npu-dma.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index b5d960d..3d4f879 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -546,6 +546,12 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, unsigned long pid = npu_context->mm->context.id; /* + * Unfortunately the nest mmu does not support flushing specific + * addresses so we have to flush the whole mm. + */ + flush_tlb_mm(npu_context->mm); + + /* * Loop over all the NPUs this process is active on and launch * an invalidate. */ @@ -576,12 +582,6 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, } } - /* - * Unfortunately the nest mmu does not support flushing specific - * addresses so we have to flush the whole mm. - */ - flush_tlb_mm(npu_context->mm); - mmio_invalidate_wait(mmio_atsd_reg, flush); if (flush) /* Wait for the flush to complete */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb 2017-08-11 6:22 [PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD Alistair Popple @ 2017-08-11 6:22 ` Alistair Popple 2017-08-13 17:04 ` kbuild test robot 0 siblings, 1 reply; 7+ messages in thread From: Alistair Popple @ 2017-08-11 6:22 UTC (permalink / raw) To: mpe; +Cc: arbab, linuxppc-dev, sbaskaran, fbarrat, Alistair Popple The nest mmu required an explicit flush as a tlbi would not flush it in the same way as the core. However an alternate firmware fix exists which should eliminate the need for this flush, so instead add a device-tree property (ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required. Signed-off-by: Alistair Popple <alistair@popple.id.au> --- Michael, This patch depends on http://patchwork.ozlabs.org/patch/796775/ - [v3,1/3] powerpc/mm: Add marker for contexts requiring global TLB invalidations. - Alistair arch/powerpc/platforms/powernv/npu-dma.c | 27 +++++++++++++++++++++------ arch/powerpc/platforms/powernv/pci.h | 3 +++ 2 files changed, 24 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 3d4f879..ac07800 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -544,12 +544,7 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, struct pci_dev *npdev; struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS]; unsigned long pid = npu_context->mm->context.id; - - /* - * Unfortunately the nest mmu does not support flushing specific - * addresses so we have to flush the whole mm. - */ - flush_tlb_mm(npu_context->mm); + bool nmmu_flushed = false; /* * Loop over all the NPUs this process is active on and launch @@ -566,6 +561,17 @@ static void mmio_invalidate(struct npu_context *npu_context, int va, npu = &nphb->npu; mmio_atsd_reg[i].npu = npu; + if (nphb->npu.nmmu_flush && !nmmu_flushed) { + /* + * Unfortunately the nest mmu does not support + * flushing specific addresses so we have to + * flush the whole mm once before shooting down + * the GPU translation. + */ + flush_tlb_mm(npu_context->mm); + nmmu_flushed = true; + } + if (va) mmio_atsd_reg[i].reg = mmio_invalidate_va(npu, address, pid, @@ -732,6 +738,13 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, return ERR_PTR(-ENODEV); npu_context->npdev[npu->index][nvlink_index] = npdev; + if (!nphb->npu.nmmu_flush) + /* + * If we're not explicitly flushing ourselves we need to mark + * the thread for global flushes + */ + mm_context_set_global_tlbi(&mm->context); + return npu_context; } EXPORT_SYMBOL(pnv_npu2_init_context); @@ -829,6 +842,8 @@ int pnv_npu2_init(struct pnv_phb *phb) static int npu_index; uint64_t rc = 0; + phb->npu.nmmu_flush = + of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush"); for_each_child_of_node(phb->hose->dn, dn) { gpdev = pnv_pci_get_gpu_dev(get_pci_dev(dn)); if (gpdev) { diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h index f16bc40..e8e3e20 100644 --- a/arch/powerpc/platforms/powernv/pci.h +++ b/arch/powerpc/platforms/powernv/pci.h @@ -184,6 +184,9 @@ struct pnv_phb { /* Bitmask for MMIO register usage */ unsigned long mmio_atsd_usage; + + /* Do we need to explicitly flush the nest mmu? */ + bool nmmu_flush; } npu; #ifdef CONFIG_CXL_BASE -- 2.1.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb 2017-08-11 6:22 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple @ 2017-08-13 17:04 ` kbuild test robot 0 siblings, 0 replies; 7+ messages in thread From: kbuild test robot @ 2017-08-13 17:04 UTC (permalink / raw) To: Alistair Popple Cc: kbuild-all, mpe, linuxppc-dev, sbaskaran, fbarrat, arbab, Alistair Popple [-- Attachment #1: Type: text/plain, Size: 5035 bytes --] Hi Alistair, [auto build test ERROR on powerpc/next] [also build test ERROR on v4.13-rc4 next-20170811] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Alistair-Popple/powerpc-powernv-npu-Move-tlb-flush-before-launching-ATSD/20170813-211752 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): arch/powerpc/platforms/powernv/npu-dma.c: In function 'pnv_npu2_init_context': >> arch/powerpc/platforms/powernv/npu-dma.c:746:3: error: implicit declaration of function 'mm_context_set_global_tlbi' [-Werror=implicit-function-declaration] mm_context_set_global_tlbi(&mm->context); ^~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors vim +/mm_context_set_global_tlbi +746 arch/powerpc/platforms/powernv/npu-dma.c 652 653 /* 654 * Call into OPAL to setup the nmmu context for the current task in 655 * the NPU. This must be called to setup the context tables before the 656 * GPU issues ATRs. pdev should be a pointed to PCIe GPU device. 657 * 658 * A release callback should be registered to allow a device driver to 659 * be notified that it should not launch any new translation requests 660 * as the final TLB invalidate is about to occur. 661 * 662 * Returns an error if there no contexts are currently available or a 663 * npu_context which should be passed to pnv_npu2_handle_fault(). 664 * 665 * mmap_sem must be held in write mode. 666 */ 667 struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, 668 unsigned long flags, 669 struct npu_context *(*cb)(struct npu_context *, void *), 670 void *priv) 671 { 672 int rc; 673 u32 nvlink_index; 674 struct device_node *nvlink_dn; 675 struct mm_struct *mm = current->mm; 676 struct pnv_phb *nphb; 677 struct npu *npu; 678 struct npu_context *npu_context; 679 680 /* 681 * At present we don't support GPUs connected to multiple NPUs and I'm 682 * not sure the hardware does either. 683 */ 684 struct pci_dev *npdev = pnv_pci_get_npu_dev(gpdev, 0); 685 686 if (!firmware_has_feature(FW_FEATURE_OPAL)) 687 return ERR_PTR(-ENODEV); 688 689 if (!npdev) 690 /* No nvlink associated with this GPU device */ 691 return ERR_PTR(-ENODEV); 692 693 if (!mm || mm->context.id == 0) { 694 /* 695 * Kernel thread contexts are not supported and context id 0 is 696 * reserved on the GPU. 697 */ 698 return ERR_PTR(-EINVAL); 699 } 700 701 nphb = pci_bus_to_host(npdev->bus)->private_data; 702 npu = &nphb->npu; 703 704 /* 705 * Setup the NPU context table for a particular GPU. These need to be 706 * per-GPU as we need the tables to filter ATSDs when there are no 707 * active contexts on a particular GPU. 708 */ 709 rc = opal_npu_init_context(nphb->opal_id, mm->context.id, flags, 710 PCI_DEVID(gpdev->bus->number, gpdev->devfn)); 711 if (rc < 0) 712 return ERR_PTR(-ENOSPC); 713 714 /* 715 * We store the npu pci device so we can more easily get at the 716 * associated npus. 717 */ 718 npu_context = mm->context.npu_context; 719 if (!npu_context) { 720 npu_context = kzalloc(sizeof(struct npu_context), GFP_KERNEL); 721 if (!npu_context) 722 return ERR_PTR(-ENOMEM); 723 724 mm->context.npu_context = npu_context; 725 npu_context->mm = mm; 726 npu_context->mn.ops = &nv_nmmu_notifier_ops; 727 __mmu_notifier_register(&npu_context->mn, mm); 728 kref_init(&npu_context->kref); 729 } else { 730 kref_get(&npu_context->kref); 731 } 732 733 npu_context->release_cb = cb; 734 npu_context->priv = priv; 735 nvlink_dn = of_parse_phandle(npdev->dev.of_node, "ibm,nvlink", 0); 736 if (WARN_ON(of_property_read_u32(nvlink_dn, "ibm,npu-link-index", 737 &nvlink_index))) 738 return ERR_PTR(-ENODEV); 739 npu_context->npdev[npu->index][nvlink_index] = npdev; 740 741 if (!nphb->npu.nmmu_flush) 742 /* 743 * If we're not explicitly flushing ourselves we need to mark 744 * the thread for global flushes 745 */ > 746 mm_context_set_global_tlbi(&mm->context); 747 748 return npu_context; 749 } 750 EXPORT_SYMBOL(pnv_npu2_init_context); 751 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 23572 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-09-08 4:36 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-09-05 3:57 [PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm() Alistair Popple 2017-09-05 3:57 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple 2017-09-05 8:10 ` Frederic Barrat 2017-09-05 11:48 ` Alistair Popple 2017-09-08 4:36 ` kbuild test robot -- strict thread matches above, loose matches on Subject: below -- 2017-08-11 6:22 [PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD Alistair Popple 2017-08-11 6:22 ` [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb Alistair Popple 2017-08-13 17:04 ` kbuild test robot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.