* [PATCH 0/2] Check VT-d Device-TLB flush error @ 2016-03-17 6:54 Quan Xu 2016-03-17 6:54 ` [PATCH 1/2] IOMMU/MMU: Adjust top level functions for " Quan Xu 2016-03-17 6:54 ` [PATCH 2/2] IOMMU/MMU: Adjust low " Quan Xu 0 siblings, 2 replies; 42+ messages in thread From: Quan Xu @ 2016-03-17 6:54 UTC (permalink / raw) To: xen-devel; +Cc: Quan Xu this patch set is a prereq patch set for Patch:'VT-d Device-TLB flush issue'. Current code would be panic(), when VT-d Device-TLB flush timed out. the panic() is going to be eliminated, so we must check all kinds of error and all the way up the call trees. Quan Xu (2): IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. This patch set was patch#1/patch#2 of 'VT-d Device-TLB flush issue' v6 patch sets. --Changes in this version: * Rebase against efc904263f483026672a5cdf3ccf9be8c4fdaa31. * Make a reasonable attempt at splitting things, adjusting top level functions first and then working the way down to leaf ones. * Remove some pointless initializers (Compiler helps me check them). * Log error and don't return error value for hardware_domain init and crashed system shutdown. * when to populate iommu page table for domu, try to tear down the iommu page table for iommu iotlb flush error. * when the flush_iotlb_qi() return value is positive, All we need are: -call iommu_flush_write_buffer() only when rc > 0 -return zero from this function when rc is positive (or 'rc = 0' after call iommu_flush_write_buffer()). * Fix v4 "VT-d Device-TLB flush issue" unaddressed issue: http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg01555.html xen/arch/x86/acpi/power.c | 14 ++- xen/arch/x86/mm.c | 13 +-- xen/arch/x86/mm/p2m-ept.c | 12 ++- xen/arch/x86/mm/p2m-pt.c | 12 ++- xen/common/grant_table.c | 5 +- xen/common/memory.c | 5 +- xen/drivers/passthrough/amd/iommu_init.c | 12 ++- xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- xen/drivers/passthrough/arm/smmu.c | 10 ++- xen/drivers/passthrough/iommu.c | 25 ++++-- xen/drivers/passthrough/vtd/extern.h | 2 +- xen/drivers/passthrough/vtd/iommu.c | 120 ++++++++++++++++++-------- xen/drivers/passthrough/vtd/quirks.c | 26 +++--- xen/drivers/passthrough/vtd/x86/vtd.c | 7 +- xen/drivers/passthrough/x86/iommu.c | 6 +- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- xen/include/asm-x86/iommu.h | 2 +- xen/include/xen/iommu.h | 12 +-- 18 files changed, 199 insertions(+), 88 deletions(-) -- 1.9.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 [PATCH 0/2] Check VT-d Device-TLB flush error Quan Xu @ 2016-03-17 6:54 ` Quan Xu 2016-03-17 7:32 ` Tian, Kevin ` (2 more replies) 2016-03-17 6:54 ` [PATCH 2/2] IOMMU/MMU: Adjust low " Quan Xu 1 sibling, 3 replies; 42+ messages in thread From: Quan Xu @ 2016-03-17 6:54 UTC (permalink / raw) To: xen-devel Cc: Kevin Tian, Keir Fraser, Jan Beulich, George Dunlap, Liu Jinsong, Dario Faggioli, Jun Nakajima, Andrew Cooper, Quan Xu, Feng Wu Current code would be panic(), when VT-d Device-TLB flush timed out. the panic() is going to be eliminated, so we must check all kinds of error and all the way up the call trees. Signed-off-by: Quan Xu <quan.xu@intel.com> CC: Jan Beulich <jbeulich@suse.com> CC: Liu Jinsong <jinsong.liu@alibaba-inc.com> CC: Keir Fraser <keir@xen.org> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Jun Nakajima <jun.nakajima@intel.com> CC: Kevin Tian <kevin.tian@intel.com> CC: George Dunlap <george.dunlap@eu.citrix.com> CC: Feng Wu <feng.wu@intel.com> CC: Dario Faggioli <dario.faggioli@citrix.com> --- xen/arch/x86/acpi/power.c | 14 +++++++++++++- xen/arch/x86/mm.c | 13 ++++++++----- xen/arch/x86/mm/p2m-ept.c | 10 +++++++++- xen/arch/x86/mm/p2m-pt.c | 12 ++++++++++-- xen/common/grant_table.c | 5 +++-- xen/common/memory.c | 5 +++-- xen/drivers/passthrough/iommu.c | 16 +++++++++++----- xen/drivers/passthrough/vtd/x86/vtd.c | 7 +++++-- xen/drivers/passthrough/x86/iommu.c | 6 +++++- xen/include/xen/iommu.h | 6 +++--- 10 files changed, 70 insertions(+), 24 deletions(-) diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c index 2885e31..50edf3f 100644 --- a/xen/arch/x86/acpi/power.c +++ b/xen/arch/x86/acpi/power.c @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void); static int device_power_down(void) { + int err; + console_suspend(); time_suspend(); @@ -53,11 +55,21 @@ static int device_power_down(void) ioapic_suspend(); - iommu_suspend(); + err = iommu_suspend(); + if ( err ) + goto iommu_suspend_error; lapic_suspend(); return 0; + +iommu_suspend_error: + ioapic_resume(); + i8259A_resume(); + time_resume(); + console_resume(); + + return err; } static void device_power_up(void) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index c997b53..526548e 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned long type, int preemptible) { unsigned long nx, x, y = page->u.inuse.type_info; - int rc = 0; + int rc = 0, ret = 0; ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned long type, if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) { if ( (x & PGT_type_mask) == PGT_writable_page ) - iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); + ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); else if ( type == PGT_writable_page ) - iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), - page_to_mfn(page), - IOMMUF_readable|IOMMUF_writable); + ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), + page_to_mfn(page), + IOMMUF_readable|IOMMUF_writable); } } @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned long type, if ( (x & PGT_partial) && !(nx & PGT_partial) ) put_page(page); + if ( !rc ) + rc = ret; + return rc; } diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index 3cb6868..f9bcce7 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -830,7 +830,15 @@ out: { if ( iommu_flags ) for ( i = 0; i < (1 << order); i++ ) - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); + { + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); + if ( rc ) + { + while ( i-- > 0 ) + iommu_unmap_page(d, gfn + i); + break; + } + } else for ( i = 0; i < (1 << order); i++ ) iommu_unmap_page(d, gfn + i); diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index 3d80612..c33b753 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -680,8 +680,16 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, } else if ( iommu_pte_flags ) for ( i = 0; i < (1UL << page_order); i++ ) - iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i, - iommu_pte_flags); + { + rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i, + iommu_pte_flags); + if ( rc ) + { + while ( i-- > 0 ) + iommu_unmap_page(p2m->domain, gfn + i); + break; + } + } else for ( i = 0; i < (1UL << page_order); i++ ) iommu_unmap_page(p2m->domain, gfn + i); diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index 8b22299..b410ffc 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -932,8 +932,9 @@ __gnttab_map_grant_ref( { nr_gets++; (void)get_page(pg, rd); - if ( !(op->flags & GNTMAP_readonly) ) - get_page_type(pg, PGT_writable_page); + if ( !(op->flags & GNTMAP_readonly) && + !get_page_type(pg, PGT_writable_page) ) + goto could_not_pin; } } } diff --git a/xen/common/memory.c b/xen/common/memory.c index ef57219..543647d 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -678,8 +678,9 @@ static int xenmem_add_to_physmap(struct domain *d, if ( need_iommu(d) ) { this_cpu(iommu_dont_flush_iotlb) = 0; - iommu_iotlb_flush(d, xatp->idx - done, done); - iommu_iotlb_flush(d, xatp->gpfn - done, done); + rc = iommu_iotlb_flush(d, xatp->idx - done, done); + if ( !rc ) + rc = iommu_iotlb_flush(d, xatp->gpfn - done, done); } #endif diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index b64676f..29efbfe 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -277,24 +277,28 @@ static void iommu_free_pagetables(unsigned long unused) cpumask_cycle(smp_processor_id(), &cpu_online_map)); } -void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) +int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) { struct hvm_iommu *hd = domain_hvm_iommu(d); if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush ) - return; + return 0; hd->platform_ops->iotlb_flush(d, gfn, page_count); + + return 0; } -void iommu_iotlb_flush_all(struct domain *d) +int iommu_iotlb_flush_all(struct domain *d) { struct hvm_iommu *hd = domain_hvm_iommu(d); if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all ) - return; + return 0; hd->platform_ops->iotlb_flush_all(d); + + return 0; } int __init iommu_setup(void) @@ -368,11 +372,13 @@ int iommu_do_domctl( return ret; } -void iommu_suspend() +int iommu_suspend() { const struct iommu_ops *ops = iommu_get_ops(); if ( iommu_enabled ) ops->suspend(); + + return 0; } void iommu_share_p2m_table(struct domain* d) diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c index c0d6aab..e5ab10a 100644 --- a/xen/drivers/passthrough/vtd/x86/vtd.c +++ b/xen/drivers/passthrough/vtd/x86/vtd.c @@ -140,8 +140,11 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d) tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K); for ( j = 0; j < tmp; j++ ) - iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, - IOMMUF_readable|IOMMUF_writable); + if ( iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, + IOMMUF_readable|IOMMUF_writable) ) + printk(XENLOG_G_ERR + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", + pfn * tmp + j, pfn * tmp + j); if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K)))) process_pending_softirqs(); diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index 8cbb655..d8e3c8f 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct domain *d) this_cpu(iommu_dont_flush_iotlb) = 0; if ( !rc ) - iommu_iotlb_flush_all(d); + { + rc = iommu_iotlb_flush_all(d); + if ( rc ) + iommu_teardown(d); + } else if ( rc != -ERESTART ) iommu_teardown(d); diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 8217cb7..d6d489a 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -167,7 +167,7 @@ struct iommu_ops { void (*dump_p2m_table)(struct domain *d); }; -void iommu_suspend(void); +int iommu_suspend(void); void iommu_resume(void); void iommu_crash_shutdown(void); int iommu_get_reserved_device_memory(iommu_grdm_t *, void *); @@ -182,8 +182,8 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d, int iommu_do_domctl(struct xen_domctl *, struct domain *d, XEN_GUEST_HANDLE_PARAM(xen_domctl_t)); -void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count); -void iommu_iotlb_flush_all(struct domain *d); +int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count); +int iommu_iotlb_flush_all(struct domain *d); /* * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to -- 1.9.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 ` [PATCH 1/2] IOMMU/MMU: Adjust top level functions for " Quan Xu @ 2016-03-17 7:32 ` Tian, Kevin 2016-03-17 7:58 ` Jan Beulich 2016-03-17 12:30 ` George Dunlap 2016-03-17 17:14 ` Jan Beulich 2 siblings, 1 reply; 42+ messages in thread From: Tian, Kevin @ 2016-03-17 7:32 UTC (permalink / raw) To: Xu, Quan, xen-devel Cc: Keir Fraser, Jan Beulich, George Dunlap, Liu Jinsong, Dario Faggioli, Nakajima, Jun, Andrew Cooper, Wu, Feng > From: Xu, Quan > Sent: Thursday, March 17, 2016 2:55 PM > > diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c > b/xen/drivers/passthrough/vtd/x86/vtd.c > index c0d6aab..e5ab10a 100644 > --- a/xen/drivers/passthrough/vtd/x86/vtd.c > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c > @@ -140,8 +140,11 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d) > > tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K); > for ( j = 0; j < tmp; j++ ) > - iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, > - IOMMUF_readable|IOMMUF_writable); > + if ( iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, > + IOMMUF_readable|IOMMUF_writable) ) > + printk(XENLOG_G_ERR > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", > + pfn * tmp + j, pfn * tmp + j); > > if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K)))) > process_pending_softirqs(); Hi, Quan, this patch looks good to me in general. Just double confirm one thing. Above doesn't handle error from iommu_map_page, while only throwing out warning. Not sure whether it has been discussed before as the agreement or not. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 7:32 ` Tian, Kevin @ 2016-03-17 7:58 ` Jan Beulich 2016-03-17 8:00 ` Tian, Kevin 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-17 7:58 UTC (permalink / raw) To: Kevin Tian Cc: Feng Wu, Quan Xu, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 17.03.16 at 08:32, <kevin.tian@intel.com> wrote: >> From: Xu, Quan >> Sent: Thursday, March 17, 2016 2:55 PM >> --- a/xen/drivers/passthrough/vtd/x86/vtd.c >> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c >> @@ -140,8 +140,11 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d) >> >> tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K); >> for ( j = 0; j < tmp; j++ ) >> - iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, >> - IOMMUF_readable|IOMMUF_writable); >> + if ( iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, >> + IOMMUF_readable|IOMMUF_writable) ) >> + printk(XENLOG_G_ERR >> + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", >> + pfn * tmp + j, pfn * tmp + j); >> >> if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K)))) >> process_pending_softirqs(); > > Hi, Quan, this patch looks good to me in general. Just double confirm one > thing. Above doesn't handle error from iommu_map_page, while only > throwing out warning. Not sure whether it has been discussed before > as the agreement or not. For code paths involved in preparing the hardware domain only I had specifically asked to handle such in a best effort manner, instead of explicitly rendering a system unbootable. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 7:58 ` Jan Beulich @ 2016-03-17 8:00 ` Tian, Kevin 0 siblings, 0 replies; 42+ messages in thread From: Tian, Kevin @ 2016-03-17 8:00 UTC (permalink / raw) To: Jan Beulich Cc: Wu, Feng, Xu, Quan, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Thursday, March 17, 2016 3:59 PM > > >>> On 17.03.16 at 08:32, <kevin.tian@intel.com> wrote: > >> From: Xu, Quan > >> Sent: Thursday, March 17, 2016 2:55 PM > >> --- a/xen/drivers/passthrough/vtd/x86/vtd.c > >> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c > >> @@ -140,8 +140,11 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain > *d) > >> > >> tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K); > >> for ( j = 0; j < tmp; j++ ) > >> - iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, > >> - IOMMUF_readable|IOMMUF_writable); > >> + if ( iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, > >> + IOMMUF_readable|IOMMUF_writable) ) > >> + printk(XENLOG_G_ERR > >> + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", > >> + pfn * tmp + j, pfn * tmp + j); > >> > >> if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K)))) > >> process_pending_softirqs(); > > > > Hi, Quan, this patch looks good to me in general. Just double confirm one > > thing. Above doesn't handle error from iommu_map_page, while only > > throwing out warning. Not sure whether it has been discussed before > > as the agreement or not. > > For code paths involved in preparing the hardware domain only > I had specifically asked to handle such in a best effort manner, > instead of explicitly rendering a system unbootable. > OK, good to know that. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 ` [PATCH 1/2] IOMMU/MMU: Adjust top level functions for " Quan Xu 2016-03-17 7:32 ` Tian, Kevin @ 2016-03-17 12:30 ` George Dunlap 2016-03-17 12:33 ` George Dunlap 2016-03-18 7:54 ` Xu, Quan 2016-03-17 17:14 ` Jan Beulich 2 siblings, 2 replies; 42+ messages in thread From: George Dunlap @ 2016-03-17 12:30 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Liu Jinsong, Dario Faggioli, xen-devel, Jan Beulich, Andrew Cooper, Feng Wu On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > index c997b53..526548e 100644 > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned long type, > int preemptible) > { > unsigned long nx, x, y = page->u.inuse.type_info; > - int rc = 0; > + int rc = 0, ret = 0; > > ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); > > @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned long type, > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > { > if ( (x & PGT_type_mask) == PGT_writable_page ) > - iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > + ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > else if ( type == PGT_writable_page ) > - iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > - page_to_mfn(page), > - IOMMUF_readable|IOMMUF_writable); > + ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > + page_to_mfn(page), > + IOMMUF_readable|IOMMUF_writable); > } > } > > @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned long type, > if ( (x & PGT_partial) && !(nx & PGT_partial) ) > put_page(page); > > + if ( !rc ) > + rc = ret; > + What's this about? If the iommu_[un]map_page() operation times out, we still go through with calling alloc_page_type(); and if alloc_page_type() fails we return its failure value, but if it succeeds, we return the error from iommu_[un]map_page()? > return rc; > } > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index 3cb6868..f9bcce7 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -830,7 +830,15 @@ out: > { > if ( iommu_flags ) > for ( i = 0; i < (1 << order); i++ ) > - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); > + { > + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); > + if ( rc ) > + { > + while ( i-- > 0 ) > + iommu_unmap_page(d, gfn + i); This won't unmap gfn+0 (since it will break out when i == 0 without calling unmap). If i were signed, we could make the conditional ">= 0"; but unfortunately it's unsigned, so you'll have to do something more complicated. While we're at it, is it worth adding "unlikely()" to the if() condition here? > + break; > + } > + } > else > for ( i = 0; i < (1 << order); i++ ) > iommu_unmap_page(d, gfn + i); > diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c > index 3d80612..c33b753 100644 > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -680,8 +680,16 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, > } > else if ( iommu_pte_flags ) > for ( i = 0; i < (1UL << page_order); i++ ) > - iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i, > - iommu_pte_flags); > + { > + rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i, > + iommu_pte_flags); > + if ( rc ) > + { > + while ( i-- > 0 ) > + iommu_unmap_page(p2m->domain, gfn + i); Same with gfn+0. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 12:30 ` George Dunlap @ 2016-03-17 12:33 ` George Dunlap 2016-03-18 3:19 ` Xu, Quan 2016-03-18 7:54 ` Xu, Quan 1 sibling, 1 reply; 42+ messages in thread From: George Dunlap @ 2016-03-17 12:33 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Liu Jinsong, Dario Faggioli, xen-devel, Jan Beulich, Andrew Cooper, Feng Wu On Thu, Mar 17, 2016 at 12:30 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: >> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >> index c997b53..526548e 100644 >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned long type, >> int preemptible) >> { >> unsigned long nx, x, y = page->u.inuse.type_info; >> - int rc = 0; >> + int rc = 0, ret = 0; >> >> ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); >> >> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned long type, >> if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) >> { >> if ( (x & PGT_type_mask) == PGT_writable_page ) >> - iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); >> + ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); >> else if ( type == PGT_writable_page ) >> - iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), >> - page_to_mfn(page), >> - IOMMUF_readable|IOMMUF_writable); >> + ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), >> + page_to_mfn(page), >> + IOMMUF_readable|IOMMUF_writable); >> } >> } >> >> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned long type, >> if ( (x & PGT_partial) && !(nx & PGT_partial) ) >> put_page(page); >> >> + if ( !rc ) >> + rc = ret; >> + > > What's this about? If the iommu_[un]map_page() operation times out, > we still go through with calling alloc_page_type(); and if > alloc_page_type() fails we return its failure value, but if it > succeeds, we return the error from iommu_[un]map_page()? > >> return rc; >> } >> >> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >> index 3cb6868..f9bcce7 100644 >> --- a/xen/arch/x86/mm/p2m-ept.c >> +++ b/xen/arch/x86/mm/p2m-ept.c >> @@ -830,7 +830,15 @@ out: >> { >> if ( iommu_flags ) >> for ( i = 0; i < (1 << order); i++ ) >> - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); >> + { >> + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); >> + if ( rc ) >> + { >> + while ( i-- > 0 ) >> + iommu_unmap_page(d, gfn + i); > > This won't unmap gfn+0 (since it will break out when i == 0 without > calling unmap). Oh, no it won't, because the decrement is postfix. For us mere mortals, I'd appreciate a comment here like this: /* Postfix operator means we will call unmap with i == 0 */ Thanks, -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 12:33 ` George Dunlap @ 2016-03-18 3:19 ` Xu, Quan 2016-03-18 8:09 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-18 3:19 UTC (permalink / raw) To: George Dunlap Cc: Tian, Kevin, Keir Fraser, Nakajima, Jun, Liu Jinsong, Dario Faggioli, xen-devel, Jan Beulich, Andrew Cooper, Wu, Feng On March 17, 2016 8:34pm, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Thu, Mar 17, 2016 at 12:30 PM, George Dunlap > <George.Dunlap@eu.citrix.com> wrote: > > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > >> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index > >> c997b53..526548e 100644 > >> --- a/xen/arch/x86/mm.c > >> +++ b/xen/arch/x86/mm.c > >> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info > *page, unsigned long type, > >> int preemptible) { > >> unsigned long nx, x, y = page->u.inuse.type_info; > >> - int rc = 0; > >> + int rc = 0, ret = 0; > >> > >> ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); > >> > >> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info > *page, unsigned long type, > >> if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > >> { > >> if ( (x & PGT_type_mask) == PGT_writable_page ) > >> - iommu_unmap_page(d, mfn_to_gmfn(d, > page_to_mfn(page))); > >> + ret = iommu_unmap_page(d, mfn_to_gmfn(d, > page_to_mfn(page))); > >> else if ( type == PGT_writable_page ) > >> - iommu_map_page(d, mfn_to_gmfn(d, > page_to_mfn(page)), > >> - page_to_mfn(page), > >> - > IOMMUF_readable|IOMMUF_writable); > >> + ret = iommu_map_page(d, mfn_to_gmfn(d, > page_to_mfn(page)), > >> + page_to_mfn(page), > >> + > IOMMUF_readable|IOMMUF_writable); > >> } > >> } > >> > >> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info > *page, unsigned long type, > >> if ( (x & PGT_partial) && !(nx & PGT_partial) ) > >> put_page(page); > >> > >> + if ( !rc ) > >> + rc = ret; > >> + > > > > What's this about? If the iommu_[un]map_page() operation times out, > > we still go through with calling alloc_page_type(); and if > > alloc_page_type() fails we return its failure value, but if it > > succeeds, we return the error from iommu_[un]map_page()? > > > >> return rc; > >> } > >> > >> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > >> index 3cb6868..f9bcce7 100644 > >> --- a/xen/arch/x86/mm/p2m-ept.c > >> +++ b/xen/arch/x86/mm/p2m-ept.c > >> @@ -830,7 +830,15 @@ out: > >> { > >> if ( iommu_flags ) > >> for ( i = 0; i < (1 << order); i++ ) > >> - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > iommu_flags); > >> + { > >> + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > iommu_flags); > >> + if ( rc ) > >> + { > >> + while ( i-- > 0 ) > >> + iommu_unmap_page(d, gfn + i); > > > > This won't unmap gfn+0 (since it will break out when i == 0 without > > calling unmap). > > Oh, no it won't, because the decrement is postfix. > > For us mere mortals, I'd appreciate a comment here like this: > > /* Postfix operator means we will call unmap with i == 0 */ > Agreed. For these 2 points, to summarize: - adding "unlikely()" to the if() condition, e.g. if ( unlikely(rc) ) - adding a comment: /* Postfix operator means we will call unmap with i == 0 */ Right? Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 3:19 ` Xu, Quan @ 2016-03-18 8:09 ` Jan Beulich 2016-03-24 6:45 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-18 8:09 UTC (permalink / raw) To: George Dunlap, Quan Xu Cc: Kevin Tian, Feng Wu, Liu Jinsong, DarioFaggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 18.03.16 at 04:19, <quan.xu@intel.com> wrote: > On March 17, 2016 8:34pm, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >> On Thu, Mar 17, 2016 at 12:30 PM, George Dunlap >> <George.Dunlap@eu.citrix.com> wrote: >> > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: >> >> --- a/xen/arch/x86/mm/p2m-ept.c >> >> +++ b/xen/arch/x86/mm/p2m-ept.c >> >> @@ -830,7 +830,15 @@ out: >> >> { >> >> if ( iommu_flags ) >> >> for ( i = 0; i < (1 << order); i++ ) >> >> - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, >> iommu_flags); >> >> + { >> >> + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, >> iommu_flags); >> >> + if ( rc ) >> >> + { >> >> + while ( i-- > 0 ) >> >> + iommu_unmap_page(d, gfn + i); >> > >> > This won't unmap gfn+0 (since it will break out when i == 0 without >> > calling unmap). >> >> Oh, no it won't, because the decrement is postfix. >> >> For us mere mortals, I'd appreciate a comment here like this: >> >> /* Postfix operator means we will call unmap with i == 0 */ >> > Agreed. > For these 2 points, to summarize: > - adding "unlikely()" to the if() condition, e.g. if ( unlikely(rc) ) > - adding a comment: > /* Postfix operator means we will call unmap with i == 0 */ To be honest, I'm opposed to the addition of such comments. See also the parallel discussion rooted at http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg01779.html Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 8:09 ` Jan Beulich @ 2016-03-24 6:45 ` Xu, Quan 0 siblings, 0 replies; 42+ messages in thread From: Xu, Quan @ 2016-03-24 6:45 UTC (permalink / raw) To: Jan Beulich, George Dunlap Cc: Tian, Kevin, Wu, Feng, Liu Jinsong, DarioFaggioli, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 18, 2016 4:10pm, <JBeulich@suse.com> wrote: > >>> On 18.03.16 at 04:19, <quan.xu@intel.com> wrote: > > On March 17, 2016 8:34pm, George Dunlap <George.Dunlap@eu.citrix.com> > wrote: > >> On Thu, Mar 17, 2016 at 12:30 PM, George Dunlap > >> <George.Dunlap@eu.citrix.com> wrote: > >> > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > >> >> --- a/xen/arch/x86/mm/p2m-ept.c > >> >> +++ b/xen/arch/x86/mm/p2m-ept.c > >> >> @@ -830,7 +830,15 @@ out: > >> >> { > >> >> if ( iommu_flags ) > >> >> for ( i = 0; i < (1 << order); i++ ) > >> >> - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > >> iommu_flags); > >> >> + { > >> >> + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + > >> >> + i, > >> iommu_flags); > >> >> + if ( rc ) > >> >> + { > >> >> + while ( i-- > 0 ) > >> >> + iommu_unmap_page(d, gfn + i); > >> > > >> > This won't unmap gfn+0 (since it will break out when i == 0 without > >> > calling unmap). > >> > >> Oh, no it won't, because the decrement is postfix. > >> > >> For us mere mortals, I'd appreciate a comment here like this: > >> > >> /* Postfix operator means we will call unmap with i == 0 */ > >> > > Agreed. > > For these 2 points, to summarize: > > - adding "unlikely()" to the if() condition, e.g. if ( unlikely(rc) ) > > - adding a comment: > > /* Postfix operator means we will call unmap with i == 0 */ > > To be honest, I'm opposed to the addition of such comments. > See also the parallel discussion rooted at > http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg01779.html > Reading the Follow-Ups email, it looks a pretty common cleanup pattern. now I don't fully get this point, but I would try to follow this pattern. Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 12:30 ` George Dunlap 2016-03-17 12:33 ` George Dunlap @ 2016-03-18 7:54 ` Xu, Quan 2016-03-18 8:19 ` Jan Beulich 1 sibling, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-18 7:54 UTC (permalink / raw) To: George Dunlap Cc: Tian, Kevin, Keir Fraser, Nakajima, Jun, Liu Jinsong, Dario Faggioli, xen-devel, Jan Beulich, Andrew Cooper, Wu, Feng On March 17, 2016 8:30pm, <dunlapg@gmail.com> wrote: > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index > > c997b53..526548e 100644 > > --- a/xen/arch/x86/mm.c > > +++ b/xen/arch/x86/mm.c > > @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, > unsigned long type, > > int preemptible) { > > unsigned long nx, x, y = page->u.inuse.type_info; > > - int rc = 0; > > + int rc = 0, ret = 0; > > > > ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); > > > > @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info > *page, unsigned long type, > > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > > { > > if ( (x & PGT_type_mask) == PGT_writable_page ) > > - iommu_unmap_page(d, mfn_to_gmfn(d, > page_to_mfn(page))); > > + ret = iommu_unmap_page(d, mfn_to_gmfn(d, > page_to_mfn(page))); > > else if ( type == PGT_writable_page ) > > - iommu_map_page(d, mfn_to_gmfn(d, > page_to_mfn(page)), > > - page_to_mfn(page), > > - > IOMMUF_readable|IOMMUF_writable); > > + ret = iommu_map_page(d, mfn_to_gmfn(d, > page_to_mfn(page)), > > + page_to_mfn(page), > > + > IOMMUF_readable|IOMMUF_writable); > > } > > } > > > > @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, > unsigned long type, > > if ( (x & PGT_partial) && !(nx & PGT_partial) ) > > put_page(page); > > > > + if ( !rc ) > > + rc = ret; > > + > > What's this about? If the iommu_[un]map_page() operation times out, > we still go through with calling alloc_page_type(); and if > alloc_page_type() fails we return its failure value, but if it > succeeds, we return the error from iommu_[un]map_page()? > Yes. To be honest, to me, it is tricky too. I found it is not right to return directly if the iommu_[un]map_page() operation times out. """if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )""" Does IOMMU support pv domain? If not, we'd better remove the " if(...){...}" Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 7:54 ` Xu, Quan @ 2016-03-18 8:19 ` Jan Beulich 2016-03-18 9:09 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-18 8:19 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, DarioFaggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 18.03.16 at 08:54, <quan.xu@intel.com> wrote: > On March 17, 2016 8:30pm, <dunlapg@gmail.com> wrote: >> On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: >> > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index >> > c997b53..526548e 100644 >> > --- a/xen/arch/x86/mm.c >> > +++ b/xen/arch/x86/mm.c >> > @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, >> unsigned long type, >> > int preemptible) { >> > unsigned long nx, x, y = page->u.inuse.type_info; >> > - int rc = 0; >> > + int rc = 0, ret = 0; >> > >> > ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); >> > >> > @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info >> *page, unsigned long type, >> > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) >> > { >> > if ( (x & PGT_type_mask) == PGT_writable_page ) >> > - iommu_unmap_page(d, mfn_to_gmfn(d, >> page_to_mfn(page))); >> > + ret = iommu_unmap_page(d, mfn_to_gmfn(d, >> page_to_mfn(page))); >> > else if ( type == PGT_writable_page ) >> > - iommu_map_page(d, mfn_to_gmfn(d, >> page_to_mfn(page)), >> > - page_to_mfn(page), >> > - >> IOMMUF_readable|IOMMUF_writable); >> > + ret = iommu_map_page(d, mfn_to_gmfn(d, >> page_to_mfn(page)), >> > + page_to_mfn(page), >> > + >> IOMMUF_readable|IOMMUF_writable); >> > } >> > } >> > >> > @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, >> unsigned long type, >> > if ( (x & PGT_partial) && !(nx & PGT_partial) ) >> > put_page(page); >> > >> > + if ( !rc ) >> > + rc = ret; >> > + >> >> What's this about? If the iommu_[un]map_page() operation times out, >> we still go through with calling alloc_page_type(); and if >> alloc_page_type() fails we return its failure value, but if it >> succeeds, we return the error from iommu_[un]map_page()? >> > Yes. > To be honest, to me, it is tricky too. > I found it is not right to return directly if the iommu_[un]map_page() > operation times out. > > """if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )""" > Does IOMMU support pv domain? If not, we'd better remove the " if(...){...}" What strange a question: Of course it does. As you can infer form the reply I sent yesterday, you first need to settle on an abstract model: What do failures mean? If, in the flush case, a timeout is going to kill the domain anyway, then error handling can be lighter weight than if you mean to try to keep the domain running. Of course in this context you also should not forget that iommu_map_page() could already return errors prior to your changes (most notably -ENOMEM, but at least the AMD side also produces others, with -EFAULT generally being accompanied by domain_crash()). As mentioned elsewhere - it seems extremely bogus that these errors weren't check for from the beginning. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 8:19 ` Jan Beulich @ 2016-03-18 9:09 ` Xu, Quan 2016-03-18 9:29 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-18 9:09 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, George Dunlap, Liu Jinsong, DarioFaggioli, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 18, 2016 4:20pm, <JBeulich@suse.com> wrote: > >>> On 18.03.16 at 08:54, <quan.xu@intel.com> wrote: > > On March 17, 2016 8:30pm, <dunlapg@gmail.com> wrote: > >> On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > >> > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index > >> > c997b53..526548e 100644 > >> > --- a/xen/arch/x86/mm.c > >> > +++ b/xen/arch/x86/mm.c > >> > @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info > >> > *page, > >> unsigned long type, > >> > int preemptible) { > >> > unsigned long nx, x, y = page->u.inuse.type_info; > >> > - int rc = 0; > >> > + int rc = 0, ret = 0; > >> > > >> > ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); > >> > > >> > @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info > >> *page, unsigned long type, > >> > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > >> > { > >> > if ( (x & PGT_type_mask) == PGT_writable_page ) > >> > - iommu_unmap_page(d, mfn_to_gmfn(d, > >> page_to_mfn(page))); > >> > + ret = iommu_unmap_page(d, mfn_to_gmfn(d, > >> page_to_mfn(page))); > >> > else if ( type == PGT_writable_page ) > >> > - iommu_map_page(d, mfn_to_gmfn(d, > >> page_to_mfn(page)), > >> > - page_to_mfn(page), > >> > - > >> IOMMUF_readable|IOMMUF_writable); > >> > + ret = iommu_map_page(d, mfn_to_gmfn(d, > >> page_to_mfn(page)), > >> > + page_to_mfn(page), > >> > + > >> IOMMUF_readable|IOMMUF_writable); > >> > } > >> > } > >> > > >> > @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info > >> > *page, > >> unsigned long type, > >> > if ( (x & PGT_partial) && !(nx & PGT_partial) ) > >> > put_page(page); > >> > > >> > + if ( !rc ) > >> > + rc = ret; > >> > + > >> > >> What's this about? If the iommu_[un]map_page() operation times out, > >> we still go through with calling alloc_page_type(); and if > >> alloc_page_type() fails we return its failure value, but if it > >> succeeds, we return the error from iommu_[un]map_page()? > >> > > Yes. > > To be honest, to me, it is tricky too. > > I found it is not right to return directly if the iommu_[un]map_page() > > operation times out. > > > > """if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )""" > > Does IOMMU support pv domain? If not, we'd better remove the " if(...){...}" > > What strange a question: Of course it does. > Jan, thanks. To be honest, this stupid question was always in my mind. > As you can infer form the reply I sent yesterday, you first need to settle on an > abstract model: What do failures mean? If, in the flush case, a timeout is going > to kill the domain anyway, then error handling can be lighter weight than if you > mean to try to keep the domain running. Of course in this context you also > should not forget that iommu_map_page() could already return errors prior to > your changes (most notably -ENOMEM, but at least the AMD side also produces > others, with -EFAULT generally being accompanied by domain_crash()). As > mentioned elsewhere - it seems extremely bogus that these errors weren't > check for from the beginning. > Jan, I am not familiar/confident enough to this single point, could you tell me more how to fix it? Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 9:09 ` Xu, Quan @ 2016-03-18 9:29 ` Jan Beulich 2016-03-18 9:38 ` Dario Faggioli 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-18 9:29 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, DarioFaggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 18.03.16 at 10:09, <quan.xu@intel.com> wrote: > On March 18, 2016 4:20pm, <JBeulich@suse.com> wrote: >> As you can infer form the reply I sent yesterday, you first need to settle on an >> abstract model: What do failures mean? If, in the flush case, a timeout is going >> to kill the domain anyway, then error handling can be lighter weight than if you >> mean to try to keep the domain running. Of course in this context you also >> should not forget that iommu_map_page() could already return errors prior to >> your changes (most notably -ENOMEM, but at least the AMD side also produces >> others, with -EFAULT generally being accompanied by domain_crash()). As >> mentioned elsewhere - it seems extremely bogus that these errors weren't >> check for from the beginning. >> > Jan, I am not familiar/confident enough to this single point, could you tell > me more how to fix it? Not sure what exactly you're asking for: As said, we first need to settle on an abstract model. Do we want IOMMU mapping failures to be fatal to the domain (perhaps with the exception of the hardware one)? I think we do, and for the hardware domain we'd do things on a best effort basis (always erring on the side of unmapping). Which would probably mean crashing the domain could be centralized in iommu_{,un}map_page(). How much roll back would then still be needed in callers of these functions for the hardware domain's sake would need to be seen. So before you start coing, give others (namely but not limited to VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance to voice differing opinions. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 9:29 ` Jan Beulich @ 2016-03-18 9:38 ` Dario Faggioli 2016-03-18 9:48 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Dario Faggioli @ 2016-03-18 9:38 UTC (permalink / raw) To: Jan Beulich, Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser [-- Attachment #1.1: Type: text/plain, Size: 1423 bytes --] On Fri, 2016-03-18 at 03:29 -0600, Jan Beulich wrote: > > > Not sure what exactly you're asking for: As said, we first need to > settle on an abstract model. Do we want IOMMU mapping failures > to be fatal to the domain (perhaps with the exception of the > hardware one)? I think we do, and for the hardware domain we'd > do things on a best effort basis (always erring on the side of > unmapping). Which would probably mean crashing the domain > could be centralized in iommu_{,un}map_page(). How much roll > back would then still be needed in callers of these functions for > the hardware domain's sake would need to be seen. > > So before you start coing, give others (namely but not limited to > VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance > to voice differing opinions. > I'm nothing of the above but, FWIW, the behavior Jan described (crashing the domain for all domains but the hardware domain) was indeed the intended plan for this series, as far as I understood from talking to people and looking at previous email conversations and submissions. And it looks to me like it is a sane plan. Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 9:38 ` Dario Faggioli @ 2016-03-18 9:48 ` Jan Beulich 2016-03-21 6:18 ` Tian, Kevin 2016-03-24 9:02 ` Xu, Quan 0 siblings, 2 replies; 42+ messages in thread From: Jan Beulich @ 2016-03-18 9:48 UTC (permalink / raw) To: Dario Faggioli, Quan Xu Cc: KevinTian, Feng Wu, George Dunlap, Liu Jinsong, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 18.03.16 at 10:38, <dario.faggioli@citrix.com> wrote: > On Fri, 2016-03-18 at 03:29 -0600, Jan Beulich wrote: >> > >> Not sure what exactly you're asking for: As said, we first need to >> settle on an abstract model. Do we want IOMMU mapping failures >> to be fatal to the domain (perhaps with the exception of the >> hardware one)? I think we do, and for the hardware domain we'd >> do things on a best effort basis (always erring on the side of >> unmapping). Which would probably mean crashing the domain >> could be centralized in iommu_{,un}map_page(). How much roll >> back would then still be needed in callers of these functions for >> the hardware domain's sake would need to be seen. >> >> So before you start coing, give others (namely but not limited to >> VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance >> to voice differing opinions. >> > I'm nothing of the above but, Don't you fall under "but not limited to" ;-) ? > FWIW, the behavior Jan described > (crashing the domain for all domains but the hardware domain) was > indeed the intended plan for this series, as far as I understood from > talking to people and looking at previous email conversations and > submissions. That was taking only the flush timeout as an error source into account. Now that we see that the lack of error handling pre-exists, we can't just extend that intended model to also cover those other error reasons without at least having given people a chance to object. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 9:48 ` Jan Beulich @ 2016-03-21 6:18 ` Tian, Kevin 2016-03-21 12:22 ` Jan Beulich 2016-03-24 9:02 ` Xu, Quan 1 sibling, 1 reply; 42+ messages in thread From: Tian, Kevin @ 2016-03-21 6:18 UTC (permalink / raw) To: Jan Beulich, Dario Faggioli, Xu, Quan Cc: Wu, Feng, George Dunlap, Liu Jinsong, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Friday, March 18, 2016 5:49 PM > > >>> On 18.03.16 at 10:38, <dario.faggioli@citrix.com> wrote: > > On Fri, 2016-03-18 at 03:29 -0600, Jan Beulich wrote: > >> > > >> Not sure what exactly you're asking for: As said, we first need to > >> settle on an abstract model. Do we want IOMMU mapping failures > >> to be fatal to the domain (perhaps with the exception of the > >> hardware one)? I think we do, and for the hardware domain we'd > >> do things on a best effort basis (always erring on the side of > >> unmapping). Which would probably mean crashing the domain > >> could be centralized in iommu_{,un}map_page(). How much roll > >> back would then still be needed in callers of these functions for > >> the hardware domain's sake would need to be seen. > >> > >> So before you start coing, give others (namely but not limited to > >> VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance > >> to voice differing opinions. > >> > > I'm nothing of the above but, > > Don't you fall under "but not limited to" ;-) ? > > > FWIW, the behavior Jan described > > (crashing the domain for all domains but the hardware domain) was > > indeed the intended plan for this series, as far as I understood from > > talking to people and looking at previous email conversations and > > submissions. > > That was taking only the flush timeout as an error source into account. > Now that we see that the lack of error handling pre-exists, we can't > just extend that intended model to also cover those other error > reasons without at least having given people a chance to object. > It makes sense so I'm OK with this behavior change. Then regarding to Quan's next version (if nobody disagrees), is it better to first describe related callers and then reach agreement which caller still needs error handling for hardware domain, before Quan starts to change actual code (at least based on current discussion Quan didn't have thorough understanding of such necessity yet)? Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-21 6:18 ` Tian, Kevin @ 2016-03-21 12:22 ` Jan Beulich 0 siblings, 0 replies; 42+ messages in thread From: Jan Beulich @ 2016-03-21 12:22 UTC (permalink / raw) To: Dario Faggioli, Kevin Tian, Quan Xu Cc: Feng Wu, George Dunlap, Liu Jinsong, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 21.03.16 at 07:18, <kevin.tian@intel.com> wrote: >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: Friday, March 18, 2016 5:49 PM >> >> >>> On 18.03.16 at 10:38, <dario.faggioli@citrix.com> wrote: >> > On Fri, 2016-03-18 at 03:29 -0600, Jan Beulich wrote: >> >> > >> >> Not sure what exactly you're asking for: As said, we first need to >> >> settle on an abstract model. Do we want IOMMU mapping failures >> >> to be fatal to the domain (perhaps with the exception of the >> >> hardware one)? I think we do, and for the hardware domain we'd >> >> do things on a best effort basis (always erring on the side of >> >> unmapping). Which would probably mean crashing the domain >> >> could be centralized in iommu_{,un}map_page(). How much roll >> >> back would then still be needed in callers of these functions for >> >> the hardware domain's sake would need to be seen. >> >> >> >> So before you start coing, give others (namely but not limited to >> >> VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance >> >> to voice differing opinions. >> >> >> > I'm nothing of the above but, >> >> Don't you fall under "but not limited to" ;-) ? >> >> > FWIW, the behavior Jan described >> > (crashing the domain for all domains but the hardware domain) was >> > indeed the intended plan for this series, as far as I understood from >> > talking to people and looking at previous email conversations and >> > submissions. >> >> That was taking only the flush timeout as an error source into account. >> Now that we see that the lack of error handling pre-exists, we can't >> just extend that intended model to also cover those other error >> reasons without at least having given people a chance to object. >> > > It makes sense so I'm OK with this behavior change. > > Then regarding to Quan's next version (if nobody disagrees), is it better > to first describe related callers and then reach agreement which caller > still needs error handling for hardware domain, before Quan starts to > change actual code (at least based on current discussion Quan didn't > have thorough understanding of such necessity yet)? Well, ideally we'd indeed reach agreement before starting to write or change code. However, in a case like this where error handling was just ignored in pre-existing code, I think the easiest way to get a complete picture is to see what places / paths need changing. I.e. here it may indeed be better to start from the patches. Whether that means posting the patches in patch form, or - prior to posting them - extracting what was learned into a textual description is a different aspect (and I'd leave it to Quan to pick the route more suitable to him). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-18 9:48 ` Jan Beulich 2016-03-21 6:18 ` Tian, Kevin @ 2016-03-24 9:02 ` Xu, Quan 2016-03-24 9:58 ` Jan Beulich 1 sibling, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-24 9:02 UTC (permalink / raw) To: Jan Beulich, Dario Faggioli Cc: Tian, Kevin, Wu, Feng, George Dunlap, Liu Jinsong, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 18, 2016 5:49pm, <JBeulich@suse.com> wrote: > >>> On 18.03.16 at 10:38, <dario.faggioli@citrix.com> wrote: > > On Fri, 2016-03-18 at 03:29 -0600, Jan Beulich wrote: > >> > > >> Not sure what exactly you're asking for: As said, we first need to > >> settle on an abstract model. Do we want IOMMU mapping failures to be > >> fatal to the domain (perhaps with the exception of the hardware one)? > >> I think we do, and for the hardware domain we'd do things on a best > >> effort basis (always erring on the side of unmapping). Which would > >> probably mean crashing the domain could be centralized in > >> iommu_{,un}map_page(). How much roll back would then still be needed > >> in callers of these functions for the hardware domain's sake would > >> need to be seen. > >> > >> So before you start coing, give others (namely but not limited to > >> VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance to voice > >> differing opinions. > >> > > FWIW, the behavior Jan described > > (crashing the domain for all domains but the hardware domain) was > > indeed the intended plan for this series, as far as I understood from > > talking to people and looking at previous email conversations and > > submissions. > > That was taking only the flush timeout as an error source into account. > Now that we see that the lack of error handling pre-exists, we can't just extend > that intended model to also cover those other error reasons without at least > having given people a chance to object. > For this abstract model, I assume we are on the same page for the precondition: If Device-TLB flush timed out, we would hide the target ATS device and crash the domain owning this ATS device. If impacted domain is hardware domain, just throw out a warning. Then IMO, 1. Try the best to return error code. 2. Log error and don't return error value for hardware_domain init or crashed system shutdown. 3. For iommu_{,un}map_page(), we'd better fix it as a normal error, as the error is not only from iommu flush, .e.g, '-ENOMEM'. So, we need to {,un}map from the IOMMU, return an error, and roll back the failed operation( .e.g, unmap EPT). 4. for the rest, we may return an error, but don't roll back the failed operation, and we need to analysis the different condition. Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-24 9:02 ` Xu, Quan @ 2016-03-24 9:58 ` Jan Beulich 2016-03-24 14:12 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-24 9:58 UTC (permalink / raw) To: Dario Faggioli, Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 24.03.16 at 10:02, <quan.xu@intel.com> wrote: > On March 18, 2016 5:49pm, <JBeulich@suse.com> wrote: >> That was taking only the flush timeout as an error source into account. >> Now that we see that the lack of error handling pre-exists, we can't just extend >> that intended model to also cover those other error reasons without at least >> having given people a chance to object. >> > > For this abstract model, > I assume we are on the same page for the precondition: > If Device-TLB flush timed out, we would hide the target ATS device and crash > the domain owning this ATS device. > If impacted domain is hardware domain, just throw out a warning. > > Then IMO, > 1. Try the best to return error code. > 2. Log error and don't return error value for hardware_domain init or > crashed system shutdown. > 3. For iommu_{,un}map_page(), we'd better fix it as a normal error, as > the error is not only from iommu flush, .e.g, '-ENOMEM'. > So, we need to {,un}map from the IOMMU, return an error, and roll back > the failed operation( .e.g, unmap EPT). Well, if that possible in a provably correct way, then sure. But be clear - when the failure occurs while unmapping, unmapping the EPT entry obviously can't be the solution, you'd need a true roll back. And of course you should keep in mind what happens to the guest if such an operation fails: If you can be certain it'll crash because of this later on anyway, you're likely better off crashing it right away (such that the reason for the crash is at least obvious). Jan > 4. for the rest, we may return an error, but don't roll back the failed > operation, and we need to analysis the different condition. > > Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-24 9:58 ` Jan Beulich @ 2016-03-24 14:12 ` Xu, Quan 2016-03-24 14:37 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-24 14:12 UTC (permalink / raw) To: Jan Beulich, Dario Faggioli Cc: Tian, Kevin, Wu, Feng, George Dunlap, Liu Jinsong, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 24, 2016 5:59pm, <JBeulich@suse.com> wrote: > >>> On 24.03.16 at 10:02, <quan.xu@intel.com> wrote: > > On March 18, 2016 5:49pm, <JBeulich@suse.com> wrote: > > 3. For iommu_{,un}map_page(), we'd better fix it as a normal error, > > as the error is not only from iommu flush, .e.g, '-ENOMEM'. > > So, we need to {,un}map from the IOMMU, return an error, and roll > > back the failed operation( .e.g, unmap EPT). > > Well, if that possible in a provably correct way, then sure. But be clear - when > the failure occurs while unmapping, unmapping the EPT entry obviously can't be > the solution, I hope we discuss about the same point as bellow?: ept_set_entry() { .... if ( iommu_flags ) for ( i = 0; i < (1 << order); i++ ) iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); else for ( i = 0; i < (1 << order); i++ ) iommu_unmap_page(d, gfn + i); .... } > you'd need a true roll back. Does it refer to as: If the old entry is present, we need to write back the old entry? > And of course you should keep in mind > what happens to the guest if such an operation fails: If you can be certain it'll > crash because of this later on anyway, you're likely better off crashing it right > away (such that the reason for the crash is at least obvious). > I think, for iommu_{,un}map_page(), it would be not aware that the domain is going to crash. As mentioned, the error is not only from iommu flush. We need to fix it one by one. fortunately, it is limited. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-24 14:12 ` Xu, Quan @ 2016-03-24 14:37 ` Jan Beulich 0 siblings, 0 replies; 42+ messages in thread From: Jan Beulich @ 2016-03-24 14:37 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 24.03.16 at 15:12, <quan.xu@intel.com> wrote: > On March 24, 2016 5:59pm, <JBeulich@suse.com> wrote: >> >>> On 24.03.16 at 10:02, <quan.xu@intel.com> wrote: >> > On March 18, 2016 5:49pm, <JBeulich@suse.com> wrote: > > >> > 3. For iommu_{,un}map_page(), we'd better fix it as a normal error, >> > as the error is not only from iommu flush, .e.g, '-ENOMEM'. >> > So, we need to {,un}map from the IOMMU, return an error, and roll >> > back the failed operation( .e.g, unmap EPT). >> >> Well, if that possible in a provably correct way, then sure. But be clear - when >> the failure occurs while unmapping, unmapping the EPT entry obviously can't be >> the solution, > > I hope we discuss about the same point as bellow?: > ept_set_entry() > { > .... > if ( iommu_flags ) > for ( i = 0; i < (1 << order); i++ ) > iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); > else > for ( i = 0; i < (1 << order); i++ ) > iommu_unmap_page(d, gfn + i); > .... > } > > >> you'd need a true roll back. > > Does it refer to as: > If the old entry is present, we need to write back the old entry? Yes, but that's not as simple as it sounds: You also need to take care of the splitting of super pages which may have occurred. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 ` [PATCH 1/2] IOMMU/MMU: Adjust top level functions for " Quan Xu 2016-03-17 7:32 ` Tian, Kevin 2016-03-17 12:30 ` George Dunlap @ 2016-03-17 17:14 ` Jan Beulich 2016-03-28 3:33 ` Xu, Quan 2 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-17 17:14 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > @@ -53,11 +55,21 @@ static int device_power_down(void) > > ioapic_suspend(); > > - iommu_suspend(); > + err = iommu_suspend(); > + if ( err ) > + goto iommu_suspend_error; > > lapic_suspend(); > > return 0; > + > +iommu_suspend_error: Labels indented by at least one space please. > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -830,7 +830,15 @@ out: > { > if ( iommu_flags ) > for ( i = 0; i < (1 << order); i++ ) > - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); > + { > + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); > + if ( rc ) > + { > + while ( i-- > 0 ) > + iommu_unmap_page(d, gfn + i); > + break; > + } > + } > else > for ( i = 0; i < (1 << order); i++ ) > iommu_unmap_page(d, gfn + i); Earlier on in the PV mm code you also checked iommu_unmap_page()'s return code - why not here (and also in p2m-pt.c)? Also I'm quite unhappy about the inconsistent state you leave things in: You unmap from the IOMMU, return an error, but leave the EPT entry in place. > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref( > { > nr_gets++; > (void)get_page(pg, rd); > - if ( !(op->flags & GNTMAP_readonly) ) > - get_page_type(pg, PGT_writable_page); > + if ( !(op->flags & GNTMAP_readonly) && > + !get_page_type(pg, PGT_writable_page) ) > + goto could_not_pin; This needs explanation, as it doesn't look related to what your actual goal is: If an error was possible here, I think this would be a security issue. However, as also kind of documented by the explicitly ignored return value from get_page(), it is my understanding there here we only obtain an _extra_ reference. > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -678,8 +678,9 @@ static int xenmem_add_to_physmap(struct domain *d, > if ( need_iommu(d) ) > { > this_cpu(iommu_dont_flush_iotlb) = 0; > - iommu_iotlb_flush(d, xatp->idx - done, done); > - iommu_iotlb_flush(d, xatp->gpfn - done, done); > + rc = iommu_iotlb_flush(d, xatp->idx - done, done); > + if ( !rc ) > + rc = iommu_iotlb_flush(d, xatp->gpfn - done, done); > } And the pattern repeats - you now return an error, but you don't roll back the now failed operation. But wait - maybe that intended: Are you meaning to crash the guest in such cases (somewhere deep in the flush code)? If so, I think that's fine, but you absolutely would need to say so in the commit message. > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct domain *d) > this_cpu(iommu_dont_flush_iotlb) = 0; > > if ( !rc ) > - iommu_iotlb_flush_all(d); > + { > + rc = iommu_iotlb_flush_all(d); > + if ( rc ) > + iommu_teardown(d); > + } > else if ( rc != -ERESTART ) > iommu_teardown(d); Why can't you just use the existing call to iommu_teardown(), by simply deleting the "else"? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-17 17:14 ` Jan Beulich @ 2016-03-28 3:33 ` Xu, Quan 2016-03-29 7:20 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-28 3:33 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 18, 2016 1:15am, <JBeulich@suse.com> wrote: > >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > > @@ -53,11 +55,21 @@ static int device_power_down(void) > > > > ioapic_suspend(); > > > > - iommu_suspend(); > > + err = iommu_suspend(); > > + if ( err ) > > + goto iommu_suspend_error; > > > > lapic_suspend(); > > > > return 0; > > + > > +iommu_suspend_error: > > Labels indented by at least one space please. > Good, I wasn't aware of it. > > --- a/xen/arch/x86/mm/p2m-ept.c > > +++ b/xen/arch/x86/mm/p2m-ept.c > > @@ -830,7 +830,15 @@ out: > > { > > if ( iommu_flags ) > > for ( i = 0; i < (1 << order); i++ ) > > - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > iommu_flags); > > + { > > + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > iommu_flags); > > + if ( rc ) > > + { > > + while ( i-- > 0 ) > > + iommu_unmap_page(d, gfn + i); > > + break; > > + } > > + } > > else > > for ( i = 0; i < (1 << order); i++ ) > > iommu_unmap_page(d, gfn + i); > > Earlier on in the PV mm code you also checked iommu_unmap_page()'s return > code - why not here (and also in p2m-pt.c)? > > Also I'm quite unhappy about the inconsistent state you leave things > in: You unmap from the IOMMU, return an error, but leave the EPT entry in > place. > As I mentioned for the abstract model, For iommu_{,un}map_page(), we'd better fix it as a normal error, as the error is not only from iommu flush, .e.g, '-ENOMEM'. So, we need to {,un}map from the IOMMU, return an error, and roll back the failed operation. For iommu_unmap_page, it is still under discussion in another thread. I think we could hold it on, waiting for another discussion. > > --- a/xen/common/grant_table.c > > +++ b/xen/common/grant_table.c > > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref( > > { > > nr_gets++; > > (void)get_page(pg, rd); > > - if ( !(op->flags & GNTMAP_readonly) ) > > - get_page_type(pg, PGT_writable_page); > > + if ( !(op->flags & GNTMAP_readonly) && > > + !get_page_type(pg, PGT_writable_page) ) > > + goto could_not_pin; > > This needs explanation, as it doesn't look related to what your actual goal is: If > an error was possible here, I think this would be a security issue. However, as > also kind of documented by the explicitly ignored return value from get_page(), > it is my understanding there here we only obtain an _extra_ reference. > For this point, I inferred from: map_vcpu_info() { ... if ( !get_page_type(page, PGT_writable_page) ) { put_page(page); return -EINVAL; } ... } , then for get_page_type(), I think the return value: 0 -- error, 1-- right. So if get_page_type() is failed, we should goto could_not_pin. btw, there is another issue in the call path: iommu_{,un}map_page() -- __get_page_type() -- get_page_type()--- I tried to return iommu_{,un}map_page() error code in __get_page_type(), is it right? > > --- a/xen/common/memory.c > > +++ b/xen/common/memory.c > > @@ -678,8 +678,9 @@ static int xenmem_add_to_physmap(struct domain > *d, > > if ( need_iommu(d) ) > > { > > this_cpu(iommu_dont_flush_iotlb) = 0; > > - iommu_iotlb_flush(d, xatp->idx - done, done); > > - iommu_iotlb_flush(d, xatp->gpfn - done, done); > > + rc = iommu_iotlb_flush(d, xatp->idx - done, done); > > + if ( !rc ) > > + rc = iommu_iotlb_flush(d, xatp->gpfn - done, done); > > } > > And the pattern repeats - you now return an error, but you don't roll back the > now failed operation. But wait - maybe that intended: > Are you meaning to crash the guest in such cases (somewhere deep in the flush > code)? If so, I think that's fine, but you absolutely would need to say so in the > commit message. > Yes, I should enhance the commit message. > > --- a/xen/drivers/passthrough/x86/iommu.c > > +++ b/xen/drivers/passthrough/x86/iommu.c > > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct > domain *d) > > this_cpu(iommu_dont_flush_iotlb) = 0; > > > > if ( !rc ) > > - iommu_iotlb_flush_all(d); > > + { > > + rc = iommu_iotlb_flush_all(d); > > + if ( rc ) > > + iommu_teardown(d); > > + } > > else if ( rc != -ERESTART ) > > iommu_teardown(d); > > Why can't you just use the existing call to iommu_teardown(), by simply deleting > the "else"? > Just check it, could I modify it as below: --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -105,7 +105,8 @@ int arch_iommu_populate_page_table(struct domain *d) if ( !rc ) iommu_iotlb_flush_all(d); - else if ( rc != -ERESTART ) + + if ( rc != -ERESTART ) iommu_teardown(d); Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-28 3:33 ` Xu, Quan @ 2016-03-29 7:20 ` Jan Beulich 2016-03-30 2:28 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-29 7:20 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 28.03.16 at 05:33, <quan.xu@intel.com> wrote: > On March 18, 2016 1:15am, <JBeulich@suse.com> wrote: >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: >> > --- a/xen/common/grant_table.c >> > +++ b/xen/common/grant_table.c >> > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref( >> > { >> > nr_gets++; >> > (void)get_page(pg, rd); >> > - if ( !(op->flags & GNTMAP_readonly) ) >> > - get_page_type(pg, PGT_writable_page); >> > + if ( !(op->flags & GNTMAP_readonly) && >> > + !get_page_type(pg, PGT_writable_page) ) >> > + goto could_not_pin; >> >> This needs explanation, as it doesn't look related to what your actual goal is: If >> an error was possible here, I think this would be a security issue. However, as >> also kind of documented by the explicitly ignored return value from get_page(), >> it is my understanding there here we only obtain an _extra_ reference. >> > > For this point, I inferred from: > map_vcpu_info() > { > ... > if ( !get_page_type(page, PGT_writable_page) ) > { > put_page(page); > return -EINVAL; > } > ... > } > , then for get_page_type(), I think the return value: > 0 -- error, > 1-- right. > > So if get_page_type() is failed, we should goto could_not_pin. Did you read my reply at all? The explanation I'm expecting here is why error checking is all of the sudden needed _at all_. > btw, there is another issue in the call path: > iommu_{,un}map_page() -- __get_page_type() -- get_page_type()--- > > > I tried to return iommu_{,un}map_page() error code in __get_page_type(), is > it right? If the operation got fully rolled back - yes. Whether fully rolling back is feasible there though is - see the respective discussion - an open question. >> > --- a/xen/drivers/passthrough/x86/iommu.c >> > +++ b/xen/drivers/passthrough/x86/iommu.c >> > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct >> domain *d) >> > this_cpu(iommu_dont_flush_iotlb) = 0; >> > >> > if ( !rc ) >> > - iommu_iotlb_flush_all(d); >> > + { >> > + rc = iommu_iotlb_flush_all(d); >> > + if ( rc ) >> > + iommu_teardown(d); >> > + } >> > else if ( rc != -ERESTART ) >> > iommu_teardown(d); >> >> Why can't you just use the existing call to iommu_teardown(), by simply > deleting >> the "else"? >> > > Just check it, could I modify it as below: > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -105,7 +105,8 @@ int arch_iommu_populate_page_table(struct domain *d) > > if ( !rc ) > iommu_iotlb_flush_all(d); > - else if ( rc != -ERESTART ) > + > + if ( rc != -ERESTART ) > iommu_teardown(d); Clearly not - not only are you losing the return value of iommu_iotlb_flush_all() now, you would then also call iommu_teardown() in the "success" case. My comment was related to code structure, yet you seem to have taken it literally. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-29 7:20 ` Jan Beulich @ 2016-03-30 2:28 ` Xu, Quan 2016-03-30 2:35 ` Xu, Quan 2016-03-30 8:05 ` Jan Beulich 0 siblings, 2 replies; 42+ messages in thread From: Xu, Quan @ 2016-03-30 2:28 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 29, 2016 3:21pm, <JBeulich@suse.com> wrote: > >>> On 28.03.16 at 05:33, <quan.xu@intel.com> wrote: > > On March 18, 2016 1:15am, <JBeulich@suse.com> wrote: > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > >> > --- a/xen/common/grant_table.c > >> > +++ b/xen/common/grant_table.c > >> > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref( > >> > { > >> > nr_gets++; > >> > (void)get_page(pg, rd); > >> > - if ( !(op->flags & GNTMAP_readonly) ) > >> > - get_page_type(pg, PGT_writable_page); > >> > + if ( !(op->flags & GNTMAP_readonly) && > >> > + !get_page_type(pg, PGT_writable_page) ) > >> > + goto could_not_pin; > >> > >> This needs explanation, as it doesn't look related to what your > >> actual goal is: If an error was possible here, I think this would be > >> a security issue. However, as also kind of documented by the > >> explicitly ignored return value from get_page(), it is my understanding there > here we only obtain an _extra_ reference. > >> > > > > For this point, I inferred from: > > map_vcpu_info() > > { > > ... > > if ( !get_page_type(page, PGT_writable_page) ) > > { > > put_page(page); > > return -EINVAL; > > } > > ... > > } > > , then for get_page_type(), I think the return value: > > 0 -- error, > > 1-- right. > > > > So if get_page_type() is failed, we should goto could_not_pin. > > Did you read my reply at all? The explanation I'm expecting here is why error > checking is all of the sudden needed _at all_. > Sorry for my stupid reply. As in this version, before the open discussion, I try to return the iommu_{,un}map_page() error in this call tree: iommu_{,un}map_page() -- __get_page_type() -- get_page_type()--- then, in this point, I try to deal with this iommu_{,un}map_page() error. > > btw, there is another issue in the call path: > > iommu_{,un}map_page() -- __get_page_type() -- get_page_type()--- > > > > > > I tried to return iommu_{,un}map_page() error code in > > __get_page_type(), is it right? > > If the operation got fully rolled back - yes. Whether fully rolling back is feasible > there though is - see the respective discussion - an open question. > For the open question, does it refer to as below: """ As said, we first need to settle on an abstract model. Do we want IOMMU mapping failures to be fatal to the domain (perhaps with the exception of the hardware one)? I think we do, and for the hardware domain we'd do things on a best effort basis (always erring on the side of unmapping). Which would probably mean crashing the domain could be centralized in iommu_{,un}map_page(). How much roll back would then still be needed in callers of these functions for the hardware domain's sake would need to be seen. """ I hope it is yes. I read all of your emails again and again, I found I did get the point until this Monday. I am summarizing it and would send out in a new thread. > >> > --- a/xen/drivers/passthrough/x86/iommu.c > >> > +++ b/xen/drivers/passthrough/x86/iommu.c > >> > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct > >> domain *d) > >> > this_cpu(iommu_dont_flush_iotlb) = 0; > >> > > >> > if ( !rc ) > >> > - iommu_iotlb_flush_all(d); > >> > + { > >> > + rc = iommu_iotlb_flush_all(d); > >> > + if ( rc ) > >> > + iommu_teardown(d); > >> > + } > >> > else if ( rc != -ERESTART ) > >> > iommu_teardown(d); > >> > >> Why can't you just use the existing call to iommu_teardown(), by > >> simply > > deleting > >> the "else"? > >> > > > > Just check it, could I modify it as below: > > --- a/xen/drivers/passthrough/x86/iommu.c > > +++ b/xen/drivers/passthrough/x86/iommu.c > > @@ -105,7 +105,8 @@ int arch_iommu_populate_page_table(struct domain > > *d) > > > > if ( !rc ) > > iommu_iotlb_flush_all(d); > > - else if ( rc != -ERESTART ) > > + > > + if ( rc != -ERESTART ) > > iommu_teardown(d); > > Clearly not - not only are you losing the return value of > iommu_iotlb_flush_all() now, you would then also call > iommu_teardown() in the "success" case. My comment was related to code > structure, yet you seem to have taken it literally. > Then, what about this one: --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -104,8 +104,9 @@ int arch_iommu_populate_page_table(struct domain *d) this_cpu(iommu_dont_flush_iotlb) = 0; if ( !rc ) - iommu_iotlb_flush_all(d); - else if ( rc != -ERESTART ) + rc = iommu_iotlb_flush_all(d); + + if ( !rc && rc != -ERESTART ) iommu_teardown(d); IMO, my original modification is correct and redundant with 2 'iommu_teardown()'.. If this is still the correct one, could you help me send out the correct one? Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-30 2:28 ` Xu, Quan @ 2016-03-30 2:35 ` Xu, Quan 2016-03-30 8:05 ` Jan Beulich 1 sibling, 0 replies; 42+ messages in thread From: Xu, Quan @ 2016-03-30 2:35 UTC (permalink / raw) To: Xu, Quan, Jan Beulich Cc: Tian, Kevin, Wu, Feng, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Nakajima, Jun, Andrew Cooper, Keir Fraser On March 30, 2016 10:28am, Xu, Quan <quan.xu@intel.com> wrote: > If this is still the correct one, could you help me send out the correct one? > Sorry, a typo: If this is still _not_ the correct one, could you help me send out the correct one? Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error. 2016-03-30 2:28 ` Xu, Quan 2016-03-30 2:35 ` Xu, Quan @ 2016-03-30 8:05 ` Jan Beulich 1 sibling, 0 replies; 42+ messages in thread From: Jan Beulich @ 2016-03-30 8:05 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, George Dunlap, Liu Jinsong, Dario Faggioli, xen-devel, Jun Nakajima, Andrew Cooper, Keir Fraser >>> On 30.03.16 at 04:28, <quan.xu@intel.com> wrote: > On March 29, 2016 3:21pm, <JBeulich@suse.com> wrote: >> >>> On 28.03.16 at 05:33, <quan.xu@intel.com> wrote: >> > On March 18, 2016 1:15am, <JBeulich@suse.com> wrote: >> >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: >> >> > --- a/xen/common/grant_table.c >> >> > +++ b/xen/common/grant_table.c >> >> > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref( >> >> > { >> >> > nr_gets++; >> >> > (void)get_page(pg, rd); >> >> > - if ( !(op->flags & GNTMAP_readonly) ) >> >> > - get_page_type(pg, PGT_writable_page); >> >> > + if ( !(op->flags & GNTMAP_readonly) && >> >> > + !get_page_type(pg, PGT_writable_page) ) >> >> > + goto could_not_pin; >> >> >> >> This needs explanation, as it doesn't look related to what your >> >> actual goal is: If an error was possible here, I think this would be >> >> a security issue. However, as also kind of documented by the >> >> explicitly ignored return value from get_page(), it is my understanding there >> here we only obtain an _extra_ reference. >> >> >> > >> > For this point, I inferred from: >> > map_vcpu_info() >> > { >> > ... >> > if ( !get_page_type(page, PGT_writable_page) ) >> > { >> > put_page(page); >> > return -EINVAL; >> > } >> > ... >> > } >> > , then for get_page_type(), I think the return value: >> > 0 -- error, >> > 1-- right. >> > >> > So if get_page_type() is failed, we should goto could_not_pin. >> >> Did you read my reply at all? The explanation I'm expecting here is why > error >> checking is all of the sudden needed _at all_. >> > > Sorry for my stupid reply. > As in this version, before the open discussion, I try to return the > iommu_{,un}map_page() error in this call tree: > iommu_{,un}map_page() -- __get_page_type() -- get_page_type()--- > then, in this point, I try to deal with this iommu_{,un}map_page() error. I still don't get it: We're talking about a get_page_type() invocation that previously was known to never fail (or at least so we hope, based on the existing code). What I'm expecting as an explanation is why this "cannot fail" state is not true any longer. And while sorting this out, please pay particular attention to the limited set of cases where __get_page_type() calls iommu_{,un}map_page() in the first place. >> > btw, there is another issue in the call path: >> > iommu_{,un}map_page() -- __get_page_type() -- get_page_type()--- >> > >> > >> > I tried to return iommu_{,un}map_page() error code in >> > __get_page_type(), is it right? >> >> If the operation got fully rolled back - yes. Whether fully rolling back is feasible >> there though is - see the respective discussion - an open question. >> > > For the open question, does it refer to as below: Partly. > """ > As said, we first need > to settle on an abstract model. Do we want IOMMU mapping > failures to be fatal to the domain (perhaps with the exception > of the hardware one)? I think we do, and for the hardware domain > we'd do things on a best effort basis (always erring on the side > of unmapping). Which would probably mean crashing the domain > could be centralized in iommu_{,un}map_page(). How much roll > back would then still be needed in callers of these functions > for the hardware domain's sake would need to be seen. > """ > > I hope it is yes. It is not clear to me what part of the above this is meant to refer to. Perhaps this is meant to answer the question in the 2nd sentence, but I think this really ought to take a little more than "yes". >> >> > --- a/xen/drivers/passthrough/x86/iommu.c >> >> > +++ b/xen/drivers/passthrough/x86/iommu.c >> >> > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct >> >> domain *d) >> >> > this_cpu(iommu_dont_flush_iotlb) = 0; >> >> > >> >> > if ( !rc ) >> >> > - iommu_iotlb_flush_all(d); >> >> > + { >> >> > + rc = iommu_iotlb_flush_all(d); >> >> > + if ( rc ) >> >> > + iommu_teardown(d); >> >> > + } >> >> > else if ( rc != -ERESTART ) >> >> > iommu_teardown(d); >> >> >> >> Why can't you just use the existing call to iommu_teardown(), by >> >> simply >> > deleting >> >> the "else"? >> >> >> > >> > Just check it, could I modify it as below: >> > --- a/xen/drivers/passthrough/x86/iommu.c >> > +++ b/xen/drivers/passthrough/x86/iommu.c >> > @@ -105,7 +105,8 @@ int arch_iommu_populate_page_table(struct domain >> > *d) >> > >> > if ( !rc ) >> > iommu_iotlb_flush_all(d); >> > - else if ( rc != -ERESTART ) >> > + >> > + if ( rc != -ERESTART ) >> > iommu_teardown(d); >> >> Clearly not - not only are you losing the return value of >> iommu_iotlb_flush_all() now, you would then also call >> iommu_teardown() in the "success" case. My comment was related to code >> structure, yet you seem to have taken it literally. >> > > Then, what about this one: > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -104,8 +104,9 @@ int arch_iommu_populate_page_table(struct domain *d) > this_cpu(iommu_dont_flush_iotlb) = 0; > > if ( !rc ) > - iommu_iotlb_flush_all(d); > - else if ( rc != -ERESTART ) > + rc = iommu_iotlb_flush_all(d); > + > + if ( !rc && rc != -ERESTART ) > iommu_teardown(d); > > > IMO, my original modification is correct and redundant with 2 > 'iommu_teardown()'.. > If this is still the correct one, could you help me send out the correct > one? The above looks right to me. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 [PATCH 0/2] Check VT-d Device-TLB flush error Quan Xu 2016-03-17 6:54 ` [PATCH 1/2] IOMMU/MMU: Adjust top level functions for " Quan Xu @ 2016-03-17 6:54 ` Quan Xu 2016-03-17 7:37 ` Tian, Kevin ` (2 more replies) 1 sibling, 3 replies; 42+ messages in thread From: Quan Xu @ 2016-03-17 6:54 UTC (permalink / raw) To: xen-devel Cc: Kevin Tian, Keir Fraser, Jun Nakajima, George Dunlap, Andrew Cooper, Dario Faggioli, Julien Grall, Stefano Stabellini, Jan Beulich, Quan Xu, Feng Wu, Suravee Suthikulpanit Current code would be panic(), when VT-d Device-TLB flush timed out. the panic() is going to be eliminated, so we must check all kinds of error and all the way up the call trees. Signed-off-by: Quan Xu <quan.xu@intel.com> CC: Jun Nakajima <jun.nakajima@intel.com> CC: Kevin Tian <kevin.tian@intel.com> CC: George Dunlap <george.dunlap@eu.citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> CC: Stefano Stabellini <stefano.stabellini@citrix.com> CC: Julien Grall <julien.grall@arm.com> CC: Feng Wu <feng.wu@intel.com> CC: Dario Faggioli <dario.faggioli@citrix.com> --- xen/arch/x86/mm/p2m-ept.c | 2 +- xen/drivers/passthrough/amd/iommu_init.c | 12 ++- xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- xen/drivers/passthrough/arm/smmu.c | 10 ++- xen/drivers/passthrough/iommu.c | 17 ++-- xen/drivers/passthrough/vtd/extern.h | 2 +- xen/drivers/passthrough/vtd/iommu.c | 120 ++++++++++++++++++-------- xen/drivers/passthrough/vtd/quirks.c | 26 +++--- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- xen/include/asm-x86/iommu.h | 2 +- xen/include/xen/iommu.h | 6 +- 11 files changed, 133 insertions(+), 68 deletions(-) diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index f9bcce7..fa6c710 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -825,7 +825,7 @@ out: need_modify_vtd_table ) { if ( iommu_hap_pt_share ) - iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); + rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); else { if ( iommu_flags ) diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c index 4536106..8bb47b2 100644 --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void) iterate_ivrs_mappings(_invalidate_all_devices); } -void amd_iommu_suspend(void) +int amd_iommu_suspend(void) { struct amd_iommu *iommu; for_each_amd_iommu ( iommu ) disable_iommu(iommu); + + return 0; } void amd_iommu_resume(void) @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void) invalidate_all_domain_pages(); } } + +void amd_iommu_crash_shutdown(void) +{ + struct amd_iommu *iommu; + + for_each_amd_iommu ( iommu ) + disable_iommu(iommu); +} diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index c8ee8dc..fb8e816 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -628,6 +628,6 @@ const struct iommu_ops amd_iommu_ops = { .suspend = amd_iommu_suspend, .resume = amd_iommu_resume, .share_p2m = amd_iommu_share_p2m, - .crash_shutdown = amd_iommu_suspend, + .crash_shutdown = amd_iommu_crash_shutdown, .dump_p2m_table = amd_dump_p2m_table, }; diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index 62da087..36c2c83 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -2540,7 +2540,7 @@ static int force_stage = 2; */ static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK; -static void arm_smmu_iotlb_flush_all(struct domain *d) +static int arm_smmu_iotlb_flush_all(struct domain *d) { struct arm_smmu_xen_domain *smmu_domain = domain_hvm_iommu(d)->arch.priv; struct iommu_domain *cfg; @@ -2557,13 +2557,15 @@ static void arm_smmu_iotlb_flush_all(struct domain *d) arm_smmu_tlb_inv_context(cfg->priv); } spin_unlock(&smmu_domain->lock); + + return 0; } -static void arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn, - unsigned int page_count) +static int arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn, + unsigned int page_count) { /* ARM SMMU v1 doesn't have flush by VMA and VMID */ - arm_smmu_iotlb_flush_all(d); + return arm_smmu_iotlb_flush_all(d); } static struct iommu_domain *arm_smmu_get_domain(struct domain *d, diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 29efbfe..08bb6f7 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) ((page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page) ) mapping |= IOMMUF_writable; - hd->platform_ops->map_page(d, gfn, mfn, mapping); + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) + printk(XENLOG_G_ERR + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", + gfn, mfn); + if ( !(i++ & 0xfffff) ) process_pending_softirqs(); } @@ -284,9 +288,7 @@ int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_cou if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush ) return 0; - hd->platform_ops->iotlb_flush(d, gfn, page_count); - - return 0; + return hd->platform_ops->iotlb_flush(d, gfn, page_count); } int iommu_iotlb_flush_all(struct domain *d) @@ -296,9 +298,7 @@ int iommu_iotlb_flush_all(struct domain *d) if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all ) return 0; - hd->platform_ops->iotlb_flush_all(d); - - return 0; + return hd->platform_ops->iotlb_flush_all(d); } int __init iommu_setup(void) @@ -375,8 +375,9 @@ int iommu_do_domctl( int iommu_suspend() { const struct iommu_ops *ops = iommu_get_ops(); + if ( iommu_enabled ) - ops->suspend(); + return ops->suspend(); return 0; } diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h index cbe0286..d4d37c3 100644 --- a/xen/drivers/passthrough/vtd/extern.h +++ b/xen/drivers/passthrough/vtd/extern.h @@ -91,7 +91,7 @@ int is_igd_vt_enabled_quirk(void); void platform_quirks_init(void); void vtd_ops_preamble_quirk(struct iommu* iommu); void vtd_ops_postamble_quirk(struct iommu* iommu); -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map); +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map); void pci_vtd_quirk(const struct pci_dev *); bool_t platform_supports_intremap(void); bool_t platform_supports_x2apic(void); diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5ad25dc..4f8b63a 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -542,11 +542,12 @@ static int iommu_flush_iotlb_psi( return status; } -static void iommu_flush_all(void) +static int iommu_flush_all(void) { struct acpi_drhd_unit *drhd; struct iommu *iommu; int flush_dev_iotlb; + int rc = 0; flush_all_cache(); for_each_drhd_unit ( drhd ) @@ -554,11 +555,24 @@ static void iommu_flush_all(void) iommu = drhd->iommu; iommu_flush_context_global(iommu, 0); flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); + + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } + else if ( rc < 0 ) + { + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); + break; + } } + + return rc; } -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, int dma_old_pte_present, unsigned int page_count) { struct hvm_iommu *hd = domain_hvm_iommu(d); @@ -566,6 +580,7 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, struct iommu *iommu; int flush_dev_iotlb; int iommu_domid; + int rc = 0; /* * No need pcideves_lock here because we have flush @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, continue; if ( page_count != 1 || gfn == INVALID_GFN ) - { - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, - 0, flush_dev_iotlb) ) - iommu_flush_write_buffer(iommu); - } + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, + 0, flush_dev_iotlb); else + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, + (paddr_t)gfn << PAGE_SHIFT_4K, 0, + !dma_old_pte_present, + flush_dev_iotlb); + if ( rc > 0 ) { - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, - (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K, - !dma_old_pte_present, flush_dev_iotlb) ) - iommu_flush_write_buffer(iommu); + iommu_flush_write_buffer(iommu); + rc = 0; } } + + return rc; } -static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) +static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) { - __intel_iommu_iotlb_flush(d, gfn, 1, page_count); + return __intel_iommu_iotlb_flush(d, gfn, 1, page_count); } -static void intel_iommu_iotlb_flush_all(struct domain *d) +static int intel_iommu_iotlb_flush_all(struct domain *d) { - __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0); + return __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0); } /* clear one page's page table */ -static void dma_pte_clear_one(struct domain *domain, u64 addr) +static int dma_pte_clear_one(struct domain *domain, u64 addr) { struct hvm_iommu *hd = domain_hvm_iommu(domain); struct dma_pte *page = NULL, *pte = NULL; u64 pg_maddr; + int rc = 0; spin_lock(&hd->arch.mapping_lock); /* get last level pte */ @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) if ( pg_maddr == 0 ) { spin_unlock(&hd->arch.mapping_lock); - return; + return -ENOMEM; } page = (struct dma_pte *)map_vtd_domain_page(pg_maddr); @@ -632,7 +650,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) { spin_unlock(&hd->arch.mapping_lock); unmap_vtd_domain_page(page); - return; + return 0; } dma_clear_pte(*pte); @@ -640,9 +658,11 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) iommu_flush_cache_entry(pte, sizeof(struct dma_pte)); if ( !this_cpu(iommu_dont_flush_iotlb) ) - __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1); + rc = __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1); unmap_vtd_domain_page(page); + + return rc; } static void iommu_free_pagetable(u64 pt_maddr, int level) @@ -1281,6 +1301,7 @@ int domain_context_mapping_one( u64 maddr, pgd_maddr; u16 seg = iommu->intel->drhd->segment; int agaw; + int rc = 0; ASSERT(pcidevs_locked()); spin_lock(&iommu->lock); @@ -1400,7 +1421,14 @@ int domain_context_mapping_one( else { int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb); + + rc = iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb); + + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } } set_bit(iommu->index, &hd->arch.iommu_bitmap); @@ -1410,7 +1438,7 @@ int domain_context_mapping_one( if ( !seg ) me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC); - return 0; + return rc; } static int domain_context_mapping( @@ -1505,6 +1533,7 @@ int domain_context_unmap_one( struct context_entry *context, *context_entries; u64 maddr; int iommu_domid; + int rc = 0; ASSERT(pcidevs_locked()); spin_lock(&iommu->lock); @@ -1539,7 +1568,14 @@ int domain_context_unmap_one( else { int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb); + + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb); + + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } } spin_unlock(&iommu->lock); @@ -1548,7 +1584,7 @@ int domain_context_unmap_one( if ( !iommu->intel->drhd->segment ) me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC); - return 0; + return rc; } static int domain_context_unmap( @@ -1738,7 +1774,7 @@ static int intel_iommu_map_page( unmap_vtd_domain_page(page); if ( !this_cpu(iommu_dont_flush_iotlb) ) - __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); + return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); return 0; } @@ -1749,19 +1785,18 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn) if ( iommu_passthrough && is_hardware_domain(d) ) return 0; - dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); - - return 0; + return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); } -void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, - int order, int present) +int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, + int order, int present) { struct acpi_drhd_unit *drhd; struct iommu *iommu = NULL; struct hvm_iommu *hd = domain_hvm_iommu(d); int flush_dev_iotlb; int iommu_domid; + int rc = 0; iommu_flush_cache_entry(pte, sizeof(struct dma_pte)); @@ -1775,11 +1810,17 @@ void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, iommu_domid= domain_iommu_domid(d, iommu); if ( iommu_domid == -1 ) continue; - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, (paddr_t)gfn << PAGE_SHIFT_4K, - order, !present, flush_dev_iotlb) ) + order, !present, flush_dev_iotlb); + if ( rc > 0 ) + { iommu_flush_write_buffer(iommu); + rc = 0; + } } + + return rc; } static int __init vtd_ept_page_compatible(struct iommu *iommu) @@ -2099,8 +2140,8 @@ static int init_vtd_hw(void) return -EIO; } } - iommu_flush_all(); - return 0; + + return iommu_flush_all(); } static void __hwdom_init setup_hwdom_rmrr(struct domain *d) @@ -2389,16 +2430,19 @@ static int intel_iommu_group_id(u16 seg, u8 bus, u8 devfn) } static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS]; -static void vtd_suspend(void) +static int vtd_suspend(void) { struct acpi_drhd_unit *drhd; struct iommu *iommu; u32 i; + int rc; if ( !iommu_enabled ) - return; + return 0; - iommu_flush_all(); + rc = iommu_flush_all(); + if ( rc ) + return rc; for_each_drhd_unit ( drhd ) { @@ -2427,6 +2471,8 @@ static void vtd_suspend(void) if ( !iommu_intremap && iommu_qinval ) disable_qinval(iommu); } + + return 0; } static void vtd_crash_shutdown(void) diff --git a/xen/drivers/passthrough/vtd/quirks.c b/xen/drivers/passthrough/vtd/quirks.c index 49df41d..a230b68 100644 --- a/xen/drivers/passthrough/vtd/quirks.c +++ b/xen/drivers/passthrough/vtd/quirks.c @@ -332,10 +332,11 @@ void __init platform_quirks_init(void) * assigning Intel integrated wifi device to a guest. */ -static void map_me_phantom_function(struct domain *domain, u32 dev, int map) +static int map_me_phantom_function(struct domain *domain, u32 dev, int map) { struct acpi_drhd_unit *drhd; struct pci_dev *pdev; + int rc; /* find ME VT-d engine base on a real ME device */ pdev = pci_get_pdev(0, 0, PCI_DEVFN(dev, 0)); @@ -343,23 +344,26 @@ static void map_me_phantom_function(struct domain *domain, u32 dev, int map) /* map or unmap ME phantom function */ if ( map ) - domain_context_mapping_one(domain, drhd->iommu, 0, - PCI_DEVFN(dev, 7), NULL); + rc = domain_context_mapping_one(domain, drhd->iommu, 0, + PCI_DEVFN(dev, 7), NULL); else - domain_context_unmap_one(domain, drhd->iommu, 0, - PCI_DEVFN(dev, 7)); + rc = domain_context_unmap_one(domain, drhd->iommu, 0, + PCI_DEVFN(dev, 7)); + + return rc; } -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) { u32 id; + int rc = 0; id = pci_conf_read32(0, 0, 0, 0, 0); if ( IS_CTG(id) ) { /* quit if ME does not exist */ if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) - return; + return -ENOENT; /* if device is WLAN device, map ME phantom device 0:3.7 */ id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0); @@ -373,7 +377,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) case 0x423b8086: case 0x423c8086: case 0x423d8086: - map_me_phantom_function(domain, 3, map); + rc = map_me_phantom_function(domain, 3, map); break; default: break; @@ -383,7 +387,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) { /* quit if ME does not exist */ if ( pci_conf_read32(0, 0, 22, 0, 0) == 0xffffffff ) - return; + return -ENOENT; /* if device is WLAN device, map ME phantom device 0:22.7 */ id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0); @@ -399,12 +403,14 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) case 0x42388086: /* Puma Peak */ case 0x422b8086: case 0x422c8086: - map_me_phantom_function(domain, 22, map); + rc = map_me_phantom_function(domain, 22, map); break; default: break; } } + + return rc; } void pci_vtd_quirk(const struct pci_dev *pdev) diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h index 9c51172..f540fc8 100644 --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -119,7 +119,7 @@ extern unsigned long *shared_intremap_inuse; /* power management support */ void amd_iommu_resume(void); -void amd_iommu_suspend(void); +int amd_iommu_suspend(void); void amd_iommu_crash_shutdown(void); /* guest iommu support */ diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h index f22b3a5..790701e 100644 --- a/xen/include/asm-x86/iommu.h +++ b/xen/include/asm-x86/iommu.h @@ -26,7 +26,7 @@ int iommu_setup_hpet_msi(struct msi_desc *); /* While VT-d specific, this must get declared in a generic header. */ int adjust_vtd_irq_affinities(void); -void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present); +int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present); bool_t iommu_supports_eim(void); int iommu_enable_x2apic_IR(void); void iommu_disable_x2apic_IR(void); diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index d6d489a..8710dfe 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -157,12 +157,12 @@ struct iommu_ops { unsigned int (*read_apic_from_ire)(unsigned int apic, unsigned int reg); int (*setup_hpet_msi)(struct msi_desc *); #endif /* CONFIG_X86 */ - void (*suspend)(void); + int (*suspend)(void); void (*resume)(void); void (*share_p2m)(struct domain *d); void (*crash_shutdown)(void); - void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count); - void (*iotlb_flush_all)(struct domain *d); + int (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count); + int (*iotlb_flush_all)(struct domain *d); int (*get_reserved_device_memory)(iommu_grdm_t *, void *); void (*dump_p2m_table)(struct domain *d); }; -- 1.9.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 ` [PATCH 2/2] IOMMU/MMU: Adjust low " Quan Xu @ 2016-03-17 7:37 ` Tian, Kevin 2016-03-18 2:30 ` Xu, Quan 2016-03-17 15:31 ` George Dunlap 2016-03-18 10:20 ` Jan Beulich 2 siblings, 1 reply; 42+ messages in thread From: Tian, Kevin @ 2016-03-17 7:37 UTC (permalink / raw) To: Xu, Quan, xen-devel Cc: Nakajima, Jun, Keir Fraser, Jan Beulich, George Dunlap, Andrew Cooper, Dario Faggioli, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit, Wu, Feng > From: Xu, Quan > Sent: Thursday, March 17, 2016 2:55 PM > > Current code would be panic(), when VT-d Device-TLB flush timed out. > the panic() is going to be eliminated, so we must check all kinds of > error and all the way up the call trees. sorry that I'm unclear what is the criteria of defining high level and low level functions in two patches. Could you elaborate? Once I thought you may mean common code and vendor specific code, however... > > Signed-off-by: Quan Xu <quan.xu@intel.com> > > CC: Jun Nakajima <jun.nakajima@intel.com> > CC: Kevin Tian <kevin.tian@intel.com> > CC: George Dunlap <george.dunlap@eu.citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <jbeulich@suse.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > CC: Julien Grall <julien.grall@arm.com> > CC: Feng Wu <feng.wu@intel.com> > CC: Dario Faggioli <dario.faggioli@citrix.com> > --- > xen/arch/x86/mm/p2m-ept.c | 2 +- > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > xen/drivers/passthrough/arm/smmu.c | 10 ++- > xen/drivers/passthrough/iommu.c | 17 ++-- > xen/drivers/passthrough/vtd/extern.h | 2 +- > xen/drivers/passthrough/vtd/iommu.c | 120 > ++++++++++++++++++-------- > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > xen/include/asm-x86/iommu.h | 2 +- > xen/include/xen/iommu.h | 6 +- Above you have general passthrough/iommu.c though most of others are vendor specific. Then in PATCH [1/2], you have: xen/arch/x86/acpi/power.c | 14 +++++++++++++- xen/arch/x86/mm.c | 13 ++++++++----- xen/arch/x86/mm/p2m-ept.c | 10 +++++++++- xen/arch/x86/mm/p2m-pt.c | 12 ++++++++++-- xen/common/grant_table.c | 5 +++-- xen/common/memory.c | 5 +++-- xen/drivers/passthrough/iommu.c | 16 +++++++++++----- xen/drivers/passthrough/vtd/x86/vtd.c | 7 +++++-- xen/drivers/passthrough/x86/iommu.c | 6 +++++- xen/include/xen/iommu.h | 6 +++--- They are also mixed. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-17 7:37 ` Tian, Kevin @ 2016-03-18 2:30 ` Xu, Quan 2016-03-18 8:06 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-18 2:30 UTC (permalink / raw) To: Tian, Kevin Cc: Nakajima, Jun, Keir Fraser, Jan Beulich, George Dunlap, Andrew Cooper, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit, Wu, Feng On March 17, 2016 3:38pm, Tian, Kevin <kevin.tian@intel.com> wrote: > > From: Xu, Quan > > Sent: Thursday, March 17, 2016 2:55 PM > > > > Current code would be panic(), when VT-d Device-TLB flush timed out. > > the panic() is going to be eliminated, so we must check all kinds of > > error and all the way up the call trees. > > sorry that I'm unclear what is the criteria of defining high level and low level > functions in two patches. Could you elaborate? Once I thought you may mean > common code and vendor specific code, however... > In this patch set, it is adjusting top level functions of the call tree first, and working my way down to leaf ones. I tried to define that top level is mainly about MMU, and the low level is mainly about IOMMU. Mixed things are in Consideration of compiling and simplification. For high level of these call trees, IMO it is a reasonable attempt at splitting things. > > xen/arch/x86/mm/p2m-ept.c | 2 +- > > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > > xen/drivers/passthrough/arm/smmu.c | 10 ++- > > xen/drivers/passthrough/iommu.c | 17 ++-- > > xen/drivers/passthrough/vtd/extern.h | 2 +- > > xen/drivers/passthrough/vtd/iommu.c | 120 > > ++++++++++++++++++-------- > > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > > xen/include/asm-x86/iommu.h | 2 +- > > xen/include/xen/iommu.h | 6 +- > > Above you have general passthrough/iommu.c though most of others are > vendor specific. > .e.g, as the 'struct iommu_ops' is common structure for arm/amd/intel. > Then in PATCH [1/2], you have: > > xen/arch/x86/acpi/power.c | 14 +++++++++++++- > xen/arch/x86/mm.c | 13 ++++++++----- > xen/arch/x86/mm/p2m-ept.c | 10 +++++++++- > xen/arch/x86/mm/p2m-pt.c | 12 ++++++++++-- > xen/common/grant_table.c | 5 +++-- > xen/common/memory.c | 5 +++-- > xen/drivers/passthrough/iommu.c | 16 +++++++++++----- > xen/drivers/passthrough/vtd/x86/vtd.c | 7 +++++-- > xen/drivers/passthrough/x86/iommu.c | 6 +++++- > xen/include/xen/iommu.h | 6 +++--- > > They are also mixed. > It is in Consideration of compiling and simplification. e.g. for this call tree, ...--iommu_suspend()--device_power_down()--... device_power_down() is in -xen/arch/x86/acpi/power.c iommu_suspend() is in -xen/drivers/passthrough/iommu.c when I tried to return error code from iommu_suspend(), which is with 'void' annotation. I need change it from 'void' to 'int', then it is unavoidable to mix things. Any good idea? To be honest, I am very tired to at splitting things like this :). Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-18 2:30 ` Xu, Quan @ 2016-03-18 8:06 ` Jan Beulich 2016-03-21 5:01 ` Tian, Kevin 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-18 8:06 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, Jun Nakajima, George Dunlap, Andrew Cooper, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, SuraveeSuthikulpanit, Keir Fraser >>> On 18.03.16 at 03:30, <quan.xu@intel.com> wrote: > Any good idea? To be honest, I am very tired to at splitting things like > this :). I understand that this is a tedious task; the code should have been propagating errors from the beginning. The most natural way of splitting things would be to go function by function, top level ones first, and leaf ones last, one function per patch (maybe pairs of functions, as in the map/unmap case). Such model would be problematic (almost) only when there's recursion at some point, which I don't think would be the case anywhere here. As mentioned before, the __must_check annotation is a great help to not miss any callers - both to you as you put things together and to the reviewers to be ascertained that nothing was missed. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-18 8:06 ` Jan Beulich @ 2016-03-21 5:01 ` Tian, Kevin 0 siblings, 0 replies; 42+ messages in thread From: Tian, Kevin @ 2016-03-21 5:01 UTC (permalink / raw) To: Jan Beulich, Xu, Quan Cc: Wu, Feng, Nakajima, Jun, George Dunlap, Andrew Cooper, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, SuraveeSuthikulpanit, Keir Fraser > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Friday, March 18, 2016 4:06 PM > > >>> On 18.03.16 at 03:30, <quan.xu@intel.com> wrote: > > Any good idea? To be honest, I am very tired to at splitting things like > > this :). > > I understand that this is a tedious task; the code should have been > propagating errors from the beginning. > > The most natural way of splitting things would be to go function by > function, top level ones first, and leaf ones last, one function per > patch (maybe pairs of functions, as in the map/unmap case). Such > model would be problematic (almost) only when there's recursion at > some point, which I don't think would be the case anywhere here. > As mentioned before, the __must_check annotation is a great help > to not miss any callers - both to you as you put things together and > to the reviewers to be ascertained that nothing was missed. > Agree with this suggestion. It also allows different maintainers to focus on changes they really need to care about. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 ` [PATCH 2/2] IOMMU/MMU: Adjust low " Quan Xu 2016-03-17 7:37 ` Tian, Kevin @ 2016-03-17 15:31 ` George Dunlap 2016-03-18 6:57 ` Xu, Quan 2016-03-18 10:20 ` Jan Beulich 2 siblings, 1 reply; 42+ messages in thread From: George Dunlap @ 2016-03-17 15:31 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Keir Fraser, Jan Beulich, Andrew Cooper, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, Jun Nakajima, Feng Wu, Suravee Suthikulpanit On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > Current code would be panic(), when VT-d Device-TLB flush timed out. > the panic() is going to be eliminated, so we must check all kinds of > error and all the way up the call trees. > > Signed-off-by: Quan Xu <quan.xu@intel.com> > > CC: Jun Nakajima <jun.nakajima@intel.com> > CC: Kevin Tian <kevin.tian@intel.com> > CC: George Dunlap <george.dunlap@eu.citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <jbeulich@suse.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > CC: Julien Grall <julien.grall@arm.com> > CC: Feng Wu <feng.wu@intel.com> > CC: Dario Faggioli <dario.faggioli@citrix.com> > --- > xen/arch/x86/mm/p2m-ept.c | 2 +- > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > xen/drivers/passthrough/arm/smmu.c | 10 ++- > xen/drivers/passthrough/iommu.c | 17 ++-- > xen/drivers/passthrough/vtd/extern.h | 2 +- > xen/drivers/passthrough/vtd/iommu.c | 120 ++++++++++++++++++-------- > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > xen/include/asm-x86/iommu.h | 2 +- > xen/include/xen/iommu.h | 6 +- > 11 files changed, 133 insertions(+), 68 deletions(-) > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index f9bcce7..fa6c710 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -825,7 +825,7 @@ out: > need_modify_vtd_table ) > { > if ( iommu_hap_pt_share ) > - iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); > + rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); So this sort changes the meaning of the "rc" check near the end of the function, when we check whether we want to update altp2m. As it happens, I *think* it doesn't matter, because you can't have altp2m and passthrough enabled at the same time, right? If so, this at least merits a comment above the altp2m check; something like: "NB that if altp2m is enabled, rc cannot be non-zero here due to iommu_pte_flush, since you can't have altp2m and pass-through enabled at the same time." If you *can* have both altp2m and pass-through, then we need to make sure that the altp2m gets updated when the hostp2m is updated, even if the iommu flush fails. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-17 15:31 ` George Dunlap @ 2016-03-18 6:57 ` Xu, Quan 0 siblings, 0 replies; 42+ messages in thread From: Xu, Quan @ 2016-03-18 6:57 UTC (permalink / raw) To: George Dunlap, Andrew Cooper Cc: Tian, Kevin, Keir Fraser, Jan Beulich, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, Nakajima, Jun, Wu, Feng, Suravee Suthikulpanit On March 17, 2016 11:31pm, <George.Dunlap@eu.citrix.com> wrote: > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > > Current code would be panic(), when VT-d Device-TLB flush timed out. > > the panic() is going to be eliminated, so we must check all kinds of > > error and all the way up the call trees. > > > > Signed-off-by: Quan Xu <quan.xu@intel.com> > > > > CC: Jun Nakajima <jun.nakajima@intel.com> > > CC: Kevin Tian <kevin.tian@intel.com> > > CC: George Dunlap <george.dunlap@eu.citrix.com> > > CC: Keir Fraser <keir@xen.org> > > CC: Jan Beulich <jbeulich@suse.com> > > CC: Andrew Cooper <andrew.cooper3@citrix.com> > > CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > > CC: Julien Grall <julien.grall@arm.com> > > CC: Feng Wu <feng.wu@intel.com> > > CC: Dario Faggioli <dario.faggioli@citrix.com> > > --- > > xen/arch/x86/mm/p2m-ept.c | 2 +- > > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > > xen/drivers/passthrough/arm/smmu.c | 10 ++- > > xen/drivers/passthrough/iommu.c | 17 ++-- > > xen/drivers/passthrough/vtd/extern.h | 2 +- > > xen/drivers/passthrough/vtd/iommu.c | 120 > ++++++++++++++++++-------- > > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > > xen/include/asm-x86/iommu.h | 2 +- > > xen/include/xen/iommu.h | 6 +- > > 11 files changed, 133 insertions(+), 68 deletions(-) > > > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > > index f9bcce7..fa6c710 100644 > > --- a/xen/arch/x86/mm/p2m-ept.c > > +++ b/xen/arch/x86/mm/p2m-ept.c > > @@ -825,7 +825,7 @@ out: > > need_modify_vtd_table ) > > { > > if ( iommu_hap_pt_share ) > > - iommu_pte_flush(d, gfn, &ept_entry->epte, order, > vtd_pte_present); > > + rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, > > + vtd_pte_present); > > So this sort changes the meaning of the "rc" check near the end of the function, > when we check whether we want to update altp2m. > > As it happens, I *think* it doesn't matter, because you can't have altp2m and > passthrough enabled at the same time, right? I think it is _not_. I have gone through the altp2m design/design discussion. It looks Xen _can_ have altp2m and passthrough enabled at the same time. Also you can refer to http://lists.xenproject.org/archives/html/xen-devel/2015-01/msg01963.html Andrew, could you help me double check it? > > If so, this at least merits a comment above the altp2m check; something like: > > "NB that if altp2m is enabled, rc cannot be non-zero here due to > iommu_pte_flush, since you can't have altp2m and pass-through enabled at the > same time." > > If you *can* have both altp2m and pass-through, then we need to make sure > that the altp2m gets updated when the hostp2m is updated, even if the iommu > flush fails. > I think it _can_, and let's wait for the above check first. Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-17 6:54 ` [PATCH 2/2] IOMMU/MMU: Adjust low " Quan Xu 2016-03-17 7:37 ` Tian, Kevin 2016-03-17 15:31 ` George Dunlap @ 2016-03-18 10:20 ` Jan Beulich 2016-03-25 9:27 ` Xu, Quan 2 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-18 10:20 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, Jun Nakajima, George Dunlap, Andrew Cooper, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit, Keir Fraser >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > --- a/xen/drivers/passthrough/amd/iommu_init.c > +++ b/xen/drivers/passthrough/amd/iommu_init.c > @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void) > iterate_ivrs_mappings(_invalidate_all_devices); > } > > -void amd_iommu_suspend(void) > +int amd_iommu_suspend(void) > { > struct amd_iommu *iommu; > > for_each_amd_iommu ( iommu ) > disable_iommu(iommu); > + > + return 0; > } > > void amd_iommu_resume(void) > @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void) > invalidate_all_domain_pages(); > } > } > + > +void amd_iommu_crash_shutdown(void) > +{ > + struct amd_iommu *iommu; > + > + for_each_amd_iommu ( iommu ) > + disable_iommu(iommu); > +} One of the two should clearly call the other - no need to have the same code twice. > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > ((page->u.inuse.type_info & PGT_type_mask) > == PGT_writable_page) ) > mapping |= IOMMUF_writable; > - hd->platform_ops->map_page(d, gfn, mfn, mapping); > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) > + printk(XENLOG_G_ERR > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", > + gfn, mfn); > + Printing one message here is certainly necessary, but what if the failure repeats for very many pages? Also %#lx instead of 0x%lx please, and a blank before the opening parenthesis. > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > iommu = drhd->iommu; > iommu_flush_context_global(iommu, 0); > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > + > + if ( rc > 0 ) > + { > + iommu_flush_write_buffer(iommu); Why is this needed all of the sudden? (Note that if you did a more fine grained split, it might also be easier for you to note/ explain all the not directly related changes in the respective commit messages. Unless of course they fix actual bugs, in which case they should be split out anyway; such individual fixes would also likely have a much faster route to commit, relieving you earlier from the burden of at least some of the changes you have to carry and re-base.) > + rc = 0; > + } > + else if ( rc < 0 ) > + { > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); > + break; > + } Is a log message really advisable here? > -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, > +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, While I'm not VT-d maintainer, I think changes like this would be a good opportunity to also drop the stray double underscores: You need to touch all callers anyway. > @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct domain *d, > unsigned long gfn, > continue; > > if ( page_count != 1 || gfn == INVALID_GFN ) > - { > - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, > - 0, flush_dev_iotlb) ) > - iommu_flush_write_buffer(iommu); > - } > + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, > + 0, flush_dev_iotlb); > else > + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, > + (paddr_t)gfn << PAGE_SHIFT_4K, 0, > + !dma_old_pte_present, > + flush_dev_iotlb); > + if ( rc > 0 ) > { > - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, > - (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K, Note how this used PAGE_ORDER_4K so far? > - !dma_old_pte_present, flush_dev_iotlb) ) > - iommu_flush_write_buffer(iommu); > + iommu_flush_write_buffer(iommu); Same question again: Why is this all of the sudden needed on both paths? > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) > if ( pg_maddr == 0 ) > { > spin_unlock(&hd->arch.mapping_lock); > - return; > + return -ENOMEM; > } addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't be any memory allocation failure here. There simply is nothing to do in this case. > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > { > u32 id; > + int rc = 0; > > id = pci_conf_read32(0, 0, 0, 0, 0); > if ( IS_CTG(id) ) > { > /* quit if ME does not exist */ > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) > - return; > + return -ENOENT; Is this really an error? IOW, do all systems which satisfy IS_CTG() have such a device? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-18 10:20 ` Jan Beulich @ 2016-03-25 9:27 ` Xu, Quan 2016-03-29 7:36 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-03-25 9:27 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, Nakajima, Jun, George Dunlap, Andrew Cooper, Dario Faggioli, xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit, Keir Fraser On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > > --- a/xen/drivers/passthrough/amd/iommu_init.c > > +++ b/xen/drivers/passthrough/amd/iommu_init.c > > @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void) > > iterate_ivrs_mappings(_invalidate_all_devices); > > } > > > > -void amd_iommu_suspend(void) > > +int amd_iommu_suspend(void) > > { > > struct amd_iommu *iommu; > > > > for_each_amd_iommu ( iommu ) > > disable_iommu(iommu); > > + > > + return 0; > > } > > > > void amd_iommu_resume(void) > > @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void) > > invalidate_all_domain_pages(); > > } > > } > > + > > +void amd_iommu_crash_shutdown(void) > > +{ > > + struct amd_iommu *iommu; > > + > > + for_each_amd_iommu ( iommu ) > > + disable_iommu(iommu); > > +} > > One of the two should clearly call the other - no need to have the same code > twice. > Good idea. > > --- a/xen/drivers/passthrough/iommu.c > > +++ b/xen/drivers/passthrough/iommu.c > > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct > domain *d) > > ((page->u.inuse.type_info & PGT_type_mask) > > == PGT_writable_page) ) > > mapping |= IOMMUF_writable; > > - hd->platform_ops->map_page(d, gfn, mfn, mapping); > > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) > > + printk(XENLOG_G_ERR > > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) > failed.\n", > > + gfn, mfn); > > + > > Printing one message here is certainly necessary, but what if the failure repeats > for very many pages? Yes, to me, it is ok, but I am open to your suggestion. > Also %#lx instead of 0x%lx please, and a blank before the > opening parenthesis. > OK, just check it: .. "IOMMU: Map page gfn: %#lx (mfn: %#lx) failed.\n" .. Right? > > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > > iommu = drhd->iommu; > > iommu_flush_context_global(iommu, 0); > > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > > + > > + if ( rc > 0 ) > > + { > > + iommu_flush_write_buffer(iommu); > > Why is this needed all of the sudden? As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my machine, and I can find the following log message: """ (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. """ __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be called to flush every IOMMU. > (Note that if you did a more fine grained > split, it might also be easier for you to note/ explain all the not directly related > changes in the respective commit messages. Unless of course they fix actual > bugs, in which case they should be split out anyway; such individual fixes would > also likely have a much faster route to commit, relieving you earlier from the > burden of at least some of the changes you have to carry and re-base.) > > > + rc = 0; > > + } > > + else if ( rc < 0 ) > > + { > > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); > > + break; > > + } > > Is a log message really advisable here? > To me, It looks tricky too. I was struggling to make decision. For scheme B, I would try to do as below: if ( iommu_flush_all() ) printk("... nnn ..."); but there are 4 function calls, if so, to me, it looks redundant. Or, could I ignore the print out for iommu_flush_all() failed? > > -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long > > gfn, > > +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long > > +gfn, > > While I'm not VT-d maintainer, I think changes like this would be a good > opportunity to also drop the stray double underscores: You need to touch all > callers anyway. > I think this is optional. > > @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct > > domain *d, unsigned long gfn, > > continue; > > > > if ( page_count != 1 || gfn == INVALID_GFN ) > > - { > > - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, > > - 0, flush_dev_iotlb) ) > > - iommu_flush_write_buffer(iommu); > > - } > > + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, > > + 0, flush_dev_iotlb); > > else > > + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, > > + (paddr_t)gfn << > PAGE_SHIFT_4K, 0, > > + !dma_old_pte_present, > > + flush_dev_iotlb); > > + if ( rc > 0 ) > > { > > - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, > > - (paddr_t)gfn << PAGE_SHIFT_4K, > PAGE_ORDER_4K, > > Note how this used PAGE_ORDER_4K so far? Sorry, this is a rebasing mistake. > > > - !dma_old_pte_present, flush_dev_iotlb) ) > > - iommu_flush_write_buffer(iommu); > > + iommu_flush_write_buffer(iommu); > > Same question again: Why is this all of the sudden needed on both paths? > The same as above question. Hold on first. > > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain > *domain, u64 addr) > > if ( pg_maddr == 0 ) > > { > > spin_unlock(&hd->arch.mapping_lock); > > - return; > > + return -ENOMEM; > > } > > addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't > be any memory allocation failure here. There simply is nothing to do in this > case. > I copy it from iommu_map_page(). Good, then the error of iommu_unmap_page() looks only from flush (the crash is at least obvious), then error handling can be lighter weight-- We may return an error, but don't roll back the failed operation. Right? > > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > > { > > u32 id; > > + int rc = 0; > > > > id = pci_conf_read32(0, 0, 0, 0, 0); > > if ( IS_CTG(id) ) > > { > > /* quit if ME does not exist */ > > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) > > - return; > > + return -ENOENT; > > Is this really an error? IOW, do all systems which satisfy IS_CTG() have such a > device? > To be honest, I didn't know much about me_wifi_quirk. Now, IMO I don't need to deal with me_wifi_quirk(). Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-25 9:27 ` Xu, Quan @ 2016-03-29 7:36 ` Jan Beulich 2016-04-11 3:09 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-03-29 7:36 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, Jun Nakajima, George Dunlap, Andrew Cooper, DarioFaggioli, xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit, Keir Fraser >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: >> > --- a/xen/drivers/passthrough/iommu.c >> > +++ b/xen/drivers/passthrough/iommu.c >> > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct >> domain *d) >> > ((page->u.inuse.type_info & PGT_type_mask) >> > == PGT_writable_page) ) >> > mapping |= IOMMUF_writable; >> > - hd->platform_ops->map_page(d, gfn, mfn, mapping); >> > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) >> > + printk(XENLOG_G_ERR >> > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) >> failed.\n", >> > + gfn, mfn); >> > + >> >> Printing one message here is certainly necessary, but what if the failure > repeats >> for very many pages? > > Yes, to me, it is ok, but I am open to your suggestion. > >> Also %#lx instead of 0x%lx please, and a blank before the >> opening parenthesis. >> > OK, just check it: > > .. > "IOMMU: Map page gfn: %#lx (mfn: %#lx) failed.\n" > .. > > Right? Almost: Generally no full stop in log messages please. And I think the word "page" is redundant here. As rule of thumb: Log messages should give as much as possible useful information (which includes the requirement for distinct ones to be distinguishable in resulting logs) with as little as possible verbosity. >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) >> > iommu = drhd->iommu; >> > iommu_flush_context_global(iommu, 0); >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); >> > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); >> > + >> > + if ( rc > 0 ) >> > + { >> > + iommu_flush_write_buffer(iommu); >> >> Why is this needed all of the sudden? > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my machine, and > I can find the following log message: > """ > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > """ > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be called to > flush every IOMMU. For one what you say suggests that right now this is being done for some (one?) IOMMU(s), which I don't see being the case. And then what you say _still_ doesn't say _why_ this is now needed all of the sudden. If, in the course of doing your re-work here, you find pre-existing issues with the code, please split the necessary fixes out of your re-work and submit them separately with proper explanations in their commit messages. >> > + rc = 0; >> > + } >> > + else if ( rc < 0 ) >> > + { >> > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); >> > + break; >> > + } >> >> Is a log message really advisable here? >> > > To me, It looks tricky too. I was struggling to make decision. For scheme B, > I would try to do as below: > > if ( iommu_flush_all() ) > printk("... nnn ..."); > > but there are 4 function calls, if so, to me, it looks redundant. > > Or, could I ignore the print out for iommu_flush_all() failed? Directing the question back is not helpful: You should know better than me why you had added a log message there in the first place. And it is this "why" which is to judge about whether it should stay there, move elsewhere, or get dropped altogether. >> > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain >> *domain, u64 addr) >> > if ( pg_maddr == 0 ) >> > { >> > spin_unlock(&hd->arch.mapping_lock); >> > - return; >> > + return -ENOMEM; >> > } >> >> addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't >> be any memory allocation failure here. There simply is nothing to do in this >> case. >> > > I copy it from iommu_map_page(). > > Good, then the error of iommu_unmap_page() looks only from flush (the crash > is at least obvious), then error handling can be lighter weight-- > We may return an error, but don't roll back the failed operation. > Right? I don't think so, and I can only re-iterate: There can't be any error here, so there's no error code to forward up the call tree. IOW the "pg_maddr == 0" case simply means "nothing to do" here. >> > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) >> > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) >> > { >> > u32 id; >> > + int rc = 0; >> > >> > id = pci_conf_read32(0, 0, 0, 0, 0); >> > if ( IS_CTG(id) ) >> > { >> > /* quit if ME does not exist */ >> > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) >> > - return; >> > + return -ENOENT; >> >> Is this really an error? IOW, do all systems which satisfy IS_CTG() have such a >> device? >> > To be honest, I didn't know much about me_wifi_quirk. In such a case - how about asking the maintainers, who are your colleagues? And that of course after having looked at the history in an attempt to gain some understanding. > Now, IMO I don't need to deal with me_wifi_quirk(). Well, you clearly can't ignore it. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-03-29 7:36 ` Jan Beulich @ 2016-04-11 3:09 ` Xu, Quan 2016-04-11 3:27 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-04-11 3:09 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, Suravee Suthikulpanit, George Dunlap, Andrew Cooper, DarioFaggioli, xen-devel, Julien Grall, Stefano Stabellini, Nakajima, Jun, Keir Fraser On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > >> > --- a/xen/drivers/passthrough/iommu.c > >> > +++ b/xen/drivers/passthrough/iommu.c > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > >> > iommu = drhd->iommu; > >> > iommu_flush_context_global(iommu, 0); > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > >> > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > >> > + > >> > + if ( rc > 0 ) > >> > + { > >> > + iommu_flush_write_buffer(iommu); > >> > >> Why is this needed all of the sudden? > > > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my > > machine, and I can find the following log message: > > """ > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > > """ > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be > > called to flush every IOMMU. > > For one what you say suggests that right now this is being done for some (one?) > IOMMU(s), which I don't see being the case. And then what you say _still_ > doesn't say _why_ this is now needed all of the sudden. If, in the course of > doing your re-work here, you find pre-existing issues with the code, please split > the necessary fixes out of your re-work and submit them separately with proper > explanations in their commit messages. > I find out it is no need modification for this function. I overlooked the parameter 'flush_non_present_entry', which is '0' to call iommu_flush_iotlb_global(). At the bottom of the call tree, (Existing code) >> In flush->iotlb(): { .... if ( flush_non_present_entry ) { if ( !cap_caching_mode(iommu->cap) ) return 1; else did = 0; } .... } << it is impossible to return '1', which is used to indicate caller needs to flush cache. Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-04-11 3:09 ` Xu, Quan @ 2016-04-11 3:27 ` Xu, Quan 2016-04-11 16:34 ` Jan Beulich 0 siblings, 1 reply; 42+ messages in thread From: Xu, Quan @ 2016-04-11 3:27 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, Nakajima, Jun, George Dunlap, Andrew Cooper, DarioFaggioli, xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit, Keir Fraser On April 11, 2016 11:10am, <quan.xu@intel.com> wrote: > On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: > > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > > >> > --- a/xen/drivers/passthrough/iommu.c > > >> > +++ b/xen/drivers/passthrough/iommu.c > > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > > >> > iommu = drhd->iommu; > > >> > iommu_flush_context_global(iommu, 0); > > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > > >> > + rc = iommu_flush_iotlb_global(iommu, 0, > > >> > + flush_dev_iotlb); > > >> > + > > >> > + if ( rc > 0 ) > > >> > + { > > >> > + iommu_flush_write_buffer(iommu); > > >> > > >> Why is this needed all of the sudden? > > > > > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my > > > machine, and I can find the following log message: > > > """ > > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > > > """ > > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be > > > called to flush every IOMMU. > > > > For one what you say suggests that right now this is being done for > > some (one?) IOMMU(s), which I don't see being the case. And then what > > you say _still_ doesn't say _why_ this is now needed all of the > > sudden. If, in the course of doing your re-work here, you find > > pre-existing issues with the code, please split the necessary fixes > > out of your re-work and submit them separately with proper explanations in > their commit messages. > > > > I find out it is no need modification for this function. Sorry, this modification refers to as: " + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } " at least errors need to be propagated. Quan > I overlooked the parameter 'flush_non_present_entry', which is '0' to call > iommu_flush_iotlb_global(). > At the bottom of the call tree, > (Existing code) > >> > In flush->iotlb(): > { > .... > if ( flush_non_present_entry ) > { > if ( !cap_caching_mode(iommu->cap) ) > return 1; > else > did = 0; > } > > .... > } > << > > > it is impossible to return '1', which is used to indicate caller needs to flush > cache. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-04-11 3:27 ` Xu, Quan @ 2016-04-11 16:34 ` Jan Beulich 2016-04-12 1:09 ` Xu, Quan 0 siblings, 1 reply; 42+ messages in thread From: Jan Beulich @ 2016-04-11 16:34 UTC (permalink / raw) To: Quan Xu Cc: Kevin Tian, Feng Wu, Jun Nakajima, George Dunlap, Andrew Cooper, DarioFaggioli, xen-devel, Julien Grall, StefanoStabellini, Suravee Suthikulpanit, Keir Fraser >>> On 11.04.16 at 05:27, <quan.xu@intel.com> wrote: > On April 11, 2016 11:10am, <quan.xu@intel.com> wrote: >> On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: >> > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: >> > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: >> > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: >> > >> > --- a/xen/drivers/passthrough/iommu.c >> > >> > +++ b/xen/drivers/passthrough/iommu.c >> > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) >> > >> > iommu = drhd->iommu; >> > >> > iommu_flush_context_global(iommu, 0); >> > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; >> > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); >> > >> > + rc = iommu_flush_iotlb_global(iommu, 0, >> > >> > + flush_dev_iotlb); >> > >> > + >> > >> > + if ( rc > 0 ) >> > >> > + { >> > >> > + iommu_flush_write_buffer(iommu); >> > >> >> > >> Why is this needed all of the sudden? >> > > >> > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my >> > > machine, and I can find the following log message: >> > > """ >> > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. >> > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. >> > > """ >> > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be >> > > called to flush every IOMMU. >> > >> > For one what you say suggests that right now this is being done for >> > some (one?) IOMMU(s), which I don't see being the case. And then what >> > you say _still_ doesn't say _why_ this is now needed all of the >> > sudden. If, in the course of doing your re-work here, you find >> > pre-existing issues with the code, please split the necessary fixes >> > out of your re-work and submit them separately with proper explanations in >> their commit messages. >> > >> >> I find out it is no need modification for this function. > Sorry, this modification refers to as: > " > + if ( rc > 0 ) > + { > + iommu_flush_write_buffer(iommu); > + rc = 0; > + } > " > > at least errors need to be propagated. How does error propagation correlate with suddenly calling iommu_flush_write_buffer()? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error. 2016-04-11 16:34 ` Jan Beulich @ 2016-04-12 1:09 ` Xu, Quan 0 siblings, 0 replies; 42+ messages in thread From: Xu, Quan @ 2016-04-12 1:09 UTC (permalink / raw) To: Jan Beulich Cc: Tian, Kevin, Wu, Feng, Nakajima, Jun, George Dunlap, Andrew Cooper, DarioFaggioli, xen-devel, Julien Grall, StefanoStabellini, Suravee Suthikulpanit, Keir Fraser On April 12, 2016 12:35am, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 11.04.16 at 05:27, <quan.xu@intel.com> wrote: > > On April 11, 2016 11:10am, <quan.xu@intel.com> wrote: > >> On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: > >> > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > >> > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > >> > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > >> > >> > --- a/xen/drivers/passthrough/iommu.c > >> > >> > +++ b/xen/drivers/passthrough/iommu.c > >> > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > >> > >> > iommu = drhd->iommu; > >> > >> > iommu_flush_context_global(iommu, 0); > >> > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > >> > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > >> > >> > + rc = iommu_flush_iotlb_global(iommu, 0, > >> > >> > + flush_dev_iotlb); > >> > >> > + > >> > >> > + if ( rc > 0 ) > >> > >> > + { > >> > >> > + iommu_flush_write_buffer(iommu); > >> > >> > >> > >> Why is this needed all of the sudden? > >> > > > >> > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my > >> > > machine, and I can find the following log message: > >> > > """ > >> > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > >> > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > >> > > """ > >> > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should > >> > > be called to flush every IOMMU. > >> > > >> > For one what you say suggests that right now this is being done for > >> > some (one?) IOMMU(s), which I don't see being the case. And then > >> > what you say _still_ doesn't say _why_ this is now needed all of > >> > the sudden. If, in the course of doing your re-work here, you find > >> > pre-existing issues with the code, please split the necessary fixes > >> > out of your re-work and submit them separately with proper > >> > explanations in > >> their commit messages. > >> > > >> > >> I find out it is no need modification for this function. > > Sorry, this modification refers to as: > > " > > + if ( rc > 0 ) > > + { > > + iommu_flush_write_buffer(iommu); > > + rc = 0; > > + } > > " > > My bad description, what I mean is that I will drop above modification, then.. > > at least errors need to be propagated. > > How does error propagation correlate with suddenly calling > iommu_flush_write_buffer()? > iommu_flush_write_buffer() is no longer called suddenly in this function. Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2016-04-12 1:09 UTC | newest] Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-03-17 6:54 [PATCH 0/2] Check VT-d Device-TLB flush error Quan Xu 2016-03-17 6:54 ` [PATCH 1/2] IOMMU/MMU: Adjust top level functions for " Quan Xu 2016-03-17 7:32 ` Tian, Kevin 2016-03-17 7:58 ` Jan Beulich 2016-03-17 8:00 ` Tian, Kevin 2016-03-17 12:30 ` George Dunlap 2016-03-17 12:33 ` George Dunlap 2016-03-18 3:19 ` Xu, Quan 2016-03-18 8:09 ` Jan Beulich 2016-03-24 6:45 ` Xu, Quan 2016-03-18 7:54 ` Xu, Quan 2016-03-18 8:19 ` Jan Beulich 2016-03-18 9:09 ` Xu, Quan 2016-03-18 9:29 ` Jan Beulich 2016-03-18 9:38 ` Dario Faggioli 2016-03-18 9:48 ` Jan Beulich 2016-03-21 6:18 ` Tian, Kevin 2016-03-21 12:22 ` Jan Beulich 2016-03-24 9:02 ` Xu, Quan 2016-03-24 9:58 ` Jan Beulich 2016-03-24 14:12 ` Xu, Quan 2016-03-24 14:37 ` Jan Beulich 2016-03-17 17:14 ` Jan Beulich 2016-03-28 3:33 ` Xu, Quan 2016-03-29 7:20 ` Jan Beulich 2016-03-30 2:28 ` Xu, Quan 2016-03-30 2:35 ` Xu, Quan 2016-03-30 8:05 ` Jan Beulich 2016-03-17 6:54 ` [PATCH 2/2] IOMMU/MMU: Adjust low " Quan Xu 2016-03-17 7:37 ` Tian, Kevin 2016-03-18 2:30 ` Xu, Quan 2016-03-18 8:06 ` Jan Beulich 2016-03-21 5:01 ` Tian, Kevin 2016-03-17 15:31 ` George Dunlap 2016-03-18 6:57 ` Xu, Quan 2016-03-18 10:20 ` Jan Beulich 2016-03-25 9:27 ` Xu, Quan 2016-03-29 7:36 ` Jan Beulich 2016-04-11 3:09 ` Xu, Quan 2016-04-11 3:27 ` Xu, Quan 2016-04-11 16:34 ` Jan Beulich 2016-04-12 1:09 ` Xu, Quan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).