All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v8 00/14] VT-d unit test
@ 2016-12-12  3:08 ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

This is v8 of vt-d unit test series.

(CC qemu-devel list this time.)

Patch "libcflat: add IS_ALIGNED() macro, and page sizes" is picked up
by Drew in the ARM GIC framework series, so please feel free to drop
it when needed.

Online repo:

  https://github.com/xzpeter/kvm-unit-tests.git iommu-ut-v8

Please review. Thanks,

v8:
- use ioremap() to access vt-d/edu registers (adding vtd_reg_base to
  store vt-d mmio base register vaddr, and pci_edu_dev.reg_base to
  store edu device base register vaddr) [Alex]
- check vt-d version [Alex]
- vtd_install_pte: do phys_to_virt() for root pointer [Alex]
- renaming VTD_FETCH_VIRT_ADDR to VTD_PHYS_TO_VIRT [Alex]
- define VTD_PAGE_SHIFT and VTD_PAGE_SIZE, use them when necessary
  [Alex]
- change define of VTD_PTE_ADDR to cover bits 63:12

v7:
- patch 13: add r-b for Drew
- change type of dev_offset in edu_dma() into unsigned [Alex]
- squash Alex's fix (thanks) for ARM build into patch 08/14 [Alex]

v6:
- pci: rename pci_dev.bar to pcidev.resource (to Drew: I didn't remove
  your r-b in several patches due to this change, assuming the change
  is trivial) [Alex]
- pci_cap_walk(): add an assert to avoid dead loop [Alex]
- pci_cap_walk(): another assertion to make sure cap_id is valid
  [Drew]

v5:
- add one more patch at the beginning of series to add "extern" to
  pci_testdev() header definition [Drew]
- add "extern" to all pci.h functions [Drew]
- update IS_ALIGNED patch to be aligned with Drew's gic series [Drew]
- replace one more PCI_DEVFN_MAX/PCI_BAR_NUM in lib/pci-host-generic.c
  [Drew]
- add {} in vtd_install_pte() when there are one line code and one
  line comment [Drew]
- drop Drew's suggested-by in patch "x86: intel-iommu: add IR MSI
  test" (I forgot why I added it, but anyway let me drop it here...)
  [Drew]

v4:
(same as v3, but with correct patches sent out)

v3:
- is_power_of_2(): remove outer "()" [Drew]
- fix indent for vtd_root_table() [Drew]
- pci_find_dev(): keep {} to avoid touching more codes [Drew]
- pci_scan_bars(): check whether bar is 64bits [Drew]
- pci_enable_defaults(): make it return nothing [Drew]
- fix strnang blanks in lib/x86/intel-iommu.h [Drew]
- pci_setup_msi(): fix correct assert() inside [Drew]
- squash patches: 03+17 08+09 10+11 12+14 [Drew]
- remove all "()" for raw defines (numbers) in macros [Drew]

v2:
- move cpu_relax patch to the beginning, and use them in all places
[Drew]
- replace all corresponding 256 into PCI_DEVFN_MAX, as well for
PCI_BAR_NUM [Drew]
- adding is_power_of_2() to replace ONE_BIT_ONLY() [Drew]
- add SZ_64K macro [Drew]
- declare pci_config_write[wb] in lib/asm-generic/pci-host-bridge.h [Alex]
- edu_reg_read/write() add "l" in func name [Drew]
- drop pci_set_master(), instead, provide pci_cmd_set_clr() [Drew]
- change return code into bool (always) for functions that apply
[Drew]
- keep old pci_find_dev() interface [Drew/Alex]
- use __raw_{read|write}*() for both vt-d and edu register read/writes
[Alex]
- remove pci_ prefix for all pci_dev fields [Drew]
- replace 0xff in cap_handlers[0xff] into (PCI_CAP_ID_MAX + 1) [Drew]
- make x86/unittest.cfg simpler by using q35 directly with eim=off
[Drew]

RFC -> v1:
- when init edu device fail, report_skip() rather than return error
[Radim]
- use asserts rather than "static bool inited" to avoid multiple init
of components (affects patch 1/2) [Drew]
- moving the first two patches out of the series [Drew]
- int vtd_init(), do not setup_idt() since smp_init() did it [Drew]
- when edu do not have MSI enabled, skip interrupt test [Radim]
- rename vtd_reg_*() into vtd_{read|write}[lq](), and move them to
header file [Drew]
- use PAGE_MASK when able [Drew]
- use "&" instead of "|" in intel-iommu init test (three places)
[Drew]
- use "vtd_init()" in unit test [Drew]
- mention that where intel-iommu.h comes from [Drew]
- re-written vtd_gcmd_or(), make it also work on even hardware [Drew]
- remove most of the oneline wrapper for VT-d registers, instead, use
vtd_{read|write}* with register names [Drew]
- remove useless BDF helpers [Drew]
- move edu device macros into header file [Drew]
- make edu_check_alive static inline [Drew]
- remove all useless wrappers in pci-edu.c [Drew]
- remove pci_dma_dir_t and all its users, instead, use "bool
from_device" [Drew]
- not use typedef for structs, to follow Linux/kvm-unit-tests coding
style [Drew]
- let pci_dev_init() clean and simple, then provide
pci_enable_defaults() for more complicated things [Drew]
- add one more patch to add intel-iommu test into x86/unittest [Radim]
- use 0x60 intr request instead of factorial to trigger edu device
interrupt [Drew]
- ...and some other changes I just forgot to note down...

Currently only a very small test scope is covered:

* VT-d init
* DMAR: 4 bytes copy
* IR: MSI

However this series could be a base point to add more test cases for
VT-d. The problem is, there are many IOMMU error conditions which are
very hard to be triggered in a real guest (IOMMU has merely no
interface for guest user, and it's totally running in the background).
This piece of work can be a start point if we want to do more
complicated things and play around with Intel IOMMU devices (also for
IOMMU regression tests).

Please review. Thanks,

Peter Xu (14):
  pci: fix missing extern for pci_testdev()
  x86/asm: add cpu_relax()
  libcflat: introduce is_power_of_2()
  x86: intel-iommu: add vt-d init test
  libcflat: add IS_ALIGNED() macro, and page sizes
  libcflat: moving MIN/MAX here
  vm/page: provide PGDIR_OFFSET() macro
  pci: introduce struct pci_dev
  pci: provide pci_scan_bars()
  pci: provide pci_enable_defaults()
  pci: edu: introduce pci-edu helpers
  x86: intel-iommu: add dmar test
  pci: add msi support for 32/64bit address
  x86: intel-iommu: add IR MSI test

 lib/alloc.c            |   3 -
 lib/libcflat.h         |  14 +++
 lib/pci-edu.c          |  73 +++++++++++
 lib/pci-edu.h          |  84 +++++++++++++
 lib/pci-host-generic.c |  19 +--
 lib/pci-testdev.c      |  10 +-
 lib/pci.c              | 161 ++++++++++++++++++++----
 lib/pci.h              |  42 +++++--
 lib/x86/asm/barrier.h  |  11 ++
 lib/x86/asm/page.h     |   3 +
 lib/x86/intel-iommu.c  | 326 +++++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h  | 146 ++++++++++++++++++++++
 lib/x86/vm.c           |   4 +-
 x86/Makefile.common    |   1 +
 x86/Makefile.x86_64    |   2 +
 x86/intel-iommu.c      | 120 ++++++++++++++++++
 x86/unittests.cfg      |   7 ++
 x86/vmexit.c           |  27 ++--
 18 files changed, 989 insertions(+), 64 deletions(-)
 create mode 100644 lib/pci-edu.c
 create mode 100644 lib/pci-edu.h
 create mode 100644 lib/x86/intel-iommu.c
 create mode 100644 lib/x86/intel-iommu.h
 create mode 100644 x86/intel-iommu.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 00/14] VT-d unit test
@ 2016-12-12  3:08 ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

This is v8 of vt-d unit test series.

(CC qemu-devel list this time.)

Patch "libcflat: add IS_ALIGNED() macro, and page sizes" is picked up
by Drew in the ARM GIC framework series, so please feel free to drop
it when needed.

Online repo:

  https://github.com/xzpeter/kvm-unit-tests.git iommu-ut-v8

Please review. Thanks,

v8:
- use ioremap() to access vt-d/edu registers (adding vtd_reg_base to
  store vt-d mmio base register vaddr, and pci_edu_dev.reg_base to
  store edu device base register vaddr) [Alex]
- check vt-d version [Alex]
- vtd_install_pte: do phys_to_virt() for root pointer [Alex]
- renaming VTD_FETCH_VIRT_ADDR to VTD_PHYS_TO_VIRT [Alex]
- define VTD_PAGE_SHIFT and VTD_PAGE_SIZE, use them when necessary
  [Alex]
- change define of VTD_PTE_ADDR to cover bits 63:12

v7:
- patch 13: add r-b for Drew
- change type of dev_offset in edu_dma() into unsigned [Alex]
- squash Alex's fix (thanks) for ARM build into patch 08/14 [Alex]

v6:
- pci: rename pci_dev.bar to pcidev.resource (to Drew: I didn't remove
  your r-b in several patches due to this change, assuming the change
  is trivial) [Alex]
- pci_cap_walk(): add an assert to avoid dead loop [Alex]
- pci_cap_walk(): another assertion to make sure cap_id is valid
  [Drew]

v5:
- add one more patch at the beginning of series to add "extern" to
  pci_testdev() header definition [Drew]
- add "extern" to all pci.h functions [Drew]
- update IS_ALIGNED patch to be aligned with Drew's gic series [Drew]
- replace one more PCI_DEVFN_MAX/PCI_BAR_NUM in lib/pci-host-generic.c
  [Drew]
- add {} in vtd_install_pte() when there are one line code and one
  line comment [Drew]
- drop Drew's suggested-by in patch "x86: intel-iommu: add IR MSI
  test" (I forgot why I added it, but anyway let me drop it here...)
  [Drew]

v4:
(same as v3, but with correct patches sent out)

v3:
- is_power_of_2(): remove outer "()" [Drew]
- fix indent for vtd_root_table() [Drew]
- pci_find_dev(): keep {} to avoid touching more codes [Drew]
- pci_scan_bars(): check whether bar is 64bits [Drew]
- pci_enable_defaults(): make it return nothing [Drew]
- fix strnang blanks in lib/x86/intel-iommu.h [Drew]
- pci_setup_msi(): fix correct assert() inside [Drew]
- squash patches: 03+17 08+09 10+11 12+14 [Drew]
- remove all "()" for raw defines (numbers) in macros [Drew]

v2:
- move cpu_relax patch to the beginning, and use them in all places
[Drew]
- replace all corresponding 256 into PCI_DEVFN_MAX, as well for
PCI_BAR_NUM [Drew]
- adding is_power_of_2() to replace ONE_BIT_ONLY() [Drew]
- add SZ_64K macro [Drew]
- declare pci_config_write[wb] in lib/asm-generic/pci-host-bridge.h [Alex]
- edu_reg_read/write() add "l" in func name [Drew]
- drop pci_set_master(), instead, provide pci_cmd_set_clr() [Drew]
- change return code into bool (always) for functions that apply
[Drew]
- keep old pci_find_dev() interface [Drew/Alex]
- use __raw_{read|write}*() for both vt-d and edu register read/writes
[Alex]
- remove pci_ prefix for all pci_dev fields [Drew]
- replace 0xff in cap_handlers[0xff] into (PCI_CAP_ID_MAX + 1) [Drew]
- make x86/unittest.cfg simpler by using q35 directly with eim=off
[Drew]

RFC -> v1:
- when init edu device fail, report_skip() rather than return error
[Radim]
- use asserts rather than "static bool inited" to avoid multiple init
of components (affects patch 1/2) [Drew]
- moving the first two patches out of the series [Drew]
- int vtd_init(), do not setup_idt() since smp_init() did it [Drew]
- when edu do not have MSI enabled, skip interrupt test [Radim]
- rename vtd_reg_*() into vtd_{read|write}[lq](), and move them to
header file [Drew]
- use PAGE_MASK when able [Drew]
- use "&" instead of "|" in intel-iommu init test (three places)
[Drew]
- use "vtd_init()" in unit test [Drew]
- mention that where intel-iommu.h comes from [Drew]
- re-written vtd_gcmd_or(), make it also work on even hardware [Drew]
- remove most of the oneline wrapper for VT-d registers, instead, use
vtd_{read|write}* with register names [Drew]
- remove useless BDF helpers [Drew]
- move edu device macros into header file [Drew]
- make edu_check_alive static inline [Drew]
- remove all useless wrappers in pci-edu.c [Drew]
- remove pci_dma_dir_t and all its users, instead, use "bool
from_device" [Drew]
- not use typedef for structs, to follow Linux/kvm-unit-tests coding
style [Drew]
- let pci_dev_init() clean and simple, then provide
pci_enable_defaults() for more complicated things [Drew]
- add one more patch to add intel-iommu test into x86/unittest [Radim]
- use 0x60 intr request instead of factorial to trigger edu device
interrupt [Drew]
- ...and some other changes I just forgot to note down...

Currently only a very small test scope is covered:

* VT-d init
* DMAR: 4 bytes copy
* IR: MSI

However this series could be a base point to add more test cases for
VT-d. The problem is, there are many IOMMU error conditions which are
very hard to be triggered in a real guest (IOMMU has merely no
interface for guest user, and it's totally running in the background).
This piece of work can be a start point if we want to do more
complicated things and play around with Intel IOMMU devices (also for
IOMMU regression tests).

Please review. Thanks,

Peter Xu (14):
  pci: fix missing extern for pci_testdev()
  x86/asm: add cpu_relax()
  libcflat: introduce is_power_of_2()
  x86: intel-iommu: add vt-d init test
  libcflat: add IS_ALIGNED() macro, and page sizes
  libcflat: moving MIN/MAX here
  vm/page: provide PGDIR_OFFSET() macro
  pci: introduce struct pci_dev
  pci: provide pci_scan_bars()
  pci: provide pci_enable_defaults()
  pci: edu: introduce pci-edu helpers
  x86: intel-iommu: add dmar test
  pci: add msi support for 32/64bit address
  x86: intel-iommu: add IR MSI test

 lib/alloc.c            |   3 -
 lib/libcflat.h         |  14 +++
 lib/pci-edu.c          |  73 +++++++++++
 lib/pci-edu.h          |  84 +++++++++++++
 lib/pci-host-generic.c |  19 +--
 lib/pci-testdev.c      |  10 +-
 lib/pci.c              | 161 ++++++++++++++++++++----
 lib/pci.h              |  42 +++++--
 lib/x86/asm/barrier.h  |  11 ++
 lib/x86/asm/page.h     |   3 +
 lib/x86/intel-iommu.c  | 326 +++++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h  | 146 ++++++++++++++++++++++
 lib/x86/vm.c           |   4 +-
 x86/Makefile.common    |   1 +
 x86/Makefile.x86_64    |   2 +
 x86/intel-iommu.c      | 120 ++++++++++++++++++
 x86/unittests.cfg      |   7 ++
 x86/vmexit.c           |  27 ++--
 18 files changed, 989 insertions(+), 64 deletions(-)
 create mode 100644 lib/pci-edu.c
 create mode 100644 lib/pci-edu.h
 create mode 100644 lib/x86/intel-iommu.c
 create mode 100644 lib/x86/intel-iommu.h
 create mode 100644 x86/intel-iommu.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 01/14] pci: fix missing extern for pci_testdev()
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/pci.h b/lib/pci.h
index 30f5381..86d6b42 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -43,7 +43,7 @@ extern bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num);
 extern void pci_bar_print(pcidevaddr_t dev, int bar_num);
 extern void pci_dev_print_id(pcidevaddr_t dev);
 
-int pci_testdev(void);
+extern int pci_testdev(void);
 
 /*
  * pci-testdev is a driver for the pci-testdev qemu pci device. The
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 01/14] pci: fix missing extern for pci_testdev()
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/pci.h b/lib/pci.h
index 30f5381..86d6b42 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -43,7 +43,7 @@ extern bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num);
 extern void pci_bar_print(pcidevaddr_t dev, int bar_num);
 extern void pci_dev_print_id(pcidevaddr_t dev);
 
-int pci_testdev(void);
+extern int pci_testdev(void);
 
 /*
  * pci-testdev is a driver for the pci-testdev qemu pci device. The
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 02/14] x86/asm: add cpu_relax()
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

This will be useful to be put inside loops.

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/x86/asm/barrier.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/lib/x86/asm/barrier.h b/lib/x86/asm/barrier.h
index 7c108bd..193fb4c 100644
--- a/lib/x86/asm/barrier.h
+++ b/lib/x86/asm/barrier.h
@@ -13,4 +13,15 @@
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 
+/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
+static inline void rep_nop(void)
+{
+	asm volatile("rep; nop" ::: "memory");
+}
+
+static inline void cpu_relax(void)
+{
+	rep_nop();
+}
+
 #endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 02/14] x86/asm: add cpu_relax()
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

This will be useful to be put inside loops.

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/x86/asm/barrier.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/lib/x86/asm/barrier.h b/lib/x86/asm/barrier.h
index 7c108bd..193fb4c 100644
--- a/lib/x86/asm/barrier.h
+++ b/lib/x86/asm/barrier.h
@@ -13,4 +13,15 @@
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 
+/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
+static inline void rep_nop(void)
+{
+	asm volatile("rep; nop" ::: "memory");
+}
+
+static inline void cpu_relax(void)
+{
+	rep_nop();
+}
+
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 03/14] libcflat: introduce is_power_of_2()
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/libcflat.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index c622198..8455c3d 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -104,4 +104,9 @@ do {									\
 	}								\
 } while (0)
 
+static inline bool is_power_of_2(unsigned long n)
+{
+	return n && !(n & (n - 1));
+}
+
 #endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 03/14] libcflat: introduce is_power_of_2()
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/libcflat.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index c622198..8455c3d 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -104,4 +104,9 @@ do {									\
 	}								\
 } while (0)
 
+static inline bool is_power_of_2(unsigned long n)
+{
+	return n && !(n & (n - 1));
+}
+
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 04/14] x86: intel-iommu: add vt-d init test
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Adding fundamental init test for Intel IOMMU. This includes basic
initialization of Intel IOMMU device, like DMAR (DMA Remapping),
IR (Interrupt Remapping), QI (Queue Invalidation), etc.

Further tests can use vtd_init() to initialize Intel IOMMU environment.

x86/unittests is updated to add this test.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/x86/intel-iommu.c |  99 +++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++
 x86/Makefile.x86_64   |   2 +
 x86/intel-iommu.c     |  27 ++++++++++
 x86/unittests.cfg     |   7 +++
 5 files changed, 276 insertions(+)
 create mode 100644 lib/x86/intel-iommu.c
 create mode 100644 lib/x86/intel-iommu.h
 create mode 100644 x86/intel-iommu.c

diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
new file mode 100644
index 0000000..1887998
--- /dev/null
+++ b/lib/x86/intel-iommu.c
@@ -0,0 +1,99 @@
+/*
+ * Intel IOMMU APIs
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ */
+
+#include "intel-iommu.h"
+
+#define VTD_RTA_MASK  (PAGE_MASK)
+#define VTD_IRTA_MASK (PAGE_MASK)
+
+void *vtd_reg_base;
+
+static uint64_t vtd_root_table(void)
+{
+	/* No extend root table support yet */
+	return vtd_readq(DMAR_RTADDR_REG) & VTD_RTA_MASK;
+}
+
+static uint64_t vtd_ir_table(void)
+{
+	return vtd_readq(DMAR_IRTA_REG) & VTD_IRTA_MASK;
+}
+
+static void vtd_gcmd_or(uint32_t cmd)
+{
+	uint32_t status;
+
+	/* We only allow set one bit for each time */
+	assert(is_power_of_2(cmd));
+
+	status = vtd_readl(DMAR_GSTS_REG);
+	vtd_writel(DMAR_GCMD_REG, status | cmd);
+
+	if (cmd & VTD_GCMD_ONE_SHOT_BITS) {
+		/* One-shot bits are taking effect immediately */
+		return;
+	}
+
+	/* Make sure IOMMU handled our command request */
+	while (!(vtd_readl(DMAR_GSTS_REG) & cmd))
+		cpu_relax();
+}
+
+static void vtd_dump_init_info(void)
+{
+	uint32_t version;
+
+	version = vtd_readl(DMAR_VER_REG);
+
+	/* Major version >= 1 */
+	assert(((version >> 3) & 0xf) >= 1);
+
+	printf("VT-d version:   0x%x\n", version);
+	printf("     cap:       0x%016lx\n", vtd_readq(DMAR_CAP_REG));
+	printf("     ecap:      0x%016lx\n", vtd_readq(DMAR_ECAP_REG));
+}
+
+static void vtd_setup_root_table(void)
+{
+	void *root = alloc_page();
+
+	memset(root, 0, PAGE_SIZE);
+	vtd_writeq(DMAR_RTADDR_REG, virt_to_phys(root));
+	vtd_gcmd_or(VTD_GCMD_ROOT);
+	printf("DMAR table address: 0x%016lx\n", vtd_root_table());
+}
+
+static void vtd_setup_ir_table(void)
+{
+	void *root = alloc_page();
+
+	memset(root, 0, PAGE_SIZE);
+	/* 0xf stands for table size (2^(0xf+1) == 65536) */
+	vtd_writeq(DMAR_IRTA_REG, virt_to_phys(root) | 0xf);
+	vtd_gcmd_or(VTD_GCMD_IR_TABLE);
+	printf("IR table address: 0x%016lx\n", vtd_ir_table());
+}
+
+void vtd_init(void)
+{
+	setup_vm();
+	smp_init();
+
+	vtd_reg_base = ioremap(Q35_HOST_BRIDGE_IOMMU_ADDR, PAGE_SIZE);
+
+	vtd_dump_init_info();
+	vtd_gcmd_or(VTD_GCMD_QI); /* Enable QI */
+	vtd_setup_root_table();
+	vtd_setup_ir_table();
+	vtd_gcmd_or(VTD_GCMD_DMAR); /* Enable DMAR */
+	vtd_gcmd_or(VTD_GCMD_IR);   /* Enable IR */
+}
diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
new file mode 100644
index 0000000..92cc85f
--- /dev/null
+++ b/lib/x86/intel-iommu.h
@@ -0,0 +1,141 @@
+/*
+ * Intel IOMMU header
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ *
+ * (From include/linux/intel-iommu.h)
+ */
+
+#ifndef __INTEL_IOMMU_H__
+#define __INTEL_IOMMU_H__
+
+#include "libcflat.h"
+#include "vm.h"
+#include "isr.h"
+#include "smp.h"
+#include "desc.h"
+#include "asm/io.h"
+
+#define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed90000ULL
+
+/*
+ * Intel IOMMU register specification
+ */
+#define DMAR_VER_REG            0x0  /* Arch version supported by this IOMMU */
+#define DMAR_CAP_REG            0x8  /* Hardware supported capabilities */
+#define DMAR_CAP_REG_HI         0xc  /* High 32-bit of DMAR_CAP_REG */
+#define DMAR_ECAP_REG           0x10 /* Extended capabilities supported */
+#define DMAR_ECAP_REG_HI        0X14
+#define DMAR_GCMD_REG           0x18 /* Global command */
+#define DMAR_GSTS_REG           0x1c /* Global status */
+#define DMAR_RTADDR_REG         0x20 /* Root entry table */
+#define DMAR_RTADDR_REG_HI      0X24
+#define DMAR_CCMD_REG           0x28 /* Context command */
+#define DMAR_CCMD_REG_HI        0x2c
+#define DMAR_FSTS_REG           0x34 /* Fault status */
+#define DMAR_FECTL_REG          0x38 /* Fault control */
+#define DMAR_FEDATA_REG         0x3c /* Fault event interrupt data */
+#define DMAR_FEADDR_REG         0x40 /* Fault event interrupt addr */
+#define DMAR_FEUADDR_REG        0x44 /* Upper address */
+#define DMAR_AFLOG_REG          0x58 /* Advanced fault control */
+#define DMAR_AFLOG_REG_HI       0X5c
+#define DMAR_PMEN_REG           0x64 /* Enable protected memory region */
+#define DMAR_PLMBASE_REG        0x68 /* PMRR low addr */
+#define DMAR_PLMLIMIT_REG       0x6c /* PMRR low limit */
+#define DMAR_PHMBASE_REG        0x70 /* PMRR high base addr */
+#define DMAR_PHMBASE_REG_HI     0X74
+#define DMAR_PHMLIMIT_REG       0x78 /* PMRR high limit */
+#define DMAR_PHMLIMIT_REG_HI    0x7c
+#define DMAR_IQH_REG            0x80 /* Invalidation queue head */
+#define DMAR_IQH_REG_HI         0X84
+#define DMAR_IQT_REG            0x88 /* Invalidation queue tail */
+#define DMAR_IQT_REG_HI         0X8c
+#define DMAR_IQA_REG            0x90 /* Invalidation queue addr */
+#define DMAR_IQA_REG_HI         0x94
+#define DMAR_ICS_REG            0x9c /* Invalidation complete status */
+#define DMAR_IRTA_REG           0xb8 /* Interrupt remapping table addr */
+#define DMAR_IRTA_REG_HI        0xbc
+#define DMAR_IECTL_REG          0xa0 /* Invalidation event control */
+#define DMAR_IEDATA_REG         0xa4 /* Invalidation event data */
+#define DMAR_IEADDR_REG         0xa8 /* Invalidation event address */
+#define DMAR_IEUADDR_REG        0xac /* Invalidation event address */
+#define DMAR_PQH_REG            0xc0 /* Page request queue head */
+#define DMAR_PQH_REG_HI         0xc4
+#define DMAR_PQT_REG            0xc8 /* Page request queue tail*/
+#define DMAR_PQT_REG_HI         0xcc
+#define DMAR_PQA_REG            0xd0 /* Page request queue address */
+#define DMAR_PQA_REG_HI         0xd4
+#define DMAR_PRS_REG            0xdc /* Page request status */
+#define DMAR_PECTL_REG          0xe0 /* Page request event control */
+#define DMAR_PEDATA_REG         0xe4 /* Page request event data */
+#define DMAR_PEADDR_REG         0xe8 /* Page request event address */
+#define DMAR_PEUADDR_REG        0xec /* Page event upper address */
+#define DMAR_MTRRCAP_REG        0x100 /* MTRR capability */
+#define DMAR_MTRRCAP_REG_HI     0x104
+#define DMAR_MTRRDEF_REG        0x108 /* MTRR default type */
+#define DMAR_MTRRDEF_REG_HI     0x10c
+
+#define VTD_GCMD_IR_TABLE       0x1000000
+#define VTD_GCMD_IR             0x2000000
+#define VTD_GCMD_QI             0x4000000
+#define VTD_GCMD_WBF            0x8000000  /* Write Buffer Flush */
+#define VTD_GCMD_SFL            0x20000000 /* Set Fault Log */
+#define VTD_GCMD_ROOT           0x40000000
+#define VTD_GCMD_DMAR           0x80000000
+#define VTD_GCMD_ONE_SHOT_BITS  (VTD_GCMD_IR_TABLE | VTD_GCMD_WBF | \
+				 VTD_GCMD_SFL | VTD_GCMD_ROOT)
+
+/* Supported Adjusted Guest Address Widths */
+#define VTD_CAP_SAGAW_SHIFT         8
+/* 39-bit AGAW, 3-level page-table */
+#define VTD_CAP_SAGAW_39bit         (0x2ULL << VTD_CAP_SAGAW_SHIFT)
+/* 48-bit AGAW, 4-level page-table */
+#define VTD_CAP_SAGAW_48bit         (0x4ULL << VTD_CAP_SAGAW_SHIFT)
+#define VTD_CAP_SAGAW               VTD_CAP_SAGAW_39bit
+
+/* Both 1G/2M huge pages */
+#define VTD_CAP_SLLPS               ((1ULL << 34) | (1ULL << 35))
+
+#define VTD_CONTEXT_TT_MULTI_LEVEL  0
+#define VTD_CONTEXT_TT_DEV_IOTLB    1
+#define VTD_CONTEXT_TT_PASS_THROUGH 2
+
+#define VTD_PTE_R                   (1 << 0)
+#define VTD_PTE_W                   (1 << 1)
+#define VTD_PTE_RW                  (VTD_PTE_R | VTD_PTE_W)
+#define VTD_PTE_ADDR                GENMASK_ULL(63, 12)
+#define VTD_PTE_HUGE                (1 << 7)
+
+extern void *vtd_reg_base;
+#define vtd_reg(reg) ({ assert(vtd_reg_base); \
+			(volatile void *)(vtd_reg_base + reg); })
+
+static inline void vtd_writel(unsigned int reg, uint32_t value)
+{
+	__raw_writel(value, vtd_reg(reg));
+}
+
+static inline void vtd_writeq(unsigned int reg, uint64_t value)
+{
+	__raw_writeq(value, vtd_reg(reg));
+}
+
+static inline uint32_t vtd_readl(unsigned int reg)
+{
+	return __raw_readl(vtd_reg(reg));
+}
+
+static inline uint64_t vtd_readq(unsigned int reg)
+{
+	return __raw_readq(vtd_reg(reg));
+}
+
+void vtd_init(void);
+
+#endif
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index f82492b..3e2821e 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -4,6 +4,7 @@ ldarch = elf64-x86-64
 CFLAGS += -mno-red-zone
 
 cflatobjs += lib/x86/setjmp64.o
+cflatobjs += lib/x86/intel-iommu.o
 
 tests = $(TEST_DIR)/access.flat $(TEST_DIR)/apic.flat \
 	  $(TEST_DIR)/emulator.flat $(TEST_DIR)/idt_test.flat \
@@ -14,6 +15,7 @@ tests = $(TEST_DIR)/access.flat $(TEST_DIR)/apic.flat \
 tests += $(TEST_DIR)/svm.flat
 tests += $(TEST_DIR)/vmx.flat
 tests += $(TEST_DIR)/tscdeadline_latency.flat
+tests += $(TEST_DIR)/intel-iommu.flat
 
 include $(TEST_DIR)/Makefile.common
 
diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
new file mode 100644
index 0000000..f247913
--- /dev/null
+++ b/x86/intel-iommu.c
@@ -0,0 +1,27 @@
+/*
+ * Intel IOMMU unit test.
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ */
+
+#include "intel-iommu.h"
+
+int main(int argc, char *argv[])
+{
+	vtd_init();
+
+	report("fault status check", vtd_readl(DMAR_FSTS_REG) == 0);
+	report("QI enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_QI);
+	report("DMAR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_ROOT);
+	report("IR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR_TABLE);
+	report("DMAR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_DMAR);
+	report("IR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR);
+
+	return report_summary();
+}
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 23395c6..5413838 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -217,3 +217,10 @@ extra_params = -cpu kvm64,hv_time,hv_synic,hv_stimer -device hyperv-testdev
 file = hyperv_clock.flat
 smp = 2
 extra_params = -cpu kvm64,hv_time
+
+[intel_iommu]
+file = intel-iommu.flat
+arch = x86_64
+timeout = 30
+smp = 4
+extra_params = -M q35,kernel-irqchip=split -device intel-iommu,intremap=on,eim=off -device edu
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 04/14] x86: intel-iommu: add vt-d init test
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Adding fundamental init test for Intel IOMMU. This includes basic
initialization of Intel IOMMU device, like DMAR (DMA Remapping),
IR (Interrupt Remapping), QI (Queue Invalidation), etc.

Further tests can use vtd_init() to initialize Intel IOMMU environment.

x86/unittests is updated to add this test.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/x86/intel-iommu.c |  99 +++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++
 x86/Makefile.x86_64   |   2 +
 x86/intel-iommu.c     |  27 ++++++++++
 x86/unittests.cfg     |   7 +++
 5 files changed, 276 insertions(+)
 create mode 100644 lib/x86/intel-iommu.c
 create mode 100644 lib/x86/intel-iommu.h
 create mode 100644 x86/intel-iommu.c

diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
new file mode 100644
index 0000000..1887998
--- /dev/null
+++ b/lib/x86/intel-iommu.c
@@ -0,0 +1,99 @@
+/*
+ * Intel IOMMU APIs
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ */
+
+#include "intel-iommu.h"
+
+#define VTD_RTA_MASK  (PAGE_MASK)
+#define VTD_IRTA_MASK (PAGE_MASK)
+
+void *vtd_reg_base;
+
+static uint64_t vtd_root_table(void)
+{
+	/* No extend root table support yet */
+	return vtd_readq(DMAR_RTADDR_REG) & VTD_RTA_MASK;
+}
+
+static uint64_t vtd_ir_table(void)
+{
+	return vtd_readq(DMAR_IRTA_REG) & VTD_IRTA_MASK;
+}
+
+static void vtd_gcmd_or(uint32_t cmd)
+{
+	uint32_t status;
+
+	/* We only allow set one bit for each time */
+	assert(is_power_of_2(cmd));
+
+	status = vtd_readl(DMAR_GSTS_REG);
+	vtd_writel(DMAR_GCMD_REG, status | cmd);
+
+	if (cmd & VTD_GCMD_ONE_SHOT_BITS) {
+		/* One-shot bits are taking effect immediately */
+		return;
+	}
+
+	/* Make sure IOMMU handled our command request */
+	while (!(vtd_readl(DMAR_GSTS_REG) & cmd))
+		cpu_relax();
+}
+
+static void vtd_dump_init_info(void)
+{
+	uint32_t version;
+
+	version = vtd_readl(DMAR_VER_REG);
+
+	/* Major version >= 1 */
+	assert(((version >> 3) & 0xf) >= 1);
+
+	printf("VT-d version:   0x%x\n", version);
+	printf("     cap:       0x%016lx\n", vtd_readq(DMAR_CAP_REG));
+	printf("     ecap:      0x%016lx\n", vtd_readq(DMAR_ECAP_REG));
+}
+
+static void vtd_setup_root_table(void)
+{
+	void *root = alloc_page();
+
+	memset(root, 0, PAGE_SIZE);
+	vtd_writeq(DMAR_RTADDR_REG, virt_to_phys(root));
+	vtd_gcmd_or(VTD_GCMD_ROOT);
+	printf("DMAR table address: 0x%016lx\n", vtd_root_table());
+}
+
+static void vtd_setup_ir_table(void)
+{
+	void *root = alloc_page();
+
+	memset(root, 0, PAGE_SIZE);
+	/* 0xf stands for table size (2^(0xf+1) == 65536) */
+	vtd_writeq(DMAR_IRTA_REG, virt_to_phys(root) | 0xf);
+	vtd_gcmd_or(VTD_GCMD_IR_TABLE);
+	printf("IR table address: 0x%016lx\n", vtd_ir_table());
+}
+
+void vtd_init(void)
+{
+	setup_vm();
+	smp_init();
+
+	vtd_reg_base = ioremap(Q35_HOST_BRIDGE_IOMMU_ADDR, PAGE_SIZE);
+
+	vtd_dump_init_info();
+	vtd_gcmd_or(VTD_GCMD_QI); /* Enable QI */
+	vtd_setup_root_table();
+	vtd_setup_ir_table();
+	vtd_gcmd_or(VTD_GCMD_DMAR); /* Enable DMAR */
+	vtd_gcmd_or(VTD_GCMD_IR);   /* Enable IR */
+}
diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
new file mode 100644
index 0000000..92cc85f
--- /dev/null
+++ b/lib/x86/intel-iommu.h
@@ -0,0 +1,141 @@
+/*
+ * Intel IOMMU header
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ *
+ * (From include/linux/intel-iommu.h)
+ */
+
+#ifndef __INTEL_IOMMU_H__
+#define __INTEL_IOMMU_H__
+
+#include "libcflat.h"
+#include "vm.h"
+#include "isr.h"
+#include "smp.h"
+#include "desc.h"
+#include "asm/io.h"
+
+#define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed90000ULL
+
+/*
+ * Intel IOMMU register specification
+ */
+#define DMAR_VER_REG            0x0  /* Arch version supported by this IOMMU */
+#define DMAR_CAP_REG            0x8  /* Hardware supported capabilities */
+#define DMAR_CAP_REG_HI         0xc  /* High 32-bit of DMAR_CAP_REG */
+#define DMAR_ECAP_REG           0x10 /* Extended capabilities supported */
+#define DMAR_ECAP_REG_HI        0X14
+#define DMAR_GCMD_REG           0x18 /* Global command */
+#define DMAR_GSTS_REG           0x1c /* Global status */
+#define DMAR_RTADDR_REG         0x20 /* Root entry table */
+#define DMAR_RTADDR_REG_HI      0X24
+#define DMAR_CCMD_REG           0x28 /* Context command */
+#define DMAR_CCMD_REG_HI        0x2c
+#define DMAR_FSTS_REG           0x34 /* Fault status */
+#define DMAR_FECTL_REG          0x38 /* Fault control */
+#define DMAR_FEDATA_REG         0x3c /* Fault event interrupt data */
+#define DMAR_FEADDR_REG         0x40 /* Fault event interrupt addr */
+#define DMAR_FEUADDR_REG        0x44 /* Upper address */
+#define DMAR_AFLOG_REG          0x58 /* Advanced fault control */
+#define DMAR_AFLOG_REG_HI       0X5c
+#define DMAR_PMEN_REG           0x64 /* Enable protected memory region */
+#define DMAR_PLMBASE_REG        0x68 /* PMRR low addr */
+#define DMAR_PLMLIMIT_REG       0x6c /* PMRR low limit */
+#define DMAR_PHMBASE_REG        0x70 /* PMRR high base addr */
+#define DMAR_PHMBASE_REG_HI     0X74
+#define DMAR_PHMLIMIT_REG       0x78 /* PMRR high limit */
+#define DMAR_PHMLIMIT_REG_HI    0x7c
+#define DMAR_IQH_REG            0x80 /* Invalidation queue head */
+#define DMAR_IQH_REG_HI         0X84
+#define DMAR_IQT_REG            0x88 /* Invalidation queue tail */
+#define DMAR_IQT_REG_HI         0X8c
+#define DMAR_IQA_REG            0x90 /* Invalidation queue addr */
+#define DMAR_IQA_REG_HI         0x94
+#define DMAR_ICS_REG            0x9c /* Invalidation complete status */
+#define DMAR_IRTA_REG           0xb8 /* Interrupt remapping table addr */
+#define DMAR_IRTA_REG_HI        0xbc
+#define DMAR_IECTL_REG          0xa0 /* Invalidation event control */
+#define DMAR_IEDATA_REG         0xa4 /* Invalidation event data */
+#define DMAR_IEADDR_REG         0xa8 /* Invalidation event address */
+#define DMAR_IEUADDR_REG        0xac /* Invalidation event address */
+#define DMAR_PQH_REG            0xc0 /* Page request queue head */
+#define DMAR_PQH_REG_HI         0xc4
+#define DMAR_PQT_REG            0xc8 /* Page request queue tail*/
+#define DMAR_PQT_REG_HI         0xcc
+#define DMAR_PQA_REG            0xd0 /* Page request queue address */
+#define DMAR_PQA_REG_HI         0xd4
+#define DMAR_PRS_REG            0xdc /* Page request status */
+#define DMAR_PECTL_REG          0xe0 /* Page request event control */
+#define DMAR_PEDATA_REG         0xe4 /* Page request event data */
+#define DMAR_PEADDR_REG         0xe8 /* Page request event address */
+#define DMAR_PEUADDR_REG        0xec /* Page event upper address */
+#define DMAR_MTRRCAP_REG        0x100 /* MTRR capability */
+#define DMAR_MTRRCAP_REG_HI     0x104
+#define DMAR_MTRRDEF_REG        0x108 /* MTRR default type */
+#define DMAR_MTRRDEF_REG_HI     0x10c
+
+#define VTD_GCMD_IR_TABLE       0x1000000
+#define VTD_GCMD_IR             0x2000000
+#define VTD_GCMD_QI             0x4000000
+#define VTD_GCMD_WBF            0x8000000  /* Write Buffer Flush */
+#define VTD_GCMD_SFL            0x20000000 /* Set Fault Log */
+#define VTD_GCMD_ROOT           0x40000000
+#define VTD_GCMD_DMAR           0x80000000
+#define VTD_GCMD_ONE_SHOT_BITS  (VTD_GCMD_IR_TABLE | VTD_GCMD_WBF | \
+				 VTD_GCMD_SFL | VTD_GCMD_ROOT)
+
+/* Supported Adjusted Guest Address Widths */
+#define VTD_CAP_SAGAW_SHIFT         8
+/* 39-bit AGAW, 3-level page-table */
+#define VTD_CAP_SAGAW_39bit         (0x2ULL << VTD_CAP_SAGAW_SHIFT)
+/* 48-bit AGAW, 4-level page-table */
+#define VTD_CAP_SAGAW_48bit         (0x4ULL << VTD_CAP_SAGAW_SHIFT)
+#define VTD_CAP_SAGAW               VTD_CAP_SAGAW_39bit
+
+/* Both 1G/2M huge pages */
+#define VTD_CAP_SLLPS               ((1ULL << 34) | (1ULL << 35))
+
+#define VTD_CONTEXT_TT_MULTI_LEVEL  0
+#define VTD_CONTEXT_TT_DEV_IOTLB    1
+#define VTD_CONTEXT_TT_PASS_THROUGH 2
+
+#define VTD_PTE_R                   (1 << 0)
+#define VTD_PTE_W                   (1 << 1)
+#define VTD_PTE_RW                  (VTD_PTE_R | VTD_PTE_W)
+#define VTD_PTE_ADDR                GENMASK_ULL(63, 12)
+#define VTD_PTE_HUGE                (1 << 7)
+
+extern void *vtd_reg_base;
+#define vtd_reg(reg) ({ assert(vtd_reg_base); \
+			(volatile void *)(vtd_reg_base + reg); })
+
+static inline void vtd_writel(unsigned int reg, uint32_t value)
+{
+	__raw_writel(value, vtd_reg(reg));
+}
+
+static inline void vtd_writeq(unsigned int reg, uint64_t value)
+{
+	__raw_writeq(value, vtd_reg(reg));
+}
+
+static inline uint32_t vtd_readl(unsigned int reg)
+{
+	return __raw_readl(vtd_reg(reg));
+}
+
+static inline uint64_t vtd_readq(unsigned int reg)
+{
+	return __raw_readq(vtd_reg(reg));
+}
+
+void vtd_init(void);
+
+#endif
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index f82492b..3e2821e 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -4,6 +4,7 @@ ldarch = elf64-x86-64
 CFLAGS += -mno-red-zone
 
 cflatobjs += lib/x86/setjmp64.o
+cflatobjs += lib/x86/intel-iommu.o
 
 tests = $(TEST_DIR)/access.flat $(TEST_DIR)/apic.flat \
 	  $(TEST_DIR)/emulator.flat $(TEST_DIR)/idt_test.flat \
@@ -14,6 +15,7 @@ tests = $(TEST_DIR)/access.flat $(TEST_DIR)/apic.flat \
 tests += $(TEST_DIR)/svm.flat
 tests += $(TEST_DIR)/vmx.flat
 tests += $(TEST_DIR)/tscdeadline_latency.flat
+tests += $(TEST_DIR)/intel-iommu.flat
 
 include $(TEST_DIR)/Makefile.common
 
diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
new file mode 100644
index 0000000..f247913
--- /dev/null
+++ b/x86/intel-iommu.c
@@ -0,0 +1,27 @@
+/*
+ * Intel IOMMU unit test.
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ */
+
+#include "intel-iommu.h"
+
+int main(int argc, char *argv[])
+{
+	vtd_init();
+
+	report("fault status check", vtd_readl(DMAR_FSTS_REG) == 0);
+	report("QI enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_QI);
+	report("DMAR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_ROOT);
+	report("IR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR_TABLE);
+	report("DMAR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_DMAR);
+	report("IR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR);
+
+	return report_summary();
+}
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 23395c6..5413838 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -217,3 +217,10 @@ extra_params = -cpu kvm64,hv_time,hv_synic,hv_stimer -device hyperv-testdev
 file = hyperv_clock.flat
 smp = 2
 extra_params = -cpu kvm64,hv_time
+
+[intel_iommu]
+file = intel-iommu.flat
+arch = x86_64
+timeout = 30
+smp = 4
+extra_params = -M q35,kernel-irqchip=split -device intel-iommu,intremap=on,eim=off -device edu
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 05/14] libcflat: add IS_ALIGNED() macro, and page sizes
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

These macros will be useful to do page alignment checks.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/libcflat.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index 8455c3d..88b10e2 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -33,6 +33,12 @@
 #define __ALIGN_MASK(x, mask)	(((x) + (mask)) & ~(mask))
 #define __ALIGN(x, a)		__ALIGN_MASK(x, (typeof(x))(a) - 1)
 #define ALIGN(x, a)		__ALIGN((x), (a))
+#define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)
+
+#define SZ_4K                  (1 << 12)
+#define SZ_64K                 (1 << 16)
+#define SZ_2M                  (1 << 21)
+#define SZ_1G                  (1 << 30)
 
 typedef uint8_t		u8;
 typedef int8_t		s8;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 05/14] libcflat: add IS_ALIGNED() macro, and page sizes
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

These macros will be useful to do page alignment checks.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/libcflat.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index 8455c3d..88b10e2 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -33,6 +33,12 @@
 #define __ALIGN_MASK(x, mask)	(((x) + (mask)) & ~(mask))
 #define __ALIGN(x, a)		__ALIGN_MASK(x, (typeof(x))(a) - 1)
 #define ALIGN(x, a)		__ALIGN((x), (a))
+#define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)
+
+#define SZ_4K                  (1 << 12)
+#define SZ_64K                 (1 << 16)
+#define SZ_2M                  (1 << 21)
+#define SZ_1G                  (1 << 30)
 
 typedef uint8_t		u8;
 typedef int8_t		s8;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 06/14] libcflat: moving MIN/MAX here
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

That's something can be used outside alloc.c.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/alloc.c    | 3 ---
 lib/libcflat.h | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/alloc.c b/lib/alloc.c
index e1d7b8a..58af52b 100644
--- a/lib/alloc.c
+++ b/lib/alloc.c
@@ -7,9 +7,6 @@
 #include "asm/spinlock.h"
 #include "asm/io.h"
 
-#define MIN(a, b)		((a) < (b) ? (a) : (b))
-#define MAX(a, b)		((a) > (b) ? (a) : (b))
-
 #define PHYS_ALLOC_NR_REGIONS	256
 
 struct phys_alloc_region {
diff --git a/lib/libcflat.h b/lib/libcflat.h
index 88b10e2..94d2a53 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -40,6 +40,9 @@
 #define SZ_2M                  (1 << 21)
 #define SZ_1G                  (1 << 30)
 
+#define MIN(a, b)		((a) < (b) ? (a) : (b))
+#define MAX(a, b)		((a) > (b) ? (a) : (b))
+
 typedef uint8_t		u8;
 typedef int8_t		s8;
 typedef uint16_t	u16;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 06/14] libcflat: moving MIN/MAX here
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

That's something can be used outside alloc.c.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/alloc.c    | 3 ---
 lib/libcflat.h | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/alloc.c b/lib/alloc.c
index e1d7b8a..58af52b 100644
--- a/lib/alloc.c
+++ b/lib/alloc.c
@@ -7,9 +7,6 @@
 #include "asm/spinlock.h"
 #include "asm/io.h"
 
-#define MIN(a, b)		((a) < (b) ? (a) : (b))
-#define MAX(a, b)		((a) > (b) ? (a) : (b))
-
 #define PHYS_ALLOC_NR_REGIONS	256
 
 struct phys_alloc_region {
diff --git a/lib/libcflat.h b/lib/libcflat.h
index 88b10e2..94d2a53 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -40,6 +40,9 @@
 #define SZ_2M                  (1 << 21)
 #define SZ_1G                  (1 << 30)
 
+#define MIN(a, b)		((a) < (b) ? (a) : (b))
+#define MAX(a, b)		((a) > (b) ? (a) : (b))
+
 typedef uint8_t		u8;
 typedef int8_t		s8;
 typedef uint16_t	u16;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 07/14] vm/page: provide PGDIR_OFFSET() macro
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

This can be used in further patches.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/x86/asm/page.h | 3 +++
 lib/x86/vm.c       | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/x86/asm/page.h b/lib/x86/asm/page.h
index 5044a49..c43bab2 100644
--- a/lib/x86/asm/page.h
+++ b/lib/x86/asm/page.h
@@ -41,5 +41,8 @@
 #define	PGDIR_MASK	1023
 #endif
 
+#define PGDIR_BITS(lvl)        (((lvl) - 1) * PGDIR_WIDTH + PAGE_SHIFT)
+#define PGDIR_OFFSET(va, lvl)  (((va) >> PGDIR_BITS(lvl)) & PGDIR_MASK)
+
 #endif /* !__ASSEMBLY__ */
 #endif
diff --git a/lib/x86/vm.c b/lib/x86/vm.c
index f7e778b..cda4c5f 100644
--- a/lib/x86/vm.c
+++ b/lib/x86/vm.c
@@ -48,7 +48,7 @@ unsigned long *install_pte(unsigned long *cr3,
     unsigned offset;
 
     for (level = PAGE_LEVEL; level > pte_level; --level) {
-	offset = ((unsigned long)virt >> ((level-1) * PGDIR_WIDTH + 12)) & PGDIR_MASK;
+	offset = PGDIR_OFFSET((unsigned long)virt, level);
 	if (!(pt[offset] & PT_PRESENT_MASK)) {
 	    unsigned long *new_pt = pt_page;
             if (!new_pt)
@@ -60,7 +60,7 @@ unsigned long *install_pte(unsigned long *cr3,
 	}
 	pt = phys_to_virt(pt[offset] & PT_ADDR_MASK);
     }
-    offset = ((unsigned long)virt >> ((level-1) * PGDIR_WIDTH + 12)) & PGDIR_MASK;
+    offset = PGDIR_OFFSET((unsigned long)virt, level);
     pt[offset] = pte;
     return &pt[offset];
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 07/14] vm/page: provide PGDIR_OFFSET() macro
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

This can be used in further patches.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/x86/asm/page.h | 3 +++
 lib/x86/vm.c       | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/x86/asm/page.h b/lib/x86/asm/page.h
index 5044a49..c43bab2 100644
--- a/lib/x86/asm/page.h
+++ b/lib/x86/asm/page.h
@@ -41,5 +41,8 @@
 #define	PGDIR_MASK	1023
 #endif
 
+#define PGDIR_BITS(lvl)        (((lvl) - 1) * PGDIR_WIDTH + PAGE_SHIFT)
+#define PGDIR_OFFSET(va, lvl)  (((va) >> PGDIR_BITS(lvl)) & PGDIR_MASK)
+
 #endif /* !__ASSEMBLY__ */
 #endif
diff --git a/lib/x86/vm.c b/lib/x86/vm.c
index f7e778b..cda4c5f 100644
--- a/lib/x86/vm.c
+++ b/lib/x86/vm.c
@@ -48,7 +48,7 @@ unsigned long *install_pte(unsigned long *cr3,
     unsigned offset;
 
     for (level = PAGE_LEVEL; level > pte_level; --level) {
-	offset = ((unsigned long)virt >> ((level-1) * PGDIR_WIDTH + 12)) & PGDIR_MASK;
+	offset = PGDIR_OFFSET((unsigned long)virt, level);
 	if (!(pt[offset] & PT_PRESENT_MASK)) {
 	    unsigned long *new_pt = pt_page;
             if (!new_pt)
@@ -60,7 +60,7 @@ unsigned long *install_pte(unsigned long *cr3,
 	}
 	pt = phys_to_virt(pt[offset] & PT_ADDR_MASK);
     }
-    offset = ((unsigned long)virt >> ((level-1) * PGDIR_WIDTH + 12)) & PGDIR_MASK;
+    offset = PGDIR_OFFSET((unsigned long)virt, level);
     pt[offset] = pte;
     return &pt[offset];
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 08/14] pci: introduce struct pci_dev
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

To extend current PCI framework, we need a per-device struct to store
device specific information. Time to have a pci_dev struct. Most of the
current PCI APIs are converted to use this pci_dev object as the first
argument. Currently it only contains one field "bdf", which is the bdf
of current device.

For a few APIs like pci_config_*() ops or pci_find_dev(), I kept the old
interface (use PCI BDF value rather than "struct pci_dev") since they
can be used in a open context that without any specific PCI device.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-host-generic.c | 17 ++++++++-------
 lib/pci-testdev.c      | 10 +++++----
 lib/pci.c              | 56 ++++++++++++++++++++++++++++++--------------------
 lib/pci.h              | 24 ++++++++++++++--------
 x86/vmexit.c           | 18 ++++++++--------
 5 files changed, 76 insertions(+), 49 deletions(-)

diff --git a/lib/pci-host-generic.c b/lib/pci-host-generic.c
index 6ac0f15..958ef70 100644
--- a/lib/pci-host-generic.c
+++ b/lib/pci-host-generic.c
@@ -165,7 +165,7 @@ static struct pci_host_bridge *pci_dt_probe(void)
 	return host;
 }
 
-static bool pci_alloc_resource(pcidevaddr_t dev, int bar_num, u64 *addr)
+static bool pci_alloc_resource(struct pci_dev *dev, int bar_num, u64 *addr)
 {
 	struct pci_host_bridge *host = pci_host_bridge;
 	struct pci_addr_space *as = &host->addr_space[0];
@@ -192,7 +192,7 @@ static bool pci_alloc_resource(pcidevaddr_t dev, int bar_num, u64 *addr)
 
 	if (i >= host->nr_addr_spaces) {
 		printf("%s: warning: can't satisfy request for ", __func__);
-		pci_dev_print_id(dev);
+		pci_dev_print_id(dev->bdf);
 		printf(" ");
 		pci_bar_print(dev, bar_num);
 		printf("\n");
@@ -211,6 +211,7 @@ static bool pci_alloc_resource(pcidevaddr_t dev, int bar_num, u64 *addr)
 
 bool pci_probe(void)
 {
+	struct pci_dev pci_dev;
 	pcidevaddr_t dev;
 	u8 header;
 	u32 cmd;
@@ -221,10 +222,12 @@ bool pci_probe(void)
 	if (!pci_host_bridge)
 		return false;
 
-	for (dev = 0; dev < 256; dev++) {
+	for (dev = 0; dev < PCI_DEVFN_MAX; dev++) {
 		if (!pci_dev_exists(dev))
 			continue;
 
+		pci_dev_init(&pci_dev, dev);
+
 		/* We are only interested in normal PCI devices */
 		header = pci_config_readb(dev, PCI_HEADER_TYPE);
 		if ((header & PCI_HEADER_TYPE_MASK) != PCI_HEADER_TYPE_NORMAL)
@@ -235,16 +238,16 @@ bool pci_probe(void)
 		for (i = 0; i < 6; i++) {
 			u64 addr;
 
-			if (pci_alloc_resource(dev, i, &addr)) {
-				pci_bar_set_addr(dev, i, addr);
+			if (pci_alloc_resource(&pci_dev, i, &addr)) {
+				pci_bar_set_addr(&pci_dev, i, addr);
 
-				if (pci_bar_is_memory(dev, i))
+				if (pci_bar_is_memory(&pci_dev, i))
 					cmd |= PCI_COMMAND_MEMORY;
 				else
 					cmd |= PCI_COMMAND_IO;
 			}
 
-			if (pci_bar_is64(dev, i))
+			if (pci_bar_is64(&pci_dev, i))
 				i++;
 		}
 
diff --git a/lib/pci-testdev.c b/lib/pci-testdev.c
index ad482d3..7d298e6 100644
--- a/lib/pci-testdev.c
+++ b/lib/pci-testdev.c
@@ -163,9 +163,10 @@ static int pci_testdev_all(struct pci_test_dev_hdr *test,
 
 int pci_testdev(void)
 {
+	struct pci_dev pci_dev;
+	pcidevaddr_t dev;
 	phys_addr_t addr;
 	void __iomem *mem, *io;
-	pcidevaddr_t dev;
 	int nr_tests = 0;
 	bool ret;
 
@@ -175,14 +176,15 @@ int pci_testdev(void)
 		       "check QEMU '-device pci-testdev' parameter\n");
 		return -1;
 	}
+	pci_dev_init(&pci_dev, dev);
 
-	ret = pci_bar_is_valid(dev, 0) && pci_bar_is_valid(dev, 1);
+	ret = pci_bar_is_valid(&pci_dev, 0) && pci_bar_is_valid(&pci_dev, 1);
 	assert(ret);
 
-	addr = pci_bar_get_addr(dev, 0);
+	addr = pci_bar_get_addr(&pci_dev, 0);
 	mem = ioremap(addr, PAGE_SIZE);
 
-	addr = pci_bar_get_addr(dev, 1);
+	addr = pci_bar_get_addr(&pci_dev, 1);
 	io = (void *)(unsigned long)addr;
 
 	nr_tests += pci_testdev_all(mem, &pci_testdev_mem_ops);
diff --git a/lib/pci.c b/lib/pci.c
index 6bd54cb..8f2356d 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -13,12 +13,18 @@ bool pci_dev_exists(pcidevaddr_t dev)
 		pci_config_readw(dev, PCI_DEVICE_ID) != 0xffff);
 }
 
+void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf)
+{
+       memset(dev, 0, sizeof(*dev));
+       dev->bdf = bdf;
+}
+
 /* Scan bus look for a specific device. Only bus 0 scanned for now. */
 pcidevaddr_t pci_find_dev(uint16_t vendor_id, uint16_t device_id)
 {
 	pcidevaddr_t dev;
 
-	for (dev = 0; dev < 256; ++dev) {
+	for (dev = 0; dev < PCI_DEVFN_MAX; ++dev) {
 		if (pci_config_readw(dev, PCI_VENDOR_ID) == vendor_id &&
 		    pci_config_readw(dev, PCI_DEVICE_ID) == device_id)
 			return dev;
@@ -33,12 +39,13 @@ uint32_t pci_bar_mask(uint32_t bar)
 		PCI_BASE_ADDRESS_IO_MASK : PCI_BASE_ADDRESS_MEM_MASK;
 }
 
-uint32_t pci_bar_get(pcidevaddr_t dev, int bar_num)
+uint32_t pci_bar_get(struct pci_dev *dev, int bar_num)
 {
-	return pci_config_readl(dev, PCI_BASE_ADDRESS_0 + bar_num * 4);
+	return pci_config_readl(dev->bdf, PCI_BASE_ADDRESS_0 +
+				bar_num * 4);
 }
 
-phys_addr_t pci_bar_get_addr(pcidevaddr_t dev, int bar_num)
+phys_addr_t pci_bar_get_addr(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar = pci_bar_get(dev, bar_num);
 	uint32_t mask = pci_bar_mask(bar);
@@ -47,17 +54,18 @@ phys_addr_t pci_bar_get_addr(pcidevaddr_t dev, int bar_num)
 	if (pci_bar_is64(dev, bar_num))
 		addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32;
 
-	return pci_translate_addr(dev, addr);
+	return pci_translate_addr(dev->bdf, addr);
 }
 
-void pci_bar_set_addr(pcidevaddr_t dev, int bar_num, phys_addr_t addr)
+void pci_bar_set_addr(struct pci_dev *dev, int bar_num, phys_addr_t addr)
 {
 	int off = PCI_BASE_ADDRESS_0 + bar_num * 4;
 
-	pci_config_writel(dev, off, (uint32_t)addr);
+	pci_config_writel(dev->bdf, off, (uint32_t)addr);
 
 	if (pci_bar_is64(dev, bar_num))
-		pci_config_writel(dev, off + 4, (uint32_t)(addr >> 32));
+		pci_config_writel(dev->bdf, off + 4,
+				  (uint32_t)(addr >> 32));
 }
 
 /*
@@ -70,20 +78,21 @@ void pci_bar_set_addr(pcidevaddr_t dev, int bar_num, phys_addr_t addr)
  * The following pci_bar_size_helper() and pci_bar_size() functions
  * implement the algorithm.
  */
-static uint32_t pci_bar_size_helper(pcidevaddr_t dev, int bar_num)
+static uint32_t pci_bar_size_helper(struct pci_dev *dev, int bar_num)
 {
 	int off = PCI_BASE_ADDRESS_0 + bar_num * 4;
+	uint16_t bdf = dev->bdf;
 	uint32_t bar, val;
 
-	bar = pci_config_readl(dev, off);
-	pci_config_writel(dev, off, ~0u);
-	val = pci_config_readl(dev, off);
-	pci_config_writel(dev, off, bar);
+	bar = pci_config_readl(bdf, off);
+	pci_config_writel(bdf, off, ~0u);
+	val = pci_config_readl(bdf, off);
+	pci_config_writel(bdf, off, bar);
 
 	return val;
 }
 
-phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
+phys_addr_t pci_bar_size(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar, size;
 
@@ -104,19 +113,19 @@ phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
 	}
 }
 
-bool pci_bar_is_memory(pcidevaddr_t dev, int bar_num)
+bool pci_bar_is_memory(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar = pci_bar_get(dev, bar_num);
 
 	return !(bar & PCI_BASE_ADDRESS_SPACE_IO);
 }
 
-bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num)
+bool pci_bar_is_valid(struct pci_dev *dev, int bar_num)
 {
 	return pci_bar_get(dev, bar_num);
 }
 
-bool pci_bar_is64(pcidevaddr_t dev, int bar_num)
+bool pci_bar_is64(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar = pci_bar_get(dev, bar_num);
 
@@ -127,7 +136,7 @@ bool pci_bar_is64(pcidevaddr_t dev, int bar_num)
 		      PCI_BASE_ADDRESS_MEM_TYPE_64;
 }
 
-void pci_bar_print(pcidevaddr_t dev, int bar_num)
+void pci_bar_print(struct pci_dev *dev, int bar_num)
 {
 	phys_addr_t size, start, end;
 	uint32_t bar;
@@ -186,8 +195,11 @@ static void pci_dev_print(pcidevaddr_t dev)
 	uint8_t progif = pci_config_readb(dev, PCI_CLASS_PROG);
 	uint8_t subclass = pci_config_readb(dev, PCI_CLASS_DEVICE);
 	uint8_t class = pci_config_readb(dev, PCI_CLASS_DEVICE + 1);
+	struct pci_dev pci_dev;
 	int i;
 
+	pci_dev_init(&pci_dev, dev);
+
 	pci_dev_print_id(dev);
 	printf(" type %02x progif %02x class %02x subclass %02x\n",
 	       header, progif, class, subclass);
@@ -196,12 +208,12 @@ static void pci_dev_print(pcidevaddr_t dev)
 		return;
 
 	for (i = 0; i < 6; i++) {
-		if (pci_bar_size(dev, i)) {
+		if (pci_bar_size(&pci_dev, i)) {
 			printf("\t");
-			pci_bar_print(dev, i);
+			pci_bar_print(&pci_dev, i);
 			printf("\n");
 		}
-		if (pci_bar_is64(dev, i))
+		if (pci_bar_is64(&pci_dev, i))
 			i++;
 	}
 }
@@ -210,7 +222,7 @@ void pci_print(void)
 {
 	pcidevaddr_t dev;
 
-	for (dev = 0; dev < 256; ++dev) {
+	for (dev = 0; dev < PCI_DEVFN_MAX; ++dev) {
 		if (pci_dev_exists(dev))
 			pci_dev_print(dev);
 	}
diff --git a/lib/pci.h b/lib/pci.h
index 86d6b42..355acd0 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -15,6 +15,14 @@ enum {
 	PCIDEVADDR_INVALID = 0xffff,
 };
 
+#define PCI_DEVFN_MAX                   256
+
+struct pci_dev {
+	uint16_t bdf;
+};
+
+extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
+
 extern bool pci_probe(void);
 extern void pci_print(void);
 extern bool pci_dev_exists(pcidevaddr_t dev);
@@ -32,15 +40,15 @@ extern pcidevaddr_t pci_find_dev(uint16_t vendor_id, uint16_t device_id);
  * It is expected the caller is aware of the device BAR layout and never
  * tries to address the middle of a 64-bit register.
  */
-extern phys_addr_t pci_bar_get_addr(pcidevaddr_t dev, int bar_num);
-extern void pci_bar_set_addr(pcidevaddr_t dev, int bar_num, phys_addr_t addr);
-extern phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num);
-extern uint32_t pci_bar_get(pcidevaddr_t dev, int bar_num);
+extern phys_addr_t pci_bar_get_addr(struct pci_dev *dev, int bar_num);
+extern void pci_bar_set_addr(struct pci_dev *dev, int bar_num, phys_addr_t addr);
+extern phys_addr_t pci_bar_size(struct pci_dev *dev, int bar_num);
+extern uint32_t pci_bar_get(struct pci_dev *dev, int bar_num);
 extern uint32_t pci_bar_mask(uint32_t bar);
-extern bool pci_bar_is64(pcidevaddr_t dev, int bar_num);
-extern bool pci_bar_is_memory(pcidevaddr_t dev, int bar_num);
-extern bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num);
-extern void pci_bar_print(pcidevaddr_t dev, int bar_num);
+extern bool pci_bar_is64(struct pci_dev *dev, int bar_num);
+extern bool pci_bar_is_memory(struct pci_dev *dev, int bar_num);
+extern bool pci_bar_is_valid(struct pci_dev *dev, int bar_num);
+extern void pci_bar_print(struct pci_dev *dev, int bar_num);
 extern void pci_dev_print_id(pcidevaddr_t dev);
 
 extern int pci_testdev(void);
diff --git a/x86/vmexit.c b/x86/vmexit.c
index 2d99d5f..63fa070 100644
--- a/x86/vmexit.c
+++ b/x86/vmexit.c
@@ -372,7 +372,8 @@ int main(int ac, char **av)
 	struct fadt_descriptor_rev1 *fadt;
 	int i;
 	unsigned long membar = 0;
-	pcidevaddr_t pcidev;
+	struct pci_dev pcidev;
+	int ret;
 
 	smp_init();
 	setup_vm();
@@ -385,21 +386,22 @@ int main(int ac, char **av)
 	pm_tmr_blk = fadt->pm_tmr_blk;
 	printf("PM timer port is %x\n", pm_tmr_blk);
 
-	pcidev = pci_find_dev(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_TEST);
-	if (pcidev != PCIDEVADDR_INVALID) {
+	ret = pci_find_dev(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_TEST);
+	if (ret != PCIDEVADDR_INVALID) {
+		pci_dev_init(&pcidev, ret);
 		for (i = 0; i < PCI_TESTDEV_NUM_BARS; i++) {
-			if (!pci_bar_is_valid(pcidev, i)) {
+			if (!pci_bar_is_valid(&pcidev, i)) {
 				continue;
 			}
-			if (pci_bar_is_memory(pcidev, i)) {
-				membar = pci_bar_get_addr(pcidev, i);
+			if (pci_bar_is_memory(&pcidev, i)) {
+				membar = pci_bar_get_addr(&pcidev, i);
 				pci_test.memaddr = ioremap(membar, PAGE_SIZE);
 			} else {
-				pci_test.iobar = pci_bar_get_addr(pcidev, i);
+				pci_test.iobar = pci_bar_get_addr(&pcidev, i);
 			}
 		}
 		printf("pci-testdev at 0x%x membar %lx iobar %x\n",
-		       pcidev, membar, pci_test.iobar);
+		       pcidev.bdf, membar, pci_test.iobar);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(tests); ++i)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 08/14] pci: introduce struct pci_dev
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

To extend current PCI framework, we need a per-device struct to store
device specific information. Time to have a pci_dev struct. Most of the
current PCI APIs are converted to use this pci_dev object as the first
argument. Currently it only contains one field "bdf", which is the bdf
of current device.

For a few APIs like pci_config_*() ops or pci_find_dev(), I kept the old
interface (use PCI BDF value rather than "struct pci_dev") since they
can be used in a open context that without any specific PCI device.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-host-generic.c | 17 ++++++++-------
 lib/pci-testdev.c      | 10 +++++----
 lib/pci.c              | 56 ++++++++++++++++++++++++++++++--------------------
 lib/pci.h              | 24 ++++++++++++++--------
 x86/vmexit.c           | 18 ++++++++--------
 5 files changed, 76 insertions(+), 49 deletions(-)

diff --git a/lib/pci-host-generic.c b/lib/pci-host-generic.c
index 6ac0f15..958ef70 100644
--- a/lib/pci-host-generic.c
+++ b/lib/pci-host-generic.c
@@ -165,7 +165,7 @@ static struct pci_host_bridge *pci_dt_probe(void)
 	return host;
 }
 
-static bool pci_alloc_resource(pcidevaddr_t dev, int bar_num, u64 *addr)
+static bool pci_alloc_resource(struct pci_dev *dev, int bar_num, u64 *addr)
 {
 	struct pci_host_bridge *host = pci_host_bridge;
 	struct pci_addr_space *as = &host->addr_space[0];
@@ -192,7 +192,7 @@ static bool pci_alloc_resource(pcidevaddr_t dev, int bar_num, u64 *addr)
 
 	if (i >= host->nr_addr_spaces) {
 		printf("%s: warning: can't satisfy request for ", __func__);
-		pci_dev_print_id(dev);
+		pci_dev_print_id(dev->bdf);
 		printf(" ");
 		pci_bar_print(dev, bar_num);
 		printf("\n");
@@ -211,6 +211,7 @@ static bool pci_alloc_resource(pcidevaddr_t dev, int bar_num, u64 *addr)
 
 bool pci_probe(void)
 {
+	struct pci_dev pci_dev;
 	pcidevaddr_t dev;
 	u8 header;
 	u32 cmd;
@@ -221,10 +222,12 @@ bool pci_probe(void)
 	if (!pci_host_bridge)
 		return false;
 
-	for (dev = 0; dev < 256; dev++) {
+	for (dev = 0; dev < PCI_DEVFN_MAX; dev++) {
 		if (!pci_dev_exists(dev))
 			continue;
 
+		pci_dev_init(&pci_dev, dev);
+
 		/* We are only interested in normal PCI devices */
 		header = pci_config_readb(dev, PCI_HEADER_TYPE);
 		if ((header & PCI_HEADER_TYPE_MASK) != PCI_HEADER_TYPE_NORMAL)
@@ -235,16 +238,16 @@ bool pci_probe(void)
 		for (i = 0; i < 6; i++) {
 			u64 addr;
 
-			if (pci_alloc_resource(dev, i, &addr)) {
-				pci_bar_set_addr(dev, i, addr);
+			if (pci_alloc_resource(&pci_dev, i, &addr)) {
+				pci_bar_set_addr(&pci_dev, i, addr);
 
-				if (pci_bar_is_memory(dev, i))
+				if (pci_bar_is_memory(&pci_dev, i))
 					cmd |= PCI_COMMAND_MEMORY;
 				else
 					cmd |= PCI_COMMAND_IO;
 			}
 
-			if (pci_bar_is64(dev, i))
+			if (pci_bar_is64(&pci_dev, i))
 				i++;
 		}
 
diff --git a/lib/pci-testdev.c b/lib/pci-testdev.c
index ad482d3..7d298e6 100644
--- a/lib/pci-testdev.c
+++ b/lib/pci-testdev.c
@@ -163,9 +163,10 @@ static int pci_testdev_all(struct pci_test_dev_hdr *test,
 
 int pci_testdev(void)
 {
+	struct pci_dev pci_dev;
+	pcidevaddr_t dev;
 	phys_addr_t addr;
 	void __iomem *mem, *io;
-	pcidevaddr_t dev;
 	int nr_tests = 0;
 	bool ret;
 
@@ -175,14 +176,15 @@ int pci_testdev(void)
 		       "check QEMU '-device pci-testdev' parameter\n");
 		return -1;
 	}
+	pci_dev_init(&pci_dev, dev);
 
-	ret = pci_bar_is_valid(dev, 0) && pci_bar_is_valid(dev, 1);
+	ret = pci_bar_is_valid(&pci_dev, 0) && pci_bar_is_valid(&pci_dev, 1);
 	assert(ret);
 
-	addr = pci_bar_get_addr(dev, 0);
+	addr = pci_bar_get_addr(&pci_dev, 0);
 	mem = ioremap(addr, PAGE_SIZE);
 
-	addr = pci_bar_get_addr(dev, 1);
+	addr = pci_bar_get_addr(&pci_dev, 1);
 	io = (void *)(unsigned long)addr;
 
 	nr_tests += pci_testdev_all(mem, &pci_testdev_mem_ops);
diff --git a/lib/pci.c b/lib/pci.c
index 6bd54cb..8f2356d 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -13,12 +13,18 @@ bool pci_dev_exists(pcidevaddr_t dev)
 		pci_config_readw(dev, PCI_DEVICE_ID) != 0xffff);
 }
 
+void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf)
+{
+       memset(dev, 0, sizeof(*dev));
+       dev->bdf = bdf;
+}
+
 /* Scan bus look for a specific device. Only bus 0 scanned for now. */
 pcidevaddr_t pci_find_dev(uint16_t vendor_id, uint16_t device_id)
 {
 	pcidevaddr_t dev;
 
-	for (dev = 0; dev < 256; ++dev) {
+	for (dev = 0; dev < PCI_DEVFN_MAX; ++dev) {
 		if (pci_config_readw(dev, PCI_VENDOR_ID) == vendor_id &&
 		    pci_config_readw(dev, PCI_DEVICE_ID) == device_id)
 			return dev;
@@ -33,12 +39,13 @@ uint32_t pci_bar_mask(uint32_t bar)
 		PCI_BASE_ADDRESS_IO_MASK : PCI_BASE_ADDRESS_MEM_MASK;
 }
 
-uint32_t pci_bar_get(pcidevaddr_t dev, int bar_num)
+uint32_t pci_bar_get(struct pci_dev *dev, int bar_num)
 {
-	return pci_config_readl(dev, PCI_BASE_ADDRESS_0 + bar_num * 4);
+	return pci_config_readl(dev->bdf, PCI_BASE_ADDRESS_0 +
+				bar_num * 4);
 }
 
-phys_addr_t pci_bar_get_addr(pcidevaddr_t dev, int bar_num)
+phys_addr_t pci_bar_get_addr(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar = pci_bar_get(dev, bar_num);
 	uint32_t mask = pci_bar_mask(bar);
@@ -47,17 +54,18 @@ phys_addr_t pci_bar_get_addr(pcidevaddr_t dev, int bar_num)
 	if (pci_bar_is64(dev, bar_num))
 		addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32;
 
-	return pci_translate_addr(dev, addr);
+	return pci_translate_addr(dev->bdf, addr);
 }
 
-void pci_bar_set_addr(pcidevaddr_t dev, int bar_num, phys_addr_t addr)
+void pci_bar_set_addr(struct pci_dev *dev, int bar_num, phys_addr_t addr)
 {
 	int off = PCI_BASE_ADDRESS_0 + bar_num * 4;
 
-	pci_config_writel(dev, off, (uint32_t)addr);
+	pci_config_writel(dev->bdf, off, (uint32_t)addr);
 
 	if (pci_bar_is64(dev, bar_num))
-		pci_config_writel(dev, off + 4, (uint32_t)(addr >> 32));
+		pci_config_writel(dev->bdf, off + 4,
+				  (uint32_t)(addr >> 32));
 }
 
 /*
@@ -70,20 +78,21 @@ void pci_bar_set_addr(pcidevaddr_t dev, int bar_num, phys_addr_t addr)
  * The following pci_bar_size_helper() and pci_bar_size() functions
  * implement the algorithm.
  */
-static uint32_t pci_bar_size_helper(pcidevaddr_t dev, int bar_num)
+static uint32_t pci_bar_size_helper(struct pci_dev *dev, int bar_num)
 {
 	int off = PCI_BASE_ADDRESS_0 + bar_num * 4;
+	uint16_t bdf = dev->bdf;
 	uint32_t bar, val;
 
-	bar = pci_config_readl(dev, off);
-	pci_config_writel(dev, off, ~0u);
-	val = pci_config_readl(dev, off);
-	pci_config_writel(dev, off, bar);
+	bar = pci_config_readl(bdf, off);
+	pci_config_writel(bdf, off, ~0u);
+	val = pci_config_readl(bdf, off);
+	pci_config_writel(bdf, off, bar);
 
 	return val;
 }
 
-phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
+phys_addr_t pci_bar_size(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar, size;
 
@@ -104,19 +113,19 @@ phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
 	}
 }
 
-bool pci_bar_is_memory(pcidevaddr_t dev, int bar_num)
+bool pci_bar_is_memory(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar = pci_bar_get(dev, bar_num);
 
 	return !(bar & PCI_BASE_ADDRESS_SPACE_IO);
 }
 
-bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num)
+bool pci_bar_is_valid(struct pci_dev *dev, int bar_num)
 {
 	return pci_bar_get(dev, bar_num);
 }
 
-bool pci_bar_is64(pcidevaddr_t dev, int bar_num)
+bool pci_bar_is64(struct pci_dev *dev, int bar_num)
 {
 	uint32_t bar = pci_bar_get(dev, bar_num);
 
@@ -127,7 +136,7 @@ bool pci_bar_is64(pcidevaddr_t dev, int bar_num)
 		      PCI_BASE_ADDRESS_MEM_TYPE_64;
 }
 
-void pci_bar_print(pcidevaddr_t dev, int bar_num)
+void pci_bar_print(struct pci_dev *dev, int bar_num)
 {
 	phys_addr_t size, start, end;
 	uint32_t bar;
@@ -186,8 +195,11 @@ static void pci_dev_print(pcidevaddr_t dev)
 	uint8_t progif = pci_config_readb(dev, PCI_CLASS_PROG);
 	uint8_t subclass = pci_config_readb(dev, PCI_CLASS_DEVICE);
 	uint8_t class = pci_config_readb(dev, PCI_CLASS_DEVICE + 1);
+	struct pci_dev pci_dev;
 	int i;
 
+	pci_dev_init(&pci_dev, dev);
+
 	pci_dev_print_id(dev);
 	printf(" type %02x progif %02x class %02x subclass %02x\n",
 	       header, progif, class, subclass);
@@ -196,12 +208,12 @@ static void pci_dev_print(pcidevaddr_t dev)
 		return;
 
 	for (i = 0; i < 6; i++) {
-		if (pci_bar_size(dev, i)) {
+		if (pci_bar_size(&pci_dev, i)) {
 			printf("\t");
-			pci_bar_print(dev, i);
+			pci_bar_print(&pci_dev, i);
 			printf("\n");
 		}
-		if (pci_bar_is64(dev, i))
+		if (pci_bar_is64(&pci_dev, i))
 			i++;
 	}
 }
@@ -210,7 +222,7 @@ void pci_print(void)
 {
 	pcidevaddr_t dev;
 
-	for (dev = 0; dev < 256; ++dev) {
+	for (dev = 0; dev < PCI_DEVFN_MAX; ++dev) {
 		if (pci_dev_exists(dev))
 			pci_dev_print(dev);
 	}
diff --git a/lib/pci.h b/lib/pci.h
index 86d6b42..355acd0 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -15,6 +15,14 @@ enum {
 	PCIDEVADDR_INVALID = 0xffff,
 };
 
+#define PCI_DEVFN_MAX                   256
+
+struct pci_dev {
+	uint16_t bdf;
+};
+
+extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
+
 extern bool pci_probe(void);
 extern void pci_print(void);
 extern bool pci_dev_exists(pcidevaddr_t dev);
@@ -32,15 +40,15 @@ extern pcidevaddr_t pci_find_dev(uint16_t vendor_id, uint16_t device_id);
  * It is expected the caller is aware of the device BAR layout and never
  * tries to address the middle of a 64-bit register.
  */
-extern phys_addr_t pci_bar_get_addr(pcidevaddr_t dev, int bar_num);
-extern void pci_bar_set_addr(pcidevaddr_t dev, int bar_num, phys_addr_t addr);
-extern phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num);
-extern uint32_t pci_bar_get(pcidevaddr_t dev, int bar_num);
+extern phys_addr_t pci_bar_get_addr(struct pci_dev *dev, int bar_num);
+extern void pci_bar_set_addr(struct pci_dev *dev, int bar_num, phys_addr_t addr);
+extern phys_addr_t pci_bar_size(struct pci_dev *dev, int bar_num);
+extern uint32_t pci_bar_get(struct pci_dev *dev, int bar_num);
 extern uint32_t pci_bar_mask(uint32_t bar);
-extern bool pci_bar_is64(pcidevaddr_t dev, int bar_num);
-extern bool pci_bar_is_memory(pcidevaddr_t dev, int bar_num);
-extern bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num);
-extern void pci_bar_print(pcidevaddr_t dev, int bar_num);
+extern bool pci_bar_is64(struct pci_dev *dev, int bar_num);
+extern bool pci_bar_is_memory(struct pci_dev *dev, int bar_num);
+extern bool pci_bar_is_valid(struct pci_dev *dev, int bar_num);
+extern void pci_bar_print(struct pci_dev *dev, int bar_num);
 extern void pci_dev_print_id(pcidevaddr_t dev);
 
 extern int pci_testdev(void);
diff --git a/x86/vmexit.c b/x86/vmexit.c
index 2d99d5f..63fa070 100644
--- a/x86/vmexit.c
+++ b/x86/vmexit.c
@@ -372,7 +372,8 @@ int main(int ac, char **av)
 	struct fadt_descriptor_rev1 *fadt;
 	int i;
 	unsigned long membar = 0;
-	pcidevaddr_t pcidev;
+	struct pci_dev pcidev;
+	int ret;
 
 	smp_init();
 	setup_vm();
@@ -385,21 +386,22 @@ int main(int ac, char **av)
 	pm_tmr_blk = fadt->pm_tmr_blk;
 	printf("PM timer port is %x\n", pm_tmr_blk);
 
-	pcidev = pci_find_dev(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_TEST);
-	if (pcidev != PCIDEVADDR_INVALID) {
+	ret = pci_find_dev(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_TEST);
+	if (ret != PCIDEVADDR_INVALID) {
+		pci_dev_init(&pcidev, ret);
 		for (i = 0; i < PCI_TESTDEV_NUM_BARS; i++) {
-			if (!pci_bar_is_valid(pcidev, i)) {
+			if (!pci_bar_is_valid(&pcidev, i)) {
 				continue;
 			}
-			if (pci_bar_is_memory(pcidev, i)) {
-				membar = pci_bar_get_addr(pcidev, i);
+			if (pci_bar_is_memory(&pcidev, i)) {
+				membar = pci_bar_get_addr(&pcidev, i);
 				pci_test.memaddr = ioremap(membar, PAGE_SIZE);
 			} else {
-				pci_test.iobar = pci_bar_get_addr(pcidev, i);
+				pci_test.iobar = pci_bar_get_addr(&pcidev, i);
 			}
 		}
 		printf("pci-testdev at 0x%x membar %lx iobar %x\n",
-		       pcidev, membar, pci_test.iobar);
+		       pcidev.bdf, membar, pci_test.iobar);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(tests); ++i)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 09/14] pci: provide pci_scan_bars()
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Let's provide a more general way to scan PCI bars, rather than read the
config registers every time.

Then let x86/vmexit.c leverage pci_scan_bars()

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-host-generic.c |  2 +-
 lib/pci.c              | 17 ++++++++++++++++-
 lib/pci.h              |  5 +++++
 x86/vmexit.c           | 17 ++++++-----------
 4 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/lib/pci-host-generic.c b/lib/pci-host-generic.c
index 958ef70..b1d2fc5 100644
--- a/lib/pci-host-generic.c
+++ b/lib/pci-host-generic.c
@@ -235,7 +235,7 @@ bool pci_probe(void)
 
 		cmd = PCI_COMMAND_SERR | PCI_COMMAND_PARITY;
 
-		for (i = 0; i < 6; i++) {
+		for (i = 0; i < PCI_BAR_NUM; i++) {
 			u64 addr;
 
 			if (pci_alloc_resource(&pci_dev, i, &addr)) {
diff --git a/lib/pci.c b/lib/pci.c
index 8f2356d..d4521be 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -207,7 +207,7 @@ static void pci_dev_print(pcidevaddr_t dev)
 	if ((header & PCI_HEADER_TYPE_MASK) != PCI_HEADER_TYPE_NORMAL)
 		return;
 
-	for (i = 0; i < 6; i++) {
+	for (i = 0; i < PCI_BAR_NUM; i++) {
 		if (pci_bar_size(&pci_dev, i)) {
 			printf("\t");
 			pci_bar_print(&pci_dev, i);
@@ -227,3 +227,18 @@ void pci_print(void)
 			pci_dev_print(dev);
 	}
 }
+
+void pci_scan_bars(struct pci_dev *dev)
+{
+	int i;
+
+	for (i = 0; i < PCI_BAR_NUM; i++) {
+		if (!pci_bar_is_valid(dev, i))
+			continue;
+		dev->resource[i] = pci_bar_get_addr(dev, i);
+		if (pci_bar_is64(dev, i)) {
+			i++;
+			dev->resource[i] = (phys_addr_t)0;
+		}
+	}
+}
diff --git a/lib/pci.h b/lib/pci.h
index 355acd0..8215c9d 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -15,13 +15,16 @@ enum {
 	PCIDEVADDR_INVALID = 0xffff,
 };
 
+#define PCI_BAR_NUM                     6
 #define PCI_DEVFN_MAX                   256
 
 struct pci_dev {
 	uint16_t bdf;
+	phys_addr_t resource[PCI_BAR_NUM];
 };
 
 extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
+extern void pci_scan_bars(struct pci_dev *dev);
 
 extern bool pci_probe(void);
 extern void pci_print(void);
@@ -65,6 +68,8 @@ extern int pci_testdev(void);
  * pci-testdev supports at least three types of tests (via mmio and
  * portio BARs): no-eventfd, wildcard-eventfd and datamatch-eventfd
  */
+#define PCI_TESTDEV_BAR_MEM		0
+#define PCI_TESTDEV_BAR_IO		1
 #define PCI_TESTDEV_NUM_BARS		2
 #define PCI_TESTDEV_NUM_TESTS		3
 
diff --git a/x86/vmexit.c b/x86/vmexit.c
index 63fa070..f9a22de 100644
--- a/x86/vmexit.c
+++ b/x86/vmexit.c
@@ -389,17 +389,12 @@ int main(int ac, char **av)
 	ret = pci_find_dev(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_TEST);
 	if (ret != PCIDEVADDR_INVALID) {
 		pci_dev_init(&pcidev, ret);
-		for (i = 0; i < PCI_TESTDEV_NUM_BARS; i++) {
-			if (!pci_bar_is_valid(&pcidev, i)) {
-				continue;
-			}
-			if (pci_bar_is_memory(&pcidev, i)) {
-				membar = pci_bar_get_addr(&pcidev, i);
-				pci_test.memaddr = ioremap(membar, PAGE_SIZE);
-			} else {
-				pci_test.iobar = pci_bar_get_addr(&pcidev, i);
-			}
-		}
+		pci_scan_bars(&pcidev);
+		assert(pci_bar_is_memory(&pcidev, PCI_TESTDEV_BAR_MEM));
+		assert(!pci_bar_is_memory(&pcidev, PCI_TESTDEV_BAR_IO));
+		membar = pcidev.resource[PCI_TESTDEV_BAR_MEM];
+		pci_test.memaddr = ioremap(membar, PAGE_SIZE);
+		pci_test.iobar = pcidev.resource[PCI_TESTDEV_BAR_IO];
 		printf("pci-testdev at 0x%x membar %lx iobar %x\n",
 		       pcidev.bdf, membar, pci_test.iobar);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 09/14] pci: provide pci_scan_bars()
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

Let's provide a more general way to scan PCI bars, rather than read the
config registers every time.

Then let x86/vmexit.c leverage pci_scan_bars()

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-host-generic.c |  2 +-
 lib/pci.c              | 17 ++++++++++++++++-
 lib/pci.h              |  5 +++++
 x86/vmexit.c           | 17 ++++++-----------
 4 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/lib/pci-host-generic.c b/lib/pci-host-generic.c
index 958ef70..b1d2fc5 100644
--- a/lib/pci-host-generic.c
+++ b/lib/pci-host-generic.c
@@ -235,7 +235,7 @@ bool pci_probe(void)
 
 		cmd = PCI_COMMAND_SERR | PCI_COMMAND_PARITY;
 
-		for (i = 0; i < 6; i++) {
+		for (i = 0; i < PCI_BAR_NUM; i++) {
 			u64 addr;
 
 			if (pci_alloc_resource(&pci_dev, i, &addr)) {
diff --git a/lib/pci.c b/lib/pci.c
index 8f2356d..d4521be 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -207,7 +207,7 @@ static void pci_dev_print(pcidevaddr_t dev)
 	if ((header & PCI_HEADER_TYPE_MASK) != PCI_HEADER_TYPE_NORMAL)
 		return;
 
-	for (i = 0; i < 6; i++) {
+	for (i = 0; i < PCI_BAR_NUM; i++) {
 		if (pci_bar_size(&pci_dev, i)) {
 			printf("\t");
 			pci_bar_print(&pci_dev, i);
@@ -227,3 +227,18 @@ void pci_print(void)
 			pci_dev_print(dev);
 	}
 }
+
+void pci_scan_bars(struct pci_dev *dev)
+{
+	int i;
+
+	for (i = 0; i < PCI_BAR_NUM; i++) {
+		if (!pci_bar_is_valid(dev, i))
+			continue;
+		dev->resource[i] = pci_bar_get_addr(dev, i);
+		if (pci_bar_is64(dev, i)) {
+			i++;
+			dev->resource[i] = (phys_addr_t)0;
+		}
+	}
+}
diff --git a/lib/pci.h b/lib/pci.h
index 355acd0..8215c9d 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -15,13 +15,16 @@ enum {
 	PCIDEVADDR_INVALID = 0xffff,
 };
 
+#define PCI_BAR_NUM                     6
 #define PCI_DEVFN_MAX                   256
 
 struct pci_dev {
 	uint16_t bdf;
+	phys_addr_t resource[PCI_BAR_NUM];
 };
 
 extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
+extern void pci_scan_bars(struct pci_dev *dev);
 
 extern bool pci_probe(void);
 extern void pci_print(void);
@@ -65,6 +68,8 @@ extern int pci_testdev(void);
  * pci-testdev supports at least three types of tests (via mmio and
  * portio BARs): no-eventfd, wildcard-eventfd and datamatch-eventfd
  */
+#define PCI_TESTDEV_BAR_MEM		0
+#define PCI_TESTDEV_BAR_IO		1
 #define PCI_TESTDEV_NUM_BARS		2
 #define PCI_TESTDEV_NUM_TESTS		3
 
diff --git a/x86/vmexit.c b/x86/vmexit.c
index 63fa070..f9a22de 100644
--- a/x86/vmexit.c
+++ b/x86/vmexit.c
@@ -389,17 +389,12 @@ int main(int ac, char **av)
 	ret = pci_find_dev(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_TEST);
 	if (ret != PCIDEVADDR_INVALID) {
 		pci_dev_init(&pcidev, ret);
-		for (i = 0; i < PCI_TESTDEV_NUM_BARS; i++) {
-			if (!pci_bar_is_valid(&pcidev, i)) {
-				continue;
-			}
-			if (pci_bar_is_memory(&pcidev, i)) {
-				membar = pci_bar_get_addr(&pcidev, i);
-				pci_test.memaddr = ioremap(membar, PAGE_SIZE);
-			} else {
-				pci_test.iobar = pci_bar_get_addr(&pcidev, i);
-			}
-		}
+		pci_scan_bars(&pcidev);
+		assert(pci_bar_is_memory(&pcidev, PCI_TESTDEV_BAR_MEM));
+		assert(!pci_bar_is_memory(&pcidev, PCI_TESTDEV_BAR_IO));
+		membar = pcidev.resource[PCI_TESTDEV_BAR_MEM];
+		pci_test.memaddr = ioremap(membar, PAGE_SIZE);
+		pci_test.iobar = pcidev.resource[PCI_TESTDEV_BAR_IO];
 		printf("pci-testdev at 0x%x membar %lx iobar %x\n",
 		       pcidev.bdf, membar, pci_test.iobar);
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 10/14] pci: provide pci_enable_defaults()
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

One helper function to set/clear specific bit in PCI_COMMAND register is
introduced. Then, provide a function to do most of the common PCI init
work.

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.c | 19 +++++++++++++++++++
 lib/pci.h |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/lib/pci.c b/lib/pci.c
index d4521be..bf2a3df 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -7,6 +7,18 @@
 #include "pci.h"
 #include "asm/pci.h"
 
+void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr)
+{
+	uint16_t val = pci_config_readw(dev->bdf, PCI_COMMAND);
+
+	/* No overlap is allowed */
+	assert((set & clr) == 0);
+	val |= set;
+	val &= ~clr;
+
+	pci_config_writew(dev->bdf, PCI_COMMAND, val);
+}
+
 bool pci_dev_exists(pcidevaddr_t dev)
 {
 	return (pci_config_readw(dev, PCI_VENDOR_ID) != 0xffff &&
@@ -242,3 +254,10 @@ void pci_scan_bars(struct pci_dev *dev)
 		}
 	}
 }
+
+void pci_enable_defaults(struct pci_dev *dev)
+{
+	pci_scan_bars(dev);
+	/* Enable device DMA operations */
+	pci_cmd_set_clr(dev, PCI_COMMAND_MASTER, 0);
+}
diff --git a/lib/pci.h b/lib/pci.h
index 8215c9d..d3052ef 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -25,6 +25,8 @@ struct pci_dev {
 
 extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
 extern void pci_scan_bars(struct pci_dev *dev);
+extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
+extern void pci_enable_defaults(struct pci_dev *dev);
 
 extern bool pci_probe(void);
 extern void pci_print(void);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 10/14] pci: provide pci_enable_defaults()
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

One helper function to set/clear specific bit in PCI_COMMAND register is
introduced. Then, provide a function to do most of the common PCI init
work.

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.c | 19 +++++++++++++++++++
 lib/pci.h |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/lib/pci.c b/lib/pci.c
index d4521be..bf2a3df 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -7,6 +7,18 @@
 #include "pci.h"
 #include "asm/pci.h"
 
+void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr)
+{
+	uint16_t val = pci_config_readw(dev->bdf, PCI_COMMAND);
+
+	/* No overlap is allowed */
+	assert((set & clr) == 0);
+	val |= set;
+	val &= ~clr;
+
+	pci_config_writew(dev->bdf, PCI_COMMAND, val);
+}
+
 bool pci_dev_exists(pcidevaddr_t dev)
 {
 	return (pci_config_readw(dev, PCI_VENDOR_ID) != 0xffff &&
@@ -242,3 +254,10 @@ void pci_scan_bars(struct pci_dev *dev)
 		}
 	}
 }
+
+void pci_enable_defaults(struct pci_dev *dev)
+{
+	pci_scan_bars(dev);
+	/* Enable device DMA operations */
+	pci_cmd_set_clr(dev, PCI_COMMAND_MASTER, 0);
+}
diff --git a/lib/pci.h b/lib/pci.h
index 8215c9d..d3052ef 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -25,6 +25,8 @@ struct pci_dev {
 
 extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
 extern void pci_scan_bars(struct pci_dev *dev);
+extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
+extern void pci_enable_defaults(struct pci_dev *dev);
 
 extern bool pci_probe(void);
 extern void pci_print(void);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 11/14] pci: edu: introduce pci-edu helpers
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

QEMU edu device is a pci device that is originally written for
educational purpose, however it also suits for IOMMU unit test. Adding
helpers for this specific device to implement the device logic.

The device supports lots of functions, here only DMA operation is
supported.

The spec of the device can be found at:

  https://github.com/qemu/qemu/blob/master/docs/specs/edu.txt

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-edu.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/pci-edu.h | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 156 insertions(+)
 create mode 100644 lib/pci-edu.c
 create mode 100644 lib/pci-edu.h

diff --git a/lib/pci-edu.c b/lib/pci-edu.c
new file mode 100644
index 0000000..8096470
--- /dev/null
+++ b/lib/pci-edu.c
@@ -0,0 +1,73 @@
+/*
+ * Edu PCI device.
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ */
+
+#include "pci-edu.h"
+#include "asm/barrier.h"
+
+/* Return true if alive */
+static inline bool edu_check_alive(struct pci_edu_dev *dev)
+{
+	static uint32_t live_count = 1;
+	uint32_t value;
+
+	edu_reg_writel(dev, EDU_REG_ALIVE, live_count++);
+	value = edu_reg_readl(dev, EDU_REG_ALIVE);
+	return (live_count - 1 == ~value);
+}
+
+bool edu_init(struct pci_edu_dev *dev)
+{
+	pcidevaddr_t dev_addr;
+
+	dev_addr = pci_find_dev(PCI_VENDOR_ID_QEMU, PCI_DEVICE_ID_EDU);
+	if (dev_addr == PCIDEVADDR_INVALID)
+		return false;
+
+	pci_dev_init(&dev->pci_dev, dev_addr);
+	pci_enable_defaults(&dev->pci_dev);
+	dev->reg_base = ioremap(dev->pci_dev.resource[EDU_BAR], PAGE_SIZE);
+	assert(edu_check_alive(dev));
+	return true;
+}
+
+void edu_dma(struct pci_edu_dev *dev, iova_t iova,
+	     size_t size, unsigned int dev_offset, bool from_device)
+{
+	uint64_t from, to;
+	uint32_t cmd = EDU_CMD_DMA_START;
+
+	assert(size <= EDU_DMA_SIZE_MAX);
+	assert(dev_offset < EDU_DMA_SIZE_MAX);
+
+	printf("edu device DMA start %s addr %p size 0x%lu off 0x%x\n",
+	       from_device ? "FROM" : "TO",
+	       (void *)iova, size, dev_offset);
+
+	if (from_device) {
+		from = dev_offset + EDU_DMA_START;
+		to = iova;
+		cmd |= EDU_CMD_DMA_FROM;
+	} else {
+		from = iova;
+		to = EDU_DMA_START + dev_offset;
+		cmd |= EDU_CMD_DMA_TO;
+	}
+
+	edu_reg_writeq(dev, EDU_REG_DMA_SRC, from);
+	edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
+	edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
+	edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);
+
+	/* Wait until DMA finished */
+	while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
+		cpu_relax();
+}
diff --git a/lib/pci-edu.h b/lib/pci-edu.h
new file mode 100644
index 0000000..8395ebe
--- /dev/null
+++ b/lib/pci-edu.h
@@ -0,0 +1,83 @@
+/*
+ * Edu PCI device header.
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ *
+ * Edu device is a virtualized device in QEMU. Please refer to
+ * docs/specs/edu.txt in QEMU repository for EDU device manual.
+ */
+#ifndef __PCI_EDU_H__
+#define __PCI_EDU_H__
+
+#include "pci.h"
+#include "asm/io.h"
+
+#define  PCI_VENDOR_ID_QEMU              0x1234
+#define  PCI_DEVICE_ID_EDU               0x11e8
+
+/* The only bar used by EDU device */
+#define EDU_BAR                     0
+#define EDU_MAGIC                   0xed
+#define EDU_VERSION                 0x100
+#define EDU_DMA_BUF_SIZE            (1 << 20)
+#define EDU_INPUT_BUF_SIZE          256
+
+#define EDU_REG_ID                  0x0
+#define EDU_REG_ALIVE               0x4
+#define EDU_REG_FACTORIAL           0x8
+#define EDU_REG_STATUS              0x20
+#define EDU_REG_DMA_SRC             0x80
+#define EDU_REG_DMA_DST             0x88
+#define EDU_REG_DMA_COUNT           0x90
+#define EDU_REG_DMA_CMD             0x98
+
+#define EDU_CMD_DMA_START           0x01
+#define EDU_CMD_DMA_FROM            0x02
+#define EDU_CMD_DMA_TO              0x00
+
+#define EDU_STATUS_FACTORIAL        0x1
+#define EDU_STATUS_INT_ENABLE       0x80
+
+#define EDU_DMA_START               0x40000
+#define EDU_DMA_SIZE_MAX            4096
+
+struct pci_edu_dev {
+	struct pci_dev pci_dev;
+	volatile void *reg_base;
+};
+
+#define edu_reg(d, r) (volatile void *)((d)->reg_base + (r))
+
+static inline uint64_t edu_reg_readq(struct pci_edu_dev *dev, int reg)
+{
+	return __raw_readq(edu_reg(dev, reg));
+}
+
+static inline uint32_t edu_reg_readl(struct pci_edu_dev *dev, int reg)
+{
+	return __raw_readl(edu_reg(dev, reg));
+}
+
+static inline void edu_reg_writeq(struct pci_edu_dev *dev, int reg,
+				  uint64_t val)
+{
+	__raw_writeq(val, edu_reg(dev, reg));
+}
+
+static inline void edu_reg_writel(struct pci_edu_dev *dev, int reg,
+				  uint32_t val)
+{
+	__raw_writel(val, edu_reg(dev, reg));
+}
+
+bool edu_init(struct pci_edu_dev *dev);
+void edu_dma(struct pci_edu_dev *dev, iova_t iova,
+	     size_t size, unsigned int dev_offset, bool from_device);
+
+#endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 11/14] pci: edu: introduce pci-edu helpers
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

QEMU edu device is a pci device that is originally written for
educational purpose, however it also suits for IOMMU unit test. Adding
helpers for this specific device to implement the device logic.

The device supports lots of functions, here only DMA operation is
supported.

The spec of the device can be found at:

  https://github.com/qemu/qemu/blob/master/docs/specs/edu.txt

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-edu.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/pci-edu.h | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 156 insertions(+)
 create mode 100644 lib/pci-edu.c
 create mode 100644 lib/pci-edu.h

diff --git a/lib/pci-edu.c b/lib/pci-edu.c
new file mode 100644
index 0000000..8096470
--- /dev/null
+++ b/lib/pci-edu.c
@@ -0,0 +1,73 @@
+/*
+ * Edu PCI device.
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ */
+
+#include "pci-edu.h"
+#include "asm/barrier.h"
+
+/* Return true if alive */
+static inline bool edu_check_alive(struct pci_edu_dev *dev)
+{
+	static uint32_t live_count = 1;
+	uint32_t value;
+
+	edu_reg_writel(dev, EDU_REG_ALIVE, live_count++);
+	value = edu_reg_readl(dev, EDU_REG_ALIVE);
+	return (live_count - 1 == ~value);
+}
+
+bool edu_init(struct pci_edu_dev *dev)
+{
+	pcidevaddr_t dev_addr;
+
+	dev_addr = pci_find_dev(PCI_VENDOR_ID_QEMU, PCI_DEVICE_ID_EDU);
+	if (dev_addr == PCIDEVADDR_INVALID)
+		return false;
+
+	pci_dev_init(&dev->pci_dev, dev_addr);
+	pci_enable_defaults(&dev->pci_dev);
+	dev->reg_base = ioremap(dev->pci_dev.resource[EDU_BAR], PAGE_SIZE);
+	assert(edu_check_alive(dev));
+	return true;
+}
+
+void edu_dma(struct pci_edu_dev *dev, iova_t iova,
+	     size_t size, unsigned int dev_offset, bool from_device)
+{
+	uint64_t from, to;
+	uint32_t cmd = EDU_CMD_DMA_START;
+
+	assert(size <= EDU_DMA_SIZE_MAX);
+	assert(dev_offset < EDU_DMA_SIZE_MAX);
+
+	printf("edu device DMA start %s addr %p size 0x%lu off 0x%x\n",
+	       from_device ? "FROM" : "TO",
+	       (void *)iova, size, dev_offset);
+
+	if (from_device) {
+		from = dev_offset + EDU_DMA_START;
+		to = iova;
+		cmd |= EDU_CMD_DMA_FROM;
+	} else {
+		from = iova;
+		to = EDU_DMA_START + dev_offset;
+		cmd |= EDU_CMD_DMA_TO;
+	}
+
+	edu_reg_writeq(dev, EDU_REG_DMA_SRC, from);
+	edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
+	edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
+	edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);
+
+	/* Wait until DMA finished */
+	while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
+		cpu_relax();
+}
diff --git a/lib/pci-edu.h b/lib/pci-edu.h
new file mode 100644
index 0000000..8395ebe
--- /dev/null
+++ b/lib/pci-edu.h
@@ -0,0 +1,83 @@
+/*
+ * Edu PCI device header.
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *   Peter Xu <peterx@redhat.com>,
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or
+ * later.
+ *
+ * Edu device is a virtualized device in QEMU. Please refer to
+ * docs/specs/edu.txt in QEMU repository for EDU device manual.
+ */
+#ifndef __PCI_EDU_H__
+#define __PCI_EDU_H__
+
+#include "pci.h"
+#include "asm/io.h"
+
+#define  PCI_VENDOR_ID_QEMU              0x1234
+#define  PCI_DEVICE_ID_EDU               0x11e8
+
+/* The only bar used by EDU device */
+#define EDU_BAR                     0
+#define EDU_MAGIC                   0xed
+#define EDU_VERSION                 0x100
+#define EDU_DMA_BUF_SIZE            (1 << 20)
+#define EDU_INPUT_BUF_SIZE          256
+
+#define EDU_REG_ID                  0x0
+#define EDU_REG_ALIVE               0x4
+#define EDU_REG_FACTORIAL           0x8
+#define EDU_REG_STATUS              0x20
+#define EDU_REG_DMA_SRC             0x80
+#define EDU_REG_DMA_DST             0x88
+#define EDU_REG_DMA_COUNT           0x90
+#define EDU_REG_DMA_CMD             0x98
+
+#define EDU_CMD_DMA_START           0x01
+#define EDU_CMD_DMA_FROM            0x02
+#define EDU_CMD_DMA_TO              0x00
+
+#define EDU_STATUS_FACTORIAL        0x1
+#define EDU_STATUS_INT_ENABLE       0x80
+
+#define EDU_DMA_START               0x40000
+#define EDU_DMA_SIZE_MAX            4096
+
+struct pci_edu_dev {
+	struct pci_dev pci_dev;
+	volatile void *reg_base;
+};
+
+#define edu_reg(d, r) (volatile void *)((d)->reg_base + (r))
+
+static inline uint64_t edu_reg_readq(struct pci_edu_dev *dev, int reg)
+{
+	return __raw_readq(edu_reg(dev, reg));
+}
+
+static inline uint32_t edu_reg_readl(struct pci_edu_dev *dev, int reg)
+{
+	return __raw_readl(edu_reg(dev, reg));
+}
+
+static inline void edu_reg_writeq(struct pci_edu_dev *dev, int reg,
+				  uint64_t val)
+{
+	__raw_writeq(val, edu_reg(dev, reg));
+}
+
+static inline void edu_reg_writel(struct pci_edu_dev *dev, int reg,
+				  uint32_t val)
+{
+	__raw_writel(val, edu_reg(dev, reg));
+}
+
+bool edu_init(struct pci_edu_dev *dev);
+void edu_dma(struct pci_edu_dev *dev, iova_t iova,
+	     size_t size, unsigned int dev_offset, bool from_device);
+
+#endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

DMAR test is based on QEMU edu device. A 4B DMA memory copy is carried
out as the simplest DMAR test.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.h             |   5 ++
 lib/x86/intel-iommu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h |   4 ++
 x86/Makefile.common   |   1 +
 x86/intel-iommu.c     |  51 +++++++++++++++++++
 5 files changed, 194 insertions(+)

diff --git a/lib/pci.h b/lib/pci.h
index d3052ef..26968b1 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -18,6 +18,9 @@ enum {
 #define PCI_BAR_NUM                     6
 #define PCI_DEVFN_MAX                   256
 
+#define PCI_BDF_GET_DEVFN(x)            ((x) & 0xff)
+#define PCI_BDF_GET_BUS(x)              (((x) >> 8) & 0xff)
+
 struct pci_dev {
 	uint16_t bdf;
 	phys_addr_t resource[PCI_BAR_NUM];
@@ -28,6 +31,8 @@ extern void pci_scan_bars(struct pci_dev *dev);
 extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
 extern void pci_enable_defaults(struct pci_dev *dev);
 
+typedef phys_addr_t iova_t;
+
 extern bool pci_probe(void);
 extern void pci_print(void);
 extern bool pci_dev_exists(pcidevaddr_t dev);
diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
index 1887998..86b1805 100644
--- a/lib/x86/intel-iommu.c
+++ b/lib/x86/intel-iommu.c
@@ -11,6 +11,42 @@
  */
 
 #include "intel-iommu.h"
+#include "libcflat.h"
+
+/*
+ * VT-d in QEMU currently only support 39 bits address width, which is
+ * 3-level translation.
+ */
+#define VTD_PAGE_LEVEL      3
+#define VTD_CE_AW_39BIT     0x1
+
+typedef uint64_t vtd_pte_t;
+
+struct vtd_root_entry {
+	/* Quad 1 */
+	uint64_t present:1;
+	uint64_t __reserved:11;
+	uint64_t context_table_p:52;
+	/* Quad 2 */
+	uint64_t __reserved_2;
+} __attribute__ ((packed));
+typedef struct vtd_root_entry vtd_re_t;
+
+struct vtd_context_entry {
+	/* Quad 1 */
+	uint64_t present:1;
+	uint64_t disable_fault_report:1;
+	uint64_t trans_type:2;
+	uint64_t __reserved:8;
+	uint64_t slptptr:52;
+	/* Quad 2 */
+	uint64_t addr_width:3;
+	uint64_t __ignore:4;
+	uint64_t __reserved_2:1;
+	uint64_t domain_id:16;
+	uint64_t __reserved_3:40;
+} __attribute__ ((packed));
+typedef struct vtd_context_entry vtd_ce_t;
 
 #define VTD_RTA_MASK  (PAGE_MASK)
 #define VTD_IRTA_MASK (PAGE_MASK)
@@ -83,6 +119,103 @@ static void vtd_setup_ir_table(void)
 	printf("IR table address: 0x%016lx\n", vtd_ir_table());
 }
 
+static void vtd_install_pte(vtd_pte_t *root, iova_t iova,
+			    phys_addr_t pa, int level_target)
+{
+	int level;
+	unsigned int offset;
+	void *page;
+
+	for (level = VTD_PAGE_LEVEL; level > level_target; level--) {
+		offset = PGDIR_OFFSET(iova, level);
+		if (!(root[offset] & VTD_PTE_RW)) {
+			page = alloc_page();
+			memset(page, 0, PAGE_SIZE);
+			root[offset] = virt_to_phys(page) | VTD_PTE_RW;
+		}
+		root = (uint64_t *)(phys_to_virt(root[offset] &
+						 VTD_PTE_ADDR));
+	}
+
+	offset = PGDIR_OFFSET(iova, level);
+	root[offset] = pa | VTD_PTE_RW;
+	if (level != 1) {
+		/* This is huge page */
+		root[offset] |= VTD_PTE_HUGE;
+	}
+}
+
+#define  VTD_PHYS_TO_VIRT(x) \
+	((void *)(((uint64_t)phys_to_virt(x)) >> VTD_PAGE_SHIFT))
+
+/**
+ * vtd_map_range: setup IO address mapping for specific memory range
+ *
+ * @sid: source ID of the device to setup
+ * @iova: start IO virtual address
+ * @pa: start physical address
+ * @size: size of the mapping area
+ */
+void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
+{
+	uint8_t bus_n, devfn;
+	void *slptptr;
+	vtd_ce_t *ce;
+	vtd_re_t *re = phys_to_virt(vtd_root_table());
+
+	assert(IS_ALIGNED(iova, SZ_4K));
+	assert(IS_ALIGNED(pa, SZ_4K));
+	assert(IS_ALIGNED(size, SZ_4K));
+
+	bus_n = PCI_BDF_GET_BUS(sid);
+	devfn = PCI_BDF_GET_DEVFN(sid);
+
+	/* Point to the correct root entry */
+	re += bus_n;
+
+	if (!re->present) {
+		ce = alloc_page();
+		memset(ce, 0, PAGE_SIZE);
+		memset(re, 0, sizeof(*re));
+		re->context_table_p = virt_to_phys(ce) >> VTD_PAGE_SHIFT;
+		re->present = 1;
+		printf("allocated vt-d root entry for PCI bus %d\n",
+		       bus_n);
+	} else
+		ce = VTD_PHYS_TO_VIRT(re->context_table_p);
+
+	/* Point to the correct context entry */
+	ce += devfn;
+
+	if (!ce->present) {
+		slptptr = alloc_page();
+		memset(slptptr, 0, PAGE_SIZE);
+		memset(ce, 0, sizeof(*ce));
+		/* To make it simple, domain ID is the same as SID */
+		ce->domain_id = sid;
+		/* We only test 39 bits width case (3-level paging) */
+		ce->addr_width = VTD_CE_AW_39BIT;
+		ce->slptptr = virt_to_phys(slptptr) >> VTD_PAGE_SHIFT;
+		ce->trans_type = VTD_CONTEXT_TT_MULTI_LEVEL;
+		ce->present = 1;
+		/* No error reporting yet */
+		ce->disable_fault_report = 1;
+		printf("allocated vt-d context entry for devfn 0x%x\n",
+		       devfn);
+	} else
+		slptptr = VTD_PHYS_TO_VIRT(ce->slptptr);
+
+	while (size) {
+		/* TODO: currently we only map 4K pages (level = 1) */
+		printf("map 4K page IOVA 0x%lx to 0x%lx (sid=0x%04x)\n",
+		       iova, pa, sid);
+		vtd_install_pte(slptptr, iova, pa, 1);
+		size -= VTD_PAGE_SIZE;
+		iova += VTD_PAGE_SIZE;
+		pa += VTD_PAGE_SIZE;
+	}
+}
+
 void vtd_init(void)
 {
 	setup_vm();
diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
index 92cc85f..50f67f1 100644
--- a/lib/x86/intel-iommu.h
+++ b/lib/x86/intel-iommu.h
@@ -20,9 +20,12 @@
 #include "isr.h"
 #include "smp.h"
 #include "desc.h"
+#include "pci.h"
 #include "asm/io.h"
 
 #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed90000ULL
+#define VTD_PAGE_SHIFT              PAGE_SHIFT
+#define VTD_PAGE_SIZE               PAGE_SIZE
 
 /*
  * Intel IOMMU register specification
@@ -137,5 +140,6 @@ static inline uint64_t vtd_readq(unsigned int reg)
 }
 
 void vtd_init(void);
+void vtd_map_range(uint16_t sid, phys_addr_t iova, phys_addr_t pa, size_t size);
 
 #endif
diff --git a/x86/Makefile.common b/x86/Makefile.common
index 356d879..1dad18b 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -3,6 +3,7 @@
 all: test_cases
 
 cflatobjs += lib/pci.o
+cflatobjs += lib/pci-edu.o
 cflatobjs += lib/x86/io.o
 cflatobjs += lib/x86/smp.o
 cflatobjs += lib/x86/vm.o
diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
index f247913..21fd57f 100644
--- a/x86/intel-iommu.c
+++ b/x86/intel-iommu.c
@@ -11,9 +11,50 @@
  */
 
 #include "intel-iommu.h"
+#include "pci-edu.h"
+
+#define VTD_TEST_DMAR_4B ("DMAR 4B memcpy test")
+
+void vtd_test_dmar(struct pci_edu_dev *dev)
+{
+	void *page = alloc_page();
+
+#define DMA_TEST_WORD (0x12345678)
+	/* Modify the first 4 bytes of the page */
+	*(uint32_t *)page = DMA_TEST_WORD;
+
+	/*
+	 * Map the newly allocated page into IOVA address 0 (size 4K)
+	 * of the device address space. Root entry and context entry
+	 * will be automatically created when needed.
+	 */
+	vtd_map_range(dev->pci_dev.bdf, 0, virt_to_phys(page), PAGE_SIZE);
+
+	/*
+	 * DMA the first 4 bytes of the page to EDU device buffer
+	 * offset 0.
+	 */
+	edu_dma(dev, 0, 4, 0, false);
+
+	/*
+	 * DMA the first 4 bytes of EDU device buffer into the page
+	 * with offset 4 (so it'll be using 4-7 bytes).
+	 */
+	edu_dma(dev, 4, 4, 0, true);
+
+	/*
+	 * Check data match between 0-3 bytes and 4-7 bytes of the
+	 * page.
+	 */
+	report(VTD_TEST_DMAR_4B, *((uint32_t *)page + 1) == DMA_TEST_WORD);
+
+	free_page(page);
+}
 
 int main(int argc, char *argv[])
 {
+	struct pci_edu_dev dev;
+
 	vtd_init();
 
 	report("fault status check", vtd_readl(DMAR_FSTS_REG) == 0);
@@ -22,6 +63,16 @@ int main(int argc, char *argv[])
 	report("IR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR_TABLE);
 	report("DMAR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_DMAR);
 	report("IR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR);
+	report("DMAR support 39 bits address width",
+	       vtd_readq(DMAR_CAP_REG) & VTD_CAP_SAGAW);
+	report("DMAR support huge pages", vtd_readq(DMAR_CAP_REG) & VTD_CAP_SLLPS);
+
+	if (!edu_init(&dev)) {
+		printf("Please specify \"-device edu\" to do "
+		       "further IOMMU tests.\n");
+		report_skip(VTD_TEST_DMAR_4B);
+	} else
+		vtd_test_dmar(&dev);
 
 	return report_summary();
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

DMAR test is based on QEMU edu device. A 4B DMA memory copy is carried
out as the simplest DMAR test.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.h             |   5 ++
 lib/x86/intel-iommu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h |   4 ++
 x86/Makefile.common   |   1 +
 x86/intel-iommu.c     |  51 +++++++++++++++++++
 5 files changed, 194 insertions(+)

diff --git a/lib/pci.h b/lib/pci.h
index d3052ef..26968b1 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -18,6 +18,9 @@ enum {
 #define PCI_BAR_NUM                     6
 #define PCI_DEVFN_MAX                   256
 
+#define PCI_BDF_GET_DEVFN(x)            ((x) & 0xff)
+#define PCI_BDF_GET_BUS(x)              (((x) >> 8) & 0xff)
+
 struct pci_dev {
 	uint16_t bdf;
 	phys_addr_t resource[PCI_BAR_NUM];
@@ -28,6 +31,8 @@ extern void pci_scan_bars(struct pci_dev *dev);
 extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
 extern void pci_enable_defaults(struct pci_dev *dev);
 
+typedef phys_addr_t iova_t;
+
 extern bool pci_probe(void);
 extern void pci_print(void);
 extern bool pci_dev_exists(pcidevaddr_t dev);
diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
index 1887998..86b1805 100644
--- a/lib/x86/intel-iommu.c
+++ b/lib/x86/intel-iommu.c
@@ -11,6 +11,42 @@
  */
 
 #include "intel-iommu.h"
+#include "libcflat.h"
+
+/*
+ * VT-d in QEMU currently only support 39 bits address width, which is
+ * 3-level translation.
+ */
+#define VTD_PAGE_LEVEL      3
+#define VTD_CE_AW_39BIT     0x1
+
+typedef uint64_t vtd_pte_t;
+
+struct vtd_root_entry {
+	/* Quad 1 */
+	uint64_t present:1;
+	uint64_t __reserved:11;
+	uint64_t context_table_p:52;
+	/* Quad 2 */
+	uint64_t __reserved_2;
+} __attribute__ ((packed));
+typedef struct vtd_root_entry vtd_re_t;
+
+struct vtd_context_entry {
+	/* Quad 1 */
+	uint64_t present:1;
+	uint64_t disable_fault_report:1;
+	uint64_t trans_type:2;
+	uint64_t __reserved:8;
+	uint64_t slptptr:52;
+	/* Quad 2 */
+	uint64_t addr_width:3;
+	uint64_t __ignore:4;
+	uint64_t __reserved_2:1;
+	uint64_t domain_id:16;
+	uint64_t __reserved_3:40;
+} __attribute__ ((packed));
+typedef struct vtd_context_entry vtd_ce_t;
 
 #define VTD_RTA_MASK  (PAGE_MASK)
 #define VTD_IRTA_MASK (PAGE_MASK)
@@ -83,6 +119,103 @@ static void vtd_setup_ir_table(void)
 	printf("IR table address: 0x%016lx\n", vtd_ir_table());
 }
 
+static void vtd_install_pte(vtd_pte_t *root, iova_t iova,
+			    phys_addr_t pa, int level_target)
+{
+	int level;
+	unsigned int offset;
+	void *page;
+
+	for (level = VTD_PAGE_LEVEL; level > level_target; level--) {
+		offset = PGDIR_OFFSET(iova, level);
+		if (!(root[offset] & VTD_PTE_RW)) {
+			page = alloc_page();
+			memset(page, 0, PAGE_SIZE);
+			root[offset] = virt_to_phys(page) | VTD_PTE_RW;
+		}
+		root = (uint64_t *)(phys_to_virt(root[offset] &
+						 VTD_PTE_ADDR));
+	}
+
+	offset = PGDIR_OFFSET(iova, level);
+	root[offset] = pa | VTD_PTE_RW;
+	if (level != 1) {
+		/* This is huge page */
+		root[offset] |= VTD_PTE_HUGE;
+	}
+}
+
+#define  VTD_PHYS_TO_VIRT(x) \
+	((void *)(((uint64_t)phys_to_virt(x)) >> VTD_PAGE_SHIFT))
+
+/**
+ * vtd_map_range: setup IO address mapping for specific memory range
+ *
+ * @sid: source ID of the device to setup
+ * @iova: start IO virtual address
+ * @pa: start physical address
+ * @size: size of the mapping area
+ */
+void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
+{
+	uint8_t bus_n, devfn;
+	void *slptptr;
+	vtd_ce_t *ce;
+	vtd_re_t *re = phys_to_virt(vtd_root_table());
+
+	assert(IS_ALIGNED(iova, SZ_4K));
+	assert(IS_ALIGNED(pa, SZ_4K));
+	assert(IS_ALIGNED(size, SZ_4K));
+
+	bus_n = PCI_BDF_GET_BUS(sid);
+	devfn = PCI_BDF_GET_DEVFN(sid);
+
+	/* Point to the correct root entry */
+	re += bus_n;
+
+	if (!re->present) {
+		ce = alloc_page();
+		memset(ce, 0, PAGE_SIZE);
+		memset(re, 0, sizeof(*re));
+		re->context_table_p = virt_to_phys(ce) >> VTD_PAGE_SHIFT;
+		re->present = 1;
+		printf("allocated vt-d root entry for PCI bus %d\n",
+		       bus_n);
+	} else
+		ce = VTD_PHYS_TO_VIRT(re->context_table_p);
+
+	/* Point to the correct context entry */
+	ce += devfn;
+
+	if (!ce->present) {
+		slptptr = alloc_page();
+		memset(slptptr, 0, PAGE_SIZE);
+		memset(ce, 0, sizeof(*ce));
+		/* To make it simple, domain ID is the same as SID */
+		ce->domain_id = sid;
+		/* We only test 39 bits width case (3-level paging) */
+		ce->addr_width = VTD_CE_AW_39BIT;
+		ce->slptptr = virt_to_phys(slptptr) >> VTD_PAGE_SHIFT;
+		ce->trans_type = VTD_CONTEXT_TT_MULTI_LEVEL;
+		ce->present = 1;
+		/* No error reporting yet */
+		ce->disable_fault_report = 1;
+		printf("allocated vt-d context entry for devfn 0x%x\n",
+		       devfn);
+	} else
+		slptptr = VTD_PHYS_TO_VIRT(ce->slptptr);
+
+	while (size) {
+		/* TODO: currently we only map 4K pages (level = 1) */
+		printf("map 4K page IOVA 0x%lx to 0x%lx (sid=0x%04x)\n",
+		       iova, pa, sid);
+		vtd_install_pte(slptptr, iova, pa, 1);
+		size -= VTD_PAGE_SIZE;
+		iova += VTD_PAGE_SIZE;
+		pa += VTD_PAGE_SIZE;
+	}
+}
+
 void vtd_init(void)
 {
 	setup_vm();
diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
index 92cc85f..50f67f1 100644
--- a/lib/x86/intel-iommu.h
+++ b/lib/x86/intel-iommu.h
@@ -20,9 +20,12 @@
 #include "isr.h"
 #include "smp.h"
 #include "desc.h"
+#include "pci.h"
 #include "asm/io.h"
 
 #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed90000ULL
+#define VTD_PAGE_SHIFT              PAGE_SHIFT
+#define VTD_PAGE_SIZE               PAGE_SIZE
 
 /*
  * Intel IOMMU register specification
@@ -137,5 +140,6 @@ static inline uint64_t vtd_readq(unsigned int reg)
 }
 
 void vtd_init(void);
+void vtd_map_range(uint16_t sid, phys_addr_t iova, phys_addr_t pa, size_t size);
 
 #endif
diff --git a/x86/Makefile.common b/x86/Makefile.common
index 356d879..1dad18b 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -3,6 +3,7 @@
 all: test_cases
 
 cflatobjs += lib/pci.o
+cflatobjs += lib/pci-edu.o
 cflatobjs += lib/x86/io.o
 cflatobjs += lib/x86/smp.o
 cflatobjs += lib/x86/vm.o
diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
index f247913..21fd57f 100644
--- a/x86/intel-iommu.c
+++ b/x86/intel-iommu.c
@@ -11,9 +11,50 @@
  */
 
 #include "intel-iommu.h"
+#include "pci-edu.h"
+
+#define VTD_TEST_DMAR_4B ("DMAR 4B memcpy test")
+
+void vtd_test_dmar(struct pci_edu_dev *dev)
+{
+	void *page = alloc_page();
+
+#define DMA_TEST_WORD (0x12345678)
+	/* Modify the first 4 bytes of the page */
+	*(uint32_t *)page = DMA_TEST_WORD;
+
+	/*
+	 * Map the newly allocated page into IOVA address 0 (size 4K)
+	 * of the device address space. Root entry and context entry
+	 * will be automatically created when needed.
+	 */
+	vtd_map_range(dev->pci_dev.bdf, 0, virt_to_phys(page), PAGE_SIZE);
+
+	/*
+	 * DMA the first 4 bytes of the page to EDU device buffer
+	 * offset 0.
+	 */
+	edu_dma(dev, 0, 4, 0, false);
+
+	/*
+	 * DMA the first 4 bytes of EDU device buffer into the page
+	 * with offset 4 (so it'll be using 4-7 bytes).
+	 */
+	edu_dma(dev, 4, 4, 0, true);
+
+	/*
+	 * Check data match between 0-3 bytes and 4-7 bytes of the
+	 * page.
+	 */
+	report(VTD_TEST_DMAR_4B, *((uint32_t *)page + 1) == DMA_TEST_WORD);
+
+	free_page(page);
+}
 
 int main(int argc, char *argv[])
 {
+	struct pci_edu_dev dev;
+
 	vtd_init();
 
 	report("fault status check", vtd_readl(DMAR_FSTS_REG) == 0);
@@ -22,6 +63,16 @@ int main(int argc, char *argv[])
 	report("IR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR_TABLE);
 	report("DMAR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_DMAR);
 	report("IR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR);
+	report("DMAR support 39 bits address width",
+	       vtd_readq(DMAR_CAP_REG) & VTD_CAP_SAGAW);
+	report("DMAR support huge pages", vtd_readq(DMAR_CAP_REG) & VTD_CAP_SLLPS);
+
+	if (!edu_init(&dev)) {
+		printf("Please specify \"-device edu\" to do "
+		       "further IOMMU tests.\n");
+		report_skip(VTD_TEST_DMAR_4B);
+	} else
+		vtd_test_dmar(&dev);
 
 	return report_summary();
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 13/14] pci: add msi support for 32/64bit address
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

pci_cap_walk() is provided to allow walk through all the capability bits
for a specific PCI device. If a cap handler is provided, it'll be
triggered if the cap is detected along the cap walk.

MSI cap handler is the first one supported. We can add more cap handler
in the future.

Meanwhile, pci_setup_msi() API is provided to support basic 32/64 bit
address MSI setup.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/pci.h |  4 ++++
 2 files changed, 73 insertions(+)

diff --git a/lib/pci.c b/lib/pci.c
index bf2a3df..a51eb8e 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -7,6 +7,74 @@
 #include "pci.h"
 #include "asm/pci.h"
 
+typedef void (*pci_cap_handler)(struct pci_dev *dev, int cap_offset);
+
+static void pci_cap_msi_handler(struct pci_dev *dev, int cap_offset)
+{
+	printf("Detected MSI for device 0x%x offset 0x%x\n",
+	       dev->bdf, cap_offset);
+	dev->msi_offset = cap_offset;
+}
+
+static pci_cap_handler cap_handlers[PCI_CAP_ID_MAX + 1] = {
+	[PCI_CAP_ID_MSI] = pci_cap_msi_handler,
+};
+
+void pci_cap_walk(struct pci_dev *dev)
+{
+	uint8_t cap_offset;
+	uint8_t cap_id;
+	int count = 0;
+
+	cap_offset = pci_config_readb(dev->bdf, PCI_CAPABILITY_LIST);
+	while (cap_offset) {
+		cap_id = pci_config_readb(dev->bdf, cap_offset);
+		printf("PCI detected cap 0x%x\n", cap_id);
+		assert(cap_id < PCI_CAP_ID_MAX + 1);
+		if (cap_handlers[cap_id])
+			cap_handlers[cap_id](dev, cap_offset);
+		cap_offset = pci_config_readb(dev->bdf, cap_offset + 1);
+		/* Avoid dead loop during cap walk */
+		assert(++count <= 255);
+	}
+}
+
+bool pci_setup_msi(struct pci_dev *dev, uint64_t msi_addr, uint32_t msi_data)
+{
+	uint16_t msi_control;
+	uint16_t offset;
+	pcidevaddr_t addr;
+
+	assert(dev);
+
+	if (!dev->msi_offset) {
+		printf("MSI: dev 0x%x does not support MSI.\n", dev->bdf);
+		return false;
+	}
+
+	addr = dev->bdf;
+	offset = dev->msi_offset;
+	msi_control = pci_config_readw(addr, offset + PCI_MSI_FLAGS);
+	pci_config_writel(addr, offset + PCI_MSI_ADDRESS_LO,
+			  msi_addr & 0xffffffff);
+
+	if (msi_control & PCI_MSI_FLAGS_64BIT) {
+		pci_config_writel(addr, offset + PCI_MSI_ADDRESS_HI,
+				  (uint32_t)(msi_addr >> 32));
+		pci_config_writel(addr, offset + PCI_MSI_DATA_64, msi_data);
+		printf("MSI: dev 0x%x init 64bit address: ", addr);
+	} else {
+		pci_config_writel(addr, offset + PCI_MSI_DATA_32, msi_data);
+		printf("MSI: dev 0x%x init 32bit address: ", addr);
+	}
+	printf("addr=0x%lx, data=0x%x\n", msi_addr, msi_data);
+
+	msi_control |= PCI_MSI_FLAGS_ENABLE;
+	pci_config_writew(addr, offset + PCI_MSI_FLAGS, msi_control);
+
+	return true;
+}
+
 void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr)
 {
 	uint16_t val = pci_config_readw(dev->bdf, PCI_COMMAND);
@@ -260,4 +328,5 @@ void pci_enable_defaults(struct pci_dev *dev)
 	pci_scan_bars(dev);
 	/* Enable device DMA operations */
 	pci_cmd_set_clr(dev, PCI_COMMAND_MASTER, 0);
+	pci_cap_walk(dev);
 }
diff --git a/lib/pci.h b/lib/pci.h
index 26968b1..e71a05d 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -23,13 +23,17 @@ enum {
 
 struct pci_dev {
 	uint16_t bdf;
+	uint16_t msi_offset;
 	phys_addr_t resource[PCI_BAR_NUM];
 };
 
 extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
 extern void pci_scan_bars(struct pci_dev *dev);
 extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
+extern void pci_cap_walk(struct pci_dev *dev);
 extern void pci_enable_defaults(struct pci_dev *dev);
+extern bool pci_setup_msi(struct pci_dev *dev, uint64_t msi_addr,
+			  uint32_t msi_data);
 
 typedef phys_addr_t iova_t;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 13/14] pci: add msi support for 32/64bit address
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

pci_cap_walk() is provided to allow walk through all the capability bits
for a specific PCI device. If a cap handler is provided, it'll be
triggered if the cap is detected along the cap walk.

MSI cap handler is the first one supported. We can add more cap handler
in the future.

Meanwhile, pci_setup_msi() API is provided to support basic 32/64 bit
address MSI setup.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/pci.h |  4 ++++
 2 files changed, 73 insertions(+)

diff --git a/lib/pci.c b/lib/pci.c
index bf2a3df..a51eb8e 100644
--- a/lib/pci.c
+++ b/lib/pci.c
@@ -7,6 +7,74 @@
 #include "pci.h"
 #include "asm/pci.h"
 
+typedef void (*pci_cap_handler)(struct pci_dev *dev, int cap_offset);
+
+static void pci_cap_msi_handler(struct pci_dev *dev, int cap_offset)
+{
+	printf("Detected MSI for device 0x%x offset 0x%x\n",
+	       dev->bdf, cap_offset);
+	dev->msi_offset = cap_offset;
+}
+
+static pci_cap_handler cap_handlers[PCI_CAP_ID_MAX + 1] = {
+	[PCI_CAP_ID_MSI] = pci_cap_msi_handler,
+};
+
+void pci_cap_walk(struct pci_dev *dev)
+{
+	uint8_t cap_offset;
+	uint8_t cap_id;
+	int count = 0;
+
+	cap_offset = pci_config_readb(dev->bdf, PCI_CAPABILITY_LIST);
+	while (cap_offset) {
+		cap_id = pci_config_readb(dev->bdf, cap_offset);
+		printf("PCI detected cap 0x%x\n", cap_id);
+		assert(cap_id < PCI_CAP_ID_MAX + 1);
+		if (cap_handlers[cap_id])
+			cap_handlers[cap_id](dev, cap_offset);
+		cap_offset = pci_config_readb(dev->bdf, cap_offset + 1);
+		/* Avoid dead loop during cap walk */
+		assert(++count <= 255);
+	}
+}
+
+bool pci_setup_msi(struct pci_dev *dev, uint64_t msi_addr, uint32_t msi_data)
+{
+	uint16_t msi_control;
+	uint16_t offset;
+	pcidevaddr_t addr;
+
+	assert(dev);
+
+	if (!dev->msi_offset) {
+		printf("MSI: dev 0x%x does not support MSI.\n", dev->bdf);
+		return false;
+	}
+
+	addr = dev->bdf;
+	offset = dev->msi_offset;
+	msi_control = pci_config_readw(addr, offset + PCI_MSI_FLAGS);
+	pci_config_writel(addr, offset + PCI_MSI_ADDRESS_LO,
+			  msi_addr & 0xffffffff);
+
+	if (msi_control & PCI_MSI_FLAGS_64BIT) {
+		pci_config_writel(addr, offset + PCI_MSI_ADDRESS_HI,
+				  (uint32_t)(msi_addr >> 32));
+		pci_config_writel(addr, offset + PCI_MSI_DATA_64, msi_data);
+		printf("MSI: dev 0x%x init 64bit address: ", addr);
+	} else {
+		pci_config_writel(addr, offset + PCI_MSI_DATA_32, msi_data);
+		printf("MSI: dev 0x%x init 32bit address: ", addr);
+	}
+	printf("addr=0x%lx, data=0x%x\n", msi_addr, msi_data);
+
+	msi_control |= PCI_MSI_FLAGS_ENABLE;
+	pci_config_writew(addr, offset + PCI_MSI_FLAGS, msi_control);
+
+	return true;
+}
+
 void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr)
 {
 	uint16_t val = pci_config_readw(dev->bdf, PCI_COMMAND);
@@ -260,4 +328,5 @@ void pci_enable_defaults(struct pci_dev *dev)
 	pci_scan_bars(dev);
 	/* Enable device DMA operations */
 	pci_cmd_set_clr(dev, PCI_COMMAND_MASTER, 0);
+	pci_cap_walk(dev);
 }
diff --git a/lib/pci.h b/lib/pci.h
index 26968b1..e71a05d 100644
--- a/lib/pci.h
+++ b/lib/pci.h
@@ -23,13 +23,17 @@ enum {
 
 struct pci_dev {
 	uint16_t bdf;
+	uint16_t msi_offset;
 	phys_addr_t resource[PCI_BAR_NUM];
 };
 
 extern void pci_dev_init(struct pci_dev *dev, pcidevaddr_t bdf);
 extern void pci_scan_bars(struct pci_dev *dev);
 extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
+extern void pci_cap_walk(struct pci_dev *dev);
 extern void pci_enable_defaults(struct pci_dev *dev);
+extern bool pci_setup_msi(struct pci_dev *dev, uint64_t msi_addr,
+			  uint32_t msi_data);
 
 typedef phys_addr_t iova_t;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [kvm-unit-tests PATCH v8 14/14] x86: intel-iommu: add IR MSI test
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
@ 2016-12-12  3:08   ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

First of all, vtd_setup_msi() is provided. It setup IRTE entries,
meanwhile, setup PCI device MSI vectors corresponding to VT-d spec.

Meanwhile, IR MSI test is added to intel IOMMU unit test. The basic IR
test is carried out by a edu INTR raise request. When write to the intr
raise register, interrupt will be generated. Type of interrupt will
depend on the setup (either INTx or MSI).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-edu.h         |  1 +
 lib/x86/intel-iommu.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h |  1 +
 x86/intel-iommu.c     | 44 +++++++++++++++++++++++-
 4 files changed, 139 insertions(+), 1 deletion(-)

diff --git a/lib/pci-edu.h b/lib/pci-edu.h
index 8395ebe..9fe32c6 100644
--- a/lib/pci-edu.h
+++ b/lib/pci-edu.h
@@ -32,6 +32,7 @@
 #define EDU_REG_ALIVE               0x4
 #define EDU_REG_FACTORIAL           0x8
 #define EDU_REG_STATUS              0x20
+#define EDU_REG_INTR_RAISE          0x60
 #define EDU_REG_DMA_SRC             0x80
 #define EDU_REG_DMA_DST             0x88
 #define EDU_REG_DMA_COUNT           0x90
diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
index 86b1805..20b9240 100644
--- a/lib/x86/intel-iommu.c
+++ b/lib/x86/intel-iommu.c
@@ -12,6 +12,7 @@
 
 #include "intel-iommu.h"
 #include "libcflat.h"
+#include "pci.h"
 
 /*
  * VT-d in QEMU currently only support 39 bits address width, which is
@@ -48,6 +49,26 @@ struct vtd_context_entry {
 } __attribute__ ((packed));
 typedef struct vtd_context_entry vtd_ce_t;
 
+struct vtd_irte {
+	uint32_t present:1;
+	uint32_t fault_disable:1;    /* Fault Processing Disable */
+	uint32_t dest_mode:1;        /* Destination Mode */
+	uint32_t redir_hint:1;       /* Redirection Hint */
+	uint32_t trigger_mode:1;     /* Trigger Mode */
+	uint32_t delivery_mode:3;    /* Delivery Mode */
+	uint32_t __avail:4;          /* Available spaces for software */
+	uint32_t __reserved_0:3;     /* Reserved 0 */
+	uint32_t irte_mode:1;        /* IRTE Mode */
+	uint32_t vector:8;           /* Interrupt Vector */
+	uint32_t __reserved_1:8;     /* Reserved 1 */
+	uint32_t dest_id;            /* Destination ID */
+	uint16_t source_id:16;       /* Source-ID */
+	uint64_t sid_q:2;            /* Source-ID Qualifier */
+	uint64_t sid_vtype:2;        /* Source-ID Validation Type */
+	uint64_t __reserved_2:44;    /* Reserved 2 */
+} __attribute__ ((packed));
+typedef struct vtd_irte vtd_irte_t;
+
 #define VTD_RTA_MASK  (PAGE_MASK)
 #define VTD_IRTA_MASK (PAGE_MASK)
 
@@ -216,6 +237,79 @@ void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
 	}
 }
 
+static uint16_t vtd_intr_index_alloc(void)
+{
+	static int index_ctr = 0;
+	assert(index_ctr < 65535);
+	return index_ctr++;
+}
+
+static void vtd_setup_irte(struct pci_dev *dev, vtd_irte_t *irte,
+			   int vector, int dest_id)
+{
+	assert(sizeof(vtd_irte_t) == 16);
+	memset(irte, 0, sizeof(*irte));
+	irte->fault_disable = 1;
+	irte->dest_mode = 0;	 /* physical */
+	irte->trigger_mode = 0;	 /* edge */
+	irte->delivery_mode = 0; /* fixed */
+	irte->irte_mode = 0;	 /* remapped */
+	irte->vector = vector;
+	irte->dest_id = dest_id;
+	irte->source_id = dev->bdf;
+	irte->sid_q = 0;
+	irte->sid_vtype = 1;     /* full-sid verify */
+	irte->present = 1;
+}
+
+struct vtd_msi_addr {
+	uint32_t __dont_care:2;
+	uint32_t handle_15:1;	 /* handle[15] */
+	uint32_t shv:1;
+	uint32_t interrupt_format:1;
+	uint32_t handle_0_14:15; /* handle[0:14] */
+	uint32_t head:12;	 /* 0xfee */
+	uint32_t addr_hi;	 /* not used except with x2apic */
+} __attribute__ ((packed));
+typedef struct vtd_msi_addr vtd_msi_addr_t;
+
+struct vtd_msi_data {
+	uint16_t __reserved;
+	uint16_t subhandle;
+} __attribute__ ((packed));
+typedef struct vtd_msi_data vtd_msi_data_t;
+
+/**
+ * vtd_setup_msi - setup MSI message for a device
+ *
+ * @dev: PCI device to setup MSI
+ * @vector: interrupt vector
+ * @dest_id: destination processor
+ */
+bool vtd_setup_msi(struct pci_dev *dev, int vector, int dest_id)
+{
+	vtd_msi_data_t msi_data = {};
+	vtd_msi_addr_t msi_addr = {};
+	vtd_irte_t *irte = phys_to_virt(vtd_ir_table());
+	uint16_t index = vtd_intr_index_alloc();
+
+	assert(sizeof(vtd_msi_addr_t) == 8);
+	assert(sizeof(vtd_msi_data_t) == 4);
+
+	printf("INTR: setup IRTE index %d\n", index);
+	vtd_setup_irte(dev, irte + index, vector, dest_id);
+
+	msi_addr.handle_15 = index >> 15 & 1;
+	msi_addr.shv = 0;
+	msi_addr.interrupt_format = 1;
+	msi_addr.handle_0_14 = index & 0x7fff;
+	msi_addr.head = 0xfee;
+	msi_data.subhandle = 0;
+
+	return pci_setup_msi(dev, *(uint64_t *)&msi_addr,
+			     *(uint32_t *)&msi_data);
+}
+
 void vtd_init(void)
 {
 	setup_vm();
diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
index 50f67f1..e3e956d 100644
--- a/lib/x86/intel-iommu.h
+++ b/lib/x86/intel-iommu.h
@@ -141,5 +141,6 @@ static inline uint64_t vtd_readq(unsigned int reg)
 
 void vtd_init(void);
 void vtd_map_range(uint16_t sid, phys_addr_t iova, phys_addr_t pa, size_t size);
+bool vtd_setup_msi(struct pci_dev *dev, int vector, int dest_id);
 
 #endif
diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
index 21fd57f..753f90e 100644
--- a/x86/intel-iommu.c
+++ b/x86/intel-iommu.c
@@ -12,8 +12,10 @@
 
 #include "intel-iommu.h"
 #include "pci-edu.h"
+#include "x86/apic.h"
 
 #define VTD_TEST_DMAR_4B ("DMAR 4B memcpy test")
+#define VTD_TEST_IR_MSI ("IR MSI")
 
 void vtd_test_dmar(struct pci_edu_dev *dev)
 {
@@ -51,6 +53,43 @@ void vtd_test_dmar(struct pci_edu_dev *dev)
 	free_page(page);
 }
 
+static volatile bool edu_intr_recved;
+
+static void edu_isr(isr_regs_t *regs)
+{
+	edu_intr_recved = true;
+	eoi();
+}
+
+static void vtd_test_ir(struct pci_edu_dev *dev)
+{
+#define VTD_TEST_VECTOR (0xee)
+	/*
+	 * Setup EDU PCI device MSI, using interrupt remapping. By
+	 * default, EDU device is using INTx.
+	 */
+	if (!vtd_setup_msi(&dev->pci_dev, VTD_TEST_VECTOR, 0)) {
+		printf("edu device does not support MSI, skip test\n");
+		report_skip(VTD_TEST_IR_MSI);
+		return;
+	}
+
+	handle_irq(VTD_TEST_VECTOR, edu_isr);
+	irq_enable();
+
+	/* Manually trigger INTR */
+	edu_reg_writel(dev, EDU_REG_INTR_RAISE, 1);
+
+	while (!edu_intr_recved)
+		cpu_relax();
+
+	/* Clear INTR bits */
+	edu_reg_writel(dev, EDU_REG_INTR_RAISE, 0);
+
+	/* We are good as long as we reach here */
+	report(VTD_TEST_IR_MSI, true);
+}
+
 int main(int argc, char *argv[])
 {
 	struct pci_edu_dev dev;
@@ -71,8 +110,11 @@ int main(int argc, char *argv[])
 		printf("Please specify \"-device edu\" to do "
 		       "further IOMMU tests.\n");
 		report_skip(VTD_TEST_DMAR_4B);
-	} else
+		report_skip(VTD_TEST_IR_MSI);
+	} else {
 		vtd_test_dmar(&dev);
+		vtd_test_ir(&dev);
+	}
 
 	return report_summary();
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [kvm-unit-tests PATCH v8 14/14] x86: intel-iommu: add IR MSI test
@ 2016-12-12  3:08   ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-12  3:08 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: drjones, agordeev, jan.kiszka, rkrcmar, pbonzini, peterx

First of all, vtd_setup_msi() is provided. It setup IRTE entries,
meanwhile, setup PCI device MSI vectors corresponding to VT-d spec.

Meanwhile, IR MSI test is added to intel IOMMU unit test. The basic IR
test is carried out by a edu INTR raise request. When write to the intr
raise register, interrupt will be generated. Type of interrupt will
depend on the setup (either INTx or MSI).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 lib/pci-edu.h         |  1 +
 lib/x86/intel-iommu.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/intel-iommu.h |  1 +
 x86/intel-iommu.c     | 44 +++++++++++++++++++++++-
 4 files changed, 139 insertions(+), 1 deletion(-)

diff --git a/lib/pci-edu.h b/lib/pci-edu.h
index 8395ebe..9fe32c6 100644
--- a/lib/pci-edu.h
+++ b/lib/pci-edu.h
@@ -32,6 +32,7 @@
 #define EDU_REG_ALIVE               0x4
 #define EDU_REG_FACTORIAL           0x8
 #define EDU_REG_STATUS              0x20
+#define EDU_REG_INTR_RAISE          0x60
 #define EDU_REG_DMA_SRC             0x80
 #define EDU_REG_DMA_DST             0x88
 #define EDU_REG_DMA_COUNT           0x90
diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
index 86b1805..20b9240 100644
--- a/lib/x86/intel-iommu.c
+++ b/lib/x86/intel-iommu.c
@@ -12,6 +12,7 @@
 
 #include "intel-iommu.h"
 #include "libcflat.h"
+#include "pci.h"
 
 /*
  * VT-d in QEMU currently only support 39 bits address width, which is
@@ -48,6 +49,26 @@ struct vtd_context_entry {
 } __attribute__ ((packed));
 typedef struct vtd_context_entry vtd_ce_t;
 
+struct vtd_irte {
+	uint32_t present:1;
+	uint32_t fault_disable:1;    /* Fault Processing Disable */
+	uint32_t dest_mode:1;        /* Destination Mode */
+	uint32_t redir_hint:1;       /* Redirection Hint */
+	uint32_t trigger_mode:1;     /* Trigger Mode */
+	uint32_t delivery_mode:3;    /* Delivery Mode */
+	uint32_t __avail:4;          /* Available spaces for software */
+	uint32_t __reserved_0:3;     /* Reserved 0 */
+	uint32_t irte_mode:1;        /* IRTE Mode */
+	uint32_t vector:8;           /* Interrupt Vector */
+	uint32_t __reserved_1:8;     /* Reserved 1 */
+	uint32_t dest_id;            /* Destination ID */
+	uint16_t source_id:16;       /* Source-ID */
+	uint64_t sid_q:2;            /* Source-ID Qualifier */
+	uint64_t sid_vtype:2;        /* Source-ID Validation Type */
+	uint64_t __reserved_2:44;    /* Reserved 2 */
+} __attribute__ ((packed));
+typedef struct vtd_irte vtd_irte_t;
+
 #define VTD_RTA_MASK  (PAGE_MASK)
 #define VTD_IRTA_MASK (PAGE_MASK)
 
@@ -216,6 +237,79 @@ void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
 	}
 }
 
+static uint16_t vtd_intr_index_alloc(void)
+{
+	static int index_ctr = 0;
+	assert(index_ctr < 65535);
+	return index_ctr++;
+}
+
+static void vtd_setup_irte(struct pci_dev *dev, vtd_irte_t *irte,
+			   int vector, int dest_id)
+{
+	assert(sizeof(vtd_irte_t) == 16);
+	memset(irte, 0, sizeof(*irte));
+	irte->fault_disable = 1;
+	irte->dest_mode = 0;	 /* physical */
+	irte->trigger_mode = 0;	 /* edge */
+	irte->delivery_mode = 0; /* fixed */
+	irte->irte_mode = 0;	 /* remapped */
+	irte->vector = vector;
+	irte->dest_id = dest_id;
+	irte->source_id = dev->bdf;
+	irte->sid_q = 0;
+	irte->sid_vtype = 1;     /* full-sid verify */
+	irte->present = 1;
+}
+
+struct vtd_msi_addr {
+	uint32_t __dont_care:2;
+	uint32_t handle_15:1;	 /* handle[15] */
+	uint32_t shv:1;
+	uint32_t interrupt_format:1;
+	uint32_t handle_0_14:15; /* handle[0:14] */
+	uint32_t head:12;	 /* 0xfee */
+	uint32_t addr_hi;	 /* not used except with x2apic */
+} __attribute__ ((packed));
+typedef struct vtd_msi_addr vtd_msi_addr_t;
+
+struct vtd_msi_data {
+	uint16_t __reserved;
+	uint16_t subhandle;
+} __attribute__ ((packed));
+typedef struct vtd_msi_data vtd_msi_data_t;
+
+/**
+ * vtd_setup_msi - setup MSI message for a device
+ *
+ * @dev: PCI device to setup MSI
+ * @vector: interrupt vector
+ * @dest_id: destination processor
+ */
+bool vtd_setup_msi(struct pci_dev *dev, int vector, int dest_id)
+{
+	vtd_msi_data_t msi_data = {};
+	vtd_msi_addr_t msi_addr = {};
+	vtd_irte_t *irte = phys_to_virt(vtd_ir_table());
+	uint16_t index = vtd_intr_index_alloc();
+
+	assert(sizeof(vtd_msi_addr_t) == 8);
+	assert(sizeof(vtd_msi_data_t) == 4);
+
+	printf("INTR: setup IRTE index %d\n", index);
+	vtd_setup_irte(dev, irte + index, vector, dest_id);
+
+	msi_addr.handle_15 = index >> 15 & 1;
+	msi_addr.shv = 0;
+	msi_addr.interrupt_format = 1;
+	msi_addr.handle_0_14 = index & 0x7fff;
+	msi_addr.head = 0xfee;
+	msi_data.subhandle = 0;
+
+	return pci_setup_msi(dev, *(uint64_t *)&msi_addr,
+			     *(uint32_t *)&msi_data);
+}
+
 void vtd_init(void)
 {
 	setup_vm();
diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
index 50f67f1..e3e956d 100644
--- a/lib/x86/intel-iommu.h
+++ b/lib/x86/intel-iommu.h
@@ -141,5 +141,6 @@ static inline uint64_t vtd_readq(unsigned int reg)
 
 void vtd_init(void);
 void vtd_map_range(uint16_t sid, phys_addr_t iova, phys_addr_t pa, size_t size);
+bool vtd_setup_msi(struct pci_dev *dev, int vector, int dest_id);
 
 #endif
diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
index 21fd57f..753f90e 100644
--- a/x86/intel-iommu.c
+++ b/x86/intel-iommu.c
@@ -12,8 +12,10 @@
 
 #include "intel-iommu.h"
 #include "pci-edu.h"
+#include "x86/apic.h"
 
 #define VTD_TEST_DMAR_4B ("DMAR 4B memcpy test")
+#define VTD_TEST_IR_MSI ("IR MSI")
 
 void vtd_test_dmar(struct pci_edu_dev *dev)
 {
@@ -51,6 +53,43 @@ void vtd_test_dmar(struct pci_edu_dev *dev)
 	free_page(page);
 }
 
+static volatile bool edu_intr_recved;
+
+static void edu_isr(isr_regs_t *regs)
+{
+	edu_intr_recved = true;
+	eoi();
+}
+
+static void vtd_test_ir(struct pci_edu_dev *dev)
+{
+#define VTD_TEST_VECTOR (0xee)
+	/*
+	 * Setup EDU PCI device MSI, using interrupt remapping. By
+	 * default, EDU device is using INTx.
+	 */
+	if (!vtd_setup_msi(&dev->pci_dev, VTD_TEST_VECTOR, 0)) {
+		printf("edu device does not support MSI, skip test\n");
+		report_skip(VTD_TEST_IR_MSI);
+		return;
+	}
+
+	handle_irq(VTD_TEST_VECTOR, edu_isr);
+	irq_enable();
+
+	/* Manually trigger INTR */
+	edu_reg_writel(dev, EDU_REG_INTR_RAISE, 1);
+
+	while (!edu_intr_recved)
+		cpu_relax();
+
+	/* Clear INTR bits */
+	edu_reg_writel(dev, EDU_REG_INTR_RAISE, 0);
+
+	/* We are good as long as we reach here */
+	report(VTD_TEST_IR_MSI, true);
+}
+
 int main(int argc, char *argv[])
 {
 	struct pci_edu_dev dev;
@@ -71,8 +110,11 @@ int main(int argc, char *argv[])
 		printf("Please specify \"-device edu\" to do "
 		       "further IOMMU tests.\n");
 		report_skip(VTD_TEST_DMAR_4B);
-	} else
+		report_skip(VTD_TEST_IR_MSI);
+	} else {
 		vtd_test_dmar(&dev);
+		vtd_test_ir(&dev);
+	}
 
 	return report_summary();
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [kvm-unit-tests PATCH v8 00/14] VT-d unit test
  2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
                   ` (14 preceding siblings ...)
  (?)
@ 2016-12-22 11:06 ` Andrew Jones
  2016-12-22 11:52   ` Paolo Bonzini
  -1 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2016-12-22 11:06 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, kvm, rkrcmar, agordeev, jan.kiszka, pbonzini


Alex,

Are you happy with v8? If so, then, as far as I'm concerned, this
series is ready. I'd like to try driving the edu device from ARM
tests too.

Thanks,
drew


On Mon, Dec 12, 2016 at 11:08:06AM +0800, Peter Xu wrote:
> This is v8 of vt-d unit test series.
> 
> (CC qemu-devel list this time.)
> 
> Patch "libcflat: add IS_ALIGNED() macro, and page sizes" is picked up
> by Drew in the ARM GIC framework series, so please feel free to drop
> it when needed.
> 
> Online repo:
> 
>   https://github.com/xzpeter/kvm-unit-tests.git iommu-ut-v8
> 
> Please review. Thanks,
> 
> v8:
> - use ioremap() to access vt-d/edu registers (adding vtd_reg_base to
>   store vt-d mmio base register vaddr, and pci_edu_dev.reg_base to
>   store edu device base register vaddr) [Alex]
> - check vt-d version [Alex]
> - vtd_install_pte: do phys_to_virt() for root pointer [Alex]
> - renaming VTD_FETCH_VIRT_ADDR to VTD_PHYS_TO_VIRT [Alex]
> - define VTD_PAGE_SHIFT and VTD_PAGE_SIZE, use them when necessary
>   [Alex]
> - change define of VTD_PTE_ADDR to cover bits 63:12
> 
> v7:
> - patch 13: add r-b for Drew
> - change type of dev_offset in edu_dma() into unsigned [Alex]
> - squash Alex's fix (thanks) for ARM build into patch 08/14 [Alex]
> 
> v6:
> - pci: rename pci_dev.bar to pcidev.resource (to Drew: I didn't remove
>   your r-b in several patches due to this change, assuming the change
>   is trivial) [Alex]
> - pci_cap_walk(): add an assert to avoid dead loop [Alex]
> - pci_cap_walk(): another assertion to make sure cap_id is valid
>   [Drew]
> 
> v5:
> - add one more patch at the beginning of series to add "extern" to
>   pci_testdev() header definition [Drew]
> - add "extern" to all pci.h functions [Drew]
> - update IS_ALIGNED patch to be aligned with Drew's gic series [Drew]
> - replace one more PCI_DEVFN_MAX/PCI_BAR_NUM in lib/pci-host-generic.c
>   [Drew]
> - add {} in vtd_install_pte() when there are one line code and one
>   line comment [Drew]
> - drop Drew's suggested-by in patch "x86: intel-iommu: add IR MSI
>   test" (I forgot why I added it, but anyway let me drop it here...)
>   [Drew]
> 
> v4:
> (same as v3, but with correct patches sent out)
> 
> v3:
> - is_power_of_2(): remove outer "()" [Drew]
> - fix indent for vtd_root_table() [Drew]
> - pci_find_dev(): keep {} to avoid touching more codes [Drew]
> - pci_scan_bars(): check whether bar is 64bits [Drew]
> - pci_enable_defaults(): make it return nothing [Drew]
> - fix strnang blanks in lib/x86/intel-iommu.h [Drew]
> - pci_setup_msi(): fix correct assert() inside [Drew]
> - squash patches: 03+17 08+09 10+11 12+14 [Drew]
> - remove all "()" for raw defines (numbers) in macros [Drew]
> 
> v2:
> - move cpu_relax patch to the beginning, and use them in all places
> [Drew]
> - replace all corresponding 256 into PCI_DEVFN_MAX, as well for
> PCI_BAR_NUM [Drew]
> - adding is_power_of_2() to replace ONE_BIT_ONLY() [Drew]
> - add SZ_64K macro [Drew]
> - declare pci_config_write[wb] in lib/asm-generic/pci-host-bridge.h [Alex]
> - edu_reg_read/write() add "l" in func name [Drew]
> - drop pci_set_master(), instead, provide pci_cmd_set_clr() [Drew]
> - change return code into bool (always) for functions that apply
> [Drew]
> - keep old pci_find_dev() interface [Drew/Alex]
> - use __raw_{read|write}*() for both vt-d and edu register read/writes
> [Alex]
> - remove pci_ prefix for all pci_dev fields [Drew]
> - replace 0xff in cap_handlers[0xff] into (PCI_CAP_ID_MAX + 1) [Drew]
> - make x86/unittest.cfg simpler by using q35 directly with eim=off
> [Drew]
> 
> RFC -> v1:
> - when init edu device fail, report_skip() rather than return error
> [Radim]
> - use asserts rather than "static bool inited" to avoid multiple init
> of components (affects patch 1/2) [Drew]
> - moving the first two patches out of the series [Drew]
> - int vtd_init(), do not setup_idt() since smp_init() did it [Drew]
> - when edu do not have MSI enabled, skip interrupt test [Radim]
> - rename vtd_reg_*() into vtd_{read|write}[lq](), and move them to
> header file [Drew]
> - use PAGE_MASK when able [Drew]
> - use "&" instead of "|" in intel-iommu init test (three places)
> [Drew]
> - use "vtd_init()" in unit test [Drew]
> - mention that where intel-iommu.h comes from [Drew]
> - re-written vtd_gcmd_or(), make it also work on even hardware [Drew]
> - remove most of the oneline wrapper for VT-d registers, instead, use
> vtd_{read|write}* with register names [Drew]
> - remove useless BDF helpers [Drew]
> - move edu device macros into header file [Drew]
> - make edu_check_alive static inline [Drew]
> - remove all useless wrappers in pci-edu.c [Drew]
> - remove pci_dma_dir_t and all its users, instead, use "bool
> from_device" [Drew]
> - not use typedef for structs, to follow Linux/kvm-unit-tests coding
> style [Drew]
> - let pci_dev_init() clean and simple, then provide
> pci_enable_defaults() for more complicated things [Drew]
> - add one more patch to add intel-iommu test into x86/unittest [Radim]
> - use 0x60 intr request instead of factorial to trigger edu device
> interrupt [Drew]
> - ...and some other changes I just forgot to note down...
> 
> Currently only a very small test scope is covered:
> 
> * VT-d init
> * DMAR: 4 bytes copy
> * IR: MSI
> 
> However this series could be a base point to add more test cases for
> VT-d. The problem is, there are many IOMMU error conditions which are
> very hard to be triggered in a real guest (IOMMU has merely no
> interface for guest user, and it's totally running in the background).
> This piece of work can be a start point if we want to do more
> complicated things and play around with Intel IOMMU devices (also for
> IOMMU regression tests).
> 
> Please review. Thanks,
> 
> Peter Xu (14):
>   pci: fix missing extern for pci_testdev()
>   x86/asm: add cpu_relax()
>   libcflat: introduce is_power_of_2()
>   x86: intel-iommu: add vt-d init test
>   libcflat: add IS_ALIGNED() macro, and page sizes
>   libcflat: moving MIN/MAX here
>   vm/page: provide PGDIR_OFFSET() macro
>   pci: introduce struct pci_dev
>   pci: provide pci_scan_bars()
>   pci: provide pci_enable_defaults()
>   pci: edu: introduce pci-edu helpers
>   x86: intel-iommu: add dmar test
>   pci: add msi support for 32/64bit address
>   x86: intel-iommu: add IR MSI test
> 
>  lib/alloc.c            |   3 -
>  lib/libcflat.h         |  14 +++
>  lib/pci-edu.c          |  73 +++++++++++
>  lib/pci-edu.h          |  84 +++++++++++++
>  lib/pci-host-generic.c |  19 +--
>  lib/pci-testdev.c      |  10 +-
>  lib/pci.c              | 161 ++++++++++++++++++++----
>  lib/pci.h              |  42 +++++--
>  lib/x86/asm/barrier.h  |  11 ++
>  lib/x86/asm/page.h     |   3 +
>  lib/x86/intel-iommu.c  | 326 +++++++++++++++++++++++++++++++++++++++++++++++++
>  lib/x86/intel-iommu.h  | 146 ++++++++++++++++++++++
>  lib/x86/vm.c           |   4 +-
>  x86/Makefile.common    |   1 +
>  x86/Makefile.x86_64    |   2 +
>  x86/intel-iommu.c      | 120 ++++++++++++++++++
>  x86/unittests.cfg      |   7 ++
>  x86/vmexit.c           |  27 ++--
>  18 files changed, 989 insertions(+), 64 deletions(-)
>  create mode 100644 lib/pci-edu.c
>  create mode 100644 lib/pci-edu.h
>  create mode 100644 lib/x86/intel-iommu.c
>  create mode 100644 lib/x86/intel-iommu.h
>  create mode 100644 x86/intel-iommu.c
> 
> -- 
> 2.7.4
> 
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [kvm-unit-tests PATCH v8 00/14] VT-d unit test
  2016-12-22 11:06 ` [Qemu-devel] [kvm-unit-tests PATCH v8 00/14] VT-d unit test Andrew Jones
@ 2016-12-22 11:52   ` Paolo Bonzini
  2016-12-23  3:44     ` Peter Xu
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2016-12-22 11:52 UTC (permalink / raw)
  To: Andrew Jones, Peter Xu; +Cc: qemu-devel, kvm, rkrcmar, agordeev, jan.kiszka



On 22/12/2016 12:06, Andrew Jones wrote:
> Alex,
> 
> Are you happy with v8? If so, then, as far as I'm concerned, this
> series is ready. I'd like to try driving the edu device from ARM
> tests too.
> 
> Thanks,
> drew

There are some minor conflicts, but nothing that I cannot fix.  I've
pushed the series.

Paolo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [kvm-unit-tests PATCH v8 00/14] VT-d unit test
  2016-12-22 11:52   ` Paolo Bonzini
@ 2016-12-23  3:44     ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2016-12-23  3:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Andrew Jones, qemu-devel, kvm, rkrcmar, agordeev, jan.kiszka

On Thu, Dec 22, 2016 at 12:52:09PM +0100, Paolo Bonzini wrote:
> 
> 
> On 22/12/2016 12:06, Andrew Jones wrote:
> > Alex,
> > 
> > Are you happy with v8? If so, then, as far as I'm concerned, this
> > series is ready. I'd like to try driving the edu device from ARM
> > tests too.
> > 
> > Thanks,
> > drew
> 
> There are some minor conflicts, but nothing that I cannot fix.  I've
> pushed the series.

Thanks to both!

-- peterx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test
  2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
@ 2017-01-03 10:33     ` Alexander Gordeev
  -1 siblings, 0 replies; 37+ messages in thread
From: Alexander Gordeev @ 2017-01-03 10:33 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, kvm, drjones, jan.kiszka, rkrcmar, pbonzini

On Mon, Dec 12, 2016 at 11:08:18AM +0800, Peter Xu wrote:
> DMAR test is based on QEMU edu device. A 4B DMA memory copy is carried
> out as the simplest DMAR test.

Hi Peter,

Sorry for missing to provide feedback on your series.
I did just a cursory review, but did not notice few issue below.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  lib/pci.h             |   5 ++
>  lib/x86/intel-iommu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  lib/x86/intel-iommu.h |   4 ++
>  x86/Makefile.common   |   1 +
>  x86/intel-iommu.c     |  51 +++++++++++++++++++
>  5 files changed, 194 insertions(+)
> 
> diff --git a/lib/pci.h b/lib/pci.h
> index d3052ef..26968b1 100644
> --- a/lib/pci.h
> +++ b/lib/pci.h
> @@ -18,6 +18,9 @@ enum {
>  #define PCI_BAR_NUM                     6
>  #define PCI_DEVFN_MAX                   256
>  
> +#define PCI_BDF_GET_DEVFN(x)            ((x) & 0xff)
> +#define PCI_BDF_GET_BUS(x)              (((x) >> 8) & 0xff)
> +
>  struct pci_dev {
>  	uint16_t bdf;
>  	phys_addr_t resource[PCI_BAR_NUM];
> @@ -28,6 +31,8 @@ extern void pci_scan_bars(struct pci_dev *dev);
>  extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
>  extern void pci_enable_defaults(struct pci_dev *dev);
>  
> +typedef phys_addr_t iova_t;
> +
>  extern bool pci_probe(void);
>  extern void pci_print(void);
>  extern bool pci_dev_exists(pcidevaddr_t dev);
> diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
> index 1887998..86b1805 100644
> --- a/lib/x86/intel-iommu.c
> +++ b/lib/x86/intel-iommu.c
> @@ -11,6 +11,42 @@
>   */
>  
>  #include "intel-iommu.h"
> +#include "libcflat.h"
> +
> +/*
> + * VT-d in QEMU currently only support 39 bits address width, which is
> + * 3-level translation.
> + */
> +#define VTD_PAGE_LEVEL      3
> +#define VTD_CE_AW_39BIT     0x1
> +
> +typedef uint64_t vtd_pte_t;
> +
> +struct vtd_root_entry {
> +	/* Quad 1 */
> +	uint64_t present:1;
> +	uint64_t __reserved:11;
> +	uint64_t context_table_p:52;
> +	/* Quad 2 */
> +	uint64_t __reserved_2;
> +} __attribute__ ((packed));
> +typedef struct vtd_root_entry vtd_re_t;
> +
> +struct vtd_context_entry {
> +	/* Quad 1 */
> +	uint64_t present:1;
> +	uint64_t disable_fault_report:1;
> +	uint64_t trans_type:2;
> +	uint64_t __reserved:8;
> +	uint64_t slptptr:52;
> +	/* Quad 2 */
> +	uint64_t addr_width:3;
> +	uint64_t __ignore:4;
> +	uint64_t __reserved_2:1;
> +	uint64_t domain_id:16;
> +	uint64_t __reserved_3:40;
> +} __attribute__ ((packed));
> +typedef struct vtd_context_entry vtd_ce_t;
>  
>  #define VTD_RTA_MASK  (PAGE_MASK)
>  #define VTD_IRTA_MASK (PAGE_MASK)
> @@ -83,6 +119,103 @@ static void vtd_setup_ir_table(void)
>  	printf("IR table address: 0x%016lx\n", vtd_ir_table());
>  }
>  
> +static void vtd_install_pte(vtd_pte_t *root, iova_t iova,
> +			    phys_addr_t pa, int level_target)
> +{
> +	int level;
> +	unsigned int offset;
> +	void *page;
> +
> +	for (level = VTD_PAGE_LEVEL; level > level_target; level--) {
> +		offset = PGDIR_OFFSET(iova, level);
> +		if (!(root[offset] & VTD_PTE_RW)) {
> +			page = alloc_page();
> +			memset(page, 0, PAGE_SIZE);
> +			root[offset] = virt_to_phys(page) | VTD_PTE_RW;
> +		}
> +		root = (uint64_t *)(phys_to_virt(root[offset] &
> +						 VTD_PTE_ADDR));
> +	}
> +
> +	offset = PGDIR_OFFSET(iova, level);
> +	root[offset] = pa | VTD_PTE_RW;
> +	if (level != 1) {
> +		/* This is huge page */
> +		root[offset] |= VTD_PTE_HUGE;
> +	}
> +}
> +
> +#define  VTD_PHYS_TO_VIRT(x) \
> +	((void *)(((uint64_t)phys_to_virt(x)) >> VTD_PAGE_SHIFT))

I would say this macro is rather superfluous.

> +/**
> + * vtd_map_range: setup IO address mapping for specific memory range
> + *
> + * @sid: source ID of the device to setup
> + * @iova: start IO virtual address
> + * @pa: start physical address
> + * @size: size of the mapping area
> + */
> +void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
> +{
> +	uint8_t bus_n, devfn;
> +	void *slptptr;
> +	vtd_ce_t *ce;
> +	vtd_re_t *re = phys_to_virt(vtd_root_table());
> +
> +	assert(IS_ALIGNED(iova, SZ_4K));
> +	assert(IS_ALIGNED(pa, SZ_4K));
> +	assert(IS_ALIGNED(size, SZ_4K));
> +
> +	bus_n = PCI_BDF_GET_BUS(sid);
> +	devfn = PCI_BDF_GET_DEVFN(sid);
> +
> +	/* Point to the correct root entry */
> +	re += bus_n;
> +
> +	if (!re->present) {
> +		ce = alloc_page();
> +		memset(ce, 0, PAGE_SIZE);
> +		memset(re, 0, sizeof(*re));
> +		re->context_table_p = virt_to_phys(ce) >> VTD_PAGE_SHIFT;
> +		re->present = 1;
> +		printf("allocated vt-d root entry for PCI bus %d\n",
> +		       bus_n);
> +	} else
> +		ce = VTD_PHYS_TO_VIRT(re->context_table_p);

Seems, it should be instead something like:

		ce = phys_to_virt(re->context_table_p << VTD_PAGE_SHIFT);

> +
> +	/* Point to the correct context entry */
> +	ce += devfn;
> +
> +	if (!ce->present) {
> +		slptptr = alloc_page();
> +		memset(slptptr, 0, PAGE_SIZE);
> +		memset(ce, 0, sizeof(*ce));
> +		/* To make it simple, domain ID is the same as SID */
> +		ce->domain_id = sid;
> +		/* We only test 39 bits width case (3-level paging) */
> +		ce->addr_width = VTD_CE_AW_39BIT;
> +		ce->slptptr = virt_to_phys(slptptr) >> VTD_PAGE_SHIFT;
> +		ce->trans_type = VTD_CONTEXT_TT_MULTI_LEVEL;
> +		ce->present = 1;
> +		/* No error reporting yet */
> +		ce->disable_fault_report = 1;
> +		printf("allocated vt-d context entry for devfn 0x%x\n",
> +		       devfn);
> +	} else
> +		slptptr = VTD_PHYS_TO_VIRT(ce->slptptr);

		slptptr = phys_to_virt(ce->slptptr << VTD_PAGE_SHIFT);

> +	while (size) {
> +		/* TODO: currently we only map 4K pages (level = 1) */
> +		printf("map 4K page IOVA 0x%lx to 0x%lx (sid=0x%04x)\n",
> +		       iova, pa, sid);
> +		vtd_install_pte(slptptr, iova, pa, 1);
> +		size -= VTD_PAGE_SIZE;
> +		iova += VTD_PAGE_SIZE;
> +		pa += VTD_PAGE_SIZE;
> +	}
> +}
> +
>  void vtd_init(void)
>  {
>  	setup_vm();
> diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
> index 92cc85f..50f67f1 100644
> --- a/lib/x86/intel-iommu.h
> +++ b/lib/x86/intel-iommu.h
> @@ -20,9 +20,12 @@
>  #include "isr.h"
>  #include "smp.h"
>  #include "desc.h"
> +#include "pci.h"
>  #include "asm/io.h"
>  
>  #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed90000ULL
> +#define VTD_PAGE_SHIFT              PAGE_SHIFT
> +#define VTD_PAGE_SIZE               PAGE_SIZE
>  
>  /*
>   * Intel IOMMU register specification
> @@ -137,5 +140,6 @@ static inline uint64_t vtd_readq(unsigned int reg)
>  }
>  
>  void vtd_init(void);
> +void vtd_map_range(uint16_t sid, phys_addr_t iova, phys_addr_t pa, size_t size);
>  
>  #endif
> diff --git a/x86/Makefile.common b/x86/Makefile.common
> index 356d879..1dad18b 100644
> --- a/x86/Makefile.common
> +++ b/x86/Makefile.common
> @@ -3,6 +3,7 @@
>  all: test_cases
>  
>  cflatobjs += lib/pci.o
> +cflatobjs += lib/pci-edu.o
>  cflatobjs += lib/x86/io.o
>  cflatobjs += lib/x86/smp.o
>  cflatobjs += lib/x86/vm.o
> diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
> index f247913..21fd57f 100644
> --- a/x86/intel-iommu.c
> +++ b/x86/intel-iommu.c
> @@ -11,9 +11,50 @@
>   */
>  
>  #include "intel-iommu.h"
> +#include "pci-edu.h"
> +
> +#define VTD_TEST_DMAR_4B ("DMAR 4B memcpy test")
> +
> +void vtd_test_dmar(struct pci_edu_dev *dev)
> +{
> +	void *page = alloc_page();
> +
> +#define DMA_TEST_WORD (0x12345678)
> +	/* Modify the first 4 bytes of the page */
> +	*(uint32_t *)page = DMA_TEST_WORD;
> +
> +	/*
> +	 * Map the newly allocated page into IOVA address 0 (size 4K)
> +	 * of the device address space. Root entry and context entry
> +	 * will be automatically created when needed.
> +	 */
> +	vtd_map_range(dev->pci_dev.bdf, 0, virt_to_phys(page), PAGE_SIZE);
> +
> +	/*
> +	 * DMA the first 4 bytes of the page to EDU device buffer
> +	 * offset 0.
> +	 */
> +	edu_dma(dev, 0, 4, 0, false);
> +
> +	/*
> +	 * DMA the first 4 bytes of EDU device buffer into the page
> +	 * with offset 4 (so it'll be using 4-7 bytes).
> +	 */
> +	edu_dma(dev, 4, 4, 0, true);
> +
> +	/*
> +	 * Check data match between 0-3 bytes and 4-7 bytes of the
> +	 * page.
> +	 */
> +	report(VTD_TEST_DMAR_4B, *((uint32_t *)page + 1) == DMA_TEST_WORD);
> +
> +	free_page(page);
> +}
>  
>  int main(int argc, char *argv[])
>  {
> +	struct pci_edu_dev dev;
> +
>  	vtd_init();
>  
>  	report("fault status check", vtd_readl(DMAR_FSTS_REG) == 0);
> @@ -22,6 +63,16 @@ int main(int argc, char *argv[])
>  	report("IR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR_TABLE);
>  	report("DMAR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_DMAR);
>  	report("IR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR);
> +	report("DMAR support 39 bits address width",
> +	       vtd_readq(DMAR_CAP_REG) & VTD_CAP_SAGAW);
> +	report("DMAR support huge pages", vtd_readq(DMAR_CAP_REG) & VTD_CAP_SLLPS);
> +
> +	if (!edu_init(&dev)) {
> +		printf("Please specify \"-device edu\" to do "
> +		       "further IOMMU tests.\n");
> +		report_skip(VTD_TEST_DMAR_4B);
> +	} else
> +		vtd_test_dmar(&dev);
>  
>  	return report_summary();
>  }
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test
@ 2017-01-03 10:33     ` Alexander Gordeev
  0 siblings, 0 replies; 37+ messages in thread
From: Alexander Gordeev @ 2017-01-03 10:33 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, kvm, drjones, jan.kiszka, rkrcmar, pbonzini

On Mon, Dec 12, 2016 at 11:08:18AM +0800, Peter Xu wrote:
> DMAR test is based on QEMU edu device. A 4B DMA memory copy is carried
> out as the simplest DMAR test.

Hi Peter,

Sorry for missing to provide feedback on your series.
I did just a cursory review, but did not notice few issue below.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  lib/pci.h             |   5 ++
>  lib/x86/intel-iommu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  lib/x86/intel-iommu.h |   4 ++
>  x86/Makefile.common   |   1 +
>  x86/intel-iommu.c     |  51 +++++++++++++++++++
>  5 files changed, 194 insertions(+)
> 
> diff --git a/lib/pci.h b/lib/pci.h
> index d3052ef..26968b1 100644
> --- a/lib/pci.h
> +++ b/lib/pci.h
> @@ -18,6 +18,9 @@ enum {
>  #define PCI_BAR_NUM                     6
>  #define PCI_DEVFN_MAX                   256
>  
> +#define PCI_BDF_GET_DEVFN(x)            ((x) & 0xff)
> +#define PCI_BDF_GET_BUS(x)              (((x) >> 8) & 0xff)
> +
>  struct pci_dev {
>  	uint16_t bdf;
>  	phys_addr_t resource[PCI_BAR_NUM];
> @@ -28,6 +31,8 @@ extern void pci_scan_bars(struct pci_dev *dev);
>  extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
>  extern void pci_enable_defaults(struct pci_dev *dev);
>  
> +typedef phys_addr_t iova_t;
> +
>  extern bool pci_probe(void);
>  extern void pci_print(void);
>  extern bool pci_dev_exists(pcidevaddr_t dev);
> diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
> index 1887998..86b1805 100644
> --- a/lib/x86/intel-iommu.c
> +++ b/lib/x86/intel-iommu.c
> @@ -11,6 +11,42 @@
>   */
>  
>  #include "intel-iommu.h"
> +#include "libcflat.h"
> +
> +/*
> + * VT-d in QEMU currently only support 39 bits address width, which is
> + * 3-level translation.
> + */
> +#define VTD_PAGE_LEVEL      3
> +#define VTD_CE_AW_39BIT     0x1
> +
> +typedef uint64_t vtd_pte_t;
> +
> +struct vtd_root_entry {
> +	/* Quad 1 */
> +	uint64_t present:1;
> +	uint64_t __reserved:11;
> +	uint64_t context_table_p:52;
> +	/* Quad 2 */
> +	uint64_t __reserved_2;
> +} __attribute__ ((packed));
> +typedef struct vtd_root_entry vtd_re_t;
> +
> +struct vtd_context_entry {
> +	/* Quad 1 */
> +	uint64_t present:1;
> +	uint64_t disable_fault_report:1;
> +	uint64_t trans_type:2;
> +	uint64_t __reserved:8;
> +	uint64_t slptptr:52;
> +	/* Quad 2 */
> +	uint64_t addr_width:3;
> +	uint64_t __ignore:4;
> +	uint64_t __reserved_2:1;
> +	uint64_t domain_id:16;
> +	uint64_t __reserved_3:40;
> +} __attribute__ ((packed));
> +typedef struct vtd_context_entry vtd_ce_t;
>  
>  #define VTD_RTA_MASK  (PAGE_MASK)
>  #define VTD_IRTA_MASK (PAGE_MASK)
> @@ -83,6 +119,103 @@ static void vtd_setup_ir_table(void)
>  	printf("IR table address: 0x%016lx\n", vtd_ir_table());
>  }
>  
> +static void vtd_install_pte(vtd_pte_t *root, iova_t iova,
> +			    phys_addr_t pa, int level_target)
> +{
> +	int level;
> +	unsigned int offset;
> +	void *page;
> +
> +	for (level = VTD_PAGE_LEVEL; level > level_target; level--) {
> +		offset = PGDIR_OFFSET(iova, level);
> +		if (!(root[offset] & VTD_PTE_RW)) {
> +			page = alloc_page();
> +			memset(page, 0, PAGE_SIZE);
> +			root[offset] = virt_to_phys(page) | VTD_PTE_RW;
> +		}
> +		root = (uint64_t *)(phys_to_virt(root[offset] &
> +						 VTD_PTE_ADDR));
> +	}
> +
> +	offset = PGDIR_OFFSET(iova, level);
> +	root[offset] = pa | VTD_PTE_RW;
> +	if (level != 1) {
> +		/* This is huge page */
> +		root[offset] |= VTD_PTE_HUGE;
> +	}
> +}
> +
> +#define  VTD_PHYS_TO_VIRT(x) \
> +	((void *)(((uint64_t)phys_to_virt(x)) >> VTD_PAGE_SHIFT))

I would say this macro is rather superfluous.

> +/**
> + * vtd_map_range: setup IO address mapping for specific memory range
> + *
> + * @sid: source ID of the device to setup
> + * @iova: start IO virtual address
> + * @pa: start physical address
> + * @size: size of the mapping area
> + */
> +void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
> +{
> +	uint8_t bus_n, devfn;
> +	void *slptptr;
> +	vtd_ce_t *ce;
> +	vtd_re_t *re = phys_to_virt(vtd_root_table());
> +
> +	assert(IS_ALIGNED(iova, SZ_4K));
> +	assert(IS_ALIGNED(pa, SZ_4K));
> +	assert(IS_ALIGNED(size, SZ_4K));
> +
> +	bus_n = PCI_BDF_GET_BUS(sid);
> +	devfn = PCI_BDF_GET_DEVFN(sid);
> +
> +	/* Point to the correct root entry */
> +	re += bus_n;
> +
> +	if (!re->present) {
> +		ce = alloc_page();
> +		memset(ce, 0, PAGE_SIZE);
> +		memset(re, 0, sizeof(*re));
> +		re->context_table_p = virt_to_phys(ce) >> VTD_PAGE_SHIFT;
> +		re->present = 1;
> +		printf("allocated vt-d root entry for PCI bus %d\n",
> +		       bus_n);
> +	} else
> +		ce = VTD_PHYS_TO_VIRT(re->context_table_p);

Seems, it should be instead something like:

		ce = phys_to_virt(re->context_table_p << VTD_PAGE_SHIFT);

> +
> +	/* Point to the correct context entry */
> +	ce += devfn;
> +
> +	if (!ce->present) {
> +		slptptr = alloc_page();
> +		memset(slptptr, 0, PAGE_SIZE);
> +		memset(ce, 0, sizeof(*ce));
> +		/* To make it simple, domain ID is the same as SID */
> +		ce->domain_id = sid;
> +		/* We only test 39 bits width case (3-level paging) */
> +		ce->addr_width = VTD_CE_AW_39BIT;
> +		ce->slptptr = virt_to_phys(slptptr) >> VTD_PAGE_SHIFT;
> +		ce->trans_type = VTD_CONTEXT_TT_MULTI_LEVEL;
> +		ce->present = 1;
> +		/* No error reporting yet */
> +		ce->disable_fault_report = 1;
> +		printf("allocated vt-d context entry for devfn 0x%x\n",
> +		       devfn);
> +	} else
> +		slptptr = VTD_PHYS_TO_VIRT(ce->slptptr);

		slptptr = phys_to_virt(ce->slptptr << VTD_PAGE_SHIFT);

> +	while (size) {
> +		/* TODO: currently we only map 4K pages (level = 1) */
> +		printf("map 4K page IOVA 0x%lx to 0x%lx (sid=0x%04x)\n",
> +		       iova, pa, sid);
> +		vtd_install_pte(slptptr, iova, pa, 1);
> +		size -= VTD_PAGE_SIZE;
> +		iova += VTD_PAGE_SIZE;
> +		pa += VTD_PAGE_SIZE;
> +	}
> +}
> +
>  void vtd_init(void)
>  {
>  	setup_vm();
> diff --git a/lib/x86/intel-iommu.h b/lib/x86/intel-iommu.h
> index 92cc85f..50f67f1 100644
> --- a/lib/x86/intel-iommu.h
> +++ b/lib/x86/intel-iommu.h
> @@ -20,9 +20,12 @@
>  #include "isr.h"
>  #include "smp.h"
>  #include "desc.h"
> +#include "pci.h"
>  #include "asm/io.h"
>  
>  #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed90000ULL
> +#define VTD_PAGE_SHIFT              PAGE_SHIFT
> +#define VTD_PAGE_SIZE               PAGE_SIZE
>  
>  /*
>   * Intel IOMMU register specification
> @@ -137,5 +140,6 @@ static inline uint64_t vtd_readq(unsigned int reg)
>  }
>  
>  void vtd_init(void);
> +void vtd_map_range(uint16_t sid, phys_addr_t iova, phys_addr_t pa, size_t size);
>  
>  #endif
> diff --git a/x86/Makefile.common b/x86/Makefile.common
> index 356d879..1dad18b 100644
> --- a/x86/Makefile.common
> +++ b/x86/Makefile.common
> @@ -3,6 +3,7 @@
>  all: test_cases
>  
>  cflatobjs += lib/pci.o
> +cflatobjs += lib/pci-edu.o
>  cflatobjs += lib/x86/io.o
>  cflatobjs += lib/x86/smp.o
>  cflatobjs += lib/x86/vm.o
> diff --git a/x86/intel-iommu.c b/x86/intel-iommu.c
> index f247913..21fd57f 100644
> --- a/x86/intel-iommu.c
> +++ b/x86/intel-iommu.c
> @@ -11,9 +11,50 @@
>   */
>  
>  #include "intel-iommu.h"
> +#include "pci-edu.h"
> +
> +#define VTD_TEST_DMAR_4B ("DMAR 4B memcpy test")
> +
> +void vtd_test_dmar(struct pci_edu_dev *dev)
> +{
> +	void *page = alloc_page();
> +
> +#define DMA_TEST_WORD (0x12345678)
> +	/* Modify the first 4 bytes of the page */
> +	*(uint32_t *)page = DMA_TEST_WORD;
> +
> +	/*
> +	 * Map the newly allocated page into IOVA address 0 (size 4K)
> +	 * of the device address space. Root entry and context entry
> +	 * will be automatically created when needed.
> +	 */
> +	vtd_map_range(dev->pci_dev.bdf, 0, virt_to_phys(page), PAGE_SIZE);
> +
> +	/*
> +	 * DMA the first 4 bytes of the page to EDU device buffer
> +	 * offset 0.
> +	 */
> +	edu_dma(dev, 0, 4, 0, false);
> +
> +	/*
> +	 * DMA the first 4 bytes of EDU device buffer into the page
> +	 * with offset 4 (so it'll be using 4-7 bytes).
> +	 */
> +	edu_dma(dev, 4, 4, 0, true);
> +
> +	/*
> +	 * Check data match between 0-3 bytes and 4-7 bytes of the
> +	 * page.
> +	 */
> +	report(VTD_TEST_DMAR_4B, *((uint32_t *)page + 1) == DMA_TEST_WORD);
> +
> +	free_page(page);
> +}
>  
>  int main(int argc, char *argv[])
>  {
> +	struct pci_edu_dev dev;
> +
>  	vtd_init();
>  
>  	report("fault status check", vtd_readl(DMAR_FSTS_REG) == 0);
> @@ -22,6 +63,16 @@ int main(int argc, char *argv[])
>  	report("IR table setup", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR_TABLE);
>  	report("DMAR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_DMAR);
>  	report("IR enablement", vtd_readl(DMAR_GSTS_REG) & VTD_GCMD_IR);
> +	report("DMAR support 39 bits address width",
> +	       vtd_readq(DMAR_CAP_REG) & VTD_CAP_SAGAW);
> +	report("DMAR support huge pages", vtd_readq(DMAR_CAP_REG) & VTD_CAP_SLLPS);
> +
> +	if (!edu_init(&dev)) {
> +		printf("Please specify \"-device edu\" to do "
> +		       "further IOMMU tests.\n");
> +		report_skip(VTD_TEST_DMAR_4B);
> +	} else
> +		vtd_test_dmar(&dev);
>  
>  	return report_summary();
>  }
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test
  2017-01-03 10:33     ` [Qemu-devel] " Alexander Gordeev
@ 2017-01-03 11:16       ` Peter Xu
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2017-01-03 11:16 UTC (permalink / raw)
  To: Alexander Gordeev; +Cc: qemu-devel, kvm, drjones, jan.kiszka, rkrcmar, pbonzini

On Tue, Jan 03, 2017 at 11:33:27AM +0100, Alexander Gordeev wrote:
> On Mon, Dec 12, 2016 at 11:08:18AM +0800, Peter Xu wrote:
> > DMAR test is based on QEMU edu device. A 4B DMA memory copy is carried
> > out as the simplest DMAR test.
> 
> Hi Peter,
> 
> Sorry for missing to provide feedback on your series.
> I did just a cursory review, but did not notice few issue below.
> 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  lib/pci.h             |   5 ++
> >  lib/x86/intel-iommu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  lib/x86/intel-iommu.h |   4 ++
> >  x86/Makefile.common   |   1 +
> >  x86/intel-iommu.c     |  51 +++++++++++++++++++
> >  5 files changed, 194 insertions(+)
> > 
> > diff --git a/lib/pci.h b/lib/pci.h
> > index d3052ef..26968b1 100644
> > --- a/lib/pci.h
> > +++ b/lib/pci.h
> > @@ -18,6 +18,9 @@ enum {
> >  #define PCI_BAR_NUM                     6
> >  #define PCI_DEVFN_MAX                   256
> >  
> > +#define PCI_BDF_GET_DEVFN(x)            ((x) & 0xff)
> > +#define PCI_BDF_GET_BUS(x)              (((x) >> 8) & 0xff)
> > +
> >  struct pci_dev {
> >  	uint16_t bdf;
> >  	phys_addr_t resource[PCI_BAR_NUM];
> > @@ -28,6 +31,8 @@ extern void pci_scan_bars(struct pci_dev *dev);
> >  extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
> >  extern void pci_enable_defaults(struct pci_dev *dev);
> >  
> > +typedef phys_addr_t iova_t;
> > +
> >  extern bool pci_probe(void);
> >  extern void pci_print(void);
> >  extern bool pci_dev_exists(pcidevaddr_t dev);
> > diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
> > index 1887998..86b1805 100644
> > --- a/lib/x86/intel-iommu.c
> > +++ b/lib/x86/intel-iommu.c
> > @@ -11,6 +11,42 @@
> >   */
> >  
> >  #include "intel-iommu.h"
> > +#include "libcflat.h"
> > +
> > +/*
> > + * VT-d in QEMU currently only support 39 bits address width, which is
> > + * 3-level translation.
> > + */
> > +#define VTD_PAGE_LEVEL      3
> > +#define VTD_CE_AW_39BIT     0x1
> > +
> > +typedef uint64_t vtd_pte_t;
> > +
> > +struct vtd_root_entry {
> > +	/* Quad 1 */
> > +	uint64_t present:1;
> > +	uint64_t __reserved:11;
> > +	uint64_t context_table_p:52;
> > +	/* Quad 2 */
> > +	uint64_t __reserved_2;
> > +} __attribute__ ((packed));
> > +typedef struct vtd_root_entry vtd_re_t;
> > +
> > +struct vtd_context_entry {
> > +	/* Quad 1 */
> > +	uint64_t present:1;
> > +	uint64_t disable_fault_report:1;
> > +	uint64_t trans_type:2;
> > +	uint64_t __reserved:8;
> > +	uint64_t slptptr:52;
> > +	/* Quad 2 */
> > +	uint64_t addr_width:3;
> > +	uint64_t __ignore:4;
> > +	uint64_t __reserved_2:1;
> > +	uint64_t domain_id:16;
> > +	uint64_t __reserved_3:40;
> > +} __attribute__ ((packed));
> > +typedef struct vtd_context_entry vtd_ce_t;
> >  
> >  #define VTD_RTA_MASK  (PAGE_MASK)
> >  #define VTD_IRTA_MASK (PAGE_MASK)
> > @@ -83,6 +119,103 @@ static void vtd_setup_ir_table(void)
> >  	printf("IR table address: 0x%016lx\n", vtd_ir_table());
> >  }
> >  
> > +static void vtd_install_pte(vtd_pte_t *root, iova_t iova,
> > +			    phys_addr_t pa, int level_target)
> > +{
> > +	int level;
> > +	unsigned int offset;
> > +	void *page;
> > +
> > +	for (level = VTD_PAGE_LEVEL; level > level_target; level--) {
> > +		offset = PGDIR_OFFSET(iova, level);
> > +		if (!(root[offset] & VTD_PTE_RW)) {
> > +			page = alloc_page();
> > +			memset(page, 0, PAGE_SIZE);
> > +			root[offset] = virt_to_phys(page) | VTD_PTE_RW;
> > +		}
> > +		root = (uint64_t *)(phys_to_virt(root[offset] &
> > +						 VTD_PTE_ADDR));
> > +	}
> > +
> > +	offset = PGDIR_OFFSET(iova, level);
> > +	root[offset] = pa | VTD_PTE_RW;
> > +	if (level != 1) {
> > +		/* This is huge page */
> > +		root[offset] |= VTD_PTE_HUGE;
> > +	}
> > +}
> > +
> > +#define  VTD_PHYS_TO_VIRT(x) \
> > +	((void *)(((uint64_t)phys_to_virt(x)) >> VTD_PAGE_SHIFT))
> 
> I would say this macro is rather superfluous.
> 
> > +/**
> > + * vtd_map_range: setup IO address mapping for specific memory range
> > + *
> > + * @sid: source ID of the device to setup
> > + * @iova: start IO virtual address
> > + * @pa: start physical address
> > + * @size: size of the mapping area
> > + */
> > +void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
> > +{
> > +	uint8_t bus_n, devfn;
> > +	void *slptptr;
> > +	vtd_ce_t *ce;
> > +	vtd_re_t *re = phys_to_virt(vtd_root_table());
> > +
> > +	assert(IS_ALIGNED(iova, SZ_4K));
> > +	assert(IS_ALIGNED(pa, SZ_4K));
> > +	assert(IS_ALIGNED(size, SZ_4K));
> > +
> > +	bus_n = PCI_BDF_GET_BUS(sid);
> > +	devfn = PCI_BDF_GET_DEVFN(sid);
> > +
> > +	/* Point to the correct root entry */
> > +	re += bus_n;
> > +
> > +	if (!re->present) {
> > +		ce = alloc_page();
> > +		memset(ce, 0, PAGE_SIZE);
> > +		memset(re, 0, sizeof(*re));
> > +		re->context_table_p = virt_to_phys(ce) >> VTD_PAGE_SHIFT;
> > +		re->present = 1;
> > +		printf("allocated vt-d root entry for PCI bus %d\n",
> > +		       bus_n);
> > +	} else
> > +		ce = VTD_PHYS_TO_VIRT(re->context_table_p);
> 
> Seems, it should be instead something like:
> 
> 		ce = phys_to_virt(re->context_table_p << VTD_PAGE_SHIFT);

Oops, you are right. It didn't encounter issue since current DMAR test
is too simple and only allocated one single page. :-)

I'll post a fix soon. Thanks!

-- peterx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test
@ 2017-01-03 11:16       ` Peter Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Xu @ 2017-01-03 11:16 UTC (permalink / raw)
  To: Alexander Gordeev; +Cc: qemu-devel, kvm, drjones, jan.kiszka, rkrcmar, pbonzini

On Tue, Jan 03, 2017 at 11:33:27AM +0100, Alexander Gordeev wrote:
> On Mon, Dec 12, 2016 at 11:08:18AM +0800, Peter Xu wrote:
> > DMAR test is based on QEMU edu device. A 4B DMA memory copy is carried
> > out as the simplest DMAR test.
> 
> Hi Peter,
> 
> Sorry for missing to provide feedback on your series.
> I did just a cursory review, but did not notice few issue below.
> 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  lib/pci.h             |   5 ++
> >  lib/x86/intel-iommu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  lib/x86/intel-iommu.h |   4 ++
> >  x86/Makefile.common   |   1 +
> >  x86/intel-iommu.c     |  51 +++++++++++++++++++
> >  5 files changed, 194 insertions(+)
> > 
> > diff --git a/lib/pci.h b/lib/pci.h
> > index d3052ef..26968b1 100644
> > --- a/lib/pci.h
> > +++ b/lib/pci.h
> > @@ -18,6 +18,9 @@ enum {
> >  #define PCI_BAR_NUM                     6
> >  #define PCI_DEVFN_MAX                   256
> >  
> > +#define PCI_BDF_GET_DEVFN(x)            ((x) & 0xff)
> > +#define PCI_BDF_GET_BUS(x)              (((x) >> 8) & 0xff)
> > +
> >  struct pci_dev {
> >  	uint16_t bdf;
> >  	phys_addr_t resource[PCI_BAR_NUM];
> > @@ -28,6 +31,8 @@ extern void pci_scan_bars(struct pci_dev *dev);
> >  extern void pci_cmd_set_clr(struct pci_dev *dev, uint16_t set, uint16_t clr);
> >  extern void pci_enable_defaults(struct pci_dev *dev);
> >  
> > +typedef phys_addr_t iova_t;
> > +
> >  extern bool pci_probe(void);
> >  extern void pci_print(void);
> >  extern bool pci_dev_exists(pcidevaddr_t dev);
> > diff --git a/lib/x86/intel-iommu.c b/lib/x86/intel-iommu.c
> > index 1887998..86b1805 100644
> > --- a/lib/x86/intel-iommu.c
> > +++ b/lib/x86/intel-iommu.c
> > @@ -11,6 +11,42 @@
> >   */
> >  
> >  #include "intel-iommu.h"
> > +#include "libcflat.h"
> > +
> > +/*
> > + * VT-d in QEMU currently only support 39 bits address width, which is
> > + * 3-level translation.
> > + */
> > +#define VTD_PAGE_LEVEL      3
> > +#define VTD_CE_AW_39BIT     0x1
> > +
> > +typedef uint64_t vtd_pte_t;
> > +
> > +struct vtd_root_entry {
> > +	/* Quad 1 */
> > +	uint64_t present:1;
> > +	uint64_t __reserved:11;
> > +	uint64_t context_table_p:52;
> > +	/* Quad 2 */
> > +	uint64_t __reserved_2;
> > +} __attribute__ ((packed));
> > +typedef struct vtd_root_entry vtd_re_t;
> > +
> > +struct vtd_context_entry {
> > +	/* Quad 1 */
> > +	uint64_t present:1;
> > +	uint64_t disable_fault_report:1;
> > +	uint64_t trans_type:2;
> > +	uint64_t __reserved:8;
> > +	uint64_t slptptr:52;
> > +	/* Quad 2 */
> > +	uint64_t addr_width:3;
> > +	uint64_t __ignore:4;
> > +	uint64_t __reserved_2:1;
> > +	uint64_t domain_id:16;
> > +	uint64_t __reserved_3:40;
> > +} __attribute__ ((packed));
> > +typedef struct vtd_context_entry vtd_ce_t;
> >  
> >  #define VTD_RTA_MASK  (PAGE_MASK)
> >  #define VTD_IRTA_MASK (PAGE_MASK)
> > @@ -83,6 +119,103 @@ static void vtd_setup_ir_table(void)
> >  	printf("IR table address: 0x%016lx\n", vtd_ir_table());
> >  }
> >  
> > +static void vtd_install_pte(vtd_pte_t *root, iova_t iova,
> > +			    phys_addr_t pa, int level_target)
> > +{
> > +	int level;
> > +	unsigned int offset;
> > +	void *page;
> > +
> > +	for (level = VTD_PAGE_LEVEL; level > level_target; level--) {
> > +		offset = PGDIR_OFFSET(iova, level);
> > +		if (!(root[offset] & VTD_PTE_RW)) {
> > +			page = alloc_page();
> > +			memset(page, 0, PAGE_SIZE);
> > +			root[offset] = virt_to_phys(page) | VTD_PTE_RW;
> > +		}
> > +		root = (uint64_t *)(phys_to_virt(root[offset] &
> > +						 VTD_PTE_ADDR));
> > +	}
> > +
> > +	offset = PGDIR_OFFSET(iova, level);
> > +	root[offset] = pa | VTD_PTE_RW;
> > +	if (level != 1) {
> > +		/* This is huge page */
> > +		root[offset] |= VTD_PTE_HUGE;
> > +	}
> > +}
> > +
> > +#define  VTD_PHYS_TO_VIRT(x) \
> > +	((void *)(((uint64_t)phys_to_virt(x)) >> VTD_PAGE_SHIFT))
> 
> I would say this macro is rather superfluous.
> 
> > +/**
> > + * vtd_map_range: setup IO address mapping for specific memory range
> > + *
> > + * @sid: source ID of the device to setup
> > + * @iova: start IO virtual address
> > + * @pa: start physical address
> > + * @size: size of the mapping area
> > + */
> > +void vtd_map_range(uint16_t sid, iova_t iova, phys_addr_t pa, size_t size)
> > +{
> > +	uint8_t bus_n, devfn;
> > +	void *slptptr;
> > +	vtd_ce_t *ce;
> > +	vtd_re_t *re = phys_to_virt(vtd_root_table());
> > +
> > +	assert(IS_ALIGNED(iova, SZ_4K));
> > +	assert(IS_ALIGNED(pa, SZ_4K));
> > +	assert(IS_ALIGNED(size, SZ_4K));
> > +
> > +	bus_n = PCI_BDF_GET_BUS(sid);
> > +	devfn = PCI_BDF_GET_DEVFN(sid);
> > +
> > +	/* Point to the correct root entry */
> > +	re += bus_n;
> > +
> > +	if (!re->present) {
> > +		ce = alloc_page();
> > +		memset(ce, 0, PAGE_SIZE);
> > +		memset(re, 0, sizeof(*re));
> > +		re->context_table_p = virt_to_phys(ce) >> VTD_PAGE_SHIFT;
> > +		re->present = 1;
> > +		printf("allocated vt-d root entry for PCI bus %d\n",
> > +		       bus_n);
> > +	} else
> > +		ce = VTD_PHYS_TO_VIRT(re->context_table_p);
> 
> Seems, it should be instead something like:
> 
> 		ce = phys_to_virt(re->context_table_p << VTD_PAGE_SHIFT);

Oops, you are right. It didn't encounter issue since current DMAR test
is too simple and only allocated one single page. :-)

I'll post a fix soon. Thanks!

-- peterx

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2017-01-03 11:16 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-12  3:08 [kvm-unit-tests PATCH v8 00/14] VT-d unit test Peter Xu
2016-12-12  3:08 ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 01/14] pci: fix missing extern for pci_testdev() Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 02/14] x86/asm: add cpu_relax() Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 03/14] libcflat: introduce is_power_of_2() Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 04/14] x86: intel-iommu: add vt-d init test Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 05/14] libcflat: add IS_ALIGNED() macro, and page sizes Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 06/14] libcflat: moving MIN/MAX here Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 07/14] vm/page: provide PGDIR_OFFSET() macro Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 08/14] pci: introduce struct pci_dev Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 09/14] pci: provide pci_scan_bars() Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 10/14] pci: provide pci_enable_defaults() Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 11/14] pci: edu: introduce pci-edu helpers Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 12/14] x86: intel-iommu: add dmar test Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2017-01-03 10:33   ` Alexander Gordeev
2017-01-03 10:33     ` [Qemu-devel] " Alexander Gordeev
2017-01-03 11:16     ` Peter Xu
2017-01-03 11:16       ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 13/14] pci: add msi support for 32/64bit address Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-12  3:08 ` [kvm-unit-tests PATCH v8 14/14] x86: intel-iommu: add IR MSI test Peter Xu
2016-12-12  3:08   ` [Qemu-devel] " Peter Xu
2016-12-22 11:06 ` [Qemu-devel] [kvm-unit-tests PATCH v8 00/14] VT-d unit test Andrew Jones
2016-12-22 11:52   ` Paolo Bonzini
2016-12-23  3:44     ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.