* [PATCH 0/7] AMD IOMMU emulation patchset v4
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
Hi,
I rebased my work on mst's PCI tree and, hopefully, fixed issues raised by
others. Here's a summary of the changes:
- made it apply to mst/pci
- moved some AMD IOMMU stuff in a reset handler
- dropped range_covers_range() (wasn't the same as ranges_overlap(), but the
latter was better anyway)
- used 'expand' to remove tabs in pci_regs.h before applying the useful changes
- fixed the endianness mistake spotted by Blue (though ldq_phys wasn't needed)
As for Anthony's suggestion to simply sed-convert all devices, I'd rather go
through them one at a time and do it manually. 'sed' would not only mess
indentation, but also it isn't straightforward to get the 'PCIDevice *' you
need to pass to the pci_* helpers. (I'll try to focus on conversion next so we
can poison the old stuff.)
I also added (read "spelled it out myself") malc's ACK to the ac97 patch.
Nothing changed since his last review.
Please have a look and merge if you like it.
Thanks,
Eduard
Eduard - Gabriel Munteanu (7):
pci: expand tabs to spaces in pci_regs.h
pci: memory access API and IOMMU support
AMD IOMMU emulation
ide: use the PCI memory access interface
rtl8139: use the PCI memory access interface
eepro100: use the PCI memory access interface
ac97: use the PCI memory access interface
Makefile.target | 2 +-
dma-helpers.c | 46 ++-
dma.h | 21 +-
hw/ac97.c | 6 +-
hw/amd_iommu.c | 663 ++++++++++++++++++++++++++
hw/eepro100.c | 86 ++--
hw/ide/core.c | 15 +-
hw/ide/internal.h | 39 ++
hw/ide/macio.c | 4 +-
hw/ide/pci.c | 7 +
hw/pc.c | 2 +
hw/pci.c | 185 ++++++++-
hw/pci.h | 74 +++
hw/pci_ids.h | 2 +
hw/pci_internals.h | 12 +
hw/pci_regs.h | 1331 ++++++++++++++++++++++++++--------------------------
hw/rtl8139.c | 99 +++--
qemu-common.h | 1 +
18 files changed, 1827 insertions(+), 768 deletions(-)
create mode 100644 hw/amd_iommu.c
rewrite hw/pci_regs.h (90%)
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 0/7] AMD IOMMU emulation patchset v4
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
Hi,
I rebased my work on mst's PCI tree and, hopefully, fixed issues raised by
others. Here's a summary of the changes:
- made it apply to mst/pci
- moved some AMD IOMMU stuff in a reset handler
- dropped range_covers_range() (wasn't the same as ranges_overlap(), but the
latter was better anyway)
- used 'expand' to remove tabs in pci_regs.h before applying the useful changes
- fixed the endianness mistake spotted by Blue (though ldq_phys wasn't needed)
As for Anthony's suggestion to simply sed-convert all devices, I'd rather go
through them one at a time and do it manually. 'sed' would not only mess
indentation, but also it isn't straightforward to get the 'PCIDevice *' you
need to pass to the pci_* helpers. (I'll try to focus on conversion next so we
can poison the old stuff.)
I also added (read "spelled it out myself") malc's ACK to the ac97 patch.
Nothing changed since his last review.
Please have a look and merge if you like it.
Thanks,
Eduard
Eduard - Gabriel Munteanu (7):
pci: expand tabs to spaces in pci_regs.h
pci: memory access API and IOMMU support
AMD IOMMU emulation
ide: use the PCI memory access interface
rtl8139: use the PCI memory access interface
eepro100: use the PCI memory access interface
ac97: use the PCI memory access interface
Makefile.target | 2 +-
dma-helpers.c | 46 ++-
dma.h | 21 +-
hw/ac97.c | 6 +-
hw/amd_iommu.c | 663 ++++++++++++++++++++++++++
hw/eepro100.c | 86 ++--
hw/ide/core.c | 15 +-
hw/ide/internal.h | 39 ++
hw/ide/macio.c | 4 +-
hw/ide/pci.c | 7 +
hw/pc.c | 2 +
hw/pci.c | 185 ++++++++-
hw/pci.h | 74 +++
hw/pci_ids.h | 2 +
hw/pci_internals.h | 12 +
hw/pci_regs.h | 1331 ++++++++++++++++++++++++++--------------------------
hw/rtl8139.c | 99 +++--
qemu-common.h | 1 +
18 files changed, 1827 insertions(+), 768 deletions(-)
create mode 100644 hw/amd_iommu.c
rewrite hw/pci_regs.h (90%)
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
The conversion was done using the GNU 'expand' tool (default settings)
to make it obey the QEMU coding style.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
1 files changed, 665 insertions(+), 665 deletions(-)
rewrite hw/pci_regs.h (90%)
diff --git a/hw/pci_regs.h b/hw/pci_regs.h
dissimilarity index 90%
index dd0bed4..0f9f84c 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -1,665 +1,665 @@
-/*
- * pci_regs.h
- *
- * PCI standard defines
- * Copyright 1994, Drew Eckhardt
- * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
- *
- * For more information, please consult the following manuals (look at
- * http://www.pcisig.com/ for how to get them):
- *
- * PCI BIOS Specification
- * PCI Local Bus Specification
- * PCI to PCI Bridge Specification
- * PCI System Design Guide
- *
- * For hypertransport information, please consult the following manuals
- * from http://www.hypertransport.org
- *
- * The Hypertransport I/O Link Specification
- */
-
-#ifndef LINUX_PCI_REGS_H
-#define LINUX_PCI_REGS_H
-
-/*
- * Under PCI, each device has 256 bytes of configuration address space,
- * of which the first 64 bytes are standardized as follows:
- */
-#define PCI_VENDOR_ID 0x00 /* 16 bits */
-#define PCI_DEVICE_ID 0x02 /* 16 bits */
-#define PCI_COMMAND 0x04 /* 16 bits */
-#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
-#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
-#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
-#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
-#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
-#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
-#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
-#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
-#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
-#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
-#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
-
-#define PCI_STATUS 0x06 /* 16 bits */
-#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
-#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
-#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
-#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
-#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
-#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
-#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
-#define PCI_STATUS_DEVSEL_FAST 0x000
-#define PCI_STATUS_DEVSEL_MEDIUM 0x200
-#define PCI_STATUS_DEVSEL_SLOW 0x400
-#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
-#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
-#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
-#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
-#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
-
-#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
-#define PCI_REVISION_ID 0x08 /* Revision ID */
-#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
-#define PCI_CLASS_DEVICE 0x0a /* Device class */
-
-#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
-#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
-#define PCI_HEADER_TYPE 0x0e /* 8 bits */
-#define PCI_HEADER_TYPE_NORMAL 0
-#define PCI_HEADER_TYPE_BRIDGE 1
-#define PCI_HEADER_TYPE_CARDBUS 2
-
-#define PCI_BIST 0x0f /* 8 bits */
-#define PCI_BIST_CODE_MASK 0x0f /* Return result */
-#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
-#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
-
-/*
- * Base addresses specify locations in memory or I/O space.
- * Decoded size can be determined by writing a value of
- * 0xffffffff to the register, and reading it back. Only
- * 1 bits are decoded.
- */
-#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
-#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
-#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
-#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
-#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
-#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
-#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
-#define PCI_BASE_ADDRESS_SPACE_IO 0x01
-#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
-#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
-#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
-#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
-#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
-#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
-#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
-#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
-/* bit 1 is reserved if address_space = 1 */
-
-/* Header type 0 (normal devices) */
-#define PCI_CARDBUS_CIS 0x28
-#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
-#define PCI_SUBSYSTEM_ID 0x2e
-#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
-#define PCI_ROM_ADDRESS_ENABLE 0x01
-#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
-
-#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
-
-/* 0x35-0x3b are reserved */
-#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
-#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
-#define PCI_MIN_GNT 0x3e /* 8 bits */
-#define PCI_MAX_LAT 0x3f /* 8 bits */
-
-/* Header type 1 (PCI-to-PCI bridges) */
-#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
-#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
-#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
-#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
-#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
-#define PCI_IO_LIMIT 0x1d
-#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
-#define PCI_IO_RANGE_TYPE_16 0x00
-#define PCI_IO_RANGE_TYPE_32 0x01
-#define PCI_IO_RANGE_MASK (~0x0fUL)
-#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
-#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
-#define PCI_MEMORY_LIMIT 0x22
-#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
-#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
-#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
-#define PCI_PREF_MEMORY_LIMIT 0x26
-#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
-#define PCI_PREF_RANGE_TYPE_32 0x00
-#define PCI_PREF_RANGE_TYPE_64 0x01
-#define PCI_PREF_RANGE_MASK (~0x0fUL)
-#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
-#define PCI_PREF_LIMIT_UPPER32 0x2c
-#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
-#define PCI_IO_LIMIT_UPPER16 0x32
-/* 0x34 same as for htype 0 */
-/* 0x35-0x3b is reserved */
-#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
-/* 0x3c-0x3d are same as for htype 0 */
-#define PCI_BRIDGE_CONTROL 0x3e
-#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
-#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
-#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
-#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
-#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
-#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
-#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
-
-/* Header type 2 (CardBus bridges) */
-#define PCI_CB_CAPABILITY_LIST 0x14
-/* 0x15 reserved */
-#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
-#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
-#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
-#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
-#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
-#define PCI_CB_MEMORY_BASE_0 0x1c
-#define PCI_CB_MEMORY_LIMIT_0 0x20
-#define PCI_CB_MEMORY_BASE_1 0x24
-#define PCI_CB_MEMORY_LIMIT_1 0x28
-#define PCI_CB_IO_BASE_0 0x2c
-#define PCI_CB_IO_BASE_0_HI 0x2e
-#define PCI_CB_IO_LIMIT_0 0x30
-#define PCI_CB_IO_LIMIT_0_HI 0x32
-#define PCI_CB_IO_BASE_1 0x34
-#define PCI_CB_IO_BASE_1_HI 0x36
-#define PCI_CB_IO_LIMIT_1 0x38
-#define PCI_CB_IO_LIMIT_1_HI 0x3a
-#define PCI_CB_IO_RANGE_MASK (~0x03UL)
-/* 0x3c-0x3d are same as for htype 0 */
-#define PCI_CB_BRIDGE_CONTROL 0x3e
-#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
-#define PCI_CB_BRIDGE_CTL_SERR 0x02
-#define PCI_CB_BRIDGE_CTL_ISA 0x04
-#define PCI_CB_BRIDGE_CTL_VGA 0x08
-#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
-#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
-#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
-#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
-#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
-#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
-#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
-#define PCI_CB_SUBSYSTEM_ID 0x42
-#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
-/* 0x48-0x7f reserved */
-
-/* Capability lists */
-
-#define PCI_CAP_LIST_ID 0 /* Capability ID */
-#define PCI_CAP_ID_PM 0x01 /* Power Management */
-#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
-#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
-#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
-#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
-#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
-#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
-#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
-#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
-#define PCI_CAP_ID_DBG 0x0A /* Debug port */
-#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
-#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
-#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
-#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
-#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
-#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
-#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
-#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
-#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
-#define PCI_CAP_SIZEOF 4
-
-/* Power Management Registers */
-
-#define PCI_PM_PMC 2 /* PM Capabilities Register */
-#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
-#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
-#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
-#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
-#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
-#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
-#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
-#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
-#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
-#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
-#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
-#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
-#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
-#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
-#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
-#define PCI_PM_CTRL 4 /* PM control and status register */
-#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
-#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
-#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
-#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
-#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
-#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
-#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
-#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
-#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
-#define PCI_PM_DATA_REGISTER 7 /* (??) */
-#define PCI_PM_SIZEOF 8
-
-/* AGP registers */
-
-#define PCI_AGP_VERSION 2 /* BCD version number */
-#define PCI_AGP_RFU 3 /* Rest of capability flags */
-#define PCI_AGP_STATUS 4 /* Status register */
-#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
-#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
-#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
-#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
-#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
-#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
-#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
-#define PCI_AGP_COMMAND 8 /* Control register */
-#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
-#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
-#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
-#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
-#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
-#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
-#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
-#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
-#define PCI_AGP_SIZEOF 12
-
-/* Vital Product Data */
-
-#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
-#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
-#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
-#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
-
-/* Slot Identification */
-
-#define PCI_SID_ESR 2 /* Expansion Slot Register */
-#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
-#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
-#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
-
-/* Message Signalled Interrupts registers */
-
-#define PCI_MSI_FLAGS 2 /* Various flags */
-#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
-#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
-#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
-#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
-#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
-#define PCI_MSI_RFU 3 /* Rest of capability flags */
-#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
-#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
-#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
-#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
-#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
-#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
-
-/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
-#define PCI_MSIX_FLAGS 2
-#define PCI_MSIX_FLAGS_QSIZE 0x7FF
-#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
-#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
-#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
-
-/* CompactPCI Hotswap Register */
-
-#define PCI_CHSWP_CSR 2 /* Control and Status Register */
-#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
-#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
-#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
-#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
-#define PCI_CHSWP_PI 0x30 /* Programming Interface */
-#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
-#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
-
-/* PCI Advanced Feature registers */
-
-#define PCI_AF_LENGTH 2
-#define PCI_AF_CAP 3
-#define PCI_AF_CAP_TP 0x01
-#define PCI_AF_CAP_FLR 0x02
-#define PCI_AF_CTRL 4
-#define PCI_AF_CTRL_FLR 0x01
-#define PCI_AF_STATUS 5
-#define PCI_AF_STATUS_TP 0x01
-
-/* PCI-X registers */
-
-#define PCI_X_CMD 2 /* Modes & Features */
-#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
-#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
-#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
-#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
-#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
-#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
-#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
- /* Max # of outstanding split transactions */
-#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
-#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
-#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
-#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
-#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
-#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
-#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
-#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
-#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
-#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
-#define PCI_X_STATUS 4 /* PCI-X capabilities */
-#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
-#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
-#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
-#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
-#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
-#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
-#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
-#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
-#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
-#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
-#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
-#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
-#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
-
-/* PCI Express capability registers */
-
-#define PCI_EXP_FLAGS 2 /* Capabilities register */
-#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
-#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
-#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
-#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
-#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
-#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
-#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
-#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
-#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
-#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
-#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
-#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
-#define PCI_EXP_DEVCAP 4 /* Device capabilities */
-#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
-#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
-#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
-#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
-#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
-#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
-#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
-#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
-#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
-#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
-#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
-#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
-#define PCI_EXP_DEVCTL 8 /* Device Control */
-#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
-#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
-#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
-#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
-#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
-#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
-#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
-#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
-#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
-#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
-#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
-#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
-#define PCI_EXP_DEVSTA 10 /* Device Status */
-#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
-#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
-#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
-#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
-#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
-#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
-#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
-#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
-#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
-#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
-#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
-#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
-#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
-#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
-#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
-#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
-#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
-#define PCI_EXP_LNKCTL 16 /* Link Control */
-#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
-#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
-#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
-#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
-#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
-#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
-#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
-#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
-#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
-#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
-#define PCI_EXP_LNKSTA 18 /* Link Status */
-#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
-#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
-#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
-#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
-#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
-#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
-#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
-#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
-#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
-#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
-#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
-#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
-#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
-#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
-#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
-#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
-#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
-#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
-#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
-#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
-#define PCI_EXP_SLTCTL 24 /* Slot Control */
-#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
-#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
-#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
-#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
-#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
-#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
-#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
-#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
-#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
-#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
-#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
-#define PCI_EXP_SLTSTA 26 /* Slot Status */
-#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
-#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
-#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
-#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
-#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
-#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
-#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
-#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
-#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
-#define PCI_EXP_RTCTL 28 /* Root Control */
-#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
-#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
-#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
-#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
-#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
-#define PCI_EXP_RTCAP 30 /* Root Capabilities */
-#define PCI_EXP_RTSTA 32 /* Root Status */
-#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
-#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
-#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
-#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
-#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
-#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
-
-/* Extended Capabilities (PCI-X 2.0 and Express) */
-#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
-#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
-#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
-
-#define PCI_EXT_CAP_ID_ERR 1
-#define PCI_EXT_CAP_ID_VC 2
-#define PCI_EXT_CAP_ID_DSN 3
-#define PCI_EXT_CAP_ID_PWR 4
-#define PCI_EXT_CAP_ID_ARI 14
-#define PCI_EXT_CAP_ID_ATS 15
-#define PCI_EXT_CAP_ID_SRIOV 16
-
-/* Advanced Error Reporting */
-#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
-#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
-#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
-#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
-#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
-#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
-#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
-#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
-#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
-#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
-#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
-#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
-#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
- /* Same bits as above */
-#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
- /* Same bits as above */
-#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
-#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
-#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
-#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
-#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
-#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
-#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
- /* Same bits as above */
-#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
-#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
-#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
-#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
-#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
-#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
-#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
-#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
-/* Correctable Err Reporting Enable */
-#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
-/* Non-fatal Err Reporting Enable */
-#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
-/* Fatal Err Reporting Enable */
-#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
-#define PCI_ERR_ROOT_STATUS 48
-#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
-/* Multi ERR_COR Received */
-#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
-/* ERR_FATAL/NONFATAL Recevied */
-#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
-/* Multi ERR_FATAL/NONFATAL Recevied */
-#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
-#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
-#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
-#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
-#define PCI_ERR_ROOT_COR_SRC 52
-#define PCI_ERR_ROOT_SRC 54
-
-/* Virtual Channel */
-#define PCI_VC_PORT_REG1 4
-#define PCI_VC_PORT_REG2 8
-#define PCI_VC_PORT_CTRL 12
-#define PCI_VC_PORT_STATUS 14
-#define PCI_VC_RES_CAP 16
-#define PCI_VC_RES_CTRL 20
-#define PCI_VC_RES_STATUS 26
-
-/* Power Budgeting */
-#define PCI_PWR_DSR 4 /* Data Select Register */
-#define PCI_PWR_DATA 8 /* Data Register */
-#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
-#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
-#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
-#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
-#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
-#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
-#define PCI_PWR_CAP 12 /* Capability */
-#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
-
-/*
- * Hypertransport sub capability types
- *
- * Unfortunately there are both 3 bit and 5 bit capability types defined
- * in the HT spec, catering for that is a little messy. You probably don't
- * want to use these directly, just use pci_find_ht_capability() and it
- * will do the right thing for you.
- */
-#define HT_3BIT_CAP_MASK 0xE0
-#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
-#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
-
-#define HT_5BIT_CAP_MASK 0xF8
-#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
-#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
-#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
-#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
-#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
-#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
-#define HT_MSI_FLAGS 0x02 /* Offset to flags */
-#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
-#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
-#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
-#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
-#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
-#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
-#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
-#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
-#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
-#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
-#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
-
-/* Alternative Routing-ID Interpretation */
-#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
-#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
-#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
-#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
-#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
-#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
-#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
-#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
-
-/* Address Translation Service */
-#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
-#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
-#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
-#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
-#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
-#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
-#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
-
-/* Single Root I/O Virtualization */
-#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
-#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
-#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
-#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
-#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
-#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
-#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
-#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
-#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
-#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
-#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
-#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
-#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
-#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
-#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
-#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
-#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
-#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
-#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
-#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
-#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
-#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
-#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
-#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
-#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
-#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
-#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
-#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
-#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
-
-#endif /* LINUX_PCI_REGS_H */
+/*
+ * pci_regs.h
+ *
+ * PCI standard defines
+ * Copyright 1994, Drew Eckhardt
+ * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
+ *
+ * For more information, please consult the following manuals (look at
+ * http://www.pcisig.com/ for how to get them):
+ *
+ * PCI BIOS Specification
+ * PCI Local Bus Specification
+ * PCI to PCI Bridge Specification
+ * PCI System Design Guide
+ *
+ * For hypertransport information, please consult the following manuals
+ * from http://www.hypertransport.org
+ *
+ * The Hypertransport I/O Link Specification
+ */
+
+#ifndef LINUX_PCI_REGS_H
+#define LINUX_PCI_REGS_H
+
+/*
+ * Under PCI, each device has 256 bytes of configuration address space,
+ * of which the first 64 bytes are standardized as follows:
+ */
+#define PCI_VENDOR_ID 0x00 /* 16 bits */
+#define PCI_DEVICE_ID 0x02 /* 16 bits */
+#define PCI_COMMAND 0x04 /* 16 bits */
+#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
+#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
+#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
+#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
+#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
+#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
+#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
+#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
+#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
+#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
+#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
+
+#define PCI_STATUS 0x06 /* 16 bits */
+#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
+#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
+#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
+#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
+#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
+#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
+#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
+#define PCI_STATUS_DEVSEL_FAST 0x000
+#define PCI_STATUS_DEVSEL_MEDIUM 0x200
+#define PCI_STATUS_DEVSEL_SLOW 0x400
+#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
+#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
+#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
+#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
+#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
+
+#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
+#define PCI_REVISION_ID 0x08 /* Revision ID */
+#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
+#define PCI_CLASS_DEVICE 0x0a /* Device class */
+
+#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
+#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
+#define PCI_HEADER_TYPE 0x0e /* 8 bits */
+#define PCI_HEADER_TYPE_NORMAL 0
+#define PCI_HEADER_TYPE_BRIDGE 1
+#define PCI_HEADER_TYPE_CARDBUS 2
+
+#define PCI_BIST 0x0f /* 8 bits */
+#define PCI_BIST_CODE_MASK 0x0f /* Return result */
+#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
+#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
+
+/*
+ * Base addresses specify locations in memory or I/O space.
+ * Decoded size can be determined by writing a value of
+ * 0xffffffff to the register, and reading it back. Only
+ * 1 bits are decoded.
+ */
+#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
+#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
+#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
+#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
+#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
+#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
+#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
+#define PCI_BASE_ADDRESS_SPACE_IO 0x01
+#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
+#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
+#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
+#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
+#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
+#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
+#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
+#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
+/* bit 1 is reserved if address_space = 1 */
+
+/* Header type 0 (normal devices) */
+#define PCI_CARDBUS_CIS 0x28
+#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
+#define PCI_SUBSYSTEM_ID 0x2e
+#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
+#define PCI_ROM_ADDRESS_ENABLE 0x01
+#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
+
+#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
+
+/* 0x35-0x3b are reserved */
+#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
+#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
+#define PCI_MIN_GNT 0x3e /* 8 bits */
+#define PCI_MAX_LAT 0x3f /* 8 bits */
+
+/* Header type 1 (PCI-to-PCI bridges) */
+#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
+#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
+#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
+#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
+#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
+#define PCI_IO_LIMIT 0x1d
+#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
+#define PCI_IO_RANGE_TYPE_16 0x00
+#define PCI_IO_RANGE_TYPE_32 0x01
+#define PCI_IO_RANGE_MASK (~0x0fUL)
+#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
+#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
+#define PCI_MEMORY_LIMIT 0x22
+#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
+#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
+#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
+#define PCI_PREF_MEMORY_LIMIT 0x26
+#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
+#define PCI_PREF_RANGE_TYPE_32 0x00
+#define PCI_PREF_RANGE_TYPE_64 0x01
+#define PCI_PREF_RANGE_MASK (~0x0fUL)
+#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
+#define PCI_PREF_LIMIT_UPPER32 0x2c
+#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
+#define PCI_IO_LIMIT_UPPER16 0x32
+/* 0x34 same as for htype 0 */
+/* 0x35-0x3b is reserved */
+#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
+/* 0x3c-0x3d are same as for htype 0 */
+#define PCI_BRIDGE_CONTROL 0x3e
+#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
+#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
+#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
+#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
+#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
+#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
+#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
+
+/* Header type 2 (CardBus bridges) */
+#define PCI_CB_CAPABILITY_LIST 0x14
+/* 0x15 reserved */
+#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
+#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
+#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
+#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
+#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
+#define PCI_CB_MEMORY_BASE_0 0x1c
+#define PCI_CB_MEMORY_LIMIT_0 0x20
+#define PCI_CB_MEMORY_BASE_1 0x24
+#define PCI_CB_MEMORY_LIMIT_1 0x28
+#define PCI_CB_IO_BASE_0 0x2c
+#define PCI_CB_IO_BASE_0_HI 0x2e
+#define PCI_CB_IO_LIMIT_0 0x30
+#define PCI_CB_IO_LIMIT_0_HI 0x32
+#define PCI_CB_IO_BASE_1 0x34
+#define PCI_CB_IO_BASE_1_HI 0x36
+#define PCI_CB_IO_LIMIT_1 0x38
+#define PCI_CB_IO_LIMIT_1_HI 0x3a
+#define PCI_CB_IO_RANGE_MASK (~0x03UL)
+/* 0x3c-0x3d are same as for htype 0 */
+#define PCI_CB_BRIDGE_CONTROL 0x3e
+#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
+#define PCI_CB_BRIDGE_CTL_SERR 0x02
+#define PCI_CB_BRIDGE_CTL_ISA 0x04
+#define PCI_CB_BRIDGE_CTL_VGA 0x08
+#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
+#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
+#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
+#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
+#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
+#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
+#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
+#define PCI_CB_SUBSYSTEM_ID 0x42
+#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
+/* 0x48-0x7f reserved */
+
+/* Capability lists */
+
+#define PCI_CAP_LIST_ID 0 /* Capability ID */
+#define PCI_CAP_ID_PM 0x01 /* Power Management */
+#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
+#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
+#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
+#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
+#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
+#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
+#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
+#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
+#define PCI_CAP_ID_DBG 0x0A /* Debug port */
+#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
+#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
+#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
+#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
+#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
+#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
+#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
+#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
+#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
+#define PCI_CAP_SIZEOF 4
+
+/* Power Management Registers */
+
+#define PCI_PM_PMC 2 /* PM Capabilities Register */
+#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
+#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
+#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
+#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
+#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
+#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
+#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
+#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
+#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
+#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
+#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
+#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
+#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
+#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
+#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
+#define PCI_PM_CTRL 4 /* PM control and status register */
+#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
+#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
+#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
+#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
+#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
+#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
+#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
+#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
+#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
+#define PCI_PM_DATA_REGISTER 7 /* (??) */
+#define PCI_PM_SIZEOF 8
+
+/* AGP registers */
+
+#define PCI_AGP_VERSION 2 /* BCD version number */
+#define PCI_AGP_RFU 3 /* Rest of capability flags */
+#define PCI_AGP_STATUS 4 /* Status register */
+#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
+#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
+#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
+#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
+#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
+#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
+#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
+#define PCI_AGP_COMMAND 8 /* Control register */
+#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
+#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
+#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
+#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
+#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
+#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
+#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
+#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
+#define PCI_AGP_SIZEOF 12
+
+/* Vital Product Data */
+
+#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
+#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
+#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
+#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
+
+/* Slot Identification */
+
+#define PCI_SID_ESR 2 /* Expansion Slot Register */
+#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
+#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
+#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
+
+/* Message Signalled Interrupts registers */
+
+#define PCI_MSI_FLAGS 2 /* Various flags */
+#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
+#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
+#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
+#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
+#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
+#define PCI_MSI_RFU 3 /* Rest of capability flags */
+#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
+#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
+#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
+#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
+#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
+#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
+
+/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
+#define PCI_MSIX_FLAGS 2
+#define PCI_MSIX_FLAGS_QSIZE 0x7FF
+#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
+#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
+#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
+
+/* CompactPCI Hotswap Register */
+
+#define PCI_CHSWP_CSR 2 /* Control and Status Register */
+#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
+#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
+#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
+#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
+#define PCI_CHSWP_PI 0x30 /* Programming Interface */
+#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
+#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
+
+/* PCI Advanced Feature registers */
+
+#define PCI_AF_LENGTH 2
+#define PCI_AF_CAP 3
+#define PCI_AF_CAP_TP 0x01
+#define PCI_AF_CAP_FLR 0x02
+#define PCI_AF_CTRL 4
+#define PCI_AF_CTRL_FLR 0x01
+#define PCI_AF_STATUS 5
+#define PCI_AF_STATUS_TP 0x01
+
+/* PCI-X registers */
+
+#define PCI_X_CMD 2 /* Modes & Features */
+#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
+#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
+#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
+#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
+#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
+#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
+#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
+ /* Max # of outstanding split transactions */
+#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
+#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
+#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
+#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
+#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
+#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
+#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
+#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
+#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
+#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
+#define PCI_X_STATUS 4 /* PCI-X capabilities */
+#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
+#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
+#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
+#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
+#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
+#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
+#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
+#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
+#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
+#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
+#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
+#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
+#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
+
+/* PCI Express capability registers */
+
+#define PCI_EXP_FLAGS 2 /* Capabilities register */
+#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
+#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
+#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
+#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
+#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
+#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
+#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
+#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
+#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
+#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
+#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
+#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
+#define PCI_EXP_DEVCAP 4 /* Device capabilities */
+#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
+#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
+#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
+#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
+#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
+#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
+#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
+#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
+#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
+#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
+#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
+#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
+#define PCI_EXP_DEVCTL 8 /* Device Control */
+#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
+#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
+#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
+#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
+#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
+#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
+#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
+#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
+#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
+#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
+#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
+#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
+#define PCI_EXP_DEVSTA 10 /* Device Status */
+#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
+#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
+#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
+#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
+#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
+#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
+#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
+#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
+#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
+#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
+#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
+#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
+#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
+#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
+#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
+#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
+#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
+#define PCI_EXP_LNKCTL 16 /* Link Control */
+#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
+#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
+#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
+#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
+#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
+#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
+#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
+#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
+#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
+#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
+#define PCI_EXP_LNKSTA 18 /* Link Status */
+#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
+#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
+#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
+#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
+#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
+#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
+#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
+#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
+#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
+#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
+#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
+#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
+#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
+#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
+#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
+#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
+#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
+#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
+#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
+#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
+#define PCI_EXP_SLTCTL 24 /* Slot Control */
+#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
+#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
+#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
+#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
+#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
+#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
+#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
+#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
+#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
+#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
+#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
+#define PCI_EXP_SLTSTA 26 /* Slot Status */
+#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
+#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
+#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
+#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
+#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
+#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
+#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
+#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
+#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
+#define PCI_EXP_RTCTL 28 /* Root Control */
+#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
+#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
+#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
+#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
+#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
+#define PCI_EXP_RTCAP 30 /* Root Capabilities */
+#define PCI_EXP_RTSTA 32 /* Root Status */
+#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
+#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
+#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
+#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
+#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
+#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
+
+/* Extended Capabilities (PCI-X 2.0 and Express) */
+#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
+#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
+#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
+
+#define PCI_EXT_CAP_ID_ERR 1
+#define PCI_EXT_CAP_ID_VC 2
+#define PCI_EXT_CAP_ID_DSN 3
+#define PCI_EXT_CAP_ID_PWR 4
+#define PCI_EXT_CAP_ID_ARI 14
+#define PCI_EXT_CAP_ID_ATS 15
+#define PCI_EXT_CAP_ID_SRIOV 16
+
+/* Advanced Error Reporting */
+#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
+#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
+#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
+#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
+#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
+#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
+#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
+#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
+#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
+#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
+#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
+#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
+#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
+ /* Same bits as above */
+#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
+ /* Same bits as above */
+#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
+#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
+#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
+#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
+#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
+#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
+#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
+ /* Same bits as above */
+#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
+#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
+#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
+#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
+#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
+#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
+#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
+#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
+/* Correctable Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
+/* Non-fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
+/* Fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
+#define PCI_ERR_ROOT_STATUS 48
+#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
+/* Multi ERR_COR Received */
+#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
+/* ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
+/* Multi ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
+#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
+#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
+#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
+#define PCI_ERR_ROOT_COR_SRC 52
+#define PCI_ERR_ROOT_SRC 54
+
+/* Virtual Channel */
+#define PCI_VC_PORT_REG1 4
+#define PCI_VC_PORT_REG2 8
+#define PCI_VC_PORT_CTRL 12
+#define PCI_VC_PORT_STATUS 14
+#define PCI_VC_RES_CAP 16
+#define PCI_VC_RES_CTRL 20
+#define PCI_VC_RES_STATUS 26
+
+/* Power Budgeting */
+#define PCI_PWR_DSR 4 /* Data Select Register */
+#define PCI_PWR_DATA 8 /* Data Register */
+#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
+#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
+#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
+#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
+#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
+#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
+#define PCI_PWR_CAP 12 /* Capability */
+#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
+
+/*
+ * Hypertransport sub capability types
+ *
+ * Unfortunately there are both 3 bit and 5 bit capability types defined
+ * in the HT spec, catering for that is a little messy. You probably don't
+ * want to use these directly, just use pci_find_ht_capability() and it
+ * will do the right thing for you.
+ */
+#define HT_3BIT_CAP_MASK 0xE0
+#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
+#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
+
+#define HT_5BIT_CAP_MASK 0xF8
+#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
+#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
+#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
+#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
+#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
+#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
+#define HT_MSI_FLAGS 0x02 /* Offset to flags */
+#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
+#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
+#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
+#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
+#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
+#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
+#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
+#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
+#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
+#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
+#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
+
+/* Alternative Routing-ID Interpretation */
+#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
+#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
+#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
+#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
+#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
+#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
+#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
+#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
+
+/* Address Translation Service */
+#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
+#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
+#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
+#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
+#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
+#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
+#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
+
+/* Single Root I/O Virtualization */
+#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
+#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
+#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
+#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
+#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
+#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
+#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
+#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
+#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
+#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
+#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
+#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
+#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
+#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
+#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
+#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
+#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
+#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
+#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
+#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
+#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
+#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
+#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
+#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
+#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
+#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
+#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
+#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
+#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
+
+#endif /* LINUX_PCI_REGS_H */
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
The conversion was done using the GNU 'expand' tool (default settings)
to make it obey the QEMU coding style.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
1 files changed, 665 insertions(+), 665 deletions(-)
rewrite hw/pci_regs.h (90%)
diff --git a/hw/pci_regs.h b/hw/pci_regs.h
dissimilarity index 90%
index dd0bed4..0f9f84c 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -1,665 +1,665 @@
-/*
- * pci_regs.h
- *
- * PCI standard defines
- * Copyright 1994, Drew Eckhardt
- * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
- *
- * For more information, please consult the following manuals (look at
- * http://www.pcisig.com/ for how to get them):
- *
- * PCI BIOS Specification
- * PCI Local Bus Specification
- * PCI to PCI Bridge Specification
- * PCI System Design Guide
- *
- * For hypertransport information, please consult the following manuals
- * from http://www.hypertransport.org
- *
- * The Hypertransport I/O Link Specification
- */
-
-#ifndef LINUX_PCI_REGS_H
-#define LINUX_PCI_REGS_H
-
-/*
- * Under PCI, each device has 256 bytes of configuration address space,
- * of which the first 64 bytes are standardized as follows:
- */
-#define PCI_VENDOR_ID 0x00 /* 16 bits */
-#define PCI_DEVICE_ID 0x02 /* 16 bits */
-#define PCI_COMMAND 0x04 /* 16 bits */
-#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
-#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
-#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
-#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
-#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
-#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
-#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
-#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
-#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
-#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
-#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
-
-#define PCI_STATUS 0x06 /* 16 bits */
-#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
-#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
-#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
-#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
-#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
-#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
-#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
-#define PCI_STATUS_DEVSEL_FAST 0x000
-#define PCI_STATUS_DEVSEL_MEDIUM 0x200
-#define PCI_STATUS_DEVSEL_SLOW 0x400
-#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
-#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
-#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
-#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
-#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
-
-#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
-#define PCI_REVISION_ID 0x08 /* Revision ID */
-#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
-#define PCI_CLASS_DEVICE 0x0a /* Device class */
-
-#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
-#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
-#define PCI_HEADER_TYPE 0x0e /* 8 bits */
-#define PCI_HEADER_TYPE_NORMAL 0
-#define PCI_HEADER_TYPE_BRIDGE 1
-#define PCI_HEADER_TYPE_CARDBUS 2
-
-#define PCI_BIST 0x0f /* 8 bits */
-#define PCI_BIST_CODE_MASK 0x0f /* Return result */
-#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
-#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
-
-/*
- * Base addresses specify locations in memory or I/O space.
- * Decoded size can be determined by writing a value of
- * 0xffffffff to the register, and reading it back. Only
- * 1 bits are decoded.
- */
-#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
-#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
-#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
-#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
-#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
-#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
-#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
-#define PCI_BASE_ADDRESS_SPACE_IO 0x01
-#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
-#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
-#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
-#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
-#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
-#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
-#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
-#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
-/* bit 1 is reserved if address_space = 1 */
-
-/* Header type 0 (normal devices) */
-#define PCI_CARDBUS_CIS 0x28
-#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
-#define PCI_SUBSYSTEM_ID 0x2e
-#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
-#define PCI_ROM_ADDRESS_ENABLE 0x01
-#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
-
-#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
-
-/* 0x35-0x3b are reserved */
-#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
-#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
-#define PCI_MIN_GNT 0x3e /* 8 bits */
-#define PCI_MAX_LAT 0x3f /* 8 bits */
-
-/* Header type 1 (PCI-to-PCI bridges) */
-#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
-#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
-#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
-#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
-#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
-#define PCI_IO_LIMIT 0x1d
-#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
-#define PCI_IO_RANGE_TYPE_16 0x00
-#define PCI_IO_RANGE_TYPE_32 0x01
-#define PCI_IO_RANGE_MASK (~0x0fUL)
-#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
-#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
-#define PCI_MEMORY_LIMIT 0x22
-#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
-#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
-#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
-#define PCI_PREF_MEMORY_LIMIT 0x26
-#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
-#define PCI_PREF_RANGE_TYPE_32 0x00
-#define PCI_PREF_RANGE_TYPE_64 0x01
-#define PCI_PREF_RANGE_MASK (~0x0fUL)
-#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
-#define PCI_PREF_LIMIT_UPPER32 0x2c
-#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
-#define PCI_IO_LIMIT_UPPER16 0x32
-/* 0x34 same as for htype 0 */
-/* 0x35-0x3b is reserved */
-#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
-/* 0x3c-0x3d are same as for htype 0 */
-#define PCI_BRIDGE_CONTROL 0x3e
-#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
-#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
-#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
-#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
-#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
-#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
-#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
-
-/* Header type 2 (CardBus bridges) */
-#define PCI_CB_CAPABILITY_LIST 0x14
-/* 0x15 reserved */
-#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
-#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
-#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
-#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
-#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
-#define PCI_CB_MEMORY_BASE_0 0x1c
-#define PCI_CB_MEMORY_LIMIT_0 0x20
-#define PCI_CB_MEMORY_BASE_1 0x24
-#define PCI_CB_MEMORY_LIMIT_1 0x28
-#define PCI_CB_IO_BASE_0 0x2c
-#define PCI_CB_IO_BASE_0_HI 0x2e
-#define PCI_CB_IO_LIMIT_0 0x30
-#define PCI_CB_IO_LIMIT_0_HI 0x32
-#define PCI_CB_IO_BASE_1 0x34
-#define PCI_CB_IO_BASE_1_HI 0x36
-#define PCI_CB_IO_LIMIT_1 0x38
-#define PCI_CB_IO_LIMIT_1_HI 0x3a
-#define PCI_CB_IO_RANGE_MASK (~0x03UL)
-/* 0x3c-0x3d are same as for htype 0 */
-#define PCI_CB_BRIDGE_CONTROL 0x3e
-#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
-#define PCI_CB_BRIDGE_CTL_SERR 0x02
-#define PCI_CB_BRIDGE_CTL_ISA 0x04
-#define PCI_CB_BRIDGE_CTL_VGA 0x08
-#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
-#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
-#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
-#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
-#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
-#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
-#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
-#define PCI_CB_SUBSYSTEM_ID 0x42
-#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
-/* 0x48-0x7f reserved */
-
-/* Capability lists */
-
-#define PCI_CAP_LIST_ID 0 /* Capability ID */
-#define PCI_CAP_ID_PM 0x01 /* Power Management */
-#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
-#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
-#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
-#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
-#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
-#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
-#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
-#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
-#define PCI_CAP_ID_DBG 0x0A /* Debug port */
-#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
-#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
-#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
-#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
-#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
-#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
-#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
-#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
-#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
-#define PCI_CAP_SIZEOF 4
-
-/* Power Management Registers */
-
-#define PCI_PM_PMC 2 /* PM Capabilities Register */
-#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
-#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
-#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
-#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
-#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
-#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
-#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
-#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
-#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
-#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
-#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
-#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
-#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
-#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
-#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
-#define PCI_PM_CTRL 4 /* PM control and status register */
-#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
-#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
-#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
-#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
-#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
-#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
-#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
-#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
-#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
-#define PCI_PM_DATA_REGISTER 7 /* (??) */
-#define PCI_PM_SIZEOF 8
-
-/* AGP registers */
-
-#define PCI_AGP_VERSION 2 /* BCD version number */
-#define PCI_AGP_RFU 3 /* Rest of capability flags */
-#define PCI_AGP_STATUS 4 /* Status register */
-#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
-#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
-#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
-#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
-#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
-#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
-#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
-#define PCI_AGP_COMMAND 8 /* Control register */
-#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
-#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
-#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
-#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
-#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
-#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
-#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
-#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
-#define PCI_AGP_SIZEOF 12
-
-/* Vital Product Data */
-
-#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
-#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
-#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
-#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
-
-/* Slot Identification */
-
-#define PCI_SID_ESR 2 /* Expansion Slot Register */
-#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
-#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
-#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
-
-/* Message Signalled Interrupts registers */
-
-#define PCI_MSI_FLAGS 2 /* Various flags */
-#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
-#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
-#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
-#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
-#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
-#define PCI_MSI_RFU 3 /* Rest of capability flags */
-#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
-#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
-#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
-#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
-#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
-#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
-
-/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
-#define PCI_MSIX_FLAGS 2
-#define PCI_MSIX_FLAGS_QSIZE 0x7FF
-#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
-#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
-#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
-
-/* CompactPCI Hotswap Register */
-
-#define PCI_CHSWP_CSR 2 /* Control and Status Register */
-#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
-#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
-#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
-#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
-#define PCI_CHSWP_PI 0x30 /* Programming Interface */
-#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
-#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
-
-/* PCI Advanced Feature registers */
-
-#define PCI_AF_LENGTH 2
-#define PCI_AF_CAP 3
-#define PCI_AF_CAP_TP 0x01
-#define PCI_AF_CAP_FLR 0x02
-#define PCI_AF_CTRL 4
-#define PCI_AF_CTRL_FLR 0x01
-#define PCI_AF_STATUS 5
-#define PCI_AF_STATUS_TP 0x01
-
-/* PCI-X registers */
-
-#define PCI_X_CMD 2 /* Modes & Features */
-#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
-#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
-#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
-#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
-#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
-#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
-#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
- /* Max # of outstanding split transactions */
-#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
-#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
-#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
-#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
-#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
-#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
-#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
-#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
-#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
-#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
-#define PCI_X_STATUS 4 /* PCI-X capabilities */
-#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
-#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
-#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
-#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
-#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
-#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
-#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
-#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
-#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
-#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
-#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
-#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
-#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
-
-/* PCI Express capability registers */
-
-#define PCI_EXP_FLAGS 2 /* Capabilities register */
-#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
-#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
-#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
-#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
-#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
-#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
-#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
-#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
-#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
-#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
-#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
-#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
-#define PCI_EXP_DEVCAP 4 /* Device capabilities */
-#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
-#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
-#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
-#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
-#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
-#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
-#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
-#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
-#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
-#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
-#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
-#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
-#define PCI_EXP_DEVCTL 8 /* Device Control */
-#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
-#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
-#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
-#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
-#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
-#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
-#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
-#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
-#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
-#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
-#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
-#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
-#define PCI_EXP_DEVSTA 10 /* Device Status */
-#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
-#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
-#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
-#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
-#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
-#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
-#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
-#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
-#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
-#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
-#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
-#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
-#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
-#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
-#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
-#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
-#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
-#define PCI_EXP_LNKCTL 16 /* Link Control */
-#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
-#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
-#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
-#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
-#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
-#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
-#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
-#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
-#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
-#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
-#define PCI_EXP_LNKSTA 18 /* Link Status */
-#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
-#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
-#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
-#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
-#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
-#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
-#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
-#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
-#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
-#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
-#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
-#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
-#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
-#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
-#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
-#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
-#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
-#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
-#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
-#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
-#define PCI_EXP_SLTCTL 24 /* Slot Control */
-#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
-#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
-#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
-#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
-#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
-#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
-#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
-#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
-#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
-#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
-#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
-#define PCI_EXP_SLTSTA 26 /* Slot Status */
-#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
-#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
-#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
-#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
-#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
-#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
-#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
-#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
-#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
-#define PCI_EXP_RTCTL 28 /* Root Control */
-#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
-#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
-#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
-#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
-#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
-#define PCI_EXP_RTCAP 30 /* Root Capabilities */
-#define PCI_EXP_RTSTA 32 /* Root Status */
-#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
-#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
-#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
-#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
-#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
-#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
-
-/* Extended Capabilities (PCI-X 2.0 and Express) */
-#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
-#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
-#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
-
-#define PCI_EXT_CAP_ID_ERR 1
-#define PCI_EXT_CAP_ID_VC 2
-#define PCI_EXT_CAP_ID_DSN 3
-#define PCI_EXT_CAP_ID_PWR 4
-#define PCI_EXT_CAP_ID_ARI 14
-#define PCI_EXT_CAP_ID_ATS 15
-#define PCI_EXT_CAP_ID_SRIOV 16
-
-/* Advanced Error Reporting */
-#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
-#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
-#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
-#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
-#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
-#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
-#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
-#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
-#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
-#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
-#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
-#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
-#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
- /* Same bits as above */
-#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
- /* Same bits as above */
-#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
-#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
-#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
-#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
-#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
-#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
-#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
- /* Same bits as above */
-#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
-#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
-#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
-#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
-#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
-#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
-#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
-#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
-/* Correctable Err Reporting Enable */
-#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
-/* Non-fatal Err Reporting Enable */
-#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
-/* Fatal Err Reporting Enable */
-#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
-#define PCI_ERR_ROOT_STATUS 48
-#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
-/* Multi ERR_COR Received */
-#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
-/* ERR_FATAL/NONFATAL Recevied */
-#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
-/* Multi ERR_FATAL/NONFATAL Recevied */
-#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
-#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
-#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
-#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
-#define PCI_ERR_ROOT_COR_SRC 52
-#define PCI_ERR_ROOT_SRC 54
-
-/* Virtual Channel */
-#define PCI_VC_PORT_REG1 4
-#define PCI_VC_PORT_REG2 8
-#define PCI_VC_PORT_CTRL 12
-#define PCI_VC_PORT_STATUS 14
-#define PCI_VC_RES_CAP 16
-#define PCI_VC_RES_CTRL 20
-#define PCI_VC_RES_STATUS 26
-
-/* Power Budgeting */
-#define PCI_PWR_DSR 4 /* Data Select Register */
-#define PCI_PWR_DATA 8 /* Data Register */
-#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
-#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
-#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
-#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
-#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
-#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
-#define PCI_PWR_CAP 12 /* Capability */
-#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
-
-/*
- * Hypertransport sub capability types
- *
- * Unfortunately there are both 3 bit and 5 bit capability types defined
- * in the HT spec, catering for that is a little messy. You probably don't
- * want to use these directly, just use pci_find_ht_capability() and it
- * will do the right thing for you.
- */
-#define HT_3BIT_CAP_MASK 0xE0
-#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
-#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
-
-#define HT_5BIT_CAP_MASK 0xF8
-#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
-#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
-#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
-#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
-#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
-#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
-#define HT_MSI_FLAGS 0x02 /* Offset to flags */
-#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
-#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
-#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
-#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
-#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
-#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
-#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
-#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
-#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
-#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
-#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
-
-/* Alternative Routing-ID Interpretation */
-#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
-#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
-#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
-#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
-#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
-#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
-#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
-#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
-
-/* Address Translation Service */
-#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
-#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
-#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
-#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
-#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
-#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
-#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
-
-/* Single Root I/O Virtualization */
-#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
-#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
-#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
-#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
-#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
-#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
-#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
-#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
-#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
-#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
-#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
-#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
-#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
-#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
-#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
-#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
-#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
-#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
-#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
-#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
-#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
-#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
-#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
-#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
-#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
-#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
-#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
-#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
-#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
-
-#endif /* LINUX_PCI_REGS_H */
+/*
+ * pci_regs.h
+ *
+ * PCI standard defines
+ * Copyright 1994, Drew Eckhardt
+ * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
+ *
+ * For more information, please consult the following manuals (look at
+ * http://www.pcisig.com/ for how to get them):
+ *
+ * PCI BIOS Specification
+ * PCI Local Bus Specification
+ * PCI to PCI Bridge Specification
+ * PCI System Design Guide
+ *
+ * For hypertransport information, please consult the following manuals
+ * from http://www.hypertransport.org
+ *
+ * The Hypertransport I/O Link Specification
+ */
+
+#ifndef LINUX_PCI_REGS_H
+#define LINUX_PCI_REGS_H
+
+/*
+ * Under PCI, each device has 256 bytes of configuration address space,
+ * of which the first 64 bytes are standardized as follows:
+ */
+#define PCI_VENDOR_ID 0x00 /* 16 bits */
+#define PCI_DEVICE_ID 0x02 /* 16 bits */
+#define PCI_COMMAND 0x04 /* 16 bits */
+#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
+#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
+#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
+#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
+#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
+#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
+#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
+#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
+#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
+#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
+#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
+
+#define PCI_STATUS 0x06 /* 16 bits */
+#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
+#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
+#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
+#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
+#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
+#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
+#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
+#define PCI_STATUS_DEVSEL_FAST 0x000
+#define PCI_STATUS_DEVSEL_MEDIUM 0x200
+#define PCI_STATUS_DEVSEL_SLOW 0x400
+#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
+#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
+#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
+#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
+#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
+
+#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
+#define PCI_REVISION_ID 0x08 /* Revision ID */
+#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
+#define PCI_CLASS_DEVICE 0x0a /* Device class */
+
+#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
+#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
+#define PCI_HEADER_TYPE 0x0e /* 8 bits */
+#define PCI_HEADER_TYPE_NORMAL 0
+#define PCI_HEADER_TYPE_BRIDGE 1
+#define PCI_HEADER_TYPE_CARDBUS 2
+
+#define PCI_BIST 0x0f /* 8 bits */
+#define PCI_BIST_CODE_MASK 0x0f /* Return result */
+#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
+#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
+
+/*
+ * Base addresses specify locations in memory or I/O space.
+ * Decoded size can be determined by writing a value of
+ * 0xffffffff to the register, and reading it back. Only
+ * 1 bits are decoded.
+ */
+#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
+#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
+#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
+#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
+#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
+#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
+#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
+#define PCI_BASE_ADDRESS_SPACE_IO 0x01
+#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
+#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
+#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
+#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
+#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
+#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
+#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
+#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
+/* bit 1 is reserved if address_space = 1 */
+
+/* Header type 0 (normal devices) */
+#define PCI_CARDBUS_CIS 0x28
+#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
+#define PCI_SUBSYSTEM_ID 0x2e
+#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
+#define PCI_ROM_ADDRESS_ENABLE 0x01
+#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
+
+#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
+
+/* 0x35-0x3b are reserved */
+#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
+#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
+#define PCI_MIN_GNT 0x3e /* 8 bits */
+#define PCI_MAX_LAT 0x3f /* 8 bits */
+
+/* Header type 1 (PCI-to-PCI bridges) */
+#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
+#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
+#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
+#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
+#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
+#define PCI_IO_LIMIT 0x1d
+#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
+#define PCI_IO_RANGE_TYPE_16 0x00
+#define PCI_IO_RANGE_TYPE_32 0x01
+#define PCI_IO_RANGE_MASK (~0x0fUL)
+#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
+#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
+#define PCI_MEMORY_LIMIT 0x22
+#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
+#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
+#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
+#define PCI_PREF_MEMORY_LIMIT 0x26
+#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
+#define PCI_PREF_RANGE_TYPE_32 0x00
+#define PCI_PREF_RANGE_TYPE_64 0x01
+#define PCI_PREF_RANGE_MASK (~0x0fUL)
+#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
+#define PCI_PREF_LIMIT_UPPER32 0x2c
+#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
+#define PCI_IO_LIMIT_UPPER16 0x32
+/* 0x34 same as for htype 0 */
+/* 0x35-0x3b is reserved */
+#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
+/* 0x3c-0x3d are same as for htype 0 */
+#define PCI_BRIDGE_CONTROL 0x3e
+#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
+#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
+#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
+#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
+#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
+#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
+#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
+
+/* Header type 2 (CardBus bridges) */
+#define PCI_CB_CAPABILITY_LIST 0x14
+/* 0x15 reserved */
+#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
+#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
+#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
+#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
+#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
+#define PCI_CB_MEMORY_BASE_0 0x1c
+#define PCI_CB_MEMORY_LIMIT_0 0x20
+#define PCI_CB_MEMORY_BASE_1 0x24
+#define PCI_CB_MEMORY_LIMIT_1 0x28
+#define PCI_CB_IO_BASE_0 0x2c
+#define PCI_CB_IO_BASE_0_HI 0x2e
+#define PCI_CB_IO_LIMIT_0 0x30
+#define PCI_CB_IO_LIMIT_0_HI 0x32
+#define PCI_CB_IO_BASE_1 0x34
+#define PCI_CB_IO_BASE_1_HI 0x36
+#define PCI_CB_IO_LIMIT_1 0x38
+#define PCI_CB_IO_LIMIT_1_HI 0x3a
+#define PCI_CB_IO_RANGE_MASK (~0x03UL)
+/* 0x3c-0x3d are same as for htype 0 */
+#define PCI_CB_BRIDGE_CONTROL 0x3e
+#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
+#define PCI_CB_BRIDGE_CTL_SERR 0x02
+#define PCI_CB_BRIDGE_CTL_ISA 0x04
+#define PCI_CB_BRIDGE_CTL_VGA 0x08
+#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
+#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
+#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
+#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
+#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
+#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
+#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
+#define PCI_CB_SUBSYSTEM_ID 0x42
+#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
+/* 0x48-0x7f reserved */
+
+/* Capability lists */
+
+#define PCI_CAP_LIST_ID 0 /* Capability ID */
+#define PCI_CAP_ID_PM 0x01 /* Power Management */
+#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
+#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
+#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
+#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
+#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
+#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
+#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
+#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
+#define PCI_CAP_ID_DBG 0x0A /* Debug port */
+#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
+#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
+#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
+#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
+#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
+#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
+#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
+#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
+#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
+#define PCI_CAP_SIZEOF 4
+
+/* Power Management Registers */
+
+#define PCI_PM_PMC 2 /* PM Capabilities Register */
+#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
+#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
+#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
+#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
+#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
+#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
+#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
+#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
+#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
+#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
+#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
+#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
+#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
+#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
+#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
+#define PCI_PM_CTRL 4 /* PM control and status register */
+#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
+#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
+#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
+#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
+#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
+#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
+#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
+#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
+#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
+#define PCI_PM_DATA_REGISTER 7 /* (??) */
+#define PCI_PM_SIZEOF 8
+
+/* AGP registers */
+
+#define PCI_AGP_VERSION 2 /* BCD version number */
+#define PCI_AGP_RFU 3 /* Rest of capability flags */
+#define PCI_AGP_STATUS 4 /* Status register */
+#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
+#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
+#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
+#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
+#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
+#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
+#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
+#define PCI_AGP_COMMAND 8 /* Control register */
+#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
+#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
+#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
+#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
+#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
+#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
+#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
+#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
+#define PCI_AGP_SIZEOF 12
+
+/* Vital Product Data */
+
+#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
+#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
+#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
+#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
+
+/* Slot Identification */
+
+#define PCI_SID_ESR 2 /* Expansion Slot Register */
+#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
+#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
+#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
+
+/* Message Signalled Interrupts registers */
+
+#define PCI_MSI_FLAGS 2 /* Various flags */
+#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
+#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
+#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
+#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
+#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
+#define PCI_MSI_RFU 3 /* Rest of capability flags */
+#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
+#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
+#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
+#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
+#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
+#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
+
+/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
+#define PCI_MSIX_FLAGS 2
+#define PCI_MSIX_FLAGS_QSIZE 0x7FF
+#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
+#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
+#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
+
+/* CompactPCI Hotswap Register */
+
+#define PCI_CHSWP_CSR 2 /* Control and Status Register */
+#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
+#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
+#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
+#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
+#define PCI_CHSWP_PI 0x30 /* Programming Interface */
+#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
+#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
+
+/* PCI Advanced Feature registers */
+
+#define PCI_AF_LENGTH 2
+#define PCI_AF_CAP 3
+#define PCI_AF_CAP_TP 0x01
+#define PCI_AF_CAP_FLR 0x02
+#define PCI_AF_CTRL 4
+#define PCI_AF_CTRL_FLR 0x01
+#define PCI_AF_STATUS 5
+#define PCI_AF_STATUS_TP 0x01
+
+/* PCI-X registers */
+
+#define PCI_X_CMD 2 /* Modes & Features */
+#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
+#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
+#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
+#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
+#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
+#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
+#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
+ /* Max # of outstanding split transactions */
+#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
+#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
+#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
+#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
+#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
+#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
+#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
+#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
+#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
+#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
+#define PCI_X_STATUS 4 /* PCI-X capabilities */
+#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
+#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
+#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
+#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
+#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
+#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
+#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
+#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
+#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
+#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
+#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
+#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
+#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
+
+/* PCI Express capability registers */
+
+#define PCI_EXP_FLAGS 2 /* Capabilities register */
+#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
+#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
+#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
+#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
+#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
+#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
+#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
+#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
+#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
+#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
+#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
+#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
+#define PCI_EXP_DEVCAP 4 /* Device capabilities */
+#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
+#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
+#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
+#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
+#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
+#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
+#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
+#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
+#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
+#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
+#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
+#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
+#define PCI_EXP_DEVCTL 8 /* Device Control */
+#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
+#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
+#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
+#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
+#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
+#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
+#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
+#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
+#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
+#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
+#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
+#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
+#define PCI_EXP_DEVSTA 10 /* Device Status */
+#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
+#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
+#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
+#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
+#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
+#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
+#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
+#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
+#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
+#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
+#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
+#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
+#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
+#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
+#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
+#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
+#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
+#define PCI_EXP_LNKCTL 16 /* Link Control */
+#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
+#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
+#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
+#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
+#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
+#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
+#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
+#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
+#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
+#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
+#define PCI_EXP_LNKSTA 18 /* Link Status */
+#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
+#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
+#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
+#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
+#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
+#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
+#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
+#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
+#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
+#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
+#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
+#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
+#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
+#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
+#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
+#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
+#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
+#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
+#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
+#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
+#define PCI_EXP_SLTCTL 24 /* Slot Control */
+#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
+#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
+#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
+#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
+#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
+#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
+#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
+#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
+#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
+#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
+#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
+#define PCI_EXP_SLTSTA 26 /* Slot Status */
+#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
+#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
+#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
+#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
+#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
+#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
+#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
+#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
+#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
+#define PCI_EXP_RTCTL 28 /* Root Control */
+#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
+#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
+#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
+#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
+#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
+#define PCI_EXP_RTCAP 30 /* Root Capabilities */
+#define PCI_EXP_RTSTA 32 /* Root Status */
+#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
+#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
+#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
+#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
+#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
+#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
+
+/* Extended Capabilities (PCI-X 2.0 and Express) */
+#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
+#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
+#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
+
+#define PCI_EXT_CAP_ID_ERR 1
+#define PCI_EXT_CAP_ID_VC 2
+#define PCI_EXT_CAP_ID_DSN 3
+#define PCI_EXT_CAP_ID_PWR 4
+#define PCI_EXT_CAP_ID_ARI 14
+#define PCI_EXT_CAP_ID_ATS 15
+#define PCI_EXT_CAP_ID_SRIOV 16
+
+/* Advanced Error Reporting */
+#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
+#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
+#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
+#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
+#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
+#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
+#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
+#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
+#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
+#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
+#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
+#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
+#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
+ /* Same bits as above */
+#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
+ /* Same bits as above */
+#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
+#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
+#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
+#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
+#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
+#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
+#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
+ /* Same bits as above */
+#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
+#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
+#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
+#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
+#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
+#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
+#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
+#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
+/* Correctable Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
+/* Non-fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
+/* Fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
+#define PCI_ERR_ROOT_STATUS 48
+#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
+/* Multi ERR_COR Received */
+#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
+/* ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
+/* Multi ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
+#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
+#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
+#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
+#define PCI_ERR_ROOT_COR_SRC 52
+#define PCI_ERR_ROOT_SRC 54
+
+/* Virtual Channel */
+#define PCI_VC_PORT_REG1 4
+#define PCI_VC_PORT_REG2 8
+#define PCI_VC_PORT_CTRL 12
+#define PCI_VC_PORT_STATUS 14
+#define PCI_VC_RES_CAP 16
+#define PCI_VC_RES_CTRL 20
+#define PCI_VC_RES_STATUS 26
+
+/* Power Budgeting */
+#define PCI_PWR_DSR 4 /* Data Select Register */
+#define PCI_PWR_DATA 8 /* Data Register */
+#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
+#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
+#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
+#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
+#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
+#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
+#define PCI_PWR_CAP 12 /* Capability */
+#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
+
+/*
+ * Hypertransport sub capability types
+ *
+ * Unfortunately there are both 3 bit and 5 bit capability types defined
+ * in the HT spec, catering for that is a little messy. You probably don't
+ * want to use these directly, just use pci_find_ht_capability() and it
+ * will do the right thing for you.
+ */
+#define HT_3BIT_CAP_MASK 0xE0
+#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
+#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
+
+#define HT_5BIT_CAP_MASK 0xF8
+#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
+#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
+#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
+#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
+#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
+#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
+#define HT_MSI_FLAGS 0x02 /* Offset to flags */
+#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
+#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
+#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
+#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
+#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
+#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
+#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
+#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
+#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
+#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
+#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
+
+/* Alternative Routing-ID Interpretation */
+#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
+#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
+#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
+#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
+#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
+#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
+#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
+#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
+
+/* Address Translation Service */
+#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
+#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
+#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
+#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
+#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
+#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
+#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
+
+/* Single Root I/O Virtualization */
+#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
+#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
+#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
+#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
+#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
+#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
+#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
+#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
+#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
+#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
+#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
+#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
+#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
+#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
+#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
+#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
+#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
+#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
+#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
+#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
+#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
+#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
+#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
+#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
+#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
+#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
+#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
+#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
+#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
+
+#endif /* LINUX_PCI_REGS_H */
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH 2/7] pci: memory access API and IOMMU support
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
PCI devices should access memory through pci_memory_*() instead of
cpu_physical_memory_*(). This also provides support for translation and
access checking in case an IOMMU is emulated.
Memory maps are treated as remote IOTLBs (that is, translation caches
belonging to the IOMMU-aware device itself). Clients (devices) must
provide callbacks for map invalidation in case these maps are
persistent beyond the current I/O context, e.g. AIO DMA transfers.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 74 +++++++++++++++++++++
hw/pci_internals.h | 12 ++++
qemu-common.h | 1 +
4 files changed, 271 insertions(+), 1 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c
index 2dc1577..b460905 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev)
pci_update_mappings(dev);
}
+static int pci_no_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ *paddr = addr;
+ *len = -1;
+
+ return 0;
+}
+
static void pci_bus_reset(void *opaque)
{
PCIBus *bus = opaque;
@@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
{
qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
assert(PCI_FUNC(devfn_min) == 0);
- bus->devfn_min = devfn_min;
+
+ bus->devfn_min = devfn_min;
+ bus->iommu = NULL;
+ bus->translate = pci_no_translate;
/* host bridge */
QLIST_INIT(&bus->child);
@@ -1789,3 +1805,170 @@ static char *pcibus_get_dev_path(DeviceState *dev)
return strdup(path);
}
+void pci_register_iommu(PCIDevice *iommu,
+ PCITranslateFunc *translate)
+{
+ iommu->bus->iommu = iommu;
+ iommu->bus->translate = translate;
+}
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ while (len) {
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err)
+ return;
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len)
+ plen = len;
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+static void pci_memory_register_map(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len,
+ target_phys_addr_t paddr,
+ PCIInvalidateMapFunc *invalidate,
+ void *invalidate_opaque)
+{
+ PCIMemoryMap *map;
+
+ map = qemu_malloc(sizeof(PCIMemoryMap));
+ map->addr = addr;
+ map->len = len;
+ map->paddr = paddr;
+ map->invalidate = invalidate;
+ map->invalidate_opaque = invalidate_opaque;
+
+ QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
+}
+
+static void pci_memory_unregister_map(PCIDevice *dev,
+ target_phys_addr_t paddr,
+ target_phys_addr_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (map->paddr == paddr && map->len == len) {
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (ranges_overlap(addr, len, map->addr, map->len)) {
+ map->invalidate(map->invalidate_opaque);
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ plen = *len;
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err)
+ return NULL;
+
+ /*
+ * If this is true, the virtual region is contiguous,
+ * but the translated physical region isn't. We just
+ * clamp *len, much like cpu_physical_memory_map() does.
+ */
+ if (plen < *len)
+ *len = plen;
+
+ /* We treat maps as remote TLBs to cope with stuff like AIO. */
+ if (cb)
+ pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
+
+ return cpu_physical_memory_map(paddr, len, is_write);
+}
+
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+ pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
+}
+
+#define DEFINE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_READ); \
+ if (err || (plen < size / 8)) \
+ return 0; \
+ \
+ return ld##suffix##_phys(paddr); \
+}
+
+#define DEFINE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
+ if (err || (plen < size / 8)) \
+ return; \
+ \
+ st##suffix##_phys(paddr, val); \
+}
+
+DEFINE_PCI_LD(ub, 8)
+DEFINE_PCI_LD(uw, 16)
+DEFINE_PCI_LD(l, 32)
+DEFINE_PCI_LD(q, 64)
+
+DEFINE_PCI_ST(b, 8)
+DEFINE_PCI_ST(w, 16)
+DEFINE_PCI_ST(l, 32)
+DEFINE_PCI_ST(q, 64)
+
diff --git a/hw/pci.h b/hw/pci.h
index c551f96..3131016 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -172,6 +172,8 @@ struct PCIDevice {
char *romfile;
ram_addr_t rom_offset;
uint32_t rom_bar;
+
+ QLIST_HEAD(, PCIMemoryMap) memory_maps;
};
PCIDevice *pci_register_device(PCIBus *bus, const char *name,
@@ -391,4 +393,76 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
return !(last2 < first1 || last1 < first2);
}
+/*
+ * Memory I/O and PCI IOMMU definitions.
+ */
+
+#define IOMMU_PERM_READ (1 << 0)
+#define IOMMU_PERM_WRITE (1 << 1)
+#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
+
+typedef int PCIInvalidateMapFunc(void *opaque);
+typedef int PCITranslateFunc(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms);
+
+extern void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write);
+extern void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+extern void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+extern void pci_register_iommu(PCIDevice *dev,
+ PCITranslateFunc *translate);
+extern void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len);
+
+#define DECLARE_PCI_LD(suffix, size) \
+extern uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
+
+#define DECLARE_PCI_ST(suffix, size) \
+extern void pci_st##suffix(PCIDevice *dev, \
+ pcibus_t addr, \
+ uint##size##_t val);
+
+DECLARE_PCI_LD(ub, 8)
+DECLARE_PCI_LD(uw, 16)
+DECLARE_PCI_LD(l, 32)
+DECLARE_PCI_LD(q, 64)
+
+DECLARE_PCI_ST(b, 8)
+DECLARE_PCI_ST(w, 16)
+DECLARE_PCI_ST(l, 32)
+DECLARE_PCI_ST(q, 64)
+
+static inline void pci_memory_read(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, buf, len, 0);
+}
+
+static inline void pci_memory_write(PCIDevice *dev,
+ pcibus_t addr,
+ const uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
+}
+
#endif
diff --git a/hw/pci_internals.h b/hw/pci_internals.h
index e3c93a3..fb134b9 100644
--- a/hw/pci_internals.h
+++ b/hw/pci_internals.h
@@ -33,6 +33,9 @@ struct PCIBus {
Keep a count of the number of devices with raised IRQs. */
int nirq;
int *irq_count;
+
+ PCIDevice *iommu;
+ PCITranslateFunc *translate;
};
struct PCIBridge {
@@ -44,4 +47,13 @@ struct PCIBridge {
const char *bus_name;
};
+struct PCIMemoryMap {
+ pcibus_t addr;
+ pcibus_t len;
+ target_phys_addr_t paddr;
+ PCIInvalidateMapFunc *invalidate;
+ void *invalidate_opaque;
+ QLIST_ENTRY(PCIMemoryMap) list;
+};
+
#endif /* QEMU_PCI_INTERNALS_H */
diff --git a/qemu-common.h b/qemu-common.h
index d735235..8b060e8 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
typedef struct PCIHostState PCIHostState;
typedef struct PCIExpressHost PCIExpressHost;
typedef struct PCIBus PCIBus;
+typedef struct PCIMemoryMap PCIMemoryMap;
typedef struct PCIDevice PCIDevice;
typedef struct PCIBridge PCIBridge;
typedef struct SerialState SerialState;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
PCI devices should access memory through pci_memory_*() instead of
cpu_physical_memory_*(). This also provides support for translation and
access checking in case an IOMMU is emulated.
Memory maps are treated as remote IOTLBs (that is, translation caches
belonging to the IOMMU-aware device itself). Clients (devices) must
provide callbacks for map invalidation in case these maps are
persistent beyond the current I/O context, e.g. AIO DMA transfers.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 74 +++++++++++++++++++++
hw/pci_internals.h | 12 ++++
qemu-common.h | 1 +
4 files changed, 271 insertions(+), 1 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c
index 2dc1577..b460905 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev)
pci_update_mappings(dev);
}
+static int pci_no_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ *paddr = addr;
+ *len = -1;
+
+ return 0;
+}
+
static void pci_bus_reset(void *opaque)
{
PCIBus *bus = opaque;
@@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
{
qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
assert(PCI_FUNC(devfn_min) == 0);
- bus->devfn_min = devfn_min;
+
+ bus->devfn_min = devfn_min;
+ bus->iommu = NULL;
+ bus->translate = pci_no_translate;
/* host bridge */
QLIST_INIT(&bus->child);
@@ -1789,3 +1805,170 @@ static char *pcibus_get_dev_path(DeviceState *dev)
return strdup(path);
}
+void pci_register_iommu(PCIDevice *iommu,
+ PCITranslateFunc *translate)
+{
+ iommu->bus->iommu = iommu;
+ iommu->bus->translate = translate;
+}
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ while (len) {
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err)
+ return;
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len)
+ plen = len;
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+static void pci_memory_register_map(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len,
+ target_phys_addr_t paddr,
+ PCIInvalidateMapFunc *invalidate,
+ void *invalidate_opaque)
+{
+ PCIMemoryMap *map;
+
+ map = qemu_malloc(sizeof(PCIMemoryMap));
+ map->addr = addr;
+ map->len = len;
+ map->paddr = paddr;
+ map->invalidate = invalidate;
+ map->invalidate_opaque = invalidate_opaque;
+
+ QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
+}
+
+static void pci_memory_unregister_map(PCIDevice *dev,
+ target_phys_addr_t paddr,
+ target_phys_addr_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (map->paddr == paddr && map->len == len) {
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (ranges_overlap(addr, len, map->addr, map->len)) {
+ map->invalidate(map->invalidate_opaque);
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ plen = *len;
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err)
+ return NULL;
+
+ /*
+ * If this is true, the virtual region is contiguous,
+ * but the translated physical region isn't. We just
+ * clamp *len, much like cpu_physical_memory_map() does.
+ */
+ if (plen < *len)
+ *len = plen;
+
+ /* We treat maps as remote TLBs to cope with stuff like AIO. */
+ if (cb)
+ pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
+
+ return cpu_physical_memory_map(paddr, len, is_write);
+}
+
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+ pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
+}
+
+#define DEFINE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_READ); \
+ if (err || (plen < size / 8)) \
+ return 0; \
+ \
+ return ld##suffix##_phys(paddr); \
+}
+
+#define DEFINE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
+ if (err || (plen < size / 8)) \
+ return; \
+ \
+ st##suffix##_phys(paddr, val); \
+}
+
+DEFINE_PCI_LD(ub, 8)
+DEFINE_PCI_LD(uw, 16)
+DEFINE_PCI_LD(l, 32)
+DEFINE_PCI_LD(q, 64)
+
+DEFINE_PCI_ST(b, 8)
+DEFINE_PCI_ST(w, 16)
+DEFINE_PCI_ST(l, 32)
+DEFINE_PCI_ST(q, 64)
+
diff --git a/hw/pci.h b/hw/pci.h
index c551f96..3131016 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -172,6 +172,8 @@ struct PCIDevice {
char *romfile;
ram_addr_t rom_offset;
uint32_t rom_bar;
+
+ QLIST_HEAD(, PCIMemoryMap) memory_maps;
};
PCIDevice *pci_register_device(PCIBus *bus, const char *name,
@@ -391,4 +393,76 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
return !(last2 < first1 || last1 < first2);
}
+/*
+ * Memory I/O and PCI IOMMU definitions.
+ */
+
+#define IOMMU_PERM_READ (1 << 0)
+#define IOMMU_PERM_WRITE (1 << 1)
+#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
+
+typedef int PCIInvalidateMapFunc(void *opaque);
+typedef int PCITranslateFunc(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms);
+
+extern void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write);
+extern void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+extern void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+extern void pci_register_iommu(PCIDevice *dev,
+ PCITranslateFunc *translate);
+extern void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len);
+
+#define DECLARE_PCI_LD(suffix, size) \
+extern uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
+
+#define DECLARE_PCI_ST(suffix, size) \
+extern void pci_st##suffix(PCIDevice *dev, \
+ pcibus_t addr, \
+ uint##size##_t val);
+
+DECLARE_PCI_LD(ub, 8)
+DECLARE_PCI_LD(uw, 16)
+DECLARE_PCI_LD(l, 32)
+DECLARE_PCI_LD(q, 64)
+
+DECLARE_PCI_ST(b, 8)
+DECLARE_PCI_ST(w, 16)
+DECLARE_PCI_ST(l, 32)
+DECLARE_PCI_ST(q, 64)
+
+static inline void pci_memory_read(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, buf, len, 0);
+}
+
+static inline void pci_memory_write(PCIDevice *dev,
+ pcibus_t addr,
+ const uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
+}
+
#endif
diff --git a/hw/pci_internals.h b/hw/pci_internals.h
index e3c93a3..fb134b9 100644
--- a/hw/pci_internals.h
+++ b/hw/pci_internals.h
@@ -33,6 +33,9 @@ struct PCIBus {
Keep a count of the number of devices with raised IRQs. */
int nirq;
int *irq_count;
+
+ PCIDevice *iommu;
+ PCITranslateFunc *translate;
};
struct PCIBridge {
@@ -44,4 +47,13 @@ struct PCIBridge {
const char *bus_name;
};
+struct PCIMemoryMap {
+ pcibus_t addr;
+ pcibus_t len;
+ target_phys_addr_t paddr;
+ PCIInvalidateMapFunc *invalidate;
+ void *invalidate_opaque;
+ QLIST_ENTRY(PCIMemoryMap) list;
+};
+
#endif /* QEMU_PCI_INTERNALS_H */
diff --git a/qemu-common.h b/qemu-common.h
index d735235..8b060e8 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
typedef struct PCIHostState PCIHostState;
typedef struct PCIExpressHost PCIExpressHost;
typedef struct PCIBus PCIBus;
+typedef struct PCIMemoryMap PCIMemoryMap;
typedef struct PCIDevice PCIDevice;
typedef struct PCIBridge PCIBridge;
typedef struct SerialState SerialState;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH 3/7] AMD IOMMU emulation
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
This introduces emulation for the AMD IOMMU, described in "AMD I/O
Virtualization Technology (IOMMU) Specification".
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
Makefile.target | 2 +-
hw/amd_iommu.c | 663 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
hw/pc.c | 2 +
hw/pci_ids.h | 2 +
hw/pci_regs.h | 1 +
5 files changed, 669 insertions(+), 1 deletions(-)
create mode 100644 hw/amd_iommu.c
diff --git a/Makefile.target b/Makefile.target
index 3ef4666..d4eeccd 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -195,7 +195,7 @@ obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
obj-i386-y += vmmouse.o vmport.o hpet.o applesmc.o
obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
obj-i386-y += debugcon.o multiboot.o
-obj-i386-y += pc_piix.o
+obj-i386-y += pc_piix.o amd_iommu.o
# shared objects
obj-ppc-y = ppc.o
diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
new file mode 100644
index 0000000..43e0426
--- /dev/null
+++ b/hw/amd_iommu.c
@@ -0,0 +1,663 @@
+/*
+ * AMD IOMMU emulation
+ *
+ * Copyright (c) 2010 Eduard - Gabriel Munteanu
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "pc.h"
+#include "hw.h"
+#include "pci.h"
+#include "qlist.h"
+
+/* Capability registers */
+#define CAPAB_HEADER 0x00
+#define CAPAB_REV_TYPE 0x02
+#define CAPAB_FLAGS 0x03
+#define CAPAB_BAR_LOW 0x04
+#define CAPAB_BAR_HIGH 0x08
+#define CAPAB_RANGE 0x0C
+#define CAPAB_MISC 0x10
+
+#define CAPAB_SIZE 0x14
+#define CAPAB_REG_SIZE 0x04
+
+/* Capability header data */
+#define CAPAB_FLAG_IOTLBSUP (1 << 0)
+#define CAPAB_FLAG_HTTUNNEL (1 << 1)
+#define CAPAB_FLAG_NPCACHE (1 << 2)
+#define CAPAB_INIT_REV (1 << 3)
+#define CAPAB_INIT_TYPE 3
+#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE)
+#define CAPAB_INIT_FLAGS (CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL)
+#define CAPAB_INIT_MISC (64 << 15) | (48 << 8)
+#define CAPAB_BAR_MASK ~((1UL << 14) - 1)
+
+/* MMIO registers */
+#define MMIO_DEVICE_TABLE 0x0000
+#define MMIO_COMMAND_BASE 0x0008
+#define MMIO_EVENT_BASE 0x0010
+#define MMIO_CONTROL 0x0018
+#define MMIO_EXCL_BASE 0x0020
+#define MMIO_EXCL_LIMIT 0x0028
+#define MMIO_COMMAND_HEAD 0x2000
+#define MMIO_COMMAND_TAIL 0x2008
+#define MMIO_EVENT_HEAD 0x2010
+#define MMIO_EVENT_TAIL 0x2018
+#define MMIO_STATUS 0x2020
+
+#define MMIO_SIZE 0x4000
+
+#define MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
+#define MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~MMIO_DEVTAB_SIZE_MASK)
+#define MMIO_DEVTAB_ENTRY_SIZE 32
+#define MMIO_DEVTAB_SIZE_UNIT 4096
+
+#define MMIO_CMDBUF_SIZE_BYTE (MMIO_COMMAND_BASE + 7)
+#define MMIO_CMDBUF_SIZE_MASK 0x0F
+#define MMIO_CMDBUF_BASE_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_CMDBUF_DEFAULT_SIZE 8
+#define MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
+#define MMIO_CMDBUF_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EVTLOG_SIZE_BYTE (MMIO_EVENT_BASE + 7)
+#define MMIO_EVTLOG_SIZE_MASK MMIO_CMDBUF_SIZE_MASK
+#define MMIO_EVTLOG_BASE_MASK MMIO_CMDBUF_BASE_MASK
+#define MMIO_EVTLOG_DEFAULT_SIZE MMIO_CMDBUF_DEFAULT_SIZE
+#define MMIO_EVTLOG_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
+#define MMIO_EVTLOG_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_ENABLED_MASK (1ULL << 0)
+#define MMIO_EXCL_ALLOW_MASK (1ULL << 1)
+#define MMIO_EXCL_LIMIT_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_LIMIT_LOW 0xFFF
+
+#define MMIO_CONTROL_IOMMUEN (1ULL << 0)
+#define MMIO_CONTROL_HTTUNEN (1ULL << 1)
+#define MMIO_CONTROL_EVENTLOGEN (1ULL << 2)
+#define MMIO_CONTROL_EVENTINTEN (1ULL << 3)
+#define MMIO_CONTROL_COMWAITINTEN (1ULL << 4)
+#define MMIO_CONTROL_CMDBUFEN (1ULL << 12)
+
+#define MMIO_STATUS_EVTLOG_OF (1ULL << 0)
+#define MMIO_STATUS_EVTLOG_INTR (1ULL << 1)
+#define MMIO_STATUS_COMWAIT_INTR (1ULL << 2)
+#define MMIO_STATUS_EVTLOG_RUN (1ULL << 3)
+#define MMIO_STATUS_CMDBUF_RUN (1ULL << 4)
+
+#define CMDBUF_ID_BYTE 0x07
+#define CMDBUF_ID_RSHIFT 4
+#define CMDBUF_ENTRY_SIZE 0x10
+
+#define CMD_COMPLETION_WAIT 0x01
+#define CMD_INVAL_DEVTAB_ENTRY 0x02
+#define CMD_INVAL_IOMMU_PAGES 0x03
+#define CMD_INVAL_IOTLB_PAGES 0x04
+#define CMD_INVAL_INTR_TABLE 0x05
+
+#define DEVTAB_ENTRY_SIZE 32
+
+/* Device table entry bits 0:63 */
+#define DEV_VALID (1ULL << 0)
+#define DEV_TRANSLATION_VALID (1ULL << 1)
+#define DEV_MODE_MASK 0x7
+#define DEV_MODE_RSHIFT 9
+#define DEV_PT_ROOT_MASK 0xFFFFFFFFFF000
+#define DEV_PT_ROOT_RSHIFT 12
+#define DEV_PERM_SHIFT 61
+#define DEV_PERM_READ (1ULL << 61)
+#define DEV_PERM_WRITE (1ULL << 62)
+
+/* Device table entry bits 64:127 */
+#define DEV_DOMAIN_ID_MASK ((1ULL << 16) - 1)
+#define DEV_IOTLB_SUPPORT (1ULL << 17)
+#define DEV_SUPPRESS_PF (1ULL << 18)
+#define DEV_SUPPRESS_ALL_PF (1ULL << 19)
+#define DEV_IOCTL_MASK ~3
+#define DEV_IOCTL_RSHIFT 20
+#define DEV_IOCTL_DENY 0
+#define DEV_IOCTL_PASSTHROUGH 1
+#define DEV_IOCTL_TRANSLATE 2
+#define DEV_CACHE (1ULL << 37)
+#define DEV_SNOOP_DISABLE (1ULL << 38)
+#define DEV_EXCL (1ULL << 39)
+
+/* Event codes and flags, as stored in the info field */
+#define EVENT_ILLEGAL_DEVTAB_ENTRY (0x1U << 24)
+#define EVENT_IOPF (0x2U << 24)
+#define EVENT_IOPF_I (1U << 3)
+#define EVENT_IOPF_PR (1U << 4)
+#define EVENT_IOPF_RW (1U << 5)
+#define EVENT_IOPF_PE (1U << 6)
+#define EVENT_IOPF_RZ (1U << 7)
+#define EVENT_IOPF_TR (1U << 8)
+#define EVENT_DEV_TAB_HW_ERROR (0x3U << 24)
+#define EVENT_PAGE_TAB_HW_ERROR (0x4U << 24)
+#define EVENT_ILLEGAL_COMMAND_ERROR (0x5U << 24)
+#define EVENT_COMMAND_HW_ERROR (0x6U << 24)
+#define EVENT_IOTLB_INV_TIMEOUT (0x7U << 24)
+#define EVENT_INVALID_DEV_REQUEST (0x8U << 24)
+
+#define EVENT_LEN 16
+
+typedef struct AMDIOMMUState {
+ PCIDevice dev;
+
+ int capab_offset;
+ unsigned char *capab;
+
+ int mmio_index;
+ target_phys_addr_t mmio_addr;
+ unsigned char *mmio_buf;
+ int mmio_enabled;
+
+ int enabled;
+ int ats_enabled;
+
+ target_phys_addr_t devtab;
+ size_t devtab_len;
+
+ target_phys_addr_t cmdbuf;
+ int cmdbuf_enabled;
+ size_t cmdbuf_len;
+ size_t cmdbuf_head;
+ size_t cmdbuf_tail;
+ int completion_wait_intr;
+
+ target_phys_addr_t evtlog;
+ int evtlog_enabled;
+ int evtlog_intr;
+ target_phys_addr_t evtlog_len;
+ target_phys_addr_t evtlog_head;
+ target_phys_addr_t evtlog_tail;
+
+ target_phys_addr_t excl_base;
+ target_phys_addr_t excl_limit;
+ int excl_enabled;
+ int excl_allow;
+} AMDIOMMUState;
+
+typedef struct AMDIOMMUEvent {
+ uint16_t devfn;
+ uint16_t reserved;
+ uint16_t domid;
+ uint16_t info;
+ uint64_t addr;
+} __attribute__((packed)) AMDIOMMUEvent;
+
+static void amd_iommu_completion_wait(AMDIOMMUState *st,
+ uint8_t *cmd)
+{
+ uint64_t addr;
+
+ if (cmd[0] & 1) {
+ addr = le64_to_cpu(*(uint64_t *) cmd) & 0xFFFFFFFFFFFF8;
+ cpu_physical_memory_write(addr, cmd + 8, 8);
+ }
+
+ if (cmd[0] & 2)
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_COMWAIT_INTR;
+}
+
+static void amd_iommu_invalidate_iotlb(AMDIOMMUState *st,
+ uint8_t *cmd)
+{
+ PCIDevice *dev;
+ PCIBus *bus = st->dev.bus;
+ int bus_num = pci_bus_num(bus);
+ int devfn = *(uint16_t *) cmd;
+
+ dev = pci_find_device(bus, bus_num, PCI_SLOT(devfn), PCI_FUNC(devfn));
+ if (dev) {
+ pci_memory_invalidate_range(dev, 0, -1);
+ }
+}
+
+static void amd_iommu_cmdbuf_run(AMDIOMMUState *st)
+{
+ uint8_t cmd[16];
+ int type;
+
+ if (!st->cmdbuf_enabled) {
+ return;
+ }
+
+ /* Check if there's work to do. */
+ if (st->cmdbuf_head == st->cmdbuf_tail) {
+ return;
+ }
+
+ cpu_physical_memory_read(st->cmdbuf + st->cmdbuf_head, cmd, 16);
+ type = cmd[CMDBUF_ID_BYTE] >> CMDBUF_ID_RSHIFT;
+ switch (type) {
+ case CMD_COMPLETION_WAIT:
+ amd_iommu_completion_wait(st, cmd);
+ break;
+ case CMD_INVAL_DEVTAB_ENTRY:
+ break;
+ case CMD_INVAL_IOMMU_PAGES:
+ break;
+ case CMD_INVAL_IOTLB_PAGES:
+ amd_iommu_invalidate_iotlb(st, cmd);
+ break;
+ case CMD_INVAL_INTR_TABLE:
+ break;
+ default:
+ break;
+ }
+
+ /* Increment and wrap head pointer. */
+ st->cmdbuf_head += CMDBUF_ENTRY_SIZE;
+ if (st->cmdbuf_head >= st->cmdbuf_len) {
+ st->cmdbuf_head = 0;
+ }
+}
+
+static uint32_t amd_iommu_mmio_buf_read(AMDIOMMUState *st,
+ size_t offset,
+ size_t size)
+{
+ ssize_t i;
+ uint32_t ret;
+
+ if (!size) {
+ return 0;
+ }
+
+ ret = st->mmio_buf[offset + size - 1];
+ for (i = size - 2; i >= 0; i--) {
+ ret <<= 8;
+ ret |= st->mmio_buf[offset + i];
+ }
+
+ return ret;
+}
+
+static void amd_iommu_mmio_buf_write(AMDIOMMUState *st,
+ size_t offset,
+ size_t size,
+ uint32_t val)
+{
+ size_t i;
+
+ for (i = 0; i < size; i++) {
+ st->mmio_buf[offset + i] = val & 0xFF;
+ val >>= 8;
+ }
+}
+
+static void amd_iommu_update_mmio(AMDIOMMUState *st,
+ target_phys_addr_t addr)
+{
+ size_t reg = addr & ~0x07;
+ uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
+ uint64_t val = le64_to_cpu(*base);
+
+ switch (reg) {
+ case MMIO_CONTROL:
+ st->enabled = !!(val & MMIO_CONTROL_IOMMUEN);
+ st->ats_enabled = !!(val & MMIO_CONTROL_HTTUNEN);
+ st->evtlog_enabled = st->enabled &&
+ !!(val & MMIO_CONTROL_EVENTLOGEN);
+ st->evtlog_intr = !!(val & MMIO_CONTROL_EVENTINTEN);
+ st->completion_wait_intr = !!(val & MMIO_CONTROL_COMWAITINTEN);
+ st->cmdbuf_enabled = st->enabled &&
+ !!(val & MMIO_CONTROL_CMDBUFEN);
+
+ /* Update status flags depending on the control register. */
+ if (st->cmdbuf_enabled) {
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_CMDBUF_RUN;
+ } else {
+ st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_CMDBUF_RUN;
+ }
+ if (st->evtlog_enabled) {
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_RUN;
+ } else {
+ st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_EVTLOG_RUN;
+ }
+
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_DEVICE_TABLE:
+ st->devtab = (target_phys_addr_t) (val & MMIO_DEVTAB_BASE_MASK);
+ st->devtab_len = ((val & MMIO_DEVTAB_SIZE_MASK) + 1) *
+ (MMIO_DEVTAB_SIZE_UNIT / MMIO_DEVTAB_ENTRY_SIZE);
+ break;
+ case MMIO_COMMAND_BASE:
+ st->cmdbuf = (target_phys_addr_t) (val & MMIO_CMDBUF_BASE_MASK);
+ st->cmdbuf_len = 1UL << (st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] &
+ MMIO_CMDBUF_SIZE_MASK);
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_COMMAND_HEAD:
+ st->cmdbuf_head = val & MMIO_CMDBUF_HEAD_MASK;
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_COMMAND_TAIL:
+ st->cmdbuf_tail = val & MMIO_CMDBUF_TAIL_MASK;
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_EVENT_BASE:
+ st->evtlog = (target_phys_addr_t) (val & MMIO_EVTLOG_BASE_MASK);
+ st->evtlog_len = 1UL << (st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] &
+ MMIO_EVTLOG_SIZE_MASK);
+ break;
+ case MMIO_EVENT_HEAD:
+ st->evtlog_head = val & MMIO_EVTLOG_HEAD_MASK;
+ break;
+ case MMIO_EVENT_TAIL:
+ st->evtlog_tail = val & MMIO_EVTLOG_TAIL_MASK;
+ break;
+ case MMIO_EXCL_BASE:
+ st->excl_base = (target_phys_addr_t) (val & MMIO_EXCL_BASE_MASK);
+ st->excl_enabled = val & MMIO_EXCL_ENABLED_MASK;
+ st->excl_allow = val & MMIO_EXCL_ALLOW_MASK;
+ break;
+ case MMIO_EXCL_LIMIT:
+ st->excl_limit = (target_phys_addr_t) ((val & MMIO_EXCL_LIMIT_MASK) |
+ MMIO_EXCL_LIMIT_LOW);
+ break;
+ default:
+ break;
+ }
+}
+
+static uint32_t amd_iommu_mmio_readb(void *opaque, target_phys_addr_t addr)
+{
+ AMDIOMMUState *st = opaque;
+
+ return amd_iommu_mmio_buf_read(st, addr, 1);
+}
+
+static uint32_t amd_iommu_mmio_readw(void *opaque, target_phys_addr_t addr)
+{
+ AMDIOMMUState *st = opaque;
+
+ return amd_iommu_mmio_buf_read(st, addr, 2);
+}
+
+static uint32_t amd_iommu_mmio_readl(void *opaque, target_phys_addr_t addr)
+{
+ AMDIOMMUState *st = opaque;
+
+ return amd_iommu_mmio_buf_read(st, addr, 4);
+}
+
+static void amd_iommu_mmio_writeb(void *opaque,
+ target_phys_addr_t addr,
+ uint32_t val)
+{
+ AMDIOMMUState *st = opaque;
+
+ amd_iommu_mmio_buf_write(st, addr, 1, val);
+ amd_iommu_update_mmio(st, addr);
+}
+
+static void amd_iommu_mmio_writew(void *opaque,
+ target_phys_addr_t addr,
+ uint32_t val)
+{
+ AMDIOMMUState *st = opaque;
+
+ amd_iommu_mmio_buf_write(st, addr, 2, val);
+ amd_iommu_update_mmio(st, addr);
+}
+
+static void amd_iommu_mmio_writel(void *opaque,
+ target_phys_addr_t addr,
+ uint32_t val)
+{
+ AMDIOMMUState *st = opaque;
+
+ amd_iommu_mmio_buf_write(st, addr, 4, val);
+ amd_iommu_update_mmio(st, addr);
+}
+
+static CPUReadMemoryFunc * const amd_iommu_mmio_read[] = {
+ amd_iommu_mmio_readb,
+ amd_iommu_mmio_readw,
+ amd_iommu_mmio_readl,
+};
+
+static CPUWriteMemoryFunc * const amd_iommu_mmio_write[] = {
+ amd_iommu_mmio_writeb,
+ amd_iommu_mmio_writew,
+ amd_iommu_mmio_writel,
+};
+
+static void amd_iommu_enable_mmio(AMDIOMMUState *st)
+{
+ target_phys_addr_t addr;
+ uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
+
+ st->mmio_index = cpu_register_io_memory(amd_iommu_mmio_read,
+ amd_iommu_mmio_write, st);
+ if (st->mmio_index < 0) {
+ return;
+ }
+
+ addr = le64_to_cpu(*(uint64_t *) &st->capab[CAPAB_BAR_LOW]) & CAPAB_BAR_MASK;
+ cpu_register_physical_memory(addr, MMIO_SIZE, st->mmio_index);
+
+ st->mmio_addr = addr;
+ st->mmio_enabled = 1;
+
+ /* Further changes to the capability are prohibited. */
+ memset(capab_wmask + CAPAB_BAR_LOW, 0x00, CAPAB_REG_SIZE);
+ memset(capab_wmask + CAPAB_BAR_HIGH, 0x00, CAPAB_REG_SIZE);
+}
+
+static void amd_iommu_write_capab(PCIDevice *dev,
+ uint32_t addr, uint32_t val, int len)
+{
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
+
+ pci_default_write_config(dev, addr, val, len);
+
+ if (!st->mmio_enabled && st->capab[CAPAB_BAR_LOW] & 0x1) {
+ amd_iommu_enable_mmio(st);
+ }
+}
+
+static void amd_iommu_reset(DeviceState *dev)
+{
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev.qdev, dev);
+ unsigned char *capab = st->capab;
+ uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
+
+ st->enabled = 0;
+ st->ats_enabled = 0;
+ st->mmio_enabled = 0;
+
+ capab[CAPAB_REV_TYPE] = CAPAB_REV_TYPE;
+ capab[CAPAB_FLAGS] = CAPAB_FLAGS;
+ capab[CAPAB_BAR_LOW] = 0;
+ capab[CAPAB_BAR_HIGH] = 0;
+ capab[CAPAB_RANGE] = 0;
+ *((uint32_t *) &capab[CAPAB_MISC]) = cpu_to_le32(CAPAB_INIT_MISC);
+
+ /* Changes to the capability are allowed after system reset. */
+ memset(capab_wmask + CAPAB_BAR_LOW, 0xFF, CAPAB_REG_SIZE);
+ memset(capab_wmask + CAPAB_BAR_HIGH, 0xFF, CAPAB_REG_SIZE);
+
+ memset(st->mmio_buf, 0, MMIO_SIZE);
+ st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] = MMIO_CMDBUF_DEFAULT_SIZE;
+ st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] = MMIO_EVTLOG_DEFAULT_SIZE;
+}
+
+static void amd_iommu_log_event(AMDIOMMUState *st, AMDIOMMUEvent *evt)
+{
+ if (!st->evtlog_enabled ||
+ (st->mmio_buf[MMIO_STATUS] | MMIO_STATUS_EVTLOG_OF)) {
+ return;
+ }
+
+ if (st->evtlog_tail >= st->evtlog_len) {
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_OF;
+ }
+
+ cpu_physical_memory_write(st->evtlog + st->evtlog_tail,
+ (uint8_t *) evt, EVENT_LEN);
+
+ st->evtlog_tail += EVENT_LEN;
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_INTR;
+}
+
+static void amd_iommu_page_fault(AMDIOMMUState *st,
+ int devfn,
+ unsigned domid,
+ target_phys_addr_t addr,
+ int present,
+ int is_write)
+{
+ AMDIOMMUEvent evt;
+ unsigned info;
+
+ evt.devfn = cpu_to_le16(devfn);
+ evt.reserved = 0;
+ evt.domid = cpu_to_le16(domid);
+ evt.addr = cpu_to_le64(addr);
+
+ info = EVENT_IOPF;
+ if (present) {
+ info |= EVENT_IOPF_PR;
+ }
+ if (is_write) {
+ info |= EVENT_IOPF_RW;
+ }
+ evt.info = cpu_to_le16(info);
+
+ amd_iommu_log_event(st, &evt);
+}
+
+static inline uint64_t amd_iommu_get_perms(uint64_t entry)
+{
+ return (entry & (DEV_PERM_READ | DEV_PERM_WRITE)) >> DEV_PERM_SHIFT;
+}
+
+static int amd_iommu_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ int devfn, present;
+ target_phys_addr_t entry_addr, pte_addr;
+ uint64_t entry[4], pte, page_offset, pte_perms;
+ unsigned level, domid;
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, iommu);
+
+ if (!st->enabled) {
+ goto no_translation;
+ }
+
+ /* Get device table entry. */
+ devfn = dev->devfn;
+ entry_addr = st->devtab + devfn * DEVTAB_ENTRY_SIZE;
+ cpu_physical_memory_read(entry_addr, (uint8_t *) entry, 32);
+
+ pte = entry[0];
+ if (!(pte & DEV_VALID) || !(pte & DEV_TRANSLATION_VALID)) {
+ goto no_translation;
+ }
+ domid = entry[1] & DEV_DOMAIN_ID_MASK;
+ level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
+ while (level > 0) {
+ /*
+ * Check permissions: the bitwise
+ * implication perms -> entry_perms must be true.
+ */
+ pte_perms = amd_iommu_get_perms(pte);
+ present = pte & 1;
+ if (!present || perms != (perms & pte_perms)) {
+ amd_iommu_page_fault(st, devfn, domid, addr,
+ present, !!(perms & IOMMU_PERM_WRITE));
+ return -EPERM;
+ }
+
+ /* Go to the next lower level. */
+ pte_addr = pte & DEV_PT_ROOT_MASK;
+ pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
+ pte = ldq_phys(pte_addr);
+ level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
+ }
+ page_offset = addr & 4095;
+ *paddr = (pte & DEV_PT_ROOT_MASK) + page_offset;
+ *len = 4096 - page_offset;
+
+ return 0;
+
+no_translation:
+ *paddr = addr;
+ *len = -1;
+ return 0;
+}
+
+static int amd_iommu_pci_initfn(PCIDevice *dev)
+{
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
+
+ pci_config_set_vendor_id(st->dev.config, PCI_VENDOR_ID_AMD);
+ pci_config_set_device_id(st->dev.config, PCI_DEVICE_ID_AMD_IOMMU);
+ pci_config_set_class(st->dev.config, PCI_CLASS_SYSTEM_IOMMU);
+
+ /* Secure Device capability */
+ st->capab_offset = pci_add_capability(&st->dev,
+ PCI_CAP_ID_SEC,
+ CAPAB_SIZE);
+ st->capab = st->dev.config + st->capab_offset;
+ dev->config_write = amd_iommu_write_capab;
+
+ /* Allocate backing space for the MMIO registers. */
+ st->mmio_buf = qemu_malloc(MMIO_SIZE);
+
+ pci_register_iommu(dev, amd_iommu_translate);
+
+ return 0;
+}
+
+static const VMStateDescription vmstate_amd_iommu = {
+ .name = "amd-iommu",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .minimum_version_id_old = 1,
+ .fields = (VMStateField []) {
+ VMSTATE_PCI_DEVICE(dev, AMDIOMMUState),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
+static PCIDeviceInfo amd_iommu_pci_info = {
+ .qdev.name = "amd-iommu",
+ .qdev.desc = "AMD IOMMU",
+ .qdev.size = sizeof(AMDIOMMUState),
+ .qdev.reset = amd_iommu_reset,
+ .qdev.vmsd = &vmstate_amd_iommu,
+ .init = amd_iommu_pci_initfn,
+};
+
+static void amd_iommu_register(void)
+{
+ pci_qdev_register(&amd_iommu_pci_info);
+}
+
+device_init(amd_iommu_register);
diff --git a/hw/pc.c b/hw/pc.c
index a96187f..e2456b0 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
int max_bus;
int bus;
+ pci_create_simple(pci_bus, -1, "amd-iommu");
+
max_bus = drive_get_max_bus(IF_SCSI);
for (bus = 0; bus <= max_bus; bus++) {
pci_create_simple(pci_bus, -1, "lsi53c895a");
diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index 39e9f1d..d790312 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -26,6 +26,7 @@
#define PCI_CLASS_MEMORY_RAM 0x0500
+#define PCI_CLASS_SYSTEM_IOMMU 0x0806
#define PCI_CLASS_SYSTEM_OTHER 0x0880
#define PCI_CLASS_SERIAL_USB 0x0c03
@@ -56,6 +57,7 @@
#define PCI_VENDOR_ID_AMD 0x1022
#define PCI_DEVICE_ID_AMD_LANCE 0x2000
+#define PCI_DEVICE_ID_AMD_IOMMU 0x0000 /* FIXME */
#define PCI_VENDOR_ID_MOTOROLA 0x1057
#define PCI_DEVICE_ID_MOTOROLA_MPC106 0x0002
diff --git a/hw/pci_regs.h b/hw/pci_regs.h
index 0f9f84c..6695e41 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -209,6 +209,7 @@
#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
+#define PCI_CAP_ID_SEC 0x0F /* Secure Device (AMD IOMMU) */
#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 3/7] AMD IOMMU emulation
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
This introduces emulation for the AMD IOMMU, described in "AMD I/O
Virtualization Technology (IOMMU) Specification".
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
Makefile.target | 2 +-
hw/amd_iommu.c | 663 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
hw/pc.c | 2 +
hw/pci_ids.h | 2 +
hw/pci_regs.h | 1 +
5 files changed, 669 insertions(+), 1 deletions(-)
create mode 100644 hw/amd_iommu.c
diff --git a/Makefile.target b/Makefile.target
index 3ef4666..d4eeccd 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -195,7 +195,7 @@ obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
obj-i386-y += vmmouse.o vmport.o hpet.o applesmc.o
obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
obj-i386-y += debugcon.o multiboot.o
-obj-i386-y += pc_piix.o
+obj-i386-y += pc_piix.o amd_iommu.o
# shared objects
obj-ppc-y = ppc.o
diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
new file mode 100644
index 0000000..43e0426
--- /dev/null
+++ b/hw/amd_iommu.c
@@ -0,0 +1,663 @@
+/*
+ * AMD IOMMU emulation
+ *
+ * Copyright (c) 2010 Eduard - Gabriel Munteanu
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "pc.h"
+#include "hw.h"
+#include "pci.h"
+#include "qlist.h"
+
+/* Capability registers */
+#define CAPAB_HEADER 0x00
+#define CAPAB_REV_TYPE 0x02
+#define CAPAB_FLAGS 0x03
+#define CAPAB_BAR_LOW 0x04
+#define CAPAB_BAR_HIGH 0x08
+#define CAPAB_RANGE 0x0C
+#define CAPAB_MISC 0x10
+
+#define CAPAB_SIZE 0x14
+#define CAPAB_REG_SIZE 0x04
+
+/* Capability header data */
+#define CAPAB_FLAG_IOTLBSUP (1 << 0)
+#define CAPAB_FLAG_HTTUNNEL (1 << 1)
+#define CAPAB_FLAG_NPCACHE (1 << 2)
+#define CAPAB_INIT_REV (1 << 3)
+#define CAPAB_INIT_TYPE 3
+#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE)
+#define CAPAB_INIT_FLAGS (CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL)
+#define CAPAB_INIT_MISC (64 << 15) | (48 << 8)
+#define CAPAB_BAR_MASK ~((1UL << 14) - 1)
+
+/* MMIO registers */
+#define MMIO_DEVICE_TABLE 0x0000
+#define MMIO_COMMAND_BASE 0x0008
+#define MMIO_EVENT_BASE 0x0010
+#define MMIO_CONTROL 0x0018
+#define MMIO_EXCL_BASE 0x0020
+#define MMIO_EXCL_LIMIT 0x0028
+#define MMIO_COMMAND_HEAD 0x2000
+#define MMIO_COMMAND_TAIL 0x2008
+#define MMIO_EVENT_HEAD 0x2010
+#define MMIO_EVENT_TAIL 0x2018
+#define MMIO_STATUS 0x2020
+
+#define MMIO_SIZE 0x4000
+
+#define MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
+#define MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~MMIO_DEVTAB_SIZE_MASK)
+#define MMIO_DEVTAB_ENTRY_SIZE 32
+#define MMIO_DEVTAB_SIZE_UNIT 4096
+
+#define MMIO_CMDBUF_SIZE_BYTE (MMIO_COMMAND_BASE + 7)
+#define MMIO_CMDBUF_SIZE_MASK 0x0F
+#define MMIO_CMDBUF_BASE_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_CMDBUF_DEFAULT_SIZE 8
+#define MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
+#define MMIO_CMDBUF_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EVTLOG_SIZE_BYTE (MMIO_EVENT_BASE + 7)
+#define MMIO_EVTLOG_SIZE_MASK MMIO_CMDBUF_SIZE_MASK
+#define MMIO_EVTLOG_BASE_MASK MMIO_CMDBUF_BASE_MASK
+#define MMIO_EVTLOG_DEFAULT_SIZE MMIO_CMDBUF_DEFAULT_SIZE
+#define MMIO_EVTLOG_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
+#define MMIO_EVTLOG_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_ENABLED_MASK (1ULL << 0)
+#define MMIO_EXCL_ALLOW_MASK (1ULL << 1)
+#define MMIO_EXCL_LIMIT_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_LIMIT_LOW 0xFFF
+
+#define MMIO_CONTROL_IOMMUEN (1ULL << 0)
+#define MMIO_CONTROL_HTTUNEN (1ULL << 1)
+#define MMIO_CONTROL_EVENTLOGEN (1ULL << 2)
+#define MMIO_CONTROL_EVENTINTEN (1ULL << 3)
+#define MMIO_CONTROL_COMWAITINTEN (1ULL << 4)
+#define MMIO_CONTROL_CMDBUFEN (1ULL << 12)
+
+#define MMIO_STATUS_EVTLOG_OF (1ULL << 0)
+#define MMIO_STATUS_EVTLOG_INTR (1ULL << 1)
+#define MMIO_STATUS_COMWAIT_INTR (1ULL << 2)
+#define MMIO_STATUS_EVTLOG_RUN (1ULL << 3)
+#define MMIO_STATUS_CMDBUF_RUN (1ULL << 4)
+
+#define CMDBUF_ID_BYTE 0x07
+#define CMDBUF_ID_RSHIFT 4
+#define CMDBUF_ENTRY_SIZE 0x10
+
+#define CMD_COMPLETION_WAIT 0x01
+#define CMD_INVAL_DEVTAB_ENTRY 0x02
+#define CMD_INVAL_IOMMU_PAGES 0x03
+#define CMD_INVAL_IOTLB_PAGES 0x04
+#define CMD_INVAL_INTR_TABLE 0x05
+
+#define DEVTAB_ENTRY_SIZE 32
+
+/* Device table entry bits 0:63 */
+#define DEV_VALID (1ULL << 0)
+#define DEV_TRANSLATION_VALID (1ULL << 1)
+#define DEV_MODE_MASK 0x7
+#define DEV_MODE_RSHIFT 9
+#define DEV_PT_ROOT_MASK 0xFFFFFFFFFF000
+#define DEV_PT_ROOT_RSHIFT 12
+#define DEV_PERM_SHIFT 61
+#define DEV_PERM_READ (1ULL << 61)
+#define DEV_PERM_WRITE (1ULL << 62)
+
+/* Device table entry bits 64:127 */
+#define DEV_DOMAIN_ID_MASK ((1ULL << 16) - 1)
+#define DEV_IOTLB_SUPPORT (1ULL << 17)
+#define DEV_SUPPRESS_PF (1ULL << 18)
+#define DEV_SUPPRESS_ALL_PF (1ULL << 19)
+#define DEV_IOCTL_MASK ~3
+#define DEV_IOCTL_RSHIFT 20
+#define DEV_IOCTL_DENY 0
+#define DEV_IOCTL_PASSTHROUGH 1
+#define DEV_IOCTL_TRANSLATE 2
+#define DEV_CACHE (1ULL << 37)
+#define DEV_SNOOP_DISABLE (1ULL << 38)
+#define DEV_EXCL (1ULL << 39)
+
+/* Event codes and flags, as stored in the info field */
+#define EVENT_ILLEGAL_DEVTAB_ENTRY (0x1U << 24)
+#define EVENT_IOPF (0x2U << 24)
+#define EVENT_IOPF_I (1U << 3)
+#define EVENT_IOPF_PR (1U << 4)
+#define EVENT_IOPF_RW (1U << 5)
+#define EVENT_IOPF_PE (1U << 6)
+#define EVENT_IOPF_RZ (1U << 7)
+#define EVENT_IOPF_TR (1U << 8)
+#define EVENT_DEV_TAB_HW_ERROR (0x3U << 24)
+#define EVENT_PAGE_TAB_HW_ERROR (0x4U << 24)
+#define EVENT_ILLEGAL_COMMAND_ERROR (0x5U << 24)
+#define EVENT_COMMAND_HW_ERROR (0x6U << 24)
+#define EVENT_IOTLB_INV_TIMEOUT (0x7U << 24)
+#define EVENT_INVALID_DEV_REQUEST (0x8U << 24)
+
+#define EVENT_LEN 16
+
+typedef struct AMDIOMMUState {
+ PCIDevice dev;
+
+ int capab_offset;
+ unsigned char *capab;
+
+ int mmio_index;
+ target_phys_addr_t mmio_addr;
+ unsigned char *mmio_buf;
+ int mmio_enabled;
+
+ int enabled;
+ int ats_enabled;
+
+ target_phys_addr_t devtab;
+ size_t devtab_len;
+
+ target_phys_addr_t cmdbuf;
+ int cmdbuf_enabled;
+ size_t cmdbuf_len;
+ size_t cmdbuf_head;
+ size_t cmdbuf_tail;
+ int completion_wait_intr;
+
+ target_phys_addr_t evtlog;
+ int evtlog_enabled;
+ int evtlog_intr;
+ target_phys_addr_t evtlog_len;
+ target_phys_addr_t evtlog_head;
+ target_phys_addr_t evtlog_tail;
+
+ target_phys_addr_t excl_base;
+ target_phys_addr_t excl_limit;
+ int excl_enabled;
+ int excl_allow;
+} AMDIOMMUState;
+
+typedef struct AMDIOMMUEvent {
+ uint16_t devfn;
+ uint16_t reserved;
+ uint16_t domid;
+ uint16_t info;
+ uint64_t addr;
+} __attribute__((packed)) AMDIOMMUEvent;
+
+static void amd_iommu_completion_wait(AMDIOMMUState *st,
+ uint8_t *cmd)
+{
+ uint64_t addr;
+
+ if (cmd[0] & 1) {
+ addr = le64_to_cpu(*(uint64_t *) cmd) & 0xFFFFFFFFFFFF8;
+ cpu_physical_memory_write(addr, cmd + 8, 8);
+ }
+
+ if (cmd[0] & 2)
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_COMWAIT_INTR;
+}
+
+static void amd_iommu_invalidate_iotlb(AMDIOMMUState *st,
+ uint8_t *cmd)
+{
+ PCIDevice *dev;
+ PCIBus *bus = st->dev.bus;
+ int bus_num = pci_bus_num(bus);
+ int devfn = *(uint16_t *) cmd;
+
+ dev = pci_find_device(bus, bus_num, PCI_SLOT(devfn), PCI_FUNC(devfn));
+ if (dev) {
+ pci_memory_invalidate_range(dev, 0, -1);
+ }
+}
+
+static void amd_iommu_cmdbuf_run(AMDIOMMUState *st)
+{
+ uint8_t cmd[16];
+ int type;
+
+ if (!st->cmdbuf_enabled) {
+ return;
+ }
+
+ /* Check if there's work to do. */
+ if (st->cmdbuf_head == st->cmdbuf_tail) {
+ return;
+ }
+
+ cpu_physical_memory_read(st->cmdbuf + st->cmdbuf_head, cmd, 16);
+ type = cmd[CMDBUF_ID_BYTE] >> CMDBUF_ID_RSHIFT;
+ switch (type) {
+ case CMD_COMPLETION_WAIT:
+ amd_iommu_completion_wait(st, cmd);
+ break;
+ case CMD_INVAL_DEVTAB_ENTRY:
+ break;
+ case CMD_INVAL_IOMMU_PAGES:
+ break;
+ case CMD_INVAL_IOTLB_PAGES:
+ amd_iommu_invalidate_iotlb(st, cmd);
+ break;
+ case CMD_INVAL_INTR_TABLE:
+ break;
+ default:
+ break;
+ }
+
+ /* Increment and wrap head pointer. */
+ st->cmdbuf_head += CMDBUF_ENTRY_SIZE;
+ if (st->cmdbuf_head >= st->cmdbuf_len) {
+ st->cmdbuf_head = 0;
+ }
+}
+
+static uint32_t amd_iommu_mmio_buf_read(AMDIOMMUState *st,
+ size_t offset,
+ size_t size)
+{
+ ssize_t i;
+ uint32_t ret;
+
+ if (!size) {
+ return 0;
+ }
+
+ ret = st->mmio_buf[offset + size - 1];
+ for (i = size - 2; i >= 0; i--) {
+ ret <<= 8;
+ ret |= st->mmio_buf[offset + i];
+ }
+
+ return ret;
+}
+
+static void amd_iommu_mmio_buf_write(AMDIOMMUState *st,
+ size_t offset,
+ size_t size,
+ uint32_t val)
+{
+ size_t i;
+
+ for (i = 0; i < size; i++) {
+ st->mmio_buf[offset + i] = val & 0xFF;
+ val >>= 8;
+ }
+}
+
+static void amd_iommu_update_mmio(AMDIOMMUState *st,
+ target_phys_addr_t addr)
+{
+ size_t reg = addr & ~0x07;
+ uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
+ uint64_t val = le64_to_cpu(*base);
+
+ switch (reg) {
+ case MMIO_CONTROL:
+ st->enabled = !!(val & MMIO_CONTROL_IOMMUEN);
+ st->ats_enabled = !!(val & MMIO_CONTROL_HTTUNEN);
+ st->evtlog_enabled = st->enabled &&
+ !!(val & MMIO_CONTROL_EVENTLOGEN);
+ st->evtlog_intr = !!(val & MMIO_CONTROL_EVENTINTEN);
+ st->completion_wait_intr = !!(val & MMIO_CONTROL_COMWAITINTEN);
+ st->cmdbuf_enabled = st->enabled &&
+ !!(val & MMIO_CONTROL_CMDBUFEN);
+
+ /* Update status flags depending on the control register. */
+ if (st->cmdbuf_enabled) {
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_CMDBUF_RUN;
+ } else {
+ st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_CMDBUF_RUN;
+ }
+ if (st->evtlog_enabled) {
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_RUN;
+ } else {
+ st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_EVTLOG_RUN;
+ }
+
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_DEVICE_TABLE:
+ st->devtab = (target_phys_addr_t) (val & MMIO_DEVTAB_BASE_MASK);
+ st->devtab_len = ((val & MMIO_DEVTAB_SIZE_MASK) + 1) *
+ (MMIO_DEVTAB_SIZE_UNIT / MMIO_DEVTAB_ENTRY_SIZE);
+ break;
+ case MMIO_COMMAND_BASE:
+ st->cmdbuf = (target_phys_addr_t) (val & MMIO_CMDBUF_BASE_MASK);
+ st->cmdbuf_len = 1UL << (st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] &
+ MMIO_CMDBUF_SIZE_MASK);
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_COMMAND_HEAD:
+ st->cmdbuf_head = val & MMIO_CMDBUF_HEAD_MASK;
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_COMMAND_TAIL:
+ st->cmdbuf_tail = val & MMIO_CMDBUF_TAIL_MASK;
+ amd_iommu_cmdbuf_run(st);
+ break;
+ case MMIO_EVENT_BASE:
+ st->evtlog = (target_phys_addr_t) (val & MMIO_EVTLOG_BASE_MASK);
+ st->evtlog_len = 1UL << (st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] &
+ MMIO_EVTLOG_SIZE_MASK);
+ break;
+ case MMIO_EVENT_HEAD:
+ st->evtlog_head = val & MMIO_EVTLOG_HEAD_MASK;
+ break;
+ case MMIO_EVENT_TAIL:
+ st->evtlog_tail = val & MMIO_EVTLOG_TAIL_MASK;
+ break;
+ case MMIO_EXCL_BASE:
+ st->excl_base = (target_phys_addr_t) (val & MMIO_EXCL_BASE_MASK);
+ st->excl_enabled = val & MMIO_EXCL_ENABLED_MASK;
+ st->excl_allow = val & MMIO_EXCL_ALLOW_MASK;
+ break;
+ case MMIO_EXCL_LIMIT:
+ st->excl_limit = (target_phys_addr_t) ((val & MMIO_EXCL_LIMIT_MASK) |
+ MMIO_EXCL_LIMIT_LOW);
+ break;
+ default:
+ break;
+ }
+}
+
+static uint32_t amd_iommu_mmio_readb(void *opaque, target_phys_addr_t addr)
+{
+ AMDIOMMUState *st = opaque;
+
+ return amd_iommu_mmio_buf_read(st, addr, 1);
+}
+
+static uint32_t amd_iommu_mmio_readw(void *opaque, target_phys_addr_t addr)
+{
+ AMDIOMMUState *st = opaque;
+
+ return amd_iommu_mmio_buf_read(st, addr, 2);
+}
+
+static uint32_t amd_iommu_mmio_readl(void *opaque, target_phys_addr_t addr)
+{
+ AMDIOMMUState *st = opaque;
+
+ return amd_iommu_mmio_buf_read(st, addr, 4);
+}
+
+static void amd_iommu_mmio_writeb(void *opaque,
+ target_phys_addr_t addr,
+ uint32_t val)
+{
+ AMDIOMMUState *st = opaque;
+
+ amd_iommu_mmio_buf_write(st, addr, 1, val);
+ amd_iommu_update_mmio(st, addr);
+}
+
+static void amd_iommu_mmio_writew(void *opaque,
+ target_phys_addr_t addr,
+ uint32_t val)
+{
+ AMDIOMMUState *st = opaque;
+
+ amd_iommu_mmio_buf_write(st, addr, 2, val);
+ amd_iommu_update_mmio(st, addr);
+}
+
+static void amd_iommu_mmio_writel(void *opaque,
+ target_phys_addr_t addr,
+ uint32_t val)
+{
+ AMDIOMMUState *st = opaque;
+
+ amd_iommu_mmio_buf_write(st, addr, 4, val);
+ amd_iommu_update_mmio(st, addr);
+}
+
+static CPUReadMemoryFunc * const amd_iommu_mmio_read[] = {
+ amd_iommu_mmio_readb,
+ amd_iommu_mmio_readw,
+ amd_iommu_mmio_readl,
+};
+
+static CPUWriteMemoryFunc * const amd_iommu_mmio_write[] = {
+ amd_iommu_mmio_writeb,
+ amd_iommu_mmio_writew,
+ amd_iommu_mmio_writel,
+};
+
+static void amd_iommu_enable_mmio(AMDIOMMUState *st)
+{
+ target_phys_addr_t addr;
+ uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
+
+ st->mmio_index = cpu_register_io_memory(amd_iommu_mmio_read,
+ amd_iommu_mmio_write, st);
+ if (st->mmio_index < 0) {
+ return;
+ }
+
+ addr = le64_to_cpu(*(uint64_t *) &st->capab[CAPAB_BAR_LOW]) & CAPAB_BAR_MASK;
+ cpu_register_physical_memory(addr, MMIO_SIZE, st->mmio_index);
+
+ st->mmio_addr = addr;
+ st->mmio_enabled = 1;
+
+ /* Further changes to the capability are prohibited. */
+ memset(capab_wmask + CAPAB_BAR_LOW, 0x00, CAPAB_REG_SIZE);
+ memset(capab_wmask + CAPAB_BAR_HIGH, 0x00, CAPAB_REG_SIZE);
+}
+
+static void amd_iommu_write_capab(PCIDevice *dev,
+ uint32_t addr, uint32_t val, int len)
+{
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
+
+ pci_default_write_config(dev, addr, val, len);
+
+ if (!st->mmio_enabled && st->capab[CAPAB_BAR_LOW] & 0x1) {
+ amd_iommu_enable_mmio(st);
+ }
+}
+
+static void amd_iommu_reset(DeviceState *dev)
+{
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev.qdev, dev);
+ unsigned char *capab = st->capab;
+ uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
+
+ st->enabled = 0;
+ st->ats_enabled = 0;
+ st->mmio_enabled = 0;
+
+ capab[CAPAB_REV_TYPE] = CAPAB_REV_TYPE;
+ capab[CAPAB_FLAGS] = CAPAB_FLAGS;
+ capab[CAPAB_BAR_LOW] = 0;
+ capab[CAPAB_BAR_HIGH] = 0;
+ capab[CAPAB_RANGE] = 0;
+ *((uint32_t *) &capab[CAPAB_MISC]) = cpu_to_le32(CAPAB_INIT_MISC);
+
+ /* Changes to the capability are allowed after system reset. */
+ memset(capab_wmask + CAPAB_BAR_LOW, 0xFF, CAPAB_REG_SIZE);
+ memset(capab_wmask + CAPAB_BAR_HIGH, 0xFF, CAPAB_REG_SIZE);
+
+ memset(st->mmio_buf, 0, MMIO_SIZE);
+ st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] = MMIO_CMDBUF_DEFAULT_SIZE;
+ st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] = MMIO_EVTLOG_DEFAULT_SIZE;
+}
+
+static void amd_iommu_log_event(AMDIOMMUState *st, AMDIOMMUEvent *evt)
+{
+ if (!st->evtlog_enabled ||
+ (st->mmio_buf[MMIO_STATUS] | MMIO_STATUS_EVTLOG_OF)) {
+ return;
+ }
+
+ if (st->evtlog_tail >= st->evtlog_len) {
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_OF;
+ }
+
+ cpu_physical_memory_write(st->evtlog + st->evtlog_tail,
+ (uint8_t *) evt, EVENT_LEN);
+
+ st->evtlog_tail += EVENT_LEN;
+ st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_INTR;
+}
+
+static void amd_iommu_page_fault(AMDIOMMUState *st,
+ int devfn,
+ unsigned domid,
+ target_phys_addr_t addr,
+ int present,
+ int is_write)
+{
+ AMDIOMMUEvent evt;
+ unsigned info;
+
+ evt.devfn = cpu_to_le16(devfn);
+ evt.reserved = 0;
+ evt.domid = cpu_to_le16(domid);
+ evt.addr = cpu_to_le64(addr);
+
+ info = EVENT_IOPF;
+ if (present) {
+ info |= EVENT_IOPF_PR;
+ }
+ if (is_write) {
+ info |= EVENT_IOPF_RW;
+ }
+ evt.info = cpu_to_le16(info);
+
+ amd_iommu_log_event(st, &evt);
+}
+
+static inline uint64_t amd_iommu_get_perms(uint64_t entry)
+{
+ return (entry & (DEV_PERM_READ | DEV_PERM_WRITE)) >> DEV_PERM_SHIFT;
+}
+
+static int amd_iommu_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ int devfn, present;
+ target_phys_addr_t entry_addr, pte_addr;
+ uint64_t entry[4], pte, page_offset, pte_perms;
+ unsigned level, domid;
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, iommu);
+
+ if (!st->enabled) {
+ goto no_translation;
+ }
+
+ /* Get device table entry. */
+ devfn = dev->devfn;
+ entry_addr = st->devtab + devfn * DEVTAB_ENTRY_SIZE;
+ cpu_physical_memory_read(entry_addr, (uint8_t *) entry, 32);
+
+ pte = entry[0];
+ if (!(pte & DEV_VALID) || !(pte & DEV_TRANSLATION_VALID)) {
+ goto no_translation;
+ }
+ domid = entry[1] & DEV_DOMAIN_ID_MASK;
+ level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
+ while (level > 0) {
+ /*
+ * Check permissions: the bitwise
+ * implication perms -> entry_perms must be true.
+ */
+ pte_perms = amd_iommu_get_perms(pte);
+ present = pte & 1;
+ if (!present || perms != (perms & pte_perms)) {
+ amd_iommu_page_fault(st, devfn, domid, addr,
+ present, !!(perms & IOMMU_PERM_WRITE));
+ return -EPERM;
+ }
+
+ /* Go to the next lower level. */
+ pte_addr = pte & DEV_PT_ROOT_MASK;
+ pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
+ pte = ldq_phys(pte_addr);
+ level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
+ }
+ page_offset = addr & 4095;
+ *paddr = (pte & DEV_PT_ROOT_MASK) + page_offset;
+ *len = 4096 - page_offset;
+
+ return 0;
+
+no_translation:
+ *paddr = addr;
+ *len = -1;
+ return 0;
+}
+
+static int amd_iommu_pci_initfn(PCIDevice *dev)
+{
+ AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
+
+ pci_config_set_vendor_id(st->dev.config, PCI_VENDOR_ID_AMD);
+ pci_config_set_device_id(st->dev.config, PCI_DEVICE_ID_AMD_IOMMU);
+ pci_config_set_class(st->dev.config, PCI_CLASS_SYSTEM_IOMMU);
+
+ /* Secure Device capability */
+ st->capab_offset = pci_add_capability(&st->dev,
+ PCI_CAP_ID_SEC,
+ CAPAB_SIZE);
+ st->capab = st->dev.config + st->capab_offset;
+ dev->config_write = amd_iommu_write_capab;
+
+ /* Allocate backing space for the MMIO registers. */
+ st->mmio_buf = qemu_malloc(MMIO_SIZE);
+
+ pci_register_iommu(dev, amd_iommu_translate);
+
+ return 0;
+}
+
+static const VMStateDescription vmstate_amd_iommu = {
+ .name = "amd-iommu",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .minimum_version_id_old = 1,
+ .fields = (VMStateField []) {
+ VMSTATE_PCI_DEVICE(dev, AMDIOMMUState),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
+static PCIDeviceInfo amd_iommu_pci_info = {
+ .qdev.name = "amd-iommu",
+ .qdev.desc = "AMD IOMMU",
+ .qdev.size = sizeof(AMDIOMMUState),
+ .qdev.reset = amd_iommu_reset,
+ .qdev.vmsd = &vmstate_amd_iommu,
+ .init = amd_iommu_pci_initfn,
+};
+
+static void amd_iommu_register(void)
+{
+ pci_qdev_register(&amd_iommu_pci_info);
+}
+
+device_init(amd_iommu_register);
diff --git a/hw/pc.c b/hw/pc.c
index a96187f..e2456b0 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
int max_bus;
int bus;
+ pci_create_simple(pci_bus, -1, "amd-iommu");
+
max_bus = drive_get_max_bus(IF_SCSI);
for (bus = 0; bus <= max_bus; bus++) {
pci_create_simple(pci_bus, -1, "lsi53c895a");
diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index 39e9f1d..d790312 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -26,6 +26,7 @@
#define PCI_CLASS_MEMORY_RAM 0x0500
+#define PCI_CLASS_SYSTEM_IOMMU 0x0806
#define PCI_CLASS_SYSTEM_OTHER 0x0880
#define PCI_CLASS_SERIAL_USB 0x0c03
@@ -56,6 +57,7 @@
#define PCI_VENDOR_ID_AMD 0x1022
#define PCI_DEVICE_ID_AMD_LANCE 0x2000
+#define PCI_DEVICE_ID_AMD_IOMMU 0x0000 /* FIXME */
#define PCI_VENDOR_ID_MOTOROLA 0x1057
#define PCI_DEVICE_ID_MOTOROLA_MPC106 0x0002
diff --git a/hw/pci_regs.h b/hw/pci_regs.h
index 0f9f84c..6695e41 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -209,6 +209,7 @@
#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
+#define PCI_CAP_ID_SEC 0x0F /* Secure Device (AMD IOMMU) */
#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH 4/7] ide: use the PCI memory access interface
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
Emulated PCI IDE controllers now use the memory access interface. This
also allows an emulated IOMMU to translate and check accesses.
Map invalidation results in cancelling DMA transfers. Since the guest OS
can't properly recover the DMA results in case the mapping is changed,
this is a fairly good approximation.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
dma-helpers.c | 46 +++++++++++++++++++++++++++++++++++++++++-----
dma.h | 21 ++++++++++++++++++++-
hw/ide/core.c | 15 ++++++++-------
hw/ide/internal.h | 39 +++++++++++++++++++++++++++++++++++++++
hw/ide/macio.c | 4 ++--
hw/ide/pci.c | 7 +++++++
6 files changed, 117 insertions(+), 15 deletions(-)
diff --git a/dma-helpers.c b/dma-helpers.c
index 712ed89..a0dcdb8 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -10,12 +10,36 @@
#include "dma.h"
#include "block_int.h"
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
+static void *qemu_sglist_default_map(void *opaque,
+ QEMUSGInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ return cpu_physical_memory_map(addr, len, is_write);
+}
+
+static void qemu_sglist_default_unmap(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+}
+
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
+ QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, void *opaque)
{
qsg->sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry));
qsg->nsg = 0;
qsg->nalloc = alloc_hint;
qsg->size = 0;
+
+ qsg->map = map ? map : qemu_sglist_default_map;
+ qsg->unmap = unmap ? unmap : qemu_sglist_default_unmap;
+ qsg->opaque = opaque;
}
void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
@@ -73,12 +97,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
int i;
for (i = 0; i < dbs->iov.niov; ++i) {
- cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
- dbs->iov.iov[i].iov_len, !dbs->is_write,
- dbs->iov.iov[i].iov_len);
+ dbs->sg->unmap(dbs->sg->opaque,
+ dbs->iov.iov[i].iov_base,
+ dbs->iov.iov[i].iov_len, !dbs->is_write,
+ dbs->iov.iov[i].iov_len);
}
}
+static void dma_bdrv_cancel(void *opaque)
+{
+ DMAAIOCB *dbs = opaque;
+
+ bdrv_aio_cancel(dbs->acb);
+ dma_bdrv_unmap(dbs);
+ qemu_iovec_destroy(&dbs->iov);
+ qemu_aio_release(dbs);
+}
+
static void dma_bdrv_cb(void *opaque, int ret)
{
DMAAIOCB *dbs = (DMAAIOCB *)opaque;
@@ -100,7 +135,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
while (dbs->sg_cur_index < dbs->sg->nsg) {
cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
- mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->is_write);
+ mem = dbs->sg->map(dbs->sg->opaque, dma_bdrv_cancel, dbs,
+ cur_addr, &cur_len, !dbs->is_write);
if (!mem)
break;
qemu_iovec_add(&dbs->iov, mem, cur_len);
diff --git a/dma.h b/dma.h
index f3bb275..d48f35c 100644
--- a/dma.h
+++ b/dma.h
@@ -15,6 +15,19 @@
#include "hw/hw.h"
#include "block.h"
+typedef void QEMUSGInvalMapFunc(void *opaque);
+typedef void *QEMUSGMapFunc(void *opaque,
+ QEMUSGInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+typedef void QEMUSGUnmapFunc(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+
typedef struct {
target_phys_addr_t base;
target_phys_addr_t len;
@@ -25,9 +38,15 @@ typedef struct {
int nsg;
int nalloc;
target_phys_addr_t size;
+
+ QEMUSGMapFunc *map;
+ QEMUSGUnmapFunc *unmap;
+ void *opaque;
} QEMUSGList;
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
+ QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap,
+ void *opaque);
void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
target_phys_addr_t len);
void qemu_sglist_destroy(QEMUSGList *qsg);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index af52c2c..024a125 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -436,7 +436,8 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
} prd;
int l, len;
- qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1);
+ qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1,
+ bm->map, bm->unmap, bm->opaque);
s->io_buffer_size = 0;
for(;;) {
if (bm->cur_prd_len == 0) {
@@ -444,7 +445,7 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
if (bm->cur_prd_last ||
(bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
return s->io_buffer_size != 0;
- cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
+ bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
bm->cur_addr += 8;
prd.addr = le32_to_cpu(prd.addr);
prd.size = le32_to_cpu(prd.size);
@@ -527,7 +528,7 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
if (bm->cur_prd_last ||
(bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
return 0;
- cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
+ bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
bm->cur_addr += 8;
prd.addr = le32_to_cpu(prd.addr);
prd.size = le32_to_cpu(prd.size);
@@ -542,11 +543,11 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
l = bm->cur_prd_len;
if (l > 0) {
if (is_write) {
- cpu_physical_memory_write(bm->cur_prd_addr,
- s->io_buffer + s->io_buffer_index, l);
+ bmdma_memory_write(bm, bm->cur_prd_addr,
+ s->io_buffer + s->io_buffer_index, l);
} else {
- cpu_physical_memory_read(bm->cur_prd_addr,
- s->io_buffer + s->io_buffer_index, l);
+ bmdma_memory_read(bm, bm->cur_prd_addr,
+ s->io_buffer + s->io_buffer_index, l);
}
bm->cur_prd_addr += l;
bm->cur_prd_len -= l;
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 4165543..f686d38 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -477,6 +477,24 @@ struct IDEDeviceInfo {
#define BM_CMD_START 0x01
#define BM_CMD_READ 0x08
+typedef void BMDMAInvalMapFunc(void *opaque);
+typedef void BMDMARWFunc(void *opaque,
+ target_phys_addr_t addr,
+ uint8_t *buf,
+ target_phys_addr_t len,
+ int is_write);
+typedef void *BMDMAMapFunc(void *opaque,
+ BMDMAInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+typedef void BMDMAUnmapFunc(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+
struct BMDMAState {
uint8_t cmd;
uint8_t status;
@@ -496,8 +514,29 @@ struct BMDMAState {
int64_t sector_num;
uint32_t nsector;
QEMUBH *bh;
+
+ BMDMARWFunc *rw;
+ BMDMAMapFunc *map;
+ BMDMAUnmapFunc *unmap;
+ void *opaque;
};
+static inline void bmdma_memory_read(BMDMAState *bm,
+ target_phys_addr_t addr,
+ uint8_t *buf,
+ target_phys_addr_t len)
+{
+ bm->rw(bm->opaque, addr, buf, len, 0);
+}
+
+static inline void bmdma_memory_write(BMDMAState *bm,
+ target_phys_addr_t addr,
+ uint8_t *buf,
+ target_phys_addr_t len)
+{
+ bm->rw(bm->opaque, addr, buf, len, 1);
+}
+
static inline IDEState *idebus_active_if(IDEBus *bus)
{
return bus->ifs + bus->unit;
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index bd1c73e..962ae13 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
s->io_buffer_size = io->len;
- qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+ qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
qemu_sglist_add(&s->sg, io->addr, io->len);
io->addr += io->len;
io->len = 0;
@@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
s->io_buffer_index = 0;
s->io_buffer_size = io->len;
- qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+ qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
qemu_sglist_add(&s->sg, io->addr, io->len);
io->addr += io->len;
io->len = 0;
diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index 4d95cc5..5879044 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
continue;
ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
}
+
+ for (i = 0; i < 2; i++) {
+ d->bmdma[i].rw = (void *) pci_memory_rw;
+ d->bmdma[i].map = (void *) pci_memory_map;
+ d->bmdma[i].unmap = (void *) pci_memory_unmap;
+ d->bmdma[i].opaque = dev;
+ }
}
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
Emulated PCI IDE controllers now use the memory access interface. This
also allows an emulated IOMMU to translate and check accesses.
Map invalidation results in cancelling DMA transfers. Since the guest OS
can't properly recover the DMA results in case the mapping is changed,
this is a fairly good approximation.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
dma-helpers.c | 46 +++++++++++++++++++++++++++++++++++++++++-----
dma.h | 21 ++++++++++++++++++++-
hw/ide/core.c | 15 ++++++++-------
hw/ide/internal.h | 39 +++++++++++++++++++++++++++++++++++++++
hw/ide/macio.c | 4 ++--
hw/ide/pci.c | 7 +++++++
6 files changed, 117 insertions(+), 15 deletions(-)
diff --git a/dma-helpers.c b/dma-helpers.c
index 712ed89..a0dcdb8 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -10,12 +10,36 @@
#include "dma.h"
#include "block_int.h"
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
+static void *qemu_sglist_default_map(void *opaque,
+ QEMUSGInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ return cpu_physical_memory_map(addr, len, is_write);
+}
+
+static void qemu_sglist_default_unmap(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+}
+
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
+ QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, void *opaque)
{
qsg->sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry));
qsg->nsg = 0;
qsg->nalloc = alloc_hint;
qsg->size = 0;
+
+ qsg->map = map ? map : qemu_sglist_default_map;
+ qsg->unmap = unmap ? unmap : qemu_sglist_default_unmap;
+ qsg->opaque = opaque;
}
void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
@@ -73,12 +97,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
int i;
for (i = 0; i < dbs->iov.niov; ++i) {
- cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
- dbs->iov.iov[i].iov_len, !dbs->is_write,
- dbs->iov.iov[i].iov_len);
+ dbs->sg->unmap(dbs->sg->opaque,
+ dbs->iov.iov[i].iov_base,
+ dbs->iov.iov[i].iov_len, !dbs->is_write,
+ dbs->iov.iov[i].iov_len);
}
}
+static void dma_bdrv_cancel(void *opaque)
+{
+ DMAAIOCB *dbs = opaque;
+
+ bdrv_aio_cancel(dbs->acb);
+ dma_bdrv_unmap(dbs);
+ qemu_iovec_destroy(&dbs->iov);
+ qemu_aio_release(dbs);
+}
+
static void dma_bdrv_cb(void *opaque, int ret)
{
DMAAIOCB *dbs = (DMAAIOCB *)opaque;
@@ -100,7 +135,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
while (dbs->sg_cur_index < dbs->sg->nsg) {
cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
- mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->is_write);
+ mem = dbs->sg->map(dbs->sg->opaque, dma_bdrv_cancel, dbs,
+ cur_addr, &cur_len, !dbs->is_write);
if (!mem)
break;
qemu_iovec_add(&dbs->iov, mem, cur_len);
diff --git a/dma.h b/dma.h
index f3bb275..d48f35c 100644
--- a/dma.h
+++ b/dma.h
@@ -15,6 +15,19 @@
#include "hw/hw.h"
#include "block.h"
+typedef void QEMUSGInvalMapFunc(void *opaque);
+typedef void *QEMUSGMapFunc(void *opaque,
+ QEMUSGInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+typedef void QEMUSGUnmapFunc(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+
typedef struct {
target_phys_addr_t base;
target_phys_addr_t len;
@@ -25,9 +38,15 @@ typedef struct {
int nsg;
int nalloc;
target_phys_addr_t size;
+
+ QEMUSGMapFunc *map;
+ QEMUSGUnmapFunc *unmap;
+ void *opaque;
} QEMUSGList;
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
+ QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap,
+ void *opaque);
void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
target_phys_addr_t len);
void qemu_sglist_destroy(QEMUSGList *qsg);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index af52c2c..024a125 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -436,7 +436,8 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
} prd;
int l, len;
- qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1);
+ qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1,
+ bm->map, bm->unmap, bm->opaque);
s->io_buffer_size = 0;
for(;;) {
if (bm->cur_prd_len == 0) {
@@ -444,7 +445,7 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
if (bm->cur_prd_last ||
(bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
return s->io_buffer_size != 0;
- cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
+ bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
bm->cur_addr += 8;
prd.addr = le32_to_cpu(prd.addr);
prd.size = le32_to_cpu(prd.size);
@@ -527,7 +528,7 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
if (bm->cur_prd_last ||
(bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
return 0;
- cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
+ bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
bm->cur_addr += 8;
prd.addr = le32_to_cpu(prd.addr);
prd.size = le32_to_cpu(prd.size);
@@ -542,11 +543,11 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
l = bm->cur_prd_len;
if (l > 0) {
if (is_write) {
- cpu_physical_memory_write(bm->cur_prd_addr,
- s->io_buffer + s->io_buffer_index, l);
+ bmdma_memory_write(bm, bm->cur_prd_addr,
+ s->io_buffer + s->io_buffer_index, l);
} else {
- cpu_physical_memory_read(bm->cur_prd_addr,
- s->io_buffer + s->io_buffer_index, l);
+ bmdma_memory_read(bm, bm->cur_prd_addr,
+ s->io_buffer + s->io_buffer_index, l);
}
bm->cur_prd_addr += l;
bm->cur_prd_len -= l;
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 4165543..f686d38 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -477,6 +477,24 @@ struct IDEDeviceInfo {
#define BM_CMD_START 0x01
#define BM_CMD_READ 0x08
+typedef void BMDMAInvalMapFunc(void *opaque);
+typedef void BMDMARWFunc(void *opaque,
+ target_phys_addr_t addr,
+ uint8_t *buf,
+ target_phys_addr_t len,
+ int is_write);
+typedef void *BMDMAMapFunc(void *opaque,
+ BMDMAInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+typedef void BMDMAUnmapFunc(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+
struct BMDMAState {
uint8_t cmd;
uint8_t status;
@@ -496,8 +514,29 @@ struct BMDMAState {
int64_t sector_num;
uint32_t nsector;
QEMUBH *bh;
+
+ BMDMARWFunc *rw;
+ BMDMAMapFunc *map;
+ BMDMAUnmapFunc *unmap;
+ void *opaque;
};
+static inline void bmdma_memory_read(BMDMAState *bm,
+ target_phys_addr_t addr,
+ uint8_t *buf,
+ target_phys_addr_t len)
+{
+ bm->rw(bm->opaque, addr, buf, len, 0);
+}
+
+static inline void bmdma_memory_write(BMDMAState *bm,
+ target_phys_addr_t addr,
+ uint8_t *buf,
+ target_phys_addr_t len)
+{
+ bm->rw(bm->opaque, addr, buf, len, 1);
+}
+
static inline IDEState *idebus_active_if(IDEBus *bus)
{
return bus->ifs + bus->unit;
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index bd1c73e..962ae13 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
s->io_buffer_size = io->len;
- qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+ qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
qemu_sglist_add(&s->sg, io->addr, io->len);
io->addr += io->len;
io->len = 0;
@@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
s->io_buffer_index = 0;
s->io_buffer_size = io->len;
- qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+ qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
qemu_sglist_add(&s->sg, io->addr, io->len);
io->addr += io->len;
io->len = 0;
diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index 4d95cc5..5879044 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
continue;
ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
}
+
+ for (i = 0; i < 2; i++) {
+ d->bmdma[i].rw = (void *) pci_memory_rw;
+ d->bmdma[i].map = (void *) pci_memory_map;
+ d->bmdma[i].unmap = (void *) pci_memory_unmap;
+ d->bmdma[i].opaque = dev;
+ }
}
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH 5/7] rtl8139: use the PCI memory access interface
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/rtl8139.c | 99 ++++++++++++++++++++++++++++++++-------------------------
1 files changed, 56 insertions(+), 43 deletions(-)
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index d92981d..32dbff3 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -412,12 +412,6 @@ typedef struct RTL8139TallyCounters
uint16_t TxUndrn;
} RTL8139TallyCounters;
-/* Clears all tally counters */
-static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters);
-
-/* Writes tally counters to specified physical memory address */
-static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* counters);
-
typedef struct RTL8139State {
PCIDevice dev;
uint8_t phys[8]; /* mac address */
@@ -496,6 +490,14 @@ typedef struct RTL8139State {
} RTL8139State;
+/* Clears all tally counters */
+static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters);
+
+/* Writes tally counters to specified physical memory address */
+static void
+RTL8139TallyCounters_physical_memory_write(RTL8139State *s,
+ target_phys_addr_t tc_addr);
+
static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
static void prom9346_decode_command(EEprom9346 *eeprom, uint8_t command)
@@ -746,6 +748,8 @@ static int rtl8139_cp_transmitter_enabled(RTL8139State *s)
static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
{
+ PCIDevice *dev = &s->dev;
+
if (s->RxBufAddr + size > s->RxBufferSize)
{
int wrapped = MOD2(s->RxBufAddr + size, s->RxBufferSize);
@@ -757,15 +761,15 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
if (size > wrapped)
{
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr,
- buf, size-wrapped );
+ pci_memory_write(dev, s->RxBuf + s->RxBufAddr,
+ buf, size-wrapped);
}
/* reset buffer pointer */
s->RxBufAddr = 0;
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr,
- buf + (size-wrapped), wrapped );
+ pci_memory_write(dev, s->RxBuf + s->RxBufAddr,
+ buf + (size-wrapped), wrapped);
s->RxBufAddr = wrapped;
@@ -774,7 +778,7 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
}
/* non-wrapping path or overwrapping enabled */
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr, buf, size );
+ pci_memory_write(dev, s->RxBuf + s->RxBufAddr, buf, size);
s->RxBufAddr += size;
}
@@ -814,6 +818,7 @@ static int rtl8139_can_receive(VLANClientState *nc)
static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_t size_, int do_interrupt)
{
RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+ PCIDevice *dev = &s->dev;
int size = size_;
uint32_t packet_header = 0;
@@ -968,13 +973,13 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
uint32_t val, rxdw0,rxdw1,rxbufLO,rxbufHI;
- cpu_physical_memory_read(cplus_rx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc, (uint8_t *)&val, 4);
rxdw0 = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
rxdw1 = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_rx_ring_desc+8, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc+8, (uint8_t *)&val, 4);
rxbufLO = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_rx_ring_desc+12, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc+12, (uint8_t *)&val, 4);
rxbufHI = le32_to_cpu(val);
DEBUG_PRINT(("RTL8139: +++ C+ mode RX descriptor %d %08x %08x %08x %08x\n",
@@ -1019,7 +1024,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
target_phys_addr_t rx_addr = rtl8139_addr64(rxbufLO, rxbufHI);
/* receive/copy to target memory */
- cpu_physical_memory_write( rx_addr, buf, size );
+ pci_memory_write(dev, rx_addr, buf, size);
if (s->CpCmd & CPlusRxChkSum)
{
@@ -1032,7 +1037,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
#else
val = 0;
#endif
- cpu_physical_memory_write( rx_addr+size, (uint8_t *)&val, 4);
+ pci_memory_write(dev, rx_addr + size, (uint8_t *)&val, 4);
/* first segment of received packet flag */
#define CP_RX_STATUS_FS (1<<29)
@@ -1081,9 +1086,9 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
/* update ring data */
val = cpu_to_le32(rxdw0);
- cpu_physical_memory_write(cplus_rx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_write(dev, cplus_rx_ring_desc, (uint8_t *)&val, 4);
val = cpu_to_le32(rxdw1);
- cpu_physical_memory_write(cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
+ pci_memory_write(dev, cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
/* update tally counter */
++s->tally_counters.RxOk;
@@ -1279,50 +1284,54 @@ static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters)
counters->TxUndrn = 0;
}
-static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* tally_counters)
+static void
+RTL8139TallyCounters_physical_memory_write(RTL8139State *s,
+ target_phys_addr_t tc_addr)
{
+ PCIDevice *dev = &s->dev;
+ RTL8139TallyCounters *tally_counters = &s->tally_counters;
uint16_t val16;
uint32_t val32;
uint64_t val64;
val64 = cpu_to_le64(tally_counters->TxOk);
- cpu_physical_memory_write(tc_addr + 0, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 0, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->RxOk);
- cpu_physical_memory_write(tc_addr + 8, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 8, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->TxERR);
- cpu_physical_memory_write(tc_addr + 16, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 16, (uint8_t *)&val64, 8);
val32 = cpu_to_le32(tally_counters->RxERR);
- cpu_physical_memory_write(tc_addr + 24, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 24, (uint8_t *)&val32, 4);
val16 = cpu_to_le16(tally_counters->MissPkt);
- cpu_physical_memory_write(tc_addr + 28, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 28, (uint8_t *)&val16, 2);
val16 = cpu_to_le16(tally_counters->FAE);
- cpu_physical_memory_write(tc_addr + 30, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 30, (uint8_t *)&val16, 2);
val32 = cpu_to_le32(tally_counters->Tx1Col);
- cpu_physical_memory_write(tc_addr + 32, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 32, (uint8_t *)&val32, 4);
val32 = cpu_to_le32(tally_counters->TxMCol);
- cpu_physical_memory_write(tc_addr + 36, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 36, (uint8_t *)&val32, 4);
val64 = cpu_to_le64(tally_counters->RxOkPhy);
- cpu_physical_memory_write(tc_addr + 40, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 40, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->RxOkBrd);
- cpu_physical_memory_write(tc_addr + 48, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 48, (uint8_t *)&val64, 8);
val32 = cpu_to_le32(tally_counters->RxOkMul);
- cpu_physical_memory_write(tc_addr + 56, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 56, (uint8_t *)&val32, 4);
val16 = cpu_to_le16(tally_counters->TxAbt);
- cpu_physical_memory_write(tc_addr + 60, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 60, (uint8_t *)&val16, 2);
val16 = cpu_to_le16(tally_counters->TxUndrn);
- cpu_physical_memory_write(tc_addr + 62, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 62, (uint8_t *)&val16, 2);
}
/* Loads values of tally counters from VM state file */
@@ -1758,6 +1767,8 @@ static void rtl8139_transfer_frame(RTL8139State *s, const uint8_t *buf, int size
static int rtl8139_transmit_one(RTL8139State *s, int descriptor)
{
+ PCIDevice *dev = &s->dev;
+
if (!rtl8139_transmitter_enabled(s))
{
DEBUG_PRINT(("RTL8139: +++ cannot transmit from descriptor %d: transmitter disabled\n",
@@ -1780,7 +1791,7 @@ static int rtl8139_transmit_one(RTL8139State *s, int descriptor)
DEBUG_PRINT(("RTL8139: +++ transmit reading %d bytes from host memory at 0x%08x\n",
txsize, s->TxAddr[descriptor]));
- cpu_physical_memory_read(s->TxAddr[descriptor], txbuffer, txsize);
+ pci_memory_read(dev, s->TxAddr[descriptor], txbuffer, txsize);
/* Mark descriptor as transferred */
s->TxStatus[descriptor] |= TxHostOwns;
@@ -1886,6 +1897,8 @@ static uint16_t ip_checksum(void *data, size_t len)
static int rtl8139_cplus_transmit_one(RTL8139State *s)
{
+ PCIDevice *dev = &s->dev;
+
if (!rtl8139_transmitter_enabled(s))
{
DEBUG_PRINT(("RTL8139: +++ C+ mode: transmitter disabled\n"));
@@ -1911,14 +1924,14 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
uint32_t val, txdw0,txdw1,txbufLO,txbufHI;
- cpu_physical_memory_read(cplus_tx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc, (uint8_t *)&val, 4);
txdw0 = le32_to_cpu(val);
/* TODO: implement VLAN tagging support, VLAN tag data is read to txdw1 */
- cpu_physical_memory_read(cplus_tx_ring_desc+4, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc+4, (uint8_t *)&val, 4);
txdw1 = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_tx_ring_desc+8, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc+8, (uint8_t *)&val, 4);
txbufLO = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_tx_ring_desc+12, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc+12, (uint8_t *)&val, 4);
txbufHI = le32_to_cpu(val);
DEBUG_PRINT(("RTL8139: +++ C+ mode TX descriptor %d %08x %08x %08x %08x\n",
@@ -2025,7 +2038,8 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
DEBUG_PRINT(("RTL8139: +++ C+ mode transmit reading %d bytes from host memory at %016" PRIx64 " to offset %d\n",
txsize, (uint64_t)tx_addr, s->cplus_txbuffer_offset));
- cpu_physical_memory_read(tx_addr, s->cplus_txbuffer + s->cplus_txbuffer_offset, txsize);
+ pci_memory_read(dev, tx_addr,
+ s->cplus_txbuffer + s->cplus_txbuffer_offset, txsize);
s->cplus_txbuffer_offset += txsize;
/* seek to next Rx descriptor */
@@ -2052,10 +2066,10 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
/* update ring data */
val = cpu_to_le32(txdw0);
- cpu_physical_memory_write(cplus_tx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_write(dev, cplus_tx_ring_desc, (uint8_t *)&val, 4);
/* TODO: implement VLAN tagging support, VLAN tag data is read to txdw1 */
// val = cpu_to_le32(txdw1);
-// cpu_physical_memory_write(cplus_tx_ring_desc+4, &val, 4);
+// pci_memory_write(dev, cplus_tx_ring_desc+4, &val, 4);
/* Now decide if descriptor being processed is holding the last segment of packet */
if (txdw0 & CP_TX_LS)
@@ -2364,7 +2378,6 @@ static void rtl8139_transmit(RTL8139State *s)
static void rtl8139_TxStatus_write(RTL8139State *s, uint32_t txRegOffset, uint32_t val)
{
-
int descriptor = txRegOffset/4;
/* handle C+ transmit mode register configuration */
@@ -2381,7 +2394,7 @@ static void rtl8139_TxStatus_write(RTL8139State *s, uint32_t txRegOffset, uint32
target_phys_addr_t tc_addr = rtl8139_addr64(s->TxStatus[0] & ~0x3f, s->TxStatus[1]);
/* dump tally counters to specified memory location */
- RTL8139TallyCounters_physical_memory_write( tc_addr, &s->tally_counters);
+ RTL8139TallyCounters_physical_memory_write(s, tc_addr);
/* mark dump completed */
s->TxStatus[0] &= ~0x8;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 5/7] rtl8139: use the PCI memory access interface
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/rtl8139.c | 99 ++++++++++++++++++++++++++++++++-------------------------
1 files changed, 56 insertions(+), 43 deletions(-)
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index d92981d..32dbff3 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -412,12 +412,6 @@ typedef struct RTL8139TallyCounters
uint16_t TxUndrn;
} RTL8139TallyCounters;
-/* Clears all tally counters */
-static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters);
-
-/* Writes tally counters to specified physical memory address */
-static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* counters);
-
typedef struct RTL8139State {
PCIDevice dev;
uint8_t phys[8]; /* mac address */
@@ -496,6 +490,14 @@ typedef struct RTL8139State {
} RTL8139State;
+/* Clears all tally counters */
+static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters);
+
+/* Writes tally counters to specified physical memory address */
+static void
+RTL8139TallyCounters_physical_memory_write(RTL8139State *s,
+ target_phys_addr_t tc_addr);
+
static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
static void prom9346_decode_command(EEprom9346 *eeprom, uint8_t command)
@@ -746,6 +748,8 @@ static int rtl8139_cp_transmitter_enabled(RTL8139State *s)
static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
{
+ PCIDevice *dev = &s->dev;
+
if (s->RxBufAddr + size > s->RxBufferSize)
{
int wrapped = MOD2(s->RxBufAddr + size, s->RxBufferSize);
@@ -757,15 +761,15 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
if (size > wrapped)
{
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr,
- buf, size-wrapped );
+ pci_memory_write(dev, s->RxBuf + s->RxBufAddr,
+ buf, size-wrapped);
}
/* reset buffer pointer */
s->RxBufAddr = 0;
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr,
- buf + (size-wrapped), wrapped );
+ pci_memory_write(dev, s->RxBuf + s->RxBufAddr,
+ buf + (size-wrapped), wrapped);
s->RxBufAddr = wrapped;
@@ -774,7 +778,7 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
}
/* non-wrapping path or overwrapping enabled */
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr, buf, size );
+ pci_memory_write(dev, s->RxBuf + s->RxBufAddr, buf, size);
s->RxBufAddr += size;
}
@@ -814,6 +818,7 @@ static int rtl8139_can_receive(VLANClientState *nc)
static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_t size_, int do_interrupt)
{
RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+ PCIDevice *dev = &s->dev;
int size = size_;
uint32_t packet_header = 0;
@@ -968,13 +973,13 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
uint32_t val, rxdw0,rxdw1,rxbufLO,rxbufHI;
- cpu_physical_memory_read(cplus_rx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc, (uint8_t *)&val, 4);
rxdw0 = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
rxdw1 = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_rx_ring_desc+8, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc+8, (uint8_t *)&val, 4);
rxbufLO = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_rx_ring_desc+12, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_rx_ring_desc+12, (uint8_t *)&val, 4);
rxbufHI = le32_to_cpu(val);
DEBUG_PRINT(("RTL8139: +++ C+ mode RX descriptor %d %08x %08x %08x %08x\n",
@@ -1019,7 +1024,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
target_phys_addr_t rx_addr = rtl8139_addr64(rxbufLO, rxbufHI);
/* receive/copy to target memory */
- cpu_physical_memory_write( rx_addr, buf, size );
+ pci_memory_write(dev, rx_addr, buf, size);
if (s->CpCmd & CPlusRxChkSum)
{
@@ -1032,7 +1037,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
#else
val = 0;
#endif
- cpu_physical_memory_write( rx_addr+size, (uint8_t *)&val, 4);
+ pci_memory_write(dev, rx_addr + size, (uint8_t *)&val, 4);
/* first segment of received packet flag */
#define CP_RX_STATUS_FS (1<<29)
@@ -1081,9 +1086,9 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
/* update ring data */
val = cpu_to_le32(rxdw0);
- cpu_physical_memory_write(cplus_rx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_write(dev, cplus_rx_ring_desc, (uint8_t *)&val, 4);
val = cpu_to_le32(rxdw1);
- cpu_physical_memory_write(cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
+ pci_memory_write(dev, cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
/* update tally counter */
++s->tally_counters.RxOk;
@@ -1279,50 +1284,54 @@ static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters)
counters->TxUndrn = 0;
}
-static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* tally_counters)
+static void
+RTL8139TallyCounters_physical_memory_write(RTL8139State *s,
+ target_phys_addr_t tc_addr)
{
+ PCIDevice *dev = &s->dev;
+ RTL8139TallyCounters *tally_counters = &s->tally_counters;
uint16_t val16;
uint32_t val32;
uint64_t val64;
val64 = cpu_to_le64(tally_counters->TxOk);
- cpu_physical_memory_write(tc_addr + 0, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 0, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->RxOk);
- cpu_physical_memory_write(tc_addr + 8, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 8, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->TxERR);
- cpu_physical_memory_write(tc_addr + 16, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 16, (uint8_t *)&val64, 8);
val32 = cpu_to_le32(tally_counters->RxERR);
- cpu_physical_memory_write(tc_addr + 24, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 24, (uint8_t *)&val32, 4);
val16 = cpu_to_le16(tally_counters->MissPkt);
- cpu_physical_memory_write(tc_addr + 28, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 28, (uint8_t *)&val16, 2);
val16 = cpu_to_le16(tally_counters->FAE);
- cpu_physical_memory_write(tc_addr + 30, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 30, (uint8_t *)&val16, 2);
val32 = cpu_to_le32(tally_counters->Tx1Col);
- cpu_physical_memory_write(tc_addr + 32, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 32, (uint8_t *)&val32, 4);
val32 = cpu_to_le32(tally_counters->TxMCol);
- cpu_physical_memory_write(tc_addr + 36, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 36, (uint8_t *)&val32, 4);
val64 = cpu_to_le64(tally_counters->RxOkPhy);
- cpu_physical_memory_write(tc_addr + 40, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 40, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->RxOkBrd);
- cpu_physical_memory_write(tc_addr + 48, (uint8_t *)&val64, 8);
+ pci_memory_write(dev, tc_addr + 48, (uint8_t *)&val64, 8);
val32 = cpu_to_le32(tally_counters->RxOkMul);
- cpu_physical_memory_write(tc_addr + 56, (uint8_t *)&val32, 4);
+ pci_memory_write(dev, tc_addr + 56, (uint8_t *)&val32, 4);
val16 = cpu_to_le16(tally_counters->TxAbt);
- cpu_physical_memory_write(tc_addr + 60, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 60, (uint8_t *)&val16, 2);
val16 = cpu_to_le16(tally_counters->TxUndrn);
- cpu_physical_memory_write(tc_addr + 62, (uint8_t *)&val16, 2);
+ pci_memory_write(dev, tc_addr + 62, (uint8_t *)&val16, 2);
}
/* Loads values of tally counters from VM state file */
@@ -1758,6 +1767,8 @@ static void rtl8139_transfer_frame(RTL8139State *s, const uint8_t *buf, int size
static int rtl8139_transmit_one(RTL8139State *s, int descriptor)
{
+ PCIDevice *dev = &s->dev;
+
if (!rtl8139_transmitter_enabled(s))
{
DEBUG_PRINT(("RTL8139: +++ cannot transmit from descriptor %d: transmitter disabled\n",
@@ -1780,7 +1791,7 @@ static int rtl8139_transmit_one(RTL8139State *s, int descriptor)
DEBUG_PRINT(("RTL8139: +++ transmit reading %d bytes from host memory at 0x%08x\n",
txsize, s->TxAddr[descriptor]));
- cpu_physical_memory_read(s->TxAddr[descriptor], txbuffer, txsize);
+ pci_memory_read(dev, s->TxAddr[descriptor], txbuffer, txsize);
/* Mark descriptor as transferred */
s->TxStatus[descriptor] |= TxHostOwns;
@@ -1886,6 +1897,8 @@ static uint16_t ip_checksum(void *data, size_t len)
static int rtl8139_cplus_transmit_one(RTL8139State *s)
{
+ PCIDevice *dev = &s->dev;
+
if (!rtl8139_transmitter_enabled(s))
{
DEBUG_PRINT(("RTL8139: +++ C+ mode: transmitter disabled\n"));
@@ -1911,14 +1924,14 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
uint32_t val, txdw0,txdw1,txbufLO,txbufHI;
- cpu_physical_memory_read(cplus_tx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc, (uint8_t *)&val, 4);
txdw0 = le32_to_cpu(val);
/* TODO: implement VLAN tagging support, VLAN tag data is read to txdw1 */
- cpu_physical_memory_read(cplus_tx_ring_desc+4, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc+4, (uint8_t *)&val, 4);
txdw1 = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_tx_ring_desc+8, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc+8, (uint8_t *)&val, 4);
txbufLO = le32_to_cpu(val);
- cpu_physical_memory_read(cplus_tx_ring_desc+12, (uint8_t *)&val, 4);
+ pci_memory_read(dev, cplus_tx_ring_desc+12, (uint8_t *)&val, 4);
txbufHI = le32_to_cpu(val);
DEBUG_PRINT(("RTL8139: +++ C+ mode TX descriptor %d %08x %08x %08x %08x\n",
@@ -2025,7 +2038,8 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
DEBUG_PRINT(("RTL8139: +++ C+ mode transmit reading %d bytes from host memory at %016" PRIx64 " to offset %d\n",
txsize, (uint64_t)tx_addr, s->cplus_txbuffer_offset));
- cpu_physical_memory_read(tx_addr, s->cplus_txbuffer + s->cplus_txbuffer_offset, txsize);
+ pci_memory_read(dev, tx_addr,
+ s->cplus_txbuffer + s->cplus_txbuffer_offset, txsize);
s->cplus_txbuffer_offset += txsize;
/* seek to next Rx descriptor */
@@ -2052,10 +2066,10 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
/* update ring data */
val = cpu_to_le32(txdw0);
- cpu_physical_memory_write(cplus_tx_ring_desc, (uint8_t *)&val, 4);
+ pci_memory_write(dev, cplus_tx_ring_desc, (uint8_t *)&val, 4);
/* TODO: implement VLAN tagging support, VLAN tag data is read to txdw1 */
// val = cpu_to_le32(txdw1);
-// cpu_physical_memory_write(cplus_tx_ring_desc+4, &val, 4);
+// pci_memory_write(dev, cplus_tx_ring_desc+4, &val, 4);
/* Now decide if descriptor being processed is holding the last segment of packet */
if (txdw0 & CP_TX_LS)
@@ -2364,7 +2378,6 @@ static void rtl8139_transmit(RTL8139State *s)
static void rtl8139_TxStatus_write(RTL8139State *s, uint32_t txRegOffset, uint32_t val)
{
-
int descriptor = txRegOffset/4;
/* handle C+ transmit mode register configuration */
@@ -2381,7 +2394,7 @@ static void rtl8139_TxStatus_write(RTL8139State *s, uint32_t txRegOffset, uint32
target_phys_addr_t tc_addr = rtl8139_addr64(s->TxStatus[0] & ~0x3f, s->TxStatus[1]);
/* dump tally counters to specified memory location */
- RTL8139TallyCounters_physical_memory_write( tc_addr, &s->tally_counters);
+ RTL8139TallyCounters_physical_memory_write(s, tc_addr);
/* mark dump completed */
s->TxStatus[0] &= ~0x8;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH 6/7] eepro100: use the PCI memory access interface
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/eepro100.c | 86 ++++++++++++++++++++++++++++++--------------------------
1 files changed, 46 insertions(+), 40 deletions(-)
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 2b75c8f..5b7d82a 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -306,10 +306,10 @@ static const uint16_t eepro100_mdi_mask[] = {
};
/* XXX: optimize */
-static void stl_le_phys(target_phys_addr_t addr, uint32_t val)
+static void stl_le_phys(EEPRO100State * s, pcibus_t addr, uint32_t val)
{
val = cpu_to_le32(val);
- cpu_physical_memory_write(addr, (const uint8_t *)&val, sizeof(val));
+ pci_memory_write(&s->dev, addr, (const uint8_t *)&val, sizeof(val));
}
#define POLYNOMIAL 0x04c11db6
@@ -692,12 +692,12 @@ static void dump_statistics(EEPRO100State * s)
* values which really matter.
* Number of data should check configuration!!!
*/
- cpu_physical_memory_write(s->statsaddr,
- (uint8_t *) & s->statistics, s->stats_size);
- stl_le_phys(s->statsaddr + 0, s->statistics.tx_good_frames);
- stl_le_phys(s->statsaddr + 36, s->statistics.rx_good_frames);
- stl_le_phys(s->statsaddr + 48, s->statistics.rx_resource_errors);
- stl_le_phys(s->statsaddr + 60, s->statistics.rx_short_frame_errors);
+ pci_memory_write(&s->dev, s->statsaddr,
+ (uint8_t *) & s->statistics, s->stats_size);
+ stl_le_phys(s, s->statsaddr + 0, s->statistics.tx_good_frames);
+ stl_le_phys(s, s->statsaddr + 36, s->statistics.rx_good_frames);
+ stl_le_phys(s, s->statsaddr + 48, s->statistics.rx_resource_errors);
+ stl_le_phys(s, s->statsaddr + 60, s->statistics.rx_short_frame_errors);
#if 0
stw_le_phys(s->statsaddr + 76, s->statistics.xmt_tco_frames);
stw_le_phys(s->statsaddr + 78, s->statistics.rcv_tco_frames);
@@ -707,7 +707,8 @@ static void dump_statistics(EEPRO100State * s)
static void read_cb(EEPRO100State *s)
{
- cpu_physical_memory_read(s->cb_address, (uint8_t *) &s->tx, sizeof(s->tx));
+ pci_memory_read(&s->dev,
+ s->cb_address, (uint8_t *) &s->tx, sizeof(s->tx));
s->tx.status = le16_to_cpu(s->tx.status);
s->tx.command = le16_to_cpu(s->tx.command);
s->tx.link = le32_to_cpu(s->tx.link);
@@ -737,18 +738,18 @@ static void tx_command(EEPRO100State *s)
}
assert(tcb_bytes <= sizeof(buf));
while (size < tcb_bytes) {
- uint32_t tx_buffer_address = ldl_phys(tbd_address);
- uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
+ uint32_t tx_buffer_address = pci_ldl(&s->dev, tbd_address);
+ uint16_t tx_buffer_size = pci_lduw(&s->dev, tbd_address + 4);
#if 0
- uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+ uint16_t tx_buffer_el = pci_lduw(&s->dev, tbd_address + 6);
#endif
tbd_address += 8;
TRACE(RXTX, logout
("TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n",
tx_buffer_address, tx_buffer_size));
tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
- cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+ pci_memory_read(&s->dev,
+ tx_buffer_address, &buf[size], tx_buffer_size);
size += tx_buffer_size;
}
if (tbd_array == 0xffffffff) {
@@ -759,16 +760,16 @@ static void tx_command(EEPRO100State *s)
if (s->has_extended_tcb_support && !(s->configuration[6] & BIT(4))) {
/* Extended Flexible TCB. */
for (; tbd_count < 2; tbd_count++) {
- uint32_t tx_buffer_address = ldl_phys(tbd_address);
- uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
- uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+ uint32_t tx_buffer_address = pci_ldl(&s->dev, tbd_address);
+ uint16_t tx_buffer_size = pci_lduw(&s->dev, tbd_address + 4);
+ uint16_t tx_buffer_el = pci_lduw(&s->dev, tbd_address + 6);
tbd_address += 8;
TRACE(RXTX, logout
("TBD (extended flexible mode): buffer address 0x%08x, size 0x%04x\n",
tx_buffer_address, tx_buffer_size));
tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
- cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+ pci_memory_read(&s->dev,
+ tx_buffer_address, &buf[size], tx_buffer_size);
size += tx_buffer_size;
if (tx_buffer_el & 1) {
break;
@@ -777,16 +778,16 @@ static void tx_command(EEPRO100State *s)
}
tbd_address = tbd_array;
for (; tbd_count < s->tx.tbd_count; tbd_count++) {
- uint32_t tx_buffer_address = ldl_phys(tbd_address);
- uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
- uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+ uint32_t tx_buffer_address = pci_ldl(&s->dev, tbd_address);
+ uint16_t tx_buffer_size = pci_lduw(&s->dev, tbd_address + 4);
+ uint16_t tx_buffer_el = pci_lduw(&s->dev, tbd_address + 6);
tbd_address += 8;
TRACE(RXTX, logout
("TBD (flexible mode): buffer address 0x%08x, size 0x%04x\n",
tx_buffer_address, tx_buffer_size));
tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
- cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+ pci_memory_read(&s->dev,
+ tx_buffer_address, &buf[size], tx_buffer_size);
size += tx_buffer_size;
if (tx_buffer_el & 1) {
break;
@@ -811,7 +812,7 @@ static void set_multicast_list(EEPRO100State *s)
TRACE(OTHER, logout("multicast list, multicast count = %u\n", multicast_count));
for (i = 0; i < multicast_count; i += 6) {
uint8_t multicast_addr[6];
- cpu_physical_memory_read(s->cb_address + 10 + i, multicast_addr, 6);
+ pci_memory_read(&s->dev, s->cb_address + 10 + i, multicast_addr, 6);
TRACE(OTHER, logout("multicast entry %s\n", nic_dump(multicast_addr, 6)));
unsigned mcast_idx = compute_mcast_idx(multicast_addr);
assert(mcast_idx < 64);
@@ -845,12 +846,14 @@ static void action_command(EEPRO100State *s)
/* Do nothing. */
break;
case CmdIASetup:
- cpu_physical_memory_read(s->cb_address + 8, &s->conf.macaddr.a[0], 6);
+ pci_memory_read(&s->dev,
+ s->cb_address + 8, &s->conf.macaddr.a[0], 6);
TRACE(OTHER, logout("macaddr: %s\n", nic_dump(&s->conf.macaddr.a[0], 6)));
break;
case CmdConfigure:
- cpu_physical_memory_read(s->cb_address + 8, &s->configuration[0],
- sizeof(s->configuration));
+ pci_memory_read(&s->dev,
+ s->cb_address + 8,
+ &s->configuration[0], sizeof(s->configuration));
TRACE(OTHER, logout("configuration: %s\n", nic_dump(&s->configuration[0], 16)));
break;
case CmdMulticastList:
@@ -880,7 +883,7 @@ static void action_command(EEPRO100State *s)
break;
}
/* Write new status. */
- stw_phys(s->cb_address, s->tx.status | ok_status | STATUS_C);
+ pci_stw(&s->dev, s->cb_address, s->tx.status | ok_status | STATUS_C);
if (bit_i) {
/* CU completed action. */
eepro100_cx_interrupt(s);
@@ -947,7 +950,7 @@ static void eepro100_cu_command(EEPRO100State * s, uint8_t val)
/* Dump statistical counters. */
TRACE(OTHER, logout("val=0x%02x (dump stats)\n", val));
dump_statistics(s);
- stl_le_phys(s->statsaddr + s->stats_size, 0xa005);
+ stl_le_phys(s, s->statsaddr + s->stats_size, 0xa005);
break;
case CU_CMD_BASE:
/* Load CU base. */
@@ -958,7 +961,7 @@ static void eepro100_cu_command(EEPRO100State * s, uint8_t val)
/* Dump and reset statistical counters. */
TRACE(OTHER, logout("val=0x%02x (dump stats and reset)\n", val));
dump_statistics(s);
- stl_le_phys(s->statsaddr + s->stats_size, 0xa007);
+ stl_le_phys(s, s->statsaddr + s->stats_size, 0xa007);
memset(&s->statistics, 0, sizeof(s->statistics));
break;
case CU_SRESUME:
@@ -1259,10 +1262,10 @@ static void eepro100_write_port(EEPRO100State * s, uint32_t val)
case PORT_SELFTEST:
TRACE(OTHER, logout("selftest address=0x%08x\n", address));
eepro100_selftest_t data;
- cpu_physical_memory_read(address, (uint8_t *) & data, sizeof(data));
+ pci_memory_read(&s->dev, address, (uint8_t *) & data, sizeof(data));
data.st_sign = 0xffffffff;
data.st_result = 0;
- cpu_physical_memory_write(address, (uint8_t *) & data, sizeof(data));
+ pci_memory_write(&s->dev, address, (uint8_t *) & data, sizeof(data));
break;
case PORT_SELECTIVE_RESET:
TRACE(OTHER, logout("selective reset, selftest address=0x%08x\n", address));
@@ -1721,8 +1724,9 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
}
/* !!! */
eepro100_rx_t rx;
- cpu_physical_memory_read(s->ru_base + s->ru_offset, (uint8_t *) & rx,
- offsetof(eepro100_rx_t, packet));
+ pci_memory_read(&s->dev,
+ s->ru_base + s->ru_offset,
+ (uint8_t *) & rx, offsetof(eepro100_rx_t, packet));
uint16_t rfd_command = le16_to_cpu(rx.command);
uint16_t rfd_size = le16_to_cpu(rx.size);
@@ -1736,9 +1740,11 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
}
TRACE(OTHER, logout("command 0x%04x, link 0x%08x, addr 0x%08x, size %u\n",
rfd_command, rx.link, rx.rx_buf_addr, rfd_size));
- stw_phys(s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, status),
- rfd_status);
- stw_phys(s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, count), size);
+ pci_stw(&s->dev,
+ s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, status),
+ rfd_status);
+ pci_stw(&s->dev,
+ s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, count), size);
/* Early receive interrupt not supported. */
#if 0
eepro100_er_interrupt(s);
@@ -1752,8 +1758,8 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
#if 0
assert(!(s->configuration[17] & BIT(0)));
#endif
- cpu_physical_memory_write(s->ru_base + s->ru_offset +
- offsetof(eepro100_rx_t, packet), buf, size);
+ pci_memory_write(&s->dev, s->ru_base + s->ru_offset +
+ offsetof(eepro100_rx_t, packet), buf, size);
s->statistics.rx_good_frames++;
eepro100_fr_interrupt(s);
s->ru_offset = le32_to_cpu(rx.link);
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 6/7] eepro100: use the PCI memory access interface
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/eepro100.c | 86 ++++++++++++++++++++++++++++++--------------------------
1 files changed, 46 insertions(+), 40 deletions(-)
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 2b75c8f..5b7d82a 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -306,10 +306,10 @@ static const uint16_t eepro100_mdi_mask[] = {
};
/* XXX: optimize */
-static void stl_le_phys(target_phys_addr_t addr, uint32_t val)
+static void stl_le_phys(EEPRO100State * s, pcibus_t addr, uint32_t val)
{
val = cpu_to_le32(val);
- cpu_physical_memory_write(addr, (const uint8_t *)&val, sizeof(val));
+ pci_memory_write(&s->dev, addr, (const uint8_t *)&val, sizeof(val));
}
#define POLYNOMIAL 0x04c11db6
@@ -692,12 +692,12 @@ static void dump_statistics(EEPRO100State * s)
* values which really matter.
* Number of data should check configuration!!!
*/
- cpu_physical_memory_write(s->statsaddr,
- (uint8_t *) & s->statistics, s->stats_size);
- stl_le_phys(s->statsaddr + 0, s->statistics.tx_good_frames);
- stl_le_phys(s->statsaddr + 36, s->statistics.rx_good_frames);
- stl_le_phys(s->statsaddr + 48, s->statistics.rx_resource_errors);
- stl_le_phys(s->statsaddr + 60, s->statistics.rx_short_frame_errors);
+ pci_memory_write(&s->dev, s->statsaddr,
+ (uint8_t *) & s->statistics, s->stats_size);
+ stl_le_phys(s, s->statsaddr + 0, s->statistics.tx_good_frames);
+ stl_le_phys(s, s->statsaddr + 36, s->statistics.rx_good_frames);
+ stl_le_phys(s, s->statsaddr + 48, s->statistics.rx_resource_errors);
+ stl_le_phys(s, s->statsaddr + 60, s->statistics.rx_short_frame_errors);
#if 0
stw_le_phys(s->statsaddr + 76, s->statistics.xmt_tco_frames);
stw_le_phys(s->statsaddr + 78, s->statistics.rcv_tco_frames);
@@ -707,7 +707,8 @@ static void dump_statistics(EEPRO100State * s)
static void read_cb(EEPRO100State *s)
{
- cpu_physical_memory_read(s->cb_address, (uint8_t *) &s->tx, sizeof(s->tx));
+ pci_memory_read(&s->dev,
+ s->cb_address, (uint8_t *) &s->tx, sizeof(s->tx));
s->tx.status = le16_to_cpu(s->tx.status);
s->tx.command = le16_to_cpu(s->tx.command);
s->tx.link = le32_to_cpu(s->tx.link);
@@ -737,18 +738,18 @@ static void tx_command(EEPRO100State *s)
}
assert(tcb_bytes <= sizeof(buf));
while (size < tcb_bytes) {
- uint32_t tx_buffer_address = ldl_phys(tbd_address);
- uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
+ uint32_t tx_buffer_address = pci_ldl(&s->dev, tbd_address);
+ uint16_t tx_buffer_size = pci_lduw(&s->dev, tbd_address + 4);
#if 0
- uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+ uint16_t tx_buffer_el = pci_lduw(&s->dev, tbd_address + 6);
#endif
tbd_address += 8;
TRACE(RXTX, logout
("TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n",
tx_buffer_address, tx_buffer_size));
tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
- cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+ pci_memory_read(&s->dev,
+ tx_buffer_address, &buf[size], tx_buffer_size);
size += tx_buffer_size;
}
if (tbd_array == 0xffffffff) {
@@ -759,16 +760,16 @@ static void tx_command(EEPRO100State *s)
if (s->has_extended_tcb_support && !(s->configuration[6] & BIT(4))) {
/* Extended Flexible TCB. */
for (; tbd_count < 2; tbd_count++) {
- uint32_t tx_buffer_address = ldl_phys(tbd_address);
- uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
- uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+ uint32_t tx_buffer_address = pci_ldl(&s->dev, tbd_address);
+ uint16_t tx_buffer_size = pci_lduw(&s->dev, tbd_address + 4);
+ uint16_t tx_buffer_el = pci_lduw(&s->dev, tbd_address + 6);
tbd_address += 8;
TRACE(RXTX, logout
("TBD (extended flexible mode): buffer address 0x%08x, size 0x%04x\n",
tx_buffer_address, tx_buffer_size));
tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
- cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+ pci_memory_read(&s->dev,
+ tx_buffer_address, &buf[size], tx_buffer_size);
size += tx_buffer_size;
if (tx_buffer_el & 1) {
break;
@@ -777,16 +778,16 @@ static void tx_command(EEPRO100State *s)
}
tbd_address = tbd_array;
for (; tbd_count < s->tx.tbd_count; tbd_count++) {
- uint32_t tx_buffer_address = ldl_phys(tbd_address);
- uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
- uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+ uint32_t tx_buffer_address = pci_ldl(&s->dev, tbd_address);
+ uint16_t tx_buffer_size = pci_lduw(&s->dev, tbd_address + 4);
+ uint16_t tx_buffer_el = pci_lduw(&s->dev, tbd_address + 6);
tbd_address += 8;
TRACE(RXTX, logout
("TBD (flexible mode): buffer address 0x%08x, size 0x%04x\n",
tx_buffer_address, tx_buffer_size));
tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
- cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+ pci_memory_read(&s->dev,
+ tx_buffer_address, &buf[size], tx_buffer_size);
size += tx_buffer_size;
if (tx_buffer_el & 1) {
break;
@@ -811,7 +812,7 @@ static void set_multicast_list(EEPRO100State *s)
TRACE(OTHER, logout("multicast list, multicast count = %u\n", multicast_count));
for (i = 0; i < multicast_count; i += 6) {
uint8_t multicast_addr[6];
- cpu_physical_memory_read(s->cb_address + 10 + i, multicast_addr, 6);
+ pci_memory_read(&s->dev, s->cb_address + 10 + i, multicast_addr, 6);
TRACE(OTHER, logout("multicast entry %s\n", nic_dump(multicast_addr, 6)));
unsigned mcast_idx = compute_mcast_idx(multicast_addr);
assert(mcast_idx < 64);
@@ -845,12 +846,14 @@ static void action_command(EEPRO100State *s)
/* Do nothing. */
break;
case CmdIASetup:
- cpu_physical_memory_read(s->cb_address + 8, &s->conf.macaddr.a[0], 6);
+ pci_memory_read(&s->dev,
+ s->cb_address + 8, &s->conf.macaddr.a[0], 6);
TRACE(OTHER, logout("macaddr: %s\n", nic_dump(&s->conf.macaddr.a[0], 6)));
break;
case CmdConfigure:
- cpu_physical_memory_read(s->cb_address + 8, &s->configuration[0],
- sizeof(s->configuration));
+ pci_memory_read(&s->dev,
+ s->cb_address + 8,
+ &s->configuration[0], sizeof(s->configuration));
TRACE(OTHER, logout("configuration: %s\n", nic_dump(&s->configuration[0], 16)));
break;
case CmdMulticastList:
@@ -880,7 +883,7 @@ static void action_command(EEPRO100State *s)
break;
}
/* Write new status. */
- stw_phys(s->cb_address, s->tx.status | ok_status | STATUS_C);
+ pci_stw(&s->dev, s->cb_address, s->tx.status | ok_status | STATUS_C);
if (bit_i) {
/* CU completed action. */
eepro100_cx_interrupt(s);
@@ -947,7 +950,7 @@ static void eepro100_cu_command(EEPRO100State * s, uint8_t val)
/* Dump statistical counters. */
TRACE(OTHER, logout("val=0x%02x (dump stats)\n", val));
dump_statistics(s);
- stl_le_phys(s->statsaddr + s->stats_size, 0xa005);
+ stl_le_phys(s, s->statsaddr + s->stats_size, 0xa005);
break;
case CU_CMD_BASE:
/* Load CU base. */
@@ -958,7 +961,7 @@ static void eepro100_cu_command(EEPRO100State * s, uint8_t val)
/* Dump and reset statistical counters. */
TRACE(OTHER, logout("val=0x%02x (dump stats and reset)\n", val));
dump_statistics(s);
- stl_le_phys(s->statsaddr + s->stats_size, 0xa007);
+ stl_le_phys(s, s->statsaddr + s->stats_size, 0xa007);
memset(&s->statistics, 0, sizeof(s->statistics));
break;
case CU_SRESUME:
@@ -1259,10 +1262,10 @@ static void eepro100_write_port(EEPRO100State * s, uint32_t val)
case PORT_SELFTEST:
TRACE(OTHER, logout("selftest address=0x%08x\n", address));
eepro100_selftest_t data;
- cpu_physical_memory_read(address, (uint8_t *) & data, sizeof(data));
+ pci_memory_read(&s->dev, address, (uint8_t *) & data, sizeof(data));
data.st_sign = 0xffffffff;
data.st_result = 0;
- cpu_physical_memory_write(address, (uint8_t *) & data, sizeof(data));
+ pci_memory_write(&s->dev, address, (uint8_t *) & data, sizeof(data));
break;
case PORT_SELECTIVE_RESET:
TRACE(OTHER, logout("selective reset, selftest address=0x%08x\n", address));
@@ -1721,8 +1724,9 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
}
/* !!! */
eepro100_rx_t rx;
- cpu_physical_memory_read(s->ru_base + s->ru_offset, (uint8_t *) & rx,
- offsetof(eepro100_rx_t, packet));
+ pci_memory_read(&s->dev,
+ s->ru_base + s->ru_offset,
+ (uint8_t *) & rx, offsetof(eepro100_rx_t, packet));
uint16_t rfd_command = le16_to_cpu(rx.command);
uint16_t rfd_size = le16_to_cpu(rx.size);
@@ -1736,9 +1740,11 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
}
TRACE(OTHER, logout("command 0x%04x, link 0x%08x, addr 0x%08x, size %u\n",
rfd_command, rx.link, rx.rx_buf_addr, rfd_size));
- stw_phys(s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, status),
- rfd_status);
- stw_phys(s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, count), size);
+ pci_stw(&s->dev,
+ s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, status),
+ rfd_status);
+ pci_stw(&s->dev,
+ s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, count), size);
/* Early receive interrupt not supported. */
#if 0
eepro100_er_interrupt(s);
@@ -1752,8 +1758,8 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
#if 0
assert(!(s->configuration[17] & BIT(0)));
#endif
- cpu_physical_memory_write(s->ru_base + s->ru_offset +
- offsetof(eepro100_rx_t, packet), buf, size);
+ pci_memory_write(&s->dev, s->ru_base + s->ru_offset +
+ offsetof(eepro100_rx_t, packet), buf, size);
s->statistics.rx_good_frames++;
eepro100_fr_interrupt(s);
s->ru_offset = le32_to_cpu(rx.link);
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH 7/7] ac97: use the PCI memory access interface
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: malc <av1474@comtv.ru>
---
hw/ac97.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/ac97.c b/hw/ac97.c
index d71072d..bad38fb 100644
--- a/hw/ac97.c
+++ b/hw/ac97.c
@@ -223,7 +223,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs *r)
{
uint8_t b[8];
- cpu_physical_memory_read (r->bdbar + r->civ * 8, b, 8);
+ pci_memory_read (&s->dev, r->bdbar + r->civ * 8, b, 8);
r->bd_valid = 1;
r->bd.addr = le32_to_cpu (*(uint32_t *) &b[0]) & ~3;
r->bd.ctl_len = le32_to_cpu (*(uint32_t *) &b[4]);
@@ -972,7 +972,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs *r,
while (temp) {
int copied;
to_copy = audio_MIN (temp, sizeof (tmpbuf));
- cpu_physical_memory_read (addr, tmpbuf, to_copy);
+ pci_memory_read (&s->dev, addr, tmpbuf, to_copy);
copied = AUD_write (s->voice_po, tmpbuf, to_copy);
dolog ("write_audio max=%x to_copy=%x copied=%x\n",
max, to_copy, copied);
@@ -1056,7 +1056,7 @@ static int read_audio (AC97LinkState *s, AC97BusMasterRegs *r,
*stop = 1;
break;
}
- cpu_physical_memory_write (addr, tmpbuf, acquired);
+ pci_memory_write (&s->dev, addr, tmpbuf, acquired);
temp -= acquired;
addr += acquired;
nread += acquired;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 7/7] ac97: use the PCI memory access interface
@ 2010-08-28 14:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 14:54 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: malc <av1474@comtv.ru>
---
hw/ac97.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/ac97.c b/hw/ac97.c
index d71072d..bad38fb 100644
--- a/hw/ac97.c
+++ b/hw/ac97.c
@@ -223,7 +223,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs *r)
{
uint8_t b[8];
- cpu_physical_memory_read (r->bdbar + r->civ * 8, b, 8);
+ pci_memory_read (&s->dev, r->bdbar + r->civ * 8, b, 8);
r->bd_valid = 1;
r->bd.addr = le32_to_cpu (*(uint32_t *) &b[0]) & ~3;
r->bd.ctl_len = le32_to_cpu (*(uint32_t *) &b[4]);
@@ -972,7 +972,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs *r,
while (temp) {
int copied;
to_copy = audio_MIN (temp, sizeof (tmpbuf));
- cpu_physical_memory_read (addr, tmpbuf, to_copy);
+ pci_memory_read (&s->dev, addr, tmpbuf, to_copy);
copied = AUD_write (s->voice_po, tmpbuf, to_copy);
dolog ("write_audio max=%x to_copy=%x copied=%x\n",
max, to_copy, copied);
@@ -1056,7 +1056,7 @@ static int read_audio (AC97LinkState *s, AC97BusMasterRegs *r,
*stop = 1;
break;
}
- cpu_physical_memory_write (addr, tmpbuf, acquired);
+ pci_memory_write (&s->dev, addr, tmpbuf, acquired);
temp -= acquired;
addr += acquired;
nread += acquired;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH 3/7] AMD IOMMU emulation
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 15:58 ` Blue Swirl
-1 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-28 15:58 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: mst, joro, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> This introduces emulation for the AMD IOMMU, described in "AMD I/O
> Virtualization Technology (IOMMU) Specification".
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> Makefile.target | 2 +-
> hw/amd_iommu.c | 663 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> hw/pc.c | 2 +
> hw/pci_ids.h | 2 +
> hw/pci_regs.h | 1 +
> 5 files changed, 669 insertions(+), 1 deletions(-)
> create mode 100644 hw/amd_iommu.c
>
> diff --git a/Makefile.target b/Makefile.target
> index 3ef4666..d4eeccd 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -195,7 +195,7 @@ obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
> obj-i386-y += vmmouse.o vmport.o hpet.o applesmc.o
> obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
> obj-i386-y += debugcon.o multiboot.o
> -obj-i386-y += pc_piix.o
> +obj-i386-y += pc_piix.o amd_iommu.o
>
> # shared objects
> obj-ppc-y = ppc.o
> diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
> new file mode 100644
> index 0000000..43e0426
> --- /dev/null
> +++ b/hw/amd_iommu.c
> @@ -0,0 +1,663 @@
> +/*
> + * AMD IOMMU emulation
> + *
> + * Copyright (c) 2010 Eduard - Gabriel Munteanu
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "pc.h"
> +#include "hw.h"
> +#include "pci.h"
> +#include "qlist.h"
> +
> +/* Capability registers */
> +#define CAPAB_HEADER 0x00
> +#define CAPAB_REV_TYPE 0x02
> +#define CAPAB_FLAGS 0x03
> +#define CAPAB_BAR_LOW 0x04
> +#define CAPAB_BAR_HIGH 0x08
> +#define CAPAB_RANGE 0x0C
> +#define CAPAB_MISC 0x10
> +
> +#define CAPAB_SIZE 0x14
> +#define CAPAB_REG_SIZE 0x04
> +
> +/* Capability header data */
> +#define CAPAB_FLAG_IOTLBSUP (1 << 0)
> +#define CAPAB_FLAG_HTTUNNEL (1 << 1)
> +#define CAPAB_FLAG_NPCACHE (1 << 2)
> +#define CAPAB_INIT_REV (1 << 3)
> +#define CAPAB_INIT_TYPE 3
> +#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE)
> +#define CAPAB_INIT_FLAGS (CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL)
> +#define CAPAB_INIT_MISC (64 << 15) | (48 << 8)
> +#define CAPAB_BAR_MASK ~((1UL << 14) - 1)
> +
> +/* MMIO registers */
> +#define MMIO_DEVICE_TABLE 0x0000
> +#define MMIO_COMMAND_BASE 0x0008
> +#define MMIO_EVENT_BASE 0x0010
> +#define MMIO_CONTROL 0x0018
> +#define MMIO_EXCL_BASE 0x0020
> +#define MMIO_EXCL_LIMIT 0x0028
> +#define MMIO_COMMAND_HEAD 0x2000
> +#define MMIO_COMMAND_TAIL 0x2008
> +#define MMIO_EVENT_HEAD 0x2010
> +#define MMIO_EVENT_TAIL 0x2018
> +#define MMIO_STATUS 0x2020
> +
> +#define MMIO_SIZE 0x4000
> +
> +#define MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
> +#define MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~MMIO_DEVTAB_SIZE_MASK)
> +#define MMIO_DEVTAB_ENTRY_SIZE 32
> +#define MMIO_DEVTAB_SIZE_UNIT 4096
> +
> +#define MMIO_CMDBUF_SIZE_BYTE (MMIO_COMMAND_BASE + 7)
> +#define MMIO_CMDBUF_SIZE_MASK 0x0F
> +#define MMIO_CMDBUF_BASE_MASK MMIO_DEVTAB_BASE_MASK
> +#define MMIO_CMDBUF_DEFAULT_SIZE 8
> +#define MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
> +#define MMIO_CMDBUF_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
> +
> +#define MMIO_EVTLOG_SIZE_BYTE (MMIO_EVENT_BASE + 7)
> +#define MMIO_EVTLOG_SIZE_MASK MMIO_CMDBUF_SIZE_MASK
> +#define MMIO_EVTLOG_BASE_MASK MMIO_CMDBUF_BASE_MASK
> +#define MMIO_EVTLOG_DEFAULT_SIZE MMIO_CMDBUF_DEFAULT_SIZE
> +#define MMIO_EVTLOG_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
> +#define MMIO_EVTLOG_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
> +
> +#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK
> +#define MMIO_EXCL_ENABLED_MASK (1ULL << 0)
> +#define MMIO_EXCL_ALLOW_MASK (1ULL << 1)
> +#define MMIO_EXCL_LIMIT_MASK MMIO_DEVTAB_BASE_MASK
> +#define MMIO_EXCL_LIMIT_LOW 0xFFF
> +
> +#define MMIO_CONTROL_IOMMUEN (1ULL << 0)
> +#define MMIO_CONTROL_HTTUNEN (1ULL << 1)
> +#define MMIO_CONTROL_EVENTLOGEN (1ULL << 2)
> +#define MMIO_CONTROL_EVENTINTEN (1ULL << 3)
> +#define MMIO_CONTROL_COMWAITINTEN (1ULL << 4)
> +#define MMIO_CONTROL_CMDBUFEN (1ULL << 12)
> +
> +#define MMIO_STATUS_EVTLOG_OF (1ULL << 0)
> +#define MMIO_STATUS_EVTLOG_INTR (1ULL << 1)
> +#define MMIO_STATUS_COMWAIT_INTR (1ULL << 2)
> +#define MMIO_STATUS_EVTLOG_RUN (1ULL << 3)
> +#define MMIO_STATUS_CMDBUF_RUN (1ULL << 4)
> +
> +#define CMDBUF_ID_BYTE 0x07
> +#define CMDBUF_ID_RSHIFT 4
> +#define CMDBUF_ENTRY_SIZE 0x10
> +
> +#define CMD_COMPLETION_WAIT 0x01
> +#define CMD_INVAL_DEVTAB_ENTRY 0x02
> +#define CMD_INVAL_IOMMU_PAGES 0x03
> +#define CMD_INVAL_IOTLB_PAGES 0x04
> +#define CMD_INVAL_INTR_TABLE 0x05
> +
> +#define DEVTAB_ENTRY_SIZE 32
> +
> +/* Device table entry bits 0:63 */
> +#define DEV_VALID (1ULL << 0)
> +#define DEV_TRANSLATION_VALID (1ULL << 1)
> +#define DEV_MODE_MASK 0x7
> +#define DEV_MODE_RSHIFT 9
> +#define DEV_PT_ROOT_MASK 0xFFFFFFFFFF000
> +#define DEV_PT_ROOT_RSHIFT 12
> +#define DEV_PERM_SHIFT 61
> +#define DEV_PERM_READ (1ULL << 61)
> +#define DEV_PERM_WRITE (1ULL << 62)
> +
> +/* Device table entry bits 64:127 */
> +#define DEV_DOMAIN_ID_MASK ((1ULL << 16) - 1)
> +#define DEV_IOTLB_SUPPORT (1ULL << 17)
> +#define DEV_SUPPRESS_PF (1ULL << 18)
> +#define DEV_SUPPRESS_ALL_PF (1ULL << 19)
> +#define DEV_IOCTL_MASK ~3
> +#define DEV_IOCTL_RSHIFT 20
> +#define DEV_IOCTL_DENY 0
> +#define DEV_IOCTL_PASSTHROUGH 1
> +#define DEV_IOCTL_TRANSLATE 2
> +#define DEV_CACHE (1ULL << 37)
> +#define DEV_SNOOP_DISABLE (1ULL << 38)
> +#define DEV_EXCL (1ULL << 39)
> +
> +/* Event codes and flags, as stored in the info field */
> +#define EVENT_ILLEGAL_DEVTAB_ENTRY (0x1U << 24)
> +#define EVENT_IOPF (0x2U << 24)
> +#define EVENT_IOPF_I (1U << 3)
> +#define EVENT_IOPF_PR (1U << 4)
> +#define EVENT_IOPF_RW (1U << 5)
> +#define EVENT_IOPF_PE (1U << 6)
> +#define EVENT_IOPF_RZ (1U << 7)
> +#define EVENT_IOPF_TR (1U << 8)
> +#define EVENT_DEV_TAB_HW_ERROR (0x3U << 24)
> +#define EVENT_PAGE_TAB_HW_ERROR (0x4U << 24)
> +#define EVENT_ILLEGAL_COMMAND_ERROR (0x5U << 24)
> +#define EVENT_COMMAND_HW_ERROR (0x6U << 24)
> +#define EVENT_IOTLB_INV_TIMEOUT (0x7U << 24)
> +#define EVENT_INVALID_DEV_REQUEST (0x8U << 24)
> +
> +#define EVENT_LEN 16
> +
> +typedef struct AMDIOMMUState {
> + PCIDevice dev;
> +
> + int capab_offset;
> + unsigned char *capab;
> +
> + int mmio_index;
> + target_phys_addr_t mmio_addr;
> + unsigned char *mmio_buf;
> + int mmio_enabled;
> +
> + int enabled;
> + int ats_enabled;
> +
> + target_phys_addr_t devtab;
> + size_t devtab_len;
> +
> + target_phys_addr_t cmdbuf;
> + int cmdbuf_enabled;
> + size_t cmdbuf_len;
> + size_t cmdbuf_head;
> + size_t cmdbuf_tail;
> + int completion_wait_intr;
> +
> + target_phys_addr_t evtlog;
> + int evtlog_enabled;
> + int evtlog_intr;
> + target_phys_addr_t evtlog_len;
> + target_phys_addr_t evtlog_head;
> + target_phys_addr_t evtlog_tail;
> +
> + target_phys_addr_t excl_base;
> + target_phys_addr_t excl_limit;
> + int excl_enabled;
> + int excl_allow;
> +} AMDIOMMUState;
> +
> +typedef struct AMDIOMMUEvent {
> + uint16_t devfn;
> + uint16_t reserved;
> + uint16_t domid;
> + uint16_t info;
> + uint64_t addr;
> +} __attribute__((packed)) AMDIOMMUEvent;
> +
> +static void amd_iommu_completion_wait(AMDIOMMUState *st,
> + uint8_t *cmd)
> +{
> + uint64_t addr;
> +
> + if (cmd[0] & 1) {
> + addr = le64_to_cpu(*(uint64_t *) cmd) & 0xFFFFFFFFFFFF8;
> + cpu_physical_memory_write(addr, cmd + 8, 8);
> + }
> +
> + if (cmd[0] & 2)
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_COMWAIT_INTR;
> +}
> +
> +static void amd_iommu_invalidate_iotlb(AMDIOMMUState *st,
> + uint8_t *cmd)
> +{
> + PCIDevice *dev;
> + PCIBus *bus = st->dev.bus;
> + int bus_num = pci_bus_num(bus);
> + int devfn = *(uint16_t *) cmd;
> +
> + dev = pci_find_device(bus, bus_num, PCI_SLOT(devfn), PCI_FUNC(devfn));
> + if (dev) {
> + pci_memory_invalidate_range(dev, 0, -1);
> + }
> +}
> +
> +static void amd_iommu_cmdbuf_run(AMDIOMMUState *st)
> +{
> + uint8_t cmd[16];
> + int type;
> +
> + if (!st->cmdbuf_enabled) {
> + return;
> + }
> +
> + /* Check if there's work to do. */
> + if (st->cmdbuf_head == st->cmdbuf_tail) {
> + return;
> + }
> +
> + cpu_physical_memory_read(st->cmdbuf + st->cmdbuf_head, cmd, 16);
> + type = cmd[CMDBUF_ID_BYTE] >> CMDBUF_ID_RSHIFT;
> + switch (type) {
> + case CMD_COMPLETION_WAIT:
> + amd_iommu_completion_wait(st, cmd);
> + break;
> + case CMD_INVAL_DEVTAB_ENTRY:
> + break;
> + case CMD_INVAL_IOMMU_PAGES:
> + break;
> + case CMD_INVAL_IOTLB_PAGES:
> + amd_iommu_invalidate_iotlb(st, cmd);
> + break;
> + case CMD_INVAL_INTR_TABLE:
> + break;
> + default:
> + break;
> + }
> +
> + /* Increment and wrap head pointer. */
> + st->cmdbuf_head += CMDBUF_ENTRY_SIZE;
> + if (st->cmdbuf_head >= st->cmdbuf_len) {
> + st->cmdbuf_head = 0;
> + }
> +}
> +
> +static uint32_t amd_iommu_mmio_buf_read(AMDIOMMUState *st,
> + size_t offset,
> + size_t size)
> +{
> + ssize_t i;
> + uint32_t ret;
> +
> + if (!size) {
> + return 0;
> + }
> +
> + ret = st->mmio_buf[offset + size - 1];
> + for (i = size - 2; i >= 0; i--) {
> + ret <<= 8;
> + ret |= st->mmio_buf[offset + i];
> + }
> +
> + return ret;
> +}
> +
> +static void amd_iommu_mmio_buf_write(AMDIOMMUState *st,
> + size_t offset,
> + size_t size,
> + uint32_t val)
> +{
> + size_t i;
> +
> + for (i = 0; i < size; i++) {
> + st->mmio_buf[offset + i] = val & 0xFF;
> + val >>= 8;
> + }
> +}
> +
> +static void amd_iommu_update_mmio(AMDIOMMUState *st,
> + target_phys_addr_t addr)
> +{
> + size_t reg = addr & ~0x07;
> + uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
This is still buggy.
> + uint64_t val = le64_to_cpu(*base);
> +
> + switch (reg) {
> + case MMIO_CONTROL:
> + st->enabled = !!(val & MMIO_CONTROL_IOMMUEN);
> + st->ats_enabled = !!(val & MMIO_CONTROL_HTTUNEN);
> + st->evtlog_enabled = st->enabled &&
> + !!(val & MMIO_CONTROL_EVENTLOGEN);
> + st->evtlog_intr = !!(val & MMIO_CONTROL_EVENTINTEN);
> + st->completion_wait_intr = !!(val & MMIO_CONTROL_COMWAITINTEN);
> + st->cmdbuf_enabled = st->enabled &&
> + !!(val & MMIO_CONTROL_CMDBUFEN);
> +
> + /* Update status flags depending on the control register. */
> + if (st->cmdbuf_enabled) {
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_CMDBUF_RUN;
> + } else {
> + st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_CMDBUF_RUN;
> + }
> + if (st->evtlog_enabled) {
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_RUN;
> + } else {
> + st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_EVTLOG_RUN;
> + }
> +
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_DEVICE_TABLE:
> + st->devtab = (target_phys_addr_t) (val & MMIO_DEVTAB_BASE_MASK);
> + st->devtab_len = ((val & MMIO_DEVTAB_SIZE_MASK) + 1) *
> + (MMIO_DEVTAB_SIZE_UNIT / MMIO_DEVTAB_ENTRY_SIZE);
> + break;
> + case MMIO_COMMAND_BASE:
> + st->cmdbuf = (target_phys_addr_t) (val & MMIO_CMDBUF_BASE_MASK);
> + st->cmdbuf_len = 1UL << (st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] &
> + MMIO_CMDBUF_SIZE_MASK);
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_COMMAND_HEAD:
> + st->cmdbuf_head = val & MMIO_CMDBUF_HEAD_MASK;
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_COMMAND_TAIL:
> + st->cmdbuf_tail = val & MMIO_CMDBUF_TAIL_MASK;
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_EVENT_BASE:
> + st->evtlog = (target_phys_addr_t) (val & MMIO_EVTLOG_BASE_MASK);
> + st->evtlog_len = 1UL << (st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] &
> + MMIO_EVTLOG_SIZE_MASK);
> + break;
> + case MMIO_EVENT_HEAD:
> + st->evtlog_head = val & MMIO_EVTLOG_HEAD_MASK;
> + break;
> + case MMIO_EVENT_TAIL:
> + st->evtlog_tail = val & MMIO_EVTLOG_TAIL_MASK;
> + break;
> + case MMIO_EXCL_BASE:
> + st->excl_base = (target_phys_addr_t) (val & MMIO_EXCL_BASE_MASK);
> + st->excl_enabled = val & MMIO_EXCL_ENABLED_MASK;
> + st->excl_allow = val & MMIO_EXCL_ALLOW_MASK;
> + break;
> + case MMIO_EXCL_LIMIT:
> + st->excl_limit = (target_phys_addr_t) ((val & MMIO_EXCL_LIMIT_MASK) |
> + MMIO_EXCL_LIMIT_LOW);
> + break;
> + default:
> + break;
> + }
> +}
> +
> +static uint32_t amd_iommu_mmio_readb(void *opaque, target_phys_addr_t addr)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + return amd_iommu_mmio_buf_read(st, addr, 1);
> +}
> +
> +static uint32_t amd_iommu_mmio_readw(void *opaque, target_phys_addr_t addr)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + return amd_iommu_mmio_buf_read(st, addr, 2);
> +}
> +
> +static uint32_t amd_iommu_mmio_readl(void *opaque, target_phys_addr_t addr)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + return amd_iommu_mmio_buf_read(st, addr, 4);
> +}
> +
> +static void amd_iommu_mmio_writeb(void *opaque,
> + target_phys_addr_t addr,
> + uint32_t val)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + amd_iommu_mmio_buf_write(st, addr, 1, val);
> + amd_iommu_update_mmio(st, addr);
> +}
> +
> +static void amd_iommu_mmio_writew(void *opaque,
> + target_phys_addr_t addr,
> + uint32_t val)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + amd_iommu_mmio_buf_write(st, addr, 2, val);
> + amd_iommu_update_mmio(st, addr);
> +}
> +
> +static void amd_iommu_mmio_writel(void *opaque,
> + target_phys_addr_t addr,
> + uint32_t val)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + amd_iommu_mmio_buf_write(st, addr, 4, val);
> + amd_iommu_update_mmio(st, addr);
> +}
> +
> +static CPUReadMemoryFunc * const amd_iommu_mmio_read[] = {
> + amd_iommu_mmio_readb,
> + amd_iommu_mmio_readw,
> + amd_iommu_mmio_readl,
> +};
> +
> +static CPUWriteMemoryFunc * const amd_iommu_mmio_write[] = {
> + amd_iommu_mmio_writeb,
> + amd_iommu_mmio_writew,
> + amd_iommu_mmio_writel,
> +};
> +
> +static void amd_iommu_enable_mmio(AMDIOMMUState *st)
> +{
> + target_phys_addr_t addr;
> + uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
> +
> + st->mmio_index = cpu_register_io_memory(amd_iommu_mmio_read,
> + amd_iommu_mmio_write, st);
> + if (st->mmio_index < 0) {
> + return;
> + }
> +
> + addr = le64_to_cpu(*(uint64_t *) &st->capab[CAPAB_BAR_LOW]) & CAPAB_BAR_MASK;
> + cpu_register_physical_memory(addr, MMIO_SIZE, st->mmio_index);
> +
> + st->mmio_addr = addr;
> + st->mmio_enabled = 1;
> +
> + /* Further changes to the capability are prohibited. */
> + memset(capab_wmask + CAPAB_BAR_LOW, 0x00, CAPAB_REG_SIZE);
> + memset(capab_wmask + CAPAB_BAR_HIGH, 0x00, CAPAB_REG_SIZE);
> +}
> +
> +static void amd_iommu_write_capab(PCIDevice *dev,
> + uint32_t addr, uint32_t val, int len)
> +{
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
> +
> + pci_default_write_config(dev, addr, val, len);
> +
> + if (!st->mmio_enabled && st->capab[CAPAB_BAR_LOW] & 0x1) {
> + amd_iommu_enable_mmio(st);
> + }
> +}
> +
> +static void amd_iommu_reset(DeviceState *dev)
> +{
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev.qdev, dev);
> + unsigned char *capab = st->capab;
> + uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
> +
> + st->enabled = 0;
> + st->ats_enabled = 0;
> + st->mmio_enabled = 0;
> +
> + capab[CAPAB_REV_TYPE] = CAPAB_REV_TYPE;
> + capab[CAPAB_FLAGS] = CAPAB_FLAGS;
> + capab[CAPAB_BAR_LOW] = 0;
> + capab[CAPAB_BAR_HIGH] = 0;
> + capab[CAPAB_RANGE] = 0;
> + *((uint32_t *) &capab[CAPAB_MISC]) = cpu_to_le32(CAPAB_INIT_MISC);
> +
> + /* Changes to the capability are allowed after system reset. */
> + memset(capab_wmask + CAPAB_BAR_LOW, 0xFF, CAPAB_REG_SIZE);
> + memset(capab_wmask + CAPAB_BAR_HIGH, 0xFF, CAPAB_REG_SIZE);
> +
> + memset(st->mmio_buf, 0, MMIO_SIZE);
> + st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] = MMIO_CMDBUF_DEFAULT_SIZE;
> + st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] = MMIO_EVTLOG_DEFAULT_SIZE;
> +}
> +
> +static void amd_iommu_log_event(AMDIOMMUState *st, AMDIOMMUEvent *evt)
> +{
> + if (!st->evtlog_enabled ||
> + (st->mmio_buf[MMIO_STATUS] | MMIO_STATUS_EVTLOG_OF)) {
> + return;
> + }
> +
> + if (st->evtlog_tail >= st->evtlog_len) {
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_OF;
> + }
> +
> + cpu_physical_memory_write(st->evtlog + st->evtlog_tail,
> + (uint8_t *) evt, EVENT_LEN);
> +
> + st->evtlog_tail += EVENT_LEN;
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_INTR;
> +}
> +
> +static void amd_iommu_page_fault(AMDIOMMUState *st,
> + int devfn,
> + unsigned domid,
> + target_phys_addr_t addr,
> + int present,
> + int is_write)
> +{
> + AMDIOMMUEvent evt;
> + unsigned info;
> +
> + evt.devfn = cpu_to_le16(devfn);
> + evt.reserved = 0;
> + evt.domid = cpu_to_le16(domid);
> + evt.addr = cpu_to_le64(addr);
> +
> + info = EVENT_IOPF;
> + if (present) {
> + info |= EVENT_IOPF_PR;
> + }
> + if (is_write) {
> + info |= EVENT_IOPF_RW;
> + }
> + evt.info = cpu_to_le16(info);
> +
> + amd_iommu_log_event(st, &evt);
> +}
> +
> +static inline uint64_t amd_iommu_get_perms(uint64_t entry)
> +{
> + return (entry & (DEV_PERM_READ | DEV_PERM_WRITE)) >> DEV_PERM_SHIFT;
> +}
> +
> +static int amd_iommu_translate(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms)
> +{
> + int devfn, present;
> + target_phys_addr_t entry_addr, pte_addr;
> + uint64_t entry[4], pte, page_offset, pte_perms;
> + unsigned level, domid;
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, iommu);
> +
> + if (!st->enabled) {
> + goto no_translation;
> + }
> +
> + /* Get device table entry. */
> + devfn = dev->devfn;
> + entry_addr = st->devtab + devfn * DEVTAB_ENTRY_SIZE;
> + cpu_physical_memory_read(entry_addr, (uint8_t *) entry, 32);
> +
> + pte = entry[0];
> + if (!(pte & DEV_VALID) || !(pte & DEV_TRANSLATION_VALID)) {
> + goto no_translation;
> + }
> + domid = entry[1] & DEV_DOMAIN_ID_MASK;
> + level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
> + while (level > 0) {
> + /*
> + * Check permissions: the bitwise
> + * implication perms -> entry_perms must be true.
> + */
> + pte_perms = amd_iommu_get_perms(pte);
> + present = pte & 1;
> + if (!present || perms != (perms & pte_perms)) {
> + amd_iommu_page_fault(st, devfn, domid, addr,
> + present, !!(perms & IOMMU_PERM_WRITE));
> + return -EPERM;
> + }
> +
> + /* Go to the next lower level. */
> + pte_addr = pte & DEV_PT_ROOT_MASK;
> + pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
> + pte = ldq_phys(pte_addr);
> + level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
> + }
> + page_offset = addr & 4095;
> + *paddr = (pte & DEV_PT_ROOT_MASK) + page_offset;
> + *len = 4096 - page_offset;
> +
> + return 0;
> +
> +no_translation:
> + *paddr = addr;
> + *len = -1;
> + return 0;
> +}
> +
> +static int amd_iommu_pci_initfn(PCIDevice *dev)
> +{
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
> +
> + pci_config_set_vendor_id(st->dev.config, PCI_VENDOR_ID_AMD);
> + pci_config_set_device_id(st->dev.config, PCI_DEVICE_ID_AMD_IOMMU);
> + pci_config_set_class(st->dev.config, PCI_CLASS_SYSTEM_IOMMU);
> +
> + /* Secure Device capability */
> + st->capab_offset = pci_add_capability(&st->dev,
> + PCI_CAP_ID_SEC,
> + CAPAB_SIZE);
> + st->capab = st->dev.config + st->capab_offset;
> + dev->config_write = amd_iommu_write_capab;
> +
> + /* Allocate backing space for the MMIO registers. */
> + st->mmio_buf = qemu_malloc(MMIO_SIZE);
> +
> + pci_register_iommu(dev, amd_iommu_translate);
> +
> + return 0;
> +}
> +
> +static const VMStateDescription vmstate_amd_iommu = {
> + .name = "amd-iommu",
> + .version_id = 1,
> + .minimum_version_id = 1,
> + .minimum_version_id_old = 1,
> + .fields = (VMStateField []) {
> + VMSTATE_PCI_DEVICE(dev, AMDIOMMUState),
> + VMSTATE_END_OF_LIST()
> + }
> +};
> +
> +static PCIDeviceInfo amd_iommu_pci_info = {
> + .qdev.name = "amd-iommu",
> + .qdev.desc = "AMD IOMMU",
> + .qdev.size = sizeof(AMDIOMMUState),
> + .qdev.reset = amd_iommu_reset,
> + .qdev.vmsd = &vmstate_amd_iommu,
> + .init = amd_iommu_pci_initfn,
> +};
> +
> +static void amd_iommu_register(void)
> +{
> + pci_qdev_register(&amd_iommu_pci_info);
> +}
> +
> +device_init(amd_iommu_register);
> diff --git a/hw/pc.c b/hw/pc.c
> index a96187f..e2456b0 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
> int max_bus;
> int bus;
>
> + pci_create_simple(pci_bus, -1, "amd-iommu");
> +
> max_bus = drive_get_max_bus(IF_SCSI);
> for (bus = 0; bus <= max_bus; bus++) {
> pci_create_simple(pci_bus, -1, "lsi53c895a");
> diff --git a/hw/pci_ids.h b/hw/pci_ids.h
> index 39e9f1d..d790312 100644
> --- a/hw/pci_ids.h
> +++ b/hw/pci_ids.h
> @@ -26,6 +26,7 @@
>
> #define PCI_CLASS_MEMORY_RAM 0x0500
>
> +#define PCI_CLASS_SYSTEM_IOMMU 0x0806
> #define PCI_CLASS_SYSTEM_OTHER 0x0880
>
> #define PCI_CLASS_SERIAL_USB 0x0c03
> @@ -56,6 +57,7 @@
>
> #define PCI_VENDOR_ID_AMD 0x1022
> #define PCI_DEVICE_ID_AMD_LANCE 0x2000
> +#define PCI_DEVICE_ID_AMD_IOMMU 0x0000 /* FIXME */
>
> #define PCI_VENDOR_ID_MOTOROLA 0x1057
> #define PCI_DEVICE_ID_MOTOROLA_MPC106 0x0002
> diff --git a/hw/pci_regs.h b/hw/pci_regs.h
> index 0f9f84c..6695e41 100644
> --- a/hw/pci_regs.h
> +++ b/hw/pci_regs.h
> @@ -209,6 +209,7 @@
> #define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
> #define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
> #define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
> +#define PCI_CAP_ID_SEC 0x0F /* Secure Device (AMD IOMMU) */
> #define PCI_CAP_ID_EXP 0x10 /* PCI Express */
> #define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
> #define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
> --
> 1.7.1
>
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 3/7] AMD IOMMU emulation
@ 2010-08-28 15:58 ` Blue Swirl
0 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-28 15:58 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu; +Cc: kvm, mst, joro, qemu-devel, yamahata, avi, paul
On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> This introduces emulation for the AMD IOMMU, described in "AMD I/O
> Virtualization Technology (IOMMU) Specification".
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> Makefile.target | 2 +-
> hw/amd_iommu.c | 663 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> hw/pc.c | 2 +
> hw/pci_ids.h | 2 +
> hw/pci_regs.h | 1 +
> 5 files changed, 669 insertions(+), 1 deletions(-)
> create mode 100644 hw/amd_iommu.c
>
> diff --git a/Makefile.target b/Makefile.target
> index 3ef4666..d4eeccd 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -195,7 +195,7 @@ obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
> obj-i386-y += vmmouse.o vmport.o hpet.o applesmc.o
> obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
> obj-i386-y += debugcon.o multiboot.o
> -obj-i386-y += pc_piix.o
> +obj-i386-y += pc_piix.o amd_iommu.o
>
> # shared objects
> obj-ppc-y = ppc.o
> diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
> new file mode 100644
> index 0000000..43e0426
> --- /dev/null
> +++ b/hw/amd_iommu.c
> @@ -0,0 +1,663 @@
> +/*
> + * AMD IOMMU emulation
> + *
> + * Copyright (c) 2010 Eduard - Gabriel Munteanu
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "pc.h"
> +#include "hw.h"
> +#include "pci.h"
> +#include "qlist.h"
> +
> +/* Capability registers */
> +#define CAPAB_HEADER 0x00
> +#define CAPAB_REV_TYPE 0x02
> +#define CAPAB_FLAGS 0x03
> +#define CAPAB_BAR_LOW 0x04
> +#define CAPAB_BAR_HIGH 0x08
> +#define CAPAB_RANGE 0x0C
> +#define CAPAB_MISC 0x10
> +
> +#define CAPAB_SIZE 0x14
> +#define CAPAB_REG_SIZE 0x04
> +
> +/* Capability header data */
> +#define CAPAB_FLAG_IOTLBSUP (1 << 0)
> +#define CAPAB_FLAG_HTTUNNEL (1 << 1)
> +#define CAPAB_FLAG_NPCACHE (1 << 2)
> +#define CAPAB_INIT_REV (1 << 3)
> +#define CAPAB_INIT_TYPE 3
> +#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE)
> +#define CAPAB_INIT_FLAGS (CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL)
> +#define CAPAB_INIT_MISC (64 << 15) | (48 << 8)
> +#define CAPAB_BAR_MASK ~((1UL << 14) - 1)
> +
> +/* MMIO registers */
> +#define MMIO_DEVICE_TABLE 0x0000
> +#define MMIO_COMMAND_BASE 0x0008
> +#define MMIO_EVENT_BASE 0x0010
> +#define MMIO_CONTROL 0x0018
> +#define MMIO_EXCL_BASE 0x0020
> +#define MMIO_EXCL_LIMIT 0x0028
> +#define MMIO_COMMAND_HEAD 0x2000
> +#define MMIO_COMMAND_TAIL 0x2008
> +#define MMIO_EVENT_HEAD 0x2010
> +#define MMIO_EVENT_TAIL 0x2018
> +#define MMIO_STATUS 0x2020
> +
> +#define MMIO_SIZE 0x4000
> +
> +#define MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
> +#define MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~MMIO_DEVTAB_SIZE_MASK)
> +#define MMIO_DEVTAB_ENTRY_SIZE 32
> +#define MMIO_DEVTAB_SIZE_UNIT 4096
> +
> +#define MMIO_CMDBUF_SIZE_BYTE (MMIO_COMMAND_BASE + 7)
> +#define MMIO_CMDBUF_SIZE_MASK 0x0F
> +#define MMIO_CMDBUF_BASE_MASK MMIO_DEVTAB_BASE_MASK
> +#define MMIO_CMDBUF_DEFAULT_SIZE 8
> +#define MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
> +#define MMIO_CMDBUF_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
> +
> +#define MMIO_EVTLOG_SIZE_BYTE (MMIO_EVENT_BASE + 7)
> +#define MMIO_EVTLOG_SIZE_MASK MMIO_CMDBUF_SIZE_MASK
> +#define MMIO_EVTLOG_BASE_MASK MMIO_CMDBUF_BASE_MASK
> +#define MMIO_EVTLOG_DEFAULT_SIZE MMIO_CMDBUF_DEFAULT_SIZE
> +#define MMIO_EVTLOG_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F)
> +#define MMIO_EVTLOG_TAIL_MASK MMIO_EVTLOG_HEAD_MASK
> +
> +#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK
> +#define MMIO_EXCL_ENABLED_MASK (1ULL << 0)
> +#define MMIO_EXCL_ALLOW_MASK (1ULL << 1)
> +#define MMIO_EXCL_LIMIT_MASK MMIO_DEVTAB_BASE_MASK
> +#define MMIO_EXCL_LIMIT_LOW 0xFFF
> +
> +#define MMIO_CONTROL_IOMMUEN (1ULL << 0)
> +#define MMIO_CONTROL_HTTUNEN (1ULL << 1)
> +#define MMIO_CONTROL_EVENTLOGEN (1ULL << 2)
> +#define MMIO_CONTROL_EVENTINTEN (1ULL << 3)
> +#define MMIO_CONTROL_COMWAITINTEN (1ULL << 4)
> +#define MMIO_CONTROL_CMDBUFEN (1ULL << 12)
> +
> +#define MMIO_STATUS_EVTLOG_OF (1ULL << 0)
> +#define MMIO_STATUS_EVTLOG_INTR (1ULL << 1)
> +#define MMIO_STATUS_COMWAIT_INTR (1ULL << 2)
> +#define MMIO_STATUS_EVTLOG_RUN (1ULL << 3)
> +#define MMIO_STATUS_CMDBUF_RUN (1ULL << 4)
> +
> +#define CMDBUF_ID_BYTE 0x07
> +#define CMDBUF_ID_RSHIFT 4
> +#define CMDBUF_ENTRY_SIZE 0x10
> +
> +#define CMD_COMPLETION_WAIT 0x01
> +#define CMD_INVAL_DEVTAB_ENTRY 0x02
> +#define CMD_INVAL_IOMMU_PAGES 0x03
> +#define CMD_INVAL_IOTLB_PAGES 0x04
> +#define CMD_INVAL_INTR_TABLE 0x05
> +
> +#define DEVTAB_ENTRY_SIZE 32
> +
> +/* Device table entry bits 0:63 */
> +#define DEV_VALID (1ULL << 0)
> +#define DEV_TRANSLATION_VALID (1ULL << 1)
> +#define DEV_MODE_MASK 0x7
> +#define DEV_MODE_RSHIFT 9
> +#define DEV_PT_ROOT_MASK 0xFFFFFFFFFF000
> +#define DEV_PT_ROOT_RSHIFT 12
> +#define DEV_PERM_SHIFT 61
> +#define DEV_PERM_READ (1ULL << 61)
> +#define DEV_PERM_WRITE (1ULL << 62)
> +
> +/* Device table entry bits 64:127 */
> +#define DEV_DOMAIN_ID_MASK ((1ULL << 16) - 1)
> +#define DEV_IOTLB_SUPPORT (1ULL << 17)
> +#define DEV_SUPPRESS_PF (1ULL << 18)
> +#define DEV_SUPPRESS_ALL_PF (1ULL << 19)
> +#define DEV_IOCTL_MASK ~3
> +#define DEV_IOCTL_RSHIFT 20
> +#define DEV_IOCTL_DENY 0
> +#define DEV_IOCTL_PASSTHROUGH 1
> +#define DEV_IOCTL_TRANSLATE 2
> +#define DEV_CACHE (1ULL << 37)
> +#define DEV_SNOOP_DISABLE (1ULL << 38)
> +#define DEV_EXCL (1ULL << 39)
> +
> +/* Event codes and flags, as stored in the info field */
> +#define EVENT_ILLEGAL_DEVTAB_ENTRY (0x1U << 24)
> +#define EVENT_IOPF (0x2U << 24)
> +#define EVENT_IOPF_I (1U << 3)
> +#define EVENT_IOPF_PR (1U << 4)
> +#define EVENT_IOPF_RW (1U << 5)
> +#define EVENT_IOPF_PE (1U << 6)
> +#define EVENT_IOPF_RZ (1U << 7)
> +#define EVENT_IOPF_TR (1U << 8)
> +#define EVENT_DEV_TAB_HW_ERROR (0x3U << 24)
> +#define EVENT_PAGE_TAB_HW_ERROR (0x4U << 24)
> +#define EVENT_ILLEGAL_COMMAND_ERROR (0x5U << 24)
> +#define EVENT_COMMAND_HW_ERROR (0x6U << 24)
> +#define EVENT_IOTLB_INV_TIMEOUT (0x7U << 24)
> +#define EVENT_INVALID_DEV_REQUEST (0x8U << 24)
> +
> +#define EVENT_LEN 16
> +
> +typedef struct AMDIOMMUState {
> + PCIDevice dev;
> +
> + int capab_offset;
> + unsigned char *capab;
> +
> + int mmio_index;
> + target_phys_addr_t mmio_addr;
> + unsigned char *mmio_buf;
> + int mmio_enabled;
> +
> + int enabled;
> + int ats_enabled;
> +
> + target_phys_addr_t devtab;
> + size_t devtab_len;
> +
> + target_phys_addr_t cmdbuf;
> + int cmdbuf_enabled;
> + size_t cmdbuf_len;
> + size_t cmdbuf_head;
> + size_t cmdbuf_tail;
> + int completion_wait_intr;
> +
> + target_phys_addr_t evtlog;
> + int evtlog_enabled;
> + int evtlog_intr;
> + target_phys_addr_t evtlog_len;
> + target_phys_addr_t evtlog_head;
> + target_phys_addr_t evtlog_tail;
> +
> + target_phys_addr_t excl_base;
> + target_phys_addr_t excl_limit;
> + int excl_enabled;
> + int excl_allow;
> +} AMDIOMMUState;
> +
> +typedef struct AMDIOMMUEvent {
> + uint16_t devfn;
> + uint16_t reserved;
> + uint16_t domid;
> + uint16_t info;
> + uint64_t addr;
> +} __attribute__((packed)) AMDIOMMUEvent;
> +
> +static void amd_iommu_completion_wait(AMDIOMMUState *st,
> + uint8_t *cmd)
> +{
> + uint64_t addr;
> +
> + if (cmd[0] & 1) {
> + addr = le64_to_cpu(*(uint64_t *) cmd) & 0xFFFFFFFFFFFF8;
> + cpu_physical_memory_write(addr, cmd + 8, 8);
> + }
> +
> + if (cmd[0] & 2)
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_COMWAIT_INTR;
> +}
> +
> +static void amd_iommu_invalidate_iotlb(AMDIOMMUState *st,
> + uint8_t *cmd)
> +{
> + PCIDevice *dev;
> + PCIBus *bus = st->dev.bus;
> + int bus_num = pci_bus_num(bus);
> + int devfn = *(uint16_t *) cmd;
> +
> + dev = pci_find_device(bus, bus_num, PCI_SLOT(devfn), PCI_FUNC(devfn));
> + if (dev) {
> + pci_memory_invalidate_range(dev, 0, -1);
> + }
> +}
> +
> +static void amd_iommu_cmdbuf_run(AMDIOMMUState *st)
> +{
> + uint8_t cmd[16];
> + int type;
> +
> + if (!st->cmdbuf_enabled) {
> + return;
> + }
> +
> + /* Check if there's work to do. */
> + if (st->cmdbuf_head == st->cmdbuf_tail) {
> + return;
> + }
> +
> + cpu_physical_memory_read(st->cmdbuf + st->cmdbuf_head, cmd, 16);
> + type = cmd[CMDBUF_ID_BYTE] >> CMDBUF_ID_RSHIFT;
> + switch (type) {
> + case CMD_COMPLETION_WAIT:
> + amd_iommu_completion_wait(st, cmd);
> + break;
> + case CMD_INVAL_DEVTAB_ENTRY:
> + break;
> + case CMD_INVAL_IOMMU_PAGES:
> + break;
> + case CMD_INVAL_IOTLB_PAGES:
> + amd_iommu_invalidate_iotlb(st, cmd);
> + break;
> + case CMD_INVAL_INTR_TABLE:
> + break;
> + default:
> + break;
> + }
> +
> + /* Increment and wrap head pointer. */
> + st->cmdbuf_head += CMDBUF_ENTRY_SIZE;
> + if (st->cmdbuf_head >= st->cmdbuf_len) {
> + st->cmdbuf_head = 0;
> + }
> +}
> +
> +static uint32_t amd_iommu_mmio_buf_read(AMDIOMMUState *st,
> + size_t offset,
> + size_t size)
> +{
> + ssize_t i;
> + uint32_t ret;
> +
> + if (!size) {
> + return 0;
> + }
> +
> + ret = st->mmio_buf[offset + size - 1];
> + for (i = size - 2; i >= 0; i--) {
> + ret <<= 8;
> + ret |= st->mmio_buf[offset + i];
> + }
> +
> + return ret;
> +}
> +
> +static void amd_iommu_mmio_buf_write(AMDIOMMUState *st,
> + size_t offset,
> + size_t size,
> + uint32_t val)
> +{
> + size_t i;
> +
> + for (i = 0; i < size; i++) {
> + st->mmio_buf[offset + i] = val & 0xFF;
> + val >>= 8;
> + }
> +}
> +
> +static void amd_iommu_update_mmio(AMDIOMMUState *st,
> + target_phys_addr_t addr)
> +{
> + size_t reg = addr & ~0x07;
> + uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
This is still buggy.
> + uint64_t val = le64_to_cpu(*base);
> +
> + switch (reg) {
> + case MMIO_CONTROL:
> + st->enabled = !!(val & MMIO_CONTROL_IOMMUEN);
> + st->ats_enabled = !!(val & MMIO_CONTROL_HTTUNEN);
> + st->evtlog_enabled = st->enabled &&
> + !!(val & MMIO_CONTROL_EVENTLOGEN);
> + st->evtlog_intr = !!(val & MMIO_CONTROL_EVENTINTEN);
> + st->completion_wait_intr = !!(val & MMIO_CONTROL_COMWAITINTEN);
> + st->cmdbuf_enabled = st->enabled &&
> + !!(val & MMIO_CONTROL_CMDBUFEN);
> +
> + /* Update status flags depending on the control register. */
> + if (st->cmdbuf_enabled) {
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_CMDBUF_RUN;
> + } else {
> + st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_CMDBUF_RUN;
> + }
> + if (st->evtlog_enabled) {
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_RUN;
> + } else {
> + st->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_EVTLOG_RUN;
> + }
> +
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_DEVICE_TABLE:
> + st->devtab = (target_phys_addr_t) (val & MMIO_DEVTAB_BASE_MASK);
> + st->devtab_len = ((val & MMIO_DEVTAB_SIZE_MASK) + 1) *
> + (MMIO_DEVTAB_SIZE_UNIT / MMIO_DEVTAB_ENTRY_SIZE);
> + break;
> + case MMIO_COMMAND_BASE:
> + st->cmdbuf = (target_phys_addr_t) (val & MMIO_CMDBUF_BASE_MASK);
> + st->cmdbuf_len = 1UL << (st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] &
> + MMIO_CMDBUF_SIZE_MASK);
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_COMMAND_HEAD:
> + st->cmdbuf_head = val & MMIO_CMDBUF_HEAD_MASK;
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_COMMAND_TAIL:
> + st->cmdbuf_tail = val & MMIO_CMDBUF_TAIL_MASK;
> + amd_iommu_cmdbuf_run(st);
> + break;
> + case MMIO_EVENT_BASE:
> + st->evtlog = (target_phys_addr_t) (val & MMIO_EVTLOG_BASE_MASK);
> + st->evtlog_len = 1UL << (st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] &
> + MMIO_EVTLOG_SIZE_MASK);
> + break;
> + case MMIO_EVENT_HEAD:
> + st->evtlog_head = val & MMIO_EVTLOG_HEAD_MASK;
> + break;
> + case MMIO_EVENT_TAIL:
> + st->evtlog_tail = val & MMIO_EVTLOG_TAIL_MASK;
> + break;
> + case MMIO_EXCL_BASE:
> + st->excl_base = (target_phys_addr_t) (val & MMIO_EXCL_BASE_MASK);
> + st->excl_enabled = val & MMIO_EXCL_ENABLED_MASK;
> + st->excl_allow = val & MMIO_EXCL_ALLOW_MASK;
> + break;
> + case MMIO_EXCL_LIMIT:
> + st->excl_limit = (target_phys_addr_t) ((val & MMIO_EXCL_LIMIT_MASK) |
> + MMIO_EXCL_LIMIT_LOW);
> + break;
> + default:
> + break;
> + }
> +}
> +
> +static uint32_t amd_iommu_mmio_readb(void *opaque, target_phys_addr_t addr)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + return amd_iommu_mmio_buf_read(st, addr, 1);
> +}
> +
> +static uint32_t amd_iommu_mmio_readw(void *opaque, target_phys_addr_t addr)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + return amd_iommu_mmio_buf_read(st, addr, 2);
> +}
> +
> +static uint32_t amd_iommu_mmio_readl(void *opaque, target_phys_addr_t addr)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + return amd_iommu_mmio_buf_read(st, addr, 4);
> +}
> +
> +static void amd_iommu_mmio_writeb(void *opaque,
> + target_phys_addr_t addr,
> + uint32_t val)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + amd_iommu_mmio_buf_write(st, addr, 1, val);
> + amd_iommu_update_mmio(st, addr);
> +}
> +
> +static void amd_iommu_mmio_writew(void *opaque,
> + target_phys_addr_t addr,
> + uint32_t val)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + amd_iommu_mmio_buf_write(st, addr, 2, val);
> + amd_iommu_update_mmio(st, addr);
> +}
> +
> +static void amd_iommu_mmio_writel(void *opaque,
> + target_phys_addr_t addr,
> + uint32_t val)
> +{
> + AMDIOMMUState *st = opaque;
> +
> + amd_iommu_mmio_buf_write(st, addr, 4, val);
> + amd_iommu_update_mmio(st, addr);
> +}
> +
> +static CPUReadMemoryFunc * const amd_iommu_mmio_read[] = {
> + amd_iommu_mmio_readb,
> + amd_iommu_mmio_readw,
> + amd_iommu_mmio_readl,
> +};
> +
> +static CPUWriteMemoryFunc * const amd_iommu_mmio_write[] = {
> + amd_iommu_mmio_writeb,
> + amd_iommu_mmio_writew,
> + amd_iommu_mmio_writel,
> +};
> +
> +static void amd_iommu_enable_mmio(AMDIOMMUState *st)
> +{
> + target_phys_addr_t addr;
> + uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
> +
> + st->mmio_index = cpu_register_io_memory(amd_iommu_mmio_read,
> + amd_iommu_mmio_write, st);
> + if (st->mmio_index < 0) {
> + return;
> + }
> +
> + addr = le64_to_cpu(*(uint64_t *) &st->capab[CAPAB_BAR_LOW]) & CAPAB_BAR_MASK;
> + cpu_register_physical_memory(addr, MMIO_SIZE, st->mmio_index);
> +
> + st->mmio_addr = addr;
> + st->mmio_enabled = 1;
> +
> + /* Further changes to the capability are prohibited. */
> + memset(capab_wmask + CAPAB_BAR_LOW, 0x00, CAPAB_REG_SIZE);
> + memset(capab_wmask + CAPAB_BAR_HIGH, 0x00, CAPAB_REG_SIZE);
> +}
> +
> +static void amd_iommu_write_capab(PCIDevice *dev,
> + uint32_t addr, uint32_t val, int len)
> +{
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
> +
> + pci_default_write_config(dev, addr, val, len);
> +
> + if (!st->mmio_enabled && st->capab[CAPAB_BAR_LOW] & 0x1) {
> + amd_iommu_enable_mmio(st);
> + }
> +}
> +
> +static void amd_iommu_reset(DeviceState *dev)
> +{
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev.qdev, dev);
> + unsigned char *capab = st->capab;
> + uint8_t *capab_wmask = st->dev.wmask + st->capab_offset;
> +
> + st->enabled = 0;
> + st->ats_enabled = 0;
> + st->mmio_enabled = 0;
> +
> + capab[CAPAB_REV_TYPE] = CAPAB_REV_TYPE;
> + capab[CAPAB_FLAGS] = CAPAB_FLAGS;
> + capab[CAPAB_BAR_LOW] = 0;
> + capab[CAPAB_BAR_HIGH] = 0;
> + capab[CAPAB_RANGE] = 0;
> + *((uint32_t *) &capab[CAPAB_MISC]) = cpu_to_le32(CAPAB_INIT_MISC);
> +
> + /* Changes to the capability are allowed after system reset. */
> + memset(capab_wmask + CAPAB_BAR_LOW, 0xFF, CAPAB_REG_SIZE);
> + memset(capab_wmask + CAPAB_BAR_HIGH, 0xFF, CAPAB_REG_SIZE);
> +
> + memset(st->mmio_buf, 0, MMIO_SIZE);
> + st->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] = MMIO_CMDBUF_DEFAULT_SIZE;
> + st->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] = MMIO_EVTLOG_DEFAULT_SIZE;
> +}
> +
> +static void amd_iommu_log_event(AMDIOMMUState *st, AMDIOMMUEvent *evt)
> +{
> + if (!st->evtlog_enabled ||
> + (st->mmio_buf[MMIO_STATUS] | MMIO_STATUS_EVTLOG_OF)) {
> + return;
> + }
> +
> + if (st->evtlog_tail >= st->evtlog_len) {
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_OF;
> + }
> +
> + cpu_physical_memory_write(st->evtlog + st->evtlog_tail,
> + (uint8_t *) evt, EVENT_LEN);
> +
> + st->evtlog_tail += EVENT_LEN;
> + st->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_INTR;
> +}
> +
> +static void amd_iommu_page_fault(AMDIOMMUState *st,
> + int devfn,
> + unsigned domid,
> + target_phys_addr_t addr,
> + int present,
> + int is_write)
> +{
> + AMDIOMMUEvent evt;
> + unsigned info;
> +
> + evt.devfn = cpu_to_le16(devfn);
> + evt.reserved = 0;
> + evt.domid = cpu_to_le16(domid);
> + evt.addr = cpu_to_le64(addr);
> +
> + info = EVENT_IOPF;
> + if (present) {
> + info |= EVENT_IOPF_PR;
> + }
> + if (is_write) {
> + info |= EVENT_IOPF_RW;
> + }
> + evt.info = cpu_to_le16(info);
> +
> + amd_iommu_log_event(st, &evt);
> +}
> +
> +static inline uint64_t amd_iommu_get_perms(uint64_t entry)
> +{
> + return (entry & (DEV_PERM_READ | DEV_PERM_WRITE)) >> DEV_PERM_SHIFT;
> +}
> +
> +static int amd_iommu_translate(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms)
> +{
> + int devfn, present;
> + target_phys_addr_t entry_addr, pte_addr;
> + uint64_t entry[4], pte, page_offset, pte_perms;
> + unsigned level, domid;
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, iommu);
> +
> + if (!st->enabled) {
> + goto no_translation;
> + }
> +
> + /* Get device table entry. */
> + devfn = dev->devfn;
> + entry_addr = st->devtab + devfn * DEVTAB_ENTRY_SIZE;
> + cpu_physical_memory_read(entry_addr, (uint8_t *) entry, 32);
> +
> + pte = entry[0];
> + if (!(pte & DEV_VALID) || !(pte & DEV_TRANSLATION_VALID)) {
> + goto no_translation;
> + }
> + domid = entry[1] & DEV_DOMAIN_ID_MASK;
> + level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
> + while (level > 0) {
> + /*
> + * Check permissions: the bitwise
> + * implication perms -> entry_perms must be true.
> + */
> + pte_perms = amd_iommu_get_perms(pte);
> + present = pte & 1;
> + if (!present || perms != (perms & pte_perms)) {
> + amd_iommu_page_fault(st, devfn, domid, addr,
> + present, !!(perms & IOMMU_PERM_WRITE));
> + return -EPERM;
> + }
> +
> + /* Go to the next lower level. */
> + pte_addr = pte & DEV_PT_ROOT_MASK;
> + pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
> + pte = ldq_phys(pte_addr);
> + level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK;
> + }
> + page_offset = addr & 4095;
> + *paddr = (pte & DEV_PT_ROOT_MASK) + page_offset;
> + *len = 4096 - page_offset;
> +
> + return 0;
> +
> +no_translation:
> + *paddr = addr;
> + *len = -1;
> + return 0;
> +}
> +
> +static int amd_iommu_pci_initfn(PCIDevice *dev)
> +{
> + AMDIOMMUState *st = DO_UPCAST(AMDIOMMUState, dev, dev);
> +
> + pci_config_set_vendor_id(st->dev.config, PCI_VENDOR_ID_AMD);
> + pci_config_set_device_id(st->dev.config, PCI_DEVICE_ID_AMD_IOMMU);
> + pci_config_set_class(st->dev.config, PCI_CLASS_SYSTEM_IOMMU);
> +
> + /* Secure Device capability */
> + st->capab_offset = pci_add_capability(&st->dev,
> + PCI_CAP_ID_SEC,
> + CAPAB_SIZE);
> + st->capab = st->dev.config + st->capab_offset;
> + dev->config_write = amd_iommu_write_capab;
> +
> + /* Allocate backing space for the MMIO registers. */
> + st->mmio_buf = qemu_malloc(MMIO_SIZE);
> +
> + pci_register_iommu(dev, amd_iommu_translate);
> +
> + return 0;
> +}
> +
> +static const VMStateDescription vmstate_amd_iommu = {
> + .name = "amd-iommu",
> + .version_id = 1,
> + .minimum_version_id = 1,
> + .minimum_version_id_old = 1,
> + .fields = (VMStateField []) {
> + VMSTATE_PCI_DEVICE(dev, AMDIOMMUState),
> + VMSTATE_END_OF_LIST()
> + }
> +};
> +
> +static PCIDeviceInfo amd_iommu_pci_info = {
> + .qdev.name = "amd-iommu",
> + .qdev.desc = "AMD IOMMU",
> + .qdev.size = sizeof(AMDIOMMUState),
> + .qdev.reset = amd_iommu_reset,
> + .qdev.vmsd = &vmstate_amd_iommu,
> + .init = amd_iommu_pci_initfn,
> +};
> +
> +static void amd_iommu_register(void)
> +{
> + pci_qdev_register(&amd_iommu_pci_info);
> +}
> +
> +device_init(amd_iommu_register);
> diff --git a/hw/pc.c b/hw/pc.c
> index a96187f..e2456b0 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
> int max_bus;
> int bus;
>
> + pci_create_simple(pci_bus, -1, "amd-iommu");
> +
> max_bus = drive_get_max_bus(IF_SCSI);
> for (bus = 0; bus <= max_bus; bus++) {
> pci_create_simple(pci_bus, -1, "lsi53c895a");
> diff --git a/hw/pci_ids.h b/hw/pci_ids.h
> index 39e9f1d..d790312 100644
> --- a/hw/pci_ids.h
> +++ b/hw/pci_ids.h
> @@ -26,6 +26,7 @@
>
> #define PCI_CLASS_MEMORY_RAM 0x0500
>
> +#define PCI_CLASS_SYSTEM_IOMMU 0x0806
> #define PCI_CLASS_SYSTEM_OTHER 0x0880
>
> #define PCI_CLASS_SERIAL_USB 0x0c03
> @@ -56,6 +57,7 @@
>
> #define PCI_VENDOR_ID_AMD 0x1022
> #define PCI_DEVICE_ID_AMD_LANCE 0x2000
> +#define PCI_DEVICE_ID_AMD_IOMMU 0x0000 /* FIXME */
>
> #define PCI_VENDOR_ID_MOTOROLA 0x1057
> #define PCI_DEVICE_ID_MOTOROLA_MPC106 0x0002
> diff --git a/hw/pci_regs.h b/hw/pci_regs.h
> index 0f9f84c..6695e41 100644
> --- a/hw/pci_regs.h
> +++ b/hw/pci_regs.h
> @@ -209,6 +209,7 @@
> #define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
> #define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
> #define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
> +#define PCI_CAP_ID_SEC 0x0F /* Secure Device (AMD IOMMU) */
> #define PCI_CAP_ID_EXP 0x10 /* PCI Express */
> #define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
> #define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
> --
> 1.7.1
>
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 0/7] AMD IOMMU emulation patchset v4
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-28 16:00 ` Blue Swirl
-1 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-28 16:00 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: mst, joro, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> Hi,
>
> I rebased my work on mst's PCI tree and, hopefully, fixed issues raised by
> others. Here's a summary of the changes:
> - made it apply to mst/pci
> - moved some AMD IOMMU stuff in a reset handler
> - dropped range_covers_range() (wasn't the same as ranges_overlap(), but the
> latter was better anyway)
> - used 'expand' to remove tabs in pci_regs.h before applying the useful changes
> - fixed the endianness mistake spotted by Blue (though ldq_phys wasn't needed)
>
> As for Anthony's suggestion to simply sed-convert all devices, I'd rather go
> through them one at a time and do it manually. 'sed' would not only mess
> indentation, but also it isn't straightforward to get the 'PCIDevice *' you
> need to pass to the pci_* helpers. (I'll try to focus on conversion next so we
> can poison the old stuff.)
>
> I also added (read "spelled it out myself") malc's ACK to the ac97 patch.
> Nothing changed since his last review.
>
> Please have a look and merge if you like it.
The endianess bug still exists. I had also other comments to 2.
>
>
> Thanks,
> Eduard
>
>
> Eduard - Gabriel Munteanu (7):
> pci: expand tabs to spaces in pci_regs.h
> pci: memory access API and IOMMU support
> AMD IOMMU emulation
> ide: use the PCI memory access interface
> rtl8139: use the PCI memory access interface
> eepro100: use the PCI memory access interface
> ac97: use the PCI memory access interface
>
> Makefile.target | 2 +-
> dma-helpers.c | 46 ++-
> dma.h | 21 +-
> hw/ac97.c | 6 +-
> hw/amd_iommu.c | 663 ++++++++++++++++++++++++++
> hw/eepro100.c | 86 ++--
> hw/ide/core.c | 15 +-
> hw/ide/internal.h | 39 ++
> hw/ide/macio.c | 4 +-
> hw/ide/pci.c | 7 +
> hw/pc.c | 2 +
> hw/pci.c | 185 ++++++++-
> hw/pci.h | 74 +++
> hw/pci_ids.h | 2 +
> hw/pci_internals.h | 12 +
> hw/pci_regs.h | 1331 ++++++++++++++++++++++++++--------------------------
> hw/rtl8139.c | 99 +++--
> qemu-common.h | 1 +
> 18 files changed, 1827 insertions(+), 768 deletions(-)
> create mode 100644 hw/amd_iommu.c
> rewrite hw/pci_regs.h (90%)
>
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 0/7] AMD IOMMU emulation patchset v4
@ 2010-08-28 16:00 ` Blue Swirl
0 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-28 16:00 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu; +Cc: kvm, mst, joro, qemu-devel, yamahata, avi, paul
On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> Hi,
>
> I rebased my work on mst's PCI tree and, hopefully, fixed issues raised by
> others. Here's a summary of the changes:
> - made it apply to mst/pci
> - moved some AMD IOMMU stuff in a reset handler
> - dropped range_covers_range() (wasn't the same as ranges_overlap(), but the
> latter was better anyway)
> - used 'expand' to remove tabs in pci_regs.h before applying the useful changes
> - fixed the endianness mistake spotted by Blue (though ldq_phys wasn't needed)
>
> As for Anthony's suggestion to simply sed-convert all devices, I'd rather go
> through them one at a time and do it manually. 'sed' would not only mess
> indentation, but also it isn't straightforward to get the 'PCIDevice *' you
> need to pass to the pci_* helpers. (I'll try to focus on conversion next so we
> can poison the old stuff.)
>
> I also added (read "spelled it out myself") malc's ACK to the ac97 patch.
> Nothing changed since his last review.
>
> Please have a look and merge if you like it.
The endianess bug still exists. I had also other comments to 2.
>
>
> Thanks,
> Eduard
>
>
> Eduard - Gabriel Munteanu (7):
> pci: expand tabs to spaces in pci_regs.h
> pci: memory access API and IOMMU support
> AMD IOMMU emulation
> ide: use the PCI memory access interface
> rtl8139: use the PCI memory access interface
> eepro100: use the PCI memory access interface
> ac97: use the PCI memory access interface
>
> Makefile.target | 2 +-
> dma-helpers.c | 46 ++-
> dma.h | 21 +-
> hw/ac97.c | 6 +-
> hw/amd_iommu.c | 663 ++++++++++++++++++++++++++
> hw/eepro100.c | 86 ++--
> hw/ide/core.c | 15 +-
> hw/ide/internal.h | 39 ++
> hw/ide/macio.c | 4 +-
> hw/ide/pci.c | 7 +
> hw/pc.c | 2 +
> hw/pci.c | 185 ++++++++-
> hw/pci.h | 74 +++
> hw/pci_ids.h | 2 +
> hw/pci_internals.h | 12 +
> hw/pci_regs.h | 1331 ++++++++++++++++++++++++++--------------------------
> hw/rtl8139.c | 99 +++--
> qemu-common.h | 1 +
> 18 files changed, 1827 insertions(+), 768 deletions(-)
> create mode 100644 hw/amd_iommu.c
> rewrite hw/pci_regs.h (90%)
>
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 3/7] AMD IOMMU emulation
2010-08-28 15:58 ` [Qemu-devel] " Blue Swirl
@ 2010-08-28 21:53 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 21:53 UTC (permalink / raw)
To: Blue Swirl
Cc: mst, joro, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 03:58:23PM +0000, Blue Swirl wrote:
> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
> <eduard.munteanu@linux360.ro> wrote:
> > This introduces emulation for the AMD IOMMU, described in "AMD I/O
> > Virtualization Technology (IOMMU) Specification".
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > ---
[snip]
> > diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
> > new file mode 100644
> > index 0000000..43e0426
> > --- /dev/null
> > +++ b/hw/amd_iommu.c
[snip]
> > +static void amd_iommu_update_mmio(AMDIOMMUState *st,
> > + ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??target_phys_addr_t addr)
> > +{
> > + ?? ??size_t reg = addr & ~0x07;
> > + ?? ??uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
>
> This is still buggy.
>
> > + ?? ??uint64_t val = le64_to_cpu(*base);
mmio_buf is always LE, so a BE host will have *base in reversed
byteorder. But look at the next line, where I did the le64_to_cpu().
That should swap the bytes on a BE host, yielding the correct byteorder.
On a LE host, *base is right the first time and le64_to_cpu() is a nop.
In any case, I only use 'val', not '*base' directly. I suppose it could
be rewritten for clarity (i.e. ditch 'base').
Do you still think it's wrong? Or is it for another reason?
Thanks,
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 3/7] AMD IOMMU emulation
@ 2010-08-28 21:53 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-28 21:53 UTC (permalink / raw)
To: Blue Swirl; +Cc: kvm, mst, joro, qemu-devel, yamahata, avi, paul
On Sat, Aug 28, 2010 at 03:58:23PM +0000, Blue Swirl wrote:
> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
> <eduard.munteanu@linux360.ro> wrote:
> > This introduces emulation for the AMD IOMMU, described in "AMD I/O
> > Virtualization Technology (IOMMU) Specification".
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > ---
[snip]
> > diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
> > new file mode 100644
> > index 0000000..43e0426
> > --- /dev/null
> > +++ b/hw/amd_iommu.c
[snip]
> > +static void amd_iommu_update_mmio(AMDIOMMUState *st,
> > + ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??target_phys_addr_t addr)
> > +{
> > + ?? ??size_t reg = addr & ~0x07;
> > + ?? ??uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
>
> This is still buggy.
>
> > + ?? ??uint64_t val = le64_to_cpu(*base);
mmio_buf is always LE, so a BE host will have *base in reversed
byteorder. But look at the next line, where I did the le64_to_cpu().
That should swap the bytes on a BE host, yielding the correct byteorder.
On a LE host, *base is right the first time and le64_to_cpu() is a nop.
In any case, I only use 'val', not '*base' directly. I suppose it could
be rewritten for clarity (i.e. ditch 'base').
Do you still think it's wrong? Or is it for another reason?
Thanks,
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 0/7] AMD IOMMU emulation patchset v4
2010-08-28 16:00 ` [Qemu-devel] " Blue Swirl
@ 2010-08-29 9:55 ` Joerg Roedel
-1 siblings, 0 replies; 97+ messages in thread
From: Joerg Roedel @ 2010-08-29 9:55 UTC (permalink / raw)
To: Blue Swirl
Cc: Eduard - Gabriel Munteanu, mst, paul, avi, anthony, av1474,
yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 04:00:31PM +0000, Blue Swirl wrote:
> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
> > Please have a look and merge if you like it.
>
> The endianess bug still exists. I had also other comments to 2.
I am very happy with this patch set. Besides your comments, is there
anything else that prevents merging of this patch set? Paul, what is
your opinion in this?
Joerg
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 0/7] AMD IOMMU emulation patchset v4
@ 2010-08-29 9:55 ` Joerg Roedel
0 siblings, 0 replies; 97+ messages in thread
From: Joerg Roedel @ 2010-08-29 9:55 UTC (permalink / raw)
To: Blue Swirl
Cc: kvm, mst, qemu-devel, yamahata, avi, Eduard - Gabriel Munteanu, paul
On Sat, Aug 28, 2010 at 04:00:31PM +0000, Blue Swirl wrote:
> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
> > Please have a look and merge if you like it.
>
> The endianess bug still exists. I had also other comments to 2.
I am very happy with this patch set. Besides your comments, is there
anything else that prevents merging of this patch set? Paul, what is
your opinion in this?
Joerg
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 3/7] AMD IOMMU emulation
2010-08-28 21:53 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-29 20:37 ` Blue Swirl
-1 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-29 20:37 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: mst, joro, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 9:53 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> On Sat, Aug 28, 2010 at 03:58:23PM +0000, Blue Swirl wrote:
>> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
>> <eduard.munteanu@linux360.ro> wrote:
>> > This introduces emulation for the AMD IOMMU, described in "AMD I/O
>> > Virtualization Technology (IOMMU) Specification".
>> >
>> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> > ---
>
> [snip]
>
>> > diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
>> > new file mode 100644
>> > index 0000000..43e0426
>> > --- /dev/null
>> > +++ b/hw/amd_iommu.c
>
> [snip]
>
>> > +static void amd_iommu_update_mmio(AMDIOMMUState *st,
>> > + ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??target_phys_addr_t addr)
>> > +{
>> > + ?? ??size_t reg = addr & ~0x07;
>> > + ?? ??uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
>>
>> This is still buggy.
>>
>> > + ?? ??uint64_t val = le64_to_cpu(*base);
>
> mmio_buf is always LE, so a BE host will have *base in reversed
> byteorder. But look at the next line, where I did the le64_to_cpu().
> That should swap the bytes on a BE host, yielding the correct byteorder.
Sorry, I missed that one when comparing the patch to previous version.
> On a LE host, *base is right the first time and le64_to_cpu() is a nop.
>
> In any case, I only use 'val', not '*base' directly. I suppose it could
> be rewritten for clarity (i.e. ditch 'base').
Yes, someone could add more code later which accidentally uses 'base' directly.
> Do you still think it's wrong? Or is it for another reason?
I think it's OK for now. The rewrite can happen with a small patch later.
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 3/7] AMD IOMMU emulation
@ 2010-08-29 20:37 ` Blue Swirl
0 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-29 20:37 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu; +Cc: kvm, mst, joro, qemu-devel, yamahata, avi, paul
On Sat, Aug 28, 2010 at 9:53 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> On Sat, Aug 28, 2010 at 03:58:23PM +0000, Blue Swirl wrote:
>> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
>> <eduard.munteanu@linux360.ro> wrote:
>> > This introduces emulation for the AMD IOMMU, described in "AMD I/O
>> > Virtualization Technology (IOMMU) Specification".
>> >
>> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> > ---
>
> [snip]
>
>> > diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
>> > new file mode 100644
>> > index 0000000..43e0426
>> > --- /dev/null
>> > +++ b/hw/amd_iommu.c
>
> [snip]
>
>> > +static void amd_iommu_update_mmio(AMDIOMMUState *st,
>> > + ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??target_phys_addr_t addr)
>> > +{
>> > + ?? ??size_t reg = addr & ~0x07;
>> > + ?? ??uint64_t *base = (uint64_t *) &st->mmio_buf[reg];
>>
>> This is still buggy.
>>
>> > + ?? ??uint64_t val = le64_to_cpu(*base);
>
> mmio_buf is always LE, so a BE host will have *base in reversed
> byteorder. But look at the next line, where I did the le64_to_cpu().
> That should swap the bytes on a BE host, yielding the correct byteorder.
Sorry, I missed that one when comparing the patch to previous version.
> On a LE host, *base is right the first time and le64_to_cpu() is a nop.
>
> In any case, I only use 'val', not '*base' directly. I suppose it could
> be rewritten for clarity (i.e. ditch 'base').
Yes, someone could add more code later which accidentally uses 'base' directly.
> Do you still think it's wrong? Or is it for another reason?
I think it's OK for now. The rewrite can happen with a small patch later.
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 0/7] AMD IOMMU emulation patchset v4
2010-08-29 9:55 ` [Qemu-devel] " Joerg Roedel
@ 2010-08-29 20:44 ` Blue Swirl
-1 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-29 20:44 UTC (permalink / raw)
To: Joerg Roedel
Cc: Eduard - Gabriel Munteanu, mst, paul, avi, anthony, av1474,
yamahata, kvm, qemu-devel
On Sun, Aug 29, 2010 at 9:55 AM, Joerg Roedel <joro@8bytes.org> wrote:
> On Sat, Aug 28, 2010 at 04:00:31PM +0000, Blue Swirl wrote:
>> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
>
>> > Please have a look and merge if you like it.
>>
>> The endianess bug still exists. I had also other comments to 2.
>
> I am very happy with this patch set. Besides your comments, is there
> anything else that prevents merging of this patch set? Paul, what is
> your opinion in this?
I also think it's nice piece of work. It would be good to fix the
CODING_STYLE (missing braces) problem in 2 before merging. The
endianess problem is not so much of a problem, my mistake.
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 0/7] AMD IOMMU emulation patchset v4
@ 2010-08-29 20:44 ` Blue Swirl
0 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-08-29 20:44 UTC (permalink / raw)
To: Joerg Roedel
Cc: kvm, mst, qemu-devel, yamahata, avi, Eduard - Gabriel Munteanu, paul
On Sun, Aug 29, 2010 at 9:55 AM, Joerg Roedel <joro@8bytes.org> wrote:
> On Sat, Aug 28, 2010 at 04:00:31PM +0000, Blue Swirl wrote:
>> On Sat, Aug 28, 2010 at 2:54 PM, Eduard - Gabriel Munteanu
>
>> > Please have a look and merge if you like it.
>>
>> The endianess bug still exists. I had also other comments to 2.
>
> I am very happy with this patch set. Besides your comments, is there
> anything else that prevents merging of this patch set? Paul, what is
> your opinion in this?
I also think it's nice piece of work. It would be good to fix the
CODING_STYLE (missing braces) problem in 2 before merging. The
endianess problem is not so much of a problem, my mistake.
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH 2/7] pci: memory access API and IOMMU support
2010-08-29 20:44 ` [Qemu-devel] " Blue Swirl
@ 2010-08-29 22:08 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-29 22:08 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm,
qemu-devel, Eduard - Gabriel Munteanu
PCI devices should access memory through pci_memory_*() instead of
cpu_physical_memory_*(). This also provides support for translation and
access checking in case an IOMMU is emulated.
Memory maps are treated as remote IOTLBs (that is, translation caches
belonging to the IOMMU-aware device itself). Clients (devices) must
provide callbacks for map invalidation in case these maps are
persistent beyond the current I/O context, e.g. AIO DMA transfers.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 69 +++++++++++++++++++
hw/pci_internals.h | 12 +++
qemu-common.h | 1 +
4 files changed, 272 insertions(+), 1 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c
index 2dc1577..afcb33c 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev)
pci_update_mappings(dev);
}
+static int pci_no_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ *paddr = addr;
+ *len = -1;
+
+ return 0;
+}
+
static void pci_bus_reset(void *opaque)
{
PCIBus *bus = opaque;
@@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
{
qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
assert(PCI_FUNC(devfn_min) == 0);
- bus->devfn_min = devfn_min;
+
+ bus->devfn_min = devfn_min;
+ bus->iommu = NULL;
+ bus->translate = pci_no_translate;
/* host bridge */
QLIST_INIT(&bus->child);
@@ -1789,3 +1805,176 @@ static char *pcibus_get_dev_path(DeviceState *dev)
return strdup(path);
}
+void pci_register_iommu(PCIDevice *iommu,
+ PCITranslateFunc *translate)
+{
+ iommu->bus->iommu = iommu;
+ iommu->bus->translate = translate;
+}
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ while (len) {
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err) {
+ return;
+ }
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len) {
+ plen = len;
+ }
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+static void pci_memory_register_map(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len,
+ target_phys_addr_t paddr,
+ PCIInvalidateMapFunc *invalidate,
+ void *invalidate_opaque)
+{
+ PCIMemoryMap *map;
+
+ map = qemu_malloc(sizeof(PCIMemoryMap));
+ map->addr = addr;
+ map->len = len;
+ map->paddr = paddr;
+ map->invalidate = invalidate;
+ map->invalidate_opaque = invalidate_opaque;
+
+ QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
+}
+
+static void pci_memory_unregister_map(PCIDevice *dev,
+ target_phys_addr_t paddr,
+ target_phys_addr_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (map->paddr == paddr && map->len == len) {
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (ranges_overlap(addr, len, map->addr, map->len)) {
+ map->invalidate(map->invalidate_opaque);
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ plen = *len;
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err) {
+ return NULL;
+ }
+
+ /*
+ * If this is true, the virtual region is contiguous,
+ * but the translated physical region isn't. We just
+ * clamp *len, much like cpu_physical_memory_map() does.
+ */
+ if (plen < *len) {
+ *len = plen;
+ }
+
+ /* We treat maps as remote TLBs to cope with stuff like AIO. */
+ if (cb) {
+ pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
+ }
+
+ return cpu_physical_memory_map(paddr, len, is_write);
+}
+
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+ pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
+}
+
+#define DEFINE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_READ); \
+ if (err || (plen < size / 8)) { \
+ return 0; \
+ } \
+ \
+ return ld##suffix##_phys(paddr); \
+}
+
+#define DEFINE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
+ if (err || (plen < size / 8)) { \
+ return; \
+ } \
+ \
+ st##suffix##_phys(paddr, val); \
+}
+
+DEFINE_PCI_LD(ub, 8)
+DEFINE_PCI_LD(uw, 16)
+DEFINE_PCI_LD(l, 32)
+DEFINE_PCI_LD(q, 64)
+
+DEFINE_PCI_ST(b, 8)
+DEFINE_PCI_ST(w, 16)
+DEFINE_PCI_ST(l, 32)
+DEFINE_PCI_ST(q, 64)
diff --git a/hw/pci.h b/hw/pci.h
index c551f96..c95863a 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -172,6 +172,8 @@ struct PCIDevice {
char *romfile;
ram_addr_t rom_offset;
uint32_t rom_bar;
+
+ QLIST_HEAD(, PCIMemoryMap) memory_maps;
};
PCIDevice *pci_register_device(PCIBus *bus, const char *name,
@@ -391,4 +393,71 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
return !(last2 < first1 || last1 < first2);
}
+/*
+ * Memory I/O and PCI IOMMU definitions.
+ */
+
+#define IOMMU_PERM_READ (1 << 0)
+#define IOMMU_PERM_WRITE (1 << 1)
+#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
+
+typedef int PCIInvalidateMapFunc(void *opaque);
+typedef int PCITranslateFunc(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms);
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write);
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+void pci_register_iommu(PCIDevice *dev, PCITranslateFunc *translate);
+void pci_memory_invalidate_range(PCIDevice *dev, pcibus_t addr, pcibus_t len);
+
+#define DECLARE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
+
+#define DECLARE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val);
+
+DECLARE_PCI_LD(ub, 8)
+DECLARE_PCI_LD(uw, 16)
+DECLARE_PCI_LD(l, 32)
+DECLARE_PCI_LD(q, 64)
+
+DECLARE_PCI_ST(b, 8)
+DECLARE_PCI_ST(w, 16)
+DECLARE_PCI_ST(l, 32)
+DECLARE_PCI_ST(q, 64)
+
+static inline void pci_memory_read(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, buf, len, 0);
+}
+
+static inline void pci_memory_write(PCIDevice *dev,
+ pcibus_t addr,
+ const uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
+}
+
#endif
diff --git a/hw/pci_internals.h b/hw/pci_internals.h
index e3c93a3..fb134b9 100644
--- a/hw/pci_internals.h
+++ b/hw/pci_internals.h
@@ -33,6 +33,9 @@ struct PCIBus {
Keep a count of the number of devices with raised IRQs. */
int nirq;
int *irq_count;
+
+ PCIDevice *iommu;
+ PCITranslateFunc *translate;
};
struct PCIBridge {
@@ -44,4 +47,13 @@ struct PCIBridge {
const char *bus_name;
};
+struct PCIMemoryMap {
+ pcibus_t addr;
+ pcibus_t len;
+ target_phys_addr_t paddr;
+ PCIInvalidateMapFunc *invalidate;
+ void *invalidate_opaque;
+ QLIST_ENTRY(PCIMemoryMap) list;
+};
+
#endif /* QEMU_PCI_INTERNALS_H */
diff --git a/qemu-common.h b/qemu-common.h
index d735235..8b060e8 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
typedef struct PCIHostState PCIHostState;
typedef struct PCIExpressHost PCIExpressHost;
typedef struct PCIBus PCIBus;
+typedef struct PCIMemoryMap PCIMemoryMap;
typedef struct PCIDevice PCIDevice;
typedef struct PCIBridge PCIBridge;
typedef struct SerialState SerialState;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-08-29 22:08 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-29 22:08 UTC (permalink / raw)
To: mst
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
PCI devices should access memory through pci_memory_*() instead of
cpu_physical_memory_*(). This also provides support for translation and
access checking in case an IOMMU is emulated.
Memory maps are treated as remote IOTLBs (that is, translation caches
belonging to the IOMMU-aware device itself). Clients (devices) must
provide callbacks for map invalidation in case these maps are
persistent beyond the current I/O context, e.g. AIO DMA transfers.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 69 +++++++++++++++++++
hw/pci_internals.h | 12 +++
qemu-common.h | 1 +
4 files changed, 272 insertions(+), 1 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c
index 2dc1577..afcb33c 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev)
pci_update_mappings(dev);
}
+static int pci_no_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ *paddr = addr;
+ *len = -1;
+
+ return 0;
+}
+
static void pci_bus_reset(void *opaque)
{
PCIBus *bus = opaque;
@@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
{
qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
assert(PCI_FUNC(devfn_min) == 0);
- bus->devfn_min = devfn_min;
+
+ bus->devfn_min = devfn_min;
+ bus->iommu = NULL;
+ bus->translate = pci_no_translate;
/* host bridge */
QLIST_INIT(&bus->child);
@@ -1789,3 +1805,176 @@ static char *pcibus_get_dev_path(DeviceState *dev)
return strdup(path);
}
+void pci_register_iommu(PCIDevice *iommu,
+ PCITranslateFunc *translate)
+{
+ iommu->bus->iommu = iommu;
+ iommu->bus->translate = translate;
+}
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ while (len) {
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err) {
+ return;
+ }
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len) {
+ plen = len;
+ }
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+static void pci_memory_register_map(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len,
+ target_phys_addr_t paddr,
+ PCIInvalidateMapFunc *invalidate,
+ void *invalidate_opaque)
+{
+ PCIMemoryMap *map;
+
+ map = qemu_malloc(sizeof(PCIMemoryMap));
+ map->addr = addr;
+ map->len = len;
+ map->paddr = paddr;
+ map->invalidate = invalidate;
+ map->invalidate_opaque = invalidate_opaque;
+
+ QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
+}
+
+static void pci_memory_unregister_map(PCIDevice *dev,
+ target_phys_addr_t paddr,
+ target_phys_addr_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (map->paddr == paddr && map->len == len) {
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (ranges_overlap(addr, len, map->addr, map->len)) {
+ map->invalidate(map->invalidate_opaque);
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ plen = *len;
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err) {
+ return NULL;
+ }
+
+ /*
+ * If this is true, the virtual region is contiguous,
+ * but the translated physical region isn't. We just
+ * clamp *len, much like cpu_physical_memory_map() does.
+ */
+ if (plen < *len) {
+ *len = plen;
+ }
+
+ /* We treat maps as remote TLBs to cope with stuff like AIO. */
+ if (cb) {
+ pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
+ }
+
+ return cpu_physical_memory_map(paddr, len, is_write);
+}
+
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+ pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
+}
+
+#define DEFINE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_READ); \
+ if (err || (plen < size / 8)) { \
+ return 0; \
+ } \
+ \
+ return ld##suffix##_phys(paddr); \
+}
+
+#define DEFINE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
+ if (err || (plen < size / 8)) { \
+ return; \
+ } \
+ \
+ st##suffix##_phys(paddr, val); \
+}
+
+DEFINE_PCI_LD(ub, 8)
+DEFINE_PCI_LD(uw, 16)
+DEFINE_PCI_LD(l, 32)
+DEFINE_PCI_LD(q, 64)
+
+DEFINE_PCI_ST(b, 8)
+DEFINE_PCI_ST(w, 16)
+DEFINE_PCI_ST(l, 32)
+DEFINE_PCI_ST(q, 64)
diff --git a/hw/pci.h b/hw/pci.h
index c551f96..c95863a 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -172,6 +172,8 @@ struct PCIDevice {
char *romfile;
ram_addr_t rom_offset;
uint32_t rom_bar;
+
+ QLIST_HEAD(, PCIMemoryMap) memory_maps;
};
PCIDevice *pci_register_device(PCIBus *bus, const char *name,
@@ -391,4 +393,71 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
return !(last2 < first1 || last1 < first2);
}
+/*
+ * Memory I/O and PCI IOMMU definitions.
+ */
+
+#define IOMMU_PERM_READ (1 << 0)
+#define IOMMU_PERM_WRITE (1 << 1)
+#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
+
+typedef int PCIInvalidateMapFunc(void *opaque);
+typedef int PCITranslateFunc(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms);
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write);
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+void pci_register_iommu(PCIDevice *dev, PCITranslateFunc *translate);
+void pci_memory_invalidate_range(PCIDevice *dev, pcibus_t addr, pcibus_t len);
+
+#define DECLARE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
+
+#define DECLARE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val);
+
+DECLARE_PCI_LD(ub, 8)
+DECLARE_PCI_LD(uw, 16)
+DECLARE_PCI_LD(l, 32)
+DECLARE_PCI_LD(q, 64)
+
+DECLARE_PCI_ST(b, 8)
+DECLARE_PCI_ST(w, 16)
+DECLARE_PCI_ST(l, 32)
+DECLARE_PCI_ST(q, 64)
+
+static inline void pci_memory_read(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, buf, len, 0);
+}
+
+static inline void pci_memory_write(PCIDevice *dev,
+ pcibus_t addr,
+ const uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
+}
+
#endif
diff --git a/hw/pci_internals.h b/hw/pci_internals.h
index e3c93a3..fb134b9 100644
--- a/hw/pci_internals.h
+++ b/hw/pci_internals.h
@@ -33,6 +33,9 @@ struct PCIBus {
Keep a count of the number of devices with raised IRQs. */
int nirq;
int *irq_count;
+
+ PCIDevice *iommu;
+ PCITranslateFunc *translate;
};
struct PCIBridge {
@@ -44,4 +47,13 @@ struct PCIBridge {
const char *bus_name;
};
+struct PCIMemoryMap {
+ pcibus_t addr;
+ pcibus_t len;
+ target_phys_addr_t paddr;
+ PCIInvalidateMapFunc *invalidate;
+ void *invalidate_opaque;
+ QLIST_ENTRY(PCIMemoryMap) list;
+};
+
#endif /* QEMU_PCI_INTERNALS_H */
diff --git a/qemu-common.h b/qemu-common.h
index d735235..8b060e8 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
typedef struct PCIHostState PCIHostState;
typedef struct PCIExpressHost PCIExpressHost;
typedef struct PCIBus PCIBus;
+typedef struct PCIMemoryMap PCIMemoryMap;
typedef struct PCIDevice PCIDevice;
typedef struct PCIBridge PCIBridge;
typedef struct SerialState SerialState;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH 2/7] pci: memory access API and IOMMU support
2010-08-29 22:08 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-29 22:11 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-29 22:11 UTC (permalink / raw)
To: mst
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Mon, Aug 30, 2010 at 01:08:23AM +0300, Eduard - Gabriel Munteanu wrote:
> PCI devices should access memory through pci_memory_*() instead of
> cpu_physical_memory_*(). This also provides support for translation and
> access checking in case an IOMMU is emulated.
>
> Memory maps are treated as remote IOTLBs (that is, translation caches
> belonging to the IOMMU-aware device itself). Clients (devices) must
> provide callbacks for map invalidation in case these maps are
> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
Please merge this instead of the patch I sent with the series. I wanted
to avoid resubmitting the whole series.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-08-29 22:11 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-29 22:11 UTC (permalink / raw)
To: mst; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Mon, Aug 30, 2010 at 01:08:23AM +0300, Eduard - Gabriel Munteanu wrote:
> PCI devices should access memory through pci_memory_*() instead of
> cpu_physical_memory_*(). This also provides support for translation and
> access checking in case an IOMMU is emulated.
>
> Memory maps are treated as remote IOTLBs (that is, translation caches
> belonging to the IOMMU-aware device itself). Clients (devices) must
> provide callbacks for map invalidation in case these maps are
> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
Please merge this instead of the patch I sent with the series. I wanted
to avoid resubmitting the whole series.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 3/7] AMD IOMMU emulation
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-30 3:07 ` Isaku Yamahata
-1 siblings, 0 replies; 97+ messages in thread
From: Isaku Yamahata @ 2010-08-30 3:07 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: mst, kvm, joro, qemu-devel, blauwirbel, paul, avi
On Sat, Aug 28, 2010 at 05:54:54PM +0300, Eduard - Gabriel Munteanu wrote:
> diff --git a/hw/pc.c b/hw/pc.c
> index a96187f..e2456b0 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
> int max_bus;
> int bus;
>
> + pci_create_simple(pci_bus, -1, "amd-iommu");
> +
> max_bus = drive_get_max_bus(IF_SCSI);
> for (bus = 0; bus <= max_bus; bus++) {
> pci_create_simple(pci_bus, -1, "lsi53c895a");
This always instantiate iommu.
How to coexist with other iommu(Intel VT-d) emulation?
--
yamahata
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 3/7] AMD IOMMU emulation
@ 2010-08-30 3:07 ` Isaku Yamahata
0 siblings, 0 replies; 97+ messages in thread
From: Isaku Yamahata @ 2010-08-30 3:07 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, mst, joro, qemu-devel, blauwirbel, paul, avi
On Sat, Aug 28, 2010 at 05:54:54PM +0300, Eduard - Gabriel Munteanu wrote:
> diff --git a/hw/pc.c b/hw/pc.c
> index a96187f..e2456b0 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
> int max_bus;
> int bus;
>
> + pci_create_simple(pci_bus, -1, "amd-iommu");
> +
> max_bus = drive_get_max_bus(IF_SCSI);
> for (bus = 0; bus <= max_bus; bus++) {
> pci_create_simple(pci_bus, -1, "lsi53c895a");
This always instantiate iommu.
How to coexist with other iommu(Intel VT-d) emulation?
--
yamahata
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 3/7] AMD IOMMU emulation
2010-08-30 3:07 ` Isaku Yamahata
@ 2010-08-30 5:54 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-30 5:54 UTC (permalink / raw)
To: Isaku Yamahata; +Cc: mst, kvm, joro, qemu-devel, blauwirbel, paul, avi
On Mon, Aug 30, 2010 at 12:07:30PM +0900, Isaku Yamahata wrote:
> On Sat, Aug 28, 2010 at 05:54:54PM +0300, Eduard - Gabriel Munteanu wrote:
> > diff --git a/hw/pc.c b/hw/pc.c
> > index a96187f..e2456b0 100644
> > --- a/hw/pc.c
> > +++ b/hw/pc.c
> > @@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
> > int max_bus;
> > int bus;
> >
> > + pci_create_simple(pci_bus, -1, "amd-iommu");
> > +
> > max_bus = drive_get_max_bus(IF_SCSI);
> > for (bus = 0; bus <= max_bus; bus++) {
> > pci_create_simple(pci_bus, -1, "lsi53c895a");
>
> This always instantiate iommu.
> How to coexist with other iommu(Intel VT-d) emulation?
> --
> yamahata
I suppose it could be turned into a compile-time/runtime configurable
option when VT-d emulation arrives. Unless you mean having both IOMMUs
run at the same time, which is impossible unless some meaningful
topology is specified (presumably hardcoded as well).
Considering this is only a machine model I'm modifying, it's just like
other emulated pieces of PC hardware that are (at the moment) hardcoded.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 3/7] AMD IOMMU emulation
@ 2010-08-30 5:54 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-30 5:54 UTC (permalink / raw)
To: Isaku Yamahata; +Cc: kvm, mst, joro, qemu-devel, blauwirbel, paul, avi
On Mon, Aug 30, 2010 at 12:07:30PM +0900, Isaku Yamahata wrote:
> On Sat, Aug 28, 2010 at 05:54:54PM +0300, Eduard - Gabriel Munteanu wrote:
> > diff --git a/hw/pc.c b/hw/pc.c
> > index a96187f..e2456b0 100644
> > --- a/hw/pc.c
> > +++ b/hw/pc.c
> > @@ -1068,6 +1068,8 @@ void pc_pci_device_init(PCIBus *pci_bus)
> > int max_bus;
> > int bus;
> >
> > + pci_create_simple(pci_bus, -1, "amd-iommu");
> > +
> > max_bus = drive_get_max_bus(IF_SCSI);
> > for (bus = 0; bus <= max_bus; bus++) {
> > pci_create_simple(pci_bus, -1, "lsi53c895a");
>
> This always instantiate iommu.
> How to coexist with other iommu(Intel VT-d) emulation?
> --
> yamahata
I suppose it could be turned into a compile-time/runtime configurable
option when VT-d emulation arrives. Unless you mean having both IOMMUs
run at the same time, which is impossible unless some meaningful
topology is specified (presumably hardcoded as well).
Considering this is only a machine model I'm modifying, it's just like
other emulated pieces of PC hardware that are (at the moment) hardcoded.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-08-31 20:29 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-08-31 20:29 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 05:54:52PM +0300, Eduard - Gabriel Munteanu wrote:
> The conversion was done using the GNU 'expand' tool (default settings)
> to make it obey the QEMU coding style.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
I'm not really interested in this: we copied pci_regs.h from linux
to help non-linux hosts, and keeping the code consistent
with the original makes detecting bugs and adding new stuff
from linux/pci_regs.h easier.
> ---
> hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
> 1 files changed, 665 insertions(+), 665 deletions(-)
> rewrite hw/pci_regs.h (90%)
>
> diff --git a/hw/pci_regs.h b/hw/pci_regs.h
> dissimilarity index 90%
> index dd0bed4..0f9f84c 100644
> --- a/hw/pci_regs.h
> +++ b/hw/pci_regs.h
> @@ -1,665 +1,665 @@
> -/*
> - * pci_regs.h
> - *
> - * PCI standard defines
> - * Copyright 1994, Drew Eckhardt
> - * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
> - *
> - * For more information, please consult the following manuals (look at
> - * http://www.pcisig.com/ for how to get them):
> - *
> - * PCI BIOS Specification
> - * PCI Local Bus Specification
> - * PCI to PCI Bridge Specification
> - * PCI System Design Guide
> - *
> - * For hypertransport information, please consult the following manuals
> - * from http://www.hypertransport.org
> - *
> - * The Hypertransport I/O Link Specification
> - */
> -
> -#ifndef LINUX_PCI_REGS_H
> -#define LINUX_PCI_REGS_H
> -
> -/*
> - * Under PCI, each device has 256 bytes of configuration address space,
> - * of which the first 64 bytes are standardized as follows:
> - */
> -#define PCI_VENDOR_ID 0x00 /* 16 bits */
> -#define PCI_DEVICE_ID 0x02 /* 16 bits */
> -#define PCI_COMMAND 0x04 /* 16 bits */
> -#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
> -#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
> -#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
> -#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
> -#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
> -#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
> -#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
> -#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
> -#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
> -#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
> -#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
> -
> -#define PCI_STATUS 0x06 /* 16 bits */
> -#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
> -#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
> -#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
> -#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
> -#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
> -#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
> -#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
> -#define PCI_STATUS_DEVSEL_FAST 0x000
> -#define PCI_STATUS_DEVSEL_MEDIUM 0x200
> -#define PCI_STATUS_DEVSEL_SLOW 0x400
> -#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
> -#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
> -#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
> -#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
> -#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
> -
> -#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
> -#define PCI_REVISION_ID 0x08 /* Revision ID */
> -#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
> -#define PCI_CLASS_DEVICE 0x0a /* Device class */
> -
> -#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
> -#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
> -#define PCI_HEADER_TYPE 0x0e /* 8 bits */
> -#define PCI_HEADER_TYPE_NORMAL 0
> -#define PCI_HEADER_TYPE_BRIDGE 1
> -#define PCI_HEADER_TYPE_CARDBUS 2
> -
> -#define PCI_BIST 0x0f /* 8 bits */
> -#define PCI_BIST_CODE_MASK 0x0f /* Return result */
> -#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
> -#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
> -
> -/*
> - * Base addresses specify locations in memory or I/O space.
> - * Decoded size can be determined by writing a value of
> - * 0xffffffff to the register, and reading it back. Only
> - * 1 bits are decoded.
> - */
> -#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
> -#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
> -#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
> -#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
> -#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
> -#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
> -#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
> -#define PCI_BASE_ADDRESS_SPACE_IO 0x01
> -#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
> -#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
> -#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
> -#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
> -#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
> -#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
> -#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
> -#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
> -/* bit 1 is reserved if address_space = 1 */
> -
> -/* Header type 0 (normal devices) */
> -#define PCI_CARDBUS_CIS 0x28
> -#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
> -#define PCI_SUBSYSTEM_ID 0x2e
> -#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
> -#define PCI_ROM_ADDRESS_ENABLE 0x01
> -#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
> -
> -#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
> -
> -/* 0x35-0x3b are reserved */
> -#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
> -#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
> -#define PCI_MIN_GNT 0x3e /* 8 bits */
> -#define PCI_MAX_LAT 0x3f /* 8 bits */
> -
> -/* Header type 1 (PCI-to-PCI bridges) */
> -#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
> -#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
> -#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
> -#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
> -#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
> -#define PCI_IO_LIMIT 0x1d
> -#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
> -#define PCI_IO_RANGE_TYPE_16 0x00
> -#define PCI_IO_RANGE_TYPE_32 0x01
> -#define PCI_IO_RANGE_MASK (~0x0fUL)
> -#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
> -#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
> -#define PCI_MEMORY_LIMIT 0x22
> -#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
> -#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
> -#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
> -#define PCI_PREF_MEMORY_LIMIT 0x26
> -#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
> -#define PCI_PREF_RANGE_TYPE_32 0x00
> -#define PCI_PREF_RANGE_TYPE_64 0x01
> -#define PCI_PREF_RANGE_MASK (~0x0fUL)
> -#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
> -#define PCI_PREF_LIMIT_UPPER32 0x2c
> -#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
> -#define PCI_IO_LIMIT_UPPER16 0x32
> -/* 0x34 same as for htype 0 */
> -/* 0x35-0x3b is reserved */
> -#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
> -/* 0x3c-0x3d are same as for htype 0 */
> -#define PCI_BRIDGE_CONTROL 0x3e
> -#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
> -#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
> -#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
> -#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
> -#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
> -#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
> -#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
> -
> -/* Header type 2 (CardBus bridges) */
> -#define PCI_CB_CAPABILITY_LIST 0x14
> -/* 0x15 reserved */
> -#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
> -#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
> -#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
> -#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
> -#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
> -#define PCI_CB_MEMORY_BASE_0 0x1c
> -#define PCI_CB_MEMORY_LIMIT_0 0x20
> -#define PCI_CB_MEMORY_BASE_1 0x24
> -#define PCI_CB_MEMORY_LIMIT_1 0x28
> -#define PCI_CB_IO_BASE_0 0x2c
> -#define PCI_CB_IO_BASE_0_HI 0x2e
> -#define PCI_CB_IO_LIMIT_0 0x30
> -#define PCI_CB_IO_LIMIT_0_HI 0x32
> -#define PCI_CB_IO_BASE_1 0x34
> -#define PCI_CB_IO_BASE_1_HI 0x36
> -#define PCI_CB_IO_LIMIT_1 0x38
> -#define PCI_CB_IO_LIMIT_1_HI 0x3a
> -#define PCI_CB_IO_RANGE_MASK (~0x03UL)
> -/* 0x3c-0x3d are same as for htype 0 */
> -#define PCI_CB_BRIDGE_CONTROL 0x3e
> -#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
> -#define PCI_CB_BRIDGE_CTL_SERR 0x02
> -#define PCI_CB_BRIDGE_CTL_ISA 0x04
> -#define PCI_CB_BRIDGE_CTL_VGA 0x08
> -#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
> -#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
> -#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
> -#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
> -#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
> -#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
> -#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
> -#define PCI_CB_SUBSYSTEM_ID 0x42
> -#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
> -/* 0x48-0x7f reserved */
> -
> -/* Capability lists */
> -
> -#define PCI_CAP_LIST_ID 0 /* Capability ID */
> -#define PCI_CAP_ID_PM 0x01 /* Power Management */
> -#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
> -#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
> -#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
> -#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
> -#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
> -#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
> -#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
> -#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
> -#define PCI_CAP_ID_DBG 0x0A /* Debug port */
> -#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
> -#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
> -#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
> -#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
> -#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
> -#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
> -#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
> -#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
> -#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
> -#define PCI_CAP_SIZEOF 4
> -
> -/* Power Management Registers */
> -
> -#define PCI_PM_PMC 2 /* PM Capabilities Register */
> -#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
> -#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
> -#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
> -#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
> -#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
> -#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
> -#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
> -#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
> -#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
> -#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
> -#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
> -#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
> -#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
> -#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
> -#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
> -#define PCI_PM_CTRL 4 /* PM control and status register */
> -#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
> -#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
> -#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
> -#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
> -#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
> -#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
> -#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
> -#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
> -#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
> -#define PCI_PM_DATA_REGISTER 7 /* (??) */
> -#define PCI_PM_SIZEOF 8
> -
> -/* AGP registers */
> -
> -#define PCI_AGP_VERSION 2 /* BCD version number */
> -#define PCI_AGP_RFU 3 /* Rest of capability flags */
> -#define PCI_AGP_STATUS 4 /* Status register */
> -#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
> -#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
> -#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
> -#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
> -#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
> -#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
> -#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
> -#define PCI_AGP_COMMAND 8 /* Control register */
> -#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
> -#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
> -#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
> -#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
> -#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
> -#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
> -#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
> -#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
> -#define PCI_AGP_SIZEOF 12
> -
> -/* Vital Product Data */
> -
> -#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
> -#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
> -#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
> -#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
> -
> -/* Slot Identification */
> -
> -#define PCI_SID_ESR 2 /* Expansion Slot Register */
> -#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
> -#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
> -#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
> -
> -/* Message Signalled Interrupts registers */
> -
> -#define PCI_MSI_FLAGS 2 /* Various flags */
> -#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
> -#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
> -#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
> -#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
> -#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
> -#define PCI_MSI_RFU 3 /* Rest of capability flags */
> -#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
> -#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
> -#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
> -#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
> -#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
> -#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
> -
> -/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
> -#define PCI_MSIX_FLAGS 2
> -#define PCI_MSIX_FLAGS_QSIZE 0x7FF
> -#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
> -#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
> -#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
> -
> -/* CompactPCI Hotswap Register */
> -
> -#define PCI_CHSWP_CSR 2 /* Control and Status Register */
> -#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
> -#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
> -#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
> -#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
> -#define PCI_CHSWP_PI 0x30 /* Programming Interface */
> -#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
> -#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
> -
> -/* PCI Advanced Feature registers */
> -
> -#define PCI_AF_LENGTH 2
> -#define PCI_AF_CAP 3
> -#define PCI_AF_CAP_TP 0x01
> -#define PCI_AF_CAP_FLR 0x02
> -#define PCI_AF_CTRL 4
> -#define PCI_AF_CTRL_FLR 0x01
> -#define PCI_AF_STATUS 5
> -#define PCI_AF_STATUS_TP 0x01
> -
> -/* PCI-X registers */
> -
> -#define PCI_X_CMD 2 /* Modes & Features */
> -#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
> -#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
> -#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
> -#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
> -#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
> -#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
> -#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
> - /* Max # of outstanding split transactions */
> -#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
> -#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
> -#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
> -#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
> -#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
> -#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
> -#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
> -#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
> -#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
> -#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
> -#define PCI_X_STATUS 4 /* PCI-X capabilities */
> -#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
> -#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
> -#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
> -#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
> -#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
> -#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
> -#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
> -#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
> -#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
> -#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
> -#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
> -#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
> -#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
> -
> -/* PCI Express capability registers */
> -
> -#define PCI_EXP_FLAGS 2 /* Capabilities register */
> -#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
> -#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
> -#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
> -#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
> -#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
> -#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
> -#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
> -#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
> -#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
> -#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
> -#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
> -#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
> -#define PCI_EXP_DEVCAP 4 /* Device capabilities */
> -#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
> -#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
> -#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
> -#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
> -#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
> -#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
> -#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
> -#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
> -#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
> -#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
> -#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
> -#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
> -#define PCI_EXP_DEVCTL 8 /* Device Control */
> -#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
> -#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
> -#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
> -#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
> -#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
> -#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
> -#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
> -#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
> -#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
> -#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
> -#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
> -#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
> -#define PCI_EXP_DEVSTA 10 /* Device Status */
> -#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
> -#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
> -#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
> -#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
> -#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
> -#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
> -#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
> -#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
> -#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
> -#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
> -#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
> -#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
> -#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
> -#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
> -#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
> -#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
> -#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
> -#define PCI_EXP_LNKCTL 16 /* Link Control */
> -#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
> -#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
> -#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
> -#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
> -#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
> -#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
> -#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
> -#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
> -#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
> -#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
> -#define PCI_EXP_LNKSTA 18 /* Link Status */
> -#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
> -#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
> -#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
> -#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
> -#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
> -#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
> -#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
> -#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
> -#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
> -#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
> -#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
> -#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
> -#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
> -#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
> -#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
> -#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
> -#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
> -#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
> -#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
> -#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
> -#define PCI_EXP_SLTCTL 24 /* Slot Control */
> -#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
> -#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
> -#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
> -#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
> -#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
> -#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
> -#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
> -#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
> -#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
> -#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
> -#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
> -#define PCI_EXP_SLTSTA 26 /* Slot Status */
> -#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
> -#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
> -#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
> -#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
> -#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
> -#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
> -#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
> -#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
> -#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
> -#define PCI_EXP_RTCTL 28 /* Root Control */
> -#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
> -#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
> -#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
> -#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
> -#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
> -#define PCI_EXP_RTCAP 30 /* Root Capabilities */
> -#define PCI_EXP_RTSTA 32 /* Root Status */
> -#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
> -#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
> -#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
> -#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
> -#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
> -#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
> -
> -/* Extended Capabilities (PCI-X 2.0 and Express) */
> -#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
> -#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
> -#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
> -
> -#define PCI_EXT_CAP_ID_ERR 1
> -#define PCI_EXT_CAP_ID_VC 2
> -#define PCI_EXT_CAP_ID_DSN 3
> -#define PCI_EXT_CAP_ID_PWR 4
> -#define PCI_EXT_CAP_ID_ARI 14
> -#define PCI_EXT_CAP_ID_ATS 15
> -#define PCI_EXT_CAP_ID_SRIOV 16
> -
> -/* Advanced Error Reporting */
> -#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
> -#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
> -#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
> -#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
> -#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
> -#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
> -#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
> -#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
> -#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
> -#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
> -#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
> -#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
> -#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
> - /* Same bits as above */
> -#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
> - /* Same bits as above */
> -#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
> -#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
> -#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
> -#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
> -#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
> -#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
> -#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
> - /* Same bits as above */
> -#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
> -#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
> -#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
> -#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
> -#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
> -#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
> -#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
> -#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
> -/* Correctable Err Reporting Enable */
> -#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
> -/* Non-fatal Err Reporting Enable */
> -#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
> -/* Fatal Err Reporting Enable */
> -#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
> -#define PCI_ERR_ROOT_STATUS 48
> -#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
> -/* Multi ERR_COR Received */
> -#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
> -/* ERR_FATAL/NONFATAL Recevied */
> -#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
> -/* Multi ERR_FATAL/NONFATAL Recevied */
> -#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
> -#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
> -#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
> -#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
> -#define PCI_ERR_ROOT_COR_SRC 52
> -#define PCI_ERR_ROOT_SRC 54
> -
> -/* Virtual Channel */
> -#define PCI_VC_PORT_REG1 4
> -#define PCI_VC_PORT_REG2 8
> -#define PCI_VC_PORT_CTRL 12
> -#define PCI_VC_PORT_STATUS 14
> -#define PCI_VC_RES_CAP 16
> -#define PCI_VC_RES_CTRL 20
> -#define PCI_VC_RES_STATUS 26
> -
> -/* Power Budgeting */
> -#define PCI_PWR_DSR 4 /* Data Select Register */
> -#define PCI_PWR_DATA 8 /* Data Register */
> -#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
> -#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
> -#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
> -#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
> -#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
> -#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
> -#define PCI_PWR_CAP 12 /* Capability */
> -#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
> -
> -/*
> - * Hypertransport sub capability types
> - *
> - * Unfortunately there are both 3 bit and 5 bit capability types defined
> - * in the HT spec, catering for that is a little messy. You probably don't
> - * want to use these directly, just use pci_find_ht_capability() and it
> - * will do the right thing for you.
> - */
> -#define HT_3BIT_CAP_MASK 0xE0
> -#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
> -#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
> -
> -#define HT_5BIT_CAP_MASK 0xF8
> -#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
> -#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
> -#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
> -#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
> -#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
> -#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
> -#define HT_MSI_FLAGS 0x02 /* Offset to flags */
> -#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
> -#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
> -#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
> -#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
> -#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
> -#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
> -#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
> -#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
> -#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
> -#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
> -#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
> -
> -/* Alternative Routing-ID Interpretation */
> -#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
> -#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
> -#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
> -#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
> -#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
> -#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
> -#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
> -#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
> -
> -/* Address Translation Service */
> -#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
> -#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
> -#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
> -#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
> -#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
> -#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
> -#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
> -
> -/* Single Root I/O Virtualization */
> -#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
> -#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
> -#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
> -#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
> -#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
> -#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
> -#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
> -#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
> -#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
> -#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
> -#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
> -#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
> -#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
> -#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
> -#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
> -#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
> -#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
> -#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
> -#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
> -#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
> -#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
> -#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
> -#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
> -#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
> -#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
> -#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
> -#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
> -#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
> -#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
> -
> -#endif /* LINUX_PCI_REGS_H */
> +/*
> + * pci_regs.h
> + *
> + * PCI standard defines
> + * Copyright 1994, Drew Eckhardt
> + * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
> + *
> + * For more information, please consult the following manuals (look at
> + * http://www.pcisig.com/ for how to get them):
> + *
> + * PCI BIOS Specification
> + * PCI Local Bus Specification
> + * PCI to PCI Bridge Specification
> + * PCI System Design Guide
> + *
> + * For hypertransport information, please consult the following manuals
> + * from http://www.hypertransport.org
> + *
> + * The Hypertransport I/O Link Specification
> + */
> +
> +#ifndef LINUX_PCI_REGS_H
> +#define LINUX_PCI_REGS_H
> +
> +/*
> + * Under PCI, each device has 256 bytes of configuration address space,
> + * of which the first 64 bytes are standardized as follows:
> + */
> +#define PCI_VENDOR_ID 0x00 /* 16 bits */
> +#define PCI_DEVICE_ID 0x02 /* 16 bits */
> +#define PCI_COMMAND 0x04 /* 16 bits */
> +#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
> +#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
> +#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
> +#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
> +#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
> +#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
> +#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
> +#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
> +#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
> +#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
> +#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
> +
> +#define PCI_STATUS 0x06 /* 16 bits */
> +#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
> +#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
> +#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
> +#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
> +#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
> +#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
> +#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
> +#define PCI_STATUS_DEVSEL_FAST 0x000
> +#define PCI_STATUS_DEVSEL_MEDIUM 0x200
> +#define PCI_STATUS_DEVSEL_SLOW 0x400
> +#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
> +#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
> +#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
> +#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
> +#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
> +
> +#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
> +#define PCI_REVISION_ID 0x08 /* Revision ID */
> +#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
> +#define PCI_CLASS_DEVICE 0x0a /* Device class */
> +
> +#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
> +#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
> +#define PCI_HEADER_TYPE 0x0e /* 8 bits */
> +#define PCI_HEADER_TYPE_NORMAL 0
> +#define PCI_HEADER_TYPE_BRIDGE 1
> +#define PCI_HEADER_TYPE_CARDBUS 2
> +
> +#define PCI_BIST 0x0f /* 8 bits */
> +#define PCI_BIST_CODE_MASK 0x0f /* Return result */
> +#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
> +#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
> +
> +/*
> + * Base addresses specify locations in memory or I/O space.
> + * Decoded size can be determined by writing a value of
> + * 0xffffffff to the register, and reading it back. Only
> + * 1 bits are decoded.
> + */
> +#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
> +#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
> +#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
> +#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
> +#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
> +#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
> +#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
> +#define PCI_BASE_ADDRESS_SPACE_IO 0x01
> +#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
> +#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
> +#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
> +#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
> +#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
> +#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
> +#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
> +#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
> +/* bit 1 is reserved if address_space = 1 */
> +
> +/* Header type 0 (normal devices) */
> +#define PCI_CARDBUS_CIS 0x28
> +#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
> +#define PCI_SUBSYSTEM_ID 0x2e
> +#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
> +#define PCI_ROM_ADDRESS_ENABLE 0x01
> +#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
> +
> +#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
> +
> +/* 0x35-0x3b are reserved */
> +#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
> +#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
> +#define PCI_MIN_GNT 0x3e /* 8 bits */
> +#define PCI_MAX_LAT 0x3f /* 8 bits */
> +
> +/* Header type 1 (PCI-to-PCI bridges) */
> +#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
> +#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
> +#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
> +#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
> +#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
> +#define PCI_IO_LIMIT 0x1d
> +#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
> +#define PCI_IO_RANGE_TYPE_16 0x00
> +#define PCI_IO_RANGE_TYPE_32 0x01
> +#define PCI_IO_RANGE_MASK (~0x0fUL)
> +#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
> +#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
> +#define PCI_MEMORY_LIMIT 0x22
> +#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
> +#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
> +#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
> +#define PCI_PREF_MEMORY_LIMIT 0x26
> +#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
> +#define PCI_PREF_RANGE_TYPE_32 0x00
> +#define PCI_PREF_RANGE_TYPE_64 0x01
> +#define PCI_PREF_RANGE_MASK (~0x0fUL)
> +#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
> +#define PCI_PREF_LIMIT_UPPER32 0x2c
> +#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
> +#define PCI_IO_LIMIT_UPPER16 0x32
> +/* 0x34 same as for htype 0 */
> +/* 0x35-0x3b is reserved */
> +#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
> +/* 0x3c-0x3d are same as for htype 0 */
> +#define PCI_BRIDGE_CONTROL 0x3e
> +#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
> +#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
> +#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
> +#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
> +#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
> +#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
> +#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
> +
> +/* Header type 2 (CardBus bridges) */
> +#define PCI_CB_CAPABILITY_LIST 0x14
> +/* 0x15 reserved */
> +#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
> +#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
> +#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
> +#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
> +#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
> +#define PCI_CB_MEMORY_BASE_0 0x1c
> +#define PCI_CB_MEMORY_LIMIT_0 0x20
> +#define PCI_CB_MEMORY_BASE_1 0x24
> +#define PCI_CB_MEMORY_LIMIT_1 0x28
> +#define PCI_CB_IO_BASE_0 0x2c
> +#define PCI_CB_IO_BASE_0_HI 0x2e
> +#define PCI_CB_IO_LIMIT_0 0x30
> +#define PCI_CB_IO_LIMIT_0_HI 0x32
> +#define PCI_CB_IO_BASE_1 0x34
> +#define PCI_CB_IO_BASE_1_HI 0x36
> +#define PCI_CB_IO_LIMIT_1 0x38
> +#define PCI_CB_IO_LIMIT_1_HI 0x3a
> +#define PCI_CB_IO_RANGE_MASK (~0x03UL)
> +/* 0x3c-0x3d are same as for htype 0 */
> +#define PCI_CB_BRIDGE_CONTROL 0x3e
> +#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
> +#define PCI_CB_BRIDGE_CTL_SERR 0x02
> +#define PCI_CB_BRIDGE_CTL_ISA 0x04
> +#define PCI_CB_BRIDGE_CTL_VGA 0x08
> +#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
> +#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
> +#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
> +#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
> +#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
> +#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
> +#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
> +#define PCI_CB_SUBSYSTEM_ID 0x42
> +#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
> +/* 0x48-0x7f reserved */
> +
> +/* Capability lists */
> +
> +#define PCI_CAP_LIST_ID 0 /* Capability ID */
> +#define PCI_CAP_ID_PM 0x01 /* Power Management */
> +#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
> +#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
> +#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
> +#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
> +#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
> +#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
> +#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
> +#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
> +#define PCI_CAP_ID_DBG 0x0A /* Debug port */
> +#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
> +#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
> +#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
> +#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
> +#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
> +#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
> +#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
> +#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
> +#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
> +#define PCI_CAP_SIZEOF 4
> +
> +/* Power Management Registers */
> +
> +#define PCI_PM_PMC 2 /* PM Capabilities Register */
> +#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
> +#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
> +#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
> +#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
> +#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
> +#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
> +#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
> +#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
> +#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
> +#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
> +#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
> +#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
> +#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
> +#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
> +#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
> +#define PCI_PM_CTRL 4 /* PM control and status register */
> +#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
> +#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
> +#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
> +#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
> +#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
> +#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
> +#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
> +#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
> +#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
> +#define PCI_PM_DATA_REGISTER 7 /* (??) */
> +#define PCI_PM_SIZEOF 8
> +
> +/* AGP registers */
> +
> +#define PCI_AGP_VERSION 2 /* BCD version number */
> +#define PCI_AGP_RFU 3 /* Rest of capability flags */
> +#define PCI_AGP_STATUS 4 /* Status register */
> +#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
> +#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
> +#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
> +#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
> +#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
> +#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
> +#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
> +#define PCI_AGP_COMMAND 8 /* Control register */
> +#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
> +#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
> +#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
> +#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
> +#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
> +#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
> +#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
> +#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
> +#define PCI_AGP_SIZEOF 12
> +
> +/* Vital Product Data */
> +
> +#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
> +#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
> +#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
> +#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
> +
> +/* Slot Identification */
> +
> +#define PCI_SID_ESR 2 /* Expansion Slot Register */
> +#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
> +#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
> +#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
> +
> +/* Message Signalled Interrupts registers */
> +
> +#define PCI_MSI_FLAGS 2 /* Various flags */
> +#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
> +#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
> +#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
> +#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
> +#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
> +#define PCI_MSI_RFU 3 /* Rest of capability flags */
> +#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
> +#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
> +#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
> +#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
> +#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
> +#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
> +
> +/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
> +#define PCI_MSIX_FLAGS 2
> +#define PCI_MSIX_FLAGS_QSIZE 0x7FF
> +#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
> +#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
> +#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
> +
> +/* CompactPCI Hotswap Register */
> +
> +#define PCI_CHSWP_CSR 2 /* Control and Status Register */
> +#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
> +#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
> +#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
> +#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
> +#define PCI_CHSWP_PI 0x30 /* Programming Interface */
> +#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
> +#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
> +
> +/* PCI Advanced Feature registers */
> +
> +#define PCI_AF_LENGTH 2
> +#define PCI_AF_CAP 3
> +#define PCI_AF_CAP_TP 0x01
> +#define PCI_AF_CAP_FLR 0x02
> +#define PCI_AF_CTRL 4
> +#define PCI_AF_CTRL_FLR 0x01
> +#define PCI_AF_STATUS 5
> +#define PCI_AF_STATUS_TP 0x01
> +
> +/* PCI-X registers */
> +
> +#define PCI_X_CMD 2 /* Modes & Features */
> +#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
> +#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
> +#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
> +#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
> +#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
> +#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
> +#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
> + /* Max # of outstanding split transactions */
> +#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
> +#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
> +#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
> +#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
> +#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
> +#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
> +#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
> +#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
> +#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
> +#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
> +#define PCI_X_STATUS 4 /* PCI-X capabilities */
> +#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
> +#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
> +#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
> +#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
> +#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
> +#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
> +#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
> +#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
> +#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
> +#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
> +#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
> +#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
> +#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
> +
> +/* PCI Express capability registers */
> +
> +#define PCI_EXP_FLAGS 2 /* Capabilities register */
> +#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
> +#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
> +#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
> +#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
> +#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
> +#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
> +#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
> +#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
> +#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
> +#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
> +#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
> +#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
> +#define PCI_EXP_DEVCAP 4 /* Device capabilities */
> +#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
> +#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
> +#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
> +#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
> +#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
> +#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
> +#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
> +#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
> +#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
> +#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
> +#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
> +#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
> +#define PCI_EXP_DEVCTL 8 /* Device Control */
> +#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
> +#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
> +#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
> +#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
> +#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
> +#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
> +#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
> +#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
> +#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
> +#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
> +#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
> +#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
> +#define PCI_EXP_DEVSTA 10 /* Device Status */
> +#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
> +#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
> +#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
> +#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
> +#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
> +#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
> +#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
> +#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
> +#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
> +#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
> +#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
> +#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
> +#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
> +#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
> +#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
> +#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
> +#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
> +#define PCI_EXP_LNKCTL 16 /* Link Control */
> +#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
> +#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
> +#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
> +#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
> +#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
> +#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
> +#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
> +#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
> +#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
> +#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
> +#define PCI_EXP_LNKSTA 18 /* Link Status */
> +#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
> +#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
> +#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
> +#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
> +#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
> +#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
> +#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
> +#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
> +#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
> +#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
> +#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
> +#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
> +#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
> +#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
> +#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
> +#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
> +#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
> +#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
> +#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
> +#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
> +#define PCI_EXP_SLTCTL 24 /* Slot Control */
> +#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
> +#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
> +#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
> +#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
> +#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
> +#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
> +#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
> +#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
> +#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
> +#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
> +#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
> +#define PCI_EXP_SLTSTA 26 /* Slot Status */
> +#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
> +#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
> +#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
> +#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
> +#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
> +#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
> +#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
> +#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
> +#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
> +#define PCI_EXP_RTCTL 28 /* Root Control */
> +#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
> +#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
> +#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
> +#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
> +#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
> +#define PCI_EXP_RTCAP 30 /* Root Capabilities */
> +#define PCI_EXP_RTSTA 32 /* Root Status */
> +#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
> +#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
> +#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
> +#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
> +#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
> +#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
> +
> +/* Extended Capabilities (PCI-X 2.0 and Express) */
> +#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
> +#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
> +#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
> +
> +#define PCI_EXT_CAP_ID_ERR 1
> +#define PCI_EXT_CAP_ID_VC 2
> +#define PCI_EXT_CAP_ID_DSN 3
> +#define PCI_EXT_CAP_ID_PWR 4
> +#define PCI_EXT_CAP_ID_ARI 14
> +#define PCI_EXT_CAP_ID_ATS 15
> +#define PCI_EXT_CAP_ID_SRIOV 16
> +
> +/* Advanced Error Reporting */
> +#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
> +#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
> +#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
> +#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
> +#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
> +#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
> +#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
> +#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
> +#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
> +#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
> +#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
> +#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
> +#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
> + /* Same bits as above */
> +#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
> + /* Same bits as above */
> +#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
> +#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
> +#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
> +#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
> +#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
> +#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
> +#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
> + /* Same bits as above */
> +#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
> +#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
> +#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
> +#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
> +#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
> +#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
> +#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
> +#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
> +/* Correctable Err Reporting Enable */
> +#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
> +/* Non-fatal Err Reporting Enable */
> +#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
> +/* Fatal Err Reporting Enable */
> +#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
> +#define PCI_ERR_ROOT_STATUS 48
> +#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
> +/* Multi ERR_COR Received */
> +#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
> +/* ERR_FATAL/NONFATAL Recevied */
> +#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
> +/* Multi ERR_FATAL/NONFATAL Recevied */
> +#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
> +#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
> +#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
> +#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
> +#define PCI_ERR_ROOT_COR_SRC 52
> +#define PCI_ERR_ROOT_SRC 54
> +
> +/* Virtual Channel */
> +#define PCI_VC_PORT_REG1 4
> +#define PCI_VC_PORT_REG2 8
> +#define PCI_VC_PORT_CTRL 12
> +#define PCI_VC_PORT_STATUS 14
> +#define PCI_VC_RES_CAP 16
> +#define PCI_VC_RES_CTRL 20
> +#define PCI_VC_RES_STATUS 26
> +
> +/* Power Budgeting */
> +#define PCI_PWR_DSR 4 /* Data Select Register */
> +#define PCI_PWR_DATA 8 /* Data Register */
> +#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
> +#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
> +#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
> +#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
> +#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
> +#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
> +#define PCI_PWR_CAP 12 /* Capability */
> +#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
> +
> +/*
> + * Hypertransport sub capability types
> + *
> + * Unfortunately there are both 3 bit and 5 bit capability types defined
> + * in the HT spec, catering for that is a little messy. You probably don't
> + * want to use these directly, just use pci_find_ht_capability() and it
> + * will do the right thing for you.
> + */
> +#define HT_3BIT_CAP_MASK 0xE0
> +#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
> +#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
> +
> +#define HT_5BIT_CAP_MASK 0xF8
> +#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
> +#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
> +#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
> +#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
> +#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
> +#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
> +#define HT_MSI_FLAGS 0x02 /* Offset to flags */
> +#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
> +#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
> +#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
> +#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
> +#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
> +#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
> +#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
> +#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
> +#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
> +#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
> +#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
> +
> +/* Alternative Routing-ID Interpretation */
> +#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
> +#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
> +#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
> +#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
> +#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
> +#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
> +#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
> +#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
> +
> +/* Address Translation Service */
> +#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
> +#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
> +#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
> +#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
> +#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
> +#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
> +#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
> +
> +/* Single Root I/O Virtualization */
> +#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
> +#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
> +#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
> +#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
> +#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
> +#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
> +#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
> +#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
> +#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
> +#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
> +#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
> +#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
> +#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
> +#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
> +#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
> +#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
> +#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
> +#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
> +#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
> +#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
> +#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
> +#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
> +#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
> +#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
> +#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
> +#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
> +#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
> +#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
> +#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
> +
> +#endif /* LINUX_PCI_REGS_H */
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
@ 2010-08-31 20:29 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-08-31 20:29 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Sat, Aug 28, 2010 at 05:54:52PM +0300, Eduard - Gabriel Munteanu wrote:
> The conversion was done using the GNU 'expand' tool (default settings)
> to make it obey the QEMU coding style.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
I'm not really interested in this: we copied pci_regs.h from linux
to help non-linux hosts, and keeping the code consistent
with the original makes detecting bugs and adding new stuff
from linux/pci_regs.h easier.
> ---
> hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
> 1 files changed, 665 insertions(+), 665 deletions(-)
> rewrite hw/pci_regs.h (90%)
>
> diff --git a/hw/pci_regs.h b/hw/pci_regs.h
> dissimilarity index 90%
> index dd0bed4..0f9f84c 100644
> --- a/hw/pci_regs.h
> +++ b/hw/pci_regs.h
> @@ -1,665 +1,665 @@
> -/*
> - * pci_regs.h
> - *
> - * PCI standard defines
> - * Copyright 1994, Drew Eckhardt
> - * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
> - *
> - * For more information, please consult the following manuals (look at
> - * http://www.pcisig.com/ for how to get them):
> - *
> - * PCI BIOS Specification
> - * PCI Local Bus Specification
> - * PCI to PCI Bridge Specification
> - * PCI System Design Guide
> - *
> - * For hypertransport information, please consult the following manuals
> - * from http://www.hypertransport.org
> - *
> - * The Hypertransport I/O Link Specification
> - */
> -
> -#ifndef LINUX_PCI_REGS_H
> -#define LINUX_PCI_REGS_H
> -
> -/*
> - * Under PCI, each device has 256 bytes of configuration address space,
> - * of which the first 64 bytes are standardized as follows:
> - */
> -#define PCI_VENDOR_ID 0x00 /* 16 bits */
> -#define PCI_DEVICE_ID 0x02 /* 16 bits */
> -#define PCI_COMMAND 0x04 /* 16 bits */
> -#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
> -#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
> -#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
> -#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
> -#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
> -#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
> -#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
> -#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
> -#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
> -#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
> -#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
> -
> -#define PCI_STATUS 0x06 /* 16 bits */
> -#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
> -#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
> -#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
> -#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
> -#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
> -#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
> -#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
> -#define PCI_STATUS_DEVSEL_FAST 0x000
> -#define PCI_STATUS_DEVSEL_MEDIUM 0x200
> -#define PCI_STATUS_DEVSEL_SLOW 0x400
> -#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
> -#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
> -#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
> -#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
> -#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
> -
> -#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
> -#define PCI_REVISION_ID 0x08 /* Revision ID */
> -#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
> -#define PCI_CLASS_DEVICE 0x0a /* Device class */
> -
> -#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
> -#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
> -#define PCI_HEADER_TYPE 0x0e /* 8 bits */
> -#define PCI_HEADER_TYPE_NORMAL 0
> -#define PCI_HEADER_TYPE_BRIDGE 1
> -#define PCI_HEADER_TYPE_CARDBUS 2
> -
> -#define PCI_BIST 0x0f /* 8 bits */
> -#define PCI_BIST_CODE_MASK 0x0f /* Return result */
> -#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
> -#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
> -
> -/*
> - * Base addresses specify locations in memory or I/O space.
> - * Decoded size can be determined by writing a value of
> - * 0xffffffff to the register, and reading it back. Only
> - * 1 bits are decoded.
> - */
> -#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
> -#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
> -#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
> -#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
> -#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
> -#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
> -#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
> -#define PCI_BASE_ADDRESS_SPACE_IO 0x01
> -#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
> -#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
> -#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
> -#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
> -#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
> -#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
> -#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
> -#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
> -/* bit 1 is reserved if address_space = 1 */
> -
> -/* Header type 0 (normal devices) */
> -#define PCI_CARDBUS_CIS 0x28
> -#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
> -#define PCI_SUBSYSTEM_ID 0x2e
> -#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
> -#define PCI_ROM_ADDRESS_ENABLE 0x01
> -#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
> -
> -#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
> -
> -/* 0x35-0x3b are reserved */
> -#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
> -#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
> -#define PCI_MIN_GNT 0x3e /* 8 bits */
> -#define PCI_MAX_LAT 0x3f /* 8 bits */
> -
> -/* Header type 1 (PCI-to-PCI bridges) */
> -#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
> -#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
> -#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
> -#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
> -#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
> -#define PCI_IO_LIMIT 0x1d
> -#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
> -#define PCI_IO_RANGE_TYPE_16 0x00
> -#define PCI_IO_RANGE_TYPE_32 0x01
> -#define PCI_IO_RANGE_MASK (~0x0fUL)
> -#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
> -#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
> -#define PCI_MEMORY_LIMIT 0x22
> -#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
> -#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
> -#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
> -#define PCI_PREF_MEMORY_LIMIT 0x26
> -#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
> -#define PCI_PREF_RANGE_TYPE_32 0x00
> -#define PCI_PREF_RANGE_TYPE_64 0x01
> -#define PCI_PREF_RANGE_MASK (~0x0fUL)
> -#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
> -#define PCI_PREF_LIMIT_UPPER32 0x2c
> -#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
> -#define PCI_IO_LIMIT_UPPER16 0x32
> -/* 0x34 same as for htype 0 */
> -/* 0x35-0x3b is reserved */
> -#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
> -/* 0x3c-0x3d are same as for htype 0 */
> -#define PCI_BRIDGE_CONTROL 0x3e
> -#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
> -#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
> -#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
> -#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
> -#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
> -#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
> -#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
> -
> -/* Header type 2 (CardBus bridges) */
> -#define PCI_CB_CAPABILITY_LIST 0x14
> -/* 0x15 reserved */
> -#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
> -#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
> -#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
> -#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
> -#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
> -#define PCI_CB_MEMORY_BASE_0 0x1c
> -#define PCI_CB_MEMORY_LIMIT_0 0x20
> -#define PCI_CB_MEMORY_BASE_1 0x24
> -#define PCI_CB_MEMORY_LIMIT_1 0x28
> -#define PCI_CB_IO_BASE_0 0x2c
> -#define PCI_CB_IO_BASE_0_HI 0x2e
> -#define PCI_CB_IO_LIMIT_0 0x30
> -#define PCI_CB_IO_LIMIT_0_HI 0x32
> -#define PCI_CB_IO_BASE_1 0x34
> -#define PCI_CB_IO_BASE_1_HI 0x36
> -#define PCI_CB_IO_LIMIT_1 0x38
> -#define PCI_CB_IO_LIMIT_1_HI 0x3a
> -#define PCI_CB_IO_RANGE_MASK (~0x03UL)
> -/* 0x3c-0x3d are same as for htype 0 */
> -#define PCI_CB_BRIDGE_CONTROL 0x3e
> -#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
> -#define PCI_CB_BRIDGE_CTL_SERR 0x02
> -#define PCI_CB_BRIDGE_CTL_ISA 0x04
> -#define PCI_CB_BRIDGE_CTL_VGA 0x08
> -#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
> -#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
> -#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
> -#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
> -#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
> -#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
> -#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
> -#define PCI_CB_SUBSYSTEM_ID 0x42
> -#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
> -/* 0x48-0x7f reserved */
> -
> -/* Capability lists */
> -
> -#define PCI_CAP_LIST_ID 0 /* Capability ID */
> -#define PCI_CAP_ID_PM 0x01 /* Power Management */
> -#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
> -#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
> -#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
> -#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
> -#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
> -#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
> -#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
> -#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
> -#define PCI_CAP_ID_DBG 0x0A /* Debug port */
> -#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
> -#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
> -#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
> -#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
> -#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
> -#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
> -#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
> -#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
> -#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
> -#define PCI_CAP_SIZEOF 4
> -
> -/* Power Management Registers */
> -
> -#define PCI_PM_PMC 2 /* PM Capabilities Register */
> -#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
> -#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
> -#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
> -#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
> -#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
> -#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
> -#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
> -#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
> -#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
> -#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
> -#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
> -#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
> -#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
> -#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
> -#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
> -#define PCI_PM_CTRL 4 /* PM control and status register */
> -#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
> -#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
> -#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
> -#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
> -#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
> -#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
> -#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
> -#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
> -#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
> -#define PCI_PM_DATA_REGISTER 7 /* (??) */
> -#define PCI_PM_SIZEOF 8
> -
> -/* AGP registers */
> -
> -#define PCI_AGP_VERSION 2 /* BCD version number */
> -#define PCI_AGP_RFU 3 /* Rest of capability flags */
> -#define PCI_AGP_STATUS 4 /* Status register */
> -#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
> -#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
> -#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
> -#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
> -#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
> -#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
> -#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
> -#define PCI_AGP_COMMAND 8 /* Control register */
> -#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
> -#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
> -#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
> -#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
> -#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
> -#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
> -#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
> -#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
> -#define PCI_AGP_SIZEOF 12
> -
> -/* Vital Product Data */
> -
> -#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
> -#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
> -#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
> -#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
> -
> -/* Slot Identification */
> -
> -#define PCI_SID_ESR 2 /* Expansion Slot Register */
> -#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
> -#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
> -#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
> -
> -/* Message Signalled Interrupts registers */
> -
> -#define PCI_MSI_FLAGS 2 /* Various flags */
> -#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
> -#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
> -#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
> -#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
> -#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
> -#define PCI_MSI_RFU 3 /* Rest of capability flags */
> -#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
> -#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
> -#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
> -#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
> -#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
> -#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
> -
> -/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
> -#define PCI_MSIX_FLAGS 2
> -#define PCI_MSIX_FLAGS_QSIZE 0x7FF
> -#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
> -#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
> -#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
> -
> -/* CompactPCI Hotswap Register */
> -
> -#define PCI_CHSWP_CSR 2 /* Control and Status Register */
> -#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
> -#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
> -#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
> -#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
> -#define PCI_CHSWP_PI 0x30 /* Programming Interface */
> -#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
> -#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
> -
> -/* PCI Advanced Feature registers */
> -
> -#define PCI_AF_LENGTH 2
> -#define PCI_AF_CAP 3
> -#define PCI_AF_CAP_TP 0x01
> -#define PCI_AF_CAP_FLR 0x02
> -#define PCI_AF_CTRL 4
> -#define PCI_AF_CTRL_FLR 0x01
> -#define PCI_AF_STATUS 5
> -#define PCI_AF_STATUS_TP 0x01
> -
> -/* PCI-X registers */
> -
> -#define PCI_X_CMD 2 /* Modes & Features */
> -#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
> -#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
> -#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
> -#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
> -#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
> -#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
> -#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
> - /* Max # of outstanding split transactions */
> -#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
> -#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
> -#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
> -#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
> -#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
> -#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
> -#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
> -#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
> -#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
> -#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
> -#define PCI_X_STATUS 4 /* PCI-X capabilities */
> -#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
> -#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
> -#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
> -#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
> -#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
> -#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
> -#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
> -#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
> -#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
> -#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
> -#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
> -#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
> -#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
> -
> -/* PCI Express capability registers */
> -
> -#define PCI_EXP_FLAGS 2 /* Capabilities register */
> -#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
> -#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
> -#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
> -#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
> -#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
> -#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
> -#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
> -#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
> -#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
> -#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
> -#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
> -#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
> -#define PCI_EXP_DEVCAP 4 /* Device capabilities */
> -#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
> -#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
> -#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
> -#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
> -#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
> -#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
> -#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
> -#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
> -#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
> -#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
> -#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
> -#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
> -#define PCI_EXP_DEVCTL 8 /* Device Control */
> -#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
> -#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
> -#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
> -#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
> -#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
> -#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
> -#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
> -#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
> -#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
> -#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
> -#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
> -#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
> -#define PCI_EXP_DEVSTA 10 /* Device Status */
> -#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
> -#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
> -#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
> -#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
> -#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
> -#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
> -#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
> -#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
> -#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
> -#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
> -#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
> -#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
> -#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
> -#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
> -#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
> -#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
> -#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
> -#define PCI_EXP_LNKCTL 16 /* Link Control */
> -#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
> -#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
> -#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
> -#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
> -#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
> -#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
> -#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
> -#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
> -#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
> -#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
> -#define PCI_EXP_LNKSTA 18 /* Link Status */
> -#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
> -#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
> -#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
> -#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
> -#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
> -#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
> -#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
> -#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
> -#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
> -#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
> -#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
> -#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
> -#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
> -#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
> -#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
> -#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
> -#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
> -#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
> -#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
> -#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
> -#define PCI_EXP_SLTCTL 24 /* Slot Control */
> -#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
> -#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
> -#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
> -#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
> -#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
> -#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
> -#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
> -#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
> -#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
> -#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
> -#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
> -#define PCI_EXP_SLTSTA 26 /* Slot Status */
> -#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
> -#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
> -#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
> -#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
> -#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
> -#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
> -#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
> -#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
> -#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
> -#define PCI_EXP_RTCTL 28 /* Root Control */
> -#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
> -#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
> -#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
> -#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
> -#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
> -#define PCI_EXP_RTCAP 30 /* Root Capabilities */
> -#define PCI_EXP_RTSTA 32 /* Root Status */
> -#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
> -#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
> -#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
> -#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
> -#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
> -#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
> -
> -/* Extended Capabilities (PCI-X 2.0 and Express) */
> -#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
> -#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
> -#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
> -
> -#define PCI_EXT_CAP_ID_ERR 1
> -#define PCI_EXT_CAP_ID_VC 2
> -#define PCI_EXT_CAP_ID_DSN 3
> -#define PCI_EXT_CAP_ID_PWR 4
> -#define PCI_EXT_CAP_ID_ARI 14
> -#define PCI_EXT_CAP_ID_ATS 15
> -#define PCI_EXT_CAP_ID_SRIOV 16
> -
> -/* Advanced Error Reporting */
> -#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
> -#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
> -#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
> -#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
> -#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
> -#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
> -#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
> -#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
> -#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
> -#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
> -#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
> -#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
> -#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
> - /* Same bits as above */
> -#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
> - /* Same bits as above */
> -#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
> -#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
> -#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
> -#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
> -#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
> -#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
> -#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
> - /* Same bits as above */
> -#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
> -#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
> -#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
> -#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
> -#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
> -#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
> -#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
> -#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
> -/* Correctable Err Reporting Enable */
> -#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
> -/* Non-fatal Err Reporting Enable */
> -#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
> -/* Fatal Err Reporting Enable */
> -#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
> -#define PCI_ERR_ROOT_STATUS 48
> -#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
> -/* Multi ERR_COR Received */
> -#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
> -/* ERR_FATAL/NONFATAL Recevied */
> -#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
> -/* Multi ERR_FATAL/NONFATAL Recevied */
> -#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
> -#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
> -#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
> -#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
> -#define PCI_ERR_ROOT_COR_SRC 52
> -#define PCI_ERR_ROOT_SRC 54
> -
> -/* Virtual Channel */
> -#define PCI_VC_PORT_REG1 4
> -#define PCI_VC_PORT_REG2 8
> -#define PCI_VC_PORT_CTRL 12
> -#define PCI_VC_PORT_STATUS 14
> -#define PCI_VC_RES_CAP 16
> -#define PCI_VC_RES_CTRL 20
> -#define PCI_VC_RES_STATUS 26
> -
> -/* Power Budgeting */
> -#define PCI_PWR_DSR 4 /* Data Select Register */
> -#define PCI_PWR_DATA 8 /* Data Register */
> -#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
> -#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
> -#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
> -#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
> -#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
> -#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
> -#define PCI_PWR_CAP 12 /* Capability */
> -#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
> -
> -/*
> - * Hypertransport sub capability types
> - *
> - * Unfortunately there are both 3 bit and 5 bit capability types defined
> - * in the HT spec, catering for that is a little messy. You probably don't
> - * want to use these directly, just use pci_find_ht_capability() and it
> - * will do the right thing for you.
> - */
> -#define HT_3BIT_CAP_MASK 0xE0
> -#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
> -#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
> -
> -#define HT_5BIT_CAP_MASK 0xF8
> -#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
> -#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
> -#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
> -#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
> -#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
> -#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
> -#define HT_MSI_FLAGS 0x02 /* Offset to flags */
> -#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
> -#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
> -#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
> -#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
> -#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
> -#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
> -#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
> -#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
> -#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
> -#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
> -#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
> -
> -/* Alternative Routing-ID Interpretation */
> -#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
> -#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
> -#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
> -#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
> -#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
> -#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
> -#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
> -#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
> -
> -/* Address Translation Service */
> -#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
> -#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
> -#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
> -#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
> -#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
> -#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
> -#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
> -
> -/* Single Root I/O Virtualization */
> -#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
> -#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
> -#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
> -#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
> -#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
> -#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
> -#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
> -#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
> -#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
> -#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
> -#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
> -#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
> -#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
> -#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
> -#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
> -#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
> -#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
> -#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
> -#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
> -#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
> -#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
> -#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
> -#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
> -#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
> -#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
> -#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
> -#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
> -#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
> -#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
> -
> -#endif /* LINUX_PCI_REGS_H */
> +/*
> + * pci_regs.h
> + *
> + * PCI standard defines
> + * Copyright 1994, Drew Eckhardt
> + * Copyright 1997--1999 Martin Mares <mj@ucw.cz>
> + *
> + * For more information, please consult the following manuals (look at
> + * http://www.pcisig.com/ for how to get them):
> + *
> + * PCI BIOS Specification
> + * PCI Local Bus Specification
> + * PCI to PCI Bridge Specification
> + * PCI System Design Guide
> + *
> + * For hypertransport information, please consult the following manuals
> + * from http://www.hypertransport.org
> + *
> + * The Hypertransport I/O Link Specification
> + */
> +
> +#ifndef LINUX_PCI_REGS_H
> +#define LINUX_PCI_REGS_H
> +
> +/*
> + * Under PCI, each device has 256 bytes of configuration address space,
> + * of which the first 64 bytes are standardized as follows:
> + */
> +#define PCI_VENDOR_ID 0x00 /* 16 bits */
> +#define PCI_DEVICE_ID 0x02 /* 16 bits */
> +#define PCI_COMMAND 0x04 /* 16 bits */
> +#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
> +#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
> +#define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
> +#define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
> +#define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
> +#define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
> +#define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
> +#define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
> +#define PCI_COMMAND_SERR 0x100 /* Enable SERR */
> +#define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
> +#define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
> +
> +#define PCI_STATUS 0x06 /* 16 bits */
> +#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */
> +#define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */
> +#define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */
> +#define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */
> +#define PCI_STATUS_FAST_BACK 0x80 /* Accept fast-back to back */
> +#define PCI_STATUS_PARITY 0x100 /* Detected parity error */
> +#define PCI_STATUS_DEVSEL_MASK 0x600 /* DEVSEL timing */
> +#define PCI_STATUS_DEVSEL_FAST 0x000
> +#define PCI_STATUS_DEVSEL_MEDIUM 0x200
> +#define PCI_STATUS_DEVSEL_SLOW 0x400
> +#define PCI_STATUS_SIG_TARGET_ABORT 0x800 /* Set on target abort */
> +#define PCI_STATUS_REC_TARGET_ABORT 0x1000 /* Master ack of " */
> +#define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
> +#define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
> +#define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */
> +
> +#define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */
> +#define PCI_REVISION_ID 0x08 /* Revision ID */
> +#define PCI_CLASS_PROG 0x09 /* Reg. Level Programming Interface */
> +#define PCI_CLASS_DEVICE 0x0a /* Device class */
> +
> +#define PCI_CACHE_LINE_SIZE 0x0c /* 8 bits */
> +#define PCI_LATENCY_TIMER 0x0d /* 8 bits */
> +#define PCI_HEADER_TYPE 0x0e /* 8 bits */
> +#define PCI_HEADER_TYPE_NORMAL 0
> +#define PCI_HEADER_TYPE_BRIDGE 1
> +#define PCI_HEADER_TYPE_CARDBUS 2
> +
> +#define PCI_BIST 0x0f /* 8 bits */
> +#define PCI_BIST_CODE_MASK 0x0f /* Return result */
> +#define PCI_BIST_START 0x40 /* 1 to start BIST, 2 secs or less */
> +#define PCI_BIST_CAPABLE 0x80 /* 1 if BIST capable */
> +
> +/*
> + * Base addresses specify locations in memory or I/O space.
> + * Decoded size can be determined by writing a value of
> + * 0xffffffff to the register, and reading it back. Only
> + * 1 bits are decoded.
> + */
> +#define PCI_BASE_ADDRESS_0 0x10 /* 32 bits */
> +#define PCI_BASE_ADDRESS_1 0x14 /* 32 bits [htype 0,1 only] */
> +#define PCI_BASE_ADDRESS_2 0x18 /* 32 bits [htype 0 only] */
> +#define PCI_BASE_ADDRESS_3 0x1c /* 32 bits */
> +#define PCI_BASE_ADDRESS_4 0x20 /* 32 bits */
> +#define PCI_BASE_ADDRESS_5 0x24 /* 32 bits */
> +#define PCI_BASE_ADDRESS_SPACE 0x01 /* 0 = memory, 1 = I/O */
> +#define PCI_BASE_ADDRESS_SPACE_IO 0x01
> +#define PCI_BASE_ADDRESS_SPACE_MEMORY 0x00
> +#define PCI_BASE_ADDRESS_MEM_TYPE_MASK 0x06
> +#define PCI_BASE_ADDRESS_MEM_TYPE_32 0x00 /* 32 bit address */
> +#define PCI_BASE_ADDRESS_MEM_TYPE_1M 0x02 /* Below 1M [obsolete] */
> +#define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */
> +#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */
> +#define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL)
> +#define PCI_BASE_ADDRESS_IO_MASK (~0x03UL)
> +/* bit 1 is reserved if address_space = 1 */
> +
> +/* Header type 0 (normal devices) */
> +#define PCI_CARDBUS_CIS 0x28
> +#define PCI_SUBSYSTEM_VENDOR_ID 0x2c
> +#define PCI_SUBSYSTEM_ID 0x2e
> +#define PCI_ROM_ADDRESS 0x30 /* Bits 31..11 are address, 10..1 reserved */
> +#define PCI_ROM_ADDRESS_ENABLE 0x01
> +#define PCI_ROM_ADDRESS_MASK (~0x7ffUL)
> +
> +#define PCI_CAPABILITY_LIST 0x34 /* Offset of first capability list entry */
> +
> +/* 0x35-0x3b are reserved */
> +#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */
> +#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */
> +#define PCI_MIN_GNT 0x3e /* 8 bits */
> +#define PCI_MAX_LAT 0x3f /* 8 bits */
> +
> +/* Header type 1 (PCI-to-PCI bridges) */
> +#define PCI_PRIMARY_BUS 0x18 /* Primary bus number */
> +#define PCI_SECONDARY_BUS 0x19 /* Secondary bus number */
> +#define PCI_SUBORDINATE_BUS 0x1a /* Highest bus number behind the bridge */
> +#define PCI_SEC_LATENCY_TIMER 0x1b /* Latency timer for secondary interface */
> +#define PCI_IO_BASE 0x1c /* I/O range behind the bridge */
> +#define PCI_IO_LIMIT 0x1d
> +#define PCI_IO_RANGE_TYPE_MASK 0x0fUL /* I/O bridging type */
> +#define PCI_IO_RANGE_TYPE_16 0x00
> +#define PCI_IO_RANGE_TYPE_32 0x01
> +#define PCI_IO_RANGE_MASK (~0x0fUL)
> +#define PCI_SEC_STATUS 0x1e /* Secondary status register, only bit 14 used */
> +#define PCI_MEMORY_BASE 0x20 /* Memory range behind */
> +#define PCI_MEMORY_LIMIT 0x22
> +#define PCI_MEMORY_RANGE_TYPE_MASK 0x0fUL
> +#define PCI_MEMORY_RANGE_MASK (~0x0fUL)
> +#define PCI_PREF_MEMORY_BASE 0x24 /* Prefetchable memory range behind */
> +#define PCI_PREF_MEMORY_LIMIT 0x26
> +#define PCI_PREF_RANGE_TYPE_MASK 0x0fUL
> +#define PCI_PREF_RANGE_TYPE_32 0x00
> +#define PCI_PREF_RANGE_TYPE_64 0x01
> +#define PCI_PREF_RANGE_MASK (~0x0fUL)
> +#define PCI_PREF_BASE_UPPER32 0x28 /* Upper half of prefetchable memory range */
> +#define PCI_PREF_LIMIT_UPPER32 0x2c
> +#define PCI_IO_BASE_UPPER16 0x30 /* Upper half of I/O addresses */
> +#define PCI_IO_LIMIT_UPPER16 0x32
> +/* 0x34 same as for htype 0 */
> +/* 0x35-0x3b is reserved */
> +#define PCI_ROM_ADDRESS1 0x38 /* Same as PCI_ROM_ADDRESS, but for htype 1 */
> +/* 0x3c-0x3d are same as for htype 0 */
> +#define PCI_BRIDGE_CONTROL 0x3e
> +#define PCI_BRIDGE_CTL_PARITY 0x01 /* Enable parity detection on secondary interface */
> +#define PCI_BRIDGE_CTL_SERR 0x02 /* The same for SERR forwarding */
> +#define PCI_BRIDGE_CTL_ISA 0x04 /* Enable ISA mode */
> +#define PCI_BRIDGE_CTL_VGA 0x08 /* Forward VGA addresses */
> +#define PCI_BRIDGE_CTL_MASTER_ABORT 0x20 /* Report master aborts */
> +#define PCI_BRIDGE_CTL_BUS_RESET 0x40 /* Secondary bus reset */
> +#define PCI_BRIDGE_CTL_FAST_BACK 0x80 /* Fast Back2Back enabled on secondary interface */
> +
> +/* Header type 2 (CardBus bridges) */
> +#define PCI_CB_CAPABILITY_LIST 0x14
> +/* 0x15 reserved */
> +#define PCI_CB_SEC_STATUS 0x16 /* Secondary status */
> +#define PCI_CB_PRIMARY_BUS 0x18 /* PCI bus number */
> +#define PCI_CB_CARD_BUS 0x19 /* CardBus bus number */
> +#define PCI_CB_SUBORDINATE_BUS 0x1a /* Subordinate bus number */
> +#define PCI_CB_LATENCY_TIMER 0x1b /* CardBus latency timer */
> +#define PCI_CB_MEMORY_BASE_0 0x1c
> +#define PCI_CB_MEMORY_LIMIT_0 0x20
> +#define PCI_CB_MEMORY_BASE_1 0x24
> +#define PCI_CB_MEMORY_LIMIT_1 0x28
> +#define PCI_CB_IO_BASE_0 0x2c
> +#define PCI_CB_IO_BASE_0_HI 0x2e
> +#define PCI_CB_IO_LIMIT_0 0x30
> +#define PCI_CB_IO_LIMIT_0_HI 0x32
> +#define PCI_CB_IO_BASE_1 0x34
> +#define PCI_CB_IO_BASE_1_HI 0x36
> +#define PCI_CB_IO_LIMIT_1 0x38
> +#define PCI_CB_IO_LIMIT_1_HI 0x3a
> +#define PCI_CB_IO_RANGE_MASK (~0x03UL)
> +/* 0x3c-0x3d are same as for htype 0 */
> +#define PCI_CB_BRIDGE_CONTROL 0x3e
> +#define PCI_CB_BRIDGE_CTL_PARITY 0x01 /* Similar to standard bridge control register */
> +#define PCI_CB_BRIDGE_CTL_SERR 0x02
> +#define PCI_CB_BRIDGE_CTL_ISA 0x04
> +#define PCI_CB_BRIDGE_CTL_VGA 0x08
> +#define PCI_CB_BRIDGE_CTL_MASTER_ABORT 0x20
> +#define PCI_CB_BRIDGE_CTL_CB_RESET 0x40 /* CardBus reset */
> +#define PCI_CB_BRIDGE_CTL_16BIT_INT 0x80 /* Enable interrupt for 16-bit cards */
> +#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM0 0x100 /* Prefetch enable for both memory regions */
> +#define PCI_CB_BRIDGE_CTL_PREFETCH_MEM1 0x200
> +#define PCI_CB_BRIDGE_CTL_POST_WRITES 0x400
> +#define PCI_CB_SUBSYSTEM_VENDOR_ID 0x40
> +#define PCI_CB_SUBSYSTEM_ID 0x42
> +#define PCI_CB_LEGACY_MODE_BASE 0x44 /* 16-bit PC Card legacy mode base address (ExCa) */
> +/* 0x48-0x7f reserved */
> +
> +/* Capability lists */
> +
> +#define PCI_CAP_LIST_ID 0 /* Capability ID */
> +#define PCI_CAP_ID_PM 0x01 /* Power Management */
> +#define PCI_CAP_ID_AGP 0x02 /* Accelerated Graphics Port */
> +#define PCI_CAP_ID_VPD 0x03 /* Vital Product Data */
> +#define PCI_CAP_ID_SLOTID 0x04 /* Slot Identification */
> +#define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */
> +#define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */
> +#define PCI_CAP_ID_PCIX 0x07 /* PCI-X */
> +#define PCI_CAP_ID_HT 0x08 /* HyperTransport */
> +#define PCI_CAP_ID_VNDR 0x09 /* Vendor specific */
> +#define PCI_CAP_ID_DBG 0x0A /* Debug port */
> +#define PCI_CAP_ID_CCRC 0x0B /* CompactPCI Central Resource Control */
> +#define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */
> +#define PCI_CAP_ID_SSVID 0x0D /* Bridge subsystem vendor/device ID */
> +#define PCI_CAP_ID_AGP3 0x0E /* AGP Target PCI-PCI bridge */
> +#define PCI_CAP_ID_EXP 0x10 /* PCI Express */
> +#define PCI_CAP_ID_MSIX 0x11 /* MSI-X */
> +#define PCI_CAP_ID_AF 0x13 /* PCI Advanced Features */
> +#define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */
> +#define PCI_CAP_FLAGS 2 /* Capability defined flags (16 bits) */
> +#define PCI_CAP_SIZEOF 4
> +
> +/* Power Management Registers */
> +
> +#define PCI_PM_PMC 2 /* PM Capabilities Register */
> +#define PCI_PM_CAP_VER_MASK 0x0007 /* Version */
> +#define PCI_PM_CAP_PME_CLOCK 0x0008 /* PME clock required */
> +#define PCI_PM_CAP_RESERVED 0x0010 /* Reserved field */
> +#define PCI_PM_CAP_DSI 0x0020 /* Device specific initialization */
> +#define PCI_PM_CAP_AUX_POWER 0x01C0 /* Auxilliary power support mask */
> +#define PCI_PM_CAP_D1 0x0200 /* D1 power state support */
> +#define PCI_PM_CAP_D2 0x0400 /* D2 power state support */
> +#define PCI_PM_CAP_PME 0x0800 /* PME pin supported */
> +#define PCI_PM_CAP_PME_MASK 0xF800 /* PME Mask of all supported states */
> +#define PCI_PM_CAP_PME_D0 0x0800 /* PME# from D0 */
> +#define PCI_PM_CAP_PME_D1 0x1000 /* PME# from D1 */
> +#define PCI_PM_CAP_PME_D2 0x2000 /* PME# from D2 */
> +#define PCI_PM_CAP_PME_D3 0x4000 /* PME# from D3 (hot) */
> +#define PCI_PM_CAP_PME_D3cold 0x8000 /* PME# from D3 (cold) */
> +#define PCI_PM_CAP_PME_SHIFT 11 /* Start of the PME Mask in PMC */
> +#define PCI_PM_CTRL 4 /* PM control and status register */
> +#define PCI_PM_CTRL_STATE_MASK 0x0003 /* Current power state (D0 to D3) */
> +#define PCI_PM_CTRL_NO_SOFT_RESET 0x0008 /* No reset for D3hot->D0 */
> +#define PCI_PM_CTRL_PME_ENABLE 0x0100 /* PME pin enable */
> +#define PCI_PM_CTRL_DATA_SEL_MASK 0x1e00 /* Data select (??) */
> +#define PCI_PM_CTRL_DATA_SCALE_MASK 0x6000 /* Data scale (??) */
> +#define PCI_PM_CTRL_PME_STATUS 0x8000 /* PME pin status */
> +#define PCI_PM_PPB_EXTENSIONS 6 /* PPB support extensions (??) */
> +#define PCI_PM_PPB_B2_B3 0x40 /* Stop clock when in D3hot (??) */
> +#define PCI_PM_BPCC_ENABLE 0x80 /* Bus power/clock control enable (??) */
> +#define PCI_PM_DATA_REGISTER 7 /* (??) */
> +#define PCI_PM_SIZEOF 8
> +
> +/* AGP registers */
> +
> +#define PCI_AGP_VERSION 2 /* BCD version number */
> +#define PCI_AGP_RFU 3 /* Rest of capability flags */
> +#define PCI_AGP_STATUS 4 /* Status register */
> +#define PCI_AGP_STATUS_RQ_MASK 0xff000000 /* Maximum number of requests - 1 */
> +#define PCI_AGP_STATUS_SBA 0x0200 /* Sideband addressing supported */
> +#define PCI_AGP_STATUS_64BIT 0x0020 /* 64-bit addressing supported */
> +#define PCI_AGP_STATUS_FW 0x0010 /* FW transfers supported */
> +#define PCI_AGP_STATUS_RATE4 0x0004 /* 4x transfer rate supported */
> +#define PCI_AGP_STATUS_RATE2 0x0002 /* 2x transfer rate supported */
> +#define PCI_AGP_STATUS_RATE1 0x0001 /* 1x transfer rate supported */
> +#define PCI_AGP_COMMAND 8 /* Control register */
> +#define PCI_AGP_COMMAND_RQ_MASK 0xff000000 /* Master: Maximum number of requests */
> +#define PCI_AGP_COMMAND_SBA 0x0200 /* Sideband addressing enabled */
> +#define PCI_AGP_COMMAND_AGP 0x0100 /* Allow processing of AGP transactions */
> +#define PCI_AGP_COMMAND_64BIT 0x0020 /* Allow processing of 64-bit addresses */
> +#define PCI_AGP_COMMAND_FW 0x0010 /* Force FW transfers */
> +#define PCI_AGP_COMMAND_RATE4 0x0004 /* Use 4x rate */
> +#define PCI_AGP_COMMAND_RATE2 0x0002 /* Use 2x rate */
> +#define PCI_AGP_COMMAND_RATE1 0x0001 /* Use 1x rate */
> +#define PCI_AGP_SIZEOF 12
> +
> +/* Vital Product Data */
> +
> +#define PCI_VPD_ADDR 2 /* Address to access (15 bits!) */
> +#define PCI_VPD_ADDR_MASK 0x7fff /* Address mask */
> +#define PCI_VPD_ADDR_F 0x8000 /* Write 0, 1 indicates completion */
> +#define PCI_VPD_DATA 4 /* 32-bits of data returned here */
> +
> +/* Slot Identification */
> +
> +#define PCI_SID_ESR 2 /* Expansion Slot Register */
> +#define PCI_SID_ESR_NSLOTS 0x1f /* Number of expansion slots available */
> +#define PCI_SID_ESR_FIC 0x20 /* First In Chassis Flag */
> +#define PCI_SID_CHASSIS_NR 3 /* Chassis Number */
> +
> +/* Message Signalled Interrupts registers */
> +
> +#define PCI_MSI_FLAGS 2 /* Various flags */
> +#define PCI_MSI_FLAGS_64BIT 0x80 /* 64-bit addresses allowed */
> +#define PCI_MSI_FLAGS_QSIZE 0x70 /* Message queue size configured */
> +#define PCI_MSI_FLAGS_QMASK 0x0e /* Maximum queue size available */
> +#define PCI_MSI_FLAGS_ENABLE 0x01 /* MSI feature enabled */
> +#define PCI_MSI_FLAGS_MASKBIT 0x100 /* 64-bit mask bits allowed */
> +#define PCI_MSI_RFU 3 /* Rest of capability flags */
> +#define PCI_MSI_ADDRESS_LO 4 /* Lower 32 bits */
> +#define PCI_MSI_ADDRESS_HI 8 /* Upper 32 bits (if PCI_MSI_FLAGS_64BIT set) */
> +#define PCI_MSI_DATA_32 8 /* 16 bits of data for 32-bit devices */
> +#define PCI_MSI_MASK_32 12 /* Mask bits register for 32-bit devices */
> +#define PCI_MSI_DATA_64 12 /* 16 bits of data for 64-bit devices */
> +#define PCI_MSI_MASK_64 16 /* Mask bits register for 64-bit devices */
> +
> +/* MSI-X registers (these are at offset PCI_MSIX_FLAGS) */
> +#define PCI_MSIX_FLAGS 2
> +#define PCI_MSIX_FLAGS_QSIZE 0x7FF
> +#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
> +#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
> +#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
> +
> +/* CompactPCI Hotswap Register */
> +
> +#define PCI_CHSWP_CSR 2 /* Control and Status Register */
> +#define PCI_CHSWP_DHA 0x01 /* Device Hiding Arm */
> +#define PCI_CHSWP_EIM 0x02 /* ENUM# Signal Mask */
> +#define PCI_CHSWP_PIE 0x04 /* Pending Insert or Extract */
> +#define PCI_CHSWP_LOO 0x08 /* LED On / Off */
> +#define PCI_CHSWP_PI 0x30 /* Programming Interface */
> +#define PCI_CHSWP_EXT 0x40 /* ENUM# status - extraction */
> +#define PCI_CHSWP_INS 0x80 /* ENUM# status - insertion */
> +
> +/* PCI Advanced Feature registers */
> +
> +#define PCI_AF_LENGTH 2
> +#define PCI_AF_CAP 3
> +#define PCI_AF_CAP_TP 0x01
> +#define PCI_AF_CAP_FLR 0x02
> +#define PCI_AF_CTRL 4
> +#define PCI_AF_CTRL_FLR 0x01
> +#define PCI_AF_STATUS 5
> +#define PCI_AF_STATUS_TP 0x01
> +
> +/* PCI-X registers */
> +
> +#define PCI_X_CMD 2 /* Modes & Features */
> +#define PCI_X_CMD_DPERR_E 0x0001 /* Data Parity Error Recovery Enable */
> +#define PCI_X_CMD_ERO 0x0002 /* Enable Relaxed Ordering */
> +#define PCI_X_CMD_READ_512 0x0000 /* 512 byte maximum read byte count */
> +#define PCI_X_CMD_READ_1K 0x0004 /* 1Kbyte maximum read byte count */
> +#define PCI_X_CMD_READ_2K 0x0008 /* 2Kbyte maximum read byte count */
> +#define PCI_X_CMD_READ_4K 0x000c /* 4Kbyte maximum read byte count */
> +#define PCI_X_CMD_MAX_READ 0x000c /* Max Memory Read Byte Count */
> + /* Max # of outstanding split transactions */
> +#define PCI_X_CMD_SPLIT_1 0x0000 /* Max 1 */
> +#define PCI_X_CMD_SPLIT_2 0x0010 /* Max 2 */
> +#define PCI_X_CMD_SPLIT_3 0x0020 /* Max 3 */
> +#define PCI_X_CMD_SPLIT_4 0x0030 /* Max 4 */
> +#define PCI_X_CMD_SPLIT_8 0x0040 /* Max 8 */
> +#define PCI_X_CMD_SPLIT_12 0x0050 /* Max 12 */
> +#define PCI_X_CMD_SPLIT_16 0x0060 /* Max 16 */
> +#define PCI_X_CMD_SPLIT_32 0x0070 /* Max 32 */
> +#define PCI_X_CMD_MAX_SPLIT 0x0070 /* Max Outstanding Split Transactions */
> +#define PCI_X_CMD_VERSION(x) (((x) >> 12) & 3) /* Version */
> +#define PCI_X_STATUS 4 /* PCI-X capabilities */
> +#define PCI_X_STATUS_DEVFN 0x000000ff /* A copy of devfn */
> +#define PCI_X_STATUS_BUS 0x0000ff00 /* A copy of bus nr */
> +#define PCI_X_STATUS_64BIT 0x00010000 /* 64-bit device */
> +#define PCI_X_STATUS_133MHZ 0x00020000 /* 133 MHz capable */
> +#define PCI_X_STATUS_SPL_DISC 0x00040000 /* Split Completion Discarded */
> +#define PCI_X_STATUS_UNX_SPL 0x00080000 /* Unexpected Split Completion */
> +#define PCI_X_STATUS_COMPLEX 0x00100000 /* Device Complexity */
> +#define PCI_X_STATUS_MAX_READ 0x00600000 /* Designed Max Memory Read Count */
> +#define PCI_X_STATUS_MAX_SPLIT 0x03800000 /* Designed Max Outstanding Split Transactions */
> +#define PCI_X_STATUS_MAX_CUM 0x1c000000 /* Designed Max Cumulative Read Size */
> +#define PCI_X_STATUS_SPL_ERR 0x20000000 /* Rcvd Split Completion Error Msg */
> +#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
> +#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
> +
> +/* PCI Express capability registers */
> +
> +#define PCI_EXP_FLAGS 2 /* Capabilities register */
> +#define PCI_EXP_FLAGS_VERS 0x000f /* Capability version */
> +#define PCI_EXP_FLAGS_TYPE 0x00f0 /* Device/Port type */
> +#define PCI_EXP_TYPE_ENDPOINT 0x0 /* Express Endpoint */
> +#define PCI_EXP_TYPE_LEG_END 0x1 /* Legacy Endpoint */
> +#define PCI_EXP_TYPE_ROOT_PORT 0x4 /* Root Port */
> +#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
> +#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
> +#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
> +#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
> +#define PCI_EXP_TYPE_RC_EC 0x10 /* Root Complex Event Collector */
> +#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
> +#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
> +#define PCI_EXP_DEVCAP 4 /* Device capabilities */
> +#define PCI_EXP_DEVCAP_PAYLOAD 0x07 /* Max_Payload_Size */
> +#define PCI_EXP_DEVCAP_PHANTOM 0x18 /* Phantom functions */
> +#define PCI_EXP_DEVCAP_EXT_TAG 0x20 /* Extended tags */
> +#define PCI_EXP_DEVCAP_L0S 0x1c0 /* L0s Acceptable Latency */
> +#define PCI_EXP_DEVCAP_L1 0xe00 /* L1 Acceptable Latency */
> +#define PCI_EXP_DEVCAP_ATN_BUT 0x1000 /* Attention Button Present */
> +#define PCI_EXP_DEVCAP_ATN_IND 0x2000 /* Attention Indicator Present */
> +#define PCI_EXP_DEVCAP_PWR_IND 0x4000 /* Power Indicator Present */
> +#define PCI_EXP_DEVCAP_RBER 0x8000 /* Role-Based Error Reporting */
> +#define PCI_EXP_DEVCAP_PWR_VAL 0x3fc0000 /* Slot Power Limit Value */
> +#define PCI_EXP_DEVCAP_PWR_SCL 0xc000000 /* Slot Power Limit Scale */
> +#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
> +#define PCI_EXP_DEVCTL 8 /* Device Control */
> +#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
> +#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
> +#define PCI_EXP_DEVCTL_FERE 0x0004 /* Fatal Error Reporting Enable */
> +#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
> +#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
> +#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
> +#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
> +#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
> +#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
> +#define PCI_EXP_DEVCTL_NOSNOOP_EN 0x0800 /* Enable No Snoop */
> +#define PCI_EXP_DEVCTL_READRQ 0x7000 /* Max_Read_Request_Size */
> +#define PCI_EXP_DEVCTL_BCR_FLR 0x8000 /* Bridge Configuration Retry / FLR */
> +#define PCI_EXP_DEVSTA 10 /* Device Status */
> +#define PCI_EXP_DEVSTA_CED 0x01 /* Correctable Error Detected */
> +#define PCI_EXP_DEVSTA_NFED 0x02 /* Non-Fatal Error Detected */
> +#define PCI_EXP_DEVSTA_FED 0x04 /* Fatal Error Detected */
> +#define PCI_EXP_DEVSTA_URD 0x08 /* Unsupported Request Detected */
> +#define PCI_EXP_DEVSTA_AUXPD 0x10 /* AUX Power Detected */
> +#define PCI_EXP_DEVSTA_TRPND 0x20 /* Transactions Pending */
> +#define PCI_EXP_LNKCAP 12 /* Link Capabilities */
> +#define PCI_EXP_LNKCAP_SLS 0x0000000f /* Supported Link Speeds */
> +#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
> +#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
> +#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
> +#define PCI_EXP_LNKCAP_L1EL 0x00038000 /* L1 Exit Latency */
> +#define PCI_EXP_LNKCAP_CLKPM 0x00040000 /* L1 Clock Power Management */
> +#define PCI_EXP_LNKCAP_SDERC 0x00080000 /* Suprise Down Error Reporting Capable */
> +#define PCI_EXP_LNKCAP_DLLLARC 0x00100000 /* Data Link Layer Link Active Reporting Capable */
> +#define PCI_EXP_LNKCAP_LBNC 0x00200000 /* Link Bandwidth Notification Capability */
> +#define PCI_EXP_LNKCAP_PN 0xff000000 /* Port Number */
> +#define PCI_EXP_LNKCTL 16 /* Link Control */
> +#define PCI_EXP_LNKCTL_ASPMC 0x0003 /* ASPM Control */
> +#define PCI_EXP_LNKCTL_RCB 0x0008 /* Read Completion Boundary */
> +#define PCI_EXP_LNKCTL_LD 0x0010 /* Link Disable */
> +#define PCI_EXP_LNKCTL_RL 0x0020 /* Retrain Link */
> +#define PCI_EXP_LNKCTL_CCC 0x0040 /* Common Clock Configuration */
> +#define PCI_EXP_LNKCTL_ES 0x0080 /* Extended Synch */
> +#define PCI_EXP_LNKCTL_CLKREQ_EN 0x100 /* Enable clkreq */
> +#define PCI_EXP_LNKCTL_HAWD 0x0200 /* Hardware Autonomous Width Disable */
> +#define PCI_EXP_LNKCTL_LBMIE 0x0400 /* Link Bandwidth Management Interrupt Enable */
> +#define PCI_EXP_LNKCTL_LABIE 0x0800 /* Lnk Autonomous Bandwidth Interrupt Enable */
> +#define PCI_EXP_LNKSTA 18 /* Link Status */
> +#define PCI_EXP_LNKSTA_CLS 0x000f /* Current Link Speed */
> +#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Nogotiated Link Width */
> +#define PCI_EXP_LNKSTA_LT 0x0800 /* Link Training */
> +#define PCI_EXP_LNKSTA_SLC 0x1000 /* Slot Clock Configuration */
> +#define PCI_EXP_LNKSTA_DLLLA 0x2000 /* Data Link Layer Link Active */
> +#define PCI_EXP_LNKSTA_LBMS 0x4000 /* Link Bandwidth Management Status */
> +#define PCI_EXP_LNKSTA_LABS 0x8000 /* Link Autonomous Bandwidth Status */
> +#define PCI_EXP_SLTCAP 20 /* Slot Capabilities */
> +#define PCI_EXP_SLTCAP_ABP 0x00000001 /* Attention Button Present */
> +#define PCI_EXP_SLTCAP_PCP 0x00000002 /* Power Controller Present */
> +#define PCI_EXP_SLTCAP_MRLSP 0x00000004 /* MRL Sensor Present */
> +#define PCI_EXP_SLTCAP_AIP 0x00000008 /* Attention Indicator Present */
> +#define PCI_EXP_SLTCAP_PIP 0x00000010 /* Power Indicator Present */
> +#define PCI_EXP_SLTCAP_HPS 0x00000020 /* Hot-Plug Surprise */
> +#define PCI_EXP_SLTCAP_HPC 0x00000040 /* Hot-Plug Capable */
> +#define PCI_EXP_SLTCAP_SPLV 0x00007f80 /* Slot Power Limit Value */
> +#define PCI_EXP_SLTCAP_SPLS 0x00018000 /* Slot Power Limit Scale */
> +#define PCI_EXP_SLTCAP_EIP 0x00020000 /* Electromechanical Interlock Present */
> +#define PCI_EXP_SLTCAP_NCCS 0x00040000 /* No Command Completed Support */
> +#define PCI_EXP_SLTCAP_PSN 0xfff80000 /* Physical Slot Number */
> +#define PCI_EXP_SLTCTL 24 /* Slot Control */
> +#define PCI_EXP_SLTCTL_ABPE 0x0001 /* Attention Button Pressed Enable */
> +#define PCI_EXP_SLTCTL_PFDE 0x0002 /* Power Fault Detected Enable */
> +#define PCI_EXP_SLTCTL_MRLSCE 0x0004 /* MRL Sensor Changed Enable */
> +#define PCI_EXP_SLTCTL_PDCE 0x0008 /* Presence Detect Changed Enable */
> +#define PCI_EXP_SLTCTL_CCIE 0x0010 /* Command Completed Interrupt Enable */
> +#define PCI_EXP_SLTCTL_HPIE 0x0020 /* Hot-Plug Interrupt Enable */
> +#define PCI_EXP_SLTCTL_AIC 0x00c0 /* Attention Indicator Control */
> +#define PCI_EXP_SLTCTL_PIC 0x0300 /* Power Indicator Control */
> +#define PCI_EXP_SLTCTL_PCC 0x0400 /* Power Controller Control */
> +#define PCI_EXP_SLTCTL_EIC 0x0800 /* Electromechanical Interlock Control */
> +#define PCI_EXP_SLTCTL_DLLSCE 0x1000 /* Data Link Layer State Changed Enable */
> +#define PCI_EXP_SLTSTA 26 /* Slot Status */
> +#define PCI_EXP_SLTSTA_ABP 0x0001 /* Attention Button Pressed */
> +#define PCI_EXP_SLTSTA_PFD 0x0002 /* Power Fault Detected */
> +#define PCI_EXP_SLTSTA_MRLSC 0x0004 /* MRL Sensor Changed */
> +#define PCI_EXP_SLTSTA_PDC 0x0008 /* Presence Detect Changed */
> +#define PCI_EXP_SLTSTA_CC 0x0010 /* Command Completed */
> +#define PCI_EXP_SLTSTA_MRLSS 0x0020 /* MRL Sensor State */
> +#define PCI_EXP_SLTSTA_PDS 0x0040 /* Presence Detect State */
> +#define PCI_EXP_SLTSTA_EIS 0x0080 /* Electromechanical Interlock Status */
> +#define PCI_EXP_SLTSTA_DLLSC 0x0100 /* Data Link Layer State Changed */
> +#define PCI_EXP_RTCTL 28 /* Root Control */
> +#define PCI_EXP_RTCTL_SECEE 0x01 /* System Error on Correctable Error */
> +#define PCI_EXP_RTCTL_SENFEE 0x02 /* System Error on Non-Fatal Error */
> +#define PCI_EXP_RTCTL_SEFEE 0x04 /* System Error on Fatal Error */
> +#define PCI_EXP_RTCTL_PMEIE 0x08 /* PME Interrupt Enable */
> +#define PCI_EXP_RTCTL_CRSSVE 0x10 /* CRS Software Visibility Enable */
> +#define PCI_EXP_RTCAP 30 /* Root Capabilities */
> +#define PCI_EXP_RTSTA 32 /* Root Status */
> +#define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */
> +#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
> +#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
> +#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
> +#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
> +#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
> +
> +/* Extended Capabilities (PCI-X 2.0 and Express) */
> +#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
> +#define PCI_EXT_CAP_VER(header) ((header >> 16) & 0xf)
> +#define PCI_EXT_CAP_NEXT(header) ((header >> 20) & 0xffc)
> +
> +#define PCI_EXT_CAP_ID_ERR 1
> +#define PCI_EXT_CAP_ID_VC 2
> +#define PCI_EXT_CAP_ID_DSN 3
> +#define PCI_EXT_CAP_ID_PWR 4
> +#define PCI_EXT_CAP_ID_ARI 14
> +#define PCI_EXT_CAP_ID_ATS 15
> +#define PCI_EXT_CAP_ID_SRIOV 16
> +
> +/* Advanced Error Reporting */
> +#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
> +#define PCI_ERR_UNC_TRAIN 0x00000001 /* Training */
> +#define PCI_ERR_UNC_DLP 0x00000010 /* Data Link Protocol */
> +#define PCI_ERR_UNC_POISON_TLP 0x00001000 /* Poisoned TLP */
> +#define PCI_ERR_UNC_FCP 0x00002000 /* Flow Control Protocol */
> +#define PCI_ERR_UNC_COMP_TIME 0x00004000 /* Completion Timeout */
> +#define PCI_ERR_UNC_COMP_ABORT 0x00008000 /* Completer Abort */
> +#define PCI_ERR_UNC_UNX_COMP 0x00010000 /* Unexpected Completion */
> +#define PCI_ERR_UNC_RX_OVER 0x00020000 /* Receiver Overflow */
> +#define PCI_ERR_UNC_MALF_TLP 0x00040000 /* Malformed TLP */
> +#define PCI_ERR_UNC_ECRC 0x00080000 /* ECRC Error Status */
> +#define PCI_ERR_UNC_UNSUP 0x00100000 /* Unsupported Request */
> +#define PCI_ERR_UNCOR_MASK 8 /* Uncorrectable Error Mask */
> + /* Same bits as above */
> +#define PCI_ERR_UNCOR_SEVER 12 /* Uncorrectable Error Severity */
> + /* Same bits as above */
> +#define PCI_ERR_COR_STATUS 16 /* Correctable Error Status */
> +#define PCI_ERR_COR_RCVR 0x00000001 /* Receiver Error Status */
> +#define PCI_ERR_COR_BAD_TLP 0x00000040 /* Bad TLP Status */
> +#define PCI_ERR_COR_BAD_DLLP 0x00000080 /* Bad DLLP Status */
> +#define PCI_ERR_COR_REP_ROLL 0x00000100 /* REPLAY_NUM Rollover */
> +#define PCI_ERR_COR_REP_TIMER 0x00001000 /* Replay Timer Timeout */
> +#define PCI_ERR_COR_MASK 20 /* Correctable Error Mask */
> + /* Same bits as above */
> +#define PCI_ERR_CAP 24 /* Advanced Error Capabilities */
> +#define PCI_ERR_CAP_FEP(x) ((x) & 31) /* First Error Pointer */
> +#define PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
> +#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
> +#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
> +#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
> +#define PCI_ERR_HEADER_LOG 28 /* Header Log Register (16 bytes) */
> +#define PCI_ERR_ROOT_COMMAND 44 /* Root Error Command */
> +/* Correctable Err Reporting Enable */
> +#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001
> +/* Non-fatal Err Reporting Enable */
> +#define PCI_ERR_ROOT_CMD_NONFATAL_EN 0x00000002
> +/* Fatal Err Reporting Enable */
> +#define PCI_ERR_ROOT_CMD_FATAL_EN 0x00000004
> +#define PCI_ERR_ROOT_STATUS 48
> +#define PCI_ERR_ROOT_COR_RCV 0x00000001 /* ERR_COR Received */
> +/* Multi ERR_COR Received */
> +#define PCI_ERR_ROOT_MULTI_COR_RCV 0x00000002
> +/* ERR_FATAL/NONFATAL Recevied */
> +#define PCI_ERR_ROOT_UNCOR_RCV 0x00000004
> +/* Multi ERR_FATAL/NONFATAL Recevied */
> +#define PCI_ERR_ROOT_MULTI_UNCOR_RCV 0x00000008
> +#define PCI_ERR_ROOT_FIRST_FATAL 0x00000010 /* First Fatal */
> +#define PCI_ERR_ROOT_NONFATAL_RCV 0x00000020 /* Non-Fatal Received */
> +#define PCI_ERR_ROOT_FATAL_RCV 0x00000040 /* Fatal Received */
> +#define PCI_ERR_ROOT_COR_SRC 52
> +#define PCI_ERR_ROOT_SRC 54
> +
> +/* Virtual Channel */
> +#define PCI_VC_PORT_REG1 4
> +#define PCI_VC_PORT_REG2 8
> +#define PCI_VC_PORT_CTRL 12
> +#define PCI_VC_PORT_STATUS 14
> +#define PCI_VC_RES_CAP 16
> +#define PCI_VC_RES_CTRL 20
> +#define PCI_VC_RES_STATUS 26
> +
> +/* Power Budgeting */
> +#define PCI_PWR_DSR 4 /* Data Select Register */
> +#define PCI_PWR_DATA 8 /* Data Register */
> +#define PCI_PWR_DATA_BASE(x) ((x) & 0xff) /* Base Power */
> +#define PCI_PWR_DATA_SCALE(x) (((x) >> 8) & 3) /* Data Scale */
> +#define PCI_PWR_DATA_PM_SUB(x) (((x) >> 10) & 7) /* PM Sub State */
> +#define PCI_PWR_DATA_PM_STATE(x) (((x) >> 13) & 3) /* PM State */
> +#define PCI_PWR_DATA_TYPE(x) (((x) >> 15) & 7) /* Type */
> +#define PCI_PWR_DATA_RAIL(x) (((x) >> 18) & 7) /* Power Rail */
> +#define PCI_PWR_CAP 12 /* Capability */
> +#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
> +
> +/*
> + * Hypertransport sub capability types
> + *
> + * Unfortunately there are both 3 bit and 5 bit capability types defined
> + * in the HT spec, catering for that is a little messy. You probably don't
> + * want to use these directly, just use pci_find_ht_capability() and it
> + * will do the right thing for you.
> + */
> +#define HT_3BIT_CAP_MASK 0xE0
> +#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */
> +#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */
> +
> +#define HT_5BIT_CAP_MASK 0xF8
> +#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */
> +#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */
> +#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */
> +#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */
> +#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */
> +#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */
> +#define HT_MSI_FLAGS 0x02 /* Offset to flags */
> +#define HT_MSI_FLAGS_ENABLE 0x1 /* Mapping enable */
> +#define HT_MSI_FLAGS_FIXED 0x2 /* Fixed mapping only */
> +#define HT_MSI_FIXED_ADDR 0x00000000FEE00000ULL /* Fixed addr */
> +#define HT_MSI_ADDR_LO 0x04 /* Offset to low addr bits */
> +#define HT_MSI_ADDR_LO_MASK 0xFFF00000 /* Low address bit mask */
> +#define HT_MSI_ADDR_HI 0x08 /* Offset to high addr bits */
> +#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */
> +#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */
> +#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */
> +#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */
> +#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */
> +
> +/* Alternative Routing-ID Interpretation */
> +#define PCI_ARI_CAP 0x04 /* ARI Capability Register */
> +#define PCI_ARI_CAP_MFVC 0x0001 /* MFVC Function Groups Capability */
> +#define PCI_ARI_CAP_ACS 0x0002 /* ACS Function Groups Capability */
> +#define PCI_ARI_CAP_NFN(x) (((x) >> 8) & 0xff) /* Next Function Number */
> +#define PCI_ARI_CTRL 0x06 /* ARI Control Register */
> +#define PCI_ARI_CTRL_MFVC 0x0001 /* MFVC Function Groups Enable */
> +#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
> +#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
> +
> +/* Address Translation Service */
> +#define PCI_ATS_CAP 0x04 /* ATS Capability Register */
> +#define PCI_ATS_CAP_QDEP(x) ((x) & 0x1f) /* Invalidate Queue Depth */
> +#define PCI_ATS_MAX_QDEP 32 /* Max Invalidate Queue Depth */
> +#define PCI_ATS_CTRL 0x06 /* ATS Control Register */
> +#define PCI_ATS_CTRL_ENABLE 0x8000 /* ATS Enable */
> +#define PCI_ATS_CTRL_STU(x) ((x) & 0x1f) /* Smallest Translation Unit */
> +#define PCI_ATS_MIN_STU 12 /* shift of minimum STU block */
> +
> +/* Single Root I/O Virtualization */
> +#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
> +#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
> +#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
> +#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
> +#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
> +#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
> +#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
> +#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
> +#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
> +#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
> +#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
> +#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
> +#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
> +#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
> +#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
> +#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
> +#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
> +#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
> +#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
> +#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
> +#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
> +#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
> +#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
> +#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
> +#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
> +#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
> +#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
> +#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
> +#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
> +
> +#endif /* LINUX_PCI_REGS_H */
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
2010-08-31 20:29 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-08-31 22:58 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-31 22:58 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Tue, Aug 31, 2010 at 11:29:53PM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 28, 2010 at 05:54:52PM +0300, Eduard - Gabriel Munteanu wrote:
> > The conversion was done using the GNU 'expand' tool (default settings)
> > to make it obey the QEMU coding style.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>
> I'm not really interested in this: we copied pci_regs.h from linux
> to help non-linux hosts, and keeping the code consistent
> with the original makes detecting bugs and adding new stuff
> from linux/pci_regs.h easier.
>
> > ---
> > hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
> > 1 files changed, 665 insertions(+), 665 deletions(-)
> > rewrite hw/pci_regs.h (90%)
Ok, I'll drop it. The only reason I did it was one of my additions to
this file made the patch look indented awkwardly.
I'll use tabs and merge it into Linux as well.
Thanks,
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
@ 2010-08-31 22:58 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-31 22:58 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Tue, Aug 31, 2010 at 11:29:53PM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 28, 2010 at 05:54:52PM +0300, Eduard - Gabriel Munteanu wrote:
> > The conversion was done using the GNU 'expand' tool (default settings)
> > to make it obey the QEMU coding style.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>
> I'm not really interested in this: we copied pci_regs.h from linux
> to help non-linux hosts, and keeping the code consistent
> with the original makes detecting bugs and adding new stuff
> from linux/pci_regs.h easier.
>
> > ---
> > hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
> > 1 files changed, 665 insertions(+), 665 deletions(-)
> > rewrite hw/pci_regs.h (90%)
Ok, I'll drop it. The only reason I did it was one of my additions to
this file made the patch look indented awkwardly.
I'll use tabs and merge it into Linux as well.
Thanks,
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
2010-08-31 22:58 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-01 10:39 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-01 10:39 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Wed, Sep 01, 2010 at 01:58:30AM +0300, Eduard - Gabriel Munteanu wrote:
> On Tue, Aug 31, 2010 at 11:29:53PM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 28, 2010 at 05:54:52PM +0300, Eduard - Gabriel Munteanu wrote:
> > > The conversion was done using the GNU 'expand' tool (default settings)
> > > to make it obey the QEMU coding style.
> > >
> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >
> > I'm not really interested in this: we copied pci_regs.h from linux
> > to help non-linux hosts, and keeping the code consistent
> > with the original makes detecting bugs and adding new stuff
> > from linux/pci_regs.h easier.
> >
> > > ---
> > > hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
> > > 1 files changed, 665 insertions(+), 665 deletions(-)
> > > rewrite hw/pci_regs.h (90%)
>
> Ok, I'll drop it. The only reason I did it was one of my additions to
> this file made the patch look indented awkwardly.
>
> I'll use tabs and merge it into Linux as well.
>
>
> Thanks,
> Eduard
Good idea, this way more people with pci knowledge check it.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h
@ 2010-09-01 10:39 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-01 10:39 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Wed, Sep 01, 2010 at 01:58:30AM +0300, Eduard - Gabriel Munteanu wrote:
> On Tue, Aug 31, 2010 at 11:29:53PM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 28, 2010 at 05:54:52PM +0300, Eduard - Gabriel Munteanu wrote:
> > > The conversion was done using the GNU 'expand' tool (default settings)
> > > to make it obey the QEMU coding style.
> > >
> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >
> > I'm not really interested in this: we copied pci_regs.h from linux
> > to help non-linux hosts, and keeping the code consistent
> > with the original makes detecting bugs and adding new stuff
> > from linux/pci_regs.h easier.
> >
> > > ---
> > > hw/pci_regs.h | 1330 ++++++++++++++++++++++++++++----------------------------
> > > 1 files changed, 665 insertions(+), 665 deletions(-)
> > > rewrite hw/pci_regs.h (90%)
>
> Ok, I'll drop it. The only reason I did it was one of my additions to
> this file made the patch look indented awkwardly.
>
> I'll use tabs and merge it into Linux as well.
>
>
> Thanks,
> Eduard
Good idea, this way more people with pci knowledge check it.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-08-29 22:08 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-01 20:10 ` Stefan Weil
-1 siblings, 0 replies; 97+ messages in thread
From: Stefan Weil @ 2010-09-01 20:10 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: mst, kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
Please see my comments at the end of this mail.
Am 30.08.2010 00:08, schrieb Eduard - Gabriel Munteanu:
> PCI devices should access memory through pci_memory_*() instead of
> cpu_physical_memory_*(). This also provides support for translation and
> access checking in case an IOMMU is emulated.
>
> Memory maps are treated as remote IOTLBs (that is, translation caches
> belonging to the IOMMU-aware device itself). Clients (devices) must
> provide callbacks for map invalidation in case these maps are
> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> hw/pci.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> hw/pci.h | 69 +++++++++++++++++++
> hw/pci_internals.h | 12 +++
> qemu-common.h | 1 +
> 4 files changed, 272 insertions(+), 1 deletions(-)
>
> diff --git a/hw/pci.c b/hw/pci.c
> index 2dc1577..afcb33c 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
>
> ...
>
> diff --git a/hw/pci.h b/hw/pci.h
> index c551f96..c95863a 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -172,6 +172,8 @@ struct PCIDevice {
> char *romfile;
> ram_addr_t rom_offset;
> uint32_t rom_bar;
> +
> + QLIST_HEAD(, PCIMemoryMap) memory_maps;
> };
>
> PCIDevice *pci_register_device(PCIBus *bus, const char *name,
> @@ -391,4 +393,71 @@ static inline int ranges_overlap(uint64_t first1,
> uint64_t len1,
> return !(last2 < first1 || last1 < first2);
> }
>
> +/*
> + * Memory I/O and PCI IOMMU definitions.
> + */
> +
> +#define IOMMU_PERM_READ (1 << 0)
> +#define IOMMU_PERM_WRITE (1 << 1)
> +#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
> +
> +typedef int PCIInvalidateMapFunc(void *opaque);
> +typedef int PCITranslateFunc(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms);
> +
> +void pci_memory_rw(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len,
> + int is_write);
> +void *pci_memory_map(PCIDevice *dev,
> + PCIInvalidateMapFunc *cb,
> + void *opaque,
> + pcibus_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +void pci_memory_unmap(PCIDevice *dev,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +void pci_register_iommu(PCIDevice *dev, PCITranslateFunc *translate);
> +void pci_memory_invalidate_range(PCIDevice *dev, pcibus_t addr,
> pcibus_t len);
> +
> +#define DECLARE_PCI_LD(suffix, size) \
> +uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
> +
> +#define DECLARE_PCI_ST(suffix, size) \
> +void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val);
> +
> +DECLARE_PCI_LD(ub, 8)
> +DECLARE_PCI_LD(uw, 16)
> +DECLARE_PCI_LD(l, 32)
> +DECLARE_PCI_LD(q, 64)
> +
> +DECLARE_PCI_ST(b, 8)
> +DECLARE_PCI_ST(w, 16)
> +DECLARE_PCI_ST(l, 32)
> +DECLARE_PCI_ST(q, 64)
> +
> +static inline void pci_memory_read(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, buf, len, 0);
> +}
> +
> +static inline void pci_memory_write(PCIDevice *dev,
> + pcibus_t addr,
> + const uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> +}
> +
> #endif
The functions pci_memory_read and pci_memory_write not only read
or write byte data but many different data types which leads to
a lot of type casts in your other patches.
I'd prefer "void *buf" and "const void *buf" in the argument lists.
Then all those type casts could be removed.
Regards
Stefan Weil
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-01 20:10 ` Stefan Weil
0 siblings, 0 replies; 97+ messages in thread
From: Stefan Weil @ 2010-09-01 20:10 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, mst, joro, qemu-devel, blauwirbel, yamahata, paul, avi
Please see my comments at the end of this mail.
Am 30.08.2010 00:08, schrieb Eduard - Gabriel Munteanu:
> PCI devices should access memory through pci_memory_*() instead of
> cpu_physical_memory_*(). This also provides support for translation and
> access checking in case an IOMMU is emulated.
>
> Memory maps are treated as remote IOTLBs (that is, translation caches
> belonging to the IOMMU-aware device itself). Clients (devices) must
> provide callbacks for map invalidation in case these maps are
> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> hw/pci.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> hw/pci.h | 69 +++++++++++++++++++
> hw/pci_internals.h | 12 +++
> qemu-common.h | 1 +
> 4 files changed, 272 insertions(+), 1 deletions(-)
>
> diff --git a/hw/pci.c b/hw/pci.c
> index 2dc1577..afcb33c 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
>
> ...
>
> diff --git a/hw/pci.h b/hw/pci.h
> index c551f96..c95863a 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -172,6 +172,8 @@ struct PCIDevice {
> char *romfile;
> ram_addr_t rom_offset;
> uint32_t rom_bar;
> +
> + QLIST_HEAD(, PCIMemoryMap) memory_maps;
> };
>
> PCIDevice *pci_register_device(PCIBus *bus, const char *name,
> @@ -391,4 +393,71 @@ static inline int ranges_overlap(uint64_t first1,
> uint64_t len1,
> return !(last2 < first1 || last1 < first2);
> }
>
> +/*
> + * Memory I/O and PCI IOMMU definitions.
> + */
> +
> +#define IOMMU_PERM_READ (1 << 0)
> +#define IOMMU_PERM_WRITE (1 << 1)
> +#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
> +
> +typedef int PCIInvalidateMapFunc(void *opaque);
> +typedef int PCITranslateFunc(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms);
> +
> +void pci_memory_rw(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len,
> + int is_write);
> +void *pci_memory_map(PCIDevice *dev,
> + PCIInvalidateMapFunc *cb,
> + void *opaque,
> + pcibus_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +void pci_memory_unmap(PCIDevice *dev,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +void pci_register_iommu(PCIDevice *dev, PCITranslateFunc *translate);
> +void pci_memory_invalidate_range(PCIDevice *dev, pcibus_t addr,
> pcibus_t len);
> +
> +#define DECLARE_PCI_LD(suffix, size) \
> +uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
> +
> +#define DECLARE_PCI_ST(suffix, size) \
> +void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val);
> +
> +DECLARE_PCI_LD(ub, 8)
> +DECLARE_PCI_LD(uw, 16)
> +DECLARE_PCI_LD(l, 32)
> +DECLARE_PCI_LD(q, 64)
> +
> +DECLARE_PCI_ST(b, 8)
> +DECLARE_PCI_ST(w, 16)
> +DECLARE_PCI_ST(l, 32)
> +DECLARE_PCI_ST(q, 64)
> +
> +static inline void pci_memory_read(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, buf, len, 0);
> +}
> +
> +static inline void pci_memory_write(PCIDevice *dev,
> + pcibus_t addr,
> + const uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> +}
> +
> #endif
The functions pci_memory_read and pci_memory_write not only read
or write byte data but many different data types which leads to
a lot of type casts in your other patches.
I'd prefer "void *buf" and "const void *buf" in the argument lists.
Then all those type casts could be removed.
Regards
Stefan Weil
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-02 5:19 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 5:19 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> Emulated PCI IDE controllers now use the memory access interface. This
> also allows an emulated IOMMU to translate and check accesses.
>
> Map invalidation results in cancelling DMA transfers. Since the guest OS
> can't properly recover the DMA results in case the mapping is changed,
> this is a fairly good approximation.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> dma-helpers.c | 46 +++++++++++++++++++++++++++++++++++++++++-----
> dma.h | 21 ++++++++++++++++++++-
> hw/ide/core.c | 15 ++++++++-------
> hw/ide/internal.h | 39 +++++++++++++++++++++++++++++++++++++++
> hw/ide/macio.c | 4 ++--
> hw/ide/pci.c | 7 +++++++
> 6 files changed, 117 insertions(+), 15 deletions(-)
>
> diff --git a/dma-helpers.c b/dma-helpers.c
> index 712ed89..a0dcdb8 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -10,12 +10,36 @@
> #include "dma.h"
> #include "block_int.h"
>
> -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
> +static void *qemu_sglist_default_map(void *opaque,
> + QEMUSGInvalMapFunc *inval_cb,
> + void *inval_opaque,
> + target_phys_addr_t addr,
> + target_phys_addr_t *len,
> + int is_write)
> +{
> + return cpu_physical_memory_map(addr, len, is_write);
> +}
> +
> +static void qemu_sglist_default_unmap(void *opaque,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len)
> +{
> + cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> +}
> +
> +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
> + QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, void *opaque)
> {
> qsg->sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry));
> qsg->nsg = 0;
> qsg->nalloc = alloc_hint;
> qsg->size = 0;
> +
> + qsg->map = map ? map : qemu_sglist_default_map;
> + qsg->unmap = unmap ? unmap : qemu_sglist_default_unmap;
> + qsg->opaque = opaque;
> }
>
> void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
> @@ -73,12 +97,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
> int i;
>
> for (i = 0; i < dbs->iov.niov; ++i) {
> - cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
> - dbs->iov.iov[i].iov_len, !dbs->is_write,
> - dbs->iov.iov[i].iov_len);
> + dbs->sg->unmap(dbs->sg->opaque,
> + dbs->iov.iov[i].iov_base,
> + dbs->iov.iov[i].iov_len, !dbs->is_write,
> + dbs->iov.iov[i].iov_len);
> }
> }
>
> +static void dma_bdrv_cancel(void *opaque)
> +{
> + DMAAIOCB *dbs = opaque;
> +
> + bdrv_aio_cancel(dbs->acb);
> + dma_bdrv_unmap(dbs);
> + qemu_iovec_destroy(&dbs->iov);
> + qemu_aio_release(dbs);
> +}
> +
> static void dma_bdrv_cb(void *opaque, int ret)
> {
> DMAAIOCB *dbs = (DMAAIOCB *)opaque;
> @@ -100,7 +135,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
> while (dbs->sg_cur_index < dbs->sg->nsg) {
> cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
> cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
> - mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->is_write);
> + mem = dbs->sg->map(dbs->sg->opaque, dma_bdrv_cancel, dbs,
> + cur_addr, &cur_len, !dbs->is_write);
> if (!mem)
> break;
> qemu_iovec_add(&dbs->iov, mem, cur_len);
> diff --git a/dma.h b/dma.h
> index f3bb275..d48f35c 100644
> --- a/dma.h
> +++ b/dma.h
> @@ -15,6 +15,19 @@
> #include "hw/hw.h"
> #include "block.h"
>
> +typedef void QEMUSGInvalMapFunc(void *opaque);
> +typedef void *QEMUSGMapFunc(void *opaque,
> + QEMUSGInvalMapFunc *inval_cb,
> + void *inval_opaque,
> + target_phys_addr_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +typedef void QEMUSGUnmapFunc(void *opaque,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +
> typedef struct {
> target_phys_addr_t base;
> target_phys_addr_t len;
> @@ -25,9 +38,15 @@ typedef struct {
> int nsg;
> int nalloc;
> target_phys_addr_t size;
> +
> + QEMUSGMapFunc *map;
> + QEMUSGUnmapFunc *unmap;
> + void *opaque;
> } QEMUSGList;
>
> -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
> +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
> + QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap,
> + void *opaque);
> void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
> target_phys_addr_t len);
> void qemu_sglist_destroy(QEMUSGList *qsg);
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index af52c2c..024a125 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -436,7 +436,8 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
> } prd;
> int l, len;
>
> - qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1);
> + qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1,
> + bm->map, bm->unmap, bm->opaque);
> s->io_buffer_size = 0;
> for(;;) {
> if (bm->cur_prd_len == 0) {
> @@ -444,7 +445,7 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
> if (bm->cur_prd_last ||
> (bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
> return s->io_buffer_size != 0;
> - cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
> + bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
> bm->cur_addr += 8;
> prd.addr = le32_to_cpu(prd.addr);
> prd.size = le32_to_cpu(prd.size);
> @@ -527,7 +528,7 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
> if (bm->cur_prd_last ||
> (bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
> return 0;
> - cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
> + bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
> bm->cur_addr += 8;
> prd.addr = le32_to_cpu(prd.addr);
> prd.size = le32_to_cpu(prd.size);
> @@ -542,11 +543,11 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
> l = bm->cur_prd_len;
> if (l > 0) {
> if (is_write) {
> - cpu_physical_memory_write(bm->cur_prd_addr,
> - s->io_buffer + s->io_buffer_index, l);
> + bmdma_memory_write(bm, bm->cur_prd_addr,
> + s->io_buffer + s->io_buffer_index, l);
> } else {
> - cpu_physical_memory_read(bm->cur_prd_addr,
> - s->io_buffer + s->io_buffer_index, l);
> + bmdma_memory_read(bm, bm->cur_prd_addr,
> + s->io_buffer + s->io_buffer_index, l);
> }
> bm->cur_prd_addr += l;
> bm->cur_prd_len -= l;
> diff --git a/hw/ide/internal.h b/hw/ide/internal.h
> index 4165543..f686d38 100644
> --- a/hw/ide/internal.h
> +++ b/hw/ide/internal.h
> @@ -477,6 +477,24 @@ struct IDEDeviceInfo {
> #define BM_CMD_START 0x01
> #define BM_CMD_READ 0x08
>
> +typedef void BMDMAInvalMapFunc(void *opaque);
> +typedef void BMDMARWFunc(void *opaque,
> + target_phys_addr_t addr,
> + uint8_t *buf,
> + target_phys_addr_t len,
> + int is_write);
> +typedef void *BMDMAMapFunc(void *opaque,
> + BMDMAInvalMapFunc *inval_cb,
> + void *inval_opaque,
> + target_phys_addr_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +typedef void BMDMAUnmapFunc(void *opaque,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +
> struct BMDMAState {
> uint8_t cmd;
> uint8_t status;
> @@ -496,8 +514,29 @@ struct BMDMAState {
> int64_t sector_num;
> uint32_t nsector;
> QEMUBH *bh;
> +
> + BMDMARWFunc *rw;
> + BMDMAMapFunc *map;
> + BMDMAUnmapFunc *unmap;
> + void *opaque;
> };
>
> +static inline void bmdma_memory_read(BMDMAState *bm,
> + target_phys_addr_t addr,
> + uint8_t *buf,
> + target_phys_addr_t len)
> +{
> + bm->rw(bm->opaque, addr, buf, len, 0);
> +}
> +
> +static inline void bmdma_memory_write(BMDMAState *bm,
> + target_phys_addr_t addr,
> + uint8_t *buf,
> + target_phys_addr_t len)
> +{
> + bm->rw(bm->opaque, addr, buf, len, 1);
> +}
> +
Here again, I am concerned about indirection and pointer chaising on data path.
Can we have an iommu pointer in the device, and do a fast path in case
there is no iommu?
> static inline IDEState *idebus_active_if(IDEBus *bus)
> {
> return bus->ifs + bus->unit;
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index bd1c73e..962ae13 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
>
> s->io_buffer_size = io->len;
>
> - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> qemu_sglist_add(&s->sg, io->addr, io->len);
> io->addr += io->len;
> io->len = 0;
> @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> s->io_buffer_index = 0;
> s->io_buffer_size = io->len;
>
> - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> qemu_sglist_add(&s->sg, io->addr, io->len);
> io->addr += io->len;
> io->len = 0;
> diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> index 4d95cc5..5879044 100644
> --- a/hw/ide/pci.c
> +++ b/hw/ide/pci.c
> @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> continue;
> ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> }
> +
> + for (i = 0; i < 2; i++) {
> + d->bmdma[i].rw = (void *) pci_memory_rw;
> + d->bmdma[i].map = (void *) pci_memory_map;
> + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> + d->bmdma[i].opaque = dev;
> + }
> }
These casts show something is wrong with the API, IMO.
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 5:19 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 5:19 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> Emulated PCI IDE controllers now use the memory access interface. This
> also allows an emulated IOMMU to translate and check accesses.
>
> Map invalidation results in cancelling DMA transfers. Since the guest OS
> can't properly recover the DMA results in case the mapping is changed,
> this is a fairly good approximation.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> dma-helpers.c | 46 +++++++++++++++++++++++++++++++++++++++++-----
> dma.h | 21 ++++++++++++++++++++-
> hw/ide/core.c | 15 ++++++++-------
> hw/ide/internal.h | 39 +++++++++++++++++++++++++++++++++++++++
> hw/ide/macio.c | 4 ++--
> hw/ide/pci.c | 7 +++++++
> 6 files changed, 117 insertions(+), 15 deletions(-)
>
> diff --git a/dma-helpers.c b/dma-helpers.c
> index 712ed89..a0dcdb8 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -10,12 +10,36 @@
> #include "dma.h"
> #include "block_int.h"
>
> -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
> +static void *qemu_sglist_default_map(void *opaque,
> + QEMUSGInvalMapFunc *inval_cb,
> + void *inval_opaque,
> + target_phys_addr_t addr,
> + target_phys_addr_t *len,
> + int is_write)
> +{
> + return cpu_physical_memory_map(addr, len, is_write);
> +}
> +
> +static void qemu_sglist_default_unmap(void *opaque,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len)
> +{
> + cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> +}
> +
> +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
> + QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, void *opaque)
> {
> qsg->sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry));
> qsg->nsg = 0;
> qsg->nalloc = alloc_hint;
> qsg->size = 0;
> +
> + qsg->map = map ? map : qemu_sglist_default_map;
> + qsg->unmap = unmap ? unmap : qemu_sglist_default_unmap;
> + qsg->opaque = opaque;
> }
>
> void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
> @@ -73,12 +97,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
> int i;
>
> for (i = 0; i < dbs->iov.niov; ++i) {
> - cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
> - dbs->iov.iov[i].iov_len, !dbs->is_write,
> - dbs->iov.iov[i].iov_len);
> + dbs->sg->unmap(dbs->sg->opaque,
> + dbs->iov.iov[i].iov_base,
> + dbs->iov.iov[i].iov_len, !dbs->is_write,
> + dbs->iov.iov[i].iov_len);
> }
> }
>
> +static void dma_bdrv_cancel(void *opaque)
> +{
> + DMAAIOCB *dbs = opaque;
> +
> + bdrv_aio_cancel(dbs->acb);
> + dma_bdrv_unmap(dbs);
> + qemu_iovec_destroy(&dbs->iov);
> + qemu_aio_release(dbs);
> +}
> +
> static void dma_bdrv_cb(void *opaque, int ret)
> {
> DMAAIOCB *dbs = (DMAAIOCB *)opaque;
> @@ -100,7 +135,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
> while (dbs->sg_cur_index < dbs->sg->nsg) {
> cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
> cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
> - mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->is_write);
> + mem = dbs->sg->map(dbs->sg->opaque, dma_bdrv_cancel, dbs,
> + cur_addr, &cur_len, !dbs->is_write);
> if (!mem)
> break;
> qemu_iovec_add(&dbs->iov, mem, cur_len);
> diff --git a/dma.h b/dma.h
> index f3bb275..d48f35c 100644
> --- a/dma.h
> +++ b/dma.h
> @@ -15,6 +15,19 @@
> #include "hw/hw.h"
> #include "block.h"
>
> +typedef void QEMUSGInvalMapFunc(void *opaque);
> +typedef void *QEMUSGMapFunc(void *opaque,
> + QEMUSGInvalMapFunc *inval_cb,
> + void *inval_opaque,
> + target_phys_addr_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +typedef void QEMUSGUnmapFunc(void *opaque,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +
> typedef struct {
> target_phys_addr_t base;
> target_phys_addr_t len;
> @@ -25,9 +38,15 @@ typedef struct {
> int nsg;
> int nalloc;
> target_phys_addr_t size;
> +
> + QEMUSGMapFunc *map;
> + QEMUSGUnmapFunc *unmap;
> + void *opaque;
> } QEMUSGList;
>
> -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
> +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
> + QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap,
> + void *opaque);
> void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
> target_phys_addr_t len);
> void qemu_sglist_destroy(QEMUSGList *qsg);
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index af52c2c..024a125 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -436,7 +436,8 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
> } prd;
> int l, len;
>
> - qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1);
> + qemu_sglist_init(&s->sg, s->nsector / (IDE_PAGE_SIZE / 512) + 1,
> + bm->map, bm->unmap, bm->opaque);
> s->io_buffer_size = 0;
> for(;;) {
> if (bm->cur_prd_len == 0) {
> @@ -444,7 +445,7 @@ static int dma_buf_prepare(BMDMAState *bm, int is_write)
> if (bm->cur_prd_last ||
> (bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
> return s->io_buffer_size != 0;
> - cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
> + bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
> bm->cur_addr += 8;
> prd.addr = le32_to_cpu(prd.addr);
> prd.size = le32_to_cpu(prd.size);
> @@ -527,7 +528,7 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
> if (bm->cur_prd_last ||
> (bm->cur_addr - bm->addr) >= IDE_PAGE_SIZE)
> return 0;
> - cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8);
> + bmdma_memory_read(bm, bm->cur_addr, (uint8_t *)&prd, 8);
> bm->cur_addr += 8;
> prd.addr = le32_to_cpu(prd.addr);
> prd.size = le32_to_cpu(prd.size);
> @@ -542,11 +543,11 @@ static int dma_buf_rw(BMDMAState *bm, int is_write)
> l = bm->cur_prd_len;
> if (l > 0) {
> if (is_write) {
> - cpu_physical_memory_write(bm->cur_prd_addr,
> - s->io_buffer + s->io_buffer_index, l);
> + bmdma_memory_write(bm, bm->cur_prd_addr,
> + s->io_buffer + s->io_buffer_index, l);
> } else {
> - cpu_physical_memory_read(bm->cur_prd_addr,
> - s->io_buffer + s->io_buffer_index, l);
> + bmdma_memory_read(bm, bm->cur_prd_addr,
> + s->io_buffer + s->io_buffer_index, l);
> }
> bm->cur_prd_addr += l;
> bm->cur_prd_len -= l;
> diff --git a/hw/ide/internal.h b/hw/ide/internal.h
> index 4165543..f686d38 100644
> --- a/hw/ide/internal.h
> +++ b/hw/ide/internal.h
> @@ -477,6 +477,24 @@ struct IDEDeviceInfo {
> #define BM_CMD_START 0x01
> #define BM_CMD_READ 0x08
>
> +typedef void BMDMAInvalMapFunc(void *opaque);
> +typedef void BMDMARWFunc(void *opaque,
> + target_phys_addr_t addr,
> + uint8_t *buf,
> + target_phys_addr_t len,
> + int is_write);
> +typedef void *BMDMAMapFunc(void *opaque,
> + BMDMAInvalMapFunc *inval_cb,
> + void *inval_opaque,
> + target_phys_addr_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +typedef void BMDMAUnmapFunc(void *opaque,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +
> struct BMDMAState {
> uint8_t cmd;
> uint8_t status;
> @@ -496,8 +514,29 @@ struct BMDMAState {
> int64_t sector_num;
> uint32_t nsector;
> QEMUBH *bh;
> +
> + BMDMARWFunc *rw;
> + BMDMAMapFunc *map;
> + BMDMAUnmapFunc *unmap;
> + void *opaque;
> };
>
> +static inline void bmdma_memory_read(BMDMAState *bm,
> + target_phys_addr_t addr,
> + uint8_t *buf,
> + target_phys_addr_t len)
> +{
> + bm->rw(bm->opaque, addr, buf, len, 0);
> +}
> +
> +static inline void bmdma_memory_write(BMDMAState *bm,
> + target_phys_addr_t addr,
> + uint8_t *buf,
> + target_phys_addr_t len)
> +{
> + bm->rw(bm->opaque, addr, buf, len, 1);
> +}
> +
Here again, I am concerned about indirection and pointer chaising on data path.
Can we have an iommu pointer in the device, and do a fast path in case
there is no iommu?
> static inline IDEState *idebus_active_if(IDEBus *bus)
> {
> return bus->ifs + bus->unit;
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index bd1c73e..962ae13 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
>
> s->io_buffer_size = io->len;
>
> - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> qemu_sglist_add(&s->sg, io->addr, io->len);
> io->addr += io->len;
> io->len = 0;
> @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> s->io_buffer_index = 0;
> s->io_buffer_size = io->len;
>
> - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> qemu_sglist_add(&s->sg, io->addr, io->len);
> io->addr += io->len;
> io->len = 0;
> diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> index 4d95cc5..5879044 100644
> --- a/hw/ide/pci.c
> +++ b/hw/ide/pci.c
> @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> continue;
> ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> }
> +
> + for (i = 0; i < 2; i++) {
> + d->bmdma[i].rw = (void *) pci_memory_rw;
> + d->bmdma[i].map = (void *) pci_memory_map;
> + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> + d->bmdma[i].opaque = dev;
> + }
> }
These casts show something is wrong with the API, IMO.
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 2/7] pci: memory access API and IOMMU support
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-02 5:28 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 5:28 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> PCI devices should access memory through pci_memory_*() instead of
> cpu_physical_memory_*(). This also provides support for translation and
> access checking in case an IOMMU is emulated.
>
> Memory maps are treated as remote IOTLBs (that is, translation caches
> belonging to the IOMMU-aware device itself). Clients (devices) must
> provide callbacks for map invalidation in case these maps are
> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
I am concerned about adding more pointer chaising on data path.
Could we have
1. an iommu pointer in a device, inherited by secondary buses
when they are created and by devices from buses when they are attached.
2. translation pointer in the iommu instead of the bus
3. pci_memory_XX functions inline, doing fast path for non-iommu case:
if (__builtin_expect(!dev->iommu, 1)
return cpu_memory_rw
> ---
> hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> hw/pci.h | 74 +++++++++++++++++++++
> hw/pci_internals.h | 12 ++++
> qemu-common.h | 1 +
> 4 files changed, 271 insertions(+), 1 deletions(-)
Almost nothing here is PCI specific.
Can this code go into dma.c/dma.h?
We would have struct DMADevice, APIs like device_dma_write etc.
This would help us get rid of the void * stuff as well?
>
> diff --git a/hw/pci.c b/hw/pci.c
> index 2dc1577..b460905 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev)
> pci_update_mappings(dev);
> }
>
> +static int pci_no_translate(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms)
> +{
> + *paddr = addr;
> + *len = -1;
> +
> + return 0;
> +}
> +
> static void pci_bus_reset(void *opaque)
> {
> PCIBus *bus = opaque;
> @@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> {
> qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
> assert(PCI_FUNC(devfn_min) == 0);
> - bus->devfn_min = devfn_min;
> +
> + bus->devfn_min = devfn_min;
> + bus->iommu = NULL;
> + bus->translate = pci_no_translate;
>
> /* host bridge */
> QLIST_INIT(&bus->child);
> @@ -1789,3 +1805,170 @@ static char *pcibus_get_dev_path(DeviceState *dev)
> return strdup(path);
> }
>
> +void pci_register_iommu(PCIDevice *iommu,
> + PCITranslateFunc *translate)
> +{
> + iommu->bus->iommu = iommu;
> + iommu->bus->translate = translate;
> +}
> +
The above seems broken for secondary buses, right? Also, can we use
qdev for initialization in some way, instead of adding more APIs? E.g.
I think it would be nice if we could just use qdev command line flags to
control which bus is behind iommu and which isn't.
> +void pci_memory_rw(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len,
> + int is_write)
> +{
> + int err;
> + unsigned perms;
> + PCIDevice *iommu = dev->bus->iommu;
> + target_phys_addr_t paddr, plen;
> +
> + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> +
> + while (len) {
> + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> + if (err)
> + return;
> +
> + /* The translation might be valid for larger regions. */
> + if (plen > len)
> + plen = len;
> +
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> +
> + len -= plen;
> + addr += plen;
> + buf += plen;
> + }
> +}
> +
> +static void pci_memory_register_map(PCIDevice *dev,
> + pcibus_t addr,
> + pcibus_t len,
> + target_phys_addr_t paddr,
> + PCIInvalidateMapFunc *invalidate,
> + void *invalidate_opaque)
> +{
> + PCIMemoryMap *map;
> +
> + map = qemu_malloc(sizeof(PCIMemoryMap));
> + map->addr = addr;
> + map->len = len;
> + map->paddr = paddr;
> + map->invalidate = invalidate;
> + map->invalidate_opaque = invalidate_opaque;
> +
> + QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
> +}
> +
> +static void pci_memory_unregister_map(PCIDevice *dev,
> + target_phys_addr_t paddr,
> + target_phys_addr_t len)
> +{
> + PCIMemoryMap *map;
> +
> + QLIST_FOREACH(map, &dev->memory_maps, list) {
> + if (map->paddr == paddr && map->len == len) {
> + QLIST_REMOVE(map, list);
> + free(map);
> + }
> + }
> +}
> +
> +void pci_memory_invalidate_range(PCIDevice *dev,
> + pcibus_t addr,
> + pcibus_t len)
> +{
> + PCIMemoryMap *map;
> +
> + QLIST_FOREACH(map, &dev->memory_maps, list) {
> + if (ranges_overlap(addr, len, map->addr, map->len)) {
> + map->invalidate(map->invalidate_opaque);
> + QLIST_REMOVE(map, list);
> + free(map);
> + }
> + }
> +}
> +
> +void *pci_memory_map(PCIDevice *dev,
> + PCIInvalidateMapFunc *cb,
> + void *opaque,
> + pcibus_t addr,
> + target_phys_addr_t *len,
> + int is_write)
> +{
> + int err;
> + unsigned perms;
> + PCIDevice *iommu = dev->bus->iommu;
> + target_phys_addr_t paddr, plen;
> +
> + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> +
> + plen = *len;
> + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> + if (err)
> + return NULL;
> +
> + /*
> + * If this is true, the virtual region is contiguous,
> + * but the translated physical region isn't. We just
> + * clamp *len, much like cpu_physical_memory_map() does.
> + */
> + if (plen < *len)
> + *len = plen;
> +
> + /* We treat maps as remote TLBs to cope with stuff like AIO. */
> + if (cb)
> + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
> +
> + return cpu_physical_memory_map(paddr, len, is_write);
> +}
> +
All the above is really only useful for when there is an iommu,
right? So maybe we should shortcut all this if there's no iommu?
> +void pci_memory_unmap(PCIDevice *dev,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len)
> +{
> + cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> + pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
> +}
> +
> +#define DEFINE_PCI_LD(suffix, size) \
> +uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
> +{ \
> + int err; \
> + target_phys_addr_t paddr, plen; \
> + \
> + err = dev->bus->translate(dev->bus->iommu, dev, \
> + addr, &paddr, &plen, IOMMU_PERM_READ); \
> + if (err || (plen < size / 8)) \
> + return 0; \
> + \
> + return ld##suffix##_phys(paddr); \
> +}
> +
> +#define DEFINE_PCI_ST(suffix, size) \
> +void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
> +{ \
> + int err; \
> + target_phys_addr_t paddr, plen; \
> + \
> + err = dev->bus->translate(dev->bus->iommu, dev, \
> + addr, &paddr, &plen, IOMMU_PERM_WRITE); \
> + if (err || (plen < size / 8)) \
> + return; \
> + \
> + st##suffix##_phys(paddr, val); \
> +}
> +
> +DEFINE_PCI_LD(ub, 8)
> +DEFINE_PCI_LD(uw, 16)
> +DEFINE_PCI_LD(l, 32)
> +DEFINE_PCI_LD(q, 64)
> +
> +DEFINE_PCI_ST(b, 8)
> +DEFINE_PCI_ST(w, 16)
> +DEFINE_PCI_ST(l, 32)
> +DEFINE_PCI_ST(q, 64)
> +
> diff --git a/hw/pci.h b/hw/pci.h
> index c551f96..3131016 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -172,6 +172,8 @@ struct PCIDevice {
> char *romfile;
> ram_addr_t rom_offset;
> uint32_t rom_bar;
> +
> + QLIST_HEAD(, PCIMemoryMap) memory_maps;
> };
>
> PCIDevice *pci_register_device(PCIBus *bus, const char *name,
> @@ -391,4 +393,76 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
> return !(last2 < first1 || last1 < first2);
> }
>
> +/*
> + * Memory I/O and PCI IOMMU definitions.
> + */
> +
> +#define IOMMU_PERM_READ (1 << 0)
> +#define IOMMU_PERM_WRITE (1 << 1)
> +#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
> +
> +typedef int PCIInvalidateMapFunc(void *opaque);
> +typedef int PCITranslateFunc(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms);
> +
> +extern void pci_memory_rw(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len,
> + int is_write);
> +extern void *pci_memory_map(PCIDevice *dev,
> + PCIInvalidateMapFunc *cb,
> + void *opaque,
> + pcibus_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +extern void pci_memory_unmap(PCIDevice *dev,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +extern void pci_register_iommu(PCIDevice *dev,
> + PCITranslateFunc *translate);
> +extern void pci_memory_invalidate_range(PCIDevice *dev,
> + pcibus_t addr,
> + pcibus_t len);
> +
> +#define DECLARE_PCI_LD(suffix, size) \
> +extern uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
> +
> +#define DECLARE_PCI_ST(suffix, size) \
> +extern void pci_st##suffix(PCIDevice *dev, \
> + pcibus_t addr, \
> + uint##size##_t val);
> +
> +DECLARE_PCI_LD(ub, 8)
> +DECLARE_PCI_LD(uw, 16)
> +DECLARE_PCI_LD(l, 32)
> +DECLARE_PCI_LD(q, 64)
> +
> +DECLARE_PCI_ST(b, 8)
> +DECLARE_PCI_ST(w, 16)
> +DECLARE_PCI_ST(l, 32)
> +DECLARE_PCI_ST(q, 64)
> +
> +static inline void pci_memory_read(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, buf, len, 0);
> +}
> +
> +static inline void pci_memory_write(PCIDevice *dev,
> + pcibus_t addr,
> + const uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> +}
> +
> #endif
> diff --git a/hw/pci_internals.h b/hw/pci_internals.h
> index e3c93a3..fb134b9 100644
> --- a/hw/pci_internals.h
> +++ b/hw/pci_internals.h
> @@ -33,6 +33,9 @@ struct PCIBus {
> Keep a count of the number of devices with raised IRQs. */
> int nirq;
> int *irq_count;
> +
> + PCIDevice *iommu;
> + PCITranslateFunc *translate;
> };
Why is translate pointer in a bus? I think it's a work of an iommu?
> struct PCIBridge {
> @@ -44,4 +47,13 @@ struct PCIBridge {
> const char *bus_name;
> };
>
> +struct PCIMemoryMap {
> + pcibus_t addr;
> + pcibus_t len;
> + target_phys_addr_t paddr;
> + PCIInvalidateMapFunc *invalidate;
> + void *invalidate_opaque;
Can we have a structure that encapsulates the mapping
data instead of a void *?
> + QLIST_ENTRY(PCIMemoryMap) list;
> +};
> +
> #endif /* QEMU_PCI_INTERNALS_H */
> diff --git a/qemu-common.h b/qemu-common.h
> index d735235..8b060e8 100644
> --- a/qemu-common.h
> +++ b/qemu-common.h
> @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
> typedef struct PCIHostState PCIHostState;
> typedef struct PCIExpressHost PCIExpressHost;
> typedef struct PCIBus PCIBus;
> +typedef struct PCIMemoryMap PCIMemoryMap;
> typedef struct PCIDevice PCIDevice;
> typedef struct PCIBridge PCIBridge;
> typedef struct SerialState SerialState;
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 5:28 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 5:28 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> PCI devices should access memory through pci_memory_*() instead of
> cpu_physical_memory_*(). This also provides support for translation and
> access checking in case an IOMMU is emulated.
>
> Memory maps are treated as remote IOTLBs (that is, translation caches
> belonging to the IOMMU-aware device itself). Clients (devices) must
> provide callbacks for map invalidation in case these maps are
> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
I am concerned about adding more pointer chaising on data path.
Could we have
1. an iommu pointer in a device, inherited by secondary buses
when they are created and by devices from buses when they are attached.
2. translation pointer in the iommu instead of the bus
3. pci_memory_XX functions inline, doing fast path for non-iommu case:
if (__builtin_expect(!dev->iommu, 1)
return cpu_memory_rw
> ---
> hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> hw/pci.h | 74 +++++++++++++++++++++
> hw/pci_internals.h | 12 ++++
> qemu-common.h | 1 +
> 4 files changed, 271 insertions(+), 1 deletions(-)
Almost nothing here is PCI specific.
Can this code go into dma.c/dma.h?
We would have struct DMADevice, APIs like device_dma_write etc.
This would help us get rid of the void * stuff as well?
>
> diff --git a/hw/pci.c b/hw/pci.c
> index 2dc1577..b460905 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev)
> pci_update_mappings(dev);
> }
>
> +static int pci_no_translate(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms)
> +{
> + *paddr = addr;
> + *len = -1;
> +
> + return 0;
> +}
> +
> static void pci_bus_reset(void *opaque)
> {
> PCIBus *bus = opaque;
> @@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> {
> qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
> assert(PCI_FUNC(devfn_min) == 0);
> - bus->devfn_min = devfn_min;
> +
> + bus->devfn_min = devfn_min;
> + bus->iommu = NULL;
> + bus->translate = pci_no_translate;
>
> /* host bridge */
> QLIST_INIT(&bus->child);
> @@ -1789,3 +1805,170 @@ static char *pcibus_get_dev_path(DeviceState *dev)
> return strdup(path);
> }
>
> +void pci_register_iommu(PCIDevice *iommu,
> + PCITranslateFunc *translate)
> +{
> + iommu->bus->iommu = iommu;
> + iommu->bus->translate = translate;
> +}
> +
The above seems broken for secondary buses, right? Also, can we use
qdev for initialization in some way, instead of adding more APIs? E.g.
I think it would be nice if we could just use qdev command line flags to
control which bus is behind iommu and which isn't.
> +void pci_memory_rw(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len,
> + int is_write)
> +{
> + int err;
> + unsigned perms;
> + PCIDevice *iommu = dev->bus->iommu;
> + target_phys_addr_t paddr, plen;
> +
> + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> +
> + while (len) {
> + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> + if (err)
> + return;
> +
> + /* The translation might be valid for larger regions. */
> + if (plen > len)
> + plen = len;
> +
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> +
> + len -= plen;
> + addr += plen;
> + buf += plen;
> + }
> +}
> +
> +static void pci_memory_register_map(PCIDevice *dev,
> + pcibus_t addr,
> + pcibus_t len,
> + target_phys_addr_t paddr,
> + PCIInvalidateMapFunc *invalidate,
> + void *invalidate_opaque)
> +{
> + PCIMemoryMap *map;
> +
> + map = qemu_malloc(sizeof(PCIMemoryMap));
> + map->addr = addr;
> + map->len = len;
> + map->paddr = paddr;
> + map->invalidate = invalidate;
> + map->invalidate_opaque = invalidate_opaque;
> +
> + QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
> +}
> +
> +static void pci_memory_unregister_map(PCIDevice *dev,
> + target_phys_addr_t paddr,
> + target_phys_addr_t len)
> +{
> + PCIMemoryMap *map;
> +
> + QLIST_FOREACH(map, &dev->memory_maps, list) {
> + if (map->paddr == paddr && map->len == len) {
> + QLIST_REMOVE(map, list);
> + free(map);
> + }
> + }
> +}
> +
> +void pci_memory_invalidate_range(PCIDevice *dev,
> + pcibus_t addr,
> + pcibus_t len)
> +{
> + PCIMemoryMap *map;
> +
> + QLIST_FOREACH(map, &dev->memory_maps, list) {
> + if (ranges_overlap(addr, len, map->addr, map->len)) {
> + map->invalidate(map->invalidate_opaque);
> + QLIST_REMOVE(map, list);
> + free(map);
> + }
> + }
> +}
> +
> +void *pci_memory_map(PCIDevice *dev,
> + PCIInvalidateMapFunc *cb,
> + void *opaque,
> + pcibus_t addr,
> + target_phys_addr_t *len,
> + int is_write)
> +{
> + int err;
> + unsigned perms;
> + PCIDevice *iommu = dev->bus->iommu;
> + target_phys_addr_t paddr, plen;
> +
> + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> +
> + plen = *len;
> + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> + if (err)
> + return NULL;
> +
> + /*
> + * If this is true, the virtual region is contiguous,
> + * but the translated physical region isn't. We just
> + * clamp *len, much like cpu_physical_memory_map() does.
> + */
> + if (plen < *len)
> + *len = plen;
> +
> + /* We treat maps as remote TLBs to cope with stuff like AIO. */
> + if (cb)
> + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
> +
> + return cpu_physical_memory_map(paddr, len, is_write);
> +}
> +
All the above is really only useful for when there is an iommu,
right? So maybe we should shortcut all this if there's no iommu?
> +void pci_memory_unmap(PCIDevice *dev,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len)
> +{
> + cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> + pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
> +}
> +
> +#define DEFINE_PCI_LD(suffix, size) \
> +uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
> +{ \
> + int err; \
> + target_phys_addr_t paddr, plen; \
> + \
> + err = dev->bus->translate(dev->bus->iommu, dev, \
> + addr, &paddr, &plen, IOMMU_PERM_READ); \
> + if (err || (plen < size / 8)) \
> + return 0; \
> + \
> + return ld##suffix##_phys(paddr); \
> +}
> +
> +#define DEFINE_PCI_ST(suffix, size) \
> +void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
> +{ \
> + int err; \
> + target_phys_addr_t paddr, plen; \
> + \
> + err = dev->bus->translate(dev->bus->iommu, dev, \
> + addr, &paddr, &plen, IOMMU_PERM_WRITE); \
> + if (err || (plen < size / 8)) \
> + return; \
> + \
> + st##suffix##_phys(paddr, val); \
> +}
> +
> +DEFINE_PCI_LD(ub, 8)
> +DEFINE_PCI_LD(uw, 16)
> +DEFINE_PCI_LD(l, 32)
> +DEFINE_PCI_LD(q, 64)
> +
> +DEFINE_PCI_ST(b, 8)
> +DEFINE_PCI_ST(w, 16)
> +DEFINE_PCI_ST(l, 32)
> +DEFINE_PCI_ST(q, 64)
> +
> diff --git a/hw/pci.h b/hw/pci.h
> index c551f96..3131016 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -172,6 +172,8 @@ struct PCIDevice {
> char *romfile;
> ram_addr_t rom_offset;
> uint32_t rom_bar;
> +
> + QLIST_HEAD(, PCIMemoryMap) memory_maps;
> };
>
> PCIDevice *pci_register_device(PCIBus *bus, const char *name,
> @@ -391,4 +393,76 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
> return !(last2 < first1 || last1 < first2);
> }
>
> +/*
> + * Memory I/O and PCI IOMMU definitions.
> + */
> +
> +#define IOMMU_PERM_READ (1 << 0)
> +#define IOMMU_PERM_WRITE (1 << 1)
> +#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
> +
> +typedef int PCIInvalidateMapFunc(void *opaque);
> +typedef int PCITranslateFunc(PCIDevice *iommu,
> + PCIDevice *dev,
> + pcibus_t addr,
> + target_phys_addr_t *paddr,
> + target_phys_addr_t *len,
> + unsigned perms);
> +
> +extern void pci_memory_rw(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len,
> + int is_write);
> +extern void *pci_memory_map(PCIDevice *dev,
> + PCIInvalidateMapFunc *cb,
> + void *opaque,
> + pcibus_t addr,
> + target_phys_addr_t *len,
> + int is_write);
> +extern void pci_memory_unmap(PCIDevice *dev,
> + void *buffer,
> + target_phys_addr_t len,
> + int is_write,
> + target_phys_addr_t access_len);
> +extern void pci_register_iommu(PCIDevice *dev,
> + PCITranslateFunc *translate);
> +extern void pci_memory_invalidate_range(PCIDevice *dev,
> + pcibus_t addr,
> + pcibus_t len);
> +
> +#define DECLARE_PCI_LD(suffix, size) \
> +extern uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
> +
> +#define DECLARE_PCI_ST(suffix, size) \
> +extern void pci_st##suffix(PCIDevice *dev, \
> + pcibus_t addr, \
> + uint##size##_t val);
> +
> +DECLARE_PCI_LD(ub, 8)
> +DECLARE_PCI_LD(uw, 16)
> +DECLARE_PCI_LD(l, 32)
> +DECLARE_PCI_LD(q, 64)
> +
> +DECLARE_PCI_ST(b, 8)
> +DECLARE_PCI_ST(w, 16)
> +DECLARE_PCI_ST(l, 32)
> +DECLARE_PCI_ST(q, 64)
> +
> +static inline void pci_memory_read(PCIDevice *dev,
> + pcibus_t addr,
> + uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, buf, len, 0);
> +}
> +
> +static inline void pci_memory_write(PCIDevice *dev,
> + pcibus_t addr,
> + const uint8_t *buf,
> + pcibus_t len)
> +{
> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> +}
> +
> #endif
> diff --git a/hw/pci_internals.h b/hw/pci_internals.h
> index e3c93a3..fb134b9 100644
> --- a/hw/pci_internals.h
> +++ b/hw/pci_internals.h
> @@ -33,6 +33,9 @@ struct PCIBus {
> Keep a count of the number of devices with raised IRQs. */
> int nirq;
> int *irq_count;
> +
> + PCIDevice *iommu;
> + PCITranslateFunc *translate;
> };
Why is translate pointer in a bus? I think it's a work of an iommu?
> struct PCIBridge {
> @@ -44,4 +47,13 @@ struct PCIBridge {
> const char *bus_name;
> };
>
> +struct PCIMemoryMap {
> + pcibus_t addr;
> + pcibus_t len;
> + target_phys_addr_t paddr;
> + PCIInvalidateMapFunc *invalidate;
> + void *invalidate_opaque;
Can we have a structure that encapsulates the mapping
data instead of a void *?
> + QLIST_ENTRY(PCIMemoryMap) list;
> +};
> +
> #endif /* QEMU_PCI_INTERNALS_H */
> diff --git a/qemu-common.h b/qemu-common.h
> index d735235..8b060e8 100644
> --- a/qemu-common.h
> +++ b/qemu-common.h
> @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
> typedef struct PCIHostState PCIHostState;
> typedef struct PCIExpressHost PCIExpressHost;
> typedef struct PCIBus PCIBus;
> +typedef struct PCIMemoryMap PCIMemoryMap;
> typedef struct PCIDevice PCIDevice;
> typedef struct PCIBridge PCIBridge;
> typedef struct SerialState SerialState;
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-01 20:10 ` Stefan Weil
@ 2010-09-02 6:00 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 6:00 UTC (permalink / raw)
To: Stefan Weil
Cc: Eduard - Gabriel Munteanu, kvm, joro, qemu-devel, blauwirbel,
yamahata, paul, avi
On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
> >+static inline void pci_memory_read(PCIDevice *dev,
> >+ pcibus_t addr,
> >+ uint8_t *buf,
> >+ pcibus_t len)
> >+{
> >+ pci_memory_rw(dev, addr, buf, len, 0);
> >+}
> >+
> >+static inline void pci_memory_write(PCIDevice *dev,
> >+ pcibus_t addr,
> >+ const uint8_t *buf,
> >+ pcibus_t len)
> >+{
> >+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> >+}
> >+
> >#endif
>
> The functions pci_memory_read and pci_memory_write not only read
> or write byte data but many different data types which leads to
> a lot of type casts in your other patches.
>
> I'd prefer "void *buf" and "const void *buf" in the argument lists.
> Then all those type casts could be removed.
>
> Regards
> Stefan Weil
Further, I am not sure pcibus_t is a good type to use here.
This also forces use of pci specific types in e.g. ide, or resorting to
casts as this patch does. We probably should use a more generic type
for this.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 6:00 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 6:00 UTC (permalink / raw)
To: Stefan Weil
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
> >+static inline void pci_memory_read(PCIDevice *dev,
> >+ pcibus_t addr,
> >+ uint8_t *buf,
> >+ pcibus_t len)
> >+{
> >+ pci_memory_rw(dev, addr, buf, len, 0);
> >+}
> >+
> >+static inline void pci_memory_write(PCIDevice *dev,
> >+ pcibus_t addr,
> >+ const uint8_t *buf,
> >+ pcibus_t len)
> >+{
> >+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> >+}
> >+
> >#endif
>
> The functions pci_memory_read and pci_memory_write not only read
> or write byte data but many different data types which leads to
> a lot of type casts in your other patches.
>
> I'd prefer "void *buf" and "const void *buf" in the argument lists.
> Then all those type casts could be removed.
>
> Regards
> Stefan Weil
Further, I am not sure pcibus_t is a good type to use here.
This also forces use of pci specific types in e.g. ide, or resorting to
casts as this patch does. We probably should use a more generic type
for this.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 5:28 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-02 8:40 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 8:40 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> > PCI devices should access memory through pci_memory_*() instead of
> > cpu_physical_memory_*(). This also provides support for translation and
> > access checking in case an IOMMU is emulated.
> >
> > Memory maps are treated as remote IOTLBs (that is, translation caches
> > belonging to the IOMMU-aware device itself). Clients (devices) must
> > provide callbacks for map invalidation in case these maps are
> > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>
>
> I am concerned about adding more pointer chaising on data path.
> Could we have
> 1. an iommu pointer in a device, inherited by secondary buses
> when they are created and by devices from buses when they are attached.
> 2. translation pointer in the iommu instead of the bus
The first solution I proposed was based on qdev, that is, each
DeviceState had an 'iommu' field. Translation would be done by
recursively looking in the parent bus/devs for an IOMMU.
But Anthony said we're better off with bus-specific APIs, mostly because
(IIRC) there may be different types of addresses and it might be
difficult to abstract those properly.
I suppose I could revisit the idea by integrating the IOMMU in a
PCIDevice as opposed to a DeviceState.
Anthony, Paul, any thoughts on this?
> 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>
> if (__builtin_expect(!dev->iommu, 1)
> return cpu_memory_rw
But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
kernel? If so, it puts the IOMMU-enabled case at disadvantage.
I suppose most emulated systems would have at least some theoretical
reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
PCI devices) or for userspace drivers. So there are reasons to enable
the IOMMU even when you don't have a real host IOMMU and you're not
using nested guests.
> > ---
> > hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> > hw/pci.h | 74 +++++++++++++++++++++
> > hw/pci_internals.h | 12 ++++
> > qemu-common.h | 1 +
> > 4 files changed, 271 insertions(+), 1 deletions(-)
>
> Almost nothing here is PCI specific.
> Can this code go into dma.c/dma.h?
> We would have struct DMADevice, APIs like device_dma_write etc.
> This would help us get rid of the void * stuff as well?
>
Yeah, I know, that's similar to what I intended to do at first. Though
I'm not sure that rids us of 'void *' stuff, quite on the contrary from
what I've seen.
Some stuff still needs to stay 'void *' (or an equivalent typedef, but
still an opaque) simply because of the required level of abstraction
that's needed.
[snip]
> > +void pci_register_iommu(PCIDevice *iommu,
> > + PCITranslateFunc *translate)
> > +{
> > + iommu->bus->iommu = iommu;
> > + iommu->bus->translate = translate;
> > +}
> > +
>
> The above seems broken for secondary buses, right? Also, can we use
> qdev for initialization in some way, instead of adding more APIs? E.g.
> I think it would be nice if we could just use qdev command line flags to
> control which bus is behind iommu and which isn't.
>
>
Each bus must have its own IOMMU. The secondary bus should ask the
primary bus instead of going through cpu_physical_memory_*(). If that
isn't the case, it's broken and the secondary bus must be converted to
the new API just like regular devices. I'll have a look at that.
> > +void pci_memory_rw(PCIDevice *dev,
> > + pcibus_t addr,
> > + uint8_t *buf,
> > + pcibus_t len,
> > + int is_write)
> > +{
> > + int err;
> > + unsigned perms;
> > + PCIDevice *iommu = dev->bus->iommu;
> > + target_phys_addr_t paddr, plen;
> > +
> > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > +
> > + while (len) {
> > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > + if (err)
> > + return;
> > +
> > + /* The translation might be valid for larger regions. */
> > + if (plen > len)
> > + plen = len;
> > +
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +
> > + len -= plen;
> > + addr += plen;
> > + buf += plen;
> > + }
> > +}
> > +
> > +static void pci_memory_register_map(PCIDevice *dev,
> > + pcibus_t addr,
> > + pcibus_t len,
> > + target_phys_addr_t paddr,
> > + PCIInvalidateMapFunc *invalidate,
> > + void *invalidate_opaque)
> > +{
> > + PCIMemoryMap *map;
> > +
> > + map = qemu_malloc(sizeof(PCIMemoryMap));
> > + map->addr = addr;
> > + map->len = len;
> > + map->paddr = paddr;
> > + map->invalidate = invalidate;
> > + map->invalidate_opaque = invalidate_opaque;
> > +
> > + QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
> > +}
> > +
> > +static void pci_memory_unregister_map(PCIDevice *dev,
> > + target_phys_addr_t paddr,
> > + target_phys_addr_t len)
> > +{
> > + PCIMemoryMap *map;
> > +
> > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > + if (map->paddr == paddr && map->len == len) {
> > + QLIST_REMOVE(map, list);
> > + free(map);
> > + }
> > + }
> > +}
> > +
> > +void pci_memory_invalidate_range(PCIDevice *dev,
> > + pcibus_t addr,
> > + pcibus_t len)
> > +{
> > + PCIMemoryMap *map;
> > +
> > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > + if (ranges_overlap(addr, len, map->addr, map->len)) {
> > + map->invalidate(map->invalidate_opaque);
> > + QLIST_REMOVE(map, list);
> > + free(map);
> > + }
> > + }
> > +}
> > +
> > +void *pci_memory_map(PCIDevice *dev,
> > + PCIInvalidateMapFunc *cb,
> > + void *opaque,
> > + pcibus_t addr,
> > + target_phys_addr_t *len,
> > + int is_write)
> > +{
> > + int err;
> > + unsigned perms;
> > + PCIDevice *iommu = dev->bus->iommu;
> > + target_phys_addr_t paddr, plen;
> > +
> > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > +
> > + plen = *len;
> > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > + if (err)
> > + return NULL;
> > +
> > + /*
> > + * If this is true, the virtual region is contiguous,
> > + * but the translated physical region isn't. We just
> > + * clamp *len, much like cpu_physical_memory_map() does.
> > + */
> > + if (plen < *len)
> > + *len = plen;
> > +
> > + /* We treat maps as remote TLBs to cope with stuff like AIO. */
> > + if (cb)
> > + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
> > +
> > + return cpu_physical_memory_map(paddr, len, is_write);
> > +}
> > +
>
> All the above is really only useful for when there is an iommu,
> right? So maybe we should shortcut all this if there's no iommu?
>
Some people (e.g. Blue) suggested I shouldn't make the IOMMU emulation a
compile-time option, like I originally did. And I'm not sure any runtime
"optimization" (as in likely()/unlikely()) is justified.
[snip]
> > diff --git a/hw/pci_internals.h b/hw/pci_internals.h
> > index e3c93a3..fb134b9 100644
> > --- a/hw/pci_internals.h
> > +++ b/hw/pci_internals.h
> > @@ -33,6 +33,9 @@ struct PCIBus {
> > Keep a count of the number of devices with raised IRQs. */
> > int nirq;
> > int *irq_count;
> > +
> > + PCIDevice *iommu;
> > + PCITranslateFunc *translate;
> > };
>
> Why is translate pointer in a bus? I think it's a work of an iommu?
>
Anthony and Paul thought it's best to simply as the parent bus for
translation. I somewhat agree to that: devices that aren't IOMMU-aware
simply attempt to do PCI requests to memory and the IOMMU translates
and checks them transparently.
> > struct PCIBridge {
> > @@ -44,4 +47,13 @@ struct PCIBridge {
> > const char *bus_name;
> > };
> >
> > +struct PCIMemoryMap {
> > + pcibus_t addr;
> > + pcibus_t len;
> > + target_phys_addr_t paddr;
> > + PCIInvalidateMapFunc *invalidate;
> > + void *invalidate_opaque;
>
> Can we have a structure that encapsulates the mapping
> data instead of a void *?
>
>
Not really. 'invalidate_opaque' belongs to device code. It's meant to be
a handle to easily identify the mapping. For example, DMA code wants to
cancel AIO transfers when the bus requests the map to be invalidated.
It's difficult to look that AIO transfer up using non-opaque data.
> > + QLIST_ENTRY(PCIMemoryMap) list;
> > +};
> > +
> > #endif /* QEMU_PCI_INTERNALS_H */
> > diff --git a/qemu-common.h b/qemu-common.h
> > index d735235..8b060e8 100644
> > --- a/qemu-common.h
> > +++ b/qemu-common.h
> > @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
> > typedef struct PCIHostState PCIHostState;
> > typedef struct PCIExpressHost PCIExpressHost;
> > typedef struct PCIBus PCIBus;
> > +typedef struct PCIMemoryMap PCIMemoryMap;
> > typedef struct PCIDevice PCIDevice;
> > typedef struct PCIBridge PCIBridge;
> > typedef struct SerialState SerialState;
> > --
> > 1.7.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 8:40 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 8:40 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> > PCI devices should access memory through pci_memory_*() instead of
> > cpu_physical_memory_*(). This also provides support for translation and
> > access checking in case an IOMMU is emulated.
> >
> > Memory maps are treated as remote IOTLBs (that is, translation caches
> > belonging to the IOMMU-aware device itself). Clients (devices) must
> > provide callbacks for map invalidation in case these maps are
> > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>
>
> I am concerned about adding more pointer chaising on data path.
> Could we have
> 1. an iommu pointer in a device, inherited by secondary buses
> when they are created and by devices from buses when they are attached.
> 2. translation pointer in the iommu instead of the bus
The first solution I proposed was based on qdev, that is, each
DeviceState had an 'iommu' field. Translation would be done by
recursively looking in the parent bus/devs for an IOMMU.
But Anthony said we're better off with bus-specific APIs, mostly because
(IIRC) there may be different types of addresses and it might be
difficult to abstract those properly.
I suppose I could revisit the idea by integrating the IOMMU in a
PCIDevice as opposed to a DeviceState.
Anthony, Paul, any thoughts on this?
> 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>
> if (__builtin_expect(!dev->iommu, 1)
> return cpu_memory_rw
But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
kernel? If so, it puts the IOMMU-enabled case at disadvantage.
I suppose most emulated systems would have at least some theoretical
reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
PCI devices) or for userspace drivers. So there are reasons to enable
the IOMMU even when you don't have a real host IOMMU and you're not
using nested guests.
> > ---
> > hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> > hw/pci.h | 74 +++++++++++++++++++++
> > hw/pci_internals.h | 12 ++++
> > qemu-common.h | 1 +
> > 4 files changed, 271 insertions(+), 1 deletions(-)
>
> Almost nothing here is PCI specific.
> Can this code go into dma.c/dma.h?
> We would have struct DMADevice, APIs like device_dma_write etc.
> This would help us get rid of the void * stuff as well?
>
Yeah, I know, that's similar to what I intended to do at first. Though
I'm not sure that rids us of 'void *' stuff, quite on the contrary from
what I've seen.
Some stuff still needs to stay 'void *' (or an equivalent typedef, but
still an opaque) simply because of the required level of abstraction
that's needed.
[snip]
> > +void pci_register_iommu(PCIDevice *iommu,
> > + PCITranslateFunc *translate)
> > +{
> > + iommu->bus->iommu = iommu;
> > + iommu->bus->translate = translate;
> > +}
> > +
>
> The above seems broken for secondary buses, right? Also, can we use
> qdev for initialization in some way, instead of adding more APIs? E.g.
> I think it would be nice if we could just use qdev command line flags to
> control which bus is behind iommu and which isn't.
>
>
Each bus must have its own IOMMU. The secondary bus should ask the
primary bus instead of going through cpu_physical_memory_*(). If that
isn't the case, it's broken and the secondary bus must be converted to
the new API just like regular devices. I'll have a look at that.
> > +void pci_memory_rw(PCIDevice *dev,
> > + pcibus_t addr,
> > + uint8_t *buf,
> > + pcibus_t len,
> > + int is_write)
> > +{
> > + int err;
> > + unsigned perms;
> > + PCIDevice *iommu = dev->bus->iommu;
> > + target_phys_addr_t paddr, plen;
> > +
> > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > +
> > + while (len) {
> > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > + if (err)
> > + return;
> > +
> > + /* The translation might be valid for larger regions. */
> > + if (plen > len)
> > + plen = len;
> > +
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +
> > + len -= plen;
> > + addr += plen;
> > + buf += plen;
> > + }
> > +}
> > +
> > +static void pci_memory_register_map(PCIDevice *dev,
> > + pcibus_t addr,
> > + pcibus_t len,
> > + target_phys_addr_t paddr,
> > + PCIInvalidateMapFunc *invalidate,
> > + void *invalidate_opaque)
> > +{
> > + PCIMemoryMap *map;
> > +
> > + map = qemu_malloc(sizeof(PCIMemoryMap));
> > + map->addr = addr;
> > + map->len = len;
> > + map->paddr = paddr;
> > + map->invalidate = invalidate;
> > + map->invalidate_opaque = invalidate_opaque;
> > +
> > + QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
> > +}
> > +
> > +static void pci_memory_unregister_map(PCIDevice *dev,
> > + target_phys_addr_t paddr,
> > + target_phys_addr_t len)
> > +{
> > + PCIMemoryMap *map;
> > +
> > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > + if (map->paddr == paddr && map->len == len) {
> > + QLIST_REMOVE(map, list);
> > + free(map);
> > + }
> > + }
> > +}
> > +
> > +void pci_memory_invalidate_range(PCIDevice *dev,
> > + pcibus_t addr,
> > + pcibus_t len)
> > +{
> > + PCIMemoryMap *map;
> > +
> > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > + if (ranges_overlap(addr, len, map->addr, map->len)) {
> > + map->invalidate(map->invalidate_opaque);
> > + QLIST_REMOVE(map, list);
> > + free(map);
> > + }
> > + }
> > +}
> > +
> > +void *pci_memory_map(PCIDevice *dev,
> > + PCIInvalidateMapFunc *cb,
> > + void *opaque,
> > + pcibus_t addr,
> > + target_phys_addr_t *len,
> > + int is_write)
> > +{
> > + int err;
> > + unsigned perms;
> > + PCIDevice *iommu = dev->bus->iommu;
> > + target_phys_addr_t paddr, plen;
> > +
> > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > +
> > + plen = *len;
> > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > + if (err)
> > + return NULL;
> > +
> > + /*
> > + * If this is true, the virtual region is contiguous,
> > + * but the translated physical region isn't. We just
> > + * clamp *len, much like cpu_physical_memory_map() does.
> > + */
> > + if (plen < *len)
> > + *len = plen;
> > +
> > + /* We treat maps as remote TLBs to cope with stuff like AIO. */
> > + if (cb)
> > + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
> > +
> > + return cpu_physical_memory_map(paddr, len, is_write);
> > +}
> > +
>
> All the above is really only useful for when there is an iommu,
> right? So maybe we should shortcut all this if there's no iommu?
>
Some people (e.g. Blue) suggested I shouldn't make the IOMMU emulation a
compile-time option, like I originally did. And I'm not sure any runtime
"optimization" (as in likely()/unlikely()) is justified.
[snip]
> > diff --git a/hw/pci_internals.h b/hw/pci_internals.h
> > index e3c93a3..fb134b9 100644
> > --- a/hw/pci_internals.h
> > +++ b/hw/pci_internals.h
> > @@ -33,6 +33,9 @@ struct PCIBus {
> > Keep a count of the number of devices with raised IRQs. */
> > int nirq;
> > int *irq_count;
> > +
> > + PCIDevice *iommu;
> > + PCITranslateFunc *translate;
> > };
>
> Why is translate pointer in a bus? I think it's a work of an iommu?
>
Anthony and Paul thought it's best to simply as the parent bus for
translation. I somewhat agree to that: devices that aren't IOMMU-aware
simply attempt to do PCI requests to memory and the IOMMU translates
and checks them transparently.
> > struct PCIBridge {
> > @@ -44,4 +47,13 @@ struct PCIBridge {
> > const char *bus_name;
> > };
> >
> > +struct PCIMemoryMap {
> > + pcibus_t addr;
> > + pcibus_t len;
> > + target_phys_addr_t paddr;
> > + PCIInvalidateMapFunc *invalidate;
> > + void *invalidate_opaque;
>
> Can we have a structure that encapsulates the mapping
> data instead of a void *?
>
>
Not really. 'invalidate_opaque' belongs to device code. It's meant to be
a handle to easily identify the mapping. For example, DMA code wants to
cancel AIO transfers when the bus requests the map to be invalidated.
It's difficult to look that AIO transfer up using non-opaque data.
> > + QLIST_ENTRY(PCIMemoryMap) list;
> > +};
> > +
> > #endif /* QEMU_PCI_INTERNALS_H */
> > diff --git a/qemu-common.h b/qemu-common.h
> > index d735235..8b060e8 100644
> > --- a/qemu-common.h
> > +++ b/qemu-common.h
> > @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
> > typedef struct PCIHostState PCIHostState;
> > typedef struct PCIExpressHost PCIExpressHost;
> > typedef struct PCIBus PCIBus;
> > +typedef struct PCIMemoryMap PCIMemoryMap;
> > typedef struct PCIDevice PCIDevice;
> > typedef struct PCIBridge PCIBridge;
> > typedef struct SerialState SerialState;
> > --
> > 1.7.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-01 20:10 ` Stefan Weil
@ 2010-09-02 8:51 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 8:51 UTC (permalink / raw)
To: Stefan Weil; +Cc: mst, kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
> Please see my comments at the end of this mail.
>
>
> Am 30.08.2010 00:08, schrieb Eduard - Gabriel Munteanu:
> > PCI devices should access memory through pci_memory_*() instead of
> > cpu_physical_memory_*(). This also provides support for translation and
> > access checking in case an IOMMU is emulated.
> >
> > Memory maps are treated as remote IOTLBs (that is, translation caches
> > belonging to the IOMMU-aware device itself). Clients (devices) must
> > provide callbacks for map invalidation in case these maps are
> > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > ---
[snip]
> > +static inline void pci_memory_read(PCIDevice *dev,
> > + pcibus_t addr,
> > + uint8_t *buf,
> > + pcibus_t len)
> > +{
> > + pci_memory_rw(dev, addr, buf, len, 0);
> > +}
> > +
> > +static inline void pci_memory_write(PCIDevice *dev,
> > + pcibus_t addr,
> > + const uint8_t *buf,
> > + pcibus_t len)
> > +{
> > + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> > +}
> > +
> > #endif
>
> The functions pci_memory_read and pci_memory_write not only read
> or write byte data but many different data types which leads to
> a lot of type casts in your other patches.
>
> I'd prefer "void *buf" and "const void *buf" in the argument lists.
> Then all those type casts could be removed.
>
> Regards
> Stefan Weil
I only followed an approach similar to how cpu_physical_memory_{read,write}()
is defined. I think I should change both cpu_physical_memory_* stuff and
pci_memory_* stuff, not only the latter, if I decide to go on that
approach.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 8:51 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 8:51 UTC (permalink / raw)
To: Stefan Weil; +Cc: kvm, mst, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
> Please see my comments at the end of this mail.
>
>
> Am 30.08.2010 00:08, schrieb Eduard - Gabriel Munteanu:
> > PCI devices should access memory through pci_memory_*() instead of
> > cpu_physical_memory_*(). This also provides support for translation and
> > access checking in case an IOMMU is emulated.
> >
> > Memory maps are treated as remote IOTLBs (that is, translation caches
> > belonging to the IOMMU-aware device itself). Clients (devices) must
> > provide callbacks for map invalidation in case these maps are
> > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > ---
[snip]
> > +static inline void pci_memory_read(PCIDevice *dev,
> > + pcibus_t addr,
> > + uint8_t *buf,
> > + pcibus_t len)
> > +{
> > + pci_memory_rw(dev, addr, buf, len, 0);
> > +}
> > +
> > +static inline void pci_memory_write(PCIDevice *dev,
> > + pcibus_t addr,
> > + const uint8_t *buf,
> > + pcibus_t len)
> > +{
> > + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> > +}
> > +
> > #endif
>
> The functions pci_memory_read and pci_memory_write not only read
> or write byte data but many different data types which leads to
> a lot of type casts in your other patches.
>
> I'd prefer "void *buf" and "const void *buf" in the argument lists.
> Then all those type casts could be removed.
>
> Regards
> Stefan Weil
I only followed an approach similar to how cpu_physical_memory_{read,write}()
is defined. I think I should change both cpu_physical_memory_* stuff and
pci_memory_* stuff, not only the latter, if I decide to go on that
approach.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 6:00 ` Michael S. Tsirkin
@ 2010-09-02 9:08 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 9:08 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Stefan Weil, kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 09:00:46AM +0300, Michael S. Tsirkin wrote:
> On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
> > >+static inline void pci_memory_read(PCIDevice *dev,
> > >+ pcibus_t addr,
> > >+ uint8_t *buf,
> > >+ pcibus_t len)
> > >+{
> > >+ pci_memory_rw(dev, addr, buf, len, 0);
> > >+}
> > >+
> > >+static inline void pci_memory_write(PCIDevice *dev,
> > >+ pcibus_t addr,
> > >+ const uint8_t *buf,
> > >+ pcibus_t len)
> > >+{
> > >+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> > >+}
> > >+
> > >#endif
> >
> > The functions pci_memory_read and pci_memory_write not only read
> > or write byte data but many different data types which leads to
> > a lot of type casts in your other patches.
> >
> > I'd prefer "void *buf" and "const void *buf" in the argument lists.
> > Then all those type casts could be removed.
> >
> > Regards
> > Stefan Weil
>
> Further, I am not sure pcibus_t is a good type to use here.
> This also forces use of pci specific types in e.g. ide, or resorting to
> casts as this patch does. We probably should use a more generic type
> for this.
It only forces use of PCI-specific types in the IDE controller, which is
already a PCI device.
Eduard
> --
> MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 9:08 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 9:08 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, yamahata, joro, qemu-devel, blauwirbel, paul, avi
On Thu, Sep 02, 2010 at 09:00:46AM +0300, Michael S. Tsirkin wrote:
> On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
> > >+static inline void pci_memory_read(PCIDevice *dev,
> > >+ pcibus_t addr,
> > >+ uint8_t *buf,
> > >+ pcibus_t len)
> > >+{
> > >+ pci_memory_rw(dev, addr, buf, len, 0);
> > >+}
> > >+
> > >+static inline void pci_memory_write(PCIDevice *dev,
> > >+ pcibus_t addr,
> > >+ const uint8_t *buf,
> > >+ pcibus_t len)
> > >+{
> > >+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
> > >+}
> > >+
> > >#endif
> >
> > The functions pci_memory_read and pci_memory_write not only read
> > or write byte data but many different data types which leads to
> > a lot of type casts in your other patches.
> >
> > I'd prefer "void *buf" and "const void *buf" in the argument lists.
> > Then all those type casts could be removed.
> >
> > Regards
> > Stefan Weil
>
> Further, I am not sure pcibus_t is a good type to use here.
> This also forces use of pci specific types in e.g. ide, or resorting to
> casts as this patch does. We probably should use a more generic type
> for this.
It only forces use of PCI-specific types in the IDE controller, which is
already a PCI device.
Eduard
> --
> MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 5:19 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-02 9:12 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 9:12 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> > Emulated PCI IDE controllers now use the memory access interface. This
> > also allows an emulated IOMMU to translate and check accesses.
> >
> > Map invalidation results in cancelling DMA transfers. Since the guest OS
> > can't properly recover the DMA results in case the mapping is changed,
> > this is a fairly good approximation.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > ---
[snip]
> > +static inline void bmdma_memory_read(BMDMAState *bm,
> > + target_phys_addr_t addr,
> > + uint8_t *buf,
> > + target_phys_addr_t len)
> > +{
> > + bm->rw(bm->opaque, addr, buf, len, 0);
> > +}
> > +
> > +static inline void bmdma_memory_write(BMDMAState *bm,
> > + target_phys_addr_t addr,
> > + uint8_t *buf,
> > + target_phys_addr_t len)
> > +{
> > + bm->rw(bm->opaque, addr, buf, len, 1);
> > +}
> > +
>
> Here again, I am concerned about indirection and pointer chaising on data path.
> Can we have an iommu pointer in the device, and do a fast path in case
> there is no iommu?
>
See my other reply.
> > static inline IDEState *idebus_active_if(IDEBus *bus)
> > {
> > return bus->ifs + bus->unit;
> > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > index bd1c73e..962ae13 100644
> > --- a/hw/ide/macio.c
> > +++ b/hw/ide/macio.c
> > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> >
> > s->io_buffer_size = io->len;
> >
> > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > qemu_sglist_add(&s->sg, io->addr, io->len);
> > io->addr += io->len;
> > io->len = 0;
> > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > s->io_buffer_index = 0;
> > s->io_buffer_size = io->len;
> >
> > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > qemu_sglist_add(&s->sg, io->addr, io->len);
> > io->addr += io->len;
> > io->len = 0;
> > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > index 4d95cc5..5879044 100644
> > --- a/hw/ide/pci.c
> > +++ b/hw/ide/pci.c
> > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > continue;
> > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > }
> > +
> > + for (i = 0; i < 2; i++) {
> > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > + d->bmdma[i].map = (void *) pci_memory_map;
> > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > + d->bmdma[i].opaque = dev;
> > + }
> > }
>
> These casts show something is wrong with the API, IMO.
>
Hm, here's an oversight on my part: I think I should provide explicit
bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
uint{32,64}_t depending on the guest machine, so it might be buggy on
32-bit wrt calling conventions. But that introduces yet another
non-inlined function call :-(. That would drop the (void *) cast,
though.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 9:12 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 9:12 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> > Emulated PCI IDE controllers now use the memory access interface. This
> > also allows an emulated IOMMU to translate and check accesses.
> >
> > Map invalidation results in cancelling DMA transfers. Since the guest OS
> > can't properly recover the DMA results in case the mapping is changed,
> > this is a fairly good approximation.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > ---
[snip]
> > +static inline void bmdma_memory_read(BMDMAState *bm,
> > + target_phys_addr_t addr,
> > + uint8_t *buf,
> > + target_phys_addr_t len)
> > +{
> > + bm->rw(bm->opaque, addr, buf, len, 0);
> > +}
> > +
> > +static inline void bmdma_memory_write(BMDMAState *bm,
> > + target_phys_addr_t addr,
> > + uint8_t *buf,
> > + target_phys_addr_t len)
> > +{
> > + bm->rw(bm->opaque, addr, buf, len, 1);
> > +}
> > +
>
> Here again, I am concerned about indirection and pointer chaising on data path.
> Can we have an iommu pointer in the device, and do a fast path in case
> there is no iommu?
>
See my other reply.
> > static inline IDEState *idebus_active_if(IDEBus *bus)
> > {
> > return bus->ifs + bus->unit;
> > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > index bd1c73e..962ae13 100644
> > --- a/hw/ide/macio.c
> > +++ b/hw/ide/macio.c
> > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> >
> > s->io_buffer_size = io->len;
> >
> > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > qemu_sglist_add(&s->sg, io->addr, io->len);
> > io->addr += io->len;
> > io->len = 0;
> > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > s->io_buffer_index = 0;
> > s->io_buffer_size = io->len;
> >
> > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > qemu_sglist_add(&s->sg, io->addr, io->len);
> > io->addr += io->len;
> > io->len = 0;
> > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > index 4d95cc5..5879044 100644
> > --- a/hw/ide/pci.c
> > +++ b/hw/ide/pci.c
> > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > continue;
> > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > }
> > +
> > + for (i = 0; i < 2; i++) {
> > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > + d->bmdma[i].map = (void *) pci_memory_map;
> > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > + d->bmdma[i].opaque = dev;
> > + }
> > }
>
> These casts show something is wrong with the API, IMO.
>
Hm, here's an oversight on my part: I think I should provide explicit
bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
uint{32,64}_t depending on the guest machine, so it might be buggy on
32-bit wrt calling conventions. But that introduces yet another
non-inlined function call :-(. That would drop the (void *) cast,
though.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 8:40 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-02 9:49 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 9:49 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> > > PCI devices should access memory through pci_memory_*() instead of
> > > cpu_physical_memory_*(). This also provides support for translation and
> > > access checking in case an IOMMU is emulated.
> > >
> > > Memory maps are treated as remote IOTLBs (that is, translation caches
> > > belonging to the IOMMU-aware device itself). Clients (devices) must
> > > provide callbacks for map invalidation in case these maps are
> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> > >
> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >
> >
> > I am concerned about adding more pointer chaising on data path.
> > Could we have
> > 1. an iommu pointer in a device, inherited by secondary buses
> > when they are created and by devices from buses when they are attached.
> > 2. translation pointer in the iommu instead of the bus
>
> The first solution I proposed was based on qdev, that is, each
> DeviceState had an 'iommu' field. Translation would be done by
> recursively looking in the parent bus/devs for an IOMMU.
>
> But Anthony said we're better off with bus-specific APIs, mostly because
> (IIRC) there may be different types of addresses and it might be
> difficult to abstract those properly.
Well we ended up with casting
away types to make pci callbacks fit in ide structure,
and silently assuming that all addresses are in fact 64 bit.
So maybe it's hard to abstract addresses properly, but
it appears we'll have to, to avoid even worse problems.
> I suppose I could revisit the idea by integrating the IOMMU in a
> PCIDevice as opposed to a DeviceState.
>
> Anthony, Paul, any thoughts on this?
Just to clarify: this is an optimization idea:
instead of a bus walk on each access, do the walk
when device is attached to the bus, and copy the iommu
from the root to the device itself.
This will also make it possible to create
DMADeviceState structure which would have this iommu field,
and we'd use this structure instead of the void pointers all over.
> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
> >
> > if (__builtin_expect(!dev->iommu, 1)
> > return cpu_memory_rw
>
> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
IOMMU has a ton of indirections anyway.
> I suppose most emulated systems would have at least some theoretical
> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
> PCI devices) or for userspace drivers.
> So there are reasons to enable
> the IOMMU even when you don't have a real host IOMMU and you're not
> using nested guests.
The time most people enable iommu for all devices in both real and virtualized
systems appears distant, one of the reasons is because it has a lot of overhead.
Let's start with not adding overhead for existing users, makes sense?
> > > ---
> > > hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> > > hw/pci.h | 74 +++++++++++++++++++++
> > > hw/pci_internals.h | 12 ++++
> > > qemu-common.h | 1 +
> > > 4 files changed, 271 insertions(+), 1 deletions(-)
> >
> > Almost nothing here is PCI specific.
> > Can this code go into dma.c/dma.h?
> > We would have struct DMADevice, APIs like device_dma_write etc.
> > This would help us get rid of the void * stuff as well?
> >
>
> Yeah, I know, that's similar to what I intended to do at first. Though
> I'm not sure that rids us of 'void *' stuff, quite on the contrary from
> what I've seen.
>
> Some stuff still needs to stay 'void *' (or an equivalent typedef, but
> still an opaque) simply because of the required level of abstraction
> that's needed.
>
> [snip]
>
> > > +void pci_register_iommu(PCIDevice *iommu,
> > > + PCITranslateFunc *translate)
> > > +{
> > > + iommu->bus->iommu = iommu;
> > > + iommu->bus->translate = translate;
> > > +}
> > > +
> >
> > The above seems broken for secondary buses, right? Also, can we use
> > qdev for initialization in some way, instead of adding more APIs? E.g.
> > I think it would be nice if we could just use qdev command line flags to
> > control which bus is behind iommu and which isn't.
> >
> >
>
> Each bus must have its own IOMMU. The secondary bus should ask the
> primary bus instead of going through cpu_physical_memory_*(). If that
> isn't the case, it's broken and the secondary bus must be converted to
> the new API just like regular devices. I'll have a look at that.
>
> > > +void pci_memory_rw(PCIDevice *dev,
> > > + pcibus_t addr,
> > > + uint8_t *buf,
> > > + pcibus_t len,
> > > + int is_write)
> > > +{
> > > + int err;
> > > + unsigned perms;
> > > + PCIDevice *iommu = dev->bus->iommu;
> > > + target_phys_addr_t paddr, plen;
> > > +
> > > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > > +
> > > + while (len) {
> > > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > > + if (err)
> > > + return;
> > > +
> > > + /* The translation might be valid for larger regions. */
> > > + if (plen > len)
> > > + plen = len;
> > > +
> > > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > > +
> > > + len -= plen;
> > > + addr += plen;
> > > + buf += plen;
> > > + }
> > > +}
> > > +
> > > +static void pci_memory_register_map(PCIDevice *dev,
> > > + pcibus_t addr,
> > > + pcibus_t len,
> > > + target_phys_addr_t paddr,
> > > + PCIInvalidateMapFunc *invalidate,
> > > + void *invalidate_opaque)
> > > +{
> > > + PCIMemoryMap *map;
> > > +
> > > + map = qemu_malloc(sizeof(PCIMemoryMap));
> > > + map->addr = addr;
> > > + map->len = len;
> > > + map->paddr = paddr;
> > > + map->invalidate = invalidate;
> > > + map->invalidate_opaque = invalidate_opaque;
> > > +
> > > + QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
> > > +}
> > > +
> > > +static void pci_memory_unregister_map(PCIDevice *dev,
> > > + target_phys_addr_t paddr,
> > > + target_phys_addr_t len)
> > > +{
> > > + PCIMemoryMap *map;
> > > +
> > > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > > + if (map->paddr == paddr && map->len == len) {
> > > + QLIST_REMOVE(map, list);
> > > + free(map);
> > > + }
> > > + }
> > > +}
> > > +
> > > +void pci_memory_invalidate_range(PCIDevice *dev,
> > > + pcibus_t addr,
> > > + pcibus_t len)
> > > +{
> > > + PCIMemoryMap *map;
> > > +
> > > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > > + if (ranges_overlap(addr, len, map->addr, map->len)) {
> > > + map->invalidate(map->invalidate_opaque);
> > > + QLIST_REMOVE(map, list);
> > > + free(map);
> > > + }
> > > + }
> > > +}
> > > +
> > > +void *pci_memory_map(PCIDevice *dev,
> > > + PCIInvalidateMapFunc *cb,
> > > + void *opaque,
> > > + pcibus_t addr,
> > > + target_phys_addr_t *len,
> > > + int is_write)
> > > +{
> > > + int err;
> > > + unsigned perms;
> > > + PCIDevice *iommu = dev->bus->iommu;
> > > + target_phys_addr_t paddr, plen;
> > > +
> > > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > > +
> > > + plen = *len;
> > > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > > + if (err)
> > > + return NULL;
> > > +
> > > + /*
> > > + * If this is true, the virtual region is contiguous,
> > > + * but the translated physical region isn't. We just
> > > + * clamp *len, much like cpu_physical_memory_map() does.
> > > + */
> > > + if (plen < *len)
> > > + *len = plen;
> > > +
> > > + /* We treat maps as remote TLBs to cope with stuff like AIO. */
> > > + if (cb)
> > > + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
> > > +
> > > + return cpu_physical_memory_map(paddr, len, is_write);
> > > +}
> > > +
> >
> > All the above is really only useful for when there is an iommu,
> > right? So maybe we should shortcut all this if there's no iommu?
> >
>
> Some people (e.g. Blue) suggested I shouldn't make the IOMMU emulation a
> compile-time option, like I originally did. And I'm not sure any runtime
> "optimization" (as in likely()/unlikely()) is justified.
>
> [snip]
>
> > > diff --git a/hw/pci_internals.h b/hw/pci_internals.h
> > > index e3c93a3..fb134b9 100644
> > > --- a/hw/pci_internals.h
> > > +++ b/hw/pci_internals.h
> > > @@ -33,6 +33,9 @@ struct PCIBus {
> > > Keep a count of the number of devices with raised IRQs. */
> > > int nirq;
> > > int *irq_count;
> > > +
> > > + PCIDevice *iommu;
> > > + PCITranslateFunc *translate;
> > > };
> >
> > Why is translate pointer in a bus? I think it's a work of an iommu?
> >
>
> Anthony and Paul thought it's best to simply as the parent bus for
> translation. I somewhat agree to that: devices that aren't IOMMU-aware
> simply attempt to do PCI requests to memory and the IOMMU translates
> and checks them transparently.
>
> > > struct PCIBridge {
> > > @@ -44,4 +47,13 @@ struct PCIBridge {
> > > const char *bus_name;
> > > };
> > >
> > > +struct PCIMemoryMap {
> > > + pcibus_t addr;
> > > + pcibus_t len;
> > > + target_phys_addr_t paddr;
> > > + PCIInvalidateMapFunc *invalidate;
> > > + void *invalidate_opaque;
> >
> > Can we have a structure that encapsulates the mapping
> > data instead of a void *?
> >
> >
>
> Not really. 'invalidate_opaque' belongs to device code. It's meant to be
> a handle to easily identify the mapping. For example, DMA code wants to
> cancel AIO transfers when the bus requests the map to be invalidated.
> It's difficult to look that AIO transfer up using non-opaque data.
>
> > > + QLIST_ENTRY(PCIMemoryMap) list;
> > > +};
> > > +
> > > #endif /* QEMU_PCI_INTERNALS_H */
> > > diff --git a/qemu-common.h b/qemu-common.h
> > > index d735235..8b060e8 100644
> > > --- a/qemu-common.h
> > > +++ b/qemu-common.h
> > > @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
> > > typedef struct PCIHostState PCIHostState;
> > > typedef struct PCIExpressHost PCIExpressHost;
> > > typedef struct PCIBus PCIBus;
> > > +typedef struct PCIMemoryMap PCIMemoryMap;
> > > typedef struct PCIDevice PCIDevice;
> > > typedef struct PCIBridge PCIBridge;
> > > typedef struct SerialState SerialState;
> > > --
> > > 1.7.1
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 9:49 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 9:49 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> > > PCI devices should access memory through pci_memory_*() instead of
> > > cpu_physical_memory_*(). This also provides support for translation and
> > > access checking in case an IOMMU is emulated.
> > >
> > > Memory maps are treated as remote IOTLBs (that is, translation caches
> > > belonging to the IOMMU-aware device itself). Clients (devices) must
> > > provide callbacks for map invalidation in case these maps are
> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> > >
> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >
> >
> > I am concerned about adding more pointer chaising on data path.
> > Could we have
> > 1. an iommu pointer in a device, inherited by secondary buses
> > when they are created and by devices from buses when they are attached.
> > 2. translation pointer in the iommu instead of the bus
>
> The first solution I proposed was based on qdev, that is, each
> DeviceState had an 'iommu' field. Translation would be done by
> recursively looking in the parent bus/devs for an IOMMU.
>
> But Anthony said we're better off with bus-specific APIs, mostly because
> (IIRC) there may be different types of addresses and it might be
> difficult to abstract those properly.
Well we ended up with casting
away types to make pci callbacks fit in ide structure,
and silently assuming that all addresses are in fact 64 bit.
So maybe it's hard to abstract addresses properly, but
it appears we'll have to, to avoid even worse problems.
> I suppose I could revisit the idea by integrating the IOMMU in a
> PCIDevice as opposed to a DeviceState.
>
> Anthony, Paul, any thoughts on this?
Just to clarify: this is an optimization idea:
instead of a bus walk on each access, do the walk
when device is attached to the bus, and copy the iommu
from the root to the device itself.
This will also make it possible to create
DMADeviceState structure which would have this iommu field,
and we'd use this structure instead of the void pointers all over.
> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
> >
> > if (__builtin_expect(!dev->iommu, 1)
> > return cpu_memory_rw
>
> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
IOMMU has a ton of indirections anyway.
> I suppose most emulated systems would have at least some theoretical
> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
> PCI devices) or for userspace drivers.
> So there are reasons to enable
> the IOMMU even when you don't have a real host IOMMU and you're not
> using nested guests.
The time most people enable iommu for all devices in both real and virtualized
systems appears distant, one of the reasons is because it has a lot of overhead.
Let's start with not adding overhead for existing users, makes sense?
> > > ---
> > > hw/pci.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> > > hw/pci.h | 74 +++++++++++++++++++++
> > > hw/pci_internals.h | 12 ++++
> > > qemu-common.h | 1 +
> > > 4 files changed, 271 insertions(+), 1 deletions(-)
> >
> > Almost nothing here is PCI specific.
> > Can this code go into dma.c/dma.h?
> > We would have struct DMADevice, APIs like device_dma_write etc.
> > This would help us get rid of the void * stuff as well?
> >
>
> Yeah, I know, that's similar to what I intended to do at first. Though
> I'm not sure that rids us of 'void *' stuff, quite on the contrary from
> what I've seen.
>
> Some stuff still needs to stay 'void *' (or an equivalent typedef, but
> still an opaque) simply because of the required level of abstraction
> that's needed.
>
> [snip]
>
> > > +void pci_register_iommu(PCIDevice *iommu,
> > > + PCITranslateFunc *translate)
> > > +{
> > > + iommu->bus->iommu = iommu;
> > > + iommu->bus->translate = translate;
> > > +}
> > > +
> >
> > The above seems broken for secondary buses, right? Also, can we use
> > qdev for initialization in some way, instead of adding more APIs? E.g.
> > I think it would be nice if we could just use qdev command line flags to
> > control which bus is behind iommu and which isn't.
> >
> >
>
> Each bus must have its own IOMMU. The secondary bus should ask the
> primary bus instead of going through cpu_physical_memory_*(). If that
> isn't the case, it's broken and the secondary bus must be converted to
> the new API just like regular devices. I'll have a look at that.
>
> > > +void pci_memory_rw(PCIDevice *dev,
> > > + pcibus_t addr,
> > > + uint8_t *buf,
> > > + pcibus_t len,
> > > + int is_write)
> > > +{
> > > + int err;
> > > + unsigned perms;
> > > + PCIDevice *iommu = dev->bus->iommu;
> > > + target_phys_addr_t paddr, plen;
> > > +
> > > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > > +
> > > + while (len) {
> > > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > > + if (err)
> > > + return;
> > > +
> > > + /* The translation might be valid for larger regions. */
> > > + if (plen > len)
> > > + plen = len;
> > > +
> > > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > > +
> > > + len -= plen;
> > > + addr += plen;
> > > + buf += plen;
> > > + }
> > > +}
> > > +
> > > +static void pci_memory_register_map(PCIDevice *dev,
> > > + pcibus_t addr,
> > > + pcibus_t len,
> > > + target_phys_addr_t paddr,
> > > + PCIInvalidateMapFunc *invalidate,
> > > + void *invalidate_opaque)
> > > +{
> > > + PCIMemoryMap *map;
> > > +
> > > + map = qemu_malloc(sizeof(PCIMemoryMap));
> > > + map->addr = addr;
> > > + map->len = len;
> > > + map->paddr = paddr;
> > > + map->invalidate = invalidate;
> > > + map->invalidate_opaque = invalidate_opaque;
> > > +
> > > + QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
> > > +}
> > > +
> > > +static void pci_memory_unregister_map(PCIDevice *dev,
> > > + target_phys_addr_t paddr,
> > > + target_phys_addr_t len)
> > > +{
> > > + PCIMemoryMap *map;
> > > +
> > > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > > + if (map->paddr == paddr && map->len == len) {
> > > + QLIST_REMOVE(map, list);
> > > + free(map);
> > > + }
> > > + }
> > > +}
> > > +
> > > +void pci_memory_invalidate_range(PCIDevice *dev,
> > > + pcibus_t addr,
> > > + pcibus_t len)
> > > +{
> > > + PCIMemoryMap *map;
> > > +
> > > + QLIST_FOREACH(map, &dev->memory_maps, list) {
> > > + if (ranges_overlap(addr, len, map->addr, map->len)) {
> > > + map->invalidate(map->invalidate_opaque);
> > > + QLIST_REMOVE(map, list);
> > > + free(map);
> > > + }
> > > + }
> > > +}
> > > +
> > > +void *pci_memory_map(PCIDevice *dev,
> > > + PCIInvalidateMapFunc *cb,
> > > + void *opaque,
> > > + pcibus_t addr,
> > > + target_phys_addr_t *len,
> > > + int is_write)
> > > +{
> > > + int err;
> > > + unsigned perms;
> > > + PCIDevice *iommu = dev->bus->iommu;
> > > + target_phys_addr_t paddr, plen;
> > > +
> > > + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
> > > +
> > > + plen = *len;
> > > + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
> > > + if (err)
> > > + return NULL;
> > > +
> > > + /*
> > > + * If this is true, the virtual region is contiguous,
> > > + * but the translated physical region isn't. We just
> > > + * clamp *len, much like cpu_physical_memory_map() does.
> > > + */
> > > + if (plen < *len)
> > > + *len = plen;
> > > +
> > > + /* We treat maps as remote TLBs to cope with stuff like AIO. */
> > > + if (cb)
> > > + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
> > > +
> > > + return cpu_physical_memory_map(paddr, len, is_write);
> > > +}
> > > +
> >
> > All the above is really only useful for when there is an iommu,
> > right? So maybe we should shortcut all this if there's no iommu?
> >
>
> Some people (e.g. Blue) suggested I shouldn't make the IOMMU emulation a
> compile-time option, like I originally did. And I'm not sure any runtime
> "optimization" (as in likely()/unlikely()) is justified.
>
> [snip]
>
> > > diff --git a/hw/pci_internals.h b/hw/pci_internals.h
> > > index e3c93a3..fb134b9 100644
> > > --- a/hw/pci_internals.h
> > > +++ b/hw/pci_internals.h
> > > @@ -33,6 +33,9 @@ struct PCIBus {
> > > Keep a count of the number of devices with raised IRQs. */
> > > int nirq;
> > > int *irq_count;
> > > +
> > > + PCIDevice *iommu;
> > > + PCITranslateFunc *translate;
> > > };
> >
> > Why is translate pointer in a bus? I think it's a work of an iommu?
> >
>
> Anthony and Paul thought it's best to simply as the parent bus for
> translation. I somewhat agree to that: devices that aren't IOMMU-aware
> simply attempt to do PCI requests to memory and the IOMMU translates
> and checks them transparently.
>
> > > struct PCIBridge {
> > > @@ -44,4 +47,13 @@ struct PCIBridge {
> > > const char *bus_name;
> > > };
> > >
> > > +struct PCIMemoryMap {
> > > + pcibus_t addr;
> > > + pcibus_t len;
> > > + target_phys_addr_t paddr;
> > > + PCIInvalidateMapFunc *invalidate;
> > > + void *invalidate_opaque;
> >
> > Can we have a structure that encapsulates the mapping
> > data instead of a void *?
> >
> >
>
> Not really. 'invalidate_opaque' belongs to device code. It's meant to be
> a handle to easily identify the mapping. For example, DMA code wants to
> cancel AIO transfers when the bus requests the map to be invalidated.
> It's difficult to look that AIO transfer up using non-opaque data.
>
> > > + QLIST_ENTRY(PCIMemoryMap) list;
> > > +};
> > > +
> > > #endif /* QEMU_PCI_INTERNALS_H */
> > > diff --git a/qemu-common.h b/qemu-common.h
> > > index d735235..8b060e8 100644
> > > --- a/qemu-common.h
> > > +++ b/qemu-common.h
> > > @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
> > > typedef struct PCIHostState PCIHostState;
> > > typedef struct PCIExpressHost PCIExpressHost;
> > > typedef struct PCIBus PCIBus;
> > > +typedef struct PCIMemoryMap PCIMemoryMap;
> > > typedef struct PCIDevice PCIDevice;
> > > typedef struct PCIBridge PCIBridge;
> > > typedef struct SerialState SerialState;
> > > --
> > > 1.7.1
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 9:12 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-02 9:58 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 9:58 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> > > Emulated PCI IDE controllers now use the memory access interface. This
> > > also allows an emulated IOMMU to translate and check accesses.
> > >
> > > Map invalidation results in cancelling DMA transfers. Since the guest OS
> > > can't properly recover the DMA results in case the mapping is changed,
> > > this is a fairly good approximation.
> > >
> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > > ---
>
> [snip]
>
> > > +static inline void bmdma_memory_read(BMDMAState *bm,
> > > + target_phys_addr_t addr,
> > > + uint8_t *buf,
> > > + target_phys_addr_t len)
> > > +{
> > > + bm->rw(bm->opaque, addr, buf, len, 0);
> > > +}
> > > +
> > > +static inline void bmdma_memory_write(BMDMAState *bm,
> > > + target_phys_addr_t addr,
> > > + uint8_t *buf,
> > > + target_phys_addr_t len)
> > > +{
> > > + bm->rw(bm->opaque, addr, buf, len, 1);
> > > +}
> > > +
> >
> > Here again, I am concerned about indirection and pointer chaising on data path.
> > Can we have an iommu pointer in the device, and do a fast path in case
> > there is no iommu?
> >
>
> See my other reply.
I don't insist on this solution, but what other way do you propose to
avoid the overhead for everyone not using an iommu?
I'm all for a solution that would help iommu as well,
but one wasn't yet proposed.
> > > static inline IDEState *idebus_active_if(IDEBus *bus)
> > > {
> > > return bus->ifs + bus->unit;
> > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > index bd1c73e..962ae13 100644
> > > --- a/hw/ide/macio.c
> > > +++ b/hw/ide/macio.c
> > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > >
> > > s->io_buffer_size = io->len;
> > >
> > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > io->addr += io->len;
> > > io->len = 0;
> > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > s->io_buffer_index = 0;
> > > s->io_buffer_size = io->len;
> > >
> > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > io->addr += io->len;
> > > io->len = 0;
> > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > index 4d95cc5..5879044 100644
> > > --- a/hw/ide/pci.c
> > > +++ b/hw/ide/pci.c
> > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > continue;
> > > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > }
> > > +
> > > + for (i = 0; i < 2; i++) {
> > > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > > + d->bmdma[i].map = (void *) pci_memory_map;
> > > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > + d->bmdma[i].opaque = dev;
> > > + }
> > > }
> >
> > These casts show something is wrong with the API, IMO.
> >
>
> Hm, here's an oversight on my part: I think I should provide explicit
> bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> uint{32,64}_t depending on the guest machine, so it might be buggy on
> 32-bit wrt calling conventions. But that introduces yet another
> non-inlined function call :-(. That would drop the (void *) cast,
> though.
>
>
> Eduard
So we get away with it without casts but only because C compiler
will let us silently convert the types, possibly discarding
data in the process. Or we'll add a check that will try and detect
this, but there's no good way to report a DMA error to user.
IOW, if our code only works because target fits in pcibus, what good
is the abstraction and using distinct types?
This is why I think we need a generic DMA APIs using dma addresses.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 9:58 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 9:58 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> > > Emulated PCI IDE controllers now use the memory access interface. This
> > > also allows an emulated IOMMU to translate and check accesses.
> > >
> > > Map invalidation results in cancelling DMA transfers. Since the guest OS
> > > can't properly recover the DMA results in case the mapping is changed,
> > > this is a fairly good approximation.
> > >
> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > > ---
>
> [snip]
>
> > > +static inline void bmdma_memory_read(BMDMAState *bm,
> > > + target_phys_addr_t addr,
> > > + uint8_t *buf,
> > > + target_phys_addr_t len)
> > > +{
> > > + bm->rw(bm->opaque, addr, buf, len, 0);
> > > +}
> > > +
> > > +static inline void bmdma_memory_write(BMDMAState *bm,
> > > + target_phys_addr_t addr,
> > > + uint8_t *buf,
> > > + target_phys_addr_t len)
> > > +{
> > > + bm->rw(bm->opaque, addr, buf, len, 1);
> > > +}
> > > +
> >
> > Here again, I am concerned about indirection and pointer chaising on data path.
> > Can we have an iommu pointer in the device, and do a fast path in case
> > there is no iommu?
> >
>
> See my other reply.
I don't insist on this solution, but what other way do you propose to
avoid the overhead for everyone not using an iommu?
I'm all for a solution that would help iommu as well,
but one wasn't yet proposed.
> > > static inline IDEState *idebus_active_if(IDEBus *bus)
> > > {
> > > return bus->ifs + bus->unit;
> > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > index bd1c73e..962ae13 100644
> > > --- a/hw/ide/macio.c
> > > +++ b/hw/ide/macio.c
> > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > >
> > > s->io_buffer_size = io->len;
> > >
> > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > io->addr += io->len;
> > > io->len = 0;
> > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > s->io_buffer_index = 0;
> > > s->io_buffer_size = io->len;
> > >
> > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > io->addr += io->len;
> > > io->len = 0;
> > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > index 4d95cc5..5879044 100644
> > > --- a/hw/ide/pci.c
> > > +++ b/hw/ide/pci.c
> > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > continue;
> > > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > }
> > > +
> > > + for (i = 0; i < 2; i++) {
> > > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > > + d->bmdma[i].map = (void *) pci_memory_map;
> > > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > + d->bmdma[i].opaque = dev;
> > > + }
> > > }
> >
> > These casts show something is wrong with the API, IMO.
> >
>
> Hm, here's an oversight on my part: I think I should provide explicit
> bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> uint{32,64}_t depending on the guest machine, so it might be buggy on
> 32-bit wrt calling conventions. But that introduces yet another
> non-inlined function call :-(. That would drop the (void *) cast,
> though.
>
>
> Eduard
So we get away with it without casts but only because C compiler
will let us silently convert the types, possibly discarding
data in the process. Or we'll add a check that will try and detect
this, but there's no good way to report a DMA error to user.
IOW, if our code only works because target fits in pcibus, what good
is the abstraction and using distinct types?
This is why I think we need a generic DMA APIs using dma addresses.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 9:08 ` Eduard - Gabriel Munteanu
@ 2010-09-02 13:24 ` Anthony Liguori
-1 siblings, 0 replies; 97+ messages in thread
From: Anthony Liguori @ 2010-09-02 13:24 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: Michael S. Tsirkin, Stefan Weil, kvm, joro, qemu-devel,
blauwirbel, yamahata, paul, avi
On 09/02/2010 04:08 AM, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 09:00:46AM +0300, Michael S. Tsirkin wrote:
>
>> On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
>>
>>>> +static inline void pci_memory_read(PCIDevice *dev,
>>>> + pcibus_t addr,
>>>> + uint8_t *buf,
>>>> + pcibus_t len)
>>>> +{
>>>> + pci_memory_rw(dev, addr, buf, len, 0);
>>>> +}
>>>> +
>>>> +static inline void pci_memory_write(PCIDevice *dev,
>>>> + pcibus_t addr,
>>>> + const uint8_t *buf,
>>>> + pcibus_t len)
>>>> +{
>>>> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
>>>> +}
>>>> +
>>>> #endif
>>>>
>>> The functions pci_memory_read and pci_memory_write not only read
>>> or write byte data but many different data types which leads to
>>> a lot of type casts in your other patches.
>>>
>>> I'd prefer "void *buf" and "const void *buf" in the argument lists.
>>> Then all those type casts could be removed.
>>>
>>> Regards
>>> Stefan Weil
>>>
>> Further, I am not sure pcibus_t is a good type to use here.
>> This also forces use of pci specific types in e.g. ide, or resorting to
>> casts as this patch does. We probably should use a more generic type
>> for this.
>>
> It only forces use of PCI-specific types in the IDE controller, which is
> already a PCI device.
>
But IDE controllers are not always PCI devices... This isn't an issue
with your patch, per-say, but with how we're modelling the IDE
controller today. There's no great solution but I think your patch is
an improvement over what we have today.
I do agree with Stefan though that void * would make a lot more sense.
Regards,
Anthony Liguori
> Eduard
>
>
>> --
>> MST
>>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 13:24 ` Anthony Liguori
0 siblings, 0 replies; 97+ messages in thread
From: Anthony Liguori @ 2010-09-02 13:24 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, Michael S. Tsirkin, joro, qemu-devel, blauwirbel, paul, avi,
yamahata
On 09/02/2010 04:08 AM, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 09:00:46AM +0300, Michael S. Tsirkin wrote:
>
>> On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
>>
>>>> +static inline void pci_memory_read(PCIDevice *dev,
>>>> + pcibus_t addr,
>>>> + uint8_t *buf,
>>>> + pcibus_t len)
>>>> +{
>>>> + pci_memory_rw(dev, addr, buf, len, 0);
>>>> +}
>>>> +
>>>> +static inline void pci_memory_write(PCIDevice *dev,
>>>> + pcibus_t addr,
>>>> + const uint8_t *buf,
>>>> + pcibus_t len)
>>>> +{
>>>> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
>>>> +}
>>>> +
>>>> #endif
>>>>
>>> The functions pci_memory_read and pci_memory_write not only read
>>> or write byte data but many different data types which leads to
>>> a lot of type casts in your other patches.
>>>
>>> I'd prefer "void *buf" and "const void *buf" in the argument lists.
>>> Then all those type casts could be removed.
>>>
>>> Regards
>>> Stefan Weil
>>>
>> Further, I am not sure pcibus_t is a good type to use here.
>> This also forces use of pci specific types in e.g. ide, or resorting to
>> casts as this patch does. We probably should use a more generic type
>> for this.
>>
> It only forces use of PCI-specific types in the IDE controller, which is
> already a PCI device.
>
But IDE controllers are not always PCI devices... This isn't an issue
with your patch, per-say, but with how we're modelling the IDE
controller today. There's no great solution but I think your patch is
an improvement over what we have today.
I do agree with Stefan though that void * would make a lot more sense.
Regards,
Anthony Liguori
> Eduard
>
>
>> --
>> MST
>>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 9:58 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-02 15:01 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 15:01 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
>
> I don't insist on this solution, but what other way do you propose to
> avoid the overhead for everyone not using an iommu?
> I'm all for a solution that would help iommu as well,
> but one wasn't yet proposed.
>
Hm, we could get even better performance by simply making the IOMMU a
compile-time option. It also avoids problems in case some device hasn't
been converted yet, and involves little to no tradeoffs. What do you
think?
AFAICT, there are few uses for the IOMMU besides development and
avantgarde stuff, as you note. So distributions can continue supplying
prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
time being. The only practical (commercial) use right now would be in
the case of private virtual servers, which could be divided further into
nested guests (though real IOMMU hardware isn't widespread yet).
Blue Swirl, in the light of this, do you agree on making it a
compile-time option?
> > > > static inline IDEState *idebus_active_if(IDEBus *bus)
> > > > {
> > > > return bus->ifs + bus->unit;
> > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > index bd1c73e..962ae13 100644
> > > > --- a/hw/ide/macio.c
> > > > +++ b/hw/ide/macio.c
> > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > >
> > > > s->io_buffer_size = io->len;
> > > >
> > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > io->addr += io->len;
> > > > io->len = 0;
> > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > > s->io_buffer_index = 0;
> > > > s->io_buffer_size = io->len;
> > > >
> > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > io->addr += io->len;
> > > > io->len = 0;
> > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > index 4d95cc5..5879044 100644
> > > > --- a/hw/ide/pci.c
> > > > +++ b/hw/ide/pci.c
> > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > > continue;
> > > > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > > }
> > > > +
> > > > + for (i = 0; i < 2; i++) {
> > > > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > + d->bmdma[i].map = (void *) pci_memory_map;
> > > > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > + d->bmdma[i].opaque = dev;
> > > > + }
> > > > }
> > >
> > > These casts show something is wrong with the API, IMO.
> > >
> >
> > Hm, here's an oversight on my part: I think I should provide explicit
> > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > 32-bit wrt calling conventions. But that introduces yet another
> > non-inlined function call :-(. That would drop the (void *) cast,
> > though.
> >
> >
> > Eduard
>
> So we get away with it without casts but only because C compiler
> will let us silently convert the types, possibly discarding
> data in the process. Or we'll add a check that will try and detect
> this, but there's no good way to report a DMA error to user.
> IOW, if our code only works because target fits in pcibus, what good
> is the abstraction and using distinct types?
>
> This is why I think we need a generic DMA APIs using dma addresses.
The API was made so that it doesn't report errors. That's because the
PCI bus doesn't provide any possibility of doing so (real devices can't
retry transfers in case an I/O page fault occurs).
In my previous generic IOMMU layer implementation pci_memory_*()
returned non-zero on failure, but I decided to drop it when switching to
a PCI-only (rather a PCI-specific) approach.
In case target_phys_addr_t no longer fits in pcibus_t by a simple
implicit conversion, those explicit bmdma hooks I was going to add will
do the necessary conversions.
The idea of using distinct types is two-fold: let the programmer know
not to rely on them being the same thing, and let the compiler prevent
him from shooting himself in the foot (like I did). Even if there is a
dma_addr_t, some piece of code still needs to provide glue and
conversion between the DMA code and bus-specific code.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 15:01 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 15:01 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
>
> I don't insist on this solution, but what other way do you propose to
> avoid the overhead for everyone not using an iommu?
> I'm all for a solution that would help iommu as well,
> but one wasn't yet proposed.
>
Hm, we could get even better performance by simply making the IOMMU a
compile-time option. It also avoids problems in case some device hasn't
been converted yet, and involves little to no tradeoffs. What do you
think?
AFAICT, there are few uses for the IOMMU besides development and
avantgarde stuff, as you note. So distributions can continue supplying
prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
time being. The only practical (commercial) use right now would be in
the case of private virtual servers, which could be divided further into
nested guests (though real IOMMU hardware isn't widespread yet).
Blue Swirl, in the light of this, do you agree on making it a
compile-time option?
> > > > static inline IDEState *idebus_active_if(IDEBus *bus)
> > > > {
> > > > return bus->ifs + bus->unit;
> > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > index bd1c73e..962ae13 100644
> > > > --- a/hw/ide/macio.c
> > > > +++ b/hw/ide/macio.c
> > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > >
> > > > s->io_buffer_size = io->len;
> > > >
> > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > io->addr += io->len;
> > > > io->len = 0;
> > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > > s->io_buffer_index = 0;
> > > > s->io_buffer_size = io->len;
> > > >
> > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > io->addr += io->len;
> > > > io->len = 0;
> > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > index 4d95cc5..5879044 100644
> > > > --- a/hw/ide/pci.c
> > > > +++ b/hw/ide/pci.c
> > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > > continue;
> > > > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > > }
> > > > +
> > > > + for (i = 0; i < 2; i++) {
> > > > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > + d->bmdma[i].map = (void *) pci_memory_map;
> > > > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > + d->bmdma[i].opaque = dev;
> > > > + }
> > > > }
> > >
> > > These casts show something is wrong with the API, IMO.
> > >
> >
> > Hm, here's an oversight on my part: I think I should provide explicit
> > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > 32-bit wrt calling conventions. But that introduces yet another
> > non-inlined function call :-(. That would drop the (void *) cast,
> > though.
> >
> >
> > Eduard
>
> So we get away with it without casts but only because C compiler
> will let us silently convert the types, possibly discarding
> data in the process. Or we'll add a check that will try and detect
> this, but there's no good way to report a DMA error to user.
> IOW, if our code only works because target fits in pcibus, what good
> is the abstraction and using distinct types?
>
> This is why I think we need a generic DMA APIs using dma addresses.
The API was made so that it doesn't report errors. That's because the
PCI bus doesn't provide any possibility of doing so (real devices can't
retry transfers in case an I/O page fault occurs).
In my previous generic IOMMU layer implementation pci_memory_*()
returned non-zero on failure, but I decided to drop it when switching to
a PCI-only (rather a PCI-specific) approach.
In case target_phys_addr_t no longer fits in pcibus_t by a simple
implicit conversion, those explicit bmdma hooks I was going to add will
do the necessary conversions.
The idea of using distinct types is two-fold: let the programmer know
not to rely on them being the same thing, and let the compiler prevent
him from shooting himself in the foot (like I did). Even if there is a
dma_addr_t, some piece of code still needs to provide glue and
conversion between the DMA code and bus-specific code.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 15:01 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-02 15:24 ` Avi Kivity
-1 siblings, 0 replies; 97+ messages in thread
From: Avi Kivity @ 2010-09-02 15:24 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: Michael S. Tsirkin, joro, blauwirbel, paul, anthony, av1474,
yamahata, kvm, qemu-devel
On 09/02/2010 06:01 PM, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
>> On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
>>> On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
>>>> On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
>> I don't insist on this solution, but what other way do you propose to
>> avoid the overhead for everyone not using an iommu?
>> I'm all for a solution that would help iommu as well,
>> but one wasn't yet proposed.
>>
> Hm, we could get even better performance by simply making the IOMMU a
> compile-time option. It also avoids problems in case some device hasn't
> been converted yet, and involves little to no tradeoffs. What do you
> think?
>
> AFAICT, there are few uses for the IOMMU besides development and
> avantgarde stuff, as you note. So distributions can continue supplying
> prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
> time being. The only practical (commercial) use right now would be in
> the case of private virtual servers, which could be divided further into
> nested guests (though real IOMMU hardware isn't widespread yet).
>
> Blue Swirl, in the light of this, do you agree on making it a
> compile-time option?
That's not a practical long term solution. Eventually everything gets
turned on.
I don't really see a problem with the additional indirection. By the
time we reach actual hardware to satisfy the request, we'll have gone
through many such indirections; modern processors deal very well with them.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 15:24 ` Avi Kivity
0 siblings, 0 replies; 97+ messages in thread
From: Avi Kivity @ 2010-09-02 15:24 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, Michael S. Tsirkin, joro, qemu-devel, blauwirbel, yamahata, paul
On 09/02/2010 06:01 PM, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
>> On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
>>> On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
>>>> On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
>> I don't insist on this solution, but what other way do you propose to
>> avoid the overhead for everyone not using an iommu?
>> I'm all for a solution that would help iommu as well,
>> but one wasn't yet proposed.
>>
> Hm, we could get even better performance by simply making the IOMMU a
> compile-time option. It also avoids problems in case some device hasn't
> been converted yet, and involves little to no tradeoffs. What do you
> think?
>
> AFAICT, there are few uses for the IOMMU besides development and
> avantgarde stuff, as you note. So distributions can continue supplying
> prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
> time being. The only practical (commercial) use right now would be in
> the case of private virtual servers, which could be divided further into
> nested guests (though real IOMMU hardware isn't widespread yet).
>
> Blue Swirl, in the light of this, do you agree on making it a
> compile-time option?
That's not a practical long term solution. Eventually everything gets
turned on.
I don't really see a problem with the additional indirection. By the
time we reach actual hardware to satisfy the request, we'll have gone
through many such indirections; modern processors deal very well with them.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 15:01 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-02 15:31 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 15:31 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 06:01:35PM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> >
> > I don't insist on this solution, but what other way do you propose to
> > avoid the overhead for everyone not using an iommu?
> > I'm all for a solution that would help iommu as well,
> > but one wasn't yet proposed.
> >
>
> Hm, we could get even better performance by simply making the IOMMU a
> compile-time option. It also avoids problems in case some device hasn't
> been converted yet, and involves little to no tradeoffs. What do you
> think?
>
> AFAICT, there are few uses for the IOMMU besides development and
> avantgarde stuff, as you note. So distributions can continue supplying
> prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
> time being. The only practical (commercial) use right now would be in
> the case of private virtual servers, which could be divided further into
> nested guests (though real IOMMU hardware isn't widespread yet).
>
> Blue Swirl, in the light of this, do you agree on making it a
> compile-time option?
>
> > > > > static inline IDEState *idebus_active_if(IDEBus *bus)
> > > > > {
> > > > > return bus->ifs + bus->unit;
> > > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > > index bd1c73e..962ae13 100644
> > > > > --- a/hw/ide/macio.c
> > > > > +++ b/hw/ide/macio.c
> > > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > > >
> > > > > s->io_buffer_size = io->len;
> > > > >
> > > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > > io->addr += io->len;
> > > > > io->len = 0;
> > > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > > > s->io_buffer_index = 0;
> > > > > s->io_buffer_size = io->len;
> > > > >
> > > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > > io->addr += io->len;
> > > > > io->len = 0;
> > > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > > index 4d95cc5..5879044 100644
> > > > > --- a/hw/ide/pci.c
> > > > > +++ b/hw/ide/pci.c
> > > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > > > continue;
> > > > > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > > > }
> > > > > +
> > > > > + for (i = 0; i < 2; i++) {
> > > > > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > > + d->bmdma[i].map = (void *) pci_memory_map;
> > > > > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > > + d->bmdma[i].opaque = dev;
> > > > > + }
> > > > > }
> > > >
> > > > These casts show something is wrong with the API, IMO.
> > > >
> > >
> > > Hm, here's an oversight on my part: I think I should provide explicit
> > > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > > 32-bit wrt calling conventions. But that introduces yet another
> > > non-inlined function call :-(. That would drop the (void *) cast,
> > > though.
> > >
> > >
> > > Eduard
> >
> > So we get away with it without casts but only because C compiler
> > will let us silently convert the types, possibly discarding
> > data in the process. Or we'll add a check that will try and detect
> > this, but there's no good way to report a DMA error to user.
> > IOW, if our code only works because target fits in pcibus, what good
> > is the abstraction and using distinct types?
> >
> > This is why I think we need a generic DMA APIs using dma addresses.
>
> The API was made so that it doesn't report errors. That's because the
> PCI bus doesn't provide any possibility of doing so (real devices can't
> retry transfers in case an I/O page fault occurs).
This is what I am saying. We can't deal with errors.
> In my previous generic IOMMU layer implementation pci_memory_*()
> returned non-zero on failure, but I decided to drop it when switching to
> a PCI-only (rather a PCI-specific) approach.
>
> In case target_phys_addr_t no longer fits in pcibus_t by a simple
> implicit conversion, those explicit bmdma hooks I was going to add will
> do the necessary conversions.
>
> The idea of using distinct types is two-fold: let the programmer know
> not to rely on them being the same thing, and let the compiler prevent
> him from shooting himself in the foot (like I did). Even if there is a
> dma_addr_t, some piece of code still needs to provide glue and
> conversion between the DMA code and bus-specific code.
>
>
> Eduard
Nothing I see here is bus-specific, really. Without an mmu addresses
that make sense are target addresses, with iommu - whatever iommu
supports. So make iommu work with dma_addr_t and do the conversion.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 15:31 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 15:31 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 06:01:35PM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> >
> > I don't insist on this solution, but what other way do you propose to
> > avoid the overhead for everyone not using an iommu?
> > I'm all for a solution that would help iommu as well,
> > but one wasn't yet proposed.
> >
>
> Hm, we could get even better performance by simply making the IOMMU a
> compile-time option. It also avoids problems in case some device hasn't
> been converted yet, and involves little to no tradeoffs. What do you
> think?
>
> AFAICT, there are few uses for the IOMMU besides development and
> avantgarde stuff, as you note. So distributions can continue supplying
> prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
> time being. The only practical (commercial) use right now would be in
> the case of private virtual servers, which could be divided further into
> nested guests (though real IOMMU hardware isn't widespread yet).
>
> Blue Swirl, in the light of this, do you agree on making it a
> compile-time option?
>
> > > > > static inline IDEState *idebus_active_if(IDEBus *bus)
> > > > > {
> > > > > return bus->ifs + bus->unit;
> > > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > > index bd1c73e..962ae13 100644
> > > > > --- a/hw/ide/macio.c
> > > > > +++ b/hw/ide/macio.c
> > > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > > >
> > > > > s->io_buffer_size = io->len;
> > > > >
> > > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > > io->addr += io->len;
> > > > > io->len = 0;
> > > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > > > s->io_buffer_index = 0;
> > > > > s->io_buffer_size = io->len;
> > > > >
> > > > > - qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > > qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > > io->addr += io->len;
> > > > > io->len = 0;
> > > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > > index 4d95cc5..5879044 100644
> > > > > --- a/hw/ide/pci.c
> > > > > +++ b/hw/ide/pci.c
> > > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > > > continue;
> > > > > ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > > > }
> > > > > +
> > > > > + for (i = 0; i < 2; i++) {
> > > > > + d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > > + d->bmdma[i].map = (void *) pci_memory_map;
> > > > > + d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > > + d->bmdma[i].opaque = dev;
> > > > > + }
> > > > > }
> > > >
> > > > These casts show something is wrong with the API, IMO.
> > > >
> > >
> > > Hm, here's an oversight on my part: I think I should provide explicit
> > > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > > 32-bit wrt calling conventions. But that introduces yet another
> > > non-inlined function call :-(. That would drop the (void *) cast,
> > > though.
> > >
> > >
> > > Eduard
> >
> > So we get away with it without casts but only because C compiler
> > will let us silently convert the types, possibly discarding
> > data in the process. Or we'll add a check that will try and detect
> > this, but there's no good way to report a DMA error to user.
> > IOW, if our code only works because target fits in pcibus, what good
> > is the abstraction and using distinct types?
> >
> > This is why I think we need a generic DMA APIs using dma addresses.
>
> The API was made so that it doesn't report errors. That's because the
> PCI bus doesn't provide any possibility of doing so (real devices can't
> retry transfers in case an I/O page fault occurs).
This is what I am saying. We can't deal with errors.
> In my previous generic IOMMU layer implementation pci_memory_*()
> returned non-zero on failure, but I decided to drop it when switching to
> a PCI-only (rather a PCI-specific) approach.
>
> In case target_phys_addr_t no longer fits in pcibus_t by a simple
> implicit conversion, those explicit bmdma hooks I was going to add will
> do the necessary conversions.
>
> The idea of using distinct types is two-fold: let the programmer know
> not to rely on them being the same thing, and let the compiler prevent
> him from shooting himself in the foot (like I did). Even if there is a
> dma_addr_t, some piece of code still needs to provide glue and
> conversion between the DMA code and bus-specific code.
>
>
> Eduard
Nothing I see here is bus-specific, really. Without an mmu addresses
that make sense are target addresses, with iommu - whatever iommu
supports. So make iommu work with dma_addr_t and do the conversion.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 15:24 ` [Qemu-devel] " Avi Kivity
@ 2010-09-02 15:39 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 15:39 UTC (permalink / raw)
To: Avi Kivity
Cc: Eduard - Gabriel Munteanu, joro, blauwirbel, paul, anthony,
av1474, yamahata, kvm, qemu-devel
On Thu, Sep 02, 2010 at 06:24:25PM +0300, Avi Kivity wrote:
> That's not a practical long term solution. Eventually everything
> gets turned on.
That's why I wanted a simple !iommu check and fallback.
This way unless it's really used there's no overhead.
> I don't really see a problem with the additional indirection. By
> the time we reach actual hardware to satisfy the request,
> we'll have gone through many such indirections; modern processors deal
> very well with them.
>
> --
> error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 15:39 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 15:39 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu
On Thu, Sep 02, 2010 at 06:24:25PM +0300, Avi Kivity wrote:
> That's not a practical long term solution. Eventually everything
> gets turned on.
That's why I wanted a simple !iommu check and fallback.
This way unless it's really used there's no overhead.
> I don't really see a problem with the additional indirection. By
> the time we reach actual hardware to satisfy the request,
> we'll have gone through many such indirections; modern processors deal
> very well with them.
>
> --
> error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 8:51 ` Eduard - Gabriel Munteanu
@ 2010-09-02 16:05 ` Stefan Weil
-1 siblings, 0 replies; 97+ messages in thread
From: Stefan Weil @ 2010-09-02 16:05 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: mst, kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
Am 02.09.2010 10:51, schrieb Eduard - Gabriel Munteanu:
> On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
>
>> Please see my comments at the end of this mail.
>>
>>
>> Am 30.08.2010 00:08, schrieb Eduard - Gabriel Munteanu:
>>
>>> PCI devices should access memory through pci_memory_*() instead of
>>> cpu_physical_memory_*(). This also provides support for translation and
>>> access checking in case an IOMMU is emulated.
>>>
>>> Memory maps are treated as remote IOTLBs (that is, translation caches
>>> belonging to the IOMMU-aware device itself). Clients (devices) must
>>> provide callbacks for map invalidation in case these maps are
>>> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>>>
>>> Signed-off-by: Eduard - Gabriel Munteanu<eduard.munteanu@linux360.ro>
>>> ---
>>>
> [snip]
>
>
>>> +static inline void pci_memory_read(PCIDevice *dev,
>>> + pcibus_t addr,
>>> + uint8_t *buf,
>>> + pcibus_t len)
>>> +{
>>> + pci_memory_rw(dev, addr, buf, len, 0);
>>> +}
>>> +
>>> +static inline void pci_memory_write(PCIDevice *dev,
>>> + pcibus_t addr,
>>> + const uint8_t *buf,
>>> + pcibus_t len)
>>> +{
>>> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
>>> +}
>>> +
>>> #endif
>>>
>> The functions pci_memory_read and pci_memory_write not only read
>> or write byte data but many different data types which leads to
>> a lot of type casts in your other patches.
>>
>> I'd prefer "void *buf" and "const void *buf" in the argument lists.
>> Then all those type casts could be removed.
>>
>> Regards
>> Stefan Weil
>>
> I only followed an approach similar to how cpu_physical_memory_{read,write}()
> is defined. I think I should change both cpu_physical_memory_* stuff and
> pci_memory_* stuff, not only the latter, if I decide to go on that
> approach.
>
>
> Eduard
>
Yes, cpu_physical_memory_read, cpu_physical_memory_write
and cpu_physical_memory_rw should be changed, too.
They also require several type casts today.
But this change can be done in an independent patch.
Stefan
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 16:05 ` Stefan Weil
0 siblings, 0 replies; 97+ messages in thread
From: Stefan Weil @ 2010-09-02 16:05 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, mst, joro, qemu-devel, blauwirbel, yamahata, paul, avi
Am 02.09.2010 10:51, schrieb Eduard - Gabriel Munteanu:
> On Wed, Sep 01, 2010 at 10:10:30PM +0200, Stefan Weil wrote:
>
>> Please see my comments at the end of this mail.
>>
>>
>> Am 30.08.2010 00:08, schrieb Eduard - Gabriel Munteanu:
>>
>>> PCI devices should access memory through pci_memory_*() instead of
>>> cpu_physical_memory_*(). This also provides support for translation and
>>> access checking in case an IOMMU is emulated.
>>>
>>> Memory maps are treated as remote IOTLBs (that is, translation caches
>>> belonging to the IOMMU-aware device itself). Clients (devices) must
>>> provide callbacks for map invalidation in case these maps are
>>> persistent beyond the current I/O context, e.g. AIO DMA transfers.
>>>
>>> Signed-off-by: Eduard - Gabriel Munteanu<eduard.munteanu@linux360.ro>
>>> ---
>>>
> [snip]
>
>
>>> +static inline void pci_memory_read(PCIDevice *dev,
>>> + pcibus_t addr,
>>> + uint8_t *buf,
>>> + pcibus_t len)
>>> +{
>>> + pci_memory_rw(dev, addr, buf, len, 0);
>>> +}
>>> +
>>> +static inline void pci_memory_write(PCIDevice *dev,
>>> + pcibus_t addr,
>>> + const uint8_t *buf,
>>> + pcibus_t len)
>>> +{
>>> + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
>>> +}
>>> +
>>> #endif
>>>
>> The functions pci_memory_read and pci_memory_write not only read
>> or write byte data but many different data types which leads to
>> a lot of type casts in your other patches.
>>
>> I'd prefer "void *buf" and "const void *buf" in the argument lists.
>> Then all those type casts could be removed.
>>
>> Regards
>> Stefan Weil
>>
> I only followed an approach similar to how cpu_physical_memory_{read,write}()
> is defined. I think I should change both cpu_physical_memory_* stuff and
> pci_memory_* stuff, not only the latter, if I decide to go on that
> approach.
>
>
> Eduard
>
Yes, cpu_physical_memory_read, cpu_physical_memory_write
and cpu_physical_memory_rw should be changed, too.
They also require several type casts today.
But this change can be done in an independent patch.
Stefan
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 4/7] ide: use the PCI memory access interface
2010-09-02 15:39 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-02 16:07 ` Avi Kivity
-1 siblings, 0 replies; 97+ messages in thread
From: Avi Kivity @ 2010-09-02 16:07 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Eduard - Gabriel Munteanu, joro, blauwirbel, paul, anthony,
av1474, yamahata, kvm, qemu-devel
On 09/02/2010 06:39 PM, Michael S. Tsirkin wrote:
> On Thu, Sep 02, 2010 at 06:24:25PM +0300, Avi Kivity wrote:
>> That's not a practical long term solution. Eventually everything
>> gets turned on.
> That's why I wanted a simple !iommu check and fallback.
> This way unless it's really used there's no overhead.
>
It's not very different from an indirect function call. Modern branch
predictors store the target function address and supply it ahead of time.
I've never seen a function call instruction in a profile.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
@ 2010-09-02 16:07 ` Avi Kivity
0 siblings, 0 replies; 97+ messages in thread
From: Avi Kivity @ 2010-09-02 16:07 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu
On 09/02/2010 06:39 PM, Michael S. Tsirkin wrote:
> On Thu, Sep 02, 2010 at 06:24:25PM +0300, Avi Kivity wrote:
>> That's not a practical long term solution. Eventually everything
>> gets turned on.
> That's why I wanted a simple !iommu check and fallback.
> This way unless it's really used there's no overhead.
>
It's not very different from an indirect function call. Modern branch
predictors store the target function address and supply it ahead of time.
I've never seen a function call instruction in a profile.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 16:05 ` Stefan Weil
@ 2010-09-02 16:14 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 16:14 UTC (permalink / raw)
To: Stefan Weil; +Cc: mst, kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 06:05:39PM +0200, Stefan Weil wrote:
> Am 02.09.2010 10:51, schrieb Eduard - Gabriel Munteanu:
[snip]
> >> The functions pci_memory_read and pci_memory_write not only read
> >> or write byte data but many different data types which leads to
> >> a lot of type casts in your other patches.
> >>
> >> I'd prefer "void *buf" and "const void *buf" in the argument lists.
> >> Then all those type casts could be removed.
> >>
> >> Regards
> >> Stefan Weil
> >>
> > I only followed an approach similar to how cpu_physical_memory_{read,write}()
> > is defined. I think I should change both cpu_physical_memory_* stuff and
> > pci_memory_* stuff, not only the latter, if I decide to go on that
> > approach.
> >
> >
> > Eduard
> >
>
>
> Yes, cpu_physical_memory_read, cpu_physical_memory_write
> and cpu_physical_memory_rw should be changed, too.
>
> They also require several type casts today.
>
> But this change can be done in an independent patch.
>
> Stefan
Roger, I'm on it. The existing casts could remain there AFAICT, so it's
a pretty simple change.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-02 16:14 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-02 16:14 UTC (permalink / raw)
To: Stefan Weil; +Cc: kvm, mst, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 02, 2010 at 06:05:39PM +0200, Stefan Weil wrote:
> Am 02.09.2010 10:51, schrieb Eduard - Gabriel Munteanu:
[snip]
> >> The functions pci_memory_read and pci_memory_write not only read
> >> or write byte data but many different data types which leads to
> >> a lot of type casts in your other patches.
> >>
> >> I'd prefer "void *buf" and "const void *buf" in the argument lists.
> >> Then all those type casts could be removed.
> >>
> >> Regards
> >> Stefan Weil
> >>
> > I only followed an approach similar to how cpu_physical_memory_{read,write}()
> > is defined. I think I should change both cpu_physical_memory_* stuff and
> > pci_memory_* stuff, not only the latter, if I decide to go on that
> > approach.
> >
> >
> > Eduard
> >
>
>
> Yes, cpu_physical_memory_read, cpu_physical_memory_write
> and cpu_physical_memory_rw should be changed, too.
>
> They also require several type casts today.
>
> But this change can be done in an independent patch.
>
> Stefan
Roger, I'm on it. The existing casts could remain there AFAICT, so it's
a pretty simple change.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-02 9:49 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-04 9:01 ` Blue Swirl
-1 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-09-04 9:01 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Eduard - Gabriel Munteanu, joro, paul, avi, anthony, av1474,
yamahata, kvm, qemu-devel
On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
>> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
>> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
>> > > PCI devices should access memory through pci_memory_*() instead of
>> > > cpu_physical_memory_*(). This also provides support for translation and
>> > > access checking in case an IOMMU is emulated.
>> > >
>> > > Memory maps are treated as remote IOTLBs (that is, translation caches
>> > > belonging to the IOMMU-aware device itself). Clients (devices) must
>> > > provide callbacks for map invalidation in case these maps are
>> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
>> > >
>> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> >
>> >
>> > I am concerned about adding more pointer chaising on data path.
>> > Could we have
>> > 1. an iommu pointer in a device, inherited by secondary buses
>> > when they are created and by devices from buses when they are attached.
>> > 2. translation pointer in the iommu instead of the bus
>>
>> The first solution I proposed was based on qdev, that is, each
>> DeviceState had an 'iommu' field. Translation would be done by
>> recursively looking in the parent bus/devs for an IOMMU.
>>
>> But Anthony said we're better off with bus-specific APIs, mostly because
>> (IIRC) there may be different types of addresses and it might be
>> difficult to abstract those properly.
>
> Well we ended up with casting
> away types to make pci callbacks fit in ide structure,
> and silently assuming that all addresses are in fact 64 bit.
> So maybe it's hard to abstract addresses properly, but
> it appears we'll have to, to avoid even worse problems.
>
>> I suppose I could revisit the idea by integrating the IOMMU in a
>> PCIDevice as opposed to a DeviceState.
>>
>> Anthony, Paul, any thoughts on this?
>
> Just to clarify: this is an optimization idea:
> instead of a bus walk on each access, do the walk
> when device is attached to the bus, and copy the iommu
> from the root to the device itself.
>
> This will also make it possible to create
> DMADeviceState structure which would have this iommu field,
> and we'd use this structure instead of the void pointers all over.
>
>
>> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>> >
>> > if (__builtin_expect(!dev->iommu, 1)
>> > return cpu_memory_rw
>>
>> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
>> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
>
> IOMMU has a ton of indirections anyway.
>
>> I suppose most emulated systems would have at least some theoretical
>> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
>> PCI devices) or for userspace drivers.
>> So there are reasons to enable
>> the IOMMU even when you don't have a real host IOMMU and you're not
>> using nested guests.
>
> The time most people enable iommu for all devices in both real and virtualized
> systems appears distant, one of the reasons is because it has a lot of overhead.
> Let's start with not adding overhead for existing users, makes sense?
I think the goal architecture (not for IOMMU, but in general) is one
with zero copy DMA. This means we have stage one where the addresses
are translated to host pointers and stage two where the read/write
operation happens. The devices need to be converted.
Now, let's consider the IOMMU in this zero copy architecture. It's one
stage of address translation, for the access operation it will not
matter. We can add translation caching at device level (or even at any
intermediate levels), but that needs a cache invalidation callback
system as discussed earlier. This can be implemented later, we need
the zero copy stuff first.
Given this overall picture, I think eliminating some pointer
dereference overheads in non-zero copy architecture is a very
premature optimization and it may even direct the architecture to
wrong direction.
If the performance degradation at this point is not acceptable, we
could also postpone merging IOMMU until zero copy conversion has
happened, or make IOMMU a compile time option. But it would be nice to
back the decision by performance figures.
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-04 9:01 ` Blue Swirl
0 siblings, 0 replies; 97+ messages in thread
From: Blue Swirl @ 2010-09-04 9:01 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: kvm, joro, qemu-devel, yamahata, avi, Eduard - Gabriel Munteanu, paul
On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
>> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
>> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
>> > > PCI devices should access memory through pci_memory_*() instead of
>> > > cpu_physical_memory_*(). This also provides support for translation and
>> > > access checking in case an IOMMU is emulated.
>> > >
>> > > Memory maps are treated as remote IOTLBs (that is, translation caches
>> > > belonging to the IOMMU-aware device itself). Clients (devices) must
>> > > provide callbacks for map invalidation in case these maps are
>> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
>> > >
>> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> >
>> >
>> > I am concerned about adding more pointer chaising on data path.
>> > Could we have
>> > 1. an iommu pointer in a device, inherited by secondary buses
>> > when they are created and by devices from buses when they are attached.
>> > 2. translation pointer in the iommu instead of the bus
>>
>> The first solution I proposed was based on qdev, that is, each
>> DeviceState had an 'iommu' field. Translation would be done by
>> recursively looking in the parent bus/devs for an IOMMU.
>>
>> But Anthony said we're better off with bus-specific APIs, mostly because
>> (IIRC) there may be different types of addresses and it might be
>> difficult to abstract those properly.
>
> Well we ended up with casting
> away types to make pci callbacks fit in ide structure,
> and silently assuming that all addresses are in fact 64 bit.
> So maybe it's hard to abstract addresses properly, but
> it appears we'll have to, to avoid even worse problems.
>
>> I suppose I could revisit the idea by integrating the IOMMU in a
>> PCIDevice as opposed to a DeviceState.
>>
>> Anthony, Paul, any thoughts on this?
>
> Just to clarify: this is an optimization idea:
> instead of a bus walk on each access, do the walk
> when device is attached to the bus, and copy the iommu
> from the root to the device itself.
>
> This will also make it possible to create
> DMADeviceState structure which would have this iommu field,
> and we'd use this structure instead of the void pointers all over.
>
>
>> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>> >
>> > if (__builtin_expect(!dev->iommu, 1)
>> > return cpu_memory_rw
>>
>> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
>> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
>
> IOMMU has a ton of indirections anyway.
>
>> I suppose most emulated systems would have at least some theoretical
>> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
>> PCI devices) or for userspace drivers.
>> So there are reasons to enable
>> the IOMMU even when you don't have a real host IOMMU and you're not
>> using nested guests.
>
> The time most people enable iommu for all devices in both real and virtualized
> systems appears distant, one of the reasons is because it has a lot of overhead.
> Let's start with not adding overhead for existing users, makes sense?
I think the goal architecture (not for IOMMU, but in general) is one
with zero copy DMA. This means we have stage one where the addresses
are translated to host pointers and stage two where the read/write
operation happens. The devices need to be converted.
Now, let's consider the IOMMU in this zero copy architecture. It's one
stage of address translation, for the access operation it will not
matter. We can add translation caching at device level (or even at any
intermediate levels), but that needs a cache invalidation callback
system as discussed earlier. This can be implemented later, we need
the zero copy stuff first.
Given this overall picture, I think eliminating some pointer
dereference overheads in non-zero copy architecture is a very
premature optimization and it may even direct the architecture to
wrong direction.
If the performance degradation at this point is not acceptable, we
could also postpone merging IOMMU until zero copy conversion has
happened, or make IOMMU a compile time option. But it would be nice to
back the decision by performance figures.
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH 2/7] pci: memory access API and IOMMU support
2010-09-04 9:01 ` [Qemu-devel] " Blue Swirl
@ 2010-09-05 7:10 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-05 7:10 UTC (permalink / raw)
To: Blue Swirl
Cc: Eduard - Gabriel Munteanu, joro, paul, avi, anthony, av1474,
yamahata, kvm, qemu-devel
On Sat, Sep 04, 2010 at 09:01:06AM +0000, Blue Swirl wrote:
> On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
> >> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
> >> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> >> > > PCI devices should access memory through pci_memory_*() instead of
> >> > > cpu_physical_memory_*(). This also provides support for translation and
> >> > > access checking in case an IOMMU is emulated.
> >> > >
> >> > > Memory maps are treated as remote IOTLBs (that is, translation caches
> >> > > belonging to the IOMMU-aware device itself). Clients (devices) must
> >> > > provide callbacks for map invalidation in case these maps are
> >> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> >> > >
> >> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >> >
> >> >
> >> > I am concerned about adding more pointer chaising on data path.
> >> > Could we have
> >> > 1. an iommu pointer in a device, inherited by secondary buses
> >> > when they are created and by devices from buses when they are attached.
> >> > 2. translation pointer in the iommu instead of the bus
> >>
> >> The first solution I proposed was based on qdev, that is, each
> >> DeviceState had an 'iommu' field. Translation would be done by
> >> recursively looking in the parent bus/devs for an IOMMU.
> >>
> >> But Anthony said we're better off with bus-specific APIs, mostly because
> >> (IIRC) there may be different types of addresses and it might be
> >> difficult to abstract those properly.
> >
> > Well we ended up with casting
> > away types to make pci callbacks fit in ide structure,
> > and silently assuming that all addresses are in fact 64 bit.
> > So maybe it's hard to abstract addresses properly, but
> > it appears we'll have to, to avoid even worse problems.
> >
> >> I suppose I could revisit the idea by integrating the IOMMU in a
> >> PCIDevice as opposed to a DeviceState.
> >>
> >> Anthony, Paul, any thoughts on this?
> >
> > Just to clarify: this is an optimization idea:
> > instead of a bus walk on each access, do the walk
> > when device is attached to the bus, and copy the iommu
> > from the root to the device itself.
> >
> > This will also make it possible to create
> > DMADeviceState structure which would have this iommu field,
> > and we'd use this structure instead of the void pointers all over.
> >
> >
> >> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
> >> >
> >> > if (__builtin_expect(!dev->iommu, 1)
> >> > return cpu_memory_rw
> >>
> >> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
> >> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
> >
> > IOMMU has a ton of indirections anyway.
> >
> >> I suppose most emulated systems would have at least some theoretical
> >> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
> >> PCI devices) or for userspace drivers.
> >> So there are reasons to enable
> >> the IOMMU even when you don't have a real host IOMMU and you're not
> >> using nested guests.
> >
> > The time most people enable iommu for all devices in both real and virtualized
> > systems appears distant, one of the reasons is because it has a lot of overhead.
> > Let's start with not adding overhead for existing users, makes sense?
>
> I think the goal architecture (not for IOMMU, but in general) is one
> with zero copy DMA. This means we have stage one where the addresses
> are translated to host pointers and stage two where the read/write
> operation happens. The devices need to be converted.
>
> Now, let's consider the IOMMU in this zero copy architecture. It's one
> stage of address translation, for the access operation it will not
> matter. We can add translation caching at device level (or even at any
> intermediate levels), but that needs a cache invalidation callback
> system as discussed earlier. This can be implemented later, we need
> the zero copy stuff first.
>
> Given this overall picture, I think eliminating some pointer
> dereference overheads in non-zero copy architecture is a very
> premature optimization and it may even direct the architecture to
> wrong direction.
>
> If the performance degradation at this point is not acceptable, we
> could also postpone merging IOMMU until zero copy conversion has
> happened, or make IOMMU a compile time option. But it would be nice to
> back the decision by performance figures.
I agree, a minimal benchmark showing no performance impact
when disabled would put these concerns to rest.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
@ 2010-09-05 7:10 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-05 7:10 UTC (permalink / raw)
To: Blue Swirl
Cc: kvm, joro, qemu-devel, yamahata, avi, Eduard - Gabriel Munteanu, paul
On Sat, Sep 04, 2010 at 09:01:06AM +0000, Blue Swirl wrote:
> On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
> >> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
> >> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
> >> > > PCI devices should access memory through pci_memory_*() instead of
> >> > > cpu_physical_memory_*(). This also provides support for translation and
> >> > > access checking in case an IOMMU is emulated.
> >> > >
> >> > > Memory maps are treated as remote IOTLBs (that is, translation caches
> >> > > belonging to the IOMMU-aware device itself). Clients (devices) must
> >> > > provide callbacks for map invalidation in case these maps are
> >> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
> >> > >
> >> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >> >
> >> >
> >> > I am concerned about adding more pointer chaising on data path.
> >> > Could we have
> >> > 1. an iommu pointer in a device, inherited by secondary buses
> >> > when they are created and by devices from buses when they are attached.
> >> > 2. translation pointer in the iommu instead of the bus
> >>
> >> The first solution I proposed was based on qdev, that is, each
> >> DeviceState had an 'iommu' field. Translation would be done by
> >> recursively looking in the parent bus/devs for an IOMMU.
> >>
> >> But Anthony said we're better off with bus-specific APIs, mostly because
> >> (IIRC) there may be different types of addresses and it might be
> >> difficult to abstract those properly.
> >
> > Well we ended up with casting
> > away types to make pci callbacks fit in ide structure,
> > and silently assuming that all addresses are in fact 64 bit.
> > So maybe it's hard to abstract addresses properly, but
> > it appears we'll have to, to avoid even worse problems.
> >
> >> I suppose I could revisit the idea by integrating the IOMMU in a
> >> PCIDevice as opposed to a DeviceState.
> >>
> >> Anthony, Paul, any thoughts on this?
> >
> > Just to clarify: this is an optimization idea:
> > instead of a bus walk on each access, do the walk
> > when device is attached to the bus, and copy the iommu
> > from the root to the device itself.
> >
> > This will also make it possible to create
> > DMADeviceState structure which would have this iommu field,
> > and we'd use this structure instead of the void pointers all over.
> >
> >
> >> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
> >> >
> >> > if (__builtin_expect(!dev->iommu, 1)
> >> > return cpu_memory_rw
> >>
> >> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
> >> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
> >
> > IOMMU has a ton of indirections anyway.
> >
> >> I suppose most emulated systems would have at least some theoretical
> >> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
> >> PCI devices) or for userspace drivers.
> >> So there are reasons to enable
> >> the IOMMU even when you don't have a real host IOMMU and you're not
> >> using nested guests.
> >
> > The time most people enable iommu for all devices in both real and virtualized
> > systems appears distant, one of the reasons is because it has a lot of overhead.
> > Let's start with not adding overhead for existing users, makes sense?
>
> I think the goal architecture (not for IOMMU, but in general) is one
> with zero copy DMA. This means we have stage one where the addresses
> are translated to host pointers and stage two where the read/write
> operation happens. The devices need to be converted.
>
> Now, let's consider the IOMMU in this zero copy architecture. It's one
> stage of address translation, for the access operation it will not
> matter. We can add translation caching at device level (or even at any
> intermediate levels), but that needs a cache invalidation callback
> system as discussed earlier. This can be implemented later, we need
> the zero copy stuff first.
>
> Given this overall picture, I think eliminating some pointer
> dereference overheads in non-zero copy architecture is a very
> premature optimization and it may even direct the architecture to
> wrong direction.
>
> If the performance degradation at this point is not acceptable, we
> could also postpone merging IOMMU until zero copy conversion has
> happened, or make IOMMU a compile time option. But it would be nice to
> back the decision by performance figures.
I agree, a minimal benchmark showing no performance impact
when disabled would put these concerns to rest.
--
MST
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-13 20:01 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-13 20:01 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
So I think the following will give the idea of what an API
might look like that will let us avoid the scary hacks in
e.g. the ide layer and other generic layers that need to do DMA,
without either binding us to pci, adding more complexity with
callbacks, or losing type safety with casts and void*.
Basically we have DMADevice that we can use container_of on
to get a PCIDevice from, and DMAMmu that will get instanciated
in a specific MMU.
This is not complete code - just a header - I might complete
this later if/when there's interest or hopefully someone interested
in iommu emulation will.
Notes:
the IOMMU_PERM_RW code seem unused, so I replaced
this with plain is_write. Is it ever useful?
It seems that invalidate callback should be able to
get away with just a device, so I switched to that
from a void pointer for type safety.
Seems enough for the users I saw.
I saw devices do stl_le_phys and such, these
might need to be wrapped as well.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
diff --git a/hw/dma_rw.h b/hw/dma_rw.h
new file mode 100644
index 0000000..d63fd17
--- /dev/null
+++ b/hw/dma_rw.h
@@ -0,0 +1,122 @@
+#ifndef DMA_RW_H
+#define DMA_RW_H
+
+#include "qemu-common.h"
+
+/* We currently only have pci mmus, but using
+ a generic type makes it possible to use this
+ e.g. from the generic ide code without callbacks. */
+typedef uint64_t dma_addr_t;
+
+typedef struct DMAMmu DMAMmu;
+typedef struct DMADevice DMADevice;
+
+typedef int DMATranslateFunc(DMAMmu *mmu,
+ DMADevice *dev,
+ dma_addr_t addr,
+ dma_addr_t *paddr,
+ dma_addr_t *len,
+ int is_write);
+
+typedef int DMAInvalidateMapFunc(DMADevice *);
+struct DMAMmu {
+ /* invalidate, etc. */
+ DmaTranslateFunc *translate;
+};
+
+struct DMADevice {
+ DMAMmu *mmu;
+ DMAInvalidateMapFunc *invalidate;
+};
+
+void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
+
+static inline void dma_memory_rw(DMADevice *dev,
+ dma_addr_t addr,
+ void *buf,
+ uint32_t len,
+ int is_write)
+{
+ uint32_t plen;
+ /* Fast-path non-iommu.
+ * More importantly, makes it obvious what this function does. */
+ if (!dev->mmu) {
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+ return;
+ }
+ while (len) {
+ err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
+ if (err) {
+ return;
+ }
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len) {
+ plen = len;
+ }
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+void *dma_memory_map(DMADevice *dev,
+ dma_addr_t addr,
+ uint32_t *len,
+ int is_write);
+void dma_memory_unmap(DMADevice *dev,
+ void *buffer,
+ uint32_t len,
+ int is_write,
+ uint32_t access_len);
+
+
++#define DEFINE_DMA_LD(suffix, size) \
++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
++{ \
++ int err; \
++ target_phys_addr_t paddr, plen; \
++ if (!dev->mmu) { \
++ return ld##suffix##_phys(addr, val); \
++ } \
++ \
++ err = dev->mmu->translate(dev->bus->iommu, dev, \
++ addr, &paddr, &plen, IOMMU_PERM_READ); \
++ if (err || (plen < size / 8)) \
++ return 0; \
++ \
++ return ld##suffix##_phys(paddr); \
++}
++
++#define DEFINE_DMA_ST(suffix, size) \
++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
++{ \
++ int err; \
++ target_phys_addr_t paddr, plen; \
++ \
++ if (!dev->mmu) { \
++ st##suffix##_phys(addr, val); \
++ return; \
++ } \
++ err = dev->mmu->translate(dev->bus->iommu, dev, \
++ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
++ if (err || (plen < size / 8)) \
++ return; \
++ \
++ st##suffix##_phys(paddr, val); \
++}
+
+DEFINE_DMA_LD(ub, 8)
+DEFINE_DMA_LD(uw, 16)
+DEFINE_DMA_LD(l, 32)
+DEFINE_DMA_LD(q, 64)
+
+DEFINE_DMA_ST(b, 8)
+DEFINE_DMA_ST(w, 16)
+DEFINE_DMA_ST(l, 32)
+DEFINE_DMA_ST(q, 64)
+
+#endif
diff --git a/hw/pci.h b/hw/pci.h
index 1c6075e..9737f0e 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -5,6 +5,7 @@
#include "qobject.h"
#include "qdev.h"
+#include "dma_rw.h"
/* PCI includes legacy ISA access. */
#include "isa.h"
@@ -119,6 +120,10 @@ enum {
struct PCIDevice {
DeviceState qdev;
+
+ /* For devices that do DMA. */
+ DMADevice dma;
+
/* PCI config space */
uint8_t *config;
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-13 20:01 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-13 20:01 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
So I think the following will give the idea of what an API
might look like that will let us avoid the scary hacks in
e.g. the ide layer and other generic layers that need to do DMA,
without either binding us to pci, adding more complexity with
callbacks, or losing type safety with casts and void*.
Basically we have DMADevice that we can use container_of on
to get a PCIDevice from, and DMAMmu that will get instanciated
in a specific MMU.
This is not complete code - just a header - I might complete
this later if/when there's interest or hopefully someone interested
in iommu emulation will.
Notes:
the IOMMU_PERM_RW code seem unused, so I replaced
this with plain is_write. Is it ever useful?
It seems that invalidate callback should be able to
get away with just a device, so I switched to that
from a void pointer for type safety.
Seems enough for the users I saw.
I saw devices do stl_le_phys and such, these
might need to be wrapped as well.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
diff --git a/hw/dma_rw.h b/hw/dma_rw.h
new file mode 100644
index 0000000..d63fd17
--- /dev/null
+++ b/hw/dma_rw.h
@@ -0,0 +1,122 @@
+#ifndef DMA_RW_H
+#define DMA_RW_H
+
+#include "qemu-common.h"
+
+/* We currently only have pci mmus, but using
+ a generic type makes it possible to use this
+ e.g. from the generic ide code without callbacks. */
+typedef uint64_t dma_addr_t;
+
+typedef struct DMAMmu DMAMmu;
+typedef struct DMADevice DMADevice;
+
+typedef int DMATranslateFunc(DMAMmu *mmu,
+ DMADevice *dev,
+ dma_addr_t addr,
+ dma_addr_t *paddr,
+ dma_addr_t *len,
+ int is_write);
+
+typedef int DMAInvalidateMapFunc(DMADevice *);
+struct DMAMmu {
+ /* invalidate, etc. */
+ DmaTranslateFunc *translate;
+};
+
+struct DMADevice {
+ DMAMmu *mmu;
+ DMAInvalidateMapFunc *invalidate;
+};
+
+void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
+
+static inline void dma_memory_rw(DMADevice *dev,
+ dma_addr_t addr,
+ void *buf,
+ uint32_t len,
+ int is_write)
+{
+ uint32_t plen;
+ /* Fast-path non-iommu.
+ * More importantly, makes it obvious what this function does. */
+ if (!dev->mmu) {
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+ return;
+ }
+ while (len) {
+ err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
+ if (err) {
+ return;
+ }
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len) {
+ plen = len;
+ }
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+void *dma_memory_map(DMADevice *dev,
+ dma_addr_t addr,
+ uint32_t *len,
+ int is_write);
+void dma_memory_unmap(DMADevice *dev,
+ void *buffer,
+ uint32_t len,
+ int is_write,
+ uint32_t access_len);
+
+
++#define DEFINE_DMA_LD(suffix, size) \
++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
++{ \
++ int err; \
++ target_phys_addr_t paddr, plen; \
++ if (!dev->mmu) { \
++ return ld##suffix##_phys(addr, val); \
++ } \
++ \
++ err = dev->mmu->translate(dev->bus->iommu, dev, \
++ addr, &paddr, &plen, IOMMU_PERM_READ); \
++ if (err || (plen < size / 8)) \
++ return 0; \
++ \
++ return ld##suffix##_phys(paddr); \
++}
++
++#define DEFINE_DMA_ST(suffix, size) \
++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
++{ \
++ int err; \
++ target_phys_addr_t paddr, plen; \
++ \
++ if (!dev->mmu) { \
++ st##suffix##_phys(addr, val); \
++ return; \
++ } \
++ err = dev->mmu->translate(dev->bus->iommu, dev, \
++ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
++ if (err || (plen < size / 8)) \
++ return; \
++ \
++ st##suffix##_phys(paddr, val); \
++}
+
+DEFINE_DMA_LD(ub, 8)
+DEFINE_DMA_LD(uw, 16)
+DEFINE_DMA_LD(l, 32)
+DEFINE_DMA_LD(q, 64)
+
+DEFINE_DMA_ST(b, 8)
+DEFINE_DMA_ST(w, 16)
+DEFINE_DMA_ST(l, 32)
+DEFINE_DMA_ST(q, 64)
+
+#endif
diff --git a/hw/pci.h b/hw/pci.h
index 1c6075e..9737f0e 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -5,6 +5,7 @@
#include "qobject.h"
#include "qdev.h"
+#include "dma_rw.h"
/* PCI includes legacy ISA access. */
#include "isa.h"
@@ -119,6 +120,10 @@ enum {
struct PCIDevice {
DeviceState qdev;
+
+ /* For devices that do DMA. */
+ DMADevice dma;
+
/* PCI config space */
uint8_t *config;
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-09-13 20:01 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-13 20:45 ` Anthony Liguori
-1 siblings, 0 replies; 97+ messages in thread
From: Anthony Liguori @ 2010-09-13 20:45 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Eduard - Gabriel Munteanu, kvm, joro, qemu-devel, blauwirbel,
yamahata, paul, avi
On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote:
> So I think the following will give the idea of what an API
> might look like that will let us avoid the scary hacks in
> e.g. the ide layer and other generic layers that need to do DMA,
> without either binding us to pci, adding more complexity with
> callbacks, or losing type safety with casts and void*.
>
> Basically we have DMADevice that we can use container_of on
> to get a PCIDevice from, and DMAMmu that will get instanciated
> in a specific MMU.
>
> This is not complete code - just a header - I might complete
> this later if/when there's interest or hopefully someone interested
> in iommu emulation will.
>
> Notes:
> the IOMMU_PERM_RW code seem unused, so I replaced
> this with plain is_write. Is it ever useful?
>
> It seems that invalidate callback should be able to
> get away with just a device, so I switched to that
> from a void pointer for type safety.
> Seems enough for the users I saw.
>
> I saw devices do stl_le_phys and such, these
> might need to be wrapped as well.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>
One of the troubles with an interface like this is that I'm not sure a
generic model universally works.
For instance, I know some PCI busses do transparent byte swapping. For
this to work, there has to be a notion of generic memory reads/writes
vs. reads of a 32-bit, 16-bit, and 8-bit value.
With a generic API, we lose the flexibility to do this type of bus
interface.
Regards,
Anthony Liguori
> ---
>
> diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> new file mode 100644
> index 0000000..d63fd17
> --- /dev/null
> +++ b/hw/dma_rw.h
> @@ -0,0 +1,122 @@
> +#ifndef DMA_RW_H
> +#define DMA_RW_H
> +
> +#include "qemu-common.h"
> +
> +/* We currently only have pci mmus, but using
> + a generic type makes it possible to use this
> + e.g. from the generic ide code without callbacks. */
> +typedef uint64_t dma_addr_t;
> +
> +typedef struct DMAMmu DMAMmu;
> +typedef struct DMADevice DMADevice;
> +
> +typedef int DMATranslateFunc(DMAMmu *mmu,
> + DMADevice *dev,
> + dma_addr_t addr,
> + dma_addr_t *paddr,
> + dma_addr_t *len,
> + int is_write);
> +
> +typedef int DMAInvalidateMapFunc(DMADevice *);
> +struct DMAMmu {
> + /* invalidate, etc. */
> + DmaTranslateFunc *translate;
> +};
> +
> +struct DMADevice {
> + DMAMmu *mmu;
> + DMAInvalidateMapFunc *invalidate;
> +};
> +
> +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> +
> +static inline void dma_memory_rw(DMADevice *dev,
> + dma_addr_t addr,
> + void *buf,
> + uint32_t len,
> + int is_write)
> +{
> + uint32_t plen;
> + /* Fast-path non-iommu.
> + * More importantly, makes it obvious what this function does. */
> + if (!dev->mmu) {
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> + return;
> + }
> + while (len) {
> + err = dev->mmu->translate(iommu, dev, addr,&paddr,&plen, is_write);
> + if (err) {
> + return;
> + }
> +
> + /* The translation might be valid for larger regions. */
> + if (plen> len) {
> + plen = len;
> + }
> +
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> +
> + len -= plen;
> + addr += plen;
> + buf += plen;
> + }
> +}
> +
> +void *dma_memory_map(DMADevice *dev,
> + dma_addr_t addr,
> + uint32_t *len,
> + int is_write);
> +void dma_memory_unmap(DMADevice *dev,
> + void *buffer,
> + uint32_t len,
> + int is_write,
> + uint32_t access_len);
> +
> +
> ++#define DEFINE_DMA_LD(suffix, size) \
> ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ if (!dev->mmu) { \
> ++ return ld##suffix##_phys(addr, val); \
> ++ } \
> ++ \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr,&paddr,&plen, IOMMU_PERM_READ); \
> ++ if (err || (plen< size / 8)) \
> ++ return 0; \
> ++ \
> ++ return ld##suffix##_phys(paddr); \
> ++}
> ++
> ++#define DEFINE_DMA_ST(suffix, size) \
> ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ \
> ++ if (!dev->mmu) { \
> ++ st##suffix##_phys(addr, val); \
> ++ return; \
> ++ } \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr,&paddr,&plen, IOMMU_PERM_WRITE); \
> ++ if (err || (plen< size / 8)) \
> ++ return; \
> ++ \
> ++ st##suffix##_phys(paddr, val); \
> ++}
> +
> +DEFINE_DMA_LD(ub, 8)
> +DEFINE_DMA_LD(uw, 16)
> +DEFINE_DMA_LD(l, 32)
> +DEFINE_DMA_LD(q, 64)
> +
> +DEFINE_DMA_ST(b, 8)
> +DEFINE_DMA_ST(w, 16)
> +DEFINE_DMA_ST(l, 32)
> +DEFINE_DMA_ST(q, 64)
> +
> +#endif
> diff --git a/hw/pci.h b/hw/pci.h
> index 1c6075e..9737f0e 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -5,6 +5,7 @@
> #include "qobject.h"
>
> #include "qdev.h"
> +#include "dma_rw.h"
>
> /* PCI includes legacy ISA access. */
> #include "isa.h"
> @@ -119,6 +120,10 @@ enum {
>
> struct PCIDevice {
> DeviceState qdev;
> +
> + /* For devices that do DMA. */
> + DMADevice dma;
> +
> /* PCI config space */
> uint8_t *config;
>
>
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-13 20:45 ` Anthony Liguori
0 siblings, 0 replies; 97+ messages in thread
From: Anthony Liguori @ 2010-09-13 20:45 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote:
> So I think the following will give the idea of what an API
> might look like that will let us avoid the scary hacks in
> e.g. the ide layer and other generic layers that need to do DMA,
> without either binding us to pci, adding more complexity with
> callbacks, or losing type safety with casts and void*.
>
> Basically we have DMADevice that we can use container_of on
> to get a PCIDevice from, and DMAMmu that will get instanciated
> in a specific MMU.
>
> This is not complete code - just a header - I might complete
> this later if/when there's interest or hopefully someone interested
> in iommu emulation will.
>
> Notes:
> the IOMMU_PERM_RW code seem unused, so I replaced
> this with plain is_write. Is it ever useful?
>
> It seems that invalidate callback should be able to
> get away with just a device, so I switched to that
> from a void pointer for type safety.
> Seems enough for the users I saw.
>
> I saw devices do stl_le_phys and such, these
> might need to be wrapped as well.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>
One of the troubles with an interface like this is that I'm not sure a
generic model universally works.
For instance, I know some PCI busses do transparent byte swapping. For
this to work, there has to be a notion of generic memory reads/writes
vs. reads of a 32-bit, 16-bit, and 8-bit value.
With a generic API, we lose the flexibility to do this type of bus
interface.
Regards,
Anthony Liguori
> ---
>
> diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> new file mode 100644
> index 0000000..d63fd17
> --- /dev/null
> +++ b/hw/dma_rw.h
> @@ -0,0 +1,122 @@
> +#ifndef DMA_RW_H
> +#define DMA_RW_H
> +
> +#include "qemu-common.h"
> +
> +/* We currently only have pci mmus, but using
> + a generic type makes it possible to use this
> + e.g. from the generic ide code without callbacks. */
> +typedef uint64_t dma_addr_t;
> +
> +typedef struct DMAMmu DMAMmu;
> +typedef struct DMADevice DMADevice;
> +
> +typedef int DMATranslateFunc(DMAMmu *mmu,
> + DMADevice *dev,
> + dma_addr_t addr,
> + dma_addr_t *paddr,
> + dma_addr_t *len,
> + int is_write);
> +
> +typedef int DMAInvalidateMapFunc(DMADevice *);
> +struct DMAMmu {
> + /* invalidate, etc. */
> + DmaTranslateFunc *translate;
> +};
> +
> +struct DMADevice {
> + DMAMmu *mmu;
> + DMAInvalidateMapFunc *invalidate;
> +};
> +
> +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> +
> +static inline void dma_memory_rw(DMADevice *dev,
> + dma_addr_t addr,
> + void *buf,
> + uint32_t len,
> + int is_write)
> +{
> + uint32_t plen;
> + /* Fast-path non-iommu.
> + * More importantly, makes it obvious what this function does. */
> + if (!dev->mmu) {
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> + return;
> + }
> + while (len) {
> + err = dev->mmu->translate(iommu, dev, addr,&paddr,&plen, is_write);
> + if (err) {
> + return;
> + }
> +
> + /* The translation might be valid for larger regions. */
> + if (plen> len) {
> + plen = len;
> + }
> +
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> +
> + len -= plen;
> + addr += plen;
> + buf += plen;
> + }
> +}
> +
> +void *dma_memory_map(DMADevice *dev,
> + dma_addr_t addr,
> + uint32_t *len,
> + int is_write);
> +void dma_memory_unmap(DMADevice *dev,
> + void *buffer,
> + uint32_t len,
> + int is_write,
> + uint32_t access_len);
> +
> +
> ++#define DEFINE_DMA_LD(suffix, size) \
> ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ if (!dev->mmu) { \
> ++ return ld##suffix##_phys(addr, val); \
> ++ } \
> ++ \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr,&paddr,&plen, IOMMU_PERM_READ); \
> ++ if (err || (plen< size / 8)) \
> ++ return 0; \
> ++ \
> ++ return ld##suffix##_phys(paddr); \
> ++}
> ++
> ++#define DEFINE_DMA_ST(suffix, size) \
> ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ \
> ++ if (!dev->mmu) { \
> ++ st##suffix##_phys(addr, val); \
> ++ return; \
> ++ } \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr,&paddr,&plen, IOMMU_PERM_WRITE); \
> ++ if (err || (plen< size / 8)) \
> ++ return; \
> ++ \
> ++ st##suffix##_phys(paddr, val); \
> ++}
> +
> +DEFINE_DMA_LD(ub, 8)
> +DEFINE_DMA_LD(uw, 16)
> +DEFINE_DMA_LD(l, 32)
> +DEFINE_DMA_LD(q, 64)
> +
> +DEFINE_DMA_ST(b, 8)
> +DEFINE_DMA_ST(w, 16)
> +DEFINE_DMA_ST(l, 32)
> +DEFINE_DMA_ST(q, 64)
> +
> +#endif
> diff --git a/hw/pci.h b/hw/pci.h
> index 1c6075e..9737f0e 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -5,6 +5,7 @@
> #include "qobject.h"
>
> #include "qdev.h"
> +#include "dma_rw.h"
>
> /* PCI includes legacy ISA access. */
> #include "isa.h"
> @@ -119,6 +120,10 @@ enum {
>
> struct PCIDevice {
> DeviceState qdev;
> +
> + /* For devices that do DMA. */
> + DMADevice dma;
> +
> /* PCI config space */
> uint8_t *config;
>
>
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-09-13 20:01 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-16 7:06 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-16 7:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Mon, Sep 13, 2010 at 10:01:20PM +0200, Michael S. Tsirkin wrote:
> So I think the following will give the idea of what an API
> might look like that will let us avoid the scary hacks in
> e.g. the ide layer and other generic layers that need to do DMA,
> without either binding us to pci, adding more complexity with
> callbacks, or losing type safety with casts and void*.
>
> Basically we have DMADevice that we can use container_of on
> to get a PCIDevice from, and DMAMmu that will get instanciated
> in a specific MMU.
>
> This is not complete code - just a header - I might complete
> this later if/when there's interest or hopefully someone interested
> in iommu emulation will.
Hi,
I personally like this approach better. It also seems to make poisoning
cpu_physical_memory_*() easier if we convert every device to this API.
We could then ban cpu_physical_memory_*(), perhaps by requiring a
#define and #ifdef-ing those declarations.
> Notes:
> the IOMMU_PERM_RW code seem unused, so I replaced
> this with plain is_write. Is it ever useful?
The original idea made provisions for stuff like full R/W memory maps.
In that case cpu_physical_memory_map() would call the translation /
checking function with perms == IOMMU_PERM_RW. That's not there yet so
it can be removed at the moment, especially since it only affects these
helpers.
Also, I'm not sure if there are other sorts of accesses besides reads
and writes we want to check or translate.
> It seems that invalidate callback should be able to
> get away with just a device, so I switched to that
> from a void pointer for type safety.
> Seems enough for the users I saw.
I think this makes matters too complicated. Normally, a single DMADevice
should be embedded within a <bus>Device, so doing this makes it really
hard to invalidate a specific map when there are more of them. It forces
device code to act as a bus, provide fake 'DMADevice's for each map and
dispatch translation to the real DMATranslateFunc. I see no other way.
If you really want more type-safety (although I think this is a case of
a true opaque identifying something only device code understands), I
have another proposal: have a DMAMap embedded in the opaque. Example
from dma-helpers.c:
typedef struct {
DMADevice *owner;
[...]
} DMAMap;
typedef struct {
[...]
DMAMap map;
[...]
} DMAAIOCB;
/* The callback. */
static void dma_bdrv_cancel(DMAMap *map)
{
DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
[...]
}
The upside is we only need to pass the DMAMap. That can also contain
details of the actual map in case the device wants to release only the
relevant range and remap the rest.
> I saw devices do stl_le_phys and such, these
> might need to be wrapped as well.
stl_le_phys() is defined and used only by hw/eepro100.c. That's already
dealt with by converting the device.
Thanks,
Eduard
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ---
>
> diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> new file mode 100644
> index 0000000..d63fd17
> --- /dev/null
> +++ b/hw/dma_rw.h
> @@ -0,0 +1,122 @@
> +#ifndef DMA_RW_H
> +#define DMA_RW_H
> +
> +#include "qemu-common.h"
> +
> +/* We currently only have pci mmus, but using
> + a generic type makes it possible to use this
> + e.g. from the generic ide code without callbacks. */
> +typedef uint64_t dma_addr_t;
> +
> +typedef struct DMAMmu DMAMmu;
> +typedef struct DMADevice DMADevice;
> +
> +typedef int DMATranslateFunc(DMAMmu *mmu,
> + DMADevice *dev,
> + dma_addr_t addr,
> + dma_addr_t *paddr,
> + dma_addr_t *len,
> + int is_write);
> +
> +typedef int DMAInvalidateMapFunc(DMADevice *);
> +struct DMAMmu {
> + /* invalidate, etc. */
> + DmaTranslateFunc *translate;
> +};
> +
> +struct DMADevice {
> + DMAMmu *mmu;
> + DMAInvalidateMapFunc *invalidate;
> +};
> +
> +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> +
> +static inline void dma_memory_rw(DMADevice *dev,
> + dma_addr_t addr,
> + void *buf,
> + uint32_t len,
> + int is_write)
> +{
> + uint32_t plen;
> + /* Fast-path non-iommu.
> + * More importantly, makes it obvious what this function does. */
> + if (!dev->mmu) {
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> + return;
> + }
> + while (len) {
> + err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
> + if (err) {
> + return;
> + }
> +
> + /* The translation might be valid for larger regions. */
> + if (plen > len) {
> + plen = len;
> + }
> +
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> +
> + len -= plen;
> + addr += plen;
> + buf += plen;
> + }
> +}
> +
> +void *dma_memory_map(DMADevice *dev,
> + dma_addr_t addr,
> + uint32_t *len,
> + int is_write);
> +void dma_memory_unmap(DMADevice *dev,
> + void *buffer,
> + uint32_t len,
> + int is_write,
> + uint32_t access_len);
> +
> +
> ++#define DEFINE_DMA_LD(suffix, size) \
> ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ if (!dev->mmu) { \
> ++ return ld##suffix##_phys(addr, val); \
> ++ } \
> ++ \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr, &paddr, &plen, IOMMU_PERM_READ); \
> ++ if (err || (plen < size / 8)) \
> ++ return 0; \
> ++ \
> ++ return ld##suffix##_phys(paddr); \
> ++}
> ++
> ++#define DEFINE_DMA_ST(suffix, size) \
> ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ \
> ++ if (!dev->mmu) { \
> ++ st##suffix##_phys(addr, val); \
> ++ return; \
> ++ } \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
> ++ if (err || (plen < size / 8)) \
> ++ return; \
> ++ \
> ++ st##suffix##_phys(paddr, val); \
> ++}
> +
> +DEFINE_DMA_LD(ub, 8)
> +DEFINE_DMA_LD(uw, 16)
> +DEFINE_DMA_LD(l, 32)
> +DEFINE_DMA_LD(q, 64)
> +
> +DEFINE_DMA_ST(b, 8)
> +DEFINE_DMA_ST(w, 16)
> +DEFINE_DMA_ST(l, 32)
> +DEFINE_DMA_ST(q, 64)
> +
> +#endif
> diff --git a/hw/pci.h b/hw/pci.h
> index 1c6075e..9737f0e 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -5,6 +5,7 @@
> #include "qobject.h"
>
> #include "qdev.h"
> +#include "dma_rw.h"
>
> /* PCI includes legacy ISA access. */
> #include "isa.h"
> @@ -119,6 +120,10 @@ enum {
>
> struct PCIDevice {
> DeviceState qdev;
> +
> + /* For devices that do DMA. */
> + DMADevice dma;
> +
> /* PCI config space */
> uint8_t *config;
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-16 7:06 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-16 7:06 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Mon, Sep 13, 2010 at 10:01:20PM +0200, Michael S. Tsirkin wrote:
> So I think the following will give the idea of what an API
> might look like that will let us avoid the scary hacks in
> e.g. the ide layer and other generic layers that need to do DMA,
> without either binding us to pci, adding more complexity with
> callbacks, or losing type safety with casts and void*.
>
> Basically we have DMADevice that we can use container_of on
> to get a PCIDevice from, and DMAMmu that will get instanciated
> in a specific MMU.
>
> This is not complete code - just a header - I might complete
> this later if/when there's interest or hopefully someone interested
> in iommu emulation will.
Hi,
I personally like this approach better. It also seems to make poisoning
cpu_physical_memory_*() easier if we convert every device to this API.
We could then ban cpu_physical_memory_*(), perhaps by requiring a
#define and #ifdef-ing those declarations.
> Notes:
> the IOMMU_PERM_RW code seem unused, so I replaced
> this with plain is_write. Is it ever useful?
The original idea made provisions for stuff like full R/W memory maps.
In that case cpu_physical_memory_map() would call the translation /
checking function with perms == IOMMU_PERM_RW. That's not there yet so
it can be removed at the moment, especially since it only affects these
helpers.
Also, I'm not sure if there are other sorts of accesses besides reads
and writes we want to check or translate.
> It seems that invalidate callback should be able to
> get away with just a device, so I switched to that
> from a void pointer for type safety.
> Seems enough for the users I saw.
I think this makes matters too complicated. Normally, a single DMADevice
should be embedded within a <bus>Device, so doing this makes it really
hard to invalidate a specific map when there are more of them. It forces
device code to act as a bus, provide fake 'DMADevice's for each map and
dispatch translation to the real DMATranslateFunc. I see no other way.
If you really want more type-safety (although I think this is a case of
a true opaque identifying something only device code understands), I
have another proposal: have a DMAMap embedded in the opaque. Example
from dma-helpers.c:
typedef struct {
DMADevice *owner;
[...]
} DMAMap;
typedef struct {
[...]
DMAMap map;
[...]
} DMAAIOCB;
/* The callback. */
static void dma_bdrv_cancel(DMAMap *map)
{
DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
[...]
}
The upside is we only need to pass the DMAMap. That can also contain
details of the actual map in case the device wants to release only the
relevant range and remap the rest.
> I saw devices do stl_le_phys and such, these
> might need to be wrapped as well.
stl_le_phys() is defined and used only by hw/eepro100.c. That's already
dealt with by converting the device.
Thanks,
Eduard
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ---
>
> diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> new file mode 100644
> index 0000000..d63fd17
> --- /dev/null
> +++ b/hw/dma_rw.h
> @@ -0,0 +1,122 @@
> +#ifndef DMA_RW_H
> +#define DMA_RW_H
> +
> +#include "qemu-common.h"
> +
> +/* We currently only have pci mmus, but using
> + a generic type makes it possible to use this
> + e.g. from the generic ide code without callbacks. */
> +typedef uint64_t dma_addr_t;
> +
> +typedef struct DMAMmu DMAMmu;
> +typedef struct DMADevice DMADevice;
> +
> +typedef int DMATranslateFunc(DMAMmu *mmu,
> + DMADevice *dev,
> + dma_addr_t addr,
> + dma_addr_t *paddr,
> + dma_addr_t *len,
> + int is_write);
> +
> +typedef int DMAInvalidateMapFunc(DMADevice *);
> +struct DMAMmu {
> + /* invalidate, etc. */
> + DmaTranslateFunc *translate;
> +};
> +
> +struct DMADevice {
> + DMAMmu *mmu;
> + DMAInvalidateMapFunc *invalidate;
> +};
> +
> +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> +
> +static inline void dma_memory_rw(DMADevice *dev,
> + dma_addr_t addr,
> + void *buf,
> + uint32_t len,
> + int is_write)
> +{
> + uint32_t plen;
> + /* Fast-path non-iommu.
> + * More importantly, makes it obvious what this function does. */
> + if (!dev->mmu) {
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> + return;
> + }
> + while (len) {
> + err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
> + if (err) {
> + return;
> + }
> +
> + /* The translation might be valid for larger regions. */
> + if (plen > len) {
> + plen = len;
> + }
> +
> + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> +
> + len -= plen;
> + addr += plen;
> + buf += plen;
> + }
> +}
> +
> +void *dma_memory_map(DMADevice *dev,
> + dma_addr_t addr,
> + uint32_t *len,
> + int is_write);
> +void dma_memory_unmap(DMADevice *dev,
> + void *buffer,
> + uint32_t len,
> + int is_write,
> + uint32_t access_len);
> +
> +
> ++#define DEFINE_DMA_LD(suffix, size) \
> ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ if (!dev->mmu) { \
> ++ return ld##suffix##_phys(addr, val); \
> ++ } \
> ++ \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr, &paddr, &plen, IOMMU_PERM_READ); \
> ++ if (err || (plen < size / 8)) \
> ++ return 0; \
> ++ \
> ++ return ld##suffix##_phys(paddr); \
> ++}
> ++
> ++#define DEFINE_DMA_ST(suffix, size) \
> ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> ++{ \
> ++ int err; \
> ++ target_phys_addr_t paddr, plen; \
> ++ \
> ++ if (!dev->mmu) { \
> ++ st##suffix##_phys(addr, val); \
> ++ return; \
> ++ } \
> ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> ++ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
> ++ if (err || (plen < size / 8)) \
> ++ return; \
> ++ \
> ++ st##suffix##_phys(paddr, val); \
> ++}
> +
> +DEFINE_DMA_LD(ub, 8)
> +DEFINE_DMA_LD(uw, 16)
> +DEFINE_DMA_LD(l, 32)
> +DEFINE_DMA_LD(q, 64)
> +
> +DEFINE_DMA_ST(b, 8)
> +DEFINE_DMA_ST(w, 16)
> +DEFINE_DMA_ST(l, 32)
> +DEFINE_DMA_ST(q, 64)
> +
> +#endif
> diff --git a/hw/pci.h b/hw/pci.h
> index 1c6075e..9737f0e 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -5,6 +5,7 @@
> #include "qobject.h"
>
> #include "qdev.h"
> +#include "dma_rw.h"
>
> /* PCI includes legacy ISA access. */
> #include "isa.h"
> @@ -119,6 +120,10 @@ enum {
>
> struct PCIDevice {
> DeviceState qdev;
> +
> + /* For devices that do DMA. */
> + DMADevice dma;
> +
> /* PCI config space */
> uint8_t *config;
>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-09-13 20:45 ` Anthony Liguori
@ 2010-09-16 7:12 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-16 7:12 UTC (permalink / raw)
To: Anthony Liguori
Cc: Michael S. Tsirkin, kvm, joro, qemu-devel, blauwirbel, yamahata,
paul, avi
On Mon, Sep 13, 2010 at 03:45:34PM -0500, Anthony Liguori wrote:
> On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote:
> > So I think the following will give the idea of what an API
> > might look like that will let us avoid the scary hacks in
> > e.g. the ide layer and other generic layers that need to do DMA,
> > without either binding us to pci, adding more complexity with
> > callbacks, or losing type safety with casts and void*.
> >
> > Basically we have DMADevice that we can use container_of on
> > to get a PCIDevice from, and DMAMmu that will get instanciated
> > in a specific MMU.
> >
> > This is not complete code - just a header - I might complete
> > this later if/when there's interest or hopefully someone interested
> > in iommu emulation will.
> >
> > Notes:
> > the IOMMU_PERM_RW code seem unused, so I replaced
> > this with plain is_write. Is it ever useful?
> >
> > It seems that invalidate callback should be able to
> > get away with just a device, so I switched to that
> > from a void pointer for type safety.
> > Seems enough for the users I saw.
> >
> > I saw devices do stl_le_phys and such, these
> > might need to be wrapped as well.
> >
> > Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >
>
> One of the troubles with an interface like this is that I'm not sure a
> generic model universally works.
>
> For instance, I know some PCI busses do transparent byte swapping. For
> this to work, there has to be a notion of generic memory reads/writes
> vs. reads of a 32-bit, 16-bit, and 8-bit value.
>
> With a generic API, we lose the flexibility to do this type of bus
> interface.
>
> Regards,
>
> Anthony Liguori
>
[snip]
I suppose additional callbacks that do the actual R/W could solve this.
If those aren't present, default to cpu_physical_memory_*().
It should be easy for such a callback to decide on a case-by-case basis
depending on the R/W transaction size, if this is ever needed.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-16 7:12 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-16 7:12 UTC (permalink / raw)
To: Anthony Liguori
Cc: kvm, Michael S. Tsirkin, joro, qemu-devel, blauwirbel, yamahata,
paul, avi
On Mon, Sep 13, 2010 at 03:45:34PM -0500, Anthony Liguori wrote:
> On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote:
> > So I think the following will give the idea of what an API
> > might look like that will let us avoid the scary hacks in
> > e.g. the ide layer and other generic layers that need to do DMA,
> > without either binding us to pci, adding more complexity with
> > callbacks, or losing type safety with casts and void*.
> >
> > Basically we have DMADevice that we can use container_of on
> > to get a PCIDevice from, and DMAMmu that will get instanciated
> > in a specific MMU.
> >
> > This is not complete code - just a header - I might complete
> > this later if/when there's interest or hopefully someone interested
> > in iommu emulation will.
> >
> > Notes:
> > the IOMMU_PERM_RW code seem unused, so I replaced
> > this with plain is_write. Is it ever useful?
> >
> > It seems that invalidate callback should be able to
> > get away with just a device, so I switched to that
> > from a void pointer for type safety.
> > Seems enough for the users I saw.
> >
> > I saw devices do stl_le_phys and such, these
> > might need to be wrapped as well.
> >
> > Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >
>
> One of the troubles with an interface like this is that I'm not sure a
> generic model universally works.
>
> For instance, I know some PCI busses do transparent byte swapping. For
> this to work, there has to be a notion of generic memory reads/writes
> vs. reads of a 32-bit, 16-bit, and 8-bit value.
>
> With a generic API, we lose the flexibility to do this type of bus
> interface.
>
> Regards,
>
> Anthony Liguori
>
[snip]
I suppose additional callbacks that do the actual R/W could solve this.
If those aren't present, default to cpu_physical_memory_*().
It should be easy for such a callback to decide on a case-by-case basis
depending on the R/W transaction size, if this is ever needed.
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-09-16 7:06 ` [Qemu-devel] " Eduard - Gabriel Munteanu
@ 2010-09-16 9:20 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-16 9:20 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 16, 2010 at 10:06:16AM +0300, Eduard - Gabriel Munteanu wrote:
> On Mon, Sep 13, 2010 at 10:01:20PM +0200, Michael S. Tsirkin wrote:
> > So I think the following will give the idea of what an API
> > might look like that will let us avoid the scary hacks in
> > e.g. the ide layer and other generic layers that need to do DMA,
> > without either binding us to pci, adding more complexity with
> > callbacks, or losing type safety with casts and void*.
> >
> > Basically we have DMADevice that we can use container_of on
> > to get a PCIDevice from, and DMAMmu that will get instanciated
> > in a specific MMU.
> >
> > This is not complete code - just a header - I might complete
> > this later if/when there's interest or hopefully someone interested
> > in iommu emulation will.
>
> Hi,
>
> I personally like this approach better. It also seems to make poisoning
> cpu_physical_memory_*() easier if we convert every device to this API.
> We could then ban cpu_physical_memory_*(), perhaps by requiring a
> #define and #ifdef-ing those declarations.
>
> > Notes:
> > the IOMMU_PERM_RW code seem unused, so I replaced
> > this with plain is_write. Is it ever useful?
>
> The original idea made provisions for stuff like full R/W memory maps.
> In that case cpu_physical_memory_map() would call the translation /
> checking function with perms == IOMMU_PERM_RW. That's not there yet so
> it can be removed at the moment, especially since it only affects these
> helpers.
>
> Also, I'm not sure if there are other sorts of accesses besides reads
> and writes we want to check or translate.
>
> > It seems that invalidate callback should be able to
> > get away with just a device, so I switched to that
> > from a void pointer for type safety.
> > Seems enough for the users I saw.
>
> I think this makes matters too complicated. Normally, a single DMADevice
> should be embedded within a <bus>Device,
No, DMADevice is a device that does DMA.
So e.g. a PCI device would embed one.
Remember, traslations are per device, right?
DMAMmu is part of the iommu object.
> so doing this makes it really
> hard to invalidate a specific map when there are more of them. It forces
> device code to act as a bus, provide fake 'DMADevice's for each map and
> dispatch translation to the real DMATranslateFunc. I see no other way.
>
> If you really want more type-safety (although I think this is a case of
> a true opaque identifying something only device code understands), I
> have another proposal: have a DMAMap embedded in the opaque. Example
> from dma-helpers.c:
>
> typedef struct {
> DMADevice *owner;
> [...]
> } DMAMap;
>
> typedef struct {
> [...]
> DMAMap map;
> [...]
> } DMAAIOCB;
>
> /* The callback. */
> static void dma_bdrv_cancel(DMAMap *map)
> {
> DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
>
> [...]
> }
>
> The upside is we only need to pass the DMAMap. That can also contain
> details of the actual map in case the device wants to release only the
> relevant range and remap the rest.
Fine.
Or maybe DMAAIOCB (just make some letters lower case: DMAIocb?).
Everyone will use it anyway, right?
> > I saw devices do stl_le_phys and such, these
> > might need to be wrapped as well.
>
> stl_le_phys() is defined and used only by hw/eepro100.c. That's already
> dealt with by converting the device.
>
I see. Need to get around to adding some prefix to it to make this clear.
> Thanks,
> Eduard
>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > ---
> >
> > diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> > new file mode 100644
> > index 0000000..d63fd17
> > --- /dev/null
> > +++ b/hw/dma_rw.h
> > @@ -0,0 +1,122 @@
> > +#ifndef DMA_RW_H
> > +#define DMA_RW_H
> > +
> > +#include "qemu-common.h"
> > +
> > +/* We currently only have pci mmus, but using
> > + a generic type makes it possible to use this
> > + e.g. from the generic ide code without callbacks. */
> > +typedef uint64_t dma_addr_t;
> > +
> > +typedef struct DMAMmu DMAMmu;
> > +typedef struct DMADevice DMADevice;
> > +
> > +typedef int DMATranslateFunc(DMAMmu *mmu,
> > + DMADevice *dev,
> > + dma_addr_t addr,
> > + dma_addr_t *paddr,
> > + dma_addr_t *len,
> > + int is_write);
> > +
> > +typedef int DMAInvalidateMapFunc(DMADevice *);
> > +struct DMAMmu {
> > + /* invalidate, etc. */
> > + DmaTranslateFunc *translate;
> > +};
> > +
> > +struct DMADevice {
> > + DMAMmu *mmu;
> > + DMAInvalidateMapFunc *invalidate;
> > +};
> > +
> > +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> > +
> > +static inline void dma_memory_rw(DMADevice *dev,
> > + dma_addr_t addr,
> > + void *buf,
> > + uint32_t len,
> > + int is_write)
> > +{
> > + uint32_t plen;
> > + /* Fast-path non-iommu.
> > + * More importantly, makes it obvious what this function does. */
> > + if (!dev->mmu) {
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > + return;
> > + }
> > + while (len) {
> > + err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
> > + if (err) {
> > + return;
> > + }
> > +
> > + /* The translation might be valid for larger regions. */
> > + if (plen > len) {
> > + plen = len;
> > + }
> > +
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +
> > + len -= plen;
> > + addr += plen;
> > + buf += plen;
> > + }
> > +}
> > +
> > +void *dma_memory_map(DMADevice *dev,
> > + dma_addr_t addr,
> > + uint32_t *len,
> > + int is_write);
> > +void dma_memory_unmap(DMADevice *dev,
> > + void *buffer,
> > + uint32_t len,
> > + int is_write,
> > + uint32_t access_len);
> > +
> > +
> > ++#define DEFINE_DMA_LD(suffix, size) \
> > ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> > ++{ \
> > ++ int err; \
> > ++ target_phys_addr_t paddr, plen; \
> > ++ if (!dev->mmu) { \
> > ++ return ld##suffix##_phys(addr, val); \
> > ++ } \
> > ++ \
> > ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> > ++ addr, &paddr, &plen, IOMMU_PERM_READ); \
> > ++ if (err || (plen < size / 8)) \
> > ++ return 0; \
> > ++ \
> > ++ return ld##suffix##_phys(paddr); \
> > ++}
> > ++
> > ++#define DEFINE_DMA_ST(suffix, size) \
> > ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> > ++{ \
> > ++ int err; \
> > ++ target_phys_addr_t paddr, plen; \
> > ++ \
> > ++ if (!dev->mmu) { \
> > ++ st##suffix##_phys(addr, val); \
> > ++ return; \
> > ++ } \
> > ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> > ++ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
> > ++ if (err || (plen < size / 8)) \
> > ++ return; \
> > ++ \
> > ++ st##suffix##_phys(paddr, val); \
> > ++}
> > +
> > +DEFINE_DMA_LD(ub, 8)
> > +DEFINE_DMA_LD(uw, 16)
> > +DEFINE_DMA_LD(l, 32)
> > +DEFINE_DMA_LD(q, 64)
> > +
> > +DEFINE_DMA_ST(b, 8)
> > +DEFINE_DMA_ST(w, 16)
> > +DEFINE_DMA_ST(l, 32)
> > +DEFINE_DMA_ST(q, 64)
> > +
> > +#endif
> > diff --git a/hw/pci.h b/hw/pci.h
> > index 1c6075e..9737f0e 100644
> > --- a/hw/pci.h
> > +++ b/hw/pci.h
> > @@ -5,6 +5,7 @@
> > #include "qobject.h"
> >
> > #include "qdev.h"
> > +#include "dma_rw.h"
> >
> > /* PCI includes legacy ISA access. */
> > #include "isa.h"
> > @@ -119,6 +120,10 @@ enum {
> >
> > struct PCIDevice {
> > DeviceState qdev;
> > +
> > + /* For devices that do DMA. */
> > + DMADevice dma;
> > +
> > /* PCI config space */
> > uint8_t *config;
> >
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-16 9:20 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-16 9:20 UTC (permalink / raw)
To: Eduard - Gabriel Munteanu
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 16, 2010 at 10:06:16AM +0300, Eduard - Gabriel Munteanu wrote:
> On Mon, Sep 13, 2010 at 10:01:20PM +0200, Michael S. Tsirkin wrote:
> > So I think the following will give the idea of what an API
> > might look like that will let us avoid the scary hacks in
> > e.g. the ide layer and other generic layers that need to do DMA,
> > without either binding us to pci, adding more complexity with
> > callbacks, or losing type safety with casts and void*.
> >
> > Basically we have DMADevice that we can use container_of on
> > to get a PCIDevice from, and DMAMmu that will get instanciated
> > in a specific MMU.
> >
> > This is not complete code - just a header - I might complete
> > this later if/when there's interest or hopefully someone interested
> > in iommu emulation will.
>
> Hi,
>
> I personally like this approach better. It also seems to make poisoning
> cpu_physical_memory_*() easier if we convert every device to this API.
> We could then ban cpu_physical_memory_*(), perhaps by requiring a
> #define and #ifdef-ing those declarations.
>
> > Notes:
> > the IOMMU_PERM_RW code seem unused, so I replaced
> > this with plain is_write. Is it ever useful?
>
> The original idea made provisions for stuff like full R/W memory maps.
> In that case cpu_physical_memory_map() would call the translation /
> checking function with perms == IOMMU_PERM_RW. That's not there yet so
> it can be removed at the moment, especially since it only affects these
> helpers.
>
> Also, I'm not sure if there are other sorts of accesses besides reads
> and writes we want to check or translate.
>
> > It seems that invalidate callback should be able to
> > get away with just a device, so I switched to that
> > from a void pointer for type safety.
> > Seems enough for the users I saw.
>
> I think this makes matters too complicated. Normally, a single DMADevice
> should be embedded within a <bus>Device,
No, DMADevice is a device that does DMA.
So e.g. a PCI device would embed one.
Remember, traslations are per device, right?
DMAMmu is part of the iommu object.
> so doing this makes it really
> hard to invalidate a specific map when there are more of them. It forces
> device code to act as a bus, provide fake 'DMADevice's for each map and
> dispatch translation to the real DMATranslateFunc. I see no other way.
>
> If you really want more type-safety (although I think this is a case of
> a true opaque identifying something only device code understands), I
> have another proposal: have a DMAMap embedded in the opaque. Example
> from dma-helpers.c:
>
> typedef struct {
> DMADevice *owner;
> [...]
> } DMAMap;
>
> typedef struct {
> [...]
> DMAMap map;
> [...]
> } DMAAIOCB;
>
> /* The callback. */
> static void dma_bdrv_cancel(DMAMap *map)
> {
> DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
>
> [...]
> }
>
> The upside is we only need to pass the DMAMap. That can also contain
> details of the actual map in case the device wants to release only the
> relevant range and remap the rest.
Fine.
Or maybe DMAAIOCB (just make some letters lower case: DMAIocb?).
Everyone will use it anyway, right?
> > I saw devices do stl_le_phys and such, these
> > might need to be wrapped as well.
>
> stl_le_phys() is defined and used only by hw/eepro100.c. That's already
> dealt with by converting the device.
>
I see. Need to get around to adding some prefix to it to make this clear.
> Thanks,
> Eduard
>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > ---
> >
> > diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> > new file mode 100644
> > index 0000000..d63fd17
> > --- /dev/null
> > +++ b/hw/dma_rw.h
> > @@ -0,0 +1,122 @@
> > +#ifndef DMA_RW_H
> > +#define DMA_RW_H
> > +
> > +#include "qemu-common.h"
> > +
> > +/* We currently only have pci mmus, but using
> > + a generic type makes it possible to use this
> > + e.g. from the generic ide code without callbacks. */
> > +typedef uint64_t dma_addr_t;
> > +
> > +typedef struct DMAMmu DMAMmu;
> > +typedef struct DMADevice DMADevice;
> > +
> > +typedef int DMATranslateFunc(DMAMmu *mmu,
> > + DMADevice *dev,
> > + dma_addr_t addr,
> > + dma_addr_t *paddr,
> > + dma_addr_t *len,
> > + int is_write);
> > +
> > +typedef int DMAInvalidateMapFunc(DMADevice *);
> > +struct DMAMmu {
> > + /* invalidate, etc. */
> > + DmaTranslateFunc *translate;
> > +};
> > +
> > +struct DMADevice {
> > + DMAMmu *mmu;
> > + DMAInvalidateMapFunc *invalidate;
> > +};
> > +
> > +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> > +
> > +static inline void dma_memory_rw(DMADevice *dev,
> > + dma_addr_t addr,
> > + void *buf,
> > + uint32_t len,
> > + int is_write)
> > +{
> > + uint32_t plen;
> > + /* Fast-path non-iommu.
> > + * More importantly, makes it obvious what this function does. */
> > + if (!dev->mmu) {
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > + return;
> > + }
> > + while (len) {
> > + err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
> > + if (err) {
> > + return;
> > + }
> > +
> > + /* The translation might be valid for larger regions. */
> > + if (plen > len) {
> > + plen = len;
> > + }
> > +
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +
> > + len -= plen;
> > + addr += plen;
> > + buf += plen;
> > + }
> > +}
> > +
> > +void *dma_memory_map(DMADevice *dev,
> > + dma_addr_t addr,
> > + uint32_t *len,
> > + int is_write);
> > +void dma_memory_unmap(DMADevice *dev,
> > + void *buffer,
> > + uint32_t len,
> > + int is_write,
> > + uint32_t access_len);
> > +
> > +
> > ++#define DEFINE_DMA_LD(suffix, size) \
> > ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> > ++{ \
> > ++ int err; \
> > ++ target_phys_addr_t paddr, plen; \
> > ++ if (!dev->mmu) { \
> > ++ return ld##suffix##_phys(addr, val); \
> > ++ } \
> > ++ \
> > ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> > ++ addr, &paddr, &plen, IOMMU_PERM_READ); \
> > ++ if (err || (plen < size / 8)) \
> > ++ return 0; \
> > ++ \
> > ++ return ld##suffix##_phys(paddr); \
> > ++}
> > ++
> > ++#define DEFINE_DMA_ST(suffix, size) \
> > ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> > ++{ \
> > ++ int err; \
> > ++ target_phys_addr_t paddr, plen; \
> > ++ \
> > ++ if (!dev->mmu) { \
> > ++ st##suffix##_phys(addr, val); \
> > ++ return; \
> > ++ } \
> > ++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> > ++ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
> > ++ if (err || (plen < size / 8)) \
> > ++ return; \
> > ++ \
> > ++ st##suffix##_phys(paddr, val); \
> > ++}
> > +
> > +DEFINE_DMA_LD(ub, 8)
> > +DEFINE_DMA_LD(uw, 16)
> > +DEFINE_DMA_LD(l, 32)
> > +DEFINE_DMA_LD(q, 64)
> > +
> > +DEFINE_DMA_ST(b, 8)
> > +DEFINE_DMA_ST(w, 16)
> > +DEFINE_DMA_ST(l, 32)
> > +DEFINE_DMA_ST(q, 64)
> > +
> > +#endif
> > diff --git a/hw/pci.h b/hw/pci.h
> > index 1c6075e..9737f0e 100644
> > --- a/hw/pci.h
> > +++ b/hw/pci.h
> > @@ -5,6 +5,7 @@
> > #include "qobject.h"
> >
> > #include "qdev.h"
> > +#include "dma_rw.h"
> >
> > /* PCI includes legacy ISA access. */
> > #include "isa.h"
> > @@ -119,6 +120,10 @@ enum {
> >
> > struct PCIDevice {
> > DeviceState qdev;
> > +
> > + /* For devices that do DMA. */
> > + DMADevice dma;
> > +
> > /* PCI config space */
> > uint8_t *config;
> >
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-09-13 20:45 ` Anthony Liguori
@ 2010-09-16 9:35 ` Michael S. Tsirkin
-1 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-16 9:35 UTC (permalink / raw)
To: Anthony Liguori
Cc: Eduard - Gabriel Munteanu, kvm, joro, qemu-devel, blauwirbel,
yamahata, paul, avi
On Mon, Sep 13, 2010 at 03:45:34PM -0500, Anthony Liguori wrote:
> On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote:
> >So I think the following will give the idea of what an API
> >might look like that will let us avoid the scary hacks in
> >e.g. the ide layer and other generic layers that need to do DMA,
> >without either binding us to pci, adding more complexity with
> >callbacks, or losing type safety with casts and void*.
> >
> >Basically we have DMADevice that we can use container_of on
> >to get a PCIDevice from, and DMAMmu that will get instanciated
> >in a specific MMU.
> >
> >This is not complete code - just a header - I might complete
> >this later if/when there's interest or hopefully someone interested
> >in iommu emulation will.
> >
> >Notes:
> >the IOMMU_PERM_RW code seem unused, so I replaced
> >this with plain is_write. Is it ever useful?
> >
> >It seems that invalidate callback should be able to
> >get away with just a device, so I switched to that
> >from a void pointer for type safety.
> >Seems enough for the users I saw.
> >
> >I saw devices do stl_le_phys and such, these
> >might need to be wrapped as well.
> >
> >Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>
> One of the troubles with an interface like this is that I'm not sure
> a generic model universally works.
>
> For instance, I know some PCI busses do transparent byte swapping.
> For this to work, there has to be a notion of generic memory
> reads/writes vs. reads of a 32-bit, 16-bit, and 8-bit value.
>
> With a generic API, we lose the flexibility to do this type of bus
> interface.
>
> Regards,
>
> Anthony Liguori
Surely only PCI root can do such tricks.
Anyway, I suspect what you refer to is byte swapping of config cycles
and similar IO done by driver. If a bus byteswapped a DMA transaction,
this basically breaks DMA as driver will have to go and fix up all data
before passing it up to the OS. Right?
We'd have to add more wrappers to emulate such insanity,
as MMU intentionally only handles translation.
> >---
> >
> >diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> >new file mode 100644
> >index 0000000..d63fd17
> >--- /dev/null
> >+++ b/hw/dma_rw.h
> >@@ -0,0 +1,122 @@
> >+#ifndef DMA_RW_H
> >+#define DMA_RW_H
> >+
> >+#include "qemu-common.h"
> >+
> >+/* We currently only have pci mmus, but using
> >+ a generic type makes it possible to use this
> >+ e.g. from the generic ide code without callbacks. */
> >+typedef uint64_t dma_addr_t;
> >+
> >+typedef struct DMAMmu DMAMmu;
> >+typedef struct DMADevice DMADevice;
> >+
> >+typedef int DMATranslateFunc(DMAMmu *mmu,
> >+ DMADevice *dev,
> >+ dma_addr_t addr,
> >+ dma_addr_t *paddr,
> >+ dma_addr_t *len,
> >+ int is_write);
> >+
> >+typedef int DMAInvalidateMapFunc(DMADevice *);
> >+struct DMAMmu {
> >+ /* invalidate, etc. */
> >+ DmaTranslateFunc *translate;
> >+};
> >+
> >+struct DMADevice {
> >+ DMAMmu *mmu;
> >+ DMAInvalidateMapFunc *invalidate;
> >+};
> >+
> >+void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> >+
> >+static inline void dma_memory_rw(DMADevice *dev,
> >+ dma_addr_t addr,
> >+ void *buf,
> >+ uint32_t len,
> >+ int is_write)
> >+{
> >+ uint32_t plen;
> >+ /* Fast-path non-iommu.
> >+ * More importantly, makes it obvious what this function does. */
> >+ if (!dev->mmu) {
> >+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
> >+ return;
> >+ }
> >+ while (len) {
> >+ err = dev->mmu->translate(iommu, dev, addr,&paddr,&plen, is_write);
> >+ if (err) {
> >+ return;
> >+ }
> >+
> >+ /* The translation might be valid for larger regions. */
> >+ if (plen> len) {
> >+ plen = len;
> >+ }
> >+
> >+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
> >+
> >+ len -= plen;
> >+ addr += plen;
> >+ buf += plen;
> >+ }
> >+}
> >+
> >+void *dma_memory_map(DMADevice *dev,
> >+ dma_addr_t addr,
> >+ uint32_t *len,
> >+ int is_write);
> >+void dma_memory_unmap(DMADevice *dev,
> >+ void *buffer,
> >+ uint32_t len,
> >+ int is_write,
> >+ uint32_t access_len);
> >+
> >+
> >++#define DEFINE_DMA_LD(suffix, size) \
> >++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> >++{ \
> >++ int err; \
> >++ target_phys_addr_t paddr, plen; \
> >++ if (!dev->mmu) { \
> >++ return ld##suffix##_phys(addr, val); \
> >++ } \
> >++ \
> >++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> >++ addr,&paddr,&plen, IOMMU_PERM_READ); \
> >++ if (err || (plen< size / 8)) \
> >++ return 0; \
> >++ \
> >++ return ld##suffix##_phys(paddr); \
> >++}
> >++
> >++#define DEFINE_DMA_ST(suffix, size) \
> >++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> >++{ \
> >++ int err; \
> >++ target_phys_addr_t paddr, plen; \
> >++ \
> >++ if (!dev->mmu) { \
> >++ st##suffix##_phys(addr, val); \
> >++ return; \
> >++ } \
> >++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> >++ addr,&paddr,&plen, IOMMU_PERM_WRITE); \
> >++ if (err || (plen< size / 8)) \
> >++ return; \
> >++ \
> >++ st##suffix##_phys(paddr, val); \
> >++}
> >+
> >+DEFINE_DMA_LD(ub, 8)
> >+DEFINE_DMA_LD(uw, 16)
> >+DEFINE_DMA_LD(l, 32)
> >+DEFINE_DMA_LD(q, 64)
> >+
> >+DEFINE_DMA_ST(b, 8)
> >+DEFINE_DMA_ST(w, 16)
> >+DEFINE_DMA_ST(l, 32)
> >+DEFINE_DMA_ST(q, 64)
> >+
> >+#endif
> >diff --git a/hw/pci.h b/hw/pci.h
> >index 1c6075e..9737f0e 100644
> >--- a/hw/pci.h
> >+++ b/hw/pci.h
> >@@ -5,6 +5,7 @@
> > #include "qobject.h"
> >
> > #include "qdev.h"
> >+#include "dma_rw.h"
> >
> > /* PCI includes legacy ISA access. */
> > #include "isa.h"
> >@@ -119,6 +120,10 @@ enum {
> >
> > struct PCIDevice {
> > DeviceState qdev;
> >+
> >+ /* For devices that do DMA. */
> >+ DMADevice dma;
> >+
> > /* PCI config space */
> > uint8_t *config;
> >
> >
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-16 9:35 ` Michael S. Tsirkin
0 siblings, 0 replies; 97+ messages in thread
From: Michael S. Tsirkin @ 2010-09-16 9:35 UTC (permalink / raw)
To: Anthony Liguori
Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul,
Eduard - Gabriel Munteanu, avi
On Mon, Sep 13, 2010 at 03:45:34PM -0500, Anthony Liguori wrote:
> On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote:
> >So I think the following will give the idea of what an API
> >might look like that will let us avoid the scary hacks in
> >e.g. the ide layer and other generic layers that need to do DMA,
> >without either binding us to pci, adding more complexity with
> >callbacks, or losing type safety with casts and void*.
> >
> >Basically we have DMADevice that we can use container_of on
> >to get a PCIDevice from, and DMAMmu that will get instanciated
> >in a specific MMU.
> >
> >This is not complete code - just a header - I might complete
> >this later if/when there's interest or hopefully someone interested
> >in iommu emulation will.
> >
> >Notes:
> >the IOMMU_PERM_RW code seem unused, so I replaced
> >this with plain is_write. Is it ever useful?
> >
> >It seems that invalidate callback should be able to
> >get away with just a device, so I switched to that
> >from a void pointer for type safety.
> >Seems enough for the users I saw.
> >
> >I saw devices do stl_le_phys and such, these
> >might need to be wrapped as well.
> >
> >Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>
> One of the troubles with an interface like this is that I'm not sure
> a generic model universally works.
>
> For instance, I know some PCI busses do transparent byte swapping.
> For this to work, there has to be a notion of generic memory
> reads/writes vs. reads of a 32-bit, 16-bit, and 8-bit value.
>
> With a generic API, we lose the flexibility to do this type of bus
> interface.
>
> Regards,
>
> Anthony Liguori
Surely only PCI root can do such tricks.
Anyway, I suspect what you refer to is byte swapping of config cycles
and similar IO done by driver. If a bus byteswapped a DMA transaction,
this basically breaks DMA as driver will have to go and fix up all data
before passing it up to the OS. Right?
We'd have to add more wrappers to emulate such insanity,
as MMU intentionally only handles translation.
> >---
> >
> >diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> >new file mode 100644
> >index 0000000..d63fd17
> >--- /dev/null
> >+++ b/hw/dma_rw.h
> >@@ -0,0 +1,122 @@
> >+#ifndef DMA_RW_H
> >+#define DMA_RW_H
> >+
> >+#include "qemu-common.h"
> >+
> >+/* We currently only have pci mmus, but using
> >+ a generic type makes it possible to use this
> >+ e.g. from the generic ide code without callbacks. */
> >+typedef uint64_t dma_addr_t;
> >+
> >+typedef struct DMAMmu DMAMmu;
> >+typedef struct DMADevice DMADevice;
> >+
> >+typedef int DMATranslateFunc(DMAMmu *mmu,
> >+ DMADevice *dev,
> >+ dma_addr_t addr,
> >+ dma_addr_t *paddr,
> >+ dma_addr_t *len,
> >+ int is_write);
> >+
> >+typedef int DMAInvalidateMapFunc(DMADevice *);
> >+struct DMAMmu {
> >+ /* invalidate, etc. */
> >+ DmaTranslateFunc *translate;
> >+};
> >+
> >+struct DMADevice {
> >+ DMAMmu *mmu;
> >+ DMAInvalidateMapFunc *invalidate;
> >+};
> >+
> >+void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> >+
> >+static inline void dma_memory_rw(DMADevice *dev,
> >+ dma_addr_t addr,
> >+ void *buf,
> >+ uint32_t len,
> >+ int is_write)
> >+{
> >+ uint32_t plen;
> >+ /* Fast-path non-iommu.
> >+ * More importantly, makes it obvious what this function does. */
> >+ if (!dev->mmu) {
> >+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
> >+ return;
> >+ }
> >+ while (len) {
> >+ err = dev->mmu->translate(iommu, dev, addr,&paddr,&plen, is_write);
> >+ if (err) {
> >+ return;
> >+ }
> >+
> >+ /* The translation might be valid for larger regions. */
> >+ if (plen> len) {
> >+ plen = len;
> >+ }
> >+
> >+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
> >+
> >+ len -= plen;
> >+ addr += plen;
> >+ buf += plen;
> >+ }
> >+}
> >+
> >+void *dma_memory_map(DMADevice *dev,
> >+ dma_addr_t addr,
> >+ uint32_t *len,
> >+ int is_write);
> >+void dma_memory_unmap(DMADevice *dev,
> >+ void *buffer,
> >+ uint32_t len,
> >+ int is_write,
> >+ uint32_t access_len);
> >+
> >+
> >++#define DEFINE_DMA_LD(suffix, size) \
> >++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \
> >++{ \
> >++ int err; \
> >++ target_phys_addr_t paddr, plen; \
> >++ if (!dev->mmu) { \
> >++ return ld##suffix##_phys(addr, val); \
> >++ } \
> >++ \
> >++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> >++ addr,&paddr,&plen, IOMMU_PERM_READ); \
> >++ if (err || (plen< size / 8)) \
> >++ return 0; \
> >++ \
> >++ return ld##suffix##_phys(paddr); \
> >++}
> >++
> >++#define DEFINE_DMA_ST(suffix, size) \
> >++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \
> >++{ \
> >++ int err; \
> >++ target_phys_addr_t paddr, plen; \
> >++ \
> >++ if (!dev->mmu) { \
> >++ st##suffix##_phys(addr, val); \
> >++ return; \
> >++ } \
> >++ err = dev->mmu->translate(dev->bus->iommu, dev, \
> >++ addr,&paddr,&plen, IOMMU_PERM_WRITE); \
> >++ if (err || (plen< size / 8)) \
> >++ return; \
> >++ \
> >++ st##suffix##_phys(paddr, val); \
> >++}
> >+
> >+DEFINE_DMA_LD(ub, 8)
> >+DEFINE_DMA_LD(uw, 16)
> >+DEFINE_DMA_LD(l, 32)
> >+DEFINE_DMA_LD(q, 64)
> >+
> >+DEFINE_DMA_ST(b, 8)
> >+DEFINE_DMA_ST(w, 16)
> >+DEFINE_DMA_ST(l, 32)
> >+DEFINE_DMA_ST(q, 64)
> >+
> >+#endif
> >diff --git a/hw/pci.h b/hw/pci.h
> >index 1c6075e..9737f0e 100644
> >--- a/hw/pci.h
> >+++ b/hw/pci.h
> >@@ -5,6 +5,7 @@
> > #include "qobject.h"
> >
> > #include "qdev.h"
> >+#include "dma_rw.h"
> >
> > /* PCI includes legacy ISA access. */
> > #include "isa.h"
> >@@ -119,6 +120,10 @@ enum {
> >
> > struct PCIDevice {
> > DeviceState qdev;
> >+
> >+ /* For devices that do DMA. */
> >+ DMADevice dma;
> >+
> > /* PCI config space */
> > uint8_t *config;
> >
> >
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
2010-09-16 9:20 ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-09-16 11:15 ` Eduard - Gabriel Munteanu
-1 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-16 11:15 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: joro, blauwirbel, paul, avi, anthony, av1474, yamahata, kvm, qemu-devel
On Thu, Sep 16, 2010 at 11:20:43AM +0200, Michael S. Tsirkin wrote:
> On Thu, Sep 16, 2010 at 10:06:16AM +0300, Eduard - Gabriel Munteanu wrote:
[snip]
>
> No, DMADevice is a device that does DMA.
> So e.g. a PCI device would embed one.
> Remember, traslations are per device, right?
> DMAMmu is part of the iommu object.
>
I agree, all I'm saying is a DMADevice is insufficient to invalidate one
of the many maps a given device holds, at least without resorting to
horrible tricks or destroying them all.
> > so doing this makes it really
> > hard to invalidate a specific map when there are more of them. It forces
> > device code to act as a bus, provide fake 'DMADevice's for each map and
> > dispatch translation to the real DMATranslateFunc. I see no other way.
> >
> > If you really want more type-safety (although I think this is a case of
> > a true opaque identifying something only device code understands), I
> > have another proposal: have a DMAMap embedded in the opaque. Example
> > from dma-helpers.c:
> >
> > typedef struct {
> > DMADevice *owner;
> > [...]
> > } DMAMap;
> >
> > typedef struct {
> > [...]
> > DMAMap map;
> > [...]
> > } DMAAIOCB;
> >
> > /* The callback. */
> > static void dma_bdrv_cancel(DMAMap *map)
> > {
> > DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
> >
> > [...]
> > }
> >
> > The upside is we only need to pass the DMAMap. That can also contain
> > details of the actual map in case the device wants to release only the
> > relevant range and remap the rest.
>
> Fine.
> Or maybe DMAAIOCB (just make some letters lower case: DMAIocb?).
> Everyone will use it anyway, right?
No, you misunderstood me. DMAAIOCB is already there in dma-helpers.c,
it's the opaque I used and it was already there before my patches.
The idea was to define DMAMap and embed it in DMAAIOCB, so we can
upcast from the former to the latter (which is what we actually need).
IDE DMA uses that, but other devices would use whatever they want, even
nothing except the DMAMap. The only requirement is that the container
struct allows device code to acknowledge the invalidation (in case of
AIO, we simply kill that thread and release resources).
Well, perhaps DMAMap isn't the best name, but you get the idea.
> > > I saw devices do stl_le_phys and such, these
> > > might need to be wrapped as well.
> >
> > stl_le_phys() is defined and used only by hw/eepro100.c. That's already
> > dealt with by converting the device.
> >
>
> I see. Need to get around to adding some prefix to it to make this clear.
>
> > Thanks,
> > Eduard
> >
[snip]
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [Qemu-devel] Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)
@ 2010-09-16 11:15 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-09-16 11:15 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, joro, qemu-devel, blauwirbel, yamahata, paul, avi
On Thu, Sep 16, 2010 at 11:20:43AM +0200, Michael S. Tsirkin wrote:
> On Thu, Sep 16, 2010 at 10:06:16AM +0300, Eduard - Gabriel Munteanu wrote:
[snip]
>
> No, DMADevice is a device that does DMA.
> So e.g. a PCI device would embed one.
> Remember, traslations are per device, right?
> DMAMmu is part of the iommu object.
>
I agree, all I'm saying is a DMADevice is insufficient to invalidate one
of the many maps a given device holds, at least without resorting to
horrible tricks or destroying them all.
> > so doing this makes it really
> > hard to invalidate a specific map when there are more of them. It forces
> > device code to act as a bus, provide fake 'DMADevice's for each map and
> > dispatch translation to the real DMATranslateFunc. I see no other way.
> >
> > If you really want more type-safety (although I think this is a case of
> > a true opaque identifying something only device code understands), I
> > have another proposal: have a DMAMap embedded in the opaque. Example
> > from dma-helpers.c:
> >
> > typedef struct {
> > DMADevice *owner;
> > [...]
> > } DMAMap;
> >
> > typedef struct {
> > [...]
> > DMAMap map;
> > [...]
> > } DMAAIOCB;
> >
> > /* The callback. */
> > static void dma_bdrv_cancel(DMAMap *map)
> > {
> > DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
> >
> > [...]
> > }
> >
> > The upside is we only need to pass the DMAMap. That can also contain
> > details of the actual map in case the device wants to release only the
> > relevant range and remap the rest.
>
> Fine.
> Or maybe DMAAIOCB (just make some letters lower case: DMAIocb?).
> Everyone will use it anyway, right?
No, you misunderstood me. DMAAIOCB is already there in dma-helpers.c,
it's the opaque I used and it was already there before my patches.
The idea was to define DMAMap and embed it in DMAAIOCB, so we can
upcast from the former to the latter (which is what we actually need).
IDE DMA uses that, but other devices would use whatever they want, even
nothing except the DMAMap. The only requirement is that the container
struct allows device code to acknowledge the invalidation (in case of
AIO, we simply kill that thread and release resources).
Well, perhaps DMAMap isn't the best name, but you get the idea.
> > > I saw devices do stl_le_phys and such, these
> > > might need to be wrapped as well.
> >
> > stl_le_phys() is defined and used only by hw/eepro100.c. That's already
> > dealt with by converting the device.
> >
>
> I see. Need to get around to adding some prefix to it to make this clear.
>
> > Thanks,
> > Eduard
> >
[snip]
Eduard
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH 2/7] pci: memory access API and IOMMU support
2010-08-15 19:27 [PATCH 0/7] AMD IOMMU emulation patches v3 Eduard - Gabriel Munteanu
@ 2010-08-15 19:27 ` Eduard - Gabriel Munteanu
0 siblings, 0 replies; 97+ messages in thread
From: Eduard - Gabriel Munteanu @ 2010-08-15 19:27 UTC (permalink / raw)
To: joro
Cc: paul, blauwirbel, anthony, avi, kvm, qemu-devel,
Eduard - Gabriel Munteanu
PCI devices should access memory through pci_memory_*() instead of
cpu_physical_memory_*(). This also provides support for translation and
access checking in case an IOMMU is emulated.
Memory maps are treated as remote IOTLBs (that is, translation caches
belonging to the IOMMU-aware device itself). Clients (devices) must
provide callbacks for map invalidation in case these maps are
persistent beyond the current I/O context, e.g. AIO DMA transfers.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
hw/pci.c | 197 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 74 +++++++++++++++++++++
qemu-common.h | 1 +
3 files changed, 271 insertions(+), 1 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c
index 6871728..8668e06 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -58,6 +58,18 @@ struct PCIBus {
Keep a count of the number of devices with raised IRQs. */
int nirq;
int *irq_count;
+
+ PCIDevice *iommu;
+ PCITranslateFunc *translate;
+};
+
+struct PCIMemoryMap {
+ pcibus_t addr;
+ pcibus_t len;
+ target_phys_addr_t paddr;
+ PCIInvalidateMapFunc *invalidate;
+ void *invalidate_opaque;
+ QLIST_ENTRY(PCIMemoryMap) list;
};
static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent);
@@ -166,6 +178,19 @@ static void pci_device_reset(PCIDevice *dev)
pci_update_mappings(dev);
}
+static int pci_no_translate(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms)
+{
+ *paddr = addr;
+ *len = -1;
+
+ return 0;
+}
+
static void pci_bus_reset(void *opaque)
{
PCIBus *bus = opaque;
@@ -227,7 +252,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
const char *name, int devfn_min)
{
qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name);
- bus->devfn_min = devfn_min;
+
+ bus->devfn_min = devfn_min;
+ bus->iommu = NULL;
+ bus->translate = pci_no_translate;
/* host bridge */
QLIST_INIT(&bus->child);
@@ -2029,6 +2057,173 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
}
}
+void pci_register_iommu(PCIDevice *iommu,
+ PCITranslateFunc *translate)
+{
+ iommu->bus->iommu = iommu;
+ iommu->bus->translate = translate;
+}
+
+void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ while (len) {
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err)
+ return;
+
+ /* The translation might be valid for larger regions. */
+ if (plen > len)
+ plen = len;
+
+ cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+ len -= plen;
+ addr += plen;
+ buf += plen;
+ }
+}
+
+static void pci_memory_register_map(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len,
+ target_phys_addr_t paddr,
+ PCIInvalidateMapFunc *invalidate,
+ void *invalidate_opaque)
+{
+ PCIMemoryMap *map;
+
+ map = qemu_malloc(sizeof(PCIMemoryMap));
+ map->addr = addr;
+ map->len = len;
+ map->paddr = paddr;
+ map->invalidate = invalidate;
+ map->invalidate_opaque = invalidate_opaque;
+
+ QLIST_INSERT_HEAD(&dev->memory_maps, map, list);
+}
+
+static void pci_memory_unregister_map(PCIDevice *dev,
+ target_phys_addr_t paddr,
+ target_phys_addr_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (map->paddr == paddr && map->len == len) {
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len)
+{
+ PCIMemoryMap *map;
+
+ QLIST_FOREACH(map, &dev->memory_maps, list) {
+ if (range_covers_range(addr, len, map->addr, map->len)) {
+ map->invalidate(map->invalidate_opaque);
+ QLIST_REMOVE(map, list);
+ free(map);
+ }
+ }
+}
+
+void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+ int err;
+ unsigned perms;
+ PCIDevice *iommu = dev->bus->iommu;
+ target_phys_addr_t paddr, plen;
+
+ perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+ plen = *len;
+ err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms);
+ if (err)
+ return NULL;
+
+ /*
+ * If this is true, the virtual region is contiguous,
+ * but the translated physical region isn't. We just
+ * clamp *len, much like cpu_physical_memory_map() does.
+ */
+ if (plen < *len)
+ *len = plen;
+
+ /* We treat maps as remote TLBs to cope with stuff like AIO. */
+ if (cb)
+ pci_memory_register_map(dev, addr, *len, paddr, cb, opaque);
+
+ return cpu_physical_memory_map(paddr, len, is_write);
+}
+
+void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len)
+{
+ cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+ pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len);
+}
+
+#define DEFINE_PCI_LD(suffix, size) \
+uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_READ); \
+ if (err || (plen < size / 8)) \
+ return 0; \
+ \
+ return ld##suffix##_phys(paddr); \
+}
+
+#define DEFINE_PCI_ST(suffix, size) \
+void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \
+{ \
+ int err; \
+ target_phys_addr_t paddr, plen; \
+ \
+ err = dev->bus->translate(dev->bus->iommu, dev, \
+ addr, &paddr, &plen, IOMMU_PERM_WRITE); \
+ if (err || (plen < size / 8)) \
+ return; \
+ \
+ st##suffix##_phys(paddr, val); \
+}
+
+DEFINE_PCI_LD(ub, 8)
+DEFINE_PCI_LD(uw, 16)
+DEFINE_PCI_LD(l, 32)
+DEFINE_PCI_LD(q, 64)
+
+DEFINE_PCI_ST(b, 8)
+DEFINE_PCI_ST(w, 16)
+DEFINE_PCI_ST(l, 32)
+DEFINE_PCI_ST(q, 64)
+
static PCIDeviceInfo bridge_info = {
.qdev.name = "pci-bridge",
.qdev.size = sizeof(PCIBridge),
diff --git a/hw/pci.h b/hw/pci.h
index 5a6cdb5..a62bc8e 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -203,6 +203,8 @@ struct PCIDevice {
PCICapConfigReadFunc *config_read;
PCICapConfigWriteFunc *config_write;
} cap;
+
+ QLIST_HEAD(, PCIMemoryMap) memory_maps;
};
PCIDevice *pci_register_device(PCIBus *bus, const char *name,
@@ -440,4 +442,76 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
return !(last2 < first1 || last1 < first2);
}
+/*
+ * Memory I/O and PCI IOMMU definitions.
+ */
+
+#define IOMMU_PERM_READ (1 << 0)
+#define IOMMU_PERM_WRITE (1 << 1)
+#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE)
+
+typedef int PCIInvalidateMapFunc(void *opaque);
+typedef int PCITranslateFunc(PCIDevice *iommu,
+ PCIDevice *dev,
+ pcibus_t addr,
+ target_phys_addr_t *paddr,
+ target_phys_addr_t *len,
+ unsigned perms);
+
+extern void pci_memory_rw(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len,
+ int is_write);
+extern void *pci_memory_map(PCIDevice *dev,
+ PCIInvalidateMapFunc *cb,
+ void *opaque,
+ pcibus_t addr,
+ target_phys_addr_t *len,
+ int is_write);
+extern void pci_memory_unmap(PCIDevice *dev,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+extern void pci_register_iommu(PCIDevice *dev,
+ PCITranslateFunc *translate);
+extern void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len);
+
+#define DECLARE_PCI_LD(suffix, size) \
+extern uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr);
+
+#define DECLARE_PCI_ST(suffix, size) \
+extern void pci_st##suffix(PCIDevice *dev, \
+ pcibus_t addr, \
+ uint##size##_t val);
+
+DECLARE_PCI_LD(ub, 8)
+DECLARE_PCI_LD(uw, 16)
+DECLARE_PCI_LD(l, 32)
+DECLARE_PCI_LD(q, 64)
+
+DECLARE_PCI_ST(b, 8)
+DECLARE_PCI_ST(w, 16)
+DECLARE_PCI_ST(l, 32)
+DECLARE_PCI_ST(q, 64)
+
+static inline void pci_memory_read(PCIDevice *dev,
+ pcibus_t addr,
+ uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, buf, len, 0);
+}
+
+static inline void pci_memory_write(PCIDevice *dev,
+ pcibus_t addr,
+ const uint8_t *buf,
+ pcibus_t len)
+{
+ pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1);
+}
+
#endif
diff --git a/qemu-common.h b/qemu-common.h
index 3fb2f0b..40c6d58 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice;
typedef struct PCIHostState PCIHostState;
typedef struct PCIExpressHost PCIExpressHost;
typedef struct PCIBus PCIBus;
+typedef struct PCIMemoryMap PCIMemoryMap;
typedef struct PCIDevice PCIDevice;
typedef struct SerialState SerialState;
typedef struct IRQState *qemu_irq;
--
1.7.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
end of thread, other threads:[~2010-09-16 11:17 UTC | newest]
Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-28 14:54 [PATCH 0/7] AMD IOMMU emulation patchset v4 Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-31 20:29 ` Michael S. Tsirkin
2010-08-31 20:29 ` [Qemu-devel] " Michael S. Tsirkin
2010-08-31 22:58 ` Eduard - Gabriel Munteanu
2010-08-31 22:58 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-01 10:39 ` Michael S. Tsirkin
2010-09-01 10:39 ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 5:28 ` Michael S. Tsirkin
2010-09-02 5:28 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 8:40 ` Eduard - Gabriel Munteanu
2010-09-02 8:40 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 9:49 ` Michael S. Tsirkin
2010-09-02 9:49 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-04 9:01 ` Blue Swirl
2010-09-04 9:01 ` [Qemu-devel] " Blue Swirl
2010-09-05 7:10 ` Michael S. Tsirkin
2010-09-05 7:10 ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 3/7] AMD IOMMU emulation Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 15:58 ` Blue Swirl
2010-08-28 15:58 ` [Qemu-devel] " Blue Swirl
2010-08-28 21:53 ` Eduard - Gabriel Munteanu
2010-08-28 21:53 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-29 20:37 ` Blue Swirl
2010-08-29 20:37 ` [Qemu-devel] " Blue Swirl
2010-08-30 3:07 ` [Qemu-devel] " Isaku Yamahata
2010-08-30 3:07 ` Isaku Yamahata
2010-08-30 5:54 ` Eduard - Gabriel Munteanu
2010-08-30 5:54 ` Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 4/7] ide: use the PCI memory access interface Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 5:19 ` Michael S. Tsirkin
2010-09-02 5:19 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 9:12 ` Eduard - Gabriel Munteanu
2010-09-02 9:12 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 9:58 ` Michael S. Tsirkin
2010-09-02 9:58 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 15:01 ` Eduard - Gabriel Munteanu
2010-09-02 15:01 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 15:24 ` Avi Kivity
2010-09-02 15:24 ` [Qemu-devel] " Avi Kivity
2010-09-02 15:39 ` Michael S. Tsirkin
2010-09-02 15:39 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 16:07 ` Avi Kivity
2010-09-02 16:07 ` [Qemu-devel] " Avi Kivity
2010-09-02 15:31 ` Michael S. Tsirkin
2010-09-02 15:31 ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 5/7] rtl8139: " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 6/7] eepro100: " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 7/7] ac97: " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 16:00 ` [PATCH 0/7] AMD IOMMU emulation patchset v4 Blue Swirl
2010-08-28 16:00 ` [Qemu-devel] " Blue Swirl
2010-08-29 9:55 ` Joerg Roedel
2010-08-29 9:55 ` [Qemu-devel] " Joerg Roedel
2010-08-29 20:44 ` Blue Swirl
2010-08-29 20:44 ` [Qemu-devel] " Blue Swirl
2010-08-29 22:08 ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
2010-08-29 22:08 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-29 22:11 ` Eduard - Gabriel Munteanu
2010-08-29 22:11 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-01 20:10 ` [Qemu-devel] " Stefan Weil
2010-09-01 20:10 ` Stefan Weil
2010-09-02 6:00 ` Michael S. Tsirkin
2010-09-02 6:00 ` Michael S. Tsirkin
2010-09-02 9:08 ` Eduard - Gabriel Munteanu
2010-09-02 9:08 ` Eduard - Gabriel Munteanu
2010-09-02 13:24 ` Anthony Liguori
2010-09-02 13:24 ` Anthony Liguori
2010-09-02 8:51 ` Eduard - Gabriel Munteanu
2010-09-02 8:51 ` Eduard - Gabriel Munteanu
2010-09-02 16:05 ` Stefan Weil
2010-09-02 16:05 ` Stefan Weil
2010-09-02 16:14 ` Eduard - Gabriel Munteanu
2010-09-02 16:14 ` Eduard - Gabriel Munteanu
2010-09-13 20:01 ` [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4) Michael S. Tsirkin
2010-09-13 20:01 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-13 20:45 ` Anthony Liguori
2010-09-13 20:45 ` Anthony Liguori
2010-09-16 7:12 ` Eduard - Gabriel Munteanu
2010-09-16 7:12 ` Eduard - Gabriel Munteanu
2010-09-16 9:35 ` Michael S. Tsirkin
2010-09-16 9:35 ` Michael S. Tsirkin
2010-09-16 7:06 ` Eduard - Gabriel Munteanu
2010-09-16 7:06 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-16 9:20 ` Michael S. Tsirkin
2010-09-16 9:20 ` [Qemu-devel] " Michael S. Tsirkin
2010-09-16 11:15 ` Eduard - Gabriel Munteanu
2010-09-16 11:15 ` [Qemu-devel] " Eduard - Gabriel Munteanu
-- strict thread matches above, loose matches on Subject: below --
2010-08-15 19:27 [PATCH 0/7] AMD IOMMU emulation patches v3 Eduard - Gabriel Munteanu
2010-08-15 19:27 ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.