All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC:  vfio API changes needed for powerpc (v2)
@ 2013-04-04 17:32 ` Yoder Stuart-B08248
  0 siblings, 0 replies; 10+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-04 17:32 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Wood Scott-B07421, kvm-u79uwXL29TY76Z2rM5mHXA,
	qemu-devel-qX2TKyscuCcdnm+yROfE0A, agraf-l3A5Bk7waGM,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Bhushan Bharat-R65777

Based on the email thread over the last couple of days, I have
below an more concrete proposal (v2) for new ioctls supporting vfio-pci
on SoCs with the Freescale PAMU.

Example usage is as described by Scott:

count = VFIO_IOMMU_GET_MSI_BANK_COUNT
VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY)
VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS)
// do other DMA maps now, or later, or not at all, doesn't matter
for (i = 0; i < count; i++)
        VFIO_IOMMU_MAP_MSI_BANK(iova, i);
// The kernel now knows where each bank has been mapped, and can
//   update PCI config space appropriately.

Thanks,
Stuart

----------------------------------------------------------------------------

The Freescale PAMU is an aperture-based IOMMU with the following 
characteristics.  Each device has an entry in a table in memory
describing the iova->phys mapping. The mapping has:
   -an overall aperture that is power of 2 sized, and has a start iova that
    is naturally aligned
   -has 1 or more windows within the aperture
      -number of windows must be power of 2, max is 256
      -size of each window is determined by aperture size / # of windows
      -iova of each window is determined by aperture start iova / # of windows
      -the mapped region in each window can be different than
       the window size...mapping must power of 2 
      -physical address of the mapping must be naturally aligned
       with the mapping size

/*
 * VFIO_PAMU_GET_ATTR
 *
 * Gets the iommu attributes for the current vfio container.  This 
 * ioctl is applicable to an iommu type of VFIO_PAMU only.
 * Caller sets argsz and attribute.  The ioctl fills in
 * the provided struct vfio_pamu_attr based on the attribute
 * value that was set.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */
struct vfio_pamu_attr {
        __u32   argsz;
        __u32   attribute;
#define VFIO_ATTR_GEOMETRY     1
#define VFIO_ATTR_WINDOWS      2
#define VFIO_ATTR_PAMU_STASH   3

        /* fowlling fields are for VFIO_ATTR_GEOMETRY */
        __u64 aperture_start; /* first address that can be mapped    */
        __u64 aperture_end;   /* last address that can be mapped     */

        /* follwing fields are for VFIO_ATTR_WINDOWS */
        __u32 windows;        /* number of windows in the aperture */
                              /* initially this will be the max number
                               * of windows that can be set
                               */

        /* following fields are for VFIO_ATTR_PAMU_STASH */
        __u32 cpu;            /* CPU number for stashing */
        __u32 cache;          /* cache ID for stashing */
};
#define VFIO_PAMU_GET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
        struct vfio_pamu_attr)

/*
 * VFIO_PAMU_SET_ATTR
 *
 * Sets the iommu attributes for the current vfio container.  This 
 * ioctl is applicable to an iommu type of VFIO_PAMU only.
 * Caller sets struct vfio_pamu attr, including argsz and attribute and
 * setting any fields that are valid for the attribute.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */
#define VFIO_PAMU_SET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
        struct vfio_pamu_attr)

/*
 * VFIO_PAMU_GET_MSI_BANK_COUNT
 *
 * Returns the number of MSI banks for this platform.  This tells user space
 * how many aperture windows should be reserved for MSI banks when setting
 * the PAMU geometry and window count.
 * Fills in provided struct vfio_pamu_msi_banks. Caller sets argsz. 
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */
struct vfio_pamu_msi_banks {
        __u32   argsz;
        __u32   bank_count;  /* the number of MSI
};
#define VFIO_PAMU_GET_MSI_BANK_COUNT  _IO(VFIO_TYPE, VFIO_BASE + x,
        struct vfio_pamu_msi_banks)

/*
 * VFIO_PAMU_MAP_MSI_BANK
 *
 * Maps the MSI bank at the specified index and iova.  User space must
 * call this ioctl once for each MSI bank (count of banks is returned by
 * VFIO_IOMMU_GET_MSI_BANK_COUNT).
 * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */

struct vfio_pamu_msi_bank_map {
        __u32   argsz;
        __u32   msi_bank_index;  /* the index of the MSI bank */
        __u64   iova;      /* the iova the bank is to be mapped to */
};

/*
 * VFIO_PAMU_UNMAP_MSI_BANK
 *
 * Unmaps the MSI bank at the specified iova.
 * Caller provides struct vfio_pamu_msi_bank_unmap with all fields set.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */

struct vfio_pamu_msi_bank_unmap {
        __u32   argsz;
        __u64   iova;      /* the iova to be unmapped to */
};

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] RFC:  vfio API changes needed for powerpc (v2)
@ 2013-04-04 17:32 ` Yoder Stuart-B08248
  0 siblings, 0 replies; 10+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-04 17:32 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Wood Scott-B07421, kvm, qemu-devel, agraf, iommu,
	Bhushan Bharat-R65777, Sethi Varun-B16395

Based on the email thread over the last couple of days, I have
below an more concrete proposal (v2) for new ioctls supporting vfio-pci
on SoCs with the Freescale PAMU.

Example usage is as described by Scott:

count = VFIO_IOMMU_GET_MSI_BANK_COUNT
VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY)
VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS)
// do other DMA maps now, or later, or not at all, doesn't matter
for (i = 0; i < count; i++)
        VFIO_IOMMU_MAP_MSI_BANK(iova, i);
// The kernel now knows where each bank has been mapped, and can
//   update PCI config space appropriately.

Thanks,
Stuart

----------------------------------------------------------------------------

The Freescale PAMU is an aperture-based IOMMU with the following 
characteristics.  Each device has an entry in a table in memory
describing the iova->phys mapping. The mapping has:
   -an overall aperture that is power of 2 sized, and has a start iova that
    is naturally aligned
   -has 1 or more windows within the aperture
      -number of windows must be power of 2, max is 256
      -size of each window is determined by aperture size / # of windows
      -iova of each window is determined by aperture start iova / # of windows
      -the mapped region in each window can be different than
       the window size...mapping must power of 2 
      -physical address of the mapping must be naturally aligned
       with the mapping size

/*
 * VFIO_PAMU_GET_ATTR
 *
 * Gets the iommu attributes for the current vfio container.  This 
 * ioctl is applicable to an iommu type of VFIO_PAMU only.
 * Caller sets argsz and attribute.  The ioctl fills in
 * the provided struct vfio_pamu_attr based on the attribute
 * value that was set.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */
struct vfio_pamu_attr {
        __u32   argsz;
        __u32   attribute;
#define VFIO_ATTR_GEOMETRY     1
#define VFIO_ATTR_WINDOWS      2
#define VFIO_ATTR_PAMU_STASH   3

        /* fowlling fields are for VFIO_ATTR_GEOMETRY */
        __u64 aperture_start; /* first address that can be mapped    */
        __u64 aperture_end;   /* last address that can be mapped     */

        /* follwing fields are for VFIO_ATTR_WINDOWS */
        __u32 windows;        /* number of windows in the aperture */
                              /* initially this will be the max number
                               * of windows that can be set
                               */

        /* following fields are for VFIO_ATTR_PAMU_STASH */
        __u32 cpu;            /* CPU number for stashing */
        __u32 cache;          /* cache ID for stashing */
};
#define VFIO_PAMU_GET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
        struct vfio_pamu_attr)

/*
 * VFIO_PAMU_SET_ATTR
 *
 * Sets the iommu attributes for the current vfio container.  This 
 * ioctl is applicable to an iommu type of VFIO_PAMU only.
 * Caller sets struct vfio_pamu attr, including argsz and attribute and
 * setting any fields that are valid for the attribute.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */
#define VFIO_PAMU_SET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
        struct vfio_pamu_attr)

/*
 * VFIO_PAMU_GET_MSI_BANK_COUNT
 *
 * Returns the number of MSI banks for this platform.  This tells user space
 * how many aperture windows should be reserved for MSI banks when setting
 * the PAMU geometry and window count.
 * Fills in provided struct vfio_pamu_msi_banks. Caller sets argsz. 
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */
struct vfio_pamu_msi_banks {
        __u32   argsz;
        __u32   bank_count;  /* the number of MSI
};
#define VFIO_PAMU_GET_MSI_BANK_COUNT  _IO(VFIO_TYPE, VFIO_BASE + x,
        struct vfio_pamu_msi_banks)

/*
 * VFIO_PAMU_MAP_MSI_BANK
 *
 * Maps the MSI bank at the specified index and iova.  User space must
 * call this ioctl once for each MSI bank (count of banks is returned by
 * VFIO_IOMMU_GET_MSI_BANK_COUNT).
 * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */

struct vfio_pamu_msi_bank_map {
        __u32   argsz;
        __u32   msi_bank_index;  /* the index of the MSI bank */
        __u64   iova;      /* the iova the bank is to be mapped to */
};

/*
 * VFIO_PAMU_UNMAP_MSI_BANK
 *
 * Unmaps the MSI bank at the specified iova.
 * Caller provides struct vfio_pamu_msi_bank_unmap with all fields set.
 * Operates on VFIO file descriptor (/dev/vfio/vfio).
 * Return: 0 on success, -errno on failure
 */

struct vfio_pamu_msi_bank_unmap {
        __u32   argsz;
        __u64   iova;      /* the iova to be unmapped to */
};

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC:  vfio API changes needed for powerpc (v2)
  2013-04-04 17:32 ` [Qemu-devel] " Yoder Stuart-B08248
@ 2013-04-04 18:42     ` Alex Williamson
  -1 siblings, 0 replies; 10+ messages in thread
From: Alex Williamson @ 2013-04-04 18:42 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, kvm-u79uwXL29TY76Z2rM5mHXA,
	qemu-devel-qX2TKyscuCcdnm+yROfE0A, agraf-l3A5Bk7waGM,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Bhushan Bharat-R65777

On Thu, 2013-04-04 at 17:32 +0000, Yoder Stuart-B08248 wrote:
> Based on the email thread over the last couple of days, I have
> below an more concrete proposal (v2) for new ioctls supporting vfio-pci
> on SoCs with the Freescale PAMU.
> 
> Example usage is as described by Scott:
> 
> count = VFIO_IOMMU_GET_MSI_BANK_COUNT
> VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY)
> VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS)
> // do other DMA maps now, or later, or not at all, doesn't matter
> for (i = 0; i < count; i++)
>         VFIO_IOMMU_MAP_MSI_BANK(iova, i);
> // The kernel now knows where each bank has been mapped, and can
> //   update PCI config space appropriately.
> 
> Thanks,
> Stuart
> 
> ----------------------------------------------------------------------------
> 
> The Freescale PAMU is an aperture-based IOMMU with the following 
> characteristics.  Each device has an entry in a table in memory
> describing the iova->phys mapping. The mapping has:
>    -an overall aperture that is power of 2 sized, and has a start iova that
>     is naturally aligned
>    -has 1 or more windows within the aperture
>       -number of windows must be power of 2, max is 256
>       -size of each window is determined by aperture size / # of windows
>       -iova of each window is determined by aperture start iova / # of windows
>       -the mapped region in each window can be different than
>        the window size...mapping must power of 2 
>       -physical address of the mapping must be naturally aligned
>        with the mapping size
> 
> /*
>  * VFIO_PAMU_GET_ATTR
>  *
>  * Gets the iommu attributes for the current vfio container.  This 
>  * ioctl is applicable to an iommu type of VFIO_PAMU only.
>  * Caller sets argsz and attribute.  The ioctl fills in
>  * the provided struct vfio_pamu_attr based on the attribute
>  * value that was set.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> struct vfio_pamu_attr {
>         __u32   argsz;
>         __u32   attribute;
> #define VFIO_ATTR_GEOMETRY     1
> #define VFIO_ATTR_WINDOWS      2
> #define VFIO_ATTR_PAMU_STASH   3
> 
>         /* fowlling fields are for VFIO_ATTR_GEOMETRY */
>         __u64 aperture_start; /* first address that can be mapped    */
>         __u64 aperture_end;   /* last address that can be mapped     */
> 
>         /* follwing fields are for VFIO_ATTR_WINDOWS */
>         __u32 windows;        /* number of windows in the aperture */
>                               /* initially this will be the max number
>                                * of windows that can be set
>                                */
> 
>         /* following fields are for VFIO_ATTR_PAMU_STASH */
>         __u32 cpu;            /* CPU number for stashing */
>         __u32 cache;          /* cache ID for stashing */

Can multiple attribute bits be set at once?  If not then the above could
be a union and the attribute could be an enum.  A flags field is
probably still a good idea.

> };
> #define VFIO_PAMU_GET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
>         struct vfio_pamu_attr)
> 
> /*
>  * VFIO_PAMU_SET_ATTR
>  *
>  * Sets the iommu attributes for the current vfio container.  This 
>  * ioctl is applicable to an iommu type of VFIO_PAMU only.
>  * Caller sets struct vfio_pamu attr, including argsz and attribute and
>  * setting any fields that are valid for the attribute.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> #define VFIO_PAMU_SET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
>         struct vfio_pamu_attr)
> 
> /*
>  * VFIO_PAMU_GET_MSI_BANK_COUNT
>  *
>  * Returns the number of MSI banks for this platform.  This tells user space
>  * how many aperture windows should be reserved for MSI banks when setting
>  * the PAMU geometry and window count.
>  * Fills in provided struct vfio_pamu_msi_banks. Caller sets argsz. 
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> struct vfio_pamu_msi_banks {
>         __u32   argsz;
>         __u32   bank_count;  /* the number of MSI
> };
> #define VFIO_PAMU_GET_MSI_BANK_COUNT  _IO(VFIO_TYPE, VFIO_BASE + x,
>         struct vfio_pamu_msi_banks)

argsz w/o flags doesn't really buy us much.

> 
> /*
>  * VFIO_PAMU_MAP_MSI_BANK
>  *
>  * Maps the MSI bank at the specified index and iova.  User space must
>  * call this ioctl once for each MSI bank (count of banks is returned by
>  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
>  * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> 
> struct vfio_pamu_msi_bank_map {
>         __u32   argsz;
>         __u32   msi_bank_index;  /* the index of the MSI bank */
>         __u64   iova;      /* the iova the bank is to be mapped to */
> };

Again, flags.  If we dynamically add or remove devices from a container
the bank count can change, right?  If bank count goes from 2 to 3, does
userspace know assume the new bank is [2]?  If bank count goes from 3 to
2, which index was that?  If removing a device removes an MSI bank then
vfio-pamu will automatically do the unmap, right?

> /*
>  * VFIO_PAMU_UNMAP_MSI_BANK
>  *
>  * Unmaps the MSI bank at the specified iova.
>  * Caller provides struct vfio_pamu_msi_bank_unmap with all fields set.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).

It would be more clear which fd these were for if they were named
VFIO_IOMMU_PAMU_...

>  * Return: 0 on success, -errno on failure
>  */
> 
> struct vfio_pamu_msi_bank_unmap {
>         __u32   argsz;
>         __u64   iova;      /* the iova to be unmapped to */
> };

Thanks,
Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] RFC:  vfio API changes needed for powerpc (v2)
@ 2013-04-04 18:42     ` Alex Williamson
  0 siblings, 0 replies; 10+ messages in thread
From: Alex Williamson @ 2013-04-04 18:42 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, kvm, qemu-devel, agraf, iommu,
	Bhushan Bharat-R65777, Sethi Varun-B16395

On Thu, 2013-04-04 at 17:32 +0000, Yoder Stuart-B08248 wrote:
> Based on the email thread over the last couple of days, I have
> below an more concrete proposal (v2) for new ioctls supporting vfio-pci
> on SoCs with the Freescale PAMU.
> 
> Example usage is as described by Scott:
> 
> count = VFIO_IOMMU_GET_MSI_BANK_COUNT
> VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY)
> VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS)
> // do other DMA maps now, or later, or not at all, doesn't matter
> for (i = 0; i < count; i++)
>         VFIO_IOMMU_MAP_MSI_BANK(iova, i);
> // The kernel now knows where each bank has been mapped, and can
> //   update PCI config space appropriately.
> 
> Thanks,
> Stuart
> 
> ----------------------------------------------------------------------------
> 
> The Freescale PAMU is an aperture-based IOMMU with the following 
> characteristics.  Each device has an entry in a table in memory
> describing the iova->phys mapping. The mapping has:
>    -an overall aperture that is power of 2 sized, and has a start iova that
>     is naturally aligned
>    -has 1 or more windows within the aperture
>       -number of windows must be power of 2, max is 256
>       -size of each window is determined by aperture size / # of windows
>       -iova of each window is determined by aperture start iova / # of windows
>       -the mapped region in each window can be different than
>        the window size...mapping must power of 2 
>       -physical address of the mapping must be naturally aligned
>        with the mapping size
> 
> /*
>  * VFIO_PAMU_GET_ATTR
>  *
>  * Gets the iommu attributes for the current vfio container.  This 
>  * ioctl is applicable to an iommu type of VFIO_PAMU only.
>  * Caller sets argsz and attribute.  The ioctl fills in
>  * the provided struct vfio_pamu_attr based on the attribute
>  * value that was set.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> struct vfio_pamu_attr {
>         __u32   argsz;
>         __u32   attribute;
> #define VFIO_ATTR_GEOMETRY     1
> #define VFIO_ATTR_WINDOWS      2
> #define VFIO_ATTR_PAMU_STASH   3
> 
>         /* fowlling fields are for VFIO_ATTR_GEOMETRY */
>         __u64 aperture_start; /* first address that can be mapped    */
>         __u64 aperture_end;   /* last address that can be mapped     */
> 
>         /* follwing fields are for VFIO_ATTR_WINDOWS */
>         __u32 windows;        /* number of windows in the aperture */
>                               /* initially this will be the max number
>                                * of windows that can be set
>                                */
> 
>         /* following fields are for VFIO_ATTR_PAMU_STASH */
>         __u32 cpu;            /* CPU number for stashing */
>         __u32 cache;          /* cache ID for stashing */

Can multiple attribute bits be set at once?  If not then the above could
be a union and the attribute could be an enum.  A flags field is
probably still a good idea.

> };
> #define VFIO_PAMU_GET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
>         struct vfio_pamu_attr)
> 
> /*
>  * VFIO_PAMU_SET_ATTR
>  *
>  * Sets the iommu attributes for the current vfio container.  This 
>  * ioctl is applicable to an iommu type of VFIO_PAMU only.
>  * Caller sets struct vfio_pamu attr, including argsz and attribute and
>  * setting any fields that are valid for the attribute.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> #define VFIO_PAMU_SET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + x,
>         struct vfio_pamu_attr)
> 
> /*
>  * VFIO_PAMU_GET_MSI_BANK_COUNT
>  *
>  * Returns the number of MSI banks for this platform.  This tells user space
>  * how many aperture windows should be reserved for MSI banks when setting
>  * the PAMU geometry and window count.
>  * Fills in provided struct vfio_pamu_msi_banks. Caller sets argsz. 
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> struct vfio_pamu_msi_banks {
>         __u32   argsz;
>         __u32   bank_count;  /* the number of MSI
> };
> #define VFIO_PAMU_GET_MSI_BANK_COUNT  _IO(VFIO_TYPE, VFIO_BASE + x,
>         struct vfio_pamu_msi_banks)

argsz w/o flags doesn't really buy us much.

> 
> /*
>  * VFIO_PAMU_MAP_MSI_BANK
>  *
>  * Maps the MSI bank at the specified index and iova.  User space must
>  * call this ioctl once for each MSI bank (count of banks is returned by
>  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
>  * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).
>  * Return: 0 on success, -errno on failure
>  */
> 
> struct vfio_pamu_msi_bank_map {
>         __u32   argsz;
>         __u32   msi_bank_index;  /* the index of the MSI bank */
>         __u64   iova;      /* the iova the bank is to be mapped to */
> };

Again, flags.  If we dynamically add or remove devices from a container
the bank count can change, right?  If bank count goes from 2 to 3, does
userspace know assume the new bank is [2]?  If bank count goes from 3 to
2, which index was that?  If removing a device removes an MSI bank then
vfio-pamu will automatically do the unmap, right?

> /*
>  * VFIO_PAMU_UNMAP_MSI_BANK
>  *
>  * Unmaps the MSI bank at the specified iova.
>  * Caller provides struct vfio_pamu_msi_bank_unmap with all fields set.
>  * Operates on VFIO file descriptor (/dev/vfio/vfio).

It would be more clear which fd these were for if they were named
VFIO_IOMMU_PAMU_...

>  * Return: 0 on success, -errno on failure
>  */
> 
> struct vfio_pamu_msi_bank_unmap {
>         __u32   argsz;
>         __u64   iova;      /* the iova to be unmapped to */
> };

Thanks,
Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: RFC:  vfio API changes needed for powerpc (v2)
  2013-04-04 18:42     ` [Qemu-devel] " Alex Williamson
@ 2013-04-04 21:38         ` Yoder Stuart-B08248
  -1 siblings, 0 replies; 10+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-04 21:38 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Wood Scott-B07421, kvm-u79uwXL29TY76Z2rM5mHXA,
	qemu-devel-qX2TKyscuCcdnm+yROfE0A, agraf-l3A5Bk7waGM,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Bhushan Bharat-R65777


> > /*
> >  * VFIO_PAMU_MAP_MSI_BANK
> >  *
> >  * Maps the MSI bank at the specified index and iova.  User space must
> >  * call this ioctl once for each MSI bank (count of banks is returned by
> >  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
> >  * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
> >  * Operates on VFIO file descriptor (/dev/vfio/vfio).
> >  * Return: 0 on success, -errno on failure
> >  */
> >
> > struct vfio_pamu_msi_bank_map {
> >         __u32   argsz;
> >         __u32   msi_bank_index;  /* the index of the MSI bank */
> >         __u64   iova;      /* the iova the bank is to be mapped to */
> > };
> 
> Again, flags.  If we dynamically add or remove devices from a container
> the bank count can change, right?  If bank count goes from 2 to 3, does
> userspace know assume the new bank is [2]?  If bank count goes from 3 to
> 2, which index was that?  If removing a device removes an MSI bank then
> vfio-pamu will automatically do the unmap, right?

My assumption is that the bank count returned by VFIO_IOMMU_GET_MSI_BANK_COUNT
is the max bank count for a platform.  (number will mostly likely always be
3 or 4).  So it won't change as devices are added or removed.

If devices are added or removed, the kernel side can enable or disable
the corresponding MSI windows.  But that is hidden from user space.

Stuart

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] RFC:  vfio API changes needed for powerpc (v2)
@ 2013-04-04 21:38         ` Yoder Stuart-B08248
  0 siblings, 0 replies; 10+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-04 21:38 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Wood Scott-B07421, kvm, qemu-devel, agraf, iommu,
	Bhushan Bharat-R65777, Sethi Varun-B16395


> > /*
> >  * VFIO_PAMU_MAP_MSI_BANK
> >  *
> >  * Maps the MSI bank at the specified index and iova.  User space must
> >  * call this ioctl once for each MSI bank (count of banks is returned by
> >  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
> >  * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
> >  * Operates on VFIO file descriptor (/dev/vfio/vfio).
> >  * Return: 0 on success, -errno on failure
> >  */
> >
> > struct vfio_pamu_msi_bank_map {
> >         __u32   argsz;
> >         __u32   msi_bank_index;  /* the index of the MSI bank */
> >         __u64   iova;      /* the iova the bank is to be mapped to */
> > };
> 
> Again, flags.  If we dynamically add or remove devices from a container
> the bank count can change, right?  If bank count goes from 2 to 3, does
> userspace know assume the new bank is [2]?  If bank count goes from 3 to
> 2, which index was that?  If removing a device removes an MSI bank then
> vfio-pamu will automatically do the unmap, right?

My assumption is that the bank count returned by VFIO_IOMMU_GET_MSI_BANK_COUNT
is the max bank count for a platform.  (number will mostly likely always be
3 or 4).  So it won't change as devices are added or removed.

If devices are added or removed, the kernel side can enable or disable
the corresponding MSI windows.  But that is hidden from user space.

Stuart



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: vfio API changes needed for powerpc (v2)
  2013-04-04 21:38         ` [Qemu-devel] " Yoder Stuart-B08248
@ 2013-04-04 22:51           ` Scott Wood
  -1 siblings, 0 replies; 10+ messages in thread
From: Scott Wood @ 2013-04-04 22:51 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Alex Williamson, Wood Scott-B07421, agraf, Bhushan Bharat-R65777,
	Sethi Varun-B16395, kvm, qemu-devel, iommu

On 04/04/2013 04:38:44 PM, Yoder Stuart-B08248 wrote:
> 
> > > /*
> > >  * VFIO_PAMU_MAP_MSI_BANK
> > >  *
> > >  * Maps the MSI bank at the specified index and iova.  User space  
> must
> > >  * call this ioctl once for each MSI bank (count of banks is  
> returned by
> > >  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
> > >  * Caller provides struct vfio_pamu_msi_bank_map with all fields  
> set.
> > >  * Operates on VFIO file descriptor (/dev/vfio/vfio).
> > >  * Return: 0 on success, -errno on failure
> > >  */
> > >
> > > struct vfio_pamu_msi_bank_map {
> > >         __u32   argsz;
> > >         __u32   msi_bank_index;  /* the index of the MSI bank */
> > >         __u64   iova;      /* the iova the bank is to be mapped  
> to */
> > > };
> >
> > Again, flags.  If we dynamically add or remove devices from a  
> container
> > the bank count can change, right?  If bank count goes from 2 to 3,  
> does
> > userspace know assume the new bank is [2]?  If bank count goes from  
> 3 to
> > 2, which index was that?  If removing a device removes an MSI bank  
> then
> > vfio-pamu will automatically do the unmap, right?
> 
> My assumption is that the bank count returned by  
> VFIO_IOMMU_GET_MSI_BANK_COUNT
> is the max bank count for a platform.  (number will mostly likely  
> always be
> 3 or 4).  So it won't change as devices are added or removed.

It should be the actual number of banks used.  This is required if  
we're going to have userspace do the iteration and specify the exact  
iovas to use -- and even if we aren't going to do that, it would be  
more restrictive on available iova-space than is necessary.  Usually  
there will be only one bank in the group.

Actually mapping all of the MSI banks, all the time, would completely  
negate the point of using the separate alias pages.

-Scott

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] RFC: vfio API changes needed for powerpc (v2)
@ 2013-04-04 22:51           ` Scott Wood
  0 siblings, 0 replies; 10+ messages in thread
From: Scott Wood @ 2013-04-04 22:51 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, kvm, agraf, iommu, qemu-devel,
	Alex Williamson, Bhushan Bharat-R65777, Sethi Varun-B16395

On 04/04/2013 04:38:44 PM, Yoder Stuart-B08248 wrote:
> 
> > > /*
> > >  * VFIO_PAMU_MAP_MSI_BANK
> > >  *
> > >  * Maps the MSI bank at the specified index and iova.  User space  
> must
> > >  * call this ioctl once for each MSI bank (count of banks is  
> returned by
> > >  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
> > >  * Caller provides struct vfio_pamu_msi_bank_map with all fields  
> set.
> > >  * Operates on VFIO file descriptor (/dev/vfio/vfio).
> > >  * Return: 0 on success, -errno on failure
> > >  */
> > >
> > > struct vfio_pamu_msi_bank_map {
> > >         __u32   argsz;
> > >         __u32   msi_bank_index;  /* the index of the MSI bank */
> > >         __u64   iova;      /* the iova the bank is to be mapped  
> to */
> > > };
> >
> > Again, flags.  If we dynamically add or remove devices from a  
> container
> > the bank count can change, right?  If bank count goes from 2 to 3,  
> does
> > userspace know assume the new bank is [2]?  If bank count goes from  
> 3 to
> > 2, which index was that?  If removing a device removes an MSI bank  
> then
> > vfio-pamu will automatically do the unmap, right?
> 
> My assumption is that the bank count returned by  
> VFIO_IOMMU_GET_MSI_BANK_COUNT
> is the max bank count for a platform.  (number will mostly likely  
> always be
> 3 or 4).  So it won't change as devices are added or removed.

It should be the actual number of banks used.  This is required if  
we're going to have userspace do the iteration and specify the exact  
iovas to use -- and even if we aren't going to do that, it would be  
more restrictive on available iova-space than is necessary.  Usually  
there will be only one bank in the group.

Actually mapping all of the MSI banks, all the time, would completely  
negate the point of using the separate alias pages.

-Scott

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: RFC: vfio API changes needed for powerpc (v2)
  2013-04-04 22:51           ` [Qemu-devel] " Scott Wood
@ 2013-04-04 23:11             ` Yoder Stuart-B08248
  -1 siblings, 0 replies; 10+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-04 23:11 UTC (permalink / raw)
  To: Wood Scott-B07421
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, qemu-devel-qX2TKyscuCcdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	agraf-l3A5Bk7waGM, Bhushan Bharat-R65777



> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Thursday, April 04, 2013 5:52 PM
> To: Yoder Stuart-B08248
> Cc: Alex Williamson; Wood Scott-B07421; agraf-l3A5Bk7waGM@public.gmane.org; Bhushan Bharat-R65777; Sethi Varun-B16395;
> kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: RFC: vfio API changes needed for powerpc (v2)
> 
> On 04/04/2013 04:38:44 PM, Yoder Stuart-B08248 wrote:
> >
> > > > /*
> > > >  * VFIO_PAMU_MAP_MSI_BANK
> > > >  *
> > > >  * Maps the MSI bank at the specified index and iova.  User space
> > must
> > > >  * call this ioctl once for each MSI bank (count of banks is
> > returned by
> > > >  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
> > > >  * Caller provides struct vfio_pamu_msi_bank_map with all fields
> > set.
> > > >  * Operates on VFIO file descriptor (/dev/vfio/vfio).
> > > >  * Return: 0 on success, -errno on failure
> > > >  */
> > > >
> > > > struct vfio_pamu_msi_bank_map {
> > > >         __u32   argsz;
> > > >         __u32   msi_bank_index;  /* the index of the MSI bank */
> > > >         __u64   iova;      /* the iova the bank is to be mapped
> > to */
> > > > };
> > >
> > > Again, flags.  If we dynamically add or remove devices from a
> > container
> > > the bank count can change, right?  If bank count goes from 2 to 3,
> > does
> > > userspace know assume the new bank is [2]?  If bank count goes from
> > 3 to
> > > 2, which index was that?  If removing a device removes an MSI bank
> > then
> > > vfio-pamu will automatically do the unmap, right?
> >
> > My assumption is that the bank count returned by
> > VFIO_IOMMU_GET_MSI_BANK_COUNT
> > is the max bank count for a platform.  (number will mostly likely
> > always be
> > 3 or 4).  So it won't change as devices are added or removed.
> 
> It should be the actual number of banks used.  This is required if
> we're going to have userspace do the iteration and specify the exact
> iovas to use -- and even if we aren't going to do that, it would be
> more restrictive on available iova-space than is necessary.  Usually
> there will be only one bank in the group.
> 
> Actually mapping all of the MSI banks, all the time, would completely
> negate the point of using the separate alias pages.

The geometry, windows, DMA mappings, etc is set on a 'container'
which may have multiple groups in it.  So user space needs to
determine the total number of MSI windows needed when setting
the geometry and window count.

In the flow you proposed:

count = VFIO_IOMMU_GET_MSI_BANK_COUNT
VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY)
VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS)
// do other DMA maps now, or later, or not at all, doesn't matter
for (i = 0; i < count; i++)
        VFIO_IOMMU_MAP_MSI_BANK(iova, i);

...that "count" has to be the total, not the count for 1 of
N possible groups.   So the get count ioctl is not done on
a group.

However, like you pointed out we don't want to negate isolation 
of the separate alias pages.  All this API is doing is telling
the kernel which windows to use for which MSI banks.   It's up
to the kernel to actually enable them as needed.

Say 3 MSI banks exist.  If there are no groups added to the
vfio container and all 3 MAP_MSI_BANK calls occurred
the picture may look like this (based on my earlier example):

   win gphys/
    #   enabled iova        phys          size
    --- ------- ----        ----          ----
    5     N     0x10000000  0xf_fe044000  4KB    // msi bank 1
    6     N     0x14000000  0xf_fe045000  4KB    // msi bank 2
    7     N     0x18000000  0xf_fe046000  4KB    // msi bank 3

User space adds 2 groups that use bank 1:

   win          gphys/
    #   enabled iova        phys          size
    --- ------- ----        ----          ----
    5     Y     0x10000000  0xf_fe044000  4KB    // msi bank 1
    6     N     0x14000000  0xf_fe045000  4KB    // msi bank 2
    7     N     0x18000000  0xf_fe046000  4KB    // msi bank 3

User space adds another group that uses bank 3:

   win          gphys/
    #   enabled iova        phys          size
    --- ------- ----        ----          ----
    5     Y     0x10000000  0xf_fe044000  4KB    // msi bank 1
    6     N     0x14000000  0xf_fe045000  4KB    // msi bank 2
    7     Y     0x18000000  0xf_fe046000  4KB    // msi bank 3

User space doesn't need to care what is actually enabled,
it just needs to the the kernel which windows to use
and the kernel can take care of the rest.

Stuart

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] RFC: vfio API changes needed for powerpc (v2)
@ 2013-04-04 23:11             ` Yoder Stuart-B08248
  0 siblings, 0 replies; 10+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-04 23:11 UTC (permalink / raw)
  To: Wood Scott-B07421
  Cc: kvm, qemu-devel, iommu, agraf, Alex Williamson,
	Bhushan Bharat-R65777, Sethi Varun-B16395



> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Thursday, April 04, 2013 5:52 PM
> To: Yoder Stuart-B08248
> Cc: Alex Williamson; Wood Scott-B07421; agraf@suse.de; Bhushan Bharat-R65777; Sethi Varun-B16395;
> kvm@vger.kernel.org; qemu-devel@nongnu.org; iommu@lists.linux-foundation.org
> Subject: Re: RFC: vfio API changes needed for powerpc (v2)
> 
> On 04/04/2013 04:38:44 PM, Yoder Stuart-B08248 wrote:
> >
> > > > /*
> > > >  * VFIO_PAMU_MAP_MSI_BANK
> > > >  *
> > > >  * Maps the MSI bank at the specified index and iova.  User space
> > must
> > > >  * call this ioctl once for each MSI bank (count of banks is
> > returned by
> > > >  * VFIO_IOMMU_GET_MSI_BANK_COUNT).
> > > >  * Caller provides struct vfio_pamu_msi_bank_map with all fields
> > set.
> > > >  * Operates on VFIO file descriptor (/dev/vfio/vfio).
> > > >  * Return: 0 on success, -errno on failure
> > > >  */
> > > >
> > > > struct vfio_pamu_msi_bank_map {
> > > >         __u32   argsz;
> > > >         __u32   msi_bank_index;  /* the index of the MSI bank */
> > > >         __u64   iova;      /* the iova the bank is to be mapped
> > to */
> > > > };
> > >
> > > Again, flags.  If we dynamically add or remove devices from a
> > container
> > > the bank count can change, right?  If bank count goes from 2 to 3,
> > does
> > > userspace know assume the new bank is [2]?  If bank count goes from
> > 3 to
> > > 2, which index was that?  If removing a device removes an MSI bank
> > then
> > > vfio-pamu will automatically do the unmap, right?
> >
> > My assumption is that the bank count returned by
> > VFIO_IOMMU_GET_MSI_BANK_COUNT
> > is the max bank count for a platform.  (number will mostly likely
> > always be
> > 3 or 4).  So it won't change as devices are added or removed.
> 
> It should be the actual number of banks used.  This is required if
> we're going to have userspace do the iteration and specify the exact
> iovas to use -- and even if we aren't going to do that, it would be
> more restrictive on available iova-space than is necessary.  Usually
> there will be only one bank in the group.
> 
> Actually mapping all of the MSI banks, all the time, would completely
> negate the point of using the separate alias pages.

The geometry, windows, DMA mappings, etc is set on a 'container'
which may have multiple groups in it.  So user space needs to
determine the total number of MSI windows needed when setting
the geometry and window count.

In the flow you proposed:

count = VFIO_IOMMU_GET_MSI_BANK_COUNT
VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY)
VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS)
// do other DMA maps now, or later, or not at all, doesn't matter
for (i = 0; i < count; i++)
        VFIO_IOMMU_MAP_MSI_BANK(iova, i);

...that "count" has to be the total, not the count for 1 of
N possible groups.   So the get count ioctl is not done on
a group.

However, like you pointed out we don't want to negate isolation 
of the separate alias pages.  All this API is doing is telling
the kernel which windows to use for which MSI banks.   It's up
to the kernel to actually enable them as needed.

Say 3 MSI banks exist.  If there are no groups added to the
vfio container and all 3 MAP_MSI_BANK calls occurred
the picture may look like this (based on my earlier example):

   win gphys/
    #   enabled iova        phys          size
    --- ------- ----        ----          ----
    5     N     0x10000000  0xf_fe044000  4KB    // msi bank 1
    6     N     0x14000000  0xf_fe045000  4KB    // msi bank 2
    7     N     0x18000000  0xf_fe046000  4KB    // msi bank 3

User space adds 2 groups that use bank 1:

   win          gphys/
    #   enabled iova        phys          size
    --- ------- ----        ----          ----
    5     Y     0x10000000  0xf_fe044000  4KB    // msi bank 1
    6     N     0x14000000  0xf_fe045000  4KB    // msi bank 2
    7     N     0x18000000  0xf_fe046000  4KB    // msi bank 3

User space adds another group that uses bank 3:

   win          gphys/
    #   enabled iova        phys          size
    --- ------- ----        ----          ----
    5     Y     0x10000000  0xf_fe044000  4KB    // msi bank 1
    6     N     0x14000000  0xf_fe045000  4KB    // msi bank 2
    7     Y     0x18000000  0xf_fe046000  4KB    // msi bank 3

User space doesn't need to care what is actually enabled,
it just needs to the the kernel which windows to use
and the kernel can take care of the rest.

Stuart

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-04 23:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-04 17:32 RFC: vfio API changes needed for powerpc (v2) Yoder Stuart-B08248
2013-04-04 17:32 ` [Qemu-devel] " Yoder Stuart-B08248
     [not found] ` <9F6FE96B71CF29479FF1CDC8046E15035A273C-TcFNo7jSaXOLgTCmFNXF2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
2013-04-04 18:42   ` Alex Williamson
2013-04-04 18:42     ` [Qemu-devel] " Alex Williamson
     [not found]     ` <1365100973.2882.315.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
2013-04-04 21:38       ` Yoder Stuart-B08248
2013-04-04 21:38         ` [Qemu-devel] " Yoder Stuart-B08248
2013-04-04 22:51         ` Scott Wood
2013-04-04 22:51           ` [Qemu-devel] " Scott Wood
2013-04-04 23:11           ` Yoder Stuart-B08248
2013-04-04 23:11             ` [Qemu-devel] " Yoder Stuart-B08248

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.