All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Wang, Xiao W" <xiao.w.wang@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"Bie, Tiwei" <tiwei.bie@intel.com>,
	"Tan, Jianfeng" <jianfeng.tan@intel.com>,
	"Liang, Cunming" <cunming.liang@intel.com>,
	"Daly, Dan" <dan.daly@intel.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"gaetan.rivet@6wind.com" <gaetan.rivet@6wind.com>,
	"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	"Chen, Junjie J" <junjie.j.chen@intel.com>
Subject: Re: [PATCH v6 1/4] eal/vfio: add multiple container support
Date: Thu, 12 Apr 2018 16:07:41 +0000	[thread overview]
Message-ID: <B7F2E978279D1D49A3034B7786DACF406F88D1F2@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <974c9cd0-87c4-6ab1-0787-9278a7379fda@intel.com>

Hi Anatoly,

> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, April 12, 2018 10:04 PM
> To: Wang, Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Cc: dev@dpdk.org; maxime.coquelin@redhat.com; Wang, Zhihong
> <zhihong.wang@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Liang, Cunming <cunming.liang@intel.com>; Daly,
> Dan <dan.daly@intel.com>; thomas@monjalon.net; gaetan.rivet@6wind.com;
> hemant.agrawal@nxp.com; Chen, Junjie J <junjie.j.chen@intel.com>
> Subject: Re: [PATCH v6 1/4] eal/vfio: add multiple container support
> 
> On 12-Apr-18 8:19 AM, Xiao Wang wrote:
> > Currently eal vfio framework binds vfio group fd to the default
> > container fd during rte_vfio_setup_device, while in some cases,
> > e.g. vDPA (vhost data path acceleration), we want to put vfio group
> > to a separate container and program IOMMU via this container.
> >
> > This patch adds some APIs to support container creating and device
> > binding with a container.
> >
> > A driver could use "rte_vfio_create_container" helper to create a
> > new container from eal, use "rte_vfio_bind_group" to bind a device
> > to the newly created container.
> >
> > During rte_vfio_setup_device, the container bound with the device
> > will be used for IOMMU setup.
> >
> > Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
> > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > ---
> 
> Apologies for late review. Some comments below.
> 
> <...>
> 
> >
> > +struct rte_memseg;
> > +
> >   /**
> >    * Setup vfio_cfg for the device identified by its address.
> >    * It discovers the configured I/O MMU groups or sets a new one for the
> device.
> > @@ -131,6 +133,117 @@ rte_vfio_clear_group(int vfio_group_fd);
> >   }
> >   #endif
> >
> 
> <...>
> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma mapping for devices in a conainer.
> > + *
> > + * @param container_fd
> > + *   the specified container fd
> > + *
> > + * @param dma_type
> > + *   the dma map type
> > + *
> > + * @param ms
> > + *   the dma address region to map
> > + *
> > + * @return
> > + *    0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_dma_map(int container_fd, int dma_type, const struct
> rte_memseg *ms);
> > +
> 
> First of all, why memseg, instead of va/iova/len? This seems like
> unnecessary attachment to internals of DPDK memory representation. Not
> all memory comes in memsegs, this makes the API unnecessarily specific
> to DPDK memory.

Agree, will use va/iova/len.

> 
> Also, why providing DMA type? There's already a VFIO type pointer in
> vfio_config - you can set this pointer for every new created container,
> so the user wouldn't have to care about IOMMU type. Is it not possible
> to figure out DMA type from within EAL VFIO? If not, maybe provide an
> API to do so, e.g. rte_vfio_container_set_dma_type()?

It's possible, EAL VFIO should be able to figure out a container's DMA type.

> 
> This will also need to be rebased on top of latest HEAD because there
> already is a similar DMA map/unmap API added, only without the container
> parameter. Perhaps rename these new functions to
> rte_vfio_container_(create|destroy|dma_map|dma_unmap)?

OK, will check the latest HEAD and rebase on that.

> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma unmapping for devices in a conainer.
> > + *
> > + * @param container_fd
> > + *   the specified container fd
> > + *
> > + * @param dma_type
> > + *    the dma map type
> > + *
> > + * @param ms
> > + *   the dma address region to unmap
> > + *
> > + * @return
> > + *    0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_dma_unmap(int container_fd, int dma_type, const struct
> rte_memseg *ms);
> > +
> >   #endif /* VFIO_PRESENT */
> >
> 
> <...>
> 
> > @@ -75,8 +53,8 @@ vfio_get_group_fd(int iommu_group_no)
> >   		if (vfio_group_fd < 0) {
> >   			/* if file not found, it's not an error */
> >   			if (errno != ENOENT) {
> > -				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> filename,
> > -						strerror(errno));
> > +				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> > +					filename, strerror(errno));
> 
> This looks like unintended change.
> 
> >   				return -1;
> >   			}
> >
> > @@ -86,8 +64,10 @@ vfio_get_group_fd(int iommu_group_no)
> >   			vfio_group_fd = open(filename, O_RDWR);
> >   			if (vfio_group_fd < 0) {
> >   				if (errno != ENOENT) {
> > -					RTE_LOG(ERR, EAL, "Cannot
> open %s: %s\n", filename,
> > -							strerror(errno));
> > +					RTE_LOG(ERR, EAL,
> > +						"Cannot open %s: %s\n",
> > +						filename,
> > +						strerror(errno));
> 
> This looks like unintended change.
> 
> >   					return -1;
> >   				}
> >   				return 0;
> > @@ -95,21 +75,19 @@ vfio_get_group_fd(int iommu_group_no)
> >   			/* noiommu group found */
> >   		}
> >
> > -		cur_grp->group_no = iommu_group_no;
> > -		cur_grp->fd = vfio_group_fd;
> > -		vfio_cfg.vfio_active_groups++;
> >   		return vfio_group_fd;
> >   	}
> > -	/* if we're in a secondary process, request group fd from the primary
> > +	/*
> > +	 * if we're in a secondary process, request group fd from the primary
> >   	 * process via our socket
> >   	 */
> 
> This looks like unintended change.
> 
> >   	else {
> > -		int socket_fd, ret;
> > -
> > -		socket_fd = vfio_mp_sync_connect_to_primary();
> > +		int ret;
> > +		int socket_fd = vfio_mp_sync_connect_to_primary();
> >
> >   		if (socket_fd < 0) {
> > -			RTE_LOG(ERR, EAL, "  cannot connect to primary
> process!\n");
> > +			RTE_LOG(ERR, EAL,
> > +				"  cannot connect to primary process!\n");
> 
> This looks like unintended change.
> 
> >   			return -1;
> >   		}
> >   		if (vfio_mp_sync_send_request(socket_fd,
> SOCKET_REQ_GROUP) < 0) {
> > @@ -122,6 +100,7 @@ vfio_get_group_fd(int iommu_group_no)
> >   			close(socket_fd);
> >   			return -1;
> >   		}
> > +
> >   		ret = vfio_mp_sync_receive_request(socket_fd);
> 
> This looks like unintended change.
> 
> (hint: "git revert -n HEAD && git add -p" is your friend :) )

Thanks, will remove these diff.

> 
> >   		switch (ret) {
> >   		case SOCKET_NO_FD:
> > @@ -132,9 +111,6 @@ vfio_get_group_fd(int iommu_group_no)
> >   			/* if we got the fd, store it and return it */
> >   			if (vfio_group_fd > 0) {
> >   				close(socket_fd);
> > -				cur_grp->group_no = iommu_group_no;
> > -				cur_grp->fd = vfio_group_fd;
> > -				vfio_cfg.vfio_active_groups++;
> >   				return vfio_group_fd;
> >   			}
> >   			/* fall-through on error */
> > @@ -147,70 +123,349 @@ vfio_get_group_fd(int iommu_group_no)
> >   	return -1;
> 
> <...>
> 
> > +int __rte_experimental
> > +rte_vfio_create_container(void)
> > +{
> > +	struct vfio_config *vfio_cfg;
> > +	int i;
> > +
> > +	/* Find an empty slot to store new vfio config */
> > +	for (i = 1; i < VFIO_MAX_CONTAINERS; i++) {
> > +		if (vfio_cfgs[i] == NULL)
> > +			break;
> > +	}
> > +
> > +	if (i == VFIO_MAX_CONTAINERS) {
> > +		RTE_LOG(ERR, EAL, "exceed max vfio container limit\n");
> > +		return -1;
> > +	}
> > +
> > +	vfio_cfgs[i] = rte_zmalloc("vfio_container", sizeof(struct vfio_config),
> > +		RTE_CACHE_LINE_SIZE);
> > +	if (vfio_cfgs[i] == NULL)
> > +		return -ENOMEM;
> 
> Is there a specific reason why 1) dynamic allocation is used (as opposed
> to just storing a static array), and 2) DPDK memory allocation is used?
> This seems like unnecessary complication.
> 
> Even if you were to decide to allocate memory instead of having a static
> array, you'll have to register for rte_eal_cleanup() to delete any
> allocated containers on DPDK exit. But, as i said, i think it would be
> better to keep it as static array.
>

Thanks for the suggestion, static array looks simpler and cleaner.
 
> > +
> > +	RTE_LOG(INFO, EAL, "alloc container at slot %d\n", i);
> > +	vfio_cfg = vfio_cfgs[i];
> > +	vfio_cfg->vfio_active_groups = 0;
> > +	vfio_cfg->vfio_container_fd = vfio_get_container_fd();
> > +
> > +	if (vfio_cfg->vfio_container_fd < 0) {
> > +		rte_free(vfio_cfgs[i]);
> > +		vfio_cfgs[i] = NULL;
> > +		return -1;
> > +	}
> > +
> > +	for (i = 0; i < VFIO_MAX_GROUPS; i++) {
> > +		vfio_cfg->vfio_groups[i].group_no = -1;
> > +		vfio_cfg->vfio_groups[i].fd = -1;
> > +		vfio_cfg->vfio_groups[i].devices = 0;
> > +	}
> 
> <...>
> 
> > @@ -665,41 +931,80 @@ vfio_get_group_no(const char *sysfs_base,
> >   }
> >
> >   static int
> > -vfio_type1_dma_map(int vfio_container_fd)
> > +do_vfio_type1_dma_map(int vfio_container_fd, const struct rte_memseg
> *ms)
> 
> <...>
> 
> 
> > +static int
> > +do_vfio_type1_dma_unmap(int vfio_container_fd, const struct
> rte_memseg *ms)
> 
> API's such as these two were recently added to DPDK.

Will check and rebase.

BRs,
Xiao

> 
> --
> Thanks,
> Anatoly

  reply	other threads:[~2018-04-12 16:07 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-09 23:08 [PATCH 0/3] add ifcvf driver Xiao Wang
2018-03-09 23:08 ` [PATCH 1/3] eal/vfio: add support for multiple container Xiao Wang
2018-03-14 12:08   ` Burakov, Anatoly
2018-03-15 16:49     ` Wang, Xiao W
2018-03-09 23:08 ` [PATCH 2/3] bus/pci: expose sysfs parsing API Xiao Wang
2018-03-14 11:19   ` Burakov, Anatoly
2018-03-14 13:30     ` Gaëtan Rivet
2018-03-15 16:49       ` Wang, Xiao W
2018-03-15 17:19         ` Gaëtan Rivet
2018-03-19  1:31           ` Wang, Xiao W
2018-03-21 13:21   ` [PATCH v2 0/3] add ifcvf driver Xiao Wang
2018-03-21 13:21     ` [PATCH v2 1/3] eal/vfio: add support for multiple container Xiao Wang
2018-03-21 20:32       ` Thomas Monjalon
2018-03-21 21:37         ` Gaëtan Rivet
2018-03-22  3:00           ` Wang, Xiao W
2018-03-21 13:21     ` [PATCH v2 2/3] bus/pci: expose sysfs parsing API Xiao Wang
2018-03-21 20:44       ` Thomas Monjalon
2018-03-22  2:46         ` Wang, Xiao W
2018-03-21 13:21     ` [PATCH v2 3/3] net/ifcvf: add ifcvf driver Xiao Wang
2018-03-21 20:52       ` Thomas Monjalon
2018-03-23 10:39         ` Wang, Xiao W
2018-03-21 20:57       ` Maxime Coquelin
2018-03-23 10:37         ` Wang, Xiao W
2018-03-22  8:51       ` Ferruh Yigit
2018-03-22 17:23         ` Wang, Xiao W
2018-03-31  2:29       ` [PATCH v3 0/3] add ifcvf vdpa driver Xiao Wang
2018-03-31  2:29         ` [PATCH v3 1/4] eal/vfio: add support for multiple container Xiao Wang
2018-03-31 11:06           ` Maxime Coquelin
2018-03-31  2:29         ` [PATCH v3 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-03-31 11:13           ` Maxime Coquelin
2018-03-31 13:16             ` Thomas Monjalon
2018-04-02  4:08               ` Wang, Xiao W
2018-03-31  2:29         ` [PATCH v3 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-03-31 11:26           ` Maxime Coquelin
2018-04-03  9:38             ` Wang, Xiao W
2018-04-04 14:40           ` [PATCH v4 0/4] " Xiao Wang
2018-04-04 14:40             ` [PATCH v4 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-05 18:06               ` [PATCH v5 0/4] add ifcvf vdpa driver Xiao Wang
2018-04-05 18:06                 ` [PATCH v5 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-05 18:06                 ` [PATCH v5 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-11 18:58                   ` Ferruh Yigit
2018-04-05 18:07                 ` [PATCH v5 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-11 18:58                   ` Ferruh Yigit
2018-04-12  7:19                   ` [PATCH v6 0/4] " Xiao Wang
2018-04-12  7:19                     ` [PATCH v6 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-12 14:03                       ` Burakov, Anatoly
2018-04-12 16:07                         ` Wang, Xiao W [this message]
2018-04-12 16:24                           ` Burakov, Anatoly
2018-04-13  9:18                             ` Wang, Xiao W
2018-04-15 15:33                       ` [PATCH v7 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-15 15:33                         ` [PATCH v7 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-16 10:02                           ` Burakov, Anatoly
2018-04-16 12:22                             ` Wang, Xiao W
2018-04-16 15:34                           ` [PATCH v8 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-16 15:34                             ` [PATCH v8 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-16 15:56                               ` Burakov, Anatoly
2018-04-16 15:34                             ` [PATCH v8 2/5] vfio: add multi container support Xiao Wang
2018-04-16 15:58                               ` Burakov, Anatoly
2018-04-17  7:06                               ` [PATCH v9 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-17  7:06                                 ` [PATCH v9 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-17  7:06                                 ` [PATCH v9 2/5] vfio: add multi container support Xiao Wang
2018-04-17  7:06                                 ` [PATCH v9 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-17  7:06                                 ` [PATCH v9 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-17  7:06                                 ` [PATCH v9 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-17 11:13                                 ` [PATCH v9 0/5] add ifcvf vdpa driver Ferruh Yigit
2018-04-16 15:34                             ` [PATCH v8 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-16 15:34                             ` [PATCH v8 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-16 15:34                             ` [PATCH v8 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-16 16:36                             ` [PATCH v8 0/5] add ifcvf vdpa driver Ferruh Yigit
2018-04-16 18:07                               ` Thomas Monjalon
2018-04-17  5:36                                 ` Wang, Xiao W
2018-04-15 15:33                         ` [PATCH v7 2/5] vfio: add multi container support Xiao Wang
2018-04-16 10:03                           ` Burakov, Anatoly
2018-04-16 12:44                             ` Wang, Xiao W
2018-04-15 15:33                         ` [PATCH v7 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-15 15:33                         ` [PATCH v7 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-15 15:33                         ` [PATCH v7 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-12  7:19                     ` [PATCH v6 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-12  7:19                     ` [PATCH v6 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-12  7:19                     ` [PATCH v6 4/4] doc: add ifcvf driver document and release note Xiao Wang
2018-04-05 18:07                 ` [PATCH v5 " Xiao Wang
2018-04-11 18:59                 ` [PATCH v5 0/4] add ifcvf vdpa driver Ferruh Yigit
2018-04-12  5:47                   ` Wang, Xiao W
2018-04-04 14:40             ` [PATCH v4 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-04 14:40             ` [PATCH v4 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-04 14:40             ` [PATCH v4 4/4] doc: add ifcvf driver document and release note Xiao Wang
2018-03-31  2:29         ` [PATCH v3 4/4] net/ifcvf: add " Xiao Wang
2018-03-31 11:28           ` Maxime Coquelin
2018-03-09 23:08 ` [PATCH 3/3] net/ifcvf: add ifcvf driver Xiao Wang
2018-03-10 18:23 ` [PATCH 0/3] " Maxime Coquelin
2018-03-15 16:49   ` Wang, Xiao W
2018-03-21 20:47     ` Maxime Coquelin
2018-03-23 10:27       ` Wang, Xiao W
2018-03-25  9:51         ` Maxime Coquelin
2018-03-26  9:05           ` Wang, Xiao W
2018-03-26 13:29             ` Maxime Coquelin
2018-03-27  4:40               ` Wang, Xiao W
2018-03-27  5:09                 ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B7F2E978279D1D49A3034B7786DACF406F88D1F2@SHSMSX101.ccr.corp.intel.com \
    --to=xiao.w.wang@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=gaetan.rivet@6wind.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jianfeng.tan@intel.com \
    --cc=junjie.j.chen@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=thomas@monjalon.net \
    --cc=tiwei.bie@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.