All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] allow procinfo and pdump on eth vdev
@ 2018-03-04 15:30 Jianfeng Tan
  2018-03-04 15:30 ` [PATCH 1/4] eal: bring forward multi-process channel init Jianfeng Tan
                   ` (6 more replies)
  0 siblings, 7 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-03-04 15:30 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit, anatoly.burakov, Jianfeng Tan

As we know, we have below limitations in vdev:
  - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
  - dpdk-pdump cannot dump the packets for (most) vdev in primary proces;
  - secondary process cannot use (most) vdev in primary process.

The very first reason is that the secondary process actually does not know
the existence of those vdevs as vdevs are chained on a linked list, and
not shareable to secondary.

In this patch series, we would like to propose a vdev sharing model like this:
  - As a secondary process boots, all devices (including vdev) in primary
    will be automatically shared. After both primary and secondary process
    booted,
  - Device add/remove in primary will be translated to device hog plug/unplug
    event in secondary processes. (TODO)
  - Device add in secondary
    * If that kind of device support multi-process, the secondary will
      request the primary to probe the device and the primary to share
      it to the secondary. It's not necessary to have secondary-private
      device in this case. (TODO)
    * If that kind of device does not support multi-process, the secondary
      will probe the device by itself, and the port id is shared among
      all primary/secondary processes.

This patch series don't:
  - provide secondary data path (Rx/Tx) support for each specific vdev.

How to test:

Step 0: start testpmd with a vhost port; and a VM connected to the vhost port.

Step 1: try using dpdk-procinfo to get the stats.
 $(dpdk-procinfo) --log-level=8 --no-pci -- --stats

Step 2: try using dpdk-pdump to dump the packets.
 $(dpdk-pdump) -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'

Jianfeng Tan (4):
  eal: bring forward multi-process channel init
  bus/vdev: bus scan by multi-process channel
  drivers/net: do not allocate rte_eth_dev_data privately
  drivers/net: share vdev data to secondary process

 drivers/bus/vdev/Makefile                 |   1 +
 drivers/bus/vdev/vdev.c                   | 110 ++++++++++++++++++++++++++++++
 drivers/net/af_packet/rte_eth_af_packet.c |  42 ++++++------
 drivers/net/bonding/rte_eth_bond_pmd.c    |  13 ++++
 drivers/net/failsafe/failsafe.c           |  14 ++++
 drivers/net/kni/rte_eth_kni.c             |  25 ++++---
 drivers/net/null/rte_eth_null.c           |  30 ++++----
 drivers/net/octeontx/octeontx_ethdev.c    |  28 ++++----
 drivers/net/pcap/rte_eth_pcap.c           |  31 +++++----
 drivers/net/softnic/rte_eth_softnic.c     |  19 +++++-
 drivers/net/tap/rte_eth_tap.c             |  22 +++---
 drivers/net/vhost/rte_eth_vhost.c         |  34 ++++-----
 lib/librte_eal/bsdapp/eal/eal.c           |  23 ++++---
 lib/librte_eal/linuxapp/eal/eal.c         |  23 ++++---
 14 files changed, 295 insertions(+), 120 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/4] eal: bring forward multi-process channel init
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-03-04 15:30 ` Jianfeng Tan
  2018-03-04 15:30 ` [PATCH 2/4] bus/vdev: bus scan by multi-process channel Jianfeng Tan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-03-04 15:30 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit, anatoly.burakov, Jianfeng Tan

Adjust the init sequence: put mp channel init before bus scan
so that we can init the vdev bus through mp channel in the
secondary process before the bus scan.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 23 +++++++++++++----------
 lib/librte_eal/linuxapp/eal/eal.c | 23 +++++++++++++----------
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 4eafcb5..b469382 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -544,6 +544,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -583,16 +596,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 	if (rte_eal_memory_init() < 0) {
 		rte_eal_init_alert("Cannot init memory\n");
 		rte_errno = ENOMEM;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 4ca06f4..8914f91 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -769,6 +769,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -815,8 +828,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
 		rte_eal_init_alert("Cannot init logging.");
 		rte_errno = ENOMEM;
@@ -824,14 +835,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0) {
 		rte_eal_init_alert("Cannot init VFIO\n");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/4] bus/vdev: bus scan by multi-process channel
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-03-04 15:30 ` [PATCH 1/4] eal: bring forward multi-process channel init Jianfeng Tan
@ 2018-03-04 15:30 ` Jianfeng Tan
  2018-03-05  9:36   ` Burakov, Anatoly
  2018-03-07 14:00   ` Burakov, Anatoly
  2018-03-04 15:30 ` [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately Jianfeng Tan
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-03-04 15:30 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit, anatoly.burakov, Jianfeng Tan

To scan the vdevs in primary, we send request to primary process
to obtain the names for vdevs.

Only the name is shared from the primary. In probe(), the device
driver is supposed to locate (or request more) the detail
information from the primary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/bus/vdev/Makefile |   1 +
 drivers/bus/vdev/vdev.c   | 110 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 111 insertions(+)

diff --git a/drivers/bus/vdev/Makefile b/drivers/bus/vdev/Makefile
index 24d424a..bd0bb89 100644
--- a/drivers/bus/vdev/Makefile
+++ b/drivers/bus/vdev/Makefile
@@ -10,6 +10,7 @@ LIB = librte_bus_vdev.a
 
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 # versioning export map
 EXPORT_MAP := rte_bus_vdev_version.map
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index e4bc724..0a3ea52 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -314,6 +314,88 @@ rte_vdev_uninit(const char *name)
 	return 0;
 }
 
+struct vdev_param {
+#define VDEV_SCAN_REQ	1
+#define VDEV_SCAN_ONE	2
+#define VDEV_SCAN_REP	3
+	int type;
+	int num;
+	char name[RTE_DEV_NAME_MAX_LEN];
+};
+
+static int vdev_plug(struct rte_device *dev);
+
+static int
+vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_vdev_device *dev;
+	struct rte_devargs *devargs;
+	struct rte_mp_msg mp_resp;
+	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
+	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
+	const char *devname;
+	int num;
+
+	strcpy(mp_resp.name, "vdev");
+	mp_resp.len_param = sizeof(*ou);
+	mp_resp.num_fds = 0;
+
+	switch (in->type) {
+	case VDEV_SCAN_REQ:
+		ou->type = VDEV_SCAN_ONE;
+		ou->num = 1;
+		num = 0;
+		TAILQ_FOREACH(dev, &vdev_device_list, next) {
+			devname = rte_vdev_device_name(dev);
+			if (strlen(devname) == 0)
+				VDEV_LOG(INFO, "vdev with no name is not sent");
+			VDEV_LOG(INFO, "send vdev, %s", devname);
+			strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
+			if (rte_mp_sendmsg(&mp_resp) < 0)
+				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
+					 devname, strerror(rte_errno));
+			num++;
+		}
+		ou->type = VDEV_SCAN_REP;
+		ou->num = num;
+		if (rte_mp_reply(&mp_resp, peer) < 0)
+			VDEV_LOG(ERR, "Failed to reply a scan request");
+		break;
+	case VDEV_SCAN_ONE:
+		VDEV_LOG(INFO, "receive vdev, %s", in->name);
+		dev = find_vdev(in->name);
+		if (dev) {
+			VDEV_LOG(ERR, "vdev already exists: %s", in->name);
+			break;
+		}
+
+		devargs = alloc_devargs(in->name, NULL);
+		if (!devargs) {
+			VDEV_LOG(ERR, "failed to allocate memory");
+			break;
+		}
+
+		dev = calloc(1, sizeof(*dev));
+		if (!dev) {
+			VDEV_LOG(ERR, "failed to allocate memory");
+			free(devargs);
+			break;
+		}
+
+		dev->device.devargs = devargs;
+		dev->device.numa_node = 0; /* to be corrected in probe() */
+		dev->device.name = devargs->name;
+
+		TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
+		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+		break;
+	default:
+		VDEV_LOG(ERR, "vdev cannot recognize this message");
+	}
+
+	return 0;
+}
+
 static int
 vdev_scan(void)
 {
@@ -321,6 +403,34 @@ vdev_scan(void)
 	struct rte_devargs *devargs;
 	struct vdev_custom_scan *custom_scan;
 
+	if (rte_mp_action_register("vdev", vdev_action) < 0 &&
+	    rte_errno != EEXIST) {
+		VDEV_LOG(ERR, "vdev fails to add action");
+		return -1;
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		struct rte_mp_msg mp_req, *mp_rep;
+		struct rte_mp_reply mp_reply;
+		struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+		struct vdev_param *req = (struct vdev_param *)mp_req.param;
+		struct vdev_param *resp;
+
+		strcpy(mp_req.name, "vdev");
+		mp_req.len_param = sizeof(*req);
+		mp_req.num_fds = 0;
+		req->type = VDEV_SCAN_REQ;
+		if (rte_mp_request(&mp_req, &mp_reply, &ts) == 0 &&
+		    mp_reply.nb_received == 1) {
+			mp_rep = &mp_reply.msgs[0];
+			resp = (struct vdev_param *)mp_rep->param;
+			VDEV_LOG(INFO, "Received %d vdevs", resp->num);
+		} else
+			VDEV_LOG(ERR, "Failed to request vdev from primary");
+
+		/* Fall through to allow private vdevs in secondary process */
+	}
+
 	/* call custom scan callbacks if any */
 	rte_spinlock_lock(&vdev_custom_scan_lock);
 	TAILQ_FOREACH(custom_scan, &vdev_custom_scans, next) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-03-04 15:30 ` [PATCH 1/4] eal: bring forward multi-process channel init Jianfeng Tan
  2018-03-04 15:30 ` [PATCH 2/4] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-03-04 15:30 ` Jianfeng Tan
  2018-03-06  6:07   ` Matan Azrad
  2018-03-04 15:30 ` [PATCH 4/4] drivers/net: share vdev data to secondary process Jianfeng Tan
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-03-04 15:30 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit, anatoly.burakov, Jianfeng Tan

We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 25 +++++++------------------
 drivers/net/kni/rte_eth_kni.c             | 13 ++-----------
 drivers/net/null/rte_eth_null.c           | 17 +++--------------
 drivers/net/octeontx/octeontx_ethdev.c    | 14 ++------------
 drivers/net/pcap/rte_eth_pcap.c           | 18 +++---------------
 drivers/net/tap/rte_eth_tap.c             |  9 +--------
 drivers/net/vhost/rte_eth_vhost.c         | 17 ++---------------
 7 files changed, 20 insertions(+), 93 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 57eccfd..2db692f 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: no interface specified for AF_PACKET ethdev\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 
 	RTE_LOG(INFO, PMD,
 		"%s: creating AF_PACKET-backed ethdev on numa socket %u\n",
 		name, numa_node);
 
-	/*
-	 * now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error_early;
-
 	*internals = rte_zmalloc_socket(name, sizeof(**internals),
 	                                0, numa_node);
 	if (*internals == NULL)
-		goto error_early;
+		return -1;
 
 	for (q = 0; q < nb_queues; q++) {
 		(*internals)->rx_queue[q].map = MAP_FAILED;
@@ -604,24 +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: I/F name too long (%s)\n",
 			name, pair->value);
-		goto error_early;
+		return -1;
 	}
 	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFINDEX)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	(*internals)->if_name = strdup(pair->value);
 	if ((*internals)->if_name == NULL)
-		goto error_early;
+		return -1;
 	(*internals)->if_index = ifr.ifr_ifindex;
 
 	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFHWADDR)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
 
@@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 
 	(*internals)->nb_queues = nb_queues;
 
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->dev_private = *internals;
 	data->nb_rx_queues = (uint16_t)nb_queues;
 	data->nb_tx_queues = (uint16_t)nb_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &(*internals)->eth_addr;
 
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 	}
 	free((*internals)->if_name);
 	rte_free(*internals);
-error_early:
-	rte_free(data);
 	return -1;
 }
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index dc4e65f..1a07089 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -337,25 +337,17 @@ eth_kni_create(struct rte_vdev_device *vdev,
 	struct pmd_internals *internals;
 	struct rte_eth_dev_data *data;
 	struct rte_eth_dev *eth_dev;
-	const char *name;
 
 	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
 			numa_node);
 
-	name = rte_vdev_device_name(vdev);
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return NULL;
-
 	/* reserve an ethdev entry */
 	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*internals));
-	if (eth_dev == NULL) {
-		rte_free(data);
+	if (eth_dev == NULL)
 		return NULL;
-	}
 
 	internals = eth_dev->data->dev_private;
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = 1;
 	data->nb_tx_queues = 1;
 	data->dev_link = pmd_link;
@@ -363,7 +355,6 @@ eth_kni_create(struct rte_vdev_device *vdev,
 
 	eth_random_addr(internals->eth_addr.addr_bytes);
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &eth_kni_ops;
 
 	internals->no_request_thread = args->no_request_thread;
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index d003b28..98fc60c 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -496,7 +496,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 {
 	const unsigned nb_rx_queues = 1;
 	const unsigned nb_tx_queues = 1;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internals *internals = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 
@@ -513,19 +513,9 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 	RTE_LOG(INFO, PMD, "Creating null ethdev on numa socket %u\n",
 		dev->device.numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(rte_vdev_device_name(dev), sizeof(*data), 0,
-		dev->device.numa_node);
-	if (!data)
-		return -ENOMEM;
-
 	eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
-	if (!eth_dev) {
-		rte_free(data);
+	if (!eth_dev)
 		return -ENOMEM;
-	}
 
 	/* now put it all together
 	 * - store queue data in internals,
@@ -546,13 +536,12 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 
 	rte_memcpy(internals->rss_key, default_rss_key, 40);
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &eth_addr;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 
 	/* finally assign rx and tx ops */
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index b739c0b..f58f6af 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1039,7 +1039,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	char octtx_name[OCTEONTX_MAX_NAME_LEN];
 	struct octeontx_nic *nic = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	const char *name = rte_vdev_device_name(dev);
 
 	PMD_INIT_FUNC_TRACE();
@@ -1055,13 +1055,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		return 0;
 	}
 
-	data = rte_zmalloc_socket(octtx_name, sizeof(*data), 0, socket_id);
-	if (data == NULL) {
-		octeontx_log_err("failed to allocate devdata");
-		res = -ENOMEM;
-		goto err;
-	}
-
 	nic = rte_zmalloc_socket(octtx_name, sizeof(*nic), 0, socket_id);
 	if (nic == NULL) {
 		octeontx_log_err("failed to allocate nic structure");
@@ -1097,11 +1090,9 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	eth_dev->data->kdrv = RTE_KDRV_NONE;
 	eth_dev->data->numa_node = dev->device.numa_node;
 
-	rte_memcpy(data, (eth_dev)->data, sizeof(*data));
+	data = eth_dev->data;
 	data->dev_private = nic;
-
 	data->port_id = eth_dev->data->port_id;
-	snprintf(data->name, sizeof(data->name), "%s", eth_dev->data->name);
 
 	nic->ev_queues = 1;
 	nic->ev_ports = 1;
@@ -1120,7 +1111,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		goto err;
 	}
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &octeontx_dev_ops;
 
 	/* Finally save ethdev pointer to the NIC structure */
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index c1571e1..f9f53ff 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -773,27 +773,16 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 		struct pmd_internals **internals,
 		struct rte_eth_dev **eth_dev)
 {
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	unsigned int numa_node = vdev->device.numa_node;
-	const char *name;
 
-	name = rte_vdev_device_name(vdev);
 	RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %d\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return -1;
-
 	/* reserve an ethdev entry */
 	*eth_dev = rte_eth_vdev_allocate(vdev, sizeof(**internals));
-	if (*eth_dev == NULL) {
-		rte_free(data);
+	if (*eth_dev == NULL)
 		return -1;
-	}
 
 	/* now put it all together
 	 * - store queue data in internals,
@@ -802,7 +791,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 	*internals = (*eth_dev)->data->dev_private;
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
@@ -812,7 +801,6 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * NOTE: we'll replace the data element, of originally allocated
 	 * eth_dev so the rings are local per-process
 	 */
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index f09db0e..0fb8be5 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1348,12 +1348,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 
 	RTE_LOG(DEBUG, PMD, "  TAP device on numa %u\n", rte_socket_id());
 
-	data = rte_zmalloc_socket(tap_name, sizeof(*data), 0, numa_node);
-	if (!data) {
-		RTE_LOG(ERR, PMD, "TAP Failed to allocate data\n");
-		goto error_exit_nodev;
-	}
-
 	dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
 	if (!dev) {
 		RTE_LOG(ERR, PMD, "TAP Unable to allocate device struct\n");
@@ -1373,7 +1367,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	}
 
 	/* Setup some default values */
-	rte_memcpy(data, dev->data, sizeof(*data));
+	data = dev->data;
 	data->dev_private = pmd;
 	data->dev_flags = RTE_ETH_DEV_INTR_LSC;
 	data->numa_node = numa_node;
@@ -1384,7 +1378,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	data->nb_rx_queues = 0;
 	data->nb_tx_queues = 0;
 
-	dev->data = data;
 	dev->dev_ops = &ops;
 	dev->rx_pkt_burst = pmd_rx_burst;
 	dev->tx_pkt_burst = pmd_tx_burst;
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 3aae01c..aa06ab5 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1016,7 +1016,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	int16_t queues, const unsigned int numa_node, uint64_t flags)
 {
 	const char *name = rte_vdev_device_name(dev);
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internal *internal = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct ether_addr *eth_addr = NULL;
@@ -1026,13 +1026,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	RTE_LOG(INFO, PMD, "Creating VHOST-USER backend on numa socket %u\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure and internal
-	 * (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error;
-
 	list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
 	if (list == NULL)
 		goto error;
@@ -1074,12 +1067,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	rte_spinlock_init(&vring_state->lock);
 	vring_states[eth_dev->data->port_id] = vring_state;
 
-	/* We'll replace the 'data' originally allocated by eth_dev. So the
-	 * vhost PMD resources won't be shared between multi processes.
-	 */
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
-	eth_dev->data = data;
-
+	data = eth_dev->data;
 	data->nb_rx_queues = queues;
 	data->nb_tx_queues = queues;
 	internal->max_queues = queues;
@@ -1120,7 +1108,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 		rte_eth_dev_release_port(eth_dev);
 	rte_free(internal);
 	rte_free(list);
-	rte_free(data);
 
 	return -1;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 4/4] drivers/net: share vdev data to secondary process
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (2 preceding siblings ...)
  2018-03-04 15:30 ` [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately Jianfeng Tan
@ 2018-03-04 15:30 ` Jianfeng Tan
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-03-04 15:30 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit, anatoly.burakov, Jianfeng Tan

dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 17 +++++++++++++++--
 drivers/net/bonding/rte_eth_bond_pmd.c    | 13 +++++++++++++
 drivers/net/failsafe/failsafe.c           | 14 ++++++++++++++
 drivers/net/kni/rte_eth_kni.c             | 12 ++++++++++++
 drivers/net/null/rte_eth_null.c           | 13 +++++++++++++
 drivers/net/octeontx/octeontx_ethdev.c    | 14 ++++++++++++++
 drivers/net/pcap/rte_eth_pcap.c           | 13 +++++++++++++
 drivers/net/softnic/rte_eth_softnic.c     | 19 ++++++++++++++++---
 drivers/net/tap/rte_eth_tap.c             | 13 +++++++++++++
 drivers/net/vhost/rte_eth_vhost.c         | 17 +++++++++++++++--
 10 files changed, 138 insertions(+), 7 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 2db692f..970cf05 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -915,9 +915,22 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
 	int ret = 0;
 	struct rte_kvargs *kvlist;
 	int sockfd = -1;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL) {
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index c34c325..7d6dea2 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -2994,6 +2994,7 @@ bond_probe(struct rte_vdev_device *dev)
 	uint8_t bonding_mode, socket_id/*, agg_mode*/;
 	int  arg_count, port_id;
 	uint8_t agg_mode;
+	struct rte_eth_dev *eth_dev;
 
 	if (!dev)
 		return -EINVAL;
@@ -3001,6 +3002,18 @@ bond_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	RTE_LOG(INFO, EAL, "Initializing pmd_bond for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &default_dev_ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev),
 		pmd_bond_init_valid_arguments);
 	if (kvlist == NULL)
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index c499bfb..ea9fdc6 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -294,10 +294,24 @@ static int
 rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 {
 	const char *name;
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(vdev);
 	INFO("Initializing " FAILSAFE_DRIVER_NAME " for %s",
 			name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(vdev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &failsafe_ops;
+		return 0;
+	}
+
 	return fs_eth_dev_create(vdev);
 }
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index 1a07089..24909c7 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -405,6 +405,18 @@ eth_kni_probe(struct rte_vdev_device *vdev)
 	params = rte_vdev_device_args(vdev);
 	RTE_LOG(INFO, PMD, "Initializing eth_kni for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &eth_kni_ops;
+		return 0;
+	}
+
 	ret = eth_kni_kvargs_process(&args, params);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 98fc60c..53a4b3e 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -597,6 +597,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	unsigned packet_size = default_packet_size;
 	unsigned packet_copy = default_packet_copy;
 	struct rte_kvargs *kvlist = NULL;
+	struct rte_eth_dev *eth_dev;
 	int ret;
 
 	if (!dev)
@@ -606,6 +607,18 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	params = rte_vdev_device_args(dev);
 	RTE_LOG(INFO, PMD, "Initializing pmd_null for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	if (params != NULL) {
 		kvlist = rte_kvargs_parse(params, valid_arguments);
 		if (kvlist == NULL)
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index f58f6af..0c81d82 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1200,12 +1200,26 @@ octeontx_probe(struct rte_vdev_device *dev)
 	struct rte_event_dev_config dev_conf;
 	const char *eventdev_name = "event_octeontx";
 	struct rte_event_dev_info info;
+	struct rte_eth_dev *eth_dev;
 
 	struct octeontx_vdev_init_params init_params = {
 		OCTEONTX_VDEV_DEFAULT_MAX_NR_PORT
 	};
 
 	dev_name = rte_vdev_device_name(dev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(dev_name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", dev_name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &octeontx_dev_ops;
+		return 0;
+	}
+
 	res = octeontx_parse_vdev_init_params(&init_params, dev);
 	if (res < 0)
 		return -EINVAL;
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index f9f53ff..8850817 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -898,6 +898,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	struct rte_kvargs *kvlist;
 	struct pmd_devargs pcaps = {0};
 	struct pmd_devargs dumpers = {0};
+	struct rte_eth_dev *eth_dev;
 	int single_iface = 0;
 	int ret;
 
@@ -908,6 +909,18 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	start_cycles = rte_get_timer_cycles();
 	hz = rte_get_timer_hz();
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
 		return -1;
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index b0c1341..e324394 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -725,13 +725,26 @@ pmd_probe(struct rte_vdev_device *vdev)
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(vdev);
 
-	RTE_LOG(INFO, PMD,
-		"Probing device \"%s\"\n",
-		rte_vdev_device_name(vdev));
+	RTE_LOG(INFO, PMD, "Probing device \"%s\"\n", name);
 
 	/* Parse input arguments */
 	params = rte_vdev_device_args(vdev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &pmd_ops;
+		return 0;
+	}
+
 	if (!params)
 		return -EINVAL;
 
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 0fb8be5..4dd8a8c 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1585,10 +1585,23 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	char tap_name[RTE_ETH_NAME_MAX_LEN];
 	char remote_iface[RTE_ETH_NAME_MAX_LEN];
 	int fixed_mac_type = 0;
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(dev);
 	params = rte_vdev_device_args(dev);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	speed = ETH_SPEED_NUM_10G;
 	snprintf(tap_name, sizeof(tap_name), "%s%d",
 		 DEFAULT_TAP_NAME, tap_unit++);
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index aa06ab5..d8e1d7f 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1151,9 +1151,22 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
 	int client_mode = 0;
 	int dequeue_zero_copy = 0;
 	int iommu_support = 0;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/4] bus/vdev: bus scan by multi-process channel
  2018-03-04 15:30 ` [PATCH 2/4] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-03-05  9:36   ` Burakov, Anatoly
  2018-03-06  0:50     ` Tan, Jianfeng
  2018-03-07 14:00   ` Burakov, Anatoly
  1 sibling, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-03-05  9:36 UTC (permalink / raw)
  To: Jianfeng Tan, dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit

On 04-Mar-18 3:30 PM, Jianfeng Tan wrote:
> To scan the vdevs in primary, we send request to primary process
> to obtain the names for vdevs.
> 
> Only the name is shared from the primary. In probe(), the device
> driver is supposed to locate (or request more) the detail
> information from the primary.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> ---

Is there much point in having private vdevs? Granted, i'm not exactly a 
heavy user of vdev's, but to me this would seem like a way to introduce 
more confusion. How do i tell which devices are shared between 
processes, and which are private to one process? Can i control which one 
do i get? To me it would seem like it would be better to just switch all 
vdevs to being shared.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/4] bus/vdev: bus scan by multi-process channel
  2018-03-05  9:36   ` Burakov, Anatoly
@ 2018-03-06  0:50     ` Tan, Jianfeng
  0 siblings, 0 replies; 48+ messages in thread
From: Tan, Jianfeng @ 2018-03-06  0:50 UTC (permalink / raw)
  To: Burakov, Anatoly, dev
  Cc: Richardson, Bruce, Ananyev, Konstantin, thomas, maxime.coquelin,
	Yigit, Ferruh

Hi Anatoly,

> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Monday, March 5, 2018 5:37 PM
> To: Tan, Jianfeng; dev@dpdk.org
> Cc: Richardson, Bruce; Ananyev, Konstantin; thomas@monjalon.net;
> maxime.coquelin@redhat.com; Yigit, Ferruh
> Subject: Re: [PATCH 2/4] bus/vdev: bus scan by multi-process channel
> 
> On 04-Mar-18 3:30 PM, Jianfeng Tan wrote:
> > To scan the vdevs in primary, we send request to primary process
> > to obtain the names for vdevs.
> >
> > Only the name is shared from the primary. In probe(), the device
> > driver is supposed to locate (or request more) the detail
> > information from the primary.
> >
> > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > ---
> 
> Is there much point in having private vdevs? Granted, i'm not exactly a
> heavy user of vdev's, but to me this would seem like a way to introduce
> more confusion. How do i tell which devices are shared between
> processes, and which are private to one process? Can i control which one
> do i get? To me it would seem like it would be better to just switch all
> vdevs to being shared.

Yes, that’s the final target: to make every vdev shared between primary and secondary process.

However, now most kinds of the vdevs do not support multi-process. For those devices,

- If they are firstly probed in primary, then we will share the rte_eth_dev_data to the secondary, so that the secondary can get stats or pdump the port.
- If they are firstly probed in secondary, considering it's mostly used by the secondary process, so we will allocate the "port id" exclusively, and keep it in that secondary process privately.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately
  2018-03-04 15:30 ` [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately Jianfeng Tan
@ 2018-03-06  6:07   ` Matan Azrad
  2018-03-06  8:55     ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Matan Azrad @ 2018-03-06  6:07 UTC (permalink / raw)
  To: Jianfeng Tan, ferruh.yigit
  Cc: bruce.richardson, konstantin.ananyev, Thomas Monjalon,
	maxime.coquelin, anatoly.burakov, dev

Hi Jianfeng

Please see a comment below.

> From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM
> We introduced private rte_eth_dev_data to allow vdev to be created both in
> primary process and secondary process(es). This is not friendly to multi-
> process model, for example, it leads to port id contention issue if two
> processes both find the data entry is free.
> 
> And to get stats of primary vdev in secondary, we must allocate from the
> pre-defined array so that we can find it.
> 
> Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> ---
>  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++++++------------------
>  drivers/net/kni/rte_eth_kni.c             | 13 ++-----------
>  drivers/net/null/rte_eth_null.c           | 17 +++--------------
>  drivers/net/octeontx/octeontx_ethdev.c    | 14 ++------------
>  drivers/net/pcap/rte_eth_pcap.c           | 18 +++---------------
>  drivers/net/tap/rte_eth_tap.c             |  9 +--------
>  drivers/net/vhost/rte_eth_vhost.c         | 17 ++---------------
>  7 files changed, 20 insertions(+), 93 deletions(-)
> 
> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> b/drivers/net/af_packet/rte_eth_af_packet.c
> index 57eccfd..2db692f 100644
> --- a/drivers/net/af_packet/rte_eth_af_packet.c
> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device
> *dev,
>  		RTE_LOG(ERR, PMD,
>  			"%s: no interface specified for AF_PACKET
> ethdev\n",
>  		        name);
> -		goto error_early;
> +		return -1;
>  	}
> 
>  	RTE_LOG(INFO, PMD,
>  		"%s: creating AF_PACKET-backed ethdev on numa socket
> %u\n",
>  		name, numa_node);
> 
> -	/*
> -	 * now do all data allocation - for eth_dev structure, dummy pci
> driver
> -	 * and internal (private) data
> -	 */
> -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> -	if (data == NULL)
> -		goto error_early;
> -
>  	*internals = rte_zmalloc_socket(name, sizeof(**internals),
>  	                                0, numa_node);
>  	if (*internals == NULL)
> -		goto error_early;
> +		return -1;
> 
>  	for (q = 0; q < nb_queues; q++) {
>  		(*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
>  		RTE_LOG(ERR, PMD,
>  			"%s: I/F name too long (%s)\n",
>  			name, pair->value);
> -		goto error_early;
> +		return -1;
>  	}
>  	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
>  		RTE_LOG(ERR, PMD,
>  			"%s: ioctl failed (SIOCGIFINDEX)\n",
>  		        name);
> -		goto error_early;
> +		return -1;
>  	}
>  	(*internals)->if_name = strdup(pair->value);
>  	if ((*internals)->if_name == NULL)
> -		goto error_early;
> +		return -1;
>  	(*internals)->if_index = ifr.ifr_ifindex;
> 
>  	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
>  		RTE_LOG(ERR, PMD,
>  			"%s: ioctl failed (SIOCGIFHWADDR)\n",
>  		        name);
> -		goto error_early;
> +		return -1;
>  	}
>  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> ETH_ALEN);
> 
> @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device
> *dev,
> 
>  	(*internals)->nb_queues = nb_queues;
> 
> -	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> +	data = (*eth_dev)->data;
>  	data->dev_private = *internals;
>  	data->nb_rx_queues = (uint16_t)nb_queues;
>  	data->nb_tx_queues = (uint16_t)nb_queues;
>  	data->dev_link = pmd_link;
>  	data->mac_addrs = &(*internals)->eth_addr;
> 
> -	(*eth_dev)->data = data;
>  	(*eth_dev)->dev_ops = &ops;
> 
>  	return 0;
> @@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
>  	}
>  	free((*internals)->if_name);
>  	rte_free(*internals);
> -error_early:
> -	rte_free(data);
>  	return -1;
>  }
> 

I think you should remove the private rte_eth_dev_data freeing in  rte_pmd_af_packet_remove().
This is relevant to all the vdevs here.

Question:
Does the patch include all the vdevs which allocated private rte_eth_dev_data?
If so, it may solve also part of the issue discussed here:
https://dpdk.org/dev/patchwork/patch/34047/


Matan.

> diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
> index dc4e65f..1a07089 100644
> --- a/drivers/net/kni/rte_eth_kni.c
> +++ b/drivers/net/kni/rte_eth_kni.c
> @@ -337,25 +337,17 @@ eth_kni_create(struct rte_vdev_device *vdev,
>  	struct pmd_internals *internals;
>  	struct rte_eth_dev_data *data;
>  	struct rte_eth_dev *eth_dev;
> -	const char *name;
> 
>  	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
>  			numa_node);
> 
> -	name = rte_vdev_device_name(vdev);
> -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> -	if (data == NULL)
> -		return NULL;
> -
>  	/* reserve an ethdev entry */
>  	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*internals));
> -	if (eth_dev == NULL) {
> -		rte_free(data);
> +	if (eth_dev == NULL)
>  		return NULL;
> -	}
> 
>  	internals = eth_dev->data->dev_private;
> -	rte_memcpy(data, eth_dev->data, sizeof(*data));
> +	data = eth_dev->data;
>  	data->nb_rx_queues = 1;
>  	data->nb_tx_queues = 1;
>  	data->dev_link = pmd_link;
> @@ -363,7 +355,6 @@ eth_kni_create(struct rte_vdev_device *vdev,
> 
>  	eth_random_addr(internals->eth_addr.addr_bytes);
> 
> -	eth_dev->data = data;
>  	eth_dev->dev_ops = &eth_kni_ops;
> 
>  	internals->no_request_thread = args->no_request_thread; diff --git
> a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c index
> d003b28..98fc60c 100644
> --- a/drivers/net/null/rte_eth_null.c
> +++ b/drivers/net/null/rte_eth_null.c
> @@ -496,7 +496,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,  {
>  	const unsigned nb_rx_queues = 1;
>  	const unsigned nb_tx_queues = 1;
> -	struct rte_eth_dev_data *data = NULL;
> +	struct rte_eth_dev_data *data;
>  	struct pmd_internals *internals = NULL;
>  	struct rte_eth_dev *eth_dev = NULL;
> 
> @@ -513,19 +513,9 @@ eth_dev_null_create(struct rte_vdev_device *dev,
>  	RTE_LOG(INFO, PMD, "Creating null ethdev on numa socket %u\n",
>  		dev->device.numa_node);
> 
> -	/* now do all data allocation - for eth_dev structure, dummy pci
> driver
> -	 * and internal (private) data
> -	 */
> -	data = rte_zmalloc_socket(rte_vdev_device_name(dev),
> sizeof(*data), 0,
> -		dev->device.numa_node);
> -	if (!data)
> -		return -ENOMEM;
> -
>  	eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
> -	if (!eth_dev) {
> -		rte_free(data);
> +	if (!eth_dev)
>  		return -ENOMEM;
> -	}
> 
>  	/* now put it all together
>  	 * - store queue data in internals,
> @@ -546,13 +536,12 @@ eth_dev_null_create(struct rte_vdev_device *dev,
> 
>  	rte_memcpy(internals->rss_key, default_rss_key, 40);
> 
> -	rte_memcpy(data, eth_dev->data, sizeof(*data));
> +	data = eth_dev->data;
>  	data->nb_rx_queues = (uint16_t)nb_rx_queues;
>  	data->nb_tx_queues = (uint16_t)nb_tx_queues;
>  	data->dev_link = pmd_link;
>  	data->mac_addrs = &eth_addr;
> 
> -	eth_dev->data = data;
>  	eth_dev->dev_ops = &ops;
> 
>  	/* finally assign rx and tx ops */
> diff --git a/drivers/net/octeontx/octeontx_ethdev.c
> b/drivers/net/octeontx/octeontx_ethdev.c
> index b739c0b..f58f6af 100644
> --- a/drivers/net/octeontx/octeontx_ethdev.c
> +++ b/drivers/net/octeontx/octeontx_ethdev.c
> @@ -1039,7 +1039,7 @@ octeontx_create(struct rte_vdev_device *dev, int
> port, uint8_t evdev,
>  	char octtx_name[OCTEONTX_MAX_NAME_LEN];
>  	struct octeontx_nic *nic = NULL;
>  	struct rte_eth_dev *eth_dev = NULL;
> -	struct rte_eth_dev_data *data = NULL;
> +	struct rte_eth_dev_data *data;
>  	const char *name = rte_vdev_device_name(dev);
> 
>  	PMD_INIT_FUNC_TRACE();
> @@ -1055,13 +1055,6 @@ octeontx_create(struct rte_vdev_device *dev, int
> port, uint8_t evdev,
>  		return 0;
>  	}
> 
> -	data = rte_zmalloc_socket(octtx_name, sizeof(*data), 0, socket_id);
> -	if (data == NULL) {
> -		octeontx_log_err("failed to allocate devdata");
> -		res = -ENOMEM;
> -		goto err;
> -	}
> -
>  	nic = rte_zmalloc_socket(octtx_name, sizeof(*nic), 0, socket_id);
>  	if (nic == NULL) {
>  		octeontx_log_err("failed to allocate nic structure"); @@ -
> 1097,11 +1090,9 @@ octeontx_create(struct rte_vdev_device *dev, int port,
> uint8_t evdev,
>  	eth_dev->data->kdrv = RTE_KDRV_NONE;
>  	eth_dev->data->numa_node = dev->device.numa_node;
> 
> -	rte_memcpy(data, (eth_dev)->data, sizeof(*data));
> +	data = eth_dev->data;
>  	data->dev_private = nic;
> -
>  	data->port_id = eth_dev->data->port_id;
> -	snprintf(data->name, sizeof(data->name), "%s", eth_dev->data-
> >name);
> 
>  	nic->ev_queues = 1;
>  	nic->ev_ports = 1;
> @@ -1120,7 +1111,6 @@ octeontx_create(struct rte_vdev_device *dev, int
> port, uint8_t evdev,
>  		goto err;
>  	}
> 
> -	eth_dev->data = data;
>  	eth_dev->dev_ops = &octeontx_dev_ops;
> 
>  	/* Finally save ethdev pointer to the NIC structure */ diff --git
> a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
> index c1571e1..f9f53ff 100644
> --- a/drivers/net/pcap/rte_eth_pcap.c
> +++ b/drivers/net/pcap/rte_eth_pcap.c
> @@ -773,27 +773,16 @@ pmd_init_internals(struct rte_vdev_device *vdev,
>  		struct pmd_internals **internals,
>  		struct rte_eth_dev **eth_dev)
>  {
> -	struct rte_eth_dev_data *data = NULL;
> +	struct rte_eth_dev_data *data;
>  	unsigned int numa_node = vdev->device.numa_node;
> -	const char *name;
> 
> -	name = rte_vdev_device_name(vdev);
>  	RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket
> %d\n",
>  		numa_node);
> 
> -	/* now do all data allocation - for eth_dev structure
> -	 * and internal (private) data
> -	 */
> -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> -	if (data == NULL)
> -		return -1;
> -
>  	/* reserve an ethdev entry */
>  	*eth_dev = rte_eth_vdev_allocate(vdev, sizeof(**internals));
> -	if (*eth_dev == NULL) {
> -		rte_free(data);
> +	if (*eth_dev == NULL)
>  		return -1;
> -	}
> 
>  	/* now put it all together
>  	 * - store queue data in internals,
> @@ -802,7 +791,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
>  	 * - and point eth_dev structure to new eth_dev_data structure
>  	 */
>  	*internals = (*eth_dev)->data->dev_private;
> -	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> +	data = (*eth_dev)->data;
>  	data->nb_rx_queues = (uint16_t)nb_rx_queues;
>  	data->nb_tx_queues = (uint16_t)nb_tx_queues;
>  	data->dev_link = pmd_link;
> @@ -812,7 +801,6 @@ pmd_init_internals(struct rte_vdev_device *vdev,
>  	 * NOTE: we'll replace the data element, of originally allocated
>  	 * eth_dev so the rings are local per-process
>  	 */
> -	(*eth_dev)->data = data;
>  	(*eth_dev)->dev_ops = &ops;
> 
>  	return 0;
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index f09db0e..0fb8be5 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -1348,12 +1348,6 @@ eth_dev_tap_create(struct rte_vdev_device
> *vdev, char *tap_name,
> 
>  	RTE_LOG(DEBUG, PMD, "  TAP device on numa %u\n",
> rte_socket_id());
> 
> -	data = rte_zmalloc_socket(tap_name, sizeof(*data), 0, numa_node);
> -	if (!data) {
> -		RTE_LOG(ERR, PMD, "TAP Failed to allocate data\n");
> -		goto error_exit_nodev;
> -	}
> -
>  	dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
>  	if (!dev) {
>  		RTE_LOG(ERR, PMD, "TAP Unable to allocate device
> struct\n"); @@ -1373,7 +1367,7 @@ eth_dev_tap_create(struct
> rte_vdev_device *vdev, char *tap_name,
>  	}
> 
>  	/* Setup some default values */
> -	rte_memcpy(data, dev->data, sizeof(*data));
> +	data = dev->data;
>  	data->dev_private = pmd;
>  	data->dev_flags = RTE_ETH_DEV_INTR_LSC;
>  	data->numa_node = numa_node;
> @@ -1384,7 +1378,6 @@ eth_dev_tap_create(struct rte_vdev_device
> *vdev, char *tap_name,
>  	data->nb_rx_queues = 0;
>  	data->nb_tx_queues = 0;
> 
> -	dev->data = data;
>  	dev->dev_ops = &ops;
>  	dev->rx_pkt_burst = pmd_rx_burst;
>  	dev->tx_pkt_burst = pmd_tx_burst;
> diff --git a/drivers/net/vhost/rte_eth_vhost.c
> b/drivers/net/vhost/rte_eth_vhost.c
> index 3aae01c..aa06ab5 100644
> --- a/drivers/net/vhost/rte_eth_vhost.c
> +++ b/drivers/net/vhost/rte_eth_vhost.c
> @@ -1016,7 +1016,7 @@ eth_dev_vhost_create(struct rte_vdev_device
> *dev, char *iface_name,
>  	int16_t queues, const unsigned int numa_node, uint64_t flags)  {
>  	const char *name = rte_vdev_device_name(dev);
> -	struct rte_eth_dev_data *data = NULL;
> +	struct rte_eth_dev_data *data;
>  	struct pmd_internal *internal = NULL;
>  	struct rte_eth_dev *eth_dev = NULL;
>  	struct ether_addr *eth_addr = NULL;
> @@ -1026,13 +1026,6 @@ eth_dev_vhost_create(struct rte_vdev_device
> *dev, char *iface_name,
>  	RTE_LOG(INFO, PMD, "Creating VHOST-USER backend on numa
> socket %u\n",
>  		numa_node);
> 
> -	/* now do all data allocation - for eth_dev structure and internal
> -	 * (private) data
> -	 */
> -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> -	if (data == NULL)
> -		goto error;
> -
>  	list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
>  	if (list == NULL)
>  		goto error;
> @@ -1074,12 +1067,7 @@ eth_dev_vhost_create(struct rte_vdev_device
> *dev, char *iface_name,
>  	rte_spinlock_init(&vring_state->lock);
>  	vring_states[eth_dev->data->port_id] = vring_state;
> 
> -	/* We'll replace the 'data' originally allocated by eth_dev. So the
> -	 * vhost PMD resources won't be shared between multi processes.
> -	 */
> -	rte_memcpy(data, eth_dev->data, sizeof(*data));
> -	eth_dev->data = data;
> -
> +	data = eth_dev->data;
>  	data->nb_rx_queues = queues;
>  	data->nb_tx_queues = queues;
>  	internal->max_queues = queues;
> @@ -1120,7 +1108,6 @@ eth_dev_vhost_create(struct rte_vdev_device
> *dev, char *iface_name,
>  		rte_eth_dev_release_port(eth_dev);
>  	rte_free(internal);
>  	rte_free(list);
> -	rte_free(data);
> 
>  	return -1;
>  }
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately
  2018-03-06  6:07   ` Matan Azrad
@ 2018-03-06  8:55     ` Tan, Jianfeng
  2018-03-07  6:00       ` Matan Azrad
  0 siblings, 1 reply; 48+ messages in thread
From: Tan, Jianfeng @ 2018-03-06  8:55 UTC (permalink / raw)
  To: Matan Azrad, Yigit, Ferruh
  Cc: Richardson, Bruce, Ananyev, Konstantin, Thomas Monjalon,
	maxime.coquelin, Burakov, Anatoly, dev



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Tuesday, March 6, 2018 2:08 PM
> To: Tan, Jianfeng; Yigit, Ferruh
> Cc: Richardson, Bruce; Ananyev, Konstantin; Thomas Monjalon;
> maxime.coquelin@redhat.com; Burakov, Anatoly; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate
> rte_eth_dev_data privately
> 
> Hi Jianfeng
> 
> Please see a comment below.
> 
> > From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM
> > We introduced private rte_eth_dev_data to allow vdev to be created both
> in
> > primary process and secondary process(es). This is not friendly to multi-
> > process model, for example, it leads to port id contention issue if two
> > processes both find the data entry is free.
> >
> > And to get stats of primary vdev in secondary, we must allocate from the
> > pre-defined array so that we can find it.
> >
> > Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
> > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > ---
> >  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++++++------------------
> >  drivers/net/kni/rte_eth_kni.c             | 13 ++-----------
> >  drivers/net/null/rte_eth_null.c           | 17 +++--------------
> >  drivers/net/octeontx/octeontx_ethdev.c    | 14 ++------------
> >  drivers/net/pcap/rte_eth_pcap.c           | 18 +++---------------
> >  drivers/net/tap/rte_eth_tap.c             |  9 +--------
> >  drivers/net/vhost/rte_eth_vhost.c         | 17 ++---------------
> >  7 files changed, 20 insertions(+), 93 deletions(-)
> >
> > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> > b/drivers/net/af_packet/rte_eth_af_packet.c
> > index 57eccfd..2db692f 100644
> > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device
> > *dev,
> >  		RTE_LOG(ERR, PMD,
> >  			"%s: no interface specified for AF_PACKET
> > ethdev\n",
> >  		        name);
> > -		goto error_early;
> > +		return -1;
> >  	}
> >
> >  	RTE_LOG(INFO, PMD,
> >  		"%s: creating AF_PACKET-backed ethdev on numa socket
> > %u\n",
> >  		name, numa_node);
> >
> > -	/*
> > -	 * now do all data allocation - for eth_dev structure, dummy pci
> > driver
> > -	 * and internal (private) data
> > -	 */
> > -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > -	if (data == NULL)
> > -		goto error_early;
> > -
> >  	*internals = rte_zmalloc_socket(name, sizeof(**internals),
> >  	                                0, numa_node);
> >  	if (*internals == NULL)
> > -		goto error_early;
> > +		return -1;
> >
> >  	for (q = 0; q < nb_queues; q++) {
> >  		(*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> > +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
> >  		RTE_LOG(ERR, PMD,
> >  			"%s: I/F name too long (%s)\n",
> >  			name, pair->value);
> > -		goto error_early;
> > +		return -1;
> >  	}
> >  	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
> >  		RTE_LOG(ERR, PMD,
> >  			"%s: ioctl failed (SIOCGIFINDEX)\n",
> >  		        name);
> > -		goto error_early;
> > +		return -1;
> >  	}
> >  	(*internals)->if_name = strdup(pair->value);
> >  	if ((*internals)->if_name == NULL)
> > -		goto error_early;
> > +		return -1;
> >  	(*internals)->if_index = ifr.ifr_ifindex;
> >
> >  	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
> >  		RTE_LOG(ERR, PMD,
> >  			"%s: ioctl failed (SIOCGIFHWADDR)\n",
> >  		        name);
> > -		goto error_early;
> > +		return -1;
> >  	}
> >  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> > ETH_ALEN);
> >
> > @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device
> > *dev,
> >
> >  	(*internals)->nb_queues = nb_queues;
> >
> > -	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> > +	data = (*eth_dev)->data;
> >  	data->dev_private = *internals;
> >  	data->nb_rx_queues = (uint16_t)nb_queues;
> >  	data->nb_tx_queues = (uint16_t)nb_queues;
> >  	data->dev_link = pmd_link;
> >  	data->mac_addrs = &(*internals)->eth_addr;
> >
> > -	(*eth_dev)->data = data;
> >  	(*eth_dev)->dev_ops = &ops;
> >
> >  	return 0;
> > @@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device
> *dev,
> >  	}
> >  	free((*internals)->if_name);
> >  	rte_free(*internals);
> > -error_early:
> > -	rte_free(data);
> >  	return -1;
> >  }
> >
> 
> I think you should remove the private rte_eth_dev_data freeing in
> rte_pmd_af_packet_remove().
> This is relevant to all the vdevs here.

Ah, yes, you are correct. I will fix that in v2.

> 
> Question:
> Does the patch include all the vdevs which allocated private
> rte_eth_dev_data?

Yes, we are removing all private rte_eth_dev_data. If I missed some device, welcome to point out.

> If so, it may solve also part of the issue discussed here:
> https://dpdk.org/dev/patchwork/patch/34047/

Yes, related. We now allocate rte_eth_dev_data which can be indexed by all primary/secondary processes.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately
  2018-03-06  8:55     ` Tan, Jianfeng
@ 2018-03-07  6:00       ` Matan Azrad
  2018-03-07  6:10         ` Matan Azrad
  0 siblings, 1 reply; 48+ messages in thread
From: Matan Azrad @ 2018-03-07  6:00 UTC (permalink / raw)
  To: Tan, Jianfeng, Yigit, Ferruh
  Cc: Richardson, Bruce, Ananyev, Konstantin, Thomas Monjalon,
	maxime.coquelin, Burakov, Anatoly, dev

Hi Jianfeng

From: Tan, Jianfeng, Sent: Tuesday, March 6, 2018 10:56 AM
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Tuesday, March 6, 2018 2:08 PM
> > To: Tan, Jianfeng; Yigit, Ferruh
> > Cc: Richardson, Bruce; Ananyev, Konstantin; Thomas Monjalon;
> > maxime.coquelin@redhat.com; Burakov, Anatoly; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate
> > rte_eth_dev_data privately
> >
> > Hi Jianfeng
> >
> > Please see a comment below.
> >
> > > From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM We
> > > introduced private rte_eth_dev_data to allow vdev to be created both
> > in
> > > primary process and secondary process(es). This is not friendly to
> > > multi- process model, for example, it leads to port id contention
> > > issue if two processes both find the data entry is free.
> > >
> > > And to get stats of primary vdev in secondary, we must allocate from
> > > the pre-defined array so that we can find it.
> > >
> > > Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
> > > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > > ---
> > >  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++++++------------------
> > >  drivers/net/kni/rte_eth_kni.c             | 13 ++-----------
> > >  drivers/net/null/rte_eth_null.c           | 17 +++--------------
> > >  drivers/net/octeontx/octeontx_ethdev.c    | 14 ++------------
> > >  drivers/net/pcap/rte_eth_pcap.c           | 18 +++---------------
> > >  drivers/net/tap/rte_eth_tap.c             |  9 +--------
> > >  drivers/net/vhost/rte_eth_vhost.c         | 17 ++---------------
> > >  7 files changed, 20 insertions(+), 93 deletions(-)
> > >
> > > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> > > b/drivers/net/af_packet/rte_eth_af_packet.c
> > > index 57eccfd..2db692f 100644
> > > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > > @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device
> > > *dev,
> > >  		RTE_LOG(ERR, PMD,
> > >  			"%s: no interface specified for AF_PACKET
> ethdev\n",
> > >  		        name);
> > > -		goto error_early;
> > > +		return -1;
> > >  	}
> > >
> > >  	RTE_LOG(INFO, PMD,
> > >  		"%s: creating AF_PACKET-backed ethdev on numa socket
> %u\n",
> > >  		name, numa_node);
> > >
> > > -	/*
> > > -	 * now do all data allocation - for eth_dev structure, dummy pci
> > > driver
> > > -	 * and internal (private) data
> > > -	 */
> > > -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > > -	if (data == NULL)
> > > -		goto error_early;
> > > -
> > >  	*internals = rte_zmalloc_socket(name, sizeof(**internals),
> > >  	                                0, numa_node);
> > >  	if (*internals == NULL)
> > > -		goto error_early;
> > > +		return -1;
> > >
> > >  	for (q = 0; q < nb_queues; q++) {
> > >  		(*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> > > +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
> > >  		RTE_LOG(ERR, PMD,
> > >  			"%s: I/F name too long (%s)\n",
> > >  			name, pair->value);
> > > -		goto error_early;
> > > +		return -1;
> > >  	}
> > >  	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
> > >  		RTE_LOG(ERR, PMD,
> > >  			"%s: ioctl failed (SIOCGIFINDEX)\n",
> > >  		        name);
> > > -		goto error_early;
> > > +		return -1;
> > >  	}
> > >  	(*internals)->if_name = strdup(pair->value);
> > >  	if ((*internals)->if_name == NULL)
> > > -		goto error_early;
> > > +		return -1;
> > >  	(*internals)->if_index = ifr.ifr_ifindex;
> > >
> > >  	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
> > >  		RTE_LOG(ERR, PMD,
> > >  			"%s: ioctl failed (SIOCGIFHWADDR)\n",
> > >  		        name);
> > > -		goto error_early;
> > > +		return -1;
> > >  	}
> > >  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> ETH_ALEN);
> > >
> > > @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device
> > > *dev,
> > >
> > >  	(*internals)->nb_queues = nb_queues;
> > >
> > > -	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> > > +	data = (*eth_dev)->data;
> > >  	data->dev_private = *internals;
> > >  	data->nb_rx_queues = (uint16_t)nb_queues;
> > >  	data->nb_tx_queues = (uint16_t)nb_queues;
> > >  	data->dev_link = pmd_link;
> > >  	data->mac_addrs = &(*internals)->eth_addr;
> > >
> > > -	(*eth_dev)->data = data;
> > >  	(*eth_dev)->dev_ops = &ops;
> > >
> > >  	return 0;
> > > @@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device
> > *dev,
> > >  	}
> > >  	free((*internals)->if_name);
> > >  	rte_free(*internals);
> > > -error_early:
> > > -	rte_free(data);
> > >  	return -1;
> > >  }
> > >
> >
> > I think you should remove the private rte_eth_dev_data freeing in
> > rte_pmd_af_packet_remove().
> > This is relevant to all the vdevs here.
> 
> Ah, yes, you are correct. I will fix that in v2.
> 
> >
> > Question:
> > Does the patch include all the vdevs which allocated private
> > rte_eth_dev_data?
> 
> Yes, we are removing all private rte_eth_dev_data. If I missed some device,
> welcome to point out.

net/ring

> > If so, it may solve also part of the issue discussed here:
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdp
> d
> >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F34047%2F&data=02%7C01%7Cmata
> n%40mell
> >
> anox.com%7C4143e70010774a15672708d583401618%7Ca652971c7d2e4d9ba6
> a4d149
> >
> 256f461b%7C0%7C0%7C636559233645410291&sdata=G1pYHEXENP3low8oziaI
> KsxiHB
> > mlEjV1f89LMZmnzvc%3D&reserved=0
> 
> Yes, related. We now allocate rte_eth_dev_data which can be indexed by all
> primary/secondary processes.
> 
> Thanks,
> Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately
  2018-03-07  6:00       ` Matan Azrad
@ 2018-03-07  6:10         ` Matan Azrad
  2018-03-12  3:40           ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Matan Azrad @ 2018-03-07  6:10 UTC (permalink / raw)
  To: Tan, Jianfeng, Yigit, Ferruh
  Cc: Richardson, Bruce, Ananyev, Konstantin, Thomas Monjalon,
	maxime.coquelin, Burakov, Anatoly, dev

Hi Jianfeng


From: Matan Azrad, Wednesday, March 7, 2018 8:01 AM
> Hi Jianfeng
> 
> From: Tan, Jianfeng, Sent: Tuesday, March 6, 2018 10:56 AM
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Tuesday, March 6, 2018 2:08 PM
> > > To: Tan, Jianfeng; Yigit, Ferruh
> > > Cc: Richardson, Bruce; Ananyev, Konstantin; Thomas Monjalon;
> > > maxime.coquelin@redhat.com; Burakov, Anatoly; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate
> > > rte_eth_dev_data privately
> > >
> > > Hi Jianfeng
> > >
> > > Please see a comment below.
> > >
> > > > From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM We
> > > > introduced private rte_eth_dev_data to allow vdev to be created
> > > > both
> > > in
> > > > primary process and secondary process(es). This is not friendly to
> > > > multi- process model, for example, it leads to port id contention
> > > > issue if two processes both find the data entry is free.
> > > >
> > > > And to get stats of primary vdev in secondary, we must allocate
> > > > from the pre-defined array so that we can find it.
> > > >
> > > > Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > > > ---
> > > >  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++++++----------------
> --
> > > >  drivers/net/kni/rte_eth_kni.c             | 13 ++-----------
> > > >  drivers/net/null/rte_eth_null.c           | 17 +++--------------
> > > >  drivers/net/octeontx/octeontx_ethdev.c    | 14 ++------------
> > > >  drivers/net/pcap/rte_eth_pcap.c           | 18 +++---------------
> > > >  drivers/net/tap/rte_eth_tap.c             |  9 +--------
> > > >  drivers/net/vhost/rte_eth_vhost.c         | 17 ++---------------
> > > >  7 files changed, 20 insertions(+), 93 deletions(-)
> > > >
> > > > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> > > > b/drivers/net/af_packet/rte_eth_af_packet.c
> > > > index 57eccfd..2db692f 100644
> > > > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > > > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > > > @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct
> > > > rte_vdev_device *dev,
> > > >  		RTE_LOG(ERR, PMD,
> > > >  			"%s: no interface specified for AF_PACKET
> > ethdev\n",
> > > >  		        name);
> > > > -		goto error_early;
> > > > +		return -1;
> > > >  	}
> > > >
> > > >  	RTE_LOG(INFO, PMD,
> > > >  		"%s: creating AF_PACKET-backed ethdev on numa socket
> > %u\n",
> > > >  		name, numa_node);
> > > >
> > > > -	/*
> > > > -	 * now do all data allocation - for eth_dev structure, dummy pci
> > > > driver
> > > > -	 * and internal (private) data
> > > > -	 */
> > > > -	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > > > -	if (data == NULL)
> > > > -		goto error_early;
> > > > -
> > > >  	*internals = rte_zmalloc_socket(name, sizeof(**internals),
> > > >  	                                0, numa_node);
> > > >  	if (*internals == NULL)
> > > > -		goto error_early;
> > > > +		return -1;
> > > >
> > > >  	for (q = 0; q < nb_queues; q++) {
> > > >  		(*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> > > > +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
> > > >  		RTE_LOG(ERR, PMD,
> > > >  			"%s: I/F name too long (%s)\n",
> > > >  			name, pair->value);
> > > > -		goto error_early;
> > > > +		return -1;
> > > >  	}
> > > >  	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
> > > >  		RTE_LOG(ERR, PMD,
> > > >  			"%s: ioctl failed (SIOCGIFINDEX)\n",
> > > >  		        name);
> > > > -		goto error_early;
> > > > +		return -1;
> > > >  	}
> > > >  	(*internals)->if_name = strdup(pair->value);
> > > >  	if ((*internals)->if_name == NULL)
> > > > -		goto error_early;
> > > > +		return -1;
> > > >  	(*internals)->if_index = ifr.ifr_ifindex;
> > > >
> > > >  	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
> > > >  		RTE_LOG(ERR, PMD,
> > > >  			"%s: ioctl failed (SIOCGIFHWADDR)\n",
> > > >  		        name);
> > > > -		goto error_early;
> > > > +		return -1;
> > > >  	}
> > > >  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> > ETH_ALEN);
> > > >
> > > > @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct
> > > > rte_vdev_device *dev,
> > > >
> > > >  	(*internals)->nb_queues = nb_queues;
> > > >
> > > > -	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> > > > +	data = (*eth_dev)->data;
> > > >  	data->dev_private = *internals;
> > > >  	data->nb_rx_queues = (uint16_t)nb_queues;
> > > >  	data->nb_tx_queues = (uint16_t)nb_queues;
> > > >  	data->dev_link = pmd_link;
> > > >  	data->mac_addrs = &(*internals)->eth_addr;
> > > >
> > > > -	(*eth_dev)->data = data;
> > > >  	(*eth_dev)->dev_ops = &ops;
> > > >
> > > >  	return 0;
> > > > @@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device
> > > *dev,
> > > >  	}
> > > >  	free((*internals)->if_name);
> > > >  	rte_free(*internals);
> > > > -error_early:
> > > > -	rte_free(data);
> > > >  	return -1;
> > > >  }
> > > >
> > >
> > > I think you should remove the private rte_eth_dev_data freeing in
> > > rte_pmd_af_packet_remove().
> > > This is relevant to all the vdevs here.
> >
> > Ah, yes, you are correct. I will fix that in v2.
> >
> > >
> > > Question:
> > > Does the patch include all the vdevs which allocated private
> > > rte_eth_dev_data?
> >
> > Yes, we are removing all private rte_eth_dev_data. If I missed some
> > device, welcome to point out.
> 
> net/ring
> 

What is about next PCI device?

net/cxgbe

> > > If so, it may solve also part of the issue discussed here:
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdp
> > d
> > >
> >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F34047%2F&data=02%7C01%7Cmata
> > n%40mell
> > >
> >
> anox.com%7C4143e70010774a15672708d583401618%7Ca652971c7d2e4d9ba6
> > a4d149
> > >
> >
> 256f461b%7C0%7C0%7C636559233645410291&sdata=G1pYHEXENP3low8oziaI
> > KsxiHB
> > > mlEjV1f89LMZmnzvc%3D&reserved=0
> >
> > Yes, related. We now allocate rte_eth_dev_data which can be indexed by
> > all primary/secondary processes.
> >
> > Thanks,
> > Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/4] bus/vdev: bus scan by multi-process channel
  2018-03-04 15:30 ` [PATCH 2/4] bus/vdev: bus scan by multi-process channel Jianfeng Tan
  2018-03-05  9:36   ` Burakov, Anatoly
@ 2018-03-07 14:00   ` Burakov, Anatoly
  2018-03-12  3:22     ` Tan, Jianfeng
  1 sibling, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-03-07 14:00 UTC (permalink / raw)
  To: Jianfeng Tan, dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit

On 04-Mar-18 3:30 PM, Jianfeng Tan wrote:
> To scan the vdevs in primary, we send request to primary process
> to obtain the names for vdevs.
> 
> Only the name is shared from the primary. In probe(), the device
> driver is supposed to locate (or request more) the detail
> information from the primary.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> ---

General note - you probably want to syncrhonize access to the tailq. 
Multiple secondaries may initialize, a vdev hotplug event may be in 
process, etc.


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/4] bus/vdev: bus scan by multi-process channel
  2018-03-07 14:00   ` Burakov, Anatoly
@ 2018-03-12  3:22     ` Tan, Jianfeng
  0 siblings, 0 replies; 48+ messages in thread
From: Tan, Jianfeng @ 2018-03-12  3:22 UTC (permalink / raw)
  To: Burakov, Anatoly, dev
  Cc: bruce.richardson, konstantin.ananyev, thomas, maxime.coquelin,
	ferruh.yigit



On 3/7/2018 10:00 PM, Burakov, Anatoly wrote:
> On 04-Mar-18 3:30 PM, Jianfeng Tan wrote:
>> To scan the vdevs in primary, we send request to primary process
>> to obtain the names for vdevs.
>>
>> Only the name is shared from the primary. In probe(), the device
>> driver is supposed to locate (or request more) the detail
>> information from the primary.
>>
>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>> ---
>
> General note - you probably want to syncrhonize access to the tailq. 
> Multiple secondaries may initialize, a vdev hotplug event may be in 
> process, etc.
>

Make sense, will change it in next version.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately
  2018-03-07  6:10         ` Matan Azrad
@ 2018-03-12  3:40           ` Tan, Jianfeng
  0 siblings, 0 replies; 48+ messages in thread
From: Tan, Jianfeng @ 2018-03-12  3:40 UTC (permalink / raw)
  To: Matan Azrad, Yigit, Ferruh
  Cc: Richardson, Bruce, Ananyev, Konstantin, Thomas Monjalon,
	maxime.coquelin, Burakov, Anatoly, dev



On 3/7/2018 2:10 PM, Matan Azrad wrote:
>
>>> Yes, we are removing all private rte_eth_dev_data. If I missed some
>>> device, welcome to point out.
>> net/ring

>> What is about next PCI device?
>>
>> net/cxgbe

Will change these two in next version. Thank you, Matan.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 0/5] allow procinfo and pdump on eth vdev
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (3 preceding siblings ...)
  2018-03-04 15:30 ` [PATCH 4/4] drivers/net: share vdev data to secondary process Jianfeng Tan
@ 2018-04-19 16:50 ` Jianfeng Tan
  2018-04-19 16:50   ` [PATCH v3 1/5] eal: bring forward multi-process channel init Jianfeng Tan
                     ` (4 more replies)
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  6 siblings, 5 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-19 16:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

v3:
  - Update doc.
  - Rebase on master.

v2:
  - Add spinlock for vdev device list as suggested by Anatoly.
  - Add ring, cxgbe and remove the free in each PMDs as suggested by Matan.
  - Rebase on master.

As we know, we have below limitations in vdev:
  - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
  - dpdk-pdump cannot dump the packets for (most) vdev in primary process;
  - secondary process cannot use (most) vdev in primary process.

The very first reason is that the secondary process actually does not know
the existence of those vdevs as vdevs are chained on a linked list, and
not shareable to secondary.

In this patchset, we would like to propose a vdev sharing model like this:
  - As a secondary process boots, all devices (including vdev) in primary
    will be automatically shared. After both primary and secondary process
    booted,
  - Device add/remove in primary will be translated to device hog
    plug/unplug event in secondary processes. (TODO)
  - Device add in secondary
    * If that kind of device support multi-process, the secondary will
      request the primary to probe the device and the primary to share
      it to the secondary. It's not necessary to have secondary-private
      device in this case. (TODO)
    * If that kind of device does not support multi-process, the secondary
      will probe the device by itself, and the port id is shared among
      all primary/secondary processes.

This patchset don't:
  - provide secondary data path (Rx/Tx) support for each specific vdev.

How to test:

Step 0: start testpmd with a vhost port and a VM connected to it.

Step 1: try using dpdk-procinfo to get the stats.
 $(dpdk-procinfo) --log-level=8 --no-pci -- --stats

Step 2: try using dpdk-pdump to dump the packets.
 $(dpdk-pdump) -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'


Jianfeng Tan (5):
  eal: bring forward multi-process channel init
  bus/vdev: add lock on vdev device list
  bus/vdev: bus scan by multi-process channel
  drivers/net: not use private eth dev data
  drivers/net: share vdev data to secondary process

 doc/guides/rel_notes/release_18_05.rst    |   5 +
 drivers/bus/vdev/Makefile                 |   1 +
 drivers/bus/vdev/vdev.c                   | 187 ++++++++++++++++++++++++++----
 drivers/net/af_packet/rte_eth_af_packet.c |  43 +++----
 drivers/net/bonding/rte_eth_bond_pmd.c    |  13 +++
 drivers/net/cxgbe/cxgbe_main.c            |   1 -
 drivers/net/failsafe/failsafe.c           |  14 +++
 drivers/net/kni/rte_eth_kni.c             |  26 +++--
 drivers/net/null/rte_eth_null.c           |  32 ++---
 drivers/net/octeontx/octeontx_ethdev.c    |  29 ++---
 drivers/net/pcap/rte_eth_pcap.c           |  32 ++---
 drivers/net/ring/rte_eth_ring.c           |  17 +--
 drivers/net/softnic/rte_eth_softnic.c     |  19 ++-
 drivers/net/tap/rte_eth_tap.c             |  24 ++--
 drivers/net/vhost/rte_eth_vhost.c         |  36 +++---
 lib/librte_eal/bsdapp/eal/eal.c           |  23 ++--
 lib/librte_eal/linuxapp/eal/eal.c         |  23 ++--
 17 files changed, 359 insertions(+), 166 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 1/5] eal: bring forward multi-process channel init
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-04-19 16:50   ` Jianfeng Tan
  2018-04-20  8:16     ` Burakov, Anatoly
  2018-04-19 16:50   ` [PATCH v3 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-19 16:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

Adjust the init sequence: put mp channel init before bus scan
so that we can init the vdev bus through mp channel in the
secondary process before the bus scan.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 23 +++++++++++++----------
 lib/librte_eal/linuxapp/eal/eal.c | 23 +++++++++++++----------
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index d996190..d315cde 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -552,6 +552,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -595,16 +608,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 	/* in secondary processes, memory init may allocate additional fbarrays
 	 * not present in primary processes, so to avoid any potential issues,
 	 * initialize memzones first.
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 21afa73..5b23bf0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -770,6 +770,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -820,8 +833,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
 		rte_eal_init_alert("Cannot init logging.");
 		rte_errno = ENOMEM;
@@ -829,14 +840,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0) {
 		rte_eal_init_alert("Cannot init VFIO\n");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 2/5] bus/vdev: add lock on vdev device list
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-19 16:50   ` [PATCH v3 1/5] eal: bring forward multi-process channel init Jianfeng Tan
@ 2018-04-19 16:50   ` Jianfeng Tan
  2018-04-20  8:26     ` Burakov, Anatoly
  2018-04-19 16:50   ` [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-19 16:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

As we could add virtual devices from different threads now, we
add a spin lock to protect the vdev device list.

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/bus/vdev/vdev.c | 61 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index f8dd1f5..181a15a 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -33,6 +33,8 @@ TAILQ_HEAD(vdev_device_list, rte_vdev_device);
 
 static struct vdev_device_list vdev_device_list =
 	TAILQ_HEAD_INITIALIZER(vdev_device_list);
+static rte_spinlock_t vdev_device_list_lock = RTE_SPINLOCK_INITIALIZER;
+
 struct vdev_driver_list vdev_driver_list =
 	TAILQ_HEAD_INITIALIZER(vdev_driver_list);
 
@@ -149,6 +151,7 @@ vdev_probe_all_drivers(struct rte_vdev_device *dev)
 	return ret;
 }
 
+/* The caller shall be responsible for thread-safe */
 static struct rte_vdev_device *
 find_vdev(const char *name)
 {
@@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
 	if (name == NULL)
 		return -EINVAL;
 
-	dev = find_vdev(name);
-	if (dev)
-		return -EEXIST;
-
 	devargs = alloc_devargs(name, args);
 	if (!devargs)
 		return -ENOMEM;
@@ -221,16 +220,28 @@ rte_vdev_init(const char *name, const char *args)
 	dev->device.numa_node = SOCKET_ID_ANY;
 	dev->device.name = devargs->name;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
+	if (find_vdev(name)) {
+		rte_spinlock_unlock(&vdev_device_list_lock);
+		ret = -EEXIST;
+		goto fail;
+	}
+	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+	rte_spinlock_unlock(&vdev_device_list_lock);
+
 	ret = vdev_probe_all_drivers(dev);
 	if (ret) {
 		if (ret > 0)
 			VDEV_LOG(ERR, "no driver found for %s\n", name);
+		/* If fails, remove it from vdev list */
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_REMOVE(&vdev_device_list, dev, next);
+		rte_spinlock_unlock(&vdev_device_list_lock);
 		goto fail;
 	}
 
 	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
 
-	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 	return 0;
 
 fail:
@@ -266,17 +277,25 @@ rte_vdev_uninit(const char *name)
 	if (name == NULL)
 		return -EINVAL;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
 	dev = find_vdev(name);
-	if (!dev)
+	if (!dev) {
+		rte_spinlock_unlock(&vdev_device_list_lock);
 		return -ENOENT;
+	}
+	TAILQ_REMOVE(&vdev_device_list, dev, next);
+	rte_spinlock_unlock(&vdev_device_list_lock);
 
 	devargs = dev->device.devargs;
 
 	ret = vdev_remove_driver(dev);
-	if (ret)
+	if (ret) {
+		/* If fails, add back to vdev list */
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+		rte_spinlock_unlock(&vdev_device_list_lock);
 		return ret;
-
-	TAILQ_REMOVE(&vdev_device_list, dev, next);
+	}
 
 	TAILQ_REMOVE(&devargs_list, devargs, next);
 
@@ -314,19 +333,25 @@ vdev_scan(void)
 		if (devargs->bus != &rte_vdev_bus)
 			continue;
 
-		dev = find_vdev(devargs->name);
-		if (dev)
-			continue;
-
 		dev = calloc(1, sizeof(*dev));
 		if (!dev)
 			return -1;
 
+		rte_spinlock_lock(&vdev_device_list_lock);
+
+		if (find_vdev(devargs->name)) {
+			rte_spinlock_unlock(&vdev_device_list_lock);
+			free(dev);
+			continue;
+		}
+
 		dev->device.devargs = devargs;
 		dev->device.numa_node = SOCKET_ID_ANY;
 		dev->device.name = devargs->name;
 
 		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+
+		rte_spinlock_unlock(&vdev_device_list_lock);
 	}
 
 	return 0;
@@ -340,6 +365,10 @@ vdev_probe(void)
 
 	/* call the init function for each virtual device */
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
+		/* we don't use the vdev lock here, as it's only used in DPDK
+		 * initialization; and we don't want to hold such a lock when
+		 * we call each driver probe.
+		 */
 
 		if (dev->device.driver)
 			continue;
@@ -360,14 +389,18 @@ vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 {
 	struct rte_vdev_device *dev;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
 		if (start && &dev->device == start) {
 			start = NULL;
 			continue;
 		}
-		if (cmp(&dev->device, data) == 0)
+		if (cmp(&dev->device, data) == 0) {
+			rte_spinlock_unlock(&vdev_device_list_lock);
 			return &dev->device;
+		}
 	}
+	rte_spinlock_unlock(&vdev_device_list_lock);
 	return NULL;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-19 16:50   ` [PATCH v3 1/5] eal: bring forward multi-process channel init Jianfeng Tan
  2018-04-19 16:50   ` [PATCH v3 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
@ 2018-04-19 16:50   ` Jianfeng Tan
  2018-04-20  8:41     ` Burakov, Anatoly
  2018-04-19 16:50   ` [PATCH v3 4/5] drivers/net: not use private eth dev data Jianfeng Tan
  2018-04-19 16:50   ` [PATCH v3 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
  4 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-19 16:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

To scan the vdevs in primary, we send request to primary process
to obtain the names for vdevs.

Only the name is shared from the primary. In probe(), the device
driver is supposed to locate (or request more) the detail
information from the primary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/bus/vdev/Makefile |   1 +
 drivers/bus/vdev/vdev.c   | 134 ++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 125 insertions(+), 10 deletions(-)

diff --git a/drivers/bus/vdev/Makefile b/drivers/bus/vdev/Makefile
index 24d424a..bd0bb89 100644
--- a/drivers/bus/vdev/Makefile
+++ b/drivers/bus/vdev/Makefile
@@ -10,6 +10,7 @@ LIB = librte_bus_vdev.a
 
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 # versioning export map
 EXPORT_MAP := rte_bus_vdev_version.map
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 181a15a..2074802 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -196,8 +196,8 @@ alloc_devargs(const char *name, const char *args)
 	return devargs;
 }
 
-int
-rte_vdev_init(const char *name, const char *args)
+static int
+insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
 {
 	struct rte_vdev_device *dev;
 	struct rte_devargs *devargs;
@@ -229,6 +229,33 @@ rte_vdev_init(const char *name, const char *args)
 	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 	rte_spinlock_unlock(&vdev_device_list_lock);
 
+	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
+
+	if (p_dev)
+		*p_dev = dev;
+
+	return 0;
+
+fail:
+	free(devargs->args);
+	free(devargs);
+	free(dev);
+	return ret;
+}
+
+int
+rte_vdev_init(const char *name, const char *args)
+{
+	struct rte_vdev_device *dev;
+	struct rte_devargs *devargs;
+	int ret;
+
+	ret = insert_vdev(name, args, &dev);
+	if (ret < 0)
+		return ret;
+
+	devargs = dev->device.devargs;
+
 	ret = vdev_probe_all_drivers(dev);
 	if (ret) {
 		if (ret > 0)
@@ -237,17 +264,14 @@ rte_vdev_init(const char *name, const char *args)
 		rte_spinlock_lock(&vdev_device_list_lock);
 		TAILQ_REMOVE(&vdev_device_list, dev, next);
 		rte_spinlock_unlock(&vdev_device_list_lock);
-		goto fail;
-	}
 
-	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
+		TAILQ_REMOVE(&devargs_list, devargs, next);
 
-	return 0;
+		free(devargs->args);
+		free(devargs);
+		free(dev);
+	}
 
-fail:
-	free(devargs->args);
-	free(devargs);
-	free(dev);
 	return ret;
 }
 
@@ -305,6 +329,68 @@ rte_vdev_uninit(const char *name)
 	return 0;
 }
 
+struct vdev_param {
+#define VDEV_SCAN_REQ	1
+#define VDEV_SCAN_ONE	2
+#define VDEV_SCAN_REP	3
+	int type;
+	int num;
+	char name[RTE_DEV_NAME_MAX_LEN];
+};
+
+static int vdev_plug(struct rte_device *dev);
+
+static int
+vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_vdev_device *dev;
+	struct rte_mp_msg mp_resp;
+	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
+	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
+	const char *devname;
+	int num;
+
+	strcpy(mp_resp.name, "vdev");
+	mp_resp.len_param = sizeof(*ou);
+	mp_resp.num_fds = 0;
+
+	switch (in->type) {
+	case VDEV_SCAN_REQ:
+		ou->type = VDEV_SCAN_ONE;
+		ou->num = 1;
+		num = 0;
+
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_FOREACH(dev, &vdev_device_list, next) {
+			devname = rte_vdev_device_name(dev);
+			if (strlen(devname) == 0)
+				VDEV_LOG(INFO, "vdev with no name is not sent");
+			VDEV_LOG(INFO, "send vdev, %s", devname);
+			strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
+			if (rte_mp_sendmsg(&mp_resp) < 0)
+				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
+					 devname, strerror(rte_errno));
+			num++;
+		}
+		rte_spinlock_unlock(&vdev_device_list_lock);
+
+		ou->type = VDEV_SCAN_REP;
+		ou->num = num;
+		if (rte_mp_reply(&mp_resp, peer) < 0)
+			VDEV_LOG(ERR, "Failed to reply a scan request");
+		break;
+	case VDEV_SCAN_ONE:
+		VDEV_LOG(INFO, "receive vdev, %s", in->name);
+		if (insert_vdev(in->name, NULL, NULL) < 0)
+			VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
+		break;
+	default:
+		VDEV_LOG(ERR, "vdev cannot recognize this message");
+	}
+
+	return 0;
+}
+
 static int
 vdev_scan(void)
 {
@@ -312,6 +398,34 @@ vdev_scan(void)
 	struct rte_devargs *devargs;
 	struct vdev_custom_scan *custom_scan;
 
+	if (rte_mp_action_register("vdev", vdev_action) < 0 &&
+	    rte_errno != EEXIST) {
+		VDEV_LOG(ERR, "vdev fails to add action");
+		return -1;
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		struct rte_mp_msg mp_req, *mp_rep;
+		struct rte_mp_reply mp_reply;
+		struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+		struct vdev_param *req = (struct vdev_param *)mp_req.param;
+		struct vdev_param *resp;
+
+		strcpy(mp_req.name, "vdev");
+		mp_req.len_param = sizeof(*req);
+		mp_req.num_fds = 0;
+		req->type = VDEV_SCAN_REQ;
+		if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0 &&
+		    mp_reply.nb_received == 1) {
+			mp_rep = &mp_reply.msgs[0];
+			resp = (struct vdev_param *)mp_rep->param;
+			VDEV_LOG(INFO, "Received %d vdevs", resp->num);
+		} else
+			VDEV_LOG(ERR, "Failed to request vdev from primary");
+
+		/* Fall through to allow private vdevs in secondary process */
+	}
+
 	/* call custom scan callbacks if any */
 	rte_spinlock_lock(&vdev_custom_scan_lock);
 	TAILQ_FOREACH(custom_scan, &vdev_custom_scans, next) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 4/5] drivers/net: not use private eth dev data
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (2 preceding siblings ...)
  2018-04-19 16:50   ` [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-19 16:50   ` Jianfeng Tan
  2018-04-19 16:50   ` [PATCH v3 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
  4 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-19 16:50 UTC (permalink / raw)
  To: dev
  Cc: thomas, Jianfeng Tan, John W . Linville, Ferruh Yigit,
	Tetsuya Mukawa, Santosh Shukla, Jerin Jacob, Pascal Mazon,
	Maxime Coquelin, Bruce Richardson, Rahul Lakkireddy

We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Cc: John W. Linville <linville@tuxdriver.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Tetsuya Mukawa <mtetsuyah@gmail.com>
Cc: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Cc: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Cc: Pascal Mazon <pascal.mazon@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>
Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 26 +++++++-------------------
 drivers/net/cxgbe/cxgbe_main.c            |  1 -
 drivers/net/kni/rte_eth_kni.c             | 14 ++------------
 drivers/net/null/rte_eth_null.c           | 19 ++++---------------
 drivers/net/octeontx/octeontx_ethdev.c    | 15 ++-------------
 drivers/net/pcap/rte_eth_pcap.c           | 19 +++----------------
 drivers/net/ring/rte_eth_ring.c           | 17 +----------------
 drivers/net/tap/rte_eth_tap.c             | 11 +----------
 drivers/net/vhost/rte_eth_vhost.c         | 19 ++-----------------
 9 files changed, 22 insertions(+), 119 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 57eccfd..110e8a5 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: no interface specified for AF_PACKET ethdev\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 
 	RTE_LOG(INFO, PMD,
 		"%s: creating AF_PACKET-backed ethdev on numa socket %u\n",
 		name, numa_node);
 
-	/*
-	 * now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error_early;
-
 	*internals = rte_zmalloc_socket(name, sizeof(**internals),
 	                                0, numa_node);
 	if (*internals == NULL)
-		goto error_early;
+		return -1;
 
 	for (q = 0; q < nb_queues; q++) {
 		(*internals)->rx_queue[q].map = MAP_FAILED;
@@ -604,24 +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: I/F name too long (%s)\n",
 			name, pair->value);
-		goto error_early;
+		return -1;
 	}
 	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFINDEX)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	(*internals)->if_name = strdup(pair->value);
 	if ((*internals)->if_name == NULL)
-		goto error_early;
+		return -1;
 	(*internals)->if_index = ifr.ifr_ifindex;
 
 	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFHWADDR)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
 
@@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 
 	(*internals)->nb_queues = nb_queues;
 
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->dev_private = *internals;
 	data->nb_rx_queues = (uint16_t)nb_queues;
 	data->nb_tx_queues = (uint16_t)nb_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &(*internals)->eth_addr;
 
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 	}
 	free((*internals)->if_name);
 	rte_free(*internals);
-error_early:
-	rte_free(data);
 	return -1;
 }
 
@@ -985,7 +974,6 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
 	free(internals->if_name);
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index c786a1a..74bccd5 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -29,7 +29,6 @@
 #include <rte_ether.h>
 #include <rte_ethdev_driver.h>
 #include <rte_ethdev_pci.h>
-#include <rte_malloc.h>
 #include <rte_random.h>
 #include <rte_dev.h>
 #include <rte_kvargs.h>
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index c10e970..b7897b6 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -336,25 +336,17 @@ eth_kni_create(struct rte_vdev_device *vdev,
 	struct pmd_internals *internals;
 	struct rte_eth_dev_data *data;
 	struct rte_eth_dev *eth_dev;
-	const char *name;
 
 	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
 			numa_node);
 
-	name = rte_vdev_device_name(vdev);
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return NULL;
-
 	/* reserve an ethdev entry */
 	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*internals));
-	if (eth_dev == NULL) {
-		rte_free(data);
+	if (!eth_dev)
 		return NULL;
-	}
 
 	internals = eth_dev->data->dev_private;
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = 1;
 	data->nb_tx_queues = 1;
 	data->dev_link = pmd_link;
@@ -362,7 +354,6 @@ eth_kni_create(struct rte_vdev_device *vdev,
 
 	eth_random_addr(internals->eth_addr.addr_bytes);
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &eth_kni_ops;
 
 	internals->no_request_thread = args->no_request_thread;
@@ -458,7 +449,6 @@ eth_kni_remove(struct rte_vdev_device *vdev)
 	rte_kni_release(internals->kni);
 
 	rte_free(internals);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 74dde95..7d89a32 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -495,7 +495,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 {
 	const unsigned nb_rx_queues = 1;
 	const unsigned nb_tx_queues = 1;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internals *internals = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 
@@ -512,19 +512,10 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 	RTE_LOG(INFO, PMD, "Creating null ethdev on numa socket %u\n",
 		dev->device.numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(rte_vdev_device_name(dev), sizeof(*data), 0,
-		dev->device.numa_node);
-	if (!data)
-		return -ENOMEM;
-
 	eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
-	if (!eth_dev) {
-		rte_free(data);
+	if (!eth_dev)
 		return -ENOMEM;
-	}
+
 	/* now put it all together
 	 * - store queue data in internals,
 	 * - store numa_node info in ethdev data
@@ -545,13 +536,12 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 
 	rte_memcpy(internals->rss_key, default_rss_key, 40);
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->eth_addr;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 
 	/* finally assign rx and tx ops */
@@ -669,7 +659,6 @@ rte_pmd_null_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index 6d67d25..ee06cd3 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1068,7 +1068,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	char octtx_name[OCTEONTX_MAX_NAME_LEN];
 	struct octeontx_nic *nic = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	const char *name = rte_vdev_device_name(dev);
 
 	PMD_INIT_FUNC_TRACE();
@@ -1084,13 +1084,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		return 0;
 	}
 
-	data = rte_zmalloc_socket(octtx_name, sizeof(*data), 0, socket_id);
-	if (data == NULL) {
-		octeontx_log_err("failed to allocate devdata");
-		res = -ENOMEM;
-		goto err;
-	}
-
 	nic = rte_zmalloc_socket(octtx_name, sizeof(*nic), 0, socket_id);
 	if (nic == NULL) {
 		octeontx_log_err("failed to allocate nic structure");
@@ -1126,11 +1119,9 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	eth_dev->data->kdrv = RTE_KDRV_NONE;
 	eth_dev->data->numa_node = dev->device.numa_node;
 
-	rte_memcpy(data, (eth_dev)->data, sizeof(*data));
+	data = eth_dev->data;
 	data->dev_private = nic;
-
 	data->port_id = eth_dev->data->port_id;
-	snprintf(data->name, sizeof(data->name), "%s", eth_dev->data->name);
 
 	nic->ev_queues = 1;
 	nic->ev_ports = 1;
@@ -1149,7 +1140,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		goto err;
 	}
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &octeontx_dev_ops;
 
 	/* Finally save ethdev pointer to the NIC structure */
@@ -1217,7 +1207,6 @@ octeontx_remove(struct rte_vdev_device *dev)
 
 		rte_free(eth_dev->data->mac_addrs);
 		rte_free(eth_dev->data->dev_private);
-		rte_free(eth_dev->data);
 		rte_eth_dev_release_port(eth_dev);
 		rte_event_dev_close(nic->evdev);
 	}
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index c1571e1..8740d52 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -773,27 +773,16 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 		struct pmd_internals **internals,
 		struct rte_eth_dev **eth_dev)
 {
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	unsigned int numa_node = vdev->device.numa_node;
-	const char *name;
 
-	name = rte_vdev_device_name(vdev);
 	RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %d\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return -1;
-
 	/* reserve an ethdev entry */
 	*eth_dev = rte_eth_vdev_allocate(vdev, sizeof(**internals));
-	if (*eth_dev == NULL) {
-		rte_free(data);
+	if (!(*eth_dev))
 		return -1;
-	}
 
 	/* now put it all together
 	 * - store queue data in internals,
@@ -802,7 +791,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 	*internals = (*eth_dev)->data->dev_private;
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
@@ -812,7 +801,6 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * NOTE: we'll replace the data element, of originally allocated
 	 * eth_dev so the rings are local per-process
 	 */
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -1020,7 +1008,6 @@ pmd_pcap_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index df13c44..e53823a 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -259,15 +259,6 @@ do_eth_dev_ring_create(const char *name,
 	RTE_LOG(INFO, PMD, "Creating rings-backed ethdev on numa socket %u\n",
 			numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL) {
-		rte_errno = ENOMEM;
-		goto error;
-	}
-
 	rx_queues_local = rte_zmalloc_socket(name,
 			sizeof(void *) * nb_rx_queues, 0, numa_node);
 	if (rx_queues_local == NULL) {
@@ -301,10 +292,8 @@ do_eth_dev_ring_create(const char *name,
 	 * - point eth_dev_data to internals
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
-	/* NOTE: we'll replace the data element, of originally allocated eth_dev
-	 * so the rings are local per-process */
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->rx_queues = rx_queues_local;
 	data->tx_queues = tx_queues_local;
 
@@ -326,7 +315,6 @@ do_eth_dev_ring_create(const char *name,
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->address;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 	data->kdrv = RTE_KDRV_NONE;
 	data->numa_node = numa_node;
@@ -342,7 +330,6 @@ do_eth_dev_ring_create(const char *name,
 error:
 	rte_free(rx_queues_local);
 	rte_free(tx_queues_local);
-	rte_free(data);
 	rte_free(internals);
 
 	return -1;
@@ -675,8 +662,6 @@ rte_pmd_ring_remove(struct rte_vdev_device *dev)
 	rte_free(eth_dev->data->tx_queues);
 	rte_free(eth_dev->data->dev_private);
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 	return 0;
 }
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 915d937..b18efd8 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1386,12 +1386,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(DEBUG, PMD, "%s device on numa %u\n",
 			tuntap_name, rte_socket_id());
 
-	data = rte_zmalloc_socket(tap_name, sizeof(*data), 0, numa_node);
-	if (!data) {
-		RTE_LOG(ERR, PMD, "%s Failed to allocate data\n", tuntap_name);
-		goto error_exit_nodev;
-	}
-
 	dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
 	if (!dev) {
 		RTE_LOG(ERR, PMD, "%s Unable to allocate device struct\n",
@@ -1412,7 +1406,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	}
 
 	/* Setup some default values */
-	rte_memcpy(data, dev->data, sizeof(*data));
+	data = dev->data;
 	data->dev_private = pmd;
 	data->dev_flags = RTE_ETH_DEV_INTR_LSC;
 	data->numa_node = numa_node;
@@ -1423,7 +1417,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	data->nb_rx_queues = 0;
 	data->nb_tx_queues = 0;
 
-	dev->data = data;
 	dev->dev_ops = &ops;
 	dev->rx_pkt_burst = pmd_rx_burst;
 	dev->tx_pkt_burst = pmd_tx_burst;
@@ -1574,7 +1567,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(ERR, PMD, "%s Unable to initialize %s\n",
 		tuntap_name, rte_vdev_device_name(vdev));
 
-	rte_free(data);
 	return -EINVAL;
 }
 
@@ -1828,7 +1820,6 @@ rte_pmd_tap_remove(struct rte_vdev_device *dev)
 
 	close(internals->ioctl_sock);
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index d7d44a0..fea13eb 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1227,7 +1227,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	int16_t queues, const unsigned int numa_node, uint64_t flags)
 {
 	const char *name = rte_vdev_device_name(dev);
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internal *internal = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct ether_addr *eth_addr = NULL;
@@ -1237,13 +1237,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	RTE_LOG(INFO, PMD, "Creating VHOST-USER backend on numa socket %u\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure and internal
-	 * (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error;
-
 	list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
 	if (list == NULL)
 		goto error;
@@ -1285,12 +1278,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	rte_spinlock_init(&vring_state->lock);
 	vring_states[eth_dev->data->port_id] = vring_state;
 
-	/* We'll replace the 'data' originally allocated by eth_dev. So the
-	 * vhost PMD resources won't be shared between multi processes.
-	 */
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
-	eth_dev->data = data;
-
+	data = eth_dev->data;
 	data->nb_rx_queues = queues;
 	data->nb_tx_queues = queues;
 	internal->max_queues = queues;
@@ -1331,7 +1319,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 		rte_eth_dev_release_port(eth_dev);
 	rte_free(internal);
 	rte_free(list);
-	rte_free(data);
 
 	return -1;
 }
@@ -1462,8 +1449,6 @@ rte_pmd_vhost_remove(struct rte_vdev_device *dev)
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 
 	return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 5/5] drivers/net: share vdev data to secondary process
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (3 preceding siblings ...)
  2018-04-19 16:50   ` [PATCH v3 4/5] drivers/net: not use private eth dev data Jianfeng Tan
@ 2018-04-19 16:50   ` Jianfeng Tan
  4 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-19 16:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 doc/guides/rel_notes/release_18_05.rst    |  5 +++++
 drivers/net/af_packet/rte_eth_af_packet.c | 17 +++++++++++++++--
 drivers/net/bonding/rte_eth_bond_pmd.c    | 13 +++++++++++++
 drivers/net/failsafe/failsafe.c           | 14 ++++++++++++++
 drivers/net/kni/rte_eth_kni.c             | 12 ++++++++++++
 drivers/net/null/rte_eth_null.c           | 13 +++++++++++++
 drivers/net/octeontx/octeontx_ethdev.c    | 14 ++++++++++++++
 drivers/net/pcap/rte_eth_pcap.c           | 13 +++++++++++++
 drivers/net/softnic/rte_eth_softnic.c     | 19 ++++++++++++++++---
 drivers/net/tap/rte_eth_tap.c             | 13 +++++++++++++
 drivers/net/vhost/rte_eth_vhost.c         | 17 +++++++++++++++--
 11 files changed, 143 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index 290fa09..854efeb 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -115,6 +115,11 @@ New Features
 
   Linux uevent is supported as backend of this device event notification framework.
 
+* **Added support for procinfo and pdump on eth vdev.**
+
+  For ethernet virtual devices (like tap, pcap, etc), with this feature, we can get
+  stats/xstats on shared memory from secondary process, and also pdump packets on
+  those virtual devices.
 
 API Changes
 -----------
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 110e8a5..b394d3c 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -915,9 +915,22 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
 	int ret = 0;
 	struct rte_kvargs *kvlist;
 	int sockfd = -1;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL) {
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 2805c71..09696ea 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3021,6 +3021,7 @@ bond_probe(struct rte_vdev_device *dev)
 	uint8_t bonding_mode, socket_id/*, agg_mode*/;
 	int  arg_count, port_id;
 	uint8_t agg_mode;
+	struct rte_eth_dev *eth_dev;
 
 	if (!dev)
 		return -EINVAL;
@@ -3028,6 +3029,18 @@ bond_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	RTE_LOG(INFO, EAL, "Initializing pmd_bond for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &default_dev_ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev),
 		pmd_bond_init_valid_arguments);
 	if (kvlist == NULL)
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index fa279cb..dc9b0d0 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -294,10 +294,24 @@ static int
 rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 {
 	const char *name;
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(vdev);
 	INFO("Initializing " FAILSAFE_DRIVER_NAME " for %s",
 			name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(vdev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &failsafe_ops;
+		return 0;
+	}
+
 	return fs_eth_dev_create(vdev);
 }
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index b7897b6..08fc6a3 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -404,6 +404,18 @@ eth_kni_probe(struct rte_vdev_device *vdev)
 	params = rte_vdev_device_args(vdev);
 	RTE_LOG(INFO, PMD, "Initializing eth_kni for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &eth_kni_ops;
+		return 0;
+	}
+
 	ret = eth_kni_kvargs_process(&args, params);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 7d89a32..6413a90 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -597,6 +597,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	unsigned packet_size = default_packet_size;
 	unsigned packet_copy = default_packet_copy;
 	struct rte_kvargs *kvlist = NULL;
+	struct rte_eth_dev *eth_dev;
 	int ret;
 
 	if (!dev)
@@ -606,6 +607,18 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	params = rte_vdev_device_args(dev);
 	RTE_LOG(INFO, PMD, "Initializing pmd_null for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	if (params != NULL) {
 		kvlist = rte_kvargs_parse(params, valid_arguments);
 		if (kvlist == NULL)
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index ee06cd3..04120f5 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1228,12 +1228,26 @@ octeontx_probe(struct rte_vdev_device *dev)
 	struct rte_event_dev_config dev_conf;
 	const char *eventdev_name = "event_octeontx";
 	struct rte_event_dev_info info;
+	struct rte_eth_dev *eth_dev;
 
 	struct octeontx_vdev_init_params init_params = {
 		OCTEONTX_VDEV_DEFAULT_MAX_NR_PORT
 	};
 
 	dev_name = rte_vdev_device_name(dev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(dev_name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", dev_name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &octeontx_dev_ops;
+		return 0;
+	}
+
 	res = octeontx_parse_vdev_init_params(&init_params, dev);
 	if (res < 0)
 		return -EINVAL;
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 8740d52..570c9e9 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -898,6 +898,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	struct rte_kvargs *kvlist;
 	struct pmd_devargs pcaps = {0};
 	struct pmd_devargs dumpers = {0};
+	struct rte_eth_dev *eth_dev;
 	int single_iface = 0;
 	int ret;
 
@@ -908,6 +909,18 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	start_cycles = rte_get_timer_cycles();
 	hz = rte_get_timer_hz();
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
 		return -1;
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index b0c1341..e324394 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -725,13 +725,26 @@ pmd_probe(struct rte_vdev_device *vdev)
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(vdev);
 
-	RTE_LOG(INFO, PMD,
-		"Probing device \"%s\"\n",
-		rte_vdev_device_name(vdev));
+	RTE_LOG(INFO, PMD, "Probing device \"%s\"\n", name);
 
 	/* Parse input arguments */
 	params = rte_vdev_device_args(vdev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &pmd_ops;
+		return 0;
+	}
+
 	if (!params)
 		return -EINVAL;
 
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index b18efd8..cca5852 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1721,6 +1721,7 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	char tap_name[RTE_ETH_NAME_MAX_LEN];
 	char remote_iface[RTE_ETH_NAME_MAX_LEN];
 	struct ether_addr user_mac = { .addr_bytes = {0} };
+	struct rte_eth_dev *eth_dev;
 
 	tap_type = 1;
 	strcpy(tuntap_name, "TAP");
@@ -1728,6 +1729,18 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	params = rte_vdev_device_args(dev);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	speed = ETH_SPEED_NUM_10G;
 	snprintf(tap_name, sizeof(tap_name), "%s%d",
 		 DEFAULT_TAP_NAME, tap_unit++);
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index fea13eb..99a7727 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1362,9 +1362,22 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
 	int client_mode = 0;
 	int dequeue_zero_copy = 0;
 	int iommu_support = 0;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/5] eal: bring forward multi-process channel init
  2018-04-19 16:50   ` [PATCH v3 1/5] eal: bring forward multi-process channel init Jianfeng Tan
@ 2018-04-20  8:16     ` Burakov, Anatoly
  2018-04-20 14:08       ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-20  8:16 UTC (permalink / raw)
  To: Jianfeng Tan, dev; +Cc: thomas

On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
> Adjust the init sequence: put mp channel init before bus scan
> so that we can init the vdev bus through mp channel in the
> secondary process before the bus scan.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> ---

Hi Jianfeng,

Just a general question. I can't recall if we've discussed this 
internally, but does this new IPC-based vdev bus scan trigger any memory 
allocations? So far bus scans were well-behaved and didn't do that.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/5] bus/vdev: add lock on vdev device list
  2018-04-19 16:50   ` [PATCH v3 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
@ 2018-04-20  8:26     ` Burakov, Anatoly
  2018-04-20 14:19       ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-20  8:26 UTC (permalink / raw)
  To: Jianfeng Tan, dev; +Cc: thomas

On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
> As we could add virtual devices from different threads now, we
> add a spin lock to protect the vdev device list.
> 
> Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> ---

<...>

> +/* The caller shall be responsible for thread-safe */
>   static struct rte_vdev_device *
>   find_vdev(const char *name)
>   {
> @@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
>   	if (name == NULL)
>   		return -EINVAL;
>   
> -	dev = find_vdev(name);
> -	if (dev)
> -		return -EEXIST;
> -
>   	devargs = alloc_devargs(name, args);
>   	if (!devargs)
>   		return -ENOMEM;
> @@ -221,16 +220,28 @@ rte_vdev_init(const char *name, const char *args)
>   	dev->device.numa_node = SOCKET_ID_ANY;
>   	dev->device.name = devargs->name;
>   
> +	rte_spinlock_lock(&vdev_device_list_lock);
> +	if (find_vdev(name)) {
> +		rte_spinlock_unlock(&vdev_device_list_lock);
> +		ret = -EEXIST;
> +		goto fail;
> +	}
> +	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
> +	rte_spinlock_unlock(&vdev_device_list_lock);
> +

I wonder if is possible to just leave the tailq locked until you either 
insert the device into tailq, or figure out that it's not possible? 
Seems like doing two locks here is unnecessary, unless 
vdev_probe_all_drivers needs this tailq unlocked...

>   	ret = vdev_probe_all_drivers(dev);
>   	if (ret) {
>   		if (ret > 0)
>   			VDEV_LOG(ERR, "no driver found for %s\n", name);
> +		/* If fails, remove it from vdev list */
> +		rte_spinlock_lock(&vdev_device_list_lock);
> +		TAILQ_REMOVE(&vdev_device_list, dev, next);
> +		rte_spinlock_unlock(&vdev_device_list_lock);
>   		goto fail;
>   	}
>   
>   	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
>   
> -	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>   	return 0;
>   
>   fail:
> @@ -266,17 +277,25 @@ rte_vdev_uninit(const char *name)
>   	if (name == NULL)
>   		return -EINVAL;
>   
> +	rte_spinlock_lock(&vdev_device_list_lock);
>   	dev = find_vdev(name);
> -	if (!dev)
> +	if (!dev) {
> +		rte_spinlock_unlock(&vdev_device_list_lock);
>   		return -ENOENT;
> +	}
> +	TAILQ_REMOVE(&vdev_device_list, dev, next);
> +	rte_spinlock_unlock(&vdev_device_list_lock);
>   
>   	devargs = dev->device.devargs;
>   
>   	ret = vdev_remove_driver(dev);
> -	if (ret)
> +	if (ret) {
> +		/* If fails, add back to vdev list */
> +		rte_spinlock_lock(&vdev_device_list_lock);
> +		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
> +		rte_spinlock_unlock(&vdev_device_list_lock);
>   		return ret;
> -
> -	TAILQ_REMOVE(&vdev_device_list, dev, next);
> +	}

Same comment here - perhaps keep the lock locked all the way? Maybe a 
good way to ensure you don't miss anything is put most of it in a static 
function, and do

static int vdev_uninit() {
	...
}

static int rte_vdev_uninit() {
	int ret;
	lock();
	ret = vdev_uninit();
	unlock();
	return ret;
}

? In general, it is better to do lock/unlock in one place and not 
disperse lock/unlock calls across various branches.

>   
>   	TAILQ_REMOVE(&devargs_list, devargs, next);
>   
> @@ -314,19 +333,25 @@ vdev_scan(void)
>   		if (devargs->bus != &rte_vdev_bus)
>   			continue;
>   
> -		dev = find_vdev(devargs->name);
> -		if (dev)
> -			continue;
> -
>   		dev = calloc(1, sizeof(*dev));
>   		if (!dev)
>   			return -1;
>   
> +		rte_spinlock_lock(&vdev_device_list_lock);
> +
> +		if (find_vdev(devargs->name)) {
> +			rte_spinlock_unlock(&vdev_device_list_lock);
> +			free(dev);
> +			continue;
> +		}
> +
>   		dev->device.devargs = devargs;
>   		dev->device.numa_node = SOCKET_ID_ANY;
>   		dev->device.name = devargs->name;
>   
>   		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
> +
> +		rte_spinlock_unlock(&vdev_device_list_lock);
>   	}
>   
>   	return 0;
> @@ -340,6 +365,10 @@ vdev_probe(void)
>   
>   	/* call the init function for each virtual device */
>   	TAILQ_FOREACH(dev, &vdev_device_list, next) {
> +		/* we don't use the vdev lock here, as it's only used in DPDK
> +		 * initialization; and we don't want to hold such a lock when
> +		 * we call each driver probe.
> +		 */
>   
>   		if (dev->device.driver)
>   			continue;
> @@ -360,14 +389,18 @@ vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
>   {
>   	struct rte_vdev_device *dev;
>   
> +	rte_spinlock_lock(&vdev_device_list_lock);
>   	TAILQ_FOREACH(dev, &vdev_device_list, next) {
>   		if (start && &dev->device == start) {
>   			start = NULL;
>   			continue;
>   		}
> -		if (cmp(&dev->device, data) == 0)
> +		if (cmp(&dev->device, data) == 0) {
> +			rte_spinlock_unlock(&vdev_device_list_lock);
>   			return &dev->device;
> +		}
>   	}
> +	rte_spinlock_unlock(&vdev_device_list_lock);
>   	return NULL;

How about

break;
}
unlock();
return dev ? &dev->device : NULL;

? Seems clearer to me.

>   }
>   
> 


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-19 16:50   ` [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-20  8:41     ` Burakov, Anatoly
  2018-04-20 14:28       ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-20  8:41 UTC (permalink / raw)
  To: Jianfeng Tan, dev; +Cc: thomas

On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
> To scan the vdevs in primary, we send request to primary process
> to obtain the names for vdevs.
> 
> Only the name is shared from the primary. In probe(), the device
> driver is supposed to locate (or request more) the detail
> information from the primary.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> ---

<...>

> +static int
> +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
> +{
> +	struct rte_vdev_device *dev;
> +	struct rte_mp_msg mp_resp;
> +	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
> +	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
> +	const char *devname;
> +	int num;
> +
> +	strcpy(mp_resp.name, "vdev");
> +	mp_resp.len_param = sizeof(*ou);
> +	mp_resp.num_fds = 0;
> +
> +	switch (in->type) {
> +	case VDEV_SCAN_REQ:
> +		ou->type = VDEV_SCAN_ONE;
> +		ou->num = 1;
> +		num = 0;
> +
> +		rte_spinlock_lock(&vdev_device_list_lock);
> +		TAILQ_FOREACH(dev, &vdev_device_list, next) {
> +			devname = rte_vdev_device_name(dev);
> +			if (strlen(devname) == 0)
> +				VDEV_LOG(INFO, "vdev with no name is not sent");
> +			VDEV_LOG(INFO, "send vdev, %s", devname);
> +			strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);

Probably better use strlcpy as it always null-terminates.

> +			if (rte_mp_sendmsg(&mp_resp) < 0)
> +				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
> +					 devname, strerror(rte_errno));
> +			num++;

Some comments on what is going on here (why are we sending messages in 
response? why multiple? who will receive these messages?) would be nice. 
I have a sneaking suspicion that you could've packed the response into 
one single message, but i'm not completely sure what is going on here, 
so maybe what you have here makes sense...

> +		}
> +		rte_spinlock_unlock(&vdev_device_list_lock);
> +
> +		ou->type = VDEV_SCAN_REP;
> +		ou->num = num;
> +		if (rte_mp_reply(&mp_resp, peer) < 0)
> +			VDEV_LOG(ERR, "Failed to reply a scan request");
> +		break;

<...>

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 1/5] eal: bring forward multi-process channel init
  2018-04-20  8:16     ` Burakov, Anatoly
@ 2018-04-20 14:08       ` Tan, Jianfeng
  0 siblings, 0 replies; 48+ messages in thread
From: Tan, Jianfeng @ 2018-04-20 14:08 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: thomas



On 4/20/2018 4:16 PM, Burakov, Anatoly wrote:
> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>> Adjust the init sequence: put mp channel init before bus scan
>> so that we can init the vdev bus through mp channel in the
>> secondary process before the bus scan.
>>
>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>> ---
>
> Hi Jianfeng,
>
> Just a general question. I can't recall if we've discussed this 
> internally, 

I don't think we once discussed this.

> but does this new IPC-based vdev bus scan trigger any memory allocations?

No, we don't.

> So far bus scans were well-behaved and didn't do that.
>

I think it's due to that even in the master branch implementation, bus 
scan is ahead of memory init. So we don't use any rte_malloc in bus scan.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/5] bus/vdev: add lock on vdev device list
  2018-04-20  8:26     ` Burakov, Anatoly
@ 2018-04-20 14:19       ` Tan, Jianfeng
  2018-04-20 15:16         ` Burakov, Anatoly
  0 siblings, 1 reply; 48+ messages in thread
From: Tan, Jianfeng @ 2018-04-20 14:19 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: thomas



On 4/20/2018 4:26 PM, Burakov, Anatoly wrote:
> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>> As we could add virtual devices from different threads now, we
>> add a spin lock to protect the vdev device list.
>>
>> Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>> ---
>
> <...>
>
>> +/* The caller shall be responsible for thread-safe */
>>   static struct rte_vdev_device *
>>   find_vdev(const char *name)
>>   {
>> @@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
>>       if (name == NULL)
>>           return -EINVAL;
>>   -    dev = find_vdev(name);
>> -    if (dev)
>> -        return -EEXIST;
>> -
>>       devargs = alloc_devargs(name, args);
>>       if (!devargs)
>>           return -ENOMEM;
>> @@ -221,16 +220,28 @@ rte_vdev_init(const char *name, const char *args)
>>       dev->device.numa_node = SOCKET_ID_ANY;
>>       dev->device.name = devargs->name;
>>   +    rte_spinlock_lock(&vdev_device_list_lock);
>> +    if (find_vdev(name)) {
>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>> +        ret = -EEXIST;
>> +        goto fail;
>> +    }
>> +    TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>> +    rte_spinlock_unlock(&vdev_device_list_lock);
>> +
>
> I wonder if is possible to just leave the tailq locked until you 
> either insert the device into tailq, or figure out that it's not 
> possible? Seems like doing two locks here is unnecessary, unless 
> vdev_probe_all_drivers needs this tailq unlocked...

My opinion is that we don't know what could be done in driver probe(). 
It could possibly insert a new vdev (it does not happen now, but could 
happen in future?). So here, we call this with tailq unlocked. Or we 
keep it as simple as possible as you say?

>
>>       ret = vdev_probe_all_drivers(dev);
>>       if (ret) {
>>           if (ret > 0)
>>               VDEV_LOG(ERR, "no driver found for %s\n", name);
>> +        /* If fails, remove it from vdev list */
>> +        rte_spinlock_lock(&vdev_device_list_lock);
>> +        TAILQ_REMOVE(&vdev_device_list, dev, next);
>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>>           goto fail;
>>       }
>>         TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
>>   -    TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>>       return 0;
>>     fail:
>> @@ -266,17 +277,25 @@ rte_vdev_uninit(const char *name)
>>       if (name == NULL)
>>           return -EINVAL;
>>   +    rte_spinlock_lock(&vdev_device_list_lock);
>>       dev = find_vdev(name);
>> -    if (!dev)
>> +    if (!dev) {
>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>>           return -ENOENT;
>> +    }
>> +    TAILQ_REMOVE(&vdev_device_list, dev, next);
>> +    rte_spinlock_unlock(&vdev_device_list_lock);
>>         devargs = dev->device.devargs;
>>         ret = vdev_remove_driver(dev);
>> -    if (ret)
>> +    if (ret) {
>> +        /* If fails, add back to vdev list */
>> +        rte_spinlock_lock(&vdev_device_list_lock);
>> +        TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>>           return ret;
>> -
>> -    TAILQ_REMOVE(&vdev_device_list, dev, next);
>> +    }
>
> Same comment here - perhaps keep the lock locked all the way? Maybe a 
> good way to ensure you don't miss anything is put most of it in a 
> static function, and do
>
> static int vdev_uninit() {
>     ...
> }
>
> static int rte_vdev_uninit() {
>     int ret;
>     lock();
>     ret = vdev_uninit();
>     unlock();
>     return ret;
> }
>
> ? In general, it is better to do lock/unlock in one place and not 
> disperse lock/unlock calls across various branches.

Makes sense. Will change code accordingly once the above decision is made.

>
>>         TAILQ_REMOVE(&devargs_list, devargs, next);
>>   @@ -314,19 +333,25 @@ vdev_scan(void)
>>           if (devargs->bus != &rte_vdev_bus)
>>               continue;
>>   -        dev = find_vdev(devargs->name);
>> -        if (dev)
>> -            continue;
>> -
>>           dev = calloc(1, sizeof(*dev));
>>           if (!dev)
>>               return -1;
>>   +        rte_spinlock_lock(&vdev_device_list_lock);
>> +
>> +        if (find_vdev(devargs->name)) {
>> +            rte_spinlock_unlock(&vdev_device_list_lock);
>> +            free(dev);
>> +            continue;
>> +        }
>> +
>>           dev->device.devargs = devargs;
>>           dev->device.numa_node = SOCKET_ID_ANY;
>>           dev->device.name = devargs->name;
>>             TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>> +
>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>>       }
>>         return 0;
>> @@ -340,6 +365,10 @@ vdev_probe(void)
>>         /* call the init function for each virtual device */
>>       TAILQ_FOREACH(dev, &vdev_device_list, next) {
>> +        /* we don't use the vdev lock here, as it's only used in DPDK
>> +         * initialization; and we don't want to hold such a lock when
>> +         * we call each driver probe.
>> +         */
>>             if (dev->device.driver)
>>               continue;
>> @@ -360,14 +389,18 @@ vdev_find_device(const struct rte_device 
>> *start, rte_dev_cmp_t cmp,
>>   {
>>       struct rte_vdev_device *dev;
>>   +    rte_spinlock_lock(&vdev_device_list_lock);
>>       TAILQ_FOREACH(dev, &vdev_device_list, next) {
>>           if (start && &dev->device == start) {
>>               start = NULL;
>>               continue;
>>           }
>> -        if (cmp(&dev->device, data) == 0)
>> +        if (cmp(&dev->device, data) == 0) {
>> +            rte_spinlock_unlock(&vdev_device_list_lock);
>>               return &dev->device;
>> +        }
>>       }
>> +    rte_spinlock_unlock(&vdev_device_list_lock);
>>       return NULL;
>
> How about
>
> break;
> }
> unlock();
> return dev ? &dev->device : NULL;
>
> ? Seems clearer to me.

Yep, will change that.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-20  8:41     ` Burakov, Anatoly
@ 2018-04-20 14:28       ` Tan, Jianfeng
  2018-04-20 15:19         ` Burakov, Anatoly
  0 siblings, 1 reply; 48+ messages in thread
From: Tan, Jianfeng @ 2018-04-20 14:28 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: thomas



On 4/20/2018 4:41 PM, Burakov, Anatoly wrote:
> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>> To scan the vdevs in primary, we send request to primary process
>> to obtain the names for vdevs.
>>
>> Only the name is shared from the primary. In probe(), the device
>> driver is supposed to locate (or request more) the detail
>> information from the primary.
>>
>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>> ---
>
> <...>
>
>> +static int
>> +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
>> +{
>> +    struct rte_vdev_device *dev;
>> +    struct rte_mp_msg mp_resp;
>> +    struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
>> +    const struct vdev_param *in = (const struct vdev_param 
>> *)mp_msg->param;
>> +    const char *devname;
>> +    int num;
>> +
>> +    strcpy(mp_resp.name, "vdev");
>> +    mp_resp.len_param = sizeof(*ou);
>> +    mp_resp.num_fds = 0;
>> +
>> +    switch (in->type) {
>> +    case VDEV_SCAN_REQ:
>> +        ou->type = VDEV_SCAN_ONE;
>> +        ou->num = 1;
>> +        num = 0;
>> +
>> +        rte_spinlock_lock(&vdev_device_list_lock);
>> +        TAILQ_FOREACH(dev, &vdev_device_list, next) {
>> +            devname = rte_vdev_device_name(dev);
>> +            if (strlen(devname) == 0)
>> +                VDEV_LOG(INFO, "vdev with no name is not sent");
>> +            VDEV_LOG(INFO, "send vdev, %s", devname);
>> +            strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
>
> Probably better use strlcpy as it always null-terminates.

Yep.

>
>> +            if (rte_mp_sendmsg(&mp_resp) < 0)
>> +                VDEV_LOG(ERR, "send vdev, %s, failed, %s",
>> +                     devname, strerror(rte_errno));
>> +            num++;
>
> Some comments on what is going on here (why are we sending messages in 
> response? why multiple? who will receive these messages?) would be nice.

Yep, will explain that below.

> I have a sneaking suspicion that you could've packed the response into 
> one single message, but i'm not completely sure what is going on here, 
> so maybe what you have here makes sense...

What's happening here is that:

a. Secondary process sends a sync request to ask for vdev in primary.
b. Primary process receives the request, and send vdevs one by one.
c. Primary process sends back reply, which indicates how many vdevs are 
sent.

The reason we don't pack all vdevs in the reply message is that, the 
message length is RTE_MP_MAX_PARAM_LEN (256) in length. It's possible 
that we cannot pack all vdevs in the single reply message.

Thanks,
Jianfeng

>> +        }
>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>> +
>> +        ou->type = VDEV_SCAN_REP;
>> +        ou->num = num;
>> +        if (rte_mp_reply(&mp_resp, peer) < 0)
>> +            VDEV_LOG(ERR, "Failed to reply a scan request");
>> +        break;
>
> <...>
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/5] bus/vdev: add lock on vdev device list
  2018-04-20 14:19       ` Tan, Jianfeng
@ 2018-04-20 15:16         ` Burakov, Anatoly
  2018-04-20 15:23           ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-20 15:16 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: thomas

On 20-Apr-18 3:19 PM, Tan, Jianfeng wrote:
> 
> 
> On 4/20/2018 4:26 PM, Burakov, Anatoly wrote:
>> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>>> As we could add virtual devices from different threads now, we
>>> add a spin lock to protect the vdev device list.
>>>
>>> Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>>> ---
>>
>> <...>
>>
>>> +/* The caller shall be responsible for thread-safe */
>>>   static struct rte_vdev_device *
>>>   find_vdev(const char *name)
>>>   {
>>> @@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
>>>       if (name == NULL)
>>>           return -EINVAL;
>>>   -    dev = find_vdev(name);
>>> -    if (dev)
>>> -        return -EEXIST;
>>> -
>>>       devargs = alloc_devargs(name, args);
>>>       if (!devargs)
>>>           return -ENOMEM;
>>> @@ -221,16 +220,28 @@ rte_vdev_init(const char *name, const char *args)
>>>       dev->device.numa_node = SOCKET_ID_ANY;
>>>       dev->device.name = devargs->name;
>>>   +    rte_spinlock_lock(&vdev_device_list_lock);
>>> +    if (find_vdev(name)) {
>>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>>> +        ret = -EEXIST;
>>> +        goto fail;
>>> +    }
>>> +    TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>>> +    rte_spinlock_unlock(&vdev_device_list_lock);
>>> +
>>
>> I wonder if is possible to just leave the tailq locked until you 
>> either insert the device into tailq, or figure out that it's not 
>> possible? Seems like doing two locks here is unnecessary, unless 
>> vdev_probe_all_drivers needs this tailq unlocked...
> 
> My opinion is that we don't know what could be done in driver probe(). 
> It could possibly insert a new vdev (it does not happen now, but could 
> happen in future?). So here, we call this with tailq unlocked. Or we 
> keep it as simple as possible as you say?

I thought this code was responsible for inserting vdevs? I think it 
would be generally bad design to insert vdev while inserting vdev :)

That said, it's a fair point, and i don't have a strong opinion on this, 
so you can leave it as is if you want.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-20 14:28       ` Tan, Jianfeng
@ 2018-04-20 15:19         ` Burakov, Anatoly
  2018-04-20 15:32           ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-20 15:19 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: thomas

On 20-Apr-18 3:28 PM, Tan, Jianfeng wrote:
> 
> 
> On 4/20/2018 4:41 PM, Burakov, Anatoly wrote:
>> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>>> To scan the vdevs in primary, we send request to primary process
>>> to obtain the names for vdevs.
>>>
>>> Only the name is shared from the primary. In probe(), the device
>>> driver is supposed to locate (or request more) the detail
>>> information from the primary.
>>>
>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>>> ---
>>
>> <...>
>>
>>> +static int
>>> +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
>>> +{
>>> +    struct rte_vdev_device *dev;
>>> +    struct rte_mp_msg mp_resp;
>>> +    struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
>>> +    const struct vdev_param *in = (const struct vdev_param 
>>> *)mp_msg->param;
>>> +    const char *devname;
>>> +    int num;
>>> +
>>> +    strcpy(mp_resp.name, "vdev");
>>> +    mp_resp.len_param = sizeof(*ou);
>>> +    mp_resp.num_fds = 0;
>>> +
>>> +    switch (in->type) {
>>> +    case VDEV_SCAN_REQ:
>>> +        ou->type = VDEV_SCAN_ONE;
>>> +        ou->num = 1;
>>> +        num = 0;
>>> +
>>> +        rte_spinlock_lock(&vdev_device_list_lock);
>>> +        TAILQ_FOREACH(dev, &vdev_device_list, next) {
>>> +            devname = rte_vdev_device_name(dev);
>>> +            if (strlen(devname) == 0)
>>> +                VDEV_LOG(INFO, "vdev with no name is not sent");
>>> +            VDEV_LOG(INFO, "send vdev, %s", devname);
>>> +            strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
>>
>> Probably better use strlcpy as it always null-terminates.
> 
> Yep.
> 
>>
>>> +            if (rte_mp_sendmsg(&mp_resp) < 0)
>>> +                VDEV_LOG(ERR, "send vdev, %s, failed, %s",
>>> +                     devname, strerror(rte_errno));
>>> +            num++;
>>
>> Some comments on what is going on here (why are we sending messages in 
>> response? why multiple? who will receive these messages?) would be nice.
> 
> Yep, will explain that below.
> 
>> I have a sneaking suspicion that you could've packed the response into 
>> one single message, but i'm not completely sure what is going on here, 
>> so maybe what you have here makes sense...
> 
> What's happening here is that:
> 
> a. Secondary process sends a sync request to ask for vdev in primary.
> b. Primary process receives the request, and send vdevs one by one.
> c. Primary process sends back reply, which indicates how many vdevs are 
> sent.
> 
> The reason we don't pack all vdevs in the reply message is that, the 
> message length is RTE_MP_MAX_PARAM_LEN (256) in length. It's possible 
> that we cannot pack all vdevs in the single reply message.
> 

OK. How does secondary know which vdevs are new and which aren't? Does 
it even matter how many vdevs primary has sent? Correct me if i'm wrong, 
but it seems that you're only using sync request as kind of 
synchronization mechanism, and are not actually expecting any useful 
data in the reply. Which is OK, but in that case just don't bother 
sending any data in the reply in the first place :)

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 2/5] bus/vdev: add lock on vdev device list
  2018-04-20 15:16         ` Burakov, Anatoly
@ 2018-04-20 15:23           ` Tan, Jianfeng
  0 siblings, 0 replies; 48+ messages in thread
From: Tan, Jianfeng @ 2018-04-20 15:23 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: thomas



On 4/20/2018 11:16 PM, Burakov, Anatoly wrote:
> On 20-Apr-18 3:19 PM, Tan, Jianfeng wrote:
>>
>>
>> On 4/20/2018 4:26 PM, Burakov, Anatoly wrote:
>>> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>>>> As we could add virtual devices from different threads now, we
>>>> add a spin lock to protect the vdev device list.
>>>>
>>>> Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>>>> ---
>>>
>>> <...>
>>>
>>>> +/* The caller shall be responsible for thread-safe */
>>>>   static struct rte_vdev_device *
>>>>   find_vdev(const char *name)
>>>>   {
>>>> @@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
>>>>       if (name == NULL)
>>>>           return -EINVAL;
>>>>   -    dev = find_vdev(name);
>>>> -    if (dev)
>>>> -        return -EEXIST;
>>>> -
>>>>       devargs = alloc_devargs(name, args);
>>>>       if (!devargs)
>>>>           return -ENOMEM;
>>>> @@ -221,16 +220,28 @@ rte_vdev_init(const char *name, const char 
>>>> *args)
>>>>       dev->device.numa_node = SOCKET_ID_ANY;
>>>>       dev->device.name = devargs->name;
>>>>   +    rte_spinlock_lock(&vdev_device_list_lock);
>>>> +    if (find_vdev(name)) {
>>>> +        rte_spinlock_unlock(&vdev_device_list_lock);
>>>> +        ret = -EEXIST;
>>>> +        goto fail;
>>>> +    }
>>>> +    TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>>>> +    rte_spinlock_unlock(&vdev_device_list_lock);
>>>> +
>>>
>>> I wonder if is possible to just leave the tailq locked until you 
>>> either insert the device into tailq, or figure out that it's not 
>>> possible? Seems like doing two locks here is unnecessary, unless 
>>> vdev_probe_all_drivers needs this tailq unlocked...
>>
>> My opinion is that we don't know what could be done in driver 
>> probe(). It could possibly insert a new vdev (it does not happen now, 
>> but could happen in future?). So here, we call this with tailq 
>> unlocked. Or we keep it as simple as possible as you say?
>
> I thought this code was responsible for inserting vdevs? I think it 
> would be generally bad design to insert vdev while inserting vdev :)

I might have mixed this with another case. I think it's a fair point.

>
> That said, it's a fair point, and i don't have a strong opinion on 
> this, so you can leave it as is if you want.

I'll change the implementation.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-20 15:19         ` Burakov, Anatoly
@ 2018-04-20 15:32           ` Tan, Jianfeng
  2018-04-20 15:39             ` Burakov, Anatoly
  0 siblings, 1 reply; 48+ messages in thread
From: Tan, Jianfeng @ 2018-04-20 15:32 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: thomas



On 4/20/2018 11:19 PM, Burakov, Anatoly wrote:
> On 20-Apr-18 3:28 PM, Tan, Jianfeng wrote:
>>
>>
>> On 4/20/2018 4:41 PM, Burakov, Anatoly wrote:
>>> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>>>> To scan the vdevs in primary, we send request to primary process
>>>> to obtain the names for vdevs.
>>>>
>>>> Only the name is shared from the primary. In probe(), the device
>>>> driver is supposed to locate (or request more) the detail
>>>> information from the primary.
>>>>
>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>>>> ---
>>>
>>> <...>
>>>
>>>> +static int
>>>> +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
>>>> +{
>>>> +    struct rte_vdev_device *dev;
>>>> +    struct rte_mp_msg mp_resp;
>>>> +    struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
>>>> +    const struct vdev_param *in = (const struct vdev_param 
>>>> *)mp_msg->param;
>>>> +    const char *devname;
>>>> +    int num;
>>>> +
>>>> +    strcpy(mp_resp.name, "vdev");
>>>> +    mp_resp.len_param = sizeof(*ou);
>>>> +    mp_resp.num_fds = 0;
>>>> +
>>>> +    switch (in->type) {
>>>> +    case VDEV_SCAN_REQ:
>>>> +        ou->type = VDEV_SCAN_ONE;
>>>> +        ou->num = 1;
>>>> +        num = 0;
>>>> +
>>>> +        rte_spinlock_lock(&vdev_device_list_lock);
>>>> +        TAILQ_FOREACH(dev, &vdev_device_list, next) {
>>>> +            devname = rte_vdev_device_name(dev);
>>>> +            if (strlen(devname) == 0)
>>>> +                VDEV_LOG(INFO, "vdev with no name is not sent");
>>>> +            VDEV_LOG(INFO, "send vdev, %s", devname);
>>>> +            strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
>>>
>>> Probably better use strlcpy as it always null-terminates.
>>
>> Yep.
>>
>>>
>>>> +            if (rte_mp_sendmsg(&mp_resp) < 0)
>>>> +                VDEV_LOG(ERR, "send vdev, %s, failed, %s",
>>>> +                     devname, strerror(rte_errno));
>>>> +            num++;
>>>
>>> Some comments on what is going on here (why are we sending messages 
>>> in response? why multiple? who will receive these messages?) would 
>>> be nice.
>>
>> Yep, will explain that below.
>>
>>> I have a sneaking suspicion that you could've packed the response 
>>> into one single message, but i'm not completely sure what is going 
>>> on here, so maybe what you have here makes sense...
>>
>> What's happening here is that:
>>
>> a. Secondary process sends a sync request to ask for vdev in primary.
>> b. Primary process receives the request, and send vdevs one by one.
>> c. Primary process sends back reply, which indicates how many vdevs 
>> are sent.
>>
>> The reason we don't pack all vdevs in the reply message is that, the 
>> message length is RTE_MP_MAX_PARAM_LEN (256) in length. It's possible 
>> that we cannot pack all vdevs in the single reply message.
>>
>
> OK. How does secondary know which vdevs are new and which aren't?

This auto discovery is designed for secondary boot to know which vdevs 
are used in primary. So they are all new to the secondary process. For 
runtime vdev add in primary, we are going to rely on hotplug framework 
to tell the news to secondary processes.

> Does it even matter how many vdevs primary has sent? Correct me if i'm 
> wrong, but it seems that you're only using sync request as kind of 
> synchronization mechanism, and are not actually expecting any useful 
> data in the reply. Which is OK, but in that case just don't bother 
> sending any data in the reply in the first place :)

I would like to keep this information, so that secondary process can 
tell how many vdevs come from primary process (secondary process can 
definitely iterate the vdev list to know, but it's that straightforward).

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-20 15:32           ` Tan, Jianfeng
@ 2018-04-20 15:39             ` Burakov, Anatoly
  0 siblings, 0 replies; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-20 15:39 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: thomas

On 20-Apr-18 4:32 PM, Tan, Jianfeng wrote:
> 
> 
> On 4/20/2018 11:19 PM, Burakov, Anatoly wrote:
>> On 20-Apr-18 3:28 PM, Tan, Jianfeng wrote:
>>>
>>>
>>> On 4/20/2018 4:41 PM, Burakov, Anatoly wrote:
>>>> On 19-Apr-18 5:50 PM, Jianfeng Tan wrote:
>>>>> To scan the vdevs in primary, we send request to primary process
>>>>> to obtain the names for vdevs.
>>>>>
>>>>> Only the name is shared from the primary. In probe(), the device
>>>>> driver is supposed to locate (or request more) the detail
>>>>> information from the primary.
>>>>>
>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
>>>>> ---
>>>>
>>>> <...>
>>>>
>>>>> +static int
>>>>> +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
>>>>> +{
>>>>> +    struct rte_vdev_device *dev;
>>>>> +    struct rte_mp_msg mp_resp;
>>>>> +    struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
>>>>> +    const struct vdev_param *in = (const struct vdev_param 
>>>>> *)mp_msg->param;
>>>>> +    const char *devname;
>>>>> +    int num;
>>>>> +
>>>>> +    strcpy(mp_resp.name, "vdev");
>>>>> +    mp_resp.len_param = sizeof(*ou);
>>>>> +    mp_resp.num_fds = 0;
>>>>> +
>>>>> +    switch (in->type) {
>>>>> +    case VDEV_SCAN_REQ:
>>>>> +        ou->type = VDEV_SCAN_ONE;
>>>>> +        ou->num = 1;
>>>>> +        num = 0;
>>>>> +
>>>>> +        rte_spinlock_lock(&vdev_device_list_lock);
>>>>> +        TAILQ_FOREACH(dev, &vdev_device_list, next) {
>>>>> +            devname = rte_vdev_device_name(dev);
>>>>> +            if (strlen(devname) == 0)
>>>>> +                VDEV_LOG(INFO, "vdev with no name is not sent");
>>>>> +            VDEV_LOG(INFO, "send vdev, %s", devname);
>>>>> +            strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
>>>>
>>>> Probably better use strlcpy as it always null-terminates.
>>>
>>> Yep.
>>>
>>>>
>>>>> +            if (rte_mp_sendmsg(&mp_resp) < 0)
>>>>> +                VDEV_LOG(ERR, "send vdev, %s, failed, %s",
>>>>> +                     devname, strerror(rte_errno));
>>>>> +            num++;
>>>>
>>>> Some comments on what is going on here (why are we sending messages 
>>>> in response? why multiple? who will receive these messages?) would 
>>>> be nice.
>>>
>>> Yep, will explain that below.
>>>
>>>> I have a sneaking suspicion that you could've packed the response 
>>>> into one single message, but i'm not completely sure what is going 
>>>> on here, so maybe what you have here makes sense...
>>>
>>> What's happening here is that:
>>>
>>> a. Secondary process sends a sync request to ask for vdev in primary.
>>> b. Primary process receives the request, and send vdevs one by one.
>>> c. Primary process sends back reply, which indicates how many vdevs 
>>> are sent.
>>>
>>> The reason we don't pack all vdevs in the reply message is that, the 
>>> message length is RTE_MP_MAX_PARAM_LEN (256) in length. It's possible 
>>> that we cannot pack all vdevs in the single reply message.
>>>
>>
>> OK. How does secondary know which vdevs are new and which aren't?
> 
> This auto discovery is designed for secondary boot to know which vdevs 
> are used in primary. So they are all new to the secondary process. For 
> runtime vdev add in primary, we are going to rely on hotplug framework 
> to tell the news to secondary processes.
> 
>> Does it even matter how many vdevs primary has sent? Correct me if i'm 
>> wrong, but it seems that you're only using sync request as kind of 
>> synchronization mechanism, and are not actually expecting any useful 
>> data in the reply. Which is OK, but in that case just don't bother 
>> sending any data in the reply in the first place :)
> 
> I would like to keep this information, so that secondary process can 
> tell how many vdevs come from primary process (secondary process can 
> definitely iterate the vdev list to know, but it's that straightforward).
> 

OK, no strong objections here :)

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v4 0/5] allow procinfo and pdump on eth vdev
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (4 preceding siblings ...)
  2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-04-20 16:57 ` Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 1/5] eal: bring forward multi-process channel init Jianfeng Tan
                     ` (4 more replies)
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  6 siblings, 5 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-20 16:57 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

v4:
  - Change the lock code style as suggested by Anatoly.
  - Add function note as suggested by Anatoly.

v3:
  - Update doc.
  - Rebase on master.

v2:
  - Add spinlock for vdev device list as suggested by Anatoly.
  - Add ring, cxgbe and remove the free in each PMDs as suggested by Matan.
  - Rebase on master.

As we know, we have below limitations in vdev:
  - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
  - dpdk-pdump cannot dump the packets for (most) vdev in primary process;
  - secondary process cannot use (most) vdev in primary process.

The very first reason is that the secondary process actually does not know
the existence of those vdevs as vdevs are chained on a linked list, and
not shareable to secondary.

In this patchset, we would like to propose a vdev sharing model like this:
  - As a secondary process boots, all devices (including vdev) in primary
    will be automatically shared. After both primary and secondary process
    booted,
  - Device add/remove in primary will be translated to device hog
    plug/unplug event in secondary processes. (TODO)
  - Device add in secondary
    * If that kind of device support multi-process, the secondary will
      request the primary to probe the device and the primary to share
      it to the secondary. It's not necessary to have secondary-private
      device in this case. (TODO)
    * If that kind of device does not support multi-process, the secondary
      will probe the device by itself, and the port id is shared among
      all primary/secondary processes.

This patchset don't:
  - provide secondary data path (Rx/Tx) support for each specific vdev.

How to test:

Step 0: start testpmd with a vhost port and a VM connected to it.

Step 1: try using dpdk-procinfo to get the stats.
 $(dpdk-procinfo) --log-level=8 --no-pci -- --stats

Step 2: try using dpdk-pdump to dump the packets.
 $(dpdk-pdump) -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'


Jianfeng Tan (5):
  eal: bring forward multi-process channel init
  bus/vdev: add lock on vdev device list
  bus/vdev: bus scan by multi-process channel
  drivers/net: not use private eth dev data
  drivers/net: share vdev data to secondary process

 doc/guides/rel_notes/release_18_05.rst    |   5 +
 drivers/bus/vdev/Makefile                 |   1 +
 drivers/bus/vdev/vdev.c                   | 193 ++++++++++++++++++++++++++----
 drivers/net/af_packet/rte_eth_af_packet.c |  43 +++----
 drivers/net/bonding/rte_eth_bond_pmd.c    |  13 ++
 drivers/net/cxgbe/cxgbe_main.c            |   1 -
 drivers/net/failsafe/failsafe.c           |  14 +++
 drivers/net/kni/rte_eth_kni.c             |  26 ++--
 drivers/net/null/rte_eth_null.c           |  32 ++---
 drivers/net/octeontx/octeontx_ethdev.c    |  29 +++--
 drivers/net/pcap/rte_eth_pcap.c           |  32 ++---
 drivers/net/ring/rte_eth_ring.c           |  17 +--
 drivers/net/softnic/rte_eth_softnic.c     |  19 ++-
 drivers/net/tap/rte_eth_tap.c             |  24 ++--
 drivers/net/vhost/rte_eth_vhost.c         |  36 +++---
 lib/librte_eal/bsdapp/eal/eal.c           |  23 ++--
 lib/librte_eal/linuxapp/eal/eal.c         |  23 ++--
 17 files changed, 360 insertions(+), 171 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v4 1/5] eal: bring forward multi-process channel init
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-04-20 16:57   ` Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-20 16:57 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

Adjust the init sequence: put mp channel init before bus scan
so that we can init the vdev bus through mp channel in the
secondary process before the bus scan.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 23 +++++++++++++----------
 lib/librte_eal/linuxapp/eal/eal.c | 23 +++++++++++++----------
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index d996190..d315cde 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -552,6 +552,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -595,16 +608,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 	/* in secondary processes, memory init may allocate additional fbarrays
 	 * not present in primary processes, so to avoid any potential issues,
 	 * initialize memzones first.
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 21afa73..5b23bf0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -770,6 +770,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -820,8 +833,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
 		rte_eal_init_alert("Cannot init logging.");
 		rte_errno = ENOMEM;
@@ -829,14 +840,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0) {
 		rte_eal_init_alert("Cannot init VFIO\n");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 2/5] bus/vdev: add lock on vdev device list
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 1/5] eal: bring forward multi-process channel init Jianfeng Tan
@ 2018-04-20 16:57   ` Jianfeng Tan
  2018-04-23  9:47     ` Burakov, Anatoly
  2018-04-20 16:57   ` [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-20 16:57 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

As we could add virtual devices from different threads now, we
add a spin lock to protect the vdev device list.

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/bus/vdev/vdev.c | 95 +++++++++++++++++++++++++++++++++++--------------
 1 file changed, 69 insertions(+), 26 deletions(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index f8dd1f5..70964f5 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -33,6 +33,8 @@ TAILQ_HEAD(vdev_device_list, rte_vdev_device);
 
 static struct vdev_device_list vdev_device_list =
 	TAILQ_HEAD_INITIALIZER(vdev_device_list);
+static rte_spinlock_t vdev_device_list_lock = RTE_SPINLOCK_INITIALIZER;
+
 struct vdev_driver_list vdev_driver_list =
 	TAILQ_HEAD_INITIALIZER(vdev_driver_list);
 
@@ -149,6 +151,7 @@ vdev_probe_all_drivers(struct rte_vdev_device *dev)
 	return ret;
 }
 
+/* The caller shall be responsible for thread-safe */
 static struct rte_vdev_device *
 find_vdev(const char *name)
 {
@@ -193,8 +196,8 @@ alloc_devargs(const char *name, const char *args)
 	return devargs;
 }
 
-int
-rte_vdev_init(const char *name, const char *args)
+static int
+insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
 {
 	struct rte_vdev_device *dev;
 	struct rte_devargs *devargs;
@@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
 	if (name == NULL)
 		return -EINVAL;
 
-	dev = find_vdev(name);
-	if (dev)
-		return -EEXIST;
-
 	devargs = alloc_devargs(name, args);
 	if (!devargs)
 		return -ENOMEM;
@@ -221,18 +220,18 @@ rte_vdev_init(const char *name, const char *args)
 	dev->device.numa_node = SOCKET_ID_ANY;
 	dev->device.name = devargs->name;
 
-	ret = vdev_probe_all_drivers(dev);
-	if (ret) {
-		if (ret > 0)
-			VDEV_LOG(ERR, "no driver found for %s\n", name);
+	if (find_vdev(name)) {
+		ret = -EEXIST;
 		goto fail;
 	}
 
+	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
 
-	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
-	return 0;
+	if (p_dev)
+		*p_dev = dev;
 
+	return 0;
 fail:
 	free(devargs->args);
 	free(devargs);
@@ -240,6 +239,33 @@ rte_vdev_init(const char *name, const char *args)
 	return ret;
 }
 
+int
+rte_vdev_init(const char *name, const char *args)
+{
+	struct rte_vdev_device *dev;
+	struct rte_devargs *devargs;
+	int ret;
+
+	rte_spinlock_lock(&vdev_device_list_lock);
+	ret = insert_vdev(name, args, &dev);
+	if (ret == 0) {
+		ret = vdev_probe_all_drivers(dev);
+		if (ret) {
+			if (ret > 0)
+				VDEV_LOG(ERR, "no driver found for %s\n", name);
+			/* If fails, remove it from vdev list */
+			devargs = dev->device.devargs;
+			TAILQ_REMOVE(&vdev_device_list, dev, next);
+			TAILQ_REMOVE(&devargs_list, devargs, next);
+			free(devargs->args);
+			free(devargs);
+			free(dev);
+		}
+	}
+	rte_spinlock_unlock(&vdev_device_list_lock);
+	return ret;
+}
+
 static int
 vdev_remove_driver(struct rte_vdev_device *dev)
 {
@@ -266,24 +292,28 @@ rte_vdev_uninit(const char *name)
 	if (name == NULL)
 		return -EINVAL;
 
-	dev = find_vdev(name);
-	if (!dev)
-		return -ENOENT;
+	rte_spinlock_lock(&vdev_device_list_lock);
 
-	devargs = dev->device.devargs;
+	dev = find_vdev(name);
+	if (!dev) {
+		ret = -ENOENT;
+		goto unlock;
+	}
 
 	ret = vdev_remove_driver(dev);
 	if (ret)
-		return ret;
+		goto unlock;
 
 	TAILQ_REMOVE(&vdev_device_list, dev, next);
-
+	devargs = dev->device.devargs;
 	TAILQ_REMOVE(&devargs_list, devargs, next);
-
 	free(devargs->args);
 	free(devargs);
 	free(dev);
-	return 0;
+
+unlock:
+	rte_spinlock_unlock(&vdev_device_list_lock);
+	return ret;
 }
 
 static int
@@ -314,19 +344,25 @@ vdev_scan(void)
 		if (devargs->bus != &rte_vdev_bus)
 			continue;
 
-		dev = find_vdev(devargs->name);
-		if (dev)
-			continue;
-
 		dev = calloc(1, sizeof(*dev));
 		if (!dev)
 			return -1;
 
+		rte_spinlock_lock(&vdev_device_list_lock);
+
+		if (find_vdev(devargs->name)) {
+			rte_spinlock_unlock(&vdev_device_list_lock);
+			free(dev);
+			continue;
+		}
+
 		dev->device.devargs = devargs;
 		dev->device.numa_node = SOCKET_ID_ANY;
 		dev->device.name = devargs->name;
 
 		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+
+		rte_spinlock_unlock(&vdev_device_list_lock);
 	}
 
 	return 0;
@@ -340,6 +376,10 @@ vdev_probe(void)
 
 	/* call the init function for each virtual device */
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
+		/* we don't use the vdev lock here, as it's only used in DPDK
+		 * initialization; and we don't want to hold such a lock when
+		 * we call each driver probe.
+		 */
 
 		if (dev->device.driver)
 			continue;
@@ -360,15 +400,18 @@ vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 {
 	struct rte_vdev_device *dev;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
 		if (start && &dev->device == start) {
 			start = NULL;
 			continue;
 		}
 		if (cmp(&dev->device, data) == 0)
-			return &dev->device;
+			break;
 	}
-	return NULL;
+	rte_spinlock_unlock(&vdev_device_list_lock);
+
+	return dev ? &dev->device : NULL;
 }
 
 static int
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 1/5] eal: bring forward multi-process channel init Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
@ 2018-04-20 16:57   ` Jianfeng Tan
  2018-04-23  9:54     ` Burakov, Anatoly
  2018-04-20 16:57   ` [PATCH v4 4/5] drivers/net: not use private eth dev data Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
  4 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-20 16:57 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

To scan the vdevs in primary, we send request to primary process
to obtain the names for vdevs.

Only the name is shared from the primary. In probe(), the device
driver is supposed to locate (or request more) the detail
information from the primary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/bus/vdev/Makefile |   1 +
 drivers/bus/vdev/vdev.c   | 100 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/drivers/bus/vdev/Makefile b/drivers/bus/vdev/Makefile
index 24d424a..bd0bb89 100644
--- a/drivers/bus/vdev/Makefile
+++ b/drivers/bus/vdev/Makefile
@@ -10,6 +10,7 @@ LIB = librte_bus_vdev.a
 
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 # versioning export map
 EXPORT_MAP := rte_bus_vdev_version.map
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 70964f5..ce5332a 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -18,6 +18,7 @@
 #include <rte_memory.h>
 #include <rte_tailq.h>
 #include <rte_spinlock.h>
+#include <rte_string_fns.h>
 #include <rte_errno.h>
 
 #include "rte_bus_vdev.h"
@@ -316,6 +317,77 @@ rte_vdev_uninit(const char *name)
 	return ret;
 }
 
+struct vdev_param {
+#define VDEV_SCAN_REQ	1
+#define VDEV_SCAN_ONE	2
+#define VDEV_SCAN_REP	3
+	int type;
+	int num;
+	char name[RTE_DEV_NAME_MAX_LEN];
+};
+
+static int vdev_plug(struct rte_device *dev);
+
+/**
+ * This function works as the action for both primary and secondary process
+ * for static vdev discovery when a secondary process is booting.
+ *
+ * step 1, secondary process sends a sync request to ask for vdev in primary;
+ * step 2, primary process receives the request, and send vdevs one by one;
+ * step 3, primary process sends back reply, which indicates how many vdevs
+ * are sent.
+ */
+static int
+vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_vdev_device *dev;
+	struct rte_mp_msg mp_resp;
+	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
+	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
+	const char *devname;
+	int num;
+
+	strcpy(mp_resp.name, "vdev");
+	mp_resp.len_param = sizeof(*ou);
+	mp_resp.num_fds = 0;
+
+	switch (in->type) {
+	case VDEV_SCAN_REQ:
+		ou->type = VDEV_SCAN_ONE;
+		ou->num = 1;
+		num = 0;
+
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_FOREACH(dev, &vdev_device_list, next) {
+			devname = rte_vdev_device_name(dev);
+			if (strlen(devname) == 0)
+				VDEV_LOG(INFO, "vdev with no name is not sent");
+			VDEV_LOG(INFO, "send vdev, %s", devname);
+			strlcpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
+			if (rte_mp_sendmsg(&mp_resp) < 0)
+				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
+					 devname, strerror(rte_errno));
+			num++;
+		}
+		rte_spinlock_unlock(&vdev_device_list_lock);
+
+		ou->type = VDEV_SCAN_REP;
+		ou->num = num;
+		if (rte_mp_reply(&mp_resp, peer) < 0)
+			VDEV_LOG(ERR, "Failed to reply a scan request");
+		break;
+	case VDEV_SCAN_ONE:
+		VDEV_LOG(INFO, "receive vdev, %s", in->name);
+		if (insert_vdev(in->name, NULL, NULL) < 0)
+			VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
+		break;
+	default:
+		VDEV_LOG(ERR, "vdev cannot recognize this message");
+	}
+
+	return 0;
+}
+
 static int
 vdev_scan(void)
 {
@@ -323,6 +395,34 @@ vdev_scan(void)
 	struct rte_devargs *devargs;
 	struct vdev_custom_scan *custom_scan;
 
+	if (rte_mp_action_register("vdev", vdev_action) < 0 &&
+	    rte_errno != EEXIST) {
+		VDEV_LOG(ERR, "vdev fails to add action");
+		return -1;
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		struct rte_mp_msg mp_req, *mp_rep;
+		struct rte_mp_reply mp_reply;
+		struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+		struct vdev_param *req = (struct vdev_param *)mp_req.param;
+		struct vdev_param *resp;
+
+		strcpy(mp_req.name, "vdev");
+		mp_req.len_param = sizeof(*req);
+		mp_req.num_fds = 0;
+		req->type = VDEV_SCAN_REQ;
+		if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0 &&
+		    mp_reply.nb_received == 1) {
+			mp_rep = &mp_reply.msgs[0];
+			resp = (struct vdev_param *)mp_rep->param;
+			VDEV_LOG(INFO, "Received %d vdevs", resp->num);
+		} else
+			VDEV_LOG(ERR, "Failed to request vdev from primary");
+
+		/* Fall through to allow private vdevs in secondary process */
+	}
+
 	/* call custom scan callbacks if any */
 	rte_spinlock_lock(&vdev_custom_scan_lock);
 	TAILQ_FOREACH(custom_scan, &vdev_custom_scans, next) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 4/5] drivers/net: not use private eth dev data
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (2 preceding siblings ...)
  2018-04-20 16:57   ` [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-20 16:57   ` Jianfeng Tan
  2018-04-20 16:57   ` [PATCH v4 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
  4 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-20 16:57 UTC (permalink / raw)
  To: dev
  Cc: thomas, Jianfeng Tan, John W . Linville, Ferruh Yigit,
	Tetsuya Mukawa, Santosh Shukla, Jerin Jacob, Pascal Mazon,
	Maxime Coquelin, Bruce Richardson, Rahul Lakkireddy

We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Cc: John W. Linville <linville@tuxdriver.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Tetsuya Mukawa <mtetsuyah@gmail.com>
Cc: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Cc: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Cc: Pascal Mazon <pascal.mazon@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>
Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 26 +++++++-------------------
 drivers/net/cxgbe/cxgbe_main.c            |  1 -
 drivers/net/kni/rte_eth_kni.c             | 14 ++------------
 drivers/net/null/rte_eth_null.c           | 19 ++++---------------
 drivers/net/octeontx/octeontx_ethdev.c    | 15 ++-------------
 drivers/net/pcap/rte_eth_pcap.c           | 19 +++----------------
 drivers/net/ring/rte_eth_ring.c           | 17 +----------------
 drivers/net/tap/rte_eth_tap.c             | 11 +----------
 drivers/net/vhost/rte_eth_vhost.c         | 19 ++-----------------
 9 files changed, 22 insertions(+), 119 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 57eccfd..110e8a5 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: no interface specified for AF_PACKET ethdev\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 
 	RTE_LOG(INFO, PMD,
 		"%s: creating AF_PACKET-backed ethdev on numa socket %u\n",
 		name, numa_node);
 
-	/*
-	 * now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error_early;
-
 	*internals = rte_zmalloc_socket(name, sizeof(**internals),
 	                                0, numa_node);
 	if (*internals == NULL)
-		goto error_early;
+		return -1;
 
 	for (q = 0; q < nb_queues; q++) {
 		(*internals)->rx_queue[q].map = MAP_FAILED;
@@ -604,24 +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: I/F name too long (%s)\n",
 			name, pair->value);
-		goto error_early;
+		return -1;
 	}
 	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFINDEX)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	(*internals)->if_name = strdup(pair->value);
 	if ((*internals)->if_name == NULL)
-		goto error_early;
+		return -1;
 	(*internals)->if_index = ifr.ifr_ifindex;
 
 	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFHWADDR)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
 
@@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 
 	(*internals)->nb_queues = nb_queues;
 
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->dev_private = *internals;
 	data->nb_rx_queues = (uint16_t)nb_queues;
 	data->nb_tx_queues = (uint16_t)nb_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &(*internals)->eth_addr;
 
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 	}
 	free((*internals)->if_name);
 	rte_free(*internals);
-error_early:
-	rte_free(data);
 	return -1;
 }
 
@@ -985,7 +974,6 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
 	free(internals->if_name);
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index c786a1a..74bccd5 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -29,7 +29,6 @@
 #include <rte_ether.h>
 #include <rte_ethdev_driver.h>
 #include <rte_ethdev_pci.h>
-#include <rte_malloc.h>
 #include <rte_random.h>
 #include <rte_dev.h>
 #include <rte_kvargs.h>
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index c10e970..b7897b6 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -336,25 +336,17 @@ eth_kni_create(struct rte_vdev_device *vdev,
 	struct pmd_internals *internals;
 	struct rte_eth_dev_data *data;
 	struct rte_eth_dev *eth_dev;
-	const char *name;
 
 	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
 			numa_node);
 
-	name = rte_vdev_device_name(vdev);
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return NULL;
-
 	/* reserve an ethdev entry */
 	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*internals));
-	if (eth_dev == NULL) {
-		rte_free(data);
+	if (!eth_dev)
 		return NULL;
-	}
 
 	internals = eth_dev->data->dev_private;
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = 1;
 	data->nb_tx_queues = 1;
 	data->dev_link = pmd_link;
@@ -362,7 +354,6 @@ eth_kni_create(struct rte_vdev_device *vdev,
 
 	eth_random_addr(internals->eth_addr.addr_bytes);
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &eth_kni_ops;
 
 	internals->no_request_thread = args->no_request_thread;
@@ -458,7 +449,6 @@ eth_kni_remove(struct rte_vdev_device *vdev)
 	rte_kni_release(internals->kni);
 
 	rte_free(internals);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 74dde95..7d89a32 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -495,7 +495,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 {
 	const unsigned nb_rx_queues = 1;
 	const unsigned nb_tx_queues = 1;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internals *internals = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 
@@ -512,19 +512,10 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 	RTE_LOG(INFO, PMD, "Creating null ethdev on numa socket %u\n",
 		dev->device.numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(rte_vdev_device_name(dev), sizeof(*data), 0,
-		dev->device.numa_node);
-	if (!data)
-		return -ENOMEM;
-
 	eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
-	if (!eth_dev) {
-		rte_free(data);
+	if (!eth_dev)
 		return -ENOMEM;
-	}
+
 	/* now put it all together
 	 * - store queue data in internals,
 	 * - store numa_node info in ethdev data
@@ -545,13 +536,12 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 
 	rte_memcpy(internals->rss_key, default_rss_key, 40);
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->eth_addr;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 
 	/* finally assign rx and tx ops */
@@ -669,7 +659,6 @@ rte_pmd_null_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index 6d67d25..ee06cd3 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1068,7 +1068,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	char octtx_name[OCTEONTX_MAX_NAME_LEN];
 	struct octeontx_nic *nic = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	const char *name = rte_vdev_device_name(dev);
 
 	PMD_INIT_FUNC_TRACE();
@@ -1084,13 +1084,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		return 0;
 	}
 
-	data = rte_zmalloc_socket(octtx_name, sizeof(*data), 0, socket_id);
-	if (data == NULL) {
-		octeontx_log_err("failed to allocate devdata");
-		res = -ENOMEM;
-		goto err;
-	}
-
 	nic = rte_zmalloc_socket(octtx_name, sizeof(*nic), 0, socket_id);
 	if (nic == NULL) {
 		octeontx_log_err("failed to allocate nic structure");
@@ -1126,11 +1119,9 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	eth_dev->data->kdrv = RTE_KDRV_NONE;
 	eth_dev->data->numa_node = dev->device.numa_node;
 
-	rte_memcpy(data, (eth_dev)->data, sizeof(*data));
+	data = eth_dev->data;
 	data->dev_private = nic;
-
 	data->port_id = eth_dev->data->port_id;
-	snprintf(data->name, sizeof(data->name), "%s", eth_dev->data->name);
 
 	nic->ev_queues = 1;
 	nic->ev_ports = 1;
@@ -1149,7 +1140,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		goto err;
 	}
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &octeontx_dev_ops;
 
 	/* Finally save ethdev pointer to the NIC structure */
@@ -1217,7 +1207,6 @@ octeontx_remove(struct rte_vdev_device *dev)
 
 		rte_free(eth_dev->data->mac_addrs);
 		rte_free(eth_dev->data->dev_private);
-		rte_free(eth_dev->data);
 		rte_eth_dev_release_port(eth_dev);
 		rte_event_dev_close(nic->evdev);
 	}
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index c1571e1..8740d52 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -773,27 +773,16 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 		struct pmd_internals **internals,
 		struct rte_eth_dev **eth_dev)
 {
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	unsigned int numa_node = vdev->device.numa_node;
-	const char *name;
 
-	name = rte_vdev_device_name(vdev);
 	RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %d\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return -1;
-
 	/* reserve an ethdev entry */
 	*eth_dev = rte_eth_vdev_allocate(vdev, sizeof(**internals));
-	if (*eth_dev == NULL) {
-		rte_free(data);
+	if (!(*eth_dev))
 		return -1;
-	}
 
 	/* now put it all together
 	 * - store queue data in internals,
@@ -802,7 +791,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 	*internals = (*eth_dev)->data->dev_private;
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
@@ -812,7 +801,6 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * NOTE: we'll replace the data element, of originally allocated
 	 * eth_dev so the rings are local per-process
 	 */
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -1020,7 +1008,6 @@ pmd_pcap_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index df13c44..e53823a 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -259,15 +259,6 @@ do_eth_dev_ring_create(const char *name,
 	RTE_LOG(INFO, PMD, "Creating rings-backed ethdev on numa socket %u\n",
 			numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL) {
-		rte_errno = ENOMEM;
-		goto error;
-	}
-
 	rx_queues_local = rte_zmalloc_socket(name,
 			sizeof(void *) * nb_rx_queues, 0, numa_node);
 	if (rx_queues_local == NULL) {
@@ -301,10 +292,8 @@ do_eth_dev_ring_create(const char *name,
 	 * - point eth_dev_data to internals
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
-	/* NOTE: we'll replace the data element, of originally allocated eth_dev
-	 * so the rings are local per-process */
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->rx_queues = rx_queues_local;
 	data->tx_queues = tx_queues_local;
 
@@ -326,7 +315,6 @@ do_eth_dev_ring_create(const char *name,
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->address;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 	data->kdrv = RTE_KDRV_NONE;
 	data->numa_node = numa_node;
@@ -342,7 +330,6 @@ do_eth_dev_ring_create(const char *name,
 error:
 	rte_free(rx_queues_local);
 	rte_free(tx_queues_local);
-	rte_free(data);
 	rte_free(internals);
 
 	return -1;
@@ -675,8 +662,6 @@ rte_pmd_ring_remove(struct rte_vdev_device *dev)
 	rte_free(eth_dev->data->tx_queues);
 	rte_free(eth_dev->data->dev_private);
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 	return 0;
 }
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 915d937..b18efd8 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1386,12 +1386,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(DEBUG, PMD, "%s device on numa %u\n",
 			tuntap_name, rte_socket_id());
 
-	data = rte_zmalloc_socket(tap_name, sizeof(*data), 0, numa_node);
-	if (!data) {
-		RTE_LOG(ERR, PMD, "%s Failed to allocate data\n", tuntap_name);
-		goto error_exit_nodev;
-	}
-
 	dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
 	if (!dev) {
 		RTE_LOG(ERR, PMD, "%s Unable to allocate device struct\n",
@@ -1412,7 +1406,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	}
 
 	/* Setup some default values */
-	rte_memcpy(data, dev->data, sizeof(*data));
+	data = dev->data;
 	data->dev_private = pmd;
 	data->dev_flags = RTE_ETH_DEV_INTR_LSC;
 	data->numa_node = numa_node;
@@ -1423,7 +1417,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	data->nb_rx_queues = 0;
 	data->nb_tx_queues = 0;
 
-	dev->data = data;
 	dev->dev_ops = &ops;
 	dev->rx_pkt_burst = pmd_rx_burst;
 	dev->tx_pkt_burst = pmd_tx_burst;
@@ -1574,7 +1567,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(ERR, PMD, "%s Unable to initialize %s\n",
 		tuntap_name, rte_vdev_device_name(vdev));
 
-	rte_free(data);
 	return -EINVAL;
 }
 
@@ -1828,7 +1820,6 @@ rte_pmd_tap_remove(struct rte_vdev_device *dev)
 
 	close(internals->ioctl_sock);
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index d7d44a0..fea13eb 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1227,7 +1227,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	int16_t queues, const unsigned int numa_node, uint64_t flags)
 {
 	const char *name = rte_vdev_device_name(dev);
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internal *internal = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct ether_addr *eth_addr = NULL;
@@ -1237,13 +1237,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	RTE_LOG(INFO, PMD, "Creating VHOST-USER backend on numa socket %u\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure and internal
-	 * (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error;
-
 	list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
 	if (list == NULL)
 		goto error;
@@ -1285,12 +1278,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	rte_spinlock_init(&vring_state->lock);
 	vring_states[eth_dev->data->port_id] = vring_state;
 
-	/* We'll replace the 'data' originally allocated by eth_dev. So the
-	 * vhost PMD resources won't be shared between multi processes.
-	 */
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
-	eth_dev->data = data;
-
+	data = eth_dev->data;
 	data->nb_rx_queues = queues;
 	data->nb_tx_queues = queues;
 	internal->max_queues = queues;
@@ -1331,7 +1319,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 		rte_eth_dev_release_port(eth_dev);
 	rte_free(internal);
 	rte_free(list);
-	rte_free(data);
 
 	return -1;
 }
@@ -1462,8 +1449,6 @@ rte_pmd_vhost_remove(struct rte_vdev_device *dev)
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 
 	return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 5/5] drivers/net: share vdev data to secondary process
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (3 preceding siblings ...)
  2018-04-20 16:57   ` [PATCH v4 4/5] drivers/net: not use private eth dev data Jianfeng Tan
@ 2018-04-20 16:57   ` Jianfeng Tan
  4 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-20 16:57 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 doc/guides/rel_notes/release_18_05.rst    |  5 +++++
 drivers/net/af_packet/rte_eth_af_packet.c | 17 +++++++++++++++--
 drivers/net/bonding/rte_eth_bond_pmd.c    | 13 +++++++++++++
 drivers/net/failsafe/failsafe.c           | 14 ++++++++++++++
 drivers/net/kni/rte_eth_kni.c             | 12 ++++++++++++
 drivers/net/null/rte_eth_null.c           | 13 +++++++++++++
 drivers/net/octeontx/octeontx_ethdev.c    | 14 ++++++++++++++
 drivers/net/pcap/rte_eth_pcap.c           | 13 +++++++++++++
 drivers/net/softnic/rte_eth_softnic.c     | 19 ++++++++++++++++---
 drivers/net/tap/rte_eth_tap.c             | 13 +++++++++++++
 drivers/net/vhost/rte_eth_vhost.c         | 17 +++++++++++++++--
 11 files changed, 143 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index 290fa09..854efeb 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -115,6 +115,11 @@ New Features
 
   Linux uevent is supported as backend of this device event notification framework.
 
+* **Added support for procinfo and pdump on eth vdev.**
+
+  For ethernet virtual devices (like tap, pcap, etc), with this feature, we can get
+  stats/xstats on shared memory from secondary process, and also pdump packets on
+  those virtual devices.
 
 API Changes
 -----------
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 110e8a5..b394d3c 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -915,9 +915,22 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
 	int ret = 0;
 	struct rte_kvargs *kvlist;
 	int sockfd = -1;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL) {
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 2805c71..09696ea 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3021,6 +3021,7 @@ bond_probe(struct rte_vdev_device *dev)
 	uint8_t bonding_mode, socket_id/*, agg_mode*/;
 	int  arg_count, port_id;
 	uint8_t agg_mode;
+	struct rte_eth_dev *eth_dev;
 
 	if (!dev)
 		return -EINVAL;
@@ -3028,6 +3029,18 @@ bond_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	RTE_LOG(INFO, EAL, "Initializing pmd_bond for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &default_dev_ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev),
 		pmd_bond_init_valid_arguments);
 	if (kvlist == NULL)
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index fa279cb..dc9b0d0 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -294,10 +294,24 @@ static int
 rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 {
 	const char *name;
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(vdev);
 	INFO("Initializing " FAILSAFE_DRIVER_NAME " for %s",
 			name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(vdev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &failsafe_ops;
+		return 0;
+	}
+
 	return fs_eth_dev_create(vdev);
 }
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index b7897b6..08fc6a3 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -404,6 +404,18 @@ eth_kni_probe(struct rte_vdev_device *vdev)
 	params = rte_vdev_device_args(vdev);
 	RTE_LOG(INFO, PMD, "Initializing eth_kni for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &eth_kni_ops;
+		return 0;
+	}
+
 	ret = eth_kni_kvargs_process(&args, params);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 7d89a32..6413a90 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -597,6 +597,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	unsigned packet_size = default_packet_size;
 	unsigned packet_copy = default_packet_copy;
 	struct rte_kvargs *kvlist = NULL;
+	struct rte_eth_dev *eth_dev;
 	int ret;
 
 	if (!dev)
@@ -606,6 +607,18 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	params = rte_vdev_device_args(dev);
 	RTE_LOG(INFO, PMD, "Initializing pmd_null for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	if (params != NULL) {
 		kvlist = rte_kvargs_parse(params, valid_arguments);
 		if (kvlist == NULL)
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index ee06cd3..04120f5 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1228,12 +1228,26 @@ octeontx_probe(struct rte_vdev_device *dev)
 	struct rte_event_dev_config dev_conf;
 	const char *eventdev_name = "event_octeontx";
 	struct rte_event_dev_info info;
+	struct rte_eth_dev *eth_dev;
 
 	struct octeontx_vdev_init_params init_params = {
 		OCTEONTX_VDEV_DEFAULT_MAX_NR_PORT
 	};
 
 	dev_name = rte_vdev_device_name(dev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(dev_name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", dev_name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &octeontx_dev_ops;
+		return 0;
+	}
+
 	res = octeontx_parse_vdev_init_params(&init_params, dev);
 	if (res < 0)
 		return -EINVAL;
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 8740d52..570c9e9 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -898,6 +898,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	struct rte_kvargs *kvlist;
 	struct pmd_devargs pcaps = {0};
 	struct pmd_devargs dumpers = {0};
+	struct rte_eth_dev *eth_dev;
 	int single_iface = 0;
 	int ret;
 
@@ -908,6 +909,18 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	start_cycles = rte_get_timer_cycles();
 	hz = rte_get_timer_hz();
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
 		return -1;
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index b0c1341..e324394 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -725,13 +725,26 @@ pmd_probe(struct rte_vdev_device *vdev)
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(vdev);
 
-	RTE_LOG(INFO, PMD,
-		"Probing device \"%s\"\n",
-		rte_vdev_device_name(vdev));
+	RTE_LOG(INFO, PMD, "Probing device \"%s\"\n", name);
 
 	/* Parse input arguments */
 	params = rte_vdev_device_args(vdev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &pmd_ops;
+		return 0;
+	}
+
 	if (!params)
 		return -EINVAL;
 
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index b18efd8..cca5852 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1721,6 +1721,7 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	char tap_name[RTE_ETH_NAME_MAX_LEN];
 	char remote_iface[RTE_ETH_NAME_MAX_LEN];
 	struct ether_addr user_mac = { .addr_bytes = {0} };
+	struct rte_eth_dev *eth_dev;
 
 	tap_type = 1;
 	strcpy(tuntap_name, "TAP");
@@ -1728,6 +1729,18 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	params = rte_vdev_device_args(dev);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	speed = ETH_SPEED_NUM_10G;
 	snprintf(tap_name, sizeof(tap_name), "%s%d",
 		 DEFAULT_TAP_NAME, tap_unit++);
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index fea13eb..99a7727 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1362,9 +1362,22 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
 	int client_mode = 0;
 	int dequeue_zero_copy = 0;
 	int iommu_support = 0;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 2/5] bus/vdev: add lock on vdev device list
  2018-04-20 16:57   ` [PATCH v4 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
@ 2018-04-23  9:47     ` Burakov, Anatoly
  0 siblings, 0 replies; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-23  9:47 UTC (permalink / raw)
  To: Jianfeng Tan, dev; +Cc: thomas

On 20-Apr-18 5:57 PM, Jianfeng Tan wrote:
> As we could add virtual devices from different threads now, we
> add a spin lock to protect the vdev device list.
> 
> Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> ---

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-20 16:57   ` [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-23  9:54     ` Burakov, Anatoly
  2018-04-24  5:22       ` Tan, Jianfeng
  0 siblings, 1 reply; 48+ messages in thread
From: Burakov, Anatoly @ 2018-04-23  9:54 UTC (permalink / raw)
  To: Jianfeng Tan, dev; +Cc: thomas

On 20-Apr-18 5:57 PM, Jianfeng Tan wrote:
> To scan the vdevs in primary, we send request to primary process
> to obtain the names for vdevs.
> 
> Only the name is shared from the primary. In probe(), the device
> driver is supposed to locate (or request more) the detail
> information from the primary.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> ---

<...>

> +static int
> +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
> +{
> +	struct rte_vdev_device *dev;
> +	struct rte_mp_msg mp_resp;
> +	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
> +	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
> +	const char *devname;
> +	int num;
> +
> +	strcpy(mp_resp.name, "vdev");

This string is used in a lot of places, so... #define? also, i think 
action name is a bit too short. maybe make it more descriptive, like 
"bus_vdev" or something to that effect?

Also, i think Coverity will complain about not checking string length, 
so... strlcpy()?

> +	mp_resp.len_param = sizeof(*ou);
> +	mp_resp.num_fds = 0;
> +
> +	switch (in->type) {
> +	case VDEV_SCAN_REQ:
> +		ou->type = VDEV_SCAN_ONE;
> +		ou->num = 1;
> +		num = 0;
> +
> +		rte_spinlock_lock(&vdev_device_list_lock);
> +		TAILQ_FOREACH(dev, &vdev_device_list, next) {
> +			devname = rte_vdev_device_name(dev);
> +			if (strlen(devname) == 0)
> +				VDEV_LOG(INFO, "vdev with no name is not sent");

The comment says it's "not sent" but code doesn't seem to indicate that 
this will happen. Forgot "continue"?

> +			VDEV_LOG(INFO, "send vdev, %s", devname);
> +			strlcpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
> +			if (rte_mp_sendmsg(&mp_resp) < 0)
> +				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
> +					 devname, strerror(rte_errno));
> +			num++;
> +		}

Once all of that is addressed,

Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-23  9:54     ` Burakov, Anatoly
@ 2018-04-24  5:22       ` Tan, Jianfeng
  0 siblings, 0 replies; 48+ messages in thread
From: Tan, Jianfeng @ 2018-04-24  5:22 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: thomas



> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Monday, April 23, 2018 5:55 PM
> To: Tan, Jianfeng; dev@dpdk.org
> Cc: thomas@monjalon.net
> Subject: Re: [dpdk-dev] [PATCH v4 3/5] bus/vdev: bus scan by multi-process
> channel
> 
> On 20-Apr-18 5:57 PM, Jianfeng Tan wrote:
> > To scan the vdevs in primary, we send request to primary process
> > to obtain the names for vdevs.
> >
> > Only the name is shared from the primary. In probe(), the device
> > driver is supposed to locate (or request more) the detail
> > information from the primary.
> >
> > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> > ---
> 
> <...>
> 
> > +static int
> > +vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
> > +{
> > +	struct rte_vdev_device *dev;
> > +	struct rte_mp_msg mp_resp;
> > +	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
> > +	const struct vdev_param *in = (const struct vdev_param *)mp_msg-
> >param;
> > +	const char *devname;
> > +	int num;
> > +
> > +	strcpy(mp_resp.name, "vdev");
> 
> This string is used in a lot of places, so... #define? also, i think
> action name is a bit too short. maybe make it more descriptive, like
> "bus_vdev" or something to that effect?
> 
> Also, i think Coverity will complain about not checking string length,
> so... strlcpy()?

Will do in next version.

> 
> > +	mp_resp.len_param = sizeof(*ou);
> > +	mp_resp.num_fds = 0;
> > +
> > +	switch (in->type) {
> > +	case VDEV_SCAN_REQ:
> > +		ou->type = VDEV_SCAN_ONE;
> > +		ou->num = 1;
> > +		num = 0;
> > +
> > +		rte_spinlock_lock(&vdev_device_list_lock);
> > +		TAILQ_FOREACH(dev, &vdev_device_list, next) {
> > +			devname = rte_vdev_device_name(dev);
> > +			if (strlen(devname) == 0)
> > +				VDEV_LOG(INFO, "vdev with no name is not
> sent");
> 
> The comment says it's "not sent" but code doesn't seem to indicate that
> this will happen. Forgot "continue"?

Nice catch, will fix.

Thanks,
Jianfeng

> 
> > +			VDEV_LOG(INFO, "send vdev, %s", devname);
> > +			strlcpy(ou->name, devname,
> RTE_DEV_NAME_MAX_LEN);
> > +			if (rte_mp_sendmsg(&mp_resp) < 0)
> > +				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
> > +					 devname, strerror(rte_errno));
> > +			num++;
> > +		}
> 
> Once all of that is addressed,
> 
> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> --
> Thanks,
> Anatoly

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v5 0/5] allow procinfo and pdump on eth vdev
  2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (5 preceding siblings ...)
  2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-04-24  5:51 ` Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 1/5] eal: bring forward multi-process channel init Jianfeng Tan
                     ` (5 more replies)
  6 siblings, 6 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-24  5:51 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

v5:
  - Addess a code style issue and an implementation bug as suggested
    by Anatoly.

v4:
  - Change the lock code style as suggested by Anatoly.
  - Add function note as suggested by Anatoly.

v3:
  - Update doc.
  - Rebase on master.

v2:
  - Add spinlock for vdev device list as suggested by Anatoly.
  - Add ring, cxgbe and remove the free in each PMDs as suggested by Matan.
  - Rebase on master.

As we know, we have below limitations in vdev:
  - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
  - dpdk-pdump cannot dump the packets for (most) vdev in primary process;
  - secondary process cannot use (most) vdev in primary process.

The very first reason is that the secondary process actually does not know
the existence of those vdevs as vdevs are chained on a linked list, and
not shareable to secondary.

In this patchset, we would like to propose a vdev sharing model like this:
  - As a secondary process boots, all devices (including vdev) in primary
    will be automatically shared. After both primary and secondary process
    booted,
  - Device add/remove in primary will be translated to device hog
    plug/unplug event in secondary processes. (TODO)
  - Device add in secondary
    * If that kind of device support multi-process, the secondary will
      request the primary to probe the device and the primary to share
      it to the secondary. It's not necessary to have secondary-private
      device in this case. (TODO)
    * If that kind of device does not support multi-process, the secondary
      will probe the device by itself, and the port id is shared among
      all primary/secondary processes.

This patchset don't:
  - provide secondary data path (Rx/Tx) support for each specific vdev.

How to test:

Step 0: start testpmd with a vhost port and a VM connected to it.

Step 1: try using dpdk-procinfo to get the stats.
 $(dpdk-procinfo) --log-level=8 --no-pci -- --stats

Step 2: try using dpdk-pdump to dump the packets.
 $(dpdk-pdump) -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'


Jianfeng Tan (5):
  eal: bring forward multi-process channel init
  bus/vdev: add lock on vdev device list
  bus/vdev: bus scan by multi-process channel
  drivers/net: not use private eth dev data
  drivers/net: share vdev data to secondary process

 doc/guides/rel_notes/release_18_05.rst    |   5 +
 drivers/bus/vdev/Makefile                 |   1 +
 drivers/bus/vdev/vdev.c                   | 197 ++++++++++++++++++++++++++----
 drivers/net/af_packet/rte_eth_af_packet.c |  43 +++----
 drivers/net/bonding/rte_eth_bond_pmd.c    |  13 ++
 drivers/net/cxgbe/cxgbe_main.c            |   1 -
 drivers/net/failsafe/failsafe.c           |  14 +++
 drivers/net/kni/rte_eth_kni.c             |  26 ++--
 drivers/net/null/rte_eth_null.c           |  32 ++---
 drivers/net/octeontx/octeontx_ethdev.c    |  29 +++--
 drivers/net/pcap/rte_eth_pcap.c           |  32 ++---
 drivers/net/ring/rte_eth_ring.c           |  17 +--
 drivers/net/softnic/rte_eth_softnic.c     |  19 ++-
 drivers/net/tap/rte_eth_tap.c             |  24 ++--
 drivers/net/vhost/rte_eth_vhost.c         |  36 +++---
 lib/librte_eal/bsdapp/eal/eal.c           |  23 ++--
 lib/librte_eal/linuxapp/eal/eal.c         |  23 ++--
 17 files changed, 364 insertions(+), 171 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v5 1/5] eal: bring forward multi-process channel init
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-04-24  5:51   ` Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-24  5:51 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

Adjust the init sequence: put mp channel init before bus scan
so that we can init the vdev bus through mp channel in the
secondary process before the bus scan.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 23 +++++++++++++----------
 lib/librte_eal/linuxapp/eal/eal.c | 23 +++++++++++++----------
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index d996190..d315cde 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -552,6 +552,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -595,16 +608,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 	/* in secondary processes, memory init may allocate additional fbarrays
 	 * not present in primary processes, so to avoid any potential issues,
 	 * initialize memzones first.
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 21afa73..5b23bf0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -770,6 +770,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -820,8 +833,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
 		rte_eal_init_alert("Cannot init logging.");
 		rte_errno = ENOMEM;
@@ -829,14 +840,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0) {
 		rte_eal_init_alert("Cannot init VFIO\n");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v5 2/5] bus/vdev: add lock on vdev device list
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 1/5] eal: bring forward multi-process channel init Jianfeng Tan
@ 2018-04-24  5:51   ` Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-24  5:51 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

As we could add virtual devices from different threads now, we
add a spin lock to protect the vdev device list.

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/bus/vdev/vdev.c | 95 +++++++++++++++++++++++++++++++++++--------------
 1 file changed, 69 insertions(+), 26 deletions(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index f8dd1f5..70964f5 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -33,6 +33,8 @@ TAILQ_HEAD(vdev_device_list, rte_vdev_device);
 
 static struct vdev_device_list vdev_device_list =
 	TAILQ_HEAD_INITIALIZER(vdev_device_list);
+static rte_spinlock_t vdev_device_list_lock = RTE_SPINLOCK_INITIALIZER;
+
 struct vdev_driver_list vdev_driver_list =
 	TAILQ_HEAD_INITIALIZER(vdev_driver_list);
 
@@ -149,6 +151,7 @@ vdev_probe_all_drivers(struct rte_vdev_device *dev)
 	return ret;
 }
 
+/* The caller shall be responsible for thread-safe */
 static struct rte_vdev_device *
 find_vdev(const char *name)
 {
@@ -193,8 +196,8 @@ alloc_devargs(const char *name, const char *args)
 	return devargs;
 }
 
-int
-rte_vdev_init(const char *name, const char *args)
+static int
+insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
 {
 	struct rte_vdev_device *dev;
 	struct rte_devargs *devargs;
@@ -203,10 +206,6 @@ rte_vdev_init(const char *name, const char *args)
 	if (name == NULL)
 		return -EINVAL;
 
-	dev = find_vdev(name);
-	if (dev)
-		return -EEXIST;
-
 	devargs = alloc_devargs(name, args);
 	if (!devargs)
 		return -ENOMEM;
@@ -221,18 +220,18 @@ rte_vdev_init(const char *name, const char *args)
 	dev->device.numa_node = SOCKET_ID_ANY;
 	dev->device.name = devargs->name;
 
-	ret = vdev_probe_all_drivers(dev);
-	if (ret) {
-		if (ret > 0)
-			VDEV_LOG(ERR, "no driver found for %s\n", name);
+	if (find_vdev(name)) {
+		ret = -EEXIST;
 		goto fail;
 	}
 
+	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
 
-	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
-	return 0;
+	if (p_dev)
+		*p_dev = dev;
 
+	return 0;
 fail:
 	free(devargs->args);
 	free(devargs);
@@ -240,6 +239,33 @@ rte_vdev_init(const char *name, const char *args)
 	return ret;
 }
 
+int
+rte_vdev_init(const char *name, const char *args)
+{
+	struct rte_vdev_device *dev;
+	struct rte_devargs *devargs;
+	int ret;
+
+	rte_spinlock_lock(&vdev_device_list_lock);
+	ret = insert_vdev(name, args, &dev);
+	if (ret == 0) {
+		ret = vdev_probe_all_drivers(dev);
+		if (ret) {
+			if (ret > 0)
+				VDEV_LOG(ERR, "no driver found for %s\n", name);
+			/* If fails, remove it from vdev list */
+			devargs = dev->device.devargs;
+			TAILQ_REMOVE(&vdev_device_list, dev, next);
+			TAILQ_REMOVE(&devargs_list, devargs, next);
+			free(devargs->args);
+			free(devargs);
+			free(dev);
+		}
+	}
+	rte_spinlock_unlock(&vdev_device_list_lock);
+	return ret;
+}
+
 static int
 vdev_remove_driver(struct rte_vdev_device *dev)
 {
@@ -266,24 +292,28 @@ rte_vdev_uninit(const char *name)
 	if (name == NULL)
 		return -EINVAL;
 
-	dev = find_vdev(name);
-	if (!dev)
-		return -ENOENT;
+	rte_spinlock_lock(&vdev_device_list_lock);
 
-	devargs = dev->device.devargs;
+	dev = find_vdev(name);
+	if (!dev) {
+		ret = -ENOENT;
+		goto unlock;
+	}
 
 	ret = vdev_remove_driver(dev);
 	if (ret)
-		return ret;
+		goto unlock;
 
 	TAILQ_REMOVE(&vdev_device_list, dev, next);
-
+	devargs = dev->device.devargs;
 	TAILQ_REMOVE(&devargs_list, devargs, next);
-
 	free(devargs->args);
 	free(devargs);
 	free(dev);
-	return 0;
+
+unlock:
+	rte_spinlock_unlock(&vdev_device_list_lock);
+	return ret;
 }
 
 static int
@@ -314,19 +344,25 @@ vdev_scan(void)
 		if (devargs->bus != &rte_vdev_bus)
 			continue;
 
-		dev = find_vdev(devargs->name);
-		if (dev)
-			continue;
-
 		dev = calloc(1, sizeof(*dev));
 		if (!dev)
 			return -1;
 
+		rte_spinlock_lock(&vdev_device_list_lock);
+
+		if (find_vdev(devargs->name)) {
+			rte_spinlock_unlock(&vdev_device_list_lock);
+			free(dev);
+			continue;
+		}
+
 		dev->device.devargs = devargs;
 		dev->device.numa_node = SOCKET_ID_ANY;
 		dev->device.name = devargs->name;
 
 		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+
+		rte_spinlock_unlock(&vdev_device_list_lock);
 	}
 
 	return 0;
@@ -340,6 +376,10 @@ vdev_probe(void)
 
 	/* call the init function for each virtual device */
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
+		/* we don't use the vdev lock here, as it's only used in DPDK
+		 * initialization; and we don't want to hold such a lock when
+		 * we call each driver probe.
+		 */
 
 		if (dev->device.driver)
 			continue;
@@ -360,15 +400,18 @@ vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 {
 	struct rte_vdev_device *dev;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
 		if (start && &dev->device == start) {
 			start = NULL;
 			continue;
 		}
 		if (cmp(&dev->device, data) == 0)
-			return &dev->device;
+			break;
 	}
-	return NULL;
+	rte_spinlock_unlock(&vdev_device_list_lock);
+
+	return dev ? &dev->device : NULL;
 }
 
 static int
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v5 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 1/5] eal: bring forward multi-process channel init Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
@ 2018-04-24  5:51   ` Jianfeng Tan
  2018-04-24 10:09     ` Thomas Monjalon
  2018-04-24  5:51   ` [PATCH v5 4/5] drivers/net: not use private eth dev data Jianfeng Tan
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-24  5:51 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

To scan the vdevs in primary, we send request to primary process
to obtain the names for vdevs.

Only the name is shared from the primary. In probe(), the device
driver is supposed to locate (or request more) the detail
information from the primary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/bus/vdev/Makefile |   1 +
 drivers/bus/vdev/vdev.c   | 104 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+)

diff --git a/drivers/bus/vdev/Makefile b/drivers/bus/vdev/Makefile
index 24d424a..bd0bb89 100644
--- a/drivers/bus/vdev/Makefile
+++ b/drivers/bus/vdev/Makefile
@@ -10,6 +10,7 @@ LIB = librte_bus_vdev.a
 
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 # versioning export map
 EXPORT_MAP := rte_bus_vdev_version.map
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 70964f5..38ed70a 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -18,11 +18,14 @@
 #include <rte_memory.h>
 #include <rte_tailq.h>
 #include <rte_spinlock.h>
+#include <rte_string_fns.h>
 #include <rte_errno.h>
 
 #include "rte_bus_vdev.h"
 #include "vdev_logs.h"
 
+#define VDEV_MP_KEY	"bus_vdev_mp"
+
 int vdev_logtype_bus;
 
 /* Forward declare to access virtual bus name */
@@ -316,6 +319,79 @@ rte_vdev_uninit(const char *name)
 	return ret;
 }
 
+struct vdev_param {
+#define VDEV_SCAN_REQ	1
+#define VDEV_SCAN_ONE	2
+#define VDEV_SCAN_REP	3
+	int type;
+	int num;
+	char name[RTE_DEV_NAME_MAX_LEN];
+};
+
+static int vdev_plug(struct rte_device *dev);
+
+/**
+ * This function works as the action for both primary and secondary process
+ * for static vdev discovery when a secondary process is booting.
+ *
+ * step 1, secondary process sends a sync request to ask for vdev in primary;
+ * step 2, primary process receives the request, and send vdevs one by one;
+ * step 3, primary process sends back reply, which indicates how many vdevs
+ * are sent.
+ */
+static int
+vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_vdev_device *dev;
+	struct rte_mp_msg mp_resp;
+	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
+	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
+	const char *devname;
+	int num;
+
+	strlcpy(mp_resp.name, VDEV_MP_KEY, sizeof(mp_resp.name));
+	mp_resp.len_param = sizeof(*ou);
+	mp_resp.num_fds = 0;
+
+	switch (in->type) {
+	case VDEV_SCAN_REQ:
+		ou->type = VDEV_SCAN_ONE;
+		ou->num = 1;
+		num = 0;
+
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_FOREACH(dev, &vdev_device_list, next) {
+			devname = rte_vdev_device_name(dev);
+			if (strlen(devname) == 0) {
+				VDEV_LOG(INFO, "vdev with no name is not sent");
+				continue;
+			}
+			VDEV_LOG(INFO, "send vdev, %s", devname);
+			strlcpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
+			if (rte_mp_sendmsg(&mp_resp) < 0)
+				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
+					 devname, strerror(rte_errno));
+			num++;
+		}
+		rte_spinlock_unlock(&vdev_device_list_lock);
+
+		ou->type = VDEV_SCAN_REP;
+		ou->num = num;
+		if (rte_mp_reply(&mp_resp, peer) < 0)
+			VDEV_LOG(ERR, "Failed to reply a scan request");
+		break;
+	case VDEV_SCAN_ONE:
+		VDEV_LOG(INFO, "receive vdev, %s", in->name);
+		if (insert_vdev(in->name, NULL, NULL) < 0)
+			VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
+		break;
+	default:
+		VDEV_LOG(ERR, "vdev cannot recognize this message");
+	}
+
+	return 0;
+}
+
 static int
 vdev_scan(void)
 {
@@ -323,6 +399,34 @@ vdev_scan(void)
 	struct rte_devargs *devargs;
 	struct vdev_custom_scan *custom_scan;
 
+	if (rte_mp_action_register(VDEV_MP_KEY, vdev_action) < 0 &&
+	    rte_errno != EEXIST) {
+		VDEV_LOG(ERR, "Failed to add vdev mp action");
+		return -1;
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		struct rte_mp_msg mp_req, *mp_rep;
+		struct rte_mp_reply mp_reply;
+		struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+		struct vdev_param *req = (struct vdev_param *)mp_req.param;
+		struct vdev_param *resp;
+
+		strlcpy(mp_req.name, VDEV_MP_KEY, sizeof(mp_req.name));
+		mp_req.len_param = sizeof(*req);
+		mp_req.num_fds = 0;
+		req->type = VDEV_SCAN_REQ;
+		if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0 &&
+		    mp_reply.nb_received == 1) {
+			mp_rep = &mp_reply.msgs[0];
+			resp = (struct vdev_param *)mp_rep->param;
+			VDEV_LOG(INFO, "Received %d vdevs", resp->num);
+		} else
+			VDEV_LOG(ERR, "Failed to request vdev from primary");
+
+		/* Fall through to allow private vdevs in secondary process */
+	}
+
 	/* call custom scan callbacks if any */
 	rte_spinlock_lock(&vdev_custom_scan_lock);
 	TAILQ_FOREACH(custom_scan, &vdev_custom_scans, next) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v5 4/5] drivers/net: not use private eth dev data
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (2 preceding siblings ...)
  2018-04-24  5:51   ` [PATCH v5 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-24  5:51   ` Jianfeng Tan
  2018-04-24  5:51   ` [PATCH v5 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
  2018-04-24 10:32   ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon
  5 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-24  5:51 UTC (permalink / raw)
  To: dev
  Cc: thomas, Jianfeng Tan, John W . Linville, Ferruh Yigit,
	Tetsuya Mukawa, Santosh Shukla, Jerin Jacob, Pascal Mazon,
	Maxime Coquelin, Bruce Richardson, Rahul Lakkireddy

We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Cc: John W. Linville <linville@tuxdriver.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Tetsuya Mukawa <mtetsuyah@gmail.com>
Cc: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Cc: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Cc: Pascal Mazon <pascal.mazon@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>
Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 26 +++++++-------------------
 drivers/net/cxgbe/cxgbe_main.c            |  1 -
 drivers/net/kni/rte_eth_kni.c             | 14 ++------------
 drivers/net/null/rte_eth_null.c           | 19 ++++---------------
 drivers/net/octeontx/octeontx_ethdev.c    | 15 ++-------------
 drivers/net/pcap/rte_eth_pcap.c           | 19 +++----------------
 drivers/net/ring/rte_eth_ring.c           | 17 +----------------
 drivers/net/tap/rte_eth_tap.c             | 11 +----------
 drivers/net/vhost/rte_eth_vhost.c         | 19 ++-----------------
 9 files changed, 22 insertions(+), 119 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 57eccfd..110e8a5 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: no interface specified for AF_PACKET ethdev\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 
 	RTE_LOG(INFO, PMD,
 		"%s: creating AF_PACKET-backed ethdev on numa socket %u\n",
 		name, numa_node);
 
-	/*
-	 * now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error_early;
-
 	*internals = rte_zmalloc_socket(name, sizeof(**internals),
 	                                0, numa_node);
 	if (*internals == NULL)
-		goto error_early;
+		return -1;
 
 	for (q = 0; q < nb_queues; q++) {
 		(*internals)->rx_queue[q].map = MAP_FAILED;
@@ -604,24 +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: I/F name too long (%s)\n",
 			name, pair->value);
-		goto error_early;
+		return -1;
 	}
 	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFINDEX)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	(*internals)->if_name = strdup(pair->value);
 	if ((*internals)->if_name == NULL)
-		goto error_early;
+		return -1;
 	(*internals)->if_index = ifr.ifr_ifindex;
 
 	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFHWADDR)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
 
@@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 
 	(*internals)->nb_queues = nb_queues;
 
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->dev_private = *internals;
 	data->nb_rx_queues = (uint16_t)nb_queues;
 	data->nb_tx_queues = (uint16_t)nb_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &(*internals)->eth_addr;
 
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 	}
 	free((*internals)->if_name);
 	rte_free(*internals);
-error_early:
-	rte_free(data);
 	return -1;
 }
 
@@ -985,7 +974,6 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
 	free(internals->if_name);
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index c786a1a..74bccd5 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -29,7 +29,6 @@
 #include <rte_ether.h>
 #include <rte_ethdev_driver.h>
 #include <rte_ethdev_pci.h>
-#include <rte_malloc.h>
 #include <rte_random.h>
 #include <rte_dev.h>
 #include <rte_kvargs.h>
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index c10e970..b7897b6 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -336,25 +336,17 @@ eth_kni_create(struct rte_vdev_device *vdev,
 	struct pmd_internals *internals;
 	struct rte_eth_dev_data *data;
 	struct rte_eth_dev *eth_dev;
-	const char *name;
 
 	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
 			numa_node);
 
-	name = rte_vdev_device_name(vdev);
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return NULL;
-
 	/* reserve an ethdev entry */
 	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*internals));
-	if (eth_dev == NULL) {
-		rte_free(data);
+	if (!eth_dev)
 		return NULL;
-	}
 
 	internals = eth_dev->data->dev_private;
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = 1;
 	data->nb_tx_queues = 1;
 	data->dev_link = pmd_link;
@@ -362,7 +354,6 @@ eth_kni_create(struct rte_vdev_device *vdev,
 
 	eth_random_addr(internals->eth_addr.addr_bytes);
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &eth_kni_ops;
 
 	internals->no_request_thread = args->no_request_thread;
@@ -458,7 +449,6 @@ eth_kni_remove(struct rte_vdev_device *vdev)
 	rte_kni_release(internals->kni);
 
 	rte_free(internals);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 74dde95..7d89a32 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -495,7 +495,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 {
 	const unsigned nb_rx_queues = 1;
 	const unsigned nb_tx_queues = 1;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internals *internals = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 
@@ -512,19 +512,10 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 	RTE_LOG(INFO, PMD, "Creating null ethdev on numa socket %u\n",
 		dev->device.numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(rte_vdev_device_name(dev), sizeof(*data), 0,
-		dev->device.numa_node);
-	if (!data)
-		return -ENOMEM;
-
 	eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
-	if (!eth_dev) {
-		rte_free(data);
+	if (!eth_dev)
 		return -ENOMEM;
-	}
+
 	/* now put it all together
 	 * - store queue data in internals,
 	 * - store numa_node info in ethdev data
@@ -545,13 +536,12 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 
 	rte_memcpy(internals->rss_key, default_rss_key, 40);
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->eth_addr;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 
 	/* finally assign rx and tx ops */
@@ -669,7 +659,6 @@ rte_pmd_null_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index 6d67d25..ee06cd3 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1068,7 +1068,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	char octtx_name[OCTEONTX_MAX_NAME_LEN];
 	struct octeontx_nic *nic = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	const char *name = rte_vdev_device_name(dev);
 
 	PMD_INIT_FUNC_TRACE();
@@ -1084,13 +1084,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		return 0;
 	}
 
-	data = rte_zmalloc_socket(octtx_name, sizeof(*data), 0, socket_id);
-	if (data == NULL) {
-		octeontx_log_err("failed to allocate devdata");
-		res = -ENOMEM;
-		goto err;
-	}
-
 	nic = rte_zmalloc_socket(octtx_name, sizeof(*nic), 0, socket_id);
 	if (nic == NULL) {
 		octeontx_log_err("failed to allocate nic structure");
@@ -1126,11 +1119,9 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	eth_dev->data->kdrv = RTE_KDRV_NONE;
 	eth_dev->data->numa_node = dev->device.numa_node;
 
-	rte_memcpy(data, (eth_dev)->data, sizeof(*data));
+	data = eth_dev->data;
 	data->dev_private = nic;
-
 	data->port_id = eth_dev->data->port_id;
-	snprintf(data->name, sizeof(data->name), "%s", eth_dev->data->name);
 
 	nic->ev_queues = 1;
 	nic->ev_ports = 1;
@@ -1149,7 +1140,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		goto err;
 	}
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &octeontx_dev_ops;
 
 	/* Finally save ethdev pointer to the NIC structure */
@@ -1217,7 +1207,6 @@ octeontx_remove(struct rte_vdev_device *dev)
 
 		rte_free(eth_dev->data->mac_addrs);
 		rte_free(eth_dev->data->dev_private);
-		rte_free(eth_dev->data);
 		rte_eth_dev_release_port(eth_dev);
 		rte_event_dev_close(nic->evdev);
 	}
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index c1571e1..8740d52 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -773,27 +773,16 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 		struct pmd_internals **internals,
 		struct rte_eth_dev **eth_dev)
 {
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	unsigned int numa_node = vdev->device.numa_node;
-	const char *name;
 
-	name = rte_vdev_device_name(vdev);
 	RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %d\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return -1;
-
 	/* reserve an ethdev entry */
 	*eth_dev = rte_eth_vdev_allocate(vdev, sizeof(**internals));
-	if (*eth_dev == NULL) {
-		rte_free(data);
+	if (!(*eth_dev))
 		return -1;
-	}
 
 	/* now put it all together
 	 * - store queue data in internals,
@@ -802,7 +791,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 	*internals = (*eth_dev)->data->dev_private;
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
@@ -812,7 +801,6 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * NOTE: we'll replace the data element, of originally allocated
 	 * eth_dev so the rings are local per-process
 	 */
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -1020,7 +1008,6 @@ pmd_pcap_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index df13c44..e53823a 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -259,15 +259,6 @@ do_eth_dev_ring_create(const char *name,
 	RTE_LOG(INFO, PMD, "Creating rings-backed ethdev on numa socket %u\n",
 			numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL) {
-		rte_errno = ENOMEM;
-		goto error;
-	}
-
 	rx_queues_local = rte_zmalloc_socket(name,
 			sizeof(void *) * nb_rx_queues, 0, numa_node);
 	if (rx_queues_local == NULL) {
@@ -301,10 +292,8 @@ do_eth_dev_ring_create(const char *name,
 	 * - point eth_dev_data to internals
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
-	/* NOTE: we'll replace the data element, of originally allocated eth_dev
-	 * so the rings are local per-process */
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->rx_queues = rx_queues_local;
 	data->tx_queues = tx_queues_local;
 
@@ -326,7 +315,6 @@ do_eth_dev_ring_create(const char *name,
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->address;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 	data->kdrv = RTE_KDRV_NONE;
 	data->numa_node = numa_node;
@@ -342,7 +330,6 @@ do_eth_dev_ring_create(const char *name,
 error:
 	rte_free(rx_queues_local);
 	rte_free(tx_queues_local);
-	rte_free(data);
 	rte_free(internals);
 
 	return -1;
@@ -675,8 +662,6 @@ rte_pmd_ring_remove(struct rte_vdev_device *dev)
 	rte_free(eth_dev->data->tx_queues);
 	rte_free(eth_dev->data->dev_private);
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 	return 0;
 }
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 915d937..b18efd8 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1386,12 +1386,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(DEBUG, PMD, "%s device on numa %u\n",
 			tuntap_name, rte_socket_id());
 
-	data = rte_zmalloc_socket(tap_name, sizeof(*data), 0, numa_node);
-	if (!data) {
-		RTE_LOG(ERR, PMD, "%s Failed to allocate data\n", tuntap_name);
-		goto error_exit_nodev;
-	}
-
 	dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
 	if (!dev) {
 		RTE_LOG(ERR, PMD, "%s Unable to allocate device struct\n",
@@ -1412,7 +1406,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	}
 
 	/* Setup some default values */
-	rte_memcpy(data, dev->data, sizeof(*data));
+	data = dev->data;
 	data->dev_private = pmd;
 	data->dev_flags = RTE_ETH_DEV_INTR_LSC;
 	data->numa_node = numa_node;
@@ -1423,7 +1417,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	data->nb_rx_queues = 0;
 	data->nb_tx_queues = 0;
 
-	dev->data = data;
 	dev->dev_ops = &ops;
 	dev->rx_pkt_burst = pmd_rx_burst;
 	dev->tx_pkt_burst = pmd_tx_burst;
@@ -1574,7 +1567,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(ERR, PMD, "%s Unable to initialize %s\n",
 		tuntap_name, rte_vdev_device_name(vdev));
 
-	rte_free(data);
 	return -EINVAL;
 }
 
@@ -1828,7 +1820,6 @@ rte_pmd_tap_remove(struct rte_vdev_device *dev)
 
 	close(internals->ioctl_sock);
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index d7d44a0..fea13eb 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1227,7 +1227,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	int16_t queues, const unsigned int numa_node, uint64_t flags)
 {
 	const char *name = rte_vdev_device_name(dev);
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internal *internal = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct ether_addr *eth_addr = NULL;
@@ -1237,13 +1237,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	RTE_LOG(INFO, PMD, "Creating VHOST-USER backend on numa socket %u\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure and internal
-	 * (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error;
-
 	list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
 	if (list == NULL)
 		goto error;
@@ -1285,12 +1278,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	rte_spinlock_init(&vring_state->lock);
 	vring_states[eth_dev->data->port_id] = vring_state;
 
-	/* We'll replace the 'data' originally allocated by eth_dev. So the
-	 * vhost PMD resources won't be shared between multi processes.
-	 */
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
-	eth_dev->data = data;
-
+	data = eth_dev->data;
 	data->nb_rx_queues = queues;
 	data->nb_tx_queues = queues;
 	internal->max_queues = queues;
@@ -1331,7 +1319,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 		rte_eth_dev_release_port(eth_dev);
 	rte_free(internal);
 	rte_free(list);
-	rte_free(data);
 
 	return -1;
 }
@@ -1462,8 +1449,6 @@ rte_pmd_vhost_remove(struct rte_vdev_device *dev)
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 
 	return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v5 5/5] drivers/net: share vdev data to secondary process
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (3 preceding siblings ...)
  2018-04-24  5:51   ` [PATCH v5 4/5] drivers/net: not use private eth dev data Jianfeng Tan
@ 2018-04-24  5:51   ` Jianfeng Tan
  2018-04-24 10:32   ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon
  5 siblings, 0 replies; 48+ messages in thread
From: Jianfeng Tan @ 2018-04-24  5:51 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
---
 doc/guides/rel_notes/release_18_05.rst    |  5 +++++
 drivers/net/af_packet/rte_eth_af_packet.c | 17 +++++++++++++++--
 drivers/net/bonding/rte_eth_bond_pmd.c    | 13 +++++++++++++
 drivers/net/failsafe/failsafe.c           | 14 ++++++++++++++
 drivers/net/kni/rte_eth_kni.c             | 12 ++++++++++++
 drivers/net/null/rte_eth_null.c           | 13 +++++++++++++
 drivers/net/octeontx/octeontx_ethdev.c    | 14 ++++++++++++++
 drivers/net/pcap/rte_eth_pcap.c           | 13 +++++++++++++
 drivers/net/softnic/rte_eth_softnic.c     | 19 ++++++++++++++++---
 drivers/net/tap/rte_eth_tap.c             | 13 +++++++++++++
 drivers/net/vhost/rte_eth_vhost.c         | 17 +++++++++++++++--
 11 files changed, 143 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index 5276882..983c1c8 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -134,6 +134,11 @@ New Features
 
   Linux uevent is supported as backend of this device event notification framework.
 
+* **Added support for procinfo and pdump on eth vdev.**
+
+  For ethernet virtual devices (like tap, pcap, etc), with this feature, we can get
+  stats/xstats on shared memory from secondary process, and also pdump packets on
+  those virtual devices.
 
 API Changes
 -----------
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 110e8a5..b394d3c 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -915,9 +915,22 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
 	int ret = 0;
 	struct rte_kvargs *kvlist;
 	int sockfd = -1;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL) {
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 2805c71..09696ea 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3021,6 +3021,7 @@ bond_probe(struct rte_vdev_device *dev)
 	uint8_t bonding_mode, socket_id/*, agg_mode*/;
 	int  arg_count, port_id;
 	uint8_t agg_mode;
+	struct rte_eth_dev *eth_dev;
 
 	if (!dev)
 		return -EINVAL;
@@ -3028,6 +3029,18 @@ bond_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	RTE_LOG(INFO, EAL, "Initializing pmd_bond for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &default_dev_ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev),
 		pmd_bond_init_valid_arguments);
 	if (kvlist == NULL)
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index fa279cb..dc9b0d0 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -294,10 +294,24 @@ static int
 rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 {
 	const char *name;
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(vdev);
 	INFO("Initializing " FAILSAFE_DRIVER_NAME " for %s",
 			name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(vdev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &failsafe_ops;
+		return 0;
+	}
+
 	return fs_eth_dev_create(vdev);
 }
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index b7897b6..08fc6a3 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -404,6 +404,18 @@ eth_kni_probe(struct rte_vdev_device *vdev)
 	params = rte_vdev_device_args(vdev);
 	RTE_LOG(INFO, PMD, "Initializing eth_kni for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &eth_kni_ops;
+		return 0;
+	}
+
 	ret = eth_kni_kvargs_process(&args, params);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 7d89a32..6413a90 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -597,6 +597,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	unsigned packet_size = default_packet_size;
 	unsigned packet_copy = default_packet_copy;
 	struct rte_kvargs *kvlist = NULL;
+	struct rte_eth_dev *eth_dev;
 	int ret;
 
 	if (!dev)
@@ -606,6 +607,18 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	params = rte_vdev_device_args(dev);
 	RTE_LOG(INFO, PMD, "Initializing pmd_null for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	if (params != NULL) {
 		kvlist = rte_kvargs_parse(params, valid_arguments);
 		if (kvlist == NULL)
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index ee06cd3..04120f5 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1228,12 +1228,26 @@ octeontx_probe(struct rte_vdev_device *dev)
 	struct rte_event_dev_config dev_conf;
 	const char *eventdev_name = "event_octeontx";
 	struct rte_event_dev_info info;
+	struct rte_eth_dev *eth_dev;
 
 	struct octeontx_vdev_init_params init_params = {
 		OCTEONTX_VDEV_DEFAULT_MAX_NR_PORT
 	};
 
 	dev_name = rte_vdev_device_name(dev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(dev_name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", dev_name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &octeontx_dev_ops;
+		return 0;
+	}
+
 	res = octeontx_parse_vdev_init_params(&init_params, dev);
 	if (res < 0)
 		return -EINVAL;
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 8740d52..570c9e9 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -898,6 +898,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	struct rte_kvargs *kvlist;
 	struct pmd_devargs pcaps = {0};
 	struct pmd_devargs dumpers = {0};
+	struct rte_eth_dev *eth_dev;
 	int single_iface = 0;
 	int ret;
 
@@ -908,6 +909,18 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	start_cycles = rte_get_timer_cycles();
 	hz = rte_get_timer_hz();
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
 		return -1;
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index b0c1341..e324394 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -725,13 +725,26 @@ pmd_probe(struct rte_vdev_device *vdev)
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(vdev);
 
-	RTE_LOG(INFO, PMD,
-		"Probing device \"%s\"\n",
-		rte_vdev_device_name(vdev));
+	RTE_LOG(INFO, PMD, "Probing device \"%s\"\n", name);
 
 	/* Parse input arguments */
 	params = rte_vdev_device_args(vdev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &pmd_ops;
+		return 0;
+	}
+
 	if (!params)
 		return -EINVAL;
 
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index b18efd8..cca5852 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1721,6 +1721,7 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	char tap_name[RTE_ETH_NAME_MAX_LEN];
 	char remote_iface[RTE_ETH_NAME_MAX_LEN];
 	struct ether_addr user_mac = { .addr_bytes = {0} };
+	struct rte_eth_dev *eth_dev;
 
 	tap_type = 1;
 	strcpy(tuntap_name, "TAP");
@@ -1728,6 +1729,18 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	params = rte_vdev_device_args(dev);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	speed = ETH_SPEED_NUM_10G;
 	snprintf(tap_name, sizeof(tap_name), "%s%d",
 		 DEFAULT_TAP_NAME, tap_unit++);
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index fea13eb..99a7727 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1362,9 +1362,22 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
 	int client_mode = 0;
 	int dequeue_zero_copy = 0;
 	int iommu_support = 0;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v5 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-24  5:51   ` [PATCH v5 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-24 10:09     ` Thomas Monjalon
  0 siblings, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2018-04-24 10:09 UTC (permalink / raw)
  To: Jianfeng Tan; +Cc: dev, bruce.richardson

24/04/2018 07:51, Jianfeng Tan:
> --- a/drivers/bus/vdev/Makefile
> +++ b/drivers/bus/vdev/Makefile
> +CFLAGS += -DALLOW_EXPERIMENTAL_API

You need to do the same for meson.build.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v5 0/5] allow procinfo and pdump on eth vdev
  2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                     ` (4 preceding siblings ...)
  2018-04-24  5:51   ` [PATCH v5 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
@ 2018-04-24 10:32   ` Thomas Monjalon
  5 siblings, 0 replies; 48+ messages in thread
From: Thomas Monjalon @ 2018-04-24 10:32 UTC (permalink / raw)
  To: Jianfeng Tan; +Cc: dev

24/04/2018 07:51, Jianfeng Tan:
> Jianfeng Tan (5):
>   eal: bring forward multi-process channel init
>   bus/vdev: add lock on vdev device list
>   bus/vdev: bus scan by multi-process channel
>   drivers/net: not use private eth dev data
>   drivers/net: share vdev data to secondary process

Applied with suggested meson fix, thanks

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2018-04-24 10:32 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-04 15:30 [PATCH 0/4] allow procinfo and pdump on eth vdev Jianfeng Tan
2018-03-04 15:30 ` [PATCH 1/4] eal: bring forward multi-process channel init Jianfeng Tan
2018-03-04 15:30 ` [PATCH 2/4] bus/vdev: bus scan by multi-process channel Jianfeng Tan
2018-03-05  9:36   ` Burakov, Anatoly
2018-03-06  0:50     ` Tan, Jianfeng
2018-03-07 14:00   ` Burakov, Anatoly
2018-03-12  3:22     ` Tan, Jianfeng
2018-03-04 15:30 ` [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately Jianfeng Tan
2018-03-06  6:07   ` Matan Azrad
2018-03-06  8:55     ` Tan, Jianfeng
2018-03-07  6:00       ` Matan Azrad
2018-03-07  6:10         ` Matan Azrad
2018-03-12  3:40           ` Tan, Jianfeng
2018-03-04 15:30 ` [PATCH 4/4] drivers/net: share vdev data to secondary process Jianfeng Tan
2018-04-19 16:50 ` [PATCH v3 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
2018-04-19 16:50   ` [PATCH v3 1/5] eal: bring forward multi-process channel init Jianfeng Tan
2018-04-20  8:16     ` Burakov, Anatoly
2018-04-20 14:08       ` Tan, Jianfeng
2018-04-19 16:50   ` [PATCH v3 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
2018-04-20  8:26     ` Burakov, Anatoly
2018-04-20 14:19       ` Tan, Jianfeng
2018-04-20 15:16         ` Burakov, Anatoly
2018-04-20 15:23           ` Tan, Jianfeng
2018-04-19 16:50   ` [PATCH v3 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
2018-04-20  8:41     ` Burakov, Anatoly
2018-04-20 14:28       ` Tan, Jianfeng
2018-04-20 15:19         ` Burakov, Anatoly
2018-04-20 15:32           ` Tan, Jianfeng
2018-04-20 15:39             ` Burakov, Anatoly
2018-04-19 16:50   ` [PATCH v3 4/5] drivers/net: not use private eth dev data Jianfeng Tan
2018-04-19 16:50   ` [PATCH v3 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
2018-04-20 16:57 ` [PATCH v4 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
2018-04-20 16:57   ` [PATCH v4 1/5] eal: bring forward multi-process channel init Jianfeng Tan
2018-04-20 16:57   ` [PATCH v4 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
2018-04-23  9:47     ` Burakov, Anatoly
2018-04-20 16:57   ` [PATCH v4 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
2018-04-23  9:54     ` Burakov, Anatoly
2018-04-24  5:22       ` Tan, Jianfeng
2018-04-20 16:57   ` [PATCH v4 4/5] drivers/net: not use private eth dev data Jianfeng Tan
2018-04-20 16:57   ` [PATCH v4 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
2018-04-24  5:51 ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
2018-04-24  5:51   ` [PATCH v5 1/5] eal: bring forward multi-process channel init Jianfeng Tan
2018-04-24  5:51   ` [PATCH v5 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
2018-04-24  5:51   ` [PATCH v5 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
2018-04-24 10:09     ` Thomas Monjalon
2018-04-24  5:51   ` [PATCH v5 4/5] drivers/net: not use private eth dev data Jianfeng Tan
2018-04-24  5:51   ` [PATCH v5 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
2018-04-24 10:32   ` [PATCH v5 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.