All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/8] add packet capture framework
       [not found] <1465487895-5870-1-git-send-email-reshma.pattan@intel.com>
@ 2016-06-10 16:18 ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks Reshma Pattan
                     ` (9 more replies)
  0 siblings, 10 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev

This patch set include below changes

1)Changes to librte_ether.
2)A new library librte_pdump added for packet capture framework.
3)A new app/pdump tool added for packet capturing.
4)Test pmd changes done to initialize packet capture framework.
5)Documentation update.

1)librte_pdump
==============
To support packet capturing on dpdk Ethernet devices, a new library librte_pdump
is added.Users can develop their own packet capturing application using new library APIs.

Operation:
----------
The librte_pdump provides APIs to support packet capturing on dpdk Ethernet devices.
Library provides APIs to initialize the packet capture framework, enable/disable
the packet capture and uninitialize the packet capture framework.

The librte_pdump library works on a client/server model. The server is responsible for enabling or
disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
the packet capture.

The packet capture framework, as part of its initialization, creates the pthread and the server socket in
the pthread. The application that calls the framework initialization will have the server socket created,
either under the path that the application has passed or under the default path i.e. either ''/var/run'' for
root user or ''$HOME'' for non root user.

Applications that request enabling or disabling of the packet capture will have the client socket created either under
the ''/var/run/'' for root users or ''$HOME'' for not root users to send the requests to the server.
The server socket will listen for client requests for enabling or disabling the packet capture.

Applications using below APIs need to pass port/device_id, queue, mempool and
ring parameters. Library uses user provided ring and mempool to mirror the rx/tx
packets of the port for users. Users need to dequeue the rings and write the packets
to vdev(pcap/tuntap) to view the packets using any standard tools.

Note:
Mempool and Ring should be mc/mp supportable.
Mempool mbuf size should be big enough to handle the rx/tx packets of a port.

APIs:
-----
rte_pdump_init()
rte_pdump_enable()
rte_pdump_enable_by_deviceid()
rte_pdump_disable()
rte_pdump_disable_by_deviceid()
rte_pdump_uninit()
rte_pdump_set_socket_dir()

2)app/pdump tool
================
Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
This tool by default runs as secondary process, and provides the support for
the command line options for packet capture.

./build/app/dpdk_pdump --
                       --pdump '(port=<port id> | device_id=<pci id or vdev name>),
                                (queue=<queue id>),
                                (rx-dev=<iface or pcap file> |
                                 tx-dev=<iface or pcap file>),
                                [ring-size=<ring size>],
                                [mbuf-size=<mbuf data size>],
                                [total-num-mbufs=<number of mbufs>]'

Parameters inside the parenthesis represents the mandatory parameters.
Parameters inside the square brackets represents optional parameters.
User has to pass on packet capture parameters under --pdump parameters, multiples of
--pdump can be passed to capture packets on different port and queue combinations

Operation:
----------
*Tool parse the user command line arguments,
creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
of the device passed in rx-dev|tx-dev parameters.

*Then calls the APIs of librte_pdump i.e. rte_pdump_enable()/rte_pdump_enable_by_deviceid()
to enable packet capturing on a specific port/device_id and queue by passing on
port|device_id, queue, mempool and ring info.

*Tool runs in while loop to dequeue the packets from the ring and write them to pcap device.

*Tool can be stopped using SIGINT, upon which tool calls
rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated resources.

Note:
CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the pdump tool.

3)Test-pmd changes
==================
Changes are done to test-pmd application to initialize/uninitialize the packet capture framework.
So app/pdump tool can be run to see packets of dpdk ports that are used by test-pmd.

Similarly any application which needs packet capture should call initialize/uninitialize APIs of
librte_pdump and use pdump tool to start the capture.

4)Packet capture flow between pdump tool and librte_pdump
=========================================================
* Pdump tool (Secondary process) requests packet capture
for specific port|device_id and queue combinations.

*Library in secondary process context creates client socket and communicates
the port|device_id, queue, ring and mempool to server.

*Library initializes server in primary process 'test-pmd' context and server serves
the client request to enable Ethernet rxtx call-backs for a given port|device_id and queue.

*Copy the rx/tx packets to passed mempool and enqueue the packets to ring for secondary process.

*Pdump tool will dequeue the packets from ring and writes them to PCAPMD vdev,
so ultimately packets will be seen on the device that is passed in rx-dev|tx-dev.

*Once the pdump tool is terminated with SIGINT it will disable the packet capturing.

*Library receives the disable packet capture request, communicate the info to server,
server will remove the Ethernet rxtx call-backs.

*Packet capture can be seen using tcpdump command
"tcpdump -ni <iface>" (or) "tcpdump –nr <pcapfile>"

5)Example command line
======================
./build/app/dpdk_pdump -- --pdump 'device_id=0000:02:0.0,queue=*,tx-dev=/tmp/dt-file.pcap,rx-dev=/tmp/dr-file.pcap,ring-size=8192,mbuf-size=2176,total-num-mbufs=32768' --pdump 'device_id=0000:01:00.0,queue=*,rx-dev=/tmp/d-file.pcap,tx-dev=/tmp/d-file.pcap,ring-size=16384,mbuf-size=2176,total-num-mbufs=32768'

v8:
added server socket argument to rte_pdump_init() API ==> http://dpdk.org/dev/patchwork/patch/13402/
added rte_pdump_set_socket_dir() API.
updated documentation for new changes.

v7:
fixed lines over 90 characters.

v6:
removed below deprecation notice patch from patch set.
http://dpdk.org/dev/patchwork/patch/13372/

v5:
addressed code review comments for below patches
http://dpdk.org/dev/patchwork/patch/12955/
http://dpdk.org/dev/patchwork/patch/12951/

v4:
added missing deprecation notice for ABI changes of rte_eth_dev_info structure.
made doc changes as per doc guidelines.
replaced rte_eal_vdev_init with rte_eth_dev_attach in pdump tool.
removed rxtx-dev parameter from pdump tool command line.

v3:
app/pdump: Moved cleanup code from signal handler to main.
divided librte_ether changes into multiple patches.
example command changed in app/pdump application guide

v2:
fix compilation issues for 4.8.3
fix unnecessary #includes


Reshma Pattan (8):
  librte_ether: protect add/remove of rxtx callbacks with spinlocks
  librte_ether: add new api rte_eth_add_first_rx_callback
  librte_ether: add new fields to rte_eth_dev_info struct
  librte_ether: make rte_eth_dev_get_port_by_name
    rte_eth_dev_get_name_by_port public
  lib/librte_pdump: add new library for packet capturing support
  app/pdump: add pdump tool for packet capturing
  app/test-pmd: add pdump initialization uninitialization
  doc: update doc for packet capture framework

 MAINTAINERS                             |   8 +
 app/Makefile                            |   1 +
 app/pdump/Makefile                      |  45 ++
 app/pdump/main.c                        | 844 +++++++++++++++++++++++++++++
 app/test-pmd/testpmd.c                  |   6 +
 config/common_base                      |   5 +
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/pdump_library.rst | 117 +++++
 doc/guides/rel_notes/release_16_07.rst  |  13 +
 doc/guides/sample_app_ug/index.rst      |   1 +
 doc/guides/sample_app_ug/pdump.rst      | 122 +++++
 lib/Makefile                            |   1 +
 lib/librte_ether/rte_ethdev.c           | 123 +++--
 lib/librte_ether/rte_ethdev.h           |  60 +++
 lib/librte_ether/rte_ether_version.map  |   9 +
 lib/librte_pdump/Makefile               |  55 ++
 lib/librte_pdump/rte_pdump.c            | 904 ++++++++++++++++++++++++++++++++
 lib/librte_pdump/rte_pdump.h            | 208 ++++++++
 lib/librte_pdump/rte_pdump_version.map  |  13 +
 mk/rte.app.mk                           |   1 +
 20 files changed, 2493 insertions(+), 44 deletions(-)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
2.5.0

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v8 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 2/8] librte_ether: add new api rte_eth_add_first_rx_callback Reshma Pattan
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added spinlocks around add/remove logic of rxtx callbacks to
avoid corruption of callback lists in multithreaded context.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 82 +++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 42 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..ce70d58 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -77,6 +77,12 @@ static uint8_t nb_ports;
 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for add/remove rx callbacks */
+static rte_spinlock_t rte_eth_rx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* spinlock for add/remove tx callbacks */
+static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -1634,7 +1640,6 @@ rte_eth_dev_set_rx_queue_stats_mapping(uint8_t port_id, uint16_t rx_queue_id,
 			STAT_QMAP_RX);
 }
 
-
 void
 rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 {
@@ -2905,7 +2910,6 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_errno = EINVAL;
 		return NULL;
 	}
-
 	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
 
 	if (cb == NULL) {
@@ -2916,6 +2920,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.rx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
@@ -2928,6 +2933,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
 	return cb;
 }
@@ -2957,6 +2963,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.tx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].pre_tx_burst_cbs[queue_id];
@@ -2969,6 +2976,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
 	return cb;
 }
@@ -2987,29 +2995,24 @@ rte_eth_remove_rx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->post_rx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+	int ret = -EINVAL;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	prev_cb = &dev->post_rx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
@@ -3026,29 +3029,24 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->pre_tx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	int ret = -EINVAL;
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
+	prev_cb = &dev->pre_tx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 2/8] librte_ether: add new api rte_eth_add_first_rx_callback
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 3/8] librte_ether: add new fields to rte_eth_dev_info struct Reshma Pattan
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added new public api rte_eth_add_first_rx_callback to add given
callback as head of list.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 35 ++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 28 +++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  6 ++++++
 3 files changed, 69 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ce70d58..97d167e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2939,6 +2939,41 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 void *
+rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+		rte_rx_callback_fn fn, void *user_param)
+{
+#ifndef RTE_ETHDEV_RXTX_CALLBACKS
+	rte_errno = ENOTSUP;
+	return NULL;
+#endif
+	/* check input parameters */
+	if (!rte_eth_dev_is_valid_port(port_id) || fn == NULL ||
+		queue_id >= rte_eth_devices[port_id].data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
+
+	if (cb == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	cb->fn.rx = fn;
+	cb->param = user_param;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	/* Add the callbacks at fisrt position*/
+	cb->next = rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
+	rte_smp_wmb();
+	rte_eth_devices[port_id].post_rx_burst_cbs[queue_id] = cb;
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
+
+	return cb;
+}
+
+void *
 rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_tx_callback_fn fn, void *user_param)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..237e6ef 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3825,6 +3825,34 @@ int rte_eth_dev_get_dcb_info(uint8_t port_id,
 void *rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_rx_callback_fn fn, void *user_param);
 
+/*
+* Add a callback that must be called first on packet RX on a given port
+* and queue.
+*
+* This API configures a first function to be called for each burst of
+* packets received on a given NIC port queue. The return value is a pointer
+* that can be used to later remove the callback using
+* rte_eth_remove_rx_callback().
+*
+* Multiple functions are called in the order that they are added.
+*
+* @param port_id
+*   The port identifier of the Ethernet device.
+* @param queue_id
+*   The queue on the Ethernet device on which the callback is to be added.
+* @param fn
+*   The callback function
+* @param user_param
+*   A generic pointer parameter which will be passed to each invocation of the
+*   callback function on this port and queue.
+*
+* @return
+*   NULL on error.
+*   On success, a pointer value which can later be used to remove the callback.
+*/
+void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+		rte_rx_callback_fn fn, void *user_param);
+
 /**
  * Add a callback to be called on packet TX on a given port and queue.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 214ecc7..c990b04 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,9 @@ DPDK_16.04 {
 	rte_eth_tx_buffer_set_err_callback;
 
 } DPDK_2.2;
+
+DPDK_16.07 {
+	global:
+
+	rte_eth_add_first_rx_callback;
+} DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 3/8] librte_ether: add new fields to rte_eth_dev_info struct
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 2/8] librte_ether: add new api rte_eth_add_first_rx_callback Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 4/8] librte_ether: make rte_eth_dev_get_port_by_name rte_eth_dev_get_name_by_port public Reshma Pattan
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

New fields nb_rx_queues and nb_tx_queues are added to
rte_eth_dev_info structure.
Changes to API rte_eth_dev_info_get() are done to update
these new fields to rte_eth_dev_info object.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 2 ++
 lib/librte_ether/rte_ethdev.h          | 3 +++
 lib/librte_ether/rte_ether_version.map | 1 +
 3 files changed, 6 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 97d167e..1f634c9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1661,6 +1661,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 	(*dev->dev_ops->dev_infos_get)(dev, dev_info);
 	dev_info->pci_dev = dev->pci_dev;
 	dev_info->driver_name = dev->data->drv_name;
+	dev_info->nb_rx_queues = dev->data->nb_rx_queues;
+	dev_info->nb_tx_queues = dev->data->nb_tx_queues;
 }
 
 int
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 237e6ef..8ad7c01 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -882,6 +882,9 @@ struct rte_eth_dev_info {
 	struct rte_eth_desc_lim rx_desc_lim;  /**< RX descriptors limits */
 	struct rte_eth_desc_lim tx_desc_lim;  /**< TX descriptors limits */
 	uint32_t speed_capa;  /**< Supported speeds bitmap (ETH_LINK_SPEED_). */
+	/** Configured number of rx/tx queues */
+	uint16_t nb_rx_queues; /**< Number of RX queues. */
+	uint16_t nb_tx_queues; /**< Number of TX queues. */
 };
 
 /**
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index c990b04..d06d648 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,4 +137,5 @@ DPDK_16.07 {
 	global:
 
 	rte_eth_add_first_rx_callback;
+	rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 4/8] librte_ether: make rte_eth_dev_get_port_by_name rte_eth_dev_get_name_by_port public
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (2 preceding siblings ...)
  2016-06-10 16:18   ` [PATCH v8 3/8] librte_ether: add new fields to rte_eth_dev_info struct Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support Reshma Pattan
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Converted rte_eth_dev_get_port_by_name to a public API.
Converted rte_eth_dev_get_name_by_port to a public API.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          |  4 ++--
 lib/librte_ether/rte_ethdev.h          | 29 +++++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  2 ++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 1f634c9..0b19569 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -406,7 +406,7 @@ rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
 	return 0;
 }
 
-static int
+int
 rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 {
 	char *tmp;
@@ -425,7 +425,7 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 	return 0;
 }
 
-static int
+int
 rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id)
 {
 	int i;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8ad7c01..fab281e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4284,6 +4284,35 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
 				  uint32_t mask,
 				  uint8_t en);
 
+/**
+* Get the port id from pci adrress or device name
+* Ex: 0000:2:00.0 or vdev name eth_pcap0
+*
+* @param name
+*  pci address or name of the device
+* @param port_id
+*   pointer to port identifier of the device
+* @return
+*   - (0) if successful.
+*   - (-ENODEV or -EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id);
+
+/**
+* Get the device name from port id
+*
+* @param port_id
+*   pointer to port identifier of the device
+* @param name
+*  pci address or name of the device
+* @return
+*   - (0) if successful.
+*   - (-EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index d06d648..73e730d 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,5 +137,7 @@ DPDK_16.07 {
 	global:
 
 	rte_eth_add_first_rx_callback;
+	rte_eth_dev_get_name_by_port;
+	rte_eth_dev_get_port_by_name;
 	rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (3 preceding siblings ...)
  2016-06-10 16:18   ` [PATCH v8 4/8] librte_ether: make rte_eth_dev_get_port_by_name rte_eth_dev_get_name_by_port public Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 18:48     ` Aaron Conole
  2016-06-10 16:18   ` [PATCH v8 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added new library for packet capturing support.

Added public api rte_pdump_init, applications should call
this as part of their application setup to have packet
capturing framework ready.

Added public api rte_pdump_uninit to uninitialize the packet
capturing framework.

Added public apis rte_pdump_enable and rte_pdump_disable to
enable and disable packet capturing on specific port and queue.

Added public apis rte_pdump_enable_by_deviceid and
rte_pdump_disable_by_deviceid to enable and disable packet
capturing on a specific device (pci address or name) and queue.

Added public api rte_pdump_set_socket_dir to set the
server socket path.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS                            |   4 +
 config/common_base                     |   5 +
 lib/Makefile                           |   1 +
 lib/librte_pdump/Makefile              |  55 ++
 lib/librte_pdump/rte_pdump.c           | 904 +++++++++++++++++++++++++++++++++
 lib/librte_pdump/rte_pdump.h           | 208 ++++++++
 lib/librte_pdump/rte_pdump_version.map |  13 +
 mk/rte.app.mk                          |   1 +
 8 files changed, 1191 insertions(+)
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e8558f..cc3ffdb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -432,6 +432,10 @@ F: app/test/test_reorder*
 F: examples/packet_ordering/
 F: doc/guides/sample_app_ug/packet_ordering.rst
 
+Pdump
+M: Reshma Pattan <reshma.pattan@intel.com>
+F: lib/librte_pdump/
+
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
 F: lib/librte_sched/
diff --git a/config/common_base b/config/common_base
index 47c26f6..a2d5d72 100644
--- a/config/common_base
+++ b/config/common_base
@@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
 CONFIG_RTE_LIBRTE_REORDER=y
 
 #
+# Compile the pdump library
+#
+CONFIG_RTE_LIBRTE_PDUMP=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/lib/Makefile b/lib/Makefile
index f254dba..ca7c02f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
new file mode 100644
index 0000000..af81a28
--- /dev/null
+++ b/lib/librte_pdump/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pdump.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+CFLAGS += -D_GNU_SOURCE
+
+EXPORT_MAP := rte_pdump_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_PDUMP) := rte_pdump.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_PDUMP)-include := rte_pdump.h
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
new file mode 100644
index 0000000..c4233cb
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump.c
@@ -0,0 +1,904 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <stdio.h>
+
+#include <rte_memcpy.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_pci.h>
+
+#include "rte_pdump.h"
+
+#define SOCKET_PATH_VAR_RUN "/var/run/pdump_sockets"
+#define SOCKET_PATH_HOME "HOME/pdump_sockets"
+#define SERVER_SOCKET "%s/pdump_server_socket"
+#define CLIENT_SOCKET "%s/pdump_client_socket_%d_%u"
+#define DEVICE_ID_SIZE 64
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_PDUMP RTE_LOGTYPE_USER1
+
+enum pdump_operation {
+	DISABLE = 1,
+	ENABLE = 2
+};
+
+enum pdump_socktype {
+	SERVER = 1,
+	CLIENT = 2
+};
+
+enum pdump_version {
+	V1 = 1
+};
+
+static pthread_t pdump_thread;
+static int pdump_socket_fd;
+static char socket_dir[PATH_MAX];
+
+struct pdump_request {
+	uint16_t ver;
+	uint16_t op;
+	uint32_t flags;
+	union pdump_data {
+		struct enable_v1 {
+			char device[DEVICE_ID_SIZE];
+			uint16_t queue;
+			struct rte_ring *ring;
+			struct rte_mempool *mp;
+			void *filter;
+		} en_v1;
+		struct disable_v1 {
+			char device[DEVICE_ID_SIZE];
+			uint16_t queue;
+			struct rte_ring *ring;
+			struct rte_mempool *mp;
+			void *filter;
+		} dis_v1;
+	} data;
+};
+
+struct pdump_response {
+	uint16_t ver;
+	uint16_t res_op;
+	int32_t err_value;
+};
+
+static struct pdump_rxtx_cbs {
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+	struct rte_eth_rxtx_callback *cb;
+	void *filter;
+} rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT],
+tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
+
+static inline int
+pdump_pktmbuf_copy_data(struct rte_mbuf *seg, const struct rte_mbuf *m)
+{
+	if (rte_pktmbuf_tailroom(seg) < m->data_len) {
+		RTE_LOG(ERR, PDUMP,
+			"User mempool: insufficient data_len of mbuf\n");
+		return -EINVAL;
+	}
+
+	seg->port = m->port;
+	seg->vlan_tci = m->vlan_tci;
+	seg->hash = m->hash;
+	seg->tx_offload = m->tx_offload;
+	seg->ol_flags = m->ol_flags;
+	seg->packet_type = m->packet_type;
+	seg->vlan_tci_outer = m->vlan_tci_outer;
+	seg->data_len = m->data_len;
+	seg->pkt_len = seg->data_len;
+	rte_memcpy(rte_pktmbuf_mtod(seg, void *),
+			rte_pktmbuf_mtod(m, void *),
+			rte_pktmbuf_data_len(seg));
+
+	return 0;
+}
+
+static inline struct rte_mbuf *
+pdump_pktmbuf_copy(struct rte_mbuf *m, struct rte_mempool *mp)
+{
+	struct rte_mbuf *m_dup, *seg, **prev;
+	uint32_t pktlen;
+	uint8_t nseg;
+
+	m_dup = rte_pktmbuf_alloc(mp);
+	if (unlikely(m_dup == NULL))
+		return NULL;
+
+	seg = m_dup;
+	prev = &seg->next;
+	pktlen = m->pkt_len;
+	nseg = 0;
+
+	do {
+		nseg++;
+		if (pdump_pktmbuf_copy_data(seg, m) < 0) {
+			rte_pktmbuf_free(m_dup);
+			return NULL;
+		}
+		*prev = seg;
+		prev = &seg->next;
+	} while ((m = m->next) != NULL &&
+			(seg = rte_pktmbuf_alloc(mp)) != NULL);
+
+	*prev = NULL;
+	m_dup->nb_segs = nseg;
+	m_dup->pkt_len = pktlen;
+
+	/* Allocation of new indirect segment failed */
+	if (unlikely(seg == NULL)) {
+		rte_pktmbuf_free(m_dup);
+		return NULL;
+	}
+
+	__rte_mbuf_sanity_check(m_dup, 1);
+	return m_dup;
+}
+
+static inline void
+pdump_copy(struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
+{
+	unsigned i;
+	int ring_enq;
+	uint16_t d_pkts = 0;
+	struct rte_mbuf *dup_bufs[nb_pkts];
+	struct pdump_rxtx_cbs *cbs;
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+	struct rte_mbuf *p;
+
+	cbs  = user_params;
+	ring = cbs->ring;
+	mp = cbs->mp;
+	for (i = 0; i < nb_pkts; i++) {
+		p = pdump_pktmbuf_copy(pkts[i], mp);
+		if (p)
+			dup_bufs[d_pkts++] = p;
+	}
+
+	ring_enq = rte_ring_enqueue_burst(ring, (void *)dup_bufs, d_pkts);
+	if (unlikely(ring_enq < d_pkts)) {
+		RTE_LOG(DEBUG, PDUMP,
+			"only %d of packets enqueued to ring\n", ring_enq);
+		do {
+			rte_pktmbuf_free(dup_bufs[ring_enq]);
+		} while (++ring_enq < d_pkts);
+	}
+}
+
+static uint16_t
+pdump_rx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+	struct rte_mbuf **pkts, uint16_t nb_pkts,
+	uint16_t max_pkts __rte_unused,
+	void *user_params)
+{
+	pdump_copy(pkts, nb_pkts, user_params);
+	return nb_pkts;
+}
+
+static uint16_t
+pdump_tx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+		struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
+{
+	pdump_copy(pkts, nb_pkts, user_params);
+	return nb_pkts;
+}
+
+static int
+pdump_get_dombdf(char *device_id, char *domBDF, size_t len)
+{
+	int ret;
+	struct rte_pci_addr dev_addr = {0};
+
+	/* identify if device_id is pci address or name */
+	ret = eal_parse_pci_DomBDF(device_id, &dev_addr);
+	if (ret < 0)
+		return -1;
+
+	if (dev_addr.domain)
+		ret = snprintf(domBDF, len, "%u:%u:%u.%u", dev_addr.domain,
+				dev_addr.bus, dev_addr.devid,
+				dev_addr.function);
+	else
+		ret = snprintf(domBDF, len, "%u:%u.%u", dev_addr.bus,
+				dev_addr.devid,
+				dev_addr.function);
+
+	return ret;
+}
+
+static int
+pdump_regitser_rx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
+				struct rte_ring *ring, struct rte_mempool *mp,
+				uint16_t operation)
+{
+	uint16_t qid;
+	struct pdump_rxtx_cbs *cbs = NULL;
+
+	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
+	for (; qid < end_q; qid++) {
+		cbs = &rx_cbs[port][qid];
+		if (cbs && operation == ENABLE) {
+			if (cbs->cb) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add rx callback for port=%d "
+					"and queue=%d, callback already exists\n",
+					port, qid);
+				return -EEXIST;
+			}
+			cbs->ring = ring;
+			cbs->mp = mp;
+			cbs->cb = rte_eth_add_first_rx_callback(port, qid,
+								pdump_rx, cbs);
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add rx callback, errno=%d\n",
+					rte_errno);
+				return rte_errno;
+			}
+		}
+		if (cbs && operation == DISABLE) {
+			int ret;
+
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to delete non existing rx "
+					"callback for port=%d and queue=%d\n",
+					port, qid);
+				return -EINVAL;
+			}
+			ret = rte_eth_remove_rx_callback(port, qid, cbs->cb);
+			if (ret < 0) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to remove rx callback, errno=%d\n",
+					rte_errno);
+				return ret;
+			}
+			cbs->cb = NULL;
+		}
+	}
+
+	return 0;
+}
+
+static int
+pdump_regitser_tx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
+				struct rte_ring *ring, struct rte_mempool *mp,
+				uint16_t operation)
+{
+
+	uint16_t qid;
+	struct pdump_rxtx_cbs *cbs = NULL;
+
+	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
+	for (; qid < end_q; qid++) {
+		cbs = &tx_cbs[port][qid];
+		if (cbs && operation == ENABLE) {
+			if (cbs->cb) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add tx callback for port=%d "
+					"and queue=%d, callback already exists\n",
+					port, qid);
+				return -EEXIST;
+			}
+			cbs->ring = ring;
+			cbs->mp = mp;
+			cbs->cb = rte_eth_add_tx_callback(port, qid, pdump_tx,
+								cbs);
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add tx callback, errno=%d\n",
+					rte_errno);
+				return rte_errno;
+			}
+		}
+		if (cbs && operation == DISABLE) {
+			int ret;
+
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to delete non existing tx "
+					"callback for port=%d and queue=%d\n",
+					port, qid);
+				return -EINVAL;
+			}
+			ret = rte_eth_remove_tx_callback(port, qid, cbs->cb);
+			if (ret < 0) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to remove tx callback, errno=%d\n",
+					rte_errno);
+				return ret;
+			}
+			cbs->cb = NULL;
+		}
+	}
+
+	return 0;
+}
+
+static int
+set_pdump_rxtx_cbs(struct pdump_request *p)
+{
+	uint16_t nb_rx_q, nb_tx_q = 0, end_q, queue;
+	uint8_t port;
+	int ret = 0;
+	uint32_t flags;
+	uint16_t operation;
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+
+	flags = p->flags;
+	operation = p->op;
+	if (operation == ENABLE) {
+		ret = rte_eth_dev_get_port_by_name(p->data.en_v1.device,
+				&port);
+		if (ret < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to get potid for device id=%s\n",
+				p->data.en_v1.device);
+			return -EINVAL;
+		}
+		queue = p->data.en_v1.queue;
+		ring = p->data.en_v1.ring;
+		mp = p->data.en_v1.mp;
+	} else {
+		ret = rte_eth_dev_get_port_by_name(p->data.dis_v1.device,
+				&port);
+		if (ret < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to get potid for device id=%s\n",
+				p->data.dis_v1.device);
+			return -EINVAL;
+		}
+		queue = p->data.dis_v1.queue;
+		ring = p->data.dis_v1.ring;
+		mp = p->data.dis_v1.mp;
+	}
+
+	/* validation if packet capture is for all queues */
+	if (queue == RTE_PDUMP_ALL_QUEUES) {
+		struct rte_eth_dev_info dev_info;
+
+		rte_eth_dev_info_get(port, &dev_info);
+		nb_rx_q = dev_info.nb_rx_queues;
+		nb_tx_q = dev_info.nb_tx_queues;
+		if (nb_rx_q == 0 && flags & RTE_PDUMP_FLAG_RX) {
+			RTE_LOG(ERR, PDUMP,
+				"number of rx queues cannot be 0\n");
+			return -EINVAL;
+		}
+		if (nb_tx_q == 0 && flags & RTE_PDUMP_FLAG_TX) {
+			RTE_LOG(ERR, PDUMP,
+				"number of tx queues cannot be 0\n");
+			return -EINVAL;
+		}
+		if ((nb_tx_q == 0 || nb_rx_q == 0) &&
+			flags == RTE_PDUMP_FLAG_RXTX) {
+			RTE_LOG(ERR, PDUMP,
+				"both tx&rx queues must be non zero\n");
+			return -EINVAL;
+		}
+	}
+
+	/* register RX callback */
+	if (flags & RTE_PDUMP_FLAG_RX) {
+		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_rx_q : queue + 1;
+		ret = pdump_regitser_rx_callbacks(end_q, port, queue, ring, mp,
+							operation);
+		if (ret < 0)
+			return ret;
+	}
+
+	/* register TX callback */
+	if (flags & RTE_PDUMP_FLAG_TX) {
+		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_tx_q : queue + 1;
+		ret = pdump_regitser_tx_callbacks(end_q, port, queue, ring, mp,
+							operation);
+		if (ret < 0)
+			return ret;
+	}
+
+	return ret;
+}
+
+/* get socket path (/var/run if root, $HOME otherwise) */
+static void
+pdump_get_socket_path(char *buffer, int bufsz, enum pdump_socktype type)
+{
+	const char *dir = NULL;
+
+	if (type == SERVER && socket_dir[0] != 0)
+		dir = socket_dir;
+	else {
+
+		if (getuid() != 0)
+			dir = getenv(SOCKET_PATH_HOME);
+		else
+			dir = SOCKET_PATH_VAR_RUN;
+	}
+
+	mkdir(dir, 700);
+	if (type == SERVER)
+		snprintf(buffer, bufsz, SERVER_SOCKET, dir);
+	else
+		snprintf(buffer, bufsz, CLIENT_SOCKET, dir, getpid(),
+				rte_sys_gettid());
+}
+
+static int
+pdump_create_server_socket(void)
+{
+	int ret, socket_fd;
+	struct sockaddr_un addr;
+	socklen_t addr_len;
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path), SERVER);
+	addr.sun_family = AF_UNIX;
+
+	/* remove if file already exists */
+	unlink(addr.sun_path);
+
+	/* set up a server socket */
+	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
+	if (socket_fd < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	addr_len = sizeof(struct sockaddr_un);
+	ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
+	if (ret) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to bind to server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		close(socket_fd);
+		return -1;
+	}
+
+	/* save the socket in local configuration */
+	pdump_socket_fd = socket_fd;
+
+	return 0;
+}
+
+static __attribute__((noreturn)) void *
+pdump_thread_main(__rte_unused void *arg)
+{
+	struct sockaddr_un cli_addr;
+	socklen_t cli_len;
+	struct pdump_request cli_req;
+	struct pdump_response resp;
+	int n;
+	int ret = 0;
+
+	/* host thread, never break out */
+	for (;;) {
+		/* recv client requests */
+		cli_len = sizeof(cli_addr);
+		n = recvfrom(pdump_socket_fd, &cli_req,
+				sizeof(struct pdump_request), 0,
+				(struct sockaddr *)&cli_addr, &cli_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to recv from client:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			continue;
+		}
+
+		ret = set_pdump_rxtx_cbs(&cli_req);
+
+		resp.ver = cli_req.ver;
+		resp.res_op = cli_req.op;
+		resp.err_value = ret;
+		n = sendto(pdump_socket_fd, &resp,
+				sizeof(struct pdump_response),
+				0, (struct sockaddr *)&cli_addr, cli_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to send to client:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+		}
+	}
+}
+
+int
+rte_pdump_init(const char *path)
+{
+	int ret = 0;
+	char thread_name[RTE_MAX_THREAD_NAME_LEN];
+
+	ret = rte_pdump_set_socket_dir(path);
+	if (ret != 0)
+		return -1;
+
+	ret = pdump_create_server_socket();
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create server socket:%s:%d\n",
+			__func__, __LINE__);
+		return -1;
+	}
+
+	/* create the host thread to wait/handle pdump requests */
+	ret = pthread_create(&pdump_thread, NULL, pdump_thread_main, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create the pdump thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+	/* Set thread_name for aid in debugging. */
+	snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN, "pdump-thread");
+	ret = rte_thread_setname(pdump_thread, thread_name);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, PDUMP,
+			"Failed to set thread name for pdump handling\n");
+	}
+
+	return 0;
+}
+
+int
+rte_pdump_uninit(void)
+{
+	int ret;
+
+	ret = pthread_cancel(pdump_thread);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to cancel the pdump thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	ret = close(pdump_socket_fd);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to close server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	struct sockaddr_un addr;
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path), SERVER);
+	ret = unlink(addr.sun_path);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to remove server socket addr: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_create_client_socket(struct pdump_request *p)
+{
+	int ret, socket_fd;
+	int pid;
+	int n;
+	struct pdump_response server_resp;
+	struct sockaddr_un addr, serv_addr, from;
+	socklen_t addr_len, serv_len;
+
+	pid = getpid();
+
+	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
+	if (socket_fd < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"client socket(): %s:pid(%d):tid(%u), %s:%d\n",
+			strerror(errno), pid, rte_sys_gettid(),
+			__func__, __LINE__);
+		ret = errno;
+		return ret;
+	}
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path), CLIENT);
+	addr.sun_family = AF_UNIX;
+	addr_len = sizeof(struct sockaddr_un);
+
+	do {
+		ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
+		if (ret) {
+			RTE_LOG(ERR, PDUMP,
+				"client bind(): %s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret = errno;
+			break;
+		}
+
+		serv_len = sizeof(struct sockaddr_un);
+		memset(&serv_addr, 0, sizeof(serv_addr));
+		pdump_get_socket_path(serv_addr.sun_path,
+					sizeof(serv_addr.sun_path),
+					SERVER);
+		serv_addr.sun_family = AF_UNIX;
+
+		n =  sendto(socket_fd, p, sizeof(struct pdump_request), 0,
+				(struct sockaddr *)&serv_addr, serv_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to send to server:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret =  errno;
+			break;
+		}
+
+		n = recvfrom(socket_fd, &server_resp,
+				sizeof(struct pdump_response), 0,
+				(struct sockaddr *)&from, &serv_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to recv from server:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret = errno;
+			break;
+		}
+		ret = server_resp.err_value;
+	} while (0);
+
+	close(socket_fd);
+	unlink(addr.sun_path);
+	return ret;
+}
+
+static int
+pdump_validate_ring_mp(struct rte_ring *ring, struct rte_mempool *mp)
+{
+	if (ring == NULL || mp == NULL) {
+		RTE_LOG(ERR, PDUMP, "NULL ring or mempool are passed %s:%d\n",
+			__func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+	if (mp->flags & MEMPOOL_F_SP_PUT || mp->flags & MEMPOOL_F_SC_GET) {
+		RTE_LOG(ERR, PDUMP, "mempool with either SP or SC settings"
+		" is not valid for pdump, should have MP and MC settings\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+	if (ring->prod.sp_enqueue || ring->cons.sc_dequeue) {
+		RTE_LOG(ERR, PDUMP, "ring with either SP or SC settings"
+		" is not valid for pdump, should have MP and MC settings\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_validate_flags(uint32_t flags)
+{
+	if (flags != RTE_PDUMP_FLAG_RX && flags != RTE_PDUMP_FLAG_TX &&
+		flags != RTE_PDUMP_FLAG_RXTX) {
+		RTE_LOG(ERR, PDUMP,
+			"invalid flags, should be either rx/tx/rxtx\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_validate_port(uint8_t port, char *name)
+{
+	int ret = 0;
+
+	if (port >= RTE_MAX_ETHPORTS) {
+		RTE_LOG(ERR, PDUMP, "Invalid port id %u, %s:%d\n", port,
+			__func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	ret = rte_eth_dev_get_name_by_port(port, name);
+	if (ret < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"port id to name mapping failed for port id=%u, %s:%d\n",
+			port, __func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_prepare_client_request(char *device, uint16_t queue,
+				uint32_t flags,
+				uint16_t operation,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter)
+{
+	int ret;
+	struct pdump_request req = {.ver = 1,};
+
+	req.flags = flags;
+	req.op =  operation;
+	if ((operation & ENABLE) != 0) {
+		strncpy(req.data.en_v1.device, device, strlen(device));
+		req.data.en_v1.queue = queue;
+		req.data.en_v1.ring = ring;
+		req.data.en_v1.mp = mp;
+		req.data.en_v1.filter = filter;
+	} else {
+		strncpy(req.data.dis_v1.device, device, strlen(device));
+		req.data.dis_v1.queue = queue;
+		req.data.dis_v1.ring = NULL;
+		req.data.dis_v1.mp = NULL;
+		req.data.dis_v1.filter = NULL;
+	}
+
+	ret = pdump_create_client_socket(&req);
+	if (ret < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"client request for pdump enable/disable failed\n");
+		rte_errno = ret;
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
+			struct rte_ring *ring,
+			struct rte_mempool *mp,
+			void *filter)
+{
+
+	int ret = 0;
+	char name[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_port(port, name);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_ring_mp(ring, mp);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	ret = pdump_prepare_client_request(name, queue, flags,
+						ENABLE, ring, mp, filter);
+
+	return ret;
+}
+
+int
+rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter)
+{
+	int ret = 0;
+	char domBDF[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_ring_mp(ring, mp);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
+		ret = pdump_prepare_client_request(domBDF, queue, flags,
+						ENABLE, ring, mp, filter);
+	else
+		ret = pdump_prepare_client_request(device_id, queue, flags,
+						ENABLE, ring, mp, filter);
+
+	return ret;
+}
+
+int
+rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags)
+{
+	int ret = 0;
+	char name[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_port(port, name);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	ret = pdump_prepare_client_request(name, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+
+	return ret;
+}
+
+int
+rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags)
+{
+	int ret = 0;
+	char domBDF[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
+		ret = pdump_prepare_client_request(domBDF, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+	else
+		ret = pdump_prepare_client_request(device_id, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+
+	return ret;
+}
+
+int
+rte_pdump_set_socket_dir(const char *path)
+{
+	int ret, count;
+
+	if (path != NULL) {
+		count = sizeof(socket_dir);
+		ret = snprintf(socket_dir, count, "%s", path);
+		if (ret < 0  || ret >= count) {
+			RTE_LOG(ERR, PDUMP,
+					"Invalid server socket path:%s:%d\n",
+					__func__, __LINE__);
+			socket_dir[0] = 0;
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
diff --git a/lib/librte_pdump/rte_pdump.h b/lib/librte_pdump/rte_pdump.h
new file mode 100644
index 0000000..63e8ac3
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump.h
@@ -0,0 +1,208 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PDUMP_H_
+#define _RTE_PDUMP_H_
+
+/**
+ * @file
+ * RTE pdump
+ *
+ * packet dump library to provide packet capturing support on dpdk.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_PDUMP_ALL_QUEUES UINT16_MAX
+
+enum {
+	RTE_PDUMP_FLAG_RX = 1,  /* receive direction */
+	RTE_PDUMP_FLAG_TX = 2,  /* transmit direction */
+	/* both receive and transmit directions */
+	RTE_PDUMP_FLAG_RXTX = (RTE_PDUMP_FLAG_RX|RTE_PDUMP_FLAG_TX)
+};
+
+/**
+ * Initialize packet capturing handling
+ *
+ * Creates pthread and server socket for handling clients
+ * requests to enable/disable rxtx callbacks.
+ *
+ * @param path
+ * directory path for server socket.
+ *
+ * @return
+ *    0 on success, -1 on error
+ */
+int
+rte_pdump_init(const char *path);
+
+/**
+ * Un initialize packet capturing handling
+ *
+ * Cancels pthread, close server socket, removes server socket address.
+ *
+ * @return
+ *    0 on success, -1 on error
+ */
+int
+rte_pdump_uninit(void);
+
+/**
+ * Enables packet capturing on given port and queue.
+ *
+ * @param port
+ *  port on which packet capturing should be enabled.
+ * @param queue
+ *  queue of a given port on which packet capturing should be enabled.
+ *  users should pass on value UINT16_MAX to enable packet capturing on all
+ *  queues of a given port.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ * @param ring
+ *  ring on which captured packets will be enqueued for user.
+ * @param mp
+ *  mempool on to which original packets will be mirrored or duplicated.
+ * @param filter
+ *  place holder for packet filtering.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
+		struct rte_ring *ring,
+		struct rte_mempool *mp,
+		void *filter);
+
+/**
+ * Disables packet capturing on given port and queue.
+ *
+ * @param port
+ *  port on which packet capturing should be disabled.
+ * @param queue
+ *  queue of a given port on which packet capturing should be disabled.
+ *  users should pass on value UINT16_MAX to disable packet capturing on all
+ *  queues of a given port.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags);
+
+/**
+ * Enables packet capturing on given device id and queue.
+ * device_id can be name or pci address of device.
+ *
+ * @param device_id
+ *  device id on which packet capturing should be enabled.
+ * @param queue
+ *  queue of a given device id on which packet capturing should be enabled.
+ *  users should pass on value UINT16_MAX to enable packet capturing on all
+ *  queues of a given device id.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ * @param ring
+ *  ring on which captured packets will be enqueued for user.
+ * @param mp
+ *  mempool on to which original packets will be mirrored or duplicated.
+ * @param filter
+ *  place holder for packet filtering.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter);
+
+/**
+ * Disables packet capturing on given device_id and queue.
+ * device_id can be name or pci address of device.
+ *
+ * @param device_id
+ *  pci address or name of the device on which packet capturing
+ *  should be disabled.
+ * @param queue
+ *  queue of a given device on which packet capturing should be disabled.
+ *  users should pass on value UINT16_MAX to disable packet capturing on all
+ *  queues of a given device id.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+int
+rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags);
+
+/**
+ * Allows applications to set server socket path.
+ * If specified path is null default path will be selected, i.e.
+ *"/var/run/" for root user and "$HOME" for non root user.
+ * Clients need to call this API only when their server path is non default
+ * path. This path will be used internally to send pdump enable or disable
+ * requests to the server.
+ * This API is not thread-safe.
+ *
+ * @param path
+ * directory path for server socket.
+ *
+ * @return
+ * 0 on success, -EINVAL on error
+ *
+ */
+int
+rte_pdump_set_socket_dir(const char *path);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PDUMP_H_ */
diff --git a/lib/librte_pdump/rte_pdump_version.map b/lib/librte_pdump/rte_pdump_version.map
new file mode 100644
index 0000000..edec99a
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump_version.map
@@ -0,0 +1,13 @@
+DPDK_16.07 {
+	global:
+
+	rte_pdump_disable;
+	rte_pdump_disable_by_deviceid;
+	rte_pdump_enable;
+	rte_pdump_enable_by_deviceid;
+	rte_pdump_init;
+	rte_pdump_set_socket_dir;
+	rte_pdump_uninit;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index b84b56d..f792f2a 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -61,6 +61,7 @@ _LDLIBS-y += --whole-archive
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 6/8] app/pdump: add pdump tool for packet capturing
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (4 preceding siblings ...)
  2016-06-10 16:18   ` [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 7/8] app/test-pmd: add pdump initialization uninitialization Reshma Pattan
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

New tool added for packet capturing on dpdk.
This tool supports command line options.
This tool runs as secondary process by default.

Command line supports various parameters to capture
the packets.

User should pass on a)port and queue (or) b)pci address
and queue (or) c)device name and queue to capture
the packets.

Users also need to pass on either pcap file name or
any linux iface, on to which packets captured from dpdk
ports will be sent on for the users to view using tcpdump.

Users have option to capture packets either a) in RX
direction, b)(or) in TX direction c)(or) from both the
directions.

User can pass on ring_size and mempool parameters using
command line, but these are optional parameters.
These are used to create ring and mempool objects for packet
mirroring from primary application to tool. If user doesn't
provide any values, default values will be used internally
for the creation of the ring and mempool.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS        |   1 +
 app/Makefile       |   1 +
 app/pdump/Makefile |  45 +++
 app/pdump/main.c   | 844 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 891 insertions(+)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index cc3ffdb..a48c8de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -435,6 +435,7 @@ F: doc/guides/sample_app_ug/packet_ordering.rst
 Pdump
 M: Reshma Pattan <reshma.pattan@intel.com>
 F: lib/librte_pdump/
+F: app/pdump/
 
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
diff --git a/app/Makefile b/app/Makefile
index 1151e09..c593efa 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += pdump
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/pdump/Makefile b/app/pdump/Makefile
new file mode 100644
index 0000000..96bb4af
--- /dev/null
+++ b/app/pdump/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = dpdk_pdump
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+
+SRCS-y := main.c
+
+# this application needs libraries first
+DEPDIRS-y += lib
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/pdump/main.c b/app/pdump/main.c
new file mode 100644
index 0000000..f8923b9
--- /dev/null
+++ b/app/pdump/main.c
@@ -0,0 +1,844 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdlib.h>
+#include <getopt.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <net/if.h>
+
+#include <rte_eal.h>
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_ethdev.h>
+#include <rte_memory.h>
+#include <rte_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_errno.h>
+#include <rte_dev.h>
+#include <rte_kvargs.h>
+#include <rte_mempool.h>
+#include <rte_ring.h>
+#include <rte_pdump.h>
+
+#define PDUMP_PORT_ARG "port"
+#define PDUMP_PCI_ARG "device_id"
+#define PDUMP_QUEUE_ARG "queue"
+#define PDUMP_DIR_ARG "dir"
+#define PDUMP_RX_DEV_ARG "rx-dev"
+#define PDUMP_TX_DEV_ARG "tx-dev"
+#define PDUMP_RING_SIZE_ARG "ring-size"
+#define PDUMP_MSIZE_ARG "mbuf-size"
+#define PDUMP_NUM_MBUFS_ARG "total-num-mbufs"
+
+#define VDEV_PCAP "eth_pcap_%s_%d,tx_pcap=%s"
+#define VDEV_IFACE "eth_pcap_%s_%d,tx_iface=%s"
+#define TX_STREAM_SIZE 64
+
+#define MP_NAME "pdump_pool_%d"
+
+#define RX_RING "rx_ring_%d"
+#define TX_RING "tx_ring_%d"
+
+#define RX_STR "rx"
+#define TX_STR "tx"
+
+/* Maximum long option length for option parsing. */
+#define APP_ARG_TCPDUMP_MAX_TUPLES 54
+#define MBUF_POOL_CACHE_SIZE 250
+#define TX_DESC_PER_QUEUE 512
+#define RX_DESC_PER_QUEUE 128
+#define MBUFS_PER_POOL 65535
+#define MAX_LONG_OPT_SZ 64
+#define RING_SIZE 16384
+#define SIZE 256
+#define BURST_SIZE 32
+#define NUM_VDEVS 2
+
+#define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
+/* true if x is a power of 2 */
+#define POWEROF2(x) ((((x)-1) & (x)) == 0)
+
+enum pdump_en_dis {
+	DISABLE = 1,
+	ENABLE = 2
+};
+
+enum pcap_stream {
+	IFACE = 1,
+	PCAP = 2
+};
+
+enum pdump_by {
+	PORT_ID = 1,
+	DEVICE_ID = 2
+};
+
+const char *valid_pdump_arguments[] = {
+	PDUMP_PORT_ARG,
+	PDUMP_PCI_ARG,
+	PDUMP_QUEUE_ARG,
+	PDUMP_DIR_ARG,
+	PDUMP_RX_DEV_ARG,
+	PDUMP_TX_DEV_ARG,
+	PDUMP_RING_SIZE_ARG,
+	PDUMP_MSIZE_ARG,
+	PDUMP_NUM_MBUFS_ARG,
+	NULL
+};
+
+struct pdump_stats {
+	uint64_t dequeue_pkts;
+	uint64_t tx_pkts;
+	uint64_t freed_pkts;
+};
+
+struct pdump_tuples {
+	/* cli params */
+	uint8_t port;
+	char *device_id;
+	uint16_t queue;
+	char rx_dev[TX_STREAM_SIZE];
+	char tx_dev[TX_STREAM_SIZE];
+	uint32_t ring_size;
+	uint16_t mbuf_data_size;
+	uint32_t total_num_mbufs;
+
+	/* params for library API call */
+	uint32_t dir;
+	struct rte_mempool *mp;
+	struct rte_ring *rx_ring;
+	struct rte_ring *tx_ring;
+
+	/* params for packet dumping */
+	enum pdump_by dump_by_type;
+	int rx_vdev_id;
+	int tx_vdev_id;
+	enum pcap_stream rx_vdev_stream_type;
+	enum pcap_stream tx_vdev_stream_type;
+	bool single_pdump_dev;
+
+	/* stats */
+	struct pdump_stats stats;
+} __rte_cache_aligned;
+static struct pdump_tuples pdump_t[APP_ARG_TCPDUMP_MAX_TUPLES];
+
+struct parse_val {
+	uint64_t min;
+	uint64_t max;
+	uint64_t val;
+};
+
+int num_tuples;
+static struct rte_eth_conf port_conf_default;
+volatile uint8_t quit_signal;
+
+/**< display usage */
+static void
+pdump_usage(const char *prgname)
+{
+	printf("usage: %s [EAL options] -- --pdump "
+			"'(port=<port id> | device_id=<pci id or vdev name>),"
+			"(queue=<queue_id>),"
+			"(rx-dev=<iface or pcap file> |"
+			" tx-dev=<iface or pcap file>,"
+			"[ring-size=<ring size>default:16384],"
+			"[mbuf-size=<mbuf data size>default:2176],"
+			"[total-num-mbufs=<number of mbufs>default:65535]"
+			"'\n",
+			prgname);
+}
+
+static int
+parse_device_id(const char *key __rte_unused, const char *value,
+		void *extra_args)
+{
+	struct pdump_tuples *pt = extra_args;
+
+	pt->device_id = strdup(value);
+	pt->dump_by_type = DEVICE_ID;
+
+	return 0;
+}
+
+static int
+parse_queue(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long n;
+	struct pdump_tuples *pt = extra_args;
+
+	if (!strcmp(value, "*"))
+		pt->queue = RTE_PDUMP_ALL_QUEUES;
+	else {
+		n = strtoul(value, NULL, 10);
+		pt->queue = (uint16_t) n;
+	}
+	return 0;
+}
+
+static int
+parse_rxtxdev(const char *key, const char *value, void *extra_args)
+{
+
+	struct pdump_tuples *pt = extra_args;
+
+	if (!strcmp(key, PDUMP_RX_DEV_ARG)) {
+		strncpy(pt->rx_dev, value, strlen(value));
+		/* identify the tx stream type for pcap vdev */
+		if (if_nametoindex(pt->rx_dev))
+			pt->rx_vdev_stream_type = IFACE;
+	} else if (!strcmp(key, PDUMP_TX_DEV_ARG)) {
+		strncpy(pt->tx_dev, value, strlen(value));
+		/* identify the tx stream type for pcap vdev */
+		if (if_nametoindex(pt->tx_dev))
+			pt->tx_vdev_stream_type = IFACE;
+	} else {
+		printf("invalid dev type %s, must be rx or tx\n", value);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+parse_uint_value(const char *key, const char *value, void *extra_args)
+{
+	struct parse_val *v;
+	unsigned long t;
+	char *end;
+	int ret = 0;
+
+	errno = 0;
+	v = extra_args;
+	t = strtoul(value, &end, 10);
+
+	if (errno != 0 || end[0] != 0 || t < v->min || t > v->max) {
+		printf("invalid value:\"%s\" for key:\"%s\", "
+			"value must be >= %"PRIu64" and <= %"PRIu64"\n",
+			value, key, v->min, v->max);
+		ret = -EINVAL;
+	}
+	if (!strcmp(key, PDUMP_RING_SIZE_ARG) && !POWEROF2(t)) {
+		printf("invalid value:\"%s\" for key:\"%s\", "
+			"value must be power of 2\n", value, key);
+		ret = -EINVAL;
+	}
+
+	if (ret != 0)
+		return ret;
+
+	v->val = t;
+	return 0;
+}
+
+static int
+parse_pdump(const char *optarg)
+{
+	struct rte_kvargs *kvlist;
+	int ret = 0, cnt1, cnt2;
+	struct pdump_tuples *pt;
+	struct parse_val v = {0};
+
+	pt = &pdump_t[num_tuples];
+
+	/* initial check for invalid arguments */
+	kvlist = rte_kvargs_parse(optarg, valid_pdump_arguments);
+	if (kvlist == NULL) {
+		printf("--pdump=\"%s\": invalid argument passed\n", optarg);
+		return -1;
+	}
+
+	/* port/device_id parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_PORT_ARG);
+	cnt2 = rte_kvargs_count(kvlist, PDUMP_PCI_ARG);
+	if (!((cnt1 == 1 && cnt2 == 0) || (cnt1 == 0 && cnt2 == 1))) {
+		printf("--pdump=\"%s\": must have either port or "
+			"device_id argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	} else if (cnt1 == 1) {
+		v.min = 0;
+		v.max = RTE_MAX_ETHPORTS-1;
+		ret = rte_kvargs_process(kvlist, PDUMP_PORT_ARG,
+				&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->port = (uint8_t) v.val;
+		pt->dump_by_type = PORT_ID;
+	} else if (cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_PCI_ARG,
+				&parse_device_id, pt);
+		if (ret < 0)
+			goto free_kvlist;
+	}
+
+	/* queue parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_QUEUE_ARG);
+	if (cnt1 != 1) {
+		printf("--pdump=\"%s\": must have queue argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	}
+	ret = rte_kvargs_process(kvlist, PDUMP_QUEUE_ARG, &parse_queue, pt);
+	if (ret < 0)
+		goto free_kvlist;
+
+	/* rx-dev and tx-dev parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_RX_DEV_ARG);
+	cnt2 = rte_kvargs_count(kvlist, PDUMP_TX_DEV_ARG);
+	if (cnt1 == 0 && cnt2 == 0) {
+		printf("--pdump=\"%s\": must have either rx-dev or "
+			"tx-dev argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	} else if (cnt1 == 1 && cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_RX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		ret = rte_kvargs_process(kvlist, PDUMP_TX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		/* if captured packets has to send to the same vdev */
+		if (!strcmp(pt->rx_dev, pt->tx_dev))
+			pt->single_pdump_dev = true;
+		pt->dir = RTE_PDUMP_FLAG_RXTX;
+	} else if (cnt1 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_RX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->dir = RTE_PDUMP_FLAG_RX;
+	} else if (cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_TX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->dir = RTE_PDUMP_FLAG_TX;
+	}
+
+	/* optional */
+	/* ring_size parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_RING_SIZE_ARG);
+	if (cnt1 == 1) {
+		v.min = 2;
+		v.max = RTE_RING_SZ_MASK-1;
+		ret = rte_kvargs_process(kvlist, PDUMP_RING_SIZE_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->ring_size = (uint16_t) v.val;
+	} else
+		pt->ring_size = RING_SIZE;
+
+	/* mbuf_data_size parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_MSIZE_ARG);
+	if (cnt1 == 1) {
+		v.min = 1;
+		v.max = UINT16_MAX;
+		ret = rte_kvargs_process(kvlist, PDUMP_MSIZE_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->mbuf_data_size = (uint16_t) v.val;
+	} else
+		pt->mbuf_data_size = RTE_MBUF_DEFAULT_BUF_SIZE;
+
+	/* total_num_mbufs parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_NUM_MBUFS_ARG);
+	if (cnt1 == 1) {
+		v.min = 1025;
+		v.max = UINT16_MAX;
+		ret = rte_kvargs_process(kvlist, PDUMP_NUM_MBUFS_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->total_num_mbufs = (uint16_t) v.val;
+	} else
+		pt->total_num_mbufs = MBUFS_PER_POOL;
+
+	num_tuples++;
+
+free_kvlist:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+launch_args_parse(int argc, char **argv, char *prgname)
+{
+	int opt, ret;
+	int option_index;
+	static struct option long_option[] = {
+		{"pdump", 1, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	if (argc == 1)
+		pdump_usage(prgname);
+
+	/* Parse command line */
+	while ((opt = getopt_long(argc, argv, " ",
+			long_option, &option_index)) != EOF) {
+		switch (opt) {
+		case 0:
+			if (!strncmp(long_option[option_index].name, "pdump",
+					MAX_LONG_OPT_SZ)) {
+				ret = parse_pdump(optarg);
+				if (ret) {
+					pdump_usage(prgname);
+					return -1;
+				}
+			}
+			break;
+		default:
+			pdump_usage(prgname);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+print_pdump_stats(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	for (i = 0; i < num_tuples; i++) {
+		printf("##### PDUMP DEBUG STATS #####\n");
+		pt = &pdump_t[i];
+		printf(" -packets dequeued:			%"PRIu64"\n",
+							pt->stats.dequeue_pkts);
+		printf(" -packets transmitted to vdev:		%"PRIu64"\n",
+							pt->stats.tx_pkts);
+		printf(" -packets freed:			%"PRIu64"\n",
+							pt->stats.freed_pkts);
+	}
+}
+
+static inline void
+disable_pdump(struct pdump_tuples *pt)
+{
+	if (pt->dump_by_type == DEVICE_ID)
+		rte_pdump_disable_by_deviceid(pt->device_id, pt->queue,
+						pt->dir);
+	else if (pt->dump_by_type == PORT_ID)
+		rte_pdump_disable(pt->port, pt->queue, pt->dir);
+}
+
+static inline void
+pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
+{
+	/* write input packets of port to vdev for pdump */
+	struct rte_mbuf *rxtx_bufs[BURST_SIZE];
+
+	/* first dequeue packets from ring of primary process */
+	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
+			(void *)rxtx_bufs, BURST_SIZE);
+	stats->dequeue_pkts += nb_in_deq;
+
+	if (nb_in_deq) {
+		/* then sent on vdev */
+		uint16_t nb_in_txd = rte_eth_tx_burst(
+				vdev_id,
+				0, rxtx_bufs, nb_in_deq);
+		stats->tx_pkts += nb_in_txd;
+
+		if (unlikely(nb_in_txd < nb_in_deq)) {
+			do {
+				rte_pktmbuf_free(rxtx_bufs[nb_in_txd]);
+				stats->freed_pkts++;
+			} while (++nb_in_txd < nb_in_deq);
+		}
+	}
+}
+
+static void
+free_ring_data(struct rte_ring *ring, uint8_t vdev_id,
+		struct pdump_stats *stats)
+{
+	while (rte_ring_count(ring))
+		pdump_rxtx(ring, vdev_id, stats);
+}
+
+static void
+cleanup_pdump_resources(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	/* disable pdump and free the pdump_tuple resources */
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+
+		/* remove callbacks */
+		disable_pdump(pt);
+
+		/*
+		* transmit rest of the enqueued packets of the rings on to
+		* the vdev, in order to release mbufs to the mepool.
+		**/
+		if (pt->dir & RTE_PDUMP_FLAG_RX)
+			free_ring_data(pt->rx_ring, pt->rx_vdev_id, &pt->stats);
+		if (pt->dir & RTE_PDUMP_FLAG_TX)
+			free_ring_data(pt->tx_ring, pt->tx_vdev_id, &pt->stats);
+
+		if (pt->device_id)
+			free(pt->device_id);
+
+		/* free the rings */
+		if (pt->rx_ring)
+			rte_ring_free(pt->rx_ring);
+		if (pt->tx_ring)
+			rte_ring_free(pt->tx_ring);
+	}
+}
+
+static void
+signal_handler(int sig_num)
+{
+	if (sig_num == SIGINT) {
+		printf("\n\nSignal %d received, preparing to exit...\n",
+				sig_num);
+		quit_signal = 1;
+	}
+}
+
+static inline int
+configure_vdev(uint8_t port_id)
+{
+	struct ether_addr addr;
+	const uint16_t rxRings = 0, txRings = 1;
+	const uint8_t nb_ports = rte_eth_dev_count();
+	int ret;
+	uint16_t q;
+
+	if (port_id > nb_ports)
+		return -1;
+
+	ret = rte_eth_dev_configure(port_id, rxRings, txRings,
+					&port_conf_default);
+	if (ret != 0)
+		rte_exit(EXIT_FAILURE, "dev config failed\n");
+
+	 for (q = 0; q < txRings; q++) {
+		ret = rte_eth_tx_queue_setup(port_id, q, TX_DESC_PER_QUEUE,
+				rte_eth_dev_socket_id(port_id), NULL);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "queue setup failed\n");
+	}
+
+	ret = rte_eth_dev_start(port_id);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "dev start failed\n");
+
+	rte_eth_macaddr_get(port_id, &addr);
+	printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
+			" %02"PRIx8" %02"PRIx8" %02"PRIx8"\n",
+			(unsigned)port_id,
+			addr.addr_bytes[0], addr.addr_bytes[1],
+			addr.addr_bytes[2], addr.addr_bytes[3],
+			addr.addr_bytes[4], addr.addr_bytes[5]);
+
+	rte_eth_promiscuous_enable(port_id);
+
+	return 0;
+}
+
+static void
+create_mp_ring_vdev(void)
+{
+	int i;
+	uint8_t portid;
+	struct pdump_tuples *pt = NULL;
+	struct rte_mempool *mbuf_pool = NULL;
+	char vdev_args[SIZE];
+	char ring_name[SIZE];
+	char mempool_name[SIZE];
+
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+		snprintf(mempool_name, SIZE, MP_NAME, i);
+		mbuf_pool = rte_mempool_lookup(mempool_name);
+		if (mbuf_pool == NULL) {
+			/* create mempool */
+			mbuf_pool = rte_pktmbuf_pool_create(mempool_name,
+					pt->total_num_mbufs,
+					MBUF_POOL_CACHE_SIZE, 0,
+					pt->mbuf_data_size,
+					rte_socket_id());
+			if (mbuf_pool == NULL)
+				rte_exit(EXIT_FAILURE,
+					"Mempool creation failed: %s\n",
+					rte_strerror(rte_errno));
+		}
+		pt->mp = mbuf_pool;
+
+		if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
+			/* if captured packets has to send to the same vdev */
+			/* create rx_ring */
+			snprintf(ring_name, SIZE, RX_RING, i);
+			pt->rx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->rx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s:%s:%d\n",
+						rte_strerror(rte_errno),
+						__func__, __LINE__);
+
+			/* create tx_ring */
+			snprintf(ring_name, SIZE, TX_RING, i);
+			pt->tx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->tx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s:%s:%d\n",
+						rte_strerror(rte_errno),
+						__func__, __LINE__);
+
+			/* create vdevs */
+			(pt->rx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, RX_STR, i,
+			pt->rx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, RX_STR, i,
+			pt->rx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed:%s:%d\n",
+					__func__, __LINE__);
+			pt->rx_vdev_id = portid;
+
+			/* configure vdev */
+			configure_vdev(pt->rx_vdev_id);
+
+			if (pt->single_pdump_dev)
+				pt->tx_vdev_id = portid;
+			else {
+				(pt->tx_vdev_stream_type == IFACE) ?
+				snprintf(vdev_args, SIZE, VDEV_IFACE, TX_STR, i,
+				pt->tx_dev) :
+				snprintf(vdev_args, SIZE, VDEV_PCAP, TX_STR, i,
+				pt->tx_dev);
+				if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+					rte_exit(EXIT_FAILURE,
+						"vdev creation failed:"
+						"%s:%d\n", __func__, __LINE__);
+				pt->tx_vdev_id = portid;
+
+				/* configure vdev */
+				configure_vdev(pt->tx_vdev_id);
+			}
+		} else if (pt->dir == RTE_PDUMP_FLAG_RX) {
+
+			/* create rx_ring */
+			snprintf(ring_name, SIZE, RX_RING, i);
+			pt->rx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->rx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s\n",
+					rte_strerror(rte_errno));
+
+			(pt->rx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, RX_STR, i,
+				pt->rx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, RX_STR, i,
+				pt->rx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed:%s:%d\n",
+					__func__, __LINE__);
+			pt->rx_vdev_id = portid;
+			/* configure vdev */
+			configure_vdev(pt->rx_vdev_id);
+		} else if (pt->dir == RTE_PDUMP_FLAG_TX) {
+
+			/* create tx_ring */
+			snprintf(ring_name, SIZE, TX_RING, i);
+			pt->tx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->tx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s\n",
+					rte_strerror(rte_errno));
+
+			(pt->tx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, TX_STR, i,
+				pt->tx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, TX_STR, i,
+				pt->tx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed\n");
+			pt->tx_vdev_id = portid;
+
+			/* configure vdev */
+			configure_vdev(pt->tx_vdev_id);
+		}
+	}
+}
+
+static void
+enable_pdump(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+	int ret = 0, ret1 = 0;
+
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+		if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
+			if (pt->dump_by_type == DEVICE_ID) {
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						RTE_PDUMP_FLAG_RX,
+						pt->rx_ring,
+						pt->mp, NULL);
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						RTE_PDUMP_FLAG_TX,
+						pt->tx_ring,
+						pt->mp, NULL);
+			} else if (pt->dump_by_type == PORT_ID) {
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						RTE_PDUMP_FLAG_RX,
+						pt->rx_ring, pt->mp, NULL);
+				ret1 = rte_pdump_enable(pt->port, pt->queue,
+						RTE_PDUMP_FLAG_TX,
+						pt->tx_ring, pt->mp, NULL);
+			}
+		} else if (pt->dir == RTE_PDUMP_FLAG_RX) {
+			if (pt->dump_by_type == DEVICE_ID)
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						pt->dir, pt->rx_ring,
+						pt->mp, NULL);
+			else if (pt->dump_by_type == PORT_ID)
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						pt->dir,
+						pt->rx_ring, pt->mp, NULL);
+		} else if (pt->dir == RTE_PDUMP_FLAG_TX) {
+			if (pt->dump_by_type == DEVICE_ID)
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						pt->dir,
+						pt->tx_ring, pt->mp, NULL);
+			else if (pt->dump_by_type == PORT_ID)
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						pt->dir,
+						pt->tx_ring, pt->mp, NULL);
+		}
+		if (ret < 0 || ret1 < 0) {
+			cleanup_pdump_resources();
+			rte_exit(EXIT_FAILURE, "%s\n", rte_strerror(rte_errno));
+		}
+	}
+}
+
+static inline void
+dump_packets(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	while (!quit_signal) {
+		for (i = 0; i < num_tuples; i++) {
+			pt = &pdump_t[i];
+			if (pt->dir & RTE_PDUMP_FLAG_RX)
+				pdump_rxtx(pt->rx_ring, pt->rx_vdev_id,
+					&pt->stats);
+			if (pt->dir & RTE_PDUMP_FLAG_TX)
+				pdump_rxtx(pt->tx_ring, pt->tx_vdev_id,
+					&pt->stats);
+		}
+	}
+}
+
+int
+main(int argc, char **argv)
+{
+	int diag;
+	int ret;
+	int i;
+
+	char c_flag[] = "-c1";
+	char n_flag[] = "-n4";
+	char mp_flag[] = "--proc-type=secondary";
+	char *argp[argc + 3];
+
+	/* catch ctrl-c so we can print on exit */
+	signal(SIGINT, signal_handler);
+
+	argp[0] = argv[0];
+	argp[1] = c_flag;
+	argp[2] = n_flag;
+	argp[3] = mp_flag;
+
+	for (i = 1; i < argc; i++)
+		argp[i + 3] = argv[i];
+
+	argc += 3;
+
+	diag = rte_eal_init(argc, argp);
+	if (diag < 0)
+		rte_panic("Cannot init EAL\n");
+
+	argc -= diag;
+	argv += (diag - 3);
+
+	/* parse app arguments */
+	if (argc > 1) {
+		ret = launch_args_parse(argc, argv, argp[0]);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "Invalid argument\n");
+	}
+
+	/* create mempool, ring and vdevs info */
+	create_mp_ring_vdev();
+	enable_pdump();
+	dump_packets();
+
+	cleanup_pdump_resources();
+	/* dump debug stats */
+	print_pdump_stats();
+
+	return 0;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 7/8] app/test-pmd: add pdump initialization uninitialization
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (5 preceding siblings ...)
  2016-06-10 16:18   ` [PATCH v8 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 16:18   ` [PATCH v8 8/8] doc: update doc for packet capture framework Reshma Pattan
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Call rte_pdump_init and rte_pdump_uninit for packet
capturing initialization and uninitialization.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 app/test-pmd/testpmd.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index dd6b046..9707cfc 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_XENVIRT
 #include <rte_eth_xenvirt.h>
 #endif
+#include <rte_pdump.h>
 
 #include "testpmd.h"
 
@@ -2029,6 +2030,8 @@ signal_handler(int signum)
 	if (signum == SIGINT || signum == SIGTERM) {
 		printf("\nSignal %d received, preparing to exit...\n",
 				signum);
+		/* uninitialize packet capture framework */
+		rte_pdump_uninit();
 		force_quit();
 		/* exit with the expected status */
 		signal(signum, SIG_DFL);
@@ -2049,6 +2052,9 @@ main(int argc, char** argv)
 	if (diag < 0)
 		rte_panic("Cannot init EAL\n");
 
+	/* initialize packet capture framework */
+	rte_pdump_init(NULL);
+
 	nb_ports = (portid_t) rte_eth_dev_count();
 	if (nb_ports == 0)
 		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v8 8/8] doc: update doc for packet capture framework
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (6 preceding siblings ...)
  2016-06-10 16:18   ` [PATCH v8 7/8] app/test-pmd: add pdump initialization uninitialization Reshma Pattan
@ 2016-06-10 16:18   ` Reshma Pattan
  2016-06-10 23:23   ` [PATCH v8 0/8] add " Neil Horman
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
  9 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-10 16:18 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added programmers guide for librte_pdump.
Added sample application guide for app/pdump application.
Updated release note for packet capture framework changes.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 MAINTAINERS                             |   3 +
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/pdump_library.rst | 117 ++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_16_07.rst  |  13 ++++
 doc/guides/sample_app_ug/index.rst      |   1 +
 doc/guides/sample_app_ug/pdump.rst      | 122 ++++++++++++++++++++++++++++++++
 6 files changed, 257 insertions(+)
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index a48c8de..ce7c941 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -436,6 +436,9 @@ Pdump
 M: Reshma Pattan <reshma.pattan@intel.com>
 F: lib/librte_pdump/
 F: app/pdump/
+F: doc/guides/prog_guide/pdump_library.rst
+F: doc/guides/sample_app_ug/pdump.rst
+
 
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index b862d0c..4caf969 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
     writing_efficient_code
     profile_app
     glossary
+    pdump_library
 
 
 **Figures**
diff --git a/doc/guides/prog_guide/pdump_library.rst b/doc/guides/prog_guide/pdump_library.rst
new file mode 100644
index 0000000..3088063
--- /dev/null
+++ b/doc/guides/prog_guide/pdump_library.rst
@@ -0,0 +1,117 @@
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _pdump_library:
+
+The librte_pdump Library
+========================
+
+The ``librte_pdump`` library provides a framework for packet capturing in DPDK.
+The library provides the following APIs to initialize the packet capture framework, to enable
+or disable the packet capture, and to uninitialize it:
+
+* ``rte_pdump_init()``:
+  This API initializes the packet capture framework.
+
+* ``rte_pdump_enable()``:
+  This API enables the packet capture on a given port and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_enable_by_deviceid()``:
+  This API enables the packet capture on a given device id (``vdev name or pci address``) and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_disable()``:
+  This API disables the packet capture on a given port and queue.
+
+* ``rte_pdump_disable_by_deviceid()``:
+  This API disables the packet capture on a given device id (``vdev name or pci address``) and queue.
+
+* ``rte_pdump_uninit()``:
+  This API uninitializes the packet capture framework.
+
+* ``rte_pdump_set_socket_dir()``:
+  This API sets the server socket path.
+  Note: This API is not thread-safe.
+
+
+Operation
+---------
+
+The ``librte_pdump`` library works on a client/server model. The server is responsible for enabling or
+disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
+the packet capture.
+
+The packet capture framework, as part of its initialization, creates the pthread and the server socket in
+the pthread. The application that calls the framework initialization will have the server socket created,
+either under the path that the application has passed or under the default path i.e. either ``/var/run`` for
+root user or ``$HOME`` for non root user.
+
+Applications that request enabling or disabling of the packet capture will have the client socket created either under
+the ``/var/run/`` for root users or ``$HOME`` for not root users to send the requests to the server.
+The server socket will listen for client requests for enabling or disabling the packet capture.
+
+
+Implementation Details
+----------------------
+
+The library API ``rte_pdump_init()``, initializes the packet capture framework by creating the pthread and the server
+socket. The server socket in the pthread context will be listening to the client requests to enable or disable the
+packet capture.
+
+The library APIs ``rte_pdump_enable()`` and ``rte_pdump_enable_by_deviceid()`` enables the packet capture.
+On each call to these APIs, the library creates a separate client socket, creates the "pdump enable" request and sends
+the request to the server. The server that is listening on the socket will take the request and enable the packet capture
+by registering the Ethernet RX and TX callbacks for the given port or device_id and queue combinations.
+Then the server will mirror the packets to the new mempool and enqueue them to the rte_ring that clients have passed
+to these APIs. The server also sends the response back to the client about the status of the request that was processed.
+After the response is received from the server, the client socket is closed.
+
+The library APIs ``rte_pdump_disable()`` and ``rte_pdump_disable_by_deviceid()`` disables the packet capture.
+On each call to these APIs, the library creates a separate client socket, creates the "pdump disable" request and sends
+the request to the server. The server that is listening on the socket will take the request and disable the packet
+capture by removing the Ethernet RX and TX callbacks for the given port or device_id and queue combinations. The server
+also sends the response back to the client about the status of the request that was processed. After the response is
+received from the server, the client socket is closed.
+
+The library API ``rte_pdump_uninit()``, uninitializes the packet capture framework by closing the pthread and the
+server socket.
+
+The library API ``rte_pdump_set_server_socket_dir()``, sets the given path as server socket path.
+If the given path is ``NULL``, default path will be selected, i.e. either ``/var/run/`` for root users or ``$HOME``
+for non root users. Clients need to call this API only when their server socket path is non default path.
+The given server socket path will be used by clients to send the pdump enable and disable requests to the server.
+
+
+Use Case: Packet Capturing
+--------------------------
+
+The DPDK ``app/pdump`` tool is developed based on this library to capture packets in DPDK.
+Users can use this as an example to develop their own packet capturing application.
diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index c0f6b02..a4de2a2 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -66,6 +66,11 @@ New Features
   * Enable RSS per network interface through the configuration file.
   * Streamline the CLI code.
 
+* **Added packet capture framework.**
+
+  * A new library ``librte_pdump`` is added to provide packet capture APIs.
+  * A new ``app/pdump`` tool is added to capture packets in DPDK.
+
 
 Resolved Issues
 ---------------
@@ -135,6 +140,11 @@ API Changes
   ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff.
 
+* Function ``rte_eth_dev_get_port_by_name`` changed to a public API.
+
+* Function ``rte_eth_dev_info_get`` updated to return new fields ``nb_rx_queues`` and ``nb_tx_queues``
+  in the ``rte_eth_dev_info`` object.
+
 
 ABI Changes
 -----------
@@ -146,6 +156,9 @@ ABI Changes
 * The ``rte_port_source_params`` structure has new fields to support PCAP file.
   It was already in release 16.04 with ``RTE_NEXT_ABI`` flag.
 
+* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and ``nb_tx_queues``
+  to support number of queues configured by software.
+
 
 Shared Library Versions
 -----------------------
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 930f68c..96bb317 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -76,6 +76,7 @@ Sample Applications User Guide
     ptpclient
     performance_thread
     ipsec_secgw
+    pdump
 
 **Figures**
 
diff --git a/doc/guides/sample_app_ug/pdump.rst b/doc/guides/sample_app_ug/pdump.rst
new file mode 100644
index 0000000..96c8709
--- /dev/null
+++ b/doc/guides/sample_app_ug/pdump.rst
@@ -0,0 +1,122 @@
+
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+dpdk_pdump Application
+======================
+
+The ``dpdk_pdump`` application is a Data Plane Development Kit (DPDK) application that runs as a DPDK secondary process and
+is capable of enabling packet capture on dpdk ports.
+
+
+Running the Application
+-----------------------
+
+The application has a ``--pdump`` command line option with various sub arguments:
+
+.. code-block:: console
+
+   ./build/app/dpdk_pdump --
+                          --pdump '(port=<port id> | device_id=<pci id or vdev name>),
+                                   (queue=<queue_id>),
+                                   (rx-dev=<iface or pcap file> |
+                                    tx-dev=<iface or pcap file>),
+                                   [ring-size=<ring size>],
+                                   [mbuf-size=<mbuf data size>],
+                                   [total-num-mbufs=<number of mbufs>]'
+
+Note:
+
+* Parameters inside the parentheses represents mandatory parameters.
+
+* Parameters inside the square brackets represents optional parameters.
+
+Multiple instances of ``--pdump`` can be passed to capture packets on different port and queue combinations.
+
+
+Parameters
+~~~~~~~~~~
+
+``port``:
+Port id of the eth device on which packets should be captured.
+
+``device_id``:
+PCI address (or) name of the eth device on which packets should be captured.
+
+   .. Note::
+
+      * As of now the ``dpdk_pdump`` tool cannot capture the packets of virtual devices
+        in the primary process due to a bug in the ethdev library. Due to this bug, in a multi process context,
+        when the primary and secondary have different ports set, then the secondary process
+        (here the ``dpdk_pdump`` tool) overwrites the ``rte_eth_devices[]`` entries of the primary process.
+
+``queue``:
+Queue id of the eth device on which packets should be captured. The user can pass a queue value of ``*`` to enable
+packet capture on all queues of the eth device.
+
+``rx-dev``:
+Can be either a pcap file name or any Linux iface.
+
+``tx-dev``:
+Can be either a pcap file name or any Linux iface.
+
+   .. Note::
+
+      * To receive ingress packets only, ``rx-dev`` should be passed.
+
+      * To receive egress packets only, ``tx-dev`` should be passed.
+
+      * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev``
+        should both be passed with the different file names or the Linux iface names.
+
+      * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev``
+        should both be passed with the same file names or the the Linux iface names.
+
+``ring-size``:
+Size of the ring. This value is used internally for ring creation. The ring will be used to enqueue the packets from
+the primary application to the secondary. This is an optional parameter with default size 16384.
+
+``mbuf-size``:
+Size of the mbuf data. This is used internally for mempool creation. Ideally this value must be same as
+the primary application's mempool's mbuf data size which is used for packet RX. This is an optional parameter with
+default size 2176.
+
+``total-num-mbufs``:
+Total number mbufs in mempool. This is used internally for mempool creation. This is an optional parameter with default
+value 65535.
+
+
+Example
+-------
+
+.. code-block:: console
+
+   $ sudo ./build/app/dpdk_pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support
  2016-06-10 16:18   ` [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support Reshma Pattan
@ 2016-06-10 18:48     ` Aaron Conole
  2016-06-10 22:14       ` Pattan, Reshma
  0 siblings, 1 reply; 67+ messages in thread
From: Aaron Conole @ 2016-06-10 18:48 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

Hi Reshma,

Reshma Pattan <reshma.pattan@intel.com> writes:

> Added new library for packet capturing support.
>
> Added public api rte_pdump_init, applications should call
> this as part of their application setup to have packet
> capturing framework ready.
>
> Added public api rte_pdump_uninit to uninitialize the packet
> capturing framework.
>
> Added public apis rte_pdump_enable and rte_pdump_disable to
> enable and disable packet capturing on specific port and queue.
>
> Added public apis rte_pdump_enable_by_deviceid and
> rte_pdump_disable_by_deviceid to enable and disable packet
> capturing on a specific device (pci address or name) and queue.
>
> Added public api rte_pdump_set_socket_dir to set the
> server socket path.

Thanks for this, it is quite useful!  I am wondering, should the same
API work for a client socket as well?  The code becomes a bit easier to
maintain, and the API behaves whether executed from client or server.
Thoughts?

Thanks,
Aaron

> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> ---
>  MAINTAINERS                            |   4 +
>  config/common_base                     |   5 +
>  lib/Makefile                           |   1 +
>  lib/librte_pdump/Makefile              |  55 ++
>  lib/librte_pdump/rte_pdump.c           | 904 +++++++++++++++++++++++++++++++++
>  lib/librte_pdump/rte_pdump.h           | 208 ++++++++
>  lib/librte_pdump/rte_pdump_version.map |  13 +
>  mk/rte.app.mk                          |   1 +
>  8 files changed, 1191 insertions(+)
>  create mode 100644 lib/librte_pdump/Makefile
>  create mode 100644 lib/librte_pdump/rte_pdump.c
>  create mode 100644 lib/librte_pdump/rte_pdump.h
>  create mode 100644 lib/librte_pdump/rte_pdump_version.map
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3e8558f..cc3ffdb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -432,6 +432,10 @@ F: app/test/test_reorder*
>  F: examples/packet_ordering/
>  F: doc/guides/sample_app_ug/packet_ordering.rst
>  
> +Pdump
> +M: Reshma Pattan <reshma.pattan@intel.com>
> +F: lib/librte_pdump/
> +
>  Hierarchical scheduler
>  M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
>  F: lib/librte_sched/
> diff --git a/config/common_base b/config/common_base
> index 47c26f6..a2d5d72 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
>  CONFIG_RTE_LIBRTE_REORDER=y
>  
>  #
> +# Compile the pdump library
> +#
> +CONFIG_RTE_LIBRTE_PDUMP=y
> +
> +#
>  # Compile librte_port
>  #
>  CONFIG_RTE_LIBRTE_PORT=y
> diff --git a/lib/Makefile b/lib/Makefile
> index f254dba..ca7c02f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
>  DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
>  DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
>  DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
> +DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>  
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
> new file mode 100644
> index 0000000..af81a28
> --- /dev/null
> +++ b/lib/librte_pdump/Makefile
> @@ -0,0 +1,55 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_pdump.a
> +
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
> +CFLAGS += -D_GNU_SOURCE
> +
> +EXPORT_MAP := rte_pdump_version.map
> +
> +LIBABIVER := 1
> +
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_PDUMP) := rte_pdump.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_PDUMP)-include := rte_pdump.h
> +
> +# this lib depends upon:
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_mbuf
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_eal
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_ether
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
> new file mode 100644
> index 0000000..c4233cb
> --- /dev/null
> +++ b/lib/librte_pdump/rte_pdump.c
> @@ -0,0 +1,904 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +#include <sys/stat.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <pthread.h>
> +#include <stdbool.h>
> +#include <stdio.h>
> +
> +#include <rte_memcpy.h>
> +#include <rte_mbuf.h>
> +#include <rte_ethdev.h>
> +#include <rte_lcore.h>
> +#include <rte_log.h>
> +#include <rte_errno.h>
> +#include <rte_pci.h>
> +
> +#include "rte_pdump.h"
> +
> +#define SOCKET_PATH_VAR_RUN "/var/run/pdump_sockets"
> +#define SOCKET_PATH_HOME "HOME/pdump_sockets"
> +#define SERVER_SOCKET "%s/pdump_server_socket"
> +#define CLIENT_SOCKET "%s/pdump_client_socket_%d_%u"
> +#define DEVICE_ID_SIZE 64
> +/* Macros for printing using RTE_LOG */
> +#define RTE_LOGTYPE_PDUMP RTE_LOGTYPE_USER1
> +
> +enum pdump_operation {
> +	DISABLE = 1,
> +	ENABLE = 2
> +};
> +
> +enum pdump_socktype {
> +	SERVER = 1,
> +	CLIENT = 2
> +};
> +
> +enum pdump_version {
> +	V1 = 1
> +};
> +
> +static pthread_t pdump_thread;
> +static int pdump_socket_fd;
> +static char socket_dir[PATH_MAX];
> +
> +struct pdump_request {
> +	uint16_t ver;
> +	uint16_t op;
> +	uint32_t flags;
> +	union pdump_data {
> +		struct enable_v1 {
> +			char device[DEVICE_ID_SIZE];
> +			uint16_t queue;
> +			struct rte_ring *ring;
> +			struct rte_mempool *mp;
> +			void *filter;
> +		} en_v1;
> +		struct disable_v1 {
> +			char device[DEVICE_ID_SIZE];
> +			uint16_t queue;
> +			struct rte_ring *ring;
> +			struct rte_mempool *mp;
> +			void *filter;
> +		} dis_v1;
> +	} data;
> +};
> +
> +struct pdump_response {
> +	uint16_t ver;
> +	uint16_t res_op;
> +	int32_t err_value;
> +};
> +
> +static struct pdump_rxtx_cbs {
> +	struct rte_ring *ring;
> +	struct rte_mempool *mp;
> +	struct rte_eth_rxtx_callback *cb;
> +	void *filter;
> +} rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT],
> +tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
> +
> +static inline int
> +pdump_pktmbuf_copy_data(struct rte_mbuf *seg, const struct rte_mbuf *m)
> +{
> +	if (rte_pktmbuf_tailroom(seg) < m->data_len) {
> +		RTE_LOG(ERR, PDUMP,
> +			"User mempool: insufficient data_len of mbuf\n");
> +		return -EINVAL;
> +	}
> +
> +	seg->port = m->port;
> +	seg->vlan_tci = m->vlan_tci;
> +	seg->hash = m->hash;
> +	seg->tx_offload = m->tx_offload;
> +	seg->ol_flags = m->ol_flags;
> +	seg->packet_type = m->packet_type;
> +	seg->vlan_tci_outer = m->vlan_tci_outer;
> +	seg->data_len = m->data_len;
> +	seg->pkt_len = seg->data_len;
> +	rte_memcpy(rte_pktmbuf_mtod(seg, void *),
> +			rte_pktmbuf_mtod(m, void *),
> +			rte_pktmbuf_data_len(seg));
> +
> +	return 0;
> +}
> +
> +static inline struct rte_mbuf *
> +pdump_pktmbuf_copy(struct rte_mbuf *m, struct rte_mempool *mp)
> +{
> +	struct rte_mbuf *m_dup, *seg, **prev;
> +	uint32_t pktlen;
> +	uint8_t nseg;
> +
> +	m_dup = rte_pktmbuf_alloc(mp);
> +	if (unlikely(m_dup == NULL))
> +		return NULL;
> +
> +	seg = m_dup;
> +	prev = &seg->next;
> +	pktlen = m->pkt_len;
> +	nseg = 0;
> +
> +	do {
> +		nseg++;
> +		if (pdump_pktmbuf_copy_data(seg, m) < 0) {
> +			rte_pktmbuf_free(m_dup);
> +			return NULL;
> +		}
> +		*prev = seg;
> +		prev = &seg->next;
> +	} while ((m = m->next) != NULL &&
> +			(seg = rte_pktmbuf_alloc(mp)) != NULL);
> +
> +	*prev = NULL;
> +	m_dup->nb_segs = nseg;
> +	m_dup->pkt_len = pktlen;
> +
> +	/* Allocation of new indirect segment failed */
> +	if (unlikely(seg == NULL)) {
> +		rte_pktmbuf_free(m_dup);
> +		return NULL;
> +	}
> +
> +	__rte_mbuf_sanity_check(m_dup, 1);
> +	return m_dup;
> +}
> +
> +static inline void
> +pdump_copy(struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
> +{
> +	unsigned i;
> +	int ring_enq;
> +	uint16_t d_pkts = 0;
> +	struct rte_mbuf *dup_bufs[nb_pkts];
> +	struct pdump_rxtx_cbs *cbs;
> +	struct rte_ring *ring;
> +	struct rte_mempool *mp;
> +	struct rte_mbuf *p;
> +
> +	cbs  = user_params;
> +	ring = cbs->ring;
> +	mp = cbs->mp;
> +	for (i = 0; i < nb_pkts; i++) {
> +		p = pdump_pktmbuf_copy(pkts[i], mp);
> +		if (p)
> +			dup_bufs[d_pkts++] = p;
> +	}
> +
> +	ring_enq = rte_ring_enqueue_burst(ring, (void *)dup_bufs, d_pkts);
> +	if (unlikely(ring_enq < d_pkts)) {
> +		RTE_LOG(DEBUG, PDUMP,
> +			"only %d of packets enqueued to ring\n", ring_enq);
> +		do {
> +			rte_pktmbuf_free(dup_bufs[ring_enq]);
> +		} while (++ring_enq < d_pkts);
> +	}
> +}
> +
> +static uint16_t
> +pdump_rx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
> +	struct rte_mbuf **pkts, uint16_t nb_pkts,
> +	uint16_t max_pkts __rte_unused,
> +	void *user_params)
> +{
> +	pdump_copy(pkts, nb_pkts, user_params);
> +	return nb_pkts;
> +}
> +
> +static uint16_t
> +pdump_tx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
> +		struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
> +{
> +	pdump_copy(pkts, nb_pkts, user_params);
> +	return nb_pkts;
> +}
> +
> +static int
> +pdump_get_dombdf(char *device_id, char *domBDF, size_t len)
> +{
> +	int ret;
> +	struct rte_pci_addr dev_addr = {0};
> +
> +	/* identify if device_id is pci address or name */
> +	ret = eal_parse_pci_DomBDF(device_id, &dev_addr);
> +	if (ret < 0)
> +		return -1;
> +
> +	if (dev_addr.domain)
> +		ret = snprintf(domBDF, len, "%u:%u:%u.%u", dev_addr.domain,
> +				dev_addr.bus, dev_addr.devid,
> +				dev_addr.function);
> +	else
> +		ret = snprintf(domBDF, len, "%u:%u.%u", dev_addr.bus,
> +				dev_addr.devid,
> +				dev_addr.function);
> +
> +	return ret;
> +}
> +
> +static int
> +pdump_regitser_rx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
> +				struct rte_ring *ring, struct rte_mempool *mp,
> +				uint16_t operation)
> +{
> +	uint16_t qid;
> +	struct pdump_rxtx_cbs *cbs = NULL;
> +
> +	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
> +	for (; qid < end_q; qid++) {
> +		cbs = &rx_cbs[port][qid];
> +		if (cbs && operation == ENABLE) {
> +			if (cbs->cb) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to add rx callback for port=%d "
> +					"and queue=%d, callback already exists\n",
> +					port, qid);
> +				return -EEXIST;
> +			}
> +			cbs->ring = ring;
> +			cbs->mp = mp;
> +			cbs->cb = rte_eth_add_first_rx_callback(port, qid,
> +								pdump_rx, cbs);
> +			if (cbs->cb == NULL) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to add rx callback, errno=%d\n",
> +					rte_errno);
> +				return rte_errno;
> +			}
> +		}
> +		if (cbs && operation == DISABLE) {
> +			int ret;
> +
> +			if (cbs->cb == NULL) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to delete non existing rx "
> +					"callback for port=%d and queue=%d\n",
> +					port, qid);
> +				return -EINVAL;
> +			}
> +			ret = rte_eth_remove_rx_callback(port, qid, cbs->cb);
> +			if (ret < 0) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to remove rx callback, errno=%d\n",
> +					rte_errno);
> +				return ret;
> +			}
> +			cbs->cb = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +pdump_regitser_tx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
> +				struct rte_ring *ring, struct rte_mempool *mp,
> +				uint16_t operation)
> +{
> +
> +	uint16_t qid;
> +	struct pdump_rxtx_cbs *cbs = NULL;
> +
> +	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
> +	for (; qid < end_q; qid++) {
> +		cbs = &tx_cbs[port][qid];
> +		if (cbs && operation == ENABLE) {
> +			if (cbs->cb) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to add tx callback for port=%d "
> +					"and queue=%d, callback already exists\n",
> +					port, qid);
> +				return -EEXIST;
> +			}
> +			cbs->ring = ring;
> +			cbs->mp = mp;
> +			cbs->cb = rte_eth_add_tx_callback(port, qid, pdump_tx,
> +								cbs);
> +			if (cbs->cb == NULL) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to add tx callback, errno=%d\n",
> +					rte_errno);
> +				return rte_errno;
> +			}
> +		}
> +		if (cbs && operation == DISABLE) {
> +			int ret;
> +
> +			if (cbs->cb == NULL) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to delete non existing tx "
> +					"callback for port=%d and queue=%d\n",
> +					port, qid);
> +				return -EINVAL;
> +			}
> +			ret = rte_eth_remove_tx_callback(port, qid, cbs->cb);
> +			if (ret < 0) {
> +				RTE_LOG(ERR, PDUMP,
> +					"failed to remove tx callback, errno=%d\n",
> +					rte_errno);
> +				return ret;
> +			}
> +			cbs->cb = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +set_pdump_rxtx_cbs(struct pdump_request *p)
> +{
> +	uint16_t nb_rx_q, nb_tx_q = 0, end_q, queue;
> +	uint8_t port;
> +	int ret = 0;
> +	uint32_t flags;
> +	uint16_t operation;
> +	struct rte_ring *ring;
> +	struct rte_mempool *mp;
> +
> +	flags = p->flags;
> +	operation = p->op;
> +	if (operation == ENABLE) {
> +		ret = rte_eth_dev_get_port_by_name(p->data.en_v1.device,
> +				&port);
> +		if (ret < 0) {
> +			RTE_LOG(ERR, PDUMP,
> +				"failed to get potid for device id=%s\n",
> +				p->data.en_v1.device);
> +			return -EINVAL;
> +		}
> +		queue = p->data.en_v1.queue;
> +		ring = p->data.en_v1.ring;
> +		mp = p->data.en_v1.mp;
> +	} else {
> +		ret = rte_eth_dev_get_port_by_name(p->data.dis_v1.device,
> +				&port);
> +		if (ret < 0) {
> +			RTE_LOG(ERR, PDUMP,
> +				"failed to get potid for device id=%s\n",
> +				p->data.dis_v1.device);
> +			return -EINVAL;
> +		}
> +		queue = p->data.dis_v1.queue;
> +		ring = p->data.dis_v1.ring;
> +		mp = p->data.dis_v1.mp;
> +	}
> +
> +	/* validation if packet capture is for all queues */
> +	if (queue == RTE_PDUMP_ALL_QUEUES) {
> +		struct rte_eth_dev_info dev_info;
> +
> +		rte_eth_dev_info_get(port, &dev_info);
> +		nb_rx_q = dev_info.nb_rx_queues;
> +		nb_tx_q = dev_info.nb_tx_queues;
> +		if (nb_rx_q == 0 && flags & RTE_PDUMP_FLAG_RX) {
> +			RTE_LOG(ERR, PDUMP,
> +				"number of rx queues cannot be 0\n");
> +			return -EINVAL;
> +		}
> +		if (nb_tx_q == 0 && flags & RTE_PDUMP_FLAG_TX) {
> +			RTE_LOG(ERR, PDUMP,
> +				"number of tx queues cannot be 0\n");
> +			return -EINVAL;
> +		}
> +		if ((nb_tx_q == 0 || nb_rx_q == 0) &&
> +			flags == RTE_PDUMP_FLAG_RXTX) {
> +			RTE_LOG(ERR, PDUMP,
> +				"both tx&rx queues must be non zero\n");
> +			return -EINVAL;
> +		}
> +	}
> +
> +	/* register RX callback */
> +	if (flags & RTE_PDUMP_FLAG_RX) {
> +		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_rx_q : queue + 1;
> +		ret = pdump_regitser_rx_callbacks(end_q, port, queue, ring, mp,
> +							operation);
> +		if (ret < 0)
> +			return ret;
> +	}
> +
> +	/* register TX callback */
> +	if (flags & RTE_PDUMP_FLAG_TX) {
> +		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_tx_q : queue + 1;
> +		ret = pdump_regitser_tx_callbacks(end_q, port, queue, ring, mp,
> +							operation);
> +		if (ret < 0)
> +			return ret;
> +	}
> +
> +	return ret;
> +}
> +
> +/* get socket path (/var/run if root, $HOME otherwise) */
> +static void
> +pdump_get_socket_path(char *buffer, int bufsz, enum pdump_socktype type)
> +{
> +	const char *dir = NULL;
> +
> +	if (type == SERVER && socket_dir[0] != 0)
> +		dir = socket_dir;
> +	else {
> +
> +		if (getuid() != 0)
> +			dir = getenv(SOCKET_PATH_HOME);
> +		else
> +			dir = SOCKET_PATH_VAR_RUN;
> +	}
> +
> +	mkdir(dir, 700);
> +	if (type == SERVER)
> +		snprintf(buffer, bufsz, SERVER_SOCKET, dir);
> +	else
> +		snprintf(buffer, bufsz, CLIENT_SOCKET, dir, getpid(),
> +				rte_sys_gettid());
> +}
> +
> +static int
> +pdump_create_server_socket(void)
> +{
> +	int ret, socket_fd;
> +	struct sockaddr_un addr;
> +	socklen_t addr_len;
> +
> +	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path), SERVER);
> +	addr.sun_family = AF_UNIX;
> +
> +	/* remove if file already exists */
> +	unlink(addr.sun_path);
> +
> +	/* set up a server socket */
> +	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
> +	if (socket_fd < 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to create server socket: %s, %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +
> +	addr_len = sizeof(struct sockaddr_un);
> +	ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
> +	if (ret) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to bind to server socket: %s, %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		close(socket_fd);
> +		return -1;
> +	}
> +
> +	/* save the socket in local configuration */
> +	pdump_socket_fd = socket_fd;
> +
> +	return 0;
> +}
> +
> +static __attribute__((noreturn)) void *
> +pdump_thread_main(__rte_unused void *arg)
> +{
> +	struct sockaddr_un cli_addr;
> +	socklen_t cli_len;
> +	struct pdump_request cli_req;
> +	struct pdump_response resp;
> +	int n;
> +	int ret = 0;
> +
> +	/* host thread, never break out */
> +	for (;;) {
> +		/* recv client requests */
> +		cli_len = sizeof(cli_addr);
> +		n = recvfrom(pdump_socket_fd, &cli_req,
> +				sizeof(struct pdump_request), 0,
> +				(struct sockaddr *)&cli_addr, &cli_len);
> +		if (n < 0) {
> +			RTE_LOG(ERR, PDUMP,
> +				"failed to recv from client:%s, %s:%d\n",
> +				strerror(errno), __func__, __LINE__);
> +			continue;
> +		}
> +
> +		ret = set_pdump_rxtx_cbs(&cli_req);
> +
> +		resp.ver = cli_req.ver;
> +		resp.res_op = cli_req.op;
> +		resp.err_value = ret;
> +		n = sendto(pdump_socket_fd, &resp,
> +				sizeof(struct pdump_response),
> +				0, (struct sockaddr *)&cli_addr, cli_len);
> +		if (n < 0) {
> +			RTE_LOG(ERR, PDUMP,
> +				"failed to send to client:%s, %s:%d\n",
> +				strerror(errno), __func__, __LINE__);
> +		}
> +	}
> +}
> +
> +int
> +rte_pdump_init(const char *path)
> +{
> +	int ret = 0;
> +	char thread_name[RTE_MAX_THREAD_NAME_LEN];
> +
> +	ret = rte_pdump_set_socket_dir(path);
> +	if (ret != 0)
> +		return -1;
> +
> +	ret = pdump_create_server_socket();
> +	if (ret != 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to create server socket:%s:%d\n",
> +			__func__, __LINE__);
> +		return -1;
> +	}
> +
> +	/* create the host thread to wait/handle pdump requests */
> +	ret = pthread_create(&pdump_thread, NULL, pdump_thread_main, NULL);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to create the pdump thread:%s, %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +	/* Set thread_name for aid in debugging. */
> +	snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN, "pdump-thread");
> +	ret = rte_thread_setname(pdump_thread, thread_name);
> +	if (ret != 0) {
> +		RTE_LOG(DEBUG, PDUMP,
> +			"Failed to set thread name for pdump handling\n");
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +rte_pdump_uninit(void)
> +{
> +	int ret;
> +
> +	ret = pthread_cancel(pdump_thread);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to cancel the pdump thread:%s, %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +
> +	ret = close(pdump_socket_fd);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to close server socket: %s, %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +
> +	struct sockaddr_un addr;
> +
> +	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path), SERVER);
> +	ret = unlink(addr.sun_path);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"Failed to remove server socket addr: %s, %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +pdump_create_client_socket(struct pdump_request *p)
> +{
> +	int ret, socket_fd;
> +	int pid;
> +	int n;
> +	struct pdump_response server_resp;
> +	struct sockaddr_un addr, serv_addr, from;
> +	socklen_t addr_len, serv_len;
> +
> +	pid = getpid();
> +
> +	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
> +	if (socket_fd < 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"client socket(): %s:pid(%d):tid(%u), %s:%d\n",
> +			strerror(errno), pid, rte_sys_gettid(),
> +			__func__, __LINE__);
> +		ret = errno;
> +		return ret;
> +	}
> +
> +	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path), CLIENT);
> +	addr.sun_family = AF_UNIX;
> +	addr_len = sizeof(struct sockaddr_un);
> +
> +	do {
> +		ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
> +		if (ret) {
> +			RTE_LOG(ERR, PDUMP,
> +				"client bind(): %s, %s:%d\n",
> +				strerror(errno), __func__, __LINE__);
> +			ret = errno;
> +			break;
> +		}
> +
> +		serv_len = sizeof(struct sockaddr_un);
> +		memset(&serv_addr, 0, sizeof(serv_addr));
> +		pdump_get_socket_path(serv_addr.sun_path,
> +					sizeof(serv_addr.sun_path),
> +					SERVER);
> +		serv_addr.sun_family = AF_UNIX;
> +
> +		n =  sendto(socket_fd, p, sizeof(struct pdump_request), 0,
> +				(struct sockaddr *)&serv_addr, serv_len);
> +		if (n < 0) {
> +			RTE_LOG(ERR, PDUMP,
> +				"failed to send to server:%s, %s:%d\n",
> +				strerror(errno), __func__, __LINE__);
> +			ret =  errno;
> +			break;
> +		}
> +
> +		n = recvfrom(socket_fd, &server_resp,
> +				sizeof(struct pdump_response), 0,
> +				(struct sockaddr *)&from, &serv_len);
> +		if (n < 0) {
> +			RTE_LOG(ERR, PDUMP,
> +				"failed to recv from server:%s, %s:%d\n",
> +				strerror(errno), __func__, __LINE__);
> +			ret = errno;
> +			break;
> +		}
> +		ret = server_resp.err_value;
> +	} while (0);
> +
> +	close(socket_fd);
> +	unlink(addr.sun_path);
> +	return ret;
> +}
> +
> +static int
> +pdump_validate_ring_mp(struct rte_ring *ring, struct rte_mempool *mp)
> +{
> +	if (ring == NULL || mp == NULL) {
> +		RTE_LOG(ERR, PDUMP, "NULL ring or mempool are passed %s:%d\n",
> +			__func__, __LINE__);
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +	if (mp->flags & MEMPOOL_F_SP_PUT || mp->flags & MEMPOOL_F_SC_GET) {
> +		RTE_LOG(ERR, PDUMP, "mempool with either SP or SC settings"
> +		" is not valid for pdump, should have MP and MC settings\n");
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +	if (ring->prod.sp_enqueue || ring->cons.sc_dequeue) {
> +		RTE_LOG(ERR, PDUMP, "ring with either SP or SC settings"
> +		" is not valid for pdump, should have MP and MC settings\n");
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +pdump_validate_flags(uint32_t flags)
> +{
> +	if (flags != RTE_PDUMP_FLAG_RX && flags != RTE_PDUMP_FLAG_TX &&
> +		flags != RTE_PDUMP_FLAG_RXTX) {
> +		RTE_LOG(ERR, PDUMP,
> +			"invalid flags, should be either rx/tx/rxtx\n");
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +pdump_validate_port(uint8_t port, char *name)
> +{
> +	int ret = 0;
> +
> +	if (port >= RTE_MAX_ETHPORTS) {
> +		RTE_LOG(ERR, PDUMP, "Invalid port id %u, %s:%d\n", port,
> +			__func__, __LINE__);
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +
> +	ret = rte_eth_dev_get_name_by_port(port, name);
> +	if (ret < 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"port id to name mapping failed for port id=%u, %s:%d\n",
> +			port, __func__, __LINE__);
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +pdump_prepare_client_request(char *device, uint16_t queue,
> +				uint32_t flags,
> +				uint16_t operation,
> +				struct rte_ring *ring,
> +				struct rte_mempool *mp,
> +				void *filter)
> +{
> +	int ret;
> +	struct pdump_request req = {.ver = 1,};
> +
> +	req.flags = flags;
> +	req.op =  operation;
> +	if ((operation & ENABLE) != 0) {
> +		strncpy(req.data.en_v1.device, device, strlen(device));
> +		req.data.en_v1.queue = queue;
> +		req.data.en_v1.ring = ring;
> +		req.data.en_v1.mp = mp;
> +		req.data.en_v1.filter = filter;
> +	} else {
> +		strncpy(req.data.dis_v1.device, device, strlen(device));
> +		req.data.dis_v1.queue = queue;
> +		req.data.dis_v1.ring = NULL;
> +		req.data.dis_v1.mp = NULL;
> +		req.data.dis_v1.filter = NULL;
> +	}
> +
> +	ret = pdump_create_client_socket(&req);
> +	if (ret < 0) {
> +		RTE_LOG(ERR, PDUMP,
> +			"client request for pdump enable/disable failed\n");
> +		rte_errno = ret;
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
> +			struct rte_ring *ring,
> +			struct rte_mempool *mp,
> +			void *filter)
> +{
> +
> +	int ret = 0;
> +	char name[DEVICE_ID_SIZE];
> +
> +	ret = pdump_validate_port(port, name);
> +	if (ret < 0)
> +		return ret;
> +	ret = pdump_validate_ring_mp(ring, mp);
> +	if (ret < 0)
> +		return ret;
> +	ret = pdump_validate_flags(flags);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = pdump_prepare_client_request(name, queue, flags,
> +						ENABLE, ring, mp, filter);
> +
> +	return ret;
> +}
> +
> +int
> +rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
> +				uint32_t flags,
> +				struct rte_ring *ring,
> +				struct rte_mempool *mp,
> +				void *filter)
> +{
> +	int ret = 0;
> +	char domBDF[DEVICE_ID_SIZE];
> +
> +	ret = pdump_validate_ring_mp(ring, mp);
> +	if (ret < 0)
> +		return ret;
> +	ret = pdump_validate_flags(flags);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
> +		ret = pdump_prepare_client_request(domBDF, queue, flags,
> +						ENABLE, ring, mp, filter);
> +	else
> +		ret = pdump_prepare_client_request(device_id, queue, flags,
> +						ENABLE, ring, mp, filter);
> +
> +	return ret;
> +}
> +
> +int
> +rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags)
> +{
> +	int ret = 0;
> +	char name[DEVICE_ID_SIZE];
> +
> +	ret = pdump_validate_port(port, name);
> +	if (ret < 0)
> +		return ret;
> +	ret = pdump_validate_flags(flags);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = pdump_prepare_client_request(name, queue, flags,
> +						DISABLE, NULL, NULL, NULL);
> +
> +	return ret;
> +}
> +
> +int
> +rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
> +				uint32_t flags)
> +{
> +	int ret = 0;
> +	char domBDF[DEVICE_ID_SIZE];
> +
> +	ret = pdump_validate_flags(flags);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
> +		ret = pdump_prepare_client_request(domBDF, queue, flags,
> +						DISABLE, NULL, NULL, NULL);
> +	else
> +		ret = pdump_prepare_client_request(device_id, queue, flags,
> +						DISABLE, NULL, NULL, NULL);
> +
> +	return ret;
> +}
> +
> +int
> +rte_pdump_set_socket_dir(const char *path)
> +{
> +	int ret, count;
> +
> +	if (path != NULL) {
> +		count = sizeof(socket_dir);
> +		ret = snprintf(socket_dir, count, "%s", path);
> +		if (ret < 0  || ret >= count) {
> +			RTE_LOG(ERR, PDUMP,
> +					"Invalid server socket path:%s:%d\n",
> +					__func__, __LINE__);
> +			socket_dir[0] = 0;
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return 0;
> +}
> diff --git a/lib/librte_pdump/rte_pdump.h b/lib/librte_pdump/rte_pdump.h
> new file mode 100644
> index 0000000..63e8ac3
> --- /dev/null
> +++ b/lib/librte_pdump/rte_pdump.h
> @@ -0,0 +1,208 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_PDUMP_H_
> +#define _RTE_PDUMP_H_
> +
> +/**
> + * @file
> + * RTE pdump
> + *
> + * packet dump library to provide packet capturing support on dpdk.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#define RTE_PDUMP_ALL_QUEUES UINT16_MAX
> +
> +enum {
> +	RTE_PDUMP_FLAG_RX = 1,  /* receive direction */
> +	RTE_PDUMP_FLAG_TX = 2,  /* transmit direction */
> +	/* both receive and transmit directions */
> +	RTE_PDUMP_FLAG_RXTX = (RTE_PDUMP_FLAG_RX|RTE_PDUMP_FLAG_TX)
> +};
> +
> +/**
> + * Initialize packet capturing handling
> + *
> + * Creates pthread and server socket for handling clients
> + * requests to enable/disable rxtx callbacks.
> + *
> + * @param path
> + * directory path for server socket.
> + *
> + * @return
> + *    0 on success, -1 on error
> + */
> +int
> +rte_pdump_init(const char *path);
> +
> +/**
> + * Un initialize packet capturing handling
> + *
> + * Cancels pthread, close server socket, removes server socket address.
> + *
> + * @return
> + *    0 on success, -1 on error
> + */
> +int
> +rte_pdump_uninit(void);
> +
> +/**
> + * Enables packet capturing on given port and queue.
> + *
> + * @param port
> + *  port on which packet capturing should be enabled.
> + * @param queue
> + *  queue of a given port on which packet capturing should be enabled.
> + *  users should pass on value UINT16_MAX to enable packet capturing on all
> + *  queues of a given port.
> + * @param flags
> + *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
> + *  on which packet capturing should be enabled for a given port and queue.
> + * @param ring
> + *  ring on which captured packets will be enqueued for user.
> + * @param mp
> + *  mempool on to which original packets will be mirrored or duplicated.
> + * @param filter
> + *  place holder for packet filtering.
> + *
> + * @return
> + *    0 on success, -1 on error, rte_errno is set accordingly.
> + */
> +
> +int
> +rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
> +		struct rte_ring *ring,
> +		struct rte_mempool *mp,
> +		void *filter);
> +
> +/**
> + * Disables packet capturing on given port and queue.
> + *
> + * @param port
> + *  port on which packet capturing should be disabled.
> + * @param queue
> + *  queue of a given port on which packet capturing should be disabled.
> + *  users should pass on value UINT16_MAX to disable packet capturing on all
> + *  queues of a given port.
> + * @param flags
> + *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
> + *  on which packet capturing should be enabled for a given port and queue.
> + *
> + * @return
> + *    0 on success, -1 on error, rte_errno is set accordingly.
> + */
> +
> +int
> +rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags);
> +
> +/**
> + * Enables packet capturing on given device id and queue.
> + * device_id can be name or pci address of device.
> + *
> + * @param device_id
> + *  device id on which packet capturing should be enabled.
> + * @param queue
> + *  queue of a given device id on which packet capturing should be enabled.
> + *  users should pass on value UINT16_MAX to enable packet capturing on all
> + *  queues of a given device id.
> + * @param flags
> + *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
> + *  on which packet capturing should be enabled for a given port and queue.
> + * @param ring
> + *  ring on which captured packets will be enqueued for user.
> + * @param mp
> + *  mempool on to which original packets will be mirrored or duplicated.
> + * @param filter
> + *  place holder for packet filtering.
> + *
> + * @return
> + *    0 on success, -1 on error, rte_errno is set accordingly.
> + */
> +
> +int
> +rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
> +				uint32_t flags,
> +				struct rte_ring *ring,
> +				struct rte_mempool *mp,
> +				void *filter);
> +
> +/**
> + * Disables packet capturing on given device_id and queue.
> + * device_id can be name or pci address of device.
> + *
> + * @param device_id
> + *  pci address or name of the device on which packet capturing
> + *  should be disabled.
> + * @param queue
> + *  queue of a given device on which packet capturing should be disabled.
> + *  users should pass on value UINT16_MAX to disable packet capturing on all
> + *  queues of a given device id.
> + * @param flags
> + *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
> + *  on which packet capturing should be enabled for a given port and queue.
> + *
> + * @return
> + *    0 on success, -1 on error, rte_errno is set accordingly.
> + */
> +int
> +rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
> +				uint32_t flags);
> +
> +/**
> + * Allows applications to set server socket path.
> + * If specified path is null default path will be selected, i.e.
> + *"/var/run/" for root user and "$HOME" for non root user.
> + * Clients need to call this API only when their server path is non default
> + * path. This path will be used internally to send pdump enable or disable
> + * requests to the server.
> + * This API is not thread-safe.
> + *
> + * @param path
> + * directory path for server socket.
> + *
> + * @return
> + * 0 on success, -EINVAL on error
> + *
> + */
> +int
> +rte_pdump_set_socket_dir(const char *path);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_PDUMP_H_ */
> diff --git a/lib/librte_pdump/rte_pdump_version.map
> b/lib/librte_pdump/rte_pdump_version.map
> new file mode 100644
> index 0000000..edec99a
> --- /dev/null
> +++ b/lib/librte_pdump/rte_pdump_version.map
> @@ -0,0 +1,13 @@
> +DPDK_16.07 {
> +	global:
> +
> +	rte_pdump_disable;
> +	rte_pdump_disable_by_deviceid;
> +	rte_pdump_enable;
> +	rte_pdump_enable_by_deviceid;
> +	rte_pdump_init;
> +	rte_pdump_set_socket_dir;
> +	rte_pdump_uninit;
> +
> +	local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index b84b56d..f792f2a 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -61,6 +61,7 @@ _LDLIBS-y += --whole-archive
>  
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
>  
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support
  2016-06-10 18:48     ` Aaron Conole
@ 2016-06-10 22:14       ` Pattan, Reshma
  2016-06-13 13:28         ` Aaron Conole
  0 siblings, 1 reply; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-10 22:14 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev

Hi,

> -----Original Message-----
> From: Aaron Conole [mailto:aconole@redhat.com]
> Sent: Friday, June 10, 2016 7:48 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v8 5/8] lib/librte_pdump: add new library for
> packet capturing support
> 
> Hi Reshma,
> 
> Reshma Pattan <reshma.pattan@intel.com> writes:
> 
> > Added new library for packet capturing support.
> >
> > Added public api rte_pdump_init, applications should call this as part
> > of their application setup to have packet capturing framework ready.
> >
> > Added public api rte_pdump_uninit to uninitialize the packet capturing
> > framework.
> >
> > Added public apis rte_pdump_enable and rte_pdump_disable to enable and
> > disable packet capturing on specific port and queue.
> >
> > Added public apis rte_pdump_enable_by_deviceid and
> > rte_pdump_disable_by_deviceid to enable and disable packet capturing
> > on a specific device (pci address or name) and queue.
> >
> > Added public api rte_pdump_set_socket_dir to set the server socket
> > path.
> 
> Thanks for this, it is quite useful!  I am wondering, should the same API work for
> a client socket as well?  The code becomes a bit easier to maintain, and the API
> behaves whether executed from client or server.
> Thoughts?

In this patch, server socket path is added as argument to rte_pdump_init() , so server socket path must be passed while calling rte_pdump_init() API.
And  rte_pdump_set_socket_dir() is added for clients , as client need to know server socket path for contacting server,  so application should pass server socket path for clients using this API. 

Could you please clarify which of the below option you are looking to have?
a)If you want to have client and server  sockets under same non default path this can be done using same API. This just needs a tiny change in the code.

b)But if you want to have aserver and client sockets under  different paths, this can done using either of the below approaches.
b1)use same rte_pdump_set_socket_dir()  API, but add  a new argument  to specify  if the path is for server or client socket. 
 	(or) 
b2)have two separate APIs to set client and server socket paths. 

Which one do you prefer? 

Konstantin, any comments from your side, please add. 

Thanks,
Reshma

> 
> Thanks,
> Aaron
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v8 0/8] add packet capture framework
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (7 preceding siblings ...)
  2016-06-10 16:18   ` [PATCH v8 8/8] doc: update doc for packet capture framework Reshma Pattan
@ 2016-06-10 23:23   ` Neil Horman
  2016-06-13  8:47     ` Pattan, Reshma
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
  9 siblings, 1 reply; 67+ messages in thread
From: Neil Horman @ 2016-06-10 23:23 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

On Fri, Jun 10, 2016 at 05:18:46PM +0100, Reshma Pattan wrote:
> This patch set include below changes
> 
> 1)Changes to librte_ether.
> 2)A new library librte_pdump added for packet capture framework.
> 3)A new app/pdump tool added for packet capturing.
> 4)Test pmd changes done to initialize packet capture framework.
> 5)Documentation update.
> 
> 1)librte_pdump
> ==============
> To support packet capturing on dpdk Ethernet devices, a new library librte_pdump
> is added.Users can develop their own packet capturing application using new library APIs.
> 
> Operation:
> ----------
> The librte_pdump provides APIs to support packet capturing on dpdk Ethernet devices.
> Library provides APIs to initialize the packet capture framework, enable/disable
> the packet capture and uninitialize the packet capture framework.
> 
> The librte_pdump library works on a client/server model. The server is responsible for enabling or
> disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
> the packet capture.
> 
> The packet capture framework, as part of its initialization, creates the pthread and the server socket in
> the pthread. The application that calls the framework initialization will have the server socket created,
> either under the path that the application has passed or under the default path i.e. either ''/var/run'' for
> root user or ''$HOME'' for non root user.
> 
> Applications that request enabling or disabling of the packet capture will have the client socket created either under
> the ''/var/run/'' for root users or ''$HOME'' for not root users to send the requests to the server.
> The server socket will listen for client requests for enabling or disabling the packet capture.
> 
> Applications using below APIs need to pass port/device_id, queue, mempool and
> ring parameters. Library uses user provided ring and mempool to mirror the rx/tx
> packets of the port for users. Users need to dequeue the rings and write the packets
> to vdev(pcap/tuntap) to view the packets using any standard tools.
> 
> Note:
> Mempool and Ring should be mc/mp supportable.
> Mempool mbuf size should be big enough to handle the rx/tx packets of a port.
> 
> APIs:
> -----
> rte_pdump_init()
> rte_pdump_enable()
> rte_pdump_enable_by_deviceid()
> rte_pdump_disable()
> rte_pdump_disable_by_deviceid()
> rte_pdump_uninit()
> rte_pdump_set_socket_dir()
> 
> 2)app/pdump tool
> ================
> Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
> This tool by default runs as secondary process, and provides the support for
> the command line options for packet capture.
> 
> ./build/app/dpdk_pdump --
>                        --pdump '(port=<port id> | device_id=<pci id or vdev name>),
>                                 (queue=<queue id>),
>                                 (rx-dev=<iface or pcap file> |
>                                  tx-dev=<iface or pcap file>),
>                                 [ring-size=<ring size>],
>                                 [mbuf-size=<mbuf data size>],
>                                 [total-num-mbufs=<number of mbufs>]'
> 
> Parameters inside the parenthesis represents the mandatory parameters.
> Parameters inside the square brackets represents optional parameters.
> User has to pass on packet capture parameters under --pdump parameters, multiples of
> --pdump can be passed to capture packets on different port and queue combinations
> 
> Operation:
> ----------
> *Tool parse the user command line arguments,
> creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
> of the device passed in rx-dev|tx-dev parameters.
> 
> *Then calls the APIs of librte_pdump i.e. rte_pdump_enable()/rte_pdump_enable_by_deviceid()
> to enable packet capturing on a specific port/device_id and queue by passing on
> port|device_id, queue, mempool and ring info.
> 
> *Tool runs in while loop to dequeue the packets from the ring and write them to pcap device.
> 
> *Tool can be stopped using SIGINT, upon which tool calls
> rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated resources.
> 
> Note:
> CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the pdump tool.
> 
> 3)Test-pmd changes
> ==================
> Changes are done to test-pmd application to initialize/uninitialize the packet capture framework.
> So app/pdump tool can be run to see packets of dpdk ports that are used by test-pmd.
> 
> Similarly any application which needs packet capture should call initialize/uninitialize APIs of
> librte_pdump and use pdump tool to start the capture.
> 
> 4)Packet capture flow between pdump tool and librte_pdump
> =========================================================
> * Pdump tool (Secondary process) requests packet capture
> for specific port|device_id and queue combinations.
> 
> *Library in secondary process context creates client socket and communicates
> the port|device_id, queue, ring and mempool to server.
> 
> *Library initializes server in primary process 'test-pmd' context and server serves
> the client request to enable Ethernet rxtx call-backs for a given port|device_id and queue.
> 
> *Copy the rx/tx packets to passed mempool and enqueue the packets to ring for secondary process.
> 
> *Pdump tool will dequeue the packets from ring and writes them to PCAPMD vdev,
> so ultimately packets will be seen on the device that is passed in rx-dev|tx-dev.
> 
> *Once the pdump tool is terminated with SIGINT it will disable the packet capturing.
> 
> *Library receives the disable packet capture request, communicate the info to server,
> server will remove the Ethernet rxtx call-backs.
> 
> *Packet capture can be seen using tcpdump command
> "tcpdump -ni <iface>" (or) "tcpdump –nr <pcapfile>"
> 
> 5)Example command line
> ======================
> ./build/app/dpdk_pdump -- --pdump 'device_id=0000:02:0.0,queue=*,tx-dev=/tmp/dt-file.pcap,rx-dev=/tmp/dr-file.pcap,ring-size=8192,mbuf-size=2176,total-num-mbufs=32768' --pdump 'device_id=0000:01:00.0,queue=*,rx-dev=/tmp/d-file.pcap,tx-dev=/tmp/d-file.pcap,ring-size=16384,mbuf-size=2176,total-num-mbufs=32768'
> 
> v8:
> added server socket argument to rte_pdump_init() API ==> http://dpdk.org/dev/patchwork/patch/13402/
> added rte_pdump_set_socket_dir() API.
> updated documentation for new changes.
> 
> v7:
> fixed lines over 90 characters.
> 
> v6:
> removed below deprecation notice patch from patch set.
> http://dpdk.org/dev/patchwork/patch/13372/
> 
> v5:
> addressed code review comments for below patches
> http://dpdk.org/dev/patchwork/patch/12955/
> http://dpdk.org/dev/patchwork/patch/12951/
> 
> v4:
> added missing deprecation notice for ABI changes of rte_eth_dev_info structure.
> made doc changes as per doc guidelines.
> replaced rte_eal_vdev_init with rte_eth_dev_attach in pdump tool.
> removed rxtx-dev parameter from pdump tool command line.
> 
> v3:
> app/pdump: Moved cleanup code from signal handler to main.
> divided librte_ether changes into multiple patches.
> example command changed in app/pdump application guide
> 
> v2:
> fix compilation issues for 4.8.3
> fix unnecessary #includes
> 
> 
> Reshma Pattan (8):
>   librte_ether: protect add/remove of rxtx callbacks with spinlocks
>   librte_ether: add new api rte_eth_add_first_rx_callback
>   librte_ether: add new fields to rte_eth_dev_info struct
>   librte_ether: make rte_eth_dev_get_port_by_name
>     rte_eth_dev_get_name_by_port public
>   lib/librte_pdump: add new library for packet capturing support
>   app/pdump: add pdump tool for packet capturing
>   app/test-pmd: add pdump initialization uninitialization
>   doc: update doc for packet capture framework
> 
>  MAINTAINERS                             |   8 +
>  app/Makefile                            |   1 +
>  app/pdump/Makefile                      |  45 ++
>  app/pdump/main.c                        | 844 +++++++++++++++++++++++++++++
>  app/test-pmd/testpmd.c                  |   6 +
>  config/common_base                      |   5 +
>  doc/guides/prog_guide/index.rst         |   1 +
>  doc/guides/prog_guide/pdump_library.rst | 117 +++++
>  doc/guides/rel_notes/release_16_07.rst  |  13 +
>  doc/guides/sample_app_ug/index.rst      |   1 +
>  doc/guides/sample_app_ug/pdump.rst      | 122 +++++
>  lib/Makefile                            |   1 +
>  lib/librte_ether/rte_ethdev.c           | 123 +++--
>  lib/librte_ether/rte_ethdev.h           |  60 +++
>  lib/librte_ether/rte_ether_version.map  |   9 +
>  lib/librte_pdump/Makefile               |  55 ++
>  lib/librte_pdump/rte_pdump.c            | 904 ++++++++++++++++++++++++++++++++
>  lib/librte_pdump/rte_pdump.h            | 208 ++++++++
>  lib/librte_pdump/rte_pdump_version.map  |  13 +
>  mk/rte.app.mk                           |   1 +
>  20 files changed, 2493 insertions(+), 44 deletions(-)
>  create mode 100644 app/pdump/Makefile
>  create mode 100644 app/pdump/main.c
>  create mode 100644 doc/guides/prog_guide/pdump_library.rst
>  create mode 100644 doc/guides/sample_app_ug/pdump.rst
>  create mode 100644 lib/librte_pdump/Makefile
>  create mode 100644 lib/librte_pdump/rte_pdump.c
>  create mode 100644 lib/librte_pdump/rte_pdump.h
>  create mode 100644 lib/librte_pdump/rte_pdump_version.map
> 
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> -- 
> 2.5.0
> 
> 
This seems useful, but the pcap pmd already accepts pcap formatted files for
input to send using the pcap library.  Shouldn't this functionality be
integrated with that pmd instead of breaking it out to its own library?

Neil

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v8 0/8] add packet capture framework
  2016-06-10 23:23   ` [PATCH v8 0/8] add " Neil Horman
@ 2016-06-13  8:47     ` Pattan, Reshma
  0 siblings, 0 replies; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-13  8:47 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

Hi,

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Saturday, June 11, 2016 12:23 AM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v8 0/8] add packet capture framework
> 
> On Fri, Jun 10, 2016 at 05:18:46PM +0100, Reshma Pattan wrote:
> > This patch set include below changes
> >
> > 1)Changes to librte_ether.
> > 2)A new library librte_pdump added for packet capture framework.
> > 3)A new app/pdump tool added for packet capturing.
> > 4)Test pmd changes done to initialize packet capture framework.
> > 5)Documentation update.
> >
> > 1)librte_pdump
> > ==============
> > To support packet capturing on dpdk Ethernet devices, a new library
> > librte_pdump is added.Users can develop their own packet capturing
> application using new library APIs.
> >
> > Operation:
> > ----------
> > The librte_pdump provides APIs to support packet capturing on dpdk Ethernet
> devices.
> > Library provides APIs to initialize the packet capture framework,
> > enable/disable the packet capture and uninitialize the packet capture
> framework.
> >
> > The librte_pdump library works on a client/server model. The server is
> > responsible for enabling or disabling the packet capture and the
> > clients are responsible for requesting the enabling or disabling of the packet
> capture.
> >
> > The packet capture framework, as part of its initialization, creates
> > the pthread and the server socket in the pthread. The application that
> > calls the framework initialization will have the server socket
> > created, either under the path that the application has passed or under the
> default path i.e. either ''/var/run'' for root user or ''$HOME'' for non root user.
> >
> > Applications that request enabling or disabling of the packet capture
> > will have the client socket created either under the ''/var/run/'' for root users
> or ''$HOME'' for not root users to send the requests to the server.
> > The server socket will listen for client requests for enabling or disabling the
> packet capture.
> >
> > Applications using below APIs need to pass port/device_id, queue,
> > mempool and ring parameters. Library uses user provided ring and
> > mempool to mirror the rx/tx packets of the port for users. Users need
> > to dequeue the rings and write the packets to vdev(pcap/tuntap) to view the
> packets using any standard tools.
> >
> > Note:
> > Mempool and Ring should be mc/mp supportable.
> > Mempool mbuf size should be big enough to handle the rx/tx packets of a
> port.
> >
> > APIs:
> > -----
> > rte_pdump_init()
> > rte_pdump_enable()
> > rte_pdump_enable_by_deviceid()
> > rte_pdump_disable()
> > rte_pdump_disable_by_deviceid()
> > rte_pdump_uninit()
> > rte_pdump_set_socket_dir()
> >
> > 2)app/pdump tool
> > ================
> > Tool app/pdump is designed based on librte_pdump for packet capturing in
> DPDK.
> > This tool by default runs as secondary process, and provides the
> > support for the command line options for packet capture.
> >
> > ./build/app/dpdk_pdump --
> >                        --pdump '(port=<port id> | device_id=<pci id or vdev name>),
> >                                 (queue=<queue id>),
> >                                 (rx-dev=<iface or pcap file> |
> >                                  tx-dev=<iface or pcap file>),
> >                                 [ring-size=<ring size>],
> >                                 [mbuf-size=<mbuf data size>],
> >                                 [total-num-mbufs=<number of mbufs>]'
> >
> > Parameters inside the parenthesis represents the mandatory parameters.
> > Parameters inside the square brackets represents optional parameters.
> > User has to pass on packet capture parameters under --pdump
> > parameters, multiples of --pdump can be passed to capture packets on
> > different port and queue combinations
> >
> > Operation:
> > ----------
> > *Tool parse the user command line arguments, creates the mempool, ring
> > and the PCAP PMD vdev with 'tx_stream' as either of the device passed
> > in rx-dev|tx-dev parameters.
> >
> > *Then calls the APIs of librte_pdump i.e.
> > rte_pdump_enable()/rte_pdump_enable_by_deviceid()
> > to enable packet capturing on a specific port/device_id and queue by
> > passing on
> > port|device_id, queue, mempool and ring info.
> >
> > *Tool runs in while loop to dequeue the packets from the ring and write them
> to pcap device.
> >
> > *Tool can be stopped using SIGINT, upon which tool calls
> > rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated
> resources.
> >
> > Note:
> > CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run
> the pdump tool.
> >
> > 3)Test-pmd changes
> > ==================
> > Changes are done to test-pmd application to initialize/uninitialize the packet
> capture framework.
> > So app/pdump tool can be run to see packets of dpdk ports that are used by
> test-pmd.
> >
> > Similarly any application which needs packet capture should call
> > initialize/uninitialize APIs of librte_pdump and use pdump tool to start the
> capture.
> >
> > 4)Packet capture flow between pdump tool and librte_pdump
> > =========================================================
> > * Pdump tool (Secondary process) requests packet capture for specific
> > port|device_id and queue combinations.
> >
> > *Library in secondary process context creates client socket and
> > communicates the port|device_id, queue, ring and mempool to server.
> >
> > *Library initializes server in primary process 'test-pmd' context and
> > server serves the client request to enable Ethernet rxtx call-backs for a given
> port|device_id and queue.
> >
> > *Copy the rx/tx packets to passed mempool and enqueue the packets to ring
> for secondary process.
> >
> > *Pdump tool will dequeue the packets from ring and writes them to
> > PCAPMD vdev, so ultimately packets will be seen on the device that is passed
> in rx-dev|tx-dev.
> >
> > *Once the pdump tool is terminated with SIGINT it will disable the packet
> capturing.
> >
> > *Library receives the disable packet capture request, communicate the
> > info to server, server will remove the Ethernet rxtx call-backs.
> >
> > *Packet capture can be seen using tcpdump command "tcpdump -ni
> > <iface>" (or) "tcpdump –nr <pcapfile>"
> >
> > 5)Example command line
> > ======================
> > ./build/app/dpdk_pdump -- --pdump 'device_id=0000:02:0.0,queue=*,tx-
> dev=/tmp/dt-file.pcap,rx-dev=/tmp/dr-file.pcap,ring-size=8192,mbuf-
> size=2176,total-num-mbufs=32768' --pdump
> 'device_id=0000:01:00.0,queue=*,rx-dev=/tmp/d-file.pcap,tx-dev=/tmp/d-
> file.pcap,ring-size=16384,mbuf-size=2176,total-num-mbufs=32768'
> >
> > v8:
> > added server socket argument to rte_pdump_init() API ==>
> > http://dpdk.org/dev/patchwork/patch/13402/
> > added rte_pdump_set_socket_dir() API.
> > updated documentation for new changes.
> >
> > v7:
> > fixed lines over 90 characters.
> >
> > v6:
> > removed below deprecation notice patch from patch set.
> > http://dpdk.org/dev/patchwork/patch/13372/
> >
> > v5:
> > addressed code review comments for below patches
> > http://dpdk.org/dev/patchwork/patch/12955/
> > http://dpdk.org/dev/patchwork/patch/12951/
> >
> > v4:
> > added missing deprecation notice for ABI changes of rte_eth_dev_info
> structure.
> > made doc changes as per doc guidelines.
> > replaced rte_eal_vdev_init with rte_eth_dev_attach in pdump tool.
> > removed rxtx-dev parameter from pdump tool command line.
> >
> > v3:
> > app/pdump: Moved cleanup code from signal handler to main.
> > divided librte_ether changes into multiple patches.
> > example command changed in app/pdump application guide
> >
> > v2:
> > fix compilation issues for 4.8.3
> > fix unnecessary #includes
> >
> >
> > Reshma Pattan (8):
> >   librte_ether: protect add/remove of rxtx callbacks with spinlocks
> >   librte_ether: add new api rte_eth_add_first_rx_callback
> >   librte_ether: add new fields to rte_eth_dev_info struct
> >   librte_ether: make rte_eth_dev_get_port_by_name
> >     rte_eth_dev_get_name_by_port public
> >   lib/librte_pdump: add new library for packet capturing support
> >   app/pdump: add pdump tool for packet capturing
> >   app/test-pmd: add pdump initialization uninitialization
> >   doc: update doc for packet capture framework
> >
> >  MAINTAINERS                             |   8 +
> >  app/Makefile                            |   1 +
> >  app/pdump/Makefile                      |  45 ++
> >  app/pdump/main.c                        | 844 +++++++++++++++++++++++++++++
> >  app/test-pmd/testpmd.c                  |   6 +
> >  config/common_base                      |   5 +
> >  doc/guides/prog_guide/index.rst         |   1 +
> >  doc/guides/prog_guide/pdump_library.rst | 117 +++++
> > doc/guides/rel_notes/release_16_07.rst  |  13 +
> >  doc/guides/sample_app_ug/index.rst      |   1 +
> >  doc/guides/sample_app_ug/pdump.rst      | 122 +++++
> >  lib/Makefile                            |   1 +
> >  lib/librte_ether/rte_ethdev.c           | 123 +++--
> >  lib/librte_ether/rte_ethdev.h           |  60 +++
> >  lib/librte_ether/rte_ether_version.map  |   9 +
> >  lib/librte_pdump/Makefile               |  55 ++
> >  lib/librte_pdump/rte_pdump.c            | 904
> ++++++++++++++++++++++++++++++++
> >  lib/librte_pdump/rte_pdump.h            | 208 ++++++++
> >  lib/librte_pdump/rte_pdump_version.map  |  13 +
> >  mk/rte.app.mk                           |   1 +
> >  20 files changed, 2493 insertions(+), 44 deletions(-)  create mode
> > 100644 app/pdump/Makefile  create mode 100644 app/pdump/main.c  create
> > mode 100644 doc/guides/prog_guide/pdump_library.rst
> >  create mode 100644 doc/guides/sample_app_ug/pdump.rst
> >  create mode 100644 lib/librte_pdump/Makefile  create mode 100644
> > lib/librte_pdump/rte_pdump.c  create mode 100644
> > lib/librte_pdump/rte_pdump.h  create mode 100644
> > lib/librte_pdump/rte_pdump_version.map
> >
> > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > --
> > 2.5.0
> >
> >
> This seems useful, but the pcap pmd already accepts pcap formatted files for
> input to send using the pcap library.  Shouldn't this functionality be integrated
> with that pmd instead of breaking it out to its own library?
> 

The librte_pdump library doesn’t' deal with any  PCAP functionality, it is solely to send packets of DPDK ports to applications over the rings.
It is upto the applications to decide on which vdev  pmd they would like to use to send the packets further on to the Linux devices.
In this patch set, the  pdump tool(application based on librte_pdump) uses pcap pmd vdev. In future we also can replace  pcap pmd vdev usage with other virtual device pmd types. 

> Neil


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support
  2016-06-10 22:14       ` Pattan, Reshma
@ 2016-06-13 13:28         ` Aaron Conole
  0 siblings, 0 replies; 67+ messages in thread
From: Aaron Conole @ 2016-06-13 13:28 UTC (permalink / raw)
  To: Pattan, Reshma; +Cc: dev

"Pattan, Reshma" <reshma.pattan@intel.com> writes:

> Hi,
>
>> -----Original Message-----
>> From: Aaron Conole [mailto:aconole@redhat.com]
>> Sent: Friday, June 10, 2016 7:48 PM
>> To: Pattan, Reshma <reshma.pattan@intel.com>
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v8 5/8] lib/librte_pdump: add new library for
>> packet capturing support
>> 
>> Hi Reshma,
>> 
>> Reshma Pattan <reshma.pattan@intel.com> writes:
>> 
>> > Added new library for packet capturing support.
>> >
>> > Added public api rte_pdump_init, applications should call this as part
>> > of their application setup to have packet capturing framework ready.
>> >
>> > Added public api rte_pdump_uninit to uninitialize the packet capturing
>> > framework.
>> >
>> > Added public apis rte_pdump_enable and rte_pdump_disable to enable and
>> > disable packet capturing on specific port and queue.
>> >
>> > Added public apis rte_pdump_enable_by_deviceid and
>> > rte_pdump_disable_by_deviceid to enable and disable packet capturing
>> > on a specific device (pci address or name) and queue.
>> >
>> > Added public api rte_pdump_set_socket_dir to set the server socket
>> > path.
>> 
>> Thanks for this, it is quite useful!  I am wondering, should the
>> same API work for
>> a client socket as well?  The code becomes a bit easier to maintain,
>> and the API
>> behaves whether executed from client or server.
>> Thoughts?
>
> In this patch, server socket path is added as argument to
> rte_pdump_init() , so server socket path must be passed while calling
> rte_pdump_init() API.
> And rte_pdump_set_socket_dir() is added for clients , as client need
> to know server socket path for contacting server, so application
> should pass server socket path for clients using this API.
>
> Could you please clarify which of the below option you are looking to have?
> a)If you want to have client and server sockets under same non default
> path this can be done using same API. This just needs a tiny change in
> the code.
>
> b)But if you want to have aserver and client sockets under different
> paths, this can done using either of the below approaches.
> b1)use same rte_pdump_set_socket_dir() API, but add a new argument to
> specify if the path is for server or client socket.

This is probably the better option.  I think it would result in the
least surprise to an end developer, anyway.

Thanks,
Aaron

>  	(or) 
> b2)have two separate APIs to set client and server socket paths. 
>
> Which one do you prefer? 
>
> Konstantin, any comments from your side, please add. 
>
> Thanks,
> Reshma
>
>> 
>> Thanks,
>> Aaron
>> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v9 0/8] add packet capture framework
  2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
                     ` (8 preceding siblings ...)
  2016-06-10 23:23   ` [PATCH v8 0/8] add " Neil Horman
@ 2016-06-14  9:38   ` Reshma Pattan
  2016-06-14  9:38     ` [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
                       ` (8 more replies)
  9 siblings, 9 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev

This patch set include below changes

1)Changes to librte_ether.
2)A new library librte_pdump added for packet capture framework.
3)A new app/pdump tool added for packet capturing.
4)Test pmd changes done to initialize packet capture framework.
5)Documentation update.

1)librte_pdump
==============
To support packet capturing on dpdk Ethernet devices, a new library librte_pdump
is added.Users can develop their own packet capturing application using new library APIs.

Operation:
----------
The librte_pdump provides APIs to support packet capturing on dpdk Ethernet devices.
Library provides APIs to initialize the packet capture framework, enable/disable
the packet capture and uninitialize the packet capture framework.

The librte_pdump library works on a client/server model. The server is responsible for enabling or
disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
the packet capture.

The packet capture framework, as part of its initialization, creates the pthread and the server socket in
the pthread. The application that calls the framework initialization will have the server socket created,
either under the path that the application has passed or under the default path i.e. either ''/var/run'' for
root user or ''$HOME'' for non root user.

Applications that request enabling or disabling of the packet capture will have the client socket created
either under the path that the application has passed or under the default path i.e. either ''/var/run/''
for root users or ''$HOME'' for not root users to send the requests to the server.
The server socket will listen for client requests for enabling or disabling the packet capture.

Applications using below APIs need to pass port/device_id, queue, mempool and
ring parameters. Library uses user provided ring and mempool to mirror the rx/tx
packets of the port for users. Users need to dequeue the rings and write the packets
to vdev(pcap/tuntap) to view the packets using any standard tools.

Note:
Mempool and Ring should be mc/mp supportable.
Mempool mbuf size should be big enough to handle the rx/tx packets of a port.

APIs:
-----
rte_pdump_init()
rte_pdump_enable()
rte_pdump_enable_by_deviceid()
rte_pdump_disable()
rte_pdump_disable_by_deviceid()
rte_pdump_uninit()
rte_pdump_set_socket_dir()

2)app/pdump tool
================
Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
This tool by default runs as secondary process, and provides the support for
the command line options for packet capture.

./build/app/dpdk_pdump --
                       --pdump '(port=<port id> | device_id=<pci id or vdev name>),
                                (queue=<queue id>),
                                (rx-dev=<iface or pcap file> |
                                 tx-dev=<iface or pcap file>),
                                [ring-size=<ring size>],
                                [mbuf-size=<mbuf data size>],
                                [total-num-mbufs=<number of mbufs>]'

Parameters inside the parenthesis represents the mandatory parameters.
Parameters inside the square brackets represents optional parameters.
User has to pass on packet capture parameters under --pdump parameters, multiples of
--pdump can be passed to capture packets on different port and queue combinations

Operation:
----------
*Tool parse the user command line arguments,
creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
of the device passed in rx-dev|tx-dev parameters.

*Then calls the APIs of librte_pdump i.e. rte_pdump_enable()/rte_pdump_enable_by_deviceid()
to enable packet capturing on a specific port/device_id and queue by passing on
port|device_id, queue, mempool and ring info.

*Tool runs in while loop to dequeue the packets from the ring and write them to pcap device.

*Tool can be stopped using SIGINT, upon which tool calls
rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated resources.

Note:
CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the pdump tool.

3)Test-pmd changes
==================
Changes are done to test-pmd application to initialize/uninitialize the packet capture framework.
So app/pdump tool can be run to see packets of dpdk ports that are used by test-pmd.

Similarly any application which needs packet capture should call initialize/uninitialize APIs of
librte_pdump and use pdump tool to start the capture.

4)Packet capture flow between pdump tool and librte_pdump
=========================================================
* Pdump tool (Secondary process) requests packet capture
for specific port|device_id and queue combinations.

*Library in secondary process context creates client socket and communicates
the port|device_id, queue, ring and mempool to server.

*Library initializes server in primary process 'test-pmd' context and server serves
the client request to enable Ethernet rxtx call-backs for a given port|device_id and queue.

*Copy the rx/tx packets to passed mempool and enqueue the packets to ring for secondary process.

*Pdump tool will dequeue the packets from ring and writes them to PCAPMD vdev,
so ultimately packets will be seen on the device that is passed in rx-dev|tx-dev.

*Once the pdump tool is terminated with SIGINT it will disable the packet capturing.

*Library receives the disable packet capture request, communicate the info to server,
server will remove the Ethernet rxtx call-backs.

*Packet capture can be seen using tcpdump command
"tcpdump -ni <iface>" (or) "tcpdump –nr <pcapfile>"

5)Example command line
======================
./build/app/dpdk_pdump -- --pdump 'device_id=0000:02:0.0,queue=*,tx-dev=/tmp/dt-file.pcap,rx-dev=/tmp/dr-file.pcap,ring-size=8192,mbuf-size=2176,total-num-mbufs=32768' --pdump 'device_id=0000:01:00.0,queue=*,rx-dev=/tmp/d-file.pcap,tx-dev=/tmp/d-file.pcap,ring-size=16384,mbuf-size=2176,total-num-mbufs=32768'

v9:
added a support in rte_pdump_set_socket_dir() to set server and client socket paths
==> http://dpdk.org/dev/patchwork/patch/13450/
updated the documentation for the new changes.
updated the commit messages.

v8:
added server socket argument to rte_pdump_init() API ==> http://dpdk.org/dev/patchwork/patch/13402/
added rte_pdump_set_socket_dir() API.
updated documentation for new changes.

v7:
fixed lines over 90 characters.

v6:
removed below deprecation notice patch from patch set.
http://dpdk.org/dev/patchwork/patch/13372/

v5:
addressed code review comments for below patches
http://dpdk.org/dev/patchwork/patch/12955/
http://dpdk.org/dev/patchwork/patch/12951/

v4:
added missing deprecation notice for ABI changes of rte_eth_dev_info structure.
made doc changes as per doc guidelines.
replaced rte_eal_vdev_init with rte_eth_dev_attach in pdump tool.
removed rxtx-dev parameter from pdump tool command line.

v3:
app/pdump: Moved cleanup code from signal handler to main.
divided librte_ether changes into multiple patches.
example command changed in app/pdump application guide

v2:
fix compilation issues for 4.8.3
fix unnecessary #includes


Reshma Pattan (8):
  ethdev: use locks to protect Rx/Tx callback lists
  ethdev: add new api to add Rx callback as head of the list
  ethdev: add new fields to ethdev info struct
  ethdev: make get port by name and get name by port public
  pdump: add new library for packet capturing support
  app/pdump: add pdump tool for packet capturing
  app/testpmd: add pdump initialization uninitialization
  doc: update doc for packet capture framework

 MAINTAINERS                             |   8 +
 app/Makefile                            |   1 +
 app/pdump/Makefile                      |  45 ++
 app/pdump/main.c                        | 844 +++++++++++++++++++++++++++++
 app/test-pmd/testpmd.c                  |   6 +
 config/common_base                      |   5 +
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/pdump_library.rst | 119 +++++
 doc/guides/rel_notes/release_16_07.rst  |  13 +
 doc/guides/sample_app_ug/index.rst      |   1 +
 doc/guides/sample_app_ug/pdump.rst      | 122 +++++
 lib/Makefile                            |   1 +
 lib/librte_ether/rte_ethdev.c           | 123 +++--
 lib/librte_ether/rte_ethdev.h           |  60 +++
 lib/librte_ether/rte_ether_version.map  |   9 +
 lib/librte_pdump/Makefile               |  55 ++
 lib/librte_pdump/rte_pdump.c            | 913 ++++++++++++++++++++++++++++++++
 lib/librte_pdump/rte_pdump.h            | 216 ++++++++
 lib/librte_pdump/rte_pdump_version.map  |  13 +
 mk/rte.app.mk                           |   1 +
 20 files changed, 2512 insertions(+), 44 deletions(-)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
2.5.0

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 19:59       ` Thomas Monjalon
  2016-06-14  9:38     ` [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
                       ` (7 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added spinlocks around add/remove logic of Rx and Tx callbacks
to avoid corruption of callback lists in multithreaded context.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 82 +++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 42 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..ce70d58 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -77,6 +77,12 @@ static uint8_t nb_ports;
 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for add/remove rx callbacks */
+static rte_spinlock_t rte_eth_rx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* spinlock for add/remove tx callbacks */
+static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -1634,7 +1640,6 @@ rte_eth_dev_set_rx_queue_stats_mapping(uint8_t port_id, uint16_t rx_queue_id,
 			STAT_QMAP_RX);
 }
 
-
 void
 rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 {
@@ -2905,7 +2910,6 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_errno = EINVAL;
 		return NULL;
 	}
-
 	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
 
 	if (cb == NULL) {
@@ -2916,6 +2920,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.rx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
@@ -2928,6 +2933,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
 	return cb;
 }
@@ -2957,6 +2963,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.tx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].pre_tx_burst_cbs[queue_id];
@@ -2969,6 +2976,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
 	return cb;
 }
@@ -2987,29 +2995,24 @@ rte_eth_remove_rx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->post_rx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+	int ret = -EINVAL;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	prev_cb = &dev->post_rx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
@@ -3026,29 +3029,24 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->pre_tx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	int ret = -EINVAL;
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
+	prev_cb = &dev->pre_tx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
  2016-06-14  9:38     ` [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 20:01       ` Thomas Monjalon
  2016-06-14  9:38     ` [PATCH v9 3/8] ethdev: add new fields to ethdev info struct Reshma Pattan
                       ` (6 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added new public api rte_eth_add_first_rx_callback to add given
callback as head of the list.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 35 ++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 28 +++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  6 ++++++
 3 files changed, 69 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ce70d58..97d167e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2939,6 +2939,41 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 void *
+rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+		rte_rx_callback_fn fn, void *user_param)
+{
+#ifndef RTE_ETHDEV_RXTX_CALLBACKS
+	rte_errno = ENOTSUP;
+	return NULL;
+#endif
+	/* check input parameters */
+	if (!rte_eth_dev_is_valid_port(port_id) || fn == NULL ||
+		queue_id >= rte_eth_devices[port_id].data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
+
+	if (cb == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	cb->fn.rx = fn;
+	cb->param = user_param;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	/* Add the callbacks at fisrt position*/
+	cb->next = rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
+	rte_smp_wmb();
+	rte_eth_devices[port_id].post_rx_burst_cbs[queue_id] = cb;
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
+
+	return cb;
+}
+
+void *
 rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_tx_callback_fn fn, void *user_param)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..237e6ef 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3825,6 +3825,34 @@ int rte_eth_dev_get_dcb_info(uint8_t port_id,
 void *rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_rx_callback_fn fn, void *user_param);
 
+/*
+* Add a callback that must be called first on packet RX on a given port
+* and queue.
+*
+* This API configures a first function to be called for each burst of
+* packets received on a given NIC port queue. The return value is a pointer
+* that can be used to later remove the callback using
+* rte_eth_remove_rx_callback().
+*
+* Multiple functions are called in the order that they are added.
+*
+* @param port_id
+*   The port identifier of the Ethernet device.
+* @param queue_id
+*   The queue on the Ethernet device on which the callback is to be added.
+* @param fn
+*   The callback function
+* @param user_param
+*   A generic pointer parameter which will be passed to each invocation of the
+*   callback function on this port and queue.
+*
+* @return
+*   NULL on error.
+*   On success, a pointer value which can later be used to remove the callback.
+*/
+void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+		rte_rx_callback_fn fn, void *user_param);
+
 /**
  * Add a callback to be called on packet TX on a given port and queue.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 214ecc7..c990b04 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,9 @@ DPDK_16.04 {
 	rte_eth_tx_buffer_set_err_callback;
 
 } DPDK_2.2;
+
+DPDK_16.07 {
+	global:
+
+	rte_eth_add_first_rx_callback;
+} DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 3/8] ethdev: add new fields to ethdev info struct
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
  2016-06-14  9:38     ` [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
  2016-06-14  9:38     ` [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 20:10       ` Thomas Monjalon
  2016-06-14  9:38     ` [PATCH v9 4/8] ethdev: make get port by name and get name by port public Reshma Pattan
                       ` (5 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

New fields nb_rx_queues and nb_tx_queues are added to
rte_eth_dev_info structure.
Changes to API rte_eth_dev_info_get() are done to update
these new fields to rte_eth_dev_info object.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 2 ++
 lib/librte_ether/rte_ethdev.h          | 3 +++
 lib/librte_ether/rte_ether_version.map | 1 +
 3 files changed, 6 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 97d167e..1f634c9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1661,6 +1661,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 	(*dev->dev_ops->dev_infos_get)(dev, dev_info);
 	dev_info->pci_dev = dev->pci_dev;
 	dev_info->driver_name = dev->data->drv_name;
+	dev_info->nb_rx_queues = dev->data->nb_rx_queues;
+	dev_info->nb_tx_queues = dev->data->nb_tx_queues;
 }
 
 int
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 237e6ef..8ad7c01 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -882,6 +882,9 @@ struct rte_eth_dev_info {
 	struct rte_eth_desc_lim rx_desc_lim;  /**< RX descriptors limits */
 	struct rte_eth_desc_lim tx_desc_lim;  /**< TX descriptors limits */
 	uint32_t speed_capa;  /**< Supported speeds bitmap (ETH_LINK_SPEED_). */
+	/** Configured number of rx/tx queues */
+	uint16_t nb_rx_queues; /**< Number of RX queues. */
+	uint16_t nb_tx_queues; /**< Number of TX queues. */
 };
 
 /**
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index c990b04..d06d648 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,4 +137,5 @@ DPDK_16.07 {
 	global:
 
 	rte_eth_add_first_rx_callback;
+	rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 4/8] ethdev: make get port by name and get name by port public
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
                       ` (2 preceding siblings ...)
  2016-06-14  9:38     ` [PATCH v9 3/8] ethdev: add new fields to ethdev info struct Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 20:23       ` Thomas Monjalon
  2016-06-14  9:38     ` [PATCH v9 5/8] pdump: add new library for packet capturing support Reshma Pattan
                       ` (4 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Converted rte_eth_dev_get_port_by_name to a public API.
Converted rte_eth_dev_get_name_by_port to a public API.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          |  4 ++--
 lib/librte_ether/rte_ethdev.h          | 29 +++++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  2 ++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 1f634c9..0b19569 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -406,7 +406,7 @@ rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
 	return 0;
 }
 
-static int
+int
 rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 {
 	char *tmp;
@@ -425,7 +425,7 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 	return 0;
 }
 
-static int
+int
 rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id)
 {
 	int i;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8ad7c01..fab281e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4284,6 +4284,35 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
 				  uint32_t mask,
 				  uint8_t en);
 
+/**
+* Get the port id from pci adrress or device name
+* Ex: 0000:2:00.0 or vdev name eth_pcap0
+*
+* @param name
+*  pci address or name of the device
+* @param port_id
+*   pointer to port identifier of the device
+* @return
+*   - (0) if successful.
+*   - (-ENODEV or -EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id);
+
+/**
+* Get the device name from port id
+*
+* @param port_id
+*   pointer to port identifier of the device
+* @param name
+*  pci address or name of the device
+* @return
+*   - (0) if successful.
+*   - (-EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index d06d648..73e730d 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,5 +137,7 @@ DPDK_16.07 {
 	global:
 
 	rte_eth_add_first_rx_callback;
+	rte_eth_dev_get_name_by_port;
+	rte_eth_dev_get_port_by_name;
 	rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
                       ` (3 preceding siblings ...)
  2016-06-14  9:38     ` [PATCH v9 4/8] ethdev: make get port by name and get name by port public Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 20:28       ` Thomas Monjalon
  2016-06-14  9:38     ` [PATCH v9 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
                       ` (3 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

The new librte_pdump library is added for packet capturing
support.

Added public api rte_pdump_init, applications should call
this as part of their application setup to have packet
capturing framework ready.

Added public api rte_pdump_uninit to uninitialize the packet
capturing framework.

Added public apis rte_pdump_enable and rte_pdump_disable to
enable and disable packet capturing on specific port and queue.

Added public apis rte_pdump_enable_by_deviceid and
rte_pdump_disable_by_deviceid to enable and disable packet
capturing on a specific device (pci address or name) and queue.

Added public api rte_pdump_set_socket_dir to set the
server and client socket paths.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS                            |   4 +
 config/common_base                     |   5 +
 lib/Makefile                           |   1 +
 lib/librte_pdump/Makefile              |  55 ++
 lib/librte_pdump/rte_pdump.c           | 913 +++++++++++++++++++++++++++++++++
 lib/librte_pdump/rte_pdump.h           | 216 ++++++++
 lib/librte_pdump/rte_pdump_version.map |  13 +
 mk/rte.app.mk                          |   1 +
 8 files changed, 1208 insertions(+)
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e6b70c..a500cf4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -433,6 +433,10 @@ F: app/test/test_reorder*
 F: examples/packet_ordering/
 F: doc/guides/sample_app_ug/packet_ordering.rst
 
+Pdump
+M: Reshma Pattan <reshma.pattan@intel.com>
+F: lib/librte_pdump/
+
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
 F: lib/librte_sched/
diff --git a/config/common_base b/config/common_base
index 47c26f6..a2d5d72 100644
--- a/config/common_base
+++ b/config/common_base
@@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
 CONFIG_RTE_LIBRTE_REORDER=y
 
 #
+# Compile the pdump library
+#
+CONFIG_RTE_LIBRTE_PDUMP=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/lib/Makefile b/lib/Makefile
index f254dba..ca7c02f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
new file mode 100644
index 0000000..af81a28
--- /dev/null
+++ b/lib/librte_pdump/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pdump.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+CFLAGS += -D_GNU_SOURCE
+
+EXPORT_MAP := rte_pdump_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_PDUMP) := rte_pdump.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_PDUMP)-include := rte_pdump.h
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
new file mode 100644
index 0000000..c921f51
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump.c
@@ -0,0 +1,913 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <stdio.h>
+
+#include <rte_memcpy.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_pci.h>
+
+#include "rte_pdump.h"
+
+#define SOCKET_PATH_VAR_RUN "/var/run/pdump_sockets"
+#define SOCKET_PATH_HOME "HOME/pdump_sockets"
+#define SERVER_SOCKET "%s/pdump_server_socket"
+#define CLIENT_SOCKET "%s/pdump_client_socket_%d_%u"
+#define DEVICE_ID_SIZE 64
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_PDUMP RTE_LOGTYPE_USER1
+
+enum pdump_operation {
+	DISABLE = 1,
+	ENABLE = 2
+};
+
+enum pdump_version {
+	V1 = 1
+};
+
+static pthread_t pdump_thread;
+static int pdump_socket_fd;
+static char server_socket_dir[PATH_MAX];
+static char client_socket_dir[PATH_MAX];
+
+struct pdump_request {
+	uint16_t ver;
+	uint16_t op;
+	uint32_t flags;
+	union pdump_data {
+		struct enable_v1 {
+			char device[DEVICE_ID_SIZE];
+			uint16_t queue;
+			struct rte_ring *ring;
+			struct rte_mempool *mp;
+			void *filter;
+		} en_v1;
+		struct disable_v1 {
+			char device[DEVICE_ID_SIZE];
+			uint16_t queue;
+			struct rte_ring *ring;
+			struct rte_mempool *mp;
+			void *filter;
+		} dis_v1;
+	} data;
+};
+
+struct pdump_response {
+	uint16_t ver;
+	uint16_t res_op;
+	int32_t err_value;
+};
+
+static struct pdump_rxtx_cbs {
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+	struct rte_eth_rxtx_callback *cb;
+	void *filter;
+} rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT],
+tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
+
+static inline int
+pdump_pktmbuf_copy_data(struct rte_mbuf *seg, const struct rte_mbuf *m)
+{
+	if (rte_pktmbuf_tailroom(seg) < m->data_len) {
+		RTE_LOG(ERR, PDUMP,
+			"User mempool: insufficient data_len of mbuf\n");
+		return -EINVAL;
+	}
+
+	seg->port = m->port;
+	seg->vlan_tci = m->vlan_tci;
+	seg->hash = m->hash;
+	seg->tx_offload = m->tx_offload;
+	seg->ol_flags = m->ol_flags;
+	seg->packet_type = m->packet_type;
+	seg->vlan_tci_outer = m->vlan_tci_outer;
+	seg->data_len = m->data_len;
+	seg->pkt_len = seg->data_len;
+	rte_memcpy(rte_pktmbuf_mtod(seg, void *),
+			rte_pktmbuf_mtod(m, void *),
+			rte_pktmbuf_data_len(seg));
+
+	return 0;
+}
+
+static inline struct rte_mbuf *
+pdump_pktmbuf_copy(struct rte_mbuf *m, struct rte_mempool *mp)
+{
+	struct rte_mbuf *m_dup, *seg, **prev;
+	uint32_t pktlen;
+	uint8_t nseg;
+
+	m_dup = rte_pktmbuf_alloc(mp);
+	if (unlikely(m_dup == NULL))
+		return NULL;
+
+	seg = m_dup;
+	prev = &seg->next;
+	pktlen = m->pkt_len;
+	nseg = 0;
+
+	do {
+		nseg++;
+		if (pdump_pktmbuf_copy_data(seg, m) < 0) {
+			rte_pktmbuf_free(m_dup);
+			return NULL;
+		}
+		*prev = seg;
+		prev = &seg->next;
+	} while ((m = m->next) != NULL &&
+			(seg = rte_pktmbuf_alloc(mp)) != NULL);
+
+	*prev = NULL;
+	m_dup->nb_segs = nseg;
+	m_dup->pkt_len = pktlen;
+
+	/* Allocation of new indirect segment failed */
+	if (unlikely(seg == NULL)) {
+		rte_pktmbuf_free(m_dup);
+		return NULL;
+	}
+
+	__rte_mbuf_sanity_check(m_dup, 1);
+	return m_dup;
+}
+
+static inline void
+pdump_copy(struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
+{
+	unsigned i;
+	int ring_enq;
+	uint16_t d_pkts = 0;
+	struct rte_mbuf *dup_bufs[nb_pkts];
+	struct pdump_rxtx_cbs *cbs;
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+	struct rte_mbuf *p;
+
+	cbs  = user_params;
+	ring = cbs->ring;
+	mp = cbs->mp;
+	for (i = 0; i < nb_pkts; i++) {
+		p = pdump_pktmbuf_copy(pkts[i], mp);
+		if (p)
+			dup_bufs[d_pkts++] = p;
+	}
+
+	ring_enq = rte_ring_enqueue_burst(ring, (void *)dup_bufs, d_pkts);
+	if (unlikely(ring_enq < d_pkts)) {
+		RTE_LOG(DEBUG, PDUMP,
+			"only %d of packets enqueued to ring\n", ring_enq);
+		do {
+			rte_pktmbuf_free(dup_bufs[ring_enq]);
+		} while (++ring_enq < d_pkts);
+	}
+}
+
+static uint16_t
+pdump_rx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+	struct rte_mbuf **pkts, uint16_t nb_pkts,
+	uint16_t max_pkts __rte_unused,
+	void *user_params)
+{
+	pdump_copy(pkts, nb_pkts, user_params);
+	return nb_pkts;
+}
+
+static uint16_t
+pdump_tx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+		struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
+{
+	pdump_copy(pkts, nb_pkts, user_params);
+	return nb_pkts;
+}
+
+static int
+pdump_get_dombdf(char *device_id, char *domBDF, size_t len)
+{
+	int ret;
+	struct rte_pci_addr dev_addr = {0};
+
+	/* identify if device_id is pci address or name */
+	ret = eal_parse_pci_DomBDF(device_id, &dev_addr);
+	if (ret < 0)
+		return -1;
+
+	if (dev_addr.domain)
+		ret = snprintf(domBDF, len, "%u:%u:%u.%u", dev_addr.domain,
+				dev_addr.bus, dev_addr.devid,
+				dev_addr.function);
+	else
+		ret = snprintf(domBDF, len, "%u:%u.%u", dev_addr.bus,
+				dev_addr.devid,
+				dev_addr.function);
+
+	return ret;
+}
+
+static int
+pdump_regitser_rx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
+				struct rte_ring *ring, struct rte_mempool *mp,
+				uint16_t operation)
+{
+	uint16_t qid;
+	struct pdump_rxtx_cbs *cbs = NULL;
+
+	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
+	for (; qid < end_q; qid++) {
+		cbs = &rx_cbs[port][qid];
+		if (cbs && operation == ENABLE) {
+			if (cbs->cb) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add rx callback for port=%d "
+					"and queue=%d, callback already exists\n",
+					port, qid);
+				return -EEXIST;
+			}
+			cbs->ring = ring;
+			cbs->mp = mp;
+			cbs->cb = rte_eth_add_first_rx_callback(port, qid,
+								pdump_rx, cbs);
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add rx callback, errno=%d\n",
+					rte_errno);
+				return rte_errno;
+			}
+		}
+		if (cbs && operation == DISABLE) {
+			int ret;
+
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to delete non existing rx "
+					"callback for port=%d and queue=%d\n",
+					port, qid);
+				return -EINVAL;
+			}
+			ret = rte_eth_remove_rx_callback(port, qid, cbs->cb);
+			if (ret < 0) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to remove rx callback, errno=%d\n",
+					rte_errno);
+				return ret;
+			}
+			cbs->cb = NULL;
+		}
+	}
+
+	return 0;
+}
+
+static int
+pdump_regitser_tx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
+				struct rte_ring *ring, struct rte_mempool *mp,
+				uint16_t operation)
+{
+
+	uint16_t qid;
+	struct pdump_rxtx_cbs *cbs = NULL;
+
+	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
+	for (; qid < end_q; qid++) {
+		cbs = &tx_cbs[port][qid];
+		if (cbs && operation == ENABLE) {
+			if (cbs->cb) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add tx callback for port=%d "
+					"and queue=%d, callback already exists\n",
+					port, qid);
+				return -EEXIST;
+			}
+			cbs->ring = ring;
+			cbs->mp = mp;
+			cbs->cb = rte_eth_add_tx_callback(port, qid, pdump_tx,
+								cbs);
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add tx callback, errno=%d\n",
+					rte_errno);
+				return rte_errno;
+			}
+		}
+		if (cbs && operation == DISABLE) {
+			int ret;
+
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to delete non existing tx "
+					"callback for port=%d and queue=%d\n",
+					port, qid);
+				return -EINVAL;
+			}
+			ret = rte_eth_remove_tx_callback(port, qid, cbs->cb);
+			if (ret < 0) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to remove tx callback, errno=%d\n",
+					rte_errno);
+				return ret;
+			}
+			cbs->cb = NULL;
+		}
+	}
+
+	return 0;
+}
+
+static int
+set_pdump_rxtx_cbs(struct pdump_request *p)
+{
+	uint16_t nb_rx_q, nb_tx_q = 0, end_q, queue;
+	uint8_t port;
+	int ret = 0;
+	uint32_t flags;
+	uint16_t operation;
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+
+	flags = p->flags;
+	operation = p->op;
+	if (operation == ENABLE) {
+		ret = rte_eth_dev_get_port_by_name(p->data.en_v1.device,
+				&port);
+		if (ret < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to get potid for device id=%s\n",
+				p->data.en_v1.device);
+			return -EINVAL;
+		}
+		queue = p->data.en_v1.queue;
+		ring = p->data.en_v1.ring;
+		mp = p->data.en_v1.mp;
+	} else {
+		ret = rte_eth_dev_get_port_by_name(p->data.dis_v1.device,
+				&port);
+		if (ret < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to get potid for device id=%s\n",
+				p->data.dis_v1.device);
+			return -EINVAL;
+		}
+		queue = p->data.dis_v1.queue;
+		ring = p->data.dis_v1.ring;
+		mp = p->data.dis_v1.mp;
+	}
+
+	/* validation if packet capture is for all queues */
+	if (queue == RTE_PDUMP_ALL_QUEUES) {
+		struct rte_eth_dev_info dev_info;
+
+		rte_eth_dev_info_get(port, &dev_info);
+		nb_rx_q = dev_info.nb_rx_queues;
+		nb_tx_q = dev_info.nb_tx_queues;
+		if (nb_rx_q == 0 && flags & RTE_PDUMP_FLAG_RX) {
+			RTE_LOG(ERR, PDUMP,
+				"number of rx queues cannot be 0\n");
+			return -EINVAL;
+		}
+		if (nb_tx_q == 0 && flags & RTE_PDUMP_FLAG_TX) {
+			RTE_LOG(ERR, PDUMP,
+				"number of tx queues cannot be 0\n");
+			return -EINVAL;
+		}
+		if ((nb_tx_q == 0 || nb_rx_q == 0) &&
+			flags == RTE_PDUMP_FLAG_RXTX) {
+			RTE_LOG(ERR, PDUMP,
+				"both tx&rx queues must be non zero\n");
+			return -EINVAL;
+		}
+	}
+
+	/* register RX callback */
+	if (flags & RTE_PDUMP_FLAG_RX) {
+		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_rx_q : queue + 1;
+		ret = pdump_regitser_rx_callbacks(end_q, port, queue, ring, mp,
+							operation);
+		if (ret < 0)
+			return ret;
+	}
+
+	/* register TX callback */
+	if (flags & RTE_PDUMP_FLAG_TX) {
+		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_tx_q : queue + 1;
+		ret = pdump_regitser_tx_callbacks(end_q, port, queue, ring, mp,
+							operation);
+		if (ret < 0)
+			return ret;
+	}
+
+	return ret;
+}
+
+/* get socket path (/var/run if root, $HOME otherwise) */
+static void
+pdump_get_socket_path(char *buffer, int bufsz, enum rte_pdump_socktype type)
+{
+	const char *dir = NULL;
+
+	if (type == RTE_PDUMP_SOCKET_SERVER && server_socket_dir[0] != 0)
+		dir = server_socket_dir;
+	else if (type == RTE_PDUMP_SOCKET_CLIENT && client_socket_dir[0] != 0)
+		dir = client_socket_dir;
+	else {
+		if (getuid() != 0)
+			dir = getenv(SOCKET_PATH_HOME);
+		else
+			dir = SOCKET_PATH_VAR_RUN;
+	}
+
+	mkdir(dir, 700);
+	if (type == RTE_PDUMP_SOCKET_SERVER)
+		snprintf(buffer, bufsz, SERVER_SOCKET, dir);
+	else
+		snprintf(buffer, bufsz, CLIENT_SOCKET, dir, getpid(),
+				rte_sys_gettid());
+}
+
+static int
+pdump_create_server_socket(void)
+{
+	int ret, socket_fd;
+	struct sockaddr_un addr;
+	socklen_t addr_len;
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+				RTE_PDUMP_SOCKET_SERVER);
+	addr.sun_family = AF_UNIX;
+
+	/* remove if file already exists */
+	unlink(addr.sun_path);
+
+	/* set up a server socket */
+	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
+	if (socket_fd < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	addr_len = sizeof(struct sockaddr_un);
+	ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
+	if (ret) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to bind to server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		close(socket_fd);
+		return -1;
+	}
+
+	/* save the socket in local configuration */
+	pdump_socket_fd = socket_fd;
+
+	return 0;
+}
+
+static __attribute__((noreturn)) void *
+pdump_thread_main(__rte_unused void *arg)
+{
+	struct sockaddr_un cli_addr;
+	socklen_t cli_len;
+	struct pdump_request cli_req;
+	struct pdump_response resp;
+	int n;
+	int ret = 0;
+
+	/* host thread, never break out */
+	for (;;) {
+		/* recv client requests */
+		cli_len = sizeof(cli_addr);
+		n = recvfrom(pdump_socket_fd, &cli_req,
+				sizeof(struct pdump_request), 0,
+				(struct sockaddr *)&cli_addr, &cli_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to recv from client:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			continue;
+		}
+
+		ret = set_pdump_rxtx_cbs(&cli_req);
+
+		resp.ver = cli_req.ver;
+		resp.res_op = cli_req.op;
+		resp.err_value = ret;
+		n = sendto(pdump_socket_fd, &resp,
+				sizeof(struct pdump_response),
+				0, (struct sockaddr *)&cli_addr, cli_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to send to client:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+		}
+	}
+}
+
+int
+rte_pdump_init(const char *path)
+{
+	int ret = 0;
+	char thread_name[RTE_MAX_THREAD_NAME_LEN];
+
+	ret = rte_pdump_set_socket_dir(path, RTE_PDUMP_SOCKET_SERVER);
+	if (ret != 0)
+		return -1;
+
+	ret = pdump_create_server_socket();
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create server socket:%s:%d\n",
+			__func__, __LINE__);
+		return -1;
+	}
+
+	/* create the host thread to wait/handle pdump requests */
+	ret = pthread_create(&pdump_thread, NULL, pdump_thread_main, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create the pdump thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+	/* Set thread_name for aid in debugging. */
+	snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN, "pdump-thread");
+	ret = rte_thread_setname(pdump_thread, thread_name);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, PDUMP,
+			"Failed to set thread name for pdump handling\n");
+	}
+
+	return 0;
+}
+
+int
+rte_pdump_uninit(void)
+{
+	int ret;
+
+	ret = pthread_cancel(pdump_thread);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to cancel the pdump thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	ret = close(pdump_socket_fd);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to close server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	struct sockaddr_un addr;
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+				RTE_PDUMP_SOCKET_SERVER);
+	ret = unlink(addr.sun_path);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to remove server socket addr: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_create_client_socket(struct pdump_request *p)
+{
+	int ret, socket_fd;
+	int pid;
+	int n;
+	struct pdump_response server_resp;
+	struct sockaddr_un addr, serv_addr, from;
+	socklen_t addr_len, serv_len;
+
+	pid = getpid();
+
+	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
+	if (socket_fd < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"client socket(): %s:pid(%d):tid(%u), %s:%d\n",
+			strerror(errno), pid, rte_sys_gettid(),
+			__func__, __LINE__);
+		ret = errno;
+		return ret;
+	}
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+				RTE_PDUMP_SOCKET_CLIENT);
+	addr.sun_family = AF_UNIX;
+	addr_len = sizeof(struct sockaddr_un);
+
+	do {
+		ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
+		if (ret) {
+			RTE_LOG(ERR, PDUMP,
+				"client bind(): %s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret = errno;
+			break;
+		}
+
+		serv_len = sizeof(struct sockaddr_un);
+		memset(&serv_addr, 0, sizeof(serv_addr));
+		pdump_get_socket_path(serv_addr.sun_path,
+					sizeof(serv_addr.sun_path),
+					RTE_PDUMP_SOCKET_SERVER);
+		serv_addr.sun_family = AF_UNIX;
+
+		n =  sendto(socket_fd, p, sizeof(struct pdump_request), 0,
+				(struct sockaddr *)&serv_addr, serv_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to send to server:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret =  errno;
+			break;
+		}
+
+		n = recvfrom(socket_fd, &server_resp,
+				sizeof(struct pdump_response), 0,
+				(struct sockaddr *)&from, &serv_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to recv from server:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret = errno;
+			break;
+		}
+		ret = server_resp.err_value;
+	} while (0);
+
+	close(socket_fd);
+	unlink(addr.sun_path);
+	return ret;
+}
+
+static int
+pdump_validate_ring_mp(struct rte_ring *ring, struct rte_mempool *mp)
+{
+	if (ring == NULL || mp == NULL) {
+		RTE_LOG(ERR, PDUMP, "NULL ring or mempool are passed %s:%d\n",
+			__func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+	if (mp->flags & MEMPOOL_F_SP_PUT || mp->flags & MEMPOOL_F_SC_GET) {
+		RTE_LOG(ERR, PDUMP, "mempool with either SP or SC settings"
+		" is not valid for pdump, should have MP and MC settings\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+	if (ring->prod.sp_enqueue || ring->cons.sc_dequeue) {
+		RTE_LOG(ERR, PDUMP, "ring with either SP or SC settings"
+		" is not valid for pdump, should have MP and MC settings\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_validate_flags(uint32_t flags)
+{
+	if (flags != RTE_PDUMP_FLAG_RX && flags != RTE_PDUMP_FLAG_TX &&
+		flags != RTE_PDUMP_FLAG_RXTX) {
+		RTE_LOG(ERR, PDUMP,
+			"invalid flags, should be either rx/tx/rxtx\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_validate_port(uint8_t port, char *name)
+{
+	int ret = 0;
+
+	if (port >= RTE_MAX_ETHPORTS) {
+		RTE_LOG(ERR, PDUMP, "Invalid port id %u, %s:%d\n", port,
+			__func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	ret = rte_eth_dev_get_name_by_port(port, name);
+	if (ret < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"port id to name mapping failed for port id=%u, %s:%d\n",
+			port, __func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_prepare_client_request(char *device, uint16_t queue,
+				uint32_t flags,
+				uint16_t operation,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter)
+{
+	int ret;
+	struct pdump_request req = {.ver = 1,};
+
+	req.flags = flags;
+	req.op =  operation;
+	if ((operation & ENABLE) != 0) {
+		strncpy(req.data.en_v1.device, device, strlen(device));
+		req.data.en_v1.queue = queue;
+		req.data.en_v1.ring = ring;
+		req.data.en_v1.mp = mp;
+		req.data.en_v1.filter = filter;
+	} else {
+		strncpy(req.data.dis_v1.device, device, strlen(device));
+		req.data.dis_v1.queue = queue;
+		req.data.dis_v1.ring = NULL;
+		req.data.dis_v1.mp = NULL;
+		req.data.dis_v1.filter = NULL;
+	}
+
+	ret = pdump_create_client_socket(&req);
+	if (ret < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"client request for pdump enable/disable failed\n");
+		rte_errno = ret;
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
+			struct rte_ring *ring,
+			struct rte_mempool *mp,
+			void *filter)
+{
+
+	int ret = 0;
+	char name[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_port(port, name);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_ring_mp(ring, mp);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	ret = pdump_prepare_client_request(name, queue, flags,
+						ENABLE, ring, mp, filter);
+
+	return ret;
+}
+
+int
+rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter)
+{
+	int ret = 0;
+	char domBDF[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_ring_mp(ring, mp);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
+		ret = pdump_prepare_client_request(domBDF, queue, flags,
+						ENABLE, ring, mp, filter);
+	else
+		ret = pdump_prepare_client_request(device_id, queue, flags,
+						ENABLE, ring, mp, filter);
+
+	return ret;
+}
+
+int
+rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags)
+{
+	int ret = 0;
+	char name[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_port(port, name);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	ret = pdump_prepare_client_request(name, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+
+	return ret;
+}
+
+int
+rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags)
+{
+	int ret = 0;
+	char domBDF[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
+		ret = pdump_prepare_client_request(domBDF, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+	else
+		ret = pdump_prepare_client_request(device_id, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+
+	return ret;
+}
+
+int
+rte_pdump_set_socket_dir(const char *path, enum rte_pdump_socktype type)
+{
+	int ret, count;
+
+	if (path != NULL) {
+		if (type == RTE_PDUMP_SOCKET_SERVER) {
+			count = sizeof(server_socket_dir);
+			ret = snprintf(server_socket_dir, count, "%s", path);
+		} else {
+			count = sizeof(client_socket_dir);
+			ret = snprintf(client_socket_dir, count, "%s", path);
+		}
+
+		if (ret < 0  || ret >= count) {
+			RTE_LOG(ERR, PDUMP,
+					"Invalid socket path:%s:%d\n",
+					__func__, __LINE__);
+			if (type == RTE_PDUMP_SOCKET_SERVER)
+				server_socket_dir[0] = 0;
+			else
+				client_socket_dir[0] = 0;
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
diff --git a/lib/librte_pdump/rte_pdump.h b/lib/librte_pdump/rte_pdump.h
new file mode 100644
index 0000000..b5f4e2f
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump.h
@@ -0,0 +1,216 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PDUMP_H_
+#define _RTE_PDUMP_H_
+
+/**
+ * @file
+ * RTE pdump
+ *
+ * packet dump library to provide packet capturing support on dpdk.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_PDUMP_ALL_QUEUES UINT16_MAX
+
+enum {
+	RTE_PDUMP_FLAG_RX = 1,  /* receive direction */
+	RTE_PDUMP_FLAG_TX = 2,  /* transmit direction */
+	/* both receive and transmit directions */
+	RTE_PDUMP_FLAG_RXTX = (RTE_PDUMP_FLAG_RX|RTE_PDUMP_FLAG_TX)
+};
+
+enum rte_pdump_socktype {
+	RTE_PDUMP_SOCKET_SERVER = 1,
+	RTE_PDUMP_SOCKET_CLIENT = 2
+};
+
+/**
+ * Initialize packet capturing handling
+ *
+ * Creates pthread and server socket for handling clients
+ * requests to enable/disable rxtx callbacks.
+ *
+ * @param path
+ * directory path for server socket.
+ *
+ * @return
+ *    0 on success, -1 on error
+ */
+int
+rte_pdump_init(const char *path);
+
+/**
+ * Un initialize packet capturing handling
+ *
+ * Cancels pthread, close server socket, removes server socket address.
+ *
+ * @return
+ *    0 on success, -1 on error
+ */
+int
+rte_pdump_uninit(void);
+
+/**
+ * Enables packet capturing on given port and queue.
+ *
+ * @param port
+ *  port on which packet capturing should be enabled.
+ * @param queue
+ *  queue of a given port on which packet capturing should be enabled.
+ *  users should pass on value UINT16_MAX to enable packet capturing on all
+ *  queues of a given port.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ * @param ring
+ *  ring on which captured packets will be enqueued for user.
+ * @param mp
+ *  mempool on to which original packets will be mirrored or duplicated.
+ * @param filter
+ *  place holder for packet filtering.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
+		struct rte_ring *ring,
+		struct rte_mempool *mp,
+		void *filter);
+
+/**
+ * Disables packet capturing on given port and queue.
+ *
+ * @param port
+ *  port on which packet capturing should be disabled.
+ * @param queue
+ *  queue of a given port on which packet capturing should be disabled.
+ *  users should pass on value UINT16_MAX to disable packet capturing on all
+ *  queues of a given port.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags);
+
+/**
+ * Enables packet capturing on given device id and queue.
+ * device_id can be name or pci address of device.
+ *
+ * @param device_id
+ *  device id on which packet capturing should be enabled.
+ * @param queue
+ *  queue of a given device id on which packet capturing should be enabled.
+ *  users should pass on value UINT16_MAX to enable packet capturing on all
+ *  queues of a given device id.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ * @param ring
+ *  ring on which captured packets will be enqueued for user.
+ * @param mp
+ *  mempool on to which original packets will be mirrored or duplicated.
+ * @param filter
+ *  place holder for packet filtering.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter);
+
+/**
+ * Disables packet capturing on given device_id and queue.
+ * device_id can be name or pci address of device.
+ *
+ * @param device_id
+ *  pci address or name of the device on which packet capturing
+ *  should be disabled.
+ * @param queue
+ *  queue of a given device on which packet capturing should be disabled.
+ *  users should pass on value UINT16_MAX to disable packet capturing on all
+ *  queues of a given device id.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+int
+rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags);
+
+/**
+ * Allows applications to set server and client socket paths.
+ * If specified path is null default path will be selected, i.e.
+ *"/var/run/" for root user and "$HOME" for non root user.
+ * Clients also need to call this API to set their server path if the
+ * server path is different from default path.
+ * This API is not thread-safe.
+ *
+ * @param path
+ * directory path for server or client socket.
+ * @type
+ * specifies RTE_PDUMP_SOCKET_SERVER if socket path is for server.
+ * (or)
+ * specifies RTE_PDUMP_SOCKET_CLIENT if socket path is for client.
+ *
+ * @return
+ * 0 on success, -EINVAL on error
+ *
+ */
+int
+rte_pdump_set_socket_dir(const char *path, enum rte_pdump_socktype type);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PDUMP_H_ */
diff --git a/lib/librte_pdump/rte_pdump_version.map b/lib/librte_pdump/rte_pdump_version.map
new file mode 100644
index 0000000..edec99a
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump_version.map
@@ -0,0 +1,13 @@
+DPDK_16.07 {
+	global:
+
+	rte_pdump_disable;
+	rte_pdump_disable_by_deviceid;
+	rte_pdump_enable;
+	rte_pdump_enable_by_deviceid;
+	rte_pdump_init;
+	rte_pdump_set_socket_dir;
+	rte_pdump_uninit;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index e9969fc..f894669 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -78,6 +78,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += -lrte_acl
 _LDLIBS-$(CONFIG_RTE_LIBRTE_JOBSTATS)       += -lrte_jobstats
 _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER)          += -lrte_power
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 
 _LDLIBS-y += --whole-archive
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 6/8] app/pdump: add pdump tool for packet capturing
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
                       ` (4 preceding siblings ...)
  2016-06-14  9:38     ` [PATCH v9 5/8] pdump: add new library for packet capturing support Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 19:56       ` Thomas Monjalon
  2016-06-14  9:38     ` [PATCH v9 7/8] app/testpmd: add pdump initialization uninitialization Reshma Pattan
                       ` (2 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

New tool added for packet capturing on dpdk.
This tool supports command line options.
This tool runs as secondary process by default.

Command line supports various parameters to capture
the packets.

User should pass on a)port and queue (or) b)pci address
and queue (or) c)device name and queue to capture
the packets.

Users also need to pass on either pcap file name or
any linux iface, on to which packets captured from dpdk
ports will be sent on for the users to view using tcpdump.

Users have option to capture packets either a) in Rx
direction, b)(or) in Tx direction c)(or) from both the
directions.

User can pass on ring_size and mempool parameters using
command line, but these are optional parameters.
These are used to create ring and mempool objects for packet
mirroring from primary application to tool. If user doesn't
provide any values, default values will be used internally
for the creation of the ring and mempool.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS        |   1 +
 app/Makefile       |   1 +
 app/pdump/Makefile |  45 +++
 app/pdump/main.c   | 844 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 891 insertions(+)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a500cf4..c46cf86 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -436,6 +436,7 @@ F: doc/guides/sample_app_ug/packet_ordering.rst
 Pdump
 M: Reshma Pattan <reshma.pattan@intel.com>
 F: lib/librte_pdump/
+F: app/pdump/
 
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
diff --git a/app/Makefile b/app/Makefile
index 1151e09..c593efa 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += pdump
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/pdump/Makefile b/app/pdump/Makefile
new file mode 100644
index 0000000..96bb4af
--- /dev/null
+++ b/app/pdump/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = dpdk_pdump
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+
+SRCS-y := main.c
+
+# this application needs libraries first
+DEPDIRS-y += lib
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/pdump/main.c b/app/pdump/main.c
new file mode 100644
index 0000000..f8923b9
--- /dev/null
+++ b/app/pdump/main.c
@@ -0,0 +1,844 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdlib.h>
+#include <getopt.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <net/if.h>
+
+#include <rte_eal.h>
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_ethdev.h>
+#include <rte_memory.h>
+#include <rte_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_errno.h>
+#include <rte_dev.h>
+#include <rte_kvargs.h>
+#include <rte_mempool.h>
+#include <rte_ring.h>
+#include <rte_pdump.h>
+
+#define PDUMP_PORT_ARG "port"
+#define PDUMP_PCI_ARG "device_id"
+#define PDUMP_QUEUE_ARG "queue"
+#define PDUMP_DIR_ARG "dir"
+#define PDUMP_RX_DEV_ARG "rx-dev"
+#define PDUMP_TX_DEV_ARG "tx-dev"
+#define PDUMP_RING_SIZE_ARG "ring-size"
+#define PDUMP_MSIZE_ARG "mbuf-size"
+#define PDUMP_NUM_MBUFS_ARG "total-num-mbufs"
+
+#define VDEV_PCAP "eth_pcap_%s_%d,tx_pcap=%s"
+#define VDEV_IFACE "eth_pcap_%s_%d,tx_iface=%s"
+#define TX_STREAM_SIZE 64
+
+#define MP_NAME "pdump_pool_%d"
+
+#define RX_RING "rx_ring_%d"
+#define TX_RING "tx_ring_%d"
+
+#define RX_STR "rx"
+#define TX_STR "tx"
+
+/* Maximum long option length for option parsing. */
+#define APP_ARG_TCPDUMP_MAX_TUPLES 54
+#define MBUF_POOL_CACHE_SIZE 250
+#define TX_DESC_PER_QUEUE 512
+#define RX_DESC_PER_QUEUE 128
+#define MBUFS_PER_POOL 65535
+#define MAX_LONG_OPT_SZ 64
+#define RING_SIZE 16384
+#define SIZE 256
+#define BURST_SIZE 32
+#define NUM_VDEVS 2
+
+#define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
+/* true if x is a power of 2 */
+#define POWEROF2(x) ((((x)-1) & (x)) == 0)
+
+enum pdump_en_dis {
+	DISABLE = 1,
+	ENABLE = 2
+};
+
+enum pcap_stream {
+	IFACE = 1,
+	PCAP = 2
+};
+
+enum pdump_by {
+	PORT_ID = 1,
+	DEVICE_ID = 2
+};
+
+const char *valid_pdump_arguments[] = {
+	PDUMP_PORT_ARG,
+	PDUMP_PCI_ARG,
+	PDUMP_QUEUE_ARG,
+	PDUMP_DIR_ARG,
+	PDUMP_RX_DEV_ARG,
+	PDUMP_TX_DEV_ARG,
+	PDUMP_RING_SIZE_ARG,
+	PDUMP_MSIZE_ARG,
+	PDUMP_NUM_MBUFS_ARG,
+	NULL
+};
+
+struct pdump_stats {
+	uint64_t dequeue_pkts;
+	uint64_t tx_pkts;
+	uint64_t freed_pkts;
+};
+
+struct pdump_tuples {
+	/* cli params */
+	uint8_t port;
+	char *device_id;
+	uint16_t queue;
+	char rx_dev[TX_STREAM_SIZE];
+	char tx_dev[TX_STREAM_SIZE];
+	uint32_t ring_size;
+	uint16_t mbuf_data_size;
+	uint32_t total_num_mbufs;
+
+	/* params for library API call */
+	uint32_t dir;
+	struct rte_mempool *mp;
+	struct rte_ring *rx_ring;
+	struct rte_ring *tx_ring;
+
+	/* params for packet dumping */
+	enum pdump_by dump_by_type;
+	int rx_vdev_id;
+	int tx_vdev_id;
+	enum pcap_stream rx_vdev_stream_type;
+	enum pcap_stream tx_vdev_stream_type;
+	bool single_pdump_dev;
+
+	/* stats */
+	struct pdump_stats stats;
+} __rte_cache_aligned;
+static struct pdump_tuples pdump_t[APP_ARG_TCPDUMP_MAX_TUPLES];
+
+struct parse_val {
+	uint64_t min;
+	uint64_t max;
+	uint64_t val;
+};
+
+int num_tuples;
+static struct rte_eth_conf port_conf_default;
+volatile uint8_t quit_signal;
+
+/**< display usage */
+static void
+pdump_usage(const char *prgname)
+{
+	printf("usage: %s [EAL options] -- --pdump "
+			"'(port=<port id> | device_id=<pci id or vdev name>),"
+			"(queue=<queue_id>),"
+			"(rx-dev=<iface or pcap file> |"
+			" tx-dev=<iface or pcap file>,"
+			"[ring-size=<ring size>default:16384],"
+			"[mbuf-size=<mbuf data size>default:2176],"
+			"[total-num-mbufs=<number of mbufs>default:65535]"
+			"'\n",
+			prgname);
+}
+
+static int
+parse_device_id(const char *key __rte_unused, const char *value,
+		void *extra_args)
+{
+	struct pdump_tuples *pt = extra_args;
+
+	pt->device_id = strdup(value);
+	pt->dump_by_type = DEVICE_ID;
+
+	return 0;
+}
+
+static int
+parse_queue(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long n;
+	struct pdump_tuples *pt = extra_args;
+
+	if (!strcmp(value, "*"))
+		pt->queue = RTE_PDUMP_ALL_QUEUES;
+	else {
+		n = strtoul(value, NULL, 10);
+		pt->queue = (uint16_t) n;
+	}
+	return 0;
+}
+
+static int
+parse_rxtxdev(const char *key, const char *value, void *extra_args)
+{
+
+	struct pdump_tuples *pt = extra_args;
+
+	if (!strcmp(key, PDUMP_RX_DEV_ARG)) {
+		strncpy(pt->rx_dev, value, strlen(value));
+		/* identify the tx stream type for pcap vdev */
+		if (if_nametoindex(pt->rx_dev))
+			pt->rx_vdev_stream_type = IFACE;
+	} else if (!strcmp(key, PDUMP_TX_DEV_ARG)) {
+		strncpy(pt->tx_dev, value, strlen(value));
+		/* identify the tx stream type for pcap vdev */
+		if (if_nametoindex(pt->tx_dev))
+			pt->tx_vdev_stream_type = IFACE;
+	} else {
+		printf("invalid dev type %s, must be rx or tx\n", value);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+parse_uint_value(const char *key, const char *value, void *extra_args)
+{
+	struct parse_val *v;
+	unsigned long t;
+	char *end;
+	int ret = 0;
+
+	errno = 0;
+	v = extra_args;
+	t = strtoul(value, &end, 10);
+
+	if (errno != 0 || end[0] != 0 || t < v->min || t > v->max) {
+		printf("invalid value:\"%s\" for key:\"%s\", "
+			"value must be >= %"PRIu64" and <= %"PRIu64"\n",
+			value, key, v->min, v->max);
+		ret = -EINVAL;
+	}
+	if (!strcmp(key, PDUMP_RING_SIZE_ARG) && !POWEROF2(t)) {
+		printf("invalid value:\"%s\" for key:\"%s\", "
+			"value must be power of 2\n", value, key);
+		ret = -EINVAL;
+	}
+
+	if (ret != 0)
+		return ret;
+
+	v->val = t;
+	return 0;
+}
+
+static int
+parse_pdump(const char *optarg)
+{
+	struct rte_kvargs *kvlist;
+	int ret = 0, cnt1, cnt2;
+	struct pdump_tuples *pt;
+	struct parse_val v = {0};
+
+	pt = &pdump_t[num_tuples];
+
+	/* initial check for invalid arguments */
+	kvlist = rte_kvargs_parse(optarg, valid_pdump_arguments);
+	if (kvlist == NULL) {
+		printf("--pdump=\"%s\": invalid argument passed\n", optarg);
+		return -1;
+	}
+
+	/* port/device_id parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_PORT_ARG);
+	cnt2 = rte_kvargs_count(kvlist, PDUMP_PCI_ARG);
+	if (!((cnt1 == 1 && cnt2 == 0) || (cnt1 == 0 && cnt2 == 1))) {
+		printf("--pdump=\"%s\": must have either port or "
+			"device_id argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	} else if (cnt1 == 1) {
+		v.min = 0;
+		v.max = RTE_MAX_ETHPORTS-1;
+		ret = rte_kvargs_process(kvlist, PDUMP_PORT_ARG,
+				&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->port = (uint8_t) v.val;
+		pt->dump_by_type = PORT_ID;
+	} else if (cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_PCI_ARG,
+				&parse_device_id, pt);
+		if (ret < 0)
+			goto free_kvlist;
+	}
+
+	/* queue parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_QUEUE_ARG);
+	if (cnt1 != 1) {
+		printf("--pdump=\"%s\": must have queue argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	}
+	ret = rte_kvargs_process(kvlist, PDUMP_QUEUE_ARG, &parse_queue, pt);
+	if (ret < 0)
+		goto free_kvlist;
+
+	/* rx-dev and tx-dev parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_RX_DEV_ARG);
+	cnt2 = rte_kvargs_count(kvlist, PDUMP_TX_DEV_ARG);
+	if (cnt1 == 0 && cnt2 == 0) {
+		printf("--pdump=\"%s\": must have either rx-dev or "
+			"tx-dev argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	} else if (cnt1 == 1 && cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_RX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		ret = rte_kvargs_process(kvlist, PDUMP_TX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		/* if captured packets has to send to the same vdev */
+		if (!strcmp(pt->rx_dev, pt->tx_dev))
+			pt->single_pdump_dev = true;
+		pt->dir = RTE_PDUMP_FLAG_RXTX;
+	} else if (cnt1 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_RX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->dir = RTE_PDUMP_FLAG_RX;
+	} else if (cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_TX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->dir = RTE_PDUMP_FLAG_TX;
+	}
+
+	/* optional */
+	/* ring_size parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_RING_SIZE_ARG);
+	if (cnt1 == 1) {
+		v.min = 2;
+		v.max = RTE_RING_SZ_MASK-1;
+		ret = rte_kvargs_process(kvlist, PDUMP_RING_SIZE_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->ring_size = (uint16_t) v.val;
+	} else
+		pt->ring_size = RING_SIZE;
+
+	/* mbuf_data_size parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_MSIZE_ARG);
+	if (cnt1 == 1) {
+		v.min = 1;
+		v.max = UINT16_MAX;
+		ret = rte_kvargs_process(kvlist, PDUMP_MSIZE_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->mbuf_data_size = (uint16_t) v.val;
+	} else
+		pt->mbuf_data_size = RTE_MBUF_DEFAULT_BUF_SIZE;
+
+	/* total_num_mbufs parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_NUM_MBUFS_ARG);
+	if (cnt1 == 1) {
+		v.min = 1025;
+		v.max = UINT16_MAX;
+		ret = rte_kvargs_process(kvlist, PDUMP_NUM_MBUFS_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->total_num_mbufs = (uint16_t) v.val;
+	} else
+		pt->total_num_mbufs = MBUFS_PER_POOL;
+
+	num_tuples++;
+
+free_kvlist:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+launch_args_parse(int argc, char **argv, char *prgname)
+{
+	int opt, ret;
+	int option_index;
+	static struct option long_option[] = {
+		{"pdump", 1, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	if (argc == 1)
+		pdump_usage(prgname);
+
+	/* Parse command line */
+	while ((opt = getopt_long(argc, argv, " ",
+			long_option, &option_index)) != EOF) {
+		switch (opt) {
+		case 0:
+			if (!strncmp(long_option[option_index].name, "pdump",
+					MAX_LONG_OPT_SZ)) {
+				ret = parse_pdump(optarg);
+				if (ret) {
+					pdump_usage(prgname);
+					return -1;
+				}
+			}
+			break;
+		default:
+			pdump_usage(prgname);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+print_pdump_stats(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	for (i = 0; i < num_tuples; i++) {
+		printf("##### PDUMP DEBUG STATS #####\n");
+		pt = &pdump_t[i];
+		printf(" -packets dequeued:			%"PRIu64"\n",
+							pt->stats.dequeue_pkts);
+		printf(" -packets transmitted to vdev:		%"PRIu64"\n",
+							pt->stats.tx_pkts);
+		printf(" -packets freed:			%"PRIu64"\n",
+							pt->stats.freed_pkts);
+	}
+}
+
+static inline void
+disable_pdump(struct pdump_tuples *pt)
+{
+	if (pt->dump_by_type == DEVICE_ID)
+		rte_pdump_disable_by_deviceid(pt->device_id, pt->queue,
+						pt->dir);
+	else if (pt->dump_by_type == PORT_ID)
+		rte_pdump_disable(pt->port, pt->queue, pt->dir);
+}
+
+static inline void
+pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
+{
+	/* write input packets of port to vdev for pdump */
+	struct rte_mbuf *rxtx_bufs[BURST_SIZE];
+
+	/* first dequeue packets from ring of primary process */
+	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
+			(void *)rxtx_bufs, BURST_SIZE);
+	stats->dequeue_pkts += nb_in_deq;
+
+	if (nb_in_deq) {
+		/* then sent on vdev */
+		uint16_t nb_in_txd = rte_eth_tx_burst(
+				vdev_id,
+				0, rxtx_bufs, nb_in_deq);
+		stats->tx_pkts += nb_in_txd;
+
+		if (unlikely(nb_in_txd < nb_in_deq)) {
+			do {
+				rte_pktmbuf_free(rxtx_bufs[nb_in_txd]);
+				stats->freed_pkts++;
+			} while (++nb_in_txd < nb_in_deq);
+		}
+	}
+}
+
+static void
+free_ring_data(struct rte_ring *ring, uint8_t vdev_id,
+		struct pdump_stats *stats)
+{
+	while (rte_ring_count(ring))
+		pdump_rxtx(ring, vdev_id, stats);
+}
+
+static void
+cleanup_pdump_resources(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	/* disable pdump and free the pdump_tuple resources */
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+
+		/* remove callbacks */
+		disable_pdump(pt);
+
+		/*
+		* transmit rest of the enqueued packets of the rings on to
+		* the vdev, in order to release mbufs to the mepool.
+		**/
+		if (pt->dir & RTE_PDUMP_FLAG_RX)
+			free_ring_data(pt->rx_ring, pt->rx_vdev_id, &pt->stats);
+		if (pt->dir & RTE_PDUMP_FLAG_TX)
+			free_ring_data(pt->tx_ring, pt->tx_vdev_id, &pt->stats);
+
+		if (pt->device_id)
+			free(pt->device_id);
+
+		/* free the rings */
+		if (pt->rx_ring)
+			rte_ring_free(pt->rx_ring);
+		if (pt->tx_ring)
+			rte_ring_free(pt->tx_ring);
+	}
+}
+
+static void
+signal_handler(int sig_num)
+{
+	if (sig_num == SIGINT) {
+		printf("\n\nSignal %d received, preparing to exit...\n",
+				sig_num);
+		quit_signal = 1;
+	}
+}
+
+static inline int
+configure_vdev(uint8_t port_id)
+{
+	struct ether_addr addr;
+	const uint16_t rxRings = 0, txRings = 1;
+	const uint8_t nb_ports = rte_eth_dev_count();
+	int ret;
+	uint16_t q;
+
+	if (port_id > nb_ports)
+		return -1;
+
+	ret = rte_eth_dev_configure(port_id, rxRings, txRings,
+					&port_conf_default);
+	if (ret != 0)
+		rte_exit(EXIT_FAILURE, "dev config failed\n");
+
+	 for (q = 0; q < txRings; q++) {
+		ret = rte_eth_tx_queue_setup(port_id, q, TX_DESC_PER_QUEUE,
+				rte_eth_dev_socket_id(port_id), NULL);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "queue setup failed\n");
+	}
+
+	ret = rte_eth_dev_start(port_id);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "dev start failed\n");
+
+	rte_eth_macaddr_get(port_id, &addr);
+	printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
+			" %02"PRIx8" %02"PRIx8" %02"PRIx8"\n",
+			(unsigned)port_id,
+			addr.addr_bytes[0], addr.addr_bytes[1],
+			addr.addr_bytes[2], addr.addr_bytes[3],
+			addr.addr_bytes[4], addr.addr_bytes[5]);
+
+	rte_eth_promiscuous_enable(port_id);
+
+	return 0;
+}
+
+static void
+create_mp_ring_vdev(void)
+{
+	int i;
+	uint8_t portid;
+	struct pdump_tuples *pt = NULL;
+	struct rte_mempool *mbuf_pool = NULL;
+	char vdev_args[SIZE];
+	char ring_name[SIZE];
+	char mempool_name[SIZE];
+
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+		snprintf(mempool_name, SIZE, MP_NAME, i);
+		mbuf_pool = rte_mempool_lookup(mempool_name);
+		if (mbuf_pool == NULL) {
+			/* create mempool */
+			mbuf_pool = rte_pktmbuf_pool_create(mempool_name,
+					pt->total_num_mbufs,
+					MBUF_POOL_CACHE_SIZE, 0,
+					pt->mbuf_data_size,
+					rte_socket_id());
+			if (mbuf_pool == NULL)
+				rte_exit(EXIT_FAILURE,
+					"Mempool creation failed: %s\n",
+					rte_strerror(rte_errno));
+		}
+		pt->mp = mbuf_pool;
+
+		if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
+			/* if captured packets has to send to the same vdev */
+			/* create rx_ring */
+			snprintf(ring_name, SIZE, RX_RING, i);
+			pt->rx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->rx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s:%s:%d\n",
+						rte_strerror(rte_errno),
+						__func__, __LINE__);
+
+			/* create tx_ring */
+			snprintf(ring_name, SIZE, TX_RING, i);
+			pt->tx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->tx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s:%s:%d\n",
+						rte_strerror(rte_errno),
+						__func__, __LINE__);
+
+			/* create vdevs */
+			(pt->rx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, RX_STR, i,
+			pt->rx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, RX_STR, i,
+			pt->rx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed:%s:%d\n",
+					__func__, __LINE__);
+			pt->rx_vdev_id = portid;
+
+			/* configure vdev */
+			configure_vdev(pt->rx_vdev_id);
+
+			if (pt->single_pdump_dev)
+				pt->tx_vdev_id = portid;
+			else {
+				(pt->tx_vdev_stream_type == IFACE) ?
+				snprintf(vdev_args, SIZE, VDEV_IFACE, TX_STR, i,
+				pt->tx_dev) :
+				snprintf(vdev_args, SIZE, VDEV_PCAP, TX_STR, i,
+				pt->tx_dev);
+				if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+					rte_exit(EXIT_FAILURE,
+						"vdev creation failed:"
+						"%s:%d\n", __func__, __LINE__);
+				pt->tx_vdev_id = portid;
+
+				/* configure vdev */
+				configure_vdev(pt->tx_vdev_id);
+			}
+		} else if (pt->dir == RTE_PDUMP_FLAG_RX) {
+
+			/* create rx_ring */
+			snprintf(ring_name, SIZE, RX_RING, i);
+			pt->rx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->rx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s\n",
+					rte_strerror(rte_errno));
+
+			(pt->rx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, RX_STR, i,
+				pt->rx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, RX_STR, i,
+				pt->rx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed:%s:%d\n",
+					__func__, __LINE__);
+			pt->rx_vdev_id = portid;
+			/* configure vdev */
+			configure_vdev(pt->rx_vdev_id);
+		} else if (pt->dir == RTE_PDUMP_FLAG_TX) {
+
+			/* create tx_ring */
+			snprintf(ring_name, SIZE, TX_RING, i);
+			pt->tx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->tx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s\n",
+					rte_strerror(rte_errno));
+
+			(pt->tx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, TX_STR, i,
+				pt->tx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, TX_STR, i,
+				pt->tx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed\n");
+			pt->tx_vdev_id = portid;
+
+			/* configure vdev */
+			configure_vdev(pt->tx_vdev_id);
+		}
+	}
+}
+
+static void
+enable_pdump(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+	int ret = 0, ret1 = 0;
+
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+		if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
+			if (pt->dump_by_type == DEVICE_ID) {
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						RTE_PDUMP_FLAG_RX,
+						pt->rx_ring,
+						pt->mp, NULL);
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						RTE_PDUMP_FLAG_TX,
+						pt->tx_ring,
+						pt->mp, NULL);
+			} else if (pt->dump_by_type == PORT_ID) {
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						RTE_PDUMP_FLAG_RX,
+						pt->rx_ring, pt->mp, NULL);
+				ret1 = rte_pdump_enable(pt->port, pt->queue,
+						RTE_PDUMP_FLAG_TX,
+						pt->tx_ring, pt->mp, NULL);
+			}
+		} else if (pt->dir == RTE_PDUMP_FLAG_RX) {
+			if (pt->dump_by_type == DEVICE_ID)
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						pt->dir, pt->rx_ring,
+						pt->mp, NULL);
+			else if (pt->dump_by_type == PORT_ID)
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						pt->dir,
+						pt->rx_ring, pt->mp, NULL);
+		} else if (pt->dir == RTE_PDUMP_FLAG_TX) {
+			if (pt->dump_by_type == DEVICE_ID)
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						pt->dir,
+						pt->tx_ring, pt->mp, NULL);
+			else if (pt->dump_by_type == PORT_ID)
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						pt->dir,
+						pt->tx_ring, pt->mp, NULL);
+		}
+		if (ret < 0 || ret1 < 0) {
+			cleanup_pdump_resources();
+			rte_exit(EXIT_FAILURE, "%s\n", rte_strerror(rte_errno));
+		}
+	}
+}
+
+static inline void
+dump_packets(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	while (!quit_signal) {
+		for (i = 0; i < num_tuples; i++) {
+			pt = &pdump_t[i];
+			if (pt->dir & RTE_PDUMP_FLAG_RX)
+				pdump_rxtx(pt->rx_ring, pt->rx_vdev_id,
+					&pt->stats);
+			if (pt->dir & RTE_PDUMP_FLAG_TX)
+				pdump_rxtx(pt->tx_ring, pt->tx_vdev_id,
+					&pt->stats);
+		}
+	}
+}
+
+int
+main(int argc, char **argv)
+{
+	int diag;
+	int ret;
+	int i;
+
+	char c_flag[] = "-c1";
+	char n_flag[] = "-n4";
+	char mp_flag[] = "--proc-type=secondary";
+	char *argp[argc + 3];
+
+	/* catch ctrl-c so we can print on exit */
+	signal(SIGINT, signal_handler);
+
+	argp[0] = argv[0];
+	argp[1] = c_flag;
+	argp[2] = n_flag;
+	argp[3] = mp_flag;
+
+	for (i = 1; i < argc; i++)
+		argp[i + 3] = argv[i];
+
+	argc += 3;
+
+	diag = rte_eal_init(argc, argp);
+	if (diag < 0)
+		rte_panic("Cannot init EAL\n");
+
+	argc -= diag;
+	argv += (diag - 3);
+
+	/* parse app arguments */
+	if (argc > 1) {
+		ret = launch_args_parse(argc, argv, argp[0]);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "Invalid argument\n");
+	}
+
+	/* create mempool, ring and vdevs info */
+	create_mp_ring_vdev();
+	enable_pdump();
+	dump_packets();
+
+	cleanup_pdump_resources();
+	/* dump debug stats */
+	print_pdump_stats();
+
+	return 0;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 7/8] app/testpmd: add pdump initialization uninitialization
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
                       ` (5 preceding siblings ...)
  2016-06-14  9:38     ` [PATCH v9 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14  9:38     ` [PATCH v9 8/8] doc: update doc for packet capture framework Reshma Pattan
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
  8 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Call rte_pdump_init and rte_pdump_uninit for packet
capturing initialization and uninitialization.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 app/test-pmd/testpmd.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index dd6b046..9707cfc 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_XENVIRT
 #include <rte_eth_xenvirt.h>
 #endif
+#include <rte_pdump.h>
 
 #include "testpmd.h"
 
@@ -2029,6 +2030,8 @@ signal_handler(int signum)
 	if (signum == SIGINT || signum == SIGTERM) {
 		printf("\nSignal %d received, preparing to exit...\n",
 				signum);
+		/* uninitialize packet capture framework */
+		rte_pdump_uninit();
 		force_quit();
 		/* exit with the expected status */
 		signal(signum, SIG_DFL);
@@ -2049,6 +2052,9 @@ main(int argc, char** argv)
 	if (diag < 0)
 		rte_panic("Cannot init EAL\n");
 
+	/* initialize packet capture framework */
+	rte_pdump_init(NULL);
+
 	nb_ports = (portid_t) rte_eth_dev_count();
 	if (nb_ports == 0)
 		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v9 8/8] doc: update doc for packet capture framework
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
                       ` (6 preceding siblings ...)
  2016-06-14  9:38     ` [PATCH v9 7/8] app/testpmd: add pdump initialization uninitialization Reshma Pattan
@ 2016-06-14  9:38     ` Reshma Pattan
  2016-06-14 20:41       ` Thomas Monjalon
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
  8 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-14  9:38 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added programmers guide for librte_pdump.
Added sample application guide for app/pdump application.
Updated release note for packet capture framework changes.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 MAINTAINERS                             |   3 +
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/pdump_library.rst | 119 +++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_16_07.rst  |  13 ++++
 doc/guides/sample_app_ug/index.rst      |   1 +
 doc/guides/sample_app_ug/pdump.rst      | 122 ++++++++++++++++++++++++++++++++
 6 files changed, 259 insertions(+)
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index c46cf86..9a84f59 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -437,6 +437,9 @@ Pdump
 M: Reshma Pattan <reshma.pattan@intel.com>
 F: lib/librte_pdump/
 F: app/pdump/
+F: doc/guides/prog_guide/pdump_library.rst
+F: doc/guides/sample_app_ug/pdump.rst
+
 
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index b862d0c..4caf969 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
     writing_efficient_code
     profile_app
     glossary
+    pdump_library
 
 
 **Figures**
diff --git a/doc/guides/prog_guide/pdump_library.rst b/doc/guides/prog_guide/pdump_library.rst
new file mode 100644
index 0000000..8781ffb
--- /dev/null
+++ b/doc/guides/prog_guide/pdump_library.rst
@@ -0,0 +1,119 @@
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _pdump_library:
+
+The librte_pdump Library
+========================
+
+The ``librte_pdump`` library provides a framework for packet capturing in DPDK.
+The library provides the following APIs to initialize the packet capture framework, to enable
+or disable the packet capture, and to uninitialize it:
+
+* ``rte_pdump_init()``:
+  This API initializes the packet capture framework.
+
+* ``rte_pdump_enable()``:
+  This API enables the packet capture on a given port and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_enable_by_deviceid()``:
+  This API enables the packet capture on a given device id (``vdev name or pci address``) and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_disable()``:
+  This API disables the packet capture on a given port and queue.
+
+* ``rte_pdump_disable_by_deviceid()``:
+  This API disables the packet capture on a given device id (``vdev name or pci address``) and queue.
+
+* ``rte_pdump_uninit()``:
+  This API uninitializes the packet capture framework.
+
+* ``rte_pdump_set_socket_dir()``:
+  This API sets the server and client socket paths.
+  Note: This API is not thread-safe.
+
+
+Operation
+---------
+
+The ``librte_pdump`` library works on a client/server model. The server is responsible for enabling or
+disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
+the packet capture.
+
+The packet capture framework, as part of its initialization, creates the pthread and the server socket in
+the pthread. The application that calls the framework initialization will have the server socket created,
+either under the path that the application has passed or under the default path i.e. either ``/var/run`` for
+root user or ``$HOME`` for non root user.
+
+Applications that request enabling or disabling of the packet capture will have the client socket created either under
+the path that the application has passed or under the default path i.e. either ``/var/run/`` for root user or ``$HOME``
+for not root user to send the requests to the server.
+The server socket will listen for client requests for enabling or disabling the packet capture.
+
+
+Implementation Details
+----------------------
+
+The library API ``rte_pdump_init()``, initializes the packet capture framework by creating the pthread and the server
+socket. The server socket in the pthread context will be listening to the client requests to enable or disable the
+packet capture.
+
+The library APIs ``rte_pdump_enable()`` and ``rte_pdump_enable_by_deviceid()`` enables the packet capture.
+On each call to these APIs, the library creates a separate client socket, creates the "pdump enable" request and sends
+the request to the server. The server that is listening on the socket will take the request and enable the packet capture
+by registering the Ethernet RX and TX callbacks for the given port or device_id and queue combinations.
+Then the server will mirror the packets to the new mempool and enqueue them to the rte_ring that clients have passed
+to these APIs. The server also sends the response back to the client about the status of the request that was processed.
+After the response is received from the server, the client socket is closed.
+
+The library APIs ``rte_pdump_disable()`` and ``rte_pdump_disable_by_deviceid()`` disables the packet capture.
+On each call to these APIs, the library creates a separate client socket, creates the "pdump disable" request and sends
+the request to the server. The server that is listening on the socket will take the request and disable the packet
+capture by removing the Ethernet RX and TX callbacks for the given port or device_id and queue combinations. The server
+also sends the response back to the client about the status of the request that was processed. After the response is
+received from the server, the client socket is closed.
+
+The library API ``rte_pdump_uninit()``, uninitializes the packet capture framework by closing the pthread and the
+server socket.
+
+The library API ``rte_pdump_set_socket_dir()``, sets the given path as either server socket path
+or client socket path based on the ``type`` argument of the API.
+If the given path is ``NULL``, default path will be selected, i.e. either ``/var/run/`` for root user or ``$HOME``
+for non root user. Clients also need to call this API to set their server socket path if the server socket
+path is different from default path.
+
+
+Use Case: Packet Capturing
+--------------------------
+
+The DPDK ``app/pdump`` tool is developed based on this library to capture packets in DPDK.
+Users can use this as an example to develop their own packet capturing application.
diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index c0f6b02..a4de2a2 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -66,6 +66,11 @@ New Features
   * Enable RSS per network interface through the configuration file.
   * Streamline the CLI code.
 
+* **Added packet capture framework.**
+
+  * A new library ``librte_pdump`` is added to provide packet capture APIs.
+  * A new ``app/pdump`` tool is added to capture packets in DPDK.
+
 
 Resolved Issues
 ---------------
@@ -135,6 +140,11 @@ API Changes
   ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff.
 
+* Function ``rte_eth_dev_get_port_by_name`` changed to a public API.
+
+* Function ``rte_eth_dev_info_get`` updated to return new fields ``nb_rx_queues`` and ``nb_tx_queues``
+  in the ``rte_eth_dev_info`` object.
+
 
 ABI Changes
 -----------
@@ -146,6 +156,9 @@ ABI Changes
 * The ``rte_port_source_params`` structure has new fields to support PCAP file.
   It was already in release 16.04 with ``RTE_NEXT_ABI`` flag.
 
+* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and ``nb_tx_queues``
+  to support number of queues configured by software.
+
 
 Shared Library Versions
 -----------------------
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 930f68c..96bb317 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -76,6 +76,7 @@ Sample Applications User Guide
     ptpclient
     performance_thread
     ipsec_secgw
+    pdump
 
 **Figures**
 
diff --git a/doc/guides/sample_app_ug/pdump.rst b/doc/guides/sample_app_ug/pdump.rst
new file mode 100644
index 0000000..96c8709
--- /dev/null
+++ b/doc/guides/sample_app_ug/pdump.rst
@@ -0,0 +1,122 @@
+
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+dpdk_pdump Application
+======================
+
+The ``dpdk_pdump`` application is a Data Plane Development Kit (DPDK) application that runs as a DPDK secondary process and
+is capable of enabling packet capture on dpdk ports.
+
+
+Running the Application
+-----------------------
+
+The application has a ``--pdump`` command line option with various sub arguments:
+
+.. code-block:: console
+
+   ./build/app/dpdk_pdump --
+                          --pdump '(port=<port id> | device_id=<pci id or vdev name>),
+                                   (queue=<queue_id>),
+                                   (rx-dev=<iface or pcap file> |
+                                    tx-dev=<iface or pcap file>),
+                                   [ring-size=<ring size>],
+                                   [mbuf-size=<mbuf data size>],
+                                   [total-num-mbufs=<number of mbufs>]'
+
+Note:
+
+* Parameters inside the parentheses represents mandatory parameters.
+
+* Parameters inside the square brackets represents optional parameters.
+
+Multiple instances of ``--pdump`` can be passed to capture packets on different port and queue combinations.
+
+
+Parameters
+~~~~~~~~~~
+
+``port``:
+Port id of the eth device on which packets should be captured.
+
+``device_id``:
+PCI address (or) name of the eth device on which packets should be captured.
+
+   .. Note::
+
+      * As of now the ``dpdk_pdump`` tool cannot capture the packets of virtual devices
+        in the primary process due to a bug in the ethdev library. Due to this bug, in a multi process context,
+        when the primary and secondary have different ports set, then the secondary process
+        (here the ``dpdk_pdump`` tool) overwrites the ``rte_eth_devices[]`` entries of the primary process.
+
+``queue``:
+Queue id of the eth device on which packets should be captured. The user can pass a queue value of ``*`` to enable
+packet capture on all queues of the eth device.
+
+``rx-dev``:
+Can be either a pcap file name or any Linux iface.
+
+``tx-dev``:
+Can be either a pcap file name or any Linux iface.
+
+   .. Note::
+
+      * To receive ingress packets only, ``rx-dev`` should be passed.
+
+      * To receive egress packets only, ``tx-dev`` should be passed.
+
+      * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev``
+        should both be passed with the different file names or the Linux iface names.
+
+      * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev``
+        should both be passed with the same file names or the the Linux iface names.
+
+``ring-size``:
+Size of the ring. This value is used internally for ring creation. The ring will be used to enqueue the packets from
+the primary application to the secondary. This is an optional parameter with default size 16384.
+
+``mbuf-size``:
+Size of the mbuf data. This is used internally for mempool creation. Ideally this value must be same as
+the primary application's mempool's mbuf data size which is used for packet RX. This is an optional parameter with
+default size 2176.
+
+``total-num-mbufs``:
+Total number mbufs in mempool. This is used internally for mempool creation. This is an optional parameter with default
+value 65535.
+
+
+Example
+-------
+
+.. code-block:: console
+
+   $ sudo ./build/app/dpdk_pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 6/8] app/pdump: add pdump tool for packet capturing
  2016-06-14  9:38     ` [PATCH v9 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
@ 2016-06-14 19:56       ` Thomas Monjalon
  0 siblings, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 19:56 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-14 10:38, Reshma Pattan:
> New tool added for packet capturing on dpdk.
> This tool supports command line options.
> This tool runs as secondary process by default.
> 
> Command line supports various parameters to capture
> the packets.
> 
> User should pass on a)port and queue (or) b)pci address
> and queue (or) c)device name and queue to capture
> the packets.
> 
> Users also need to pass on either pcap file name or
> any linux iface, on to which packets captured from dpdk
> ports will be sent on for the users to view using tcpdump.
> 
> Users have option to capture packets either a) in Rx
> direction, b)(or) in Tx direction c)(or) from both the
> directions.
> 
> User can pass on ring_size and mempool parameters using
> command line, but these are optional parameters.
> These are used to create ring and mempool objects for packet
> mirroring from primary application to tool. If user doesn't
> provide any values, default values will be used internally
> for the creation of the ring and mempool.

The explanations should be in an user guide.

> --- a/app/Makefile
> +++ b/app/Makefile
> @@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
>  DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
>  DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
>  DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
> +DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += pdump

Why Linux only?

There is a build error if CONFIG_RTE_LIBRTE_PDUMP is disabled.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-14  9:38     ` [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
@ 2016-06-14 19:59       ` Thomas Monjalon
  2016-06-15  5:30         ` Pattan, Reshma
  0 siblings, 1 reply; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 19:59 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-14 10:38, Reshma Pattan:
> Added spinlocks around add/remove logic of Rx and Tx callbacks
> to avoid corruption of callback lists in multithreaded context.
> 
> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>

Why cb->next is not locked in burst functions?
Just protecting add/remove but not its usage seems useless.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list
  2016-06-14  9:38     ` [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
@ 2016-06-14 20:01       ` Thomas Monjalon
  2016-06-14 21:43         ` Pattan, Reshma
  0 siblings, 1 reply; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 20:01 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-14 10:38, Reshma Pattan:
> Added new public api rte_eth_add_first_rx_callback to add given
> callback as head of the list.
> 
> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> ---
> +/*
> +* Add a callback that must be called first on packet RX on a given port
> +* and queue.
> +*
> +* This API configures a first function to be called for each burst of
> +* packets received on a given NIC port queue. The return value is a pointer
> +* that can be used to later remove the callback using
> +* rte_eth_remove_rx_callback().
> +*
> +* Multiple functions are called in the order that they are added.
> +*
> +* @param port_id
> +*   The port identifier of the Ethernet device.
> +* @param queue_id
> +*   The queue on the Ethernet device on which the callback is to be added.
> +* @param fn
> +*   The callback function
> +* @param user_param
> +*   A generic pointer parameter which will be passed to each invocation of the
> +*   callback function on this port and queue.
> +*
> +* @return
> +*   NULL on error.
> +*   On success, a pointer value which can later be used to remove the callback.
> +*/
> +void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
> +		rte_rx_callback_fn fn, void *user_param);

Sorry I fail to understand why this function is needed.
What cannot be done in rte_eth_add_rx_callback?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 3/8] ethdev: add new fields to ethdev info struct
  2016-06-14  9:38     ` [PATCH v9 3/8] ethdev: add new fields to ethdev info struct Reshma Pattan
@ 2016-06-14 20:10       ` Thomas Monjalon
  2016-06-14 21:57         ` Pattan, Reshma
  0 siblings, 1 reply; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 20:10 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-14 10:38, Reshma Pattan:
> New fields nb_rx_queues and nb_tx_queues are added to
> rte_eth_dev_info structure.

Please add a justification of why these fields are needed.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 4/8] ethdev: make get port by name and get name by port public
  2016-06-14  9:38     ` [PATCH v9 4/8] ethdev: make get port by name and get name by port public Reshma Pattan
@ 2016-06-14 20:23       ` Thomas Monjalon
  2016-06-14 21:55         ` Pattan, Reshma
  0 siblings, 1 reply; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 20:23 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev, david.marchand

2016-06-14 10:38, Reshma Pattan:
> Converted rte_eth_dev_get_port_by_name to a public API.
> Converted rte_eth_dev_get_name_by_port to a public API.

No justification?

It was planned to remove these functions.
The unique naming of the device interfaces could be improved
in EAL.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-14  9:38     ` [PATCH v9 5/8] pdump: add new library for packet capturing support Reshma Pattan
@ 2016-06-14 20:28       ` Thomas Monjalon
  2016-06-14 21:59         ` Pattan, Reshma
  2016-06-15  9:05         ` Mcnamara, John
  0 siblings, 2 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 20:28 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-14 10:38, Reshma Pattan:
> The new librte_pdump library is added for packet capturing
> support.
> 
> Added public api rte_pdump_init, applications should call
> this as part of their application setup to have packet
> capturing framework ready.
> 
> Added public api rte_pdump_uninit to uninitialize the packet
> capturing framework.
> 
> Added public apis rte_pdump_enable and rte_pdump_disable to
> enable and disable packet capturing on specific port and queue.
> 
> Added public apis rte_pdump_enable_by_deviceid and
> rte_pdump_disable_by_deviceid to enable and disable packet
> capturing on a specific device (pci address or name) and queue.
> 
> Added public api rte_pdump_set_socket_dir to set the
> server and client socket paths.

Reshma, it is not the right info to put in a commit log.
The description of each function is in the doxygen comments.
We need to have the overview, what is the design architecture
(rings, mempool, etc) and why you do that?

>  MAINTAINERS                            |   4 +
>  config/common_base                     |   5 +
>  lib/Makefile                           |   1 +
>  lib/librte_pdump/Makefile              |  55 ++
>  lib/librte_pdump/rte_pdump.c           | 913 +++++++++++++++++++++++++++++++++
>  lib/librte_pdump/rte_pdump.h           | 216 ++++++++
>  lib/librte_pdump/rte_pdump_version.map |  13 +
>  mk/rte.app.mk                          |   1 +

And more importantly, we need a doc in the prog guide.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 8/8] doc: update doc for packet capture framework
  2016-06-14  9:38     ` [PATCH v9 8/8] doc: update doc for packet capture framework Reshma Pattan
@ 2016-06-14 20:41       ` Thomas Monjalon
  2016-06-15  5:44         ` Pattan, Reshma
  0 siblings, 1 reply; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-14 20:41 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev, john.mcnamara

When commenting previous patches, I missed these docs.
Please move them in the appropriate patches.

2016-06-14 10:38, Reshma Pattan:
> --- a/doc/guides/prog_guide/index.rst
> +++ b/doc/guides/prog_guide/index.rst
> @@ -71,6 +71,7 @@ Programmer's Guide
>      writing_efficient_code
>      profile_app
>      glossary
> +    pdump_library

There is probably a better place that after the glossary.

[...]
> +The librte_pdump Library
> +========================
> +
> +The ``librte_pdump`` library provides a framework for packet capturing in DPDK.

Here you need to explain what you mean by "packet capturing".
Doing a copy?
Slowing down the normal processing?
Which usage do you target? debugging? fast mirroring?

> +Use Case: Packet Capturing
> +--------------------------
> +
> +The DPDK ``app/pdump`` tool is developed based on this library to capture packets in DPDK.
> +Users can use this as an example to develop their own packet capturing application.

Is it an example or a debugging tool?
If it is an example, it should be in the examples/ directory.

>  ABI Changes
>  -----------
> @@ -146,6 +156,9 @@ ABI Changes
>  * The ``rte_port_source_params`` structure has new fields to support PCAP file.
>    It was already in release 16.04 with ``RTE_NEXT_ABI`` flag.
>  
> +* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and ``nb_tx_queues``
> +  to support number of queues configured by software.

There was no deprecation notice in 16.04 for this ABI change.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list
  2016-06-14 20:01       ` Thomas Monjalon
@ 2016-06-14 21:43         ` Pattan, Reshma
  0 siblings, 0 replies; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-14 21:43 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:02 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 2/8] ethdev: add new api to add Rx callback
> as head of the list
> 
> 2016-06-14 10:38, Reshma Pattan:
> > Added new public api rte_eth_add_first_rx_callback to add given
> > callback as head of the list.
> >
> > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > ---
> > +/*
> > +* Add a callback that must be called first on packet RX on a given
> > +port
> > +* and queue.
> > +*
> > +* This API configures a first function to be called for each burst of
> > +* packets received on a given NIC port queue. The return value is a
> > +pointer
> > +* that can be used to later remove the callback using
> > +* rte_eth_remove_rx_callback().
> > +*
> > +* Multiple functions are called in the order that they are added.
> > +*
> > +* @param port_id
> > +*   The port identifier of the Ethernet device.
> > +* @param queue_id
> > +*   The queue on the Ethernet device on which the callback is to be added.
> > +* @param fn
> > +*   The callback function
> > +* @param user_param
> > +*   A generic pointer parameter which will be passed to each invocation of
> the
> > +*   callback function on this port and queue.
> > +*
> > +* @return
> > +*   NULL on error.
> > +*   On success, a pointer value which can later be used to remove the
> callback.
> > +*/
> > +void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
> > +		rte_rx_callback_fn fn, void *user_param);
> 
> Sorry I fail to understand why this function is needed.
> What cannot be done in rte_eth_add_rx_callback?

Packet capturing framework should display Rx packets of NIC even before they are being processed by other callbacks of the 
application (because other callback s of application may change the packet data as part of the processing).
So packet capturing framework should register  a callback at the head of the Rx callback list so that callback always gets 
called  first before any other callbacks of the applications.  Hence this API is introduced. 

Thanks,
Reshma

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 4/8] ethdev: make get port by name and get name by port public
  2016-06-14 20:23       ` Thomas Monjalon
@ 2016-06-14 21:55         ` Pattan, Reshma
  0 siblings, 0 replies; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-14 21:55 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, david.marchand



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:24 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org; david.marchand@6wind.com
> Subject: Re: [dpdk-dev] [PATCH v9 4/8] ethdev: make get port by name and get
> name by port public
> 
> 2016-06-14 10:38, Reshma Pattan:
> > Converted rte_eth_dev_get_port_by_name to a public API.
> > Converted rte_eth_dev_get_name_by_port to a public API.
> 
> No justification?

I will add the justification to commit message. 

> 
> It was planned to remove these functions.
> The unique naming of the device interfaces could be improved in EAL.

Packet capture framework  provides APIs  to enable or disable  packet capture either using   portid, or pci adrres /device name. 
These apis were made public for packet capture framework  library to convert pci address and device names to port id  internally,
 and use the port information for registering the RxTx callbacks. These APIs are needed, without which library will not work. 

Thanks,
Reshma

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 3/8] ethdev: add new fields to ethdev info struct
  2016-06-14 20:10       ` Thomas Monjalon
@ 2016-06-14 21:57         ` Pattan, Reshma
  0 siblings, 0 replies; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-14 21:57 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:10 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 3/8] ethdev: add new fields to ethdev info
> struct
> 
> 2016-06-14 10:38, Reshma Pattan:
> > New fields nb_rx_queues and nb_tx_queues are added to rte_eth_dev_info
> > structure.
> 
> Please add a justification of why these fields are needed.

Ok , I will update the commit message.

Thanks,
Reshma

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-14 20:28       ` Thomas Monjalon
@ 2016-06-14 21:59         ` Pattan, Reshma
  2016-06-15  9:05         ` Mcnamara, John
  1 sibling, 0 replies; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-14 21:59 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:28 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 5/8] pdump: add new library for packet
> capturing support
> 
> 2016-06-14 10:38, Reshma Pattan:
> > The new librte_pdump library is added for packet capturing support.
> >
> > Added public api rte_pdump_init, applications should call this as part
> > of their application setup to have packet capturing framework ready.
> >
> > Added public api rte_pdump_uninit to uninitialize the packet capturing
> > framework.
> >
> > Added public apis rte_pdump_enable and rte_pdump_disable to enable and
> > disable packet capturing on specific port and queue.
> >
> > Added public apis rte_pdump_enable_by_deviceid and
> > rte_pdump_disable_by_deviceid to enable and disable packet capturing
> > on a specific device (pci address or name) and queue.
> >
> > Added public api rte_pdump_set_socket_dir to set the server and client
> > socket paths.
> 
> Reshma, it is not the right info to put in a commit log.
> The description of each function is in the doxygen comments.
> We need to have the overview, what is the design architecture (rings, mempool,
> etc) and why you do that?

Yes, I will update the commit message.

> 
> >  MAINTAINERS                            |   4 +
> >  config/common_base                     |   5 +
> >  lib/Makefile                           |   1 +
> >  lib/librte_pdump/Makefile              |  55 ++
> >  lib/librte_pdump/rte_pdump.c           | 913
> +++++++++++++++++++++++++++++++++
> >  lib/librte_pdump/rte_pdump.h           | 216 ++++++++
> >  lib/librte_pdump/rte_pdump_version.map |  13 +
> >  mk/rte.app.mk                          |   1 +
> 
> And more importantly, we need a doc in the prog guide.
> 

Yes the document is added, I will move  the document change to this patch.

Thanks,
Reshma

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-14 19:59       ` Thomas Monjalon
@ 2016-06-15  5:30         ` Pattan, Reshma
  2016-06-15  8:19           ` Thomas Monjalon
  0 siblings, 1 reply; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-15  5:30 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Ananyev, Konstantin



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:00 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx
> callback lists
> 
> 2016-06-14 10:38, Reshma Pattan:
> > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > avoid corruption of callback lists in multithreaded context.
> >
> > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> 
> Why cb->next is not locked in burst functions?
It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.

> Just protecting add/remove but not its usage seems useless.
Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads 
i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request). 

Thanks,
Reshma

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 8/8] doc: update doc for packet capture framework
  2016-06-14 20:41       ` Thomas Monjalon
@ 2016-06-15  5:44         ` Pattan, Reshma
  2016-06-15  8:24           ` Thomas Monjalon
  0 siblings, 1 reply; 67+ messages in thread
From: Pattan, Reshma @ 2016-06-15  5:44 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Mcnamara, John



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:41 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org; Mcnamara, John <john.mcnamara@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v9 8/8] doc: update doc for packet capture
> framework
> 
> When commenting previous patches, I missed these docs.
> Please move them in the appropriate patches.
> 
> > +Use Case: Packet Capturing
> > +--------------------------
> > +
> > +The DPDK ``app/pdump`` tool is developed based on this library to capture
> packets in DPDK.
> > +Users can use this as an example to develop their own packet capturing
> application.
> 
> Is it an example or a debugging tool?

It is a debugging tool.

> If it is an example, it should be in the examples/ directory.
> 
> >  ABI Changes
> >  -----------
> > @@ -146,6 +156,9 @@ ABI Changes
> >  * The ``rte_port_source_params`` structure has new fields to support PCAP
> file.
> >    It was already in release 16.04 with ``RTE_NEXT_ABI`` flag.
> >
> > +* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and
> ``nb_tx_queues``
> > +  to support number of queues configured by software.
> 
> There was no deprecation notice in 16.04 for this ABI change.

Deprecation notice and relevant planed changes were sent as RFC during the start of 16.07 , please find the below is the link for the same.
http://dpdk.org/dev/patchwork/patch/12033/
http://dpdk.org/dev/patchwork/patch/12034/

Thanks,
Reshma

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  5:30         ` Pattan, Reshma
@ 2016-06-15  8:19           ` Thomas Monjalon
  2016-06-15  8:37             ` Ananyev, Konstantin
  0 siblings, 1 reply; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-15  8:19 UTC (permalink / raw)
  To: Pattan, Reshma; +Cc: dev, Ananyev, Konstantin

2016-06-15 05:30, Pattan, Reshma:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-06-14 10:38, Reshma Pattan:
> > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > avoid corruption of callback lists in multithreaded context.
> > >
> > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > 
> > Why cb->next is not locked in burst functions?
> It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.
> 
> > Just protecting add/remove but not its usage seems useless.
> Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads 
> i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request). 

So read and write can be done by different threads.
I think the read access would need locking but we do not want it
in fast path.
Are you sure there is no issue in this design?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 8/8] doc: update doc for packet capture framework
  2016-06-15  5:44         ` Pattan, Reshma
@ 2016-06-15  8:24           ` Thomas Monjalon
  0 siblings, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-15  8:24 UTC (permalink / raw)
  To: Pattan, Reshma; +Cc: dev, Mcnamara, John

2016-06-15 05:44, Pattan, Reshma:
> > >  ABI Changes
> > >  -----------
> > > @@ -146,6 +156,9 @@ ABI Changes
> > >  * The ``rte_port_source_params`` structure has new fields to support PCAP
> > file.
> > >    It was already in release 16.04 with ``RTE_NEXT_ABI`` flag.
> > >
> > > +* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and
> > ``nb_tx_queues``
> > > +  to support number of queues configured by software.
> > 
> > There was no deprecation notice in 16.04 for this ABI change.
> 
> Deprecation notice and relevant planed changes were sent as RFC during the start of 16.07 , please find the below is the link for the same.
> http://dpdk.org/dev/patchwork/patch/12033/
> http://dpdk.org/dev/patchwork/patch/12034/

Yes there was no notice in 16.04.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  8:19           ` Thomas Monjalon
@ 2016-06-15  8:37             ` Ananyev, Konstantin
  2016-06-15  8:48               ` Thomas Monjalon
  0 siblings, 1 reply; 67+ messages in thread
From: Ananyev, Konstantin @ 2016-06-15  8:37 UTC (permalink / raw)
  To: Thomas Monjalon, Pattan, Reshma; +Cc: dev


Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, June 15, 2016 9:19 AM
> To: Pattan, Reshma
> Cc: dev@dpdk.org; Ananyev, Konstantin
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> 2016-06-15 05:30, Pattan, Reshma:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2016-06-14 10:38, Reshma Pattan:
> > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > avoid corruption of callback lists in multithreaded context.
> > > >
> > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > >
> > > Why cb->next is not locked in burst functions?
> > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.
> >
> > > Just protecting add/remove but not its usage seems useless.
> > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads
> > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> 
> So read and write can be done by different threads.

Yes, and this is possible even in current DPDK version (16.04).
What is added by Reshma's patch - now it is possible to have concurrent write
from 2 different thread to that list.  

> I think the read access would need locking but we do not want it
> in fast path.

I don't think it would be needed.
As I said - read/write interaction didn't change from what we have right now.
But if you have some particular scenario in mind that you believe would cause
a race condition - please speak up.  
Konstantin

> Are you sure there is no issue in this design?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  8:37             ` Ananyev, Konstantin
@ 2016-06-15  8:48               ` Thomas Monjalon
  2016-06-15  9:54                 ` Ananyev, Konstantin
  2016-06-15 12:15                 ` Ivan Boule
  0 siblings, 2 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-15  8:48 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: Pattan, Reshma, dev

2016-06-15 08:37, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-06-15 05:30, Pattan, Reshma:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2016-06-14 10:38, Reshma Pattan:
> > > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > > avoid corruption of callback lists in multithreaded context.
> > > > >
> > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > >
> > > > Why cb->next is not locked in burst functions?
> > > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.
> > >
> > > > Just protecting add/remove but not its usage seems useless.
> > > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads
> > > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> > 
> > So read and write can be done by different threads.
> 
> Yes, and this is possible even in current DPDK version (16.04).
> What is added by Reshma's patch - now it is possible to have concurrent write
> from 2 different thread to that list.  
> 
> > I think the read access would need locking but we do not want it
> > in fast path.
> 
> I don't think it would be needed.
> As I said - read/write interaction didn't change from what we have right now.
> But if you have some particular scenario in mind that you believe would cause
> a race condition - please speak up.  

If we add/remove a callback during a burst? Is it possible that the next
pointer would have a wrong value leading to a crash?
Maybe we need a comment to state that we should not alter burst
callbacks while running burst functions.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-14 20:28       ` Thomas Monjalon
  2016-06-14 21:59         ` Pattan, Reshma
@ 2016-06-15  9:05         ` Mcnamara, John
  2016-06-15  9:32           ` Thomas Monjalon
  1 sibling, 1 reply; 67+ messages in thread
From: Mcnamara, John @ 2016-06-15  9:05 UTC (permalink / raw)
  To: Thomas Monjalon, Pattan, Reshma; +Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, June 14, 2016 9:28 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 5/8] pdump: add new library for packet
> capturing support
> 
> 2016-06-14 10:38, Reshma Pattan:
> > The new librte_pdump library is added for packet capturing support.
> >
> 
> And more importantly, we need a doc in the prog guide.
> 

Hi Thomas,

The Programmers Guide update is in another part of the patchset. Can we get some clarification on the requirements for documentation within patchset?

Should all documentation related to a feature be in the patch for the feature? From your recent comments on patches it looks like that is the way you prefer it. That is fine but there is some confusion because it seems that wasn't always a requirement in the past so it would be best to clarify, and preferably document this.

Also, it makes it a bit harder for the documentation maintainer (me in this case) to see doc changes within patches and to ack just the doc part. From a documentation maintainer point of view it would be best to have any, non-trivial, doc changes in a separate patch.

John

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-15  9:05         ` Mcnamara, John
@ 2016-06-15  9:32           ` Thomas Monjalon
  2016-06-15  9:43             ` Bruce Richardson
  2016-06-15 15:44             ` Mcnamara, John
  0 siblings, 2 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-15  9:32 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: Pattan, Reshma, dev

2016-06-15 09:05, Mcnamara, John:
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > 2016-06-14 10:38, Reshma Pattan:
> > > The new librte_pdump library is added for packet capturing support.
> > >
> > 
> > And more importantly, we need a doc in the prog guide.
> > 
> 
> Hi Thomas,
> 
> The Programmers Guide update is in another part of the patchset. Can we get some clarification on the requirements for documentation within patchset?
> 
> Should all documentation related to a feature be in the patch for the feature? From your recent comments on patches it looks like that is the way you prefer it. That is fine but there is some confusion because it seems that wasn't always a requirement in the past so it would be best to clarify, and preferably document this.

When reading a patch (including after integration in the git tree),
it is easier to understand when having the related doc with the code changes.

> Also, it makes it a bit harder for the documentation maintainer (me in this case) to see doc changes within patches and to ack just the doc part. From a documentation maintainer point of view it would be best to have any, non-trivial, doc changes in a separate patch.

I understand your concern.
But you cannot assume every doc changes will be properly highlighted in
the headline. I think you need to filter patches based on a content pattern:
	+++ b/doc/guides/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-15  9:32           ` Thomas Monjalon
@ 2016-06-15  9:43             ` Bruce Richardson
  2016-06-15 15:44             ` Mcnamara, John
  1 sibling, 0 replies; 67+ messages in thread
From: Bruce Richardson @ 2016-06-15  9:43 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Mcnamara, John, Pattan, Reshma, dev

On Wed, Jun 15, 2016 at 11:32:39AM +0200, Thomas Monjalon wrote:
> 2016-06-15 09:05, Mcnamara, John:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > > 2016-06-14 10:38, Reshma Pattan:
> > > > The new librte_pdump library is added for packet capturing support.
> > > >
> > > 
> > > And more importantly, we need a doc in the prog guide.
> > > 
> > 
> > Hi Thomas,
> > 
> > The Programmers Guide update is in another part of the patchset. Can we get some clarification on the requirements for documentation within patchset?
> > 
> > Should all documentation related to a feature be in the patch for the feature? From your recent comments on patches it looks like that is the way you prefer it. That is fine but there is some confusion because it seems that wasn't always a requirement in the past so it would be best to clarify, and preferably document this.
> 
> When reading a patch (including after integration in the git tree),
> it is easier to understand when having the related doc with the code changes.
> 
> > Also, it makes it a bit harder for the documentation maintainer (me in this case) to see doc changes within patches and to ack just the doc part. From a documentation maintainer point of view it would be best to have any, non-trivial, doc changes in a separate patch.
> 
> I understand your concern.
> But you cannot assume every doc changes will be properly highlighted in
> the headline. I think you need to filter patches based on a content pattern:
> 	+++ b/doc/guides/

My 2c on this is that I think that non-trivial doc changes should be in separate
patches and reviewed separately. I think that changes to add a new feature to
the release notes, or to add a new tick-mark in the NIC feature matrix should
be part of the patches adding the new features. However, a multi-paragraph doc
addition I think is better as a separate doc patch.

/Bruce

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  8:48               ` Thomas Monjalon
@ 2016-06-15  9:54                 ` Ananyev, Konstantin
  2016-06-15 11:17                   ` Thomas Monjalon
  2016-06-15 13:49                   ` Thomas Monjalon
  2016-06-15 12:15                 ` Ivan Boule
  1 sibling, 2 replies; 67+ messages in thread
From: Ananyev, Konstantin @ 2016-06-15  9:54 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Pattan, Reshma, dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, June 15, 2016 9:49 AM
> To: Ananyev, Konstantin
> Cc: Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> 2016-06-15 08:37, Ananyev, Konstantin:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2016-06-15 05:30, Pattan, Reshma:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > > 2016-06-14 10:38, Reshma Pattan:
> > > > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > > > avoid corruption of callback lists in multithreaded context.
> > > > > >
> > > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > > >
> > > > > Why cb->next is not locked in burst functions?
> > > > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane)
> thread.
> > > >
> > > > > Just protecting add/remove but not its usage seems useless.
> > > > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2
> threads
> > > > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> > >
> > > So read and write can be done by different threads.
> >
> > Yes, and this is possible even in current DPDK version (16.04).
> > What is added by Reshma's patch - now it is possible to have concurrent write
> > from 2 different thread to that list.
> >
> > > I think the read access would need locking but we do not want it
> > > in fast path.
> >
> > I don't think it would be needed.
> > As I said - read/write interaction didn't change from what we have right now.
> > But if you have some particular scenario in mind that you believe would cause
> > a race condition - please speak up.
> 
> If we add/remove a callback during a burst? Is it possible that the next
> pointer would have a wrong value leading to a crash?
> Maybe we need a comment to state that we should not alter burst
> callbacks while running burst functions.

Current status (16.04):
It is safe to add/remove RX/TX callbacks while 
another thread is doing simultaneously RX/TX burst over same queue.
I.E: it is supposed to be safe to invoke
rte_eth_add(/remove)_rx(/tx)_callback() and rte_eth_rx_burst()/rte_eth_tx_burst()
from different threads simultaneously.
Though it is not safe to free/modify that rte_eth_rxtx_callback while current
rte_eth_rx_burst()/rte_eth_tx_burst() are still active.
That exactly what comments for rte_eth_remove_rx_callback() say:

* Note: the callback is removed from the callback list but it isn't freed
 * since the it may still be in use. The memory for the callback can be
 * subsequently freed back by the application by calling rte_free():
 *
 * - Immediately - if the port is stopped, or the user knows that no
 *   callbacks are in flight e.g. if called from the thread doing RX/TX
 *   on that queue.
 *
 * - After a short delay - where the delay is sufficient to allow any
 *   in-flight callbacks to complete.

In other words, right now there only way to know for sure that it is safe
to free the removed callback - is to stop the port.

Does it need to be changed, so when rte_eth_remove_rx_callback() returns
user can safely free the callback (or even better rte_eth_remove_rx_callback free the callback for us)?
In my opinion - yes.
Though, I think, it has nothing to do with pdump patches, and I think should be a matter
for separate a patch/discussion.

Now with pdump library introduction - there is possibility that 2 different threads
can try to  add/remove callbacks for the same queue simultaneously.
First one - thread executing control requests from local user,
second  one - pdump control thread executing pdump requests from pdump client.
That lock is introduced to avoid race condition between such 2 threads:
i.e. to prevent multiple threads to modify same list simultaneously.   
It is not intended to synchronise read/write accesses to the list, see above. 

Konstantin

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  9:54                 ` Ananyev, Konstantin
@ 2016-06-15 11:17                   ` Thomas Monjalon
  2016-06-15 13:49                   ` Thomas Monjalon
  1 sibling, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-15 11:17 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: Pattan, Reshma, dev

2016-06-15 09:54, Ananyev, Konstantin:
> 
> > -----Original Message-----
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > Sent: Wednesday, June 15, 2016 9:49 AM
> > To: Ananyev, Konstantin
> > Cc: Pattan, Reshma; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > 2016-06-15 08:37, Ananyev, Konstantin:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2016-06-15 05:30, Pattan, Reshma:
> > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > > > 2016-06-14 10:38, Reshma Pattan:
> > > > > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > > > > avoid corruption of callback lists in multithreaded context.
> > > > > > >
> > > > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > > > >
> > > > > > Why cb->next is not locked in burst functions?
> > > > > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane)
> > thread.
> > > > >
> > > > > > Just protecting add/remove but not its usage seems useless.
> > > > > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2
> > threads
> > > > > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> > > >
> > > > So read and write can be done by different threads.
> > >
> > > Yes, and this is possible even in current DPDK version (16.04).
> > > What is added by Reshma's patch - now it is possible to have concurrent write
> > > from 2 different thread to that list.
> > >
> > > > I think the read access would need locking but we do not want it
> > > > in fast path.
> > >
> > > I don't think it would be needed.
> > > As I said - read/write interaction didn't change from what we have right now.
> > > But if you have some particular scenario in mind that you believe would cause
> > > a race condition - please speak up.
> > 
> > If we add/remove a callback during a burst? Is it possible that the next
> > pointer would have a wrong value leading to a crash?
> > Maybe we need a comment to state that we should not alter burst
> > callbacks while running burst functions.
> 
> Current status (16.04):
> It is safe to add/remove RX/TX callbacks while 
> another thread is doing simultaneously RX/TX burst over same queue.
> I.E: it is supposed to be safe to invoke
> rte_eth_add(/remove)_rx(/tx)_callback() and rte_eth_rx_burst()/rte_eth_tx_burst()
> from different threads simultaneously.
> Though it is not safe to free/modify that rte_eth_rxtx_callback while current
> rte_eth_rx_burst()/rte_eth_tx_burst() are still active.
> That exactly what comments for rte_eth_remove_rx_callback() say:
> 
> * Note: the callback is removed from the callback list but it isn't freed
>  * since the it may still be in use. The memory for the callback can be
>  * subsequently freed back by the application by calling rte_free():
>  *
>  * - Immediately - if the port is stopped, or the user knows that no
>  *   callbacks are in flight e.g. if called from the thread doing RX/TX
>  *   on that queue.
>  *
>  * - After a short delay - where the delay is sufficient to allow any
>  *   in-flight callbacks to complete.
> 
> In other words, right now there only way to know for sure that it is safe
> to free the removed callback - is to stop the port.
> 
> Does it need to be changed, so when rte_eth_remove_rx_callback() returns
> user can safely free the callback (or even better rte_eth_remove_rx_callback free the callback for us)?
> In my opinion - yes.
> Though, I think, it has nothing to do with pdump patches, and I think should be a matter
> for separate a patch/discussion.
> 
> Now with pdump library introduction - there is possibility that 2 different threads
> can try to  add/remove callbacks for the same queue simultaneously.
> First one - thread executing control requests from local user,
> second  one - pdump control thread executing pdump requests from pdump client.
> That lock is introduced to avoid race condition between such 2 threads:
> i.e. to prevent multiple threads to modify same list simultaneously.   
> It is not intended to synchronise read/write accesses to the list, see above. 

OK thanks for the explanations

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  8:48               ` Thomas Monjalon
  2016-06-15  9:54                 ` Ananyev, Konstantin
@ 2016-06-15 12:15                 ` Ivan Boule
  2016-06-15 12:40                   ` Ananyev, Konstantin
  1 sibling, 1 reply; 67+ messages in thread
From: Ivan Boule @ 2016-06-15 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Ananyev, Konstantin; +Cc: Pattan, Reshma, dev

On 06/15/2016 10:48 AM, Thomas Monjalon wrote:

>>
>>> I think the read access would need locking but we do not want it
>>> in fast path.
>>
>> I don't think it would be needed.
>> As I said - read/write interaction didn't change from what we have right now.
>> But if you have some particular scenario in mind that you believe would cause
>> a race condition - please speak up.
>
> If we add/remove a callback during a burst? Is it possible that the next
> pointer would have a wrong value leading to a crash?
> Maybe we need a comment to state that we should not alter burst
> callbacks while running burst functions.
>

Hi Reshma,

You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the 
function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list 
of RX callbacks associated with the polled RX queue to safely remove RX 
callback(s) in parallel.
The problem is not [only] with the setting and the loading of "cb->next" 
that you assume to be atomic operations, which is certainly true on most 
CPUs.
I see the 2 important following issues:

1) the "rte_eth_rxtx_callback" data structure associated with a removed 
RX callback could still be accessed in the callback parsing loop of the 
function "rte_eth_rx_burst()" after having been freed in parallel.

BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE 
MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that 
does not free the "rte_eth_rxtx_callback" data structure associated with 
the removed callback !

2) As a consequence of 1), RX callbacks can be invoked/executed 
while/after being removed.
If the application must free resources that it dynamically allocated to 
be used by the RX callback being removed, how to guarantee that the last 
invocation of that RX callback has been completed and that such a 
callback will never be invoked again, so that the resources can safely 
be freed?

This is an example of a well-known more generic object deletion problem 
which must arrange to guarantee that a deleted object is not used and 
not accessible for use anymore before being actually deleted (freed, for 
instance).

Note that a lock cannot be used in the execution path of the 
rte_eth_rx_burst() function to address this issue, as locks MUST NEVER 
be introduced in the RX/TX path of the DPDK framework.

Of course, the same issues stand for TX callbacks.

Regards,
Ivan



-- 
Ivan Boule
6WIND Development Engineer

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 12:15                 ` Ivan Boule
@ 2016-06-15 12:40                   ` Ananyev, Konstantin
  2016-06-15 13:29                     ` Bruce Richardson
  0 siblings, 1 reply; 67+ messages in thread
From: Ananyev, Konstantin @ 2016-06-15 12:40 UTC (permalink / raw)
  To: Ivan Boule, Thomas Monjalon; +Cc: Pattan, Reshma, dev

Hi Ivan,

> -----Original Message-----
> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> Sent: Wednesday, June 15, 2016 1:15 PM
> To: Thomas Monjalon; Ananyev, Konstantin
> Cc: Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> 
> >>
> >>> I think the read access would need locking but we do not want it
> >>> in fast path.
> >>
> >> I don't think it would be needed.
> >> As I said - read/write interaction didn't change from what we have right now.
> >> But if you have some particular scenario in mind that you believe would cause
> >> a race condition - please speak up.
> >
> > If we add/remove a callback during a burst? Is it possible that the next
> > pointer would have a wrong value leading to a crash?
> > Maybe we need a comment to state that we should not alter burst
> > callbacks while running burst functions.
> >
> 
> Hi Reshma,
> 
> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> of RX callbacks associated with the polled RX queue to safely remove RX
> callback(s) in parallel.
> The problem is not [only] with the setting and the loading of "cb->next"
> that you assume to be atomic operations, which is certainly true on most
> CPUs.
> I see the 2 important following issues:
> 
> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> RX callback could still be accessed in the callback parsing loop of the
> function "rte_eth_rx_burst()" after having been freed in parallel.
> 
> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> does not free the "rte_eth_rxtx_callback" data structure associated with
> the removed callback !

Yes, though it is documented behaviour, someone can probably
refer it as a feature, not a bug ;)

> 
> 2) As a consequence of 1), RX callbacks can be invoked/executed
> while/after being removed.
> If the application must free resources that it dynamically allocated to
> be used by the RX callback being removed, how to guarantee that the last
> invocation of that RX callback has been completed and that such a
> callback will never be invoked again, so that the resources can safely
> be freed?
> 
> This is an example of a well-known more generic object deletion problem
> which must arrange to guarantee that a deleted object is not used and
> not accessible for use anymore before being actually deleted (freed, for
> instance).

Yes, and as I wrote in other mail, IMO it needs to be addressed.
But again it is already existing problem in rte_ethdev,
and I think it shouldn't stop pdump integration.
Konstantin

> 
> Note that a lock cannot be used in the execution path of the
> rte_eth_rx_burst() function to address this issue, as locks MUST NEVER
> be introduced in the RX/TX path of the DPDK framework.
> 
> Of course, the same issues stand for TX callbacks.
> 
> Regards,
> Ivan
> 
> 
> 
> --
> Ivan Boule
> 6WIND Development Engineer

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 12:40                   ` Ananyev, Konstantin
@ 2016-06-15 13:29                     ` Bruce Richardson
  2016-06-15 14:07                       ` Ivan Boule
  0 siblings, 1 reply; 67+ messages in thread
From: Bruce Richardson @ 2016-06-15 13:29 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: Ivan Boule, Thomas Monjalon, Pattan, Reshma, dev

On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> Hi Ivan,
> 
> > -----Original Message-----
> > From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > Sent: Wednesday, June 15, 2016 1:15 PM
> > To: Thomas Monjalon; Ananyev, Konstantin
> > Cc: Pattan, Reshma; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > 
> > >>
> > >>> I think the read access would need locking but we do not want it
> > >>> in fast path.
> > >>
> > >> I don't think it would be needed.
> > >> As I said - read/write interaction didn't change from what we have right now.
> > >> But if you have some particular scenario in mind that you believe would cause
> > >> a race condition - please speak up.
> > >
> > > If we add/remove a callback during a burst? Is it possible that the next
> > > pointer would have a wrong value leading to a crash?
> > > Maybe we need a comment to state that we should not alter burst
> > > callbacks while running burst functions.
> > >
> > 
> > Hi Reshma,
> > 
> > You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > of RX callbacks associated with the polled RX queue to safely remove RX
> > callback(s) in parallel.
> > The problem is not [only] with the setting and the loading of "cb->next"
> > that you assume to be atomic operations, which is certainly true on most
> > CPUs.
> > I see the 2 important following issues:
> > 
> > 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > RX callback could still be accessed in the callback parsing loop of the
> > function "rte_eth_rx_burst()" after having been freed in parallel.
> > 
> > BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > does not free the "rte_eth_rxtx_callback" data structure associated with
> > the removed callback !
> 
> Yes, though it is documented behaviour, someone can probably
> refer it as a feature, not a bug ;)
> 

+1
This is definitely not a bug, this is absolutely by design. One may argue with
the design, but it was done for a definite reason, so as to avoid paying the
penalty of having locks. It pushes more responsibility onto the app, but it
does allow the app to choose the best solution for managing the freeing of
memory for its situation. The alternative is to force all apps to pay the cost
of having locks, even if better options for freeing the memory are available.

/Bruce

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15  9:54                 ` Ananyev, Konstantin
  2016-06-15 11:17                   ` Thomas Monjalon
@ 2016-06-15 13:49                   ` Thomas Monjalon
  1 sibling, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-15 13:49 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: Pattan, Reshma, dev

I agree this patch do not bring a new issue.
But the current status deserves to be discussed.

2016-06-15 09:54, Ananyev, Konstantin:
> It is safe to add/remove RX/TX callbacks while 
> another thread is doing simultaneously RX/TX burst over same queue.

You are probably right, but I don't why it is safe? On which CPU?
How can we be sure that read and write of the "next" pointer are atomic?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v10 0/7] add packet capture framework
  2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
                       ` (7 preceding siblings ...)
  2016-06-14  9:38     ` [PATCH v9 8/8] doc: update doc for packet capture framework Reshma Pattan
@ 2016-06-15 14:06     ` Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 1/7] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
                         ` (7 more replies)
  8 siblings, 8 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev

This patch set include below changes

1)Changes to librte_ether.
2)A new library librte_pdump added for packet capture framework.
3)A new app/pdump tool added for packet capturing.
4)Test pmd changes done to initialize packet capture framework.
5)Documentation update.

1)librte_pdump
==============
To support packet capturing on dpdk Ethernet devices, a new library librte_pdump
is added.Users can develop their own packet capturing application using new library APIs.

Operation:
----------
The librte_pdump provides APIs to support packet capturing on dpdk Ethernet devices.
Library provides APIs to initialize the packet capture framework, enable/disable
the packet capture and uninitialize the packet capture framework.

The librte_pdump library works on a client/server model. The server is responsible for enabling or
disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
the packet capture.

The packet capture framework, as part of its initialization, creates the pthread and the server socket in
the pthread. The application that calls the framework initialization will have the server socket created,
either under the path that the application has passed or under the default path i.e. either ''/var/run'' for
root user or ''$HOME'' for non root user.

Applications that request enabling or disabling of the packet capture will have the client socket created
either under the path that the application has passed or under the default path i.e. either ''/var/run/''
for root users or ''$HOME'' for not root users to send the requests to the server.
The server socket will listen for client requests for enabling or disabling the packet capture.

Applications using below APIs need to pass port/device_id, queue, mempool and
ring parameters. Library uses user provided ring and mempool to mirror the rx/tx
packets of the port for users. Users need to dequeue the rings and write the packets
to vdev(pcap/tuntap) to view the packets using any standard tools.

Note:
Mempool and Ring should be mc/mp supportable.
Mempool mbuf size should be big enough to handle the rx/tx packets of a port.

APIs:
-----
rte_pdump_init()
rte_pdump_enable()
rte_pdump_enable_by_deviceid()
rte_pdump_disable()
rte_pdump_disable_by_deviceid()
rte_pdump_uninit()
rte_pdump_set_socket_dir()

2)app/pdump tool
================
Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
This tool by default runs as secondary process, and provides the support for
the command line options for packet capture.

./build/app/dpdk_pdump --
                       --pdump '(port=<port id> | device_id=<pci id or vdev name>),
                                (queue=<queue id>),
                                (rx-dev=<iface or pcap file> |
                                 tx-dev=<iface or pcap file>),
                                [ring-size=<ring size>],
                                [mbuf-size=<mbuf data size>],
                                [total-num-mbufs=<number of mbufs>]'

Parameters inside the parenthesis represents the mandatory parameters.
Parameters inside the square brackets represents optional parameters.
User has to pass on packet capture parameters under --pdump parameters, multiples of
--pdump can be passed to capture packets on different port and queue combinations

Operation:
----------
*Tool parse the user command line arguments,
creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
of the device passed in rx-dev|tx-dev parameters.

*Then calls the APIs of librte_pdump i.e. rte_pdump_enable()/rte_pdump_enable_by_deviceid()
to enable packet capturing on a specific port/device_id and queue by passing on
port|device_id, queue, mempool and ring info.

*Tool runs in while loop to dequeue the packets from the ring and write them to pcap device.

*Tool can be stopped using SIGINT, upon which tool calls
rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated resources.

Note:
CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the pdump tool.

3)Test-pmd changes
==================
Changes are done to test-pmd application to initialize/uninitialize the packet capture framework.
So app/pdump tool can be run to see packets of dpdk ports that are used by test-pmd.

Similarly any application which needs packet capture should call initialize/uninitialize APIs of
librte_pdump and use pdump tool to start the capture.

4)Packet capture flow between pdump tool and librte_pdump
=========================================================
* Pdump tool (Secondary process) requests packet capture
for specific port|device_id and queue combinations.

*Library in secondary process context creates client socket and communicates
the port|device_id, queue, ring and mempool to server.

*Library initializes server in primary process 'test-pmd' context and server serves
the client request to enable Ethernet rxtx call-backs for a given port|device_id and queue.

*Copy the rx/tx packets to passed mempool and enqueue the packets to ring for secondary process.

*Pdump tool will dequeue the packets from ring and writes them to PCAPMD vdev,
so ultimately packets will be seen on the device that is passed in rx-dev|tx-dev.

*Once the pdump tool is terminated with SIGINT it will disable the packet capturing.

*Library receives the disable packet capture request, communicate the info to server,
server will remove the Ethernet rxtx call-backs.

*Packet capture can be seen using tcpdump command
"tcpdump -ni <iface>" (or) "tcpdump –nr <pcapfile>"

5)Example command line
======================
./build/app/dpdk_pdump -- --pdump 'device_id=0000:02:0.0,queue=*,tx-dev=/tmp/dt-file.pcap,rx-dev=/tmp/dr-file.pcap,ring-size=8192,mbuf-size=2176,total-num-mbufs=32768' --pdump 'device_id=0000:01:00.0,queue=*,rx-dev=/tmp/d-file.pcap,tx-dev=/tmp/d-file.pcap,ring-size=16384,mbuf-size=2176,total-num-mbufs=32768'

v10:
fixed commit messages description.
fixed compilation issue when CONFIG_RTE_LIBRTE_PDUMP is disabled.
removed wrong config option CONFIG_RTE_EXEC_ENV_LINUXAPP inside app/Makefile
for pdump tool.
moved document changes to appropriate patches.

v9:
added a support in rte_pdump_set_socket_dir() to set server and client socket paths
==> http://dpdk.org/dev/patchwork/patch/13450/
updated the documentation for the new changes.
updated the commit messages.

v8:
added server socket argument to rte_pdump_init() API ==> http://dpdk.org/dev/patchwork/patch/13402/
added rte_pdump_set_socket_dir() API.
updated documentation for new changes.

v7:
fixed lines over 90 characters.

v6:
removed below deprecation notice patch from patch set.
http://dpdk.org/dev/patchwork/patch/13372/

v5:
addressed code review comments for below patches
http://dpdk.org/dev/patchwork/patch/12955/
http://dpdk.org/dev/patchwork/patch/12951/

v4:
added missing deprecation notice for ABI changes of rte_eth_dev_info structure.
made doc changes as per doc guidelines.
replaced rte_eal_vdev_init with rte_eth_dev_attach in pdump tool.
removed rxtx-dev parameter from pdump tool command line.

v3:
app/pdump: Moved cleanup code from signal handler to main.
divided librte_ether changes into multiple patches.
example command changed in app/pdump application guide

v2:
fix compilation issues for 4.8.3
fix unnecessary #includes


Reshma Pattan (7):
  ethdev: use locks to protect Rx/Tx callback lists
  ethdev: add new api to add Rx callback as head of the list
  ethdev: add new fields to ethdev info struct
  ethdev: make get port by name and get name by port public
  pdump: add new library for packet capturing support
  app/pdump: add pdump tool for packet capturing
  app/testpmd: add pdump initialization uninitialization

 MAINTAINERS                             |   7 +
 app/Makefile                            |   1 +
 app/pdump/Makefile                      |  49 ++
 app/pdump/main.c                        | 844 +++++++++++++++++++++++++++++
 app/test-pmd/testpmd.c                  |  12 +
 config/common_base                      |   5 +
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/pdump_library.rst | 123 +++++
 doc/guides/rel_notes/release_16_07.rst  |  14 +
 doc/guides/sample_app_ug/index.rst      |   1 +
 doc/guides/sample_app_ug/pdump.rst      | 122 +++++
 lib/Makefile                            |   1 +
 lib/librte_ether/rte_ethdev.c           | 123 +++--
 lib/librte_ether/rte_ethdev.h           |  60 +++
 lib/librte_ether/rte_ether_version.map  |   9 +
 lib/librte_pdump/Makefile               |  55 ++
 lib/librte_pdump/rte_pdump.c            | 913 ++++++++++++++++++++++++++++++++
 lib/librte_pdump/rte_pdump.h            | 216 ++++++++
 lib/librte_pdump/rte_pdump_version.map  |  13 +
 mk/rte.app.mk                           |   1 +
 20 files changed, 2526 insertions(+), 44 deletions(-)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
-- 
2.5.0

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v10 1/7] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 2/7] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added spinlocks around add/remove logic of Rx and Tx callbacks
to avoid corruption of callback lists in multithreaded context.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 82 +++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 42 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..ce70d58 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -77,6 +77,12 @@ static uint8_t nb_ports;
 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for add/remove rx callbacks */
+static rte_spinlock_t rte_eth_rx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* spinlock for add/remove tx callbacks */
+static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -1634,7 +1640,6 @@ rte_eth_dev_set_rx_queue_stats_mapping(uint8_t port_id, uint16_t rx_queue_id,
 			STAT_QMAP_RX);
 }
 
-
 void
 rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 {
@@ -2905,7 +2910,6 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_errno = EINVAL;
 		return NULL;
 	}
-
 	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
 
 	if (cb == NULL) {
@@ -2916,6 +2920,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.rx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
@@ -2928,6 +2933,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
 	return cb;
 }
@@ -2957,6 +2963,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.tx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].pre_tx_burst_cbs[queue_id];
@@ -2969,6 +2976,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
 	return cb;
 }
@@ -2987,29 +2995,24 @@ rte_eth_remove_rx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->post_rx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+	int ret = -EINVAL;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	prev_cb = &dev->post_rx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
@@ -3026,29 +3029,24 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->pre_tx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	int ret = -EINVAL;
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
+	prev_cb = &dev->pre_tx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10 2/7] ethdev: add new api to add Rx callback as head of the list
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 1/7] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 3/7] ethdev: add new fields to ethdev info struct Reshma Pattan
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Added new public api rte_eth_add_first_rx_callback to add given
callback as head of the list.

The librte_pdump library should display Rx packets of the
NIC even before they are being processed by other callbacks
of the application (because other callbacks of the application
may change the packet data as part of the processing).
So packet capturing framework should register a callback at the
head of the Rx callback list so that callback always gets called
first before any other callbacks of the applications. Hence this API
is introduced.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 35 ++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 28 +++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  6 ++++++
 3 files changed, 69 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ce70d58..97d167e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2939,6 +2939,41 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 void *
+rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+		rte_rx_callback_fn fn, void *user_param)
+{
+#ifndef RTE_ETHDEV_RXTX_CALLBACKS
+	rte_errno = ENOTSUP;
+	return NULL;
+#endif
+	/* check input parameters */
+	if (!rte_eth_dev_is_valid_port(port_id) || fn == NULL ||
+		queue_id >= rte_eth_devices[port_id].data->nb_rx_queues) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
+
+	if (cb == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	cb->fn.rx = fn;
+	cb->param = user_param;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	/* Add the callbacks at fisrt position*/
+	cb->next = rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
+	rte_smp_wmb();
+	rte_eth_devices[port_id].post_rx_burst_cbs[queue_id] = cb;
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
+
+	return cb;
+}
+
+void *
 rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_tx_callback_fn fn, void *user_param)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..237e6ef 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3825,6 +3825,34 @@ int rte_eth_dev_get_dcb_info(uint8_t port_id,
 void *rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_rx_callback_fn fn, void *user_param);
 
+/*
+* Add a callback that must be called first on packet RX on a given port
+* and queue.
+*
+* This API configures a first function to be called for each burst of
+* packets received on a given NIC port queue. The return value is a pointer
+* that can be used to later remove the callback using
+* rte_eth_remove_rx_callback().
+*
+* Multiple functions are called in the order that they are added.
+*
+* @param port_id
+*   The port identifier of the Ethernet device.
+* @param queue_id
+*   The queue on the Ethernet device on which the callback is to be added.
+* @param fn
+*   The callback function
+* @param user_param
+*   A generic pointer parameter which will be passed to each invocation of the
+*   callback function on this port and queue.
+*
+* @return
+*   NULL on error.
+*   On success, a pointer value which can later be used to remove the callback.
+*/
+void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+		rte_rx_callback_fn fn, void *user_param);
+
 /**
  * Add a callback to be called on packet TX on a given port and queue.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 214ecc7..c990b04 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,9 @@ DPDK_16.04 {
 	rte_eth_tx_buffer_set_err_callback;
 
 } DPDK_2.2;
+
+DPDK_16.07 {
+	global:
+
+	rte_eth_add_first_rx_callback;
+} DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10 3/7] ethdev: add new fields to ethdev info struct
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 1/7] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 2/7] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-16 19:14         ` Thomas Monjalon
  2016-06-15 14:06       ` [PATCH v10 4/7] ethdev: make get port by name and get name by port public Reshma Pattan
                         ` (4 subsequent siblings)
  7 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

The new fields nb_rx_queues and nb_tx_queues are added to the
rte_eth_dev_info structure.
Changes to API rte_eth_dev_info_get() are done to update these new fields
to the rte_eth_dev_info object.
Release notes is updated with the changes.

The librte_pdump library needs to register Rx and Tx callbacks for all
the nb_rx_queues and nb_tx_queues, when application wants to capture the
packets on all the software configured number of Rx and Tx queues of the
device. So far there is no support to get nb_rx_queues and nb_tx_queues
information from the ethdev library. Hence these changes are introduced.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 doc/guides/rel_notes/release_16_07.rst | 6 ++++++
 lib/librte_ether/rte_ethdev.c          | 2 ++
 lib/librte_ether/rte_ethdev.h          | 3 +++
 lib/librte_ether/rte_ether_version.map | 1 +
 4 files changed, 12 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index c0f6b02..004ecee 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -135,6 +135,9 @@ API Changes
   ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff.
 
+* Function ``rte_eth_dev_info_get`` updated to return new fields ``nb_rx_queues`` and ``nb_tx_queues``
+  in the ``rte_eth_dev_info`` object.
+
 
 ABI Changes
 -----------
@@ -146,6 +149,9 @@ ABI Changes
 * The ``rte_port_source_params`` structure has new fields to support PCAP file.
   It was already in release 16.04 with ``RTE_NEXT_ABI`` flag.
 
+* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and ``nb_tx_queues``
+  to support number of queues configured by software.
+
 
 Shared Library Versions
 -----------------------
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 97d167e..1f634c9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1661,6 +1661,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 	(*dev->dev_ops->dev_infos_get)(dev, dev_info);
 	dev_info->pci_dev = dev->pci_dev;
 	dev_info->driver_name = dev->data->drv_name;
+	dev_info->nb_rx_queues = dev->data->nb_rx_queues;
+	dev_info->nb_tx_queues = dev->data->nb_tx_queues;
 }
 
 int
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 237e6ef..8ad7c01 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -882,6 +882,9 @@ struct rte_eth_dev_info {
 	struct rte_eth_desc_lim rx_desc_lim;  /**< RX descriptors limits */
 	struct rte_eth_desc_lim tx_desc_lim;  /**< TX descriptors limits */
 	uint32_t speed_capa;  /**< Supported speeds bitmap (ETH_LINK_SPEED_). */
+	/** Configured number of rx/tx queues */
+	uint16_t nb_rx_queues; /**< Number of RX queues. */
+	uint16_t nb_tx_queues; /**< Number of TX queues. */
 };
 
 /**
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index c990b04..d06d648 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,4 +137,5 @@ DPDK_16.07 {
 	global:
 
 	rte_eth_add_first_rx_callback;
+	rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10 4/7] ethdev: make get port by name and get name by port public
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
                         ` (2 preceding siblings ...)
  2016-06-15 14:06       ` [PATCH v10 3/7] ethdev: add new fields to ethdev info struct Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-16 20:27         ` Thomas Monjalon
  2016-06-15 14:06       ` [PATCH v10 5/7] pdump: add new library for packet capturing support Reshma Pattan
                         ` (3 subsequent siblings)
  7 siblings, 1 reply; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Converted rte_eth_dev_get_port_by_name to a public API.
Converted rte_eth_dev_get_name_by_port to a public API.
Updated the release notes with the changes.

The librte_pdump library provides the APIs to enable or disable the
packet capture either using the port id or pci address or device name.
So pdump library need to do a mapping from name to port and port to name
internally to validate the device name and register the Rx and Tx
callbacks for the mapped ports. So these APIs are made public for the
pdump library for doing the mentioned mappings.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 doc/guides/rel_notes/release_16_07.rst |  3 +++
 lib/librte_ether/rte_ethdev.c          |  4 ++--
 lib/librte_ether/rte_ethdev.h          | 29 +++++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  2 ++
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index 004ecee..c6222f8 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -138,6 +138,9 @@ API Changes
 * Function ``rte_eth_dev_info_get`` updated to return new fields ``nb_rx_queues`` and ``nb_tx_queues``
   in the ``rte_eth_dev_info`` object.
 
+* Functions ``rte_eth_dev_get_port_by_name`` and ``rte_eth_dev_get_name_by_port``
+  are changed to a public APIs.
+
 
 ABI Changes
 -----------
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 1f634c9..0b19569 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -406,7 +406,7 @@ rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
 	return 0;
 }
 
-static int
+int
 rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 {
 	char *tmp;
@@ -425,7 +425,7 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 	return 0;
 }
 
-static int
+int
 rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id)
 {
 	int i;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8ad7c01..fab281e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4284,6 +4284,35 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
 				  uint32_t mask,
 				  uint8_t en);
 
+/**
+* Get the port id from pci adrress or device name
+* Ex: 0000:2:00.0 or vdev name eth_pcap0
+*
+* @param name
+*  pci address or name of the device
+* @param port_id
+*   pointer to port identifier of the device
+* @return
+*   - (0) if successful.
+*   - (-ENODEV or -EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id);
+
+/**
+* Get the device name from port id
+*
+* @param port_id
+*   pointer to port identifier of the device
+* @param name
+*  pci address or name of the device
+* @return
+*   - (0) if successful.
+*   - (-EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index d06d648..73e730d 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,5 +137,7 @@ DPDK_16.07 {
 	global:
 
 	rte_eth_add_first_rx_callback;
+	rte_eth_dev_get_name_by_port;
+	rte_eth_dev_get_port_by_name;
 	rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10 5/7] pdump: add new library for packet capturing support
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
                         ` (3 preceding siblings ...)
  2016-06-15 14:06       ` [PATCH v10 4/7] ethdev: make get port by name and get name by port public Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 6/7] app/pdump: add pdump tool for packet capturing Reshma Pattan
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

The librte_pdump library provides a framework for
packet capturing in dpdk. The library provides set of
APIs to initialize the packet capture framework, to
enable or disable the packet capture, and to uninitialize
it.

The librte_pdump library works on a client/server model.
The server is responsible for enabling or disabling the
packet capture and the clients are responsible
for requesting the enabling or disabling of the packet
capture.

Enabling APIs are supported with port, queue, ring and
mempool parameters. Applications should pass on this information
to get the packets from the dpdk ports.

For enabling requests from applications, library creates the client
request containing the mempool, ring, port and queue information and
sends the request to the server. After receiving the request, server
registers the Rx and Tx callbacks for all the port and queues.
After the callbacks registration, registered callbacks will get the
Rx and Tx packets. Packets then will be copied to the new mbufs that
are allocated from the user passed mempool. These new mbufs then will
be enqueued to the application passed ring. Applications need to dequeue
the mbufs from the rings and direct them to the devices like
pcap vdev for viewing the packets outside of the dpdk
using the packet capture tools.

For disabling requests, library creates the client request containing
the port and queue information and sends the request to the server.
After receiving the request, server removes the Rx and Tx callback
for all the port and queues.

Update the release notes.
Added programmers guide for librte_pdump.
Updated the MAINTAINERS file.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS                             |   5 +
 config/common_base                      |   5 +
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/pdump_library.rst | 123 +++++
 doc/guides/rel_notes/release_16_07.rst  |   4 +
 lib/Makefile                            |   1 +
 lib/librte_pdump/Makefile               |  55 ++
 lib/librte_pdump/rte_pdump.c            | 913 ++++++++++++++++++++++++++++++++
 lib/librte_pdump/rte_pdump.h            | 216 ++++++++
 lib/librte_pdump/rte_pdump_version.map  |  13 +
 mk/rte.app.mk                           |   1 +
 11 files changed, 1337 insertions(+)
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e6b70c..afb2d0c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -433,6 +433,11 @@ F: app/test/test_reorder*
 F: examples/packet_ordering/
 F: doc/guides/sample_app_ug/packet_ordering.rst
 
+Pdump
+M: Reshma Pattan <reshma.pattan@intel.com>
+F: lib/librte_pdump/
+F: doc/guides/prog_guide/pdump_library.rst
+
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
 F: lib/librte_sched/
diff --git a/config/common_base b/config/common_base
index b9ba405..943730b 100644
--- a/config/common_base
+++ b/config/common_base
@@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
 CONFIG_RTE_LIBRTE_REORDER=y
 
 #
+# Compile the pdump library
+#
+CONFIG_RTE_LIBRTE_PDUMP=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index b862d0c..47030c2 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -52,6 +52,7 @@ Programmer's Guide
     packet_distrib_lib
     reorder_lib
     ip_fragment_reassembly_lib
+    pdump_library
     multi_proc_support
     kernel_nic_interface
     thread_safety_dpdk_functions
diff --git a/doc/guides/prog_guide/pdump_library.rst b/doc/guides/prog_guide/pdump_library.rst
new file mode 100644
index 0000000..580ffcb
--- /dev/null
+++ b/doc/guides/prog_guide/pdump_library.rst
@@ -0,0 +1,123 @@
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _pdump_library:
+
+The librte_pdump Library
+========================
+
+The ``librte_pdump`` library provides a framework for packet capturing in DPDK.
+The library does the complete copy of the Rx and Tx mbufs to a new mempool and
+hence it slows down the performance of the applications, so it is recommended
+to use this library for debugging purposes.
+
+The library provides the following APIs to initialize the packet capture framework, to enable
+or disable the packet capture, and to uninitialize it:
+
+* ``rte_pdump_init()``:
+  This API initializes the packet capture framework.
+
+* ``rte_pdump_enable()``:
+  This API enables the packet capture on a given port and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_enable_by_deviceid()``:
+  This API enables the packet capture on a given device id (``vdev name or pci address``) and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_disable()``:
+  This API disables the packet capture on a given port and queue.
+
+* ``rte_pdump_disable_by_deviceid()``:
+  This API disables the packet capture on a given device id (``vdev name or pci address``) and queue.
+
+* ``rte_pdump_uninit()``:
+  This API uninitializes the packet capture framework.
+
+* ``rte_pdump_set_socket_dir()``:
+  This API sets the server and client socket paths.
+  Note: This API is not thread-safe.
+
+
+Operation
+---------
+
+The ``librte_pdump`` library works on a client/server model. The server is responsible for enabling or
+disabling the packet capture and the clients are responsible for requesting the enabling or disabling of
+the packet capture.
+
+The packet capture framework, as part of its initialization, creates the pthread and the server socket in
+the pthread. The application that calls the framework initialization will have the server socket created,
+either under the path that the application has passed or under the default path i.e. either ``/var/run`` for
+root user or ``$HOME`` for non root user.
+
+Applications that request enabling or disabling of the packet capture will have the client socket created either under
+the path that the application has passed or under the default path i.e. either ``/var/run/`` for root user or ``$HOME``
+for not root user to send the requests to the server.
+The server socket will listen for client requests for enabling or disabling the packet capture.
+
+
+Implementation Details
+----------------------
+
+The library API ``rte_pdump_init()``, initializes the packet capture framework by creating the pthread and the server
+socket. The server socket in the pthread context will be listening to the client requests to enable or disable the
+packet capture.
+
+The library APIs ``rte_pdump_enable()`` and ``rte_pdump_enable_by_deviceid()`` enables the packet capture.
+On each call to these APIs, the library creates a separate client socket, creates the "pdump enable" request and sends
+the request to the server. The server that is listening on the socket will take the request and enable the packet capture
+by registering the Ethernet RX and TX callbacks for the given port or device_id and queue combinations.
+Then the server will mirror the packets to the new mempool and enqueue them to the rte_ring that clients have passed
+to these APIs. The server also sends the response back to the client about the status of the request that was processed.
+After the response is received from the server, the client socket is closed.
+
+The library APIs ``rte_pdump_disable()`` and ``rte_pdump_disable_by_deviceid()`` disables the packet capture.
+On each call to these APIs, the library creates a separate client socket, creates the "pdump disable" request and sends
+the request to the server. The server that is listening on the socket will take the request and disable the packet
+capture by removing the Ethernet RX and TX callbacks for the given port or device_id and queue combinations. The server
+also sends the response back to the client about the status of the request that was processed. After the response is
+received from the server, the client socket is closed.
+
+The library API ``rte_pdump_uninit()``, uninitializes the packet capture framework by closing the pthread and the
+server socket.
+
+The library API ``rte_pdump_set_socket_dir()``, sets the given path as either server socket path
+or client socket path based on the ``type`` argument of the API.
+If the given path is ``NULL``, default path will be selected, i.e. either ``/var/run/`` for root user or ``$HOME``
+for non root user. Clients also need to call this API to set their server socket path if the server socket
+path is different from default path.
+
+
+Use Case: Packet Capturing
+--------------------------
+
+The DPDK ``app/pdump`` tool is developed based on this library to capture packets in DPDK.
+Users can use this as an example to develop their own packet capturing tools.
diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index c6222f8..2137779 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -66,6 +66,10 @@ New Features
   * Enable RSS per network interface through the configuration file.
   * Streamline the CLI code.
 
+* **Added packet capture framework.**
+
+  * A new library ``librte_pdump`` is added to provide packet capture APIs.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/Makefile b/lib/Makefile
index f254dba..ca7c02f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
new file mode 100644
index 0000000..af81a28
--- /dev/null
+++ b/lib/librte_pdump/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pdump.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+CFLAGS += -D_GNU_SOURCE
+
+EXPORT_MAP := rte_pdump_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_PDUMP) := rte_pdump.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_PDUMP)-include := rte_pdump.h
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
new file mode 100644
index 0000000..c921f51
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump.c
@@ -0,0 +1,913 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <stdio.h>
+
+#include <rte_memcpy.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_pci.h>
+
+#include "rte_pdump.h"
+
+#define SOCKET_PATH_VAR_RUN "/var/run/pdump_sockets"
+#define SOCKET_PATH_HOME "HOME/pdump_sockets"
+#define SERVER_SOCKET "%s/pdump_server_socket"
+#define CLIENT_SOCKET "%s/pdump_client_socket_%d_%u"
+#define DEVICE_ID_SIZE 64
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_PDUMP RTE_LOGTYPE_USER1
+
+enum pdump_operation {
+	DISABLE = 1,
+	ENABLE = 2
+};
+
+enum pdump_version {
+	V1 = 1
+};
+
+static pthread_t pdump_thread;
+static int pdump_socket_fd;
+static char server_socket_dir[PATH_MAX];
+static char client_socket_dir[PATH_MAX];
+
+struct pdump_request {
+	uint16_t ver;
+	uint16_t op;
+	uint32_t flags;
+	union pdump_data {
+		struct enable_v1 {
+			char device[DEVICE_ID_SIZE];
+			uint16_t queue;
+			struct rte_ring *ring;
+			struct rte_mempool *mp;
+			void *filter;
+		} en_v1;
+		struct disable_v1 {
+			char device[DEVICE_ID_SIZE];
+			uint16_t queue;
+			struct rte_ring *ring;
+			struct rte_mempool *mp;
+			void *filter;
+		} dis_v1;
+	} data;
+};
+
+struct pdump_response {
+	uint16_t ver;
+	uint16_t res_op;
+	int32_t err_value;
+};
+
+static struct pdump_rxtx_cbs {
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+	struct rte_eth_rxtx_callback *cb;
+	void *filter;
+} rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT],
+tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
+
+static inline int
+pdump_pktmbuf_copy_data(struct rte_mbuf *seg, const struct rte_mbuf *m)
+{
+	if (rte_pktmbuf_tailroom(seg) < m->data_len) {
+		RTE_LOG(ERR, PDUMP,
+			"User mempool: insufficient data_len of mbuf\n");
+		return -EINVAL;
+	}
+
+	seg->port = m->port;
+	seg->vlan_tci = m->vlan_tci;
+	seg->hash = m->hash;
+	seg->tx_offload = m->tx_offload;
+	seg->ol_flags = m->ol_flags;
+	seg->packet_type = m->packet_type;
+	seg->vlan_tci_outer = m->vlan_tci_outer;
+	seg->data_len = m->data_len;
+	seg->pkt_len = seg->data_len;
+	rte_memcpy(rte_pktmbuf_mtod(seg, void *),
+			rte_pktmbuf_mtod(m, void *),
+			rte_pktmbuf_data_len(seg));
+
+	return 0;
+}
+
+static inline struct rte_mbuf *
+pdump_pktmbuf_copy(struct rte_mbuf *m, struct rte_mempool *mp)
+{
+	struct rte_mbuf *m_dup, *seg, **prev;
+	uint32_t pktlen;
+	uint8_t nseg;
+
+	m_dup = rte_pktmbuf_alloc(mp);
+	if (unlikely(m_dup == NULL))
+		return NULL;
+
+	seg = m_dup;
+	prev = &seg->next;
+	pktlen = m->pkt_len;
+	nseg = 0;
+
+	do {
+		nseg++;
+		if (pdump_pktmbuf_copy_data(seg, m) < 0) {
+			rte_pktmbuf_free(m_dup);
+			return NULL;
+		}
+		*prev = seg;
+		prev = &seg->next;
+	} while ((m = m->next) != NULL &&
+			(seg = rte_pktmbuf_alloc(mp)) != NULL);
+
+	*prev = NULL;
+	m_dup->nb_segs = nseg;
+	m_dup->pkt_len = pktlen;
+
+	/* Allocation of new indirect segment failed */
+	if (unlikely(seg == NULL)) {
+		rte_pktmbuf_free(m_dup);
+		return NULL;
+	}
+
+	__rte_mbuf_sanity_check(m_dup, 1);
+	return m_dup;
+}
+
+static inline void
+pdump_copy(struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
+{
+	unsigned i;
+	int ring_enq;
+	uint16_t d_pkts = 0;
+	struct rte_mbuf *dup_bufs[nb_pkts];
+	struct pdump_rxtx_cbs *cbs;
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+	struct rte_mbuf *p;
+
+	cbs  = user_params;
+	ring = cbs->ring;
+	mp = cbs->mp;
+	for (i = 0; i < nb_pkts; i++) {
+		p = pdump_pktmbuf_copy(pkts[i], mp);
+		if (p)
+			dup_bufs[d_pkts++] = p;
+	}
+
+	ring_enq = rte_ring_enqueue_burst(ring, (void *)dup_bufs, d_pkts);
+	if (unlikely(ring_enq < d_pkts)) {
+		RTE_LOG(DEBUG, PDUMP,
+			"only %d of packets enqueued to ring\n", ring_enq);
+		do {
+			rte_pktmbuf_free(dup_bufs[ring_enq]);
+		} while (++ring_enq < d_pkts);
+	}
+}
+
+static uint16_t
+pdump_rx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+	struct rte_mbuf **pkts, uint16_t nb_pkts,
+	uint16_t max_pkts __rte_unused,
+	void *user_params)
+{
+	pdump_copy(pkts, nb_pkts, user_params);
+	return nb_pkts;
+}
+
+static uint16_t
+pdump_tx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+		struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
+{
+	pdump_copy(pkts, nb_pkts, user_params);
+	return nb_pkts;
+}
+
+static int
+pdump_get_dombdf(char *device_id, char *domBDF, size_t len)
+{
+	int ret;
+	struct rte_pci_addr dev_addr = {0};
+
+	/* identify if device_id is pci address or name */
+	ret = eal_parse_pci_DomBDF(device_id, &dev_addr);
+	if (ret < 0)
+		return -1;
+
+	if (dev_addr.domain)
+		ret = snprintf(domBDF, len, "%u:%u:%u.%u", dev_addr.domain,
+				dev_addr.bus, dev_addr.devid,
+				dev_addr.function);
+	else
+		ret = snprintf(domBDF, len, "%u:%u.%u", dev_addr.bus,
+				dev_addr.devid,
+				dev_addr.function);
+
+	return ret;
+}
+
+static int
+pdump_regitser_rx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
+				struct rte_ring *ring, struct rte_mempool *mp,
+				uint16_t operation)
+{
+	uint16_t qid;
+	struct pdump_rxtx_cbs *cbs = NULL;
+
+	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
+	for (; qid < end_q; qid++) {
+		cbs = &rx_cbs[port][qid];
+		if (cbs && operation == ENABLE) {
+			if (cbs->cb) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add rx callback for port=%d "
+					"and queue=%d, callback already exists\n",
+					port, qid);
+				return -EEXIST;
+			}
+			cbs->ring = ring;
+			cbs->mp = mp;
+			cbs->cb = rte_eth_add_first_rx_callback(port, qid,
+								pdump_rx, cbs);
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add rx callback, errno=%d\n",
+					rte_errno);
+				return rte_errno;
+			}
+		}
+		if (cbs && operation == DISABLE) {
+			int ret;
+
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to delete non existing rx "
+					"callback for port=%d and queue=%d\n",
+					port, qid);
+				return -EINVAL;
+			}
+			ret = rte_eth_remove_rx_callback(port, qid, cbs->cb);
+			if (ret < 0) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to remove rx callback, errno=%d\n",
+					rte_errno);
+				return ret;
+			}
+			cbs->cb = NULL;
+		}
+	}
+
+	return 0;
+}
+
+static int
+pdump_regitser_tx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
+				struct rte_ring *ring, struct rte_mempool *mp,
+				uint16_t operation)
+{
+
+	uint16_t qid;
+	struct pdump_rxtx_cbs *cbs = NULL;
+
+	qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue;
+	for (; qid < end_q; qid++) {
+		cbs = &tx_cbs[port][qid];
+		if (cbs && operation == ENABLE) {
+			if (cbs->cb) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add tx callback for port=%d "
+					"and queue=%d, callback already exists\n",
+					port, qid);
+				return -EEXIST;
+			}
+			cbs->ring = ring;
+			cbs->mp = mp;
+			cbs->cb = rte_eth_add_tx_callback(port, qid, pdump_tx,
+								cbs);
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to add tx callback, errno=%d\n",
+					rte_errno);
+				return rte_errno;
+			}
+		}
+		if (cbs && operation == DISABLE) {
+			int ret;
+
+			if (cbs->cb == NULL) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to delete non existing tx "
+					"callback for port=%d and queue=%d\n",
+					port, qid);
+				return -EINVAL;
+			}
+			ret = rte_eth_remove_tx_callback(port, qid, cbs->cb);
+			if (ret < 0) {
+				RTE_LOG(ERR, PDUMP,
+					"failed to remove tx callback, errno=%d\n",
+					rte_errno);
+				return ret;
+			}
+			cbs->cb = NULL;
+		}
+	}
+
+	return 0;
+}
+
+static int
+set_pdump_rxtx_cbs(struct pdump_request *p)
+{
+	uint16_t nb_rx_q, nb_tx_q = 0, end_q, queue;
+	uint8_t port;
+	int ret = 0;
+	uint32_t flags;
+	uint16_t operation;
+	struct rte_ring *ring;
+	struct rte_mempool *mp;
+
+	flags = p->flags;
+	operation = p->op;
+	if (operation == ENABLE) {
+		ret = rte_eth_dev_get_port_by_name(p->data.en_v1.device,
+				&port);
+		if (ret < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to get potid for device id=%s\n",
+				p->data.en_v1.device);
+			return -EINVAL;
+		}
+		queue = p->data.en_v1.queue;
+		ring = p->data.en_v1.ring;
+		mp = p->data.en_v1.mp;
+	} else {
+		ret = rte_eth_dev_get_port_by_name(p->data.dis_v1.device,
+				&port);
+		if (ret < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to get potid for device id=%s\n",
+				p->data.dis_v1.device);
+			return -EINVAL;
+		}
+		queue = p->data.dis_v1.queue;
+		ring = p->data.dis_v1.ring;
+		mp = p->data.dis_v1.mp;
+	}
+
+	/* validation if packet capture is for all queues */
+	if (queue == RTE_PDUMP_ALL_QUEUES) {
+		struct rte_eth_dev_info dev_info;
+
+		rte_eth_dev_info_get(port, &dev_info);
+		nb_rx_q = dev_info.nb_rx_queues;
+		nb_tx_q = dev_info.nb_tx_queues;
+		if (nb_rx_q == 0 && flags & RTE_PDUMP_FLAG_RX) {
+			RTE_LOG(ERR, PDUMP,
+				"number of rx queues cannot be 0\n");
+			return -EINVAL;
+		}
+		if (nb_tx_q == 0 && flags & RTE_PDUMP_FLAG_TX) {
+			RTE_LOG(ERR, PDUMP,
+				"number of tx queues cannot be 0\n");
+			return -EINVAL;
+		}
+		if ((nb_tx_q == 0 || nb_rx_q == 0) &&
+			flags == RTE_PDUMP_FLAG_RXTX) {
+			RTE_LOG(ERR, PDUMP,
+				"both tx&rx queues must be non zero\n");
+			return -EINVAL;
+		}
+	}
+
+	/* register RX callback */
+	if (flags & RTE_PDUMP_FLAG_RX) {
+		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_rx_q : queue + 1;
+		ret = pdump_regitser_rx_callbacks(end_q, port, queue, ring, mp,
+							operation);
+		if (ret < 0)
+			return ret;
+	}
+
+	/* register TX callback */
+	if (flags & RTE_PDUMP_FLAG_TX) {
+		end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_tx_q : queue + 1;
+		ret = pdump_regitser_tx_callbacks(end_q, port, queue, ring, mp,
+							operation);
+		if (ret < 0)
+			return ret;
+	}
+
+	return ret;
+}
+
+/* get socket path (/var/run if root, $HOME otherwise) */
+static void
+pdump_get_socket_path(char *buffer, int bufsz, enum rte_pdump_socktype type)
+{
+	const char *dir = NULL;
+
+	if (type == RTE_PDUMP_SOCKET_SERVER && server_socket_dir[0] != 0)
+		dir = server_socket_dir;
+	else if (type == RTE_PDUMP_SOCKET_CLIENT && client_socket_dir[0] != 0)
+		dir = client_socket_dir;
+	else {
+		if (getuid() != 0)
+			dir = getenv(SOCKET_PATH_HOME);
+		else
+			dir = SOCKET_PATH_VAR_RUN;
+	}
+
+	mkdir(dir, 700);
+	if (type == RTE_PDUMP_SOCKET_SERVER)
+		snprintf(buffer, bufsz, SERVER_SOCKET, dir);
+	else
+		snprintf(buffer, bufsz, CLIENT_SOCKET, dir, getpid(),
+				rte_sys_gettid());
+}
+
+static int
+pdump_create_server_socket(void)
+{
+	int ret, socket_fd;
+	struct sockaddr_un addr;
+	socklen_t addr_len;
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+				RTE_PDUMP_SOCKET_SERVER);
+	addr.sun_family = AF_UNIX;
+
+	/* remove if file already exists */
+	unlink(addr.sun_path);
+
+	/* set up a server socket */
+	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
+	if (socket_fd < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	addr_len = sizeof(struct sockaddr_un);
+	ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
+	if (ret) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to bind to server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		close(socket_fd);
+		return -1;
+	}
+
+	/* save the socket in local configuration */
+	pdump_socket_fd = socket_fd;
+
+	return 0;
+}
+
+static __attribute__((noreturn)) void *
+pdump_thread_main(__rte_unused void *arg)
+{
+	struct sockaddr_un cli_addr;
+	socklen_t cli_len;
+	struct pdump_request cli_req;
+	struct pdump_response resp;
+	int n;
+	int ret = 0;
+
+	/* host thread, never break out */
+	for (;;) {
+		/* recv client requests */
+		cli_len = sizeof(cli_addr);
+		n = recvfrom(pdump_socket_fd, &cli_req,
+				sizeof(struct pdump_request), 0,
+				(struct sockaddr *)&cli_addr, &cli_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to recv from client:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			continue;
+		}
+
+		ret = set_pdump_rxtx_cbs(&cli_req);
+
+		resp.ver = cli_req.ver;
+		resp.res_op = cli_req.op;
+		resp.err_value = ret;
+		n = sendto(pdump_socket_fd, &resp,
+				sizeof(struct pdump_response),
+				0, (struct sockaddr *)&cli_addr, cli_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to send to client:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+		}
+	}
+}
+
+int
+rte_pdump_init(const char *path)
+{
+	int ret = 0;
+	char thread_name[RTE_MAX_THREAD_NAME_LEN];
+
+	ret = rte_pdump_set_socket_dir(path, RTE_PDUMP_SOCKET_SERVER);
+	if (ret != 0)
+		return -1;
+
+	ret = pdump_create_server_socket();
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create server socket:%s:%d\n",
+			__func__, __LINE__);
+		return -1;
+	}
+
+	/* create the host thread to wait/handle pdump requests */
+	ret = pthread_create(&pdump_thread, NULL, pdump_thread_main, NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to create the pdump thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+	/* Set thread_name for aid in debugging. */
+	snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN, "pdump-thread");
+	ret = rte_thread_setname(pdump_thread, thread_name);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, PDUMP,
+			"Failed to set thread name for pdump handling\n");
+	}
+
+	return 0;
+}
+
+int
+rte_pdump_uninit(void)
+{
+	int ret;
+
+	ret = pthread_cancel(pdump_thread);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to cancel the pdump thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	ret = close(pdump_socket_fd);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to close server socket: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	struct sockaddr_un addr;
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+				RTE_PDUMP_SOCKET_SERVER);
+	ret = unlink(addr.sun_path);
+	if (ret != 0) {
+		RTE_LOG(ERR, PDUMP,
+			"Failed to remove server socket addr: %s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_create_client_socket(struct pdump_request *p)
+{
+	int ret, socket_fd;
+	int pid;
+	int n;
+	struct pdump_response server_resp;
+	struct sockaddr_un addr, serv_addr, from;
+	socklen_t addr_len, serv_len;
+
+	pid = getpid();
+
+	socket_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
+	if (socket_fd < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"client socket(): %s:pid(%d):tid(%u), %s:%d\n",
+			strerror(errno), pid, rte_sys_gettid(),
+			__func__, __LINE__);
+		ret = errno;
+		return ret;
+	}
+
+	pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+				RTE_PDUMP_SOCKET_CLIENT);
+	addr.sun_family = AF_UNIX;
+	addr_len = sizeof(struct sockaddr_un);
+
+	do {
+		ret = bind(socket_fd, (struct sockaddr *) &addr, addr_len);
+		if (ret) {
+			RTE_LOG(ERR, PDUMP,
+				"client bind(): %s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret = errno;
+			break;
+		}
+
+		serv_len = sizeof(struct sockaddr_un);
+		memset(&serv_addr, 0, sizeof(serv_addr));
+		pdump_get_socket_path(serv_addr.sun_path,
+					sizeof(serv_addr.sun_path),
+					RTE_PDUMP_SOCKET_SERVER);
+		serv_addr.sun_family = AF_UNIX;
+
+		n =  sendto(socket_fd, p, sizeof(struct pdump_request), 0,
+				(struct sockaddr *)&serv_addr, serv_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to send to server:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret =  errno;
+			break;
+		}
+
+		n = recvfrom(socket_fd, &server_resp,
+				sizeof(struct pdump_response), 0,
+				(struct sockaddr *)&from, &serv_len);
+		if (n < 0) {
+			RTE_LOG(ERR, PDUMP,
+				"failed to recv from server:%s, %s:%d\n",
+				strerror(errno), __func__, __LINE__);
+			ret = errno;
+			break;
+		}
+		ret = server_resp.err_value;
+	} while (0);
+
+	close(socket_fd);
+	unlink(addr.sun_path);
+	return ret;
+}
+
+static int
+pdump_validate_ring_mp(struct rte_ring *ring, struct rte_mempool *mp)
+{
+	if (ring == NULL || mp == NULL) {
+		RTE_LOG(ERR, PDUMP, "NULL ring or mempool are passed %s:%d\n",
+			__func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+	if (mp->flags & MEMPOOL_F_SP_PUT || mp->flags & MEMPOOL_F_SC_GET) {
+		RTE_LOG(ERR, PDUMP, "mempool with either SP or SC settings"
+		" is not valid for pdump, should have MP and MC settings\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+	if (ring->prod.sp_enqueue || ring->cons.sc_dequeue) {
+		RTE_LOG(ERR, PDUMP, "ring with either SP or SC settings"
+		" is not valid for pdump, should have MP and MC settings\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_validate_flags(uint32_t flags)
+{
+	if (flags != RTE_PDUMP_FLAG_RX && flags != RTE_PDUMP_FLAG_TX &&
+		flags != RTE_PDUMP_FLAG_RXTX) {
+		RTE_LOG(ERR, PDUMP,
+			"invalid flags, should be either rx/tx/rxtx\n");
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_validate_port(uint8_t port, char *name)
+{
+	int ret = 0;
+
+	if (port >= RTE_MAX_ETHPORTS) {
+		RTE_LOG(ERR, PDUMP, "Invalid port id %u, %s:%d\n", port,
+			__func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	ret = rte_eth_dev_get_name_by_port(port, name);
+	if (ret < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"port id to name mapping failed for port id=%u, %s:%d\n",
+			port, __func__, __LINE__);
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+pdump_prepare_client_request(char *device, uint16_t queue,
+				uint32_t flags,
+				uint16_t operation,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter)
+{
+	int ret;
+	struct pdump_request req = {.ver = 1,};
+
+	req.flags = flags;
+	req.op =  operation;
+	if ((operation & ENABLE) != 0) {
+		strncpy(req.data.en_v1.device, device, strlen(device));
+		req.data.en_v1.queue = queue;
+		req.data.en_v1.ring = ring;
+		req.data.en_v1.mp = mp;
+		req.data.en_v1.filter = filter;
+	} else {
+		strncpy(req.data.dis_v1.device, device, strlen(device));
+		req.data.dis_v1.queue = queue;
+		req.data.dis_v1.ring = NULL;
+		req.data.dis_v1.mp = NULL;
+		req.data.dis_v1.filter = NULL;
+	}
+
+	ret = pdump_create_client_socket(&req);
+	if (ret < 0) {
+		RTE_LOG(ERR, PDUMP,
+			"client request for pdump enable/disable failed\n");
+		rte_errno = ret;
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
+			struct rte_ring *ring,
+			struct rte_mempool *mp,
+			void *filter)
+{
+
+	int ret = 0;
+	char name[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_port(port, name);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_ring_mp(ring, mp);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	ret = pdump_prepare_client_request(name, queue, flags,
+						ENABLE, ring, mp, filter);
+
+	return ret;
+}
+
+int
+rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter)
+{
+	int ret = 0;
+	char domBDF[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_ring_mp(ring, mp);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
+		ret = pdump_prepare_client_request(domBDF, queue, flags,
+						ENABLE, ring, mp, filter);
+	else
+		ret = pdump_prepare_client_request(device_id, queue, flags,
+						ENABLE, ring, mp, filter);
+
+	return ret;
+}
+
+int
+rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags)
+{
+	int ret = 0;
+	char name[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_port(port, name);
+	if (ret < 0)
+		return ret;
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	ret = pdump_prepare_client_request(name, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+
+	return ret;
+}
+
+int
+rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags)
+{
+	int ret = 0;
+	char domBDF[DEVICE_ID_SIZE];
+
+	ret = pdump_validate_flags(flags);
+	if (ret < 0)
+		return ret;
+
+	if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
+		ret = pdump_prepare_client_request(domBDF, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+	else
+		ret = pdump_prepare_client_request(device_id, queue, flags,
+						DISABLE, NULL, NULL, NULL);
+
+	return ret;
+}
+
+int
+rte_pdump_set_socket_dir(const char *path, enum rte_pdump_socktype type)
+{
+	int ret, count;
+
+	if (path != NULL) {
+		if (type == RTE_PDUMP_SOCKET_SERVER) {
+			count = sizeof(server_socket_dir);
+			ret = snprintf(server_socket_dir, count, "%s", path);
+		} else {
+			count = sizeof(client_socket_dir);
+			ret = snprintf(client_socket_dir, count, "%s", path);
+		}
+
+		if (ret < 0  || ret >= count) {
+			RTE_LOG(ERR, PDUMP,
+					"Invalid socket path:%s:%d\n",
+					__func__, __LINE__);
+			if (type == RTE_PDUMP_SOCKET_SERVER)
+				server_socket_dir[0] = 0;
+			else
+				client_socket_dir[0] = 0;
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
diff --git a/lib/librte_pdump/rte_pdump.h b/lib/librte_pdump/rte_pdump.h
new file mode 100644
index 0000000..b5f4e2f
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump.h
@@ -0,0 +1,216 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PDUMP_H_
+#define _RTE_PDUMP_H_
+
+/**
+ * @file
+ * RTE pdump
+ *
+ * packet dump library to provide packet capturing support on dpdk.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_PDUMP_ALL_QUEUES UINT16_MAX
+
+enum {
+	RTE_PDUMP_FLAG_RX = 1,  /* receive direction */
+	RTE_PDUMP_FLAG_TX = 2,  /* transmit direction */
+	/* both receive and transmit directions */
+	RTE_PDUMP_FLAG_RXTX = (RTE_PDUMP_FLAG_RX|RTE_PDUMP_FLAG_TX)
+};
+
+enum rte_pdump_socktype {
+	RTE_PDUMP_SOCKET_SERVER = 1,
+	RTE_PDUMP_SOCKET_CLIENT = 2
+};
+
+/**
+ * Initialize packet capturing handling
+ *
+ * Creates pthread and server socket for handling clients
+ * requests to enable/disable rxtx callbacks.
+ *
+ * @param path
+ * directory path for server socket.
+ *
+ * @return
+ *    0 on success, -1 on error
+ */
+int
+rte_pdump_init(const char *path);
+
+/**
+ * Un initialize packet capturing handling
+ *
+ * Cancels pthread, close server socket, removes server socket address.
+ *
+ * @return
+ *    0 on success, -1 on error
+ */
+int
+rte_pdump_uninit(void);
+
+/**
+ * Enables packet capturing on given port and queue.
+ *
+ * @param port
+ *  port on which packet capturing should be enabled.
+ * @param queue
+ *  queue of a given port on which packet capturing should be enabled.
+ *  users should pass on value UINT16_MAX to enable packet capturing on all
+ *  queues of a given port.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ * @param ring
+ *  ring on which captured packets will be enqueued for user.
+ * @param mp
+ *  mempool on to which original packets will be mirrored or duplicated.
+ * @param filter
+ *  place holder for packet filtering.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_enable(uint8_t port, uint16_t queue, uint32_t flags,
+		struct rte_ring *ring,
+		struct rte_mempool *mp,
+		void *filter);
+
+/**
+ * Disables packet capturing on given port and queue.
+ *
+ * @param port
+ *  port on which packet capturing should be disabled.
+ * @param queue
+ *  queue of a given port on which packet capturing should be disabled.
+ *  users should pass on value UINT16_MAX to disable packet capturing on all
+ *  queues of a given port.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_disable(uint8_t port, uint16_t queue, uint32_t flags);
+
+/**
+ * Enables packet capturing on given device id and queue.
+ * device_id can be name or pci address of device.
+ *
+ * @param device_id
+ *  device id on which packet capturing should be enabled.
+ * @param queue
+ *  queue of a given device id on which packet capturing should be enabled.
+ *  users should pass on value UINT16_MAX to enable packet capturing on all
+ *  queues of a given device id.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ * @param ring
+ *  ring on which captured packets will be enqueued for user.
+ * @param mp
+ *  mempool on to which original packets will be mirrored or duplicated.
+ * @param filter
+ *  place holder for packet filtering.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+
+int
+rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags,
+				struct rte_ring *ring,
+				struct rte_mempool *mp,
+				void *filter);
+
+/**
+ * Disables packet capturing on given device_id and queue.
+ * device_id can be name or pci address of device.
+ *
+ * @param device_id
+ *  pci address or name of the device on which packet capturing
+ *  should be disabled.
+ * @param queue
+ *  queue of a given device on which packet capturing should be disabled.
+ *  users should pass on value UINT16_MAX to disable packet capturing on all
+ *  queues of a given device id.
+ * @param flags
+ *  flags specifies RTE_PDUMP_FLAG_RX/RTE_PDUMP_FLAG_TX/RTE_PDUMP_FLAG_RXTX
+ *  on which packet capturing should be enabled for a given port and queue.
+ *
+ * @return
+ *    0 on success, -1 on error, rte_errno is set accordingly.
+ */
+int
+rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
+				uint32_t flags);
+
+/**
+ * Allows applications to set server and client socket paths.
+ * If specified path is null default path will be selected, i.e.
+ *"/var/run/" for root user and "$HOME" for non root user.
+ * Clients also need to call this API to set their server path if the
+ * server path is different from default path.
+ * This API is not thread-safe.
+ *
+ * @param path
+ * directory path for server or client socket.
+ * @type
+ * specifies RTE_PDUMP_SOCKET_SERVER if socket path is for server.
+ * (or)
+ * specifies RTE_PDUMP_SOCKET_CLIENT if socket path is for client.
+ *
+ * @return
+ * 0 on success, -EINVAL on error
+ *
+ */
+int
+rte_pdump_set_socket_dir(const char *path, enum rte_pdump_socktype type);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PDUMP_H_ */
diff --git a/lib/librte_pdump/rte_pdump_version.map b/lib/librte_pdump/rte_pdump_version.map
new file mode 100644
index 0000000..edec99a
--- /dev/null
+++ b/lib/librte_pdump/rte_pdump_version.map
@@ -0,0 +1,13 @@
+DPDK_16.07 {
+	global:
+
+	rte_pdump_disable;
+	rte_pdump_disable_by_deviceid;
+	rte_pdump_enable;
+	rte_pdump_enable_by_deviceid;
+	rte_pdump_init;
+	rte_pdump_set_socket_dir;
+	rte_pdump_uninit;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index e9969fc..f894669 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -78,6 +78,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += -lrte_acl
 _LDLIBS-$(CONFIG_RTE_LIBRTE_JOBSTATS)       += -lrte_jobstats
 _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER)          += -lrte_power
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 
 _LDLIBS-y += --whole-archive
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10 6/7] app/pdump: add pdump tool for packet capturing
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
                         ` (4 preceding siblings ...)
  2016-06-15 14:06       ` [PATCH v10 5/7] pdump: add new library for packet capturing support Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-15 14:06       ` [PATCH v10 7/7] app/testpmd: add pdump initialization uninitialization Reshma Pattan
  2016-06-16 21:55       ` [PATCH v10 0/7] add packet capture framework Thomas Monjalon
  7 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

The new pdump tool is added for packet capturing on dpdk.
This tool runs as secondary process by default.
Tool facilitates the command line options like
port, device_id, queue which user should pass on
to the tool to request the packet capture on those devices.

Tool creates the rte ring, mempool and pcap vdev and
calls the enable API of the pdump library with port/device_id,
queue, ring and mempool as arguments to enable the packet
capture on specific devices and gets the packets from the
primary process over the ring. Once the packets are
received, those packets will be send to the pcap vdev.

Tool can be terminated by using ctrl+c(SIGINT) upon which tool
calls the disable API of the pdump library to disable the packet capture
and dequeues the rest of the packets from the ring and sends them on
to the pcap vdev, then after releases all allocated resources.

Updates the release notes.
Updated the MAINTAINERS.
Added sample application guide for app/pdump application.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS                            |   2 +
 app/Makefile                           |   1 +
 app/pdump/Makefile                     |  49 ++
 app/pdump/main.c                       | 844 +++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_16_07.rst |   1 +
 doc/guides/sample_app_ug/index.rst     |   1 +
 doc/guides/sample_app_ug/pdump.rst     | 122 +++++
 7 files changed, 1020 insertions(+)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c
 create mode 100644 doc/guides/sample_app_ug/pdump.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index afb2d0c..630f2c8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -437,6 +437,8 @@ Pdump
 M: Reshma Pattan <reshma.pattan@intel.com>
 F: lib/librte_pdump/
 F: doc/guides/prog_guide/pdump_library.rst
+F: app/pdump/
+F: doc/guides/sample_app_ug/pdump.rst
 
 Hierarchical scheduler
 M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
diff --git a/app/Makefile b/app/Makefile
index 1151e09..30ec292 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
+DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += pdump
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/pdump/Makefile b/app/pdump/Makefile
new file mode 100644
index 0000000..d85bb08
--- /dev/null
+++ b/app/pdump/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+ifeq ($(CONFIG_RTE_LIBRTE_PDUMP),y)
+
+APP = dpdk_pdump
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+
+SRCS-y := main.c
+
+# this application needs libraries first
+DEPDIRS-y += lib
+
+include $(RTE_SDK)/mk/rte.app.mk
+
+endif
diff --git a/app/pdump/main.c b/app/pdump/main.c
new file mode 100644
index 0000000..f8923b9
--- /dev/null
+++ b/app/pdump/main.c
@@ -0,0 +1,844 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <stdlib.h>
+#include <getopt.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <net/if.h>
+
+#include <rte_eal.h>
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_ethdev.h>
+#include <rte_memory.h>
+#include <rte_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_errno.h>
+#include <rte_dev.h>
+#include <rte_kvargs.h>
+#include <rte_mempool.h>
+#include <rte_ring.h>
+#include <rte_pdump.h>
+
+#define PDUMP_PORT_ARG "port"
+#define PDUMP_PCI_ARG "device_id"
+#define PDUMP_QUEUE_ARG "queue"
+#define PDUMP_DIR_ARG "dir"
+#define PDUMP_RX_DEV_ARG "rx-dev"
+#define PDUMP_TX_DEV_ARG "tx-dev"
+#define PDUMP_RING_SIZE_ARG "ring-size"
+#define PDUMP_MSIZE_ARG "mbuf-size"
+#define PDUMP_NUM_MBUFS_ARG "total-num-mbufs"
+
+#define VDEV_PCAP "eth_pcap_%s_%d,tx_pcap=%s"
+#define VDEV_IFACE "eth_pcap_%s_%d,tx_iface=%s"
+#define TX_STREAM_SIZE 64
+
+#define MP_NAME "pdump_pool_%d"
+
+#define RX_RING "rx_ring_%d"
+#define TX_RING "tx_ring_%d"
+
+#define RX_STR "rx"
+#define TX_STR "tx"
+
+/* Maximum long option length for option parsing. */
+#define APP_ARG_TCPDUMP_MAX_TUPLES 54
+#define MBUF_POOL_CACHE_SIZE 250
+#define TX_DESC_PER_QUEUE 512
+#define RX_DESC_PER_QUEUE 128
+#define MBUFS_PER_POOL 65535
+#define MAX_LONG_OPT_SZ 64
+#define RING_SIZE 16384
+#define SIZE 256
+#define BURST_SIZE 32
+#define NUM_VDEVS 2
+
+#define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
+/* true if x is a power of 2 */
+#define POWEROF2(x) ((((x)-1) & (x)) == 0)
+
+enum pdump_en_dis {
+	DISABLE = 1,
+	ENABLE = 2
+};
+
+enum pcap_stream {
+	IFACE = 1,
+	PCAP = 2
+};
+
+enum pdump_by {
+	PORT_ID = 1,
+	DEVICE_ID = 2
+};
+
+const char *valid_pdump_arguments[] = {
+	PDUMP_PORT_ARG,
+	PDUMP_PCI_ARG,
+	PDUMP_QUEUE_ARG,
+	PDUMP_DIR_ARG,
+	PDUMP_RX_DEV_ARG,
+	PDUMP_TX_DEV_ARG,
+	PDUMP_RING_SIZE_ARG,
+	PDUMP_MSIZE_ARG,
+	PDUMP_NUM_MBUFS_ARG,
+	NULL
+};
+
+struct pdump_stats {
+	uint64_t dequeue_pkts;
+	uint64_t tx_pkts;
+	uint64_t freed_pkts;
+};
+
+struct pdump_tuples {
+	/* cli params */
+	uint8_t port;
+	char *device_id;
+	uint16_t queue;
+	char rx_dev[TX_STREAM_SIZE];
+	char tx_dev[TX_STREAM_SIZE];
+	uint32_t ring_size;
+	uint16_t mbuf_data_size;
+	uint32_t total_num_mbufs;
+
+	/* params for library API call */
+	uint32_t dir;
+	struct rte_mempool *mp;
+	struct rte_ring *rx_ring;
+	struct rte_ring *tx_ring;
+
+	/* params for packet dumping */
+	enum pdump_by dump_by_type;
+	int rx_vdev_id;
+	int tx_vdev_id;
+	enum pcap_stream rx_vdev_stream_type;
+	enum pcap_stream tx_vdev_stream_type;
+	bool single_pdump_dev;
+
+	/* stats */
+	struct pdump_stats stats;
+} __rte_cache_aligned;
+static struct pdump_tuples pdump_t[APP_ARG_TCPDUMP_MAX_TUPLES];
+
+struct parse_val {
+	uint64_t min;
+	uint64_t max;
+	uint64_t val;
+};
+
+int num_tuples;
+static struct rte_eth_conf port_conf_default;
+volatile uint8_t quit_signal;
+
+/**< display usage */
+static void
+pdump_usage(const char *prgname)
+{
+	printf("usage: %s [EAL options] -- --pdump "
+			"'(port=<port id> | device_id=<pci id or vdev name>),"
+			"(queue=<queue_id>),"
+			"(rx-dev=<iface or pcap file> |"
+			" tx-dev=<iface or pcap file>,"
+			"[ring-size=<ring size>default:16384],"
+			"[mbuf-size=<mbuf data size>default:2176],"
+			"[total-num-mbufs=<number of mbufs>default:65535]"
+			"'\n",
+			prgname);
+}
+
+static int
+parse_device_id(const char *key __rte_unused, const char *value,
+		void *extra_args)
+{
+	struct pdump_tuples *pt = extra_args;
+
+	pt->device_id = strdup(value);
+	pt->dump_by_type = DEVICE_ID;
+
+	return 0;
+}
+
+static int
+parse_queue(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long n;
+	struct pdump_tuples *pt = extra_args;
+
+	if (!strcmp(value, "*"))
+		pt->queue = RTE_PDUMP_ALL_QUEUES;
+	else {
+		n = strtoul(value, NULL, 10);
+		pt->queue = (uint16_t) n;
+	}
+	return 0;
+}
+
+static int
+parse_rxtxdev(const char *key, const char *value, void *extra_args)
+{
+
+	struct pdump_tuples *pt = extra_args;
+
+	if (!strcmp(key, PDUMP_RX_DEV_ARG)) {
+		strncpy(pt->rx_dev, value, strlen(value));
+		/* identify the tx stream type for pcap vdev */
+		if (if_nametoindex(pt->rx_dev))
+			pt->rx_vdev_stream_type = IFACE;
+	} else if (!strcmp(key, PDUMP_TX_DEV_ARG)) {
+		strncpy(pt->tx_dev, value, strlen(value));
+		/* identify the tx stream type for pcap vdev */
+		if (if_nametoindex(pt->tx_dev))
+			pt->tx_vdev_stream_type = IFACE;
+	} else {
+		printf("invalid dev type %s, must be rx or tx\n", value);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+parse_uint_value(const char *key, const char *value, void *extra_args)
+{
+	struct parse_val *v;
+	unsigned long t;
+	char *end;
+	int ret = 0;
+
+	errno = 0;
+	v = extra_args;
+	t = strtoul(value, &end, 10);
+
+	if (errno != 0 || end[0] != 0 || t < v->min || t > v->max) {
+		printf("invalid value:\"%s\" for key:\"%s\", "
+			"value must be >= %"PRIu64" and <= %"PRIu64"\n",
+			value, key, v->min, v->max);
+		ret = -EINVAL;
+	}
+	if (!strcmp(key, PDUMP_RING_SIZE_ARG) && !POWEROF2(t)) {
+		printf("invalid value:\"%s\" for key:\"%s\", "
+			"value must be power of 2\n", value, key);
+		ret = -EINVAL;
+	}
+
+	if (ret != 0)
+		return ret;
+
+	v->val = t;
+	return 0;
+}
+
+static int
+parse_pdump(const char *optarg)
+{
+	struct rte_kvargs *kvlist;
+	int ret = 0, cnt1, cnt2;
+	struct pdump_tuples *pt;
+	struct parse_val v = {0};
+
+	pt = &pdump_t[num_tuples];
+
+	/* initial check for invalid arguments */
+	kvlist = rte_kvargs_parse(optarg, valid_pdump_arguments);
+	if (kvlist == NULL) {
+		printf("--pdump=\"%s\": invalid argument passed\n", optarg);
+		return -1;
+	}
+
+	/* port/device_id parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_PORT_ARG);
+	cnt2 = rte_kvargs_count(kvlist, PDUMP_PCI_ARG);
+	if (!((cnt1 == 1 && cnt2 == 0) || (cnt1 == 0 && cnt2 == 1))) {
+		printf("--pdump=\"%s\": must have either port or "
+			"device_id argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	} else if (cnt1 == 1) {
+		v.min = 0;
+		v.max = RTE_MAX_ETHPORTS-1;
+		ret = rte_kvargs_process(kvlist, PDUMP_PORT_ARG,
+				&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->port = (uint8_t) v.val;
+		pt->dump_by_type = PORT_ID;
+	} else if (cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_PCI_ARG,
+				&parse_device_id, pt);
+		if (ret < 0)
+			goto free_kvlist;
+	}
+
+	/* queue parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_QUEUE_ARG);
+	if (cnt1 != 1) {
+		printf("--pdump=\"%s\": must have queue argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	}
+	ret = rte_kvargs_process(kvlist, PDUMP_QUEUE_ARG, &parse_queue, pt);
+	if (ret < 0)
+		goto free_kvlist;
+
+	/* rx-dev and tx-dev parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_RX_DEV_ARG);
+	cnt2 = rte_kvargs_count(kvlist, PDUMP_TX_DEV_ARG);
+	if (cnt1 == 0 && cnt2 == 0) {
+		printf("--pdump=\"%s\": must have either rx-dev or "
+			"tx-dev argument\n", optarg);
+		ret = -1;
+		goto free_kvlist;
+	} else if (cnt1 == 1 && cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_RX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		ret = rte_kvargs_process(kvlist, PDUMP_TX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		/* if captured packets has to send to the same vdev */
+		if (!strcmp(pt->rx_dev, pt->tx_dev))
+			pt->single_pdump_dev = true;
+		pt->dir = RTE_PDUMP_FLAG_RXTX;
+	} else if (cnt1 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_RX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->dir = RTE_PDUMP_FLAG_RX;
+	} else if (cnt2 == 1) {
+		ret = rte_kvargs_process(kvlist, PDUMP_TX_DEV_ARG,
+					&parse_rxtxdev, pt);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->dir = RTE_PDUMP_FLAG_TX;
+	}
+
+	/* optional */
+	/* ring_size parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_RING_SIZE_ARG);
+	if (cnt1 == 1) {
+		v.min = 2;
+		v.max = RTE_RING_SZ_MASK-1;
+		ret = rte_kvargs_process(kvlist, PDUMP_RING_SIZE_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->ring_size = (uint16_t) v.val;
+	} else
+		pt->ring_size = RING_SIZE;
+
+	/* mbuf_data_size parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_MSIZE_ARG);
+	if (cnt1 == 1) {
+		v.min = 1;
+		v.max = UINT16_MAX;
+		ret = rte_kvargs_process(kvlist, PDUMP_MSIZE_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->mbuf_data_size = (uint16_t) v.val;
+	} else
+		pt->mbuf_data_size = RTE_MBUF_DEFAULT_BUF_SIZE;
+
+	/* total_num_mbufs parsing and validation */
+	cnt1 = rte_kvargs_count(kvlist, PDUMP_NUM_MBUFS_ARG);
+	if (cnt1 == 1) {
+		v.min = 1025;
+		v.max = UINT16_MAX;
+		ret = rte_kvargs_process(kvlist, PDUMP_NUM_MBUFS_ARG,
+						&parse_uint_value, &v);
+		if (ret < 0)
+			goto free_kvlist;
+		pt->total_num_mbufs = (uint16_t) v.val;
+	} else
+		pt->total_num_mbufs = MBUFS_PER_POOL;
+
+	num_tuples++;
+
+free_kvlist:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+launch_args_parse(int argc, char **argv, char *prgname)
+{
+	int opt, ret;
+	int option_index;
+	static struct option long_option[] = {
+		{"pdump", 1, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	if (argc == 1)
+		pdump_usage(prgname);
+
+	/* Parse command line */
+	while ((opt = getopt_long(argc, argv, " ",
+			long_option, &option_index)) != EOF) {
+		switch (opt) {
+		case 0:
+			if (!strncmp(long_option[option_index].name, "pdump",
+					MAX_LONG_OPT_SZ)) {
+				ret = parse_pdump(optarg);
+				if (ret) {
+					pdump_usage(prgname);
+					return -1;
+				}
+			}
+			break;
+		default:
+			pdump_usage(prgname);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+print_pdump_stats(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	for (i = 0; i < num_tuples; i++) {
+		printf("##### PDUMP DEBUG STATS #####\n");
+		pt = &pdump_t[i];
+		printf(" -packets dequeued:			%"PRIu64"\n",
+							pt->stats.dequeue_pkts);
+		printf(" -packets transmitted to vdev:		%"PRIu64"\n",
+							pt->stats.tx_pkts);
+		printf(" -packets freed:			%"PRIu64"\n",
+							pt->stats.freed_pkts);
+	}
+}
+
+static inline void
+disable_pdump(struct pdump_tuples *pt)
+{
+	if (pt->dump_by_type == DEVICE_ID)
+		rte_pdump_disable_by_deviceid(pt->device_id, pt->queue,
+						pt->dir);
+	else if (pt->dump_by_type == PORT_ID)
+		rte_pdump_disable(pt->port, pt->queue, pt->dir);
+}
+
+static inline void
+pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
+{
+	/* write input packets of port to vdev for pdump */
+	struct rte_mbuf *rxtx_bufs[BURST_SIZE];
+
+	/* first dequeue packets from ring of primary process */
+	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
+			(void *)rxtx_bufs, BURST_SIZE);
+	stats->dequeue_pkts += nb_in_deq;
+
+	if (nb_in_deq) {
+		/* then sent on vdev */
+		uint16_t nb_in_txd = rte_eth_tx_burst(
+				vdev_id,
+				0, rxtx_bufs, nb_in_deq);
+		stats->tx_pkts += nb_in_txd;
+
+		if (unlikely(nb_in_txd < nb_in_deq)) {
+			do {
+				rte_pktmbuf_free(rxtx_bufs[nb_in_txd]);
+				stats->freed_pkts++;
+			} while (++nb_in_txd < nb_in_deq);
+		}
+	}
+}
+
+static void
+free_ring_data(struct rte_ring *ring, uint8_t vdev_id,
+		struct pdump_stats *stats)
+{
+	while (rte_ring_count(ring))
+		pdump_rxtx(ring, vdev_id, stats);
+}
+
+static void
+cleanup_pdump_resources(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	/* disable pdump and free the pdump_tuple resources */
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+
+		/* remove callbacks */
+		disable_pdump(pt);
+
+		/*
+		* transmit rest of the enqueued packets of the rings on to
+		* the vdev, in order to release mbufs to the mepool.
+		**/
+		if (pt->dir & RTE_PDUMP_FLAG_RX)
+			free_ring_data(pt->rx_ring, pt->rx_vdev_id, &pt->stats);
+		if (pt->dir & RTE_PDUMP_FLAG_TX)
+			free_ring_data(pt->tx_ring, pt->tx_vdev_id, &pt->stats);
+
+		if (pt->device_id)
+			free(pt->device_id);
+
+		/* free the rings */
+		if (pt->rx_ring)
+			rte_ring_free(pt->rx_ring);
+		if (pt->tx_ring)
+			rte_ring_free(pt->tx_ring);
+	}
+}
+
+static void
+signal_handler(int sig_num)
+{
+	if (sig_num == SIGINT) {
+		printf("\n\nSignal %d received, preparing to exit...\n",
+				sig_num);
+		quit_signal = 1;
+	}
+}
+
+static inline int
+configure_vdev(uint8_t port_id)
+{
+	struct ether_addr addr;
+	const uint16_t rxRings = 0, txRings = 1;
+	const uint8_t nb_ports = rte_eth_dev_count();
+	int ret;
+	uint16_t q;
+
+	if (port_id > nb_ports)
+		return -1;
+
+	ret = rte_eth_dev_configure(port_id, rxRings, txRings,
+					&port_conf_default);
+	if (ret != 0)
+		rte_exit(EXIT_FAILURE, "dev config failed\n");
+
+	 for (q = 0; q < txRings; q++) {
+		ret = rte_eth_tx_queue_setup(port_id, q, TX_DESC_PER_QUEUE,
+				rte_eth_dev_socket_id(port_id), NULL);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "queue setup failed\n");
+	}
+
+	ret = rte_eth_dev_start(port_id);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "dev start failed\n");
+
+	rte_eth_macaddr_get(port_id, &addr);
+	printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
+			" %02"PRIx8" %02"PRIx8" %02"PRIx8"\n",
+			(unsigned)port_id,
+			addr.addr_bytes[0], addr.addr_bytes[1],
+			addr.addr_bytes[2], addr.addr_bytes[3],
+			addr.addr_bytes[4], addr.addr_bytes[5]);
+
+	rte_eth_promiscuous_enable(port_id);
+
+	return 0;
+}
+
+static void
+create_mp_ring_vdev(void)
+{
+	int i;
+	uint8_t portid;
+	struct pdump_tuples *pt = NULL;
+	struct rte_mempool *mbuf_pool = NULL;
+	char vdev_args[SIZE];
+	char ring_name[SIZE];
+	char mempool_name[SIZE];
+
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+		snprintf(mempool_name, SIZE, MP_NAME, i);
+		mbuf_pool = rte_mempool_lookup(mempool_name);
+		if (mbuf_pool == NULL) {
+			/* create mempool */
+			mbuf_pool = rte_pktmbuf_pool_create(mempool_name,
+					pt->total_num_mbufs,
+					MBUF_POOL_CACHE_SIZE, 0,
+					pt->mbuf_data_size,
+					rte_socket_id());
+			if (mbuf_pool == NULL)
+				rte_exit(EXIT_FAILURE,
+					"Mempool creation failed: %s\n",
+					rte_strerror(rte_errno));
+		}
+		pt->mp = mbuf_pool;
+
+		if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
+			/* if captured packets has to send to the same vdev */
+			/* create rx_ring */
+			snprintf(ring_name, SIZE, RX_RING, i);
+			pt->rx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->rx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s:%s:%d\n",
+						rte_strerror(rte_errno),
+						__func__, __LINE__);
+
+			/* create tx_ring */
+			snprintf(ring_name, SIZE, TX_RING, i);
+			pt->tx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->tx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s:%s:%d\n",
+						rte_strerror(rte_errno),
+						__func__, __LINE__);
+
+			/* create vdevs */
+			(pt->rx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, RX_STR, i,
+			pt->rx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, RX_STR, i,
+			pt->rx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed:%s:%d\n",
+					__func__, __LINE__);
+			pt->rx_vdev_id = portid;
+
+			/* configure vdev */
+			configure_vdev(pt->rx_vdev_id);
+
+			if (pt->single_pdump_dev)
+				pt->tx_vdev_id = portid;
+			else {
+				(pt->tx_vdev_stream_type == IFACE) ?
+				snprintf(vdev_args, SIZE, VDEV_IFACE, TX_STR, i,
+				pt->tx_dev) :
+				snprintf(vdev_args, SIZE, VDEV_PCAP, TX_STR, i,
+				pt->tx_dev);
+				if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+					rte_exit(EXIT_FAILURE,
+						"vdev creation failed:"
+						"%s:%d\n", __func__, __LINE__);
+				pt->tx_vdev_id = portid;
+
+				/* configure vdev */
+				configure_vdev(pt->tx_vdev_id);
+			}
+		} else if (pt->dir == RTE_PDUMP_FLAG_RX) {
+
+			/* create rx_ring */
+			snprintf(ring_name, SIZE, RX_RING, i);
+			pt->rx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->rx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s\n",
+					rte_strerror(rte_errno));
+
+			(pt->rx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, RX_STR, i,
+				pt->rx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, RX_STR, i,
+				pt->rx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed:%s:%d\n",
+					__func__, __LINE__);
+			pt->rx_vdev_id = portid;
+			/* configure vdev */
+			configure_vdev(pt->rx_vdev_id);
+		} else if (pt->dir == RTE_PDUMP_FLAG_TX) {
+
+			/* create tx_ring */
+			snprintf(ring_name, SIZE, TX_RING, i);
+			pt->tx_ring = rte_ring_create(ring_name, pt->ring_size,
+					rte_socket_id(), 0);
+			if (pt->tx_ring == NULL)
+				rte_exit(EXIT_FAILURE, "%s\n",
+					rte_strerror(rte_errno));
+
+			(pt->tx_vdev_stream_type == IFACE) ?
+			snprintf(vdev_args, SIZE, VDEV_IFACE, TX_STR, i,
+				pt->tx_dev) :
+			snprintf(vdev_args, SIZE, VDEV_PCAP, TX_STR, i,
+				pt->tx_dev);
+			if (rte_eth_dev_attach(vdev_args, &portid) < 0)
+				rte_exit(EXIT_FAILURE,
+					"vdev creation failed\n");
+			pt->tx_vdev_id = portid;
+
+			/* configure vdev */
+			configure_vdev(pt->tx_vdev_id);
+		}
+	}
+}
+
+static void
+enable_pdump(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+	int ret = 0, ret1 = 0;
+
+	for (i = 0; i < num_tuples; i++) {
+		pt = &pdump_t[i];
+		if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
+			if (pt->dump_by_type == DEVICE_ID) {
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						RTE_PDUMP_FLAG_RX,
+						pt->rx_ring,
+						pt->mp, NULL);
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						RTE_PDUMP_FLAG_TX,
+						pt->tx_ring,
+						pt->mp, NULL);
+			} else if (pt->dump_by_type == PORT_ID) {
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						RTE_PDUMP_FLAG_RX,
+						pt->rx_ring, pt->mp, NULL);
+				ret1 = rte_pdump_enable(pt->port, pt->queue,
+						RTE_PDUMP_FLAG_TX,
+						pt->tx_ring, pt->mp, NULL);
+			}
+		} else if (pt->dir == RTE_PDUMP_FLAG_RX) {
+			if (pt->dump_by_type == DEVICE_ID)
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						pt->dir, pt->rx_ring,
+						pt->mp, NULL);
+			else if (pt->dump_by_type == PORT_ID)
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						pt->dir,
+						pt->rx_ring, pt->mp, NULL);
+		} else if (pt->dir == RTE_PDUMP_FLAG_TX) {
+			if (pt->dump_by_type == DEVICE_ID)
+				ret = rte_pdump_enable_by_deviceid(
+						pt->device_id,
+						pt->queue,
+						pt->dir,
+						pt->tx_ring, pt->mp, NULL);
+			else if (pt->dump_by_type == PORT_ID)
+				ret = rte_pdump_enable(pt->port, pt->queue,
+						pt->dir,
+						pt->tx_ring, pt->mp, NULL);
+		}
+		if (ret < 0 || ret1 < 0) {
+			cleanup_pdump_resources();
+			rte_exit(EXIT_FAILURE, "%s\n", rte_strerror(rte_errno));
+		}
+	}
+}
+
+static inline void
+dump_packets(void)
+{
+	int i;
+	struct pdump_tuples *pt;
+
+	while (!quit_signal) {
+		for (i = 0; i < num_tuples; i++) {
+			pt = &pdump_t[i];
+			if (pt->dir & RTE_PDUMP_FLAG_RX)
+				pdump_rxtx(pt->rx_ring, pt->rx_vdev_id,
+					&pt->stats);
+			if (pt->dir & RTE_PDUMP_FLAG_TX)
+				pdump_rxtx(pt->tx_ring, pt->tx_vdev_id,
+					&pt->stats);
+		}
+	}
+}
+
+int
+main(int argc, char **argv)
+{
+	int diag;
+	int ret;
+	int i;
+
+	char c_flag[] = "-c1";
+	char n_flag[] = "-n4";
+	char mp_flag[] = "--proc-type=secondary";
+	char *argp[argc + 3];
+
+	/* catch ctrl-c so we can print on exit */
+	signal(SIGINT, signal_handler);
+
+	argp[0] = argv[0];
+	argp[1] = c_flag;
+	argp[2] = n_flag;
+	argp[3] = mp_flag;
+
+	for (i = 1; i < argc; i++)
+		argp[i + 3] = argv[i];
+
+	argc += 3;
+
+	diag = rte_eal_init(argc, argp);
+	if (diag < 0)
+		rte_panic("Cannot init EAL\n");
+
+	argc -= diag;
+	argv += (diag - 3);
+
+	/* parse app arguments */
+	if (argc > 1) {
+		ret = launch_args_parse(argc, argv, argp[0]);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "Invalid argument\n");
+	}
+
+	/* create mempool, ring and vdevs info */
+	create_mp_ring_vdev();
+	enable_pdump();
+	dump_packets();
+
+	cleanup_pdump_resources();
+	/* dump debug stats */
+	print_pdump_stats();
+
+	return 0;
+}
diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index 2137779..9e08e9f 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -69,6 +69,7 @@ New Features
 * **Added packet capture framework.**
 
   * A new library ``librte_pdump`` is added to provide packet capture APIs.
+  * A new ``app/pdump`` tool is added to capture packets in DPDK.
 
 
 Resolved Issues
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 930f68c..96bb317 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -76,6 +76,7 @@ Sample Applications User Guide
     ptpclient
     performance_thread
     ipsec_secgw
+    pdump
 
 **Figures**
 
diff --git a/doc/guides/sample_app_ug/pdump.rst b/doc/guides/sample_app_ug/pdump.rst
new file mode 100644
index 0000000..96c8709
--- /dev/null
+++ b/doc/guides/sample_app_ug/pdump.rst
@@ -0,0 +1,122 @@
+
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+dpdk_pdump Application
+======================
+
+The ``dpdk_pdump`` application is a Data Plane Development Kit (DPDK) application that runs as a DPDK secondary process and
+is capable of enabling packet capture on dpdk ports.
+
+
+Running the Application
+-----------------------
+
+The application has a ``--pdump`` command line option with various sub arguments:
+
+.. code-block:: console
+
+   ./build/app/dpdk_pdump --
+                          --pdump '(port=<port id> | device_id=<pci id or vdev name>),
+                                   (queue=<queue_id>),
+                                   (rx-dev=<iface or pcap file> |
+                                    tx-dev=<iface or pcap file>),
+                                   [ring-size=<ring size>],
+                                   [mbuf-size=<mbuf data size>],
+                                   [total-num-mbufs=<number of mbufs>]'
+
+Note:
+
+* Parameters inside the parentheses represents mandatory parameters.
+
+* Parameters inside the square brackets represents optional parameters.
+
+Multiple instances of ``--pdump`` can be passed to capture packets on different port and queue combinations.
+
+
+Parameters
+~~~~~~~~~~
+
+``port``:
+Port id of the eth device on which packets should be captured.
+
+``device_id``:
+PCI address (or) name of the eth device on which packets should be captured.
+
+   .. Note::
+
+      * As of now the ``dpdk_pdump`` tool cannot capture the packets of virtual devices
+        in the primary process due to a bug in the ethdev library. Due to this bug, in a multi process context,
+        when the primary and secondary have different ports set, then the secondary process
+        (here the ``dpdk_pdump`` tool) overwrites the ``rte_eth_devices[]`` entries of the primary process.
+
+``queue``:
+Queue id of the eth device on which packets should be captured. The user can pass a queue value of ``*`` to enable
+packet capture on all queues of the eth device.
+
+``rx-dev``:
+Can be either a pcap file name or any Linux iface.
+
+``tx-dev``:
+Can be either a pcap file name or any Linux iface.
+
+   .. Note::
+
+      * To receive ingress packets only, ``rx-dev`` should be passed.
+
+      * To receive egress packets only, ``tx-dev`` should be passed.
+
+      * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev``
+        should both be passed with the different file names or the Linux iface names.
+
+      * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev``
+        should both be passed with the same file names or the the Linux iface names.
+
+``ring-size``:
+Size of the ring. This value is used internally for ring creation. The ring will be used to enqueue the packets from
+the primary application to the secondary. This is an optional parameter with default size 16384.
+
+``mbuf-size``:
+Size of the mbuf data. This is used internally for mempool creation. Ideally this value must be same as
+the primary application's mempool's mbuf data size which is used for packet RX. This is an optional parameter with
+default size 2176.
+
+``total-num-mbufs``:
+Total number mbufs in mempool. This is used internally for mempool creation. This is an optional parameter with default
+value 65535.
+
+
+Example
+-------
+
+.. code-block:: console
+
+   $ sudo ./build/app/dpdk_pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10 7/7] app/testpmd: add pdump initialization uninitialization
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
                         ` (5 preceding siblings ...)
  2016-06-15 14:06       ` [PATCH v10 6/7] app/pdump: add pdump tool for packet capturing Reshma Pattan
@ 2016-06-15 14:06       ` Reshma Pattan
  2016-06-16 21:55       ` [PATCH v10 0/7] add packet capture framework Thomas Monjalon
  7 siblings, 0 replies; 67+ messages in thread
From: Reshma Pattan @ 2016-06-15 14:06 UTC (permalink / raw)
  To: dev; +Cc: Reshma Pattan

Call rte_pdump_init and rte_pdump_uninit for packet
capturing initialization and uninitialization.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 app/test-pmd/testpmd.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index dd6b046..b26f5be 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -76,6 +76,9 @@
 #ifdef RTE_LIBRTE_PMD_XENVIRT
 #include <rte_eth_xenvirt.h>
 #endif
+#ifdef RTE_LIBRTE_PDUMP
+#include <rte_pdump.h>
+#endif
 
 #include "testpmd.h"
 
@@ -2029,6 +2032,10 @@ signal_handler(int signum)
 	if (signum == SIGINT || signum == SIGTERM) {
 		printf("\nSignal %d received, preparing to exit...\n",
 				signum);
+#ifdef RTE_LIBRTE_PDUMP
+		/* uninitialize packet capture framework */
+		rte_pdump_uninit();
+#endif
 		force_quit();
 		/* exit with the expected status */
 		signal(signum, SIG_DFL);
@@ -2049,6 +2056,11 @@ main(int argc, char** argv)
 	if (diag < 0)
 		rte_panic("Cannot init EAL\n");
 
+#ifdef RTE_LIBRTE_PDUMP
+	/* initialize packet capture framework */
+	rte_pdump_init(NULL);
+#endif
+
 	nb_ports = (portid_t) rte_eth_dev_count();
 	if (nb_ports == 0)
 		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 13:29                     ` Bruce Richardson
@ 2016-06-15 14:07                       ` Ivan Boule
  2016-06-15 14:19                         ` Bruce Richardson
  2016-06-15 14:20                         ` Ananyev, Konstantin
  0 siblings, 2 replies; 67+ messages in thread
From: Ivan Boule @ 2016-06-15 14:07 UTC (permalink / raw)
  To: Bruce Richardson, Ananyev, Konstantin
  Cc: Thomas Monjalon, Pattan, Reshma, dev

On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
>> Hi Ivan,
>>
>>> -----Original Message-----
>>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
>>> Sent: Wednesday, June 15, 2016 1:15 PM
>>> To: Thomas Monjalon; Ananyev, Konstantin
>>> Cc: Pattan, Reshma; dev@dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>
>>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
>>>
>>>>>
>>>>>> I think the read access would need locking but we do not want it
>>>>>> in fast path.
>>>>>
>>>>> I don't think it would be needed.
>>>>> As I said - read/write interaction didn't change from what we have right now.
>>>>> But if you have some particular scenario in mind that you believe would cause
>>>>> a race condition - please speak up.
>>>>
>>>> If we add/remove a callback during a burst? Is it possible that the next
>>>> pointer would have a wrong value leading to a crash?
>>>> Maybe we need a comment to state that we should not alter burst
>>>> callbacks while running burst functions.
>>>>
>>>
>>> Hi Reshma,
>>>
>>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
>>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
>>> of RX callbacks associated with the polled RX queue to safely remove RX
>>> callback(s) in parallel.
>>> The problem is not [only] with the setting and the loading of "cb->next"
>>> that you assume to be atomic operations, which is certainly true on most
>>> CPUs.
>>> I see the 2 important following issues:
>>>
>>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
>>> RX callback could still be accessed in the callback parsing loop of the
>>> function "rte_eth_rx_burst()" after having been freed in parallel.
>>>
>>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
>>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
>>> does not free the "rte_eth_rxtx_callback" data structure associated with
>>> the removed callback !
>>
>> Yes, though it is documented behaviour, someone can probably
>> refer it as a feature, not a bug ;)
>>
>
> +1
> This is definitely not a bug, this is absolutely by design. One may argue with
> the design, but it was done for a definite reason, so as to avoid paying the
> penalty of having locks. It pushes more responsibility onto the app, but it
> does allow the app to choose the best solution for managing the freeing of
> memory for its situation. The alternative is to force all apps to pay the cost
> of having locks, even if better options for freeing the memory are available.
>
> /Bruce
>

-1 (not to say 0xFFFFFFFF)

This is definitely an API design bug !
I would say that if you don't know how to free a resource that you 
allocate, it is very likely that you are wrong allocating it.
And this is exactly what happens here with RX/TX callback data structures.
This problem can easily be addressed by just changing the API as follows:

Change
     void *
     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
                             rte_rx_callback_fn fn, void *user_param)

to
     int
     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
                             struct rte_eth_rxtx_callback *cb)

In addition of solving the problem, this approach makes the API 
consistent and let the application allocate "rte_eth_rxtx_callback" data 
structures through any appropriate mean.

Regards,
Ivan

-- 
Ivan Boule
6WIND Development Engineer

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 14:07                       ` Ivan Boule
@ 2016-06-15 14:19                         ` Bruce Richardson
  2016-06-15 14:20                         ` Ananyev, Konstantin
  1 sibling, 0 replies; 67+ messages in thread
From: Bruce Richardson @ 2016-06-15 14:19 UTC (permalink / raw)
  To: Ivan Boule; +Cc: Ananyev, Konstantin, Thomas Monjalon, Pattan, Reshma, dev

On Wed, Jun 15, 2016 at 04:07:20PM +0200, Ivan Boule wrote:
> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> >On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> >>Hi Ivan,
> >>
> >>>-----Original Message-----
> >>>From: Ivan Boule [mailto:ivan.boule@6wind.com]
> >>>Sent: Wednesday, June 15, 2016 1:15 PM
> >>>To: Thomas Monjalon; Ananyev, Konstantin
> >>>Cc: Pattan, Reshma; dev@dpdk.org
> >>>Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> >>>
> >>>On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> >>>
> >>>>>
> >>>>>>I think the read access would need locking but we do not want it
> >>>>>>in fast path.
> >>>>>
> >>>>>I don't think it would be needed.
> >>>>>As I said - read/write interaction didn't change from what we have right now.
> >>>>>But if you have some particular scenario in mind that you believe would cause
> >>>>>a race condition - please speak up.
> >>>>
> >>>>If we add/remove a callback during a burst? Is it possible that the next
> >>>>pointer would have a wrong value leading to a crash?
> >>>>Maybe we need a comment to state that we should not alter burst
> >>>>callbacks while running burst functions.
> >>>>
> >>>
> >>>Hi Reshma,
> >>>
> >>>You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> >>>function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> >>>of RX callbacks associated with the polled RX queue to safely remove RX
> >>>callback(s) in parallel.
> >>>The problem is not [only] with the setting and the loading of "cb->next"
> >>>that you assume to be atomic operations, which is certainly true on most
> >>>CPUs.
> >>>I see the 2 important following issues:
> >>>
> >>>1) the "rte_eth_rxtx_callback" data structure associated with a removed
> >>>RX callback could still be accessed in the callback parsing loop of the
> >>>function "rte_eth_rx_burst()" after having been freed in parallel.
> >>>
> >>>BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> >>>MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> >>>does not free the "rte_eth_rxtx_callback" data structure associated with
> >>>the removed callback !
> >>
> >>Yes, though it is documented behaviour, someone can probably
> >>refer it as a feature, not a bug ;)
> >>
> >
> >+1
> >This is definitely not a bug, this is absolutely by design. One may argue with
> >the design, but it was done for a definite reason, so as to avoid paying the
> >penalty of having locks. It pushes more responsibility onto the app, but it
> >does allow the app to choose the best solution for managing the freeing of
> >memory for its situation. The alternative is to force all apps to pay the cost
> >of having locks, even if better options for freeing the memory are available.
> >
> >/Bruce
> >
> 
> -1 (not to say 0xFFFFFFFF)
> 
> This is definitely an API design bug !
> I would say that if you don't know how to free a resource that you allocate,
> it is very likely that you are wrong allocating it.
> And this is exactly what happens here with RX/TX callback data structures.
> This problem can easily be addressed by just changing the API as follows:
> 
> Change
>     void *
>     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                             rte_rx_callback_fn fn, void *user_param)
> 
> to
>     int
>     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                             struct rte_eth_rxtx_callback *cb)
> 
> In addition of solving the problem, this approach makes the API consistent
> and let the application allocate "rte_eth_rxtx_callback" data structures
> through any appropriate mean.
> 

That looks like a reasonable change to me. It keeps the important part of the
existing API behaviour, while making the API more consistent.

/Bruce

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 14:07                       ` Ivan Boule
  2016-06-15 14:19                         ` Bruce Richardson
@ 2016-06-15 14:20                         ` Ananyev, Konstantin
  2016-06-15 14:22                           ` Bruce Richardson
  1 sibling, 1 reply; 67+ messages in thread
From: Ananyev, Konstantin @ 2016-06-15 14:20 UTC (permalink / raw)
  To: Ivan Boule, Richardson, Bruce; +Cc: Thomas Monjalon, Pattan, Reshma, dev



> -----Original Message-----
> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> Sent: Wednesday, June 15, 2016 3:07 PM
> To: Richardson, Bruce; Ananyev, Konstantin
> Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> >> Hi Ivan,
> >>
> >>> -----Original Message-----
> >>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> >>> Sent: Wednesday, June 15, 2016 1:15 PM
> >>> To: Thomas Monjalon; Ananyev, Konstantin
> >>> Cc: Pattan, Reshma; dev@dpdk.org
> >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> >>>
> >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> >>>
> >>>>>
> >>>>>> I think the read access would need locking but we do not want it
> >>>>>> in fast path.
> >>>>>
> >>>>> I don't think it would be needed.
> >>>>> As I said - read/write interaction didn't change from what we have right now.
> >>>>> But if you have some particular scenario in mind that you believe would cause
> >>>>> a race condition - please speak up.
> >>>>
> >>>> If we add/remove a callback during a burst? Is it possible that the next
> >>>> pointer would have a wrong value leading to a crash?
> >>>> Maybe we need a comment to state that we should not alter burst
> >>>> callbacks while running burst functions.
> >>>>
> >>>
> >>> Hi Reshma,
> >>>
> >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> >>> of RX callbacks associated with the polled RX queue to safely remove RX
> >>> callback(s) in parallel.
> >>> The problem is not [only] with the setting and the loading of "cb->next"
> >>> that you assume to be atomic operations, which is certainly true on most
> >>> CPUs.
> >>> I see the 2 important following issues:
> >>>
> >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> >>> RX callback could still be accessed in the callback parsing loop of the
> >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> >>>
> >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> >>> the removed callback !
> >>
> >> Yes, though it is documented behaviour, someone can probably
> >> refer it as a feature, not a bug ;)
> >>
> >
> > +1
> > This is definitely not a bug, this is absolutely by design. One may argue with
> > the design, but it was done for a definite reason, so as to avoid paying the
> > penalty of having locks. It pushes more responsibility onto the app, but it
> > does allow the app to choose the best solution for managing the freeing of
> > memory for its situation. The alternative is to force all apps to pay the cost
> > of having locks, even if better options for freeing the memory are available.
> >
> > /Bruce
> >
> 
> -1 (not to say 0xFFFFFFFF)
> 
> This is definitely an API design bug !
> I would say that if you don't know how to free a resource that you
> allocate, it is very likely that you are wrong allocating it.
> And this is exactly what happens here with RX/TX callback data structures.
> This problem can easily be addressed by just changing the API as follows:
> 
> Change
>      void *
>      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                              rte_rx_callback_fn fn, void *user_param)
> 
> to
>      int
>      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                              struct rte_eth_rxtx_callback *cb)
> 
> In addition of solving the problem, this approach makes the API
> consistent and let the application allocate "rte_eth_rxtx_callback" data
> structures through any appropriate mean.

That might make API a bit cleaner, but I don't see how it fixes the generic problem:
there is still no way to know by the upper layer when it is safe to free/re-use
removed callback, but to make sure that all IO on that queue is stopped
(I.E. some external synchronisation around the queue).   
As you said in previous mail: 
> This is an example of a well-known more generic object deletion problem
> which must arrange to guarantee that a deleted object is not used and
> not accessible for use anymore before being actually deleted (freed, for
> instance).
Konstantin

> 
> Regards,
> Ivan
> 
> --
> Ivan Boule
> 6WIND Development Engineer

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 14:20                         ` Ananyev, Konstantin
@ 2016-06-15 14:22                           ` Bruce Richardson
  2016-06-15 14:27                             ` Ananyev, Konstantin
  0 siblings, 1 reply; 67+ messages in thread
From: Bruce Richardson @ 2016-06-15 14:22 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: Ivan Boule, Thomas Monjalon, Pattan, Reshma, dev

On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > Sent: Wednesday, June 15, 2016 3:07 PM
> > To: Richardson, Bruce; Ananyev, Konstantin
> > Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> > >> Hi Ivan,
> > >>
> > >>> -----Original Message-----
> > >>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > >>> Sent: Wednesday, June 15, 2016 1:15 PM
> > >>> To: Thomas Monjalon; Ananyev, Konstantin
> > >>> Cc: Pattan, Reshma; dev@dpdk.org
> > >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > >>>
> > >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > >>>
> > >>>>>
> > >>>>>> I think the read access would need locking but we do not want it
> > >>>>>> in fast path.
> > >>>>>
> > >>>>> I don't think it would be needed.
> > >>>>> As I said - read/write interaction didn't change from what we have right now.
> > >>>>> But if you have some particular scenario in mind that you believe would cause
> > >>>>> a race condition - please speak up.
> > >>>>
> > >>>> If we add/remove a callback during a burst? Is it possible that the next
> > >>>> pointer would have a wrong value leading to a crash?
> > >>>> Maybe we need a comment to state that we should not alter burst
> > >>>> callbacks while running burst functions.
> > >>>>
> > >>>
> > >>> Hi Reshma,
> > >>>
> > >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > >>> of RX callbacks associated with the polled RX queue to safely remove RX
> > >>> callback(s) in parallel.
> > >>> The problem is not [only] with the setting and the loading of "cb->next"
> > >>> that you assume to be atomic operations, which is certainly true on most
> > >>> CPUs.
> > >>> I see the 2 important following issues:
> > >>>
> > >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > >>> RX callback could still be accessed in the callback parsing loop of the
> > >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> > >>>
> > >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> > >>> the removed callback !
> > >>
> > >> Yes, though it is documented behaviour, someone can probably
> > >> refer it as a feature, not a bug ;)
> > >>
> > >
> > > +1
> > > This is definitely not a bug, this is absolutely by design. One may argue with
> > > the design, but it was done for a definite reason, so as to avoid paying the
> > > penalty of having locks. It pushes more responsibility onto the app, but it
> > > does allow the app to choose the best solution for managing the freeing of
> > > memory for its situation. The alternative is to force all apps to pay the cost
> > > of having locks, even if better options for freeing the memory are available.
> > >
> > > /Bruce
> > >
> > 
> > -1 (not to say 0xFFFFFFFF)
> > 
> > This is definitely an API design bug !
> > I would say that if you don't know how to free a resource that you
> > allocate, it is very likely that you are wrong allocating it.
> > And this is exactly what happens here with RX/TX callback data structures.
> > This problem can easily be addressed by just changing the API as follows:
> > 
> > Change
> >      void *
> >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> >                              rte_rx_callback_fn fn, void *user_param)
> > 
> > to
> >      int
> >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> >                              struct rte_eth_rxtx_callback *cb)
> > 
> > In addition of solving the problem, this approach makes the API
> > consistent and let the application allocate "rte_eth_rxtx_callback" data
> > structures through any appropriate mean.
> 
> That might make API a bit cleaner, but I don't see how it fixes the generic problem:
> there is still no way to know by the upper layer when it is safe to free/re-use
> removed callback, but to make sure that all IO on that queue is stopped
> (I.E. some external synchronisation around the queue).   

Actually, it allows other, more creative solutions, like an app having a
statically allocated set/pool of callback structures that it doesn't need to
allocate or free at all.

/Bruce

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 14:22                           ` Bruce Richardson
@ 2016-06-15 14:27                             ` Ananyev, Konstantin
  2016-06-15 15:33                               ` Ivan Boule
  0 siblings, 1 reply; 67+ messages in thread
From: Ananyev, Konstantin @ 2016-06-15 14:27 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: Ivan Boule, Thomas Monjalon, Pattan, Reshma, dev



> -----Original Message-----
> From: Richardson, Bruce
> Sent: Wednesday, June 15, 2016 3:22 PM
> To: Ananyev, Konstantin
> Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > > Sent: Wednesday, June 15, 2016 3:07 PM
> > > To: Richardson, Bruce; Ananyev, Konstantin
> > > Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > >
> > > On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > > > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> > > >> Hi Ivan,
> > > >>
> > > >>> -----Original Message-----
> > > >>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > > >>> Sent: Wednesday, June 15, 2016 1:15 PM
> > > >>> To: Thomas Monjalon; Ananyev, Konstantin
> > > >>> Cc: Pattan, Reshma; dev@dpdk.org
> > > >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > > >>>
> > > >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > > >>>
> > > >>>>>
> > > >>>>>> I think the read access would need locking but we do not want it
> > > >>>>>> in fast path.
> > > >>>>>
> > > >>>>> I don't think it would be needed.
> > > >>>>> As I said - read/write interaction didn't change from what we have right now.
> > > >>>>> But if you have some particular scenario in mind that you believe would cause
> > > >>>>> a race condition - please speak up.
> > > >>>>
> > > >>>> If we add/remove a callback during a burst? Is it possible that the next
> > > >>>> pointer would have a wrong value leading to a crash?
> > > >>>> Maybe we need a comment to state that we should not alter burst
> > > >>>> callbacks while running burst functions.
> > > >>>>
> > > >>>
> > > >>> Hi Reshma,
> > > >>>
> > > >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > > >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > > >>> of RX callbacks associated with the polled RX queue to safely remove RX
> > > >>> callback(s) in parallel.
> > > >>> The problem is not [only] with the setting and the loading of "cb->next"
> > > >>> that you assume to be atomic operations, which is certainly true on most
> > > >>> CPUs.
> > > >>> I see the 2 important following issues:
> > > >>>
> > > >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > > >>> RX callback could still be accessed in the callback parsing loop of the
> > > >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> > > >>>
> > > >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > > >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > > >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> > > >>> the removed callback !
> > > >>
> > > >> Yes, though it is documented behaviour, someone can probably
> > > >> refer it as a feature, not a bug ;)
> > > >>
> > > >
> > > > +1
> > > > This is definitely not a bug, this is absolutely by design. One may argue with
> > > > the design, but it was done for a definite reason, so as to avoid paying the
> > > > penalty of having locks. It pushes more responsibility onto the app, but it
> > > > does allow the app to choose the best solution for managing the freeing of
> > > > memory for its situation. The alternative is to force all apps to pay the cost
> > > > of having locks, even if better options for freeing the memory are available.
> > > >
> > > > /Bruce
> > > >
> > >
> > > -1 (not to say 0xFFFFFFFF)
> > >
> > > This is definitely an API design bug !
> > > I would say that if you don't know how to free a resource that you
> > > allocate, it is very likely that you are wrong allocating it.
> > > And this is exactly what happens here with RX/TX callback data structures.
> > > This problem can easily be addressed by just changing the API as follows:
> > >
> > > Change
> > >      void *
> > >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> > >                              rte_rx_callback_fn fn, void *user_param)
> > >
> > > to
> > >      int
> > >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> > >                              struct rte_eth_rxtx_callback *cb)
> > >
> > > In addition of solving the problem, this approach makes the API
> > > consistent and let the application allocate "rte_eth_rxtx_callback" data
> > > structures through any appropriate mean.
> >
> > That might make API a bit cleaner, but I don't see how it fixes the generic problem:
> > there is still no way to know by the upper layer when it is safe to free/re-use
> > removed callback, but to make sure that all IO on that queue is stopped
> > (I.E. some external synchronisation around the queue).
> 
> Actually, it allows other, more creative solutions, like an app having a
> statically allocated set/pool of callback structures that it doesn't need to
> allocate or free at all.

I understand that, and as I said I am not against that change.
But it doesn't solve the problem in general.
Ok, if you have a static object, you wouldn't need to call free() for it,
but you still want to re-use it after remove_cb() has finished, right?
So there is no actual difference - the main question is:
at what point after remove_cb() it is safe to modify it's contents.
Konstantin

> 
> /Bruce

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
  2016-06-15 14:27                             ` Ananyev, Konstantin
@ 2016-06-15 15:33                               ` Ivan Boule
  0 siblings, 0 replies; 67+ messages in thread
From: Ivan Boule @ 2016-06-15 15:33 UTC (permalink / raw)
  To: Ananyev, Konstantin, Richardson, Bruce
  Cc: Thomas Monjalon, Pattan, Reshma, dev

On 06/15/2016 04:27 PM, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Richardson, Bruce
>> Sent: Wednesday, June 15, 2016 3:22 PM
>> To: Ananyev, Konstantin
>> Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>
>> On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
>>>> Sent: Wednesday, June 15, 2016 3:07 PM
>>>> To: Richardson, Bruce; Ananyev, Konstantin
>>>> Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>>
>>>> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
>>>>> On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
>>>>>> Hi Ivan,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
>>>>>>> Sent: Wednesday, June 15, 2016 1:15 PM
>>>>>>> To: Thomas Monjalon; Ananyev, Konstantin
>>>>>>> Cc: Pattan, Reshma; dev@dpdk.org
>>>>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>>>>>
>>>>>>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think the read access would need locking but we do not want it
>>>>>>>>>> in fast path.
>>>>>>>>>
>>>>>>>>> I don't think it would be needed.
>>>>>>>>> As I said - read/write interaction didn't change from what we have right now.
>>>>>>>>> But if you have some particular scenario in mind that you believe would cause
>>>>>>>>> a race condition - please speak up.
>>>>>>>>
>>>>>>>> If we add/remove a callback during a burst? Is it possible that the next
>>>>>>>> pointer would have a wrong value leading to a crash?
>>>>>>>> Maybe we need a comment to state that we should not alter burst
>>>>>>>> callbacks while running burst functions.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Reshma,
>>>>>>>
>>>>>>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
>>>>>>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
>>>>>>> of RX callbacks associated with the polled RX queue to safely remove RX
>>>>>>> callback(s) in parallel.
>>>>>>> The problem is not [only] with the setting and the loading of "cb->next"
>>>>>>> that you assume to be atomic operations, which is certainly true on most
>>>>>>> CPUs.
>>>>>>> I see the 2 important following issues:
>>>>>>>
>>>>>>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
>>>>>>> RX callback could still be accessed in the callback parsing loop of the
>>>>>>> function "rte_eth_rx_burst()" after having been freed in parallel.
>>>>>>>
>>>>>>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
>>>>>>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
>>>>>>> does not free the "rte_eth_rxtx_callback" data structure associated with
>>>>>>> the removed callback !
>>>>>>
>>>>>> Yes, though it is documented behaviour, someone can probably
>>>>>> refer it as a feature, not a bug ;)
>>>>>>
>>>>>
>>>>> +1
>>>>> This is definitely not a bug, this is absolutely by design. One may argue with
>>>>> the design, but it was done for a definite reason, so as to avoid paying the
>>>>> penalty of having locks. It pushes more responsibility onto the app, but it
>>>>> does allow the app to choose the best solution for managing the freeing of
>>>>> memory for its situation. The alternative is to force all apps to pay the cost
>>>>> of having locks, even if better options for freeing the memory are available.
>>>>>
>>>>> /Bruce
>>>>>
>>>>
>>>> -1 (not to say 0xFFFFFFFF)
>>>>
>>>> This is definitely an API design bug !
>>>> I would say that if you don't know how to free a resource that you
>>>> allocate, it is very likely that you are wrong allocating it.
>>>> And this is exactly what happens here with RX/TX callback data structures.
>>>> This problem can easily be addressed by just changing the API as follows:
>>>>
>>>> Change
>>>>       void *
>>>>       rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>>>>                               rte_rx_callback_fn fn, void *user_param)
>>>>
>>>> to
>>>>       int
>>>>       rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>>>>                               struct rte_eth_rxtx_callback *cb)
>>>>
>>>> In addition of solving the problem, this approach makes the API
>>>> consistent and let the application allocate "rte_eth_rxtx_callback" data
>>>> structures through any appropriate mean.
>>>
>>> That might make API a bit cleaner, but I don't see how it fixes the generic problem:
>>> there is still no way to know by the upper layer when it is safe to free/re-use
>>> removed callback, but to make sure that all IO on that queue is stopped
>>> (I.E. some external synchronisation around the queue).
>>
>> Actually, it allows other, more creative solutions, like an app having a
>> statically allocated set/pool of callback structures that it doesn't need to
>> allocate or free at all.
>
> I understand that, and as I said I am not against that change.
> But it doesn't solve the problem in general.
> Ok, if you have a static object, you wouldn't need to call free() for it,
> but you still want to re-use it after remove_cb() has finished, right?
> So there is no actual difference - the main question is:
> at what point after remove_cb() it is safe to modify it's contents.
> Konstantin
>
>>
>> /Bruce
>
Hi Bruce/Konstantin

There is no need to use locks to ensure that a RX callback being removed 
is not executed/invoked and will never be invoked again.
This issue can be easily addressed at application level through the 
common synchronization mechanism that consists in having the control 
thread of the application that manages the adding/removal of RX 
callbacks to:
1) Ask the packet processing function of the application to stop 
invoking the function rte_eth_rx_burst() on the target RX queue.
2) Once the packet processing function acked it stopped polling the 
target RX queue, safely remove the RX callback and free whatever 
resource needs to be freed.
3) Enable the packet processing function of the application to invoke 
again the function rte_eth_rx_burst() on the target RX queue.

Enjoy :-)

Ivan



-- 
Ivan Boule
6WIND Development Engineer

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 5/8] pdump: add new library for packet capturing support
  2016-06-15  9:32           ` Thomas Monjalon
  2016-06-15  9:43             ` Bruce Richardson
@ 2016-06-15 15:44             ` Mcnamara, John
  1 sibling, 0 replies; 67+ messages in thread
From: Mcnamara, John @ 2016-06-15 15:44 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Pattan, Reshma, dev


> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, June 15, 2016 10:33 AM
> To: Mcnamara, John <john.mcnamara@intel.com>
> Cc: Pattan, Reshma <reshma.pattan@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 5/8] pdump: add new library for packet
> capturing support
>
> > Also, it makes it a bit harder for the documentation maintainer (me in
> this case) to see doc changes within patches and to ack just the doc part.
> From a documentation maintainer point of view it would be best to have
> any, non-trivial, doc changes in a separate patch.
> 
> I understand your concern.
> But you cannot assume every doc changes will be properly highlighted in
> the headline. I think you need to filter patches based on a content
> pattern:
> 	+++ b/doc/guides/

Hi Thomas,

That still leaves the issue of not being able to ack the doc part of the patch
separately from the rest of the patch.

John

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v10 3/7] ethdev: add new fields to ethdev info struct
  2016-06-15 14:06       ` [PATCH v10 3/7] ethdev: add new fields to ethdev info struct Reshma Pattan
@ 2016-06-16 19:14         ` Thomas Monjalon
  0 siblings, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-16 19:14 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-15 15:06, Reshma Pattan:
> The new fields nb_rx_queues and nb_tx_queues are added to the
> rte_eth_dev_info structure.
> Changes to API rte_eth_dev_info_get() are done to update these new fields
> to the rte_eth_dev_info object.

The ABI is changed, not the API.

> Release notes is updated with the changes.
[...]
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -137,4 +137,5 @@ DPDK_16.07 {
>  	global:
>  
>  	rte_eth_add_first_rx_callback;
> +	rte_eth_dev_info_get;
>  } DPDK_16.04;

Why duplicating this symbol in 16.07?
The ABI is broken anyway.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v10 4/7] ethdev: make get port by name and get name by port public
  2016-06-15 14:06       ` [PATCH v10 4/7] ethdev: make get port by name and get name by port public Reshma Pattan
@ 2016-06-16 20:27         ` Thomas Monjalon
  0 siblings, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-16 20:27 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

2016-06-15 15:06, Reshma Pattan:
> Converted rte_eth_dev_get_port_by_name to a public API.
> Converted rte_eth_dev_get_name_by_port to a public API.
> Updated the release notes with the changes.

It is not an API change, just a new API, so no need to reference
it in the release notes.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v10 0/7] add packet capture framework
  2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
                         ` (6 preceding siblings ...)
  2016-06-15 14:06       ` [PATCH v10 7/7] app/testpmd: add pdump initialization uninitialization Reshma Pattan
@ 2016-06-16 21:55       ` Thomas Monjalon
  7 siblings, 0 replies; 67+ messages in thread
From: Thomas Monjalon @ 2016-06-16 21:55 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

> Reshma Pattan (7):
>   ethdev: use locks to protect Rx/Tx callback lists
>   ethdev: add new api to add Rx callback as head of the list
>   ethdev: add new fields to ethdev info struct
>   ethdev: make get port by name and get name by port public
>   pdump: add new library for packet capturing support
>   app/pdump: add pdump tool for packet capturing
>   app/testpmd: add pdump initialization uninitialization

Applied with small changes, thanks

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2016-06-16 21:55 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1465487895-5870-1-git-send-email-reshma.pattan@intel.com>
2016-06-10 16:18 ` [PATCH v8 0/8] add packet capture framework Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 2/8] librte_ether: add new api rte_eth_add_first_rx_callback Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 3/8] librte_ether: add new fields to rte_eth_dev_info struct Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 4/8] librte_ether: make rte_eth_dev_get_port_by_name rte_eth_dev_get_name_by_port public Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 5/8] lib/librte_pdump: add new library for packet capturing support Reshma Pattan
2016-06-10 18:48     ` Aaron Conole
2016-06-10 22:14       ` Pattan, Reshma
2016-06-13 13:28         ` Aaron Conole
2016-06-10 16:18   ` [PATCH v8 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 7/8] app/test-pmd: add pdump initialization uninitialization Reshma Pattan
2016-06-10 16:18   ` [PATCH v8 8/8] doc: update doc for packet capture framework Reshma Pattan
2016-06-10 23:23   ` [PATCH v8 0/8] add " Neil Horman
2016-06-13  8:47     ` Pattan, Reshma
2016-06-14  9:38   ` [PATCH v9 " Reshma Pattan
2016-06-14  9:38     ` [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
2016-06-14 19:59       ` Thomas Monjalon
2016-06-15  5:30         ` Pattan, Reshma
2016-06-15  8:19           ` Thomas Monjalon
2016-06-15  8:37             ` Ananyev, Konstantin
2016-06-15  8:48               ` Thomas Monjalon
2016-06-15  9:54                 ` Ananyev, Konstantin
2016-06-15 11:17                   ` Thomas Monjalon
2016-06-15 13:49                   ` Thomas Monjalon
2016-06-15 12:15                 ` Ivan Boule
2016-06-15 12:40                   ` Ananyev, Konstantin
2016-06-15 13:29                     ` Bruce Richardson
2016-06-15 14:07                       ` Ivan Boule
2016-06-15 14:19                         ` Bruce Richardson
2016-06-15 14:20                         ` Ananyev, Konstantin
2016-06-15 14:22                           ` Bruce Richardson
2016-06-15 14:27                             ` Ananyev, Konstantin
2016-06-15 15:33                               ` Ivan Boule
2016-06-14  9:38     ` [PATCH v9 2/8] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
2016-06-14 20:01       ` Thomas Monjalon
2016-06-14 21:43         ` Pattan, Reshma
2016-06-14  9:38     ` [PATCH v9 3/8] ethdev: add new fields to ethdev info struct Reshma Pattan
2016-06-14 20:10       ` Thomas Monjalon
2016-06-14 21:57         ` Pattan, Reshma
2016-06-14  9:38     ` [PATCH v9 4/8] ethdev: make get port by name and get name by port public Reshma Pattan
2016-06-14 20:23       ` Thomas Monjalon
2016-06-14 21:55         ` Pattan, Reshma
2016-06-14  9:38     ` [PATCH v9 5/8] pdump: add new library for packet capturing support Reshma Pattan
2016-06-14 20:28       ` Thomas Monjalon
2016-06-14 21:59         ` Pattan, Reshma
2016-06-15  9:05         ` Mcnamara, John
2016-06-15  9:32           ` Thomas Monjalon
2016-06-15  9:43             ` Bruce Richardson
2016-06-15 15:44             ` Mcnamara, John
2016-06-14  9:38     ` [PATCH v9 6/8] app/pdump: add pdump tool for packet capturing Reshma Pattan
2016-06-14 19:56       ` Thomas Monjalon
2016-06-14  9:38     ` [PATCH v9 7/8] app/testpmd: add pdump initialization uninitialization Reshma Pattan
2016-06-14  9:38     ` [PATCH v9 8/8] doc: update doc for packet capture framework Reshma Pattan
2016-06-14 20:41       ` Thomas Monjalon
2016-06-15  5:44         ` Pattan, Reshma
2016-06-15  8:24           ` Thomas Monjalon
2016-06-15 14:06     ` [PATCH v10 0/7] add " Reshma Pattan
2016-06-15 14:06       ` [PATCH v10 1/7] ethdev: use locks to protect Rx/Tx callback lists Reshma Pattan
2016-06-15 14:06       ` [PATCH v10 2/7] ethdev: add new api to add Rx callback as head of the list Reshma Pattan
2016-06-15 14:06       ` [PATCH v10 3/7] ethdev: add new fields to ethdev info struct Reshma Pattan
2016-06-16 19:14         ` Thomas Monjalon
2016-06-15 14:06       ` [PATCH v10 4/7] ethdev: make get port by name and get name by port public Reshma Pattan
2016-06-16 20:27         ` Thomas Monjalon
2016-06-15 14:06       ` [PATCH v10 5/7] pdump: add new library for packet capturing support Reshma Pattan
2016-06-15 14:06       ` [PATCH v10 6/7] app/pdump: add pdump tool for packet capturing Reshma Pattan
2016-06-15 14:06       ` [PATCH v10 7/7] app/testpmd: add pdump initialization uninitialization Reshma Pattan
2016-06-16 21:55       ` [PATCH v10 0/7] add packet capture framework Thomas Monjalon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.