* [dpdk-dev] [PATCH v1 0/3] add more features for AF_XDP pmd @ 2019-05-15 8:38 Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye ` (3 more replies) 0 siblings, 4 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-15 8:38 UTC (permalink / raw) Cc: Qi Zhang, Karlsson Magnus, Topel Bjorn, dev, Xiaolong Ye Hi, This patch series mainly includes 3 new features for AF_XDP pmd. They are separated independent features, the reason I take them in one patchset is that they have code dependency. 1. zero copy This patch enables `zero copy` between af_xdp umem and mbuf by using external mbuf mechanism. 2. multi-queue With mutli-queue support, one AF_XDP pmd instance can use multi netdev queues. 3. busy-poll With busy-poll, all processing occurs on a single core, performance is better from a per-core perspective. This patch has dependency on busy-poll support in kernel side and now it is in RFC stage [1]. [1] https://www.spinics.net/lists/netdev/msg568337.html Xiaolong Ye (3): net/af_xdp: enable zero copy by extbuf net/af_xdp: add multi-queue support net/af_xdp: add busy poll support doc/guides/nics/af_xdp.rst | 5 +- drivers/net/af_xdp/rte_eth_af_xdp.c | 210 +++++++++++++++++++++------- 2 files changed, 160 insertions(+), 55 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v1 1/3] net/af_xdp: enable zero copy by extbuf 2019-05-15 8:38 [dpdk-dev] [PATCH v1 0/3] add more features for AF_XDP pmd Xiaolong Ye @ 2019-05-15 8:38 ` Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 2/3] net/af_xdp: add multi-queue support Xiaolong Ye ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-15 8:38 UTC (permalink / raw) To: Xiaolong Ye, Qi Zhang, John McNamara, Marko Kovacevic Cc: Karlsson Magnus, Topel Bjorn, dev Implement zero copy of af_xdp pmd through mbuf's external memory mechanism to achieve high performance. This patch also provides a new parameter "pmd_zero_copy" for user, so they can choose to enable zero copy of af_xdp pmd or not. To be clear, "zero copy" here is different from the "zero copy mode" of AF_XDP, it is about zero copy between af_xdp umem and mbuf used in dpdk application. Suggested-by: Varghese Vipin <vipin.varghese@intel.com> Suggested-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> --- doc/guides/nics/af_xdp.rst | 1 + drivers/net/af_xdp/rte_eth_af_xdp.c | 96 +++++++++++++++++++++++------ 2 files changed, 77 insertions(+), 20 deletions(-) diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 175038153..0bd4239fe 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -28,6 +28,7 @@ The following options can be provided to set up an af_xdp port in DPDK. * ``iface`` - name of the Kernel interface to attach to (required); * ``queue`` - netdev queue id (optional, default 0); +* ``pmd_zero_copy`` - enable zero copy or not (optional, default 0); Prerequisites ------------- diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 35c72272c..ebef7bf34 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -70,6 +70,7 @@ struct xsk_umem_info { struct xsk_umem *umem; struct rte_ring *buf_ring; const struct rte_memzone *mz; + int pmd_zc; }; struct rx_stats { @@ -109,6 +110,7 @@ struct pmd_internals { int if_index; char if_name[IFNAMSIZ]; uint16_t queue_idx; + int pmd_zc; struct ether_addr eth_addr; struct xsk_umem_info *umem; struct rte_mempool *mb_pool_share; @@ -119,10 +121,12 @@ struct pmd_internals { #define ETH_AF_XDP_IFACE_ARG "iface" #define ETH_AF_XDP_QUEUE_IDX_ARG "queue" +#define ETH_AF_XDP_PMD_ZC_ARG "pmd_zero_copy" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, ETH_AF_XDP_QUEUE_IDX_ARG, + ETH_AF_XDP_PMD_ZC_ARG, NULL }; @@ -166,6 +170,15 @@ reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size) return 0; } +static void +umem_buf_release_to_fq(void *addr, void *opaque) +{ + struct xsk_umem_info *umem = (struct xsk_umem_info *)opaque; + uint64_t umem_addr = (uint64_t)addr - umem->mz->addr_64; + + rte_ring_enqueue(umem->buf_ring, (void *)umem_addr); +} + static uint16_t eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -175,6 +188,7 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) struct xsk_ring_prod *fq = &umem->fq; uint32_t idx_rx = 0; uint32_t free_thresh = fq->size >> 1; + int pmd_zc = umem->pmd_zc; struct rte_mbuf *mbufs[ETH_AF_XDP_RX_BATCH_SIZE]; unsigned long dropped = 0; unsigned long rx_bytes = 0; @@ -197,19 +211,29 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) uint64_t addr; uint32_t len; void *pkt; + uint16_t buf_len = ETH_AF_XDP_FRAME_SIZE; + struct rte_mbuf_ext_shared_info *shinfo; desc = xsk_ring_cons__rx_desc(rx, idx_rx++); addr = desc->addr; len = desc->len; pkt = xsk_umem__get_data(rxq->umem->mz->addr, addr); - rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), pkt, len); + if (pmd_zc) { + shinfo = rte_pktmbuf_ext_shinfo_init_helper(pkt, + &buf_len, umem_buf_release_to_fq, umem); + + rte_pktmbuf_attach_extbuf(mbufs[i], pkt, 0, buf_len, + shinfo); + } else { + rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), + pkt, len); + rte_ring_enqueue(umem->buf_ring, (void *)addr); + } rte_pktmbuf_pkt_len(mbufs[i]) = len; rte_pktmbuf_data_len(mbufs[i]) = len; rx_bytes += len; bufs[i] = mbufs[i]; - - rte_ring_enqueue(umem->buf_ring, (void *)addr); } xsk_ring_cons__release(rx, rcvd); @@ -262,12 +286,21 @@ kick_tx(struct pkt_tx_queue *txq) pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); } +static inline bool +in_umem_range(struct xsk_umem_info *umem, uint64_t addr) +{ + uint64_t mz_base_addr = umem->mz->addr_64; + + return addr >= mz_base_addr && addr < mz_base_addr + umem->mz->len; +} + static uint16_t eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { struct pkt_tx_queue *txq = queue; struct xsk_umem_info *umem = txq->pair->umem; struct rte_mbuf *mbuf; + int pmd_zc = umem->pmd_zc; void *addrs[ETH_AF_XDP_TX_BATCH_SIZE]; unsigned long tx_bytes = 0; int i; @@ -294,16 +327,26 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); mbuf = bufs[i]; - - desc->addr = (uint64_t)addrs[i]; desc->len = mbuf->pkt_len; - pkt = xsk_umem__get_data(umem->mz->addr, - desc->addr); - rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), - desc->len); - tx_bytes += mbuf->pkt_len; - rte_pktmbuf_free(mbuf); + /* + * We need to make sure the external mbuf address is within + * current port's umem memzone range + */ + if (pmd_zc && RTE_MBUF_HAS_EXTBUF(mbuf) && + in_umem_range(umem, (uint64_t)mbuf->buf_addr)) { + desc->addr = (uint64_t)mbuf->buf_addr - + umem->mz->addr_64; + mbuf->buf_addr = xsk_umem__get_data(umem->mz->addr, + (uint64_t)addrs[i]); + } else { + desc->addr = (uint64_t)addrs[i]; + pkt = xsk_umem__get_data(umem->mz->addr, + desc->addr); + rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), + desc->len); + } + tx_bytes += mbuf->pkt_len; } xsk_ring_prod__submit(&txq->tx, nb_pkts); @@ -313,6 +356,9 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) txq->stats.tx_pkts += nb_pkts; txq->stats.tx_bytes += tx_bytes; + for (i = 0; i < nb_pkts; i++) + rte_pktmbuf_free(bufs[i]); + return nb_pkts; } @@ -640,6 +686,10 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, } internals->umem = rxq->umem; + internals->umem->pmd_zc = internals->pmd_zc; + + if (internals->umem->pmd_zc) + AF_XDP_LOG(INFO, "Zero copy between umem and mbuf enabled.\n"); dev->data->rx_queues[rx_queue_id] = rxq; return 0; @@ -775,9 +825,8 @@ parse_name_arg(const char *key __rte_unused, } static int -parse_parameters(struct rte_kvargs *kvlist, - char *if_name, - int *queue_idx) +parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, + int *pmd_zc) { int ret; @@ -791,6 +840,11 @@ parse_parameters(struct rte_kvargs *kvlist, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PMD_ZC_ARG, + &parse_integer_arg, pmd_zc); + if (ret < 0) + goto free_kvlist; + free_kvlist: rte_kvargs_free(kvlist); return ret; @@ -827,9 +881,8 @@ get_iface_info(const char *if_name, } static struct rte_eth_dev * -init_internals(struct rte_vdev_device *dev, - const char *if_name, - int queue_idx) +init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, + int pmd_zc) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -843,6 +896,7 @@ init_internals(struct rte_vdev_device *dev, return NULL; internals->queue_idx = queue_idx; + internals->pmd_zc = pmd_zc; strlcpy(internals->if_name, if_name, IFNAMSIZ); for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { @@ -883,6 +937,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) int xsk_queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX; struct rte_eth_dev *eth_dev = NULL; const char *name; + int pmd_zc = 0; AF_XDP_LOG(INFO, "Initializing pmd_af_xdp for %s\n", rte_vdev_device_name(dev)); @@ -909,7 +964,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (dev->device.numa_node == SOCKET_ID_ANY) dev->device.numa_node = rte_socket_id(); - if (parse_parameters(kvlist, if_name, &xsk_queue_idx) < 0) { + if (parse_parameters(kvlist, if_name, &xsk_queue_idx, &pmd_zc) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -919,7 +974,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) return -EINVAL; } - eth_dev = init_internals(dev, if_name, xsk_queue_idx); + eth_dev = init_internals(dev, if_name, xsk_queue_idx, pmd_zc); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -961,7 +1016,8 @@ static struct rte_vdev_driver pmd_af_xdp_drv = { RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv); RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "iface=<string> " - "queue=<int> "); + "queue=<int> " + "pmd_zero_copy=<0|1>"); RTE_INIT(af_xdp_init_log) { -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v1 2/3] net/af_xdp: add multi-queue support 2019-05-15 8:38 [dpdk-dev] [PATCH v1 0/3] add more features for AF_XDP pmd Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye @ 2019-05-15 8:38 ` Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 3/3] net/af_xdp: add busy poll support Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye 3 siblings, 0 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-15 8:38 UTC (permalink / raw) To: Xiaolong Ye, Qi Zhang, John McNamara, Marko Kovacevic Cc: Karlsson Magnus, Topel Bjorn, dev This patch adds two parameters `start_queue` and `queue_count` to specify the range of netdev queues used by AF_XDP pmd. Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> --- doc/guides/nics/af_xdp.rst | 3 +- drivers/net/af_xdp/rte_eth_af_xdp.c | 88 +++++++++++++++-------------- 2 files changed, 49 insertions(+), 42 deletions(-) diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 0bd4239fe..18defcda3 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -27,7 +27,8 @@ Options The following options can be provided to set up an af_xdp port in DPDK. * ``iface`` - name of the Kernel interface to attach to (required); -* ``queue`` - netdev queue id (optional, default 0); +* ``start_queue`` - starting netdev queue id (optional, default 0); +* ``queue_count`` - total netdev queue number (optional, default 1); * ``pmd_zero_copy`` - enable zero copy or not (optional, default 0); Prerequisites diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index ebef7bf34..9a4510701 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -57,7 +57,8 @@ static int af_xdp_logtype; #define ETH_AF_XDP_NUM_BUFFERS 4096 #define ETH_AF_XDP_DATA_HEADROOM 0 #define ETH_AF_XDP_DFLT_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS -#define ETH_AF_XDP_DFLT_QUEUE_IDX 0 +#define ETH_AF_XDP_DFLT_START_QUEUE_IDX 0 +#define ETH_AF_XDP_DFLT_QUEUE_COUNT 1 #define ETH_AF_XDP_RX_BATCH_SIZE 32 #define ETH_AF_XDP_TX_BATCH_SIZE 32 @@ -88,7 +89,7 @@ struct pkt_rx_queue { struct rx_stats stats; struct pkt_tx_queue *pair; - uint16_t queue_idx; + int xsk_queue_idx; }; struct tx_stats { @@ -103,13 +104,15 @@ struct pkt_tx_queue { struct tx_stats stats; struct pkt_rx_queue *pair; - uint16_t queue_idx; + int xsk_queue_idx; }; struct pmd_internals { int if_index; char if_name[IFNAMSIZ]; - uint16_t queue_idx; + int start_queue_idx; + int queue_cnt; + int pmd_zc; struct ether_addr eth_addr; struct xsk_umem_info *umem; @@ -120,12 +123,14 @@ struct pmd_internals { }; #define ETH_AF_XDP_IFACE_ARG "iface" -#define ETH_AF_XDP_QUEUE_IDX_ARG "queue" +#define ETH_AF_XDP_START_QUEUE_ARG "start_queue" +#define ETH_AF_XDP_QUEUE_COUNT_ARG "queue_count" #define ETH_AF_XDP_PMD_ZC_ARG "pmd_zero_copy" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, - ETH_AF_XDP_QUEUE_IDX_ARG, + ETH_AF_XDP_START_QUEUE_ARG, + ETH_AF_XDP_QUEUE_COUNT_ARG, ETH_AF_XDP_PMD_ZC_ARG, NULL }; @@ -395,8 +400,8 @@ eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) dev_info->if_index = internals->if_index; dev_info->max_mac_addrs = 1; dev_info->max_rx_pktlen = ETH_FRAME_LEN; - dev_info->max_rx_queues = 1; - dev_info->max_tx_queues = 1; + dev_info->max_rx_queues = internals->queue_cnt; + dev_info->max_tx_queues = internals->queue_cnt; dev_info->min_mtu = ETHER_MIN_MTU; dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; @@ -528,7 +533,8 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused, } static struct -xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals) +xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, + struct pkt_rx_queue *rxq) { struct xsk_umem_info *umem; const struct rte_memzone *mz; @@ -549,7 +555,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals) } snprintf(ring_name, sizeof(ring_name), "af_xdp_ring_%s_%u", - internals->if_name, internals->queue_idx); + internals->if_name, rxq->xsk_queue_idx); umem->buf_ring = rte_ring_create(ring_name, ETH_AF_XDP_NUM_BUFFERS, rte_socket_id(), @@ -565,7 +571,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals) ETH_AF_XDP_DATA_HEADROOM)); snprintf(mz_name, sizeof(mz_name), "af_xdp_umem_%s_%u", - internals->if_name, internals->queue_idx); + internals->if_name, rxq->xsk_queue_idx); mz = rte_memzone_reserve_aligned(mz_name, ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE, rte_socket_id(), RTE_MEMZONE_IOVA_CONTIG, @@ -602,7 +608,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, int ret = 0; int reserve_size; - rxq->umem = xdp_umem_configure(internals); + rxq->umem = xdp_umem_configure(internals, rxq); if (rxq->umem == NULL) return -ENOMEM; @@ -612,7 +618,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; cfg.bind_flags = 0; ret = xsk_socket__create(&rxq->xsk, internals->if_name, - internals->queue_idx, rxq->umem->umem, &rxq->rx, + rxq->xsk_queue_idx, rxq->umem->umem, &rxq->rx, &txq->tx, &cfg); if (ret) { AF_XDP_LOG(ERR, "Failed to create xsk socket.\n"); @@ -635,20 +641,6 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, return ret; } -static void -queue_reset(struct pmd_internals *internals, uint16_t queue_idx) -{ - struct pkt_rx_queue *rxq = &internals->rx_queues[queue_idx]; - struct pkt_tx_queue *txq = rxq->pair; - - memset(rxq, 0, sizeof(*rxq)); - memset(txq, 0, sizeof(*txq)); - rxq->pair = txq; - txq->pair = rxq; - rxq->queue_idx = queue_idx; - txq->queue_idx = queue_idx; -} - static int eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id, @@ -663,8 +655,9 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, int ret; rxq = &internals->rx_queues[rx_queue_id]; - queue_reset(internals, rx_queue_id); + AF_XDP_LOG(INFO, "Set up rx queue, rx queue id: %d, xsk queue id: %d\n", + rx_queue_id, rxq->xsk_queue_idx); /* Now get the space available for data in the mbuf */ buf_size = rte_pktmbuf_data_room_size(mb_pool) - RTE_PKTMBUF_HEADROOM; @@ -695,7 +688,6 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, return 0; err: - queue_reset(internals, rx_queue_id); return ret; } @@ -825,8 +817,8 @@ parse_name_arg(const char *key __rte_unused, } static int -parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, - int *pmd_zc) +parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, + int *queue_cnt, int *pmd_zc) { int ret; @@ -835,11 +827,18 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, if (ret < 0) goto free_kvlist; - ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_IDX_ARG, - &parse_integer_arg, queue_idx); + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_START_QUEUE_ARG, + &parse_integer_arg, start_queue); if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_COUNT_ARG, + &parse_integer_arg, queue_cnt); + if (ret < 0 || *queue_cnt > ETH_AF_XDP_MAX_QUEUE_PAIRS) { + ret = -EINVAL; + goto free_kvlist; + } + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PMD_ZC_ARG, &parse_integer_arg, pmd_zc); if (ret < 0) @@ -881,8 +880,8 @@ get_iface_info(const char *if_name, } static struct rte_eth_dev * -init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, - int pmd_zc) +init_internals(struct rte_vdev_device *dev, const char *if_name, + int start_queue_idx, int queue_cnt, int pmd_zc) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -895,13 +894,16 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, if (internals == NULL) return NULL; - internals->queue_idx = queue_idx; + internals->start_queue_idx = start_queue_idx; + internals->queue_cnt = queue_cnt; internals->pmd_zc = pmd_zc; strlcpy(internals->if_name, if_name, IFNAMSIZ); - for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + for (i = 0; i < queue_cnt; i++) { internals->tx_queues[i].pair = &internals->rx_queues[i]; internals->rx_queues[i].pair = &internals->tx_queues[i]; + internals->rx_queues[i].xsk_queue_idx = start_queue_idx + i; + internals->tx_queues[i].xsk_queue_idx = start_queue_idx + i; } ret = get_iface_info(if_name, &internals->eth_addr, @@ -934,7 +936,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) { struct rte_kvargs *kvlist; char if_name[IFNAMSIZ] = {'\0'}; - int xsk_queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX; + int xsk_start_queue_idx = ETH_AF_XDP_DFLT_START_QUEUE_IDX; + int xsk_queue_cnt = ETH_AF_XDP_DFLT_QUEUE_COUNT; struct rte_eth_dev *eth_dev = NULL; const char *name; int pmd_zc = 0; @@ -964,7 +967,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (dev->device.numa_node == SOCKET_ID_ANY) dev->device.numa_node = rte_socket_id(); - if (parse_parameters(kvlist, if_name, &xsk_queue_idx, &pmd_zc) < 0) { + if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, + &xsk_queue_cnt, &pmd_zc) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -974,7 +978,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) return -EINVAL; } - eth_dev = init_internals(dev, if_name, xsk_queue_idx, pmd_zc); + eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, + xsk_queue_cnt, pmd_zc); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -1016,7 +1021,8 @@ static struct rte_vdev_driver pmd_af_xdp_drv = { RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv); RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "iface=<string> " - "queue=<int> " + "start_queue=<int> " + "queue_count=<int> " "pmd_zero_copy=<0|1>"); RTE_INIT(af_xdp_init_log) -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v1 3/3] net/af_xdp: add busy poll support 2019-05-15 8:38 [dpdk-dev] [PATCH v1 0/3] add more features for AF_XDP pmd Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 2/3] net/af_xdp: add multi-queue support Xiaolong Ye @ 2019-05-15 8:38 ` Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye 3 siblings, 0 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-15 8:38 UTC (permalink / raw) To: Xiaolong Ye, Qi Zhang, John McNamara, Marko Kovacevic Cc: Karlsson Magnus, Topel Bjorn, dev This patch enables busy-poll support for AF_XDP pmd. With busy-poll, the kernel driver is executed in process context by calling the poll() syscall. The main advantage of busy-poll feature is that all processing occurs on a single core. This eliminates the core-to-core cache transfers that occur between the application and the softirqd processing on another core. The drawback of busy-poll is that it will downgrade the max throughput due to syscall, but from a per-core perspective, the performance is better as normal mode runs on two cores and busy-poll only runs on a single core. Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> --- doc/guides/nics/af_xdp.rst | 1 + drivers/net/af_xdp/rte_eth_af_xdp.c | 48 ++++++++++++++++++++++++++--- 2 files changed, 45 insertions(+), 4 deletions(-) diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 18defcda3..e42065170 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -29,6 +29,7 @@ The following options can be provided to set up an af_xdp port in DPDK. * ``iface`` - name of the Kernel interface to attach to (required); * ``start_queue`` - starting netdev queue id (optional, default 0); * ``queue_count`` - total netdev queue number (optional, default 1); +* ``busy_poll_size`` - busy poll batch size (optional, default 0); * ``pmd_zero_copy`` - enable zero copy or not (optional, default 0); Prerequisites diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 9a4510701..1e46a4ef4 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -6,6 +6,7 @@ #include <stdlib.h> #include <string.h> #include <netinet/in.h> +#include <poll.h> #include <net/if.h> #include <sys/socket.h> #include <sys/ioctl.h> @@ -72,6 +73,7 @@ struct xsk_umem_info { struct rte_ring *buf_ring; const struct rte_memzone *mz; int pmd_zc; + int busy_poll; }; struct rx_stats { @@ -114,6 +116,7 @@ struct pmd_internals { int queue_cnt; int pmd_zc; + int busy_poll_size; struct ether_addr eth_addr; struct xsk_umem_info *umem; struct rte_mempool *mb_pool_share; @@ -126,12 +129,14 @@ struct pmd_internals { #define ETH_AF_XDP_START_QUEUE_ARG "start_queue" #define ETH_AF_XDP_QUEUE_COUNT_ARG "queue_count" #define ETH_AF_XDP_PMD_ZC_ARG "pmd_zero_copy" +#define ETH_AF_XDP_BUSY_POLL_SIZE_ARG "busy_poll_size" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, ETH_AF_XDP_START_QUEUE_ARG, ETH_AF_XDP_QUEUE_COUNT_ARG, ETH_AF_XDP_PMD_ZC_ARG, + ETH_AF_XDP_BUSY_POLL_SIZE_ARG, NULL }; @@ -191,6 +196,7 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) struct xsk_ring_cons *rx = &rxq->rx; struct xsk_umem_info *umem = rxq->umem; struct xsk_ring_prod *fq = &umem->fq; + struct pollfd pfds[1]; uint32_t idx_rx = 0; uint32_t free_thresh = fq->size >> 1; int pmd_zc = umem->pmd_zc; @@ -199,6 +205,15 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) unsigned long rx_bytes = 0; int rcvd, i; + if (umem->busy_poll) { + memset(pfds, 0, sizeof(pfds)); + pfds[0].fd = xsk_socket__fd(rxq->xsk); + pfds[0].events = POLLIN; + + if (poll(pfds, 1, 0) <= 0) + return 0; + } + nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_RX_BATCH_SIZE); if (unlikely(rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, nb_pkts) != 0)) @@ -305,12 +320,23 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) struct pkt_tx_queue *txq = queue; struct xsk_umem_info *umem = txq->pair->umem; struct rte_mbuf *mbuf; + struct pollfd pfds[1]; int pmd_zc = umem->pmd_zc; void *addrs[ETH_AF_XDP_TX_BATCH_SIZE]; unsigned long tx_bytes = 0; int i; uint32_t idx_tx; + if (umem->busy_poll) { + memset(pfds, 0, sizeof(pfds)); + pfds[0].fd = xsk_socket__fd(txq->pair->xsk); + pfds[0].events = POLLOUT; + if (poll(pfds, 1, 0) <= 0) + return 0; + if (!(pfds[0].revents & POLLOUT)) + return 0; + } + nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE); pull_umem_cq(umem, nb_pkts); @@ -615,6 +641,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, cfg.rx_size = ring_size; cfg.tx_size = ring_size; cfg.libbpf_flags = 0; + cfg.busy_poll = internals->busy_poll_size; cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; cfg.bind_flags = 0; ret = xsk_socket__create(&rxq->xsk, internals->if_name, @@ -680,10 +707,14 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, internals->umem = rxq->umem; internals->umem->pmd_zc = internals->pmd_zc; + internals->umem->busy_poll = internals->busy_poll_size ? 1 : 0; if (internals->umem->pmd_zc) AF_XDP_LOG(INFO, "Zero copy between umem and mbuf enabled.\n"); + if (internals->umem->busy_poll) + AF_XDP_LOG(INFO, "Busy poll enabled.\n"); + dev->data->rx_queues[rx_queue_id] = rxq; return 0; @@ -818,7 +849,7 @@ parse_name_arg(const char *key __rte_unused, static int parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, - int *queue_cnt, int *pmd_zc) + int *queue_cnt, int *pmd_zc, int *busy_poll_size) { int ret; @@ -844,6 +875,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_BUSY_POLL_SIZE_ARG, + &parse_integer_arg, busy_poll_size); + if (ret < 0) + goto free_kvlist; + free_kvlist: rte_kvargs_free(kvlist); return ret; @@ -881,7 +917,8 @@ get_iface_info(const char *if_name, static struct rte_eth_dev * init_internals(struct rte_vdev_device *dev, const char *if_name, - int start_queue_idx, int queue_cnt, int pmd_zc) + int start_queue_idx, int queue_cnt, int pmd_zc, + int busy_poll_size) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -897,6 +934,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, internals->start_queue_idx = start_queue_idx; internals->queue_cnt = queue_cnt; internals->pmd_zc = pmd_zc; + internals->busy_poll_size = busy_poll_size; strlcpy(internals->if_name, if_name, IFNAMSIZ); for (i = 0; i < queue_cnt; i++) { @@ -941,6 +979,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) struct rte_eth_dev *eth_dev = NULL; const char *name; int pmd_zc = 0; + int busy_poll_size = 0; AF_XDP_LOG(INFO, "Initializing pmd_af_xdp for %s\n", rte_vdev_device_name(dev)); @@ -968,7 +1007,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) dev->device.numa_node = rte_socket_id(); if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, - &xsk_queue_cnt, &pmd_zc) < 0) { + &xsk_queue_cnt, &pmd_zc, &busy_poll_size) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -979,7 +1018,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) } eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, - xsk_queue_cnt, pmd_zc); + xsk_queue_cnt, pmd_zc, busy_poll_size); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -1023,6 +1062,7 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "iface=<string> " "start_queue=<int> " "queue_count=<int> " + "busy_poll_size=<int> " "pmd_zero_copy=<0|1>"); RTE_INIT(af_xdp_init_log) -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd 2019-05-15 8:38 [dpdk-dev] [PATCH v1 0/3] add more features for AF_XDP pmd Xiaolong Ye ` (2 preceding siblings ...) 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 3/3] net/af_xdp: add busy poll support Xiaolong Ye @ 2019-05-30 9:07 ` Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye ` (3 more replies) 3 siblings, 4 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-30 9:07 UTC (permalink / raw) Cc: Qi Zhang, Karlsson Magnus, Topel Bjorn, dev, Xiaolong Ye This patch series mainly includes 3 new features for AF_XDP pmd. They are separated independent features, the reason I take them in one patchset is that they have code dependency. 1. zero copy This patch enables `zero copy` between af_xdp umem and mbuf by using external mbuf mechanism. 2. multi-queue With mutli-queue support, one AF_XDP pmd instance can use multi netdev queues. 3. busy-poll With busy-poll, all processing occurs on a single core, performance is better from a per-core perspective. This patch has dependency on busy-poll support in kernel side and now it is in RFC stage [1]. [1] https://www.spinics.net/lists/netdev/msg568337.html V2 changes: 1. improve mutli-queue support by getting the ethtool channel, so the driver is able to get a reason maximum queue number. 2. add a tiny cleanup patch to get rid of unused struct member 3. remove the busy-poll patch as its kernel counterpart changes, will update the patch later Xiaolong Ye (3): net/af_xdp: enable zero copy by extbuf net/af_xdp: add multi-queue support net/af_xdp: remove unused struct member doc/guides/nics/af_xdp.rst | 4 +- drivers/net/af_xdp/rte_eth_af_xdp.c | 285 ++++++++++++++++++++-------- 2 files changed, 213 insertions(+), 76 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye @ 2019-05-30 9:07 ` Xiaolong Ye 2019-05-30 15:31 ` Stephen Hemminger 2019-06-11 16:16 ` William Tu 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support Xiaolong Ye ` (2 subsequent siblings) 3 siblings, 2 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-30 9:07 UTC (permalink / raw) To: Xiaolong Ye, Qi Zhang, John McNamara, Marko Kovacevic Cc: Karlsson Magnus, Topel Bjorn, dev Implement zero copy of af_xdp pmd through mbuf's external memory mechanism to achieve high performance. This patch also provides a new parameter "pmd_zero_copy" for user, so they can choose to enable zero copy of af_xdp pmd or not. To be clear, "zero copy" here is different from the "zero copy mode" of AF_XDP, it is about zero copy between af_xdp umem and mbuf used in dpdk application. Suggested-by: Varghese Vipin <vipin.varghese@intel.com> Suggested-by: Tummala Sivaprasad <sivaprasad.tummala@intel.com> Suggested-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> --- doc/guides/nics/af_xdp.rst | 1 + drivers/net/af_xdp/rte_eth_af_xdp.c | 104 +++++++++++++++++++++------- 2 files changed, 79 insertions(+), 26 deletions(-) diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 175038153..0bd4239fe 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -28,6 +28,7 @@ The following options can be provided to set up an af_xdp port in DPDK. * ``iface`` - name of the Kernel interface to attach to (required); * ``queue`` - netdev queue id (optional, default 0); +* ``pmd_zero_copy`` - enable zero copy or not (optional, default 0); Prerequisites ------------- diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 35c72272c..014cd5691 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -70,6 +70,7 @@ struct xsk_umem_info { struct xsk_umem *umem; struct rte_ring *buf_ring; const struct rte_memzone *mz; + int pmd_zc; }; struct rx_stats { @@ -109,8 +110,8 @@ struct pmd_internals { int if_index; char if_name[IFNAMSIZ]; uint16_t queue_idx; + int pmd_zc; struct ether_addr eth_addr; - struct xsk_umem_info *umem; struct rte_mempool *mb_pool_share; struct pkt_rx_queue rx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS]; @@ -119,10 +120,12 @@ struct pmd_internals { #define ETH_AF_XDP_IFACE_ARG "iface" #define ETH_AF_XDP_QUEUE_IDX_ARG "queue" +#define ETH_AF_XDP_PMD_ZC_ARG "pmd_zero_copy" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, ETH_AF_XDP_QUEUE_IDX_ARG, + ETH_AF_XDP_PMD_ZC_ARG, NULL }; @@ -166,6 +169,15 @@ reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size) return 0; } +static void +umem_buf_release_to_fq(void *addr, void *opaque) +{ + struct xsk_umem_info *umem = (struct xsk_umem_info *)opaque; + uint64_t umem_addr = (uint64_t)addr - umem->mz->addr_64; + + rte_ring_enqueue(umem->buf_ring, (void *)umem_addr); +} + static uint16_t eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -175,6 +187,7 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) struct xsk_ring_prod *fq = &umem->fq; uint32_t idx_rx = 0; uint32_t free_thresh = fq->size >> 1; + int pmd_zc = umem->pmd_zc; struct rte_mbuf *mbufs[ETH_AF_XDP_RX_BATCH_SIZE]; unsigned long dropped = 0; unsigned long rx_bytes = 0; @@ -197,19 +210,29 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) uint64_t addr; uint32_t len; void *pkt; + uint16_t buf_len = ETH_AF_XDP_FRAME_SIZE; + struct rte_mbuf_ext_shared_info *shinfo; desc = xsk_ring_cons__rx_desc(rx, idx_rx++); addr = desc->addr; len = desc->len; pkt = xsk_umem__get_data(rxq->umem->mz->addr, addr); - rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), pkt, len); + if (pmd_zc) { + shinfo = rte_pktmbuf_ext_shinfo_init_helper(pkt, + &buf_len, umem_buf_release_to_fq, umem); + + rte_pktmbuf_attach_extbuf(mbufs[i], pkt, 0, buf_len, + shinfo); + } else { + rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), + pkt, len); + rte_ring_enqueue(umem->buf_ring, (void *)addr); + } rte_pktmbuf_pkt_len(mbufs[i]) = len; rte_pktmbuf_data_len(mbufs[i]) = len; rx_bytes += len; bufs[i] = mbufs[i]; - - rte_ring_enqueue(umem->buf_ring, (void *)addr); } xsk_ring_cons__release(rx, rcvd); @@ -262,12 +285,21 @@ kick_tx(struct pkt_tx_queue *txq) pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); } +static inline bool +in_umem_range(struct xsk_umem_info *umem, uint64_t addr) +{ + uint64_t mz_base_addr = umem->mz->addr_64; + + return addr >= mz_base_addr && addr < mz_base_addr + umem->mz->len; +} + static uint16_t eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { struct pkt_tx_queue *txq = queue; struct xsk_umem_info *umem = txq->pair->umem; struct rte_mbuf *mbuf; + int pmd_zc = umem->pmd_zc; void *addrs[ETH_AF_XDP_TX_BATCH_SIZE]; unsigned long tx_bytes = 0; int i; @@ -294,16 +326,26 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); mbuf = bufs[i]; - - desc->addr = (uint64_t)addrs[i]; desc->len = mbuf->pkt_len; - pkt = xsk_umem__get_data(umem->mz->addr, - desc->addr); - rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), - desc->len); - tx_bytes += mbuf->pkt_len; - rte_pktmbuf_free(mbuf); + /* + * We need to make sure the external mbuf address is within + * current port's umem memzone range + */ + if (pmd_zc && RTE_MBUF_HAS_EXTBUF(mbuf) && + in_umem_range(umem, (uint64_t)mbuf->buf_addr)) { + desc->addr = (uint64_t)mbuf->buf_addr - + umem->mz->addr_64; + mbuf->buf_addr = xsk_umem__get_data(umem->mz->addr, + (uint64_t)addrs[i]); + } else { + desc->addr = (uint64_t)addrs[i]; + pkt = xsk_umem__get_data(umem->mz->addr, + desc->addr); + rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), + desc->len); + } + tx_bytes += mbuf->pkt_len; } xsk_ring_prod__submit(&txq->tx, nb_pkts); @@ -313,6 +355,9 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) txq->stats.tx_pkts += nb_pkts; txq->stats.tx_bytes += tx_bytes; + for (i = 0; i < nb_pkts; i++) + rte_pktmbuf_free(bufs[i]); + return nb_pkts; } @@ -454,18 +499,16 @@ eth_dev_close(struct rte_eth_dev *dev) if (rxq->umem == NULL) break; xsk_socket__delete(rxq->xsk); + (void)xsk_umem__delete(rxq->umem->umem); + xdp_umem_destroy(rxq->umem); } - (void)xsk_umem__delete(internals->umem->umem); - /* * MAC is not allocated dynamically, setting it to NULL would prevent * from releasing it in rte_eth_dev_release_port. */ dev->data->mac_addrs = NULL; - xdp_umem_destroy(internals->umem); - remove_xdp_program(internals); } @@ -639,7 +682,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, goto err; } - internals->umem = rxq->umem; + rxq->umem->pmd_zc = internals->pmd_zc; dev->data->rx_queues[rx_queue_id] = rxq; return 0; @@ -775,9 +818,8 @@ parse_name_arg(const char *key __rte_unused, } static int -parse_parameters(struct rte_kvargs *kvlist, - char *if_name, - int *queue_idx) +parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, + int *pmd_zc) { int ret; @@ -791,6 +833,11 @@ parse_parameters(struct rte_kvargs *kvlist, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PMD_ZC_ARG, + &parse_integer_arg, pmd_zc); + if (ret < 0) + goto free_kvlist; + free_kvlist: rte_kvargs_free(kvlist); return ret; @@ -827,9 +874,8 @@ get_iface_info(const char *if_name, } static struct rte_eth_dev * -init_internals(struct rte_vdev_device *dev, - const char *if_name, - int queue_idx) +init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, + int pmd_zc) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -843,6 +889,7 @@ init_internals(struct rte_vdev_device *dev, return NULL; internals->queue_idx = queue_idx; + internals->pmd_zc = pmd_zc; strlcpy(internals->if_name, if_name, IFNAMSIZ); for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { @@ -868,6 +915,9 @@ init_internals(struct rte_vdev_device *dev, /* Let rte_eth_dev_close() release the port resources. */ eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE; + if (internals->pmd_zc) + AF_XDP_LOG(INFO, "Zero copy between umem and mbuf enabled.\n"); + return eth_dev; err: @@ -883,6 +933,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) int xsk_queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX; struct rte_eth_dev *eth_dev = NULL; const char *name; + int pmd_zc = 0; AF_XDP_LOG(INFO, "Initializing pmd_af_xdp for %s\n", rte_vdev_device_name(dev)); @@ -909,7 +960,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (dev->device.numa_node == SOCKET_ID_ANY) dev->device.numa_node = rte_socket_id(); - if (parse_parameters(kvlist, if_name, &xsk_queue_idx) < 0) { + if (parse_parameters(kvlist, if_name, &xsk_queue_idx, &pmd_zc) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -919,7 +970,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) return -EINVAL; } - eth_dev = init_internals(dev, if_name, xsk_queue_idx); + eth_dev = init_internals(dev, if_name, xsk_queue_idx, pmd_zc); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -961,7 +1012,8 @@ static struct rte_vdev_driver pmd_af_xdp_drv = { RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv); RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "iface=<string> " - "queue=<int> "); + "queue=<int> " + "pmd_zero_copy=<0|1>"); RTE_INIT(af_xdp_init_log) { -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye @ 2019-05-30 15:31 ` Stephen Hemminger 2019-05-31 1:49 ` Ye Xiaolong 2019-06-11 16:16 ` William Tu 1 sibling, 1 reply; 16+ messages in thread From: Stephen Hemminger @ 2019-05-30 15:31 UTC (permalink / raw) To: Xiaolong Ye Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, dev On Thu, 30 May 2019 17:07:05 +0800 Xiaolong Ye <xiaolong.ye@intel.com> wrote: > Implement zero copy of af_xdp pmd through mbuf's external memory > mechanism to achieve high performance. > > This patch also provides a new parameter "pmd_zero_copy" for user, so they > can choose to enable zero copy of af_xdp pmd or not. > > To be clear, "zero copy" here is different from the "zero copy mode" of > AF_XDP, it is about zero copy between af_xdp umem and mbuf used in dpdk > application. > > Suggested-by: Varghese Vipin <vipin.varghese@intel.com> > Suggested-by: Tummala Sivaprasad <sivaprasad.tummala@intel.com> > Suggested-by: Olivier Matz <olivier.matz@6wind.com> > Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Why is this a parameter? Can it just be auto detected. Remember configuration is evil, it hurts usability, code coverage and increases complexity. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf 2019-05-30 15:31 ` Stephen Hemminger @ 2019-05-31 1:49 ` Ye Xiaolong 0 siblings, 0 replies; 16+ messages in thread From: Ye Xiaolong @ 2019-05-31 1:49 UTC (permalink / raw) To: Stephen Hemminger Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, Ananyev Konstantin, dev On 05/30, Stephen Hemminger wrote: >On Thu, 30 May 2019 17:07:05 +0800 >Xiaolong Ye <xiaolong.ye@intel.com> wrote: > >> Implement zero copy of af_xdp pmd through mbuf's external memory >> mechanism to achieve high performance. >> >> This patch also provides a new parameter "pmd_zero_copy" for user, so they >> can choose to enable zero copy of af_xdp pmd or not. >> >> To be clear, "zero copy" here is different from the "zero copy mode" of >> AF_XDP, it is about zero copy between af_xdp umem and mbuf used in dpdk >> application. >> >> Suggested-by: Varghese Vipin <vipin.varghese@intel.com> >> Suggested-by: Tummala Sivaprasad <sivaprasad.tummala@intel.com> >> Suggested-by: Olivier Matz <olivier.matz@6wind.com> >> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> > >Why is this a parameter? Can it just be auto detected. In case there might be situations when some users would perfer a slower copy mode, we can just let user decide. Thanks, Xiaolong >Remember configuration is evil, it hurts usability, code coverage >and increases complexity. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye 2019-05-30 15:31 ` Stephen Hemminger @ 2019-06-11 16:16 ` William Tu 2019-06-12 10:03 ` Ye Xiaolong 1 sibling, 1 reply; 16+ messages in thread From: William Tu @ 2019-06-11 16:16 UTC (permalink / raw) To: Xiaolong Ye Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, dev On Thu, May 30, 2019 at 2:16 AM Xiaolong Ye <xiaolong.ye@intel.com> wrote: > > Implement zero copy of af_xdp pmd through mbuf's external memory > mechanism to achieve high performance. > > This patch also provides a new parameter "pmd_zero_copy" for user, so they > can choose to enable zero copy of af_xdp pmd or not. > > To be clear, "zero copy" here is different from the "zero copy mode" of > AF_XDP, it is about zero copy between af_xdp umem and mbuf used in dpdk > application. > > Suggested-by: Varghese Vipin <vipin.varghese@intel.com> > Suggested-by: Tummala Sivaprasad <sivaprasad.tummala@intel.com> > Suggested-by: Olivier Matz <olivier.matz@6wind.com> > Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> > --- > doc/guides/nics/af_xdp.rst | 1 + > drivers/net/af_xdp/rte_eth_af_xdp.c | 104 +++++++++++++++++++++------- > 2 files changed, 79 insertions(+), 26 deletions(-) > > diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst > index 175038153..0bd4239fe 100644 > --- a/doc/guides/nics/af_xdp.rst > +++ b/doc/guides/nics/af_xdp.rst > @@ -28,6 +28,7 @@ The following options can be provided to set up an af_xdp port in DPDK. > > * ``iface`` - name of the Kernel interface to attach to (required); > * ``queue`` - netdev queue id (optional, default 0); > +* ``pmd_zero_copy`` - enable zero copy or not (optional, default 0); > > Prerequisites > ------------- > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c > index 35c72272c..014cd5691 100644 > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c > @@ -70,6 +70,7 @@ struct xsk_umem_info { > struct xsk_umem *umem; > struct rte_ring *buf_ring; > const struct rte_memzone *mz; > + int pmd_zc; > }; > > struct rx_stats { > @@ -109,8 +110,8 @@ struct pmd_internals { > int if_index; > char if_name[IFNAMSIZ]; > uint16_t queue_idx; > + int pmd_zc; > struct ether_addr eth_addr; > - struct xsk_umem_info *umem; > struct rte_mempool *mb_pool_share; > > struct pkt_rx_queue rx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS]; > @@ -119,10 +120,12 @@ struct pmd_internals { > > #define ETH_AF_XDP_IFACE_ARG "iface" > #define ETH_AF_XDP_QUEUE_IDX_ARG "queue" > +#define ETH_AF_XDP_PMD_ZC_ARG "pmd_zero_copy" > > static const char * const valid_arguments[] = { > ETH_AF_XDP_IFACE_ARG, > ETH_AF_XDP_QUEUE_IDX_ARG, > + ETH_AF_XDP_PMD_ZC_ARG, > NULL > }; > > @@ -166,6 +169,15 @@ reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size) > return 0; > } > > +static void > +umem_buf_release_to_fq(void *addr, void *opaque) > +{ > + struct xsk_umem_info *umem = (struct xsk_umem_info *)opaque; > + uint64_t umem_addr = (uint64_t)addr - umem->mz->addr_64; > + > + rte_ring_enqueue(umem->buf_ring, (void *)umem_addr); > +} > + > static uint16_t > eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > { > @@ -175,6 +187,7 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > struct xsk_ring_prod *fq = &umem->fq; > uint32_t idx_rx = 0; > uint32_t free_thresh = fq->size >> 1; > + int pmd_zc = umem->pmd_zc; > struct rte_mbuf *mbufs[ETH_AF_XDP_RX_BATCH_SIZE]; > unsigned long dropped = 0; > unsigned long rx_bytes = 0; > @@ -197,19 +210,29 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > uint64_t addr; > uint32_t len; > void *pkt; > + uint16_t buf_len = ETH_AF_XDP_FRAME_SIZE; > + struct rte_mbuf_ext_shared_info *shinfo; > > desc = xsk_ring_cons__rx_desc(rx, idx_rx++); > addr = desc->addr; > len = desc->len; > pkt = xsk_umem__get_data(rxq->umem->mz->addr, addr); > > - rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), pkt, len); > + if (pmd_zc) { > + shinfo = rte_pktmbuf_ext_shinfo_init_helper(pkt, > + &buf_len, umem_buf_release_to_fq, umem); > + > + rte_pktmbuf_attach_extbuf(mbufs[i], pkt, 0, buf_len, > + shinfo); > + } else { > + rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), > + pkt, len); > + rte_ring_enqueue(umem->buf_ring, (void *)addr); > + } > rte_pktmbuf_pkt_len(mbufs[i]) = len; > rte_pktmbuf_data_len(mbufs[i]) = len; > rx_bytes += len; > bufs[i] = mbufs[i]; > - > - rte_ring_enqueue(umem->buf_ring, (void *)addr); > } > > xsk_ring_cons__release(rx, rcvd); > @@ -262,12 +285,21 @@ kick_tx(struct pkt_tx_queue *txq) > pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); > } > > +static inline bool > +in_umem_range(struct xsk_umem_info *umem, uint64_t addr) > +{ > + uint64_t mz_base_addr = umem->mz->addr_64; > + > + return addr >= mz_base_addr && addr < mz_base_addr + umem->mz->len; > +} > + > static uint16_t > eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > { > struct pkt_tx_queue *txq = queue; > struct xsk_umem_info *umem = txq->pair->umem; > struct rte_mbuf *mbuf; > + int pmd_zc = umem->pmd_zc; > void *addrs[ETH_AF_XDP_TX_BATCH_SIZE]; > unsigned long tx_bytes = 0; > int i; > @@ -294,16 +326,26 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > > desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); > mbuf = bufs[i]; > - > - desc->addr = (uint64_t)addrs[i]; > desc->len = mbuf->pkt_len; > - pkt = xsk_umem__get_data(umem->mz->addr, > - desc->addr); > - rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), > - desc->len); > - tx_bytes += mbuf->pkt_len; > > - rte_pktmbuf_free(mbuf); > + /* > + * We need to make sure the external mbuf address is within > + * current port's umem memzone range > + */ > + if (pmd_zc && RTE_MBUF_HAS_EXTBUF(mbuf) && > + in_umem_range(umem, (uint64_t)mbuf->buf_addr)) { > + desc->addr = (uint64_t)mbuf->buf_addr - > + umem->mz->addr_64; > + mbuf->buf_addr = xsk_umem__get_data(umem->mz->addr, > + (uint64_t)addrs[i]); > + } else { > + desc->addr = (uint64_t)addrs[i]; > + pkt = xsk_umem__get_data(umem->mz->addr, > + desc->addr); > + rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), > + desc->len); > + } > + tx_bytes += mbuf->pkt_len; > } > > xsk_ring_prod__submit(&txq->tx, nb_pkts); > @@ -313,6 +355,9 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > txq->stats.tx_pkts += nb_pkts; > txq->stats.tx_bytes += tx_bytes; > > + for (i = 0; i < nb_pkts; i++) > + rte_pktmbuf_free(bufs[i]); > + Is it ok to free the mbuf here? If the AF_XDP is running pmd_zc=true, the packet mbuf is still in the tx ring and might not be sent out yet. Regards, William > return nb_pkts; > } > > @@ -454,18 +499,16 @@ eth_dev_close(struct rte_eth_dev *dev) > if (rxq->umem == NULL) > break; > xsk_socket__delete(rxq->xsk); > + (void)xsk_umem__delete(rxq->umem->umem); > + xdp_umem_destroy(rxq->umem); > } > > - (void)xsk_umem__delete(internals->umem->umem); > - > /* > * MAC is not allocated dynamically, setting it to NULL would prevent > * from releasing it in rte_eth_dev_release_port. > */ > dev->data->mac_addrs = NULL; > > - xdp_umem_destroy(internals->umem); > - > remove_xdp_program(internals); > } > > @@ -639,7 +682,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, > goto err; > } > > - internals->umem = rxq->umem; > + rxq->umem->pmd_zc = internals->pmd_zc; > > dev->data->rx_queues[rx_queue_id] = rxq; > return 0; > @@ -775,9 +818,8 @@ parse_name_arg(const char *key __rte_unused, > } > > static int > -parse_parameters(struct rte_kvargs *kvlist, > - char *if_name, > - int *queue_idx) > +parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, > + int *pmd_zc) > { > int ret; > > @@ -791,6 +833,11 @@ parse_parameters(struct rte_kvargs *kvlist, > if (ret < 0) > goto free_kvlist; > > + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PMD_ZC_ARG, > + &parse_integer_arg, pmd_zc); > + if (ret < 0) > + goto free_kvlist; > + > free_kvlist: > rte_kvargs_free(kvlist); > return ret; > @@ -827,9 +874,8 @@ get_iface_info(const char *if_name, > } > > static struct rte_eth_dev * > -init_internals(struct rte_vdev_device *dev, > - const char *if_name, > - int queue_idx) > +init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, > + int pmd_zc) > { > const char *name = rte_vdev_device_name(dev); > const unsigned int numa_node = dev->device.numa_node; > @@ -843,6 +889,7 @@ init_internals(struct rte_vdev_device *dev, > return NULL; > > internals->queue_idx = queue_idx; > + internals->pmd_zc = pmd_zc; > strlcpy(internals->if_name, if_name, IFNAMSIZ); > > for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { > @@ -868,6 +915,9 @@ init_internals(struct rte_vdev_device *dev, > /* Let rte_eth_dev_close() release the port resources. */ > eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE; > > + if (internals->pmd_zc) > + AF_XDP_LOG(INFO, "Zero copy between umem and mbuf enabled.\n"); > + > return eth_dev; > > err: > @@ -883,6 +933,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > int xsk_queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX; > struct rte_eth_dev *eth_dev = NULL; > const char *name; > + int pmd_zc = 0; > > AF_XDP_LOG(INFO, "Initializing pmd_af_xdp for %s\n", > rte_vdev_device_name(dev)); > @@ -909,7 +960,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > if (dev->device.numa_node == SOCKET_ID_ANY) > dev->device.numa_node = rte_socket_id(); > > - if (parse_parameters(kvlist, if_name, &xsk_queue_idx) < 0) { > + if (parse_parameters(kvlist, if_name, &xsk_queue_idx, &pmd_zc) < 0) { > AF_XDP_LOG(ERR, "Invalid kvargs value\n"); > return -EINVAL; > } > @@ -919,7 +970,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > return -EINVAL; > } > > - eth_dev = init_internals(dev, if_name, xsk_queue_idx); > + eth_dev = init_internals(dev, if_name, xsk_queue_idx, pmd_zc); > if (eth_dev == NULL) { > AF_XDP_LOG(ERR, "Failed to init internals\n"); > return -1; > @@ -961,7 +1012,8 @@ static struct rte_vdev_driver pmd_af_xdp_drv = { > RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv); > RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, > "iface=<string> " > - "queue=<int> "); > + "queue=<int> " > + "pmd_zero_copy=<0|1>"); > > RTE_INIT(af_xdp_init_log) > { > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf 2019-06-11 16:16 ` William Tu @ 2019-06-12 10:03 ` Ye Xiaolong 2019-06-13 0:32 ` William Tu 0 siblings, 1 reply; 16+ messages in thread From: Ye Xiaolong @ 2019-06-12 10:03 UTC (permalink / raw) To: William Tu Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, dev Hi, On 06/11, William Tu wrote: [snip] >> @@ -294,16 +326,26 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) >> >> desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); >> mbuf = bufs[i]; >> - >> - desc->addr = (uint64_t)addrs[i]; >> desc->len = mbuf->pkt_len; >> - pkt = xsk_umem__get_data(umem->mz->addr, >> - desc->addr); >> - rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), >> - desc->len); >> - tx_bytes += mbuf->pkt_len; >> >> - rte_pktmbuf_free(mbuf); >> + /* >> + * We need to make sure the external mbuf address is within >> + * current port's umem memzone range >> + */ >> + if (pmd_zc && RTE_MBUF_HAS_EXTBUF(mbuf) && >> + in_umem_range(umem, (uint64_t)mbuf->buf_addr)) { >> + desc->addr = (uint64_t)mbuf->buf_addr - >> + umem->mz->addr_64; >> + mbuf->buf_addr = xsk_umem__get_data(umem->mz->addr, >> + (uint64_t)addrs[i]); >> + } else { >> + desc->addr = (uint64_t)addrs[i]; >> + pkt = xsk_umem__get_data(umem->mz->addr, >> + desc->addr); >> + rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), >> + desc->len); >> + } >> + tx_bytes += mbuf->pkt_len; >> } >> >> xsk_ring_prod__submit(&txq->tx, nb_pkts); >> @@ -313,6 +355,9 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) >> txq->stats.tx_pkts += nb_pkts; >> txq->stats.tx_bytes += tx_bytes; >> >> + for (i = 0; i < nb_pkts; i++) >> + rte_pktmbuf_free(bufs[i]); >> + > >Is it ok to free the mbuf here? >If the AF_XDP is running pmd_zc=true, the packet mbuf is still in the tx >ring and might not be sent out yet. For pmd_zc=ture case, here mbuf->buf_addr has been exchanged to available addr dequeued from umem->buf_ring, rte_pktmbuf_free would just call the free callback umem_buf_release_to_fq to enqueue the addr to the buf_ring. Thanks, Xiaolong > >Regards, >William > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf 2019-06-12 10:03 ` Ye Xiaolong @ 2019-06-13 0:32 ` William Tu 0 siblings, 0 replies; 16+ messages in thread From: William Tu @ 2019-06-13 0:32 UTC (permalink / raw) To: Ye Xiaolong Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, dev On Tue, Jun 11, 2019 at 8:21 PM Ye Xiaolong <xiaolong.ye@intel.com> wrote: > > Hi, > > On 06/11, William Tu wrote: > [snip] > >> @@ -294,16 +326,26 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > >> > >> desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); > >> mbuf = bufs[i]; > >> - > >> - desc->addr = (uint64_t)addrs[i]; > >> desc->len = mbuf->pkt_len; > >> - pkt = xsk_umem__get_data(umem->mz->addr, > >> - desc->addr); > >> - rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), > >> - desc->len); > >> - tx_bytes += mbuf->pkt_len; > >> > >> - rte_pktmbuf_free(mbuf); > >> + /* > >> + * We need to make sure the external mbuf address is within > >> + * current port's umem memzone range > >> + */ > >> + if (pmd_zc && RTE_MBUF_HAS_EXTBUF(mbuf) && > >> + in_umem_range(umem, (uint64_t)mbuf->buf_addr)) { > >> + desc->addr = (uint64_t)mbuf->buf_addr - > >> + umem->mz->addr_64; > >> + mbuf->buf_addr = xsk_umem__get_data(umem->mz->addr, > >> + (uint64_t)addrs[i]); > >> + } else { > >> + desc->addr = (uint64_t)addrs[i]; > >> + pkt = xsk_umem__get_data(umem->mz->addr, > >> + desc->addr); > >> + rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), > >> + desc->len); > >> + } > >> + tx_bytes += mbuf->pkt_len; > >> } > >> > >> xsk_ring_prod__submit(&txq->tx, nb_pkts); > >> @@ -313,6 +355,9 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) > >> txq->stats.tx_pkts += nb_pkts; > >> txq->stats.tx_bytes += tx_bytes; > >> > >> + for (i = 0; i < nb_pkts; i++) > >> + rte_pktmbuf_free(bufs[i]); > >> + > > > >Is it ok to free the mbuf here? > >If the AF_XDP is running pmd_zc=true, the packet mbuf is still in the tx > >ring and might not be sent out yet. > > For pmd_zc=ture case, here mbuf->buf_addr has been exchanged to available addr > dequeued from umem->buf_ring, rte_pktmbuf_free would just call the free callback > umem_buf_release_to_fq to enqueue the addr to the buf_ring. > I see, thanks for the explanation. William > Thanks, > Xiaolong > > > > >Regards, > >William > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye @ 2019-05-30 9:07 ` Xiaolong Ye 2019-05-30 15:32 ` Stephen Hemminger 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: remove unused struct member Xiaolong Ye 2019-06-10 16:54 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Ferruh Yigit 3 siblings, 1 reply; 16+ messages in thread From: Xiaolong Ye @ 2019-05-30 9:07 UTC (permalink / raw) To: Xiaolong Ye, Qi Zhang, John McNamara, Marko Kovacevic Cc: Karlsson Magnus, Topel Bjorn, dev This patch adds two parameters `start_queue` and `queue_count` to specify the range of netdev queues used by AF_XDP pmd. Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> --- doc/guides/nics/af_xdp.rst | 3 +- drivers/net/af_xdp/rte_eth_af_xdp.c | 194 ++++++++++++++++++++-------- 2 files changed, 141 insertions(+), 56 deletions(-) diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 0bd4239fe..18defcda3 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -27,7 +27,8 @@ Options The following options can be provided to set up an af_xdp port in DPDK. * ``iface`` - name of the Kernel interface to attach to (required); -* ``queue`` - netdev queue id (optional, default 0); +* ``start_queue`` - starting netdev queue id (optional, default 0); +* ``queue_count`` - total netdev queue number (optional, default 1); * ``pmd_zero_copy`` - enable zero copy or not (optional, default 0); Prerequisites diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 014cd5691..f56aabcae 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -12,6 +12,8 @@ #include <linux/if_ether.h> #include <linux/if_xdp.h> #include <linux/if_link.h> +#include <linux/ethtool.h> +#include <linux/sockios.h> #include "af_xdp_deps.h" #include <bpf/xsk.h> @@ -57,12 +59,12 @@ static int af_xdp_logtype; #define ETH_AF_XDP_NUM_BUFFERS 4096 #define ETH_AF_XDP_DATA_HEADROOM 0 #define ETH_AF_XDP_DFLT_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS -#define ETH_AF_XDP_DFLT_QUEUE_IDX 0 +#define ETH_AF_XDP_DFLT_START_QUEUE_IDX 0 +#define ETH_AF_XDP_DFLT_QUEUE_COUNT 1 #define ETH_AF_XDP_RX_BATCH_SIZE 32 #define ETH_AF_XDP_TX_BATCH_SIZE 32 -#define ETH_AF_XDP_MAX_QUEUE_PAIRS 16 struct xsk_umem_info { struct xsk_ring_prod fq; @@ -88,7 +90,7 @@ struct pkt_rx_queue { struct rx_stats stats; struct pkt_tx_queue *pair; - uint16_t queue_idx; + int xsk_queue_idx; }; struct tx_stats { @@ -103,28 +105,34 @@ struct pkt_tx_queue { struct tx_stats stats; struct pkt_rx_queue *pair; - uint16_t queue_idx; + int xsk_queue_idx; }; struct pmd_internals { int if_index; char if_name[IFNAMSIZ]; - uint16_t queue_idx; + int start_queue_idx; + int queue_cnt; + int max_queue_cnt; + int combined_queue_cnt; + int pmd_zc; struct ether_addr eth_addr; struct rte_mempool *mb_pool_share; - struct pkt_rx_queue rx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS]; - struct pkt_tx_queue tx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS]; + struct pkt_rx_queue *rx_queues; + struct pkt_tx_queue *tx_queues; }; #define ETH_AF_XDP_IFACE_ARG "iface" -#define ETH_AF_XDP_QUEUE_IDX_ARG "queue" +#define ETH_AF_XDP_START_QUEUE_ARG "start_queue" +#define ETH_AF_XDP_QUEUE_COUNT_ARG "queue_count" #define ETH_AF_XDP_PMD_ZC_ARG "pmd_zero_copy" static const char * const valid_arguments[] = { ETH_AF_XDP_IFACE_ARG, - ETH_AF_XDP_QUEUE_IDX_ARG, + ETH_AF_XDP_START_QUEUE_ARG, + ETH_AF_XDP_QUEUE_COUNT_ARG, ETH_AF_XDP_PMD_ZC_ARG, NULL }; @@ -394,8 +402,8 @@ eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) dev_info->if_index = internals->if_index; dev_info->max_mac_addrs = 1; dev_info->max_rx_pktlen = ETH_FRAME_LEN; - dev_info->max_rx_queues = 1; - dev_info->max_tx_queues = 1; + dev_info->max_rx_queues = internals->queue_cnt; + dev_info->max_tx_queues = internals->queue_cnt; dev_info->min_mtu = ETHER_MIN_MTU; dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; @@ -412,21 +420,23 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) struct pmd_internals *internals = dev->data->dev_private; struct xdp_statistics xdp_stats; struct pkt_rx_queue *rxq; + struct pkt_tx_queue *txq; socklen_t optlen; int i, ret; for (i = 0; i < dev->data->nb_rx_queues; i++) { optlen = sizeof(struct xdp_statistics); rxq = &internals->rx_queues[i]; - stats->q_ipackets[i] = internals->rx_queues[i].stats.rx_pkts; - stats->q_ibytes[i] = internals->rx_queues[i].stats.rx_bytes; + txq = rxq->pair; + stats->q_ipackets[i] = rxq->stats.rx_pkts; + stats->q_ibytes[i] = rxq->stats.rx_bytes; - stats->q_opackets[i] = internals->tx_queues[i].stats.tx_pkts; - stats->q_obytes[i] = internals->tx_queues[i].stats.tx_bytes; + stats->q_opackets[i] = txq->stats.tx_pkts; + stats->q_obytes[i] = txq->stats.tx_bytes; stats->ipackets += stats->q_ipackets[i]; stats->ibytes += stats->q_ibytes[i]; - stats->imissed += internals->rx_queues[i].stats.rx_dropped; + stats->imissed += rxq->stats.rx_dropped; ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, XDP_STATISTICS, &xdp_stats, &optlen); if (ret != 0) { @@ -436,7 +446,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) stats->imissed += xdp_stats.rx_dropped; stats->opackets += stats->q_opackets[i]; - stats->oerrors += internals->tx_queues[i].stats.err_pkts; + stats->oerrors += txq->stats.err_pkts; stats->obytes += stats->q_obytes[i]; } @@ -449,7 +459,7 @@ eth_stats_reset(struct rte_eth_dev *dev) struct pmd_internals *internals = dev->data->dev_private; int i; - for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + for (i = 0; i < internals->queue_cnt; i++) { memset(&internals->rx_queues[i].stats, 0, sizeof(struct rx_stats)); memset(&internals->tx_queues[i].stats, 0, @@ -494,13 +504,17 @@ eth_dev_close(struct rte_eth_dev *dev) AF_XDP_LOG(INFO, "Closing AF_XDP ethdev on numa socket %u\n", rte_socket_id()); - for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + for (i = 0; i < internals->queue_cnt; i++) { rxq = &internals->rx_queues[i]; if (rxq->umem == NULL) break; xsk_socket__delete(rxq->xsk); (void)xsk_umem__delete(rxq->umem->umem); xdp_umem_destroy(rxq->umem); + + /* free pkt_tx_queue */ + rte_free(rxq->pair); + rte_free(rxq); } /* @@ -525,7 +539,8 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused, } static struct -xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals) +xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, + struct pkt_rx_queue *rxq) { struct xsk_umem_info *umem; const struct rte_memzone *mz; @@ -546,7 +561,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals) } snprintf(ring_name, sizeof(ring_name), "af_xdp_ring_%s_%u", - internals->if_name, internals->queue_idx); + internals->if_name, rxq->xsk_queue_idx); umem->buf_ring = rte_ring_create(ring_name, ETH_AF_XDP_NUM_BUFFERS, rte_socket_id(), @@ -562,7 +577,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals) ETH_AF_XDP_DATA_HEADROOM)); snprintf(mz_name, sizeof(mz_name), "af_xdp_umem_%s_%u", - internals->if_name, internals->queue_idx); + internals->if_name, rxq->xsk_queue_idx); mz = rte_memzone_reserve_aligned(mz_name, ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE, rte_socket_id(), RTE_MEMZONE_IOVA_CONTIG, @@ -599,7 +614,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, int ret = 0; int reserve_size; - rxq->umem = xdp_umem_configure(internals); + rxq->umem = xdp_umem_configure(internals, rxq); if (rxq->umem == NULL) return -ENOMEM; @@ -609,7 +624,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; cfg.bind_flags = 0; ret = xsk_socket__create(&rxq->xsk, internals->if_name, - internals->queue_idx, rxq->umem->umem, &rxq->rx, + rxq->xsk_queue_idx, rxq->umem->umem, &rxq->rx, &txq->tx, &cfg); if (ret) { AF_XDP_LOG(ERR, "Failed to create xsk socket.\n"); @@ -632,20 +647,6 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, return ret; } -static void -queue_reset(struct pmd_internals *internals, uint16_t queue_idx) -{ - struct pkt_rx_queue *rxq = &internals->rx_queues[queue_idx]; - struct pkt_tx_queue *txq = rxq->pair; - - memset(rxq, 0, sizeof(*rxq)); - memset(txq, 0, sizeof(*txq)); - rxq->pair = txq; - txq->pair = rxq; - rxq->queue_idx = queue_idx; - txq->queue_idx = queue_idx; -} - static int eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id, @@ -660,8 +661,9 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, int ret; rxq = &internals->rx_queues[rx_queue_id]; - queue_reset(internals, rx_queue_id); + AF_XDP_LOG(INFO, "Set up rx queue, rx queue id: %d, xsk queue id: %d\n", + rx_queue_id, rxq->xsk_queue_idx); /* Now get the space available for data in the mbuf */ buf_size = rte_pktmbuf_data_room_size(mb_pool) - RTE_PKTMBUF_HEADROOM; @@ -688,7 +690,6 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, return 0; err: - queue_reset(internals, rx_queue_id); return ret; } @@ -818,8 +819,45 @@ parse_name_arg(const char *key __rte_unused, } static int -parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, - int *pmd_zc) +xdp_get_channels_info(const char *if_name, int *max_queues, + int *combined_queues) +{ + struct ethtool_channels channels; + struct ifreq ifr; + int fd, ret; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (fd < 0) + return -1; + + channels.cmd = ETHTOOL_GCHANNELS; + ifr.ifr_data = (void *)&channels; + strncpy(ifr.ifr_name, if_name, IFNAMSIZ); + ret = ioctl(fd, SIOCETHTOOL, &ifr); + if (ret && errno != EOPNOTSUPP) { + ret = -errno; + goto out; + } + + if (channels.max_combined == 0 || errno == EOPNOTSUPP) { + /* If the device says it has no channels, then all traffic + * is sent to a single stream, so max queues = 1. + */ + *max_queues = 1; + *combined_queues = 1; + } else { + *max_queues = channels.max_combined; + *combined_queues = channels.combined_count; + } + + out: + close(fd); + return ret; +} + +static int +parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, + int *queue_cnt, int *pmd_zc) { int ret; @@ -828,11 +866,18 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *queue_idx, if (ret < 0) goto free_kvlist; - ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_IDX_ARG, - &parse_integer_arg, queue_idx); + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_START_QUEUE_ARG, + &parse_integer_arg, start_queue); if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_COUNT_ARG, + &parse_integer_arg, queue_cnt); + if (ret < 0 || *queue_cnt <= 0) { + ret = -EINVAL; + goto free_kvlist; + } + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PMD_ZC_ARG, &parse_integer_arg, pmd_zc); if (ret < 0) @@ -874,8 +919,8 @@ get_iface_info(const char *if_name, } static struct rte_eth_dev * -init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, - int pmd_zc) +init_internals(struct rte_vdev_device *dev, const char *if_name, + int start_queue_idx, int queue_cnt, int pmd_zc) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -888,23 +933,54 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, if (internals == NULL) return NULL; - internals->queue_idx = queue_idx; + internals->start_queue_idx = start_queue_idx; + internals->queue_cnt = queue_cnt; internals->pmd_zc = pmd_zc; strlcpy(internals->if_name, if_name, IFNAMSIZ); - for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, + &internals->combined_queue_cnt)) { + AF_XDP_LOG(ERR, "Failed to get channel info of interface: %s\n", + if_name); + goto err_free_internals; + } + + if (queue_cnt > internals->combined_queue_cnt) { + AF_XDP_LOG(ERR, "Specified queue count %d is larger than combined queue count %d.\n", + queue_cnt, internals->combined_queue_cnt); + goto err_free_internals; + } + + internals->rx_queues = rte_zmalloc_socket(NULL, + sizeof(struct pkt_rx_queue) * queue_cnt, + 0, numa_node); + if (internals->rx_queues == NULL) { + AF_XDP_LOG(ERR, "Failed to allocate memory for rx queues.\n"); + goto err_free_internals; + } + + internals->tx_queues = rte_zmalloc_socket(NULL, + sizeof(struct pkt_tx_queue) * queue_cnt, + 0, numa_node); + if (internals->tx_queues == NULL) { + AF_XDP_LOG(ERR, "Failed to allocate memory for tx queues.\n"); + goto err_free_rx; + } + for (i = 0; i < queue_cnt; i++) { internals->tx_queues[i].pair = &internals->rx_queues[i]; internals->rx_queues[i].pair = &internals->tx_queues[i]; + internals->rx_queues[i].xsk_queue_idx = start_queue_idx + i; + internals->tx_queues[i].xsk_queue_idx = start_queue_idx + i; } ret = get_iface_info(if_name, &internals->eth_addr, &internals->if_index); if (ret) - goto err; + goto err_free_tx; eth_dev = rte_eth_vdev_allocate(dev, 0); if (eth_dev == NULL) - goto err; + goto err_free_tx; eth_dev->data->dev_private = internals; eth_dev->data->dev_link = pmd_link; @@ -920,7 +996,11 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, int queue_idx, return eth_dev; -err: +err_free_tx: + rte_free(internals->tx_queues); +err_free_rx: + rte_free(internals->rx_queues); +err_free_internals: rte_free(internals); return NULL; } @@ -930,7 +1010,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) { struct rte_kvargs *kvlist; char if_name[IFNAMSIZ] = {'\0'}; - int xsk_queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX; + int xsk_start_queue_idx = ETH_AF_XDP_DFLT_START_QUEUE_IDX; + int xsk_queue_cnt = ETH_AF_XDP_DFLT_QUEUE_COUNT; struct rte_eth_dev *eth_dev = NULL; const char *name; int pmd_zc = 0; @@ -960,7 +1041,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (dev->device.numa_node == SOCKET_ID_ANY) dev->device.numa_node = rte_socket_id(); - if (parse_parameters(kvlist, if_name, &xsk_queue_idx, &pmd_zc) < 0) { + if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, + &xsk_queue_cnt, &pmd_zc) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } @@ -970,7 +1052,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) return -EINVAL; } - eth_dev = init_internals(dev, if_name, xsk_queue_idx, pmd_zc); + eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, + xsk_queue_cnt, pmd_zc); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -1012,7 +1095,8 @@ static struct rte_vdev_driver pmd_af_xdp_drv = { RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv); RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "iface=<string> " - "queue=<int> " + "start_queue=<int> " + "queue_count=<int> " "pmd_zero_copy=<0|1>"); RTE_INIT(af_xdp_init_log) -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support Xiaolong Ye @ 2019-05-30 15:32 ` Stephen Hemminger 2019-05-31 1:53 ` Ye Xiaolong 0 siblings, 1 reply; 16+ messages in thread From: Stephen Hemminger @ 2019-05-30 15:32 UTC (permalink / raw) To: Xiaolong Ye Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, dev On Thu, 30 May 2019 17:07:06 +0800 Xiaolong Ye <xiaolong.ye@intel.com> wrote: > This patch adds two parameters `start_queue` and `queue_count` to > specify the range of netdev queues used by AF_XDP pmd. > > Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Why does this have to be a config option, we already have max queues and number of queues in DPDK configuration. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support 2019-05-30 15:32 ` Stephen Hemminger @ 2019-05-31 1:53 ` Ye Xiaolong 0 siblings, 0 replies; 16+ messages in thread From: Ye Xiaolong @ 2019-05-31 1:53 UTC (permalink / raw) To: Stephen Hemminger Cc: Qi Zhang, John McNamara, Marko Kovacevic, Karlsson Magnus, Topel Bjorn, dev On 05/30, Stephen Hemminger wrote: >On Thu, 30 May 2019 17:07:06 +0800 >Xiaolong Ye <xiaolong.ye@intel.com> wrote: > >> This patch adds two parameters `start_queue` and `queue_count` to >> specify the range of netdev queues used by AF_XDP pmd. >> >> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> > >Why does this have to be a config option, we already have max queues >and number of queues in DPDK configuration. Here multi-queue refers to nic ethdev queue, AF_XDP now has 1:1 mapping between af_xdp socket and netdev queue, this patch is to support one vdev with multi af_xdp sockets (netdev queues). Thanks, Xiaolong ^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 3/3] net/af_xdp: remove unused struct member 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support Xiaolong Ye @ 2019-05-30 9:07 ` Xiaolong Ye 2019-06-10 16:54 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Ferruh Yigit 3 siblings, 0 replies; 16+ messages in thread From: Xiaolong Ye @ 2019-05-30 9:07 UTC (permalink / raw) To: Xiaolong Ye, Qi Zhang; +Cc: Karlsson Magnus, Topel Bjorn, dev, stable Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD") Cc: stable@dpdk.org Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> --- drivers/net/af_xdp/rte_eth_af_xdp.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index f56aabcae..fc25d245b 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -118,7 +118,6 @@ struct pmd_internals { int pmd_zc; struct ether_addr eth_addr; - struct rte_mempool *mb_pool_share; struct pkt_rx_queue *rx_queues; struct pkt_tx_queue *tx_queues; -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye ` (2 preceding siblings ...) 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: remove unused struct member Xiaolong Ye @ 2019-06-10 16:54 ` Ferruh Yigit 3 siblings, 0 replies; 16+ messages in thread From: Ferruh Yigit @ 2019-06-10 16:54 UTC (permalink / raw) To: Xiaolong Ye; +Cc: Qi Zhang, Karlsson Magnus, Topel Bjorn, dev On 5/30/2019 10:07 AM, Xiaolong Ye wrote: > This patch series mainly includes 3 new features for AF_XDP pmd. They > are separated independent features, the reason I take them in one > patchset is that they have code dependency. > > 1. zero copy > > This patch enables `zero copy` between af_xdp umem and mbuf by using > external mbuf mechanism. > > 2. multi-queue > > With mutli-queue support, one AF_XDP pmd instance can use multi netdev > queues. > > 3. busy-poll > > With busy-poll, all processing occurs on a single core, performance is > better from a per-core perspective. > > This patch has dependency on busy-poll support in kernel side and now it > is in > RFC stage [1]. > > [1] https://www.spinics.net/lists/netdev/msg568337.html > > V2 changes: > > 1. improve mutli-queue support by getting the ethtool channel, so the > driver is able to get a reason maximum queue number. > 2. add a tiny cleanup patch to get rid of unused struct member > 3. remove the busy-poll patch as its kernel counterpart changes, will > update the patch later > > Xiaolong Ye (3): > net/af_xdp: enable zero copy by extbuf > net/af_xdp: add multi-queue support > net/af_xdp: remove unused struct member Series applied to dpdk-next-net/master, thanks. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2019-06-13 0:33 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-15 8:38 [dpdk-dev] [PATCH v1 0/3] add more features for AF_XDP pmd Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 2/3] net/af_xdp: add multi-queue support Xiaolong Ye 2019-05-15 8:38 ` [dpdk-dev] [PATCH v1 3/3] net/af_xdp: add busy poll support Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Xiaolong Ye 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: enable zero copy by extbuf Xiaolong Ye 2019-05-30 15:31 ` Stephen Hemminger 2019-05-31 1:49 ` Ye Xiaolong 2019-06-11 16:16 ` William Tu 2019-06-12 10:03 ` Ye Xiaolong 2019-06-13 0:32 ` William Tu 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: add multi-queue support Xiaolong Ye 2019-05-30 15:32 ` Stephen Hemminger 2019-05-31 1:53 ` Ye Xiaolong 2019-05-30 9:07 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: remove unused struct member Xiaolong Ye 2019-06-10 16:54 ` [dpdk-dev] [PATCH v2 0/3] add more features for AF_XDP pmd Ferruh Yigit
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).