All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 0/8] support for netronome nfp-6xxx card
@ 2015-11-30 10:25 Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 1/8] nfp: basic initialization Alejandro Lucero
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

This patchset adds a new PMD for Netronome nfp-6xxx card.
Just PCI Virtual Functions support.
Using this PMD requires previous Netronome BSP installation.

v10:
 - Getting rid of __u8 usage
 - Squashing last two patches in one

v9:
 -  - Adding flag RTE_PCI_DRV_INTR_LSC
 - Makefile changes for compilation as a shared library
 - Adding map file for linker version script info

v8:
 - removing remaining unnecessary flags to PMD Makefile

v7:
 - Adding support for link status changes interrupts.
 - removing unnecessary flags when compiling the PMD.

v6:
 - Making each patch compilable.

v5:
 - Splitting up patches per functionality.

v4:
 - Getting rid of nfp_uio. Just submitting PMD.
 - Removing LSC interrupt support.

v3:
 - Making all patches independent for applying and building.
 - changing commits messages following standard.

v2:
 - Code style changes based on checkpatch.pl and DPDK style guide.
 - Documentation changes using the right rst format.
 - Moving the documentation files to a new patch file.
 - Adding info to MAINTAINERS and release files.

Alejandro Lucero (8):
  nfp: basic initialization
  nfp: adding rx/tx functionality
  nfp: adding rss
  nfp: adding stats
  nfp: adding link functionality
  nfp: adding extra functionality
  nfp: link status change interrupt support
  nfp: adding nic guide

 MAINTAINERS                             |    4 +
 config/common_linuxapp                  |    6 +
 doc/guides/nics/index.rst               |    1 +
 doc/guides/nics/nfp.rst                 |  265 ++++
 doc/guides/rel_notes/release_2_2.rst    |    3 +
 drivers/net/Makefile                    |    1 +
 drivers/net/nfp/Makefile                |   56 +
 drivers/net/nfp/nfp_net.c               | 2499 +++++++++++++++++++++++++++++++
 drivers/net/nfp/nfp_net_ctrl.h          |  324 ++++
 drivers/net/nfp/nfp_net_logs.h          |   75 +
 drivers/net/nfp/nfp_net_pmd.h           |  453 ++++++
 drivers/net/nfp/rte_pmd_nfp_version.map |    3 +
 mk/rte.app.mk                           |    1 +
 13 files changed, 3691 insertions(+)
 create mode 100644 doc/guides/nics/nfp.rst
 create mode 100644 drivers/net/nfp/Makefile
 create mode 100644 drivers/net/nfp/nfp_net.c
 create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/nfp/nfp_net_logs.h
 create mode 100644 drivers/net/nfp/nfp_net_pmd.h
 create mode 100644 drivers/net/nfp/rte_pmd_nfp_version.map

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v10 1/8] nfp: basic initialization
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 2/8] nfp: adding rx/tx functionality Alejandro Lucero
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 MAINTAINERS                             |    3 +
 config/common_linuxapp                  |    6 +
 doc/guides/rel_notes/release_2_2.rst    |    3 +
 drivers/net/Makefile                    |    1 +
 drivers/net/nfp/Makefile                |   56 +++
 drivers/net/nfp/nfp_net.c               |  699 +++++++++++++++++++++++++++++++
 drivers/net/nfp/nfp_net_ctrl.h          |  324 ++++++++++++++
 drivers/net/nfp/nfp_net_logs.h          |   75 ++++
 drivers/net/nfp/nfp_net_pmd.h           |  453 ++++++++++++++++++++
 drivers/net/nfp/rte_pmd_nfp_version.map |    3 +
 mk/rte.app.mk                           |    1 +
 11 files changed, 1624 insertions(+)
 create mode 100644 drivers/net/nfp/Makefile
 create mode 100644 drivers/net/nfp/nfp_net.c
 create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/nfp/nfp_net_logs.h
 create mode 100644 drivers/net/nfp/nfp_net_pmd.h
 create mode 100644 drivers/net/nfp/rte_pmd_nfp_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 4478862..a23de04 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -335,6 +335,9 @@ F: drivers/crypto/aesni_mb/
 Intel QuickAssist
 F: drivers/crypto/qat/
 
+Netronome nfp
+M: Alejandro Lucero <alejandro.lucero@netronome.com>
+F: drivers/net/nfp/
 
 Packet processing
 -----------------
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2866986..82f68c7 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -279,6 +279,12 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_DRIVER=n
 
 #
+# Compile burst-oriented Netronome NFP PMD driver
+#
+CONFIG_RTE_LIBRTE_NFP_PMD=n
+CONFIG_RTE_LIBRTE_NFP_DEBUG=n
+
+#
 # Compile example software rings based PMD
 #
 CONFIG_RTE_LIBRTE_PMD_RING=y
diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index 511d7a0..0a7c217 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -230,6 +230,9 @@ Libraries
   hardware transactional memory support, thread scaling did not work,
   due to the global ring that is shared by all cores.
 
+* **nfp: adding new PMD for Netronome nfp-6xxx card.**
+
+  Support for using Netronome nfp-6xxx with PCI VFs.
 
 Examples
 ~~~~~~~~
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index cddcd57..6e4497e 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -43,6 +43,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += mpipe
+DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += null
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += pcap
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += ring
diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile
new file mode 100644
index 0000000..ef7a13d
--- /dev/null
+++ b/drivers/net/nfp/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_nfp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_nfp_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp_net.c
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_eal lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_mempool lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_net lib/librte_malloc
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
new file mode 100644
index 0000000..b9240db
--- /dev/null
+++ b/drivers/net/nfp/nfp_net.c
@@ -0,0 +1,699 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Small portions derived from code Copyright(c) 2010-2015 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *  this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ *  contributors may be used to endorse or promote products derived from this
+ *  software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * vim:shiftwidth=8:noexpandtab
+ *
+ * @file dpdk/pmd/nfp_net.c
+ *
+ * Netronome vNIC DPDK Poll-Mode Driver: Main entry point
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/socket.h>
+#include <sys/io.h>
+#include <assert.h>
+#include <time.h>
+#include <math.h>
+#include <inttypes.h>
+
+#include <rte_byteorder.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_ethdev.h>
+#include <rte_dev.h>
+#include <rte_ether.h>
+#include <rte_malloc.h>
+#include <rte_memzone.h>
+#include <rte_mempool.h>
+#include <rte_version.h>
+#include <rte_string_fns.h>
+#include <rte_alarm.h>
+
+#include "nfp_net_pmd.h"
+#include "nfp_net_logs.h"
+#include "nfp_net_ctrl.h"
+
+/* Prototypes */
+static void nfp_net_close(struct rte_eth_dev *dev);
+static int nfp_net_configure(struct rte_eth_dev *dev);
+static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_start(struct rte_eth_dev *dev);
+static void nfp_net_stop(struct rte_eth_dev *dev);
+
+/*
+ * The offset of the queue controller queues in the PCIe Target. These
+ * happen to be at the same offset on the NFP6000 and the NFP3200 so
+ * we use a single macro here.
+ */
+#define NFP_PCIE_QUEUE(_q)	(0x80000 + (0x800 * ((_q) & 0xff)))
+
+/* Maximum value which can be added to a queue with one transaction */
+#define NFP_QCP_MAX_ADD	0x7f
+
+#define RTE_MBUF_DMA_ADDR_DEFAULT(mb) \
+	(uint64_t)((mb)->buf_physaddr + RTE_PKTMBUF_HEADROOM)
+
+/* nfp_qcp_ptr - Read or Write Pointer of a queue */
+enum nfp_qcp_ptr {
+	NFP_QCP_READ_PTR = 0,
+	NFP_QCP_WRITE_PTR
+};
+
+/*
+ * nfp_qcp_ptr_add - Add the value to the selected pointer of a queue
+ * @q: Base address for queue structure
+ * @ptr: Add to the Read or Write pointer
+ * @val: Value to add to the queue pointer
+ *
+ * If @val is greater than @NFP_QCP_MAX_ADD multiple writes are performed.
+ */
+static inline void
+nfp_qcp_ptr_add(uint8_t *q, enum nfp_qcp_ptr ptr, uint32_t val)
+{
+	uint32_t off;
+
+	if (ptr == NFP_QCP_READ_PTR)
+		off = NFP_QCP_QUEUE_ADD_RPTR;
+	else
+		off = NFP_QCP_QUEUE_ADD_WPTR;
+
+	while (val > NFP_QCP_MAX_ADD) {
+		nn_writel(rte_cpu_to_le_32(NFP_QCP_MAX_ADD), q + off);
+		val -= NFP_QCP_MAX_ADD;
+	}
+
+	nn_writel(rte_cpu_to_le_32(val), q + off);
+}
+
+/*
+ * nfp_qcp_read - Read the current Read/Write pointer value for a queue
+ * @q:  Base address for queue structure
+ * @ptr: Read or Write pointer
+ */
+static inline uint32_t
+nfp_qcp_read(uint8_t *q, enum nfp_qcp_ptr ptr)
+{
+	uint32_t off;
+	uint32_t val;
+
+	if (ptr == NFP_QCP_READ_PTR)
+		off = NFP_QCP_QUEUE_STS_LO;
+	else
+		off = NFP_QCP_QUEUE_STS_HI;
+
+	val = rte_cpu_to_le_32(nn_readl(q + off));
+
+	if (ptr == NFP_QCP_READ_PTR)
+		return val & NFP_QCP_QUEUE_STS_LO_READPTR_mask;
+	else
+		return val & NFP_QCP_QUEUE_STS_HI_WRITEPTR_mask;
+}
+
+/*
+ * Functions to read/write from/to Config BAR
+ * Performs any endian conversion necessary.
+ */
+static inline uint8_t
+nn_cfg_readb(struct nfp_net_hw *hw, int off)
+{
+	return nn_readb(hw->ctrl_bar + off);
+}
+
+static inline void
+nn_cfg_writeb(struct nfp_net_hw *hw, int off, uint8_t val)
+{
+	nn_writeb(val, hw->ctrl_bar + off);
+}
+
+static inline uint32_t
+nn_cfg_readl(struct nfp_net_hw *hw, int off)
+{
+	return rte_le_to_cpu_32(nn_readl(hw->ctrl_bar + off));
+}
+
+static inline void
+nn_cfg_writel(struct nfp_net_hw *hw, int off, uint32_t val)
+{
+	nn_writel(rte_cpu_to_le_32(val), hw->ctrl_bar + off);
+}
+
+static inline uint64_t
+nn_cfg_readq(struct nfp_net_hw *hw, int off)
+{
+	return rte_le_to_cpu_64(nn_readq(hw->ctrl_bar + off));
+}
+
+static inline void
+nn_cfg_writeq(struct nfp_net_hw *hw, int off, uint64_t val)
+{
+	nn_writeq(rte_cpu_to_le_64(val), hw->ctrl_bar + off);
+}
+
+static int
+__nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t update)
+{
+	int cnt;
+	uint32_t new;
+	struct timespec wait;
+
+	PMD_DRV_LOG(DEBUG, "Writing to the configuration queue (%p)...\n",
+		    hw->qcp_cfg);
+
+	if (hw->qcp_cfg == NULL)
+		rte_panic("Bad configuration queue pointer\n");
+
+	nfp_qcp_ptr_add(hw->qcp_cfg, NFP_QCP_WRITE_PTR, 1);
+
+	wait.tv_sec = 0;
+	wait.tv_nsec = 1000000;
+
+	PMD_DRV_LOG(DEBUG, "Polling for update ack...\n");
+
+	/* Poll update field, waiting for NFP to ack the config */
+	for (cnt = 0; ; cnt++) {
+		new = nn_cfg_readl(hw, NFP_NET_CFG_UPDATE);
+		if (new == 0)
+			break;
+		if (new & NFP_NET_CFG_UPDATE_ERR) {
+			PMD_INIT_LOG(ERR, "Reconfig error: 0x%08x\n", new);
+			return -1;
+		}
+		if (cnt >= NFP_NET_POLL_TIMEOUT) {
+			PMD_INIT_LOG(ERR, "Reconfig timeout for 0x%08x after"
+					  " %dms\n", update, cnt);
+			rte_panic("Exiting\n");
+		}
+		nanosleep(&wait, 0); /* waiting for a 1ms */
+	}
+	PMD_DRV_LOG(DEBUG, "Ack DONE\n");
+	return 0;
+}
+
+/*
+ * Reconfigure the NIC
+ * @nn:    device to reconfigure
+ * @ctrl:    The value for the ctrl field in the BAR config
+ * @update:  The value for the update field in the BAR config
+ *
+ * Write the update word to the BAR and ping the reconfig queue. Then poll
+ * until the firmware has acknowledged the update by zeroing the update word.
+ */
+static int
+nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, uint32_t update)
+{
+	uint32_t err;
+
+	PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=%08x update=%08x\n",
+		    ctrl, update);
+
+	nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl);
+	nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update);
+
+	rte_wmb();
+
+	err = __nfp_net_reconfig(hw, update);
+
+	if (!err)
+		return 0;
+
+	/*
+	 * Reconfig errors imply situations where they can be handled.
+	 * Otherwise, rte_panic is called inside __nfp_net_reconfig
+	 */
+	PMD_INIT_LOG(ERR, "Error nfp_net reconfig for ctrl: %x update: %x\n",
+		     ctrl, update);
+	return -EIO;
+}
+
+/*
+ * Configure an Ethernet device. This function must be invoked first
+ * before any other function in the Ethernet API. This function can
+ * also be re-invoked when a device is in the stopped state.
+ */
+static int
+nfp_net_configure(struct rte_eth_dev *dev)
+{
+	struct rte_eth_conf *dev_conf;
+	struct rte_eth_rxmode *rxmode;
+	struct rte_eth_txmode *txmode;
+	uint32_t new_ctrl = 0;
+	uint32_t update = 0;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/*
+	 * A DPDK app sends info about how many queues to use and how
+	 * those queues need to be configured. This is used by the
+	 * DPDK core and it makes sure no more queues than those
+	 * advertised by the driver are requested. This function is
+	 * called after that internal process
+	 */
+
+	PMD_INIT_LOG(DEBUG, "Configure\n");
+
+	dev_conf = &dev->data->dev_conf;
+	rxmode = &dev_conf->rxmode;
+	txmode = &dev_conf->txmode;
+
+	/* Checking TX mode */
+	if (txmode->mq_mode) {
+		PMD_INIT_LOG(INFO, "TX mq_mode DCB and VMDq not supported\n");
+		return -EINVAL;
+	}
+
+	/* Checking RX mode */
+	if (rxmode->mq_mode & ETH_MQ_RX_RSS) {
+		if (hw->cap & NFP_NET_CFG_CTRL_RSS) {
+			update = NFP_NET_CFG_UPDATE_RSS;
+			new_ctrl = NFP_NET_CFG_CTRL_RSS;
+		} else {
+			PMD_INIT_LOG(INFO, "RSS not supported\n");
+			return -EINVAL;
+		}
+	}
+
+	if (rxmode->split_hdr_size) {
+		PMD_INIT_LOG(INFO, "rxmode does not support split header\n");
+		return -EINVAL;
+	}
+
+	if (rxmode->hw_ip_checksum) {
+		if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM) {
+			new_ctrl |= NFP_NET_CFG_CTRL_RXCSUM;
+		} else {
+			PMD_INIT_LOG(INFO, "RXCSUM not supported\n");
+			return -EINVAL;
+		}
+	}
+
+	if (rxmode->hw_vlan_filter) {
+		PMD_INIT_LOG(INFO, "VLAN filter not supported\n");
+		return -EINVAL;
+	}
+
+	if (rxmode->hw_vlan_strip) {
+		if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN) {
+			new_ctrl |= NFP_NET_CFG_CTRL_RXVLAN;
+		} else {
+			PMD_INIT_LOG(INFO, "hw vlan strip not supported\n");
+			return -EINVAL;
+		}
+	}
+
+	if (rxmode->hw_vlan_extend) {
+		PMD_INIT_LOG(INFO, "VLAN extended not supported\n");
+		return -EINVAL;
+	}
+
+	/* Supporting VLAN insertion by default */
+	if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)
+		new_ctrl |= NFP_NET_CFG_CTRL_TXVLAN;
+
+	if (rxmode->jumbo_frame)
+		/* this is handled in rte_eth_dev_configure */
+
+	if (rxmode->hw_strip_crc) {
+		PMD_INIT_LOG(INFO, "strip CRC not supported\n");
+		return -EINVAL;
+	}
+
+	if (rxmode->enable_scatter) {
+		PMD_INIT_LOG(INFO, "Scatter not supported\n");
+		return -EINVAL;
+	}
+
+	if (!new_ctrl)
+		return 0;
+
+	update |= NFP_NET_CFG_UPDATE_GEN;
+
+	nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+		return -EIO;
+
+	hw->ctrl = new_ctrl;
+
+	return 0;
+}
+
+static void
+nfp_net_enable_queues(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *hw;
+	uint64_t enabled_queues = 0;
+	int i;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* Enabling the required TX queues in the device */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		enabled_queues |= (1 << i);
+
+	nn_cfg_writeq(hw, NFP_NET_CFG_TXRS_ENABLE, enabled_queues);
+
+	enabled_queues = 0;
+
+	/* Enabling the required RX queues in the device */
+	for (i = 0; i < dev->data->nb_rx_queues; i++)
+		enabled_queues |= (1 << i);
+
+	nn_cfg_writeq(hw, NFP_NET_CFG_RXRS_ENABLE, enabled_queues);
+}
+
+static void
+nfp_net_disable_queues(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *hw;
+	uint32_t new_ctrl, update = 0;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	nn_cfg_writeq(hw, NFP_NET_CFG_TXRS_ENABLE, 0);
+	nn_cfg_writeq(hw, NFP_NET_CFG_RXRS_ENABLE, 0);
+
+	new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_ENABLE;
+	update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING |
+		 NFP_NET_CFG_UPDATE_MSIX;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+		new_ctrl &= ~NFP_NET_CFG_CTRL_RINGCFG;
+
+	/* If an error when reconfig we avoid to change hw state */
+	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+		return;
+
+	hw->ctrl = new_ctrl;
+}
+
+static void
+nfp_net_params_setup(struct nfp_net_hw *hw)
+{
+	uint32_t *mac_address;
+
+	nn_cfg_writel(hw, NFP_NET_CFG_MTU, hw->mtu);
+	nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, hw->flbufsz);
+
+	/* A MAC address is 8 bytes long */
+	mac_address = (uint32_t *)(hw->mac_addr);
+
+	nn_cfg_writel(hw, NFP_NET_CFG_MACADDR,
+		      rte_cpu_to_be_32(*mac_address));
+	nn_cfg_writel(hw, NFP_NET_CFG_MACADDR + 4,
+		      rte_cpu_to_be_32(*(mac_address + 4)));
+}
+
+static void
+nfp_net_cfg_queue_setup(struct nfp_net_hw *hw)
+{
+	hw->qcp_cfg = hw->tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
+}
+
+static int
+nfp_net_start(struct rte_eth_dev *dev)
+{
+	uint32_t new_ctrl, update = 0;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_INIT_LOG(DEBUG, "Start\n");
+
+	/* Disabling queues just in case... */
+	nfp_net_disable_queues(dev);
+
+	/* Writing configuration parameters in the device */
+	nfp_net_params_setup(hw);
+
+	/* Enabling the required queues in the device */
+	nfp_net_enable_queues(dev);
+
+	/* Enable device */
+	new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_ENABLE | NFP_NET_CFG_UPDATE_MSIX;
+	update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+		new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
+
+	nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+		return -EIO;
+
+	hw->ctrl = new_ctrl;
+
+	return 0;
+}
+
+/* Stop device: disable rx and tx functions to allow for reconfiguring. */
+static void
+nfp_net_stop(struct rte_eth_dev *dev)
+{
+	PMD_INIT_LOG(DEBUG, "Stop\n");
+
+	nfp_net_disable_queues(dev);
+}
+
+/* Reset and stop device. The device can not be restarted. */
+static void
+nfp_net_close(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *hw;
+
+	PMD_INIT_LOG(DEBUG, "Close\n");
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/*
+	 * We assume that the DPDK application is stopping all the
+	 * threads/queues before calling the device close function.
+	 */
+
+	nfp_net_stop(dev);
+
+	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
+
+	/*
+	 * The ixgbe PMD driver disables the pcie master on the
+	 * device. The i40e does not...
+	 */
+}
+
+/* Initialise and register driver with DPDK Application */
+static struct eth_dev_ops nfp_net_eth_dev_ops = {
+	.dev_configure		= nfp_net_configure,
+	.dev_start		= nfp_net_start,
+	.dev_stop		= nfp_net_stop,
+	.dev_close		= nfp_net_close,
+};
+
+static int
+nfp_net_init(struct rte_eth_dev *eth_dev)
+{
+	struct rte_pci_device *pci_dev;
+	struct nfp_net_hw *hw;
+
+	uint32_t tx_bar_off, rx_bar_off;
+	uint32_t start_q;
+	int stride = 4;
+
+	PMD_INIT_FUNC_TRACE();
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
+
+	eth_dev->dev_ops = &nfp_net_eth_dev_ops;
+
+	/* For secondary processes, the primary has done all the work */
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	pci_dev = eth_dev->pci_dev;
+	hw->device_id = pci_dev->id.device_id;
+	hw->vendor_id = pci_dev->id.vendor_id;
+	hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
+	hw->subsystem_vendor_id = pci_dev->id.subsystem_vendor_id;
+
+	PMD_INIT_LOG(DEBUG, "nfp_net: device (%u:%u) %u:%u:%u:%u\n",
+		     pci_dev->id.vendor_id, pci_dev->id.device_id,
+		     pci_dev->addr.domain, pci_dev->addr.bus,
+		     pci_dev->addr.devid, pci_dev->addr.function);
+
+	hw->ctrl_bar = (uint8_t *)pci_dev->mem_resource[0].addr;
+	if (hw->ctrl_bar == NULL) {
+		RTE_LOG(ERR, PMD,
+			"hw->ctrl_bar is NULL. BAR0 not configured\n");
+		return -ENODEV;
+	}
+	hw->max_rx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_RXRINGS);
+	hw->max_tx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_TXRINGS);
+
+	/* Work out where in the BAR the queues start. */
+	switch (pci_dev->id.device_id) {
+	case PCI_DEVICE_ID_NFP6000_VF_NIC:
+		start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
+		tx_bar_off = NFP_PCIE_QUEUE(start_q);
+		start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_RXQ);
+		rx_bar_off = NFP_PCIE_QUEUE(start_q);
+		break;
+	default:
+		RTE_LOG(ERR, PMD, "nfp_net: no device ID matching\n");
+		return -ENODEV;
+	}
+
+	PMD_INIT_LOG(DEBUG, "tx_bar_off: 0x%08x\n", tx_bar_off);
+	PMD_INIT_LOG(DEBUG, "rx_bar_off: 0x%08x\n", rx_bar_off);
+
+	hw->tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off;
+	hw->rx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + rx_bar_off;
+
+	PMD_INIT_LOG(DEBUG, "ctrl_bar: %p, tx_bar: %p, rx_bar: %p\n",
+		     hw->ctrl_bar, hw->tx_bar, hw->rx_bar);
+
+	nfp_net_cfg_queue_setup(hw);
+
+	/* Get some of the read-only fields from the config BAR */
+	hw->ver = nn_cfg_readl(hw, NFP_NET_CFG_VERSION);
+	hw->cap = nn_cfg_readl(hw, NFP_NET_CFG_CAP);
+	hw->max_mtu = nn_cfg_readl(hw, NFP_NET_CFG_MAX_MTU);
+	hw->mtu = hw->max_mtu;
+
+	if (NFD_CFG_MAJOR_VERSION_of(hw->ver) < 2)
+		hw->rx_offset = NFP_NET_RX_OFFSET;
+	else
+		hw->rx_offset = nn_cfg_readl(hw, NFP_NET_CFG_RX_OFFSET_ADDR);
+
+	PMD_INIT_LOG(INFO, "VER: %#x, Maximum supported MTU: %d\n",
+		     hw->ver, hw->max_mtu);
+	PMD_INIT_LOG(INFO, "CAP: %#x, %s%s%s%s%s%s%s%s%s\n", hw->cap,
+		     hw->cap & NFP_NET_CFG_CTRL_PROMISC ? "PROMISC " : "",
+		     hw->cap & NFP_NET_CFG_CTRL_RXCSUM  ? "RXCSUM "  : "",
+		     hw->cap & NFP_NET_CFG_CTRL_TXCSUM  ? "TXCSUM "  : "",
+		     hw->cap & NFP_NET_CFG_CTRL_RXVLAN  ? "RXVLAN "  : "",
+		     hw->cap & NFP_NET_CFG_CTRL_TXVLAN  ? "TXVLAN "  : "",
+		     hw->cap & NFP_NET_CFG_CTRL_SCATTER ? "SCATTER " : "",
+		     hw->cap & NFP_NET_CFG_CTRL_GATHER  ? "GATHER "  : "",
+		     hw->cap & NFP_NET_CFG_CTRL_LSO     ? "TSO "     : "",
+		     hw->cap & NFP_NET_CFG_CTRL_RSS     ? "RSS "     : "");
+
+	pci_dev = eth_dev->pci_dev;
+	hw->ctrl = 0;
+
+	hw->stride_rx = stride;
+	hw->stride_tx = stride;
+
+	PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n",
+		     hw->max_rx_queues, hw->max_tx_queues);
+
+	/* Allocating memory for mac addr */
+	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0);
+	if (eth_dev->data->mac_addrs == NULL) {
+		PMD_INIT_LOG(ERR, "Failed to space for MAC address");
+		return -ENOMEM;
+	}
+
+	/* Using random mac addresses for VFs */
+	eth_random_addr(&hw->mac_addr[0]);
+
+	/* Copying mac address to DPDK eth_dev struct */
+	ether_addr_copy(&eth_dev->data->mac_addrs[0],
+			(struct ether_addr *)hw->mac_addr);
+
+	PMD_INIT_LOG(INFO, "port %d VendorID=0x%x DeviceID=0x%x "
+		     "mac=%02x:%02x:%02x:%02x:%02x:%02x",
+		     eth_dev->data->port_id, pci_dev->id.vendor_id,
+		     pci_dev->id.device_id,
+		     hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
+		     hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
+
+	return 0;
+}
+
+static struct rte_pci_id pci_id_nfp_net_map[] = {
+	{
+		.vendor_id = PCI_VENDOR_ID_NETRONOME,
+		.device_id = PCI_DEVICE_ID_NFP6000_PF_NIC,
+		.subsystem_vendor_id = PCI_ANY_ID,
+		.subsystem_device_id = PCI_ANY_ID,
+	},
+	{
+		.vendor_id = PCI_VENDOR_ID_NETRONOME,
+		.device_id = PCI_DEVICE_ID_NFP6000_VF_NIC,
+		.subsystem_vendor_id = PCI_ANY_ID,
+		.subsystem_device_id = PCI_ANY_ID,
+	},
+	{
+		.vendor_id = 0,
+	},
+};
+
+static struct eth_driver rte_nfp_net_pmd = {
+	{
+		.name = "rte_nfp_net_pmd",
+		.id_table = pci_id_nfp_net_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+	},
+	.eth_dev_init = nfp_net_init,
+	.dev_private_size = sizeof(struct nfp_net_adapter),
+};
+
+static int
+nfp_net_pmd_init(const char *name __rte_unused,
+		 const char *params __rte_unused)
+{
+	PMD_INIT_FUNC_TRACE();
+	PMD_INIT_LOG(INFO, "librte_pmd_nfp_net version %s\n",
+		     NFP_NET_PMD_VERSION);
+
+	rte_eth_driver_register(&rte_nfp_net_pmd);
+	return 0;
+}
+
+static struct rte_driver rte_nfp_net_driver = {
+	.type = PMD_PDEV,
+	.init = nfp_net_pmd_init,
+};
+
+PMD_REGISTER_DRIVER(rte_nfp_net_driver);
+
+/*
+ * Local variables:
+ * c-file-style: "Linux"
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/drivers/net/nfp/nfp_net_ctrl.h b/drivers/net/nfp/nfp_net_ctrl.h
new file mode 100644
index 0000000..fce8251
--- /dev/null
+++ b/drivers/net/nfp/nfp_net_ctrl.h
@@ -0,0 +1,324 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *  this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ *  contributors may be used to endorse or promote products derived from this
+ *  software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * vim:shiftwidth=8:noexpandtab
+ *
+ * Netronome network device driver: Control BAR layout
+ */
+#ifndef _NFP_NET_CTRL_H_
+#define _NFP_NET_CTRL_H_
+
+/*
+ * Configuration BAR size.
+ *
+ * The configuration BAR is 8K in size, but on the NFP6000, due to
+ * THB-350, 32k needs to be reserved.
+ */
+#ifdef __NFP_IS_6000
+#define NFP_NET_CFG_BAR_SZ              (32 * 1024)
+#else
+#define NFP_NET_CFG_BAR_SZ              (8 * 1024)
+#endif
+
+/* Offset in Freelist buffer where packet starts on RX */
+#define NFP_NET_RX_OFFSET               32
+
+/* Hash type pre-pended when a RSS hash was computed */
+#define NFP_NET_RSS_NONE                0
+#define NFP_NET_RSS_IPV4                1
+#define NFP_NET_RSS_IPV6                2
+#define NFP_NET_RSS_IPV6_EX             3
+#define NFP_NET_RSS_IPV4_TCP            4
+#define NFP_NET_RSS_IPV6_TCP            5
+#define NFP_NET_RSS_IPV6_EX_TCP         6
+#define NFP_NET_RSS_IPV4_UDP            7
+#define NFP_NET_RSS_IPV6_UDP            8
+#define NFP_NET_RSS_IPV6_EX_UDP         9
+
+/*
+ * @NFP_NET_TXR_MAX:         Maximum number of TX rings
+ * @NFP_NET_TXR_MASK:        Mask for TX rings
+ * @NFP_NET_RXR_MAX:         Maximum number of RX rings
+ * @NFP_NET_RXR_MASK:        Mask for RX rings
+ */
+#define NFP_NET_TXR_MAX                 64
+#define NFP_NET_TXR_MASK                (NFP_NET_TXR_MAX - 1)
+#define NFP_NET_RXR_MAX                 64
+#define NFP_NET_RXR_MASK                (NFP_NET_RXR_MAX - 1)
+
+/*
+ * Read/Write config words (0x0000 - 0x002c)
+ * @NFP_NET_CFG_CTRL:        Global control
+ * @NFP_NET_CFG_UPDATE:      Indicate which fields are updated
+ * @NFP_NET_CFG_TXRS_ENABLE: Bitmask of enabled TX rings
+ * @NFP_NET_CFG_RXRS_ENABLE: Bitmask of enabled RX rings
+ * @NFP_NET_CFG_MTU:         Set MTU size
+ * @NFP_NET_CFG_FLBUFSZ:     Set freelist buffer size (must be larger than MTU)
+ * @NFP_NET_CFG_EXN:         MSI-X table entry for exceptions
+ * @NFP_NET_CFG_LSC:         MSI-X table entry for link state changes
+ * @NFP_NET_CFG_MACADDR:     MAC address
+ *
+ * TODO:
+ * - define Error details in UPDATE
+ */
+#define NFP_NET_CFG_CTRL                0x0000
+#define   NFP_NET_CFG_CTRL_ENABLE         (0x1 <<  0) /* Global enable */
+#define   NFP_NET_CFG_CTRL_PROMISC        (0x1 <<  1) /* Enable Promisc mode */
+#define   NFP_NET_CFG_CTRL_L2BC           (0x1 <<  2) /* Allow L2 Broadcast */
+#define   NFP_NET_CFG_CTRL_L2MC           (0x1 <<  3) /* Allow L2 Multicast */
+#define   NFP_NET_CFG_CTRL_RXCSUM         (0x1 <<  4) /* Enable RX Checksum */
+#define   NFP_NET_CFG_CTRL_TXCSUM         (0x1 <<  5) /* Enable TX Checksum */
+#define   NFP_NET_CFG_CTRL_RXVLAN         (0x1 <<  6) /* Enable VLAN strip */
+#define   NFP_NET_CFG_CTRL_TXVLAN         (0x1 <<  7) /* Enable VLAN insert */
+#define   NFP_NET_CFG_CTRL_SCATTER        (0x1 <<  8) /* Scatter DMA */
+#define   NFP_NET_CFG_CTRL_GATHER         (0x1 <<  9) /* Gather DMA */
+#define   NFP_NET_CFG_CTRL_LSO            (0x1 << 10) /* LSO/TSO */
+#define   NFP_NET_CFG_CTRL_RINGCFG        (0x1 << 16) /* Ring runtime changes */
+#define   NFP_NET_CFG_CTRL_RSS            (0x1 << 17) /* RSS */
+#define   NFP_NET_CFG_CTRL_IRQMOD         (0x1 << 18) /* Interrupt moderation */
+#define   NFP_NET_CFG_CTRL_RINGPRIO       (0x1 << 19) /* Ring priorities */
+#define   NFP_NET_CFG_CTRL_MSIXAUTO       (0x1 << 20) /* MSI-X auto-masking */
+#define   NFP_NET_CFG_CTRL_TXRWB          (0x1 << 21) /* Write-back of TX ring*/
+#define   NFP_NET_CFG_CTRL_L2SWITCH       (0x1 << 22) /* L2 Switch */
+#define   NFP_NET_CFG_CTRL_L2SWITCH_LOCAL (0x1 << 23) /* Switch to local */
+#define   NFP_NET_CFG_CTRL_VXLAN          (0x1 << 24) /* Enable VXLAN */
+#define   NFP_NET_CFG_CTRL_NVGRE          (0x1 << 25) /* Enable NVGRE */
+#define NFP_NET_CFG_UPDATE              0x0004
+#define   NFP_NET_CFG_UPDATE_GEN          (0x1 <<  0) /* General update */
+#define   NFP_NET_CFG_UPDATE_RING         (0x1 <<  1) /* Ring config change */
+#define   NFP_NET_CFG_UPDATE_RSS          (0x1 <<  2) /* RSS config change */
+#define   NFP_NET_CFG_UPDATE_TXRPRIO      (0x1 <<  3) /* TX Ring prio change */
+#define   NFP_NET_CFG_UPDATE_RXRPRIO      (0x1 <<  4) /* RX Ring prio change */
+#define   NFP_NET_CFG_UPDATE_MSIX         (0x1 <<  5) /* MSI-X change */
+#define   NFP_NET_CFG_UPDATE_L2SWITCH     (0x1 <<  6) /* Switch changes */
+#define   NFP_NET_CFG_UPDATE_RESET        (0x1 <<  7) /* Update due to FLR */
+#define   NFP_NET_CFG_UPDATE_IRQMOD       (0x1 <<  8) /* IRQ mod change */
+#define   NFP_NET_CFG_UPDATE_VXLAN        (0x1 <<  9) /* VXLAN port change */
+#define   NFP_NET_CFG_UPDATE_ERR          (0x1 << 31) /* A error occurred */
+#define NFP_NET_CFG_TXRS_ENABLE         0x0008
+#define NFP_NET_CFG_RXRS_ENABLE         0x0010
+#define NFP_NET_CFG_MTU                 0x0018
+#define NFP_NET_CFG_FLBUFSZ             0x001c
+#define NFP_NET_CFG_EXN                 0x001f
+#define NFP_NET_CFG_LSC                 0x0020
+#define NFP_NET_CFG_MACADDR             0x0024
+
+/*
+ * Read-only words (0x0030 - 0x0050):
+ * @NFP_NET_CFG_VERSION:     Firmware version number
+ * @NFP_NET_CFG_STS:         Status
+ * @NFP_NET_CFG_CAP:         Capabilities (same bits as @NFP_NET_CFG_CTRL)
+ * @NFP_NET_MAX_TXRINGS:     Maximum number of TX rings
+ * @NFP_NET_MAX_RXRINGS:     Maximum number of RX rings
+ * @NFP_NET_MAX_MTU:         Maximum support MTU
+ * @NFP_NET_CFG_START_TXQ:   Start Queue Control Queue to use for TX (PF only)
+ * @NFP_NET_CFG_START_RXQ:   Start Queue Control Queue to use for RX (PF only)
+ *
+ * TODO:
+ * - define more STS bits
+ */
+#define NFP_NET_CFG_VERSION             0x0030
+#define   NFP_NET_CFG_VERSION_RESERVED_MASK	(0xff << 24)
+#define   NFP_NET_CFG_VERSION_CLASS_MASK  (0xff << 16)
+#define   NFP_NET_CFG_VERSION_CLASS(x)    (((x) & 0xff) << 16)
+#define   NFP_NET_CFG_VERSION_CLASS_GENERIC	0
+#define   NFP_NET_CFG_VERSION_MAJOR_MASK  (0xff <<  8)
+#define   NFP_NET_CFG_VERSION_MAJOR(x)    (((x) & 0xff) <<  8)
+#define   NFP_NET_CFG_VERSION_MINOR_MASK  (0xff <<  0)
+#define   NFP_NET_CFG_VERSION_MINOR(x)    (((x) & 0xff) <<  0)
+#define NFP_NET_CFG_STS                 0x0034
+#define   NFP_NET_CFG_STS_LINK            (0x1 << 0) /* Link up or down */
+#define NFP_NET_CFG_CAP                 0x0038
+#define NFP_NET_CFG_MAX_TXRINGS         0x003c
+#define NFP_NET_CFG_MAX_RXRINGS         0x0040
+#define NFP_NET_CFG_MAX_MTU             0x0044
+/* Next two words are being used by VFs for solving THB350 issue */
+#define NFP_NET_CFG_START_TXQ           0x0048
+#define NFP_NET_CFG_START_RXQ           0x004c
+
+/*
+ * NFP-3200 workaround (0x0050 - 0x0058)
+ * @NFP_NET_CFG_SPARE_ADDR:  DMA address for ME code to use (e.g. YDS-155 fix)
+ */
+#define NFP_NET_CFG_SPARE_ADDR          0x0050
+/**
+ * NFP6000/NFP4000 - Prepend configuration
+ */
+#define NFP_NET_CFG_RX_OFFSET		0x0050
+#define NFP_NET_CFG_RX_OFFSET_DYNAMIC		0	/* Prepend mode */
+
+/**
+ * Reuse spare address to contain the offset from the start of
+ * the host buffer where the first byte of the received frame
+ * will land.  Any metadata will come prior to that offset.  If the
+ * value in this field is 0, it means that that the metadata will
+ * always land starting at the first byte of the host buffer and
+ * packet data will immediately follow the metadata.  As always,
+ * the RX descriptor indicates the presence or absence of metadata
+ * along with the length thereof.
+ */
+#define NFP_NET_CFG_RX_OFFSET_ADDR      0x0050
+
+#define NFP_NET_CFG_VXLAN_PORT          0x0060
+#define NFP_NET_CFG_VXLAN_SZ            0x0008
+
+/* Offload definitions */
+#define NFP_NET_N_VXLAN_PORTS  (NFP_NET_CFG_VXLAN_SZ / sizeof(uint16_t))
+
+/**
+ * 64B reserved for future use (0x0080 - 0x00c0)
+ */
+#define NFP_NET_CFG_RESERVED            0x0080
+#define NFP_NET_CFG_RESERVED_SZ         0x0040
+
+/*
+ * RSS configuration (0x0100 - 0x01ac):
+ * Used only when NFP_NET_CFG_CTRL_RSS is enabled
+ * @NFP_NET_CFG_RSS_CFG:     RSS configuration word
+ * @NFP_NET_CFG_RSS_KEY:     RSS "secret" key
+ * @NFP_NET_CFG_RSS_ITBL:    RSS indirection table
+ */
+#define NFP_NET_CFG_RSS_BASE            0x0100
+#define NFP_NET_CFG_RSS_CTRL            NFP_NET_CFG_RSS_BASE
+#define   NFP_NET_CFG_RSS_MASK            (0x7f)
+#define   NFP_NET_CFG_RSS_MASK_of(_x)     ((_x) & 0x7f)
+#define   NFP_NET_CFG_RSS_IPV4            (1 <<  8) /* RSS for IPv4 */
+#define   NFP_NET_CFG_RSS_IPV6            (1 <<  9) /* RSS for IPv6 */
+#define   NFP_NET_CFG_RSS_IPV4_TCP        (1 << 10) /* RSS for IPv4/TCP */
+#define   NFP_NET_CFG_RSS_IPV4_UDP        (1 << 11) /* RSS for IPv4/UDP */
+#define   NFP_NET_CFG_RSS_IPV6_TCP        (1 << 12) /* RSS for IPv6/TCP */
+#define   NFP_NET_CFG_RSS_IPV6_UDP        (1 << 13) /* RSS for IPv6/UDP */
+#define   NFP_NET_CFG_RSS_TOEPLITZ        (1 << 24) /* Use Toeplitz hash */
+#define NFP_NET_CFG_RSS_KEY             (NFP_NET_CFG_RSS_BASE + 0x4)
+#define NFP_NET_CFG_RSS_KEY_SZ          0x28
+#define NFP_NET_CFG_RSS_ITBL            (NFP_NET_CFG_RSS_BASE + 0x4 + \
+					 NFP_NET_CFG_RSS_KEY_SZ)
+#define NFP_NET_CFG_RSS_ITBL_SZ         0x80
+
+/*
+ * TX ring configuration (0x200 - 0x800)
+ * @NFP_NET_CFG_TXR_BASE:    Base offset for TX ring configuration
+ * @NFP_NET_CFG_TXR_ADDR:    Per TX ring DMA address (8B entries)
+ * @NFP_NET_CFG_TXR_WB_ADDR: Per TX ring write back DMA address (8B entries)
+ * @NFP_NET_CFG_TXR_SZ:      Per TX ring ring size (1B entries)
+ * @NFP_NET_CFG_TXR_VEC:     Per TX ring MSI-X table entry (1B entries)
+ * @NFP_NET_CFG_TXR_PRIO:    Per TX ring priority (1B entries)
+ * @NFP_NET_CFG_TXR_IRQ_MOD: Per TX ring interrupt moderation (4B entries)
+ */
+#define NFP_NET_CFG_TXR_BASE            0x0200
+#define NFP_NET_CFG_TXR_ADDR(_x)        (NFP_NET_CFG_TXR_BASE + ((_x) * 0x8))
+#define NFP_NET_CFG_TXR_WB_ADDR(_x)     (NFP_NET_CFG_TXR_BASE + 0x200 + \
+					 ((_x) * 0x8))
+#define NFP_NET_CFG_TXR_SZ(_x)          (NFP_NET_CFG_TXR_BASE + 0x400 + (_x))
+#define NFP_NET_CFG_TXR_VEC(_x)         (NFP_NET_CFG_TXR_BASE + 0x440 + (_x))
+#define NFP_NET_CFG_TXR_PRIO(_x)        (NFP_NET_CFG_TXR_BASE + 0x480 + (_x))
+#define NFP_NET_CFG_TXR_IRQ_MOD(_x)     (NFP_NET_CFG_TXR_BASE + 0x500 + \
+					 ((_x) * 0x4))
+
+/*
+ * RX ring configuration (0x0800 - 0x0c00)
+ * @NFP_NET_CFG_RXR_BASE:    Base offset for RX ring configuration
+ * @NFP_NET_CFG_RXR_ADDR:    Per TX ring DMA address (8B entries)
+ * @NFP_NET_CFG_RXR_SZ:      Per TX ring ring size (1B entries)
+ * @NFP_NET_CFG_RXR_VEC:     Per TX ring MSI-X table entry (1B entries)
+ * @NFP_NET_CFG_RXR_PRIO:    Per TX ring priority (1B entries)
+ * @NFP_NET_CFG_RXR_IRQ_MOD: Per TX ring interrupt moderation (4B entries)
+ */
+#define NFP_NET_CFG_RXR_BASE            0x0800
+#define NFP_NET_CFG_RXR_ADDR(_x)        (NFP_NET_CFG_RXR_BASE + ((_x) * 0x8))
+#define NFP_NET_CFG_RXR_SZ(_x)          (NFP_NET_CFG_RXR_BASE + 0x200 + (_x))
+#define NFP_NET_CFG_RXR_VEC(_x)         (NFP_NET_CFG_RXR_BASE + 0x240 + (_x))
+#define NFP_NET_CFG_RXR_PRIO(_x)        (NFP_NET_CFG_RXR_BASE + 0x280 + (_x))
+#define NFP_NET_CFG_RXR_IRQ_MOD(_x)     (NFP_NET_CFG_RXR_BASE + 0x300 + \
+					 ((_x) * 0x4))
+
+/*
+ * Interrupt Control/Cause registers (0x0c00 - 0x0d00)
+ * These registers are only used when MSI-X auto-masking is not
+ * enabled (@NFP_NET_CFG_CTRL_MSIXAUTO not set).  The array is index
+ * by MSI-X entry and are 1B in size.  If an entry is zero, the
+ * corresponding entry is enabled.  If the FW generates an interrupt,
+ * it writes a cause into the corresponding field.  This also masks
+ * the MSI-X entry and the host driver must clear the register to
+ * re-enable the interrupt.
+ */
+#define NFP_NET_CFG_ICR_BASE            0x0c00
+#define NFP_NET_CFG_ICR(_x)             (NFP_NET_CFG_ICR_BASE + (_x))
+#define   NFP_NET_CFG_ICR_UNMASKED      0x0
+#define   NFP_NET_CFG_ICR_RXTX          0x1
+#define   NFP_NET_CFG_ICR_LSC           0x2
+
+/*
+ * General device stats (0x0d00 - 0x0d90)
+ * all counters are 64bit.
+ */
+#define NFP_NET_CFG_STATS_BASE          0x0d00
+#define NFP_NET_CFG_STATS_RX_DISCARDS   (NFP_NET_CFG_STATS_BASE + 0x00)
+#define NFP_NET_CFG_STATS_RX_ERRORS     (NFP_NET_CFG_STATS_BASE + 0x08)
+#define NFP_NET_CFG_STATS_RX_OCTETS     (NFP_NET_CFG_STATS_BASE + 0x10)
+#define NFP_NET_CFG_STATS_RX_UC_OCTETS  (NFP_NET_CFG_STATS_BASE + 0x18)
+#define NFP_NET_CFG_STATS_RX_MC_OCTETS  (NFP_NET_CFG_STATS_BASE + 0x20)
+#define NFP_NET_CFG_STATS_RX_BC_OCTETS  (NFP_NET_CFG_STATS_BASE + 0x28)
+#define NFP_NET_CFG_STATS_RX_FRAMES     (NFP_NET_CFG_STATS_BASE + 0x30)
+#define NFP_NET_CFG_STATS_RX_MC_FRAMES  (NFP_NET_CFG_STATS_BASE + 0x38)
+#define NFP_NET_CFG_STATS_RX_BC_FRAMES  (NFP_NET_CFG_STATS_BASE + 0x40)
+
+#define NFP_NET_CFG_STATS_TX_DISCARDS   (NFP_NET_CFG_STATS_BASE + 0x48)
+#define NFP_NET_CFG_STATS_TX_ERRORS     (NFP_NET_CFG_STATS_BASE + 0x50)
+#define NFP_NET_CFG_STATS_TX_OCTETS     (NFP_NET_CFG_STATS_BASE + 0x58)
+#define NFP_NET_CFG_STATS_TX_UC_OCTETS  (NFP_NET_CFG_STATS_BASE + 0x60)
+#define NFP_NET_CFG_STATS_TX_MC_OCTETS  (NFP_NET_CFG_STATS_BASE + 0x68)
+#define NFP_NET_CFG_STATS_TX_BC_OCTETS  (NFP_NET_CFG_STATS_BASE + 0x70)
+#define NFP_NET_CFG_STATS_TX_FRAMES     (NFP_NET_CFG_STATS_BASE + 0x78)
+#define NFP_NET_CFG_STATS_TX_MC_FRAMES  (NFP_NET_CFG_STATS_BASE + 0x80)
+#define NFP_NET_CFG_STATS_TX_BC_FRAMES  (NFP_NET_CFG_STATS_BASE + 0x88)
+
+/*
+ * Per ring stats (0x1000 - 0x1800)
+ * options, 64bit per entry
+ * @NFP_NET_CFG_TXR_STATS:   TX ring statistics (Packet and Byte count)
+ * @NFP_NET_CFG_RXR_STATS:   RX ring statistics (Packet and Byte count)
+ */
+#define NFP_NET_CFG_TXR_STATS_BASE      0x1000
+#define NFP_NET_CFG_TXR_STATS(_x)       (NFP_NET_CFG_TXR_STATS_BASE + \
+					 ((_x) * 0x10))
+#define NFP_NET_CFG_RXR_STATS_BASE      0x1400
+#define NFP_NET_CFG_RXR_STATS(_x)       (NFP_NET_CFG_RXR_STATS_BASE + \
+					 ((_x) * 0x10))
+
+#endif /* _NFP_NET_CTRL_H_ */
+/*
+ * Local variables:
+ * c-file-style: "Linux"
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/drivers/net/nfp/nfp_net_logs.h b/drivers/net/nfp/nfp_net_logs.h
new file mode 100644
index 0000000..0b966e4
--- /dev/null
+++ b/drivers/net/nfp/nfp_net_logs.h
@@ -0,0 +1,75 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *  this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ *  contributors may be used to endorse or promote products derived from this
+ *  software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _NFP_NET_LOGS_H_
+#define _NFP_NET_LOGS_H_
+
+#include <rte_log.h>
+
+#define RTE_LIBRTE_NFP_NET_DEBUG_INIT 1
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_INIT
+#define PMD_INIT_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
+#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
+#else
+#define PMD_INIT_LOG(level, fmt, args...) do { } while (0)
+#define PMD_INIT_FUNC_TRACE() do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_RX
+#define PMD_RX_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s() rx: " fmt, __func__, ## args)
+#else
+#define PMD_RX_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_TX
+#define PMD_TX_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s() tx: " fmt, __func__, ## args)
+#else
+#define PMD_TX_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_DRIVER
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+#else
+#define PMD_DRV_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_INIT
+#define ASSERT(x) if (!(x)) rte_panic("NFP_NET: x")
+#else
+#define ASSERT(x) do { } while (0)
+#endif
+
+#endif /* _NFP_NET_LOGS_H_ */
diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
new file mode 100644
index 0000000..a7f9386
--- /dev/null
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -0,0 +1,453 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *  this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ *  contributors may be used to endorse or promote products derived from this
+ *  software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * vim:shiftwidth=8:noexpandtab
+ *
+ * @file dpdk/pmd/nfp_net_pmd.h
+ *
+ * Netronome NFP_NET PDM driver
+ */
+
+#ifndef _NFP_NET_PMD_H_
+#define _NFP_NET_PMD_H_
+
+#define NFP_NET_PMD_VERSION "0.1"
+#define PCI_VENDOR_ID_NETRONOME         0x19ee
+#define PCI_DEVICE_ID_NFP6000_PF_NIC    0x6000
+#define PCI_DEVICE_ID_NFP6000_VF_NIC    0x6003
+
+/* Forward declaration */
+struct nfp_net_adapter;
+
+/*
+ * The maximum number of descriptors is limited by design as
+ * DPDK uses uint16_t variables for these values
+ */
+#define NFP_NET_MAX_TX_DESC (32 * 1024)
+#define NFP_NET_MIN_TX_DESC 64
+
+#define NFP_NET_MAX_RX_DESC (32 * 1024)
+#define NFP_NET_MIN_RX_DESC 64
+
+/* Bar allocation */
+#define NFP_NET_CRTL_BAR        0
+#define NFP_NET_TX_BAR          2
+#define NFP_NET_RX_BAR          2
+
+/* Macros for accessing the Queue Controller Peripheral 'CSRs' */
+#define NFP_QCP_QUEUE_OFF(_x)                 ((_x) * 0x800)
+#define NFP_QCP_QUEUE_ADD_RPTR                  0x0000
+#define NFP_QCP_QUEUE_ADD_WPTR                  0x0004
+#define NFP_QCP_QUEUE_STS_LO                    0x0008
+#define NFP_QCP_QUEUE_STS_LO_READPTR_mask     (0x3ffff)
+#define NFP_QCP_QUEUE_STS_HI                    0x000c
+#define NFP_QCP_QUEUE_STS_HI_WRITEPTR_mask    (0x3ffff)
+
+/* Interrupt definitions */
+#define NFP_NET_IRQ_LSC_IDX             0
+
+#define RTE_MBUF_DATA_DMA_ADDR(mb) \
+	((uint64_t)((mb)->buf_physaddr + (mb)->data_off))
+
+/* Default values for RX/TX configuration */
+#define DEFAULT_RX_FREE_THRESH  32
+#define DEFAULT_RX_PTHRESH      8
+#define DEFAULT_RX_HTHRESH      8
+#define DEFAULT_RX_WTHRESH      0
+
+#define DEFAULT_TX_RS_THRESH	32
+#define DEFAULT_TX_FREE_THRESH  32
+#define DEFAULT_TX_PTHRESH      32
+#define DEFAULT_TX_HTHRESH      0
+#define DEFAULT_TX_WTHRESH      0
+#define DEFAULT_TX_RSBIT_THRESH 32
+
+/* Alignment for dma zones */
+#define NFP_MEMZONE_ALIGN	128
+
+/*
+ * This is used by the reconfig protocol. It sets the maximum time waiting in
+ * milliseconds before a reconfig timeout happens.
+ */
+#define NFP_NET_POLL_TIMEOUT    5000
+
+#define NFP_QCP_QUEUE_ADDR_SZ   (0x800)
+
+#define NFP_NET_LINK_DOWN_CHECK_TIMEOUT 4000 /* ms */
+#define NFP_NET_LINK_UP_CHECK_TIMEOUT   1000 /* ms */
+
+/* Version number helper defines */
+#define NFD_CFG_CLASS_VER_msk       0xff
+#define NFD_CFG_CLASS_VER_shf       24
+#define NFD_CFG_CLASS_VER(x)        (((x) & 0xff) << 24)
+#define NFD_CFG_CLASS_VER_of(x)     (((x) >> 24) & 0xff)
+#define NFD_CFG_CLASS_TYPE_msk      0xff
+#define NFD_CFG_CLASS_TYPE_shf      16
+#define NFD_CFG_CLASS_TYPE(x)       (((x) & 0xff) << 16)
+#define NFD_CFG_CLASS_TYPE_of(x)    (((x) >> 16) & 0xff)
+#define NFD_CFG_MAJOR_VERSION_msk   0xff
+#define NFD_CFG_MAJOR_VERSION_shf   8
+#define NFD_CFG_MAJOR_VERSION(x)    (((x) & 0xff) << 8)
+#define NFD_CFG_MAJOR_VERSION_of(x) (((x) >> 8) & 0xff)
+#define NFD_CFG_MINOR_VERSION_msk   0xff
+#define NFD_CFG_MINOR_VERSION_shf   0
+#define NFD_CFG_MINOR_VERSION(x)    (((x) & 0xff) << 0)
+#define NFD_CFG_MINOR_VERSION_of(x) (((x) >> 0) & 0xff)
+
+#include <linux/types.h>
+
+static inline uint8_t nn_readb(volatile const void *addr)
+{
+	return *((volatile const uint8_t *)(addr));
+}
+
+static inline void nn_writeb(uint8_t val, volatile void *addr)
+{
+	*((volatile uint8_t *)(addr)) = val;
+}
+
+static inline uint32_t nn_readl(volatile const void *addr)
+{
+	return *((volatile const uint32_t *)(addr));
+}
+
+static inline void nn_writel(uint32_t val, volatile void *addr)
+{
+	*((volatile uint32_t *)(addr)) = val;
+}
+
+static inline uint64_t nn_readq(volatile void *addr)
+{
+	const volatile uint32_t *p = addr;
+	uint32_t low, high;
+
+	high = nn_readl((volatile const void *)(p + 1));
+	low = nn_readl((volatile const void *)p);
+
+	return low + ((uint64_t)high << 32);
+}
+
+static inline void nn_writeq(uint64_t val, volatile void *addr)
+{
+	nn_writel(val >> 32, (volatile char *)addr + 4);
+	nn_writel(val, addr);
+}
+
+/* TX descriptor format */
+#define PCIE_DESC_TX_EOP                (1 << 7)
+#define PCIE_DESC_TX_OFFSET_MASK        (0x7f)
+
+/* Flags in the host TX descriptor */
+#define PCIE_DESC_TX_CSUM               (1 << 7)
+#define PCIE_DESC_TX_IP4_CSUM           (1 << 6)
+#define PCIE_DESC_TX_TCP_CSUM           (1 << 5)
+#define PCIE_DESC_TX_UDP_CSUM           (1 << 4)
+#define PCIE_DESC_TX_VLAN               (1 << 3)
+#define PCIE_DESC_TX_LSO                (1 << 2)
+#define PCIE_DESC_TX_ENCAP_NONE         (0)
+#define PCIE_DESC_TX_ENCAP_VXLAN        (1 << 1)
+#define PCIE_DESC_TX_ENCAP_GRE          (1 << 0)
+
+struct nfp_net_tx_desc {
+	union {
+		struct {
+			uint8_t dma_addr_hi;   /* High bits of host buf address */
+			__le16 dma_len;     /* Length to DMA for this desc */
+			uint8_t offset_eop;    /* Offset in buf where pkt starts +
+					     * highest bit is eop flag.
+					     */
+			__le32 dma_addr_lo; /* Low 32bit of host buf addr */
+
+			__le16 lso;         /* MSS to be used for LSO */
+			uint8_t l4_offset;     /* LSO, where the L4 data starts */
+			uint8_t flags;         /* TX Flags, see @PCIE_DESC_TX_* */
+
+			__le16 vlan;        /* VLAN tag to add if indicated */
+			__le16 data_len;    /* Length of frame + meta data */
+		} __attribute__((__packed__));
+		__le32 vals[4];
+	};
+};
+
+struct nfp_net_txq {
+	struct nfp_net_hw *hw; /* Backpointer to nfp_net structure */
+
+	/*
+	 * Queue information: @qidx is the queue index from Linux's
+	 * perspective.  @tx_qcidx is the index of the Queue
+	 * Controller Peripheral queue relative to the TX queue BAR.
+	 * @cnt is the size of the queue in number of
+	 * descriptors. @qcp_q is a pointer to the base of the queue
+	 * structure on the NFP
+	 */
+	uint8_t *qcp_q;
+
+	/*
+	 * Read and Write pointers.  @wr_p and @rd_p are host side pointer,
+	 * they are free running and have little relation to the QCP pointers *
+	 * @qcp_rd_p is a local copy queue controller peripheral read pointer
+	 */
+
+	uint32_t wr_p;
+	uint32_t rd_p;
+	uint32_t qcp_rd_p;
+
+	uint32_t tx_count;
+
+	uint32_t tx_free_thresh;
+	uint32_t tail;
+
+	/*
+	 * For each descriptor keep a reference to the mbuff and
+	 * DMA address used until completion is signalled.
+	 */
+	struct {
+		struct rte_mbuf *mbuf;
+	} *txbufs;
+
+	/*
+	 * Information about the host side queue location. @txds is
+	 * the virtual address for the queue, @dma is the DMA address
+	 * of the queue and @size is the size in bytes for the queue
+	 * (needed for free)
+	 */
+	struct nfp_net_tx_desc *txds;
+
+	/*
+	 * At this point 56 bytes have been used for all the fields in the
+	 * TX critical path. We have room for 8 bytes and still all placed
+	 * in a cache line. We are not using the threshold values below nor
+	 * the txq_flags but if we need to, we can add the most used in the
+	 * remaining bytes.
+	 */
+	uint32_t tx_rs_thresh; /* not used by now. Future? */
+	uint32_t tx_pthresh;   /* not used by now. Future? */
+	uint32_t tx_hthresh;   /* not used by now. Future? */
+	uint32_t tx_wthresh;   /* not used by now. Future? */
+	uint32_t txq_flags;    /* not used by now. Future? */
+	uint8_t  port_id;
+	int qidx;
+	int tx_qcidx;
+	__le64 dma;
+} __attribute__ ((__aligned__(64)));
+
+/* RX and freelist descriptor format */
+#define PCIE_DESC_RX_DD                 (1 << 7)
+#define PCIE_DESC_RX_META_LEN_MASK      (0x7f)
+
+/* Flags in the RX descriptor */
+#define PCIE_DESC_RX_RSS                (1 << 15)
+#define PCIE_DESC_RX_I_IP4_CSUM         (1 << 14)
+#define PCIE_DESC_RX_I_IP4_CSUM_OK      (1 << 13)
+#define PCIE_DESC_RX_I_TCP_CSUM         (1 << 12)
+#define PCIE_DESC_RX_I_TCP_CSUM_OK      (1 << 11)
+#define PCIE_DESC_RX_I_UDP_CSUM         (1 << 10)
+#define PCIE_DESC_RX_I_UDP_CSUM_OK      (1 <<  9)
+#define PCIE_DESC_RX_INGRESS_PORT       (1 <<  8)
+#define PCIE_DESC_RX_EOP                (1 <<  7)
+#define PCIE_DESC_RX_IP4_CSUM           (1 <<  6)
+#define PCIE_DESC_RX_IP4_CSUM_OK        (1 <<  5)
+#define PCIE_DESC_RX_TCP_CSUM           (1 <<  4)
+#define PCIE_DESC_RX_TCP_CSUM_OK        (1 <<  3)
+#define PCIE_DESC_RX_UDP_CSUM           (1 <<  2)
+#define PCIE_DESC_RX_UDP_CSUM_OK        (1 <<  1)
+#define PCIE_DESC_RX_VLAN               (1 <<  0)
+
+struct nfp_net_rx_desc {
+	union {
+		/* Freelist descriptor */
+		struct {
+			uint8_t dma_addr_hi;
+			__le16 spare;
+			uint8_t dd;
+
+			__le32 dma_addr_lo;
+		} __attribute__((__packed__)) fld;
+
+		/* RX descriptor */
+		struct {
+			__le16 data_len;
+			uint8_t reserved;
+			uint8_t meta_len_dd;
+
+			__le16 flags;
+			__le16 vlan;
+		} __attribute__((__packed__)) rxd;
+
+		__le32 vals[2];
+	};
+};
+
+struct nfp_net_rx_buff {
+	struct rte_mbuf *mbuf;
+};
+
+struct nfp_net_rxq {
+	struct nfp_net_hw *hw;	/* Backpointer to nfp_net structure */
+
+	 /*
+	  * @qcp_fl and @qcp_rx are pointers to the base addresses of the
+	  * freelist and RX queue controller peripheral queue structures on the
+	  * NFP
+	  */
+	uint8_t *qcp_fl;
+	uint8_t *qcp_rx;
+
+	/*
+	 * Read and Write pointers.  @wr_p and @rd_p are host side
+	 * pointer, they are free running and have little relation to
+	 * the QCP pointers. @wr_p is where the driver adds new
+	 * freelist descriptors and @rd_p is where the driver start
+	 * reading descriptors for newly arrive packets from.
+	 */
+	uint32_t wr_p;
+	uint32_t rd_p;
+
+	/*
+	 * For each buffer placed on the freelist, record the
+	 * associated SKB
+	 */
+	struct nfp_net_rx_buff *rxbufs;
+
+	/*
+	 * Information about the host side queue location.  @rxds is
+	 * the virtual address for the queue
+	 */
+	struct nfp_net_rx_desc *rxds;
+
+	/*
+	 * The mempool is created by the user specifying a mbuf size.
+	 * We save here the reference of the mempool needed in the RX
+	 * path and the mbuf size for checking received packets can be
+	 * safely copied to the mbuf using the NFP_NET_RX_OFFSET
+	 */
+	struct rte_mempool *mem_pool;
+	uint16_t mbuf_size;
+
+	/*
+	 * Next two fields are used for giving more free descriptors
+	 * to the NFP
+	 */
+	uint16_t rx_free_thresh;
+	uint16_t nb_rx_hold;
+
+	 /* the size of the queue in number of descriptors */
+	uint16_t rx_count;
+
+	/*
+	 * Fields above this point fit in a single cache line and are all used
+	 * in the RX critical path. Fields below this point are just used
+	 * during queue configuration or not used at all (yet)
+	 */
+
+	/* referencing dev->data->port_id */
+	uint16_t port_id;
+
+	uint8_t  crc_len; /* Not used by now */
+	uint8_t  drop_en; /* Not used by now */
+
+	/* DMA address of the queue */
+	__le64 dma;
+
+	/*
+	 * Queue information: @qidx is the queue index from Linux's
+	 * perspective.  @fl_qcidx is the index of the Queue
+	 * Controller peripheral queue relative to the RX queue BAR
+	 * used for the freelist and @rx_qcidx is the Queue Controller
+	 * Peripheral index for the RX queue.
+	 */
+	int qidx;
+	int fl_qcidx;
+	int rx_qcidx;
+} __attribute__ ((__aligned__(64)));
+
+struct nfp_net_hw {
+	/* Info from the firmware */
+	uint32_t ver;
+	uint32_t cap;
+	uint32_t max_mtu;
+	uint32_t mtu;
+	uint32_t rx_offset;
+
+	/* Current values for control */
+	uint32_t ctrl;
+
+	uint8_t *ctrl_bar;
+	uint8_t *tx_bar;
+	uint8_t *rx_bar;
+
+	int stride_rx;
+	int stride_tx;
+
+	uint8_t *qcp_cfg;
+
+	uint32_t max_tx_queues;
+	uint32_t max_rx_queues;
+	uint16_t flbufsz;
+	uint16_t device_id;
+	uint16_t vendor_id;
+	uint16_t subsystem_device_id;
+	uint16_t subsystem_vendor_id;
+#if defined(DSTQ_SELECTION)
+#if DSTQ_SELECTION
+	uint16_t device_function;
+#endif
+#endif
+
+	uint8_t mac_addr[ETHER_ADDR_LEN];
+
+	/* Records starting point for counters */
+	struct rte_eth_stats eth_stats_base;
+
+#ifdef NFP_NET_LIBNFP
+	struct nfp_cpp *cpp;
+	struct nfp_cpp_area *ctrl_area;
+	struct nfp_cpp_area *tx_area;
+	struct nfp_cpp_area *rx_area;
+	struct nfp_cpp_area *msix_area;
+#endif
+};
+
+struct nfp_net_adapter {
+	struct nfp_net_hw hw;
+};
+
+#define NFP_NET_DEV_PRIVATE_TO_HW(adapter)\
+	(&((struct nfp_net_adapter *)adapter)->hw)
+
+#endif /* _NFP_NET_PMD_H_ */
+/*
+ * Local variables:
+ * c-file-style: "Linux"
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/drivers/net/nfp/rte_pmd_nfp_version.map b/drivers/net/nfp/rte_pmd_nfp_version.map
new file mode 100644
index 0000000..ad607bb
--- /dev/null
+++ b/drivers/net/nfp/rte_pmd_nfp_version.map
@@ -0,0 +1,3 @@
+DPDK_2.2 {
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 85a680d..3a95367 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -146,6 +146,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD)      += -lrte_pmd_ixgbe
 _LDLIBS-$(CONFIG_RTE_LIBRTE_E1000_PMD)      += -lrte_pmd_e1000
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5
+_LDLIBS-$(CONFIG_RTE_LIBRTE_NFP_PMD)        += -lrte_pmd_nfp
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD)      += -lrte_pmd_mpipe -lgxio
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 2/8] nfp: adding rx/tx functionality
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 1/8] nfp: basic initialization Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 3/8] nfp: adding rss Alejandro Lucero
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/nfp/nfp_net.c |  993 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 993 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index b9240db..0d85fa4 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -74,8 +74,25 @@
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
+static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
+				       uint16_t queue_idx);
+static uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+				  uint16_t nb_pkts);
+static void nfp_net_rx_queue_release(void *rxq);
+static int nfp_net_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+				  uint16_t nb_desc, unsigned int socket_id,
+				  const struct rte_eth_rxconf *rx_conf,
+				  struct rte_mempool *mp);
+static int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
+static void nfp_net_tx_queue_release(void *txq);
+static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+				  uint16_t nb_desc, unsigned int socket_id,
+				  const struct rte_eth_txconf *tx_conf);
 static int nfp_net_start(struct rte_eth_dev *dev);
 static void nfp_net_stop(struct rte_eth_dev *dev);
+static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+				  uint16_t nb_pkts);
 
 /*
  * The offset of the queue controller queues in the PCIe Target. These
@@ -186,6 +203,100 @@ nn_cfg_writeq(struct nfp_net_hw *hw, int off, uint64_t val)
 	nn_writeq(rte_cpu_to_le_64(val), hw->ctrl_bar + off);
 }
 
+/* Creating memzone for hardware rings. */
+static const struct rte_memzone *
+ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
+		      uint16_t queue_id, uint32_t ring_size, int socket_id)
+{
+	char z_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+
+	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
+		 dev->driver->pci_drv.name,
+		 ring_name, dev->data->port_id, queue_id);
+
+	mz = rte_memzone_lookup(z_name);
+	if (mz)
+		return mz;
+
+	return rte_memzone_reserve_aligned(z_name, ring_size, socket_id, 0,
+					   NFP_MEMZONE_ALIGN);
+}
+
+static void
+nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
+{
+	unsigned i;
+
+	if (rxq->rxbufs == NULL)
+		return;
+
+	for (i = 0; i < rxq->rx_count; i++) {
+		if (rxq->rxbufs[i].mbuf) {
+			rte_pktmbuf_free_seg(rxq->rxbufs[i].mbuf);
+			rxq->rxbufs[i].mbuf = NULL;
+		}
+	}
+}
+
+static void
+nfp_net_rx_queue_release(void *rx_queue)
+{
+	struct nfp_net_rxq *rxq = rx_queue;
+
+	if (rxq) {
+		nfp_net_rx_queue_release_mbufs(rxq);
+		rte_free(rxq->rxbufs);
+		rte_free(rxq);
+	}
+}
+
+static void
+nfp_net_reset_rx_queue(struct nfp_net_rxq *rxq)
+{
+	nfp_net_rx_queue_release_mbufs(rxq);
+	rxq->wr_p = 0;
+	rxq->rd_p = 0;
+	rxq->nb_rx_hold = 0;
+}
+
+static void
+nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq)
+{
+	unsigned i;
+
+	if (txq->txbufs == NULL)
+		return;
+
+	for (i = 0; i < txq->tx_count; i++) {
+		if (txq->txbufs[i].mbuf) {
+			rte_pktmbuf_free_seg(txq->txbufs[i].mbuf);
+			txq->txbufs[i].mbuf = NULL;
+		}
+	}
+}
+
+static void
+nfp_net_tx_queue_release(void *tx_queue)
+{
+	struct nfp_net_txq *txq = tx_queue;
+
+	if (txq) {
+		nfp_net_tx_queue_release_mbufs(txq);
+		rte_free(txq->txbufs);
+		rte_free(txq);
+	}
+}
+
+static void
+nfp_net_reset_tx_queue(struct nfp_net_txq *txq)
+{
+	nfp_net_tx_queue_release_mbufs(txq);
+	txq->wr_p = 0;
+	txq->rd_p = 0;
+	txq->tail = 0;
+}
+
 static int
 __nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t update)
 {
@@ -423,6 +534,18 @@ nfp_net_disable_queues(struct rte_eth_dev *dev)
 	hw->ctrl = new_ctrl;
 }
 
+static int
+nfp_net_rx_freelist_setup(struct rte_eth_dev *dev)
+{
+	int i;
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		if (nfp_net_rx_fill_freelist(dev->data->rx_queues[i]) < 0)
+			return -1;
+	}
+	return 0;
+}
+
 static void
 nfp_net_params_setup(struct nfp_net_hw *hw)
 {
@@ -451,6 +574,7 @@ nfp_net_start(struct rte_eth_dev *dev)
 {
 	uint32_t new_ctrl, update = 0;
 	struct nfp_net_hw *hw;
+	int ret;
 
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
@@ -476,18 +600,58 @@ nfp_net_start(struct rte_eth_dev *dev)
 	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
 		return -EIO;
 
+	/*
+	 * Allocating rte mbuffs for configured rx queues.
+	 * This requires queues being enabled before
+	 */
+	if (nfp_net_rx_freelist_setup(dev) < 0) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
 	hw->ctrl = new_ctrl;
 
 	return 0;
+
+error:
+	/*
+	 * An error returned by this function should mean the app
+	 * exiting and then the system releasing all the memory
+	 * allocated even memory coming from hugepages.
+	 *
+	 * The device could be enabled at this point with some queues
+	 * ready for getting packets. This is true if the call to
+	 * nfp_net_rx_freelist_setup() succeeds for some queues but
+	 * fails for subsequent queues.
+	 *
+	 * This should make the app exiting but better if we tell the
+	 * device first.
+	 */
+	nfp_net_disable_queues(dev);
+
+	return ret;
 }
 
 /* Stop device: disable rx and tx functions to allow for reconfiguring. */
 static void
 nfp_net_stop(struct rte_eth_dev *dev)
 {
+	int i;
+
 	PMD_INIT_LOG(DEBUG, "Stop\n");
 
 	nfp_net_disable_queues(dev);
+
+	/* Clear queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		nfp_net_reset_tx_queue(
+			(struct nfp_net_txq *)dev->data->tx_queues[i]);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		nfp_net_reset_rx_queue(
+			(struct nfp_net_rxq *)dev->data->rx_queues[i]);
+	}
 }
 
 /* Reset and stop device. The device can not be restarted. */
@@ -515,12 +679,839 @@ nfp_net_close(struct rte_eth_dev *dev)
 	 */
 }
 
+static uint32_t
+nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
+{
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_rx_desc *rxds;
+	uint32_t idx;
+	uint32_t count;
+
+	rxq = (struct nfp_net_rxq *)dev->data->rx_queues[queue_idx];
+
+	if (rxq == NULL) {
+		PMD_INIT_LOG(ERR, "Bad queue: %u\n", queue_idx);
+		return 0;
+	}
+
+	idx = rxq->rd_p % rxq->rx_count;
+	rxds = &rxq->rxds[idx];
+
+	count = 0;
+
+	/*
+	 * Other PMDs are just checking the DD bit in intervals of 4
+	 * descriptors and counting all four if the first has the DD
+	 * bit on. Of course, this is not accurate but can be good for
+	 * perfomance. But ideally that should be done in descriptors
+	 * chunks belonging to the same cache line
+	 */
+
+	while (count < rxq->rx_count) {
+		rxds = &rxq->rxds[idx];
+		if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+			break;
+
+		count++;
+		idx++;
+
+		/* Wrapping? */
+		if ((idx) == rxq->rx_count)
+			idx = 0;
+	}
+
+	return count;
+}
+
+static int
+nfp_net_rx_queue_setup(struct rte_eth_dev *dev,
+		       uint16_t queue_idx, uint16_t nb_desc,
+		       unsigned int socket_id,
+		       const struct rte_eth_rxconf *rx_conf,
+		       struct rte_mempool *mp)
+{
+	const struct rte_memzone *tz;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_INIT_FUNC_TRACE();
+
+	/* Validating number of descriptors */
+	if (((nb_desc * sizeof(struct nfp_net_rx_desc)) % 128) != 0 ||
+	    (nb_desc > NFP_NET_MAX_RX_DESC) ||
+	    (nb_desc < NFP_NET_MIN_RX_DESC)) {
+		RTE_LOG(ERR, PMD, "Wrong nb_desc value\n");
+		return (-EINVAL);
+	}
+
+	/*
+	 * Free memory prior to re-allocation if needed. This is the case after
+	 * calling nfp_net_stop
+	 */
+	if (dev->data->rx_queues[queue_idx]) {
+		nfp_net_rx_queue_release(dev->data->rx_queues[queue_idx]);
+		dev->data->rx_queues[queue_idx] = NULL;
+	}
+
+	/* Allocating rx queue data structure */
+	rxq = rte_zmalloc_socket("ethdev RX queue", sizeof(struct nfp_net_rxq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		return (-ENOMEM);
+
+	/* Hw queues mapping based on firmware confifguration */
+	rxq->qidx = queue_idx;
+	rxq->fl_qcidx = queue_idx * hw->stride_rx;
+	rxq->rx_qcidx = rxq->fl_qcidx + (hw->stride_rx - 1);
+	rxq->qcp_fl = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->fl_qcidx);
+	rxq->qcp_rx = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->rx_qcidx);
+
+	/*
+	 * Tracking mbuf size for detecting a potential mbuf overflow due to
+	 * RX offset
+	 */
+	rxq->mem_pool = mp;
+	rxq->mbuf_size = rxq->mem_pool->elt_size;
+	rxq->mbuf_size -= (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM);
+	hw->flbufsz = rxq->mbuf_size;
+
+	rxq->rx_count = nb_desc;
+	rxq->port_id = dev->data->port_id;
+	rxq->rx_free_thresh = rx_conf->rx_free_thresh;
+	rxq->crc_len = (uint8_t) ((dev->data->dev_conf.rxmode.hw_strip_crc) ? 0
+				  : ETHER_CRC_LEN);
+	rxq->drop_en = rx_conf->rx_drop_en;
+
+	/*
+	 * Allocate RX ring hardware descriptors. A memzone large enough to
+	 * handle the maximum ring size is allocated in order to allow for
+	 * resizing in later calls to the queue setup function.
+	 */
+	tz = ring_dma_zone_reserve(dev, "rx_ring", queue_idx,
+				   sizeof(struct nfp_net_rx_desc) *
+				   NFP_NET_MAX_RX_DESC, socket_id);
+
+	if (tz == NULL) {
+		RTE_LOG(ERR, PMD, "Error allocatig rx dma\n");
+		nfp_net_rx_queue_release(rxq);
+		return (-ENOMEM);
+	}
+
+	/* Saving physical and virtual addresses for the RX ring */
+	rxq->dma = (uint64_t)tz->phys_addr;
+	rxq->rxds = (struct nfp_net_rx_desc *)tz->addr;
+
+	/* mbuf pointers array for referencing mbufs linked to RX descriptors */
+	rxq->rxbufs = rte_zmalloc_socket("rxq->rxbufs",
+					 sizeof(*rxq->rxbufs) * nb_desc,
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq->rxbufs == NULL) {
+		nfp_net_rx_queue_release(rxq);
+		return (-ENOMEM);
+	}
+
+	PMD_RX_LOG(DEBUG, "rxbufs=%p hw_ring=%p dma_addr=0x%" PRIx64 "\n",
+		   rxq->rxbufs, rxq->rxds, (unsigned long int)rxq->dma);
+
+	nfp_net_reset_rx_queue(rxq);
+
+	dev->data->rx_queues[queue_idx] = rxq;
+	rxq->hw = hw;
+
+	/*
+	 * Telling the HW about the physical address of the RX ring and number
+	 * of descriptors in log2 format
+	 */
+	nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(queue_idx), rxq->dma);
+	nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(queue_idx), log2(nb_desc));
+
+	return 0;
+}
+
+static int
+nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq)
+{
+	struct nfp_net_rx_buff *rxe = rxq->rxbufs;
+	uint64_t dma_addr;
+	unsigned i;
+
+	PMD_RX_LOG(DEBUG, "nfp_net_rx_fill_freelist for %u descriptors\n",
+		   rxq->rx_count);
+
+	for (i = 0; i < rxq->rx_count; i++) {
+		struct nfp_net_rx_desc *rxd;
+		struct rte_mbuf *mbuf = rte_pktmbuf_alloc(rxq->mem_pool);
+
+		if (mbuf == NULL) {
+			RTE_LOG(ERR, PMD, "RX mbuf alloc failed queue_id=%u\n",
+				(unsigned)rxq->qidx);
+			return (-ENOMEM);
+		}
+
+		dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(mbuf));
+
+		rxd = &rxq->rxds[i];
+		rxd->fld.dd = 0;
+		rxd->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+		rxd->fld.dma_addr_lo = dma_addr & 0xffffffff;
+		rxe[i].mbuf = mbuf;
+		PMD_RX_LOG(DEBUG, "[%d]: %" PRIx64 "\n", i, dma_addr);
+
+		rxq->wr_p++;
+	}
+
+	/* Make sure all writes are flushed before telling the hardware */
+	rte_wmb();
+
+	/* Not advertising the whole ring as the firmware gets confused if so */
+	PMD_RX_LOG(DEBUG, "Increment FL write pointer in %u\n",
+		   rxq->rx_count - 1);
+
+	nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, rxq->rx_count - 1);
+
+	return 0;
+}
+
+static int
+nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+		       uint16_t nb_desc, unsigned int socket_id,
+		       const struct rte_eth_txconf *tx_conf)
+{
+	const struct rte_memzone *tz;
+	struct nfp_net_txq *txq;
+	uint16_t tx_free_thresh;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_INIT_FUNC_TRACE();
+
+	/* Validating number of descriptors */
+	if (((nb_desc * sizeof(struct nfp_net_tx_desc)) % 128) != 0 ||
+	    (nb_desc > NFP_NET_MAX_TX_DESC) ||
+	    (nb_desc < NFP_NET_MIN_TX_DESC)) {
+		RTE_LOG(ERR, PMD, "Wrong nb_desc value\n");
+		return -EINVAL;
+	}
+
+	tx_free_thresh = (uint16_t)((tx_conf->tx_free_thresh) ?
+				    tx_conf->tx_free_thresh :
+				    DEFAULT_TX_FREE_THRESH);
+
+	if (tx_free_thresh > (nb_desc)) {
+		RTE_LOG(ERR, PMD,
+			"tx_free_thresh must be less than the number of TX "
+			"descriptors. (tx_free_thresh=%u port=%d "
+			"queue=%d)\n", (unsigned int)tx_free_thresh,
+			(int)dev->data->port_id, (int)queue_idx);
+		return -(EINVAL);
+	}
+
+	/*
+	 * Free memory prior to re-allocation if needed. This is the case after
+	 * calling nfp_net_stop
+	 */
+	if (dev->data->tx_queues[queue_idx]) {
+		PMD_TX_LOG(DEBUG, "Freeing memory prior to re-allocation %d\n",
+			   queue_idx);
+		nfp_net_tx_queue_release(dev->data->tx_queues[queue_idx]);
+		dev->data->tx_queues[queue_idx] = NULL;
+	}
+
+	/* Allocating tx queue data structure */
+	txq = rte_zmalloc_socket("ethdev TX queue", sizeof(struct nfp_net_txq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL) {
+		RTE_LOG(ERR, PMD, "Error allocating tx dma\n");
+		return (-ENOMEM);
+	}
+
+	/*
+	 * Allocate TX ring hardware descriptors. A memzone large enough to
+	 * handle the maximum ring size is allocated in order to allow for
+	 * resizing in later calls to the queue setup function.
+	 */
+	tz = ring_dma_zone_reserve(dev, "tx_ring", queue_idx,
+				   sizeof(struct nfp_net_tx_desc) *
+				   NFP_NET_MAX_TX_DESC, socket_id);
+	if (tz == NULL) {
+		RTE_LOG(ERR, PMD, "Error allocating tx dma\n");
+		nfp_net_tx_queue_release(txq);
+		return (-ENOMEM);
+	}
+
+	txq->tx_count = nb_desc;
+	txq->tail = 0;
+	txq->tx_free_thresh = tx_free_thresh;
+	txq->tx_pthresh = tx_conf->tx_thresh.pthresh;
+	txq->tx_hthresh = tx_conf->tx_thresh.hthresh;
+	txq->tx_wthresh = tx_conf->tx_thresh.wthresh;
+
+	/* queue mapping based on firmware configuration */
+	txq->qidx = queue_idx;
+	txq->tx_qcidx = queue_idx * hw->stride_tx;
+	txq->qcp_q = hw->tx_bar + NFP_QCP_QUEUE_OFF(txq->tx_qcidx);
+
+	txq->port_id = dev->data->port_id;
+	txq->txq_flags = tx_conf->txq_flags;
+
+	/* Saving physical and virtual addresses for the TX ring */
+	txq->dma = (uint64_t)tz->phys_addr;
+	txq->txds = (struct nfp_net_tx_desc *)tz->addr;
+
+	/* mbuf pointers array for referencing mbufs linked to TX descriptors */
+	txq->txbufs = rte_zmalloc_socket("txq->txbufs",
+					 sizeof(*txq->txbufs) * nb_desc,
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq->txbufs == NULL) {
+		nfp_net_tx_queue_release(txq);
+		return (-ENOMEM);
+	}
+	PMD_TX_LOG(DEBUG, "txbufs=%p hw_ring=%p dma_addr=0x%" PRIx64 "\n",
+		   txq->txbufs, txq->txds, (unsigned long int)txq->dma);
+
+	nfp_net_reset_tx_queue(txq);
+
+	dev->data->tx_queues[queue_idx] = txq;
+	txq->hw = hw;
+
+	/*
+	 * Telling the HW about the physical address of the TX ring and number
+	 * of descriptors in log2 format
+	 */
+	nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(queue_idx), txq->dma);
+	nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(queue_idx), log2(nb_desc));
+
+	return 0;
+}
+
+/* nfp_net_tx_cksum - Set TX CSUM offload flags in TX descriptor */
+static inline void
+nfp_net_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_tx_desc *txd,
+		 struct rte_mbuf *mb)
+{
+	uint16_t ol_flags;
+	struct nfp_net_hw *hw = txq->hw;
+
+	if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
+		return;
+
+	ol_flags = mb->ol_flags;
+
+	/* IPv6 does not need checksum */
+	if (ol_flags & PKT_TX_IP_CKSUM)
+		txd->flags |= PCIE_DESC_TX_IP4_CSUM;
+
+	switch (ol_flags & PKT_TX_L4_MASK) {
+	case PKT_TX_UDP_CKSUM:
+		txd->flags |= PCIE_DESC_TX_UDP_CSUM;
+		break;
+	case PKT_TX_TCP_CKSUM:
+		txd->flags |= PCIE_DESC_TX_TCP_CSUM;
+		break;
+	}
+
+	txd->flags |= PCIE_DESC_TX_CSUM;
+}
+
+/* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags */
+static inline void
+nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
+		 struct rte_mbuf *mb)
+{
+	struct nfp_net_hw *hw = rxq->hw;
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
+		return;
+
+	/* If IPv4 and IP checksum error, fail */
+	if ((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
+	    !(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK))
+		mb->ol_flags |= PKT_RX_IP_CKSUM_BAD;
+
+	/* If neither UDP nor TCP return */
+	if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
+	    !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
+		return;
+
+	if ((rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
+	    !(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM_OK))
+		mb->ol_flags |= PKT_RX_L4_CKSUM_BAD;
+
+	if ((rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM) &&
+	    !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM_OK))
+		mb->ol_flags |= PKT_RX_L4_CKSUM_BAD;
+}
+
+#define NFP_HASH_OFFSET      ((uint8_t *)mbuf->buf_addr + mbuf->data_off - 4)
+#define NFP_HASH_TYPE_OFFSET ((uint8_t *)mbuf->buf_addr + mbuf->data_off - 8)
+
+/*
+ * nfp_net_set_hash - Set mbuf hash data
+ *
+ * The RSS hash and hash-type are pre-pended to the packet data.
+ * Extract and decode it and set the mbuf fields.
+ */
+static inline void
+nfp_net_set_hash(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
+		 struct rte_mbuf *mbuf)
+{
+	uint32_t hash;
+	uint32_t hash_type;
+	struct nfp_net_hw *hw = rxq->hw;
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+		return;
+
+	if (!(rxd->rxd.flags & PCIE_DESC_RX_RSS))
+		return;
+
+	hash = rte_be_to_cpu_32(*(uint32_t *)NFP_HASH_OFFSET);
+	hash_type = rte_be_to_cpu_32(*(uint32_t *)NFP_HASH_TYPE_OFFSET);
+
+	/*
+	 * hash type is sharing the same word with input port info
+	 * 31-8: input port
+	 * 7:0: hash type
+	 */
+	hash_type &= 0xff;
+	mbuf->hash.rss = hash;
+	mbuf->ol_flags |= PKT_RX_RSS_HASH;
+
+	switch (hash_type) {
+	case NFP_NET_RSS_IPV4:
+		mbuf->packet_type |= RTE_PTYPE_INNER_L3_IPV4;
+		break;
+	case NFP_NET_RSS_IPV6:
+		mbuf->packet_type |= RTE_PTYPE_INNER_L3_IPV6;
+		break;
+	case NFP_NET_RSS_IPV6_EX:
+		mbuf->packet_type |= RTE_PTYPE_INNER_L3_IPV6_EXT;
+		break;
+	default:
+		mbuf->packet_type |= RTE_PTYPE_INNER_L4_MASK;
+	}
+}
+
+/* nfp_net_check_port - Set mbuf in_port field */
+static void
+nfp_net_check_port(struct nfp_net_rx_desc *rxd, struct rte_mbuf *mbuf)
+{
+	uint32_t port;
+
+	if (!(rxd->rxd.flags & PCIE_DESC_RX_INGRESS_PORT)) {
+		mbuf->port = 0;
+		return;
+	}
+
+	port = rte_be_to_cpu_32(*(uint32_t *)((uint8_t *)mbuf->buf_addr +
+					      mbuf->data_off - 8));
+
+	/*
+	 * hash type is sharing the same word with input port info
+	 * 31-8: input port
+	 * 7:0: hash type
+	 */
+	port = (uint8_t)(port >> 8);
+	mbuf->port = port;
+}
+
+static inline void
+nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
+{
+	rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
+}
+
+#define NFP_DESC_META_LEN(d) (d->rxd.meta_len_dd & PCIE_DESC_RX_META_LEN_MASK)
+
+/*
+ * RX path design:
+ *
+ * There are some decissions to take:
+ * 1) How to check DD RX descriptors bit
+ * 2) How and when to allocate new mbufs
+ *
+ * Current implementation checks just one single DD bit each loop. As each
+ * descriptor is 8 bytes, it is likely a good idea to check descriptors in
+ * a single cache line instead. Tests with this change have not shown any
+ * performance improvement but it requires further investigation. For example,
+ * depending on which descriptor is next, the number of descriptors could be
+ * less than 8 for just checking those in the same cache line. This implies
+ * extra work which could be counterproductive by itself. Indeed, last firmware
+ * changes are just doing this: writing several descriptors with the DD bit
+ * for saving PCIe bandwidth and DMA operations from the NFP.
+ *
+ * Mbuf allocation is done when a new packet is received. Then the descriptor
+ * is automatically linked with the new mbuf and the old one is given to the
+ * user. The main drawback with this design is mbuf allocation is heavier than
+ * using bulk allocations allowed by DPDK with rte_mempool_get_bulk. From the
+ * cache point of view it does not seem allocating the mbuf early on as we are
+ * doing now have any benefit at all. Again, tests with this change have not
+ * shown any improvement. Also, rte_mempool_get_bulk returns all or nothing
+ * so looking at the implications of this type of allocation should be studied
+ * deeply
+ */
+
+static uint16_t
+nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_rx_desc *rxds;
+	struct nfp_net_rx_buff *rxb;
+	struct nfp_net_hw *hw;
+	struct rte_mbuf *mb;
+	struct rte_mbuf *new_mb;
+	int idx;
+	uint16_t nb_hold;
+	uint64_t dma_addr;
+	int avail;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		RTE_LOG(ERR, PMD, "RX Bad queue\n");
+		return -EINVAL;
+	}
+
+	hw = rxq->hw;
+	avail = 0;
+	nb_hold = 0;
+
+	while (avail < nb_pkts) {
+		idx = rxq->rd_p % rxq->rx_count;
+
+		rxb = &rxq->rxbufs[idx];
+		if (unlikely(rxb == NULL)) {
+			RTE_LOG(ERR, PMD, "rxb does not exist!\n");
+			break;
+		}
+
+		/*
+		 * Memory barrier to ensure that we won't do other
+		 * reads before the DD bit.
+		 */
+		rte_rmb();
+
+		rxds = &rxq->rxds[idx];
+		if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+			break;
+
+		/*
+		 * We got a packet. Let's alloc a new mbuff for refilling the
+		 * free descriptor ring as soon as possible
+		 */
+		new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+		if (unlikely(new_mb == NULL)) {
+			RTE_LOG(DEBUG, PMD, "RX mbuf alloc failed port_id=%u "
+				"queue_id=%u\n", (unsigned)rxq->port_id,
+				(unsigned)rxq->qidx);
+			nfp_net_mbuf_alloc_failed(rxq);
+			break;
+		}
+
+		nb_hold++;
+
+		/*
+		 * Grab the mbuff and refill the descriptor with the
+		 * previously allocated mbuff
+		 */
+		mb = rxb->mbuf;
+		rxb->mbuf = new_mb;
+
+		PMD_RX_LOG(DEBUG, "Packet len: %u, mbuf_size: %u\n",
+			   rxds->rxd.data_len, rxq->mbuf_size);
+
+		/* Size of this segment */
+		mb->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+		/* Size of the whole packet. We just support 1 segment */
+		mb->pkt_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+
+		if (unlikely((mb->data_len + hw->rx_offset) >
+			     rxq->mbuf_size)) {
+			/*
+			 * This should not happen and the user has the
+			 * responsibility of avoiding it. But we have
+			 * to give some info about the error
+			 */
+			RTE_LOG(ERR, PMD,
+				"mbuf overflow likely due to the RX offset.\n"
+				"\t\tYour mbuf size should have extra space for"
+				" RX offset=%u bytes.\n"
+				"\t\tCurrently you just have %u bytes available"
+				" but the received packet is %u bytes long",
+				hw->rx_offset,
+				rxq->mbuf_size - hw->rx_offset,
+				mb->data_len);
+			return -EINVAL;
+		}
+
+		/* Filling the received mbuff with packet info */
+		if (hw->rx_offset)
+			mb->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset;
+		else
+			mb->data_off = RTE_PKTMBUF_HEADROOM +
+				       NFP_DESC_META_LEN(rxds);
+
+		/* No scatter mode supported */
+		mb->nb_segs = 1;
+		mb->next = NULL;
+
+		/* Checking the RSS flag */
+		nfp_net_set_hash(rxq, rxds, mb);
+
+		/* Checking the checksum flag */
+		nfp_net_rx_cksum(rxq, rxds, mb);
+
+		/* Checking the port flag */
+		nfp_net_check_port(rxds, mb);
+
+		if ((rxds->rxd.flags & PCIE_DESC_RX_VLAN) &&
+		    (hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) {
+			mb->vlan_tci = rte_cpu_to_le_32(rxds->rxd.vlan);
+			mb->ol_flags |= PKT_RX_VLAN_PKT;
+		}
+
+		/* Adding the mbuff to the mbuff array passed by the app */
+		rx_pkts[avail++] = mb;
+
+		/* Now resetting and updating the descriptor */
+		rxds->vals[0] = 0;
+		rxds->vals[1] = 0;
+		dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb));
+		rxds->fld.dd = 0;
+		rxds->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+		rxds->fld.dma_addr_lo = dma_addr & 0xffffffff;
+
+		rxq->rd_p++;
+	}
+
+	if (nb_hold == 0)
+		return nb_hold;
+
+	PMD_RX_LOG(DEBUG, "RX  port_id=%u queue_id=%u, %d packets received\n",
+		   (unsigned)rxq->port_id, (unsigned)rxq->qidx, nb_hold);
+
+	nb_hold += rxq->nb_rx_hold;
+
+	/*
+	 * FL descriptors needs to be written before incrementing the
+	 * FL queue WR pointer
+	 */
+	rte_wmb();
+	if (nb_hold > rxq->rx_free_thresh) {
+		PMD_RX_LOG(DEBUG, "port=%u queue=%u nb_hold=%u avail=%u\n",
+			   (unsigned)rxq->port_id, (unsigned)rxq->qidx,
+			   (unsigned)nb_hold, (unsigned)avail);
+		nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold);
+		nb_hold = 0;
+	}
+	rxq->nb_rx_hold = nb_hold;
+
+	return avail;
+}
+
+/*
+ * nfp_net_tx_free_bufs - Check for descriptors with a complete
+ * status
+ * @txq: TX queue to work with
+ * Returns number of descriptors freed
+ */
+int
+nfp_net_tx_free_bufs(struct nfp_net_txq *txq)
+{
+	uint32_t qcp_rd_p;
+	int todo;
+
+	PMD_TX_LOG(DEBUG, "queue %u. Check for descriptor with a complete"
+		   " status\n", txq->qidx);
+
+	/* Work out how many packets have been sent */
+	qcp_rd_p = nfp_qcp_read(txq->qcp_q, NFP_QCP_READ_PTR);
+
+	if (qcp_rd_p == txq->qcp_rd_p) {
+		PMD_TX_LOG(DEBUG, "queue %u: It seems harrier is not sending "
+			   "packets (%u, %u)\n", txq->qidx,
+			   qcp_rd_p, txq->qcp_rd_p);
+		return 0;
+	}
+
+	if (qcp_rd_p > txq->qcp_rd_p)
+		todo = qcp_rd_p - txq->qcp_rd_p;
+	else
+		todo = qcp_rd_p + txq->tx_count - txq->qcp_rd_p;
+
+	PMD_TX_LOG(DEBUG, "qcp_rd_p %u, txq->qcp_rd_p: %u, qcp->rd_p: %u\n",
+		   qcp_rd_p, txq->qcp_rd_p, txq->rd_p);
+
+	if (todo == 0)
+		return todo;
+
+	txq->qcp_rd_p += todo;
+	txq->qcp_rd_p %= txq->tx_count;
+	txq->rd_p += todo;
+
+	return todo;
+}
+
+/* Leaving always free descriptors for avoiding wrapping confusion */
+#define NFP_FREE_TX_DESC(t) (t->tx_count - (t->wr_p - t->rd_p) - 8)
+
+/*
+ * nfp_net_txq_full - Check if the TX queue free descriptors
+ * is below tx_free_threshold
+ *
+ * @txq: TX queue to check
+ *
+ * This function uses the host copy* of read/write pointers
+ */
+static inline
+int nfp_net_txq_full(struct nfp_net_txq *txq)
+{
+	return NFP_FREE_TX_DESC(txq) < txq->tx_free_thresh;
+}
+
+static uint16_t
+nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *hw;
+	struct nfp_net_tx_desc *txds;
+	struct rte_mbuf *pkt;
+	uint64_t dma_addr;
+	int pkt_size, dma_size;
+	uint16_t free_descs, issued_descs;
+	struct rte_mbuf **lmbuf;
+	int i;
+
+	txq = tx_queue;
+	hw = txq->hw;
+	txds = &txq->txds[txq->tail];
+
+	PMD_TX_LOG(DEBUG, "working for queue %u at pos %d and %u packets\n",
+		   txq->qidx, txq->tail, nb_pkts);
+
+	if ((NFP_FREE_TX_DESC(txq) < nb_pkts) || (nfp_net_txq_full(txq)))
+		nfp_net_tx_free_bufs(txq);
+
+	free_descs = (uint16_t)NFP_FREE_TX_DESC(txq);
+	if (unlikely(free_descs == 0))
+		return 0;
+
+	pkt = *tx_pkts;
+
+	i = 0;
+	issued_descs = 0;
+	PMD_TX_LOG(DEBUG, "queue: %u. Sending %u packets\n",
+		   txq->qidx, nb_pkts);
+	/* Sending packets */
+	while ((i < nb_pkts) && free_descs) {
+		/* Grabbing the mbuf linked to the current descriptor */
+		lmbuf = &txq->txbufs[txq->tail].mbuf;
+		/* Warming the cache for releasing the mbuf later on */
+		RTE_MBUF_PREFETCH_TO_FREE(*lmbuf);
+
+		pkt = *(tx_pkts + i);
+
+		if (unlikely((pkt->nb_segs > 1) &&
+			     !(hw->cap & NFP_NET_CFG_CTRL_GATHER))) {
+			PMD_INIT_LOG(INFO, "NFP_NET_CFG_CTRL_GATHER not set\n");
+			rte_panic("Multisegment packet unsupported\n");
+		}
+
+		/* Checking if we have enough descriptors */
+		if (unlikely(pkt->nb_segs > free_descs))
+			goto xmit_end;
+
+		/*
+		 * Checksum and VLAN flags just in the first descriptor for a
+		 * multisegment packet
+		 */
+		nfp_net_tx_cksum(txq, txds, pkt);
+
+		if ((pkt->ol_flags & PKT_TX_VLAN_PKT) &&
+		    (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)) {
+			txds->flags |= PCIE_DESC_TX_VLAN;
+			txds->vlan = pkt->vlan_tci;
+		}
+
+		if (pkt->ol_flags & PKT_TX_TCP_SEG)
+			rte_panic("TSO is not supported\n");
+
+		/*
+		 * mbuf data_len is the data in one segment and pkt_len data
+		 * in the whole packet. When the packet is just one segment,
+		 * then data_len = pkt_len
+		 */
+		pkt_size = pkt->pkt_len;
+
+		while (pkt_size) {
+			/* Releasing mbuf which was prefetched above */
+			if (*lmbuf)
+				rte_pktmbuf_free_seg(*lmbuf);
+
+			dma_size = pkt->data_len;
+			dma_addr = RTE_MBUF_DATA_DMA_ADDR(pkt);
+			PMD_TX_LOG(DEBUG, "Working with mbuf at dma address:"
+				   "%" PRIx64 "\n", dma_addr);
+
+			/* Filling descriptors fields */
+			txds->dma_len = dma_size;
+			txds->data_len = pkt->pkt_len;
+			txds->dma_addr_hi = (dma_addr >> 32) & 0xff;
+			txds->dma_addr_lo = (dma_addr & 0xffffffff);
+			ASSERT(free_descs > 0);
+			free_descs--;
+
+			/*
+			 * Linking mbuf with descriptor for being released
+			 * next time descriptor is used
+			 */
+			*lmbuf = pkt;
+
+			txq->wr_p++;
+			txq->tail++;
+			if (unlikely(txq->tail == txq->tx_count)) /* wrapping?*/
+				txq->tail = 0;
+
+			pkt_size -= dma_size;
+			if (!pkt_size) {
+				/* End of packet */
+				txds->offset_eop |= PCIE_DESC_TX_EOP;
+			} else {
+				txds->offset_eop &= PCIE_DESC_TX_OFFSET_MASK;
+				pkt = pkt->next;
+			}
+			/* Referencing next free TX descriptor */
+			txds = &txq->txds[txq->tail];
+			issued_descs++;
+		}
+		i++;
+	}
+
+xmit_end:
+	/* Increment write pointers. Force memory write before we let HW know */
+	rte_wmb();
+	nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, issued_descs);
+
+	return i;
+}
+
 /* Initialise and register driver with DPDK Application */
 static struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.dev_configure		= nfp_net_configure,
 	.dev_start		= nfp_net_start,
 	.dev_stop		= nfp_net_stop,
 	.dev_close		= nfp_net_close,
+	.rx_queue_setup		= nfp_net_rx_queue_setup,
+	.rx_queue_release	= nfp_net_rx_queue_release,
+	.rx_queue_count		= nfp_net_rx_queue_count,
+	.tx_queue_setup		= nfp_net_tx_queue_setup,
+	.tx_queue_release	= nfp_net_tx_queue_release,
 };
 
 static int
@@ -538,6 +1529,8 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
 
 	eth_dev->dev_ops = &nfp_net_eth_dev_ops;
+	eth_dev->rx_pkt_burst = &nfp_net_recv_pkts;
+	eth_dev->tx_pkt_burst = &nfp_net_xmit_pkts;
 
 	/* For secondary processes, the primary has done all the work */
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 3/8] nfp: adding rss
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 1/8] nfp: basic initialization Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 2/8] nfp: adding rx/tx functionality Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 4/8] nfp: adding stats Alejandro Lucero
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/nfp/nfp_net.c |  218 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 218 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 0d85fa4..a9be403 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1501,12 +1501,230 @@ xmit_end:
 	return i;
 }
 
+/* Update Redirection Table(RETA) of Receive Side Scaling of Ethernet device */
+static int
+nfp_net_reta_update(struct rte_eth_dev *dev,
+		    struct rte_eth_rss_reta_entry64 *reta_conf,
+		    uint16_t reta_size)
+{
+	uint32_t reta, mask;
+	int i, j;
+	int idx, shift;
+	uint32_t update;
+	struct nfp_net_hw *hw =
+		NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+		return -EINVAL;
+
+	if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+		RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+			"(%d) doesn't match the number hardware can supported "
+			"(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+		return -EINVAL;
+	}
+
+	/*
+	 * Update Redirection Table. There are 128 8bit-entries which can be
+	 * manage as 32 32bit-entries
+	 */
+	for (i = 0; i < reta_size; i += 4) {
+		/* Handling 4 RSS entries per loop */
+		idx = i / RTE_RETA_GROUP_SIZE;
+		shift = i % RTE_RETA_GROUP_SIZE;
+		mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+		if (!mask)
+			continue;
+
+		reta = 0;
+		/* If all 4 entries were set, don't need read RETA register */
+		if (mask != 0xF)
+			reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + i);
+
+		for (j = 0; j < 4; j++) {
+			if (!(mask & (0x1 << j)))
+				continue;
+			if (mask != 0xF)
+				/* Clearing the entry bits */
+				reta &= ~(0xFF << (8 * j));
+			reta |= reta_conf[idx].reta[shift + j] << (8 * j);
+		}
+		nn_cfg_writel(hw, NFP_NET_CFG_RSS_ITBL + shift, reta);
+	}
+
+	update = NFP_NET_CFG_UPDATE_RSS;
+
+	if (nfp_net_reconfig(hw, hw->ctrl, update) < 0)
+		return -EIO;
+
+	return 0;
+}
+
+ /* Query Redirection Table(RETA) of Receive Side Scaling of Ethernet device. */
+static int
+nfp_net_reta_query(struct rte_eth_dev *dev,
+		   struct rte_eth_rss_reta_entry64 *reta_conf,
+		   uint16_t reta_size)
+{
+	uint8_t i, j, mask;
+	int idx, shift;
+	uint32_t reta;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+		return -EINVAL;
+
+	if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+		RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+			"(%d) doesn't match the number hardware can supported "
+			"(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+		return -EINVAL;
+	}
+
+	/*
+	 * Reading Redirection Table. There are 128 8bit-entries which can be
+	 * manage as 32 32bit-entries
+	 */
+	for (i = 0; i < reta_size; i += 4) {
+		/* Handling 4 RSS entries per loop */
+		idx = i / RTE_RETA_GROUP_SIZE;
+		shift = i % RTE_RETA_GROUP_SIZE;
+		mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+		if (!mask)
+			continue;
+
+		reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + shift);
+		for (j = 0; j < 4; j++) {
+			if (!(mask & (0x1 << j)))
+				continue;
+			reta_conf->reta[shift + j] =
+				(uint8_t)((reta >> (8 * j)) & 0xF);
+		}
+	}
+	return 0;
+}
+
+static int
+nfp_net_rss_hash_update(struct rte_eth_dev *dev,
+			struct rte_eth_rss_conf *rss_conf)
+{
+	uint32_t update;
+	uint32_t cfg_rss_ctrl = 0;
+	uint8_t key;
+	uint64_t rss_hf;
+	int i;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	rss_hf = rss_conf->rss_hf;
+
+	/* Checking if RSS is enabled */
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) {
+		if (rss_hf != 0) { /* Enable RSS? */
+			RTE_LOG(ERR, PMD, "RSS unsupported\n");
+			return -EINVAL;
+		}
+		return 0; /* Nothing to do */
+	}
+
+	if (rss_conf->rss_key_len > NFP_NET_CFG_RSS_KEY_SZ) {
+		RTE_LOG(ERR, PMD, "hash key too long\n");
+		return -EINVAL;
+	}
+
+	if (rss_hf & ETH_RSS_IPV4)
+		cfg_rss_ctrl |= NFP_NET_CFG_RSS_IPV4 |
+				NFP_NET_CFG_RSS_IPV4_TCP |
+				NFP_NET_CFG_RSS_IPV4_UDP;
+
+	if (rss_hf & ETH_RSS_IPV6)
+		cfg_rss_ctrl |= NFP_NET_CFG_RSS_IPV6 |
+				NFP_NET_CFG_RSS_IPV6_TCP |
+				NFP_NET_CFG_RSS_IPV6_UDP;
+
+	/* configuring where to apply the RSS hash */
+	nn_cfg_writel(hw, NFP_NET_CFG_RSS_CTRL, cfg_rss_ctrl);
+
+	/* Writing the key byte a byte */
+	for (i = 0; i < rss_conf->rss_key_len; i++) {
+		memcpy(&key, &rss_conf->rss_key[i], 1);
+		nn_cfg_writeb(hw, NFP_NET_CFG_RSS_KEY + i, key);
+	}
+
+	/* Writing the key size */
+	nn_cfg_writeb(hw, NFP_NET_CFG_RSS_KEY_SZ, rss_conf->rss_key_len);
+
+	update = NFP_NET_CFG_UPDATE_RSS;
+
+	if (nfp_net_reconfig(hw, hw->ctrl, update) < 0)
+		return -EIO;
+
+	return 0;
+}
+
+static int
+nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
+			  struct rte_eth_rss_conf *rss_conf)
+{
+	uint64_t rss_hf;
+	uint32_t cfg_rss_ctrl;
+	uint8_t key;
+	int i;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+		return -EINVAL;
+
+	rss_hf = rss_conf->rss_hf;
+	cfg_rss_ctrl = nn_cfg_readl(hw, NFP_NET_CFG_RSS_CTRL);
+
+	if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4)
+		rss_hf |= ETH_RSS_NONFRAG_IPV4_TCP | ETH_RSS_NONFRAG_IPV4_UDP;
+
+	if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4_TCP)
+		rss_hf |= ETH_RSS_NONFRAG_IPV4_TCP;
+
+	if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6_TCP)
+		rss_hf |= ETH_RSS_NONFRAG_IPV6_TCP;
+
+	if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4_UDP)
+		rss_hf |= ETH_RSS_NONFRAG_IPV4_UDP;
+
+	if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6_UDP)
+		rss_hf |= ETH_RSS_NONFRAG_IPV6_UDP;
+
+	if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6)
+		rss_hf |= ETH_RSS_NONFRAG_IPV4_UDP | ETH_RSS_NONFRAG_IPV6_UDP;
+
+	/* Reading the key size */
+	rss_conf->rss_key_len = nn_cfg_readl(hw, NFP_NET_CFG_RSS_KEY_SZ);
+
+	/* Reading the key byte a byte */
+	for (i = 0; i < rss_conf->rss_key_len; i++) {
+		key = nn_cfg_readb(hw, NFP_NET_CFG_RSS_KEY + i);
+		memcpy(&rss_conf->rss_key[i], &key, 1);
+	}
+
+	return 0;
+}
+
 /* Initialise and register driver with DPDK Application */
 static struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.dev_configure		= nfp_net_configure,
 	.dev_start		= nfp_net_start,
 	.dev_stop		= nfp_net_stop,
 	.dev_close		= nfp_net_close,
+	.reta_update		= nfp_net_reta_update,
+	.reta_query		= nfp_net_reta_query,
+	.rss_hash_update	= nfp_net_rss_hash_update,
+	.rss_hash_conf_get	= nfp_net_rss_hash_conf_get,
 	.rx_queue_setup		= nfp_net_rx_queue_setup,
 	.rx_queue_release	= nfp_net_rx_queue_release,
 	.rx_queue_count		= nfp_net_rx_queue_count,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 4/8] nfp: adding stats
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
                   ` (2 preceding siblings ...)
  2015-11-30 10:25 ` [PATCH v10 3/8] nfp: adding rss Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 5/8] nfp: adding link functionality Alejandro Lucero
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/nfp/nfp_net.c |  179 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 179 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index a9be403..0912064 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -90,6 +90,9 @@ static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 				  uint16_t nb_desc, unsigned int socket_id,
 				  const struct rte_eth_txconf *tx_conf);
 static int nfp_net_start(struct rte_eth_dev *dev);
+static void nfp_net_stats_get(struct rte_eth_dev *dev,
+			      struct rte_eth_stats *stats);
+static void nfp_net_stats_reset(struct rte_eth_dev *dev);
 static void nfp_net_stop(struct rte_eth_dev *dev);
 static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 				  uint16_t nb_pkts);
@@ -679,6 +682,177 @@ nfp_net_close(struct rte_eth_dev *dev)
 	 */
 }
 
+static void
+nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	int i;
+	struct nfp_net_hw *hw;
+	struct rte_eth_stats nfp_dev_stats;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* RTE_ETHDEV_QUEUE_STAT_CNTRS default value is 16 */
+
+	/* reading per RX ring stats */
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+			break;
+
+		nfp_dev_stats.q_ipackets[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i));
+
+		nfp_dev_stats.q_ipackets[i] -=
+			hw->eth_stats_base.q_ipackets[i];
+
+		nfp_dev_stats.q_ibytes[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8);
+
+		nfp_dev_stats.q_ibytes[i] -=
+			hw->eth_stats_base.q_ibytes[i];
+	}
+
+	/* reading per TX ring stats */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+			break;
+
+		nfp_dev_stats.q_opackets[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i));
+
+		nfp_dev_stats.q_opackets[i] -=
+			hw->eth_stats_base.q_opackets[i];
+
+		nfp_dev_stats.q_obytes[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8);
+
+		nfp_dev_stats.q_obytes[i] -=
+			hw->eth_stats_base.q_obytes[i];
+	}
+
+	nfp_dev_stats.ipackets =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES);
+
+	nfp_dev_stats.ipackets -= hw->eth_stats_base.ipackets;
+
+	nfp_dev_stats.ibytes =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS);
+
+	nfp_dev_stats.ibytes -= hw->eth_stats_base.ibytes;
+
+	nfp_dev_stats.opackets =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES);
+
+	nfp_dev_stats.opackets -= hw->eth_stats_base.opackets;
+
+	nfp_dev_stats.obytes =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS);
+
+	nfp_dev_stats.obytes -= hw->eth_stats_base.obytes;
+
+	nfp_dev_stats.imcasts =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+	nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+	/* reading general device stats */
+	nfp_dev_stats.ierrors =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS);
+
+	nfp_dev_stats.ierrors -= hw->eth_stats_base.ierrors;
+
+	nfp_dev_stats.oerrors =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS);
+
+	nfp_dev_stats.oerrors -= hw->eth_stats_base.oerrors;
+
+	/* Multicast frames received */
+	nfp_dev_stats.imcasts =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+	nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+	/* RX ring mbuf allocation failures */
+	nfp_dev_stats.rx_nombuf = dev->data->rx_mbuf_alloc_failed;
+
+	nfp_dev_stats.imissed =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
+
+	nfp_dev_stats.imissed -= hw->eth_stats_base.imissed;
+
+	if (stats)
+		memcpy(stats, &nfp_dev_stats, sizeof(*stats));
+}
+
+static void
+nfp_net_stats_reset(struct rte_eth_dev *dev)
+{
+	int i;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/*
+	 * hw->eth_stats_base records the per counter starting point.
+	 * Lets update it now
+	 */
+
+	/* reading per RX ring stats */
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+			break;
+
+		hw->eth_stats_base.q_ipackets[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i));
+
+		hw->eth_stats_base.q_ibytes[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8);
+	}
+
+	/* reading per TX ring stats */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+			break;
+
+		hw->eth_stats_base.q_opackets[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i));
+
+		hw->eth_stats_base.q_obytes[i] =
+			nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8);
+	}
+
+	hw->eth_stats_base.ipackets =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES);
+
+	hw->eth_stats_base.ibytes =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS);
+
+	hw->eth_stats_base.opackets =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES);
+
+	hw->eth_stats_base.obytes =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS);
+
+	hw->eth_stats_base.imcasts =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+	/* reading general device stats */
+	hw->eth_stats_base.ierrors =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS);
+
+	hw->eth_stats_base.oerrors =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS);
+
+	/* Multicast frames received */
+	hw->eth_stats_base.imcasts =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+	/* RX ring mbuf allocation failures */
+	dev->data->rx_mbuf_alloc_failed = 0;
+
+	hw->eth_stats_base.imissed =
+		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
+}
+
 static uint32_t
 nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
 {
@@ -1721,6 +1895,8 @@ static struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.dev_start		= nfp_net_start,
 	.dev_stop		= nfp_net_stop,
 	.dev_close		= nfp_net_close,
+	.stats_get		= nfp_net_stats_get,
+	.stats_reset		= nfp_net_stats_reset,
 	.reta_update		= nfp_net_reta_update,
 	.reta_query		= nfp_net_reta_query,
 	.rss_hash_update	= nfp_net_rss_hash_update,
@@ -1852,6 +2028,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
 		     hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
 		     hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
 
+	/* Recording current stats counters values */
+	nfp_net_stats_reset(eth_dev);
+
 	return 0;
 }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 5/8] nfp: adding link functionality
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
                   ` (3 preceding siblings ...)
  2015-11-30 10:25 ` [PATCH v10 4/8] nfp: adding stats Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 6/8] nfp: adding extra functionality Alejandro Lucero
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/nfp/nfp_net.c |   96 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 0912064..7c82e96 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -74,6 +74,7 @@
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
 static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
 static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
 				       uint16_t queue_idx);
@@ -226,6 +227,57 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
 					   NFP_MEMZONE_ALIGN);
 }
 
+/*
+ * Atomically reads link status information from global structure rte_eth_dev.
+ *
+ * @param dev
+ *   - Pointer to the structure rte_eth_dev to read from.
+ *   - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_read_link_status(struct rte_eth_dev *dev,
+				    struct rte_eth_link *link)
+{
+	struct rte_eth_link *dst = link;
+	struct rte_eth_link *src = &dev->data->dev_link;
+
+	if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+				*(uint64_t *)src) == 0)
+		return -1;
+
+	return 0;
+}
+
+/*
+ * Atomically writes the link status information into global
+ * structure rte_eth_dev.
+ *
+ * @param dev
+ *   - Pointer to the structure rte_eth_dev to read from.
+ *   - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_write_link_status(struct rte_eth_dev *dev,
+				     struct rte_eth_link *link)
+{
+	struct rte_eth_link *dst = &dev->data->dev_link;
+	struct rte_eth_link *src = link;
+
+	if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+				*(uint64_t *)src) == 0)
+		return -1;
+
+	return 0;
+}
+
 static void
 nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
 {
@@ -682,6 +734,49 @@ nfp_net_close(struct rte_eth_dev *dev)
 	 */
 }
 
+/*
+ * return 0 means link status changed, -1 means not changed
+ *
+ * Wait to complete is needed as it can take up to 9 seconds to get the Link
+ * status.
+ */
+static int
+nfp_net_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complete)
+{
+	struct nfp_net_hw *hw;
+	struct rte_eth_link link, old;
+	uint32_t nn_link_status;
+
+	PMD_DRV_LOG(DEBUG, "Link update\n");
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	memset(&old, 0, sizeof(old));
+	nfp_net_dev_atomic_read_link_status(dev, &old);
+
+	nn_link_status = nn_cfg_readl(hw, NFP_NET_CFG_STS);
+
+	memset(&link, 0, sizeof(struct rte_eth_link));
+
+	if (nn_link_status & NFP_NET_CFG_STS_LINK)
+		link.link_status = 1;
+
+	link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	/* Other cards can limit the tx and rx rate per VF */
+	link.link_speed = ETH_LINK_SPEED_40G;
+
+	if (old.link_status != link.link_status) {
+		nfp_net_dev_atomic_write_link_status(dev, &link);
+		if (link.link_status)
+			PMD_DRV_LOG(INFO, "NIC Link is Up\n");
+		else
+			PMD_DRV_LOG(INFO, "NIC Link is Down\n");
+		return 0;
+	}
+
+	return -1;
+}
+
 static void
 nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
@@ -1895,6 +1990,7 @@ static struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.dev_start		= nfp_net_start,
 	.dev_stop		= nfp_net_stop,
 	.dev_close		= nfp_net_close,
+	.link_update		= nfp_net_link_update,
 	.stats_get		= nfp_net_stats_get,
 	.stats_reset		= nfp_net_stats_reset,
 	.reta_update		= nfp_net_reta_update,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 6/8] nfp: adding extra functionality
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
                   ` (4 preceding siblings ...)
  2015-11-30 10:25 ` [PATCH v10 5/8] nfp: adding link functionality Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 7/8] nfp: link status change interrupt support Alejandro Lucero
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/nfp/nfp_net.c |  191 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 7c82e96..ff9a8d6 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -73,8 +73,13 @@
 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
+static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
+static void nfp_net_infos_get(struct rte_eth_dev *dev,
+			      struct rte_eth_dev_info *dev_info);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
 static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
+static void nfp_net_promisc_enable(struct rte_eth_dev *dev);
+static void nfp_net_promisc_disable(struct rte_eth_dev *dev);
 static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
 static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
 				       uint16_t queue_idx);
@@ -734,6 +739,65 @@ nfp_net_close(struct rte_eth_dev *dev)
 	 */
 }
 
+static void
+nfp_net_promisc_enable(struct rte_eth_dev *dev)
+{
+	uint32_t new_ctrl, update = 0;
+	struct nfp_net_hw *hw;
+
+	PMD_DRV_LOG(DEBUG, "Promiscuous mode enable\n");
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (!(hw->cap & NFP_NET_CFG_CTRL_PROMISC)) {
+		PMD_INIT_LOG(INFO, "Promiscuous mode not supported\n");
+		return;
+	}
+
+	if (hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) {
+		PMD_DRV_LOG(INFO, "Promiscuous mode already enabled\n");
+		return;
+	}
+
+	new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_PROMISC;
+	update = NFP_NET_CFG_UPDATE_GEN;
+
+	/*
+	 * DPDK sets promiscuous mode on just after this call assuming
+	 * it can not fail ...
+	 */
+	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+		return;
+
+	hw->ctrl = new_ctrl;
+}
+
+static void
+nfp_net_promisc_disable(struct rte_eth_dev *dev)
+{
+	uint32_t new_ctrl, update = 0;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if ((hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) == 0) {
+		PMD_DRV_LOG(INFO, "Promiscuous mode already disabled\n");
+		return;
+	}
+
+	new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_PROMISC;
+	update = NFP_NET_CFG_UPDATE_GEN;
+
+	/*
+	 * DPDK sets promiscuous mode off just before this call
+	 * assuming it can not fail ...
+	 */
+	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+		return;
+
+	hw->ctrl = new_ctrl;
+}
+
 /*
  * return 0 means link status changed, -1 means not changed
  *
@@ -948,6 +1012,65 @@ nfp_net_stats_reset(struct rte_eth_dev *dev)
 		nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
 }
 
+static void
+nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	dev_info->driver_name = dev->driver->pci_drv.name;
+	dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
+	dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
+	dev_info->min_rx_bufsize = ETHER_MIN_MTU;
+	dev_info->max_rx_pktlen = hw->mtu;
+	/* Next should change when PF support is implemented */
+	dev_info->max_mac_addrs = 1;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN)
+		dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM)
+		dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_IPV4_CKSUM |
+					     DEV_RX_OFFLOAD_UDP_CKSUM |
+					     DEV_RX_OFFLOAD_TCP_CKSUM;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)
+		dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_TXCSUM)
+		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_IPV4_CKSUM |
+					     DEV_RX_OFFLOAD_UDP_CKSUM |
+					     DEV_RX_OFFLOAD_TCP_CKSUM;
+
+	dev_info->default_rxconf = (struct rte_eth_rxconf) {
+		.rx_thresh = {
+			.pthresh = DEFAULT_RX_PTHRESH,
+			.hthresh = DEFAULT_RX_HTHRESH,
+			.wthresh = DEFAULT_RX_WTHRESH,
+		},
+		.rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+		.rx_drop_en = 0,
+	};
+
+	dev_info->default_txconf = (struct rte_eth_txconf) {
+		.tx_thresh = {
+			.pthresh = DEFAULT_TX_PTHRESH,
+			.hthresh = DEFAULT_TX_HTHRESH,
+			.wthresh = DEFAULT_TX_WTHRESH,
+		},
+		.tx_free_thresh = DEFAULT_TX_FREE_THRESH,
+		.tx_rs_thresh = DEFAULT_TX_RSBIT_THRESH,
+		.txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
+			     ETH_TXQ_FLAGS_NOOFFLOADS,
+	};
+
+	dev_info->reta_size = NFP_NET_CFG_RSS_ITBL_SZ;
+#if RTE_VER_MAJOR == 2 && RTE_VER_MINOR >= 1
+	dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
+#endif
+}
+
 static uint32_t
 nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
 {
@@ -993,6 +1116,34 @@ nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
 }
 
 static int
+nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* check that mtu is within the allowed range */
+	if ((mtu < ETHER_MIN_MTU) || ((uint32_t)mtu > hw->max_mtu))
+		return -EINVAL;
+
+	/* switch to jumbo mode if needed */
+	if ((uint32_t)mtu > ETHER_MAX_LEN)
+		dev->data->dev_conf.rxmode.jumbo_frame = 1;
+	else
+		dev->data->dev_conf.rxmode.jumbo_frame = 0;
+
+	/* update max frame size */
+	dev->data->dev_conf.rxmode.max_rx_pkt_len = (uint32_t)mtu;
+
+	/* writing to configuration space */
+	nn_cfg_writel(hw, NFP_NET_CFG_MTU, (uint32_t)mtu);
+
+	hw->mtu = mtu;
+
+	return 0;
+}
+
+static int
 nfp_net_rx_queue_setup(struct rte_eth_dev *dev,
 		       uint16_t queue_idx, uint16_t nb_desc,
 		       unsigned int socket_id,
@@ -1770,6 +1921,41 @@ xmit_end:
 	return i;
 }
 
+static void
+nfp_net_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+	uint32_t new_ctrl, update;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	new_ctrl = 0;
+
+	if ((mask & ETH_VLAN_FILTER_OFFLOAD) ||
+	    (mask & ETH_VLAN_FILTER_OFFLOAD))
+		RTE_LOG(INFO, PMD, "Not support for ETH_VLAN_FILTER_OFFLOAD or"
+			" ETH_VLAN_FILTER_EXTEND");
+
+	/* Enable vlan strip if it is not configured yet */
+	if ((mask & ETH_VLAN_STRIP_OFFLOAD) &&
+	    !(hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN))
+		new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_RXVLAN;
+
+	/* Disable vlan strip just if it is configured */
+	if (!(mask & ETH_VLAN_STRIP_OFFLOAD) &&
+	    (hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN))
+		new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_RXVLAN;
+
+	if (new_ctrl == 0)
+		return;
+
+	update = NFP_NET_CFG_UPDATE_GEN;
+
+	if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+		return;
+
+	hw->ctrl = new_ctrl;
+}
+
 /* Update Redirection Table(RETA) of Receive Side Scaling of Ethernet device */
 static int
 nfp_net_reta_update(struct rte_eth_dev *dev,
@@ -1990,9 +2176,14 @@ static struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.dev_start		= nfp_net_start,
 	.dev_stop		= nfp_net_stop,
 	.dev_close		= nfp_net_close,
+	.promiscuous_enable	= nfp_net_promisc_enable,
+	.promiscuous_disable	= nfp_net_promisc_disable,
 	.link_update		= nfp_net_link_update,
 	.stats_get		= nfp_net_stats_get,
 	.stats_reset		= nfp_net_stats_reset,
+	.dev_infos_get		= nfp_net_infos_get,
+	.mtu_set		= nfp_net_dev_mtu_set,
+	.vlan_offload_set	= nfp_net_vlan_offload_set,
 	.reta_update		= nfp_net_reta_update,
 	.reta_query		= nfp_net_reta_query,
 	.rss_hash_update	= nfp_net_rss_hash_update,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 7/8] nfp: link status change interrupt support
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
                   ` (5 preceding siblings ...)
  2015-11-30 10:25 ` [PATCH v10 6/8] nfp: adding extra functionality Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-11-30 10:25 ` [PATCH v10 8/8] nfp: adding nic guide Alejandro Lucero
  2015-12-07 23:47 ` [PATCH v10 0/8] support for netronome nfp-6xxx card Thomas Monjalon
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/nfp/nfp_net.c |  123 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 123 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index ff9a8d6..bc2089f 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -73,6 +73,9 @@
 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
+static void nfp_net_dev_interrupt_handler(struct rte_intr_handle *handle,
+					  void *param);
+static void nfp_net_dev_interrupt_delayed_handler(void *param);
 static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 static void nfp_net_infos_get(struct rte_eth_dev *dev,
 			      struct rte_eth_dev_info *dev_info);
@@ -731,6 +734,7 @@ nfp_net_close(struct rte_eth_dev *dev)
 
 	nfp_net_stop(dev);
 
+	rte_intr_disable(&dev->pci_dev->intr_handle);
 	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
 
 	/*
@@ -1115,6 +1119,114 @@ nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
 	return count;
 }
 
+static void
+nfp_net_dev_link_status_print(struct rte_eth_dev *dev)
+{
+	struct rte_eth_link link;
+
+	memset(&link, 0, sizeof(link));
+	nfp_net_dev_atomic_read_link_status(dev, &link);
+	if (link.link_status)
+		RTE_LOG(INFO, PMD, "Port %d: Link Up - speed %u Mbps - %s\n",
+			(int)(dev->data->port_id), (unsigned)link.link_speed,
+			link.link_duplex == ETH_LINK_FULL_DUPLEX
+			? "full-duplex" : "half-duplex");
+	else
+		RTE_LOG(INFO, PMD, " Port %d: Link Down\n",
+			(int)(dev->data->port_id));
+
+	RTE_LOG(INFO, PMD, "PCI Address: %04d:%02d:%02d:%d\n",
+		dev->pci_dev->addr.domain, dev->pci_dev->addr.bus,
+		dev->pci_dev->addr.devid, dev->pci_dev->addr.function);
+}
+
+/* Interrupt configuration and handling */
+
+/*
+ * nfp_net_irq_unmask - Unmask an interrupt
+ *
+ * If MSI-X auto-masking is enabled clear the mask bit, otherwise
+ * clear the ICR for the entry.
+ */
+static void
+nfp_net_irq_unmask(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
+		/* If MSI-X auto-masking is used, clear the entry */
+		rte_wmb();
+		rte_intr_enable(&dev->pci_dev->intr_handle);
+	} else {
+		/* Make sure all updates are written before un-masking */
+		rte_wmb();
+		nn_cfg_writeb(hw, NFP_NET_CFG_ICR(NFP_NET_IRQ_LSC_IDX),
+			      NFP_NET_CFG_ICR_UNMASKED);
+	}
+}
+
+static void
+nfp_net_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+			      void *param)
+{
+	int64_t timeout;
+	struct rte_eth_link link;
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+	PMD_DRV_LOG(DEBUG, "We got a LSC interrupt!!!\n");
+
+	/* get the link status */
+	memset(&link, 0, sizeof(link));
+	nfp_net_dev_atomic_read_link_status(dev, &link);
+
+	nfp_net_link_update(dev, 0);
+
+	/* likely to up */
+	if (!link.link_status) {
+		/* handle it 1 sec later, wait it being stable */
+		timeout = NFP_NET_LINK_UP_CHECK_TIMEOUT;
+		/* likely to down */
+	} else {
+		/* handle it 4 sec later, wait it being stable */
+		timeout = NFP_NET_LINK_DOWN_CHECK_TIMEOUT;
+	}
+
+	if (rte_eal_alarm_set(timeout * 1000,
+			      nfp_net_dev_interrupt_delayed_handler,
+			      (void *)dev) < 0) {
+		RTE_LOG(ERR, PMD, "Error setting alarm");
+		/* Unmasking */
+		nfp_net_irq_unmask(dev);
+	}
+}
+
+/*
+ * Interrupt handler which shall be registered for alarm callback for delayed
+ * handling specific interrupt to wait for the stable nic state. As the NIC
+ * interrupt state is not stable for nfp after link is just down, it needs
+ * to wait 4 seconds to get the stable status.
+ *
+ * @param handle   Pointer to interrupt handle.
+ * @param param    The address of parameter (struct rte_eth_dev *)
+ *
+ * @return  void
+ */
+static void
+nfp_net_dev_interrupt_delayed_handler(void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+	nfp_net_link_update(dev, 0);
+	_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC);
+
+	nfp_net_dev_link_status_print(dev);
+
+	/* Unmasking */
+	nfp_net_irq_unmask(dev);
+}
+
 static int
 nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
 {
@@ -2315,6 +2427,17 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
 		     hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
 		     hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
 
+	/* Registering LSC interrupt handler */
+	rte_intr_callback_register(&pci_dev->intr_handle,
+				   nfp_net_dev_interrupt_handler,
+				   (void *)eth_dev);
+
+	/* enable uio intr after callback register */
+	rte_intr_enable(&pci_dev->intr_handle);
+
+	/* Telling the firmware about the LSC interrupt entry */
+	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
+
 	/* Recording current stats counters values */
 	nfp_net_stats_reset(eth_dev);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 8/8] nfp: adding nic guide
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
                   ` (6 preceding siblings ...)
  2015-11-30 10:25 ` [PATCH v10 7/8] nfp: link status change interrupt support Alejandro Lucero
@ 2015-11-30 10:25 ` Alejandro Lucero
  2015-12-07 23:47 ` [PATCH v10 0/8] support for netronome nfp-6xxx card Thomas Monjalon
  8 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero @ 2015-11-30 10:25 UTC (permalink / raw)
  To: dev

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 MAINTAINERS               |    1 +
 doc/guides/nics/index.rst |    1 +
 doc/guides/nics/nfp.rst   |  265 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 267 insertions(+)
 create mode 100644 doc/guides/nics/nfp.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index a23de04..b5db75f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -338,6 +338,7 @@ F: drivers/crypto/qat/
 Netronome nfp
 M: Alejandro Lucero <alejandro.lucero@netronome.com>
 F: drivers/net/nfp/
+F: doc/guides/nics/nfp.rst
 
 Packet processing
 -----------------
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 0a0b724..7bf2938 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -46,6 +46,7 @@ Network Interface Controller Drivers
     intel_vf
     mlx4
     mlx5
+    nfp
     szedata2
     virtio
     vmxnet3
diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
new file mode 100644
index 0000000..55ba64d
--- /dev/null
+++ b/doc/guides/nics/nfp.rst
@@ -0,0 +1,265 @@
+..  BSD LICENSE
+    Copyright(c) 2015 Netronome Systems, Inc. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+NFP poll mode driver library
+============================
+
+Netronome's sixth generation of flow processors pack 216 programmable
+cores and over 100 hardware accelerators that uniquely combine packet,
+flow, security and content processing in a single device that scales
+up to 400 Gbps.
+
+This document explains how to use DPDK with the Netronome Poll Mode
+Driver (PMD) supporting Netronome's Network Flow Processor 6xxx
+(NFP-6xxx).
+
+Currently the driver supports virtual functions (VFs) only.
+
+Dependencies
+------------
+
+Before using the Netronome's DPDK PMD some NFP-6xxx configuration,
+which is not related to DPDK, is required. The system requires
+installation of **Netronome's BSP (Board Support Package)** which includes
+Linux drivers, programs and libraries.
+
+If you have a NFP-6xxx device you should already have the code and
+documentation for doing this configuration. Contact
+**support@netronome.com** to obtain the latest available firmware.
+
+The NFP Linux kernel drivers (including the required PF driver for the
+NFP) are available on Github at
+**https://github.com/Netronome/nfp-drv-kmods** along with build
+instructions.
+
+DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to
+allow access to physical devices from userspace. The NFP PMD requires
+a separate UIO driver, **nfp_uio**, to perform correct
+initialization. This driver is part of Netronome´s BSP and it is
+equivalent to Intel's igb_uio driver.
+
+Building the software
+---------------------
+
+Netronome's PMD code is provided in the **drivers/net/nfp** directory.
+Because Netronome´s BSP dependencies the driver is disabled by default
+in DPDK build using **common_linuxapp configuration** file. Enabling the
+driver or if you use another configuration file and want to have NFP
+support, this variable is needed:
+
+- **CONFIG_RTE_LIBRTE_NFP_PMD=y**
+
+Once DPDK is built all the DPDK apps and examples include support for
+the NFP PMD.
+
+
+System configuration
+--------------------
+
+Using the NFP PMD is not different to using other PMDs. Usual steps are:
+
+#. **Configure hugepages:** All major Linux distributions have the hugepages
+   functionality enabled by default. By default this allows the system uses for
+   working with transparent hugepages. But in this case some hugepages need to
+   be created/reserved for use with the DPDK through the hugetlbfs file system.
+   First the virtual file system need to be mounted:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs none /mnt/hugetlbfs
+
+   The command uses the common mount point for this file system and it needs to
+   be created if necessary.
+
+   Configuring hugepages is performed via sysfs:
+
+   .. code-block:: console
+
+      /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+   This sysfs file is used to specify the number of hugepages to reserve.
+   For example:
+
+   .. code-block:: console
+
+      echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+   This will reserve 2GB of memory using 1024 2MB hugepages. The file may be
+   read to see if the operation was performed correctly:
+
+   .. code-block:: console
+
+      cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+   The number of unused hugepages may also be inspected.
+
+   Before executing the DPDK app it should match the value of nr_hugepages.
+
+   .. code-block:: console
+
+      cat /sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages
+
+   The hugepages reservation should be performed at system initialisation and
+   it is usual to use a kernel parameter for configuration. If the reservation
+   is attempted on a busy system it will likely fail. Reserving memory for
+   hugepages may be done adding the following to the grub kernel command line:
+
+   .. code-block:: console
+
+      default_hugepagesz=1M hugepagesz=2M hugepages=1024
+
+   This will reserve 2GBytes of memory using 2Mbytes huge pages.
+
+   Finally, for a NUMA system the allocation needs to be made on the correct
+   NUMA node. In a DPDK app there is a master core which will (usually) perform
+   memory allocation. It is important that some of the hugepages are reserved
+   on the NUMA memory node where the network device is attached. This is because
+   of a restriction in DPDK by which TX and RX descriptors rings must be created
+   on the master code.
+
+   Per-node allocation of hugepages may be inspected and controlled using sysfs.
+   For example:
+
+   .. code-block:: console
+
+      cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+   For a NUMA system there will be a specific hugepage directory per node
+   allowing control of hugepage reservation. A common problem may occur when
+   hugepages reservation is performed after the system has been working for
+   some time. Configuration using the global sysfs hugepage interface will
+   succeed but the per-node allocations may be unsatisfactory.
+
+   The number of hugepages that need to be reserved depends on how the app uses
+   TX and RX descriptors, and packets mbufs.
+
+#. **Enable SR-IOV on the NFP-6xxx device:** The current NFP PMD works with
+   Virtual Functions (VFs) on a NFP device. Make sure that one of the Physical
+   Function (PF) drivers from the above Github repository is installed and
+   loaded.
+
+   Virtual Functions need to be enabled before they can be used with the PMD.
+   Before enabling the VFs it is useful to obtain information about the
+   current NFP PCI device detected by the system:
+
+   .. code-block:: console
+
+      lspci -d19ee:
+
+   Now, for example, configure two virtual functions on a NFP-6xxx device
+   whose PCI system identity is "0000:03:00.0":
+
+   .. code-block:: console
+
+      echo 2 > /sys/bus/pci/devices/0000:03:00.0/sriov_numvfs
+
+   The result of this command may be shown using lspci again:
+
+   .. code-block:: console
+
+      lspci -d19ee: -k
+
+   Two new PCI devices should appear in the output of the above command. The
+   -k option shows the device driver, if any, that devices are bound to.
+   Depending on the modules loaded at this point the new PCI devices may be
+   bound to nfp_netvf driver.
+
+#. **To install the uio kernel module (manually):** All major Linux
+   distributions have support for this kernel module so it is straightforward
+   to install it:
+
+   .. code-block:: console
+
+      modprobe uio
+
+   The module should now be listed by the lsmod command.
+
+#. **To install the nfp_uio kernel module (manually):** This module supports
+   NFP-6xxx devices through the UIO interface.
+
+   This module is part of Netronome´s BSP and it should be available when the
+   BSP is installed.
+
+   .. code-block:: console
+
+      modprobe nfp_uio.ko
+
+   The module should now be listed by the lsmod command.
+
+   Depending on which NFP modules are loaded, nfp_uio may be automatically
+   bound to the NFP PCI devices by the system. Otherwise the binding needs
+   to be done explicitly. This is the case when nfp_netvf, the Linux kernel
+   driver for NFP VFs, was loaded when VFs were created. As described later
+   in this document this configuration may also be performed using scripts
+   provided by the Netronome´s BSP.
+
+   First the device needs to be unbound, for example from the nfp_netvf
+   driver:
+
+   .. code-block:: console
+
+      echo 0000:03:08.0 > /sys/bus/pci/devices/0000:03:08.0/driver/unbind
+
+      lspci -d19ee: -k
+
+   The output of lspci should now show that 0000:03:08.0 is not bound to
+   any driver.
+
+   The next step is to add the NFP PCI ID to the NFP UIO driver:
+
+   .. code-block:: console
+
+      echo 19ee 6003 > /sys/bus/pci/drivers/nfp_uio/new_id
+
+   And then to bind the device to the nfp_uio driver:
+
+   .. code-block:: console
+
+      echo 0000:03:08.0 > /sys/bus/pci/drivers/nfp_uio/bind
+
+      lspci -d19ee: -k
+
+   lspci should show that device bound to nfp_uio driver.
+
+#. **Using tools from Netronome´s BSP to install and bind modules:** DPDK provides
+   scripts which are useful for installing the UIO modules and for binding the
+   right device to those modules avoiding doing so manually. However, these scripts
+   have not support for Netronome´s UIO driver. Along with drivers, the BSP installs
+   those DPDK scripts slightly modified with support for Netronome´s UIO driver.
+
+   Those specific scripts can be found in Netronome´s BSP installation directory.
+   Refer to BSP documentation for more information.
+
+   * **setup.sh**
+   * **dpdk_nic_bind.py**
+
+   Configuration may be performed by running setup.sh which invokes
+   dpdk_nic_bind.py as needed. Executing setup.sh will display a menu of
+   configuration options.
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v10 0/8] support for netronome nfp-6xxx card
  2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
                   ` (7 preceding siblings ...)
  2015-11-30 10:25 ` [PATCH v10 8/8] nfp: adding nic guide Alejandro Lucero
@ 2015-12-07 23:47 ` Thomas Monjalon
  8 siblings, 0 replies; 10+ messages in thread
From: Thomas Monjalon @ 2015-12-07 23:47 UTC (permalink / raw)
  To: Alejandro Lucero; +Cc: dev

2015-11-30 10:25, Alejandro Lucero:
> This patchset adds a new PMD for Netronome nfp-6xxx card.
> Just PCI Virtual Functions support.
> Using this PMD requires previous Netronome BSP installation.

There is no risk to merge this driver disabled by default.

Applied, thanks

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-12-07 23:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-30 10:25 [PATCH v10 0/8] support for netronome nfp-6xxx card Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 1/8] nfp: basic initialization Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 2/8] nfp: adding rx/tx functionality Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 3/8] nfp: adding rss Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 4/8] nfp: adding stats Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 5/8] nfp: adding link functionality Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 6/8] nfp: adding extra functionality Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 7/8] nfp: link status change interrupt support Alejandro Lucero
2015-11-30 10:25 ` [PATCH v10 8/8] nfp: adding nic guide Alejandro Lucero
2015-12-07 23:47 ` [PATCH v10 0/8] support for netronome nfp-6xxx card Thomas Monjalon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.