All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] slow data path communication between DPDK port and Linux
@ 2016-01-27 16:32 Ferruh Yigit
  2016-01-27 16:32 ` [PATCH 1/2] kdp: add kernel data path kernel module Ferruh Yigit
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-01-27 16:32 UTC (permalink / raw)
  To: dev

This is slow data path communication implementation based on existing KNI.

Difference is: librte_kni converted into a PMD, kdp kernel module is almost
same except all control path functionality removed and some simplification done.

Motivation is to simplify slow path data communication.
Now any application can use this new PMD to send/get data to Linux kernel.

PMD supports two communication methods:

1) KDP kernel module
PMD initialization functions handles creating virtual interfaces (with help of
kdp kernel module) and created FIFO. FIFO is used to share data between
userspace and kernelspace. This is default method.

2) tun/tap module
When KDP module is not inserted, PMD creates tap interface and transfers
packets using tap interface.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

Sample usage:
1) Transfer any packet received from NIC that bound to DPDK, to the Linux kernel

a) insert kdp kernel module
insmod build/kmod/rte_kdp.ko

b) bind NIC to the DPDK using dpdk_nic_bind.py

c) ./testpmd --vdev eth_kdp0

c1) testpmd show two ports, one of them physical, other virtual
...
Configuring Port 0 (socket 0)
Port 0: 00:00:00:00:00:00
Configuring Port 1 (socket 0)
...
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
Done

c2) This will create "kdp0" Linux interface
$ ip l show kdp0
21: kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff

d) Linux port can be used for data

d1)
$ ifconfig kdp0 1.0.0.2
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=64 time=0.789 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=64 time=0.881 ms

d2)
$ tcpdump -nn -i kdp0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on kdp0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:01:22.407506 IP 1.0.0.1 > 1.0.0.2: ICMP echo request, id 40016, seq 18, length 64
15:01:22.408521 IP 1.0.0.2 > 1.0.0.1: ICMP echo reply, id 40016, seq 18, length 64



2) Data travels between virtual Linux interfaces pass from DPDK application,
application can alter data

a) insert kdp kernel module
insmod build/kmod/rte_kdp.ko

b) No physical NIC involved

c) ./testpmd --vdev eth_kdp0 --vdev eth_kdp1

c1) testpmd show two ports, both of them are virtual
...
Configuring Port 0 (socket 0)
Port 0: 00:00:00:00:00:00
Configuring Port 1 (socket 0)
Port 1: 00:00:00:00:00:00
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
Done

c2) This will create "kdp0"  and "kdp1" Linux interfaces
$ ip l show kdp0; ip l show kdp1
22: kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
23: kdp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff

d) Data travel between virtual ports pass from DPDK application
$ifconfig kdp0 1.0.0.1
$ifconfig kdp1 1.0.0.2

d1)
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=64 time=3.57 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=64 time=1.85 ms
64 bytes from 1.0.0.1: icmp_seq=3 ttl=64 time=1.89 ms

d2)
$ tcpdump -nn -i kdp0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on kdp0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:20:51.908543 IP 1.0.0.2 > 1.0.0.1: ICMP echo request, id 41234, seq 1, length 64
15:20:51.909570 IP 1.0.0.1 > 1.0.0.2: ICMP echo reply, id 41234, seq 1, length 64
15:20:52.909551 IP 1.0.0.2 > 1.0.0.1: ICMP echo request, id 41234, seq 2, length 64
15:20:52.910577 IP 1.0.0.1 > 1.0.0.2: ICMP echo reply, id 41234, seq 2, length 64



3) tun/tap interface usage

a) No external module required, tun/tap support in kernel required

b) ./testpmd --vdev eth_kdp0 --vdev eth_kdp1

b1) This will create "tap_kdp0"  and "tap_kdp1" Linux interfaces
$ ip l show tap_kdp0; ip l show tap_kdp1
25: tap_kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
    link/ether 56:47:97:9c:03:8e brd ff:ff:ff:ff:ff:ff
26: tap_kdp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
    link/ether 5e:15:22:b0:52:42 brd ff:ff:ff:ff:ff:ff

Ferruh Yigit (2):
  kdp: add kernel data path kernel module
  kdp: add virtual PMD for kernel slow data path communication

 config/common_linuxapp                             |   9 +-
 doc/guides/nics/pcap_ring.rst                      | 125 ++++-
 doc/guides/rel_notes/release_2_3.rst               |   6 +
 drivers/net/Makefile                               |   3 +-
 drivers/net/kdp/Makefile                           |  61 +++
 drivers/net/kdp/rte_eth_kdp.c                      | 481 +++++++++++++++++
 drivers/net/kdp/rte_kdp.c                          | 365 +++++++++++++
 drivers/net/kdp/rte_kdp.h                          | 126 +++++
 drivers/net/kdp/rte_kdp_fifo.h                     |  91 ++++
 drivers/net/kdp/rte_kdp_tap.c                      |  96 ++++
 drivers/net/kdp/rte_pmd_kdp_version.map            |   4 +
 lib/librte_eal/common/include/rte_log.h            |   3 +-
 lib/librte_eal/linuxapp/Makefile                   |   5 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 143 +++++
 lib/librte_eal/linuxapp/kdp/Makefile               |  56 ++
 lib/librte_eal/linuxapp/kdp/kdp_dev.h              |  82 +++
 lib/librte_eal/linuxapp/kdp/kdp_fifo.h             |  91 ++++
 lib/librte_eal/linuxapp/kdp/kdp_misc.c             | 463 +++++++++++++++++
 lib/librte_eal/linuxapp/kdp/kdp_net.c              | 573 +++++++++++++++++++++
 mk/rte.app.mk                                      |   3 +-
 21 files changed, 2780 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/kdp/Makefile
 create mode 100644 drivers/net/kdp/rte_eth_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.h
 create mode 100644 drivers/net/kdp/rte_kdp_fifo.h
 create mode 100644 drivers/net/kdp/rte_kdp_tap.c
 create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] kdp: add kernel data path kernel module
  2016-01-27 16:32 [PATCH 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
@ 2016-01-27 16:32 ` Ferruh Yigit
  2016-02-08 17:14   ` Reshma Pattan
  2016-01-27 16:32 ` [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
  2016-02-19  5:05 ` [PATCH v2 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2 siblings, 1 reply; 29+ messages in thread
From: Ferruh Yigit @ 2016-01-27 16:32 UTC (permalink / raw)
  To: dev

This kernel module is based on KNI module, but this one is stripped
version of it and only for data messages, no control functionality
provided.

FIFO implementation of the KNI is kept exact same, but ethtool related
code removed and virtual network management related code simplified.

This module contains kernel support to create network devices and
this module has a simple driver for virtual network device, the driver
simply puts/gets packets to/from FIFO instead of real hardware.

FIFO is created owned by userspace application, which is for this case
KDP PMD.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_linuxapp                             |   8 +-
 lib/librte_eal/linuxapp/Makefile                   |   5 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 143 +++++
 lib/librte_eal/linuxapp/kdp/Makefile               |  56 ++
 lib/librte_eal/linuxapp/kdp/kdp_dev.h              |  82 +++
 lib/librte_eal/linuxapp/kdp/kdp_fifo.h             |  91 ++++
 lib/librte_eal/linuxapp/kdp/kdp_misc.c             | 463 +++++++++++++++++
 lib/librte_eal/linuxapp/kdp/kdp_net.c              | 573 +++++++++++++++++++++
 9 files changed, 1421 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..73c91d8 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -320,6 +320,12 @@ CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
 CONFIG_RTE_LIBRTE_PMD_NULL=y
 
 #
+# Compile KDP PMD
+#
+CONFIG_RTE_KDP_KMOD=y
+CONFIG_RTE_KDP_PREEMPT_DEFAULT=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..e3f91a7 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -38,6 +38,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_KNI_KMOD),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
 endif
+ifeq ($(CONFIG_RTE_KDP_KMOD),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kdp
+endif
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0
 endif
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 26eced5..ac72aea 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -116,6 +116,7 @@ CFLAGS_eal_thread.o += -Wno-return-type
 endif
 
 INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
+INC += rte_kdp_common.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP)-include/exec-env := \
 	$(addprefix include/exec-env/,$(INC))
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
new file mode 100644
index 0000000..0c77f58
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
@@ -0,0 +1,143 @@
+/*-
+ *   This file is provided under a dual BSD/LGPLv2 license.  When using or
+ *   redistributing this file, you may do so under either license.
+ *
+ *   GNU LESSER GENERAL PUBLIC LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2.1 of the GNU Lesser General Public License
+ *   as published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   Lesser General Public License for more details.
+ *
+ *   You should have received a copy of the GNU Lesser General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ *
+ *
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above copyright
+ *     notice, this list of conditions and the following disclaimer in
+ *     the documentation and/or other materials provided with the
+ *     distribution.
+ *   * Neither the name of Intel Corporation nor the names of its
+ *     contributors may be used to endorse or promote products derived
+ *     from this software without specific prior written permission.
+ *
+ *    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#ifndef _RTE_KDP_COMMON_H_
+#define _RTE_KDP_COMMON_H_
+
+#ifdef __KERNEL__
+#include <linux/if.h>
+#endif
+
+/**
+ * KDP name is part of memzone name.
+ */
+#define RTE_KDP_NAMESIZE 32
+
+#ifndef RTE_CACHE_LINE_SIZE
+#define RTE_CACHE_LINE_SIZE 64       /**< Cache line size. */
+#endif
+
+/*
+ * Fifo struct mapped in a shared memory. It describes a circular buffer FIFO
+ * Write and read should wrap around. Fifo is empty when write == read
+ * Writing should never overwrite the read position
+ */
+struct rte_kdp_fifo {
+	volatile unsigned write;     /**< Next position to be written*/
+	volatile unsigned read;      /**< Next position to be read */
+	unsigned len;                /**< Circular buffer length */
+	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
+	void * volatile buffer[0];   /**< The buffer contains mbuf pointers */
+};
+
+/*
+ * The kernel image of the rte_mbuf struct, with only the relevant fields.
+ * Padding is necessary to assure the offsets of these fields
+ */
+struct rte_kdp_mbuf {
+	void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
+	char pad0[10];
+
+	/**< Start address of data in segment buffer. */
+	uint16_t data_off;
+	char pad1[4];
+	uint64_t ol_flags;      /**< Offload features. */
+	char pad2[4];
+
+	/**< Total pkt len: sum of all segment data_len. */
+	uint32_t pkt_len;
+
+	/**< Amount of data in segment buffer. */
+	uint16_t data_len;
+
+	/* fields on second cache line */
+	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
+	void *pool;
+	void *next;
+};
+
+/*
+ * Struct used to create a KDP device. Passed to the kernel in IOCTL call
+ */
+struct rte_kdp_device_info {
+	char name[RTE_KDP_NAMESIZE];  /**< Network device name for KDP */
+
+	phys_addr_t tx_phys;
+	phys_addr_t rx_phys;
+	phys_addr_t alloc_phys;
+	phys_addr_t free_phys;
+
+	/* mbuf mempool */
+	void *mbuf_va;
+	phys_addr_t mbuf_phys;
+
+	uint16_t group_id;            /**< Group ID */
+	uint32_t core_id;             /**< core ID to bind for kernel thread */
+
+	uint8_t force_bind : 1;       /**< Flag for kernel thread binding */
+
+	/* mbuf size */
+	unsigned mbuf_size;
+};
+
+#define KDP_DEVICE "kdp"
+
+#define RTE_KDP_IOCTL_TEST    _IOWR(0, 1, int)
+#define RTE_KDP_IOCTL_CREATE  _IOWR(0, 2, struct rte_kdp_device_info)
+#define RTE_KDP_IOCTL_RELEASE _IOWR(0, 3, struct rte_kdp_device_info)
+
+#endif /* _RTE_KDP_COMMON_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/Makefile b/lib/librte_eal/linuxapp/kdp/Makefile
new file mode 100644
index 0000000..764f6a8
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = rte_kdp
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=50
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+MODULE_CFLAGS += -Wall -Werror
+
+# this lib needs main eal
+DEPDIRS-y += lib/librte_eal/linuxapp/eal
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += kdp_misc.c
+SRCS-y += kdp_net.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_dev.h b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
new file mode 100644
index 0000000..52952b4
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
@@ -0,0 +1,82 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KDP_DEV_H_
+#define _KDP_DEV_H_
+
+#include <exec-env/rte_kdp_common.h>
+
+/**
+ * A structure describing the private information for a kdp device.
+ */
+struct kdp_dev {
+	/* kdp list */
+	struct list_head list;
+
+	struct net_device_stats stats;
+	uint16_t group_id;           /* Group ID of a group of KDP devices */
+	unsigned core_id;            /* Core ID to bind */
+	char name[RTE_KDP_NAMESIZE]; /* Network device name */
+	struct task_struct *pthread;
+
+	/* wait queue for req/resp */
+	wait_queue_head_t wq;
+	struct mutex sync_lock;
+
+	/* kdp device */
+	struct net_device *net_dev;
+
+	/* queue for packets to be sent out */
+	void *tx_q;
+
+	/* queue for the packets received */
+	void *rx_q;
+
+	/* queue for the allocated mbufs those can be used to save sk buffs */
+	void *alloc_q;
+
+	/* free queue for the mbufs to be freed */
+	void *free_q;
+
+	void *sync_kva;
+	void *sync_va;
+
+	void *mbuf_kva;
+	void *mbuf_va;
+
+	/* mbuf size */
+	unsigned mbuf_size;
+};
+
+void kdp_net_rx(struct kdp_dev *kdp);
+void kdp_net_init(struct net_device *dev);
+void kdp_net_config_lo_mode(char *lo_str);
+
+#define KDP_ERR(args...) printk(KERN_DEBUG "KDP: Error: " args)
+#define KDP_PRINT(args...) printk(KERN_DEBUG "KDP: " args)
+
+#ifdef RTE_KDP_KO_DEBUG
+#define KDP_DBG(args...) printk(KERN_DEBUG "KDP: " args)
+#else
+#define KDP_DBG(args...)
+#endif
+
+#endif
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_fifo.h b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
new file mode 100644
index 0000000..a5fe080
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
@@ -0,0 +1,91 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KDP_FIFO_H_
+#define _KDP_FIFO_H_
+
+#include <exec-env/rte_kdp_common.h>
+
+/**
+ * Adds num elements into the fifo. Return the number actually written
+ */
+static inline unsigned
+kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned fifo_write = fifo->write;
+	unsigned fifo_read = fifo->read;
+	unsigned new_write = fifo_write;
+
+	for (i = 0; i < num; i++) {
+		new_write = (new_write + 1) & (fifo->len - 1);
+
+		if (new_write == fifo_read)
+			break;
+		fifo->buffer[fifo_write] = data[i];
+		fifo_write = new_write;
+	}
+	fifo->write = fifo_write;
+
+	return i;
+}
+
+/**
+ * Get up to num elements from the fifo. Return the number actully read
+ */
+static inline unsigned
+kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned new_read = fifo->read;
+	unsigned fifo_write = fifo->write;
+
+	for (i = 0; i < num; i++) {
+		if (new_read == fifo_write)
+			break;
+
+		data[i] = fifo->buffer[new_read];
+		new_read = (new_read + 1) & (fifo->len - 1);
+	}
+	fifo->read = new_read;
+
+	return i;
+}
+
+/**
+ * Get the num of elements in the fifo
+ */
+static inline unsigned
+kdp_fifo_count(struct rte_kdp_fifo *fifo)
+{
+	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+}
+
+/**
+ * Get the num of available elements in the fifo
+ */
+static inline unsigned
+kdp_fifo_free_count(struct rte_kdp_fifo *fifo)
+{
+	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+}
+
+#endif /* _KDP_FIFO_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_misc.c b/lib/librte_eal/linuxapp/kdp/kdp_misc.c
new file mode 100644
index 0000000..d97d1c0
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_misc.c
@@ -0,0 +1,463 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include <linux/version.h>
+#include <linux/miscdevice.h>
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+#include <linux/kthread.h>
+#include <net/netns/generic.h>
+
+#include "kdp_dev.h"
+
+#define KDP_RX_LOOP_NUM 1000
+#define KDP_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
+#define KDP_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
+
+static unsigned long device_in_use; /* device in use flag */
+static struct task_struct *kdp_kthread;
+static struct rw_semaphore kdp_list_lock;
+static struct list_head kdp_list_head;
+
+/* loopback mode */
+static char *lo_mode;
+
+/* Kernel thread mode */
+static char *kthread_mode;
+static unsigned multiple_kthread_on;
+
+static int
+kdp_thread_single(void *data)
+{
+	struct kdp_dev *dev;
+	int j;
+
+	while (!kthread_should_stop()) {
+		down_read(&kdp_list_lock);
+		for (j = 0; j < KDP_RX_LOOP_NUM; j++) {
+			list_for_each_entry(dev, &kdp_list_head, list) {
+				kdp_net_rx(dev);
+			}
+		}
+		up_read(&kdp_list_lock);
+#ifdef RTE_KDP_PREEMPT_DEFAULT
+		/* reschedule out for a while */
+		schedule_timeout_interruptible(
+			usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL));
+#endif
+	}
+
+	return 0;
+}
+
+static int
+kdp_thread_multiple(void *param)
+{
+	int j;
+	struct kdp_dev *dev = (struct kdp_dev *)param;
+
+	while (!kthread_should_stop()) {
+		for (j = 0; j < KDP_RX_LOOP_NUM; j++)
+			kdp_net_rx(dev);
+
+#ifdef RTE_KDP_PREEMPT_DEFAULT
+		schedule_timeout_interruptible(
+			usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL));
+#endif
+	}
+
+	return 0;
+}
+
+static int
+kdp_dev_remove(struct kdp_dev *dev)
+{
+	if (!dev)
+		return -ENODEV;
+
+	if (dev->net_dev) {
+		unregister_netdev(dev->net_dev);
+		free_netdev(dev->net_dev);
+	}
+
+	return 0;
+}
+
+static int
+kdp_check_param(struct kdp_dev *kdp, struct rte_kdp_device_info *dev)
+{
+	if (!kdp || !dev)
+		return -1;
+
+	/* Check if network name has been used */
+	if (!strncmp(kdp->name, dev->name, RTE_KDP_NAMESIZE)) {
+		KDP_ERR("KDP name %s duplicated\n", dev->name);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+kdp_ioctl_create(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+	int ret;
+	struct rte_kdp_device_info dev_info;
+	struct net_device *net_dev = NULL;
+	struct kdp_dev *kdp, *dev, *n;
+
+	printk(KERN_INFO "KDP: Creating kdp...\n");
+	/* Check the buffer size, to avoid warning */
+	if (_IOC_SIZE(ioctl_num) > sizeof(dev_info))
+		return -EINVAL;
+
+	/* Copy kdp info from user space */
+	ret = copy_from_user(&dev_info, (void *)ioctl_param, sizeof(dev_info));
+	if (ret) {
+		KDP_ERR("copy_from_user in kdp_ioctl_create");
+		return -EIO;
+	}
+
+	/**
+	 * Check if the cpu core id is valid for binding,
+	 * for multiple kernel thread mode.
+	 */
+	if (multiple_kthread_on && dev_info.force_bind &&
+				!cpu_online(dev_info.core_id)) {
+		KDP_ERR("cpu %u is not online\n", dev_info.core_id);
+		return -EINVAL;
+	}
+
+	/* Check if it has been created */
+	down_read(&kdp_list_lock);
+	list_for_each_entry_safe(dev, n, &kdp_list_head, list) {
+		if (kdp_check_param(dev, &dev_info) < 0) {
+			up_read(&kdp_list_lock);
+			return -EINVAL;
+		}
+	}
+	up_read(&kdp_list_lock);
+
+	net_dev = alloc_netdev(sizeof(struct kdp_dev), dev_info.name,
+#ifdef NET_NAME_UNKNOWN
+							NET_NAME_UNKNOWN,
+#endif
+							kdp_net_init);
+	if (net_dev == NULL) {
+		KDP_ERR("error allocating device \"%s\"\n", dev_info.name);
+		return -EBUSY;
+	}
+
+	kdp = netdev_priv(net_dev);
+
+	kdp->net_dev = net_dev;
+	kdp->group_id = dev_info.group_id;
+	kdp->core_id = dev_info.core_id;
+	strncpy(kdp->name, dev_info.name, RTE_KDP_NAMESIZE);
+
+	/* Translate user space info into kernel space info */
+	kdp->tx_q = phys_to_virt(dev_info.tx_phys);
+	kdp->rx_q = phys_to_virt(dev_info.rx_phys);
+	kdp->alloc_q = phys_to_virt(dev_info.alloc_phys);
+	kdp->free_q = phys_to_virt(dev_info.free_phys);
+
+	kdp->mbuf_kva = phys_to_virt(dev_info.mbuf_phys);
+	kdp->mbuf_va = dev_info.mbuf_va;
+
+	kdp->mbuf_size = dev_info.mbuf_size;
+
+	KDP_PRINT("tx_phys:      0x%016llx, tx_q addr:      0x%p\n",
+		(unsigned long long) dev_info.tx_phys, kdp->tx_q);
+	KDP_PRINT("rx_phys:      0x%016llx, rx_q addr:      0x%p\n",
+		(unsigned long long) dev_info.rx_phys, kdp->rx_q);
+	KDP_PRINT("alloc_phys:   0x%016llx, alloc_q addr:   0x%p\n",
+		(unsigned long long) dev_info.alloc_phys, kdp->alloc_q);
+	KDP_PRINT("free_phys:    0x%016llx, free_q addr:    0x%p\n",
+		(unsigned long long) dev_info.free_phys, kdp->free_q);
+	KDP_PRINT("mbuf_phys:    0x%016llx, mbuf_kva:       0x%p\n",
+		(unsigned long long) dev_info.mbuf_phys, kdp->mbuf_kva);
+	KDP_PRINT("mbuf_va:      0x%p\n", dev_info.mbuf_va);
+	KDP_PRINT("mbuf_size:    %u\n", kdp->mbuf_size);
+
+	ret = register_netdev(net_dev);
+	if (ret) {
+		KDP_ERR("error %i registering device \"%s\"\n",
+					ret, dev_info.name);
+		kdp_dev_remove(kdp);
+		return -ENODEV;
+	}
+
+	/**
+	 * Create a new kernel thread for multiple mode, set its core affinity,
+	 * and finally wake it up.
+	 */
+	if (multiple_kthread_on) {
+		kdp->pthread = kthread_create(kdp_thread_multiple,
+					      (void *)kdp,
+					      "kdp_%s", kdp->name);
+		if (IS_ERR(kdp->pthread)) {
+			kdp_dev_remove(kdp);
+			return -ECANCELED;
+		}
+		if (dev_info.force_bind)
+			kthread_bind(kdp->pthread, kdp->core_id);
+		wake_up_process(kdp->pthread);
+	}
+
+	down_write(&kdp_list_lock);
+	list_add(&kdp->list, &kdp_list_head);
+	up_write(&kdp_list_lock);
+
+	return 0;
+}
+
+static int
+kdp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+	int ret = -EINVAL;
+	struct kdp_dev *dev, *n;
+	struct rte_kdp_device_info dev_info;
+
+	if (_IOC_SIZE(ioctl_num) > sizeof(dev_info))
+		return -EINVAL;
+
+	ret = copy_from_user(&dev_info, (void *)ioctl_param, sizeof(dev_info));
+	if (ret) {
+		KDP_ERR("copy_from_user in kdp_ioctl_release");
+		return -EIO;
+	}
+
+	/* Release the network device according to its name */
+	if (strlen(dev_info.name) == 0)
+		return ret;
+
+	down_write(&kdp_list_lock);
+	list_for_each_entry_safe(dev, n, &kdp_list_head, list) {
+		if (strncmp(dev->name, dev_info.name, RTE_KDP_NAMESIZE) != 0)
+			continue;
+
+		if (multiple_kthread_on && dev->pthread != NULL) {
+			kthread_stop(dev->pthread);
+			dev->pthread = NULL;
+		}
+
+		kdp_dev_remove(dev);
+		list_del(&dev->list);
+		ret = 0;
+		break;
+	}
+	up_write(&kdp_list_lock);
+	printk(KERN_INFO "KDP: %s release kdp named %s\n",
+		(ret == 0 ? "Successfully" : "Unsuccessfully"), dev_info.name);
+
+	return ret;
+}
+
+static int
+kdp_ioctl(struct inode *inode, unsigned int ioctl_num,
+	unsigned long ioctl_param)
+{
+	int ret = -EINVAL;
+
+	KDP_DBG("IOCTL num=0x%0x param=0x%0lx\n", ioctl_num, ioctl_param);
+
+	/*
+	 * Switch according to the ioctl called
+	 */
+	switch (_IOC_NR(ioctl_num)) {
+	case _IOC_NR(RTE_KDP_IOCTL_TEST):
+		/* For test only, not used */
+		break;
+	case _IOC_NR(RTE_KDP_IOCTL_CREATE):
+		ret = kdp_ioctl_create(ioctl_num, ioctl_param);
+		break;
+	case _IOC_NR(RTE_KDP_IOCTL_RELEASE):
+		ret = kdp_ioctl_release(ioctl_num, ioctl_param);
+		break;
+	default:
+		KDP_DBG("IOCTL default\n");
+		break;
+	}
+
+	return ret;
+}
+
+static int
+kdp_open(struct inode *inode, struct file *file)
+{
+	/* kdp device can be opened by one user only per netns */
+	if (test_and_set_bit(KDP_DEV_IN_USE_BIT_NUM, &device_in_use))
+		return -EBUSY;
+
+	/* Create kernel thread for single mode */
+	if (multiple_kthread_on == 0) {
+		KDP_PRINT("Single kernel thread for all KDP devices\n");
+		/* Create kernel thread for RX */
+		kdp_kthread = kthread_run(kdp_thread_single, NULL,
+						"kdp_single");
+		if (IS_ERR(kdp_kthread)) {
+			KDP_ERR("Unable to create kernel threaed\n");
+			return PTR_ERR(kdp_kthread);
+		}
+	} else
+		KDP_PRINT("Multiple kernel thread mode enabled\n");
+
+	KDP_PRINT("/dev/kdp opened\n");
+
+	return 0;
+}
+
+static int
+kdp_release(struct inode *inode, struct file *file)
+{
+	struct kdp_dev *dev, *n;
+
+	/* Stop kernel thread for single mode */
+	if (multiple_kthread_on == 0) {
+		/* Stop kernel thread */
+		kthread_stop(kdp_kthread);
+		kdp_kthread = NULL;
+	}
+
+	down_write(&kdp_list_lock);
+	list_for_each_entry_safe(dev, n, &kdp_list_head, list) {
+		/* Stop kernel thread for multiple mode */
+		if (multiple_kthread_on && dev->pthread != NULL) {
+			kthread_stop(dev->pthread);
+			dev->pthread = NULL;
+		}
+
+		kdp_dev_remove(dev);
+		list_del(&dev->list);
+	}
+	up_write(&kdp_list_lock);
+
+	/* Clear the bit of device in use */
+	clear_bit(KDP_DEV_IN_USE_BIT_NUM, &device_in_use);
+
+	KDP_PRINT("/dev/kdp closed\n");
+
+	return 0;
+}
+
+static int
+kdp_compat_ioctl(struct inode *inode, unsigned int ioctl_num,
+		unsigned long ioctl_param)
+{
+	/* 32 bits app on 64 bits OS to be supported later */
+	KDP_PRINT("Not implemented.\n");
+
+	return -EINVAL;
+}
+
+static const struct file_operations kdp_fops = {
+	.owner = THIS_MODULE,
+	.open = kdp_open,
+	.release = kdp_release,
+	.unlocked_ioctl = (void *)kdp_ioctl,
+	.compat_ioctl = (void *)kdp_compat_ioctl,
+};
+
+static struct miscdevice kdp_misc = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = KDP_DEVICE,
+	.fops = &kdp_fops,
+};
+
+static int __init
+kdp_parse_kthread_mode(void)
+{
+	if (!kthread_mode)
+		return 0;
+
+	if (strcmp(kthread_mode, "single") == 0)
+		return 0;
+	else if (strcmp(kthread_mode, "multiple") == 0)
+		multiple_kthread_on = 1;
+	else
+		return -1;
+
+	return 0;
+}
+
+static int __init
+kdp_init(void)
+{
+	int rc;
+
+	KDP_PRINT("######## DPDK kdp module loading ########\n");
+
+	if (kdp_parse_kthread_mode() < 0) {
+		KDP_ERR("Invalid parameter for kthread_mode\n");
+		return -EINVAL;
+	}
+
+	rc = misc_register(&kdp_misc);
+	if (rc != 0) {
+		KDP_ERR("Misc registration failed\n");
+		return rc;
+	}
+
+	/* Configure the lo mode according to the input parameter */
+	kdp_net_config_lo_mode(lo_mode);
+
+	/* Clear the bit of device in use */
+	clear_bit(KDP_DEV_IN_USE_BIT_NUM, &device_in_use);
+	init_rwsem(&kdp_list_lock);
+	INIT_LIST_HEAD(&kdp_list_head);
+
+	KDP_PRINT("######## DPDK kdp module loaded  ########\n");
+
+	return 0;
+}
+module_init(kdp_init);
+
+static void __exit
+kdp_exit(void)
+{
+	misc_deregister(&kdp_misc);
+	KDP_PRINT("####### DPDK kdp module unloaded  #######\n");
+}
+module_exit(kdp_exit);
+
+module_param(lo_mode, charp, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(lo_mode,
+"KDP loopback mode (default=lo_mode_none):\n"
+"    lo_mode_none        Kernel loopback disabled\n"
+"    lo_mode_fifo        Enable kernel loopback with fifo\n"
+"    lo_mode_fifo_skb    Enable kernel loopback with fifo and skb buffer\n"
+"\n"
+);
+
+module_param(kthread_mode, charp, S_IRUGO);
+MODULE_PARM_DESC(kthread_mode,
+"Kernel thread mode (default=single):\n"
+"    single    Single kernel thread mode enabled.\n"
+"    multiple  Multiple kernel thread mode enabled.\n"
+"\n"
+);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Kernel Module for managing kdp devices");
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_net.c b/lib/librte_eal/linuxapp/kdp/kdp_net.c
new file mode 100644
index 0000000..5c669f5
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_net.c
@@ -0,0 +1,573 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+/*
+ * This code is inspired from the book "Linux Device Drivers" by
+ * Alessandro Rubini and Jonathan Corbet, published by O'Reilly & Associates
+ */
+
+#include <linux/version.h>
+#include <linux/etherdevice.h> /* eth_type_trans */
+
+#include "kdp_fifo.h"
+#include "kdp_dev.h"
+
+#define WD_TIMEOUT 5 /*jiffies */
+
+#define MBUF_BURST_SZ 32
+
+/* typedef for rx function */
+typedef void (*kdp_net_rx_t)(struct kdp_dev *kdp);
+
+/*
+ * Open and close
+ */
+static int
+kdp_net_open(struct net_device *dev)
+{
+	random_ether_addr(dev->dev_addr);
+	netif_start_queue(dev);
+
+	return 0;
+}
+
+static int
+kdp_net_release(struct net_device *dev)
+{
+	netif_stop_queue(dev); /* can't transmit any more */
+
+	return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int
+kdp_net_config(struct net_device *dev, struct ifmap *map)
+{
+	if (dev->flags & IFF_UP) /* can't act on a running interface */
+		return -EBUSY;
+
+	/* ignore other fields */
+	return 0;
+}
+
+/*
+ * Transmit a packet (called by the kernel)
+ */
+static int
+kdp_net_tx(struct sk_buff *skb, struct net_device *dev)
+{
+	int len = 0;
+	unsigned ret;
+	struct kdp_dev *kdp = netdev_priv(dev);
+	struct rte_kdp_mbuf *pkt_kva = NULL;
+	struct rte_kdp_mbuf *pkt_va = NULL;
+
+	dev->trans_start = jiffies; /* save the timestamp */
+
+	/* Check if the length of skb is less than mbuf size */
+	if (skb->len > kdp->mbuf_size)
+		goto drop;
+
+	/**
+	 * Check if it has at least one free entry in tx_q and
+	 * one entry in alloc_q.
+	 */
+	if (kdp_fifo_free_count(kdp->tx_q) == 0 ||
+			kdp_fifo_count(kdp->alloc_q) == 0) {
+		/**
+		 * If no free entry in tx_q or no entry in alloc_q,
+		 * drops skb and goes out.
+		 */
+		goto drop;
+	}
+
+	/* dequeue a mbuf from alloc_q */
+	ret = kdp_fifo_get(kdp->alloc_q, (void **)&pkt_va, 1);
+	if (likely(ret == 1)) {
+		void *data_kva;
+
+		pkt_kva = (void *)pkt_va - kdp->mbuf_va + kdp->mbuf_kva;
+		data_kva = pkt_kva->buf_addr + pkt_kva->data_off - kdp->mbuf_va
+				+ kdp->mbuf_kva;
+
+		len = skb->len;
+		memcpy(data_kva, skb->data, len);
+		if (unlikely(len < ETH_ZLEN)) {
+			memset(data_kva + len, 0, ETH_ZLEN - len);
+			len = ETH_ZLEN;
+		}
+		pkt_kva->pkt_len = len;
+		pkt_kva->data_len = len;
+
+		/* enqueue mbuf into tx_q */
+		ret = kdp_fifo_put(kdp->tx_q, (void **)&pkt_va, 1);
+		if (unlikely(ret != 1)) {
+			/* Failing should not happen */
+			KDP_ERR("Fail to enqueue mbuf into tx_q\n");
+			goto drop;
+		}
+	} else {
+		/* Failing should not happen */
+		KDP_ERR("Fail to dequeue mbuf from alloc_q\n");
+		goto drop;
+	}
+
+	/* Free skb and update statistics */
+	dev_kfree_skb(skb);
+	kdp->stats.tx_bytes += len;
+	kdp->stats.tx_packets++;
+
+	return NETDEV_TX_OK;
+
+drop:
+	/* Free skb and update statistics */
+	dev_kfree_skb(skb);
+	kdp->stats.tx_dropped++;
+
+	return NETDEV_TX_OK;
+}
+
+static int
+kdp_net_change_mtu(struct net_device *dev, int new_mtu)
+{
+	KDP_DBG("kdp_net_change_mtu new mtu %d to be set\n", new_mtu);
+
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
+/*
+ * Ioctl commands
+ */
+static int
+kdp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	KDP_DBG("kdp_net_ioctl %d\n",
+		((struct kdp_dev *)netdev_priv(dev))->group_id);
+
+	return 0;
+}
+
+static void
+kdp_net_set_rx_mode(struct net_device *dev)
+{
+}
+
+/*
+ * Return statistics to the caller
+ */
+static struct net_device_stats *
+kdp_net_stats(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+	return &kdp->stats;
+}
+
+/*
+ * Deal with a transmit timeout.
+ */
+static void
+kdp_net_tx_timeout(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	KDP_DBG("Transmit timeout at %ld, latency %ld\n", jiffies,
+			jiffies - dev->trans_start);
+
+	kdp->stats.tx_errors++;
+	netif_wake_queue(dev);
+}
+
+/**
+ * kdp_net_set_mac - Change the Ethernet Address of the KDP NIC
+ * @netdev: network interface device structure
+ * @p: pointer to an address structure
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int kdp_net_set_mac(struct net_device *netdev, void *p)
+{
+	struct sockaddr *addr = p;
+	if (!is_valid_ether_addr((unsigned char *)(addr->sa_data)))
+		return -EADDRNOTAVAIL;
+	memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len);
+	return 0;
+}
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0))
+static int kdp_net_change_carrier(struct net_device *dev, bool new_carrier)
+{
+	if (new_carrier)
+		netif_carrier_on(dev);
+	else
+		netif_carrier_off(dev);
+	return 0;
+}
+#endif
+
+static const struct net_device_ops kdp_net_netdev_ops = {
+	.ndo_open = kdp_net_open,
+	.ndo_stop = kdp_net_release,
+	.ndo_set_config = kdp_net_config,
+	.ndo_start_xmit = kdp_net_tx,
+	.ndo_change_mtu = kdp_net_change_mtu,
+	.ndo_do_ioctl = kdp_net_ioctl,
+	.ndo_set_rx_mode = kdp_net_set_rx_mode,
+	.ndo_get_stats = kdp_net_stats,
+	.ndo_tx_timeout = kdp_net_tx_timeout,
+	.ndo_set_mac_address = kdp_net_set_mac,
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0))
+	.ndo_change_carrier = kdp_net_change_carrier,
+#endif
+};
+
+/*
+ *  Fill the eth header
+ */
+static int
+kdp_net_header(struct sk_buff *skb, struct net_device *dev,
+		unsigned short type, const void *daddr,
+		const void *saddr, unsigned int len)
+{
+	struct ethhdr *eth = (struct ethhdr *) skb_push(skb, ETH_HLEN);
+
+	memcpy(eth->h_source, saddr ? saddr : dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest,   daddr ? daddr : dev->dev_addr, dev->addr_len);
+	eth->h_proto = htons(type);
+
+	return dev->hard_header_len;
+}
+
+/*
+ * Re-fill the eth header
+ */
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0))
+static int
+kdp_net_rebuild_header(struct sk_buff *skb)
+{
+	struct net_device *dev = skb->dev;
+	struct ethhdr *eth = (struct ethhdr *) skb->data;
+
+	memcpy(eth->h_source, dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest, dev->dev_addr, dev->addr_len);
+
+	return 0;
+}
+#endif /* < 4.1.0  */
+
+static const struct header_ops kdp_net_header_ops = {
+	.create  = kdp_net_header,
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0))
+	.rebuild = kdp_net_rebuild_header,
+#endif /* < 4.1.0  */
+	.cache   = NULL,  /* disable caching */
+};
+
+void
+kdp_net_init(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	KDP_DBG("kdp_net_init\n");
+
+	init_waitqueue_head(&kdp->wq);
+	mutex_init(&kdp->sync_lock);
+
+	ether_setup(dev); /* assign some of the fields */
+	dev->netdev_ops      = &kdp_net_netdev_ops;
+	dev->header_ops      = &kdp_net_header_ops;
+	dev->watchdog_timeo = WD_TIMEOUT;
+}
+
+/*
+ * RX: normal working mode
+ */
+static void
+kdp_net_rx_normal(struct kdp_dev *kdp)
+{
+	unsigned ret;
+	uint32_t len;
+	unsigned i, num_rx, num_fq;
+	struct rte_kdp_mbuf *kva;
+	struct rte_kdp_mbuf *va[MBUF_BURST_SZ];
+	void *data_kva;
+	unsigned mbuf_burst_size = MBUF_BURST_SZ;
+
+	struct sk_buff *skb;
+	struct net_device *dev = kdp->net_dev;
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+	if (num_fq == 0) {
+		/* No room on the free_q, bail out */
+		return;
+	}
+
+	/* Calculate the number of entries to dequeue from rx_q */
+	num_rx = min(num_fq, mbuf_burst_size);
+
+	/* Burst dequeue from rx_q */
+	num_rx = kdp_fifo_get(kdp->rx_q, (void **)va, num_rx);
+	if (num_rx == 0)
+		return;
+
+	/* Transfer received packets to netif */
+	for (i = 0; i < num_rx; i++) {
+		kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva;
+		len = kva->data_len;
+		data_kva = kva->buf_addr + kva->data_off - kdp->mbuf_va
+				+ kdp->mbuf_kva;
+
+		skb = dev_alloc_skb(len + 2);
+		if (!skb) {
+			KDP_ERR("Out of mem, dropping pkts\n");
+			/* Update statistics */
+			kdp->stats.rx_dropped++;
+		} else {
+			/* Align IP on 16B boundary */
+			skb_reserve(skb, 2);
+			memcpy(skb_put(skb, len), data_kva, len);
+			skb->dev = dev;
+			skb->protocol = eth_type_trans(skb, dev);
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+			/* Call netif interface */
+			netif_rx(skb);
+
+			/* Update statistics */
+			kdp->stats.rx_bytes += len;
+			kdp->stats.rx_packets++;
+		}
+	}
+
+	/* Burst enqueue mbufs into free_q */
+	ret = kdp_fifo_put(kdp->free_q, (void **)va, num_rx);
+	if (ret != num_rx)
+		/* Failing should not happen */
+		KDP_ERR("Fail to enqueue entries into free_q\n");
+}
+
+/*
+ * RX: loopback with enqueue/dequeue fifos.
+ */
+static void
+kdp_net_rx_lo_fifo(struct kdp_dev *kdp)
+{
+	unsigned ret;
+	uint32_t len;
+	unsigned i, num, num_rq, num_tq, num_aq, num_fq;
+	struct rte_kdp_mbuf *kva;
+	struct rte_kdp_mbuf *va[MBUF_BURST_SZ];
+	void *data_kva;
+	struct rte_kdp_mbuf *alloc_kva;
+	struct rte_kdp_mbuf *alloc_va[MBUF_BURST_SZ];
+	void *alloc_data_kva;
+	unsigned mbuf_burst_size = MBUF_BURST_SZ;
+
+	/* Get the number of entries in rx_q */
+	num_rq = kdp_fifo_count(kdp->rx_q);
+
+	/* Get the number of free entrie in tx_q */
+	num_tq = kdp_fifo_free_count(kdp->tx_q);
+
+	/* Get the number of entries in alloc_q */
+	num_aq = kdp_fifo_count(kdp->alloc_q);
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+
+	/* Calculate the number of entries to be dequeued from rx_q */
+	num = min(num_rq, num_tq);
+	num = min(num, num_aq);
+	num = min(num, num_fq);
+	num = min(num, mbuf_burst_size);
+
+	/* Return if no entry to dequeue from rx_q */
+	if (num == 0)
+		return;
+
+	/* Burst dequeue from rx_q */
+	ret = kdp_fifo_get(kdp->rx_q, (void **)va, num);
+	if (ret == 0)
+		return; /* Failing should not happen */
+
+	/* Dequeue entries from alloc_q */
+	ret = kdp_fifo_get(kdp->alloc_q, (void **)alloc_va, num);
+	if (ret) {
+		num = ret;
+		/* Copy mbufs */
+		for (i = 0; i < num; i++) {
+			kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva;
+			len = kva->pkt_len;
+			data_kva = kva->buf_addr + kva->data_off -
+					kdp->mbuf_va + kdp->mbuf_kva;
+
+			alloc_kva = (void *)alloc_va[i] - kdp->mbuf_va +
+							kdp->mbuf_kva;
+			alloc_data_kva = alloc_kva->buf_addr +
+					alloc_kva->data_off - kdp->mbuf_va +
+							kdp->mbuf_kva;
+			memcpy(alloc_data_kva, data_kva, len);
+			alloc_kva->pkt_len = len;
+			alloc_kva->data_len = len;
+
+			kdp->stats.tx_bytes += len;
+			kdp->stats.rx_bytes += len;
+		}
+
+		/* Burst enqueue mbufs into tx_q */
+		ret = kdp_fifo_put(kdp->tx_q, (void **)alloc_va, num);
+		if (ret != num)
+			/* Failing should not happen */
+			KDP_ERR("Fail to enqueue mbufs into tx_q\n");
+	}
+
+	/* Burst enqueue mbufs into free_q */
+	ret = kdp_fifo_put(kdp->free_q, (void **)va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		KDP_ERR("Fail to enqueue mbufs into free_q\n");
+
+	/**
+	 * Update statistic, and enqueue/dequeue failure is impossible,
+	 * as all queues are checked at first.
+	 */
+	kdp->stats.tx_packets += num;
+	kdp->stats.rx_packets += num;
+}
+
+/*
+ * RX: loopback with enqueue/dequeue fifos and sk buffer copies.
+ */
+static void
+kdp_net_rx_lo_fifo_skb(struct kdp_dev *kdp)
+{
+	unsigned ret;
+	uint32_t len;
+	unsigned i, num_rq, num_fq, num;
+	struct rte_kdp_mbuf *kva;
+	struct rte_kdp_mbuf *va[MBUF_BURST_SZ];
+	void *data_kva;
+	struct sk_buff *skb;
+	struct net_device *dev = kdp->net_dev;
+	unsigned mbuf_burst_size = MBUF_BURST_SZ;
+
+	/* Get the number of entries in rx_q */
+	num_rq = kdp_fifo_count(kdp->rx_q);
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+
+	/* Calculate the number of entries to dequeue from rx_q */
+	num = min(num_rq, num_fq);
+	num = min(num, mbuf_burst_size);
+
+	/* Return if no entry to dequeue from rx_q */
+	if (num == 0)
+		return;
+
+	/* Burst dequeue mbufs from rx_q */
+	ret = kdp_fifo_get(kdp->rx_q, (void **)va, num);
+	if (ret == 0)
+		return;
+
+	/* Copy mbufs to sk buffer and then call tx interface */
+	for (i = 0; i < num; i++) {
+		kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva;
+		len = kva->data_len;
+		data_kva = kva->buf_addr + kva->data_off - kdp->mbuf_va +
+				kdp->mbuf_kva;
+
+		skb = dev_alloc_skb(len + 2);
+		if (skb == NULL)
+			KDP_ERR("Out of mem, dropping pkts\n");
+		else {
+			/* Align IP on 16B boundary */
+			skb_reserve(skb, 2);
+			memcpy(skb_put(skb, len), data_kva, len);
+			skb->dev = dev;
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+			dev_kfree_skb(skb);
+		}
+
+		/* Simulate real usage, allocate/copy skb twice */
+		skb = dev_alloc_skb(len + 2);
+		if (skb == NULL) {
+			KDP_ERR("Out of mem, dropping pkts\n");
+			kdp->stats.rx_dropped++;
+		} else {
+			/* Align IP on 16B boundary */
+			skb_reserve(skb, 2);
+			memcpy(skb_put(skb, len), data_kva, len);
+			skb->dev = dev;
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+			kdp->stats.rx_bytes += len;
+			kdp->stats.rx_packets++;
+
+			/* call tx interface */
+			kdp_net_tx(skb, dev);
+		}
+	}
+
+	/* enqueue all the mbufs from rx_q into free_q */
+	ret = kdp_fifo_put(kdp->free_q, (void **)&va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		KDP_ERR("Fail to enqueue mbufs into free_q\n");
+}
+
+/* kdp rx function pointer, with default to normal rx */
+static kdp_net_rx_t kdp_net_rx_func = kdp_net_rx_normal;
+
+void
+kdp_net_config_lo_mode(char *lo_str)
+{
+	if (!lo_str) {
+		KDP_PRINT("loopback disabled");
+		return;
+	}
+
+	if (!strcmp(lo_str, "lo_mode_none"))
+		KDP_PRINT("loopback disabled");
+	else if (!strcmp(lo_str, "lo_mode_fifo")) {
+		KDP_PRINT("loopback mode=lo_mode_fifo enabled");
+		kdp_net_rx_func = kdp_net_rx_lo_fifo;
+	} else if (!strcmp(lo_str, "lo_mode_fifo_skb")) {
+		KDP_PRINT("loopback mode=lo_mode_fifo_skb enabled");
+		kdp_net_rx_func = kdp_net_rx_lo_fifo_skb;
+	} else
+		KDP_PRINT("Incognizant parameter, loopback disabled");
+}
+
+/* rx interface */
+void
+kdp_net_rx(struct kdp_dev *kdp)
+{
+	/**
+	 * It doesn't need to check if it is NULL pointer,
+	 * as it has a default value
+	 */
+	(*kdp_net_rx_func)(kdp);
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-01-27 16:32 [PATCH 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2016-01-27 16:32 ` [PATCH 1/2] kdp: add kernel data path kernel module Ferruh Yigit
@ 2016-01-27 16:32 ` Ferruh Yigit
  2016-01-28  8:16   ` Xu, Qian Q
  2016-02-09 17:33   ` Reshma Pattan
  2016-02-19  5:05 ` [PATCH v2 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2 siblings, 2 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-01-27 16:32 UTC (permalink / raw)
  To: dev

This patch provides slow data path communication to the Linux kernel.
Patch is based on librte_kni, and heavily re-uses it.

The main difference is librte_kni library converted into a PMD, to
provide ease of use for applications.

Now any application can use slow path communication without any update
in application, because of existing eal support for virtual PMD.

Also this PMD supports two methods to send packets to the Linux, first
one is custom FIFO implementation with help of KDP kernel module, second
one is Linux in-kernel tun/tap support. PMD first checks for KDP kernel
module, if fails it tries to create and use a tap interface.

With FIFO method: PMD's rx_pkt_burst() get packets from FIFO,
and tx_pkt_burst() puts packet to the FIFO.
The corresponding Linux virtual network device driver code
also gets/puts packets from FIFO as they are coming from hardware.

With tun/tap method: no external kernel module required, PMD reads from
and writes packets to the tap interface file descriptor. Tap interface
has performance penalty against FIFO implementation.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_linuxapp                  |   1 +
 doc/guides/nics/pcap_ring.rst           | 125 ++++++++-
 doc/guides/rel_notes/release_2_3.rst    |   6 +
 drivers/net/Makefile                    |   3 +-
 drivers/net/kdp/Makefile                |  61 ++++
 drivers/net/kdp/rte_eth_kdp.c           | 481 ++++++++++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.c               | 365 ++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.h               | 126 +++++++++
 drivers/net/kdp/rte_kdp_fifo.h          |  91 ++++++
 drivers/net/kdp/rte_kdp_tap.c           |  96 +++++++
 drivers/net/kdp/rte_pmd_kdp_version.map |   4 +
 lib/librte_eal/common/include/rte_log.h |   3 +-
 mk/rte.app.mk                           |   3 +-
 13 files changed, 1359 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/kdp/Makefile
 create mode 100644 drivers/net/kdp/rte_eth_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.h
 create mode 100644 drivers/net/kdp/rte_kdp_fifo.h
 create mode 100644 drivers/net/kdp/rte_kdp_tap.c
 create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 73c91d8..b9dec0c 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -322,6 +322,7 @@ CONFIG_RTE_LIBRTE_PMD_NULL=y
 #
 # Compile KDP PMD
 #
+CONFIG_RTE_LIBRTE_PMD_KDP=y
 CONFIG_RTE_KDP_KMOD=y
 CONFIG_RTE_KDP_PREEMPT_DEFAULT=y
 
diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst
index 46aa3ac..78b7b61 100644
--- a/doc/guides/nics/pcap_ring.rst
+++ b/doc/guides/nics/pcap_ring.rst
@@ -28,11 +28,11 @@
     (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-Libpcap and Ring Based Poll Mode Drivers
-========================================
+Software Poll Mode Drivers
+==========================
 
 In addition to Poll Mode Drivers (PMDs) for physical and virtual hardware,
-the DPDK also includes two pure-software PMDs. These two drivers are:
+the DPDK also includes pure-software PMDs. These drivers are:
 
 *   A libpcap -based PMD (librte_pmd_pcap) that reads and writes packets using libpcap,
     - both from files on disk, as well as from physical NIC devices using standard Linux kernel drivers.
@@ -40,6 +40,10 @@ the DPDK also includes two pure-software PMDs. These two drivers are:
 *   A ring-based PMD (librte_pmd_ring) that allows a set of software FIFOs (that is, rte_ring)
     to be accessed using the PMD APIs, as though they were physical NICs.
 
+*   A slow data path PMD (librte_pmd_kdp) that allows send/get packets to/from OS network
+    stack as it is a physical NIC.
+
+
 .. note::
 
     The libpcap -based PMD is disabled by default in the build configuration files,
@@ -211,6 +215,121 @@ Multiple devices may be specified, separated by commas.
     Done.
 
 
+Kernel Data Path PMD
+~~~~~~~~~~~~~~~~~~~~
+
+Kernel Data Path (KDP) PMD is to communicate with OS network stack easily by application.
+
+.. code-block:: console
+
+        ./testpmd --vdev eth_kdp0 --vdev eth_kdp1 -- -i
+        ...
+        Configuring Port 0 (socket 0)
+        Port 0: 00:00:00:00:00:00
+        Configuring Port 1 (socket 0)
+        Port 1: 00:00:00:00:00:00
+        Checking link statuses...
+        Port 0 Link Up - speed 10000 Mbps - full-duplex
+        Port 1 Link Up - speed 10000 Mbps - full-duplex
+        Done
+
+KDP PMD supports two type of communication:
+
+* Custom FIFO implementation
+* tun/tap implementation
+
+Custom FIFO implementation gives more performance but requires KDP kernel module (rte_kdp.ko) inserted.
+
+By default FIFO communication has priority, if KDP kernel module is not inserted, tun/tap communication used.
+
+If KDP kernel module inserted, above testpmd command will create following virtual interfaces, these can be used as any interface.
+
+.. code-block:: console
+
+        # ifconfig kdp0; ifconfig kdp1
+        kdp0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
+                ether 00:00:00:00:00:00  txqueuelen 1000  (Ethernet)
+                RX packets 0  bytes 0 (0.0 B)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 0  bytes 0 (0.0 B)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+        kdp1: flags=4098<BROADCAST,MULTICAST>  mtu 1500
+                ether 00:00:00:00:00:00  txqueuelen 1000  (Ethernet)
+                RX packets 0  bytes 0 (0.0 B)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 0  bytes 0 (0.0 B)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+
+With tun/tap communication method, following interfaces are created:
+
+.. code-block:: console
+
+        # ifconfig tap_kdp0; ifconfig tap_kdp1
+        tap_kdp0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
+                inet6 fe80::341f:afff:feb7:23db  prefixlen 64  scopeid 0x20<link>
+                ether 36:1f:af:b7:23:db  txqueuelen 500  (Ethernet)
+                RX packets 126624864  bytes 6184828655 (5.7 GiB)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 126236898  bytes 6150306636 (5.7 GiB)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+        tap_kdp1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
+                inet6 fe80::f030:b4ff:fe94:b720  prefixlen 64  scopeid 0x20<link>
+                ether f2:30:b4:94:b7:20  txqueuelen 500  (Ethernet)
+                RX packets 126237370  bytes 6150329717 (5.7 GiB)
+                RX errors 0  dropped 9  overruns 0  frame 0
+                TX packets 126624896  bytes 6184826874 (5.7 GiB)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+DPDK application can be used to forward packages between these interfaces:
+
+.. code-block:: console
+
+        In Linux:
+        ip l add br0 type bridge
+        ip l set tap_kdp0 master br0
+        ip l set tap_kdp1 master br0
+        ip l set br0 up
+        ip l set tap_kdp0 up
+        ip l set tap_kdp1 up
+
+
+        In testpmd:
+        testpmd> start
+          io packet forwarding - CRC stripping disabled - packets/burst=32
+          nb forwarding cores=1 - nb forwarding ports=2
+          RX queues=1 - RX desc=128 - RX free threshold=0
+          RX threshold registers: pthresh=0 hthresh=0 wthresh=0
+          TX queues=1 - TX desc=512 - TX free threshold=0
+          TX threshold registers: pthresh=0 hthresh=0 wthresh=0
+          TX RS bit threshold=0 - TXQ flags=0x0
+        testpmd> stop
+        Telling cores to stop...
+        Waiting for lcores to finish...
+
+          ---------------------- Forward statistics for port 0  ----------------------
+          RX-packets: 973900         RX-dropped: 0             RX-total: 973900
+          TX-packets: 973903         TX-dropped: 0             TX-total: 973903
+          ----------------------------------------------------------------------------
+
+          ---------------------- Forward statistics for port 1  ----------------------
+          RX-packets: 973903         RX-dropped: 0             RX-total: 973903
+          TX-packets: 973900         TX-dropped: 0             TX-total: 973900
+          ----------------------------------------------------------------------------
+
+          +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
+          RX-packets: 1947803        RX-dropped: 0             RX-total: 1947803
+          TX-packets: 1947803        TX-dropped: 0             TX-total: 1947803
+          ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+        Done.
+
+
+
+
+
 Using the Poll Mode Driver from an Application
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst
index 99de186..faf6a17 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,12 @@ DPDK Release 2.3
 New Features
 ------------
 
+* **Added Slow Data Path support.**
+
+  * This is based on KNI work and in long term intends to replace it.
+  * Added Kernel Data Path (KDP) kernel module.
+  * Added KDP virtual PMD.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 6e4497e..0be06f5 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -51,6 +51,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += szedata2
 DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += kdp
 
 include $(RTE_SDK)/mk/rte.sharelib.mk
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/kdp/Makefile b/drivers/net/kdp/Makefile
new file mode 100644
index 0000000..035056e
--- /dev/null
+++ b/drivers/net/kdp/Makefile
@@ -0,0 +1,61 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_kdp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_kdp_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_eth_kdp.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_kdp.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_kdp_tap.c
+
+#
+# Export include files
+#
+SYMLINK-y-include +=
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/kdp/rte_eth_kdp.c b/drivers/net/kdp/rte_eth_kdp.c
new file mode 100644
index 0000000..ac650d7
--- /dev/null
+++ b/drivers/net/kdp/rte_eth_kdp.c
@@ -0,0 +1,481 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ethdev.h>
+#include <rte_dev.h>
+#include <rte_kvargs.h>
+
+#include "rte_kdp.h"
+
+#define MAX_PACKET_SZ 2048
+
+struct kdp_queue {
+	struct pmd_internals *internals;
+	struct rte_mempool *mb_pool;
+
+	uint64_t rx_pkts;
+	uint64_t rx_bytes;
+	uint64_t rx_err_pkts;
+	uint64_t tx_pkts;
+	uint64_t tx_bytes;
+	uint64_t tx_err_pkts;
+};
+
+struct pmd_internals {
+	struct rte_kdp *kdp;
+	struct rte_kdp_tap *kdp_tap;
+
+	struct kdp_queue rx_queues[RTE_MAX_QUEUES_PER_PORT];
+	struct kdp_queue tx_queues[RTE_MAX_QUEUES_PER_PORT];
+};
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+static const char *drivername = "KDP PMD";
+static struct rte_eth_link pmd_link = {
+		.link_speed = 10000,
+		.link_duplex = ETH_LINK_FULL_DUPLEX,
+		.link_status = 0
+};
+
+static uint16_t
+eth_kdp_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct kdp_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	uint16_t nb_pkts;
+
+	nb_pkts = rte_kdp_rx_burst(internals->kdp, bufs, nb_bufs);
+
+	kdp_q->rx_pkts += nb_pkts;
+	kdp_q->rx_err_pkts += nb_bufs - nb_pkts;
+
+	return nb_pkts;
+}
+
+static uint16_t
+eth_kdp_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct kdp_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	uint16_t nb_pkts;
+
+	nb_pkts =  rte_kdp_tx_burst(internals->kdp, bufs, nb_bufs);
+
+	kdp_q->tx_pkts += nb_pkts;
+	kdp_q->tx_err_pkts += nb_bufs - nb_pkts;
+
+	return nb_pkts;
+}
+
+static uint16_t
+eth_kdp_tap_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct kdp_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	struct rte_kdp_tap *kdp_tap = internals->kdp_tap;
+	struct rte_mbuf *m;
+	int ret;
+	unsigned i;
+
+	for (i = 0; i < nb_bufs; i++) {
+		m = rte_pktmbuf_alloc(kdp_q->mb_pool);
+		bufs[i] = m;
+		ret = read(kdp_tap->tap_fd, rte_pktmbuf_mtod(m, void *),
+				MAX_PACKET_SZ);
+		if (ret < 0) {
+			rte_pktmbuf_free(m);
+			break;
+		}
+
+		m->nb_segs = 1;
+		m->next = NULL;
+		m->pkt_len = (uint16_t)ret;
+		m->data_len = (uint16_t)ret;
+	}
+
+	kdp_q->rx_pkts += i;
+	kdp_q->rx_err_pkts += nb_bufs - i;
+
+	return i;
+}
+
+static uint16_t
+eth_kdp_tap_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct kdp_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	struct rte_kdp_tap *kdp_tap = internals->kdp_tap;
+	struct rte_mbuf *m;
+	unsigned i;
+
+	for (i = 0; i < nb_bufs; i++) {
+		m = bufs[i];
+		write(kdp_tap->tap_fd, rte_pktmbuf_mtod(m, void*),
+				rte_pktmbuf_data_len(m));
+		rte_pktmbuf_free(m);
+	}
+
+	kdp_q->tx_pkts += i;
+	kdp_q->tx_err_pkts += nb_bufs - i;
+
+	return i;
+}
+
+static int
+kdp_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct rte_kdp_conf conf;
+	uint16_t port_id = dev->data->port_id;
+	int ret = 0;
+
+	if (internals->kdp) {
+		snprintf(conf.name, RTE_KDP_NAMESIZE, "kdp%u", port_id);
+		conf.force_bind = 0;
+		conf.group_id = port_id;
+		conf.mbuf_size = MAX_PACKET_SZ;
+
+		ret = rte_kdp_start(internals->kdp,
+				internals->rx_queues[0].mb_pool,
+				&conf);
+		if (ret)
+			RTE_LOG(ERR, KDP, "Fail to create kdp for port: %d\n",
+					port_id);
+	}
+
+	return ret;
+}
+
+static int
+eth_dev_start(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = kdp_start(dev);
+	if (ret)
+		return -1;
+
+	dev->data->dev_link.link_status = 1;
+	return 0;
+}
+
+static void
+eth_dev_stop(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	rte_kdp_release(internals->kdp);
+	dev->data->dev_link.link_status = 0;
+}
+
+static void
+eth_dev_close(struct rte_eth_dev *dev __rte_unused)
+{
+	rte_kdp_close();
+}
+
+static int
+eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	return 0;
+}
+
+static void
+eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct rte_eth_dev_data *data = dev->data;
+
+	dev_info->driver_name = data->drv_name;
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)-1;
+	dev_info->max_rx_queues = data->nb_rx_queues;
+	dev_info->max_tx_queues = data->nb_tx_queues;
+	dev_info->min_rx_bufsize = 0;
+	dev_info->pci_dev = NULL;
+}
+
+static int
+eth_rx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t rx_queue_id __rte_unused,
+		uint16_t nb_rx_desc __rte_unused,
+		unsigned int socket_id __rte_unused,
+		const struct rte_eth_rxconf *rx_conf __rte_unused,
+		struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct kdp_queue *q;
+
+	q = &internals->rx_queues[rx_queue_id];
+	q->internals = internals;
+	q->mb_pool = mb_pool;
+
+	dev->data->rx_queues[rx_queue_id] = q;
+
+	return 0;
+}
+
+static int
+eth_tx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t tx_queue_id,
+		uint16_t nb_tx_desc __rte_unused,
+		unsigned int socket_id __rte_unused,
+		const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct kdp_queue *q;
+
+	q = &internals->tx_queues[tx_queue_id];
+	q->internals = internals;
+
+	dev->data->tx_queues[tx_queue_id] = q;
+
+	return 0;
+}
+
+static void
+eth_queue_release(void *q __rte_unused)
+{
+}
+
+static int
+eth_link_update(struct rte_eth_dev *dev __rte_unused,
+		int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static void
+eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	unsigned i, num_stats;
+	unsigned long rx_packets_total = 0, rx_bytes_total = 0;
+	unsigned long tx_packets_total = 0, tx_bytes_total = 0;
+	unsigned long tx_packets_err_total = 0;
+	struct rte_eth_dev_data *data = dev->data;
+	struct kdp_queue *q;
+
+	num_stats = RTE_MIN((unsigned)RTE_ETHDEV_QUEUE_STAT_CNTRS,
+			data->nb_rx_queues);
+	for (i = 0; i < num_stats; i++) {
+		q = data->rx_queues[i];
+		stats->q_ipackets[i] = q->rx_pkts;
+		stats->q_ibytes[i] = q->rx_bytes;
+		rx_packets_total += stats->q_ipackets[i];
+		rx_bytes_total += stats->q_ibytes[i];
+	}
+
+	num_stats = RTE_MIN((unsigned)RTE_ETHDEV_QUEUE_STAT_CNTRS,
+			data->nb_tx_queues);
+	for (i = 0; i < num_stats; i++) {
+		q = data->tx_queues[i];
+		stats->q_opackets[i] = q->tx_pkts;
+		stats->q_obytes[i] = q->tx_bytes;
+		stats->q_errors[i] = q->tx_err_pkts;
+		tx_packets_total += stats->q_opackets[i];
+		tx_bytes_total += stats->q_obytes[i];
+		tx_packets_err_total += stats->q_errors[i];
+	}
+
+	stats->ipackets = rx_packets_total;
+	stats->ibytes = rx_bytes_total;
+	stats->opackets = tx_packets_total;
+	stats->obytes = tx_bytes_total;
+	stats->oerrors = tx_packets_err_total;
+}
+
+static void
+eth_stats_reset(struct rte_eth_dev *dev)
+{
+	unsigned i;
+	struct rte_eth_dev_data *data = dev->data;
+	struct kdp_queue *q;
+
+	for (i = 0; i < data->nb_rx_queues; i++) {
+		q = data->rx_queues[i];
+		q->rx_pkts = 0;
+		q->rx_bytes = 0;
+	}
+	for (i = 0; i < data->nb_tx_queues; i++) {
+		q = data->rx_queues[i];
+		q->tx_pkts = 0;
+		q->tx_bytes = 0;
+		q->tx_err_pkts = 0;
+	}
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = eth_dev_start,
+	.dev_stop = eth_dev_stop,
+	.dev_close = eth_dev_close,
+	.dev_configure = eth_dev_configure,
+	.dev_infos_get = eth_dev_info,
+	.rx_queue_setup = eth_rx_queue_setup,
+	.tx_queue_setup = eth_tx_queue_setup,
+	.rx_queue_release = eth_queue_release,
+	.tx_queue_release = eth_queue_release,
+	.link_update = eth_link_update,
+	.stats_get = eth_stats_get,
+	.stats_reset = eth_stats_reset,
+};
+
+static struct rte_eth_dev *
+eth_dev_kdp_create(const char *name, unsigned numa_node)
+{
+	uint16_t nb_rx_queues = 1;
+	uint16_t nb_tx_queues = 1;
+	struct rte_eth_dev_data *data = NULL;
+	struct pmd_internals *internals = NULL;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	if (name == NULL)
+		return NULL;
+
+	RTE_LOG(INFO, PMD, "Creating kdp ethdev on numa socket %u\n",
+			numa_node);
+
+	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+	if (data == NULL)
+		goto error;
+
+	internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node);
+	if (internals == NULL)
+		goto error;
+
+	/* reserve an ethdev entry */
+	eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+	if (eth_dev == NULL)
+		goto error;
+
+	data->dev_private = internals;
+	data->port_id = eth_dev->data->port_id;
+	memmove(data->name, eth_dev->data->name, sizeof(data->name));
+	data->nb_rx_queues = nb_rx_queues;
+	data->nb_tx_queues = nb_tx_queues;
+	data->dev_link = pmd_link;
+	data->mac_addrs = &eth_addr;
+
+	eth_dev->data = data;
+	eth_dev->dev_ops = &ops;
+	eth_dev->driver = NULL;
+
+	data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+	data->kdrv = RTE_KDRV_NONE;
+	data->drv_name = drivername;
+	data->numa_node = numa_node;
+
+	return eth_dev;
+
+error:
+	rte_free(data);
+	rte_free(internals);
+
+	return NULL;
+}
+
+static int
+rte_pmd_kdp_devinit(const char *name, const char *params __rte_unused)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+	struct pmd_internals *internals;
+	struct rte_kdp *kdp;
+	struct rte_kdp_tap *kdp_tap = NULL;
+	uint16_t port_id;
+
+	RTE_LOG(INFO, PMD, "Initializing eth_kdp for %s\n", name);
+
+	eth_dev = eth_dev_kdp_create(name, rte_socket_id());
+	if (eth_dev == NULL)
+		return -1;
+
+	internals = eth_dev->data->dev_private;
+	port_id = eth_dev->data->port_id;
+
+	kdp = rte_kdp_init(port_id);
+	if (kdp == NULL)
+		kdp_tap = rte_kdp_tap_init(port_id);
+
+	if (kdp == NULL && kdp_tap == NULL) {
+		rte_eth_dev_release_port(eth_dev);
+		rte_free(internals);
+
+		/* Not return error to prevent panic in rte_eal_init()  */
+		return 0;
+	}
+
+	internals->kdp = kdp;
+	internals->kdp_tap = kdp_tap;
+
+	if (kdp == NULL) {
+		eth_dev->rx_pkt_burst = eth_kdp_tap_rx;
+		eth_dev->tx_pkt_burst = eth_kdp_tap_tx;
+	} else {
+		eth_dev->rx_pkt_burst = eth_kdp_rx;
+		eth_dev->tx_pkt_burst = eth_kdp_tx;
+	}
+
+	return 0;
+}
+
+static int
+rte_pmd_kdp_devuninit(const char *name)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+
+	if (name == NULL)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Un-Initializing eth_kdp for %s\n", name);
+
+	/* find the ethdev entry */
+	eth_dev = rte_eth_dev_allocated(name);
+	if (eth_dev == NULL)
+		return -1;
+
+	eth_dev_stop(eth_dev);
+
+	if (eth_dev->data)
+		rte_free(eth_dev->data->dev_private);
+	rte_free(eth_dev->data);
+
+	rte_eth_dev_release_port(eth_dev);
+	return 0;
+}
+
+static struct rte_driver pmd_kdp_drv = {
+	.name = "eth_kdp",
+	.type = PMD_VDEV,
+	.init = rte_pmd_kdp_devinit,
+	.uninit = rte_pmd_kdp_devuninit,
+};
+
+PMD_REGISTER_DRIVER(pmd_kdp_drv);
diff --git a/drivers/net/kdp/rte_kdp.c b/drivers/net/kdp/rte_kdp.c
new file mode 100644
index 0000000..604f697
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp.c
@@ -0,0 +1,365 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_EXEC_ENV_LINUXAPP
+#error "KDP is not supported"
+#endif
+
+#include <rte_spinlock.h>
+#include <rte_ethdev.h>
+#include <rte_memzone.h>
+
+#include "rte_kdp.h"
+#include "rte_kdp_fifo.h"
+
+#define MAX_MBUF_BURST_NUM     32
+
+/* Maximum number of ring entries */
+#define KDP_FIFO_COUNT_MAX     1024
+#define KDP_FIFO_SIZE          (KDP_FIFO_COUNT_MAX * sizeof(void *) + \
+					sizeof(struct rte_kdp_fifo))
+
+static volatile int kdp_fd = -1;
+
+static const struct rte_memzone *
+kdp_memzone_reserve(const char *name, size_t len, int socket_id,
+		unsigned flags)
+{
+	const struct rte_memzone *mz = rte_memzone_lookup(name);
+
+	if (mz == NULL)
+		mz = rte_memzone_reserve(name, len, socket_id, flags);
+
+	return mz;
+}
+
+static int
+slot_init(struct rte_kdp_memzone_slot *slot)
+{
+#define OBJNAMSIZ 32
+	char obj_name[OBJNAMSIZ];
+	const struct rte_memzone *mz;
+
+	/* TX RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_tx_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_tx_q = mz;
+
+	/* RX RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_rx_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_rx_q = mz;
+
+	/* ALLOC RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_alloc_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_alloc_q = mz;
+
+	/* FREE RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_free_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_free_q = mz;
+
+	return 0;
+
+kdp_fail:
+	return -1;
+}
+
+static void
+ring_init(struct rte_kdp *kdp)
+{
+	struct rte_kdp_memzone_slot *slot = kdp->slot;
+	const struct rte_memzone *mz;
+
+	/* TX RING */
+	mz = slot->m_tx_q;
+	kdp->tx_q = mz->addr;
+	kdp_fifo_init(kdp->tx_q, KDP_FIFO_COUNT_MAX);
+
+	/* RX RING */
+	mz = slot->m_rx_q;
+	kdp->rx_q = mz->addr;
+	kdp_fifo_init(kdp->rx_q, KDP_FIFO_COUNT_MAX);
+
+	/* ALLOC RING */
+	mz = slot->m_alloc_q;
+	kdp->alloc_q = mz->addr;
+	kdp_fifo_init(kdp->alloc_q, KDP_FIFO_COUNT_MAX);
+
+	/* FREE RING */
+	mz = slot->m_free_q;
+	kdp->free_q = mz->addr;
+	kdp_fifo_init(kdp->free_q, KDP_FIFO_COUNT_MAX);
+}
+
+/* Shall be called before any allocation happens */
+struct rte_kdp *
+rte_kdp_init(uint16_t port_id)
+{
+	struct rte_kdp_memzone_slot *slot = NULL;
+	struct rte_kdp *kdp = NULL;
+	int ret;
+
+	/* Check FD and open */
+	if (kdp_fd < 0) {
+		kdp_fd = open("/dev/kdp", O_RDWR);
+		if (kdp_fd < 0) {
+			RTE_LOG(ERR, KDP, "Can not open /dev/kdp\n");
+			return NULL;
+		}
+	}
+
+	slot = rte_malloc(NULL, sizeof(struct rte_kdp_memzone_slot), 0);
+	if (slot == NULL)
+		goto kdp_fail;
+	slot->id = port_id;
+
+	kdp = rte_malloc(NULL, sizeof(struct rte_kdp), 0);
+	if (kdp == NULL)
+		goto kdp_fail;
+	kdp->slot = slot;
+
+	ret = slot_init(slot);
+	if (ret < 0)
+		goto kdp_fail;
+
+	ring_init(kdp);
+
+	return kdp;
+
+kdp_fail:
+	rte_free(slot);
+	rte_free(kdp);
+	RTE_LOG(ERR, KDP, "Unable to allocate memory\n");
+	return NULL;
+}
+
+static void
+kdp_allocate_mbufs(struct rte_kdp *kdp)
+{
+	int i, ret;
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
+			 offsetof(struct rte_kdp_mbuf, pool));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_addr) !=
+			 offsetof(struct rte_kdp_mbuf, buf_addr));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) !=
+			 offsetof(struct rte_kdp_mbuf, next));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) !=
+			 offsetof(struct rte_kdp_mbuf, data_off));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) !=
+			 offsetof(struct rte_kdp_mbuf, data_len));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pkt_len) !=
+			 offsetof(struct rte_kdp_mbuf, pkt_len));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, ol_flags) !=
+			 offsetof(struct rte_kdp_mbuf, ol_flags));
+
+	/* Check if pktmbuf pool has been configured */
+	if (kdp->pktmbuf_pool == NULL) {
+		RTE_LOG(ERR, KDP, "No valid mempool for allocating mbufs\n");
+		return;
+	}
+
+	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
+		pkts[i] = rte_pktmbuf_alloc(kdp->pktmbuf_pool);
+		if (unlikely(pkts[i] == NULL)) {
+			/* Out of memory */
+			RTE_LOG(ERR, KDP, "Out of memory\n");
+			break;
+		}
+	}
+
+	/* No pkt mbuf alocated */
+	if (i <= 0)
+		return;
+
+	ret = kdp_fifo_put(kdp->alloc_q, (void **)pkts, i);
+
+	/* Check if any mbufs not put into alloc_q, and then free them */
+	if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
+		int j;
+
+		for (j = ret; j < i; j++)
+			rte_pktmbuf_free(pkts[j]);
+	}
+}
+
+int
+rte_kdp_start(struct rte_kdp *kdp, struct rte_mempool *pktmbuf_pool,
+	      const struct rte_kdp_conf *conf)
+{
+	struct rte_kdp_memzone_slot *slot = kdp->slot;
+	struct rte_kdp_device_info dev_info;
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	int ret;
+
+	if (!kdp || !pktmbuf_pool || !conf || !conf->name[0])
+		return -1;
+
+	snprintf(kdp->name, RTE_KDP_NAMESIZE, "%s", conf->name);
+	kdp->pktmbuf_pool = pktmbuf_pool;
+	kdp->group_id = conf->group_id;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	dev_info.core_id = conf->core_id;
+	dev_info.force_bind = conf->force_bind;
+	dev_info.group_id = conf->group_id;
+	dev_info.mbuf_size = conf->mbuf_size;
+	snprintf(dev_info.name, RTE_KDP_NAMESIZE, "%s", conf->name);
+
+	dev_info.tx_phys = slot->m_tx_q->phys_addr;
+	dev_info.rx_phys = slot->m_rx_q->phys_addr;
+	dev_info.alloc_phys = slot->m_alloc_q->phys_addr;
+	dev_info.free_phys = slot->m_free_q->phys_addr;
+
+	/* MBUF mempool */
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME,
+		pktmbuf_pool->name);
+	mz = rte_memzone_lookup(mz_name);
+	if (mz == NULL)
+		goto kdp_fail;
+	dev_info.mbuf_va = mz->addr;
+	dev_info.mbuf_phys = mz->phys_addr;
+
+	ret = ioctl(kdp_fd, RTE_KDP_IOCTL_CREATE, &dev_info);
+	if (ret < 0)
+		goto kdp_fail;
+
+	kdp->in_use = 1;
+
+	/* Allocate mbufs and then put them into alloc_q */
+	kdp_allocate_mbufs(kdp);
+
+	return 0;
+
+kdp_fail:
+	return -1;
+}
+
+static void
+kdp_free_mbufs(struct rte_kdp *kdp)
+{
+	int i, ret;
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+
+	ret = kdp_fifo_get(kdp->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
+	if (likely(ret > 0)) {
+		for (i = 0; i < ret; i++)
+			rte_pktmbuf_free(pkts[i]);
+	}
+}
+
+unsigned
+rte_kdp_tx_burst(struct rte_kdp *kdp, struct rte_mbuf **mbufs, unsigned num)
+{
+	unsigned ret = kdp_fifo_put(kdp->rx_q, (void **)mbufs, num);
+
+	/* Get mbufs from free_q and then free them */
+	kdp_free_mbufs(kdp);
+
+	return ret;
+}
+
+unsigned
+rte_kdp_rx_burst(struct rte_kdp *kdp, struct rte_mbuf **mbufs, unsigned num)
+{
+	unsigned ret = kdp_fifo_get(kdp->tx_q, (void **)mbufs, num);
+
+	/* If buffers removed, allocate mbufs and then put them into alloc_q */
+	if (ret)
+		kdp_allocate_mbufs(kdp);
+
+	return ret;
+}
+
+static void
+kdp_free_fifo(struct rte_kdp_fifo *fifo)
+{
+	int ret;
+	struct rte_mbuf *pkt;
+
+	do {
+		ret = kdp_fifo_get(fifo, (void **)&pkt, 1);
+		if (ret)
+			rte_pktmbuf_free(pkt);
+	} while (ret);
+}
+
+int
+rte_kdp_release(struct rte_kdp *kdp)
+{
+	struct rte_kdp_device_info dev_info;
+
+	if (!kdp || !kdp->in_use)
+		return -1;
+
+	snprintf(dev_info.name, sizeof(dev_info.name), "%s", kdp->name);
+	if (ioctl(kdp_fd, RTE_KDP_IOCTL_RELEASE, &dev_info) < 0) {
+		RTE_LOG(ERR, KDP, "Fail to release kdp device\n");
+		return -1;
+	}
+
+	/* mbufs in all fifo should be released, except request/response */
+	kdp_free_fifo(kdp->tx_q);
+	kdp_free_fifo(kdp->rx_q);
+	kdp_free_fifo(kdp->alloc_q);
+	kdp_free_fifo(kdp->free_q);
+
+	rte_free(kdp->slot);
+
+	/* Memset the KDP struct */
+	memset(kdp, 0, sizeof(struct rte_kdp));
+
+	return 0;
+}
+
+void
+rte_kdp_close(void)
+{
+	if (kdp_fd < 0)
+		return;
+
+	close(kdp_fd);
+	kdp_fd = -1;
+}
diff --git a/drivers/net/kdp/rte_kdp.h b/drivers/net/kdp/rte_kdp.h
new file mode 100644
index 0000000..b9db048
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp.h
@@ -0,0 +1,126 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_KDP_H_
+#define _RTE_KDP_H_
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <sys/ioctl.h>
+
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+#include <rte_memcpy.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+
+#include <exec-env/rte_kdp_common.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * KDP memzone pool slot
+ */
+struct rte_kdp_memzone_slot {
+	uint32_t id;
+
+	/* Memzones */
+	const struct rte_memzone *m_tx_q;      /**< TX queue */
+	const struct rte_memzone *m_rx_q;      /**< RX queue */
+	const struct rte_memzone *m_alloc_q;   /**< Allocated mbufs queue */
+	const struct rte_memzone *m_free_q;    /**< To be freed mbufs queue */
+};
+
+/**
+ * KDP context
+ */
+struct rte_kdp {
+	char name[RTE_KDP_NAMESIZE];        /**< KDP interface name */
+	struct rte_mempool *pktmbuf_pool;   /**< pkt mbuf mempool */
+	struct rte_kdp_memzone_slot *slot;
+	uint16_t group_id;                  /**< Group ID of KDP devices */
+
+	struct rte_kdp_fifo *tx_q;          /**< TX queue */
+	struct rte_kdp_fifo *rx_q;          /**< RX queue */
+	struct rte_kdp_fifo *alloc_q;       /**< Allocated mbufs queue */
+	struct rte_kdp_fifo *free_q;        /**< To be freed mbufs queue */
+
+	uint8_t in_use;                     /**< kdp in use */
+};
+
+struct rte_kdp_tap {
+	char name[RTE_KDP_NAMESIZE];
+	int tap_fd;
+};
+
+/**
+ * Structure for configuring KDP device.
+ */
+struct rte_kdp_conf {
+	/*
+	 * KDP name which will be used in relevant network device.
+	 * Let the name as short as possible, as it will be part of
+	 * memzone name.
+	 */
+	char name[RTE_KDP_NAMESIZE];
+	uint32_t core_id;   /* Core ID to bind kernel thread on */
+	uint16_t group_id;
+	unsigned mbuf_size;
+
+	uint8_t force_bind; /* Flag to bind kernel thread */
+};
+
+struct rte_kdp_tap *rte_kdp_tap_init(uint16_t port_id);
+struct rte_kdp *rte_kdp_init(uint16_t port_id);
+
+int rte_kdp_start(struct rte_kdp *kdp, struct rte_mempool *pktmbuf_pool,
+	      const struct rte_kdp_conf *conf);
+
+unsigned rte_kdp_rx_burst(struct rte_kdp *kdp,
+		struct rte_mbuf **mbufs, unsigned num);
+
+unsigned rte_kdp_tx_burst(struct rte_kdp *kdp,
+		struct rte_mbuf **mbufs, unsigned num);
+
+int rte_kdp_release(struct rte_kdp *kdp);
+
+void rte_kdp_close(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_KDP_H_ */
diff --git a/drivers/net/kdp/rte_kdp_fifo.h b/drivers/net/kdp/rte_kdp_fifo.h
new file mode 100644
index 0000000..1a7e063
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp_fifo.h
@@ -0,0 +1,91 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * Initializes the kdp fifo structure
+ */
+static void
+kdp_fifo_init(struct rte_kdp_fifo *fifo, unsigned size)
+{
+	/* Ensure size is power of 2 */
+	if (size & (size - 1))
+		rte_panic("KDP fifo size must be power of 2\n");
+
+	fifo->write = 0;
+	fifo->read = 0;
+	fifo->len = size;
+	fifo->elem_size = sizeof(void *);
+}
+
+/**
+ * Adds num elements into the fifo. Return the number actually written
+ */
+static inline unsigned
+kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned fifo_write = fifo->write;
+	unsigned fifo_read = fifo->read;
+	unsigned new_write = fifo_write;
+
+	for (i = 0; i < num; i++) {
+		new_write = (new_write + 1) & (fifo->len - 1);
+
+		if (new_write == fifo_read)
+			break;
+		fifo->buffer[fifo_write] = data[i];
+		fifo_write = new_write;
+	}
+	fifo->write = fifo_write;
+	return i;
+}
+
+/**
+ * Get up to num elements from the fifo. Return the number actully read
+ */
+static inline unsigned
+kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned new_read = fifo->read;
+	unsigned fifo_write = fifo->write;
+	for (i = 0; i < num; i++) {
+		if (new_read == fifo_write)
+			break;
+
+		data[i] = fifo->buffer[new_read];
+		new_read = (new_read + 1) & (fifo->len - 1);
+	}
+	fifo->read = new_read;
+	return i;
+}
diff --git a/drivers/net/kdp/rte_kdp_tap.c b/drivers/net/kdp/rte_kdp_tap.c
new file mode 100644
index 0000000..f07ba98
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp_tap.c
@@ -0,0 +1,96 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+
+#include <sys/socket.h>
+#include <linux/if.h>
+#include <linux/if_tun.h>
+
+#include "rte_kdp.h"
+
+static int
+tap_create(char *name)
+{
+	struct ifreq ifr;
+	int fd, ret;
+
+	fd = open("/dev/net/tun", O_RDWR);
+	if (fd < 0)
+		return fd;
+
+	memset(&ifr, 0, sizeof(ifr));
+
+	/* TAP device without packet information */
+	ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
+
+	if (name && *name)
+		snprintf(ifr.ifr_name, IFNAMSIZ, "%s", name);
+
+	ret = ioctl(fd, TUNSETIFF, (void *)&ifr);
+	if (ret < 0) {
+		close(fd);
+		return ret;
+	}
+
+	if (name)
+		snprintf(name, IFNAMSIZ, "%s", ifr.ifr_name);
+
+	return fd;
+}
+
+struct rte_kdp_tap *
+rte_kdp_tap_init(uint16_t port_id)
+{
+	struct rte_kdp_tap *kdp_tap = NULL;
+	int flags;
+
+	kdp_tap = rte_malloc(NULL, sizeof(struct rte_kdp_tap), 0);
+	if (kdp_tap == NULL)
+		goto error;
+
+	snprintf(kdp_tap->name, IFNAMSIZ, "tap_kdp%u", port_id);
+	kdp_tap->tap_fd = tap_create(kdp_tap->name);
+	if (kdp_tap->tap_fd < 0)
+		goto error;
+
+	flags = fcntl(kdp_tap->tap_fd, F_GETFL, 0);
+	fcntl(kdp_tap->tap_fd, F_SETFL, flags | O_NONBLOCK);
+
+	return kdp_tap;
+
+error:
+	rte_free(kdp_tap);
+	return NULL;
+}
+
diff --git a/drivers/net/kdp/rte_pmd_kdp_version.map b/drivers/net/kdp/rte_pmd_kdp_version.map
new file mode 100644
index 0000000..0812bb1
--- /dev/null
+++ b/drivers/net/kdp/rte_pmd_kdp_version.map
@@ -0,0 +1,4 @@
+DPDK_2.3 {
+
+	local: *;
+};
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index 2e47e7f..5a0048b 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -79,6 +79,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
 #define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
 #define RTE_LOGTYPE_CRYPTODEV 0x00020000 /**< Log related to cryptodev. */
+#define RTE_LOGTYPE_KDP     0x00080000 /**< Log related to KDP. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 8ecab41..eb18972 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   Copyright(c) 2014-2015 6WIND S.A.
 #   All rights reserved.
 #
@@ -154,6 +154,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)        += -lrte_pmd_qat
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KDP)        += -lrte_pmd_kdp
 
 # AESNI MULTI BUFFER is dependent on the IPSec_MB library
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)   += -lrte_pmd_aesni_mb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-01-27 16:32 ` [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
@ 2016-01-28  8:16   ` Xu, Qian Q
  2016-01-29 16:04     ` Yigit, Ferruh
  2016-02-09 17:33   ` Reshma Pattan
  1 sibling, 1 reply; 29+ messages in thread
From: Xu, Qian Q @ 2016-01-28  8:16 UTC (permalink / raw)
  To: Yigit, Ferruh, dev

Any dependencies with kernel versions? What kernel versions should it support? 

Thanks
Qian

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
Sent: Thursday, January 28, 2016 12:33 AM
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication

This patch provides slow data path communication to the Linux kernel.
Patch is based on librte_kni, and heavily re-uses it.

The main difference is librte_kni library converted into a PMD, to provide ease of use for applications.

Now any application can use slow path communication without any update in application, because of existing eal support for virtual PMD.

Also this PMD supports two methods to send packets to the Linux, first one is custom FIFO implementation with help of KDP kernel module, second one is Linux in-kernel tun/tap support. PMD first checks for KDP kernel module, if fails it tries to create and use a tap interface.

With FIFO method: PMD's rx_pkt_burst() get packets from FIFO, and tx_pkt_burst() puts packet to the FIFO.
The corresponding Linux virtual network device driver code also gets/puts packets from FIFO as they are coming from hardware.

With tun/tap method: no external kernel module required, PMD reads from and writes packets to the tap interface file descriptor. Tap interface has performance penalty against FIFO implementation.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_linuxapp                  |   1 +
 doc/guides/nics/pcap_ring.rst           | 125 ++++++++-
 doc/guides/rel_notes/release_2_3.rst    |   6 +
 drivers/net/Makefile                    |   3 +-
 drivers/net/kdp/Makefile                |  61 ++++
 drivers/net/kdp/rte_eth_kdp.c           | 481 ++++++++++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.c               | 365 ++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.h               | 126 +++++++++
 drivers/net/kdp/rte_kdp_fifo.h          |  91 ++++++
 drivers/net/kdp/rte_kdp_tap.c           |  96 +++++++
 drivers/net/kdp/rte_pmd_kdp_version.map |   4 +
 lib/librte_eal/common/include/rte_log.h |   3 +-
 mk/rte.app.mk                           |   3 +-
 13 files changed, 1359 insertions(+), 6 deletions(-)  create mode 100644 drivers/net/kdp/Makefile  create mode 100644 drivers/net/kdp/rte_eth_kdp.c  create mode 100644 drivers/net/kdp/rte_kdp.c  create mode 100644 drivers/net/kdp/rte_kdp.h  create mode 100644 drivers/net/kdp/rte_kdp_fifo.h  create mode 100644 drivers/net/kdp/rte_kdp_tap.c  create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-01-28  8:16   ` Xu, Qian Q
@ 2016-01-29 16:04     ` Yigit, Ferruh
  0 siblings, 0 replies; 29+ messages in thread
From: Yigit, Ferruh @ 2016-01-29 16:04 UTC (permalink / raw)
  To: Xu, Qian Q; +Cc: dev

On Thu, Jan 28, 2016 at 08:16:09AM +0000, Xu, Qian Q wrote:
> Any dependencies with kernel versions? What kernel versions should it support? 
> 
Hi Qian,

Kernel module dependencies is same as KNI, and DPDK supports Kernel version >= 2.6.34, this is valid for KDP.

For PMD, it is not dependent but uses tun/tap interface, and tun/tap also supported for kernel versions >= 2.6.34.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] kdp: add kernel data path kernel module
  2016-01-27 16:32 ` [PATCH 1/2] kdp: add kernel data path kernel module Ferruh Yigit
@ 2016-02-08 17:14   ` Reshma Pattan
  2016-02-09 10:53     ` Ferruh Yigit
  0 siblings, 1 reply; 29+ messages in thread
From: Reshma Pattan @ 2016-02-08 17:14 UTC (permalink / raw)
  To: Ferruh Yigit, dev



On 1/27/2016 4:32 PM, Ferruh Yigit wrote:
> This kernel module is based on KNI module, but this one is stripped
> version of it and only for data messages, no control functionality
> provided.
>
> FIFO implementation of the KNI is kept exact same, but ethtool related
> code removed and virtual network management related code simplified.
>
> This module contains kernel support to create network devices and
> this module has a simple driver for virtual network device, the driver
> simply puts/gets packets to/from FIFO instead of real hardware.
>
> FIFO is created owned by userspace application, which is for this case
> KDP PMD.
>
> In long term this patch intends to replace the KNI and KNI will be
> depreciated.
>
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>   
>
> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
> new file mode 100644
> index 0000000..0c77f58
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
>
> +/**
> + * KDP name is part of memzone name.
> + */
> +#define RTE_KDP_NAMESIZE 32
> +
> +#ifndef RTE_CACHE_LINE_SIZE
> +#define RTE_CACHE_LINE_SIZE 64       /**< Cache line size. */
> +#endif

Jerin Jacob has patch for cleaning of MACRO RTE_CACHE_LINE_SIZE and 
having CONFIG_RTE_CACHE_LINE_SIZE

in config file. You may need to remove this,once those changes are 
available in code.

> +
> +/*
> + * The kernel image of the rte_mbuf struct, with only the relevant fields.
> + * Padding is necessary to assure the offsets of these fields
> + */
> +struct rte_kdp_mbuf {
> +	void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
> +	char pad0[10];
> +
> +	/**< Start address of data in segment buffer. */
> +	uint16_t data_off;
> +	char pad1[4];
> +	uint64_t ol_flags;      /**< Offload features. */

     You are not using ol_flags down in the code. Should this be removed?

> +	char pad2[4];
> +
> +	/**< Total pkt len: sum of all segment data_len. */
> +	uint32_t pkt_len;
> +
> +	/**< Amount of data in segment buffer. */
> +	uint16_t data_len;
> +
> +	/* fields on second cache line */
> +	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
> +	void *pool;
> +	void *next;
> +};
> +

Does all structures should have "__rte_cache_aligned" in their 
declarations? Like other DPDK structs?


> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_dev.h b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
> new file mode 100644
> index 0000000..52952b4
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
>
> +
> +#define KDP_ERR(args...) printk(KERN_DEBUG "KDP: Error: " args)
> +#define KDP_PRINT(args...) printk(KERN_DEBUG "KDP: " args)
> +
> +#ifdef RTE_KDP_KO_DEBUG
> +#define KDP_DBG(args...) printk(KERN_DEBUG "KDP: " args)

     Is it good to haveKERN_DEBUG "KDP:Debug: " like Errors?


> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_fifo.h b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
> new file mode 100644
> index 0000000..a5fe080
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
>
> +/**
> + * Adds num elements into the fifo. Return the number actually written
> + */
> +static inline unsigned
> +kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
> +{
> +	unsigned i = 0;
> +	unsigned fifo_write = fifo->write;
> +	unsigned fifo_read = fifo->read;
> +	unsigned new_write = fifo_write;
> +
> +	for (i = 0; i < num; i++) {
> +		new_write = (new_write + 1) & (fifo->len - 1);
> +
> +		if (new_write == fifo_read)
> +			break;
> +		fifo->buffer[fifo_write] = data[i];
> +		fifo_write = new_write;
> +	}
> +	fifo->write = fifo_write;
> +
> +	return i;
> +}

     you can add header for all function declarations inside header file 
with below format. Same for other header files and functions.

     *@Description

     *@params

     *@Return value


> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_misc.c b/lib/librte_eal/linuxapp/kdp/kdp_misc.c
> new file mode 100644
> index 0000000..d97d1c0
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kdp/kdp_misc.c
> +static int
> +kdp_compat_ioctl(struct inode *inode, unsigned int ioctl_num,
> +		unsigned long ioctl_param)
> +{
> +	/* 32 bits app on 64 bits OS to be supported later */
> +	KDP_PRINT("Not implemented.\n");

     Should this be warning/ERR instead of PRINT?

> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_net.c b/lib/librte_eal/linuxapp/kdp/kdp_net.c
> new file mode 100644
> index 0000000..5c669f5
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kdp/kdp_net.c
> +
> +static void
> +kdp_net_set_rx_mode(struct net_device *dev)
> +{
> +}

      Empty function body?

     Thanks,
     Reshma

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] kdp: add kernel data path kernel module
  2016-02-08 17:14   ` Reshma Pattan
@ 2016-02-09 10:53     ` Ferruh Yigit
  0 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-02-09 10:53 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

On Mon, Feb 08, 2016 at 05:14:54PM +0000, Reshma Pattan wrote:

Hi Reshma,

>
>
> On 1/27/2016 4:32 PM, Ferruh Yigit wrote:
>> This kernel module is based on KNI module, but this one is stripped
>> version of it and only for data messages, no control functionality
>> provided.
>>
>> FIFO implementation of the KNI is kept exact same, but ethtool related
>> code removed and virtual network management related code simplified.
>>
>> This module contains kernel support to create network devices and
>> this module has a simple driver for virtual network device, the driver
>> simply puts/gets packets to/from FIFO instead of real hardware.
>>
>> FIFO is created owned by userspace application, which is for this case
>> KDP PMD.
>>
>> In long term this patch intends to replace the KNI and KNI will be
>> depreciated.
>>
>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>> ---
>>   
>> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
>> new file mode 100644
>> index 0000000..0c77f58
>> --- /dev/null
>> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
>>
>> +/**
>> + * KDP name is part of memzone name.
>> + */
>> +#define RTE_KDP_NAMESIZE 32
>> +
>> +#ifndef RTE_CACHE_LINE_SIZE
>> +#define RTE_CACHE_LINE_SIZE 64       /**< Cache line size. */
>> +#endif
>
> Jerin Jacob has patch for cleaning of MACRO RTE_CACHE_LINE_SIZE and having 
> CONFIG_RTE_CACHE_LINE_SIZE
>
> in config file. You may need to remove this,once those changes are available 
> in code.
>
Thanks, when that patch applied, I can rebase code.
>> +
>> +/*
>> + * The kernel image of the rte_mbuf struct, with only the relevant fields.
>> + * Padding is necessary to assure the offsets of these fields
>> + */
>> +struct rte_kdp_mbuf {
>> +	void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
>> +	char pad0[10];
>> +
>> +	/**< Start address of data in segment buffer. */
>> +	uint16_t data_off;
>> +	char pad1[4];
>> +	uint64_t ol_flags;      /**< Offload features. */
>
>     You are not using ol_flags down in the code. Should this be removed?
>
Can't remove, this struct should match with rte_mbuf

>> +	char pad2[4];
>> +
>> +	/**< Total pkt len: sum of all segment data_len. */
>> +	uint32_t pkt_len;
>> +
>> +	/**< Amount of data in segment buffer. */
>> +	uint16_t data_len;
>> +
>> +	/* fields on second cache line */
>> +	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
>> +	void *pool;
>> +	void *next;
>> +};
>> +
>
> Does all structures should have "__rte_cache_aligned" in their declarations? 
> Like other DPDK structs?
>
This is kernel module. Doesn't know about userspace library macros.
>
>> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_dev.h b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
>> new file mode 100644
>> index 0000000..52952b4
>> --- /dev/null
>> +++ b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
>>
>> +
>> +#define KDP_ERR(args...) printk(KERN_DEBUG "KDP: Error: " args)
>> +#define KDP_PRINT(args...) printk(KERN_DEBUG "KDP: " args)
>> +
>> +#ifdef RTE_KDP_KO_DEBUG
>> +#define KDP_DBG(args...) printk(KERN_DEBUG "KDP: " args)
>
>     Is it good to haveKERN_DEBUG "KDP:Debug: " like Errors?
>
I think extra "Debug" prefix is not required here.

>
>> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_fifo.h b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
>> new file mode 100644
>> index 0000000..a5fe080
>> --- /dev/null
>> +++ b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
>>
>> +/**
>> + * Adds num elements into the fifo. Return the number actually written
>> + */
>> +static inline unsigned
>> +kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
>> +{
>> +	unsigned i = 0;
>> +	unsigned fifo_write = fifo->write;
>> +	unsigned fifo_read = fifo->read;
>> +	unsigned new_write = fifo_write;
>> +
>> +	for (i = 0; i < num; i++) {
>> +		new_write = (new_write + 1) & (fifo->len - 1);
>> +
>> +		if (new_write == fifo_read)
>> +			break;
>> +		fifo->buffer[fifo_write] = data[i];
>> +		fifo_write = new_write;
>> +	}
>> +	fifo->write = fifo_write;
>> +
>> +	return i;
>> +}
>
>     you can add header for all function declarations inside header file with 
> below format. Same for other header files and functions.
>
>     *@Description
>
>     *@params
>
>     *@Return value
>
This is private header.
>
>> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_misc.c b/lib/librte_eal/linuxapp/kdp/kdp_misc.c
>> new file mode 100644
>> index 0000000..d97d1c0
>> --- /dev/null
>> +++ b/lib/librte_eal/linuxapp/kdp/kdp_misc.c
>> +static int
>> +kdp_compat_ioctl(struct inode *inode, unsigned int ioctl_num,
>> +		unsigned long ioctl_param)
>> +{
>> +	/* 32 bits app on 64 bits OS to be supported later */
>> +	KDP_PRINT("Not implemented.\n");
>
>     Should this be warning/ERR instead of PRINT?
>
>> diff --git a/lib/librte_eal/linuxapp/kdp/kdp_net.c b/lib/librte_eal/linuxapp/kdp/kdp_net.c
>> new file mode 100644
>> index 0000000..5c669f5
>> --- /dev/null
>> +++ b/lib/librte_eal/linuxapp/kdp/kdp_net.c
>> +
>> +static void
>> +kdp_net_set_rx_mode(struct net_device *dev)
>> +{
>> +}
>
>      Empty function body?
>
Yes, this is part of net_device_ops, and required to fake multicast support.

Regards,
ferruh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-01-27 16:32 ` [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
  2016-01-28  8:16   ` Xu, Qian Q
@ 2016-02-09 17:33   ` Reshma Pattan
  2016-02-09 17:51     ` Ferruh Yigit
  1 sibling, 1 reply; 29+ messages in thread
From: Reshma Pattan @ 2016-02-09 17:33 UTC (permalink / raw)
  To: Ferruh Yigit, dev

Hi Ferruh,

On 1/27/2016 4:32 PM, Ferruh Yigit wrote:
> This patch provides slow data path communication to the Linux kernel.
> Patch is based on librte_kni, and heavily re-uses it.
>
> The main difference is librte_kni library converted into a PMD, to
> provide ease of use for applications.
>
> Now any application can use slow path communication without any update
> in application, because of existing eal support for virtual PMD.
>
> Also this PMD supports two methods to send packets to the Linux, first
> one is custom FIFO implementation with help of KDP kernel module, second
> one is Linux in-kernel tun/tap support. PMD first checks for KDP kernel
> module, if fails it tries to create and use a tap interface.
>
> With FIFO method: PMD's rx_pkt_burst() get packets from FIFO,
> and tx_pkt_burst() puts packet to the FIFO.
> The corresponding Linux virtual network device driver code
> also gets/puts packets from FIFO as they are coming from hardware.
>
> With tun/tap method: no external kernel module required, PMD reads from
> and writes packets to the tap interface file descriptor. Tap interface
> has performance penalty against FIFO implementation.
>
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>   
> diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst
> index 46aa3ac..78b7b61 100644
> --- a/doc/guides/nics/pcap_ring.rst
> +++ b/doc/guides/nics/pcap_ring.rst
> @@ -28,11 +28,11 @@
> +
> +
> +DPDK application can be used to forward packages between these interfaces:
> +

     Packages ==> packets.?

> diff --git a/drivers/net/kdp/rte_eth_kdp.c b/drivers/net/kdp/rte_eth_kdp.c
> new file mode 100644
> index 0000000..ac650d7
> --- /dev/null
> +++ b/drivers/net/kdp/rte_eth_kdp.c
> @@ -0,0 +1,481 @@
>

     No public API to create KDP PMD device. We should have one right?

> diff --git a/drivers/net/kdp/rte_kdp.h b/drivers/net/kdp/rte_kdp.h
> new file mode 100644
> index 0000000..b9db048
> --- /dev/null
> +++ b/drivers/net/kdp/rte_kdp.h
> @@ -0,0 +1,126 @@
>
> +struct rte_kdp_tap *rte_kdp_tap_init(uint16_t port_id);
> +struct rte_kdp *rte_kdp_init(uint16_t port_id);
> +
> +int rte_kdp_start(struct rte_kdp *kdp, struct rte_mempool *pktmbuf_pool,
> +	      const struct rte_kdp_conf *conf);
> +
> +unsigned rte_kdp_rx_burst(struct rte_kdp *kdp,
> +		struct rte_mbuf **mbufs, unsigned num);
> +
> +unsigned rte_kdp_tx_burst(struct rte_kdp *kdp,
> +		struct rte_mbuf **mbufs, unsigned num);
> +
> +int rte_kdp_release(struct rte_kdp *kdp);
> +
> +void rte_kdp_close(void);
>

     These functions can be static.

     Thanks,
     Reshma

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-02-09 17:33   ` Reshma Pattan
@ 2016-02-09 17:51     ` Ferruh Yigit
  0 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-02-09 17:51 UTC (permalink / raw)
  To: Reshma Pattan; +Cc: dev

On Tue, Feb 09, 2016 at 05:33:55PM +0000, Reshma Pattan wrote:
> Hi Ferruh,
>
Hi Reshma,

> On 1/27/2016 4:32 PM, Ferruh Yigit wrote:
>> This patch provides slow data path communication to the Linux kernel.
>> Patch is based on librte_kni, and heavily re-uses it.
>>
>> The main difference is librte_kni library converted into a PMD, to
>> provide ease of use for applications.
>>
>> Now any application can use slow path communication without any update
>> in application, because of existing eal support for virtual PMD.
>>
>> Also this PMD supports two methods to send packets to the Linux, first
>> one is custom FIFO implementation with help of KDP kernel module, second
>> one is Linux in-kernel tun/tap support. PMD first checks for KDP kernel
>> module, if fails it tries to create and use a tap interface.
>>
>> With FIFO method: PMD's rx_pkt_burst() get packets from FIFO,
>> and tx_pkt_burst() puts packet to the FIFO.
>> The corresponding Linux virtual network device driver code
>> also gets/puts packets from FIFO as they are coming from hardware.
>>
>> With tun/tap method: no external kernel module required, PMD reads from
>> and writes packets to the tap interface file descriptor. Tap interface
>> has performance penalty against FIFO implementation.
>>
>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>> ---
>>   diff --git a/doc/guides/nics/pcap_ring.rst 
>> b/doc/guides/nics/pcap_ring.rst
>> index 46aa3ac..78b7b61 100644
>> --- a/doc/guides/nics/pcap_ring.rst
>> +++ b/doc/guides/nics/pcap_ring.rst
>> @@ -28,11 +28,11 @@
>> +
>> +
>> +DPDK application can be used to forward packages between these interfaces:
>> +
>
>     Packages ==> packets.?
>
Right, I will fix, thanks.

>> diff --git a/drivers/net/kdp/rte_eth_kdp.c b/drivers/net/kdp/rte_eth_kdp.c
>> new file mode 100644
>> index 0000000..ac650d7
>> --- /dev/null
>> +++ b/drivers/net/kdp/rte_eth_kdp.c
>> @@ -0,0 +1,481 @@
>>
>
>     No public API to create KDP PMD device. We should have one right?
>
Doesn't have to have one, KDP does not have a requirement to have right now.
It is possible to create PMD with eal --vdev parameter...

>> diff --git a/drivers/net/kdp/rte_kdp.h b/drivers/net/kdp/rte_kdp.h
>> new file mode 100644
>> index 0000000..b9db048
>> --- /dev/null
>> +++ b/drivers/net/kdp/rte_kdp.h
>> @@ -0,0 +1,126 @@
>>
>> +struct rte_kdp_tap *rte_kdp_tap_init(uint16_t port_id);
>> +struct rte_kdp *rte_kdp_init(uint16_t port_id);
>> +
>> +int rte_kdp_start(struct rte_kdp *kdp, struct rte_mempool *pktmbuf_pool,
>> +	      const struct rte_kdp_conf *conf);
>> +
>> +unsigned rte_kdp_rx_burst(struct rte_kdp *kdp,
>> +		struct rte_mbuf **mbufs, unsigned num);
>> +
>> +unsigned rte_kdp_tx_burst(struct rte_kdp *kdp,
>> +		struct rte_mbuf **mbufs, unsigned num);
>> +
>> +int rte_kdp_release(struct rte_kdp *kdp);
>> +
>> +void rte_kdp_close(void);
>>
>
>     These functions can be static.
>
No, this header used by multiple sources, the function declarations here are the ones in the scope of other file.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 0/2] slow data path communication between DPDK port and Linux
  2016-01-27 16:32 [PATCH 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2016-01-27 16:32 ` [PATCH 1/2] kdp: add kernel data path kernel module Ferruh Yigit
  2016-01-27 16:32 ` [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
@ 2016-02-19  5:05 ` Ferruh Yigit
  2016-02-19  5:05   ` [PATCH v2 1/2] kdp: add kernel data path kernel module Ferruh Yigit
                     ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-02-19  5:05 UTC (permalink / raw)
  To: dev

This is slow data path communication implementation based on existing KNI.

Difference is: librte_kni converted into a PMD, kdp kernel module is almost
same except all control path functionality removed and some simplification done.

Motivation is to simplify slow path data communication.
Now any application can use this new PMD to send/get data to Linux kernel.

PMD supports two communication methods:

1) KDP kernel module
PMD initialization functions handles creating virtual interfaces (with help of
kdp kernel module) and created FIFO. FIFO is used to share data between
userspace and kernelspace. This is default method.

2) tun/tap module
When KDP module is not inserted, PMD creates tap interface and transfers
packets using tap interface.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

v2:
u* Use rtnetlink to create interfaces
* include modules.h to prevent compile error in old kernels


Sample usage:
1) Transfer any packet received from NIC that bound to DPDK, to the Linux kernel

a) insert kdp kernel module
insmod build/kmod/rte_kdp.ko

b) bind NIC to the DPDK using dpdk_nic_bind.py

c) ./testpmd --vdev eth_kdp0

c1) testpmd show two ports, one of them physical, other virtual
...
Configuring Port 0 (socket 0)
Port 0: 00:00:00:00:00:00
Configuring Port 1 (socket 0)
...
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
Done

c2) This will create "kdp0" Linux interface
$ ip l show kdp0
21: kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff

d) Linux port can be used for data

d1)
$ ifconfig kdp0 1.0.0.2
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=64 time=0.789 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=64 time=0.881 ms

d2)
$ tcpdump -nn -i kdp0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on kdp0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:01:22.407506 IP 1.0.0.1 > 1.0.0.2: ICMP echo request, id 40016, seq 18, length 64
15:01:22.408521 IP 1.0.0.2 > 1.0.0.1: ICMP echo reply, id 40016, seq 18, length 64



2) Data travels between virtual Linux interfaces pass from DPDK application,
application can alter data

a) insert kdp kernel module
insmod build/kmod/rte_kdp.ko

b) No physical NIC involved

c) ./testpmd --vdev eth_kdp0 --vdev eth_kdp1

c1) testpmd show two ports, both of them are virtual
...
Configuring Port 0 (socket 0)
Port 0: 00:00:00:00:00:00
Configuring Port 1 (socket 0)
Port 1: 00:00:00:00:00:00
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
Done

c2) This will create "kdp0"  and "kdp1" Linux interfaces
$ ip l show kdp0; ip l show kdp1
22: kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
23: kdp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff

d) Data travel between virtual ports pass from DPDK application
$ifconfig kdp0 1.0.0.1
$ifconfig kdp1 1.0.0.2

d1)
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=64 time=3.57 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=64 time=1.85 ms
64 bytes from 1.0.0.1: icmp_seq=3 ttl=64 time=1.89 ms

d2)
$ tcpdump -nn -i kdp0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on kdp0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:20:51.908543 IP 1.0.0.2 > 1.0.0.1: ICMP echo request, id 41234, seq 1, length 64
15:20:51.909570 IP 1.0.0.1 > 1.0.0.2: ICMP echo reply, id 41234, seq 1, length 64
15:20:52.909551 IP 1.0.0.2 > 1.0.0.1: ICMP echo request, id 41234, seq 2, length 64
15:20:52.910577 IP 1.0.0.1 > 1.0.0.2: ICMP echo reply, id 41234, seq 2, length 64



3) tun/tap interface usage

a) No external module required, tun/tap support in kernel required

b) ./testpmd --vdev eth_kdp0 --vdev eth_kdp1

b1) This will create "tap_kdp0"  and "tap_kdp1" Linux interfaces
$ ip l show tap_kdp0; ip l show tap_kdp1
25: tap_kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
    link/ether 56:47:97:9c:03:8e brd ff:ff:ff:ff:ff:ff
26: tap_kdp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
    link/ether 5e:15:22:b0:52:42 brd ff:ff:ff:ff:ff:ff

Ferruh Yigit (2):
  kdp: add kernel data path kernel module
  kdp: add virtual PMD for kernel slow data path communication

 MAINTAINERS                                        |   5 +
 config/common_linuxapp                             |   9 +-
 doc/guides/nics/pcap_ring.rst                      | 125 ++-
 doc/guides/rel_notes/release_16_04.rst             |   6 +
 drivers/net/Makefile                               |   3 +-
 drivers/net/kdp/Makefile                           |  61 ++
 drivers/net/kdp/rte_eth_kdp.c                      | 501 ++++++++++++
 drivers/net/kdp/rte_kdp.c                          | 633 +++++++++++++++
 drivers/net/kdp/rte_kdp.h                          | 116 +++
 drivers/net/kdp/rte_kdp_fifo.h                     |  91 +++
 drivers/net/kdp/rte_kdp_tap.c                      | 101 +++
 drivers/net/kdp/rte_pmd_kdp_version.map            |   4 +
 lib/librte_eal/common/include/rte_log.h            |   3 +-
 lib/librte_eal/linuxapp/Makefile                   |   5 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 139 ++++
 lib/librte_eal/linuxapp/kdp/Makefile               |  55 ++
 lib/librte_eal/linuxapp/kdp/kdp_dev.h              |  78 ++
 lib/librte_eal/linuxapp/kdp/kdp_fifo.h             |  91 +++
 lib/librte_eal/linuxapp/kdp/kdp_net.c              | 862 +++++++++++++++++++++
 mk/rte.app.mk                                      |   3 +-
 21 files changed, 2885 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/kdp/Makefile
 create mode 100644 drivers/net/kdp/rte_eth_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.h
 create mode 100644 drivers/net/kdp/rte_kdp_fifo.h
 create mode 100644 drivers/net/kdp/rte_kdp_tap.c
 create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/2] kdp: add kernel data path kernel module
  2016-02-19  5:05 ` [PATCH v2 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
@ 2016-02-19  5:05   ` Ferruh Yigit
  2016-02-19  5:05   ` [PATCH v2 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
  2016-03-09 11:17   ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-02-19  5:05 UTC (permalink / raw)
  To: dev

This kernel module is based on KNI module, but this one is stripped
version of it and only for data messages, no control functionality
provided.

FIFO implementation of the KNI is kept exact same, but ethtool related
code removed and virtual network management related code simplified.

This module contains kernel support to create network devices and
this module has a simple driver for virtual network device, the driver
simply puts/gets packets to/from FIFO instead of real hardware.

FIFO is created owned by userspace application, which is for this case
KDP PMD.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---

v2:
* Use rtnetlink to create interfaces
* include modules.h to prevent compile error in old kernels
---
 MAINTAINERS                                        |   4 +
 config/common_linuxapp                             |   8 +-
 lib/librte_eal/linuxapp/Makefile                   |   5 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 139 ++++
 lib/librte_eal/linuxapp/kdp/Makefile               |  55 ++
 lib/librte_eal/linuxapp/kdp/kdp_dev.h              |  78 ++
 lib/librte_eal/linuxapp/kdp/kdp_fifo.h             |  91 +++
 lib/librte_eal/linuxapp/kdp/kdp_net.c              | 862 +++++++++++++++++++++
 9 files changed, 1242 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 628bc05..05ffe26 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -257,6 +257,10 @@ F: app/test/test_kni.c
 F: examples/kni/
 F: doc/guides/sample_app_ug/kernel_nic_interface.rst
 
+Linux KDP
+M: Ferruh Yigit <ferruh.yigit@gmail.com>
+F: lib/librte_eal/linuxapp/kdp/
+
 Linux AF_PACKET
 M: John W. Linville <linville@tuxdriver.com>
 F: drivers/net/af_packet/
diff --git a/config/common_linuxapp b/config/common_linuxapp
index f1638db..e1b5032 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -314,6 +314,12 @@ CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
 CONFIG_RTE_LIBRTE_PMD_NULL=y
 
 #
+# Compile KDP PMD
+#
+CONFIG_RTE_KDP_KMOD=y
+CONFIG_RTE_KDP_PREEMPT_DEFAULT=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..e3f91a7 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -38,6 +38,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_KNI_KMOD),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
 endif
+ifeq ($(CONFIG_RTE_KDP_KMOD),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kdp
+endif
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0
 endif
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 6e26250..a70b793 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -121,6 +121,7 @@ CFLAGS_eal_thread.o += -Wno-return-type
 endif
 
 INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
+INC += rte_kdp_common.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP)-include/exec-env := \
 	$(addprefix include/exec-env/,$(INC))
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
new file mode 100644
index 0000000..0334876
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
@@ -0,0 +1,139 @@
+/*-
+ *   This file is provided under a dual BSD/LGPLv2 license.  When using or
+ *   redistributing this file, you may do so under either license.
+ *
+ *   GNU LESSER GENERAL PUBLIC LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2.1 of the GNU Lesser General Public License
+ *   as published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   Lesser General Public License for more details.
+ *
+ *   You should have received a copy of the GNU Lesser General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ *
+ *
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above copyright
+ *     notice, this list of conditions and the following disclaimer in
+ *     the documentation and/or other materials provided with the
+ *     distribution.
+ *   * Neither the name of Intel Corporation nor the names of its
+ *     contributors may be used to endorse or promote products derived
+ *     from this software without specific prior written permission.
+ *
+ *    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#ifndef _RTE_KDP_COMMON_H_
+#define _RTE_KDP_COMMON_H_
+
+/**
+ * KDP name
+ */
+#define RTE_KDP_NAMESIZE 32
+
+#define KDP_DEVICE "kdp"
+
+/*
+ * Fifo struct mapped in a shared memory. It describes a circular buffer FIFO
+ * Write and read should wrap around. Fifo is empty when write == read
+ * Writing should never overwrite the read position
+ */
+struct rte_kdp_fifo {
+	volatile unsigned write;     /**< Next position to be written*/
+	volatile unsigned read;      /**< Next position to be read */
+	unsigned len;                /**< Circular buffer length */
+	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
+	void * volatile buffer[0];   /**< The buffer contains mbuf pointers */
+};
+
+/*
+ * The kernel image of the rte_mbuf struct, with only the relevant fields.
+ * Padding is necessary to assure the offsets of these fields
+ */
+struct rte_kdp_mbuf {
+	void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
+	char pad0[10];
+
+	/**< Start address of data in segment buffer. */
+	uint16_t data_off;
+	char pad1[4];
+	uint64_t ol_flags;      /**< Offload features. */
+	char pad2[4];
+
+	/**< Total pkt len: sum of all segment data_len. */
+	uint32_t pkt_len;
+
+	/**< Amount of data in segment buffer. */
+	uint16_t data_len;
+
+	/* fields on second cache line */
+	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
+	void *pool;
+	void *next;
+};
+
+/*
+ * Struct used to create a KDP device. Passed to the kernel in IOCTL call
+ */
+struct rte_kdp_device_info {
+	char name[RTE_KDP_NAMESIZE];  /**< Network device name for KDP */
+
+	phys_addr_t tx_phys;
+	phys_addr_t rx_phys;
+	phys_addr_t alloc_phys;
+	phys_addr_t free_phys;
+
+	/* mbuf mempool */
+	void *mbuf_va;
+	phys_addr_t mbuf_phys;
+
+	uint16_t port_id;            /**< Group ID */
+	uint32_t core_id;             /**< core ID to bind for kernel thread */
+
+	uint8_t force_bind : 1;       /**< Flag for kernel thread binding */
+
+	/* mbuf size */
+	unsigned mbuf_size;
+};
+
+enum {
+	IFLA_KDP_UNSPEC,
+	IFLA_KDP_PORTID,
+	IFLA_KDP_DEVINFO,
+	__IFLA_KDP_MAX,
+};
+#define IFLA_KDP_MAX (__IFLA_KDP_MAX - 1)
+
+#endif /* _RTE_KDP_COMMON_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/Makefile b/lib/librte_eal/linuxapp/kdp/Makefile
new file mode 100644
index 0000000..3897dc6
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = rte_kdp
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=50
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+MODULE_CFLAGS += -Wall -Werror
+
+# this lib needs main eal
+DEPDIRS-y += lib/librte_eal/linuxapp/eal
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += kdp_net.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_dev.h b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
new file mode 100644
index 0000000..61f4288
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
@@ -0,0 +1,78 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KDP_DEV_H_
+#define _KDP_DEV_H_
+
+#include <exec-env/rte_kdp_common.h>
+
+/**
+ * A structure describing the private information for a kdp device.
+ */
+struct kdp_dev {
+	/* kdp list */
+	struct list_head list;
+
+	struct net_device_stats stats;
+	uint16_t port_id;            /* Group ID of a group of KDP devices */
+	unsigned core_id;            /* Core ID to bind */
+	char name[RTE_KDP_NAMESIZE]; /* Network device name */
+	struct task_struct *pthread;
+
+	/* wait queue for req/resp */
+	wait_queue_head_t wq;
+	struct mutex sync_lock;
+
+	/* kdp device */
+	struct net_device *net_dev;
+
+	/* queue for packets to be sent out */
+	void *tx_q;
+
+	/* queue for the packets received */
+	void *rx_q;
+
+	/* queue for the allocated mbufs those can be used to save sk buffs */
+	void *alloc_q;
+
+	/* free queue for the mbufs to be freed */
+	void *free_q;
+
+	void *sync_kva;
+	void *sync_va;
+
+	void *mbuf_kva;
+	void *mbuf_va;
+
+	/* mbuf size */
+	unsigned mbuf_size;
+};
+
+#define KDP_ERR(args...) printk(KERN_ERR "KDP: " args)
+#define KDP_PRINT(args...) printk(KERN_DEBUG "KDP: " args)
+
+#ifdef RTE_KDP_KO_DEBUG
+#define KDP_DBG(args...) printk(KERN_DEBUG "KDP: " args)
+#else
+#define KDP_DBG(args...)
+#endif
+
+#endif
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_fifo.h b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
new file mode 100644
index 0000000..a5fe080
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
@@ -0,0 +1,91 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KDP_FIFO_H_
+#define _KDP_FIFO_H_
+
+#include <exec-env/rte_kdp_common.h>
+
+/**
+ * Adds num elements into the fifo. Return the number actually written
+ */
+static inline unsigned
+kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned fifo_write = fifo->write;
+	unsigned fifo_read = fifo->read;
+	unsigned new_write = fifo_write;
+
+	for (i = 0; i < num; i++) {
+		new_write = (new_write + 1) & (fifo->len - 1);
+
+		if (new_write == fifo_read)
+			break;
+		fifo->buffer[fifo_write] = data[i];
+		fifo_write = new_write;
+	}
+	fifo->write = fifo_write;
+
+	return i;
+}
+
+/**
+ * Get up to num elements from the fifo. Return the number actully read
+ */
+static inline unsigned
+kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned new_read = fifo->read;
+	unsigned fifo_write = fifo->write;
+
+	for (i = 0; i < num; i++) {
+		if (new_read == fifo_write)
+			break;
+
+		data[i] = fifo->buffer[new_read];
+		new_read = (new_read + 1) & (fifo->len - 1);
+	}
+	fifo->read = new_read;
+
+	return i;
+}
+
+/**
+ * Get the num of elements in the fifo
+ */
+static inline unsigned
+kdp_fifo_count(struct rte_kdp_fifo *fifo)
+{
+	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+}
+
+/**
+ * Get the num of available elements in the fifo
+ */
+static inline unsigned
+kdp_fifo_free_count(struct rte_kdp_fifo *fifo)
+{
+	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+}
+
+#endif /* _KDP_FIFO_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_net.c b/lib/librte_eal/linuxapp/kdp/kdp_net.c
new file mode 100644
index 0000000..08229f1
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_net.c
@@ -0,0 +1,862 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+/*
+ * This code is inspired from the book "Linux Device Drivers" by
+ * Alessandro Rubini and Jonathan Corbet, published by O'Reilly & Associates
+ */
+
+#include <linux/version.h>
+#include <linux/module.h>
+#include <linux/etherdevice.h> /* eth_type_trans */
+#include <linux/kthread.h>
+#include <net/rtnetlink.h>
+
+#include "kdp_fifo.h"
+#include "kdp_dev.h"
+
+#define WD_TIMEOUT 5 /*jiffies */
+#define MBUF_BURST_SZ 32
+
+#define KDP_RX_LOOP_NUM 1000
+#define KDP_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
+
+static struct task_struct *kdp_kthread;
+static struct rw_semaphore kdp_list_lock;
+static struct list_head kdp_list_head;
+
+/* loopback mode */
+static char *lo_mode;
+
+/* Kernel thread mode */
+static char *kthread_mode;
+static unsigned multiple_kthread_on;
+
+/* typedef for rx function */
+typedef void (*kdp_net_rx_t)(struct kdp_dev *kdp);
+
+/*
+ * Open and close
+ */
+static int kdp_net_open(struct net_device *dev)
+{
+	random_ether_addr(dev->dev_addr);
+	netif_start_queue(dev);
+
+	return 0;
+}
+
+static int kdp_net_release(struct net_device *dev)
+{
+	netif_stop_queue(dev); /* can't transmit any more */
+
+	return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int kdp_net_config(struct net_device *dev, struct ifmap *map)
+{
+	if (dev->flags & IFF_UP) /* can't act on a running interface */
+		return -EBUSY;
+
+	/* ignore other fields */
+	return 0;
+}
+
+/*
+ * Transmit a packet (called by the kernel)
+ */
+static int kdp_net_tx(struct sk_buff *skb, struct net_device *dev)
+{
+	int len = 0;
+	unsigned ret;
+	struct kdp_dev *kdp = netdev_priv(dev);
+	struct rte_kdp_mbuf *pkt_kva = NULL;
+	struct rte_kdp_mbuf *pkt_va = NULL;
+
+	dev->trans_start = jiffies; /* save the timestamp */
+
+	/* Check if the length of skb is less than mbuf size */
+	if (skb->len > kdp->mbuf_size)
+		goto drop;
+
+	/**
+	 * Check if it has at least one free entry in tx_q and
+	 * one entry in alloc_q.
+	 */
+	if (kdp_fifo_free_count(kdp->tx_q) == 0 ||
+			kdp_fifo_count(kdp->alloc_q) == 0) {
+		/**
+		 * If no free entry in tx_q or no entry in alloc_q,
+		 * drops skb and goes out.
+		 */
+		goto drop;
+	}
+
+	/* dequeue a mbuf from alloc_q */
+	ret = kdp_fifo_get(kdp->alloc_q, (void **)&pkt_va, 1);
+	if (likely(ret == 1)) {
+		void *data_kva;
+
+		pkt_kva = (void *)pkt_va - kdp->mbuf_va + kdp->mbuf_kva;
+		data_kva = pkt_kva->buf_addr + pkt_kva->data_off - kdp->mbuf_va
+				+ kdp->mbuf_kva;
+
+		len = skb->len;
+		memcpy(data_kva, skb->data, len);
+		if (unlikely(len < ETH_ZLEN)) {
+			memset(data_kva + len, 0, ETH_ZLEN - len);
+			len = ETH_ZLEN;
+		}
+		pkt_kva->pkt_len = len;
+		pkt_kva->data_len = len;
+
+		/* enqueue mbuf into tx_q */
+		ret = kdp_fifo_put(kdp->tx_q, (void **)&pkt_va, 1);
+		if (unlikely(ret != 1)) {
+			/* Failing should not happen */
+			KDP_ERR("Fail to enqueue mbuf into tx_q\n");
+			goto drop;
+		}
+	} else {
+		/* Failing should not happen */
+		KDP_ERR("Fail to dequeue mbuf from alloc_q\n");
+		goto drop;
+	}
+
+	/* Free skb and update statistics */
+	dev_kfree_skb(skb);
+	kdp->stats.tx_bytes += len;
+	kdp->stats.tx_packets++;
+
+	return NETDEV_TX_OK;
+
+drop:
+	/* Free skb and update statistics */
+	dev_kfree_skb(skb);
+	kdp->stats.tx_dropped++;
+
+	return NETDEV_TX_OK;
+}
+
+static int kdp_net_change_mtu(struct net_device *dev, int new_mtu)
+{
+	KDP_DBG("kdp_net_change_mtu new mtu %d to be set\n", new_mtu);
+
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
+/*
+ * Ioctl commands
+ */
+static int kdp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	KDP_DBG("kdp_net_ioctl %d\n",
+		((struct kdp_dev *)netdev_priv(dev))->port_id);
+
+	return 0;
+}
+
+static void kdp_net_set_rx_mode(struct net_device *dev)
+{
+}
+
+/*
+ * Return statistics to the caller
+ */
+static struct net_device_stats *kdp_net_stats(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	return &kdp->stats;
+}
+
+/*
+ * Deal with a transmit timeout.
+ */
+static void kdp_net_tx_timeout(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	KDP_DBG("Transmit timeout at %ld, latency %ld\n", jiffies,
+			jiffies - dev->trans_start);
+
+	kdp->stats.tx_errors++;
+	netif_wake_queue(dev);
+}
+
+/**
+ * kdp_net_set_mac - Change the Ethernet Address of the KDP NIC
+ * @netdev: network interface device structure
+ * @p: pointer to an address structure
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int kdp_net_set_mac(struct net_device *netdev, void *p)
+{
+	struct sockaddr *addr = p;
+	if (!is_valid_ether_addr((unsigned char *)(addr->sa_data)))
+		return -EADDRNOTAVAIL;
+	memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len);
+
+	return 0;
+}
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0))
+static int kdp_net_change_carrier(struct net_device *dev, bool new_carrier)
+{
+	if (new_carrier)
+		netif_carrier_on(dev);
+	else
+		netif_carrier_off(dev);
+
+	return 0;
+}
+#endif
+
+static const struct net_device_ops kdp_net_netdev_ops = {
+	.ndo_open = kdp_net_open,
+	.ndo_stop = kdp_net_release,
+	.ndo_set_config = kdp_net_config,
+	.ndo_start_xmit = kdp_net_tx,
+	.ndo_change_mtu = kdp_net_change_mtu,
+	.ndo_do_ioctl = kdp_net_ioctl,
+	.ndo_set_rx_mode = kdp_net_set_rx_mode,
+	.ndo_get_stats = kdp_net_stats,
+	.ndo_tx_timeout = kdp_net_tx_timeout,
+	.ndo_set_mac_address = kdp_net_set_mac,
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0))
+	.ndo_change_carrier = kdp_net_change_carrier,
+#endif
+};
+
+/*
+ *  Fill the eth header
+ */
+static int kdp_net_header(struct sk_buff *skb, struct net_device *dev,
+		unsigned short type, const void *daddr,
+		const void *saddr, unsigned int len)
+{
+	struct ethhdr *eth = (struct ethhdr *) skb_push(skb, ETH_HLEN);
+
+	memcpy(eth->h_source, saddr ? saddr : dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest,   daddr ? daddr : dev->dev_addr, dev->addr_len);
+	eth->h_proto = htons(type);
+
+	return dev->hard_header_len;
+}
+
+/*
+ * Re-fill the eth header
+ */
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0))
+static int kdp_net_rebuild_header(struct sk_buff *skb)
+{
+	struct net_device *dev = skb->dev;
+	struct ethhdr *eth = (struct ethhdr *) skb->data;
+
+	memcpy(eth->h_source, dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest, dev->dev_addr, dev->addr_len);
+
+	return 0;
+}
+#endif /* < 4.1.0  */
+
+static const struct header_ops kdp_net_header_ops = {
+	.create  = kdp_net_header,
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0))
+	.rebuild = kdp_net_rebuild_header,
+#endif /* < 4.1.0  */
+	.cache   = NULL,  /* disable caching */
+};
+
+static void kdp_net_setup(struct net_device *dev)
+{
+	struct kdp_dev *kdp;
+
+	ether_setup(dev);
+	dev->netdev_ops = &kdp_net_netdev_ops;
+	dev->header_ops = &kdp_net_header_ops;
+	dev->watchdog_timeo = WD_TIMEOUT;
+
+	kdp = netdev_priv(dev);
+	init_waitqueue_head(&kdp->wq);
+	mutex_init(&kdp->sync_lock);
+
+	dev->flags |= IFF_UP;
+}
+
+/*
+ * RX: normal working mode
+ */
+static void kdp_net_rx_normal(struct kdp_dev *kdp)
+{
+	unsigned ret;
+	uint32_t len;
+	unsigned i, num_rx, num_fq;
+	struct rte_kdp_mbuf *kva;
+	struct rte_kdp_mbuf *va[MBUF_BURST_SZ];
+	void *data_kva;
+	unsigned mbuf_burst_size = MBUF_BURST_SZ;
+
+	struct sk_buff *skb;
+	struct net_device *dev = kdp->net_dev;
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+	if (num_fq == 0) {
+		/* No room on the free_q, bail out */
+		return;
+	}
+
+	/* Calculate the number of entries to dequeue from rx_q */
+	num_rx = min(num_fq, mbuf_burst_size);
+
+	/* Burst dequeue from rx_q */
+	num_rx = kdp_fifo_get(kdp->rx_q, (void **)va, num_rx);
+	if (num_rx == 0)
+		return;
+
+	/* Transfer received packets to netif */
+	for (i = 0; i < num_rx; i++) {
+		kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva;
+		len = kva->data_len;
+		data_kva = kva->buf_addr + kva->data_off - kdp->mbuf_va
+				+ kdp->mbuf_kva;
+
+		skb = dev_alloc_skb(len + 2);
+		if (!skb) {
+			KDP_ERR("Out of mem, dropping pkts\n");
+			/* Update statistics */
+			kdp->stats.rx_dropped++;
+		} else {
+			/* Align IP on 16B boundary */
+			skb_reserve(skb, 2);
+			memcpy(skb_put(skb, len), data_kva, len);
+			skb->dev = dev;
+			skb->protocol = eth_type_trans(skb, dev);
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+			/* Call netif interface */
+			netif_rx(skb);
+
+			/* Update statistics */
+			kdp->stats.rx_bytes += len;
+			kdp->stats.rx_packets++;
+		}
+	}
+
+	/* Burst enqueue mbufs into free_q */
+	ret = kdp_fifo_put(kdp->free_q, (void **)va, num_rx);
+	if (ret != num_rx)
+		/* Failing should not happen */
+		KDP_ERR("Fail to enqueue entries into free_q\n");
+}
+
+/*
+ * RX: loopback with enqueue/dequeue fifos.
+ */
+static void kdp_net_rx_lo_fifo(struct kdp_dev *kdp)
+{
+	unsigned ret;
+	uint32_t len;
+	unsigned i, num, num_rq, num_tq, num_aq, num_fq;
+	struct rte_kdp_mbuf *kva;
+	struct rte_kdp_mbuf *va[MBUF_BURST_SZ];
+	void *data_kva;
+	struct rte_kdp_mbuf *alloc_kva;
+	struct rte_kdp_mbuf *alloc_va[MBUF_BURST_SZ];
+	void *alloc_data_kva;
+	unsigned mbuf_burst_size = MBUF_BURST_SZ;
+
+	/* Get the number of entries in rx_q */
+	num_rq = kdp_fifo_count(kdp->rx_q);
+
+	/* Get the number of free entrie in tx_q */
+	num_tq = kdp_fifo_free_count(kdp->tx_q);
+
+	/* Get the number of entries in alloc_q */
+	num_aq = kdp_fifo_count(kdp->alloc_q);
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+
+	/* Calculate the number of entries to be dequeued from rx_q */
+	num = min(num_rq, num_tq);
+	num = min(num, num_aq);
+	num = min(num, num_fq);
+	num = min(num, mbuf_burst_size);
+
+	/* Return if no entry to dequeue from rx_q */
+	if (num == 0)
+		return;
+
+	/* Burst dequeue from rx_q */
+	ret = kdp_fifo_get(kdp->rx_q, (void **)va, num);
+	if (ret == 0)
+		return; /* Failing should not happen */
+
+	/* Dequeue entries from alloc_q */
+	ret = kdp_fifo_get(kdp->alloc_q, (void **)alloc_va, num);
+	if (ret) {
+		num = ret;
+		/* Copy mbufs */
+		for (i = 0; i < num; i++) {
+			kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva;
+			len = kva->pkt_len;
+			data_kva = kva->buf_addr + kva->data_off -
+					kdp->mbuf_va + kdp->mbuf_kva;
+
+			alloc_kva = (void *)alloc_va[i] - kdp->mbuf_va +
+							kdp->mbuf_kva;
+			alloc_data_kva = alloc_kva->buf_addr +
+					alloc_kva->data_off - kdp->mbuf_va +
+							kdp->mbuf_kva;
+			memcpy(alloc_data_kva, data_kva, len);
+			alloc_kva->pkt_len = len;
+			alloc_kva->data_len = len;
+
+			kdp->stats.tx_bytes += len;
+			kdp->stats.rx_bytes += len;
+		}
+
+		/* Burst enqueue mbufs into tx_q */
+		ret = kdp_fifo_put(kdp->tx_q, (void **)alloc_va, num);
+		if (ret != num)
+			/* Failing should not happen */
+			KDP_ERR("Fail to enqueue mbufs into tx_q\n");
+	}
+
+	/* Burst enqueue mbufs into free_q */
+	ret = kdp_fifo_put(kdp->free_q, (void **)va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		KDP_ERR("Fail to enqueue mbufs into free_q\n");
+
+	/**
+	 * Update statistic, and enqueue/dequeue failure is impossible,
+	 * as all queues are checked at first.
+	 */
+	kdp->stats.tx_packets += num;
+	kdp->stats.rx_packets += num;
+}
+
+/*
+ * RX: loopback with enqueue/dequeue fifos and sk buffer copies.
+ */
+static void kdp_net_rx_lo_fifo_skb(struct kdp_dev *kdp)
+{
+	unsigned ret;
+	uint32_t len;
+	unsigned i, num_rq, num_fq, num;
+	struct rte_kdp_mbuf *kva;
+	struct rte_kdp_mbuf *va[MBUF_BURST_SZ];
+	void *data_kva;
+	struct sk_buff *skb;
+	struct net_device *dev = kdp->net_dev;
+	unsigned mbuf_burst_size = MBUF_BURST_SZ;
+
+	/* Get the number of entries in rx_q */
+	num_rq = kdp_fifo_count(kdp->rx_q);
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+
+	/* Calculate the number of entries to dequeue from rx_q */
+	num = min(num_rq, num_fq);
+	num = min(num, mbuf_burst_size);
+
+	/* Return if no entry to dequeue from rx_q */
+	if (num == 0)
+		return;
+
+	/* Burst dequeue mbufs from rx_q */
+	ret = kdp_fifo_get(kdp->rx_q, (void **)va, num);
+	if (ret == 0)
+		return;
+
+	/* Copy mbufs to sk buffer and then call tx interface */
+	for (i = 0; i < num; i++) {
+		kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva;
+		len = kva->data_len;
+		data_kva = kva->buf_addr + kva->data_off - kdp->mbuf_va +
+				kdp->mbuf_kva;
+
+		skb = dev_alloc_skb(len + 2);
+		if (skb == NULL)
+			KDP_ERR("Out of mem, dropping pkts\n");
+		else {
+			/* Align IP on 16B boundary */
+			skb_reserve(skb, 2);
+			memcpy(skb_put(skb, len), data_kva, len);
+			skb->dev = dev;
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+			dev_kfree_skb(skb);
+		}
+
+		/* Simulate real usage, allocate/copy skb twice */
+		skb = dev_alloc_skb(len + 2);
+		if (skb == NULL) {
+			KDP_ERR("Out of mem, dropping pkts\n");
+			kdp->stats.rx_dropped++;
+		} else {
+			/* Align IP on 16B boundary */
+			skb_reserve(skb, 2);
+			memcpy(skb_put(skb, len), data_kva, len);
+			skb->dev = dev;
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+			kdp->stats.rx_bytes += len;
+			kdp->stats.rx_packets++;
+
+			/* call tx interface */
+			kdp_net_tx(skb, dev);
+		}
+	}
+
+	/* enqueue all the mbufs from rx_q into free_q */
+	ret = kdp_fifo_put(kdp->free_q, (void **)&va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		KDP_ERR("Fail to enqueue mbufs into free_q\n");
+}
+
+/* kdp rx function pointer, with default to normal rx */
+static kdp_net_rx_t kdp_net_rx_func = kdp_net_rx_normal;
+
+/* rx interface */
+static void kdp_net_rx(struct kdp_dev *kdp)
+{
+	/**
+	 * It doesn't need to check if it is NULL pointer,
+	 * as it has a default value
+	 */
+	(*kdp_net_rx_func)(kdp);
+}
+
+static int kdp_thread_single(void *data)
+{
+	struct kdp_dev *dev;
+	int j;
+
+	while (!kthread_should_stop()) {
+		down_read(&kdp_list_lock);
+		for (j = 0; j < KDP_RX_LOOP_NUM; j++) {
+			list_for_each_entry(dev, &kdp_list_head, list) {
+				kdp_net_rx(dev);
+			}
+		}
+		up_read(&kdp_list_lock);
+#ifdef RTE_KDP_PREEMPT_DEFAULT
+		/* reschedule out for a while */
+		schedule_timeout_interruptible(
+			usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL));
+#endif
+	}
+
+	return 0;
+}
+
+static int kdp_thread_multiple(void *param)
+{
+	int j;
+	struct kdp_dev *dev = (struct kdp_dev *)param;
+
+	while (!kthread_should_stop()) {
+		for (j = 0; j < KDP_RX_LOOP_NUM; j++)
+			kdp_net_rx(dev);
+
+#ifdef RTE_KDP_PREEMPT_DEFAULT
+		schedule_timeout_interruptible(
+			usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL));
+#endif
+	}
+
+	return 0;
+}
+
+static void kdp_setup(struct kdp_dev *kdp,
+		struct rte_kdp_device_info *info)
+{
+	kdp->port_id = info->port_id;
+	kdp->core_id = info->core_id;
+	strncpy(kdp->name, info->name, RTE_KDP_NAMESIZE);
+
+	/* Translate user space info into kernel space info */
+	kdp->tx_q = phys_to_virt(info->tx_phys);
+	kdp->rx_q = phys_to_virt(info->rx_phys);
+	kdp->alloc_q = phys_to_virt(info->alloc_phys);
+	kdp->free_q = phys_to_virt(info->free_phys);
+
+	kdp->mbuf_kva = phys_to_virt(info->mbuf_phys);
+	kdp->mbuf_va = info->mbuf_va;
+
+	kdp->mbuf_size = info->mbuf_size;
+
+	KDP_PRINT("tx_phys:      0x%016llx, tx_q addr:      0x%p\n",
+		(unsigned long long) info->tx_phys, kdp->tx_q);
+	KDP_PRINT("rx_phys:      0x%016llx, rx_q addr:      0x%p\n",
+		(unsigned long long) info->rx_phys, kdp->rx_q);
+	KDP_PRINT("alloc_phys:   0x%016llx, alloc_q addr:   0x%p\n",
+		(unsigned long long) info->alloc_phys, kdp->alloc_q);
+	KDP_PRINT("free_phys:    0x%016llx, free_q addr:    0x%p\n",
+		(unsigned long long) info->free_phys, kdp->free_q);
+	KDP_PRINT("mbuf_phys:    0x%016llx, mbuf_kva:       0x%p\n",
+		(unsigned long long) info->mbuf_phys, kdp->mbuf_kva);
+	KDP_PRINT("mbuf_va:      0x%p\n", info->mbuf_va);
+	KDP_PRINT("mbuf_size:    %u\n", kdp->mbuf_size);
+}
+
+static int create_kthread(struct kdp_dev *kdp,
+		struct rte_kdp_device_info *info)
+{
+	/**
+	 * Create a new kernel thread for multiple mode, set its core affinity,
+	 * and finally wake it up.
+	 */
+	if (multiple_kthread_on) {
+		kdp->pthread = kthread_create(kdp_thread_multiple,
+				(void *)kdp, "kdp_%s", kdp->name);
+		if (IS_ERR(kdp->pthread))
+			return -ECANCELED;
+
+		if (info->force_bind)
+			kthread_bind(kdp->pthread, kdp->core_id);
+
+		wake_up_process(kdp->pthread);
+
+		return 0;
+	}
+
+	/* single thread */
+	if (kdp_kthread == NULL) {
+		KDP_PRINT("Single kernel thread for all KDP devices\n");
+
+		/* Create kernel thread for RX */
+		kdp_kthread = kthread_run(kdp_thread_single, NULL,
+				"kdp_single");
+		if (IS_ERR(kdp_kthread)) {
+			KDP_ERR("Unable to create kernel threaed\n");
+			return PTR_ERR(kdp_kthread);
+		}
+	}
+
+	return 0;
+}
+
+static int kdp_net_newlink(struct net *net, struct net_device *dev,
+		struct nlattr *tb[], struct nlattr *data[])
+{
+	struct rte_kdp_device_info dev_info;
+	struct kdp_dev *kdp;
+	int ret;
+
+	kdp = netdev_priv(dev);
+
+	if (data && data[IFLA_KDP_PORTID])
+		kdp->port_id = nla_get_u8(data[IFLA_KDP_PORTID]);
+	else
+		goto error_free;
+
+	if (data && data[IFLA_KDP_DEVINFO])
+		memcpy(&dev_info, nla_data(data[IFLA_KDP_DEVINFO]),
+				sizeof(struct rte_kdp_device_info));
+	else
+		goto error_free;
+
+	/**
+	 * Check if the cpu core id is valid for binding,
+	 * for multiple kernel thread mode.
+	 */
+	if (multiple_kthread_on && dev_info.force_bind &&
+			!cpu_online(dev_info.core_id)) {
+		KDP_ERR("cpu %u is not online\n", dev_info.core_id);
+		goto error_free;
+	}
+
+	kdp->net_dev = dev;
+	kdp_setup(kdp, &dev_info);
+
+	ret = register_netdevice(dev);
+	if (ret < 0)
+		goto error_free;
+
+	ret = create_kthread(kdp, &dev_info);
+	if (ret < 0)
+		goto error_unregister;
+
+	down_write(&kdp_list_lock);
+	list_add(&kdp->list, &kdp_list_head);
+	up_write(&kdp_list_lock);
+
+	return 0;
+
+error_unregister:
+	unregister_netdev(dev);
+error_free:
+	free_netdev(dev);
+	return -EINVAL;
+}
+
+static void single_kthread_stop(void)
+{
+	/* Stop kernel thread for single mode */
+	if (multiple_kthread_on == 0 && kdp_kthread != NULL) {
+		kthread_stop(kdp_kthread);
+		kdp_kthread = NULL;
+	}
+}
+
+static void multiple_kthread_stop(struct kdp_dev *kdp)
+{
+	/* Stop kernel thread for multiple mode */
+	if (multiple_kthread_on && kdp->pthread != NULL) {
+		kthread_stop(kdp->pthread);
+		kdp->pthread = NULL;
+	}
+}
+
+static void kdp_net_dellink(struct net_device *dev, struct list_head *head)
+{
+	struct kdp_dev *kdp;
+
+	kdp = netdev_priv(dev);
+
+	down_write(&kdp_list_lock);
+	list_del(&kdp->list);
+	up_write(&kdp_list_lock);
+
+	multiple_kthread_stop(kdp);
+
+	down_write(&kdp_list_lock);
+	if (list_empty(&kdp_list_head))
+		single_kthread_stop();
+	up_write(&kdp_list_lock);
+
+	unregister_netdevice_queue(dev, head);
+}
+
+static struct rtnl_link_ops kdp_link_ops __read_mostly = {
+	.kind = KDP_DEVICE,
+	.priv_size = sizeof(struct kdp_dev),
+	.setup = kdp_net_setup,
+	.maxtype = IFLA_KDP_MAX,
+	.newlink = kdp_net_newlink,
+	.dellink = kdp_net_dellink,
+};
+
+static int __init
+kdp_parse_kthread_mode(void)
+{
+	if (!kthread_mode)
+		return 0;
+
+	if (strcmp(kthread_mode, "single") == 0)
+		return 0;
+	else if (strcmp(kthread_mode, "multiple") == 0)
+		multiple_kthread_on = 1;
+	else
+		return -1;
+
+	return 0;
+}
+
+static void kdp_net_config_lo_mode(char *lo_str)
+{
+	if (!lo_str) {
+		KDP_PRINT("loopback disabled");
+		return;
+	}
+
+	if (!strcmp(lo_str, "lo_mode_none"))
+		KDP_PRINT("loopback disabled");
+	else if (!strcmp(lo_str, "lo_mode_fifo")) {
+		KDP_PRINT("loopback mode=lo_mode_fifo enabled");
+		kdp_net_rx_func = kdp_net_rx_lo_fifo;
+	} else if (!strcmp(lo_str, "lo_mode_fifo_skb")) {
+		KDP_PRINT("loopback mode=lo_mode_fifo_skb enabled");
+		kdp_net_rx_func = kdp_net_rx_lo_fifo_skb;
+	} else
+		KDP_PRINT("Incognizant parameter, loopback disabled");
+}
+
+static int __init kdp_init(void)
+{
+	if (kdp_parse_kthread_mode() < 0) {
+		KDP_ERR("Invalid parameter for kthread_mode\n");
+		return -EINVAL;
+	}
+
+	/* Configure the lo mode according to the input parameter */
+	kdp_net_config_lo_mode(lo_mode);
+
+	init_rwsem(&kdp_list_lock);
+	INIT_LIST_HEAD(&kdp_list_head);
+
+	return rtnl_link_register(&kdp_link_ops);
+}
+module_init(kdp_init);
+
+static void kdp_release(void)
+{
+	struct kdp_dev *kdp, *n;
+
+	single_kthread_stop();
+
+	down_write(&kdp_list_lock);
+	list_for_each_entry_safe(kdp, n, &kdp_list_head, list) {
+		multiple_kthread_stop(kdp);
+		list_del(&kdp->list);
+	}
+	up_write(&kdp_list_lock);
+}
+
+static void __exit kdp_exit(void)
+{
+	kdp_release();
+	rtnl_link_unregister(&kdp_link_ops);
+}
+module_exit(kdp_exit);
+
+module_param(lo_mode, charp, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(lo_mode,
+"KDP loopback mode (default=lo_mode_none):\n"
+"    lo_mode_none        Kernel loopback disabled\n"
+"    lo_mode_fifo        Enable kernel loopback with fifo\n"
+"    lo_mode_fifo_skb    Enable kernel loopback with fifo and skb buffer\n"
+"\n"
+);
+
+module_param(kthread_mode, charp, S_IRUGO);
+MODULE_PARM_DESC(kthread_mode,
+"Kernel thread mode (default=single):\n"
+"    single    Single kernel thread mode enabled.\n"
+"    multiple  Multiple kernel thread mode enabled.\n"
+"\n"
+);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Kernel Module for managing kdp devices");
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-02-19  5:05 ` [PATCH v2 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2016-02-19  5:05   ` [PATCH v2 1/2] kdp: add kernel data path kernel module Ferruh Yigit
@ 2016-02-19  5:05   ` Ferruh Yigit
  2016-03-09 11:17   ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-02-19  5:05 UTC (permalink / raw)
  To: dev

This patch provides slow data path communication to the Linux kernel.
Patch is based on librte_kni, and heavily re-uses it.

The main difference is librte_kni library converted into a PMD, to
provide ease of use for applications.

Now any application can use slow path communication without any update
in application, because of existing eal support for virtual PMD.

Also this PMD supports two methods to send packets to the Linux, first
one is custom FIFO implementation with help of KDP kernel module, second
one is Linux in-kernel tun/tap support. PMD first checks for KDP kernel
module, if fails it tries to create and use a tap interface.

With FIFO method: PMD's rx_pkt_burst() get packets from FIFO,
and tx_pkt_burst() puts packet to the FIFO.
The corresponding Linux virtual network device driver code
also gets/puts packets from FIFO as they are coming from hardware.

With tun/tap method: no external kernel module required, PMD reads from
and writes packets to the tap interface file descriptor. Tap interface
has performance penalty against FIFO implementation.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---

v2:
* Use rtnetlink to create interfaces
---
 MAINTAINERS                             |   1 +
 config/common_linuxapp                  |   1 +
 doc/guides/nics/pcap_ring.rst           | 125 ++++++-
 doc/guides/rel_notes/release_16_04.rst  |   6 +
 drivers/net/Makefile                    |   3 +-
 drivers/net/kdp/Makefile                |  61 +++
 drivers/net/kdp/rte_eth_kdp.c           | 501 +++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.c               | 633 ++++++++++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.h               | 116 ++++++
 drivers/net/kdp/rte_kdp_fifo.h          |  91 +++++
 drivers/net/kdp/rte_kdp_tap.c           | 101 +++++
 drivers/net/kdp/rte_pmd_kdp_version.map |   4 +
 lib/librte_eal/common/include/rte_log.h |   3 +-
 mk/rte.app.mk                           |   3 +-
 14 files changed, 1643 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/kdp/Makefile
 create mode 100644 drivers/net/kdp/rte_eth_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.h
 create mode 100644 drivers/net/kdp/rte_kdp_fifo.h
 create mode 100644 drivers/net/kdp/rte_kdp_tap.c
 create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 05ffe26..deaeea3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -260,6 +260,7 @@ F: doc/guides/sample_app_ug/kernel_nic_interface.rst
 Linux KDP
 M: Ferruh Yigit <ferruh.yigit@gmail.com>
 F: lib/librte_eal/linuxapp/kdp/
+F: drivers/net/kdp/
 
 Linux AF_PACKET
 M: John W. Linville <linville@tuxdriver.com>
diff --git a/config/common_linuxapp b/config/common_linuxapp
index e1b5032..aa13719 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -316,6 +316,7 @@ CONFIG_RTE_LIBRTE_PMD_NULL=y
 #
 # Compile KDP PMD
 #
+CONFIG_RTE_LIBRTE_PMD_KDP=y
 CONFIG_RTE_KDP_KMOD=y
 CONFIG_RTE_KDP_PREEMPT_DEFAULT=y
 
diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst
index aa48d33..b602e65 100644
--- a/doc/guides/nics/pcap_ring.rst
+++ b/doc/guides/nics/pcap_ring.rst
@@ -28,11 +28,11 @@
     (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-Libpcap and Ring Based Poll Mode Drivers
-========================================
+Software Poll Mode Drivers
+==========================
 
 In addition to Poll Mode Drivers (PMDs) for physical and virtual hardware,
-the DPDK also includes two pure-software PMDs. These two drivers are:
+the DPDK also includes pure-software PMDs. These drivers are:
 
 *   A libpcap -based PMD (librte_pmd_pcap) that reads and writes packets using libpcap,
     - both from files on disk, as well as from physical NIC devices using standard Linux kernel drivers.
@@ -40,6 +40,10 @@ the DPDK also includes two pure-software PMDs. These two drivers are:
 *   A ring-based PMD (librte_pmd_ring) that allows a set of software FIFOs (that is, rte_ring)
     to be accessed using the PMD APIs, as though they were physical NICs.
 
+*   A slow data path PMD (librte_pmd_kdp) that allows send/get packets to/from OS network
+    stack as it is a physical NIC.
+
+
 .. note::
 
     The libpcap -based PMD is disabled by default in the build configuration files,
@@ -211,6 +215,121 @@ Multiple devices may be specified, separated by commas.
     Done.
 
 
+Kernel Data Path PMD
+~~~~~~~~~~~~~~~~~~~~
+
+Kernel Data Path (KDP) PMD is to communicate with OS network stack easily by application.
+
+.. code-block:: console
+
+        ./testpmd --vdev eth_kdp0 --vdev eth_kdp1 -- -i
+        ...
+        Configuring Port 0 (socket 0)
+        Port 0: 00:00:00:00:00:00
+        Configuring Port 1 (socket 0)
+        Port 1: 00:00:00:00:00:00
+        Checking link statuses...
+        Port 0 Link Up - speed 10000 Mbps - full-duplex
+        Port 1 Link Up - speed 10000 Mbps - full-duplex
+        Done
+
+KDP PMD supports two type of communication:
+
+* Custom FIFO implementation
+* tun/tap implementation
+
+Custom FIFO implementation gives more performance but requires KDP kernel module (rte_kdp.ko) inserted.
+
+By default FIFO communication has priority, if KDP kernel module is not inserted, tun/tap communication used.
+
+If KDP kernel module inserted, above testpmd command will create following virtual interfaces, these can be used as any interface.
+
+.. code-block:: console
+
+        # ifconfig kdp0; ifconfig kdp1
+        kdp0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
+                ether 00:00:00:00:00:00  txqueuelen 1000  (Ethernet)
+                RX packets 0  bytes 0 (0.0 B)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 0  bytes 0 (0.0 B)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+        kdp1: flags=4098<BROADCAST,MULTICAST>  mtu 1500
+                ether 00:00:00:00:00:00  txqueuelen 1000  (Ethernet)
+                RX packets 0  bytes 0 (0.0 B)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 0  bytes 0 (0.0 B)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+
+With tun/tap communication method, following interfaces are created:
+
+.. code-block:: console
+
+        # ifconfig tap_kdp0; ifconfig tap_kdp1
+        tap_kdp0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
+                inet6 fe80::341f:afff:feb7:23db  prefixlen 64  scopeid 0x20<link>
+                ether 36:1f:af:b7:23:db  txqueuelen 500  (Ethernet)
+                RX packets 126624864  bytes 6184828655 (5.7 GiB)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 126236898  bytes 6150306636 (5.7 GiB)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+        tap_kdp1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
+                inet6 fe80::f030:b4ff:fe94:b720  prefixlen 64  scopeid 0x20<link>
+                ether f2:30:b4:94:b7:20  txqueuelen 500  (Ethernet)
+                RX packets 126237370  bytes 6150329717 (5.7 GiB)
+                RX errors 0  dropped 9  overruns 0  frame 0
+                TX packets 126624896  bytes 6184826874 (5.7 GiB)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+DPDK application can be used to forward packets between these interfaces:
+
+.. code-block:: console
+
+        In Linux:
+        ip l add br0 type bridge
+        ip l set tap_kdp0 master br0
+        ip l set tap_kdp1 master br0
+        ip l set br0 up
+        ip l set tap_kdp0 up
+        ip l set tap_kdp1 up
+
+
+        In testpmd:
+        testpmd> start
+          io packet forwarding - CRC stripping disabled - packets/burst=32
+          nb forwarding cores=1 - nb forwarding ports=2
+          RX queues=1 - RX desc=128 - RX free threshold=0
+          RX threshold registers: pthresh=0 hthresh=0 wthresh=0
+          TX queues=1 - TX desc=512 - TX free threshold=0
+          TX threshold registers: pthresh=0 hthresh=0 wthresh=0
+          TX RS bit threshold=0 - TXQ flags=0x0
+        testpmd> stop
+        Telling cores to stop...
+        Waiting for lcores to finish...
+
+          ---------------------- Forward statistics for port 0  ----------------------
+          RX-packets: 973900         RX-dropped: 0             RX-total: 973900
+          TX-packets: 973903         TX-dropped: 0             TX-total: 973903
+          ----------------------------------------------------------------------------
+
+          ---------------------- Forward statistics for port 1  ----------------------
+          RX-packets: 973903         RX-dropped: 0             RX-total: 973903
+          TX-packets: 973900         TX-dropped: 0             TX-total: 973900
+          ----------------------------------------------------------------------------
+
+          +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
+          RX-packets: 1947803        RX-dropped: 0             RX-total: 1947803
+          TX-packets: 1947803        TX-dropped: 0             TX-total: 1947803
+          ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+        Done.
+
+
+
+
+
 Using the Poll Mode Driver from an Application
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index eb1b3b2..d17778c 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -44,6 +44,12 @@ This section should contain new features added in this release. Sample format:
   Add the offload and negotiation of checksum and TSO between vhost-user and
   vanilla Linux virtio guest.
 
+* **Added Slow Data Path support.**
+
+  * This is based on KNI work and in long term intends to replace it.
+  * Added Kernel Data Path (KDP) kernel module.
+  * Added KDP virtual PMD.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 6e4497e..0be06f5 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -51,6 +51,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += szedata2
 DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += kdp
 
 include $(RTE_SDK)/mk/rte.sharelib.mk
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/kdp/Makefile b/drivers/net/kdp/Makefile
new file mode 100644
index 0000000..035056e
--- /dev/null
+++ b/drivers/net/kdp/Makefile
@@ -0,0 +1,61 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_kdp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_kdp_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_eth_kdp.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_kdp.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_kdp_tap.c
+
+#
+# Export include files
+#
+SYMLINK-y-include +=
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/kdp/rte_eth_kdp.c b/drivers/net/kdp/rte_eth_kdp.c
new file mode 100644
index 0000000..68dd734
--- /dev/null
+++ b/drivers/net/kdp/rte_eth_kdp.c
@@ -0,0 +1,501 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ethdev.h>
+
+#include "rte_kdp.h"
+
+#define MAX_PACKET_SZ 2048
+
+struct pmd_queue_stats {
+	uint64_t pkts;
+	uint64_t bytes;
+	uint64_t err_pkts;
+};
+
+struct pmd_queue {
+	struct pmd_internals *internals;
+	struct rte_mempool *mb_pool;
+
+	struct pmd_queue_stats rx;
+	struct pmd_queue_stats tx;
+};
+
+struct pmd_internals {
+	struct kdp_data *kdp;
+	struct kdp_tap_data *kdp_tap;
+
+	struct pmd_queue rx_queues[RTE_MAX_QUEUES_PER_PORT];
+	struct pmd_queue tx_queues[RTE_MAX_QUEUES_PER_PORT];
+};
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+static const char *drivername = "KDP PMD";
+static struct rte_eth_link pmd_link = {
+		.link_speed = 10000,
+		.link_duplex = ETH_LINK_FULL_DUPLEX,
+		.link_status = 0
+};
+
+static uint16_t
+eth_kdp_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct kdp_data *kdp = kdp_q->internals->kdp;
+	uint16_t nb_pkts;
+
+	nb_pkts = kdp_rx_burst(kdp, bufs, nb_bufs);
+
+	kdp_q->rx.pkts += nb_pkts;
+	kdp_q->rx.err_pkts += nb_bufs - nb_pkts;
+
+	return nb_pkts;
+}
+
+static uint16_t
+eth_kdp_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct kdp_data *kdp = kdp_q->internals->kdp;
+	uint16_t nb_pkts;
+
+	nb_pkts =  kdp_tx_burst(kdp, bufs, nb_bufs);
+
+	kdp_q->tx.pkts += nb_pkts;
+	kdp_q->tx.err_pkts += nb_bufs - nb_pkts;
+
+	return nb_pkts;
+}
+
+static uint16_t
+eth_kdp_tap_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	struct kdp_tap_data *kdp_tap = internals->kdp_tap;
+	struct rte_mbuf *m;
+	int ret;
+	unsigned i;
+
+	for (i = 0; i < nb_bufs; i++) {
+		m = rte_pktmbuf_alloc(kdp_q->mb_pool);
+		bufs[i] = m;
+		ret = read(kdp_tap->tap_fd, rte_pktmbuf_mtod(m, void *),
+				MAX_PACKET_SZ);
+		if (ret < 0) {
+			rte_pktmbuf_free(m);
+			break;
+		}
+
+		m->nb_segs = 1;
+		m->next = NULL;
+		m->pkt_len = (uint16_t)ret;
+		m->data_len = (uint16_t)ret;
+	}
+
+	kdp_q->rx.pkts += i;
+	kdp_q->rx.err_pkts += nb_bufs - i;
+
+	return i;
+}
+
+static uint16_t
+eth_kdp_tap_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	struct kdp_tap_data *kdp_tap = internals->kdp_tap;
+	struct rte_mbuf *m;
+	unsigned i;
+
+	for (i = 0; i < nb_bufs; i++) {
+		m = bufs[i];
+		write(kdp_tap->tap_fd, rte_pktmbuf_mtod(m, void*),
+				rte_pktmbuf_data_len(m));
+		rte_pktmbuf_free(m);
+	}
+
+	kdp_q->tx.pkts += i;
+	kdp_q->tx.err_pkts += nb_bufs - i;
+
+	return i;
+}
+
+static int
+eth_kdp_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct kdp_conf conf;
+	uint16_t port_id = dev->data->port_id;
+	int ret = 0;
+
+	snprintf(conf.name, RTE_KDP_NAMESIZE, KDP_DEVICE "%u",
+			port_id);
+	conf.force_bind = 0;
+	conf.port_id = port_id;
+	conf.mbuf_size = MAX_PACKET_SZ;
+
+	ret = kdp_start(internals->kdp,
+			internals->rx_queues[0].mb_pool,
+			&conf);
+	if (ret)
+		RTE_LOG(ERR, KDP, "Fail to create kdp for port: %d\n",
+				port_id);
+
+	return ret;
+}
+
+static int
+eth_kdp_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	int ret;
+
+	if (internals->kdp) {
+		ret = eth_kdp_start(dev);
+		if (ret)
+			return -1;
+	}
+
+	dev->data->dev_link.link_status = 1;
+	return 0;
+}
+
+static void
+eth_kdp_dev_stop(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	if (internals->kdp)
+		kdp_stop(internals->kdp);
+
+	dev->data->dev_link.link_status = 0;
+}
+
+static void
+eth_kdp_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct kdp_data *kdp = internals->kdp;
+	struct kdp_tap_data *kdp_tap = internals->kdp_tap;
+
+	if (kdp) {
+		kdp_close(kdp);
+
+		rte_free(kdp);
+		internals->kdp = NULL;
+	}
+
+	if (kdp_tap) {
+		kdp_tap_close(kdp_tap);
+
+		rte_free(kdp_tap);
+		internals->kdp_tap = NULL;
+	}
+
+	rte_free(dev->data->dev_private);
+	dev->data->dev_private = NULL;
+}
+
+static int
+eth_kdp_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	return 0;
+}
+
+static void
+eth_kdp_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct rte_eth_dev_data *data = dev->data;
+
+	dev_info->driver_name = data->drv_name;
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)-1;
+	dev_info->max_rx_queues = data->nb_rx_queues;
+	dev_info->max_tx_queues = data->nb_tx_queues;
+	dev_info->min_rx_bufsize = 0;
+	dev_info->pci_dev = NULL;
+}
+
+static int
+eth_kdp_rx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t rx_queue_id __rte_unused,
+		uint16_t nb_rx_desc __rte_unused,
+		unsigned int socket_id __rte_unused,
+		const struct rte_eth_rxconf *rx_conf __rte_unused,
+		struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct pmd_queue *q;
+
+	q = &internals->rx_queues[rx_queue_id];
+	q->internals = internals;
+	q->mb_pool = mb_pool;
+
+	dev->data->rx_queues[rx_queue_id] = q;
+
+	return 0;
+}
+
+static int
+eth_kdp_tx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t tx_queue_id,
+		uint16_t nb_tx_desc __rte_unused,
+		unsigned int socket_id __rte_unused,
+		const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct pmd_queue *q;
+
+	q = &internals->tx_queues[tx_queue_id];
+	q->internals = internals;
+
+	dev->data->tx_queues[tx_queue_id] = q;
+
+	return 0;
+}
+
+static void
+eth_kdp_queue_release(void *q __rte_unused)
+{
+}
+
+static int
+eth_kdp_link_update(struct rte_eth_dev *dev __rte_unused,
+		int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static void
+eth_kdp_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	unsigned i, num_stats;
+	unsigned long rx_packets_total = 0, rx_bytes_total = 0;
+	unsigned long tx_packets_total = 0, tx_bytes_total = 0;
+	unsigned long tx_packets_err_total = 0;
+	struct rte_eth_dev_data *data = dev->data;
+	struct pmd_queue *q;
+
+	num_stats = RTE_MIN((unsigned)RTE_ETHDEV_QUEUE_STAT_CNTRS,
+			data->nb_rx_queues);
+	for (i = 0; i < num_stats; i++) {
+		q = data->rx_queues[i];
+		stats->q_ipackets[i] = q->rx.pkts;
+		stats->q_ibytes[i] = q->rx.bytes;
+		rx_packets_total += stats->q_ipackets[i];
+		rx_bytes_total += stats->q_ibytes[i];
+	}
+
+	num_stats = RTE_MIN((unsigned)RTE_ETHDEV_QUEUE_STAT_CNTRS,
+			data->nb_tx_queues);
+	for (i = 0; i < num_stats; i++) {
+		q = data->tx_queues[i];
+		stats->q_opackets[i] = q->tx.pkts;
+		stats->q_obytes[i] = q->tx.bytes;
+		stats->q_errors[i] = q->tx.err_pkts;
+		tx_packets_total += stats->q_opackets[i];
+		tx_bytes_total += stats->q_obytes[i];
+		tx_packets_err_total += stats->q_errors[i];
+	}
+
+	stats->ipackets = rx_packets_total;
+	stats->ibytes = rx_bytes_total;
+	stats->opackets = tx_packets_total;
+	stats->obytes = tx_bytes_total;
+	stats->oerrors = tx_packets_err_total;
+}
+
+static void
+eth_kdp_stats_reset(struct rte_eth_dev *dev)
+{
+	unsigned i;
+	struct rte_eth_dev_data *data = dev->data;
+	struct pmd_queue *q;
+
+	for (i = 0; i < data->nb_rx_queues; i++) {
+		q = data->rx_queues[i];
+		q->rx.pkts = 0;
+		q->rx.bytes = 0;
+	}
+	for (i = 0; i < data->nb_tx_queues; i++) {
+		q = data->tx_queues[i];
+		q->tx.pkts = 0;
+		q->tx.bytes = 0;
+		q->tx.err_pkts = 0;
+	}
+}
+
+static const struct eth_dev_ops eth_kdp_ops = {
+	.dev_start = eth_kdp_dev_start,
+	.dev_stop = eth_kdp_dev_stop,
+	.dev_close = eth_kdp_dev_close,
+	.dev_configure = eth_kdp_dev_configure,
+	.dev_infos_get = eth_kdp_dev_info,
+	.rx_queue_setup = eth_kdp_rx_queue_setup,
+	.tx_queue_setup = eth_kdp_tx_queue_setup,
+	.rx_queue_release = eth_kdp_queue_release,
+	.tx_queue_release = eth_kdp_queue_release,
+	.link_update = eth_kdp_link_update,
+	.stats_get = eth_kdp_stats_get,
+	.stats_reset = eth_kdp_stats_reset,
+};
+
+static struct rte_eth_dev *
+eth_kdp_create(const char *name, unsigned numa_node)
+{
+	uint16_t nb_rx_queues = 1;
+	uint16_t nb_tx_queues = 1;
+	struct rte_eth_dev_data *data = NULL;
+	struct pmd_internals *internals = NULL;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	RTE_LOG(INFO, PMD, "Creating kdp ethdev on numa socket %u\n",
+			numa_node);
+
+	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+	if (data == NULL)
+		goto error;
+
+	internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node);
+	if (internals == NULL)
+		goto error;
+
+	/* reserve an ethdev entry */
+	eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+	if (eth_dev == NULL)
+		goto error;
+
+	data->dev_private = internals;
+	data->port_id = eth_dev->data->port_id;
+	memmove(data->name, eth_dev->data->name, sizeof(data->name));
+	data->nb_rx_queues = nb_rx_queues;
+	data->nb_tx_queues = nb_tx_queues;
+	data->dev_link = pmd_link;
+	data->mac_addrs = &eth_addr;
+
+	eth_dev->data = data;
+	eth_dev->dev_ops = &eth_kdp_ops;
+	eth_dev->driver = NULL;
+
+	data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+	data->kdrv = RTE_KDRV_NONE;
+	data->drv_name = drivername;
+	data->numa_node = numa_node;
+
+	return eth_dev;
+
+error:
+	rte_free(data);
+	rte_free(internals);
+
+	return NULL;
+}
+
+static int
+eth_kdp_devinit(const char *name, const char *params __rte_unused)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+	struct pmd_internals *internals;
+	struct kdp_data *kdp;
+	struct kdp_tap_data *kdp_tap = NULL;
+	uint16_t port_id;
+
+	RTE_LOG(INFO, PMD, "Initializing eth_kdp for %s\n", name);
+
+	eth_dev = eth_kdp_create(name, rte_socket_id());
+	if (eth_dev == NULL)
+		return -1;
+
+	internals = eth_dev->data->dev_private;
+	port_id = eth_dev->data->port_id;
+
+	kdp = kdp_init(port_id);
+	if (kdp == NULL)
+		kdp_tap = kdp_tap_init(port_id);
+
+	if (kdp == NULL && kdp_tap == NULL) {
+		rte_eth_dev_release_port(eth_dev);
+		rte_free(internals);
+
+		/* Not return error to prevent panic in rte_eal_init()  */
+		return 0;
+	}
+
+	internals->kdp = kdp;
+	internals->kdp_tap = kdp_tap;
+
+	if (kdp == NULL) {
+		eth_dev->rx_pkt_burst = eth_kdp_tap_rx;
+		eth_dev->tx_pkt_burst = eth_kdp_tap_tx;
+	} else {
+		eth_dev->rx_pkt_burst = eth_kdp_rx;
+		eth_dev->tx_pkt_burst = eth_kdp_tx;
+	}
+
+	return 0;
+}
+
+static int
+eth_kdp_devuninit(const char *name)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+
+	RTE_LOG(INFO, PMD, "Un-Initializing eth_kdp for %s\n", name);
+
+	/* find the ethdev entry */
+	eth_dev = rte_eth_dev_allocated(name);
+	if (eth_dev == NULL)
+		return -1;
+
+	eth_kdp_dev_stop(eth_dev);
+
+	if (eth_dev->data)
+		rte_free(eth_dev->data->dev_private);
+	rte_free(eth_dev->data);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	kdp_uninit();
+
+	return 0;
+}
+
+static struct rte_driver eth_kdp_drv = {
+	.name = "eth_kdp",
+	.type = PMD_VDEV,
+	.init = eth_kdp_devinit,
+	.uninit = eth_kdp_devuninit,
+};
+
+PMD_REGISTER_DRIVER(eth_kdp_drv);
diff --git a/drivers/net/kdp/rte_kdp.c b/drivers/net/kdp/rte_kdp.c
new file mode 100644
index 0000000..ed50a0f
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp.c
@@ -0,0 +1,633 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_EXEC_ENV_LINUXAPP
+#error "KDP is not supported"
+#endif
+
+#include <sys/socket.h>
+#include <linux/netlink.h>
+#include <linux/rtnetlink.h>
+
+#include <rte_spinlock.h>
+#include <rte_ethdev.h>
+#include <rte_memzone.h>
+
+#include "rte_kdp.h"
+#include "rte_kdp_fifo.h"
+
+#define KDP_MODULE_NAME "rte_kdp"
+#define MAX_MBUF_BURST_NUM     32
+
+/* Maximum number of ring entries */
+#define KDP_FIFO_COUNT_MAX     1024
+#define KDP_FIFO_SIZE          (KDP_FIFO_COUNT_MAX * sizeof(void *) + \
+					sizeof(struct rte_kdp_fifo))
+
+#define BUFSZ 1024
+struct kdp_request {
+	struct nlmsghdr nlmsg;
+	char buf[BUFSZ];
+};
+
+static int kdp_fd = -1;
+static int kdp_ref_count;
+
+static const struct rte_memzone *
+kdp_memzone_reserve(const char *name, size_t len, int socket_id,
+		unsigned flags)
+{
+	const struct rte_memzone *mz = rte_memzone_lookup(name);
+
+	if (mz == NULL)
+		mz = rte_memzone_reserve(name, len, socket_id, flags);
+
+	return mz;
+}
+
+static int
+kdp_slot_init(struct kdp_memzone_slot *slot)
+{
+#define OBJNAMSIZ 32
+	char obj_name[OBJNAMSIZ];
+	const struct rte_memzone *mz;
+
+	/* TX RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_tx_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_tx_q = mz;
+
+	/* RX RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_rx_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_rx_q = mz;
+
+	/* ALLOC RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_alloc_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_alloc_q = mz;
+
+	/* FREE RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_free_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_free_q = mz;
+
+	return 0;
+
+kdp_fail:
+	return -1;
+}
+
+static void
+kdp_ring_init(struct kdp_data *kdp)
+{
+	struct kdp_memzone_slot *slot = kdp->slot;
+	const struct rte_memzone *mz;
+
+	/* TX RING */
+	mz = slot->m_tx_q;
+	kdp->tx_q = mz->addr;
+	kdp_fifo_init(kdp->tx_q, KDP_FIFO_COUNT_MAX);
+
+	/* RX RING */
+	mz = slot->m_rx_q;
+	kdp->rx_q = mz->addr;
+	kdp_fifo_init(kdp->rx_q, KDP_FIFO_COUNT_MAX);
+
+	/* ALLOC RING */
+	mz = slot->m_alloc_q;
+	kdp->alloc_q = mz->addr;
+	kdp_fifo_init(kdp->alloc_q, KDP_FIFO_COUNT_MAX);
+
+	/* FREE RING */
+	mz = slot->m_free_q;
+	kdp->free_q = mz->addr;
+	kdp_fifo_init(kdp->free_q, KDP_FIFO_COUNT_MAX);
+}
+
+static int
+kdp_module_check(void)
+{
+	int fd;
+
+	fd = open("/sys/module/" KDP_MODULE_NAME "/initstate", O_RDONLY);
+	if (fd < 0)
+		return -1;
+	close(fd);
+
+	return 0;
+}
+
+static int
+rtnl_socket_open(void)
+{
+	struct sockaddr_nl src;
+	int ret;
+
+	/* Check FD and open */
+	if (kdp_fd < 0) {
+		kdp_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
+		if (kdp_fd < 0) {
+			RTE_LOG(ERR, KDP, "socket for create failed.\n");
+			return -1;
+		}
+
+		memset(&src, 0, sizeof(struct sockaddr_nl));
+
+		src.nl_family = AF_NETLINK;
+		src.nl_pid = getpid();
+
+		ret = bind(kdp_fd, (struct sockaddr *)&src,
+				sizeof(struct sockaddr_nl));
+		if (ret < 0) {
+			RTE_LOG(ERR, KDP, "Bind for create failed.\n");
+			close(kdp_fd);
+			kdp_fd = -1;
+			return -1;
+		}
+	}
+
+	kdp_ref_count++;
+
+	return 0;
+}
+
+static void
+kdp_ref_put(void)
+{
+	/* not initialized? */
+	if (!kdp_ref_count)
+		return;
+
+	kdp_ref_count--;
+
+	/* not last one? */
+	if (kdp_ref_count)
+		return;
+
+	if (kdp_fd < 0)
+		return;
+
+	close(kdp_fd);
+	kdp_fd = -1;
+}
+
+struct kdp_data *
+kdp_init(uint16_t port_id)
+{
+	struct kdp_memzone_slot *slot = NULL;
+	struct kdp_data *kdp = NULL;
+	int ret;
+
+	ret = kdp_module_check();
+	if (ret)
+		return NULL;
+
+	ret = rtnl_socket_open();
+	if (ret)
+		return NULL;
+
+	slot = rte_malloc(NULL, sizeof(struct kdp_memzone_slot), 0);
+	if (slot == NULL)
+		goto kdp_fail;
+	slot->id = port_id;
+
+	kdp = rte_malloc(NULL, sizeof(struct kdp_data), 0);
+	if (kdp == NULL)
+		goto kdp_fail;
+	kdp->slot = slot;
+
+	ret = kdp_slot_init(slot);
+	if (ret < 0)
+		goto kdp_fail;
+
+	kdp_ring_init(kdp);
+
+	return kdp;
+
+kdp_fail:
+	kdp_ref_put();
+	rte_free(slot);
+	rte_free(kdp);
+	RTE_LOG(ERR, KDP, "Unable to allocate memory\n");
+	return NULL;
+}
+
+static void
+kdp_mbufs_allocate(struct kdp_data *kdp)
+{
+	int i, ret;
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
+			 offsetof(struct rte_kdp_mbuf, pool));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_addr) !=
+			 offsetof(struct rte_kdp_mbuf, buf_addr));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) !=
+			 offsetof(struct rte_kdp_mbuf, next));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) !=
+			 offsetof(struct rte_kdp_mbuf, data_off));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) !=
+			 offsetof(struct rte_kdp_mbuf, data_len));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pkt_len) !=
+			 offsetof(struct rte_kdp_mbuf, pkt_len));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, ol_flags) !=
+			 offsetof(struct rte_kdp_mbuf, ol_flags));
+
+	/* Check if pktmbuf pool has been configured */
+	if (kdp->pktmbuf_pool == NULL) {
+		RTE_LOG(ERR, KDP, "No valid mempool for allocating mbufs\n");
+		return;
+	}
+
+	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
+		pkts[i] = rte_pktmbuf_alloc(kdp->pktmbuf_pool);
+		if (unlikely(pkts[i] == NULL)) {
+			/* Out of memory */
+			RTE_LOG(ERR, KDP, "Out of memory\n");
+			break;
+		}
+	}
+
+	/* No pkt mbuf alocated */
+	if (i <= 0)
+		return;
+
+	ret = kdp_fifo_put(kdp->alloc_q, (void **)pkts, i);
+
+	/* Check if any mbufs not put into alloc_q, and then free them */
+	if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
+		int j;
+
+		for (j = ret; j < i; j++)
+			rte_pktmbuf_free(pkts[j]);
+	}
+}
+
+static int
+attr_add(struct kdp_request *req, unsigned short type, void *buf, size_t len)
+{
+	struct rtattr *rta;
+	int nlmsg_len;
+
+	nlmsg_len = NLMSG_ALIGN(req->nlmsg.nlmsg_len);
+	rta = (struct rtattr *)((char *)&req->nlmsg + nlmsg_len);
+	if (nlmsg_len + RTA_LENGTH(len) > sizeof(struct kdp_request))
+		return -1;
+	rta->rta_type = type;
+	rta->rta_len = RTA_LENGTH(len);
+	memcpy(RTA_DATA(rta), buf, len);
+	req->nlmsg.nlmsg_len = nlmsg_len + RTA_LENGTH(len);
+
+	return 0;
+}
+
+static struct
+rtattr *attr_nested_add(struct kdp_request *req, unsigned short type)
+{
+	struct rtattr *rta;
+	int nlmsg_len;
+
+	nlmsg_len = NLMSG_ALIGN(req->nlmsg.nlmsg_len);
+	rta = (struct rtattr *)((char *)&req->nlmsg + nlmsg_len);
+	if (nlmsg_len + RTA_LENGTH(0) > sizeof(struct kdp_request))
+		return NULL;
+	rta->rta_type = type;
+	rta->rta_len = nlmsg_len;
+	req->nlmsg.nlmsg_len = nlmsg_len + RTA_LENGTH(0);
+
+	return rta;
+}
+
+static void
+attr_nested_end(struct kdp_request *req, struct rtattr *rta)
+{
+	rta->rta_len = req->nlmsg.nlmsg_len - rta->rta_len;
+}
+
+static int
+rtnl_create(struct rte_kdp_device_info *dev_info)
+{
+	struct kdp_request req;
+	struct ifinfomsg *info;
+	struct rtattr *rta1;
+	struct rtattr *rta2;
+	char name[RTE_KDP_NAMESIZE];
+	char type[RTE_KDP_NAMESIZE];
+	struct iovec iov;
+	struct msghdr msg;
+	struct sockaddr_nl nladdr;
+	int ret;
+	char buf[BUFSZ];
+
+	memset(&req, 0, sizeof(struct kdp_request));
+
+	req.nlmsg.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.nlmsg.nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;
+	req.nlmsg.nlmsg_flags |= NLM_F_ACK;
+	req.nlmsg.nlmsg_type = RTM_NEWLINK;
+
+	info = NLMSG_DATA(&req.nlmsg);
+
+	info->ifi_family = AF_UNSPEC;
+	info->ifi_index = 0;
+
+	snprintf(name, RTE_KDP_NAMESIZE, "%s", dev_info->name);
+	ret = attr_add(&req, IFLA_IFNAME, name, strlen(name) + 1);
+	if (ret < 0)
+		return -1;
+
+	rta1 = attr_nested_add(&req, IFLA_LINKINFO);
+	if (rta1 == NULL)
+		return -1;
+
+	snprintf(type, RTE_KDP_NAMESIZE, KDP_DEVICE);
+	ret = attr_add(&req, IFLA_INFO_KIND, type, strlen(type) + 1);
+	if (ret < 0)
+		return -1;
+
+	rta2 = attr_nested_add(&req, IFLA_INFO_DATA);
+	if (rta2 == NULL)
+		return -1;
+
+	ret = attr_add(&req, IFLA_KDP_PORTID, &dev_info->port_id,
+			sizeof(uint8_t));
+	if (ret < 0)
+		return -1;
+
+	ret = attr_add(&req, IFLA_KDP_DEVINFO, dev_info,
+			sizeof(struct rte_kdp_device_info));
+	if (ret < 0)
+		return -1;
+
+	attr_nested_end(&req, rta2);
+	attr_nested_end(&req, rta1);
+
+	memset(&nladdr, 0, sizeof(nladdr));
+	nladdr.nl_family = AF_NETLINK;
+
+	iov.iov_base = (void *)&req.nlmsg;
+	iov.iov_len = req.nlmsg.nlmsg_len;
+
+	memset(&msg, 0, sizeof(struct msghdr));
+	msg.msg_name = &nladdr;
+	msg.msg_namelen = sizeof(nladdr);
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+
+	ret = sendmsg(kdp_fd, &msg, 0);
+	if (ret < 0) {
+		RTE_LOG(ERR, KDP, "Send for create failed %d.\n", errno);
+		return -1;
+	}
+
+	memset(buf, 0, sizeof(buf));
+	iov.iov_base = buf;
+	iov.iov_len = sizeof(buf);
+
+	ret = recvmsg(kdp_fd, &msg, 0);
+	if (ret < 0) {
+		RTE_LOG(ERR, KDP, "Recv for create failed.\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+kdp_start(struct kdp_data *kdp, struct rte_mempool *pktmbuf_pool,
+	      const struct kdp_conf *conf)
+{
+	struct kdp_memzone_slot *slot = kdp->slot;
+	struct rte_kdp_device_info dev_info;
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	int ret;
+
+	if (!kdp || !pktmbuf_pool || !conf || !conf->name[0])
+		return -1;
+
+	snprintf(kdp->name, RTE_KDP_NAMESIZE, "%s", conf->name);
+	kdp->pktmbuf_pool = pktmbuf_pool;
+	kdp->port_id = conf->port_id;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	dev_info.core_id = conf->core_id;
+	dev_info.force_bind = conf->force_bind;
+	dev_info.port_id = conf->port_id;
+	dev_info.mbuf_size = conf->mbuf_size;
+	snprintf(dev_info.name, RTE_KDP_NAMESIZE, "%s", conf->name);
+
+	dev_info.tx_phys = slot->m_tx_q->phys_addr;
+	dev_info.rx_phys = slot->m_rx_q->phys_addr;
+	dev_info.alloc_phys = slot->m_alloc_q->phys_addr;
+	dev_info.free_phys = slot->m_free_q->phys_addr;
+
+	/* MBUF mempool */
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME,
+		pktmbuf_pool->name);
+	mz = rte_memzone_lookup(mz_name);
+	if (mz == NULL)
+		goto kdp_fail;
+	dev_info.mbuf_va = mz->addr;
+	dev_info.mbuf_phys = mz->phys_addr;
+
+	ret = rtnl_create(&dev_info);
+	if (ret < 0)
+		goto kdp_fail;
+
+	kdp->in_use = 1;
+
+	/* Allocate mbufs and then put them into alloc_q */
+	kdp_mbufs_allocate(kdp);
+
+	return 0;
+
+kdp_fail:
+	return -1;
+}
+
+static void
+kdp_mbufs_free(struct kdp_data *kdp)
+{
+	int i, ret;
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+
+	ret = kdp_fifo_get(kdp->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
+	if (likely(ret > 0)) {
+		for (i = 0; i < ret; i++)
+			rte_pktmbuf_free(pkts[i]);
+	}
+}
+
+unsigned
+kdp_tx_burst(struct kdp_data *kdp, struct rte_mbuf **mbufs, unsigned num)
+{
+	unsigned ret = kdp_fifo_put(kdp->rx_q, (void **)mbufs, num);
+
+	/* Get mbufs from free_q and then free them */
+	kdp_mbufs_free(kdp);
+
+	return ret;
+}
+
+unsigned
+kdp_rx_burst(struct kdp_data *kdp, struct rte_mbuf **mbufs, unsigned num)
+{
+	unsigned ret = kdp_fifo_get(kdp->tx_q, (void **)mbufs, num);
+
+	/* If buffers removed, allocate mbufs and then put them into alloc_q */
+	if (ret)
+		kdp_mbufs_allocate(kdp);
+
+	return ret;
+}
+
+static void
+kdp_fifo_free(struct rte_kdp_fifo *fifo)
+{
+	int ret;
+	struct rte_mbuf *pkt;
+
+	do {
+		ret = kdp_fifo_get(fifo, (void **)&pkt, 1);
+		if (ret)
+			rte_pktmbuf_free(pkt);
+	} while (ret);
+}
+
+static int
+rtnl_destroy(struct kdp_data *kdp)
+{
+	struct kdp_request req;
+	struct ifinfomsg *info;
+	struct iovec iov;
+	struct msghdr msg;
+	struct sockaddr_nl nladdr;
+	int ret;
+
+	memset(&req, 0, sizeof(struct kdp_request));
+
+	req.nlmsg.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.nlmsg.nlmsg_flags = NLM_F_REQUEST;
+	req.nlmsg.nlmsg_type = RTM_DELLINK;
+
+	info = NLMSG_DATA(&req.nlmsg);
+
+	info->ifi_family = AF_UNSPEC;
+	info->ifi_index = 0;
+
+	ret = attr_add(&req, IFLA_IFNAME, kdp->name, strlen(kdp->name) + 1);
+	if (ret < 0)
+		return -1;
+
+	memset(&nladdr, 0, sizeof(nladdr));
+	nladdr.nl_family = AF_NETLINK;
+
+	iov.iov_base = (void *)&req.nlmsg;
+	iov.iov_len = req.nlmsg.nlmsg_len;
+
+	memset(&msg, 0, sizeof(struct msghdr));
+	msg.msg_name = &nladdr;
+	msg.msg_namelen = sizeof(nladdr);
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+
+	ret = sendmsg(kdp_fd, &msg, 0);
+	if (ret < 0) {
+		RTE_LOG(ERR, KDP, "Send for destroy failed.\n");
+		return -1;
+	}
+	return 0;
+}
+
+int
+kdp_stop(struct kdp_data *kdp)
+{
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+	int ret;
+	int i;
+
+	if (!kdp || !kdp->in_use)
+		return -1;
+
+	rtnl_destroy(kdp);
+
+	do {
+		ret = kdp_fifo_get(kdp->free_q, (void **)pkts,
+				MAX_MBUF_BURST_NUM);
+		if (ret > 0) {
+			for (i = 0; i < ret; i++)
+				rte_pktmbuf_free(pkts[i]);
+		}
+	} while (ret > 0);
+
+	do {
+		ret = kdp_fifo_get(kdp->alloc_q, (void **)pkts,
+				MAX_MBUF_BURST_NUM);
+		if (ret > 0) {
+			for (i = 0; i < ret; i++)
+				rte_pktmbuf_free(pkts[i]);
+		}
+	} while (ret > 0);
+	return 0;
+}
+
+void
+kdp_close(struct kdp_data *kdp)
+{
+	/* mbufs in all fifo should be released, except request/response */
+	kdp_fifo_free(kdp->tx_q);
+	kdp_fifo_free(kdp->rx_q);
+	kdp_fifo_free(kdp->alloc_q);
+	kdp_fifo_free(kdp->free_q);
+
+	rte_free(kdp->slot);
+
+	/* Memset the KDP struct */
+	memset(kdp, 0, sizeof(struct kdp_data));
+}
+
+void
+kdp_uninit(void)
+{
+	kdp_ref_put();
+}
diff --git a/drivers/net/kdp/rte_kdp.h b/drivers/net/kdp/rte_kdp.h
new file mode 100644
index 0000000..20ad93d
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp.h
@@ -0,0 +1,116 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_KDP_H_
+#define _RTE_KDP_H_
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <sys/ioctl.h>
+
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+
+#include <exec-env/rte_kdp_common.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * KDP memzone pool slot
+ */
+struct kdp_memzone_slot {
+	uint32_t id;
+
+	/* Memzones */
+	const struct rte_memzone *m_tx_q;      /**< TX queue */
+	const struct rte_memzone *m_rx_q;      /**< RX queue */
+	const struct rte_memzone *m_alloc_q;   /**< Allocated mbufs queue */
+	const struct rte_memzone *m_free_q;    /**< To be freed mbufs queue */
+};
+
+/**
+ * KDP context
+ */
+struct kdp_data {
+	char name[RTE_KDP_NAMESIZE];        /**< KDP interface name */
+	struct rte_mempool *pktmbuf_pool;   /**< pkt mbuf mempool */
+	struct kdp_memzone_slot *slot;
+	uint16_t port_id;                  /**< Group ID of KDP devices */
+
+	struct rte_kdp_fifo *tx_q;          /**< TX queue */
+	struct rte_kdp_fifo *rx_q;          /**< RX queue */
+	struct rte_kdp_fifo *alloc_q;       /**< Allocated mbufs queue */
+	struct rte_kdp_fifo *free_q;        /**< To be freed mbufs queue */
+
+	uint8_t in_use;                     /**< kdp in use */
+};
+
+struct kdp_tap_data {
+	char name[RTE_KDP_NAMESIZE];
+	int tap_fd;
+};
+
+/**
+ * Structure for configuring KDP device.
+ */
+struct kdp_conf {
+	char name[RTE_KDP_NAMESIZE];
+	uint32_t core_id;   /* Core ID to bind kernel thread on */
+	uint16_t port_id;
+	unsigned mbuf_size;
+
+	uint8_t force_bind; /* Flag to bind kernel thread */
+};
+
+struct kdp_data *kdp_init(uint16_t port_id);
+int kdp_start(struct kdp_data *kdp, struct rte_mempool *pktmbuf_pool,
+	      const struct kdp_conf *conf);
+unsigned kdp_rx_burst(struct kdp_data *kdp,
+		struct rte_mbuf **mbufs, unsigned num);
+unsigned kdp_tx_burst(struct kdp_data *kdp,
+		struct rte_mbuf **mbufs, unsigned num);
+int kdp_stop(struct kdp_data *kdp);
+void kdp_close(struct kdp_data *kdp);
+void kdp_uninit(void);
+
+struct kdp_tap_data *kdp_tap_init(uint16_t port_id);
+void kdp_tap_close(struct kdp_tap_data *kdp_tap);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_KDP_H_ */
diff --git a/drivers/net/kdp/rte_kdp_fifo.h b/drivers/net/kdp/rte_kdp_fifo.h
new file mode 100644
index 0000000..1a7e063
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp_fifo.h
@@ -0,0 +1,91 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * Initializes the kdp fifo structure
+ */
+static void
+kdp_fifo_init(struct rte_kdp_fifo *fifo, unsigned size)
+{
+	/* Ensure size is power of 2 */
+	if (size & (size - 1))
+		rte_panic("KDP fifo size must be power of 2\n");
+
+	fifo->write = 0;
+	fifo->read = 0;
+	fifo->len = size;
+	fifo->elem_size = sizeof(void *);
+}
+
+/**
+ * Adds num elements into the fifo. Return the number actually written
+ */
+static inline unsigned
+kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned fifo_write = fifo->write;
+	unsigned fifo_read = fifo->read;
+	unsigned new_write = fifo_write;
+
+	for (i = 0; i < num; i++) {
+		new_write = (new_write + 1) & (fifo->len - 1);
+
+		if (new_write == fifo_read)
+			break;
+		fifo->buffer[fifo_write] = data[i];
+		fifo_write = new_write;
+	}
+	fifo->write = fifo_write;
+	return i;
+}
+
+/**
+ * Get up to num elements from the fifo. Return the number actully read
+ */
+static inline unsigned
+kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned new_read = fifo->read;
+	unsigned fifo_write = fifo->write;
+	for (i = 0; i < num; i++) {
+		if (new_read == fifo_write)
+			break;
+
+		data[i] = fifo->buffer[new_read];
+		new_read = (new_read + 1) & (fifo->len - 1);
+	}
+	fifo->read = new_read;
+	return i;
+}
diff --git a/drivers/net/kdp/rte_kdp_tap.c b/drivers/net/kdp/rte_kdp_tap.c
new file mode 100644
index 0000000..12f3ad2
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp_tap.c
@@ -0,0 +1,101 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+
+#include <sys/socket.h>
+#include <linux/if.h>
+#include <linux/if_tun.h>
+
+#include "rte_kdp.h"
+
+static int
+tap_create(char *name)
+{
+	struct ifreq ifr;
+	int fd, ret;
+
+	fd = open("/dev/net/tun", O_RDWR);
+	if (fd < 0)
+		return fd;
+
+	memset(&ifr, 0, sizeof(ifr));
+
+	/* TAP device without packet information */
+	ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
+
+	if (name && *name)
+		snprintf(ifr.ifr_name, IFNAMSIZ, "%s", name);
+
+	ret = ioctl(fd, TUNSETIFF, (void *)&ifr);
+	if (ret < 0) {
+		close(fd);
+		return ret;
+	}
+
+	if (name)
+		snprintf(name, IFNAMSIZ, "%s", ifr.ifr_name);
+
+	return fd;
+}
+
+struct kdp_tap_data *
+kdp_tap_init(uint16_t port_id)
+{
+	struct kdp_tap_data *kdp_tap = NULL;
+	int flags;
+
+	kdp_tap = rte_malloc(NULL, sizeof(struct kdp_tap_data), 0);
+	if (kdp_tap == NULL)
+		goto error;
+
+	snprintf(kdp_tap->name, IFNAMSIZ, "tap_kdp%u", port_id);
+	kdp_tap->tap_fd = tap_create(kdp_tap->name);
+	if (kdp_tap->tap_fd < 0)
+		goto error;
+
+	flags = fcntl(kdp_tap->tap_fd, F_GETFL, 0);
+	fcntl(kdp_tap->tap_fd, F_SETFL, flags | O_NONBLOCK);
+
+	return kdp_tap;
+
+error:
+	rte_free(kdp_tap);
+	return NULL;
+}
+
+void
+kdp_tap_close(struct kdp_tap_data *kdp_tap)
+{
+	close(kdp_tap->tap_fd);
+}
diff --git a/drivers/net/kdp/rte_pmd_kdp_version.map b/drivers/net/kdp/rte_pmd_kdp_version.map
new file mode 100644
index 0000000..0812bb1
--- /dev/null
+++ b/drivers/net/kdp/rte_pmd_kdp_version.map
@@ -0,0 +1,4 @@
+DPDK_2.3 {
+
+	local: *;
+};
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index 2e47e7f..5a0048b 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -79,6 +79,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
 #define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
 #define RTE_LOGTYPE_CRYPTODEV 0x00020000 /**< Log related to cryptodev. */
+#define RTE_LOGTYPE_KDP     0x00080000 /**< Log related to KDP. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 8ecab41..eb18972 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   Copyright(c) 2014-2015 6WIND S.A.
 #   All rights reserved.
 #
@@ -154,6 +154,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)        += -lrte_pmd_qat
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KDP)        += -lrte_pmd_kdp
 
 # AESNI MULTI BUFFER is dependent on the IPSec_MB library
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)   += -lrte_pmd_aesni_mb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-02-19  5:05 ` [PATCH v2 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2016-02-19  5:05   ` [PATCH v2 1/2] kdp: add kernel data path kernel module Ferruh Yigit
  2016-02-19  5:05   ` [PATCH v2 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
@ 2016-03-09 11:17   ` Ferruh Yigit
  2016-03-09 11:17     ` [PATCH v3 1/2] kdp: add kernel data path kernel module Ferruh Yigit
                       ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-09 11:17 UTC (permalink / raw)
  To: dev

This patch sent to keep record of latest status of the work.


This is slow data path communication implementation based on existing KNI.

Difference is: librte_kni converted into a PMD, kdp kernel module is almost
same except all control path functionality removed and some simplification done.

Motivation is to simplify slow path data communication.
Now any application can use this new PMD to send/get data to Linux kernel.

PMD supports two communication methods:

1) KDP kernel module
PMD initialization functions handles creating virtual interfaces (with help of
kdp kernel module) and created FIFO. FIFO is used to share data between
userspace and kernelspace. This is default method.

2) tun/tap module
When KDP module is not inserted, PMD creates tap interface and transfers
packets using tap interface.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

v3:
* Remove logging helper macros, use pr_fmt
* Replace rw_semaphore with mutex
* Devices are not up by default
* Use unsigned primitive types as possible
* Update module parameters
* Code cleanup, remove useless comments, reorder fields/code.

v2:
* Use rtnetlink to create interfaces
* Include modules.h to prevent compile error in old kernels


Sample usage:
1) Transfer any packet received from NIC that bound to DPDK, to the Linux kernel

a) insert kdp kernel module
insmod build/kmod/rte_kdp.ko

b) bind NIC to the DPDK using dpdk_nic_bind.py

c) ./testpmd --vdev eth_kdp0

c1) testpmd show two ports, one of them physical, other virtual
...
Configuring Port 0 (socket 0)
Port 0: 00:00:00:00:00:00
Configuring Port 1 (socket 0)
...
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
Done

c2) This will create "kdp0" Linux interface
$ ip l show kdp0
21: kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff

d) Linux port can be used for data

d1)
$ ifconfig kdp0 1.0.0.2
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=64 time=0.789 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=64 time=0.881 ms

d2)
$ tcpdump -nn -i kdp0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on kdp0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:01:22.407506 IP 1.0.0.1 > 1.0.0.2: ICMP echo request, id 40016, seq 18, length 64
15:01:22.408521 IP 1.0.0.2 > 1.0.0.1: ICMP echo reply, id 40016, seq 18, length 64



2) Data travels between virtual Linux interfaces pass from DPDK application,
application can alter data

a) insert kdp kernel module
insmod build/kmod/rte_kdp.ko

b) No physical NIC involved

c) ./testpmd --vdev eth_kdp0 --vdev eth_kdp1

c1) testpmd show two ports, both of them are virtual
...
Configuring Port 0 (socket 0)
Port 0: 00:00:00:00:00:00
Configuring Port 1 (socket 0)
Port 1: 00:00:00:00:00:00
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
Done

c2) This will create "kdp0"  and "kdp1" Linux interfaces
$ ip l show kdp0; ip l show kdp1
22: kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
23: kdp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff

d) Data travel between virtual ports pass from DPDK application
$ifconfig kdp0 1.0.0.1
$ifconfig kdp1 1.0.0.2

d1)
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=64 time=3.57 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=64 time=1.85 ms
64 bytes from 1.0.0.1: icmp_seq=3 ttl=64 time=1.89 ms

d2)
$ tcpdump -nn -i kdp0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on kdp0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:20:51.908543 IP 1.0.0.2 > 1.0.0.1: ICMP echo request, id 41234, seq 1, length 64
15:20:51.909570 IP 1.0.0.1 > 1.0.0.2: ICMP echo reply, id 41234, seq 1, length 64
15:20:52.909551 IP 1.0.0.2 > 1.0.0.1: ICMP echo request, id 41234, seq 2, length 64
15:20:52.910577 IP 1.0.0.1 > 1.0.0.2: ICMP echo reply, id 41234, seq 2, length 64



3) tun/tap interface usage

a) No external module required, tun/tap support in kernel required

b) ./testpmd --vdev eth_kdp0 --vdev eth_kdp1

b1) This will create "tap_kdp0"  and "tap_kdp1" Linux interfaces
$ ip l show tap_kdp0; ip l show tap_kdp1
25: tap_kdp0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
    link/ether 56:47:97:9c:03:8e brd ff:ff:ff:ff:ff:ff
26: tap_kdp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
    link/ether 5e:15:22:b0:52:42 brd ff:ff:ff:ff:ff:ff

Ferruh Yigit (2):
  kdp: add kernel data path kernel module
  kdp: add virtual PMD for kernel slow data path communication

 MAINTAINERS                                        |   5 +
 config/common_base                                 |   7 +
 config/common_linuxapp                             |   2 +
 doc/guides/nics/pcap_ring.rst                      | 125 +++-
 doc/guides/rel_notes/release_16_04.rst             |   5 +
 drivers/net/Makefile                               |   3 +-
 drivers/net/kdp/Makefile                           |  61 ++
 drivers/net/kdp/rte_eth_kdp.c                      | 501 ++++++++++++++
 drivers/net/kdp/rte_kdp.c                          | 633 ++++++++++++++++++
 drivers/net/kdp/rte_kdp.h                          | 116 ++++
 drivers/net/kdp/rte_kdp_fifo.h                     |  91 +++
 drivers/net/kdp/rte_kdp_tap.c                      | 101 +++
 drivers/net/kdp/rte_pmd_kdp_version.map            |   4 +
 lib/librte_eal/common/include/rte_log.h            |   3 +-
 lib/librte_eal/linuxapp/Makefile                   |   3 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 134 ++++
 lib/librte_eal/linuxapp/kdp/Makefile               |  55 ++
 lib/librte_eal/linuxapp/kdp/kdp_dev.h              |  76 +++
 lib/librte_eal/linuxapp/kdp/kdp_fifo.h             |  91 +++
 lib/librte_eal/linuxapp/kdp/kdp_net.c              | 718 +++++++++++++++++++++
 mk/rte.app.mk                                      |   3 +-
 22 files changed, 2732 insertions(+), 8 deletions(-)
 create mode 100644 drivers/net/kdp/Makefile
 create mode 100644 drivers/net/kdp/rte_eth_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.h
 create mode 100644 drivers/net/kdp/rte_kdp_fifo.h
 create mode 100644 drivers/net/kdp/rte_kdp_tap.c
 create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/2] kdp: add kernel data path kernel module
  2016-03-09 11:17   ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
@ 2016-03-09 11:17     ` Ferruh Yigit
  2016-03-09 11:17     ` [PATCH v3 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
  2016-03-14 15:32     ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-09 11:17 UTC (permalink / raw)
  To: dev

This kernel module is based on KNI module, but this one is stripped
version of it and only for data messages, no control functionality
provided.

FIFO implementation of the KNI is kept exact same, but ethtool related
code removed and virtual network management related code simplified.

This module contains kernel support to create network devices and
this module has a simple driver for virtual network device, the driver
simply puts/gets packets to/from FIFO instead of real hardware.

FIFO is created owned by userspace application, which is for this case
KDP PMD.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---

v3:
* Remove logging helper macros, use pr_fmt
* Replace rw_semaphore with mutex
* Devices are not up by default
* Use unsigned primitive types as possible
* Update module parameters
* Code cleanup, remove useless comments, reorder fields/code.

v2:
* Use rtnetlink to create interfaces
* include modules.h to prevent compile error in old kernels
---
 MAINTAINERS                                        |   4 +
 config/common_base                                 |   6 +
 config/common_linuxapp                             |   1 +
 lib/librte_eal/linuxapp/Makefile                   |   3 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 134 ++++
 lib/librte_eal/linuxapp/kdp/Makefile               |  55 ++
 lib/librte_eal/linuxapp/kdp/kdp_dev.h              |  76 +++
 lib/librte_eal/linuxapp/kdp/kdp_fifo.h             |  91 +++
 lib/librte_eal/linuxapp/kdp/kdp_net.c              | 718 +++++++++++++++++++++
 10 files changed, 1089 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h
 create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e253bf7..edcc4cc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -258,6 +258,10 @@ F: app/test/test_kni.c
 F: examples/kni/
 F: doc/guides/sample_app_ug/kernel_nic_interface.rst
 
+Linux KDP
+M: Ferruh Yigit <ferruh.yigit@gmail.com>
+F: lib/librte_eal/linuxapp/kdp/
+
 Linux AF_PACKET
 M: John W. Linville <linville@tuxdriver.com>
 F: drivers/net/af_packet/
diff --git a/config/common_base b/config/common_base
index c73f71a..973baff 100644
--- a/config/common_base
+++ b/config/common_base
@@ -302,6 +302,12 @@ CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
 CONFIG_RTE_LIBRTE_PMD_NULL=y
 
 #
+# Compile KDP PMD
+#
+CONFIG_RTE_KDP_KMOD=n
+CONFIG_RTE_KDP_PREEMPT_DEFAULT=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index ffbe260..569a0fe 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -39,6 +39,7 @@ CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_KNI_KMOD=y
 CONFIG_RTE_LIBRTE_KNI=y
+CONFIG_RTE_KDP_KMOD=y
 CONFIG_RTE_LIBRTE_VHOST=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
 CONFIG_RTE_LIBRTE_POWER=y
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index 20d2a91..26c70f4 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -34,6 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal
 DIRS-$(CONFIG_RTE_EAL_IGB_UIO) += igb_uio
 DIRS-$(CONFIG_RTE_KNI_KMOD) += kni
+DIRS-$(CONFIG_RTE_KDP_KMOD) += kdp
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += xen_dom0
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index c5490e4..e75662d 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -121,6 +121,7 @@ CFLAGS_eal_thread.o += -Wno-return-type
 endif
 
 INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
+INC += rte_kdp_common.h
 
 SYMLINK-$(CONFIG_RTE_EXEC_ENV_LINUXAPP)-include/exec-env := \
 	$(addprefix include/exec-env/,$(INC))
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
new file mode 100644
index 0000000..b9db8ef
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h
@@ -0,0 +1,134 @@
+/*-
+ *   This file is provided under a dual BSD/LGPLv2 license.  When using or
+ *   redistributing this file, you may do so under either license.
+ *
+ *   GNU LESSER GENERAL PUBLIC LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2.1 of the GNU Lesser General Public License
+ *   as published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   Lesser General Public License for more details.
+ *
+ *   You should have received a copy of the GNU Lesser General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ *
+ *
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above copyright
+ *     notice, this list of conditions and the following disclaimer in
+ *     the documentation and/or other materials provided with the
+ *     distribution.
+ *   * Neither the name of Intel Corporation nor the names of its
+ *     contributors may be used to endorse or promote products derived
+ *     from this software without specific prior written permission.
+ *
+ *    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#ifndef _RTE_KDP_COMMON_H_
+#define _RTE_KDP_COMMON_H_
+
+/**
+ * KDP name
+ */
+#define RTE_KDP_NAMESIZE 32
+
+#define KDP_DEVICE "kdp"
+
+/*
+ * Fifo struct mapped in a shared memory. It describes a circular buffer FIFO
+ * Write and read should wrap around. Fifo is empty when write == read
+ * Writing should never overwrite the read position
+ */
+struct rte_kdp_fifo {
+	volatile unsigned write;     /**< Next position to be written*/
+	volatile unsigned read;      /**< Next position to be read */
+	unsigned len;                /**< Circular buffer length */
+	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
+	void * volatile buffer[0];   /**< The buffer contains mbuf pointers */
+};
+
+/*
+ * The kernel image of the rte_mbuf struct, with only the relevant fields.
+ * Padding is necessary to assure the offsets of these fields
+ */
+struct rte_kdp_mbuf {
+	void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
+	char pad0[10];
+
+	uint16_t data_off;  /**< Start address of data in segment buffer. */
+	char pad1[4];
+	uint64_t ol_flags;  /**< Offload features. */
+	char pad2[4];
+
+	uint32_t pkt_len;   /**< Total pkt len: sum of all segment data_len. */
+
+	uint16_t data_len;  /**< Amount of data in segment buffer. */
+
+	/* fields on second cache line */
+	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
+	void *pool;
+	void *next;
+};
+
+/*
+ * Struct used to create a KDP device. Passed to the kernel in IOCTL call
+ */
+struct rte_kdp_device_info {
+	char name[RTE_KDP_NAMESIZE];  /**< Network device name for KDP */
+	uint16_t port_id;
+
+	phys_addr_t tx_phys;
+	phys_addr_t rx_phys;
+	phys_addr_t alloc_phys;
+	phys_addr_t free_phys;
+
+	/* mbuf mempool */
+	void *mbuf_va;
+	phys_addr_t mbuf_phys;
+
+	unsigned mbuf_size;
+
+	uint8_t force_bind;  /**< Flag for kernel thread binding */
+	uint32_t core_id;    /**< core ID to bind for kernel thread */
+};
+
+enum {
+	IFLA_KDP_UNSPEC,
+	IFLA_KDP_PORTID,
+	IFLA_KDP_DEVINFO,
+	__IFLA_KDP_MAX,
+};
+#define IFLA_KDP_MAX (__IFLA_KDP_MAX - 1)
+
+#endif /* _RTE_KDP_COMMON_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/Makefile b/lib/librte_eal/linuxapp/kdp/Makefile
new file mode 100644
index 0000000..3897dc6
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = rte_kdp
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=50
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+MODULE_CFLAGS += -Wall -Werror
+
+# this lib needs main eal
+DEPDIRS-y += lib/librte_eal/linuxapp/eal
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += kdp_net.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_dev.h b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
new file mode 100644
index 0000000..0689e4f
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_dev.h
@@ -0,0 +1,76 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KDP_DEV_H_
+#define _KDP_DEV_H_
+
+#include <exec-env/rte_kdp_common.h>
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+/**
+ * A structure describing the private information for a kdp device.
+ */
+struct kdp_dev {
+	/* kdp list */
+	struct list_head list;
+
+	char name[RTE_KDP_NAMESIZE]; /* Network device name */
+
+	u8 port_id;
+	u32 core_id;                 /* Core ID to bind */
+
+	/* kdp device */
+	struct net_device *net_dev;
+
+	struct task_struct *pthread;
+	struct net_device_stats stats;
+
+	/* queue for packets to be sent out */
+	void *tx_q;
+
+	/* queue for the packets received */
+	void *rx_q;
+
+	/* queue for the allocated mbufs those can be used to save sk buffs */
+	void *alloc_q;
+
+	/* free queue for the mbufs to be freed */
+	void *free_q;
+
+	void *mbuf_kva;
+	void *mbuf_va;
+	ssize_t addr_diff;
+
+	/* mbuf size */
+	unsigned mbuf_size;
+};
+
+#ifdef RTE_KDP_KO_DEBUG
+#define KDP_DBG(args...) pr_debug(args)
+#else
+#define KDP_DBG(args...)
+#endif
+
+#endif /* _KDP_DEV_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_fifo.h b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
new file mode 100644
index 0000000..b70ce25
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h
@@ -0,0 +1,91 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KDP_FIFO_H_
+#define _KDP_FIFO_H_
+
+#include <exec-env/rte_kdp_common.h>
+
+/**
+ * Adds num elements into the fifo. Return the number actually written
+ */
+static inline size_t
+kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, size_t num)
+{
+	size_t i;
+	u32 fifo_write = fifo->write;
+	u32 fifo_read = fifo->read;
+	u32 new_write = fifo_write;
+
+	for (i = 0; i < num; i++) {
+		new_write = (new_write + 1) & (fifo->len - 1);
+
+		if (new_write == fifo_read)
+			break;
+		fifo->buffer[fifo_write] = data[i];
+		fifo_write = new_write;
+	}
+	fifo->write = fifo_write;
+
+	return i;
+}
+
+/**
+ * Get up to num elements from the fifo. Return the number actully read
+ */
+static inline size_t
+kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, size_t num)
+{
+	size_t i = 0;
+	u32 new_read = fifo->read;
+	u32 fifo_write = fifo->write;
+
+	for (i = 0; i < num; i++) {
+		if (new_read == fifo_write)
+			break;
+
+		data[i] = fifo->buffer[new_read];
+		new_read = (new_read + 1) & (fifo->len - 1);
+	}
+	fifo->read = new_read;
+
+	return i;
+}
+
+/**
+ * Get the num of elements in the fifo
+ */
+static inline size_t
+kdp_fifo_count(struct rte_kdp_fifo *fifo)
+{
+	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+}
+
+/**
+ * Get the num of available elements in the fifo
+ */
+static inline size_t
+kdp_fifo_free_count(struct rte_kdp_fifo *fifo)
+{
+	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+}
+
+#endif /* _KDP_FIFO_H_ */
diff --git a/lib/librte_eal/linuxapp/kdp/kdp_net.c b/lib/librte_eal/linuxapp/kdp/kdp_net.c
new file mode 100644
index 0000000..f089339
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kdp/kdp_net.c
@@ -0,0 +1,718 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/kthread.h>
+#include <linux/module.h>
+#include <linux/version.h>
+#include <net/rtnetlink.h>
+
+#include "kdp_dev.h"
+#include "kdp_fifo.h"
+
+#define WD_TIMEOUT 5 /*jiffies */
+#define MBUF_BURST_SZ 32
+
+#define KDP_RX_LOOP_NUM 1000
+#define KDP_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
+
+static struct task_struct *kdp_kthread;
+static struct mutex kdp_list_lock;
+static struct list_head kdp_list_head;
+
+/* loopback mode */
+static char *lo_mode;
+module_param(lo_mode, charp, S_IRUGO);
+MODULE_PARM_DESC(lo_mode, "Enable loopback mode: fifo or fifo_skb.");
+
+/* Kernel thread mode */
+static bool multiple_kthread;
+module_param(multiple_kthread, bool, S_IRUGO);
+MODULE_PARM_DESC(multiple_kthread, "Enable multiple kernel tread mode.");
+
+/* typedef for rx function */
+typedef void (*kdp_net_rx_t)(struct kdp_dev *kdp);
+
+static int kdp_net_open(struct net_device *dev)
+{
+	random_ether_addr(dev->dev_addr);
+	netif_start_queue(dev);
+
+	return 0;
+}
+
+static int kdp_net_close(struct net_device *dev)
+{
+	netif_stop_queue(dev);
+
+	return 0;
+}
+
+static inline void *va_to_kva(void *va, struct kdp_dev *kdp)
+{
+	return va + kdp->addr_diff;
+}
+
+static inline void *pkt_data(struct rte_kdp_mbuf *pkt, struct kdp_dev *kdp)
+{
+	return va_to_kva(pkt->buf_addr + pkt->data_off, kdp);
+}
+
+/*
+ * Transmit a packet (called by the kernel)
+ */
+static int kdp_net_tx(struct sk_buff *skb, struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+	struct rte_kdp_mbuf *pkt;
+	void *pkt_va;
+	void *data;
+	u32 len;
+	u32 ret;
+
+	dev->trans_start = jiffies; /* save the timestamp */
+
+	/* Check if the length of skb is less than mbuf size */
+	if (skb->len > kdp->mbuf_size)
+		goto drop;
+
+	/**
+	 * Check if it has at least one free entry in tx_q and
+	 * one entry in alloc_q.
+	 */
+	if (kdp_fifo_free_count(kdp->tx_q) == 0 ||
+			kdp_fifo_count(kdp->alloc_q) == 0) {
+		/**
+		 * If no free entry in tx_q or no entry in alloc_q,
+		 * drops skb and goes out.
+		 */
+		goto drop;
+	}
+
+	/* dequeue a mbuf from alloc_q */
+	ret = kdp_fifo_get(kdp->alloc_q, &pkt_va, 1);
+	if (likely(ret == 1)) {
+		pkt = va_to_kva(pkt_va, kdp);
+		data = pkt_data(pkt, kdp);
+
+		len = skb->len;
+		memcpy(data, skb->data, len);
+		if (unlikely(len < ETH_ZLEN)) {
+			memset(data + len, 0, ETH_ZLEN - len);
+			len = ETH_ZLEN;
+		}
+		pkt->pkt_len = len;
+		pkt->data_len = len;
+
+		/* enqueue mbuf into tx_q */
+		ret = kdp_fifo_put(kdp->tx_q, &pkt_va, 1);
+		if (unlikely(ret != 1)) {
+			/* Failing should not happen */
+			pr_err("Fail to enqueue mbuf into tx_q\n");
+			goto drop;
+		}
+	} else {
+		/* Failing should not happen */
+		pr_err("Fail to dequeue mbuf from alloc_q\n");
+		goto drop;
+	}
+
+	/* Free skb and update statistics */
+	dev_kfree_skb(skb);
+	kdp->stats.tx_bytes += len;
+	kdp->stats.tx_packets++;
+
+	return NETDEV_TX_OK;
+
+drop:
+	/* Free skb and update statistics */
+	dev_kfree_skb(skb);
+	kdp->stats.tx_dropped++;
+
+	return NETDEV_TX_OK;
+}
+
+static void kdp_net_set_rx_mode(struct net_device *dev)
+{
+}
+
+static int kdp_net_set_mac(struct net_device *dev, void *p)
+{
+	struct sockaddr *addr = p;
+
+	if (!is_valid_ether_addr(addr->sa_data))
+		return -EADDRNOTAVAIL;
+
+	memcpy(dev->dev_addr, addr->sa_data, dev->addr_len);
+
+	return 0;
+}
+
+static int kdp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	return -EOPNOTSUPP;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int kdp_net_config(struct net_device *dev, struct ifmap *map)
+{
+	if (dev->flags & IFF_UP)
+		return -EBUSY;
+
+	return -EOPNOTSUPP;
+}
+
+static int kdp_net_change_mtu(struct net_device *dev, int new_mtu)
+{
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
+/*
+ * Deal with a transmit timeout.
+ */
+static void kdp_net_tx_timeout(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	KDP_DBG("Transmit timeout at %ld, latency %ld\n", jiffies,
+			jiffies - dev->trans_start);
+
+	kdp->stats.tx_errors++;
+	netif_wake_queue(dev);
+}
+
+/*
+ * Return statistics to the caller
+ */
+static struct net_device_stats *kdp_net_stats(struct net_device *dev)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	return &kdp->stats;
+}
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0))
+static int kdp_net_change_carrier(struct net_device *dev, bool new_carrier)
+{
+	if (new_carrier)
+		netif_carrier_on(dev);
+	else
+		netif_carrier_off(dev);
+	return 0;
+}
+#endif
+
+static const struct net_device_ops kdp_net_netdev_ops = {
+	.ndo_open = kdp_net_open,
+	.ndo_stop = kdp_net_close,
+	.ndo_start_xmit = kdp_net_tx,
+	.ndo_set_rx_mode = kdp_net_set_rx_mode,
+	.ndo_set_mac_address = kdp_net_set_mac,
+	.ndo_do_ioctl = kdp_net_ioctl,
+	.ndo_set_config = kdp_net_config,
+	.ndo_change_mtu = kdp_net_change_mtu,
+	.ndo_tx_timeout = kdp_net_tx_timeout,
+	.ndo_get_stats = kdp_net_stats,
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0))
+	.ndo_change_carrier = kdp_net_change_carrier,
+#endif
+};
+
+static void kdp_net_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+	dev->netdev_ops = &kdp_net_netdev_ops;
+	dev->watchdog_timeo = WD_TIMEOUT;
+}
+
+/*
+ * RX: normal working mode
+ */
+static void kdp_net_rx_normal(struct kdp_dev *kdp)
+{
+	struct net_device *dev = kdp->net_dev;
+	void *va[MBUF_BURST_SZ];
+	struct rte_kdp_mbuf *pkt;
+	void *data;
+	struct sk_buff *skb;
+	size_t num_rx, num_fq;
+	size_t len;
+	size_t ret;
+	u32 i;
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+	if (num_fq == 0)
+		return; /* No room on the free_q, bail out */
+
+	/* Calculate the number of entries to dequeue from rx_q */
+	num_rx = min_t(size_t, num_fq, MBUF_BURST_SZ);
+
+	/* Burst dequeue from rx_q */
+	num_rx = kdp_fifo_get(kdp->rx_q, va, num_rx);
+	if (num_rx == 0)
+		return;
+
+	/* Transfer received packets to netif */
+	for (i = 0; i < num_rx; i++) {
+		pkt = va_to_kva(va[i], kdp);
+		data = pkt_data(pkt, kdp);
+		len = pkt->data_len;
+
+		skb = dev_alloc_skb(len + 2);
+		if (!skb) {
+			kdp->stats.rx_dropped++;
+			continue;
+		}
+
+		/* Align IP on 16B boundary */
+		skb_reserve(skb, 2);
+		memcpy(skb_put(skb, len), data, len);
+		skb->dev = dev;
+		skb->protocol = eth_type_trans(skb, dev);
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+		/* Call netif interface */
+		netif_rx(skb);
+
+		/* Update statistics */
+		kdp->stats.rx_bytes += len;
+		kdp->stats.rx_packets++;
+	}
+
+	/* Burst enqueue mbufs into free_q */
+	ret = kdp_fifo_put(kdp->free_q, va, num_rx);
+	if (ret != num_rx)
+		/* Failing should not happen */
+		pr_err("Fail to enqueue entries into free_q\n");
+}
+
+/*
+ * RX: loopback with enqueue/dequeue fifos.
+ */
+static void kdp_net_rx_lo_fifo(struct kdp_dev *kdp)
+{
+	void *va[MBUF_BURST_SZ];
+	struct rte_kdp_mbuf *pkt;
+	void *data;
+	void *alloc_va[MBUF_BURST_SZ];
+	struct rte_kdp_mbuf *alloc_pkt;
+	void *alloc_data;
+	size_t num, num_q;
+	size_t ret;
+	size_t len;
+	u32 i;
+
+	/* Get the number of entries in rx_q */
+	num_q = kdp_fifo_count(kdp->rx_q);
+	num = min_t(size_t, num_q, MBUF_BURST_SZ);
+
+	/* Get the number of free entrie in tx_q */
+	num_q = kdp_fifo_free_count(kdp->tx_q);
+	num = min_t(size_t, num, num_q);
+
+	/* Get the number of entries in alloc_q */
+	num_q = kdp_fifo_count(kdp->alloc_q);
+	num = min_t(size_t, num, num_q);
+
+	/* Get the number of free entries in free_q */
+	num_q = kdp_fifo_free_count(kdp->free_q);
+	num = min_t(size_t, num, num_q);
+
+	/* Return if no entry to dequeue from rx_q */
+	if (num == 0)
+		return;
+
+	/* Dequeue entries from alloc_q */
+	ret = kdp_fifo_get(kdp->alloc_q, alloc_va, num);
+	if (ret == 0)
+		return;
+
+	/* Burst dequeue from rx_q */
+	ret = kdp_fifo_get(kdp->rx_q, va, num);
+	if (ret == 0) {
+		/* recover enties from alloc_q before return */
+		ret = kdp_fifo_put(kdp->free_q, alloc_va, num);
+		if (ret != num)
+			pr_err("Fail to enqueue alloc mbufs into free_q\n");
+		return;
+	}
+
+	num = ret;
+	/* Copy mbufs */
+	for (i = 0; i < num; i++) {
+		pkt = va_to_kva(va[i], kdp);
+		data = pkt_data(pkt, kdp);
+
+		alloc_pkt = va_to_kva(alloc_va[i], kdp);
+		alloc_data = pkt_data(alloc_pkt, kdp);
+
+		len = pkt->pkt_len;
+		memcpy(alloc_data, data, len);
+
+		alloc_pkt->pkt_len = len;
+		alloc_pkt->data_len = len;
+
+		kdp->stats.tx_bytes += len;
+		kdp->stats.rx_bytes += len;
+	}
+
+	/* Burst enqueue mbufs into tx_q */
+	ret = kdp_fifo_put(kdp->tx_q, alloc_va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		pr_err("Fail to enqueue mbufs into tx_q\n");
+
+	/* Burst enqueue mbufs into free_q */
+	ret = kdp_fifo_put(kdp->free_q, va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		pr_err("Fail to enqueue mbufs into free_q\n");
+
+	/**
+	 * Update statistic, and enqueue/dequeue failure is impossible,
+	 * as all queues are checked at first.
+	 */
+	kdp->stats.tx_packets += num;
+	kdp->stats.rx_packets += num;
+}
+
+/*
+ * RX: loopback with enqueue/dequeue fifos and sk buffer copies.
+ */
+static void kdp_net_rx_lo_fifo_skb(struct kdp_dev *kdp)
+{
+	struct net_device *dev = kdp->net_dev;
+	void *va[MBUF_BURST_SZ];
+	struct rte_kdp_mbuf *pkt;
+	void *data;
+	struct sk_buff *skb;
+	size_t num_rq, num_fq;
+	size_t ret;
+	size_t len;
+	size_t num;
+	u32 i;
+
+	/* Get the number of entries in rx_q */
+	num_rq = kdp_fifo_count(kdp->rx_q);
+
+	/* Get the number of free entries in free_q */
+	num_fq = kdp_fifo_free_count(kdp->free_q);
+
+	/* Calculate the number of entries to dequeue from rx_q */
+	num = min_t(size_t, num_rq, num_fq);
+	num = min_t(size_t, num, MBUF_BURST_SZ);
+
+	/* Return if no entry to dequeue from rx_q */
+	if (num == 0)
+		return;
+
+	/* Burst dequeue mbufs from rx_q */
+	ret = kdp_fifo_get(kdp->rx_q, va, num);
+	if (ret == 0)
+		return;
+
+	num = ret;
+	/* Copy mbufs to sk buffer and then call tx interface */
+	for (i = 0; i < num; i++) {
+		pkt = va_to_kva(va[i], kdp);
+		data = pkt_data(pkt, kdp);
+		len = pkt->data_len;
+
+		skb = dev_alloc_skb(len + 2);
+		if (!skb) {
+			kdp->stats.rx_dropped++;
+			continue;
+		}
+
+		/* Align IP on 16B boundary */
+		skb_reserve(skb, 2);
+		memcpy(skb_put(skb, len), data, len);
+		skb->dev = dev;
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+		kdp->stats.rx_bytes += len;
+		kdp->stats.rx_packets++;
+
+		/* call tx interface */
+		kdp_net_tx(skb, dev);
+	}
+
+	/* enqueue all the mbufs from rx_q into free_q */
+	ret = kdp_fifo_put(kdp->free_q, va, num);
+	if (ret != num)
+		/* Failing should not happen */
+		pr_err("Fail to enqueue mbufs into free_q\n");
+}
+
+/* kdp rx function pointer, with default to normal rx */
+static kdp_net_rx_t kdp_net_rx_func = kdp_net_rx_normal;
+
+static int kdp_thread_single(void *data)
+{
+	struct kdp_dev *kdp;
+	u32 i;
+
+	while (!kthread_should_stop()) {
+		mutex_lock(&kdp_list_lock);
+		for (i = 0; i < KDP_RX_LOOP_NUM; i++)
+			list_for_each_entry(kdp, &kdp_list_head, list)
+				(*kdp_net_rx_func)(kdp);
+		mutex_unlock(&kdp_list_lock);
+
+#ifdef RTE_KDP_PREEMPT_DEFAULT
+		/* reschedule out for a while */
+		schedule_timeout_interruptible(
+			usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL));
+#endif
+	}
+
+	return 0;
+}
+
+static int kdp_thread_multiple(void *param)
+{
+	struct kdp_dev *kdp = param;
+	u32 i;
+
+	while (!kthread_should_stop()) {
+		for (i = 0; i < KDP_RX_LOOP_NUM; i++)
+			(*kdp_net_rx_func)(kdp);
+
+#ifdef RTE_KDP_PREEMPT_DEFAULT
+		schedule_timeout_interruptible(
+			usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL));
+#endif
+	}
+
+	return 0;
+}
+
+static void kdp_setup(struct kdp_dev *kdp, struct rte_kdp_device_info *info)
+{
+	kdp->port_id = info->port_id;
+	kdp->core_id = info->core_id;
+	strncpy(kdp->name, info->name, RTE_KDP_NAMESIZE);
+
+	/* Translate user space info into kernel space info */
+	kdp->tx_q = phys_to_virt(info->tx_phys);
+	kdp->rx_q = phys_to_virt(info->rx_phys);
+	kdp->alloc_q = phys_to_virt(info->alloc_phys);
+	kdp->free_q = phys_to_virt(info->free_phys);
+
+	kdp->mbuf_kva = phys_to_virt(info->mbuf_phys);
+	kdp->mbuf_va = info->mbuf_va;
+	kdp->addr_diff = kdp->mbuf_kva - kdp->mbuf_va;
+
+	kdp->mbuf_size = info->mbuf_size;
+
+	pr_info("tx_phys:      0x%016llx, tx_q addr:      0x%p\n",
+		(unsigned long long) info->tx_phys, kdp->tx_q);
+	pr_info("rx_phys:      0x%016llx, rx_q addr:      0x%p\n",
+		(unsigned long long) info->rx_phys, kdp->rx_q);
+	pr_info("alloc_phys:   0x%016llx, alloc_q addr:   0x%p\n",
+		(unsigned long long) info->alloc_phys, kdp->alloc_q);
+	pr_info("free_phys:    0x%016llx, free_q addr:    0x%p\n",
+		(unsigned long long) info->free_phys, kdp->free_q);
+	pr_info("mbuf_phys:    0x%016llx, mbuf_kva:       0x%p\n",
+		(unsigned long long) info->mbuf_phys, kdp->mbuf_kva);
+	pr_info("mbuf_va:      0x%p\n", info->mbuf_va);
+	pr_info("mbuf_size:    %u\n", kdp->mbuf_size);
+}
+
+static int create_kthread(struct kdp_dev *kdp,
+		struct rte_kdp_device_info *info)
+{
+	/**
+	 * Create a new kernel thread for multiple mode, set its core affinity,
+	 * and finally wake it up.
+	 */
+	if (multiple_kthread) {
+		/**
+		 * Check if the cpu core id is valid for binding,
+		 * for multiple kernel thread mode.
+		 */
+		if (info->force_bind && !cpu_online(kdp->core_id)) {
+			pr_err("cpu %u is not online\n", kdp->core_id);
+			return -EINVAL;
+		}
+
+		kdp->pthread = kthread_create(kdp_thread_multiple,
+				(void *)kdp, "kdp_%s", kdp->name);
+		if (IS_ERR(kdp->pthread))
+			return -ECANCELED;
+
+		if (info->force_bind)
+			kthread_bind(kdp->pthread, kdp->core_id);
+
+		wake_up_process(kdp->pthread);
+
+		return 0;
+	}
+
+	/* single thread */
+	if (kdp_kthread == NULL) {
+		pr_info("Single kernel thread for all KDP devices\n");
+
+		/* Create kernel thread for RX */
+		kdp_kthread = kthread_run(kdp_thread_single, NULL,
+				"kdp_single");
+		if (IS_ERR(kdp_kthread)) {
+			pr_err("Unable to create kernel thread\n");
+			return -ECANCELED;
+		}
+	}
+
+	return 0;
+}
+
+static int kdp_net_newlink(struct net *net, struct net_device *dev,
+		struct nlattr *tb[], struct nlattr *data[])
+{
+	struct rte_kdp_device_info dev_info;
+	struct kdp_dev *kdp = netdev_priv(dev);
+	int ret;
+
+	if (data && data[IFLA_KDP_PORTID])
+		kdp->port_id = nla_get_u8(data[IFLA_KDP_PORTID]);
+	else
+		goto error_free;
+
+	if (data && data[IFLA_KDP_DEVINFO])
+		memcpy(&dev_info, nla_data(data[IFLA_KDP_DEVINFO]),
+				sizeof(struct rte_kdp_device_info));
+	else
+		goto error_free;
+
+	kdp->net_dev = dev;
+	kdp_setup(kdp, &dev_info);
+
+	ret = register_netdevice(dev);
+	if (ret < 0)
+		goto error_free;
+
+	ret = create_kthread(kdp, &dev_info);
+	if (ret < 0)
+		goto error_unregister;
+
+	mutex_lock(&kdp_list_lock);
+	list_add(&kdp->list, &kdp_list_head);
+	mutex_unlock(&kdp_list_lock);
+
+	return 0;
+
+error_unregister:
+	unregister_netdev(dev);
+error_free:
+	free_netdev(dev);
+	return -EINVAL;
+}
+
+static void single_kthread_stop(void)
+{
+	/* Stop kernel thread for single mode */
+	if (!multiple_kthread && kdp_kthread) {
+		kthread_stop(kdp_kthread);
+		kdp_kthread = NULL;
+	}
+}
+
+static void multiple_kthread_stop(struct kdp_dev *kdp)
+{
+	/* Stop kernel thread for multiple mode */
+	if (multiple_kthread && kdp->pthread) {
+		kthread_stop(kdp->pthread);
+		kdp->pthread = NULL;
+	}
+}
+
+static void kdp_kthread_stop_one(struct kdp_dev *kdp)
+{
+	multiple_kthread_stop(kdp);
+
+	mutex_lock(&kdp_list_lock);
+	if (list_empty(&kdp_list_head))
+		single_kthread_stop();
+	mutex_unlock(&kdp_list_lock);
+}
+
+static void kdp_net_dellink(struct net_device *dev, struct list_head *head)
+{
+	struct kdp_dev *kdp = netdev_priv(dev);
+
+	mutex_lock(&kdp_list_lock);
+	list_del(&kdp->list);
+	mutex_unlock(&kdp_list_lock);
+
+	kdp_kthread_stop_one(kdp);
+
+	unregister_netdevice_queue(dev, head);
+}
+
+static struct rtnl_link_ops kdp_link_ops __read_mostly = {
+	.kind = KDP_DEVICE,
+	.priv_size = sizeof(struct kdp_dev),
+	.setup = kdp_net_setup,
+	.maxtype = IFLA_KDP_MAX,
+	.newlink = kdp_net_newlink,
+	.dellink = kdp_net_dellink,
+};
+
+static void __init kdp_net_config_lo_mode(char *lo_str)
+{
+	if (!lo_str)
+		return;
+
+	if (!strcmp(lo_str, "fifo")) {
+		pr_info("loopback mode fifo enabled");
+		kdp_net_rx_func = kdp_net_rx_lo_fifo;
+	} else if (!strcmp(lo_str, "fifo_skb")) {
+		pr_info("loopback mode fifo_skb enabled");
+		kdp_net_rx_func = kdp_net_rx_lo_fifo_skb;
+	} else
+		pr_info("Incognizant parameter, loopback disabled");
+}
+
+static int __init kdp_init(void)
+{
+	/* Configure the loopback mode according to the input parameter */
+	kdp_net_config_lo_mode(lo_mode);
+
+	mutex_init(&kdp_list_lock);
+	INIT_LIST_HEAD(&kdp_list_head);
+
+	return rtnl_link_register(&kdp_link_ops);
+}
+module_init(kdp_init);
+
+static void __exit kdp_exit(void)
+{
+	rtnl_link_unregister(&kdp_link_ops);
+}
+module_exit(kdp_exit);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Kernel Module for managing kdp devices");
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 2/2] kdp: add virtual PMD for kernel slow data path communication
  2016-03-09 11:17   ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2016-03-09 11:17     ` [PATCH v3 1/2] kdp: add kernel data path kernel module Ferruh Yigit
@ 2016-03-09 11:17     ` Ferruh Yigit
  2016-03-14 15:32     ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-09 11:17 UTC (permalink / raw)
  To: dev

This patch provides slow data path communication to the Linux kernel.
Patch is based on librte_kni, and heavily re-uses it.

The main difference is librte_kni library converted into a PMD, to
provide ease of use for applications.

Now any application can use slow path communication without any update
in application, because of existing eal support for virtual PMD.

Also this PMD supports two methods to send packets to the Linux, first
one is custom FIFO implementation with help of KDP kernel module, second
one is Linux in-kernel tun/tap support. PMD first checks for KDP kernel
module, if fails it tries to create and use a tap interface.

With FIFO method: PMD's rx_pkt_burst() get packets from FIFO,
and tx_pkt_burst() puts packet to the FIFO.
The corresponding Linux virtual network device driver code
also gets/puts packets from FIFO as they are coming from hardware.

With tun/tap method: no external kernel module required, PMD reads from
and writes packets to the tap interface file descriptor. Tap interface
has performance penalty against FIFO implementation.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---

v3:
* No update

v2:
* Use rtnetlink to create interfaces
---
 MAINTAINERS                             |   1 +
 config/common_base                      |   1 +
 config/common_linuxapp                  |   1 +
 doc/guides/nics/pcap_ring.rst           | 125 ++++++-
 doc/guides/rel_notes/release_16_04.rst  |   5 +
 drivers/net/Makefile                    |   3 +-
 drivers/net/kdp/Makefile                |  61 +++
 drivers/net/kdp/rte_eth_kdp.c           | 501 +++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.c               | 633 ++++++++++++++++++++++++++++++++
 drivers/net/kdp/rte_kdp.h               | 116 ++++++
 drivers/net/kdp/rte_kdp_fifo.h          |  91 +++++
 drivers/net/kdp/rte_kdp_tap.c           | 101 +++++
 drivers/net/kdp/rte_pmd_kdp_version.map |   4 +
 lib/librte_eal/common/include/rte_log.h |   3 +-
 mk/rte.app.mk                           |   3 +-
 15 files changed, 1643 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/kdp/Makefile
 create mode 100644 drivers/net/kdp/rte_eth_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.c
 create mode 100644 drivers/net/kdp/rte_kdp.h
 create mode 100644 drivers/net/kdp/rte_kdp_fifo.h
 create mode 100644 drivers/net/kdp/rte_kdp_tap.c
 create mode 100644 drivers/net/kdp/rte_pmd_kdp_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index edcc4cc..2174bac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -261,6 +261,7 @@ F: doc/guides/sample_app_ug/kernel_nic_interface.rst
 Linux KDP
 M: Ferruh Yigit <ferruh.yigit@gmail.com>
 F: lib/librte_eal/linuxapp/kdp/
+F: drivers/net/kdp/
 
 Linux AF_PACKET
 M: John W. Linville <linville@tuxdriver.com>
diff --git a/config/common_base b/config/common_base
index 973baff..767f391 100644
--- a/config/common_base
+++ b/config/common_base
@@ -306,6 +306,7 @@ CONFIG_RTE_LIBRTE_PMD_NULL=y
 #
 CONFIG_RTE_KDP_KMOD=n
 CONFIG_RTE_KDP_PREEMPT_DEFAULT=y
+CONFIG_RTE_LIBRTE_PMD_KDP=n
 
 #
 # Do prefetch of packet data within PMD driver receive function
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 569a0fe..fd25a38 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -40,6 +40,7 @@ CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_KNI_KMOD=y
 CONFIG_RTE_LIBRTE_KNI=y
 CONFIG_RTE_KDP_KMOD=y
+CONFIG_RTE_LIBRTE_PMD_KDP=y
 CONFIG_RTE_LIBRTE_VHOST=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
 CONFIG_RTE_LIBRTE_POWER=y
diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst
index aa48d33..b602e65 100644
--- a/doc/guides/nics/pcap_ring.rst
+++ b/doc/guides/nics/pcap_ring.rst
@@ -28,11 +28,11 @@
     (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-Libpcap and Ring Based Poll Mode Drivers
-========================================
+Software Poll Mode Drivers
+==========================
 
 In addition to Poll Mode Drivers (PMDs) for physical and virtual hardware,
-the DPDK also includes two pure-software PMDs. These two drivers are:
+the DPDK also includes pure-software PMDs. These drivers are:
 
 *   A libpcap -based PMD (librte_pmd_pcap) that reads and writes packets using libpcap,
     - both from files on disk, as well as from physical NIC devices using standard Linux kernel drivers.
@@ -40,6 +40,10 @@ the DPDK also includes two pure-software PMDs. These two drivers are:
 *   A ring-based PMD (librte_pmd_ring) that allows a set of software FIFOs (that is, rte_ring)
     to be accessed using the PMD APIs, as though they were physical NICs.
 
+*   A slow data path PMD (librte_pmd_kdp) that allows send/get packets to/from OS network
+    stack as it is a physical NIC.
+
+
 .. note::
 
     The libpcap -based PMD is disabled by default in the build configuration files,
@@ -211,6 +215,121 @@ Multiple devices may be specified, separated by commas.
     Done.
 
 
+Kernel Data Path PMD
+~~~~~~~~~~~~~~~~~~~~
+
+Kernel Data Path (KDP) PMD is to communicate with OS network stack easily by application.
+
+.. code-block:: console
+
+        ./testpmd --vdev eth_kdp0 --vdev eth_kdp1 -- -i
+        ...
+        Configuring Port 0 (socket 0)
+        Port 0: 00:00:00:00:00:00
+        Configuring Port 1 (socket 0)
+        Port 1: 00:00:00:00:00:00
+        Checking link statuses...
+        Port 0 Link Up - speed 10000 Mbps - full-duplex
+        Port 1 Link Up - speed 10000 Mbps - full-duplex
+        Done
+
+KDP PMD supports two type of communication:
+
+* Custom FIFO implementation
+* tun/tap implementation
+
+Custom FIFO implementation gives more performance but requires KDP kernel module (rte_kdp.ko) inserted.
+
+By default FIFO communication has priority, if KDP kernel module is not inserted, tun/tap communication used.
+
+If KDP kernel module inserted, above testpmd command will create following virtual interfaces, these can be used as any interface.
+
+.. code-block:: console
+
+        # ifconfig kdp0; ifconfig kdp1
+        kdp0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
+                ether 00:00:00:00:00:00  txqueuelen 1000  (Ethernet)
+                RX packets 0  bytes 0 (0.0 B)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 0  bytes 0 (0.0 B)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+        kdp1: flags=4098<BROADCAST,MULTICAST>  mtu 1500
+                ether 00:00:00:00:00:00  txqueuelen 1000  (Ethernet)
+                RX packets 0  bytes 0 (0.0 B)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 0  bytes 0 (0.0 B)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+
+With tun/tap communication method, following interfaces are created:
+
+.. code-block:: console
+
+        # ifconfig tap_kdp0; ifconfig tap_kdp1
+        tap_kdp0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
+                inet6 fe80::341f:afff:feb7:23db  prefixlen 64  scopeid 0x20<link>
+                ether 36:1f:af:b7:23:db  txqueuelen 500  (Ethernet)
+                RX packets 126624864  bytes 6184828655 (5.7 GiB)
+                RX errors 0  dropped 0  overruns 0  frame 0
+                TX packets 126236898  bytes 6150306636 (5.7 GiB)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+        tap_kdp1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
+                inet6 fe80::f030:b4ff:fe94:b720  prefixlen 64  scopeid 0x20<link>
+                ether f2:30:b4:94:b7:20  txqueuelen 500  (Ethernet)
+                RX packets 126237370  bytes 6150329717 (5.7 GiB)
+                RX errors 0  dropped 9  overruns 0  frame 0
+                TX packets 126624896  bytes 6184826874 (5.7 GiB)
+                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
+
+DPDK application can be used to forward packets between these interfaces:
+
+.. code-block:: console
+
+        In Linux:
+        ip l add br0 type bridge
+        ip l set tap_kdp0 master br0
+        ip l set tap_kdp1 master br0
+        ip l set br0 up
+        ip l set tap_kdp0 up
+        ip l set tap_kdp1 up
+
+
+        In testpmd:
+        testpmd> start
+          io packet forwarding - CRC stripping disabled - packets/burst=32
+          nb forwarding cores=1 - nb forwarding ports=2
+          RX queues=1 - RX desc=128 - RX free threshold=0
+          RX threshold registers: pthresh=0 hthresh=0 wthresh=0
+          TX queues=1 - TX desc=512 - TX free threshold=0
+          TX threshold registers: pthresh=0 hthresh=0 wthresh=0
+          TX RS bit threshold=0 - TXQ flags=0x0
+        testpmd> stop
+        Telling cores to stop...
+        Waiting for lcores to finish...
+
+          ---------------------- Forward statistics for port 0  ----------------------
+          RX-packets: 973900         RX-dropped: 0             RX-total: 973900
+          TX-packets: 973903         TX-dropped: 0             TX-total: 973903
+          ----------------------------------------------------------------------------
+
+          ---------------------- Forward statistics for port 1  ----------------------
+          RX-packets: 973903         RX-dropped: 0             RX-total: 973903
+          TX-packets: 973900         TX-dropped: 0             TX-total: 973900
+          ----------------------------------------------------------------------------
+
+          +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
+          RX-packets: 1947803        RX-dropped: 0             RX-total: 1947803
+          TX-packets: 1947803        TX-dropped: 0             TX-total: 1947803
+          ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+        Done.
+
+
+
+
+
 Using the Poll Mode Driver from an Application
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 96f144e..7f6b3aa 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -63,6 +63,11 @@ This section should contain new features added in this release. Sample format:
   space bytes, to boost the performance. In the meanwhile, it deprecated the
   legacy way via reading/writing sysfile supported by kernel module igb_uio.
 
+* **Added Slow Data Path support.**
+
+  * This is based on KNI work and in long term intends to replace it.
+  * Added Kernel Data Path (KDP) kernel module.
+  * Added KDP virtual PMD.
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 0c3393f..78f923a 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -51,5 +51,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += szedata2
 DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += kdp
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/kdp/Makefile b/drivers/net/kdp/Makefile
new file mode 100644
index 0000000..035056e
--- /dev/null
+++ b/drivers/net/kdp/Makefile
@@ -0,0 +1,61 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_kdp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_kdp_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_eth_kdp.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_kdp.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += rte_kdp_tap.c
+
+#
+# Export include files
+#
+SYMLINK-y-include +=
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KDP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/kdp/rte_eth_kdp.c b/drivers/net/kdp/rte_eth_kdp.c
new file mode 100644
index 0000000..68dd734
--- /dev/null
+++ b/drivers/net/kdp/rte_eth_kdp.c
@@ -0,0 +1,501 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ethdev.h>
+
+#include "rte_kdp.h"
+
+#define MAX_PACKET_SZ 2048
+
+struct pmd_queue_stats {
+	uint64_t pkts;
+	uint64_t bytes;
+	uint64_t err_pkts;
+};
+
+struct pmd_queue {
+	struct pmd_internals *internals;
+	struct rte_mempool *mb_pool;
+
+	struct pmd_queue_stats rx;
+	struct pmd_queue_stats tx;
+};
+
+struct pmd_internals {
+	struct kdp_data *kdp;
+	struct kdp_tap_data *kdp_tap;
+
+	struct pmd_queue rx_queues[RTE_MAX_QUEUES_PER_PORT];
+	struct pmd_queue tx_queues[RTE_MAX_QUEUES_PER_PORT];
+};
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+static const char *drivername = "KDP PMD";
+static struct rte_eth_link pmd_link = {
+		.link_speed = 10000,
+		.link_duplex = ETH_LINK_FULL_DUPLEX,
+		.link_status = 0
+};
+
+static uint16_t
+eth_kdp_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct kdp_data *kdp = kdp_q->internals->kdp;
+	uint16_t nb_pkts;
+
+	nb_pkts = kdp_rx_burst(kdp, bufs, nb_bufs);
+
+	kdp_q->rx.pkts += nb_pkts;
+	kdp_q->rx.err_pkts += nb_bufs - nb_pkts;
+
+	return nb_pkts;
+}
+
+static uint16_t
+eth_kdp_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct kdp_data *kdp = kdp_q->internals->kdp;
+	uint16_t nb_pkts;
+
+	nb_pkts =  kdp_tx_burst(kdp, bufs, nb_bufs);
+
+	kdp_q->tx.pkts += nb_pkts;
+	kdp_q->tx.err_pkts += nb_bufs - nb_pkts;
+
+	return nb_pkts;
+}
+
+static uint16_t
+eth_kdp_tap_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	struct kdp_tap_data *kdp_tap = internals->kdp_tap;
+	struct rte_mbuf *m;
+	int ret;
+	unsigned i;
+
+	for (i = 0; i < nb_bufs; i++) {
+		m = rte_pktmbuf_alloc(kdp_q->mb_pool);
+		bufs[i] = m;
+		ret = read(kdp_tap->tap_fd, rte_pktmbuf_mtod(m, void *),
+				MAX_PACKET_SZ);
+		if (ret < 0) {
+			rte_pktmbuf_free(m);
+			break;
+		}
+
+		m->nb_segs = 1;
+		m->next = NULL;
+		m->pkt_len = (uint16_t)ret;
+		m->data_len = (uint16_t)ret;
+	}
+
+	kdp_q->rx.pkts += i;
+	kdp_q->rx.err_pkts += nb_bufs - i;
+
+	return i;
+}
+
+static uint16_t
+eth_kdp_tap_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
+{
+	struct pmd_queue *kdp_q = q;
+	struct pmd_internals *internals = kdp_q->internals;
+	struct kdp_tap_data *kdp_tap = internals->kdp_tap;
+	struct rte_mbuf *m;
+	unsigned i;
+
+	for (i = 0; i < nb_bufs; i++) {
+		m = bufs[i];
+		write(kdp_tap->tap_fd, rte_pktmbuf_mtod(m, void*),
+				rte_pktmbuf_data_len(m));
+		rte_pktmbuf_free(m);
+	}
+
+	kdp_q->tx.pkts += i;
+	kdp_q->tx.err_pkts += nb_bufs - i;
+
+	return i;
+}
+
+static int
+eth_kdp_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct kdp_conf conf;
+	uint16_t port_id = dev->data->port_id;
+	int ret = 0;
+
+	snprintf(conf.name, RTE_KDP_NAMESIZE, KDP_DEVICE "%u",
+			port_id);
+	conf.force_bind = 0;
+	conf.port_id = port_id;
+	conf.mbuf_size = MAX_PACKET_SZ;
+
+	ret = kdp_start(internals->kdp,
+			internals->rx_queues[0].mb_pool,
+			&conf);
+	if (ret)
+		RTE_LOG(ERR, KDP, "Fail to create kdp for port: %d\n",
+				port_id);
+
+	return ret;
+}
+
+static int
+eth_kdp_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	int ret;
+
+	if (internals->kdp) {
+		ret = eth_kdp_start(dev);
+		if (ret)
+			return -1;
+	}
+
+	dev->data->dev_link.link_status = 1;
+	return 0;
+}
+
+static void
+eth_kdp_dev_stop(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	if (internals->kdp)
+		kdp_stop(internals->kdp);
+
+	dev->data->dev_link.link_status = 0;
+}
+
+static void
+eth_kdp_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct kdp_data *kdp = internals->kdp;
+	struct kdp_tap_data *kdp_tap = internals->kdp_tap;
+
+	if (kdp) {
+		kdp_close(kdp);
+
+		rte_free(kdp);
+		internals->kdp = NULL;
+	}
+
+	if (kdp_tap) {
+		kdp_tap_close(kdp_tap);
+
+		rte_free(kdp_tap);
+		internals->kdp_tap = NULL;
+	}
+
+	rte_free(dev->data->dev_private);
+	dev->data->dev_private = NULL;
+}
+
+static int
+eth_kdp_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	return 0;
+}
+
+static void
+eth_kdp_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct rte_eth_dev_data *data = dev->data;
+
+	dev_info->driver_name = data->drv_name;
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)-1;
+	dev_info->max_rx_queues = data->nb_rx_queues;
+	dev_info->max_tx_queues = data->nb_tx_queues;
+	dev_info->min_rx_bufsize = 0;
+	dev_info->pci_dev = NULL;
+}
+
+static int
+eth_kdp_rx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t rx_queue_id __rte_unused,
+		uint16_t nb_rx_desc __rte_unused,
+		unsigned int socket_id __rte_unused,
+		const struct rte_eth_rxconf *rx_conf __rte_unused,
+		struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct pmd_queue *q;
+
+	q = &internals->rx_queues[rx_queue_id];
+	q->internals = internals;
+	q->mb_pool = mb_pool;
+
+	dev->data->rx_queues[rx_queue_id] = q;
+
+	return 0;
+}
+
+static int
+eth_kdp_tx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t tx_queue_id,
+		uint16_t nb_tx_desc __rte_unused,
+		unsigned int socket_id __rte_unused,
+		const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct pmd_queue *q;
+
+	q = &internals->tx_queues[tx_queue_id];
+	q->internals = internals;
+
+	dev->data->tx_queues[tx_queue_id] = q;
+
+	return 0;
+}
+
+static void
+eth_kdp_queue_release(void *q __rte_unused)
+{
+}
+
+static int
+eth_kdp_link_update(struct rte_eth_dev *dev __rte_unused,
+		int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static void
+eth_kdp_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	unsigned i, num_stats;
+	unsigned long rx_packets_total = 0, rx_bytes_total = 0;
+	unsigned long tx_packets_total = 0, tx_bytes_total = 0;
+	unsigned long tx_packets_err_total = 0;
+	struct rte_eth_dev_data *data = dev->data;
+	struct pmd_queue *q;
+
+	num_stats = RTE_MIN((unsigned)RTE_ETHDEV_QUEUE_STAT_CNTRS,
+			data->nb_rx_queues);
+	for (i = 0; i < num_stats; i++) {
+		q = data->rx_queues[i];
+		stats->q_ipackets[i] = q->rx.pkts;
+		stats->q_ibytes[i] = q->rx.bytes;
+		rx_packets_total += stats->q_ipackets[i];
+		rx_bytes_total += stats->q_ibytes[i];
+	}
+
+	num_stats = RTE_MIN((unsigned)RTE_ETHDEV_QUEUE_STAT_CNTRS,
+			data->nb_tx_queues);
+	for (i = 0; i < num_stats; i++) {
+		q = data->tx_queues[i];
+		stats->q_opackets[i] = q->tx.pkts;
+		stats->q_obytes[i] = q->tx.bytes;
+		stats->q_errors[i] = q->tx.err_pkts;
+		tx_packets_total += stats->q_opackets[i];
+		tx_bytes_total += stats->q_obytes[i];
+		tx_packets_err_total += stats->q_errors[i];
+	}
+
+	stats->ipackets = rx_packets_total;
+	stats->ibytes = rx_bytes_total;
+	stats->opackets = tx_packets_total;
+	stats->obytes = tx_bytes_total;
+	stats->oerrors = tx_packets_err_total;
+}
+
+static void
+eth_kdp_stats_reset(struct rte_eth_dev *dev)
+{
+	unsigned i;
+	struct rte_eth_dev_data *data = dev->data;
+	struct pmd_queue *q;
+
+	for (i = 0; i < data->nb_rx_queues; i++) {
+		q = data->rx_queues[i];
+		q->rx.pkts = 0;
+		q->rx.bytes = 0;
+	}
+	for (i = 0; i < data->nb_tx_queues; i++) {
+		q = data->tx_queues[i];
+		q->tx.pkts = 0;
+		q->tx.bytes = 0;
+		q->tx.err_pkts = 0;
+	}
+}
+
+static const struct eth_dev_ops eth_kdp_ops = {
+	.dev_start = eth_kdp_dev_start,
+	.dev_stop = eth_kdp_dev_stop,
+	.dev_close = eth_kdp_dev_close,
+	.dev_configure = eth_kdp_dev_configure,
+	.dev_infos_get = eth_kdp_dev_info,
+	.rx_queue_setup = eth_kdp_rx_queue_setup,
+	.tx_queue_setup = eth_kdp_tx_queue_setup,
+	.rx_queue_release = eth_kdp_queue_release,
+	.tx_queue_release = eth_kdp_queue_release,
+	.link_update = eth_kdp_link_update,
+	.stats_get = eth_kdp_stats_get,
+	.stats_reset = eth_kdp_stats_reset,
+};
+
+static struct rte_eth_dev *
+eth_kdp_create(const char *name, unsigned numa_node)
+{
+	uint16_t nb_rx_queues = 1;
+	uint16_t nb_tx_queues = 1;
+	struct rte_eth_dev_data *data = NULL;
+	struct pmd_internals *internals = NULL;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	RTE_LOG(INFO, PMD, "Creating kdp ethdev on numa socket %u\n",
+			numa_node);
+
+	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+	if (data == NULL)
+		goto error;
+
+	internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node);
+	if (internals == NULL)
+		goto error;
+
+	/* reserve an ethdev entry */
+	eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+	if (eth_dev == NULL)
+		goto error;
+
+	data->dev_private = internals;
+	data->port_id = eth_dev->data->port_id;
+	memmove(data->name, eth_dev->data->name, sizeof(data->name));
+	data->nb_rx_queues = nb_rx_queues;
+	data->nb_tx_queues = nb_tx_queues;
+	data->dev_link = pmd_link;
+	data->mac_addrs = &eth_addr;
+
+	eth_dev->data = data;
+	eth_dev->dev_ops = &eth_kdp_ops;
+	eth_dev->driver = NULL;
+
+	data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+	data->kdrv = RTE_KDRV_NONE;
+	data->drv_name = drivername;
+	data->numa_node = numa_node;
+
+	return eth_dev;
+
+error:
+	rte_free(data);
+	rte_free(internals);
+
+	return NULL;
+}
+
+static int
+eth_kdp_devinit(const char *name, const char *params __rte_unused)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+	struct pmd_internals *internals;
+	struct kdp_data *kdp;
+	struct kdp_tap_data *kdp_tap = NULL;
+	uint16_t port_id;
+
+	RTE_LOG(INFO, PMD, "Initializing eth_kdp for %s\n", name);
+
+	eth_dev = eth_kdp_create(name, rte_socket_id());
+	if (eth_dev == NULL)
+		return -1;
+
+	internals = eth_dev->data->dev_private;
+	port_id = eth_dev->data->port_id;
+
+	kdp = kdp_init(port_id);
+	if (kdp == NULL)
+		kdp_tap = kdp_tap_init(port_id);
+
+	if (kdp == NULL && kdp_tap == NULL) {
+		rte_eth_dev_release_port(eth_dev);
+		rte_free(internals);
+
+		/* Not return error to prevent panic in rte_eal_init()  */
+		return 0;
+	}
+
+	internals->kdp = kdp;
+	internals->kdp_tap = kdp_tap;
+
+	if (kdp == NULL) {
+		eth_dev->rx_pkt_burst = eth_kdp_tap_rx;
+		eth_dev->tx_pkt_burst = eth_kdp_tap_tx;
+	} else {
+		eth_dev->rx_pkt_burst = eth_kdp_rx;
+		eth_dev->tx_pkt_burst = eth_kdp_tx;
+	}
+
+	return 0;
+}
+
+static int
+eth_kdp_devuninit(const char *name)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+
+	RTE_LOG(INFO, PMD, "Un-Initializing eth_kdp for %s\n", name);
+
+	/* find the ethdev entry */
+	eth_dev = rte_eth_dev_allocated(name);
+	if (eth_dev == NULL)
+		return -1;
+
+	eth_kdp_dev_stop(eth_dev);
+
+	if (eth_dev->data)
+		rte_free(eth_dev->data->dev_private);
+	rte_free(eth_dev->data);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	kdp_uninit();
+
+	return 0;
+}
+
+static struct rte_driver eth_kdp_drv = {
+	.name = "eth_kdp",
+	.type = PMD_VDEV,
+	.init = eth_kdp_devinit,
+	.uninit = eth_kdp_devuninit,
+};
+
+PMD_REGISTER_DRIVER(eth_kdp_drv);
diff --git a/drivers/net/kdp/rte_kdp.c b/drivers/net/kdp/rte_kdp.c
new file mode 100644
index 0000000..ed50a0f
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp.c
@@ -0,0 +1,633 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_EXEC_ENV_LINUXAPP
+#error "KDP is not supported"
+#endif
+
+#include <sys/socket.h>
+#include <linux/netlink.h>
+#include <linux/rtnetlink.h>
+
+#include <rte_spinlock.h>
+#include <rte_ethdev.h>
+#include <rte_memzone.h>
+
+#include "rte_kdp.h"
+#include "rte_kdp_fifo.h"
+
+#define KDP_MODULE_NAME "rte_kdp"
+#define MAX_MBUF_BURST_NUM     32
+
+/* Maximum number of ring entries */
+#define KDP_FIFO_COUNT_MAX     1024
+#define KDP_FIFO_SIZE          (KDP_FIFO_COUNT_MAX * sizeof(void *) + \
+					sizeof(struct rte_kdp_fifo))
+
+#define BUFSZ 1024
+struct kdp_request {
+	struct nlmsghdr nlmsg;
+	char buf[BUFSZ];
+};
+
+static int kdp_fd = -1;
+static int kdp_ref_count;
+
+static const struct rte_memzone *
+kdp_memzone_reserve(const char *name, size_t len, int socket_id,
+		unsigned flags)
+{
+	const struct rte_memzone *mz = rte_memzone_lookup(name);
+
+	if (mz == NULL)
+		mz = rte_memzone_reserve(name, len, socket_id, flags);
+
+	return mz;
+}
+
+static int
+kdp_slot_init(struct kdp_memzone_slot *slot)
+{
+#define OBJNAMSIZ 32
+	char obj_name[OBJNAMSIZ];
+	const struct rte_memzone *mz;
+
+	/* TX RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_tx_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_tx_q = mz;
+
+	/* RX RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_rx_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_rx_q = mz;
+
+	/* ALLOC RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_alloc_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_alloc_q = mz;
+
+	/* FREE RING */
+	snprintf(obj_name, OBJNAMSIZ, "kdp_free_%d", slot->id);
+	mz = kdp_memzone_reserve(obj_name, KDP_FIFO_SIZE, SOCKET_ID_ANY, 0);
+	if (mz == NULL)
+		goto kdp_fail;
+	slot->m_free_q = mz;
+
+	return 0;
+
+kdp_fail:
+	return -1;
+}
+
+static void
+kdp_ring_init(struct kdp_data *kdp)
+{
+	struct kdp_memzone_slot *slot = kdp->slot;
+	const struct rte_memzone *mz;
+
+	/* TX RING */
+	mz = slot->m_tx_q;
+	kdp->tx_q = mz->addr;
+	kdp_fifo_init(kdp->tx_q, KDP_FIFO_COUNT_MAX);
+
+	/* RX RING */
+	mz = slot->m_rx_q;
+	kdp->rx_q = mz->addr;
+	kdp_fifo_init(kdp->rx_q, KDP_FIFO_COUNT_MAX);
+
+	/* ALLOC RING */
+	mz = slot->m_alloc_q;
+	kdp->alloc_q = mz->addr;
+	kdp_fifo_init(kdp->alloc_q, KDP_FIFO_COUNT_MAX);
+
+	/* FREE RING */
+	mz = slot->m_free_q;
+	kdp->free_q = mz->addr;
+	kdp_fifo_init(kdp->free_q, KDP_FIFO_COUNT_MAX);
+}
+
+static int
+kdp_module_check(void)
+{
+	int fd;
+
+	fd = open("/sys/module/" KDP_MODULE_NAME "/initstate", O_RDONLY);
+	if (fd < 0)
+		return -1;
+	close(fd);
+
+	return 0;
+}
+
+static int
+rtnl_socket_open(void)
+{
+	struct sockaddr_nl src;
+	int ret;
+
+	/* Check FD and open */
+	if (kdp_fd < 0) {
+		kdp_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
+		if (kdp_fd < 0) {
+			RTE_LOG(ERR, KDP, "socket for create failed.\n");
+			return -1;
+		}
+
+		memset(&src, 0, sizeof(struct sockaddr_nl));
+
+		src.nl_family = AF_NETLINK;
+		src.nl_pid = getpid();
+
+		ret = bind(kdp_fd, (struct sockaddr *)&src,
+				sizeof(struct sockaddr_nl));
+		if (ret < 0) {
+			RTE_LOG(ERR, KDP, "Bind for create failed.\n");
+			close(kdp_fd);
+			kdp_fd = -1;
+			return -1;
+		}
+	}
+
+	kdp_ref_count++;
+
+	return 0;
+}
+
+static void
+kdp_ref_put(void)
+{
+	/* not initialized? */
+	if (!kdp_ref_count)
+		return;
+
+	kdp_ref_count--;
+
+	/* not last one? */
+	if (kdp_ref_count)
+		return;
+
+	if (kdp_fd < 0)
+		return;
+
+	close(kdp_fd);
+	kdp_fd = -1;
+}
+
+struct kdp_data *
+kdp_init(uint16_t port_id)
+{
+	struct kdp_memzone_slot *slot = NULL;
+	struct kdp_data *kdp = NULL;
+	int ret;
+
+	ret = kdp_module_check();
+	if (ret)
+		return NULL;
+
+	ret = rtnl_socket_open();
+	if (ret)
+		return NULL;
+
+	slot = rte_malloc(NULL, sizeof(struct kdp_memzone_slot), 0);
+	if (slot == NULL)
+		goto kdp_fail;
+	slot->id = port_id;
+
+	kdp = rte_malloc(NULL, sizeof(struct kdp_data), 0);
+	if (kdp == NULL)
+		goto kdp_fail;
+	kdp->slot = slot;
+
+	ret = kdp_slot_init(slot);
+	if (ret < 0)
+		goto kdp_fail;
+
+	kdp_ring_init(kdp);
+
+	return kdp;
+
+kdp_fail:
+	kdp_ref_put();
+	rte_free(slot);
+	rte_free(kdp);
+	RTE_LOG(ERR, KDP, "Unable to allocate memory\n");
+	return NULL;
+}
+
+static void
+kdp_mbufs_allocate(struct kdp_data *kdp)
+{
+	int i, ret;
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
+			 offsetof(struct rte_kdp_mbuf, pool));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_addr) !=
+			 offsetof(struct rte_kdp_mbuf, buf_addr));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) !=
+			 offsetof(struct rte_kdp_mbuf, next));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) !=
+			 offsetof(struct rte_kdp_mbuf, data_off));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) !=
+			 offsetof(struct rte_kdp_mbuf, data_len));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pkt_len) !=
+			 offsetof(struct rte_kdp_mbuf, pkt_len));
+	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, ol_flags) !=
+			 offsetof(struct rte_kdp_mbuf, ol_flags));
+
+	/* Check if pktmbuf pool has been configured */
+	if (kdp->pktmbuf_pool == NULL) {
+		RTE_LOG(ERR, KDP, "No valid mempool for allocating mbufs\n");
+		return;
+	}
+
+	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
+		pkts[i] = rte_pktmbuf_alloc(kdp->pktmbuf_pool);
+		if (unlikely(pkts[i] == NULL)) {
+			/* Out of memory */
+			RTE_LOG(ERR, KDP, "Out of memory\n");
+			break;
+		}
+	}
+
+	/* No pkt mbuf alocated */
+	if (i <= 0)
+		return;
+
+	ret = kdp_fifo_put(kdp->alloc_q, (void **)pkts, i);
+
+	/* Check if any mbufs not put into alloc_q, and then free them */
+	if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
+		int j;
+
+		for (j = ret; j < i; j++)
+			rte_pktmbuf_free(pkts[j]);
+	}
+}
+
+static int
+attr_add(struct kdp_request *req, unsigned short type, void *buf, size_t len)
+{
+	struct rtattr *rta;
+	int nlmsg_len;
+
+	nlmsg_len = NLMSG_ALIGN(req->nlmsg.nlmsg_len);
+	rta = (struct rtattr *)((char *)&req->nlmsg + nlmsg_len);
+	if (nlmsg_len + RTA_LENGTH(len) > sizeof(struct kdp_request))
+		return -1;
+	rta->rta_type = type;
+	rta->rta_len = RTA_LENGTH(len);
+	memcpy(RTA_DATA(rta), buf, len);
+	req->nlmsg.nlmsg_len = nlmsg_len + RTA_LENGTH(len);
+
+	return 0;
+}
+
+static struct
+rtattr *attr_nested_add(struct kdp_request *req, unsigned short type)
+{
+	struct rtattr *rta;
+	int nlmsg_len;
+
+	nlmsg_len = NLMSG_ALIGN(req->nlmsg.nlmsg_len);
+	rta = (struct rtattr *)((char *)&req->nlmsg + nlmsg_len);
+	if (nlmsg_len + RTA_LENGTH(0) > sizeof(struct kdp_request))
+		return NULL;
+	rta->rta_type = type;
+	rta->rta_len = nlmsg_len;
+	req->nlmsg.nlmsg_len = nlmsg_len + RTA_LENGTH(0);
+
+	return rta;
+}
+
+static void
+attr_nested_end(struct kdp_request *req, struct rtattr *rta)
+{
+	rta->rta_len = req->nlmsg.nlmsg_len - rta->rta_len;
+}
+
+static int
+rtnl_create(struct rte_kdp_device_info *dev_info)
+{
+	struct kdp_request req;
+	struct ifinfomsg *info;
+	struct rtattr *rta1;
+	struct rtattr *rta2;
+	char name[RTE_KDP_NAMESIZE];
+	char type[RTE_KDP_NAMESIZE];
+	struct iovec iov;
+	struct msghdr msg;
+	struct sockaddr_nl nladdr;
+	int ret;
+	char buf[BUFSZ];
+
+	memset(&req, 0, sizeof(struct kdp_request));
+
+	req.nlmsg.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.nlmsg.nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;
+	req.nlmsg.nlmsg_flags |= NLM_F_ACK;
+	req.nlmsg.nlmsg_type = RTM_NEWLINK;
+
+	info = NLMSG_DATA(&req.nlmsg);
+
+	info->ifi_family = AF_UNSPEC;
+	info->ifi_index = 0;
+
+	snprintf(name, RTE_KDP_NAMESIZE, "%s", dev_info->name);
+	ret = attr_add(&req, IFLA_IFNAME, name, strlen(name) + 1);
+	if (ret < 0)
+		return -1;
+
+	rta1 = attr_nested_add(&req, IFLA_LINKINFO);
+	if (rta1 == NULL)
+		return -1;
+
+	snprintf(type, RTE_KDP_NAMESIZE, KDP_DEVICE);
+	ret = attr_add(&req, IFLA_INFO_KIND, type, strlen(type) + 1);
+	if (ret < 0)
+		return -1;
+
+	rta2 = attr_nested_add(&req, IFLA_INFO_DATA);
+	if (rta2 == NULL)
+		return -1;
+
+	ret = attr_add(&req, IFLA_KDP_PORTID, &dev_info->port_id,
+			sizeof(uint8_t));
+	if (ret < 0)
+		return -1;
+
+	ret = attr_add(&req, IFLA_KDP_DEVINFO, dev_info,
+			sizeof(struct rte_kdp_device_info));
+	if (ret < 0)
+		return -1;
+
+	attr_nested_end(&req, rta2);
+	attr_nested_end(&req, rta1);
+
+	memset(&nladdr, 0, sizeof(nladdr));
+	nladdr.nl_family = AF_NETLINK;
+
+	iov.iov_base = (void *)&req.nlmsg;
+	iov.iov_len = req.nlmsg.nlmsg_len;
+
+	memset(&msg, 0, sizeof(struct msghdr));
+	msg.msg_name = &nladdr;
+	msg.msg_namelen = sizeof(nladdr);
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+
+	ret = sendmsg(kdp_fd, &msg, 0);
+	if (ret < 0) {
+		RTE_LOG(ERR, KDP, "Send for create failed %d.\n", errno);
+		return -1;
+	}
+
+	memset(buf, 0, sizeof(buf));
+	iov.iov_base = buf;
+	iov.iov_len = sizeof(buf);
+
+	ret = recvmsg(kdp_fd, &msg, 0);
+	if (ret < 0) {
+		RTE_LOG(ERR, KDP, "Recv for create failed.\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+kdp_start(struct kdp_data *kdp, struct rte_mempool *pktmbuf_pool,
+	      const struct kdp_conf *conf)
+{
+	struct kdp_memzone_slot *slot = kdp->slot;
+	struct rte_kdp_device_info dev_info;
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	int ret;
+
+	if (!kdp || !pktmbuf_pool || !conf || !conf->name[0])
+		return -1;
+
+	snprintf(kdp->name, RTE_KDP_NAMESIZE, "%s", conf->name);
+	kdp->pktmbuf_pool = pktmbuf_pool;
+	kdp->port_id = conf->port_id;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	dev_info.core_id = conf->core_id;
+	dev_info.force_bind = conf->force_bind;
+	dev_info.port_id = conf->port_id;
+	dev_info.mbuf_size = conf->mbuf_size;
+	snprintf(dev_info.name, RTE_KDP_NAMESIZE, "%s", conf->name);
+
+	dev_info.tx_phys = slot->m_tx_q->phys_addr;
+	dev_info.rx_phys = slot->m_rx_q->phys_addr;
+	dev_info.alloc_phys = slot->m_alloc_q->phys_addr;
+	dev_info.free_phys = slot->m_free_q->phys_addr;
+
+	/* MBUF mempool */
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME,
+		pktmbuf_pool->name);
+	mz = rte_memzone_lookup(mz_name);
+	if (mz == NULL)
+		goto kdp_fail;
+	dev_info.mbuf_va = mz->addr;
+	dev_info.mbuf_phys = mz->phys_addr;
+
+	ret = rtnl_create(&dev_info);
+	if (ret < 0)
+		goto kdp_fail;
+
+	kdp->in_use = 1;
+
+	/* Allocate mbufs and then put them into alloc_q */
+	kdp_mbufs_allocate(kdp);
+
+	return 0;
+
+kdp_fail:
+	return -1;
+}
+
+static void
+kdp_mbufs_free(struct kdp_data *kdp)
+{
+	int i, ret;
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+
+	ret = kdp_fifo_get(kdp->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
+	if (likely(ret > 0)) {
+		for (i = 0; i < ret; i++)
+			rte_pktmbuf_free(pkts[i]);
+	}
+}
+
+unsigned
+kdp_tx_burst(struct kdp_data *kdp, struct rte_mbuf **mbufs, unsigned num)
+{
+	unsigned ret = kdp_fifo_put(kdp->rx_q, (void **)mbufs, num);
+
+	/* Get mbufs from free_q and then free them */
+	kdp_mbufs_free(kdp);
+
+	return ret;
+}
+
+unsigned
+kdp_rx_burst(struct kdp_data *kdp, struct rte_mbuf **mbufs, unsigned num)
+{
+	unsigned ret = kdp_fifo_get(kdp->tx_q, (void **)mbufs, num);
+
+	/* If buffers removed, allocate mbufs and then put them into alloc_q */
+	if (ret)
+		kdp_mbufs_allocate(kdp);
+
+	return ret;
+}
+
+static void
+kdp_fifo_free(struct rte_kdp_fifo *fifo)
+{
+	int ret;
+	struct rte_mbuf *pkt;
+
+	do {
+		ret = kdp_fifo_get(fifo, (void **)&pkt, 1);
+		if (ret)
+			rte_pktmbuf_free(pkt);
+	} while (ret);
+}
+
+static int
+rtnl_destroy(struct kdp_data *kdp)
+{
+	struct kdp_request req;
+	struct ifinfomsg *info;
+	struct iovec iov;
+	struct msghdr msg;
+	struct sockaddr_nl nladdr;
+	int ret;
+
+	memset(&req, 0, sizeof(struct kdp_request));
+
+	req.nlmsg.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.nlmsg.nlmsg_flags = NLM_F_REQUEST;
+	req.nlmsg.nlmsg_type = RTM_DELLINK;
+
+	info = NLMSG_DATA(&req.nlmsg);
+
+	info->ifi_family = AF_UNSPEC;
+	info->ifi_index = 0;
+
+	ret = attr_add(&req, IFLA_IFNAME, kdp->name, strlen(kdp->name) + 1);
+	if (ret < 0)
+		return -1;
+
+	memset(&nladdr, 0, sizeof(nladdr));
+	nladdr.nl_family = AF_NETLINK;
+
+	iov.iov_base = (void *)&req.nlmsg;
+	iov.iov_len = req.nlmsg.nlmsg_len;
+
+	memset(&msg, 0, sizeof(struct msghdr));
+	msg.msg_name = &nladdr;
+	msg.msg_namelen = sizeof(nladdr);
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+
+	ret = sendmsg(kdp_fd, &msg, 0);
+	if (ret < 0) {
+		RTE_LOG(ERR, KDP, "Send for destroy failed.\n");
+		return -1;
+	}
+	return 0;
+}
+
+int
+kdp_stop(struct kdp_data *kdp)
+{
+	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
+	int ret;
+	int i;
+
+	if (!kdp || !kdp->in_use)
+		return -1;
+
+	rtnl_destroy(kdp);
+
+	do {
+		ret = kdp_fifo_get(kdp->free_q, (void **)pkts,
+				MAX_MBUF_BURST_NUM);
+		if (ret > 0) {
+			for (i = 0; i < ret; i++)
+				rte_pktmbuf_free(pkts[i]);
+		}
+	} while (ret > 0);
+
+	do {
+		ret = kdp_fifo_get(kdp->alloc_q, (void **)pkts,
+				MAX_MBUF_BURST_NUM);
+		if (ret > 0) {
+			for (i = 0; i < ret; i++)
+				rte_pktmbuf_free(pkts[i]);
+		}
+	} while (ret > 0);
+	return 0;
+}
+
+void
+kdp_close(struct kdp_data *kdp)
+{
+	/* mbufs in all fifo should be released, except request/response */
+	kdp_fifo_free(kdp->tx_q);
+	kdp_fifo_free(kdp->rx_q);
+	kdp_fifo_free(kdp->alloc_q);
+	kdp_fifo_free(kdp->free_q);
+
+	rte_free(kdp->slot);
+
+	/* Memset the KDP struct */
+	memset(kdp, 0, sizeof(struct kdp_data));
+}
+
+void
+kdp_uninit(void)
+{
+	kdp_ref_put();
+}
diff --git a/drivers/net/kdp/rte_kdp.h b/drivers/net/kdp/rte_kdp.h
new file mode 100644
index 0000000..20ad93d
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp.h
@@ -0,0 +1,116 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_KDP_H_
+#define _RTE_KDP_H_
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <sys/ioctl.h>
+
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+
+#include <exec-env/rte_kdp_common.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * KDP memzone pool slot
+ */
+struct kdp_memzone_slot {
+	uint32_t id;
+
+	/* Memzones */
+	const struct rte_memzone *m_tx_q;      /**< TX queue */
+	const struct rte_memzone *m_rx_q;      /**< RX queue */
+	const struct rte_memzone *m_alloc_q;   /**< Allocated mbufs queue */
+	const struct rte_memzone *m_free_q;    /**< To be freed mbufs queue */
+};
+
+/**
+ * KDP context
+ */
+struct kdp_data {
+	char name[RTE_KDP_NAMESIZE];        /**< KDP interface name */
+	struct rte_mempool *pktmbuf_pool;   /**< pkt mbuf mempool */
+	struct kdp_memzone_slot *slot;
+	uint16_t port_id;                  /**< Group ID of KDP devices */
+
+	struct rte_kdp_fifo *tx_q;          /**< TX queue */
+	struct rte_kdp_fifo *rx_q;          /**< RX queue */
+	struct rte_kdp_fifo *alloc_q;       /**< Allocated mbufs queue */
+	struct rte_kdp_fifo *free_q;        /**< To be freed mbufs queue */
+
+	uint8_t in_use;                     /**< kdp in use */
+};
+
+struct kdp_tap_data {
+	char name[RTE_KDP_NAMESIZE];
+	int tap_fd;
+};
+
+/**
+ * Structure for configuring KDP device.
+ */
+struct kdp_conf {
+	char name[RTE_KDP_NAMESIZE];
+	uint32_t core_id;   /* Core ID to bind kernel thread on */
+	uint16_t port_id;
+	unsigned mbuf_size;
+
+	uint8_t force_bind; /* Flag to bind kernel thread */
+};
+
+struct kdp_data *kdp_init(uint16_t port_id);
+int kdp_start(struct kdp_data *kdp, struct rte_mempool *pktmbuf_pool,
+	      const struct kdp_conf *conf);
+unsigned kdp_rx_burst(struct kdp_data *kdp,
+		struct rte_mbuf **mbufs, unsigned num);
+unsigned kdp_tx_burst(struct kdp_data *kdp,
+		struct rte_mbuf **mbufs, unsigned num);
+int kdp_stop(struct kdp_data *kdp);
+void kdp_close(struct kdp_data *kdp);
+void kdp_uninit(void);
+
+struct kdp_tap_data *kdp_tap_init(uint16_t port_id);
+void kdp_tap_close(struct kdp_tap_data *kdp_tap);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_KDP_H_ */
diff --git a/drivers/net/kdp/rte_kdp_fifo.h b/drivers/net/kdp/rte_kdp_fifo.h
new file mode 100644
index 0000000..1a7e063
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp_fifo.h
@@ -0,0 +1,91 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * Initializes the kdp fifo structure
+ */
+static void
+kdp_fifo_init(struct rte_kdp_fifo *fifo, unsigned size)
+{
+	/* Ensure size is power of 2 */
+	if (size & (size - 1))
+		rte_panic("KDP fifo size must be power of 2\n");
+
+	fifo->write = 0;
+	fifo->read = 0;
+	fifo->len = size;
+	fifo->elem_size = sizeof(void *);
+}
+
+/**
+ * Adds num elements into the fifo. Return the number actually written
+ */
+static inline unsigned
+kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned fifo_write = fifo->write;
+	unsigned fifo_read = fifo->read;
+	unsigned new_write = fifo_write;
+
+	for (i = 0; i < num; i++) {
+		new_write = (new_write + 1) & (fifo->len - 1);
+
+		if (new_write == fifo_read)
+			break;
+		fifo->buffer[fifo_write] = data[i];
+		fifo_write = new_write;
+	}
+	fifo->write = fifo_write;
+	return i;
+}
+
+/**
+ * Get up to num elements from the fifo. Return the number actully read
+ */
+static inline unsigned
+kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, unsigned num)
+{
+	unsigned i = 0;
+	unsigned new_read = fifo->read;
+	unsigned fifo_write = fifo->write;
+	for (i = 0; i < num; i++) {
+		if (new_read == fifo_write)
+			break;
+
+		data[i] = fifo->buffer[new_read];
+		new_read = (new_read + 1) & (fifo->len - 1);
+	}
+	fifo->read = new_read;
+	return i;
+}
diff --git a/drivers/net/kdp/rte_kdp_tap.c b/drivers/net/kdp/rte_kdp_tap.c
new file mode 100644
index 0000000..12f3ad2
--- /dev/null
+++ b/drivers/net/kdp/rte_kdp_tap.c
@@ -0,0 +1,101 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+
+#include <sys/socket.h>
+#include <linux/if.h>
+#include <linux/if_tun.h>
+
+#include "rte_kdp.h"
+
+static int
+tap_create(char *name)
+{
+	struct ifreq ifr;
+	int fd, ret;
+
+	fd = open("/dev/net/tun", O_RDWR);
+	if (fd < 0)
+		return fd;
+
+	memset(&ifr, 0, sizeof(ifr));
+
+	/* TAP device without packet information */
+	ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
+
+	if (name && *name)
+		snprintf(ifr.ifr_name, IFNAMSIZ, "%s", name);
+
+	ret = ioctl(fd, TUNSETIFF, (void *)&ifr);
+	if (ret < 0) {
+		close(fd);
+		return ret;
+	}
+
+	if (name)
+		snprintf(name, IFNAMSIZ, "%s", ifr.ifr_name);
+
+	return fd;
+}
+
+struct kdp_tap_data *
+kdp_tap_init(uint16_t port_id)
+{
+	struct kdp_tap_data *kdp_tap = NULL;
+	int flags;
+
+	kdp_tap = rte_malloc(NULL, sizeof(struct kdp_tap_data), 0);
+	if (kdp_tap == NULL)
+		goto error;
+
+	snprintf(kdp_tap->name, IFNAMSIZ, "tap_kdp%u", port_id);
+	kdp_tap->tap_fd = tap_create(kdp_tap->name);
+	if (kdp_tap->tap_fd < 0)
+		goto error;
+
+	flags = fcntl(kdp_tap->tap_fd, F_GETFL, 0);
+	fcntl(kdp_tap->tap_fd, F_SETFL, flags | O_NONBLOCK);
+
+	return kdp_tap;
+
+error:
+	rte_free(kdp_tap);
+	return NULL;
+}
+
+void
+kdp_tap_close(struct kdp_tap_data *kdp_tap)
+{
+	close(kdp_tap->tap_fd);
+}
diff --git a/drivers/net/kdp/rte_pmd_kdp_version.map b/drivers/net/kdp/rte_pmd_kdp_version.map
new file mode 100644
index 0000000..349c6e1
--- /dev/null
+++ b/drivers/net/kdp/rte_pmd_kdp_version.map
@@ -0,0 +1,4 @@
+DPDK_16.04 {
+
+	local: *;
+};
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index 2e47e7f..5a0048b 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -79,6 +79,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
 #define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
 #define RTE_LOGTYPE_CRYPTODEV 0x00020000 /**< Log related to cryptodev. */
+#define RTE_LOGTYPE_KDP     0x00080000 /**< Log related to KDP. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index daac09f..cdce5e9 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   Copyright(c) 2014-2015 6WIND S.A.
 #   All rights reserved.
 #
@@ -145,6 +145,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)        += -lrte_pmd_qat
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KDP)        += -lrte_pmd_kdp
 
 # AESNI MULTI BUFFER is dependent on the IPSec_MB library
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB)   += -lrte_pmd_aesni_mb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-09 11:17   ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
  2016-03-09 11:17     ` [PATCH v3 1/2] kdp: add kernel data path kernel module Ferruh Yigit
  2016-03-09 11:17     ` [PATCH v3 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
@ 2016-03-14 15:32     ` Ferruh Yigit
  2016-03-16  7:26       ` Panu Matilainen
  2 siblings, 1 reply; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-14 15:32 UTC (permalink / raw)
  To: dev; +Cc: David Marchand, Helin Zhang

On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
> This patch sent to keep record of latest status of the work.
> 
> 
> This is slow data path communication implementation based on existing KNI.
> 
> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
> same except all control path functionality removed and some simplification done.
> 
> Motivation is to simplify slow path data communication.
> Now any application can use this new PMD to send/get data to Linux kernel.
> 
> PMD supports two communication methods:
> 
> 1) KDP kernel module
> PMD initialization functions handles creating virtual interfaces (with help of
> kdp kernel module) and created FIFO. FIFO is used to share data between
> userspace and kernelspace. This is default method.
> 
> 2) tun/tap module
> When KDP module is not inserted, PMD creates tap interface and transfers
> packets using tap interface.
> 
> In long term this patch intends to replace the KNI and KNI will be
> depreciated.
> 

Self-NACK: Will work on another option that does not introduce new
kernel module.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-14 15:32     ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
@ 2016-03-16  7:26       ` Panu Matilainen
  2016-03-16  8:19         ` Ferruh Yigit
  0 siblings, 1 reply; 29+ messages in thread
From: Panu Matilainen @ 2016-03-16  7:26 UTC (permalink / raw)
  To: Ferruh Yigit, dev; +Cc: David Marchand, Helin Zhang

On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>> This patch sent to keep record of latest status of the work.
>>
>>
>> This is slow data path communication implementation based on existing KNI.
>>
>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>> same except all control path functionality removed and some simplification done.
>>
>> Motivation is to simplify slow path data communication.
>> Now any application can use this new PMD to send/get data to Linux kernel.
>>
>> PMD supports two communication methods:
>>
>> 1) KDP kernel module
>> PMD initialization functions handles creating virtual interfaces (with help of
>> kdp kernel module) and created FIFO. FIFO is used to share data between
>> userspace and kernelspace. This is default method.
>>
>> 2) tun/tap module
>> When KDP module is not inserted, PMD creates tap interface and transfers
>> packets using tap interface.
>>
>> In long term this patch intends to replace the KNI and KNI will be
>> depreciated.
>>
>
> Self-NACK: Will work on another option that does not introduce new
> kernel module.
>

Hmm, care to elaborate a bit? The second mode of this PMD already was 
free of external kernel modules. Do you mean you'll be just removing 
mode 1) from the PMD or looking at something completely different?

Just thinking that tun/tap PMD sounds like a useful thing to have, I 
hope you're not abandoning that.

	- Panu -

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16  7:26       ` Panu Matilainen
@ 2016-03-16  8:19         ` Ferruh Yigit
  2016-03-16  8:22           ` Panu Matilainen
  0 siblings, 1 reply; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-16  8:19 UTC (permalink / raw)
  To: Panu Matilainen, dev; +Cc: David Marchand, Helin Zhang

On 3/16/2016 7:26 AM, Panu Matilainen wrote:
> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>>> This patch sent to keep record of latest status of the work.
>>>
>>>
>>> This is slow data path communication implementation based on existing KNI.
>>>
>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>>> same except all control path functionality removed and some simplification done.
>>>
>>> Motivation is to simplify slow path data communication.
>>> Now any application can use this new PMD to send/get data to Linux kernel.
>>>
>>> PMD supports two communication methods:
>>>
>>> 1) KDP kernel module
>>> PMD initialization functions handles creating virtual interfaces (with help of
>>> kdp kernel module) and created FIFO. FIFO is used to share data between
>>> userspace and kernelspace. This is default method.
>>>
>>> 2) tun/tap module
>>> When KDP module is not inserted, PMD creates tap interface and transfers
>>> packets using tap interface.
>>>
>>> In long term this patch intends to replace the KNI and KNI will be
>>> depreciated.
>>>
>>
>> Self-NACK: Will work on another option that does not introduce new
>> kernel module.
>>
> 
> Hmm, care to elaborate a bit? The second mode of this PMD already was 
> free of external kernel modules. Do you mean you'll be just removing 
> mode 1) from the PMD or looking at something completely different?
> 
> Just thinking that tun/tap PMD sounds like a useful thing to have, I 
> hope you're not abandoning that.
> 

It will be KNI PMD.
Plan is to have something like KDP, but with existing KNI kernel module.
There will be tun/tap support as fallback.

Regards,
ferruh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16  8:19         ` Ferruh Yigit
@ 2016-03-16  8:22           ` Panu Matilainen
  2016-03-16 10:26             ` Ferruh Yigit
  2016-03-16 11:07             ` Bruce Richardson
  0 siblings, 2 replies; 29+ messages in thread
From: Panu Matilainen @ 2016-03-16  8:22 UTC (permalink / raw)
  To: Ferruh Yigit, dev; +Cc: David Marchand, Helin Zhang

On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
> On 3/16/2016 7:26 AM, Panu Matilainen wrote:
>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
>>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>>>> This patch sent to keep record of latest status of the work.
>>>>
>>>>
>>>> This is slow data path communication implementation based on existing KNI.
>>>>
>>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>>>> same except all control path functionality removed and some simplification done.
>>>>
>>>> Motivation is to simplify slow path data communication.
>>>> Now any application can use this new PMD to send/get data to Linux kernel.
>>>>
>>>> PMD supports two communication methods:
>>>>
>>>> 1) KDP kernel module
>>>> PMD initialization functions handles creating virtual interfaces (with help of
>>>> kdp kernel module) and created FIFO. FIFO is used to share data between
>>>> userspace and kernelspace. This is default method.
>>>>
>>>> 2) tun/tap module
>>>> When KDP module is not inserted, PMD creates tap interface and transfers
>>>> packets using tap interface.
>>>>
>>>> In long term this patch intends to replace the KNI and KNI will be
>>>> depreciated.
>>>>
>>>
>>> Self-NACK: Will work on another option that does not introduce new
>>> kernel module.
>>>
>>
>> Hmm, care to elaborate a bit? The second mode of this PMD already was
>> free of external kernel modules. Do you mean you'll be just removing
>> mode 1) from the PMD or looking at something completely different?
>>
>> Just thinking that tun/tap PMD sounds like a useful thing to have, I
>> hope you're not abandoning that.
>>
>
> It will be KNI PMD.
> Plan is to have something like KDP, but with existing KNI kernel module.
> There will be tun/tap support as fallback.

Hum, now I'm confused. I was under the impression everybody hated KNI 
and wanted to get rid of it, and certainly not build future solutions on 
top of it?

	- Panu -

>
> Regards,
> ferruh
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16  8:22           ` Panu Matilainen
@ 2016-03-16 10:26             ` Ferruh Yigit
  2016-03-16 10:45               ` Thomas Monjalon
  2016-03-16 13:15               ` Panu Matilainen
  2016-03-16 11:07             ` Bruce Richardson
  1 sibling, 2 replies; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-16 10:26 UTC (permalink / raw)
  To: Panu Matilainen, dev; +Cc: David Marchand, Helin Zhang

On 3/16/2016 8:22 AM, Panu Matilainen wrote:
> On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
>> On 3/16/2016 7:26 AM, Panu Matilainen wrote:
>>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
>>>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>>>>> This patch sent to keep record of latest status of the work.
>>>>>
>>>>>
>>>>> This is slow data path communication implementation based on existing KNI.
>>>>>
>>>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>>>>> same except all control path functionality removed and some simplification done.
>>>>>
>>>>> Motivation is to simplify slow path data communication.
>>>>> Now any application can use this new PMD to send/get data to Linux kernel.
>>>>>
>>>>> PMD supports two communication methods:
>>>>>
>>>>> 1) KDP kernel module
>>>>> PMD initialization functions handles creating virtual interfaces (with help of
>>>>> kdp kernel module) and created FIFO. FIFO is used to share data between
>>>>> userspace and kernelspace. This is default method.
>>>>>
>>>>> 2) tun/tap module
>>>>> When KDP module is not inserted, PMD creates tap interface and transfers
>>>>> packets using tap interface.
>>>>>
>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>> depreciated.
>>>>>
>>>>
>>>> Self-NACK: Will work on another option that does not introduce new
>>>> kernel module.
>>>>
>>>
>>> Hmm, care to elaborate a bit? The second mode of this PMD already was
>>> free of external kernel modules. Do you mean you'll be just removing
>>> mode 1) from the PMD or looking at something completely different?
>>>
>>> Just thinking that tun/tap PMD sounds like a useful thing to have, I
>>> hope you're not abandoning that.
>>>
>>
>> It will be KNI PMD.
>> Plan is to have something like KDP, but with existing KNI kernel module.
>> There will be tun/tap support as fallback.
> 
> Hum, now I'm confused. I was under the impression everybody hated KNI 
> and wanted to get rid of it, and certainly not build future solutions on 
> top of it?
> 

We can't remove it.
We can't replace/improve it -you were one of the major opposition to this.
This doesn't leave more option other than using it.

There won't be any update in KNI kernel module, library + sample app
will be converted into PMD.

Regards,
ferruh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 10:26             ` Ferruh Yigit
@ 2016-03-16 10:45               ` Thomas Monjalon
  2016-03-16 11:07                 ` Mcnamara, John
  2016-03-16 11:13                 ` Ferruh Yigit
  2016-03-16 13:15               ` Panu Matilainen
  1 sibling, 2 replies; 29+ messages in thread
From: Thomas Monjalon @ 2016-03-16 10:45 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, Panu Matilainen, David Marchand, Helin Zhang

2016-03-16 10:26, Ferruh Yigit:
> On 3/16/2016 8:22 AM, Panu Matilainen wrote:
> > On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
> >> On 3/16/2016 7:26 AM, Panu Matilainen wrote:
> >>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
> >>>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
> >>>>> This patch sent to keep record of latest status of the work.
> >>>>>
> >>>>>
> >>>>> This is slow data path communication implementation based on existing KNI.
> >>>>>
> >>>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
> >>>>> same except all control path functionality removed and some simplification done.
> >>>>>
> >>>>> Motivation is to simplify slow path data communication.
> >>>>> Now any application can use this new PMD to send/get data to Linux kernel.
> >>>>>
> >>>>> PMD supports two communication methods:
> >>>>>
> >>>>> 1) KDP kernel module
> >>>>> PMD initialization functions handles creating virtual interfaces (with help of
> >>>>> kdp kernel module) and created FIFO. FIFO is used to share data between
> >>>>> userspace and kernelspace. This is default method.
> >>>>>
> >>>>> 2) tun/tap module
> >>>>> When KDP module is not inserted, PMD creates tap interface and transfers
> >>>>> packets using tap interface.
> >>>>>
> >>>>> In long term this patch intends to replace the KNI and KNI will be
> >>>>> depreciated.
> >>>>>
> >>>>
> >>>> Self-NACK: Will work on another option that does not introduce new
> >>>> kernel module.
> >>>>
> >>>
> >>> Hmm, care to elaborate a bit? The second mode of this PMD already was
> >>> free of external kernel modules. Do you mean you'll be just removing
> >>> mode 1) from the PMD or looking at something completely different?
> >>>
> >>> Just thinking that tun/tap PMD sounds like a useful thing to have, I
> >>> hope you're not abandoning that.
> >>>
> >>
> >> It will be KNI PMD.
> >> Plan is to have something like KDP, but with existing KNI kernel module.
> >> There will be tun/tap support as fallback.
> > 
> > Hum, now I'm confused. I was under the impression everybody hated KNI 
> > and wanted to get rid of it, and certainly not build future solutions on 
> > top of it?
> 
> We can't remove it.

Why?

> We can't replace/improve it -you were one of the major opposition to this.
> This doesn't leave more option other than using it.

Why cannot we replace it by something upstream?

> There won't be any update in KNI kernel module, library + sample app
> will be converted into PMD.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16  8:22           ` Panu Matilainen
  2016-03-16 10:26             ` Ferruh Yigit
@ 2016-03-16 11:07             ` Bruce Richardson
  1 sibling, 0 replies; 29+ messages in thread
From: Bruce Richardson @ 2016-03-16 11:07 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: Ferruh Yigit, dev, David Marchand, Helin Zhang

On Wed, Mar 16, 2016 at 10:22:05AM +0200, Panu Matilainen wrote:
> On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
> >On 3/16/2016 7:26 AM, Panu Matilainen wrote:
> >>On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
> >>>On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
> >>>>This patch sent to keep record of latest status of the work.
> >>>>
> >>>>
> >>>>This is slow data path communication implementation based on existing KNI.
> >>>>
> >>>>Difference is: librte_kni converted into a PMD, kdp kernel module is almost
> >>>>same except all control path functionality removed and some simplification done.
> >>>>
> >>>>Motivation is to simplify slow path data communication.
> >>>>Now any application can use this new PMD to send/get data to Linux kernel.
> >>>>
> >>>>PMD supports two communication methods:
> >>>>
> >>>>1) KDP kernel module
> >>>>PMD initialization functions handles creating virtual interfaces (with help of
> >>>>kdp kernel module) and created FIFO. FIFO is used to share data between
> >>>>userspace and kernelspace. This is default method.
> >>>>
> >>>>2) tun/tap module
> >>>>When KDP module is not inserted, PMD creates tap interface and transfers
> >>>>packets using tap interface.
> >>>>
> >>>>In long term this patch intends to replace the KNI and KNI will be
> >>>>depreciated.
> >>>>
> >>>
> >>>Self-NACK: Will work on another option that does not introduce new
> >>>kernel module.
> >>>
> >>
> >>Hmm, care to elaborate a bit? The second mode of this PMD already was
> >>free of external kernel modules. Do you mean you'll be just removing
> >>mode 1) from the PMD or looking at something completely different?
> >>
> >>Just thinking that tun/tap PMD sounds like a useful thing to have, I
> >>hope you're not abandoning that.
> >>
> >
> >It will be KNI PMD.
> >Plan is to have something like KDP, but with existing KNI kernel module.
> >There will be tun/tap support as fallback.
> 
> Hum, now I'm confused. I was under the impression everybody hated KNI and
> wanted to get rid of it, and certainly not build future solutions on top of
> it?
> 
KNI has it's issues - mainly: a) not being upstream and b) having large
amounts of code to do port management in it, that is best handled by other
means - but the code for transferring packets between kernel space and userspace
is more performant and scalable than TUN/TAP, so we need to keep that around
unless/until we can get TUN/TAP to reach the same performance levels.

Now, we are thinking of some ways in which that can be achieved, but any such
solution is going to be a bit out, so making any driver for transferring packets
from user->kernel and vice versa might as well take advantage of KNI as well as
TUN/TAP so as to allow those who want the extra performance to have it.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 10:45               ` Thomas Monjalon
@ 2016-03-16 11:07                 ` Mcnamara, John
  2016-03-16 11:13                 ` Ferruh Yigit
  1 sibling, 0 replies; 29+ messages in thread
From: Mcnamara, John @ 2016-03-16 11:07 UTC (permalink / raw)
  To: Thomas Monjalon, Yigit, Ferruh
  Cc: dev, Panu Matilainen, David Marchand, Zhang, Helin

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, March 16, 2016 10:46 AM
> To: Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: dev@dpdk.org; Panu Matilainen <pmatilai@redhat.com>; David
> Marchand <david.marchand@6wind.com>; Zhang, Helin
> <helin.zhang@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 0/2] slow data path communication
> between DPDK port and Linux
> >
> > We can't remove it.
> 
> Why?

There are a lot of people using KNI.


> > We can't replace/improve it -you were one of the major opposition to this.
> > This doesn't leave more option other than using it.
> 
> Why cannot we replace it by something upstream?

In theory it could be upstreamed. Let's see how we get on with upstreaming the KCP component first.

John

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 10:45               ` Thomas Monjalon
  2016-03-16 11:07                 ` Mcnamara, John
@ 2016-03-16 11:13                 ` Ferruh Yigit
  2016-03-16 13:23                   ` Panu Matilainen
  1 sibling, 1 reply; 29+ messages in thread
From: Ferruh Yigit @ 2016-03-16 11:13 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Panu Matilainen, David Marchand, Helin Zhang

On 3/16/2016 10:45 AM, Thomas Monjalon wrote:
> 2016-03-16 10:26, Ferruh Yigit:
>> On 3/16/2016 8:22 AM, Panu Matilainen wrote:
>>> On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
>>>> On 3/16/2016 7:26 AM, Panu Matilainen wrote:
>>>>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
>>>>>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>>>>>>> This patch sent to keep record of latest status of the work.
>>>>>>>
>>>>>>>
>>>>>>> This is slow data path communication implementation based on existing KNI.
>>>>>>>
>>>>>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>>>>>>> same except all control path functionality removed and some simplification done.
>>>>>>>
>>>>>>> Motivation is to simplify slow path data communication.
>>>>>>> Now any application can use this new PMD to send/get data to Linux kernel.
>>>>>>>
>>>>>>> PMD supports two communication methods:
>>>>>>>
>>>>>>> 1) KDP kernel module
>>>>>>> PMD initialization functions handles creating virtual interfaces (with help of
>>>>>>> kdp kernel module) and created FIFO. FIFO is used to share data between
>>>>>>> userspace and kernelspace. This is default method.
>>>>>>>
>>>>>>> 2) tun/tap module
>>>>>>> When KDP module is not inserted, PMD creates tap interface and transfers
>>>>>>> packets using tap interface.
>>>>>>>
>>>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>>>> depreciated.
>>>>>>>
>>>>>>
>>>>>> Self-NACK: Will work on another option that does not introduce new
>>>>>> kernel module.
>>>>>>
>>>>>
>>>>> Hmm, care to elaborate a bit? The second mode of this PMD already was
>>>>> free of external kernel modules. Do you mean you'll be just removing
>>>>> mode 1) from the PMD or looking at something completely different?
>>>>>
>>>>> Just thinking that tun/tap PMD sounds like a useful thing to have, I
>>>>> hope you're not abandoning that.
>>>>>
>>>>
>>>> It will be KNI PMD.
>>>> Plan is to have something like KDP, but with existing KNI kernel module.
>>>> There will be tun/tap support as fallback.
>>>
>>> Hum, now I'm confused. I was under the impression everybody hated KNI 
>>> and wanted to get rid of it, and certainly not build future solutions on 
>>> top of it?
>>
>> We can't remove it.
> 
> Why?
> 
>> We can't replace/improve it -you were one of the major opposition to this.
>> This doesn't leave more option other than using it.
> 
> Why cannot we replace it by something upstream?
> 
I doubt KDP is upstream-able to Linux community. If somebody can, that
is great.

Even for KCP, upstreaming task is still under discussion, and as a heads
up, it is likely to be dropped.

Regards,
ferruh

>> There won't be any update in KNI kernel module, library + sample app
>> will be converted into PMD.
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 10:26             ` Ferruh Yigit
  2016-03-16 10:45               ` Thomas Monjalon
@ 2016-03-16 13:15               ` Panu Matilainen
  2016-03-16 13:58                 ` Thomas Monjalon
  1 sibling, 1 reply; 29+ messages in thread
From: Panu Matilainen @ 2016-03-16 13:15 UTC (permalink / raw)
  To: Ferruh Yigit, dev; +Cc: David Marchand, Helin Zhang, Thomas Monjalon

On 03/16/2016 12:26 PM, Ferruh Yigit wrote:
> On 3/16/2016 8:22 AM, Panu Matilainen wrote:
>> On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
>>> On 3/16/2016 7:26 AM, Panu Matilainen wrote:
>>>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
>>>>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>>>>>> This patch sent to keep record of latest status of the work.
>>>>>>
>>>>>>
>>>>>> This is slow data path communication implementation based on existing KNI.
>>>>>>
>>>>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>>>>>> same except all control path functionality removed and some simplification done.
>>>>>>
>>>>>> Motivation is to simplify slow path data communication.
>>>>>> Now any application can use this new PMD to send/get data to Linux kernel.
>>>>>>
>>>>>> PMD supports two communication methods:
>>>>>>
>>>>>> 1) KDP kernel module
>>>>>> PMD initialization functions handles creating virtual interfaces (with help of
>>>>>> kdp kernel module) and created FIFO. FIFO is used to share data between
>>>>>> userspace and kernelspace. This is default method.
>>>>>>
>>>>>> 2) tun/tap module
>>>>>> When KDP module is not inserted, PMD creates tap interface and transfers
>>>>>> packets using tap interface.
>>>>>>
>>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>>> depreciated.
>>>>>>
>>>>>
>>>>> Self-NACK: Will work on another option that does not introduce new
>>>>> kernel module.
>>>>>
>>>>
>>>> Hmm, care to elaborate a bit? The second mode of this PMD already was
>>>> free of external kernel modules. Do you mean you'll be just removing
>>>> mode 1) from the PMD or looking at something completely different?
>>>>
>>>> Just thinking that tun/tap PMD sounds like a useful thing to have, I
>>>> hope you're not abandoning that.
>>>>
>>>
>>> It will be KNI PMD.
>>> Plan is to have something like KDP, but with existing KNI kernel module.
>>> There will be tun/tap support as fallback.
>>
>> Hum, now I'm confused. I was under the impression everybody hated KNI
>> and wanted to get rid of it, and certainly not build future solutions on
>> top of it?
>>
>
> We can't remove it.
> We can't replace/improve it -you were one of the major opposition to this.

No no no. There's a misunderstanding somewhere in there.

I understand the functionality provided by KNI is important. I'd LOVE to 
see the it replaced. With something that does not require out-of-tree 
kernel modules.

As long as out-of-tree kernel modules are in the picture, the feature 
might as well not exist at all for the audience I'm dealing with. To 
that audience, replacing KNI with out-of-tree KCP/KDP or whatever is 
just irrelevant, there's no progress being made.

I also understand there are lot of users to whom out-of-tree kernel 
modules are not a problem at all, and I'm in no position to tell them 
that's somehow wrong. If KCP/KDP is better than KNI for that audience 
then more power to them.

But I dont see why such modules would *have* to be within the dpdk 
source - as suggested several times around this thread/topic such work 
could live in a separate repository or such.

What I really would like to see is a clear policy regarding kernel 
modules in DPDK. I certainly am in no position to dictate one, and 
that's why I've been asking questions and throwing around crazy (or not) 
ideas around the topic.

	- Panu -

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 11:13                 ` Ferruh Yigit
@ 2016-03-16 13:23                   ` Panu Matilainen
  0 siblings, 0 replies; 29+ messages in thread
From: Panu Matilainen @ 2016-03-16 13:23 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon; +Cc: dev, David Marchand, Helin Zhang

On 03/16/2016 01:13 PM, Ferruh Yigit wrote:
> On 3/16/2016 10:45 AM, Thomas Monjalon wrote:
>> 2016-03-16 10:26, Ferruh Yigit:
>>> On 3/16/2016 8:22 AM, Panu Matilainen wrote:
>>>> On 03/16/2016 10:19 AM, Ferruh Yigit wrote:
>>>>> On 3/16/2016 7:26 AM, Panu Matilainen wrote:
>>>>>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote:
>>>>>>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote:
>>>>>>>> This patch sent to keep record of latest status of the work.
>>>>>>>>
>>>>>>>>
>>>>>>>> This is slow data path communication implementation based on existing KNI.
>>>>>>>>
>>>>>>>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost
>>>>>>>> same except all control path functionality removed and some simplification done.
>>>>>>>>
>>>>>>>> Motivation is to simplify slow path data communication.
>>>>>>>> Now any application can use this new PMD to send/get data to Linux kernel.
>>>>>>>>
>>>>>>>> PMD supports two communication methods:
>>>>>>>>
>>>>>>>> 1) KDP kernel module
>>>>>>>> PMD initialization functions handles creating virtual interfaces (with help of
>>>>>>>> kdp kernel module) and created FIFO. FIFO is used to share data between
>>>>>>>> userspace and kernelspace. This is default method.
>>>>>>>>
>>>>>>>> 2) tun/tap module
>>>>>>>> When KDP module is not inserted, PMD creates tap interface and transfers
>>>>>>>> packets using tap interface.
>>>>>>>>
>>>>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>>>>> depreciated.
>>>>>>>>
>>>>>>>
>>>>>>> Self-NACK: Will work on another option that does not introduce new
>>>>>>> kernel module.
>>>>>>>
>>>>>>
>>>>>> Hmm, care to elaborate a bit? The second mode of this PMD already was
>>>>>> free of external kernel modules. Do you mean you'll be just removing
>>>>>> mode 1) from the PMD or looking at something completely different?
>>>>>>
>>>>>> Just thinking that tun/tap PMD sounds like a useful thing to have, I
>>>>>> hope you're not abandoning that.
>>>>>>
>>>>>
>>>>> It will be KNI PMD.
>>>>> Plan is to have something like KDP, but with existing KNI kernel module.
>>>>> There will be tun/tap support as fallback.
>>>>
>>>> Hum, now I'm confused. I was under the impression everybody hated KNI
>>>> and wanted to get rid of it, and certainly not build future solutions on
>>>> top of it?
>>>
>>> We can't remove it.
>>
>> Why?
>>
>>> We can't replace/improve it -you were one of the major opposition to this.
>>> This doesn't leave more option other than using it.
>>
>> Why cannot we replace it by something upstream?
>>
> I doubt KDP is upstream-able to Linux community. If somebody can, that
> is great.
>
> Even for KCP, upstreaming task is still under discussion, and as a heads
> up, it is likely to be dropped.

If KCP/KDP are not upstreamable then the solution is to find another way 
that is.

Easier said than done, no doubt.

	- Panu -

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 13:15               ` Panu Matilainen
@ 2016-03-16 13:58                 ` Thomas Monjalon
  2016-03-16 15:03                   ` Panu Matilainen
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Monjalon @ 2016-03-16 13:58 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: Ferruh Yigit, dev, David Marchand, Helin Zhang

2016-03-16 15:15, Panu Matilainen:
> What I really would like to see is a clear policy regarding kernel 
> modules in DPDK. I certainly am in no position to dictate one, and 
> that's why I've been asking questions and throwing around crazy (or not) 
> ideas around the topic.

I think the consensus is to avoid new kernel module,
but allow them in a staging directory while being discussed upstream.
About the existing out-of-tree kernel modules, we must continue trying
to obsolete them with upstream work.

If you feel the consensus must be clearly stated and acked,
please send a patch for doc/guides/contributing/design.rst.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 13:58                 ` Thomas Monjalon
@ 2016-03-16 15:03                   ` Panu Matilainen
  2016-03-16 15:15                     ` Thomas Monjalon
  0 siblings, 1 reply; 29+ messages in thread
From: Panu Matilainen @ 2016-03-16 15:03 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Ferruh Yigit, dev, David Marchand, Helin Zhang

On 03/16/2016 03:58 PM, Thomas Monjalon wrote:
> 2016-03-16 15:15, Panu Matilainen:
>> What I really would like to see is a clear policy regarding kernel
>> modules in DPDK. I certainly am in no position to dictate one, and
>> that's why I've been asking questions and throwing around crazy (or not)
>> ideas around the topic.
>
> I think the consensus is to avoid new kernel module,
> but allow them in a staging directory while being discussed upstream.

To me the more interesting question is: what happens after that?
As in, if upstream says no, does it mean axe from dpdk, no ifs and buts? 
If accepted upstream, does a version of the module still live within 
dpdk codebase (for example to provide the version for older kernel 
versions, I dont see that as unreasonable at all)?


> About the existing out-of-tree kernel modules, we must continue trying
> to obsolete them with upstream work.

Agreed.

>
> If you feel the consensus must be clearly stated and acked,
> please send a patch for doc/guides/contributing/design.rst.

I'll be happy to, once we have a clear consensus on what the policy 
actually is.

	- Panu -

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] slow data path communication between DPDK port and Linux
  2016-03-16 15:03                   ` Panu Matilainen
@ 2016-03-16 15:15                     ` Thomas Monjalon
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Monjalon @ 2016-03-16 15:15 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: Ferruh Yigit, dev, David Marchand, Helin Zhang

2016-03-16 17:03, Panu Matilainen:
> On 03/16/2016 03:58 PM, Thomas Monjalon wrote:
> > 2016-03-16 15:15, Panu Matilainen:
> >> What I really would like to see is a clear policy regarding kernel
> >> modules in DPDK. I certainly am in no position to dictate one, and
> >> that's why I've been asking questions and throwing around crazy (or not)
> >> ideas around the topic.
> >
> > I think the consensus is to avoid new kernel module,
> > but allow them in a staging directory while being discussed upstream.
> 
> To me the more interesting question is: what happens after that?
> As in, if upstream says no, does it mean axe from dpdk, no ifs and buts? 
> If accepted upstream, does a version of the module still live within 
> dpdk codebase (for example to provide the version for older kernel 
> versions, I dont see that as unreasonable at all)?
> 
> 
> > About the existing out-of-tree kernel modules, we must continue trying
> > to obsolete them with upstream work.
> 
> Agreed.
> 
> >
> > If you feel the consensus must be clearly stated and acked,
> > please send a patch for doc/guides/contributing/design.rst.
> 
> I'll be happy to, once we have a clear consensus on what the policy 
> actually is.

Sending a patch is the most efficient way of having the discussion
happens with more contributors.
We, as a technical community, take some patch-based decisions ;)

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2016-03-16 15:16 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-27 16:32 [PATCH 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
2016-01-27 16:32 ` [PATCH 1/2] kdp: add kernel data path kernel module Ferruh Yigit
2016-02-08 17:14   ` Reshma Pattan
2016-02-09 10:53     ` Ferruh Yigit
2016-01-27 16:32 ` [PATCH 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
2016-01-28  8:16   ` Xu, Qian Q
2016-01-29 16:04     ` Yigit, Ferruh
2016-02-09 17:33   ` Reshma Pattan
2016-02-09 17:51     ` Ferruh Yigit
2016-02-19  5:05 ` [PATCH v2 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
2016-02-19  5:05   ` [PATCH v2 1/2] kdp: add kernel data path kernel module Ferruh Yigit
2016-02-19  5:05   ` [PATCH v2 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
2016-03-09 11:17   ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
2016-03-09 11:17     ` [PATCH v3 1/2] kdp: add kernel data path kernel module Ferruh Yigit
2016-03-09 11:17     ` [PATCH v3 2/2] kdp: add virtual PMD for kernel slow data path communication Ferruh Yigit
2016-03-14 15:32     ` [PATCH v3 0/2] slow data path communication between DPDK port and Linux Ferruh Yigit
2016-03-16  7:26       ` Panu Matilainen
2016-03-16  8:19         ` Ferruh Yigit
2016-03-16  8:22           ` Panu Matilainen
2016-03-16 10:26             ` Ferruh Yigit
2016-03-16 10:45               ` Thomas Monjalon
2016-03-16 11:07                 ` Mcnamara, John
2016-03-16 11:13                 ` Ferruh Yigit
2016-03-16 13:23                   ` Panu Matilainen
2016-03-16 13:15               ` Panu Matilainen
2016-03-16 13:58                 ` Thomas Monjalon
2016-03-16 15:03                   ` Panu Matilainen
2016-03-16 15:15                     ` Thomas Monjalon
2016-03-16 11:07             ` Bruce Richardson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.