All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Grajciar <jgrajcia@cisco.com>
To: <dev@dpdk.org>
Cc: Jakub Grajciar <jgrajcia@cisco.com>
Subject: [RFC v3] /net: memory interface (memif)
Date: Thu, 13 Dec 2018 14:30:51 +0100	[thread overview]
Message-ID: <20181213133051.18779-1-jgrajcia@cisco.com> (raw)

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 config/common_base                          |    5 +
 config/common_linuxapp                      |    1 +
 doc/guides/nics/memif.rst                   |   80 ++
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   29 +
 drivers/net/memif/memif.h                   |  143 +++
 drivers/net/memif/memif_socket.c            | 1093 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  104 ++
 drivers/net/memif/meson.build               |   10 +
 drivers/net/memif/rte_eth_memif.c           | 1189 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 14 files changed, 2868 insertions(+)
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

diff --git a/config/common_base b/config/common_base
index d12ae98bc..b8ed10ae5 100644
--- a/config/common_base
+++ b/config/common_base
@@ -403,6 +403,11 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 6c1c8d0f4..42cbde8f5 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -18,6 +18,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..55d477840
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,80 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018 Cisco Systems, Inc.
+
+Memif Poll Mode Driver
+======================
+
+Memif PMD allows for DPDK and any other client using memif (DPDK, VPP, libmemif) to
+communicate using shared memory.
+
+The created device, transmits packets in a raw format. It can be used with Ethernet
+mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is supported in
+DPDK memif implementation.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0, net_memif1,
+and so on.  Memif uses unix domain socket to transmit control messages.
+Each memif has a unique id per socket. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``, etc.
+Note that if you assign a socket to a master interface it becomes a listener socket.
+Listener socket can not be used by a slave interface on same client.
+
+Memif works in two roles: master and slave. Slave connects to master over an existing
+socket. It is also a producer of shared memory file and initializes the shared memory.
+Master creates the socket and listens for any slave connection requests. The socket may
+already exist on the system. Be sure to remove any such sockets, if you are
+creating a master interface, or you will see an "Address already in use" error. Function
+``rte_pmd_memif_remove()``, which removes memif interface, will also remove a listener socket,
+if  it is not beeing used by any other interface.
+
+.. csv-table:: Memif configuration options:
+   :header: "Option", "Description", "Default"
+
+   "id=0", "Each memif on same socket must be given a unique id", 0
+   "role=master", "Set memif role", slave
+   "bsize=1024", "Size of packet buffer", 2048
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", 10
+   "nrxq=2", "Number of RX queues", 1
+   "ntxq=2", "Number of TX queues", 1
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef"
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer",
+    "zero-copy=yes", "Enable/disable zero-copy slave mode", no
+
+Example: testpmd and VPP
+------------------------
+
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index c0386feb9..0feab5241 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k
 DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..a82448423
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += -I$(SRCDIR)
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -Wno-pointer-arith
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
+LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
+LDLIBS += -lrte_bus_vdev
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..8b759d74b
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,143 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#ifndef MEMIF_CACHELINE_SIZE
+#define MEMIF_CACHELINE_SIZE 64
+#endif
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE = 0,
+	MEMIF_MSG_TYPE_ACK = 1,
+	MEMIF_MSG_TYPE_HELLO = 2,
+	MEMIF_MSG_TYPE_INIT = 3,
+	MEMIF_MSG_TYPE_ADD_REGION = 4,
+	MEMIF_MSG_TYPE_ADD_RING = 5,
+	MEMIF_MSG_TYPE_CONNECT = 6,
+	MEMIF_MSG_TYPE_CONNECTED = 7,
+	MEMIF_MSG_TYPE_DISCONNECT = 8,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M = 0,
+	MEMIF_RING_M2S = 1
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET = 0,
+	MEMIF_INTERFACE_MODE_IP = 1,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT = 2,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+typedef struct __attribute__ ((packed)) {
+	uint8_t name[32];
+	memif_version_t min_version;
+	memif_version_t max_version;
+	memif_region_index_t max_region;
+	memif_ring_index_t max_m2s_ring;
+	memif_ring_index_t max_s2m_ring;
+	memif_log2_ring_size_t max_log2_ring_size;
+} memif_msg_hello_t;
+
+typedef struct __attribute__ ((packed)) {
+	memif_version_t version;
+	memif_interface_id_t id;
+	memif_interface_mode_t mode:8;
+	uint8_t secret[24];
+	uint8_t name[32];
+} memif_msg_init_t;
+
+typedef struct __attribute__ ((packed)) {
+	memif_region_index_t index;
+	memif_region_size_t size;
+} memif_msg_add_region_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint16_t flags;
+#define MEMIF_MSG_ADD_RING_FLAG_S2M	(1 << 0)
+	memif_ring_index_t index;
+	memif_region_index_t region;
+	memif_region_offset_t offset;
+	memif_log2_ring_size_t log2_ring_size;
+	uint16_t private_hdr_size;	/* used for private metadata */
+} memif_msg_add_ring_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint8_t if_name[32];
+} memif_msg_connect_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint8_t if_name[32];
+} memif_msg_connected_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint32_t code;
+	uint8_t string[96];
+} memif_msg_disconnect_t;
+
+typedef struct __attribute__ ((packed, aligned(128))) {
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+typedef struct __attribute__ ((packed)) {
+	uint16_t flags;
+#define MEMIF_DESC_FLAG_NEXT (1 << 0)
+	memif_region_index_t region;
+	uint32_t length;
+	memif_region_offset_t offset;
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __attribute__((aligned(MEMIF_CACHELINE_SIZE)))
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;
+	uint16_t flags;
+#define MEMIF_RING_FLAG_MASK_INT 1
+	volatile uint16_t head;
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..a66b5741a
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1093 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include <rte_eth_memif.h>
+#include <memif_socket.h>
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = (void *)msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		struct cmsghdr *cmsg;
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "%s: Sent msg type %u.",
+			(cc->pmd !=
+			 NULL) ? rte_vdev_device_name(cc->pmd->vdev) :
+				 "memif_driver",
+			e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = rte_zmalloc("memif_msg",
+						    sizeof(struct
+							   memif_msg_queue_elt),
+						    0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "%s: Failed to enqueue disconnect message.",
+			(cc->pmd !=
+			 NULL) ? rte_vdev_device_name(cc->pmd->vdev) :
+				 "memif_driver");
+		return;
+	}
+
+	memif_msg_disconnect_t *d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->pmd != NULL) {
+			strlcpy(cc->pmd->local_disc_string, reason,
+				sizeof(cc->pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_hello_t *h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_IDX;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version",
+					 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.buffer_size = pmd->cfg.buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_pmd_list_elt *elt;
+	struct pmd_internals *pmd;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->pmd_queue, next) {
+		pmd = elt->pmd;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->pmd = pmd;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags && (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected",
+							 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required",
+								 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret",
+								 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct pmd_internals *pmd, memif_msg_t *msg,
+			     int fd)
+{
+	memif_msg_add_region_t *ar = &msg->add_region;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	struct memif_region *mr;
+
+	if (ar->index > ETH_MEMIF_MAX_REGION_IDX) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	mr = rte_realloc(pmd->regions, sizeof(struct memif_region) *
+			 (ar->index + 1), 0);
+	if (mr == NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Device error", 0);
+		return -1;
+	}
+
+	pmd->regions = mr;
+	pmd->regions[ar->index].fd = fd;
+	pmd->regions[ar->index].region_size = ar->size;
+	pmd->regions[ar->index].addr = NULL;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct pmd_internals *pmd, memif_msg_t *msg, int fd)
+{
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	struct memif_queue *mq;
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index",
+						 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index",
+						 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    &pmd->rx_queues[ar->index] : &pmd->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(pmd);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(pmd);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_init_t *i = &e->msg.init;
+
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (pmd->secret)
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct pmd_internals *pmd, uint8_t idx)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_add_region_t *ar = &e->msg.add_region;
+	struct memif_region *mr = &pmd->regions[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct pmd_internals *pmd, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_add_ring_t *ar = &e->msg.add_ring;
+	struct memif_queue *mq;
+
+	mq = (type == MEMIF_RING_S2M) ? &pmd->tx_queues[idx] :
+	    &pmd->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_connect_t *c = &e->msg.connect;
+	const char *name = rte_vdev_device_name(pmd->vdev);
+
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_connected_t *c = &e->msg.connected;
+
+	const char *name = rte_vdev_device_name(pmd->vdev);
+
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		while ((elt = TAILQ_FIRST(&pmd->cc->msg_queue)) != NULL) {
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		if (pmd->cc->intr_handle.fd > 0) {
+			ret =
+			    rte_intr_callback_unregister(&pmd->cc->intr_handle,
+							 memif_intr_handler,
+							 pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret =
+				    rte_intr_callback_unregister_pending(&pmd->
+							cc->intr_handle,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(pmd->cc->intr_handle.fd);
+				rte_free(pmd->cc);
+			}
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	struct memif_queue *mq;
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->tx_queues[i] : &pmd->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->rx_queues[i] : &pmd->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->pmd == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->pmd, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		for (i = 0; i < cc->pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->pmd, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < cc->pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->pmd, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < cc->pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->pmd, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->pmd, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->pmd, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->pmd, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->pmd, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->pmd, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->pmd == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(cc->pmd->vdev));
+	if (dev == NULL) {
+		MIF_LOG(WARNING, "%s: eth dev not allocated",
+			rte_vdev_device_name(cc->pmd->vdev));
+		return;
+	}
+	memif_disconnect(dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->pmd = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret =
+	    rte_intr_callback_register(&cc->intr_handle, memif_intr_handler,
+				       cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL) {
+		rte_free(cc);
+		cc = NULL;
+	}
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->pmd_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_pmd_list_elt *elt;
+	int ret;
+	char key[256];
+
+	struct rte_hash *hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->pmd_queue, next) {
+		if (elt->pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_pmd_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->pmd = pmd;
+	TAILQ_INSERT_TAIL(&socket->pmd_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct pmd_internals *pmd)
+{
+	struct memif_socket *socket = NULL;
+	struct memif_socket_pmd_list_elt *elt, *next;
+
+	struct rte_hash *hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->pmd_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->pmd == pmd) {
+			TAILQ_REMOVE(&socket->pmd_queue, elt, next);
+			free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->pmd_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	if (pmd->rx_queues == NULL || pmd->tx_queues == NULL ||
+	    pmd->socket_filename == NULL) {
+		MIF_LOG(ERR, "%s: Device not configured!",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	if (pmd->rx_queues == NULL || pmd->tx_queues == NULL ||
+	    pmd->socket_filename == NULL) {
+		MIF_LOG(ERR, "%s: Device not configured!",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->pmd = pmd;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for controll fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..e67009404
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct pmd_internals *pmd);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_pmd_list_elt {
+	TAILQ_ENTRY(memif_socket_pmd_list_elt) next;
+	struct pmd_internals *pmd;		/**< pointer to device internals */
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_pmd_list_elt) pmd_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	 TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct pmd_internals *pmd;		/**< pointer to device internals */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..a6f6ea8bc
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..5540b10cc
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1189 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include <rte_eth_memif.h>
+#include <memif_socket.h>
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_NRXQ_ARG		"nrxq"
+#define ETH_MEMIF_NTXQ_ARG		"ntxq"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_NRXQ_ARG,
+	ETH_MEMIF_NTXQ_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_memif_drv;
+
+const char *
+memif_version(void)
+{
+#define STR_HELP(s)	#s
+#define STR(s)		STR_HELP(s)
+	return ("memif-" STR(MEMIF_VERSION_MAJOR) "." STR(MEMIF_VERSION_MINOR));
+#undef STR
+#undef STR_HELP
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	dev_info->if_index = pmd->if_index;
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ?
+	    pmd->cfg.num_m2s_rings : pmd->cfg.num_s2m_rings;
+	dev_info->max_tx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ?
+	    pmd->cfg.num_s2m_rings : pmd->cfg.num_m2s_rings;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0].addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+	p += (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return (pmd->regions[d->region].addr + d->offset);
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	memif_ring_t *ring = mq->ring;
+	if (unlikely(ring == NULL))
+		return 0;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+	    RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head = NULL;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		uint64_t b;
+		ssize_t size __rte_unused;
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+	}
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+ next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
+
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				rte_pktmbuf_chain(mbuf_head, mbuf);
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_pkt_len(mbuf) =
+			    rte_pktmbuf_data_len(mbuf) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+ no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+ refill:
+	if (type == MEMIF_RING_M2S) {
+		uint16_t head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	memif_ring_t *ring = mq->ring;
+	if (unlikely(ring == NULL))
+		return 0;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_free && n_tx_pkts < nb_pkts) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len =
+		    (type ==
+		     MEMIF_RING_S2M) ? pmd->run.buffer_size : d0->length;
+
+ next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy(memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+ no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		uint64_t a = 1;
+		ssize_t size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt on qid %ld: %s",
+				rte_vdev_device_name(pmd->vdev),
+				mq - pmd->tx_queues, strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions + i;
+		if (r == NULL)
+			return;
+		if (r->addr == NULL)
+			return;
+		munmap(r->addr, r->region_size);
+		if (r->fd > 0) {
+			close(r->fd);
+			r->fd = -1;
+		}
+	}
+	rte_free(pmd->regions);
+}
+
+static int
+memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
+{
+	struct memif_region *r;
+	char shm_name[32];
+	int i;
+	int ret = 0;
+
+	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate regions.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	pmd->regions = r;
+	pmd->regions_num = brn + 1;
+
+	/*
+	 * Create shm for every region. Region 0 is reserved for descriptors.
+	 * Other regions contain buffers.
+	 */
+	for (i = 0; i < (brn + 1); i++) {
+		r = &pmd->regions[i];
+
+		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
+					       pmd->run.num_m2s_rings) *
+		    (sizeof(memif_ring_t) +
+		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;
+		r->region_size = (i == 0) ? r->buffer_offset :
+		    (uint32_t)(pmd->run.buffer_size *
+				(1 << pmd->run.log2_ring_size) *
+				(pmd->run.num_s2m_rings +
+				 pmd->run.num_m2s_rings));
+
+		memset(shm_name, 0, sizeof(char) * 32);
+		sprintf(shm_name, "memif region %d", i);
+
+		r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+		if (r->fd < 0) {
+			MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = ftruncate(r->fd, r->region_size);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		r->addr = mmap(NULL, r->region_size, PROT_READ |
+			       PROT_WRITE, MAP_SHARED, r->fd, 0);
+		if (r->addr == NULL) {
+			MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct pmd_internals *pmd)
+{
+	memif_ring_t *ring;
+	int i, j;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			uint16_t slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			uint16_t slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+}
+
+static void
+memif_init_queues(struct pmd_internals *pmd)
+{
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = &pmd->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (void *)mq->ring - (void *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = &pmd->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (void *)mq->ring - (void *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct pmd_internals *pmd)
+{
+	int ret;
+
+	ret = memif_alloc_regions(pmd, /* num of buffer regions */ 1);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(pmd);
+
+	memif_init_queues(pmd);
+
+	return 0;
+}
+
+int
+memif_connect(struct pmd_internals *pmd)
+{
+	struct rte_eth_dev *eth_dev =
+	    rte_eth_dev_allocated(rte_vdev_device_name(pmd->vdev));
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions + i;
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->tx_queues[i] : &pmd->rx_queues[i];
+		mq->ring = pmd->regions[mq->region].addr + mq->offset;
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* polling mode by default */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->rx_queues[i] : &pmd->tx_queues[i];
+		mq->ring = pmd->regions[mq->region].addr + mq->offset;
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* polling mode by default */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	eth_dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_realloc(pmd->tx_queues, sizeof(struct memif_queue) * (qid + 1),
+			 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc tx queue %u.",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	pmd->tx_queues = mq;
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_realloc(pmd->rx_queues, sizeof(struct memif_queue) * (qid + 1),
+			 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc rx queue %u.",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	pmd->rx_queues = mq;
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	uint8_t tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	uint8_t nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = &pmd->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = &pmd->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->tx_queues[i] :
+		    &pmd->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->rx_queues[i] :
+		    &pmd->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	/* Enable MEMIF interrupts. */
+	/* pmd->rx_queues[qid].ring->flags  &= ~MEMIF_RING_FLAG_MASK_INT; */
+
+	/*
+	 * TODO: Tell dpdk to use interrupt mode.
+	 *
+	 * return rte_intr_enable(&pmd->rx_queues[qid].intr_handle);
+	 */
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	/* Disable MEMIF interrupts. */
+	/* pmd->rx_queues[qid].ring->flags |= MEMIF_RING_FLAG_MASK_INT; */
+
+	/*
+	 * TODO: Tell dpdk to use polling mode.
+	 *
+	 * return rte_intr_disable(&pmd->rx_queues[qid].intr_handle);
+	 */
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size, uint8_t nrxq,
+	     uint8_t ntxq, uint16_t buffer_size, const char *secret,
+	     const char *eth_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->if_index = id;
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	if (log2_ring_size != 0)
+		pmd->cfg.log2_ring_size = log2_ring_size;
+	pmd->cfg.num_s2m_rings = ETH_MEMIF_DEFAULT_NRXQ;
+	pmd->cfg.num_m2s_rings = ETH_MEMIF_DEFAULT_NTXQ;
+
+	if (nrxq != 0) {
+		if (role == MEMIF_ROLE_SLAVE)
+			pmd->cfg.num_m2s_rings = nrxq;
+		else
+			pmd->cfg.num_s2m_rings = nrxq;
+	}
+	if (ntxq != 0) {
+		if (role == MEMIF_ROLE_SLAVE)
+			pmd->cfg.num_s2m_rings = ntxq;
+		else
+			pmd->cfg.num_m2s_rings = ntxq;
+	}
+
+	pmd->cfg.buffer_size = ETH_MEMIF_DEFAULT_BUFFER_SIZE;
+	if (buffer_size != 0)
+		pmd->cfg.buffer_size = buffer_size;
+
+	/* FIXME: generate mac? */
+	if (eth_addr == NULL)
+		eth_addr = ETH_MEMIF_DEFAULT_ETH_ADDR;
+
+	ret = sscanf(eth_addr, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &pmd->eth_addr.addr_bytes[0], &pmd->eth_addr.addr_bytes[1],
+	       &pmd->eth_addr.addr_bytes[2], &pmd->eth_addr.addr_bytes[3],
+	       &pmd->eth_addr.addr_bytes[4], &pmd->eth_addr.addr_bytes[5]);
+	if (ret != 6) {
+		MIF_LOG(WARNING, "%s: Failed to parse mac '%s'.",
+			rte_vdev_device_name(vdev), eth_addr);
+	}
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = &pmd->eth_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_nq(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *nq = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFF) {
+		MIF_LOG(ERR, "Invalid number of queues: %s.", value);
+		return -EINVAL;
+	}
+	*nq = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	unsigned int i;
+	struct rte_kvargs *kvlist;
+	const struct rte_kvargs_pair *pair;
+
+	const char *name = rte_vdev_device_name(vdev);
+
+	enum memif_role_t role;
+	memif_interface_id_t id;
+
+	uint16_t buffer_size;
+	memif_log2_ring_size_t log2_ring_size;
+	uint8_t nrxq, ntxq;
+	const char *socket_filename;
+	const char *eth_addr;
+	uint32_t flags;
+	const char *secret;
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* set default values */
+	role = MEMIF_ROLE_SLAVE;
+	flags = 0;
+	id = 0;
+	buffer_size = 2048;
+	log2_ring_size = 10;
+	nrxq = 1;
+	ntxq = 1;
+	socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	secret = NULL;
+	eth_addr = NULL;
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_ROLE_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+						 &memif_set_role, &role);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_ID_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+						 &memif_set_id, &id);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG) == 1) {
+			ret =
+			    rte_kvargs_process(kvlist,
+					       ETH_MEMIF_BUFFER_SIZE_ARG,
+					       &memif_set_bs, &buffer_size);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_RING_SIZE_ARG) == 1) {
+			ret =
+			    rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					       &memif_set_rs, &log2_ring_size);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_NRXQ_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_NRXQ_ARG,
+						 &memif_set_nq, &nrxq);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_NTXQ_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_NTXQ_ARG,
+						 &memif_set_nq, &ntxq);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_SOCKET_ARG) == 1) {
+			for (i = 0; i < kvlist->count; i++) {
+				pair = &kvlist->pairs[i];
+				if (strcmp(pair->key, ETH_MEMIF_SOCKET_ARG) == 0) {
+					socket_filename = pair->value;
+					ret = memif_check_socket_filename(
+						socket_filename);
+					if (ret < 0)
+						goto exit;
+				}
+			}
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_MAC_ARG) == 1) {
+			for (i = 0; i < kvlist->count; i++) {
+				pair = &kvlist->pairs[i];
+				if (strcmp(pair->key, ETH_MEMIF_MAC_ARG) == 0)
+					eth_addr = pair->value;
+			}
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_ZC_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+						 &memif_set_zc, &flags);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_SECRET_ARG) == 1) {
+			for (i = 0; i < kvlist->count; i++) {
+				pair = &kvlist->pairs[i];
+				if (strcmp(pair->key, ETH_MEMIF_SECRET_ARG) == 0)
+					secret = pair->value;
+			}
+		}
+	}
+
+	/* create interface */
+	ret =
+	    memif_create(vdev, role, id, flags, socket_filename, log2_ring_size,
+			 nrxq, ntxq, buffer_size, secret, eth_addr);
+
+ exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	struct pmd_internals *pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Invalid message size", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(pmd);
+
+	pmd->vdev = NULL;
+
+	rte_free(eth_dev->data->dev_private);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+RTE_PMD_REGISTER_ALIAS(net_memif, eth_memif);
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=<string>"
+			      ETH_MEMIF_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_NRXQ_ARG "=<int>"
+			      ETH_MEMIF_NTXQ_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=<string>"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..e122b2bb7
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include <memif.h>
+
+/* generate mac? */
+#define ETH_MEMIF_DEFAULT_ETH_ADDR		"01:ab:23:cd:45:ef"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_NRXQ			1
+#define ETH_MEMIF_DEFAULT_NTXQ			1
+#define ETH_MEMIF_DEFAULT_BUFFER_SIZE		2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		256
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_IDX		255
+
+int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER = 0,
+	MEMIF_ROLE_SLAVE = 1,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t buffer_offset;			/**< offset at which buffers start */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;		/**< offset at which the queue begins */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+	/*uint32_t *buffers;*/
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	int if_index;				/**< index */
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	struct ether_addr eth_addr;		/**< mac address */
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions;		/**< shared memory regions */
+	uint8_t regions_num;			/**< number of regions */
+
+	struct memif_queue *rx_queues;		/**< RX queues */
+	struct memif_queue *tx_queues;		/**< TX queues */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct pmd_internals *pmd);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct pmd_internals *pmd);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..97fd251df
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.02 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 980eec233..b0becbf31 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -21,6 +21,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 5699d979d..f236c5ebc 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -168,6 +168,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4 -ldl
 else
-- 
2.17.1

             reply	other threads:[~2018-12-13 13:31 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-13 13:30 Jakub Grajciar [this message]
2018-12-13 18:07 ` [RFC v3] /net: memory interface (memif) Stephen Hemminger
2018-12-14  9:39   ` Bruce Richardson
2018-12-14 16:12     ` Wiles, Keith
2019-01-04 17:16 ` Ferruh Yigit
2019-01-04 19:23 ` Stephen Hemminger
2019-01-04 19:27 ` Stephen Hemminger
2019-01-04 19:32 ` Stephen Hemminger
2019-02-20 11:52 ` [RFC v4] " Jakub Grajciar
2019-02-20 15:46   ` Stephen Hemminger
2019-02-20 16:17   ` Stephen Hemminger
2019-02-21 10:50   ` Rami Rosen
2019-02-27 17:04   ` Ferruh Yigit
2019-03-22 11:57   ` [RFC v5] " Jakub Grajciar
2019-03-25 20:58     ` Ferruh Yigit
2019-05-02 12:35       ` [dpdk-dev] " Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-03  4:27         ` Honnappa Nagarahalli
2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-06 11:04             ` Damjan Marion (damarion)
2019-05-07 11:29               ` Honnappa Nagarahalli
2019-05-07 11:37                 ` Damjan Marion (damarion)
2019-05-08  7:53                   ` Honnappa Nagarahalli
2019-05-09  8:30     ` [dpdk-dev] [RFC v6] " Jakub Grajciar
2019-05-13 10:45       ` [dpdk-dev] [RFC v7] " Jakub Grajciar
2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
2019-05-16 15:18           ` Stephen Hemminger
2019-05-16 15:19           ` Stephen Hemminger
2019-05-16 15:21           ` Stephen Hemminger
2019-05-20  9:22             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-16 15:25           ` Stephen Hemminger
2019-05-16 15:28           ` Stephen Hemminger
2019-05-20 10:18           ` [dpdk-dev] [RFC v9] " Jakub Grajciar
2019-05-29 17:29             ` Ferruh Yigit
2019-05-30 12:38               ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
2019-05-31  7:43               ` Ye Xiaolong
2019-06-03 11:28                 ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-06-03 14:25                   ` Ye Xiaolong
2019-06-05 12:01                 ` Ferruh Yigit
2019-06-03 13:37               ` Aaron Conole
2019-06-05 11:55               ` Ferruh Yigit
2019-06-06  9:24                 ` Ferruh Yigit
2019-06-06 10:25                   ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-06-06 11:18                     ` Ferruh Yigit
2019-06-06  8:24               ` [dpdk-dev] [PATCH v11] " Jakub Grajciar
2019-06-06 11:38                 ` [dpdk-dev] [PATCH v12] " Jakub Grajciar
2019-06-06 14:07                   ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181213133051.18779-1-jgrajcia@cisco.com \
    --to=jgrajcia@cisco.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.