netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings
@ 2012-10-31 22:46 Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 1/4] virtio: Move definitions to header file vring.h Sjur Brændeland
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Sjur Brændeland @ 2012-10-31 22:46 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, netdev, Linus Walleij, dmitry.tarnyagin,
	linux-kernel, virtualization, sjur

This patch-set introduces the CAIF Virtio Link layer. The purpose is to
communicate with a remote processor (a modem) over shared memory. Virtio
is used as the transport mechanism, and the Remoteproc framework provides
configuration and management of the Virtio rings and devices. The modem
and Linux host may be on the same SoC, or connected over a shared memory
interface such as LLI.

Zero-Copy data transport on the modem is primary goal for CAIF Virtio.
In order to achieve Zero-Copy the direction of the Virtio rings are
flipped in the RX direction. So we have implemented the Virtio
access-function similar to what is found in vhost.c.

The connected LTE-modem is a multi-processor system with an advanced
memory allocator on board. In order to provide zero-copy from the modem
to the connected Linux host, the direction of the Virtio rings are
reversed. This allows the modem to allocate data-buffers in RX
direction and pass them to the Linux host, and recycled buffers to be
sent back to the modem.

The option of providing pre-allocated buffers in RX direction has been
considered, but rejected. The allocation of data buffers happens deep
down in the network signaling stack on the LTE modem before it is known
what type of data is received. It may be data that is handled within the
modem and never sent to the Linux host, or IP traffic going to the host.
Pre-allocated Virtio buffers does not fit for this usage. Another issue
is that the network traffic pattern may vary, resulting in variation of
number and size of buffers allocated from the memory allocator. Dynamic
allocation is needed in order to utilize memory properly. Due to this,
we decided we had to implement "reversed" vrings. Reversed vrings allows
us to minimize the impact on the current memory allocator and buffer
handling on the modem. 

In order to implement reversed rings we have added functions for reading
descriptors from the available-ring and adding descriptors to the used-ring.
The internal data-structures in virtio_ring.c are moved into a new header
file so the data-structures can be accessed by caif_virtio.

The data buffers in TX direction are allocated using dma_alloc_coherent().
This allows memory to be allocated from the memory region shared between
the Host and modem.

In TX direction single linearized TX buffers are added to the vring. In
RX direction linearized frames are also used, but multiple descriptors may
be linked. This is done to allow maximum efficiency for the LTE modem.

This patch set is not yet fully tested and does not handle all negative
scenarios correctly. So at this stage we're primarily looking for review
comments related to the structure of the Virtio code. There are several
options on how to structure this, and feedback is welcomed.

Thanks,
Sjur

Sjur Brændeland (4):
  virtio: Move definitions to header file vring.h
  include/vring.h: Add support for reversed vritio rings.
  virtio_ring: Call callback function even when used ring is empty
  caif_virtio: Add CAIF over virtio

 drivers/net/caif/Kconfig               |    9 +
 drivers/net/caif/Makefile              |    3 +
 drivers/net/caif/caif_virtio.c         |  627 ++++++++++++++++++++++++++++++++
 drivers/remoteproc/remoteproc_virtio.c |    2 +-
 drivers/virtio/virtio_ring.c           |  102 +-----
 drivers/virtio/vring.h                 |  124 +++++++
 include/linux/virtio_ring.h            |    8 +-
 include/uapi/linux/virtio_ids.h        |    1 +
 8 files changed, 776 insertions(+), 100 deletions(-)
 create mode 100644 drivers/net/caif/caif_virtio.c
 create mode 100644 drivers/virtio/vring.h

-- 
1.7.9.5

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC virtio-next 1/4] virtio: Move definitions to header file vring.h
  2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
@ 2012-10-31 22:46 ` Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 2/4] include/vring.h: Add support for reversed vritio rings Sjur Brændeland
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Sjur Brændeland @ 2012-10-31 22:46 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, netdev, Linus Walleij, dmitry.tarnyagin,
	linux-kernel, virtualization, sjur, Sjur Brændeland

From: Sjur Brændeland <sjur.brandeland@stericsson.com>

Move the vring_virtqueue structure, memory barrier and debug
macros out from virtio_ring.c to the new header file vring.h.
This is done in order to allow other kernel modules to access the
virtio internal data-structures.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
---
Tis patch triggers a couple of checkpatch warnings, but I've
chosen to do a clean copy and not do any corrections.

 drivers/virtio/virtio_ring.c |   96 +--------------------------------
 drivers/virtio/vring.h       |  121 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 122 insertions(+), 95 deletions(-)
 create mode 100644 drivers/virtio/vring.h

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ffd7e7d..9027af6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -23,101 +23,7 @@
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/hrtimer.h>
-
-/* virtio guest is communicating with a virtual "device" that actually runs on
- * a host processor.  Memory barriers are used to control SMP effects. */
-#ifdef CONFIG_SMP
-/* Where possible, use SMP barriers which are more lightweight than mandatory
- * barriers, because mandatory barriers control MMIO effects on accesses
- * through relaxed memory I/O windows (which virtio-pci does not use). */
-#define virtio_mb(vq) \
-	do { if ((vq)->weak_barriers) smp_mb(); else mb(); } while(0)
-#define virtio_rmb(vq) \
-	do { if ((vq)->weak_barriers) smp_rmb(); else rmb(); } while(0)
-#define virtio_wmb(vq) \
-	do { if ((vq)->weak_barriers) smp_wmb(); else wmb(); } while(0)
-#else
-/* We must force memory ordering even if guest is UP since host could be
- * running on another CPU, but SMP barriers are defined to barrier() in that
- * configuration. So fall back to mandatory barriers instead. */
-#define virtio_mb(vq) mb()
-#define virtio_rmb(vq) rmb()
-#define virtio_wmb(vq) wmb()
-#endif
-
-#ifdef DEBUG
-/* For development, we want to crash whenever the ring is screwed. */
-#define BAD_RING(_vq, fmt, args...)				\
-	do {							\
-		dev_err(&(_vq)->vq.vdev->dev,			\
-			"%s:"fmt, (_vq)->vq.name, ##args);	\
-		BUG();						\
-	} while (0)
-/* Caller is supposed to guarantee no reentry. */
-#define START_USE(_vq)						\
-	do {							\
-		if ((_vq)->in_use)				\
-			panic("%s:in_use = %i\n",		\
-			      (_vq)->vq.name, (_vq)->in_use);	\
-		(_vq)->in_use = __LINE__;			\
-	} while (0)
-#define END_USE(_vq) \
-	do { BUG_ON(!(_vq)->in_use); (_vq)->in_use = 0; } while(0)
-#else
-#define BAD_RING(_vq, fmt, args...)				\
-	do {							\
-		dev_err(&_vq->vq.vdev->dev,			\
-			"%s:"fmt, (_vq)->vq.name, ##args);	\
-		(_vq)->broken = true;				\
-	} while (0)
-#define START_USE(vq)
-#define END_USE(vq)
-#endif
-
-struct vring_virtqueue
-{
-	struct virtqueue vq;
-
-	/* Actual memory layout for this queue */
-	struct vring vring;
-
-	/* Can we use weak barriers? */
-	bool weak_barriers;
-
-	/* Other side has made a mess, don't try any more. */
-	bool broken;
-
-	/* Host supports indirect buffers */
-	bool indirect;
-
-	/* Host publishes avail event idx */
-	bool event;
-
-	/* Head of free buffer list. */
-	unsigned int free_head;
-	/* Number we've added since last sync. */
-	unsigned int num_added;
-
-	/* Last used index we've seen. */
-	u16 last_used_idx;
-
-	/* How to notify other side. FIXME: commonalize hcalls! */
-	void (*notify)(struct virtqueue *vq);
-
-#ifdef DEBUG
-	/* They're supposed to lock for us. */
-	unsigned int in_use;
-
-	/* Figure out if their kicks are too delayed. */
-	bool last_add_time_valid;
-	ktime_t last_add_time;
-#endif
-
-	/* Tokens for callbacks. */
-	void *data[];
-};
-
-#define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
+#include "vring.h"
 
 /* Set up an indirect table of descriptors and add it to the queue. */
 static int vring_add_indirect(struct vring_virtqueue *vq,
diff --git a/drivers/virtio/vring.h b/drivers/virtio/vring.h
new file mode 100644
index 0000000..b997fc3
--- /dev/null
+++ b/drivers/virtio/vring.h
@@ -0,0 +1,121 @@
+/* Virtio ring implementation.
+ *
+ *  Copyright 2007 Rusty Russell IBM Corporation
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+#ifndef _LINUX_VIRTIO_RING_H_
+#define _LINUX_VIRTIO_RING_H_
+
+#include <linux/virtio_ring.h>
+#include <linux/virtio.h>
+
+struct vring_virtqueue
+{
+	struct virtqueue vq;
+
+	/* Actual memory layout for this queue */
+	struct vring vring;
+
+	/* Can we use weak barriers? */
+	bool weak_barriers;
+
+	/* Other side has made a mess, don't try any more. */
+	bool broken;
+
+	/* Host supports indirect buffers */
+	bool indirect;
+
+	/* Host publishes avail event idx */
+	bool event;
+
+	/* Number of free buffers */
+	unsigned int num_free;
+	/* Head of free buffer list. */
+	unsigned int free_head;
+	/* Number we've added since last sync. */
+	unsigned int num_added;
+
+	/* Last used index we've seen. */
+	u16 last_used_idx;
+
+	/* How to notify other side. FIXME: commonalize hcalls! */
+	void (*notify)(struct virtqueue *vq);
+
+#ifdef DEBUG
+	/* They're supposed to lock for us. */
+	unsigned int in_use;
+
+	/* Figure out if their kicks are too delayed. */
+	bool last_add_time_valid;
+	ktime_t last_add_time;
+#endif
+
+	/* Tokens for callbacks. */
+	void *data[];
+};
+#define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
+
+/* virtio guest is communicating with a virtual "device" that actually runs on
+ * a host processor.  Memory barriers are used to control SMP effects. */
+#ifdef CONFIG_SMP
+/* Where possible, use SMP barriers which are more lightweight than mandatory
+ * barriers, because mandatory barriers control MMIO effects on accesses
+ * through relaxed memory I/O windows (which virtio-pci does not use). */
+#define virtio_mb(vq) \
+	do { if ((vq)->weak_barriers) smp_mb(); else mb(); } while(0)
+#define virtio_rmb(vq) \
+	do { if ((vq)->weak_barriers) smp_rmb(); else rmb(); } while(0)
+#define virtio_wmb(vq) \
+	do { if ((vq)->weak_barriers) smp_wmb(); else wmb(); } while(0)
+#else
+/* We must force memory ordering even if guest is UP since host could be
+ * running on another CPU, but SMP barriers are defined to barrier() in that
+ * configuration. So fall back to mandatory barriers instead. */
+#define virtio_mb(vq) mb()
+#define virtio_rmb(vq) rmb()
+#define virtio_wmb(vq) wmb()
+#endif
+
+#ifdef DEBUG
+/* For development, we want to crash whenever the ring is screwed. */
+#define BAD_RING(_vq, fmt, args...)				\
+	do {							\
+		dev_err(&(_vq)->vq.vdev->dev,			\
+			"%s:"fmt, (_vq)->vq.name, ##args);	\
+		BUG();						\
+	} while (0)
+/* Caller is supposed to guarantee no reentry. */
+#define START_USE(_vq)						\
+	do {							\
+		if ((_vq)->in_use)				\
+			panic("%s:in_use = %i\n",		\
+			      (_vq)->vq.name, (_vq)->in_use);	\
+		(_vq)->in_use = __LINE__;			\
+	} while (0)
+#define END_USE(_vq) \
+	do { BUG_ON(!(_vq)->in_use); (_vq)->in_use = 0; } while(0)
+#else
+#define BAD_RING(_vq, fmt, args...)				\
+	do {							\
+		dev_err(&_vq->vq.vdev->dev,			\
+			"%s:"fmt, (_vq)->vq.name, ##args);	\
+		(_vq)->broken = true;				\
+	} while (0)
+#define START_USE(vq)
+#define END_USE(vq)
+#endif
+
+#endif
-- 
1.7.9.5

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC virtio-next 2/4] include/vring.h: Add support for reversed vritio rings.
  2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 1/4] virtio: Move definitions to header file vring.h Sjur Brændeland
@ 2012-10-31 22:46 ` Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 3/4] virtio_ring: Call callback function even when used ring is empty Sjur Brændeland
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Sjur Brændeland @ 2012-10-31 22:46 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, netdev, Linus Walleij, dmitry.tarnyagin,
	linux-kernel, virtualization, sjur, Sjur Brændeland

From: Sjur Brændeland <sjur.brandeland@stericsson.com>

Add last avilable index to the vring_virtqueue structure,
this is done to prepare for implementation of the reversed vring.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
---
 drivers/virtio/vring.h |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/virtio/vring.h b/drivers/virtio/vring.h
index b997fc3..3b53961 100644
--- a/drivers/virtio/vring.h
+++ b/drivers/virtio/vring.h
@@ -51,6 +51,9 @@ struct vring_virtqueue
 	/* Last used index we've seen. */
 	u16 last_used_idx;
 
+	/* Last avail index seen. NOTE: Only used for reversed rings.*/
+	u16 last_avail_idx;
+
 	/* How to notify other side. FIXME: commonalize hcalls! */
 	void (*notify)(struct virtqueue *vq);
 
-- 
1.7.9.5

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC virtio-next 3/4] virtio_ring: Call callback function even when used ring is empty
  2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 1/4] virtio: Move definitions to header file vring.h Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 2/4] include/vring.h: Add support for reversed vritio rings Sjur Brændeland
@ 2012-10-31 22:46 ` Sjur Brændeland
  2012-10-31 22:46 ` [RFC virtio-next 4/4] caif_virtio: Add CAIF over virtio Sjur Brændeland
  2012-11-01  7:41 ` [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Rusty Russell
  4 siblings, 0 replies; 12+ messages in thread
From: Sjur Brændeland @ 2012-10-31 22:46 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, netdev, Linus Walleij, dmitry.tarnyagin,
	linux-kernel, virtualization, sjur, Sjur Brændeland

From: Sjur Brændeland <sjur.brandeland@stericsson.com>

Enable option to force call of callback function even if
used ring is empty. This is needed for reversed vring.
Add a helper function  __vring_interrupt and add extra
boolean argument for forcing callback when interrupt is called.
The original vring_interrupt semantic and signature is
perserved.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
---
 drivers/remoteproc/remoteproc_virtio.c |    2 +-
 drivers/virtio/virtio_ring.c           |    6 +++---
 include/linux/virtio_ring.h            |    8 +++++++-
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c
index e7a4780..ddde863 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -63,7 +63,7 @@ irqreturn_t rproc_vq_interrupt(struct rproc *rproc, int notifyid)
 	if (!rvring || !rvring->vq)
 		return IRQ_NONE;
 
-	return vring_interrupt(0, rvring->vq);
+	return __vring_interrupt(0, rvring->vq, true);
 }
 EXPORT_SYMBOL(rproc_vq_interrupt);
 
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 9027af6..af85034 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -504,11 +504,11 @@ void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
 }
 EXPORT_SYMBOL_GPL(virtqueue_detach_unused_buf);
 
-irqreturn_t vring_interrupt(int irq, void *_vq)
+irqreturn_t __vring_interrupt(int irq, void *_vq, bool force)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
-	if (!more_used(vq)) {
+	if (!force && !more_used(vq)) {
 		pr_debug("virtqueue interrupt with no work for %p\n", vq);
 		return IRQ_NONE;
 	}
@@ -522,7 +522,7 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 
 	return IRQ_HANDLED;
 }
-EXPORT_SYMBOL_GPL(vring_interrupt);
+EXPORT_SYMBOL_GPL(__vring_interrupt);
 
 struct virtqueue *vring_new_virtqueue(unsigned int index,
 				      unsigned int num,
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 63c6ea1..ccb7915 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -20,5 +20,11 @@ void vring_del_virtqueue(struct virtqueue *vq);
 /* Filter out transport-specific feature bits. */
 void vring_transport_features(struct virtio_device *vdev);
 
-irqreturn_t vring_interrupt(int irq, void *_vq);
+irqreturn_t __vring_interrupt(int irq, void *_vq, bool force);
+
+static inline irqreturn_t vring_interrupt(int irq, void *_vq)
+{
+	return __vring_interrupt(irq, _vq, false);
+}
+
 #endif /* _LINUX_VIRTIO_RING_H */
-- 
1.7.9.5

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC virtio-next 4/4] caif_virtio: Add CAIF over virtio
  2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
                   ` (2 preceding siblings ...)
  2012-10-31 22:46 ` [RFC virtio-next 3/4] virtio_ring: Call callback function even when used ring is empty Sjur Brændeland
@ 2012-10-31 22:46 ` Sjur Brændeland
  2012-11-01  7:41 ` [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Rusty Russell
  4 siblings, 0 replies; 12+ messages in thread
From: Sjur Brændeland @ 2012-10-31 22:46 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Vikram ARV, Michael S. Tsirkin, netdev, Linus Walleij,
	dmitry.tarnyagin, linux-kernel, virtualization, sjur,
	Sjur Brændeland

From: Sjur Brændeland <sjur.brandeland@stericsson.com>

Add the CAIF Virtio Link layer, used for communicating with a
modem over shared memory. Virtio is used as the transport mechanism.

In the TX direction the virtio rings are used in the normal fashion,
sending data in the available ring. But in the rx direction the
the we have flipped the direction of the virtio ring, and
implemented the virtio access-function similar to what is found
in vhost.c.

CAIF also uses the virtio configuration space for getting
configuration parameters such as headroom, tailroom etc.

Signed-off-by: Vikram ARV <vikram.arv@stericsson.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
---
 drivers/net/caif/Kconfig        |    9 +
 drivers/net/caif/Makefile       |    3 +
 drivers/net/caif/caif_virtio.c  |  627 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/virtio_ids.h |    1 +
 4 files changed, 640 insertions(+)
 create mode 100644 drivers/net/caif/caif_virtio.c

diff --git a/drivers/net/caif/Kconfig b/drivers/net/caif/Kconfig
index abf4d7a..a01f617 100644
--- a/drivers/net/caif/Kconfig
+++ b/drivers/net/caif/Kconfig
@@ -47,3 +47,12 @@ config CAIF_HSI
        The caif low level driver for CAIF over HSI.
        Be aware that if you enable this then you also need to
        enable a low-level HSI driver.
+
+config CAIF_VIRTIO
+       tristate "CAIF virtio transport driver"
+       default n
+       depends on CAIF
+       depends on REMOTEPROC
+       select VIRTIO
+       ---help---
+       The caif driver for CAIF over Virtio.
diff --git a/drivers/net/caif/Makefile b/drivers/net/caif/Makefile
index 91dff86..d9ee26a 100644
--- a/drivers/net/caif/Makefile
+++ b/drivers/net/caif/Makefile
@@ -13,3 +13,6 @@ obj-$(CONFIG_CAIF_SHM) += caif_shm.o
 
 # HSI interface
 obj-$(CONFIG_CAIF_HSI) += caif_hsi.o
+
+# Virtio interface
+obj-$(CONFIG_CAIF_VIRTIO) += caif_virtio.o
diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c
new file mode 100644
index 0000000..e50940f
--- /dev/null
+++ b/drivers/net/caif/caif_virtio.c
@@ -0,0 +1,627 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2012
+ * Contact: Sjur Brendeland / sjur.brandeland@stericsson.com
+ * Authors: Vicram Arv / vikram.arv@stericsson.com,
+ *	    Dmitry Tarnyagin / dmitry.tarnyagin@stericsson.com
+ *	    Sjur Brendeland / sjur.brandeland@stericsson.com
+ * License terms: GNU General Public License (GPL) version 2
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ":" fmt
+#include <linux/module.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/dma-mapping.h>
+#include <linux/netdevice.h>
+#include <linux/if_arp.h>
+#include <linux/spinlock.h>
+#include <net/caif/caif_dev.h>
+#include <linux/virtio_caif.h>
+#include "../drivers/virtio/vring.h"
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Vicram Arv <vikram.arv@stericsson.com>");
+MODULE_DESCRIPTION("Virtio CAIF Driver");
+
+/*
+ * struct cfv_info - Caif Virtio control structure
+ * @cfdev:	caif common header
+ * @vdev:	Associated virtio device
+ * @vq_rx:	rx/downlink virtqueue
+ * @vq_tx:	tx/uplink virtqueue
+ * @ndev:	associated netdevice
+ * @queued_tx:	number of buffers queued in the tx virtqueue
+ * @watermark_tx: indicates number of buffers the tx queue
+ *		should shrink to to unblock datapath
+ * @tx_lock:	protects vq_tx to allow concurrent senders
+ * @tx_hr:	transmit headroom
+ * @rx_hr:	receive headroom
+ * @tx_tr:	transmit tailroom
+ * @rx_tr:	receive tailroom
+ * @mtu:	transmit max size
+ * @mru:	receive max size
+ */
+struct cfv_info {
+	struct caif_dev_common cfdev;
+	struct virtio_device *vdev;
+	struct virtqueue *vq_rx;
+	struct virtqueue *vq_tx;
+	struct net_device *ndev;
+	unsigned int queued_tx;
+	unsigned int watermark_tx;
+	/* Protect access to vq_tx */
+	spinlock_t tx_lock;
+	/* Copied from Virtio config space */
+	u16 tx_hr;
+	u16 rx_hr;
+	u16 tx_tr;
+	u16 rx_tr;
+	u32 mtu;
+	u32 mru;
+};
+
+/*
+ * struct token_info - maintains Transmit buffer data handle
+ * @size:	size of transmit buffer
+ * @dma_handle: handle to allocated dma device memory area
+ * @vaddr:	virtual address mapping to allocated memory area
+ */
+struct token_info {
+	size_t size;
+	u8 *vaddr;
+	dma_addr_t dma_handle;
+};
+
+/* Default if virtio config space is unavailable */
+#define CFV_DEF_MTU_SIZE 4096
+#define CFV_DEF_HEADROOM 16
+#define CFV_DEF_TAILROOM 16
+
+/* Require IP header to be 4-byte aligned. */
+#define IP_HDR_ALIGN 4
+
+/*
+ * virtqueue_next_avail_desc - get the next available descriptor
+ * @_vq: the struct virtqueue we're talking about
+ * @head: index of the descriptor in the ring
+ *
+ * Look for the next available descriptor in the available ring.
+ * Return NULL if nothing new in the available.
+ */
+static struct vring_desc *virtqueue_next_avail_desc(struct virtqueue *_vq,
+						    int *head)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	u16 avail_idx, hd, last_avail_idx = vq->last_avail_idx;
+
+	START_USE(vq);
+
+	if (unlikely(vq->broken))
+		goto err;
+
+	/* The barier secures observability of avail->idx after interrupt */
+	virtio_rmb(vq);
+
+	if (vq->last_avail_idx == vq->vring.avail->idx)
+		goto err;
+
+	avail_idx = vq->vring.avail->idx;
+	if (unlikely((u16)(avail_idx - last_avail_idx) > vq->vring.num)) {
+		BAD_RING(vq, "Avail index moved from %u to %u",
+			 last_avail_idx, avail_idx);
+		goto err;
+	}
+
+	/*
+	 * The barier secures observability of the ring content
+	 * after avail->idx update
+	 */
+	virtio_rmb(vq);
+
+	hd =  vq->vring.avail->ring[last_avail_idx & (vq->vring.num - 1)];
+	/* If their number is silly, that's an error. */
+	if (unlikely(hd >= vq->vring.num)) {
+		BAD_RING(vq, "Remote says index %d > %u is available",
+			 *head, vq->vring.num);
+		goto err;
+	}
+
+	END_USE(vq);
+	*head = hd;
+	return &vq->vring.desc[hd];
+err:
+	END_USE(vq);
+	*head = -1;
+	return NULL;
+}
+
+/*
+ * virtqueue_next_linked_desc - get next linked descriptor from the ring
+ * @_vq: the struct virtqueue we're talking about
+ * @desc: "current" descriptor
+ *
+ * Each buffer in the virtqueues is a chain of descriptors. This
+ * function returns the next descriptor in the chain,* or NULL if we're at
+ * the end.
+ *
+ * Side effect: the function increments vq->last_avail_idx if a non-linked
+ * descriptor is passed as &desc argument.
+ */
+static struct vring_desc *virtqueue_next_linked_desc(struct virtqueue *_vq,
+						     struct vring_desc *desc)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	unsigned int next;
+
+	START_USE(vq);
+
+	/* If this descriptor says it doesn't chain, we're done. */
+	if (!(desc->flags & VRING_DESC_F_NEXT))
+		goto no_next;
+
+	next = desc->next;
+	/* Make sure compiler knows to grab that: we don't want it changing! */
+	/* We will use the result as an index in an array, so most
+	 * architectures only need a compiler barrier here.
+	 */
+	read_barrier_depends();
+
+	if (unlikely(next >= vq->vring.num)) {
+		BAD_RING(vq, "Desc index is %u > %u\n", next, vq->vring.num);
+		goto err;
+	}
+
+	desc = &vq->vring.desc[next];
+
+	if (desc->flags & VRING_DESC_F_INDIRECT) {
+		pr_err("Indirect descriptor not supported\n");
+		goto err;
+	}
+
+	END_USE(vq);
+	return desc;
+no_next:
+	vq->last_avail_idx++;
+err:
+	END_USE(vq);
+	return NULL;
+}
+
+/*
+ * virtqueue_add_buf_to_used - release a used descriptor
+ * @_vq: the struct virtqueue we're talking about
+ * @head: index of the descriptor to be released
+ * @len: number of linked descriptors in a chain
+ *
+ * The function releases a used descriptor in a reversed ring
+ */
+static int virtqueue_add_buf_to_used(struct virtqueue *_vq,
+				     unsigned int head, int len)
+{
+	struct vring_virtqueue *vr_vq = to_vvq(_vq);
+	struct vring_used_elem	*used;
+	int used_idx, err = -EINVAL;
+
+	START_USE(vr_vq);
+
+	if (unlikely(vr_vq->broken))
+		goto err;
+
+	if (unlikely(head >= vr_vq->vring.num)) {
+		BAD_RING(vr_vq, "Invalid head index (%u) > max desc idx (%u) ",
+			 head, vr_vq->vring.num - 1);
+		goto err;
+	}
+
+	/*
+	 * The virtqueue contains a ring of used buffers.  Get a pointer to the
+	 * next entry in that used ring.
+	 */
+	used_idx = (vr_vq->vring.used->idx & (vr_vq->vring.num - 1));
+	used = &vr_vq->vring.used->ring[used_idx];
+	used->id = head;
+	used->len = len;
+
+	/* Make sure buffer is written before we update index. */
+	virtio_wmb(vr_vq);
+	++vr_vq->vring.used->idx;
+	err = 0;
+err:
+	END_USE(vr_vq);
+	return err;
+
+}
+
+/*
+ * virtqueue_next_desc - get next available or linked descriptor
+ * @_vq: the struct virtqueue we're talking about
+ * @desc: "current" descriptor.
+ * @head: on return it is filled by the descriptor index in case of
+ *	available descriptor was returned, or -1 in case of linked
+ *	descriptor.
+ *
+ * The function is to be used as an iterator through received descriptors.
+ */
+static struct vring_desc *virtqueue_next_desc(struct virtqueue *_vq,
+					      struct vring_desc *desc,
+					      int *head)
+{
+	struct vring_desc *next = virtqueue_next_linked_desc(_vq, desc);
+
+	if (next == NULL) {
+		virtqueue_add_buf_to_used(_vq, *head, 0);
+		/* tell the remote processor to recycle buffer */
+		virtqueue_kick(_vq);
+		next = virtqueue_next_avail_desc(_vq, head);
+	} 
+	return next;
+}
+
+/*
+ * This is invoked whenever the remote processor completed processing
+ * a TX msg we just sent it, and the buffer is put back to the used ring.
+ */
+static void cfv_release_used_buf(struct virtqueue *vq_tx)
+{
+	struct cfv_info *cfv = vq_tx->vdev->priv;
+
+	BUG_ON(vq_tx != cfv->vq_tx);
+
+	for (;;) {
+		unsigned int len;
+		struct token_info *buf_info;
+
+		/* Get used buffer from used ring to recycle used descriptors */
+		spin_lock_bh(&cfv->tx_lock);
+		buf_info = virtqueue_get_buf(vq_tx, &len);
+
+		if (!buf_info)
+			goto out;
+
+		BUG_ON(!cfv->queued_tx);
+		if (--cfv->queued_tx < cfv->watermark_tx) {
+			cfv->watermark_tx = 0;
+			netif_tx_wake_all_queues(cfv->ndev);
+		}
+		spin_unlock_bh(&cfv->tx_lock);
+
+		dma_free_coherent(vq_tx->vdev->dev.parent->parent,
+				  buf_info->size, buf_info->vaddr,
+				  buf_info->dma_handle);
+		kfree(buf_info);
+	}
+	return;
+out:
+	spin_unlock_bh(&cfv->tx_lock);
+}
+
+static int cfv_read_desc(struct vring_desc *d,
+			 void **buf, size_t *size)
+{
+	if (d->flags & VRING_DESC_F_INDIRECT) {
+		pr_warn("Indirect descriptor not supported by CAIF\n");
+		return -EINVAL;
+	}
+
+	if (!(d->flags & VRING_DESC_F_WRITE)) {
+		pr_warn("Write descriptor not supported by CAIF\n");
+		/* CAIF expects a input descriptor here */
+		return -EINVAL;
+	}
+	*buf = phys_to_virt(d->addr);
+	*size = d->len;
+	return 0;
+}
+
+static struct sk_buff *cfv_alloc_and_copy_skb(struct cfv_info *cfv,
+					      u8 *frm, u32 frm_len)
+{
+	struct sk_buff *skb;
+	u32 cfpkt_len, pad_len;
+
+	/* Verify that packet size with down-link header and mtu size */
+	if (frm_len > cfv->mru || frm_len <= cfv->rx_hr + cfv->rx_tr) {
+		netdev_err(cfv->ndev,
+			   "Invalid frmlen:%u  mtu:%u hr:%d tr:%d\n",
+			   frm_len, cfv->mru,  cfv->rx_hr,
+			   cfv->rx_tr);
+		return NULL;
+	}
+
+	cfpkt_len = frm_len - (cfv->rx_hr + cfv->rx_tr);
+
+	pad_len = (unsigned long)(frm + cfv->rx_hr) & (IP_HDR_ALIGN - 1);
+
+	skb = netdev_alloc_skb(cfv->ndev, frm_len + pad_len);
+	if (!skb)
+		return NULL;
+
+	/* Reserve space for headers. */
+	skb_reserve(skb, cfv->rx_hr + pad_len);
+
+	memcpy(skb_put(skb, cfpkt_len), frm + cfv->rx_hr, cfpkt_len);
+	return skb;
+}
+
+/*
+ * This is invoked whenever the remote processor has sent down-link data
+ * on the Rx VQ avail ring and it's time to digest a message.
+ *
+ * CAIF virtio passes a complete CAIF frame including head/tail room
+ * in each linked descriptor. So iterate over all available buffers
+ * in available-ring and the associated linked descriptors.
+ */
+static void cfv_recv(struct virtqueue *vq_rx)
+{
+	struct cfv_info *cfv = vq_rx->vdev->priv;
+	struct vring_desc *desc;
+	struct sk_buff *skb;
+	int head = -1;
+	void *buf;
+	size_t len;
+
+	for (desc = virtqueue_next_avail_desc(vq_rx, &head);
+	     desc != NULL && !cfv_read_desc(desc, &buf, &len);
+	     desc = virtqueue_next_desc(vq_rx, desc, &head)) {
+
+		skb = cfv_alloc_and_copy_skb(cfv, buf, len);
+
+		if (!skb)
+			goto err;
+
+		skb->protocol = htons(ETH_P_CAIF);
+		skb_reset_mac_header(skb);
+		skb->dev = cfv->ndev;
+
+		/* Push received packet up the stack. */
+		if (netif_receive_skb(skb))
+			goto err;
+
+		++cfv->ndev->stats.rx_packets;
+		cfv->ndev->stats.rx_bytes += skb->len;
+	}
+	return;
+err:
+	++cfv->ndev->stats.rx_dropped;
+	return;
+}
+
+static int cfv_netdev_open(struct net_device *netdev)
+{
+	netif_carrier_on(netdev);
+	return 0;
+}
+
+static int cfv_netdev_close(struct net_device *netdev)
+{
+	netif_carrier_off(netdev);
+	return 0;
+}
+
+static struct token_info *cfv_alloc_and_copy_to_dmabuf(struct cfv_info *cfv,
+						       struct sk_buff *skb,
+						       struct scatterlist *sg)
+{
+	struct caif_payload_info *info = (void *)&skb->cb;
+	struct token_info *buf_info = NULL;
+	u8 pad_len, hdr_ofs;
+
+	if (unlikely(cfv->tx_hr + skb->len + cfv->tx_tr > cfv->mtu)) {
+		netdev_warn(cfv->ndev, "Invalid packet len (%d > %d)\n",
+			    cfv->tx_hr + skb->len + cfv->tx_tr, cfv->mtu);
+		goto err;
+	}
+
+	buf_info = kmalloc(sizeof(struct token_info), GFP_ATOMIC);
+	if (unlikely(!buf_info))
+		goto err;
+
+	/* Make the IP header aligned in tbe buffer */
+	hdr_ofs = cfv->tx_hr + info->hdr_len;
+	pad_len = hdr_ofs & (IP_HDR_ALIGN - 1);
+	buf_info->size = cfv->tx_hr + skb->len + cfv->tx_tr + pad_len;
+
+	if (WARN_ON_ONCE(cfv->vdev->dev.parent))
+		goto err;
+
+	/* allocate coherent memory for the buffers */
+	buf_info->vaddr =
+		dma_alloc_coherent(cfv->vdev->dev.parent->parent,
+				   buf_info->size, &buf_info->dma_handle,
+				   GFP_ATOMIC);
+	if (unlikely(!buf_info->vaddr)) {
+		netdev_warn(cfv->ndev,
+			    "Out of DMA memory (alloc %zu bytes)\n",
+			    buf_info->size);
+		goto err;
+	}
+
+	/* copy skbuf contents to send buffer */
+	skb_copy_bits(skb, 0, buf_info->vaddr + cfv->tx_hr + pad_len, skb->len);
+	sg_init_one(sg, buf_info->vaddr + pad_len,
+		    skb->len + cfv->tx_hr + cfv->rx_hr);
+	return buf_info;
+err:
+	kfree(buf_info);
+	return NULL;
+}
+
+/*
+ * This is invoked whenever the host processor application has sent up-link data.
+ * Send it in the TX VQ avail ring.
+ *
+ * CAIF Virtio sends does not use linked descriptors in the tx direction.
+ */
+static int cfv_netdev_tx(struct sk_buff *skb, struct net_device *netdev)
+{
+	struct cfv_info *cfv = netdev_priv(netdev);
+	struct token_info *buf_info;
+	struct scatterlist sg;
+	bool flow_off = false;
+
+	buf_info = cfv_alloc_and_copy_to_dmabuf(cfv, skb, &sg);
+	spin_lock_bh(&cfv->tx_lock);
+
+	/*
+	 * Add buffer to avail ring.
+	 * Note: in spite of a space check at beginning of the function,
+	 * the add_buff call might fail in case of concurrent access on smp
+	 * systems.
+	 */
+	if (WARN_ON(virtqueue_add_buf(cfv->vq_tx, &sg, 0, 1,
+				      buf_info, GFP_ATOMIC) < 0)) {
+		/* It should not happen */
+		++cfv->ndev->stats.tx_dropped;
+		flow_off = true;
+	} else {
+		/* update netdev statistics */
+		cfv->queued_tx++;
+		cfv->ndev->stats.tx_packets++;
+		cfv->ndev->stats.tx_bytes += skb->len;
+	}
+
+	/*
+	 * Flow-off check takes into account number of cpus to make sure
+	 * virtqueue will not be overfilled in any possible smp conditions.
+	 */
+	flow_off = cfv->queued_tx + num_present_cpus() >=
+		virtqueue_get_vring_size(cfv->vq_tx);
+
+	/* tell the remote processor it has a pending message to read */
+	virtqueue_kick(cfv->vq_tx);
+
+	if (flow_off) {
+		cfv->watermark_tx = cfv->queued_tx >> 1;
+		netif_tx_stop_all_queues(netdev);
+	}
+
+	spin_unlock_bh(&cfv->tx_lock);
+
+	dev_kfree_skb(skb);
+
+	/* Try to speculatively free used buffers */
+	if (flow_off)
+		cfv_release_used_buf(cfv->vq_tx);
+
+	return NETDEV_TX_OK;
+}
+
+static const struct net_device_ops cfv_netdev_ops = {
+	.ndo_open = cfv_netdev_open,
+	.ndo_stop = cfv_netdev_close,
+	.ndo_start_xmit = cfv_netdev_tx,
+};
+
+static void cfv_netdev_setup(struct net_device *netdev)
+{
+	netdev->netdev_ops = &cfv_netdev_ops;
+	netdev->type = ARPHRD_CAIF;
+	netdev->tx_queue_len = 100;
+	netdev->flags = IFF_POINTOPOINT | IFF_NOARP;
+	netdev->mtu = CFV_DEF_MTU_SIZE;
+	netdev->destructor = free_netdev;
+}
+
+#define GET_VIRTIO_CONFIG_OPS(_v, _var, _f) \
+	((_v)->config->get(_v, offsetof(struct virtio_caif_transf_config, _f), \
+			   &_var, \
+			   FIELD_SIZEOF(struct virtio_caif_transf_config, _f)))
+
+static int __devinit cfv_probe(struct virtio_device *vdev)
+{
+	vq_callback_t *vq_cbs[] = { cfv_recv, cfv_release_used_buf };
+	const char *names[] = { "input", "output" };
+	const char *cfv_netdev_name = "cfvrt";
+	struct net_device *netdev;
+	struct virtqueue *vqs[2];
+	struct cfv_info *cfv;
+	int err = 0;
+
+	netdev = alloc_netdev(sizeof(struct cfv_info), cfv_netdev_name,
+			      cfv_netdev_setup);
+	if (!netdev)
+		return -ENOMEM;
+
+	cfv = netdev_priv(netdev);
+	cfv->vdev = vdev;
+	cfv->ndev = netdev;
+
+	spin_lock_init(&cfv->tx_lock);
+
+	/* Get two virtqueues, for tx/ul and rx/dl */
+	err = vdev->config->find_vqs(vdev, 2, vqs, vq_cbs, names);
+	if (err)
+		goto free_cfv;
+
+	cfv->vq_rx = vqs[0];
+	cfv->vq_tx = vqs[1];
+
+	if (vdev->config->get) {
+		GET_VIRTIO_CONFIG_OPS(vdev, cfv->tx_hr, headroom);
+		GET_VIRTIO_CONFIG_OPS(vdev, cfv->rx_hr, headroom);
+		GET_VIRTIO_CONFIG_OPS(vdev, cfv->tx_tr, tailroom);
+		GET_VIRTIO_CONFIG_OPS(vdev, cfv->rx_tr, tailroom);
+		GET_VIRTIO_CONFIG_OPS(vdev, cfv->mtu, mtu);
+		GET_VIRTIO_CONFIG_OPS(vdev, cfv->mru, mtu);
+	} else {
+		cfv->tx_hr = CFV_DEF_HEADROOM;
+		cfv->rx_hr = CFV_DEF_HEADROOM;
+		cfv->tx_tr = CFV_DEF_TAILROOM;
+		cfv->rx_tr = CFV_DEF_TAILROOM;
+		cfv->mtu = CFV_DEF_MTU_SIZE;
+		cfv->mru = CFV_DEF_MTU_SIZE;
+
+	}
+
+	vdev->priv = cfv;
+
+	netif_carrier_off(netdev);
+
+	/* register Netdev */
+	err = register_netdev(netdev);
+	if (err) {
+		dev_err(&vdev->dev, "Unable to register netdev (%d)\n", err);
+		goto vqs_del;
+	}
+
+	/* tell the remote processor it can start sending messages */
+	virtqueue_kick(cfv->vq_rx);
+	return 0;
+
+vqs_del:
+	vdev->config->del_vqs(cfv->vdev);
+free_cfv:
+	free_netdev(netdev);
+	return err;
+}
+
+static void __devexit cfv_remove(struct virtio_device *vdev)
+{
+	struct cfv_info *cfv = vdev->priv;
+	vdev->config->reset(vdev);
+	vdev->config->del_vqs(cfv->vdev);
+	unregister_netdev(cfv->ndev);
+}
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_CAIF, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static unsigned int features[] = {
+};
+
+static struct virtio_driver caif_virtio_driver = {
+	.feature_table		= features,
+	.feature_table_size	= ARRAY_SIZE(features),
+	.driver.name		= KBUILD_MODNAME,
+	.driver.owner		= THIS_MODULE,
+	.id_table		= id_table,
+	.probe			= cfv_probe,
+	.remove			= cfv_remove,
+};
+
+module_driver(caif_virtio_driver, register_virtio_driver,
+	      unregister_virtio_driver);
+MODULE_DEVICE_TABLE(virtio, id_table);
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
index 270fb22..8ddad5a 100644
--- a/include/uapi/linux/virtio_ids.h
+++ b/include/uapi/linux/virtio_ids.h
@@ -37,5 +37,6 @@
 #define VIRTIO_ID_RPMSG		7 /* virtio remote processor messaging */
 #define VIRTIO_ID_SCSI		8 /* virtio scsi */
 #define VIRTIO_ID_9P		9 /* 9p virtio console */
+#define VIRTIO_ID_CAIF		12 /* virtio caif */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
-- 
1.7.9.5

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings
  2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
                   ` (3 preceding siblings ...)
  2012-10-31 22:46 ` [RFC virtio-next 4/4] caif_virtio: Add CAIF over virtio Sjur Brændeland
@ 2012-11-01  7:41 ` Rusty Russell
  2012-11-05 12:12   ` Sjur Brændeland
  4 siblings, 1 reply; 12+ messages in thread
From: Rusty Russell @ 2012-11-01  7:41 UTC (permalink / raw)
  To: Sjur Brændeland
  Cc: Michael S. Tsirkin, netdev, Linus Walleij, dmitry.tarnyagin,
	linux-kernel, virtualization, sjur

Sjur Brændeland <sjur@brendeland.net> writes:
> Zero-Copy data transport on the modem is primary goal for CAIF Virtio.
> In order to achieve Zero-Copy the direction of the Virtio rings are
> flipped in the RX direction. So we have implemented the Virtio
> access-function similar to what is found in vhost.c.

So, this adds another host-side virtqueue implementation.

Can we combine them together conveniently?  You pulled out more stuff
into vring.h which is a start, but it's a bit overloaded.

Perhaps we should separate the common fields into struct vring, and use
it to build:

        struct vring_guest {
                struct vring vr;
                u16 last_used_idx;
        };

        struct vring_host {
                struct vring vr;
                u16 last_avail_idx;
        };

I haven't looked closely at vhost to see what it wants, but I would
think we could share more code.

Cheers,
Rusty.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings
  2012-11-01  7:41 ` [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Rusty Russell
@ 2012-11-05 12:12   ` Sjur Brændeland
  2012-11-06  2:09     ` Rusty Russell
  0 siblings, 1 reply; 12+ messages in thread
From: Sjur Brændeland @ 2012-11-05 12:12 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, Linus Walleij, Ohad Ben-Cohen, linux-kernel,
	netdev, virtualization, dmitry.tarnyagin

Hi Rusty,

> So, this adds another host-side virtqueue implementation.
>
> Can we combine them together conveniently?  You pulled out more stuff
> into vring.h which is a start, but it's a bit overloaded.
> Perhaps we should separate the common fields into struct vring, and use
> it to build:
>
>         struct vring_guest {
>                 struct vring vr;
>                 u16 last_used_idx;
>         };
>
>         struct vring_host {
>                 struct vring vr;
>                 u16 last_avail_idx;
>         };
> I haven't looked closely at vhost to see what it wants, but I would
> think we could share more code.

I have played around with the code in vhost.c to explore your idea.
The main issue I run into is that vhost.c is accessing user data while my new
code does not. So I end up with some quirky code testing if the ring lives in
user memory or not.  Another issue is sparse warnings when
accessing user memory.

With your suggested changes I end up sharing about 100 lines of code.
So in sum, I feel this add more complexity than what we gain by sharing.

Below is an initial draft of the re-usable code. I added "is_uaccess" to struct
virtio_ring in order to know if the ring lives in user memory.

Let me know what you think.

[snip]
int virtqueue_add_used(struct vring_host *vr, unsigned int head, int len,
		    struct vring_used_elem  **used)
{
	/* The virtqueue contains a ring of used buffers.  Get a pointer to the
	 * next entry in that used ring. */
	*used = &vr->vring.used->ring[vr->last_used_idx % vr->vring.num];
	if (vr->is_uaccess) {
		if(unlikely(__put_user(head, &(*used)->id))) {
			pr_debug("Failed to write used id");
			return -EFAULT;
		}
		if (unlikely(__put_user(len, &(*used)->len))) {
			pr_debug("Failed to write used len");
			return -EFAULT;
		}
		smp_wmb();
		if (__put_user(vr->last_used_idx + 1,
			       &vr->vring.used->idx)) {
			pr_debug("Failed to increment used idx");
			return -EFAULT;
		}
	} else {
		(*used)->id = head;
		(*used)->len = len;
		smp_wmb();
		vr->vring.used->idx = vr->last_used_idx + 1;
	}
	vr->last_used_idx++;
	return 0;
}

/* Each buffer in the virtqueues is actually a chain of descriptors.  This
 * function returns the next descriptor in the chain,
 * or -1U if we're at the end. */
unsigned virtqueue_next_desc(struct vring_desc *desc)
{
	unsigned int next;

	/* If this descriptor says it doesn't chain, we're done. */
	if (!(desc->flags & VRING_DESC_F_NEXT))
		return -1U;

	/* Check they're not leading us off end of descriptors. */
	next = desc->next;
	/* Make sure compiler knows to grab that: we don't want it changing! */
	/* We will use the result as an index in an array, so most
	 * architectures only need a compiler barrier here. */
	read_barrier_depends();

	return next;
}

static int virtqueue_next_avail_desc(struct vring_host *vr)
{
	int head;
	u16 last_avail_idx;

	/* Check it isn't doing very strange things with descriptor numbers. */
	last_avail_idx = vr->last_avail_idx;
	if (vr->is_uaccess) {
		if (__get_user(vr->avail_idx, &vr->vring.avail->idx)) {
			pr_debug("Failed to access avail idx at %p\n",
				 &vr->vring.avail->idx);
			return -EFAULT;
		}
	} else
		vr->avail_idx = vr->vring.avail->idx;

	if (unlikely((u16)(vr->avail_idx - last_avail_idx) > vr->vring.num)) {
		pr_debug("Guest moved used index from %u to %u",
		       last_avail_idx, vr->avail_idx);
		return -EFAULT;
	}

	/* If there's nothing new since last we looked, return invalid. */
	if (vr->avail_idx == last_avail_idx)
		return vr->vring.num;

	/* Only get avail ring entries after they have been exposed by guest. */
	smp_rmb();

	/* Grab the next descriptor number they're advertising, and increment
	 * the index we've seen. */
	if (vr->is_uaccess) {
		if (unlikely(__get_user(head,
				&vr->vring.avail->ring[last_avail_idx
						       % vr->vring.num]))) {
			pr_debug("Failed to read head: idx %d address %p\n",
				 last_avail_idx,
				 &vr->vring.avail->ring[last_avail_idx %
							vr->vring.num]);
			return -EFAULT;
		}
	} else
		head = vr->vring.avail->ring[last_avail_idx % vr->vring.num];

	/* If their number is silly, that's an error. */
	if (unlikely(head >= vr->vring.num)) {
		pr_debug("Guest says index %u > %u is available",
		       head, vr->vring.num);
		return -EINVAL;
	}

	return head;
}

Thanks,
Sjur

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings
  2012-11-05 12:12   ` Sjur Brændeland
@ 2012-11-06  2:09     ` Rusty Russell
       [not found]       ` <1354718230-4486-1-git-send-email-sjur@brendeland.net>
  0 siblings, 1 reply; 12+ messages in thread
From: Rusty Russell @ 2012-11-06  2:09 UTC (permalink / raw)
  To: Sjur Brændeland
  Cc: Michael S. Tsirkin, netdev, Linus Walleij, dmitry.tarnyagin,
	linux-kernel, virtualization

Sjur Brændeland <sjurbr@gmail.com> writes:
> Hi Rusty,
>
>> So, this adds another host-side virtqueue implementation.
>>
>> Can we combine them together conveniently?  You pulled out more stuff
>> into vring.h which is a start, but it's a bit overloaded.
>> Perhaps we should separate the common fields into struct vring, and use
>> it to build:
>>
>>         struct vring_guest {
>>                 struct vring vr;
>>                 u16 last_used_idx;
>>         };
>>
>>         struct vring_host {
>>                 struct vring vr;
>>                 u16 last_avail_idx;
>>         };
>> I haven't looked closely at vhost to see what it wants, but I would
>> think we could share more code.
>
> I have played around with the code in vhost.c to explore your idea.
> The main issue I run into is that vhost.c is accessing user data while my new
> code does not. So I end up with some quirky code testing if the ring lives in
> user memory or not.  Another issue is sparse warnings when
> accessing user memory.

Sparse is a servant, not a master.  If that's the only thing stopping
us, we can ignore it (or hack around it).

> With your suggested changes I end up sharing about 100 lines of code.
> So in sum, I feel this add more complexity than what we gain by sharing.
>
> Below is an initial draft of the re-usable code. I added "is_uaccess" to struct
> virtio_ring in order to know if the ring lives in user memory.
>
> Let me know what you think.

Agreed, that's horrible...

Fortunately, recent GCCs will inline function pointers, so inlining this
and handing an accessor function gets optimized away.

I would really like this, because I'd love to have a config option to do
strict checking on the format of these things (similar to my recently
posted CONFIG_VIRTIO_DEVICE_TORTURE patch).

See below.

> int virtqueue_add_used(struct vring_host *vr, unsigned int head, int len,
> 		    struct vring_used_elem  **used)
> {
> 	/* The virtqueue contains a ring of used buffers.  Get a pointer to the
> 	 * next entry in that used ring. */
> 	*used = &vr->vring.used->ring[vr->last_used_idx % vr->vring.num];
> 	if (vr->is_uaccess) {
> 		if(unlikely(__put_user(head, &(*used)->id))) {
> 			pr_debug("Failed to write used id");
> 			return -EFAULT;
> 		}
> 		if (unlikely(__put_user(len, &(*used)->len))) {
> 			pr_debug("Failed to write used len");
> 			return -EFAULT;
> 		}
> 		smp_wmb();
> 		if (__put_user(vr->last_used_idx + 1,
> 			       &vr->vring.used->idx)) {
> 			pr_debug("Failed to increment used idx");
> 			return -EFAULT;
> 		}
> 	} else {
> 		(*used)->id = head;
> 		(*used)->len = len;
> 		smp_wmb();
> 		vr->vring.used->idx = vr->last_used_idx + 1;
> 	}
> 	vr->last_used_idx++;
> 	return 0;
> }

/* Untested! */
static inline bool in_kernel_put(u32 *dst, u32 v)
{
        *dst = v;
        return true;
}

static inline bool userspace_put(u32 *dst, u32 v)
{
	return __put_user(dst, v) == 0;
}

static inline struct vring_used_elem *vrh_add_used(struct vring_host *vr,
                                                   unsigned int head, u32 len,
                                                   bool (*put)(u32 *dst, u32 v))
{
        struct vring_used_elem *used;

	/* The virtqueue contains a ring of used buffers.  Get a pointer to the
	 * next entry in that used ring. */
	used = &vr->vring.used->ring[vr->last_used_idx % vr->vring.num];
        
	if (!put(&used->id, head) || !put(&used->len = len))
                return NULL;
	smp_wmb();
        if (!put(&vr->vring.used->idx, vr->last_used_idx + 1))
                return NULL;
	vr->last_used_idx++;
	return used;
}

Cheers,
Rusty.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.
       [not found]                   ` <20130110111117.GE13451@redhat.com>
@ 2013-01-10 22:48                     ` Rusty Russell
  2013-01-11  7:31                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Rusty Russell @ 2013-01-10 22:48 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, Linus Walleij, virtualization

"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Thu, Jan 10, 2013 at 09:00:55PM +1030, Rusty Russell wrote:
>> Not sure why vhost/net doesn't built a packet and feed it in
>> netif_rx_ni().  This is what tun seems to do, and with this code it
>> should be fairly optimal.
>
> Because we want to use NAPI.

Not quite what I was asking; it was more a question of why we're using a
raw socket, when we trivially have a complete skb already which we
should be able to feed to Linux like any network packet.

And that path is pretty well optimized...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.
  2013-01-10 22:48                     ` [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio Rusty Russell
@ 2013-01-11  7:31                       ` Michael S. Tsirkin
  2013-01-12  0:20                         ` Rusty Russell
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2013-01-11  7:31 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Sjur Brændeland, Linus Walleij, virtualization, netdev

On Fri, Jan 11, 2013 at 09:18:33AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Thu, Jan 10, 2013 at 09:00:55PM +1030, Rusty Russell wrote:
> >> Not sure why vhost/net doesn't built a packet and feed it in
> >> netif_rx_ni().  This is what tun seems to do, and with this code it
> >> should be fairly optimal.
> >
> > Because we want to use NAPI.
> 
> Not quite what I was asking; it was more a question of why we're using a
> raw socket, when we trivially have a complete skb already which we
> should be able to feed to Linux like any network packet.
> 
> And that path is pretty well optimized...
> 
> Cheers,
> Rusty.


Oh for some reason I thought you were talking about virtio.
I don't really understand what you are saying here - vhost
actually calls out to tun to build and submit the skb.

-- 
MST

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.
  2013-01-11  7:31                       ` Michael S. Tsirkin
@ 2013-01-12  0:20                         ` Rusty Russell
  2013-01-14 16:54                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Rusty Russell @ 2013-01-12  0:20 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, Linus Walleij, virtualization

"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Fri, Jan 11, 2013 at 09:18:33AM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> > On Thu, Jan 10, 2013 at 09:00:55PM +1030, Rusty Russell wrote:
>> >> Not sure why vhost/net doesn't built a packet and feed it in
>> >> netif_rx_ni().  This is what tun seems to do, and with this code it
>> >> should be fairly optimal.
>> >
>> > Because we want to use NAPI.
>> 
>> Not quite what I was asking; it was more a question of why we're using a
>> raw socket, when we trivially have a complete skb already which we
>> should be able to feed to Linux like any network packet.
>
> Oh for some reason I thought you were talking about virtio.
> I don't really understand what you are saying here - vhost
> actually calls out to tun to build and submit the skb.

Ah, the fd is tun?  Seems a bit indirect; I wonder if there's room for
more optimization here...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.
  2013-01-12  0:20                         ` Rusty Russell
@ 2013-01-14 16:54                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2013-01-14 16:54 UTC (permalink / raw)
  To: Rusty Russell; +Cc: netdev, Linus Walleij, virtualization

On Sat, Jan 12, 2013 at 10:50:30AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Fri, Jan 11, 2013 at 09:18:33AM +1030, Rusty Russell wrote:
> >> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >> > On Thu, Jan 10, 2013 at 09:00:55PM +1030, Rusty Russell wrote:
> >> >> Not sure why vhost/net doesn't built a packet and feed it in
> >> >> netif_rx_ni().  This is what tun seems to do, and with this code it
> >> >> should be fairly optimal.
> >> >
> >> > Because we want to use NAPI.
> >> 
> >> Not quite what I was asking; it was more a question of why we're using a
> >> raw socket, when we trivially have a complete skb already which we
> >> should be able to feed to Linux like any network packet.
> >
> > Oh for some reason I thought you were talking about virtio.
> > I don't really understand what you are saying here - vhost
> > actually calls out to tun to build and submit the skb.
> 
> Ah, the fd is tun?

It can be tun or macvtap. We also support a packet socket
backend though I don't know of any users, maybe this can
be dropped.

>  Seems a bit indirect; I wonder if there's room for
> more optimization here...
> 
> Cheers,
> Rusty.

Quite possibly. Using common data structures and code in tun and macvtap
would allow calling this code directly from vhost-net.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-01-14 16:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 1/4] virtio: Move definitions to header file vring.h Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 2/4] include/vring.h: Add support for reversed vritio rings Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 3/4] virtio_ring: Call callback function even when used ring is empty Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 4/4] caif_virtio: Add CAIF over virtio Sjur Brændeland
2012-11-01  7:41 ` [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Rusty Russell
2012-11-05 12:12   ` Sjur Brændeland
2012-11-06  2:09     ` Rusty Russell
     [not found]       ` <1354718230-4486-1-git-send-email-sjur@brendeland.net>
     [not found]         ` <20121206102750.GF10837@redhat.com>
     [not found]           ` <877goc0wac.fsf@rustcorp.com.au>
     [not found]             ` <CAJK669bP41oBhJ=MB64NS21Ag7XO5WswuTiVKCFTb96nvmyBiw@mail.gmail.com>
     [not found]               ` <87pq1f2rj0.fsf@rustcorp.com.au>
     [not found]                 ` <87wqvl1g9s.fsf@rustcorp.com.au>
     [not found]                   ` <20130110111117.GE13451@redhat.com>
2013-01-10 22:48                     ` [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio Rusty Russell
2013-01-11  7:31                       ` Michael S. Tsirkin
2013-01-12  0:20                         ` Rusty Russell
2013-01-14 16:54                           ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).