All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFT PATCH v2 0/6] Asynchronous UVC
@ 2018-01-09 13:09 ` Kieran Bingham
  0 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

The Linux UVC driver has long provided adequate performance capabilities for
web-cams and low data rate video devices in Linux while resolutions were low.

Modern USB cameras are now capable of high data rates thanks to USB3 with
1080p, and even 4k capture resolutions supported.

Cameras such as the Stereolabs ZED or the Logitech Brio can generate more data
than an embedded ARM core is able to process on a single core, resulting in
frame loss.

A large part of this performance impact is from the requirement to
‘memcpy’ frames out from URB packets to destination frames. This unfortunate
requirement is due to the UVC protocol allowing a variable length header, and
thus it is not possible to provide the target frame buffers directly.

Extra throughput is possible by moving the actual memcpy actions to a work
queue, and moving the memcpy out of interrupt context and allowing work tasks
to be scheduled across multiple cores.

This series has been tested on both the ZED and Brio cameras on arm64
platforms, however due to the intrinsic changes in the driver I would like to
see it tested with other devices and other platforms, so I'd appreciate if
anyone can test this on a range of USB cameras.

v2:
 - Fix up comments and issues raised by Guennadi

Kieran Bingham (6):
  uvcvideo: Refactor URB descriptors
  uvcvideo: Convert decode functions to use new context structure
  uvcvideo: Protect queue internals with helper
  uvcvideo: queue: Simplify spin-lock usage
  uvcvideo: queue: Support asynchronous buffer handling
  uvcvideo: Move decode processing to process context

 drivers/media/usb/uvc/uvc_isight.c |   4 +-
 drivers/media/usb/uvc/uvc_queue.c  | 114 ++++++++++++++----
 drivers/media/usb/uvc/uvc_video.c  | 189 ++++++++++++++++++++++--------
 drivers/media/usb/uvc/uvcvideo.h   |  56 +++++++--
 4 files changed, 285 insertions(+), 78 deletions(-)

base-commit: 6f0e5fd39143a59c22d60e7befc4f33f22aeed2f
-- 
git-series 0.9.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 0/6] Asynchronous UVC
@ 2018-01-09 13:09 ` Kieran Bingham
  0 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

The Linux UVC driver has long provided adequate performance capabilities for
web-cams and low data rate video devices in Linux while resolutions were low.

Modern USB cameras are now capable of high data rates thanks to USB3 with
1080p, and even 4k capture resolutions supported.

Cameras such as the Stereolabs ZED or the Logitech Brio can generate more data
than an embedded ARM core is able to process on a single core, resulting in
frame loss.

A large part of this performance impact is from the requirement to
‘memcpy’ frames out from URB packets to destination frames. This unfortunate
requirement is due to the UVC protocol allowing a variable length header, and
thus it is not possible to provide the target frame buffers directly.

Extra throughput is possible by moving the actual memcpy actions to a work
queue, and moving the memcpy out of interrupt context and allowing work tasks
to be scheduled across multiple cores.

This series has been tested on both the ZED and Brio cameras on arm64
platforms, however due to the intrinsic changes in the driver I would like to
see it tested with other devices and other platforms, so I'd appreciate if
anyone can test this on a range of USB cameras.

v2:
 - Fix up comments and issues raised by Guennadi

Kieran Bingham (6):
  uvcvideo: Refactor URB descriptors
  uvcvideo: Convert decode functions to use new context structure
  uvcvideo: Protect queue internals with helper
  uvcvideo: queue: Simplify spin-lock usage
  uvcvideo: queue: Support asynchronous buffer handling
  uvcvideo: Move decode processing to process context

 drivers/media/usb/uvc/uvc_isight.c |   4 +-
 drivers/media/usb/uvc/uvc_queue.c  | 114 ++++++++++++++----
 drivers/media/usb/uvc/uvc_video.c  | 189 ++++++++++++++++++++++--------
 drivers/media/usb/uvc/uvcvideo.h   |  56 +++++++--
 4 files changed, 285 insertions(+), 78 deletions(-)

base-commit: 6f0e5fd39143a59c22d60e7befc4f33f22aeed2f
-- 
git-series 0.9.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 3/6] uvcvideo: Protect queue internals with helper
  2018-01-09 13:09 ` Kieran Bingham
                   ` (3 preceding siblings ...)
  (?)
@ 2018-01-09 13:09 ` Kieran Bingham
  -1 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

The URB completion operation obtains the current buffer by reading
directly into the queue internal interface.

Protect this queue abstraction by providing a helper
uvc_queue_get_current_buffer() which can be used by both the decode
task, and the uvc_queue_next_buffer() functions.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

---

v2:
 - Fix coding style of conditional statements

 drivers/media/usb/uvc/uvc_queue.c | 33 +++++++++++++++++++++++++++-----
 drivers/media/usb/uvc/uvc_video.c |  7 +------
 drivers/media/usb/uvc/uvcvideo.h  |  2 ++-
 3 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
index c8d78b2f3de4..4a581d631525 100644
--- a/drivers/media/usb/uvc/uvc_queue.c
+++ b/drivers/media/usb/uvc/uvc_queue.c
@@ -399,6 +399,33 @@ void uvc_queue_cancel(struct uvc_video_queue *queue, int disconnect)
 	spin_unlock_irqrestore(&queue->irqlock, flags);
 }
 
+/*
+ * uvc_queue_get_current_buffer: Obtain the current working output buffer
+ *
+ * Buffers may span multiple packets, and even URBs, therefore the active buffer
+ * remains on the queue until the EOF marker.
+ */
+static struct uvc_buffer *
+__uvc_queue_get_current_buffer(struct uvc_video_queue *queue)
+{
+	if (list_empty(&queue->irqqueue))
+		return NULL;
+
+	return list_first_entry(&queue->irqqueue, struct uvc_buffer, queue);
+}
+
+struct uvc_buffer *uvc_queue_get_current_buffer(struct uvc_video_queue *queue)
+{
+	struct uvc_buffer *nextbuf;
+	unsigned long flags;
+
+	spin_lock_irqsave(&queue->irqlock, flags);
+	nextbuf = __uvc_queue_get_current_buffer(queue);
+	spin_unlock_irqrestore(&queue->irqlock, flags);
+
+	return nextbuf;
+}
+
 struct uvc_buffer *uvc_queue_next_buffer(struct uvc_video_queue *queue,
 		struct uvc_buffer *buf)
 {
@@ -415,11 +442,7 @@ struct uvc_buffer *uvc_queue_next_buffer(struct uvc_video_queue *queue,
 
 	spin_lock_irqsave(&queue->irqlock, flags);
 	list_del(&buf->queue);
-	if (!list_empty(&queue->irqqueue))
-		nextbuf = list_first_entry(&queue->irqqueue, struct uvc_buffer,
-					   queue);
-	else
-		nextbuf = NULL;
+	nextbuf = __uvc_queue_get_current_buffer(queue);
 	spin_unlock_irqrestore(&queue->irqlock, flags);
 
 	buf->state = buf->error ? UVC_BUF_STATE_ERROR : UVC_BUF_STATE_DONE;
diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index 92bd0952a66e..3878bec3276e 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1322,7 +1322,6 @@ static void uvc_video_complete(struct urb *urb)
 	struct uvc_streaming *stream = uvc_urb->stream;
 	struct uvc_video_queue *queue = &stream->queue;
 	struct uvc_buffer *buf = NULL;
-	unsigned long flags;
 	int ret;
 
 	switch (urb->status) {
@@ -1343,11 +1342,7 @@ static void uvc_video_complete(struct urb *urb)
 		return;
 	}
 
-	spin_lock_irqsave(&queue->irqlock, flags);
-	if (!list_empty(&queue->irqqueue))
-		buf = list_first_entry(&queue->irqqueue, struct uvc_buffer,
-				       queue);
-	spin_unlock_irqrestore(&queue->irqlock, flags);
+	buf = uvc_queue_get_current_buffer(queue);
 
 	stream->decode(uvc_urb, buf);
 
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 81a9a419a423..5caa1f4de3cb 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -692,6 +692,8 @@ extern int uvc_queue_streamon(struct uvc_video_queue *queue,
 extern int uvc_queue_streamoff(struct uvc_video_queue *queue,
 			       enum v4l2_buf_type type);
 extern void uvc_queue_cancel(struct uvc_video_queue *queue, int disconnect);
+extern struct uvc_buffer *
+		uvc_queue_get_current_buffer(struct uvc_video_queue *queue);
 extern struct uvc_buffer *uvc_queue_next_buffer(struct uvc_video_queue *queue,
 		struct uvc_buffer *buf);
 extern int uvc_queue_mmap(struct uvc_video_queue *queue,
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 6/6] uvcvideo: Move decode processing to process context
  2018-01-09 13:09 ` Kieran Bingham
                   ` (2 preceding siblings ...)
  (?)
@ 2018-01-09 13:09 ` Kieran Bingham
  2018-01-11 17:54   ` Kieran Bingham
  -1 siblings, 1 reply; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

Newer high definition cameras, and cameras with multiple lenses such as
the range of stereo-vision cameras now available have ever increasing
data rates.

The inclusion of a variable length packet header in URB packets mean
that we must memcpy the frame data out to our destination 'manually'.
This can result in data rates of up to 2 gigabits per second being
processed.

To improve efficiency, and maximise throughput, handle the URB decode
processing through a work queue to move it from interrupt context, and
allow multiple processors to work on URBs in parallel.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

---
v2:
 - Lock full critical section of usb_submit_urb()

 drivers/media/usb/uvc/uvc_queue.c |  12 +++-
 drivers/media/usb/uvc/uvc_video.c | 111 +++++++++++++++++++++++++------
 drivers/media/usb/uvc/uvcvideo.h  |  24 +++++++-
 3 files changed, 129 insertions(+), 18 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
index 5a9987e547d3..598087eeb5c2 100644
--- a/drivers/media/usb/uvc/uvc_queue.c
+++ b/drivers/media/usb/uvc/uvc_queue.c
@@ -179,10 +179,22 @@ static void uvc_stop_streaming(struct vb2_queue *vq)
 	struct uvc_video_queue *queue = vb2_get_drv_priv(vq);
 	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
 
+	/* Prevent new buffers coming in. */
+	spin_lock_irq(&queue->irqlock);
+	queue->flags |= UVC_QUEUE_STOPPING;
+	spin_unlock_irq(&queue->irqlock);
+
+	/*
+	 * All pending work should be completed before disabling the stream, as
+	 * all URBs will be free'd during uvc_video_enable(s, 0).
+	 */
+	flush_workqueue(stream->async_wq);
+
 	uvc_video_enable(stream, 0);
 
 	spin_lock_irq(&queue->irqlock);
 	uvc_queue_return_buffers(queue, UVC_BUF_STATE_ERROR);
+	queue->flags &= ~UVC_QUEUE_STOPPING;
 	spin_unlock_irq(&queue->irqlock);
 }
 
diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index 3878bec3276e..a9ddc4f27012 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1058,21 +1058,67 @@ static int uvc_video_decode_start(struct uvc_streaming *stream,
 	return data[0];
 }
 
-static void uvc_video_decode_data(struct uvc_streaming *stream,
-		struct uvc_buffer *buf, const __u8 *data, int len)
+/*
+ * uvc_video_decode_data_work: Asynchronous memcpy processing
+ *
+ * Perform memcpy tasks in process context, with completion handlers
+ * to return the URB, and buffer handles.
+ *
+ * The work submitter must pre-determine that the work is safe
+ */
+static void uvc_video_decode_data_work(struct work_struct *work)
 {
-	unsigned int maxlen, nbytes;
-	void *mem;
+	struct uvc_urb *uvc_urb = container_of(work, struct uvc_urb, work);
+	struct uvc_streaming *stream = uvc_urb->stream;
+	struct uvc_video_queue *queue = &stream->queue;
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < uvc_urb->packets; i++) {
+		struct uvc_decode_op *op = &uvc_urb->decodes[i];
+
+		memcpy(op->dst, op->src, op->len);
+
+		/* Release reference taken on this buffer */
+		uvc_queue_buffer_release(op->buf);
+	}
+
+	/*
+	 * Prevent resubmitting URBs when shutting down to ensure that no new
+	 * work item will be scheduled after uvc_stop_streaming() flushes the
+	 * work queue.
+	 */
+	spin_lock_irq(&queue->irqlock);
+	if (!(queue->flags & UVC_QUEUE_STOPPING)) {
+		ret = usb_submit_urb(uvc_urb->urb, GFP_ATOMIC);
+		if (ret < 0)
+			uvc_printk(KERN_ERR,
+				   "Failed to resubmit video URB (%d).\n",
+				   ret);
+	}
+	spin_unlock_irq(&queue->irqlock);
+}
+
+static void uvc_video_decode_data(struct uvc_decode_op *decode,
+		struct uvc_urb *uvc_urb, struct uvc_buffer *buf,
+		const __u8 *data, int len)
+{
+	unsigned int maxlen;
 
 	if (len <= 0)
 		return;
 
-	/* Copy the video data to the buffer. */
 	maxlen = buf->length - buf->bytesused;
-	mem = buf->mem + buf->bytesused;
-	nbytes = min((unsigned int)len, maxlen);
-	memcpy(mem, data, nbytes);
-	buf->bytesused += nbytes;
+
+	/* Take a buffer reference for async work */
+	kref_get(&buf->ref);
+
+	decode->buf = buf;
+	decode->src = data;
+	decode->dst = buf->mem + buf->bytesused;
+	decode->len = min_t(unsigned int, len, maxlen);
+
+	buf->bytesused += decode->len;
 
 	/* Complete the current frame if the buffer size was exceeded. */
 	if (len > maxlen) {
@@ -1080,6 +1126,8 @@ static void uvc_video_decode_data(struct uvc_streaming *stream,
 		buf->error = 1;
 		buf->state = UVC_BUF_STATE_READY;
 	}
+
+	uvc_urb->packets++;
 }
 
 static void uvc_video_decode_end(struct uvc_streaming *stream,
@@ -1162,6 +1210,8 @@ static void uvc_video_decode_isoc(struct uvc_urb *uvc_urb,
 	int ret, i;
 
 	for (i = 0; i < urb->number_of_packets; ++i) {
+		struct uvc_decode_op *op = &uvc_urb->decodes[uvc_urb->packets];
+
 		if (urb->iso_frame_desc[i].status < 0) {
 			uvc_trace(UVC_TRACE_FRAME, "USB isochronous frame "
 				"lost (%d).\n", urb->iso_frame_desc[i].status);
@@ -1187,7 +1237,7 @@ static void uvc_video_decode_isoc(struct uvc_urb *uvc_urb,
 			continue;
 
 		/* Decode the payload data. */
-		uvc_video_decode_data(stream, buf, mem + ret,
+		uvc_video_decode_data(op, uvc_urb, buf, mem + ret,
 			urb->iso_frame_desc[i].actual_length - ret);
 
 		/* Process the header again. */
@@ -1248,9 +1298,12 @@ static void uvc_video_decode_bulk(struct uvc_urb *uvc_urb,
 	 * sure buf is never dereferenced if NULL.
 	 */
 
-	/* Process video data. */
-	if (!stream->bulk.skip_payload && buf != NULL)
-		uvc_video_decode_data(stream, buf, mem, len);
+	/* Prepare video data for processing. */
+	if (!stream->bulk.skip_payload && buf != NULL) {
+		struct uvc_decode_op *op = &uvc_urb->decodes[0];
+
+		uvc_video_decode_data(op, uvc_urb, buf, mem, len);
+	}
 
 	/* Detect the payload end by a URB smaller than the maximum size (or
 	 * a payload size equal to the maximum) and process the header again.
@@ -1322,7 +1375,8 @@ static void uvc_video_complete(struct urb *urb)
 	struct uvc_streaming *stream = uvc_urb->stream;
 	struct uvc_video_queue *queue = &stream->queue;
 	struct uvc_buffer *buf = NULL;
-	int ret;
+	unsigned long flags;
+	bool stopping;
 
 	switch (urb->status) {
 	case 0:
@@ -1342,14 +1396,30 @@ static void uvc_video_complete(struct urb *urb)
 		return;
 	}
 
+	/*
+	 * Simply accept and discard completed URBs without processing when the
+	 * stream is being shutdown. URBs will be freed as part of the
+	 * uvc_video_enable(s, 0) action, so we must not queue asynchronous
+	 * work based upon them.
+	 */
+	spin_lock_irqsave(&queue->irqlock, flags);
+	stopping = queue->flags & UVC_QUEUE_STOPPING;
+	spin_unlock_irqrestore(&queue->irqlock, flags);
+
+	if (stopping)
+		return;
+
 	buf = uvc_queue_get_current_buffer(queue);
 
+	/* Re-initialise the URB packet work */
+	uvc_urb->packets = 0;
+
+	/* Process the URB headers, but work is deferred to a work queue */
 	stream->decode(uvc_urb, buf);
 
-	if ((ret = usb_submit_urb(urb, GFP_ATOMIC)) < 0) {
-		uvc_printk(KERN_ERR, "Failed to resubmit video URB (%d).\n",
-			ret);
-	}
+	/* Handle any heavy lifting required */
+	INIT_WORK(&uvc_urb->work, uvc_video_decode_data_work);
+	queue_work(stream->async_wq, &uvc_urb->work);
 }
 
 /*
@@ -1620,6 +1690,11 @@ static int uvc_init_video(struct uvc_streaming *stream, gfp_t gfp_flags)
 
 	uvc_video_stats_start(stream);
 
+	stream->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI,
+			0);
+	if (!stream->async_wq)
+		return -ENOMEM;
+
 	if (intf->num_altsetting > 1) {
 		struct usb_host_endpoint *best_ep = NULL;
 		unsigned int best_psize = UINT_MAX;
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 6a18dbfc3e5b..13dc0b2bb8b9 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -411,6 +411,7 @@ struct uvc_buffer {
 
 #define UVC_QUEUE_DISCONNECTED		(1 << 0)
 #define UVC_QUEUE_DROP_CORRUPTED	(1 << 1)
+#define UVC_QUEUE_STOPPING		(1 << 2)
 
 struct uvc_video_queue {
 	struct vb2_queue queue;
@@ -483,12 +484,30 @@ struct uvc_stats_stream {
 };
 
 /**
+ * struct uvc_decode_op: Context structure to schedule asynchronous memcpy
+ *
+ * @buf: active buf object for this decode
+ * @dst: copy destination address
+ * @src: copy source address
+ * @len: copy length
+ */
+struct uvc_decode_op {
+	struct uvc_buffer *buf;
+	void *dst;
+	const __u8 *src;
+	int len;
+};
+
+/**
  * struct uvc_urb - URB context management structure
  *
  * @urb: the URB described by this context structure
  * @stream: UVC streaming context
  * @buffer: memory storage for the URB
  * @dma: DMA coherent addressing for the urb_buffer
+ * @packets: counter to indicate the number of copy operations
+ * @decodes: work descriptors for asynchronous copy operations
+ * @work: work queue entry for asynchronous decode
  */
 struct uvc_urb {
 	struct urb *urb;
@@ -496,6 +515,10 @@ struct uvc_urb {
 
 	char *buffer;
 	dma_addr_t dma;
+
+	unsigned int packets;
+	struct uvc_decode_op decodes[UVC_MAX_PACKETS];
+	struct work_struct work;
 };
 
 struct uvc_streaming {
@@ -528,6 +551,7 @@ struct uvc_streaming {
 	/* Buffers queue. */
 	unsigned int frozen : 1;
 	struct uvc_video_queue queue;
+	struct workqueue_struct *async_wq;
 	void (*decode)(struct uvc_urb *uvc_urb, struct uvc_buffer *buf);
 
 	/* Context data used by the bulk completion handler. */
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 1/6] uvcvideo: Refactor URB descriptors
  2018-01-09 13:09 ` Kieran Bingham
  (?)
  (?)
@ 2018-01-09 13:09 ` Kieran Bingham
  -1 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

We currently store three separate arrays for each URB reference we hold.

Objectify the data needed to track URBs into a single uvc_urb structure,
allowing better object management and tracking of the URB.

All accesses to the data pointers through stream, are converted to use a
uvc_urb pointer for consistency.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

---
v2:
 - Re-describe URB context structure
 - Re-name uvc_urb->{urb_buffer,urb_dma}{buffer,dma}

 drivers/media/usb/uvc/uvc_video.c | 49 +++++++++++++++++++-------------
 drivers/media/usb/uvc/uvcvideo.h  | 18 ++++++++++--
 2 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index 73cd44e8bd81..e57c5f52c73b 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1357,14 +1357,16 @@ static void uvc_free_urb_buffers(struct uvc_streaming *stream)
 	unsigned int i;
 
 	for (i = 0; i < UVC_URBS; ++i) {
-		if (stream->urb_buffer[i]) {
+		struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
+
+		if (uvc_urb->buffer) {
 #ifndef CONFIG_DMA_NONCOHERENT
 			usb_free_coherent(stream->dev->udev, stream->urb_size,
-				stream->urb_buffer[i], stream->urb_dma[i]);
+					uvc_urb->buffer, uvc_urb->dma);
 #else
-			kfree(stream->urb_buffer[i]);
+			kfree(uvc_urb->buffer);
 #endif
-			stream->urb_buffer[i] = NULL;
+			uvc_urb->buffer = NULL;
 		}
 	}
 
@@ -1402,16 +1404,18 @@ static int uvc_alloc_urb_buffers(struct uvc_streaming *stream,
 	/* Retry allocations until one succeed. */
 	for (; npackets > 1; npackets /= 2) {
 		for (i = 0; i < UVC_URBS; ++i) {
+			struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
+
 			stream->urb_size = psize * npackets;
 #ifndef CONFIG_DMA_NONCOHERENT
-			stream->urb_buffer[i] = usb_alloc_coherent(
+			uvc_urb->buffer = usb_alloc_coherent(
 				stream->dev->udev, stream->urb_size,
-				gfp_flags | __GFP_NOWARN, &stream->urb_dma[i]);
+				gfp_flags | __GFP_NOWARN, &uvc_urb->dma);
 #else
-			stream->urb_buffer[i] =
+			uvc_urb->buffer =
 			    kmalloc(stream->urb_size, gfp_flags | __GFP_NOWARN);
 #endif
-			if (!stream->urb_buffer[i]) {
+			if (!uvc_urb->buffer) {
 				uvc_free_urb_buffers(stream);
 				break;
 			}
@@ -1441,13 +1445,15 @@ static void uvc_uninit_video(struct uvc_streaming *stream, int free_buffers)
 	uvc_video_stats_stop(stream);
 
 	for (i = 0; i < UVC_URBS; ++i) {
-		urb = stream->urb[i];
+		struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
+
+		urb = uvc_urb->urb;
 		if (urb == NULL)
 			continue;
 
 		usb_kill_urb(urb);
 		usb_free_urb(urb);
-		stream->urb[i] = NULL;
+		uvc_urb->urb = NULL;
 	}
 
 	if (free_buffers)
@@ -1502,6 +1508,8 @@ static int uvc_init_video_isoc(struct uvc_streaming *stream,
 	size = npackets * psize;
 
 	for (i = 0; i < UVC_URBS; ++i) {
+		struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
+
 		urb = usb_alloc_urb(npackets, gfp_flags);
 		if (urb == NULL) {
 			uvc_uninit_video(stream, 1);
@@ -1514,12 +1522,12 @@ static int uvc_init_video_isoc(struct uvc_streaming *stream,
 				ep->desc.bEndpointAddress);
 #ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP;
-		urb->transfer_dma = stream->urb_dma[i];
+		urb->transfer_dma = uvc_urb->dma;
 #else
 		urb->transfer_flags = URB_ISO_ASAP;
 #endif
 		urb->interval = ep->desc.bInterval;
-		urb->transfer_buffer = stream->urb_buffer[i];
+		urb->transfer_buffer = uvc_urb->buffer;
 		urb->complete = uvc_video_complete;
 		urb->number_of_packets = npackets;
 		urb->transfer_buffer_length = size;
@@ -1529,7 +1537,7 @@ static int uvc_init_video_isoc(struct uvc_streaming *stream,
 			urb->iso_frame_desc[j].length = psize;
 		}
 
-		stream->urb[i] = urb;
+		uvc_urb->urb = urb;
 	}
 
 	return 0;
@@ -1568,21 +1576,22 @@ static int uvc_init_video_bulk(struct uvc_streaming *stream,
 		size = 0;
 
 	for (i = 0; i < UVC_URBS; ++i) {
+		struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
+
 		urb = usb_alloc_urb(0, gfp_flags);
 		if (urb == NULL) {
 			uvc_uninit_video(stream, 1);
 			return -ENOMEM;
 		}
 
-		usb_fill_bulk_urb(urb, stream->dev->udev, pipe,
-			stream->urb_buffer[i], size, uvc_video_complete,
-			stream);
+		usb_fill_bulk_urb(urb, stream->dev->udev, pipe, uvc_urb->buffer,
+				  size, uvc_video_complete, stream);
 #ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP;
-		urb->transfer_dma = stream->urb_dma[i];
+		urb->transfer_dma = uvc_urb->dma;
 #endif
 
-		stream->urb[i] = urb;
+		uvc_urb->urb = urb;
 	}
 
 	return 0;
@@ -1673,7 +1682,9 @@ static int uvc_init_video(struct uvc_streaming *stream, gfp_t gfp_flags)
 
 	/* Submit the URBs. */
 	for (i = 0; i < UVC_URBS; ++i) {
-		ret = usb_submit_urb(stream->urb[i], gfp_flags);
+		struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
+
+		ret = usb_submit_urb(uvc_urb->urb, gfp_flags);
 		if (ret < 0) {
 			uvc_printk(KERN_ERR, "Failed to submit URB %u "
 					"(%d).\n", i, ret);
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 19e725e2bda5..2a5dc7f09463 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -479,6 +479,20 @@ struct uvc_stats_stream {
 	unsigned int max_sof;		/* Maximum STC.SOF value */
 };
 
+/**
+ * struct uvc_urb - URB context management structure
+ *
+ * @urb: the URB described by this context structure
+ * @buffer: memory storage for the URB
+ * @dma: DMA coherent addressing for the urb_buffer
+ */
+struct uvc_urb {
+	struct urb *urb;
+
+	char *buffer;
+	dma_addr_t dma;
+};
+
 struct uvc_streaming {
 	struct list_head list;
 	struct uvc_device *dev;
@@ -521,9 +535,7 @@ struct uvc_streaming {
 		__u32 max_payload_size;
 	} bulk;
 
-	struct urb *urb[UVC_URBS];
-	char *urb_buffer[UVC_URBS];
-	dma_addr_t urb_dma[UVC_URBS];
+	struct uvc_urb uvc_urb[UVC_URBS];
 	unsigned int urb_size;
 
 	__u32 sequence;
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 5/6] uvcvideo: queue: Support asynchronous buffer handling
  2018-01-09 13:09 ` Kieran Bingham
                   ` (5 preceding siblings ...)
  (?)
@ 2018-01-09 13:09 ` Kieran Bingham
  -1 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

The buffer queue interface currently operates sequentially, processing
buffers after they have fully completed.

In preparation for supporting parallel tasks operating on the buffers,
we will need to support buffers being processed on multiple CPUs.

Adapt the uvc_queue_next_buffer() such that a reference count tracks the
active use of the buffer, returning the buffer to the VB2 stack at
completion.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
---
 drivers/media/usb/uvc/uvc_queue.c | 61 ++++++++++++++++++++++++++------
 drivers/media/usb/uvc/uvcvideo.h  |  4 ++-
 2 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
index ddac4d89a291..5a9987e547d3 100644
--- a/drivers/media/usb/uvc/uvc_queue.c
+++ b/drivers/media/usb/uvc/uvc_queue.c
@@ -131,6 +131,7 @@ static void uvc_buffer_queue(struct vb2_buffer *vb)
 
 	spin_lock_irqsave(&queue->irqlock, flags);
 	if (likely(!(queue->flags & UVC_QUEUE_DISCONNECTED))) {
+		kref_init(&buf->ref);
 		list_add_tail(&buf->queue, &queue->irqqueue);
 	} else {
 		/* If the device is disconnected return the buffer to userspace
@@ -424,28 +425,66 @@ struct uvc_buffer *uvc_queue_get_current_buffer(struct uvc_video_queue *queue)
 	return nextbuf;
 }
 
-struct uvc_buffer *uvc_queue_next_buffer(struct uvc_video_queue *queue,
+/*
+ * uvc_queue_requeue: Requeue a buffer on our internal irqqueue
+ *
+ * Reuse a buffer through our internal queue without the need to 'prepare'
+ * The buffer will be returned to userspace through the uvc_buffer_queue call if
+ * the device has been disconnected
+ */
+static void uvc_queue_requeue(struct uvc_video_queue *queue,
 		struct uvc_buffer *buf)
 {
-	struct uvc_buffer *nextbuf;
-	unsigned long flags;
+	buf->error = 0;
+	buf->state = UVC_BUF_STATE_QUEUED;
+	buf->bytesused = 0;
+	vb2_set_plane_payload(&buf->buf.vb2_buf, 0, 0);
+
+	uvc_buffer_queue(&buf->buf.vb2_buf);
+}
+
+static void uvc_queue_buffer_complete(struct kref *ref)
+{
+	struct uvc_buffer *buf = container_of(ref, struct uvc_buffer, ref);
+	struct vb2_buffer *vb = &buf->buf.vb2_buf;
+	struct uvc_video_queue *queue = vb2_get_drv_priv(vb->vb2_queue);
 
 	if ((queue->flags & UVC_QUEUE_DROP_CORRUPTED) && buf->error) {
-		buf->error = 0;
-		buf->state = UVC_BUF_STATE_QUEUED;
-		buf->bytesused = 0;
-		vb2_set_plane_payload(&buf->buf.vb2_buf, 0, 0);
-		return buf;
+		uvc_queue_requeue(queue, buf);
+		return;
 	}
 
+	buf->state = buf->error ? UVC_BUF_STATE_ERROR : UVC_BUF_STATE_DONE;
+	vb2_set_plane_payload(&buf->buf.vb2_buf, 0, buf->bytesused);
+	vb2_buffer_done(&buf->buf.vb2_buf, VB2_BUF_STATE_DONE);
+}
+
+/*
+ * Release a reference on the buffer. Complete the buffer when the last
+ * reference is released
+ */
+void uvc_queue_buffer_release(struct uvc_buffer *buf)
+{
+	kref_put(&buf->ref, uvc_queue_buffer_complete);
+}
+
+/*
+ * Remove this buffer from the queue. Lifetime will persist while async actions
+ * are still running (if any), and uvc_queue_buffer_release will give the buffer
+ * back to VB2 when all users have completed.
+ */
+struct uvc_buffer *uvc_queue_next_buffer(struct uvc_video_queue *queue,
+		struct uvc_buffer *buf)
+{
+	struct uvc_buffer *nextbuf;
+	unsigned long flags;
+
 	spin_lock_irqsave(&queue->irqlock, flags);
 	list_del(&buf->queue);
 	nextbuf = __uvc_queue_get_current_buffer(queue);
 	spin_unlock_irqrestore(&queue->irqlock, flags);
 
-	buf->state = buf->error ? UVC_BUF_STATE_ERROR : UVC_BUF_STATE_DONE;
-	vb2_set_plane_payload(&buf->buf.vb2_buf, 0, buf->bytesused);
-	vb2_buffer_done(&buf->buf.vb2_buf, VB2_BUF_STATE_DONE);
+	uvc_queue_buffer_release(buf);
 
 	return nextbuf;
 }
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 5caa1f4de3cb..6a18dbfc3e5b 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -404,6 +404,9 @@ struct uvc_buffer {
 	unsigned int bytesused;
 
 	u32 pts;
+
+	/* asynchronous buffer handling */
+	struct kref ref;
 };
 
 #define UVC_QUEUE_DISCONNECTED		(1 << 0)
@@ -696,6 +699,7 @@ extern struct uvc_buffer *
 		uvc_queue_get_current_buffer(struct uvc_video_queue *queue);
 extern struct uvc_buffer *uvc_queue_next_buffer(struct uvc_video_queue *queue,
 		struct uvc_buffer *buf);
+extern void uvc_queue_buffer_release(struct uvc_buffer *buf);
 extern int uvc_queue_mmap(struct uvc_video_queue *queue,
 		struct vm_area_struct *vma);
 extern unsigned int uvc_queue_poll(struct uvc_video_queue *queue,
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 2/6] uvcvideo: Convert decode functions to use new context structure
  2018-01-09 13:09 ` Kieran Bingham
                   ` (4 preceding siblings ...)
  (?)
@ 2018-01-09 13:09 ` Kieran Bingham
  -1 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

The URB completion handlers currently reference the stream context.

Now that each URB has its own context structure, convert the decode (and
one encode) functions to utilise this context for URB management.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

---
v2:
 - fix checkpatch warning (pre-existing in code)

 drivers/media/usb/uvc/uvc_isight.c |  4 +++-
 drivers/media/usb/uvc/uvc_video.c  | 32 ++++++++++++++++++++-----------
 drivers/media/usb/uvc/uvcvideo.h   |  8 ++++----
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_isight.c b/drivers/media/usb/uvc/uvc_isight.c
index 8510e7259e76..433b8b4f96e2 100644
--- a/drivers/media/usb/uvc/uvc_isight.c
+++ b/drivers/media/usb/uvc/uvc_isight.c
@@ -99,9 +99,11 @@ static int isight_decode(struct uvc_video_queue *queue, struct uvc_buffer *buf,
 	return 0;
 }
 
-void uvc_video_decode_isight(struct urb *urb, struct uvc_streaming *stream,
+void uvc_video_decode_isight(struct uvc_urb *uvc_urb,
 		struct uvc_buffer *buf)
 {
+	struct urb *urb = uvc_urb->urb;
+	struct uvc_streaming *stream = uvc_urb->stream;
 	int ret, i;
 
 	for (i = 0; i < urb->number_of_packets; ++i) {
diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index e57c5f52c73b..92bd0952a66e 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1153,9 +1153,11 @@ static void uvc_video_validate_buffer(const struct uvc_streaming *stream,
 /*
  * Completion handler for video URBs.
  */
-static void uvc_video_decode_isoc(struct urb *urb, struct uvc_streaming *stream,
-	struct uvc_buffer *buf)
+static void uvc_video_decode_isoc(struct uvc_urb *uvc_urb,
+		struct uvc_buffer *buf)
 {
+	struct urb *urb = uvc_urb->urb;
+	struct uvc_streaming *stream = uvc_urb->stream;
 	u8 *mem;
 	int ret, i;
 
@@ -1199,9 +1201,11 @@ static void uvc_video_decode_isoc(struct urb *urb, struct uvc_streaming *stream,
 	}
 }
 
-static void uvc_video_decode_bulk(struct urb *urb, struct uvc_streaming *stream,
-	struct uvc_buffer *buf)
+static void uvc_video_decode_bulk(struct uvc_urb *uvc_urb,
+		struct uvc_buffer *buf)
 {
+	struct urb *urb = uvc_urb->urb;
+	struct uvc_streaming *stream = uvc_urb->stream;
 	u8 *mem;
 	int len, ret;
 
@@ -1266,9 +1270,12 @@ static void uvc_video_decode_bulk(struct urb *urb, struct uvc_streaming *stream,
 	}
 }
 
-static void uvc_video_encode_bulk(struct urb *urb, struct uvc_streaming *stream,
-	struct uvc_buffer *buf)
+static void uvc_video_encode_bulk(struct uvc_urb *uvc_urb,
+		struct uvc_buffer *buf)
 {
+	struct urb *urb = uvc_urb->urb;
+	struct uvc_streaming *stream = uvc_urb->stream;
+
 	u8 *mem = urb->transfer_buffer;
 	int len = stream->urb_size, ret;
 
@@ -1311,7 +1318,8 @@ static void uvc_video_encode_bulk(struct urb *urb, struct uvc_streaming *stream,
 
 static void uvc_video_complete(struct urb *urb)
 {
-	struct uvc_streaming *stream = urb->context;
+	struct uvc_urb *uvc_urb = urb->context;
+	struct uvc_streaming *stream = uvc_urb->stream;
 	struct uvc_video_queue *queue = &stream->queue;
 	struct uvc_buffer *buf = NULL;
 	unsigned long flags;
@@ -1341,7 +1349,7 @@ static void uvc_video_complete(struct urb *urb)
 				       queue);
 	spin_unlock_irqrestore(&queue->irqlock, flags);
 
-	stream->decode(urb, stream, buf);
+	stream->decode(uvc_urb, buf);
 
 	if ((ret = usb_submit_urb(urb, GFP_ATOMIC)) < 0) {
 		uvc_printk(KERN_ERR, "Failed to resubmit video URB (%d).\n",
@@ -1419,6 +1427,8 @@ static int uvc_alloc_urb_buffers(struct uvc_streaming *stream,
 				uvc_free_urb_buffers(stream);
 				break;
 			}
+
+			uvc_urb->stream = stream;
 		}
 
 		if (i == UVC_URBS) {
@@ -1517,7 +1527,7 @@ static int uvc_init_video_isoc(struct uvc_streaming *stream,
 		}
 
 		urb->dev = stream->dev->udev;
-		urb->context = stream;
+		urb->context = uvc_urb;
 		urb->pipe = usb_rcvisocpipe(stream->dev->udev,
 				ep->desc.bEndpointAddress);
 #ifndef CONFIG_DMA_NONCOHERENT
@@ -1584,8 +1594,8 @@ static int uvc_init_video_bulk(struct uvc_streaming *stream,
 			return -ENOMEM;
 		}
 
-		usb_fill_bulk_urb(urb, stream->dev->udev, pipe, uvc_urb->buffer,
-				  size, uvc_video_complete, stream);
+		usb_fill_bulk_urb(urb, stream->dev->udev, pipe,	uvc_urb->buffer,
+				  size, uvc_video_complete, uvc_urb);
 #ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP;
 		urb->transfer_dma = uvc_urb->dma;
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 2a5dc7f09463..81a9a419a423 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -483,11 +483,13 @@ struct uvc_stats_stream {
  * struct uvc_urb - URB context management structure
  *
  * @urb: the URB described by this context structure
+ * @stream: UVC streaming context
  * @buffer: memory storage for the URB
  * @dma: DMA coherent addressing for the urb_buffer
  */
 struct uvc_urb {
 	struct urb *urb;
+	struct uvc_streaming *stream;
 
 	char *buffer;
 	dma_addr_t dma;
@@ -523,8 +525,7 @@ struct uvc_streaming {
 	/* Buffers queue. */
 	unsigned int frozen : 1;
 	struct uvc_video_queue queue;
-	void (*decode) (struct urb *urb, struct uvc_streaming *video,
-			struct uvc_buffer *buf);
+	void (*decode)(struct uvc_urb *uvc_urb, struct uvc_buffer *buf);
 
 	/* Context data used by the bulk completion handler. */
 	struct {
@@ -780,8 +781,7 @@ extern struct usb_host_endpoint *uvc_find_endpoint(
 		struct usb_host_interface *alts, __u8 epaddr);
 
 /* Quirks support */
-void uvc_video_decode_isight(struct urb *urb, struct uvc_streaming *stream,
-		struct uvc_buffer *buf);
+void uvc_video_decode_isight(struct uvc_urb *uvc_urb, struct uvc_buffer *buf);
 
 /* debugfs and statistics */
 void uvc_debugfs_init(void);
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFT PATCH v2 4/6] uvcvideo: queue: Simplify spin-lock usage
  2018-01-09 13:09 ` Kieran Bingham
  (?)
@ 2018-01-09 13:09 ` Kieran Bingham
  -1 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-09 13:09 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: Guennadi Liakhovetski, laurent, Olivier BRAUN, Troy Kisky,
	Kieran Bingham

Both uvc_start_streaming(), and uvc_stop_streaming() are called from
userspace context. As such, they do not need to save the IRQ state, and
can use spin_lock_irq() and spin_unlock_irq() respectively.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
---
 drivers/media/usb/uvc/uvc_queue.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
index 4a581d631525..ddac4d89a291 100644
--- a/drivers/media/usb/uvc/uvc_queue.c
+++ b/drivers/media/usb/uvc/uvc_queue.c
@@ -158,7 +158,6 @@ static int uvc_start_streaming(struct vb2_queue *vq, unsigned int count)
 {
 	struct uvc_video_queue *queue = vb2_get_drv_priv(vq);
 	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
-	unsigned long flags;
 	int ret;
 
 	queue->buf_used = 0;
@@ -167,9 +166,9 @@ static int uvc_start_streaming(struct vb2_queue *vq, unsigned int count)
 	if (ret == 0)
 		return 0;
 
-	spin_lock_irqsave(&queue->irqlock, flags);
+	spin_lock_irq(&queue->irqlock);
 	uvc_queue_return_buffers(queue, UVC_BUF_STATE_QUEUED);
-	spin_unlock_irqrestore(&queue->irqlock, flags);
+	spin_unlock_irq(&queue->irqlock);
 
 	return ret;
 }
@@ -178,13 +177,12 @@ static void uvc_stop_streaming(struct vb2_queue *vq)
 {
 	struct uvc_video_queue *queue = vb2_get_drv_priv(vq);
 	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
-	unsigned long flags;
 
 	uvc_video_enable(stream, 0);
 
-	spin_lock_irqsave(&queue->irqlock, flags);
+	spin_lock_irq(&queue->irqlock);
 	uvc_queue_return_buffers(queue, UVC_BUF_STATE_ERROR);
-	spin_unlock_irqrestore(&queue->irqlock, flags);
+	spin_unlock_irq(&queue->irqlock);
 }
 
 static const struct vb2_ops uvc_queue_qops = {
-- 
git-series 0.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFT PATCH v2 6/6] uvcvideo: Move decode processing to process context
  2018-01-09 13:09 ` [RFT PATCH v2 6/6] uvcvideo: Move decode processing to process context Kieran Bingham
@ 2018-01-11 17:54   ` Kieran Bingham
  0 siblings, 0 replies; 9+ messages in thread
From: Kieran Bingham @ 2018-01-11 17:54 UTC (permalink / raw)
  To: linux-media, linux-kernel, Kieran Bingham
  Cc: Guennadi Liakhovetski, Laurent Pinchart, Olivier BRAUN, Troy Kisky

Hi Kieran !

Thanking you for the patch might be rather self serving here :D

On 09/01/18 13:09, Kieran Bingham wrote:
> Newer high definition cameras, and cameras with multiple lenses such as
> the range of stereo-vision cameras now available have ever increasing
> data rates.
> 
> The inclusion of a variable length packet header in URB packets mean
> that we must memcpy the frame data out to our destination 'manually'.
> This can result in data rates of up to 2 gigabits per second being
> processed.
> 
> To improve efficiency, and maximise throughput, handle the URB decode
> processing through a work queue to move it from interrupt context, and
> allow multiple processors to work on URBs in parallel.
> 
> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
> ---
> v2:
>  - Lock full critical section of usb_submit_urb()
> 
>  drivers/media/usb/uvc/uvc_queue.c |  12 +++-
>  drivers/media/usb/uvc/uvc_video.c | 111 +++++++++++++++++++++++++------
>  drivers/media/usb/uvc/uvcvideo.h  |  24 +++++++-
>  3 files changed, 129 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
> index 5a9987e547d3..598087eeb5c2 100644
> --- a/drivers/media/usb/uvc/uvc_queue.c
> +++ b/drivers/media/usb/uvc/uvc_queue.c
> @@ -179,10 +179,22 @@ static void uvc_stop_streaming(struct vb2_queue *vq)
>  	struct uvc_video_queue *queue = vb2_get_drv_priv(vq);
>  	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
>  
> +	/* Prevent new buffers coming in. */
> +	spin_lock_irq(&queue->irqlock);
> +	queue->flags |= UVC_QUEUE_STOPPING;
> +	spin_unlock_irq(&queue->irqlock);
> +
> +	/*
> +	 * All pending work should be completed before disabling the stream, as
> +	 * all URBs will be free'd during uvc_video_enable(s, 0).
> +	 */
> +	flush_workqueue(stream->async_wq);
> +
>  	uvc_video_enable(stream, 0);
>  
>  	spin_lock_irq(&queue->irqlock);
>  	uvc_queue_return_buffers(queue, UVC_BUF_STATE_ERROR);
> +	queue->flags &= ~UVC_QUEUE_STOPPING;
>  	spin_unlock_irq(&queue->irqlock);
>  }
>  
> diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
> index 3878bec3276e..a9ddc4f27012 100644
> --- a/drivers/media/usb/uvc/uvc_video.c
> +++ b/drivers/media/usb/uvc/uvc_video.c
> @@ -1058,21 +1058,67 @@ static int uvc_video_decode_start(struct uvc_streaming *stream,
>  	return data[0];
>  }
>  
> -static void uvc_video_decode_data(struct uvc_streaming *stream,
> -		struct uvc_buffer *buf, const __u8 *data, int len)
> +/*
> + * uvc_video_decode_data_work: Asynchronous memcpy processing
> + *
> + * Perform memcpy tasks in process context, with completion handlers
> + * to return the URB, and buffer handles.
> + *
> + * The work submitter must pre-determine that the work is safe
> + */
> +static void uvc_video_decode_data_work(struct work_struct *work)

This handles asynchronous memory copy work - which could equally be encode work
- so _decode_ might not be an appropriate name here.

>  {
> -	unsigned int maxlen, nbytes;
> -	void *mem;
> +	struct uvc_urb *uvc_urb = container_of(work, struct uvc_urb, work);
> +	struct uvc_streaming *stream = uvc_urb->stream;
> +	struct uvc_video_queue *queue = &stream->queue;
> +	unsigned int i;
> +	int ret;
> +
> +	for (i = 0; i < uvc_urb->packets; i++) {
> +		struct uvc_decode_op *op = &uvc_urb->decodes[i];
> +
> +		memcpy(op->dst, op->src, op->len);
> +
> +		/* Release reference taken on this buffer */
> +		uvc_queue_buffer_release(op->buf);
> +	}
> +
> +	/*
> +	 * Prevent resubmitting URBs when shutting down to ensure that no new
> +	 * work item will be scheduled after uvc_stop_streaming() flushes the
> +	 * work queue.
> +	 */
> +	spin_lock_irq(&queue->irqlock);
> +	if (!(queue->flags & UVC_QUEUE_STOPPING)) {
> +		ret = usb_submit_urb(uvc_urb->urb, GFP_ATOMIC);
> +		if (ret < 0)
> +			uvc_printk(KERN_ERR,
> +				   "Failed to resubmit video URB (%d).\n",
> +				   ret);
> +	}
> +	spin_unlock_irq(&queue->irqlock);
> +}
> +
> +static void uvc_video_decode_data(struct uvc_decode_op *decode,
> +		struct uvc_urb *uvc_urb, struct uvc_buffer *buf,
> +		const __u8 *data, int len)
> +{
> +	unsigned int maxlen;
>  
>  	if (len <= 0)
>  		return;
>  
> -	/* Copy the video data to the buffer. */
>  	maxlen = buf->length - buf->bytesused;
> -	mem = buf->mem + buf->bytesused;
> -	nbytes = min((unsigned int)len, maxlen);
> -	memcpy(mem, data, nbytes);
> -	buf->bytesused += nbytes;
> +
> +	/* Take a buffer reference for async work */
> +	kref_get(&buf->ref);
> +
> +	decode->buf = buf;
> +	decode->src = data;
> +	decode->dst = buf->mem + buf->bytesused;
> +	decode->len = min_t(unsigned int, len, maxlen);
> +
> +	buf->bytesused += decode->len;
>  
>  	/* Complete the current frame if the buffer size was exceeded. */
>  	if (len > maxlen) {
> @@ -1080,6 +1126,8 @@ static void uvc_video_decode_data(struct uvc_streaming *stream,
>  		buf->error = 1;
>  		buf->state = UVC_BUF_STATE_READY;
>  	}
> +
> +	uvc_urb->packets++;
>  }
>  
>  static void uvc_video_decode_end(struct uvc_streaming *stream,
> @@ -1162,6 +1210,8 @@ static void uvc_video_decode_isoc(struct uvc_urb *uvc_urb,
>  	int ret, i;
>  
>  	for (i = 0; i < urb->number_of_packets; ++i) {
> +		struct uvc_decode_op *op = &uvc_urb->decodes[uvc_urb->packets];
> +

This structure is only ever used by the uvc_video_decode_data() function, and it
could be obtained in there directly.


>  		if (urb->iso_frame_desc[i].status < 0) {
>  			uvc_trace(UVC_TRACE_FRAME, "USB isochronous frame "
>  				"lost (%d).\n", urb->iso_frame_desc[i].status);
> @@ -1187,7 +1237,7 @@ static void uvc_video_decode_isoc(struct uvc_urb *uvc_urb,
>  			continue;
>  
>  		/* Decode the payload data. */
> -		uvc_video_decode_data(stream, buf, mem + ret,
> +		uvc_video_decode_data(op, uvc_urb, buf, mem + ret,
>  			urb->iso_frame_desc[i].actual_length - ret);
>  
>  		/* Process the header again. */
> @@ -1248,9 +1298,12 @@ static void uvc_video_decode_bulk(struct uvc_urb *uvc_urb,
>  	 * sure buf is never dereferenced if NULL.
>  	 */
>  
> -	/* Process video data. */
> -	if (!stream->bulk.skip_payload && buf != NULL)
> -		uvc_video_decode_data(stream, buf, mem, len);
> +	/* Prepare video data for processing. */
> +	if (!stream->bulk.skip_payload && buf != NULL) {
> +		struct uvc_decode_op *op = &uvc_urb->decodes[0];

Again - no reason to obtain this op here.


> +
> +		uvc_video_decode_data(op, uvc_urb, buf, mem, len);
> +	}
>  
>  	/* Detect the payload end by a URB smaller than the maximum size (or
>  	 * a payload size equal to the maximum) and process the header again.
> @@ -1322,7 +1375,8 @@ static void uvc_video_complete(struct urb *urb)
>  	struct uvc_streaming *stream = uvc_urb->stream;
>  	struct uvc_video_queue *queue = &stream->queue;
>  	struct uvc_buffer *buf = NULL;
> -	int ret;
> +	unsigned long flags;
> +	bool stopping;
>  
>  	switch (urb->status) {
>  	case 0:
> @@ -1342,14 +1396,30 @@ static void uvc_video_complete(struct urb *urb)
>  		return;
>  	}
>  
> +	/*
> +	 * Simply accept and discard completed URBs without processing when the
> +	 * stream is being shutdown. URBs will be freed as part of the
> +	 * uvc_video_enable(s, 0) action, so we must not queue asynchronous
> +	 * work based upon them.
> +	 */
> +	spin_lock_irqsave(&queue->irqlock, flags);
> +	stopping = queue->flags & UVC_QUEUE_STOPPING;
> +	spin_unlock_irqrestore(&queue->irqlock, flags);
> +
> +	if (stopping)
> +		return;
> +

We can race here if the stream gets stopped at this point...! (Thanks Laurent)


>  	buf = uvc_queue_get_current_buffer(queue);
>  
> +	/* Re-initialise the URB packet work */
> +	uvc_urb->packets = 0;
> +
> +	/* Process the URB headers, but work is deferred to a work queue */
>  	stream->decode(uvc_urb, buf);
>  
> -	if ((ret = usb_submit_urb(urb, GFP_ATOMIC)) < 0) {
> -		uvc_printk(KERN_ERR, "Failed to resubmit video URB (%d).\n",
> -			ret);
> -	}
> +	/* Handle any heavy lifting required */
> +	INIT_WORK(&uvc_urb->work, uvc_video_decode_data_work);> +	queue_work(stream->async_wq, &uvc_urb->work);

The encode operations don't currently schedule any work to be done - but the URB
isn't submitted until the work queue runs, which adds unnecessary latency to
non-async functionality.!

>  }
>  
>  /*
> @@ -1620,6 +1690,11 @@ static int uvc_init_video(struct uvc_streaming *stream, gfp_t gfp_flags)
>  
>  	uvc_video_stats_start(stream);
>  
> +	stream->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI,
> +			0);
> +	if (!stream->async_wq)
> +		return -ENOMEM;
> +
>  	if (intf->num_altsetting > 1) {
>  		struct usb_host_endpoint *best_ep = NULL;
>  		unsigned int best_psize = UINT_MAX;
> diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
> index 6a18dbfc3e5b..13dc0b2bb8b9 100644
> --- a/drivers/media/usb/uvc/uvcvideo.h
> +++ b/drivers/media/usb/uvc/uvcvideo.h
> @@ -411,6 +411,7 @@ struct uvc_buffer {
>  
>  #define UVC_QUEUE_DISCONNECTED		(1 << 0)
>  #define UVC_QUEUE_DROP_CORRUPTED	(1 << 1)
> +#define UVC_QUEUE_STOPPING		(1 << 2)
>  
>  struct uvc_video_queue {
>  	struct vb2_queue queue;
> @@ -483,12 +484,30 @@ struct uvc_stats_stream {
>  };
>  
>  /**
> + * struct uvc_decode_op: Context structure to schedule asynchronous memcpy
> + *
> + * @buf: active buf object for this decode
> + * @dst: copy destination address
> + * @src: copy source address
> + * @len: copy length
> + */
> +struct uvc_decode_op {
> +	struct uvc_buffer *buf;
> +	void *dst;
> +	const __u8 *src;
> +	int len;
> +};
> +
> +/**
>   * struct uvc_urb - URB context management structure
>   *
>   * @urb: the URB described by this context structure
>   * @stream: UVC streaming context
>   * @buffer: memory storage for the URB
>   * @dma: DMA coherent addressing for the urb_buffer
> + * @packets: counter to indicate the number of copy operations
> + * @decodes: work descriptors for asynchronous copy operations
> + * @work: work queue entry for asynchronous decode
>   */
>  struct uvc_urb {
>  	struct urb *urb;
> @@ -496,6 +515,10 @@ struct uvc_urb {
>  
>  	char *buffer;
>  	dma_addr_t dma;
> +
> +	unsigned int packets;
> +	struct uvc_decode_op decodes[UVC_MAX_PACKETS];

Whilst not currently, the encode functions could also use this system - so
perhaps decodes isn't the right name here.

	struct uvc_copy_op copy_operations[UVC_MAX_PACKETS];

might be a better description

> +	struct work_struct work;
>  };
>  
>  struct uvc_streaming {
> @@ -528,6 +551,7 @@ struct uvc_streaming {
>  	/* Buffers queue. */
>  	unsigned int frozen : 1;
>  	struct uvc_video_queue queue;
> +	struct workqueue_struct *async_wq;
>  	void (*decode)(struct uvc_urb *uvc_urb, struct uvc_buffer *buf);
>  
>  	/* Context data used by the bulk completion handler. */
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-01-11 17:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-09 13:09 [RFT PATCH v2 0/6] Asynchronous UVC Kieran Bingham
2018-01-09 13:09 ` Kieran Bingham
2018-01-09 13:09 ` [RFT PATCH v2 4/6] uvcvideo: queue: Simplify spin-lock usage Kieran Bingham
2018-01-09 13:09 ` [RFT PATCH v2 1/6] uvcvideo: Refactor URB descriptors Kieran Bingham
2018-01-09 13:09 ` [RFT PATCH v2 6/6] uvcvideo: Move decode processing to process context Kieran Bingham
2018-01-11 17:54   ` Kieran Bingham
2018-01-09 13:09 ` [RFT PATCH v2 3/6] uvcvideo: Protect queue internals with helper Kieran Bingham
2018-01-09 13:09 ` [RFT PATCH v2 2/6] uvcvideo: Convert decode functions to use new context structure Kieran Bingham
2018-01-09 13:09 ` [RFT PATCH v2 5/6] uvcvideo: queue: Support asynchronous buffer handling Kieran Bingham

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.