All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] stk1160: allocate urb buffs with the DMA noncontiguous API
@ 2022-01-25  8:02 Dafna Hirschfeld
  2022-01-25  8:02 ` [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created Dafna Hirschfeld
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Dafna Hirschfeld @ 2022-01-25  8:02 UTC (permalink / raw)
  To: ezequiel
  Cc: Dafna Hirschfeld, kernel, linux-media, laurent.pinchart,
	hverkuil, ribalda, tfiga, senozhatsky, hch, dafna3

This set replaces urb buffers allocation to use the DMA
noncontiguous API. Similarly to a commit sent to uvc: [1]
This improves performance on ARM while not damaging performance
on other platforms. The cpu performance on ARM
improves because the new API does not cause the cache disable on ARM
while the coherent API do.
Measurements are shown in patch 4/4

[1] https://lkml.org/lkml/2021/3/12/1506

Patches Summary:
* patches 1+2 are bug fixes.
* patch 3 changes the way data is stored. This is a preparation
to patch 4.
* patch 4 is the change of API to use the noncontiguougs.

Changes since v1:
1. reduce line length to be not too longer than 80 char
2. reformulate commit log of patch 4/4 and add measurements results
3. in patch 4, invalidate the vmap range before the direct mapping range.
4. test regressions for patches 1-3 (not including patch 4) and improve vars names
5. patch 2 is a new bug fix I found
6. patch 1 is extended to fix two cases of not allocating all intended urb buffers.

Dafna Hirschfeld (4):
  media: stk1160: fix number of buffers in case not all buffers are
    created
  media: stk1160: If start stream fails, return buffers with
    VB2_BUF_STATE_QUEUED
  media: stk1160: move transfer_buffer and urb to same struct
    'stk1160_urb'
  media: stk1160: use dma_alloc_noncontiguous API

 drivers/media/usb/stk1160/stk1160-core.c  |   2 +-
 drivers/media/usb/stk1160/stk1160-v4l.c   |  16 ++-
 drivers/media/usb/stk1160/stk1160-video.c | 142 ++++++++++++----------
 drivers/media/usb/stk1160/stk1160.h       |  23 +++-
 4 files changed, 104 insertions(+), 79 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created
  2022-01-25  8:02 [PATCH v2 0/4] stk1160: allocate urb buffs with the DMA noncontiguous API Dafna Hirschfeld
@ 2022-01-25  8:02 ` Dafna Hirschfeld
  2022-02-06 11:39   ` Ezequiel Garcia
  2022-01-25  8:02 ` [PATCH v2 2/4] media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED Dafna Hirschfeld
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Dafna Hirschfeld @ 2022-01-25  8:02 UTC (permalink / raw)
  To: ezequiel
  Cc: Dafna Hirschfeld, kernel, linux-media, laurent.pinchart,
	hverkuil, ribalda, tfiga, senozhatsky, hch, dafna3

In case we fail to allocate a transfer_buffer then we
break the buffers creation loop and update the number of
buffers to the number of successfully allocated which should
be 'i' and not 'i - 1' nor 'i + 1'

Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-video.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index 202b084f65a2..92c8b1fba2b0 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -511,15 +511,15 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 	usb_free_urb(dev->isoc_ctl.urb[i]);
 	dev->isoc_ctl.urb[i] = NULL;
 
-	stk1160_warn("%d urbs allocated. Trying to continue...\n", i - 1);
+	stk1160_warn("%d urbs allocated. Trying to continue...\n", i);
 
-	dev->isoc_ctl.num_bufs = i - 1;
+	dev->isoc_ctl.num_bufs = i;
 
 	return 0;
 
 free_i_bufs:
 	/* Save the allocated buffers so far, so we can properly free them */
-	dev->isoc_ctl.num_bufs = i+1;
+	dev->isoc_ctl.num_bufs = i;
 	stk1160_free_isoc(dev);
 	return -ENOMEM;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/4] media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED
  2022-01-25  8:02 [PATCH v2 0/4] stk1160: allocate urb buffs with the DMA noncontiguous API Dafna Hirschfeld
  2022-01-25  8:02 ` [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created Dafna Hirschfeld
@ 2022-01-25  8:02 ` Dafna Hirschfeld
  2022-02-06 11:40   ` Ezequiel Garcia
  2022-01-25  8:02 ` [PATCH v2 3/4] media: stk1160: move transfer_buffer and urb to same struct 'stk1160_urb' Dafna Hirschfeld
  2022-01-25  8:02 ` [PATCH v2 4/4] media: stk1160: use dma_alloc_noncontiguous API Dafna Hirschfeld
  3 siblings, 1 reply; 9+ messages in thread
From: Dafna Hirschfeld @ 2022-01-25  8:02 UTC (permalink / raw)
  To: ezequiel
  Cc: Dafna Hirschfeld, kernel, linux-media, laurent.pinchart,
	hverkuil, ribalda, tfiga, senozhatsky, hch, dafna3

If the callback 'start_streaming' fails, then all
queued buffers in the driver should be returned with
state 'VB2_BUF_STATE_QUEUED'. Currently, they are
returned with 'VB2_BUF_STATE_ERROR' which is wrong.
Fix this. This also fixes the warning:

[   65.583633] WARNING: CPU: 5 PID: 593 at drivers/media/common/videobuf2/videobuf2-core.c:1612 vb2_start_streaming+0xd4/0x160 [videobuf2_common]
[   65.585027] Modules linked in: snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_soc_hdmi_codec dw_hdmi_i2s_audio saa7115 stk1160 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc crct10dif_ce panfrost snd_soc_simple_card snd_soc_audio_graph_card snd_soc_spdif_tx snd_soc_simple_card_utils gpu_sched phy_rockchip_pcie snd_soc_rockchip_i2s rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi cec drm_kms_helper drm rtc_rk808 rockchip_saradc industrialio_triggered_buffer kfifo_buf rockchip_thermal pcie_rockchip_host ip_tables x_tables ipv6
[   65.589383] CPU: 5 PID: 593 Comm: v4l2src0:src Tainted: G        W         5.16.0-rc4-62408-g32447129cb30-dirty #14
[   65.590293] Hardware name: Radxa ROCK Pi 4B (DT)
[   65.590696] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   65.591304] pc : vb2_start_streaming+0xd4/0x160 [videobuf2_common]
[   65.591850] lr : vb2_start_streaming+0x6c/0x160 [videobuf2_common]
[   65.592395] sp : ffff800012bc3ad0
[   65.592685] x29: ffff800012bc3ad0 x28: 0000000000000000 x27: ffff800012bc3cd8
[   65.593312] x26: 0000000000000000 x25: ffff00000d8a7800 x24: 0000000040045612
[   65.593938] x23: ffff800011323000 x22: ffff800012bc3cd8 x21: ffff00000908a8b0
[   65.594562] x20: ffff00000908a8c8 x19: 00000000fffffff4 x18: ffffffffffffffff
[   65.595188] x17: 000000040044ffff x16: 00400034b5503510 x15: ffff800011323f78
[   65.595813] x14: ffff000013163886 x13: ffff000013163885 x12: 00000000000002ce
[   65.596439] x11: 0000000000000028 x10: 0000000000000001 x9 : 0000000000000228
[   65.597064] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff726c5e78
[   65.597690] x5 : ffff800012bc3990 x4 : 0000000000000000 x3 : ffff000009a34880
[   65.598315] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007cd99f0
[   65.598940] Call trace:
[   65.599155]  vb2_start_streaming+0xd4/0x160 [videobuf2_common]
[   65.599672]  vb2_core_streamon+0x17c/0x1a8 [videobuf2_common]
[   65.600179]  vb2_streamon+0x54/0x88 [videobuf2_v4l2]
[   65.600619]  vb2_ioctl_streamon+0x54/0x60 [videobuf2_v4l2]
[   65.601103]  v4l_streamon+0x3c/0x50 [videodev]
[   65.601521]  __video_do_ioctl+0x1a4/0x428 [videodev]
[   65.601977]  video_usercopy+0x320/0x828 [videodev]
[   65.602419]  video_ioctl2+0x3c/0x58 [videodev]
[   65.602830]  v4l2_ioctl+0x60/0x90 [videodev]
[   65.603227]  __arm64_sys_ioctl+0xa8/0xe0
[   65.603576]  invoke_syscall+0x54/0x118
[   65.603911]  el0_svc_common.constprop.3+0x84/0x100
[   65.604332]  do_el0_svc+0x34/0xa0
[   65.604625]  el0_svc+0x1c/0x50
[   65.604897]  el0t_64_sync_handler+0x88/0xb0
[   65.605264]  el0t_64_sync+0x16c/0x170
[   65.605587] ---[ end trace 578e0ba07742170d ]---

Fixes: 8ac456495a33d ("[media] stk1160: Stop device and unqueue buffers when start_streaming() fails")
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-core.c |  2 +-
 drivers/media/usb/stk1160/stk1160-v4l.c  | 10 +++++-----
 drivers/media/usb/stk1160/stk1160.h      |  2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-core.c b/drivers/media/usb/stk1160/stk1160-core.c
index 4e1698f78818..ce717502ea4c 100644
--- a/drivers/media/usb/stk1160/stk1160-core.c
+++ b/drivers/media/usb/stk1160/stk1160-core.c
@@ -403,7 +403,7 @@ static void stk1160_disconnect(struct usb_interface *interface)
 	/* Here is the only place where isoc get released */
 	stk1160_uninit_isoc(dev);
 
-	stk1160_clear_queue(dev);
+	stk1160_clear_queue(dev, VB2_BUF_STATE_ERROR);
 
 	video_unregister_device(&dev->vdev);
 	v4l2_device_disconnect(&dev->v4l2_dev);
diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
index 6a4eb616d516..1aa953469402 100644
--- a/drivers/media/usb/stk1160/stk1160-v4l.c
+++ b/drivers/media/usb/stk1160/stk1160-v4l.c
@@ -258,7 +258,7 @@ static int stk1160_start_streaming(struct stk1160 *dev)
 	stk1160_uninit_isoc(dev);
 out_stop_hw:
 	usb_set_interface(dev->udev, 0, 0);
-	stk1160_clear_queue(dev);
+	stk1160_clear_queue(dev, VB2_BUF_STATE_QUEUED);
 
 	mutex_unlock(&dev->v4l_lock);
 
@@ -306,7 +306,7 @@ static int stk1160_stop_streaming(struct stk1160 *dev)
 
 	stk1160_stop_hw(dev);
 
-	stk1160_clear_queue(dev);
+	stk1160_clear_queue(dev, VB2_BUF_STATE_ERROR);
 
 	stk1160_dbg("streaming stopped\n");
 
@@ -745,7 +745,7 @@ static const struct video_device v4l_template = {
 /********************************************************************/
 
 /* Must be called with both v4l_lock and vb_queue_lock hold */
-void stk1160_clear_queue(struct stk1160 *dev)
+void stk1160_clear_queue(struct stk1160 *dev, enum vb2_buffer_state vb2_state)
 {
 	struct stk1160_buffer *buf;
 	unsigned long flags;
@@ -756,7 +756,7 @@ void stk1160_clear_queue(struct stk1160 *dev)
 		buf = list_first_entry(&dev->avail_bufs,
 			struct stk1160_buffer, list);
 		list_del(&buf->list);
-		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
+		vb2_buffer_done(&buf->vb.vb2_buf, vb2_state);
 		stk1160_dbg("buffer [%p/%d] aborted\n",
 			    buf, buf->vb.vb2_buf.index);
 	}
@@ -766,7 +766,7 @@ void stk1160_clear_queue(struct stk1160 *dev)
 		buf = dev->isoc_ctl.buf;
 		dev->isoc_ctl.buf = NULL;
 
-		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
+		vb2_buffer_done(&buf->vb.vb2_buf, vb2_state);
 		stk1160_dbg("buffer [%p/%d] aborted\n",
 			    buf, buf->vb.vb2_buf.index);
 	}
diff --git a/drivers/media/usb/stk1160/stk1160.h b/drivers/media/usb/stk1160/stk1160.h
index a31ea1c80f25..a70963ce8753 100644
--- a/drivers/media/usb/stk1160/stk1160.h
+++ b/drivers/media/usb/stk1160/stk1160.h
@@ -166,7 +166,7 @@ struct regval {
 int stk1160_vb2_setup(struct stk1160 *dev);
 int stk1160_video_register(struct stk1160 *dev);
 void stk1160_video_unregister(struct stk1160 *dev);
-void stk1160_clear_queue(struct stk1160 *dev);
+void stk1160_clear_queue(struct stk1160 *dev, enum vb2_buffer_state vb2_state);
 
 /* Provided by stk1160-video.c */
 int stk1160_alloc_isoc(struct stk1160 *dev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/4] media: stk1160: move transfer_buffer and urb to same struct 'stk1160_urb'
  2022-01-25  8:02 [PATCH v2 0/4] stk1160: allocate urb buffs with the DMA noncontiguous API Dafna Hirschfeld
  2022-01-25  8:02 ` [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created Dafna Hirschfeld
  2022-01-25  8:02 ` [PATCH v2 2/4] media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED Dafna Hirschfeld
@ 2022-01-25  8:02 ` Dafna Hirschfeld
  2022-02-07 15:08   ` Ezequiel Garcia
  2022-01-25  8:02 ` [PATCH v2 4/4] media: stk1160: use dma_alloc_noncontiguous API Dafna Hirschfeld
  3 siblings, 1 reply; 9+ messages in thread
From: Dafna Hirschfeld @ 2022-01-25  8:02 UTC (permalink / raw)
  To: ezequiel
  Cc: Dafna Hirschfeld, kernel, linux-media, laurent.pinchart,
	hverkuil, ribalda, tfiga, senozhatsky, hch, dafna3

Instead of having two separated arrays, one for the urbs and
one for their buffers, have one array of a struct containing both.
In addition, the array is just 16 pointers, no need to dynamically
allocate it.

Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-v4l.c   |  2 +-
 drivers/media/usb/stk1160/stk1160-video.c | 52 ++++++++---------------
 drivers/media/usb/stk1160/stk1160.h       | 11 ++---
 3 files changed, 25 insertions(+), 40 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
index 1aa953469402..ebf245d44005 100644
--- a/drivers/media/usb/stk1160/stk1160-v4l.c
+++ b/drivers/media/usb/stk1160/stk1160-v4l.c
@@ -232,7 +232,7 @@ static int stk1160_start_streaming(struct stk1160 *dev)
 
 	/* submit urbs and enables IRQ */
 	for (i = 0; i < dev->isoc_ctl.num_bufs; i++) {
-		rc = usb_submit_urb(dev->isoc_ctl.urb[i], GFP_KERNEL);
+		rc = usb_submit_urb(dev->isoc_ctl.urb_ctl[i].urb, GFP_KERNEL);
 		if (rc) {
 			stk1160_err("cannot submit urb[%d] (%d)\n", i, rc);
 			goto out_uninit;
diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index 92c8b1fba2b0..f3c0497a8539 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -347,7 +347,7 @@ void stk1160_cancel_isoc(struct stk1160 *dev)
 		 * We don't care for NULL pointer since
 		 * usb_kill_urb allows it.
 		 */
-		usb_kill_urb(dev->isoc_ctl.urb[i]);
+		usb_kill_urb(dev->isoc_ctl.urb_ctl[i].urb);
 	}
 
 	stk1160_dbg("all urbs killed\n");
@@ -366,30 +366,25 @@ void stk1160_free_isoc(struct stk1160 *dev)
 
 	for (i = 0; i < num_bufs; i++) {
 
-		urb = dev->isoc_ctl.urb[i];
+		urb = dev->isoc_ctl.urb_ctl[i].urb;
 		if (urb) {
 
-			if (dev->isoc_ctl.transfer_buffer[i]) {
+			if (dev->isoc_ctl.urb_ctl[i].transfer_buffer) {
 #ifndef CONFIG_DMA_NONCOHERENT
 				usb_free_coherent(dev->udev,
 					urb->transfer_buffer_length,
-					dev->isoc_ctl.transfer_buffer[i],
+					dev->isoc_ctl.urb_ctl[i].transfer_buffer,
 					urb->transfer_dma);
 #else
-				kfree(dev->isoc_ctl.transfer_buffer[i]);
+				kfree(dev->isoc_ctl.urb_ctl[i].transfer_buffer);
 #endif
 			}
 			usb_free_urb(urb);
-			dev->isoc_ctl.urb[i] = NULL;
+			dev->isoc_ctl.urb_ctl[i].urb = NULL;
 		}
-		dev->isoc_ctl.transfer_buffer[i] = NULL;
+		dev->isoc_ctl.urb_ctl[i].transfer_buffer = NULL;
 	}
 
-	kfree(dev->isoc_ctl.urb);
-	kfree(dev->isoc_ctl.transfer_buffer);
-
-	dev->isoc_ctl.urb = NULL;
-	dev->isoc_ctl.transfer_buffer = NULL;
 	dev->isoc_ctl.num_bufs = 0;
 
 	stk1160_dbg("all urb buffers freed\n");
@@ -429,19 +424,6 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 
 	dev->isoc_ctl.buf = NULL;
 	dev->isoc_ctl.max_pkt_size = dev->max_pkt_size;
-	dev->isoc_ctl.urb = kcalloc(num_bufs, sizeof(void *), GFP_KERNEL);
-	if (!dev->isoc_ctl.urb) {
-		stk1160_err("out of memory for urb array\n");
-		return -ENOMEM;
-	}
-
-	dev->isoc_ctl.transfer_buffer = kcalloc(num_bufs, sizeof(void *),
-						GFP_KERNEL);
-	if (!dev->isoc_ctl.transfer_buffer) {
-		stk1160_err("out of memory for usb transfers\n");
-		kfree(dev->isoc_ctl.urb);
-		return -ENOMEM;
-	}
 
 	/* allocate urbs and transfer buffers */
 	for (i = 0; i < num_bufs; i++) {
@@ -449,15 +431,17 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 		urb = usb_alloc_urb(max_packets, GFP_KERNEL);
 		if (!urb)
 			goto free_i_bufs;
-		dev->isoc_ctl.urb[i] = urb;
+		dev->isoc_ctl.urb_ctl[i].urb = urb;
 
 #ifndef CONFIG_DMA_NONCOHERENT
-		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev,
-			sb_size, GFP_KERNEL, &urb->transfer_dma);
+		dev->isoc_ctl.urb_ctl[i].transfer_buffer =
+			usb_alloc_coherent(dev->udev, sb_size, GFP_KERNEL,
+					   &urb->transfer_dma);
 #else
-		dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL);
+		dev->isoc_ctl.urb_ctl[i].transfer_buffer =
+			kmalloc(sb_size, GFP_KERNEL);
 #endif
-		if (!dev->isoc_ctl.transfer_buffer[i]) {
+		if (!dev->isoc_ctl.urb_ctl[i].transfer_buffer) {
 			stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n",
 				sb_size, i);
 
@@ -466,14 +450,14 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 				goto free_i_bufs;
 			goto nomore_tx_bufs;
 		}
-		memset(dev->isoc_ctl.transfer_buffer[i], 0, sb_size);
+		memset(dev->isoc_ctl.urb_ctl[i].transfer_buffer, 0, sb_size);
 
 		/*
 		 * FIXME: Where can I get the endpoint?
 		 */
 		urb->dev = dev->udev;
 		urb->pipe = usb_rcvisocpipe(dev->udev, STK1160_EP_VIDEO);
-		urb->transfer_buffer = dev->isoc_ctl.transfer_buffer[i];
+		urb->transfer_buffer = dev->isoc_ctl.urb_ctl[i].transfer_buffer;
 		urb->transfer_buffer_length = sb_size;
 		urb->complete = stk1160_isoc_irq;
 		urb->context = dev;
@@ -508,8 +492,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 	 * enough to work fine, so we just free the extra urb,
 	 * store the allocated count and keep going, fingers crossed!
 	 */
-	usb_free_urb(dev->isoc_ctl.urb[i]);
-	dev->isoc_ctl.urb[i] = NULL;
+	usb_free_urb(dev->isoc_ctl.urb_ctl[i].urb);
+	dev->isoc_ctl.urb_ctl[i].urb = NULL;
 
 	stk1160_warn("%d urbs allocated. Trying to continue...\n", i);
 
diff --git a/drivers/media/usb/stk1160/stk1160.h b/drivers/media/usb/stk1160/stk1160.h
index a70963ce8753..0c355bb078c1 100644
--- a/drivers/media/usb/stk1160/stk1160.h
+++ b/drivers/media/usb/stk1160/stk1160.h
@@ -84,6 +84,11 @@ struct stk1160_buffer {
 	unsigned int pos;		/* current pos inside buffer */
 };
 
+struct stk1160_urb {
+	struct urb *urb;
+	char *transfer_buffer;
+};
+
 struct stk1160_isoc_ctl {
 	/* max packet size of isoc transaction */
 	int max_pkt_size;
@@ -91,11 +96,7 @@ struct stk1160_isoc_ctl {
 	/* number of allocated urbs */
 	int num_bufs;
 
-	/* urb for isoc transfers */
-	struct urb **urb;
-
-	/* transfer buffers for isoc transfer */
-	char **transfer_buffer;
+	struct stk1160_urb urb_ctl[STK1160_NUM_BUFS];
 
 	/* current buffer */
 	struct stk1160_buffer *buf;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/4] media: stk1160: use dma_alloc_noncontiguous API
  2022-01-25  8:02 [PATCH v2 0/4] stk1160: allocate urb buffs with the DMA noncontiguous API Dafna Hirschfeld
                   ` (2 preceding siblings ...)
  2022-01-25  8:02 ` [PATCH v2 3/4] media: stk1160: move transfer_buffer and urb to same struct 'stk1160_urb' Dafna Hirschfeld
@ 2022-01-25  8:02 ` Dafna Hirschfeld
  2022-02-07 15:13   ` Ezequiel Garcia
  3 siblings, 1 reply; 9+ messages in thread
From: Dafna Hirschfeld @ 2022-01-25  8:02 UTC (permalink / raw)
  To: ezequiel
  Cc: Dafna Hirschfeld, kernel, linux-media, laurent.pinchart,
	hverkuil, ribalda, tfiga, senozhatsky, hch, dafna3

Replace the urb buffers allocation to use the
noncontiguous API. This improves performance on ARM
platform since the previous dma allocation was
coherent dma allocation which disables the cache on
ARM platforms. With the noncontiguous API the cache
is not disabled.
The noncontiguous API requires the driver to
handle synchronization.
This commit is similar to this one for the uvc driver:

    https://lkml.org/lkml/2021/3/12/1506

Performance tests on rock-pi4 (Arm64) shows about 15x
improvements:

== DMA NONCONTIGUOUS ==
total durations: 20.63678480 sec
urb processing durations: 0.286864889 sec
uS/qty: 286864/2508 avg: 114.379 min: 0.583 max: 155.461 (uS)
FPS: 24.92
lost: 0 done: 500
raw decode speed: 11.603 Gbits/s
bytes 414831228.000
bytes/urb: 165403

== DMA COHERENT ==
total durations: 20.73551767 sec
urb processing durations: 4.541559160 sec
uS/qty: 4541559/2509 avg: 1810.107 min: 0.583 max: 2113.163 (uS)
FPS: 24.90
lost: 0 done: 500
raw decode speed: 730.738 Mbits/s
bytes 414785444.000
bytes/urb: 165319

Performance tests on x86 laptop show no significant
difference:

== DMA NONCONTIGUOUS ==
total durations: 20.220590102 sec
urb processing durations: 0.63021818 sec
uS/qty: 63021/2512 avg: 25.088 min: 0.138 max: 146.750 (uS)
FPS: 24.72
lost: 0 done: 500
raw decode speed: 52.751 Gbits/s
bytes 415421032.000
bytes/urb: 165374

== DMA COHERENT ==
total durations: 20.220475614 sec
urb processing durations: 0.64751972 sec
uS/qty: 64751/2512 avg: 25.777 min: 0.168 max: 132.250 (uS)
FPS: 24.72
lost: 0 done: 500
raw decode speed: 51.927 Gbits/s
bytes 415422794.000
bytes/urb: 165375

Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-v4l.c   |   4 +
 drivers/media/usb/stk1160/stk1160-video.c | 114 +++++++++++++---------
 drivers/media/usb/stk1160/stk1160.h       |  10 ++
 3 files changed, 84 insertions(+), 44 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
index ebf245d44005..a1f785a5ffd8 100644
--- a/drivers/media/usb/stk1160/stk1160-v4l.c
+++ b/drivers/media/usb/stk1160/stk1160-v4l.c
@@ -232,6 +232,10 @@ static int stk1160_start_streaming(struct stk1160 *dev)
 
 	/* submit urbs and enables IRQ */
 	for (i = 0; i < dev->isoc_ctl.num_bufs; i++) {
+		struct stk1160_urb *stk_urb = &dev->isoc_ctl.urb_ctl[i];
+
+		dma_sync_sgtable_for_device(stk1160_get_dmadev(dev), stk_urb->sgt,
+					    DMA_FROM_DEVICE);
 		rc = usb_submit_urb(dev->isoc_ctl.urb_ctl[i].urb, GFP_KERNEL);
 		if (rc) {
 			stk1160_err("cannot submit urb[%d] (%d)\n", i, rc);
diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index f3c0497a8539..6600afc565b8 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -295,7 +295,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb)
 static void stk1160_isoc_irq(struct urb *urb)
 {
 	int i, rc;
-	struct stk1160 *dev = urb->context;
+	struct stk1160_urb *stk_urb = urb->context;
+	struct stk1160 *dev = stk_urb->dev;
+	struct device *dma_dev = stk1160_get_dmadev(dev);
 
 	switch (urb->status) {
 	case 0:
@@ -310,6 +312,10 @@ static void stk1160_isoc_irq(struct urb *urb)
 		return;
 	}
 
+	invalidate_kernel_vmap_range(stk_urb->transfer_buffer,
+				     urb->transfer_buffer_length);
+	dma_sync_sgtable_for_cpu(dma_dev, stk_urb->sgt, DMA_FROM_DEVICE);
+
 	stk1160_process_isoc(dev, urb);
 
 	/* Reset urb buffers */
@@ -318,6 +324,7 @@ static void stk1160_isoc_irq(struct urb *urb)
 		urb->iso_frame_desc[i].actual_length = 0;
 	}
 
+	dma_sync_sgtable_for_device(dma_dev, stk_urb->sgt, DMA_FROM_DEVICE);
 	rc = usb_submit_urb(urb, GFP_ATOMIC);
 	if (rc)
 		stk1160_err("urb re-submit failed (%d)\n", rc);
@@ -353,37 +360,34 @@ void stk1160_cancel_isoc(struct stk1160 *dev)
 	stk1160_dbg("all urbs killed\n");
 }
 
+static void stk_free_urb_buffer(struct stk1160 *dev, struct stk1160_urb *stk_urb)
+{
+	struct device *dma_dev = stk1160_get_dmadev(dev);
+
+	dma_vunmap_noncontiguous(dma_dev, stk_urb->transfer_buffer);
+	dma_free_noncontiguous(dma_dev, stk_urb->urb->transfer_buffer_length,
+			       stk_urb->sgt, DMA_FROM_DEVICE);
+	usb_free_urb(stk_urb->urb);
+
+	stk_urb->transfer_buffer = NULL;
+	stk_urb->sgt = NULL;
+	stk_urb->urb = NULL;
+	stk_urb->dev = NULL;
+	stk_urb->dma = 0;
+}
+
 /*
  * Releases urb and transfer buffers
  * Obviusly, associated urb must be killed before releasing it.
  */
 void stk1160_free_isoc(struct stk1160 *dev)
 {
-	struct urb *urb;
 	int i, num_bufs = dev->isoc_ctl.num_bufs;
 
 	stk1160_dbg("freeing %d urb buffers...\n", num_bufs);
 
-	for (i = 0; i < num_bufs; i++) {
-
-		urb = dev->isoc_ctl.urb_ctl[i].urb;
-		if (urb) {
-
-			if (dev->isoc_ctl.urb_ctl[i].transfer_buffer) {
-#ifndef CONFIG_DMA_NONCOHERENT
-				usb_free_coherent(dev->udev,
-					urb->transfer_buffer_length,
-					dev->isoc_ctl.urb_ctl[i].transfer_buffer,
-					urb->transfer_dma);
-#else
-				kfree(dev->isoc_ctl.urb_ctl[i].transfer_buffer);
-#endif
-			}
-			usb_free_urb(urb);
-			dev->isoc_ctl.urb_ctl[i].urb = NULL;
-		}
-		dev->isoc_ctl.urb_ctl[i].transfer_buffer = NULL;
-	}
+	for (i = 0; i < num_bufs; i++)
+		stk_free_urb_buffer(dev, &dev->isoc_ctl.urb_ctl[i]);
 
 	dev->isoc_ctl.num_bufs = 0;
 
@@ -400,6 +404,41 @@ void stk1160_uninit_isoc(struct stk1160 *dev)
 	stk1160_free_isoc(dev);
 }
 
+static int stk1160_fill_urb(struct stk1160 *dev, struct stk1160_urb *stk_urb,
+			    int sb_size, int max_packets)
+{
+	struct device *dma_dev = stk1160_get_dmadev(dev);
+
+	stk_urb->urb = usb_alloc_urb(max_packets, GFP_KERNEL);
+	if (!stk_urb->urb)
+		return -ENOMEM;
+	stk_urb->sgt = dma_alloc_noncontiguous(dma_dev, sb_size,
+					       DMA_FROM_DEVICE, GFP_KERNEL, 0);
+
+	/*
+	 * If the buffer allocation failed, we exit but return 0 since
+	 * we allow the driver working with less buffers
+	 */
+	if (!stk_urb->sgt)
+		goto free_urb;
+
+	stk_urb->transfer_buffer = dma_vmap_noncontiguous(dma_dev, sb_size,
+							  stk_urb->sgt);
+	if (!stk_urb->transfer_buffer)
+		goto free_sgt;
+
+	stk_urb->dma = stk_urb->sgt->sgl->dma_address;
+	stk_urb->dev = dev;
+	return 0;
+free_sgt:
+	dma_free_noncontiguous(dma_dev, sb_size, stk_urb->sgt, DMA_FROM_DEVICE);
+	stk_urb->sgt = NULL;
+free_urb:
+	usb_free_urb(stk_urb->urb);
+	stk_urb->urb = NULL;
+
+	return 0;
+}
 /*
  * Allocate URBs
  */
@@ -407,6 +446,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 {
 	struct urb *urb;
 	int i, j, k, sb_size, max_packets, num_bufs;
+	int ret;
 
 	/*
 	 * It may be necessary to release isoc here,
@@ -428,23 +468,14 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 	/* allocate urbs and transfer buffers */
 	for (i = 0; i < num_bufs; i++) {
 
-		urb = usb_alloc_urb(max_packets, GFP_KERNEL);
-		if (!urb)
+		ret = stk1160_fill_urb(dev, &dev->isoc_ctl.urb_ctl[i],
+				       sb_size, max_packets);
+		if (ret)
 			goto free_i_bufs;
-		dev->isoc_ctl.urb_ctl[i].urb = urb;
-
-#ifndef CONFIG_DMA_NONCOHERENT
-		dev->isoc_ctl.urb_ctl[i].transfer_buffer =
-			usb_alloc_coherent(dev->udev, sb_size, GFP_KERNEL,
-					   &urb->transfer_dma);
-#else
-		dev->isoc_ctl.urb_ctl[i].transfer_buffer =
-			kmalloc(sb_size, GFP_KERNEL);
-#endif
-		if (!dev->isoc_ctl.urb_ctl[i].transfer_buffer) {
-			stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n",
-				sb_size, i);
 
+		urb = dev->isoc_ctl.urb_ctl[i].urb;
+
+		if (!urb) {
 			/* Not enough transfer buffers, so just give up */
 			if (i < STK1160_MIN_BUFS)
 				goto free_i_bufs;
@@ -460,15 +491,12 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 		urb->transfer_buffer = dev->isoc_ctl.urb_ctl[i].transfer_buffer;
 		urb->transfer_buffer_length = sb_size;
 		urb->complete = stk1160_isoc_irq;
-		urb->context = dev;
+		urb->context = &dev->isoc_ctl.urb_ctl[i];
 		urb->interval = 1;
 		urb->start_frame = 0;
 		urb->number_of_packets = max_packets;
-#ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP;
-#else
-		urb->transfer_flags = URB_ISO_ASAP;
-#endif
+		urb->transfer_dma = dev->isoc_ctl.urb_ctl[i].dma;
 
 		k = 0;
 		for (j = 0; j < max_packets; j++) {
@@ -492,8 +520,6 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 	 * enough to work fine, so we just free the extra urb,
 	 * store the allocated count and keep going, fingers crossed!
 	 */
-	usb_free_urb(dev->isoc_ctl.urb_ctl[i].urb);
-	dev->isoc_ctl.urb_ctl[i].urb = NULL;
 
 	stk1160_warn("%d urbs allocated. Trying to continue...\n", i);
 
diff --git a/drivers/media/usb/stk1160/stk1160.h b/drivers/media/usb/stk1160/stk1160.h
index 0c355bb078c1..7b498d14ed7a 100644
--- a/drivers/media/usb/stk1160/stk1160.h
+++ b/drivers/media/usb/stk1160/stk1160.h
@@ -16,6 +16,8 @@
 #include <media/videobuf2-v4l2.h>
 #include <media/v4l2-device.h>
 #include <media/v4l2-ctrls.h>
+#include <linux/usb.h>
+#include <linux/usb/hcd.h>
 
 #define STK1160_VERSION		"0.9.5"
 #define STK1160_VERSION_NUM	0x000905
@@ -87,6 +89,9 @@ struct stk1160_buffer {
 struct stk1160_urb {
 	struct urb *urb;
 	char *transfer_buffer;
+	struct sg_table *sgt;
+	struct stk1160 *dev;
+	dma_addr_t dma;
 };
 
 struct stk1160_isoc_ctl {
@@ -190,3 +195,8 @@ void stk1160_select_input(struct stk1160 *dev);
 
 /* Provided by stk1160-ac97.c */
 void stk1160_ac97_setup(struct stk1160 *dev);
+
+static inline struct device *stk1160_get_dmadev(struct stk1160 *dev)
+{
+	return bus_to_hcd(dev->udev->bus)->self.sysdev;
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created
  2022-01-25  8:02 ` [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created Dafna Hirschfeld
@ 2022-02-06 11:39   ` Ezequiel Garcia
  0 siblings, 0 replies; 9+ messages in thread
From: Ezequiel Garcia @ 2022-02-06 11:39 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Collabora Kernel ML, linux-media, Laurent Pinchart, Hans Verkuil,
	Ricardo Ribalda, Tomasz Figa, senozhatsky, hch, Dafna Hirschfeld

Hi Dafna,

On Tue, 25 Jan 2022 at 05:02, Dafna Hirschfeld
<dafna.hirschfeld@collabora.com> wrote:
>
> In case we fail to allocate a transfer_buffer then we
> break the buffers creation loop and update the number of
> buffers to the number of successfully allocated which should
> be 'i' and not 'i - 1' nor 'i + 1'
>
> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>

Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>

Thanks!

> ---
>  drivers/media/usb/stk1160/stk1160-video.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
> index 202b084f65a2..92c8b1fba2b0 100644
> --- a/drivers/media/usb/stk1160/stk1160-video.c
> +++ b/drivers/media/usb/stk1160/stk1160-video.c
> @@ -511,15 +511,15 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
>         usb_free_urb(dev->isoc_ctl.urb[i]);
>         dev->isoc_ctl.urb[i] = NULL;
>
> -       stk1160_warn("%d urbs allocated. Trying to continue...\n", i - 1);
> +       stk1160_warn("%d urbs allocated. Trying to continue...\n", i);
>
> -       dev->isoc_ctl.num_bufs = i - 1;
> +       dev->isoc_ctl.num_bufs = i;
>
>         return 0;
>
>  free_i_bufs:
>         /* Save the allocated buffers so far, so we can properly free them */
> -       dev->isoc_ctl.num_bufs = i+1;
> +       dev->isoc_ctl.num_bufs = i;
>         stk1160_free_isoc(dev);
>         return -ENOMEM;
>  }
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED
  2022-01-25  8:02 ` [PATCH v2 2/4] media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED Dafna Hirschfeld
@ 2022-02-06 11:40   ` Ezequiel Garcia
  0 siblings, 0 replies; 9+ messages in thread
From: Ezequiel Garcia @ 2022-02-06 11:40 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Collabora Kernel ML, linux-media, Laurent Pinchart, Hans Verkuil,
	Ricardo Ribalda, Tomasz Figa, senozhatsky, hch, Dafna Hirschfeld

Hi Dafna,

On Tue, 25 Jan 2022 at 05:08, Dafna Hirschfeld
<dafna.hirschfeld@collabora.com> wrote:
>
> If the callback 'start_streaming' fails, then all
> queued buffers in the driver should be returned with
> state 'VB2_BUF_STATE_QUEUED'. Currently, they are
> returned with 'VB2_BUF_STATE_ERROR' which is wrong.
> Fix this. This also fixes the warning:
>
> [   65.583633] WARNING: CPU: 5 PID: 593 at drivers/media/common/videobuf2/videobuf2-core.c:1612 vb2_start_streaming+0xd4/0x160 [videobuf2_common]
> [   65.585027] Modules linked in: snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_soc_hdmi_codec dw_hdmi_i2s_audio saa7115 stk1160 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc crct10dif_ce panfrost snd_soc_simple_card snd_soc_audio_graph_card snd_soc_spdif_tx snd_soc_simple_card_utils gpu_sched phy_rockchip_pcie snd_soc_rockchip_i2s rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi cec drm_kms_helper drm rtc_rk808 rockchip_saradc industrialio_triggered_buffer kfifo_buf rockchip_thermal pcie_rockchip_host ip_tables x_tables ipv6
> [   65.589383] CPU: 5 PID: 593 Comm: v4l2src0:src Tainted: G        W         5.16.0-rc4-62408-g32447129cb30-dirty #14
> [   65.590293] Hardware name: Radxa ROCK Pi 4B (DT)
> [   65.590696] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   65.591304] pc : vb2_start_streaming+0xd4/0x160 [videobuf2_common]
> [   65.591850] lr : vb2_start_streaming+0x6c/0x160 [videobuf2_common]
> [   65.592395] sp : ffff800012bc3ad0
> [   65.592685] x29: ffff800012bc3ad0 x28: 0000000000000000 x27: ffff800012bc3cd8
> [   65.593312] x26: 0000000000000000 x25: ffff00000d8a7800 x24: 0000000040045612
> [   65.593938] x23: ffff800011323000 x22: ffff800012bc3cd8 x21: ffff00000908a8b0
> [   65.594562] x20: ffff00000908a8c8 x19: 00000000fffffff4 x18: ffffffffffffffff
> [   65.595188] x17: 000000040044ffff x16: 00400034b5503510 x15: ffff800011323f78
> [   65.595813] x14: ffff000013163886 x13: ffff000013163885 x12: 00000000000002ce
> [   65.596439] x11: 0000000000000028 x10: 0000000000000001 x9 : 0000000000000228
> [   65.597064] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff726c5e78
> [   65.597690] x5 : ffff800012bc3990 x4 : 0000000000000000 x3 : ffff000009a34880
> [   65.598315] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007cd99f0
> [   65.598940] Call trace:
> [   65.599155]  vb2_start_streaming+0xd4/0x160 [videobuf2_common]
> [   65.599672]  vb2_core_streamon+0x17c/0x1a8 [videobuf2_common]
> [   65.600179]  vb2_streamon+0x54/0x88 [videobuf2_v4l2]
> [   65.600619]  vb2_ioctl_streamon+0x54/0x60 [videobuf2_v4l2]
> [   65.601103]  v4l_streamon+0x3c/0x50 [videodev]
> [   65.601521]  __video_do_ioctl+0x1a4/0x428 [videodev]
> [   65.601977]  video_usercopy+0x320/0x828 [videodev]
> [   65.602419]  video_ioctl2+0x3c/0x58 [videodev]
> [   65.602830]  v4l2_ioctl+0x60/0x90 [videodev]
> [   65.603227]  __arm64_sys_ioctl+0xa8/0xe0
> [   65.603576]  invoke_syscall+0x54/0x118
> [   65.603911]  el0_svc_common.constprop.3+0x84/0x100
> [   65.604332]  do_el0_svc+0x34/0xa0
> [   65.604625]  el0_svc+0x1c/0x50
> [   65.604897]  el0t_64_sync_handler+0x88/0xb0
> [   65.605264]  el0t_64_sync+0x16c/0x170
> [   65.605587] ---[ end trace 578e0ba07742170d ]---
>
> Fixes: 8ac456495a33d ("[media] stk1160: Stop device and unqueue buffers when start_streaming() fails")
> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>

Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>

Thanks,

> ---
>  drivers/media/usb/stk1160/stk1160-core.c |  2 +-
>  drivers/media/usb/stk1160/stk1160-v4l.c  | 10 +++++-----
>  drivers/media/usb/stk1160/stk1160.h      |  2 +-
>  3 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/media/usb/stk1160/stk1160-core.c b/drivers/media/usb/stk1160/stk1160-core.c
> index 4e1698f78818..ce717502ea4c 100644
> --- a/drivers/media/usb/stk1160/stk1160-core.c
> +++ b/drivers/media/usb/stk1160/stk1160-core.c
> @@ -403,7 +403,7 @@ static void stk1160_disconnect(struct usb_interface *interface)
>         /* Here is the only place where isoc get released */
>         stk1160_uninit_isoc(dev);
>
> -       stk1160_clear_queue(dev);
> +       stk1160_clear_queue(dev, VB2_BUF_STATE_ERROR);
>
>         video_unregister_device(&dev->vdev);
>         v4l2_device_disconnect(&dev->v4l2_dev);
> diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
> index 6a4eb616d516..1aa953469402 100644
> --- a/drivers/media/usb/stk1160/stk1160-v4l.c
> +++ b/drivers/media/usb/stk1160/stk1160-v4l.c
> @@ -258,7 +258,7 @@ static int stk1160_start_streaming(struct stk1160 *dev)
>         stk1160_uninit_isoc(dev);
>  out_stop_hw:
>         usb_set_interface(dev->udev, 0, 0);
> -       stk1160_clear_queue(dev);
> +       stk1160_clear_queue(dev, VB2_BUF_STATE_QUEUED);
>
>         mutex_unlock(&dev->v4l_lock);
>
> @@ -306,7 +306,7 @@ static int stk1160_stop_streaming(struct stk1160 *dev)
>
>         stk1160_stop_hw(dev);
>
> -       stk1160_clear_queue(dev);
> +       stk1160_clear_queue(dev, VB2_BUF_STATE_ERROR);
>
>         stk1160_dbg("streaming stopped\n");
>
> @@ -745,7 +745,7 @@ static const struct video_device v4l_template = {
>  /********************************************************************/
>
>  /* Must be called with both v4l_lock and vb_queue_lock hold */
> -void stk1160_clear_queue(struct stk1160 *dev)
> +void stk1160_clear_queue(struct stk1160 *dev, enum vb2_buffer_state vb2_state)
>  {
>         struct stk1160_buffer *buf;
>         unsigned long flags;
> @@ -756,7 +756,7 @@ void stk1160_clear_queue(struct stk1160 *dev)
>                 buf = list_first_entry(&dev->avail_bufs,
>                         struct stk1160_buffer, list);
>                 list_del(&buf->list);
> -               vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> +               vb2_buffer_done(&buf->vb.vb2_buf, vb2_state);
>                 stk1160_dbg("buffer [%p/%d] aborted\n",
>                             buf, buf->vb.vb2_buf.index);
>         }
> @@ -766,7 +766,7 @@ void stk1160_clear_queue(struct stk1160 *dev)
>                 buf = dev->isoc_ctl.buf;
>                 dev->isoc_ctl.buf = NULL;
>
> -               vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> +               vb2_buffer_done(&buf->vb.vb2_buf, vb2_state);
>                 stk1160_dbg("buffer [%p/%d] aborted\n",
>                             buf, buf->vb.vb2_buf.index);
>         }
> diff --git a/drivers/media/usb/stk1160/stk1160.h b/drivers/media/usb/stk1160/stk1160.h
> index a31ea1c80f25..a70963ce8753 100644
> --- a/drivers/media/usb/stk1160/stk1160.h
> +++ b/drivers/media/usb/stk1160/stk1160.h
> @@ -166,7 +166,7 @@ struct regval {
>  int stk1160_vb2_setup(struct stk1160 *dev);
>  int stk1160_video_register(struct stk1160 *dev);
>  void stk1160_video_unregister(struct stk1160 *dev);
> -void stk1160_clear_queue(struct stk1160 *dev);
> +void stk1160_clear_queue(struct stk1160 *dev, enum vb2_buffer_state vb2_state);
>
>  /* Provided by stk1160-video.c */
>  int stk1160_alloc_isoc(struct stk1160 *dev);
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] media: stk1160: move transfer_buffer and urb to same struct 'stk1160_urb'
  2022-01-25  8:02 ` [PATCH v2 3/4] media: stk1160: move transfer_buffer and urb to same struct 'stk1160_urb' Dafna Hirschfeld
@ 2022-02-07 15:08   ` Ezequiel Garcia
  0 siblings, 0 replies; 9+ messages in thread
From: Ezequiel Garcia @ 2022-02-07 15:08 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: kernel, linux-media, laurent.pinchart, hverkuil, ribalda, tfiga,
	senozhatsky, hch, dafna3

Hi Dafna,

On Tue, Jan 25, 2022 at 10:02:12AM +0200, Dafna Hirschfeld wrote:
> Instead of having two separated arrays, one for the urbs and
> one for their buffers, have one array of a struct containing both.
> In addition, the array is just 16 pointers, no need to dynamically
> allocate it.
> 
> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>

Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>

Thanks,
Ezequiel

> ---
>  drivers/media/usb/stk1160/stk1160-v4l.c   |  2 +-
>  drivers/media/usb/stk1160/stk1160-video.c | 52 ++++++++---------------
>  drivers/media/usb/stk1160/stk1160.h       | 11 ++---
>  3 files changed, 25 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
> index 1aa953469402..ebf245d44005 100644
> --- a/drivers/media/usb/stk1160/stk1160-v4l.c
> +++ b/drivers/media/usb/stk1160/stk1160-v4l.c
> @@ -232,7 +232,7 @@ static int stk1160_start_streaming(struct stk1160 *dev)
>  
>  	/* submit urbs and enables IRQ */
>  	for (i = 0; i < dev->isoc_ctl.num_bufs; i++) {
> -		rc = usb_submit_urb(dev->isoc_ctl.urb[i], GFP_KERNEL);
> +		rc = usb_submit_urb(dev->isoc_ctl.urb_ctl[i].urb, GFP_KERNEL);
>  		if (rc) {
>  			stk1160_err("cannot submit urb[%d] (%d)\n", i, rc);
>  			goto out_uninit;
> diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
> index 92c8b1fba2b0..f3c0497a8539 100644
> --- a/drivers/media/usb/stk1160/stk1160-video.c
> +++ b/drivers/media/usb/stk1160/stk1160-video.c
> @@ -347,7 +347,7 @@ void stk1160_cancel_isoc(struct stk1160 *dev)
>  		 * We don't care for NULL pointer since
>  		 * usb_kill_urb allows it.
>  		 */
> -		usb_kill_urb(dev->isoc_ctl.urb[i]);
> +		usb_kill_urb(dev->isoc_ctl.urb_ctl[i].urb);
>  	}
>  
>  	stk1160_dbg("all urbs killed\n");
> @@ -366,30 +366,25 @@ void stk1160_free_isoc(struct stk1160 *dev)
>  
>  	for (i = 0; i < num_bufs; i++) {
>  
> -		urb = dev->isoc_ctl.urb[i];
> +		urb = dev->isoc_ctl.urb_ctl[i].urb;
>  		if (urb) {
>  
> -			if (dev->isoc_ctl.transfer_buffer[i]) {
> +			if (dev->isoc_ctl.urb_ctl[i].transfer_buffer) {
>  #ifndef CONFIG_DMA_NONCOHERENT
>  				usb_free_coherent(dev->udev,
>  					urb->transfer_buffer_length,
> -					dev->isoc_ctl.transfer_buffer[i],
> +					dev->isoc_ctl.urb_ctl[i].transfer_buffer,
>  					urb->transfer_dma);
>  #else
> -				kfree(dev->isoc_ctl.transfer_buffer[i]);
> +				kfree(dev->isoc_ctl.urb_ctl[i].transfer_buffer);
>  #endif
>  			}
>  			usb_free_urb(urb);
> -			dev->isoc_ctl.urb[i] = NULL;
> +			dev->isoc_ctl.urb_ctl[i].urb = NULL;
>  		}
> -		dev->isoc_ctl.transfer_buffer[i] = NULL;
> +		dev->isoc_ctl.urb_ctl[i].transfer_buffer = NULL;
>  	}
>  
> -	kfree(dev->isoc_ctl.urb);
> -	kfree(dev->isoc_ctl.transfer_buffer);
> -
> -	dev->isoc_ctl.urb = NULL;
> -	dev->isoc_ctl.transfer_buffer = NULL;
>  	dev->isoc_ctl.num_bufs = 0;
>  
>  	stk1160_dbg("all urb buffers freed\n");
> @@ -429,19 +424,6 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
>  
>  	dev->isoc_ctl.buf = NULL;
>  	dev->isoc_ctl.max_pkt_size = dev->max_pkt_size;
> -	dev->isoc_ctl.urb = kcalloc(num_bufs, sizeof(void *), GFP_KERNEL);
> -	if (!dev->isoc_ctl.urb) {
> -		stk1160_err("out of memory for urb array\n");
> -		return -ENOMEM;
> -	}
> -
> -	dev->isoc_ctl.transfer_buffer = kcalloc(num_bufs, sizeof(void *),
> -						GFP_KERNEL);
> -	if (!dev->isoc_ctl.transfer_buffer) {
> -		stk1160_err("out of memory for usb transfers\n");
> -		kfree(dev->isoc_ctl.urb);
> -		return -ENOMEM;
> -	}
>  
>  	/* allocate urbs and transfer buffers */
>  	for (i = 0; i < num_bufs; i++) {
> @@ -449,15 +431,17 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
>  		urb = usb_alloc_urb(max_packets, GFP_KERNEL);
>  		if (!urb)
>  			goto free_i_bufs;
> -		dev->isoc_ctl.urb[i] = urb;
> +		dev->isoc_ctl.urb_ctl[i].urb = urb;
>  
>  #ifndef CONFIG_DMA_NONCOHERENT
> -		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev,
> -			sb_size, GFP_KERNEL, &urb->transfer_dma);
> +		dev->isoc_ctl.urb_ctl[i].transfer_buffer =
> +			usb_alloc_coherent(dev->udev, sb_size, GFP_KERNEL,
> +					   &urb->transfer_dma);
>  #else
> -		dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL);
> +		dev->isoc_ctl.urb_ctl[i].transfer_buffer =
> +			kmalloc(sb_size, GFP_KERNEL);
>  #endif
> -		if (!dev->isoc_ctl.transfer_buffer[i]) {
> +		if (!dev->isoc_ctl.urb_ctl[i].transfer_buffer) {
>  			stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n",
>  				sb_size, i);
>  
> @@ -466,14 +450,14 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
>  				goto free_i_bufs;
>  			goto nomore_tx_bufs;
>  		}
> -		memset(dev->isoc_ctl.transfer_buffer[i], 0, sb_size);
> +		memset(dev->isoc_ctl.urb_ctl[i].transfer_buffer, 0, sb_size);
>  
>  		/*
>  		 * FIXME: Where can I get the endpoint?
>  		 */
>  		urb->dev = dev->udev;
>  		urb->pipe = usb_rcvisocpipe(dev->udev, STK1160_EP_VIDEO);
> -		urb->transfer_buffer = dev->isoc_ctl.transfer_buffer[i];
> +		urb->transfer_buffer = dev->isoc_ctl.urb_ctl[i].transfer_buffer;
>  		urb->transfer_buffer_length = sb_size;
>  		urb->complete = stk1160_isoc_irq;
>  		urb->context = dev;
> @@ -508,8 +492,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
>  	 * enough to work fine, so we just free the extra urb,
>  	 * store the allocated count and keep going, fingers crossed!
>  	 */
> -	usb_free_urb(dev->isoc_ctl.urb[i]);
> -	dev->isoc_ctl.urb[i] = NULL;
> +	usb_free_urb(dev->isoc_ctl.urb_ctl[i].urb);
> +	dev->isoc_ctl.urb_ctl[i].urb = NULL;
>  
>  	stk1160_warn("%d urbs allocated. Trying to continue...\n", i);
>  
> diff --git a/drivers/media/usb/stk1160/stk1160.h b/drivers/media/usb/stk1160/stk1160.h
> index a70963ce8753..0c355bb078c1 100644
> --- a/drivers/media/usb/stk1160/stk1160.h
> +++ b/drivers/media/usb/stk1160/stk1160.h
> @@ -84,6 +84,11 @@ struct stk1160_buffer {
>  	unsigned int pos;		/* current pos inside buffer */
>  };
>  
> +struct stk1160_urb {
> +	struct urb *urb;
> +	char *transfer_buffer;
> +};
> +
>  struct stk1160_isoc_ctl {
>  	/* max packet size of isoc transaction */
>  	int max_pkt_size;
> @@ -91,11 +96,7 @@ struct stk1160_isoc_ctl {
>  	/* number of allocated urbs */
>  	int num_bufs;
>  
> -	/* urb for isoc transfers */
> -	struct urb **urb;
> -
> -	/* transfer buffers for isoc transfer */
> -	char **transfer_buffer;
> +	struct stk1160_urb urb_ctl[STK1160_NUM_BUFS];
>  
>  	/* current buffer */
>  	struct stk1160_buffer *buf;
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 4/4] media: stk1160: use dma_alloc_noncontiguous API
  2022-01-25  8:02 ` [PATCH v2 4/4] media: stk1160: use dma_alloc_noncontiguous API Dafna Hirschfeld
@ 2022-02-07 15:13   ` Ezequiel Garcia
  0 siblings, 0 replies; 9+ messages in thread
From: Ezequiel Garcia @ 2022-02-07 15:13 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: kernel, linux-media, laurent.pinchart, hverkuil, ribalda, tfiga,
	senozhatsky, hch, dafna3

Hi Dafna,

On Tue, Jan 25, 2022 at 10:02:13AM +0200, Dafna Hirschfeld wrote:
> Replace the urb buffers allocation to use the
> noncontiguous API. This improves performance on ARM
> platform since the previous dma allocation was
> coherent dma allocation which disables the cache on
> ARM platforms. With the noncontiguous API the cache
> is not disabled.
> The noncontiguous API requires the driver to
> handle synchronization.
> This commit is similar to this one for the uvc driver:
> 
^^^^^^
This description still looks weird. How about:

"""
Replace the urb buffers allocation to use the noncontiguous API

This improves performance on ARM platform where DMA coherent allocations
produce uncached mappings. Note that the noncontiguous API
requires the driver to handle synchronization.

This commit is similar to this one for the uvc driver:

  https://lkml.org/lkml/2021/3/12/1506

Performance ...
""

Also, see below a little change:

> 
> Performance tests on rock-pi4 (Arm64) shows about 15x
> improvements:
> 
> == DMA NONCONTIGUOUS ==
> total durations: 20.63678480 sec
> urb processing durations: 0.286864889 sec
> uS/qty: 286864/2508 avg: 114.379 min: 0.583 max: 155.461 (uS)
> FPS: 24.92
> lost: 0 done: 500
> raw decode speed: 11.603 Gbits/s
> bytes 414831228.000
> bytes/urb: 165403
> 
> == DMA COHERENT ==
> total durations: 20.73551767 sec
> urb processing durations: 4.541559160 sec
> uS/qty: 4541559/2509 avg: 1810.107 min: 0.583 max: 2113.163 (uS)
> FPS: 24.90
> lost: 0 done: 500
> raw decode speed: 730.738 Mbits/s
> bytes 414785444.000
> bytes/urb: 165319
> 
> Performance tests on x86 laptop show no significant
> difference:
> 
> == DMA NONCONTIGUOUS ==
> total durations: 20.220590102 sec
> urb processing durations: 0.63021818 sec
> uS/qty: 63021/2512 avg: 25.088 min: 0.138 max: 146.750 (uS)
> FPS: 24.72
> lost: 0 done: 500
> raw decode speed: 52.751 Gbits/s
> bytes 415421032.000
> bytes/urb: 165374
> 
> == DMA COHERENT ==
> total durations: 20.220475614 sec
> urb processing durations: 0.64751972 sec
> uS/qty: 64751/2512 avg: 25.777 min: 0.168 max: 132.250 (uS)
> FPS: 24.72
> lost: 0 done: 500
> raw decode speed: 51.927 Gbits/s
> bytes 415422794.000
> bytes/urb: 165375
> 
> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
> ---
>  drivers/media/usb/stk1160/stk1160-v4l.c   |   4 +
>  drivers/media/usb/stk1160/stk1160-video.c | 114 +++++++++++++---------
>  drivers/media/usb/stk1160/stk1160.h       |  10 ++
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
> index ebf245d44005..a1f785a5ffd8 100644
> --- a/drivers/media/usb/stk1160/stk1160-v4l.c
> +++ b/drivers/media/usb/stk1160/stk1160-v4l.c
> @@ -232,6 +232,10 @@ static int stk1160_start_streaming(struct stk1160 *dev)
>  
>  	/* submit urbs and enables IRQ */
>  	for (i = 0; i < dev->isoc_ctl.num_bufs; i++) {
> +		struct stk1160_urb *stk_urb = &dev->isoc_ctl.urb_ctl[i];
> +
> +		dma_sync_sgtable_for_device(stk1160_get_dmadev(dev), stk_urb->sgt,
> +					    DMA_FROM_DEVICE);
>  		rc = usb_submit_urb(dev->isoc_ctl.urb_ctl[i].urb, GFP_KERNEL);
>  		if (rc) {
>  			stk1160_err("cannot submit urb[%d] (%d)\n", i, rc);
> diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
> index f3c0497a8539..6600afc565b8 100644
> --- a/drivers/media/usb/stk1160/stk1160-video.c
> +++ b/drivers/media/usb/stk1160/stk1160-video.c
> @@ -295,7 +295,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb)
>  static void stk1160_isoc_irq(struct urb *urb)
>  {
>  	int i, rc;
> -	struct stk1160 *dev = urb->context;
> +	struct stk1160_urb *stk_urb = urb->context;
> +	struct stk1160 *dev = stk_urb->dev;
> +	struct device *dma_dev = stk1160_get_dmadev(dev);
>  
>  	switch (urb->status) {
>  	case 0:
> @@ -310,6 +312,10 @@ static void stk1160_isoc_irq(struct urb *urb)
>  		return;
>  	}
>  
> +	invalidate_kernel_vmap_range(stk_urb->transfer_buffer,
> +				     urb->transfer_buffer_length);
> +	dma_sync_sgtable_for_cpu(dma_dev, stk_urb->sgt, DMA_FROM_DEVICE);
> +
>  	stk1160_process_isoc(dev, urb);
>  
>  	/* Reset urb buffers */
> @@ -318,6 +324,7 @@ static void stk1160_isoc_irq(struct urb *urb)
>  		urb->iso_frame_desc[i].actual_length = 0;
>  	}
>  
> +	dma_sync_sgtable_for_device(dma_dev, stk_urb->sgt, DMA_FROM_DEVICE);
>  	rc = usb_submit_urb(urb, GFP_ATOMIC);
>  	if (rc)
>  		stk1160_err("urb re-submit failed (%d)\n", rc);
> @@ -353,37 +360,34 @@ void stk1160_cancel_isoc(struct stk1160 *dev)
>  	stk1160_dbg("all urbs killed\n");
>  }
>  
> +static void stk_free_urb_buffer(struct stk1160 *dev, struct stk1160_urb *stk_urb)

Change this function to "stk1160_free_urb". With that,

Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-07 15:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-25  8:02 [PATCH v2 0/4] stk1160: allocate urb buffs with the DMA noncontiguous API Dafna Hirschfeld
2022-01-25  8:02 ` [PATCH v2 1/4] media: stk1160: fix number of buffers in case not all buffers are created Dafna Hirschfeld
2022-02-06 11:39   ` Ezequiel Garcia
2022-01-25  8:02 ` [PATCH v2 2/4] media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED Dafna Hirschfeld
2022-02-06 11:40   ` Ezequiel Garcia
2022-01-25  8:02 ` [PATCH v2 3/4] media: stk1160: move transfer_buffer and urb to same struct 'stk1160_urb' Dafna Hirschfeld
2022-02-07 15:08   ` Ezequiel Garcia
2022-01-25  8:02 ` [PATCH v2 4/4] media: stk1160: use dma_alloc_noncontiguous API Dafna Hirschfeld
2022-02-07 15:13   ` Ezequiel Garcia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.