linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] vim2m: make it work properly
@ 2019-01-29 16:00 Mauro Carvalho Chehab
  2019-01-29 16:00 ` [PATCH 1/3] media: vim2m: fix driver for it to handle different fourcc formats Mauro Carvalho Chehab
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-01-29 16:00 UTC (permalink / raw)
  To: Linux Media Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Ezequiel Garcia, Anton Leontiev,
	Tomasz Figa, Sakari Ailus, Hans Verkuil

The vim2m driver has some issues... 

It currently fakes supporting two video formats, when in fact, it just
copies the data to the buffer;

It says it supports hflip, when, in fact, it does a 8 tiles flip...
that doesn't end well, though, due to the lack of proper video
format;

If more than one open() is called, it sometimes go to some dead lock,
preventing to stop all pipelines;

By default, it can be used only one instance, as it takes too long
to generate data (40 msecs). This is actually by purpose, as it
uses a delay work queue for that.

This patch series solve all the above issues. For the last one, a
new modprobe parameter was added, in order to allow changing the
default. For example, with this:

	# sudo modprobe vim2m default_transtime=1

the delay is reduced to 1 ms. On my tests with this pipeline:

$ gst-launch-1.0 videotestsrc ! video/x-raw,format=YUY2 ! v4l2video0convert disable-passthrough=1 extra-controls="s,horizontal_flip=0,vertical_flip=0" ! video/x-raw,format=YUY2 ! videoconvert ! fpsdisplaysink

and a similar one:

$ gst-launch-1.0 videotestsrc ! video/x-raw,format=YUY2 ! v4l2video0convert disable-passthrough=1 extra-controls="s,horizontal_flip=1,vertical_flip=1" ! video/x-raw,format=RGB16 ! videoconvert ! ximagesink

I was able to create 17 such pipelines keeping the frame rate at 30
frames per second, and up to 27 pipelines without losing frames, with
a framerate close to 20 fps.

My tests were done on a 3rd generation i7core machine (i7-3630QM).

So, it sounds good enough to be used for testing m2m, even on nowadays
CPUs with less performance.

I opted to keep the default time to 40 ms to 1 ms, in order to allow
multiple streams, but, in practice, I suspect that just one instance
should be enough for most usecases. So, I ended by keping the 40 ms
timing.

PS.: the first patch is identical to the one I submitted before,
except for a minor change on its description.

This patch series can be found on my development tree:

    https://git.linuxtv.org/mchehab/experimental.git/log/?h=vim2m

Mauro Carvalho Chehab (3):
  media: vim2m: fix driver for it to handle different fourcc formats
  media: vim2m: use per-file handler work queue
  media: vim2m: allow setting the default transaction time via parameter

 drivers/media/platform/vim2m.c | 434 ++++++++++++++++++++-------------
 1 file changed, 270 insertions(+), 164 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] media: vim2m: fix driver for it to handle different fourcc formats
  2019-01-29 16:00 [PATCH 0/3] vim2m: make it work properly Mauro Carvalho Chehab
@ 2019-01-29 16:00 ` Mauro Carvalho Chehab
  2019-01-29 16:00 ` [PATCH 2/3] media: vim2m: use per-file handler work queue Mauro Carvalho Chehab
  2019-01-29 16:00 ` [PATCH 3/3] media: vim2m: allow setting the default transaction time via parameter Mauro Carvalho Chehab
  2 siblings, 0 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-01-29 16:00 UTC (permalink / raw)
  To: Linux Media Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Hans Verkuil, Ezequiel Garcia,
	Sakari Ailus, Tomasz Figa, Anton Leontiev

Despite vim2m is reporting that it supports RGB565BE and YUYV,
that's not true.

Right now, it just says that it supports both format, but it
doesn't actually support them.

Also, horizontal flip is not properly implemented. It sounds
that it was designed to do a pseudo-horizontal flip using 8
tiles. Yet, as it doesn't do format conversion, the result
is a mess.

I suspect that it was done this way in order to save CPU time,
at the time of OMAP2 days.

That's messy and doesn't really help if someone wants to
use vim2m to test a pipeline.

Worse than that, the unique RGB format it says it supports is
RGB565BE, with is not supported by Gstreamer. That prevents
practical usage of it, even for tests.

So, instead, properly implement fourcc format conversions,
adding a few more RGB formats:

	- RGB and BGR with 24 bits
	- RGB565LE (known as RGB16 at gstreamer)

Also allows using any of the 5 supported formats as either
capture or output.

Note: The YUYV conversion routines are based on the conversion code
written by Hans de Goede inside libv4lconvert (part of v4l-utils),
released under LGPGL 2.1 (GPL 2.0 compatible).

Tested all possible format combinations except for RGB565BE,
as Gstreamer currently doesn't support it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
---
 drivers/media/platform/vim2m.c | 380 +++++++++++++++++++++------------
 1 file changed, 240 insertions(+), 140 deletions(-)

diff --git a/drivers/media/platform/vim2m.c b/drivers/media/platform/vim2m.c
index a7a152fb3075..ccd0576c766e 100644
--- a/drivers/media/platform/vim2m.c
+++ b/drivers/media/platform/vim2m.c
@@ -60,8 +60,6 @@ MODULE_PARM_DESC(debug, "activates debug info");
 
 /* Default transaction time in msec */
 #define MEM2MEM_DEF_TRANSTIME	40
-#define MEM2MEM_COLOR_STEP	(0xff >> 4)
-#define MEM2MEM_NUM_TILES	8
 
 /* Flags that indicate processing mode */
 #define MEM2MEM_HFLIP	(1 << 0)
@@ -82,22 +80,24 @@ static struct platform_device vim2m_pdev = {
 struct vim2m_fmt {
 	u32	fourcc;
 	int	depth;
-	/* Types the format can be used for */
-	u32	types;
 };
 
 static struct vim2m_fmt formats[] = {
 	{
-		.fourcc	= V4L2_PIX_FMT_RGB565X, /* rrrrrggg gggbbbbb */
+		.fourcc	= V4L2_PIX_FMT_RGB565,  /* rrrrrggg gggbbbbb */
 		.depth	= 16,
-		/* Both capture and output format */
-		.types	= MEM2MEM_CAPTURE | MEM2MEM_OUTPUT,
-	},
-	{
+	}, {
+		.fourcc	= V4L2_PIX_FMT_RGB565X, /* gggbbbbb rrrrrggg */
+		.depth	= 16,
+	}, {
+		.fourcc	= V4L2_PIX_FMT_RGB24,
+		.depth	= 24,
+	}, {
+		.fourcc	= V4L2_PIX_FMT_BGR24,
+		.depth	= 24,
+	}, {
 		.fourcc	= V4L2_PIX_FMT_YUYV,
 		.depth	= 16,
-		/* Output-only format */
-		.types	= MEM2MEM_OUTPUT,
 	},
 };
 
@@ -201,23 +201,222 @@ static struct vim2m_q_data *get_q_data(struct vim2m_ctx *ctx,
 	return NULL;
 }
 
+#define CLIP(color) \
+	(u8)(((color) > 0xFF) ? 0xff : (((color) < 0) ? 0 : (color)))
+
+static void copy_two_pixels(struct vim2m_fmt *in, struct vim2m_fmt *out,
+			    u8 **src, u8 **dst, bool reverse)
+{
+	u8 _r[2], _g[2], _b[2], *r, *g, *b;
+	int i, step;
+
+	// If format is the same just copy the data, respecting the width
+	if (in->fourcc == out->fourcc) {
+		int depth = out->depth >> 3;
+
+		if (reverse) {
+			if (in->fourcc == V4L2_PIX_FMT_YUYV) {
+				int u, v, y, y1;
+
+				*src -= 2;
+
+				y1 = (*src)[0]; /* copy as second point */
+				u  = (*src)[1];
+				y  = (*src)[2]; /* copy as first point */
+				v  = (*src)[3];
+
+				*src -= 2;
+
+				*(*dst)++ = y;
+				*(*dst)++ = u;
+				*(*dst)++ = y1;
+				*(*dst)++ = v;
+				return;
+			}
+
+			memcpy(*dst, *src, depth);
+			memcpy(*dst + depth, *src - depth, depth);
+			*src -= depth << 1;
+		} else {
+			memcpy(*dst, *src, depth << 1);
+			*src += depth << 1;
+		}
+		*dst += depth << 1;
+		return;
+	}
+
+	/* Step 1: read two consecutive pixels from src pointer */
+
+	r = _r;
+	g = _g;
+	b = _b;
+
+	if (reverse)
+		step = -1;
+	else
+		step = 1;
+
+	switch (in->fourcc) {
+	case V4L2_PIX_FMT_RGB565: /* rrrrrggg gggbbbbb */
+		for (i = 0; i < 2; i++) {
+			u16 pix = *(u16 *)*src;
+
+			*r++ = (u8)(((pix & 0xf800) >> 11) << 3) | 0x07;
+			*g++ = (u8)((((pix & 0x07e0) >> 5)) << 2) | 0x03;
+			*b++ = (u8)((pix & 0x1f) << 3) | 0x07;
+
+			*src += step << 1;
+		}
+		break;
+	case V4L2_PIX_FMT_RGB565X: /* gggbbbbb rrrrrggg */
+		for (i = 0; i < 2; i++) {
+			u16 pix = *(u16 *)*src;
+
+			*r++ = (u8)(((0x00f8 & pix) >> 3) << 3) | 0x07;
+			*g++ = (u8)(((pix & 0x7) << 2) |
+				    ((pix & 0xe000) >> 5)) | 0x03;
+			*b++ = (u8)(((pix & 0x1f00) >> 8) << 3) | 0x07;
+
+			*src += step << 1;
+		}
+		break;
+	case V4L2_PIX_FMT_RGB24:
+		for (i = 0; i < 2; i++) {
+			*r++ = (*src)[0];
+			*g++ = (*src)[1];
+			*b++ = (*src)[2];
+
+			*src += step * 3;
+		}
+		break;
+	case V4L2_PIX_FMT_BGR24:
+		for (i = 0; i < 2; i++) {
+			*b++ = (*src)[0];
+			*g++ = (*src)[1];
+			*r++ = (*src)[2];
+
+			*src += step * 3;
+		}
+		break;
+	default: /* V4L2_PIX_FMT_YUYV */
+	{
+		int u, v, y, y1, u1, v1, tmp;
+
+		if (reverse) {
+			*src -= 2;
+
+			y1 = (*src)[0]; /* copy as second point */
+			u  = (*src)[1];
+			y  = (*src)[2]; /* copy as first point */
+			v  = (*src)[3];
+
+			*src -= 2;
+		} else {
+			y  = *(*src)++;
+			u  = *(*src)++;
+			y1 = *(*src)++;
+			v  = *(*src)++;
+		}
+
+		u1 = (((u - 128) << 7) +  (u - 128)) >> 6;
+		tmp = (((u - 128) << 1) + (u - 128) +
+		       ((v - 128) << 2) + ((v - 128) << 1)) >> 3;
+		v1 = (((v - 128) << 1) +  (v - 128)) >> 1;
+
+		*r++ = CLIP(y + v1);
+		*g++ = CLIP(y - tmp);
+		*b++ = CLIP(y + u1);
+
+		*r = CLIP(y1 + v1);
+		*g = CLIP(y1 - tmp);
+		*b = CLIP(y1 + u1);
+		break;
+	}
+	}
+
+	/* Step 2: store two consecutive points, reversing them if needed */
+
+	r = _r;
+	g = _g;
+	b = _b;
+
+	switch (out->fourcc) {
+	case V4L2_PIX_FMT_RGB565: /* rrrrrggg gggbbbbb */
+		for (i = 0; i < 2; i++) {
+			u16 *pix = (u16 *) *dst;
+
+			*pix = ((*r << 8) & 0xf800) | ((*g << 3) & 0x07e0) |
+			       (*b >> 3);
+
+			*dst += 2;
+		}
+		return;
+	case V4L2_PIX_FMT_RGB565X: /* gggbbbbb rrrrrggg */
+		for (i = 0; i < 2; i++) {
+			u16 *pix = (u16 *) *dst;
+			u8 green = *g++ >> 2;
+
+			*pix = ((green << 8) & 0xe000) | (green & 0x07) |
+			       ((*b++ << 5) & 0x1f00) | ((*r++ & 0xf8));
+
+			*dst += 2;
+		}
+		return;
+	case V4L2_PIX_FMT_RGB24:
+		for (i = 0; i < 2; i++) {
+			*(*dst)++ = *r++;
+			*(*dst)++ = *g++;
+			*(*dst)++ = *b++;
+		}
+		return;
+	case V4L2_PIX_FMT_BGR24:
+		for (i = 0; i < 2; i++) {
+			*(*dst)++ = *b++;
+			*(*dst)++ = *g++;
+			*(*dst)++ = *r++;
+		}
+		return;
+	default: /* V4L2_PIX_FMT_YUYV */
+	{
+		u8 y, y1, u, v;
+
+		y = ((8453  * (*r) + 16594 * (*g) +  3223 * (*b)
+		     + 524288) >> 15);
+		u = ((-4878 * (*r) - 9578  * (*g) + 14456 * (*b)
+		     + 4210688) >> 15);
+		v = ((14456 * (*r++) - 12105 * (*g++) - 2351 * (*b++)
+		     + 4210688) >> 15);
+		y1 = ((8453 * (*r) + 16594 * (*g) +  3223 * (*b)
+		     + 524288) >> 15);
+
+		*(*dst)++ = y;
+		*(*dst)++ = u;
+
+		*(*dst)++ = y1;
+		*(*dst)++ = v;
+		return;
+	}
+	}
+}
 
 static int device_process(struct vim2m_ctx *ctx,
 			  struct vb2_v4l2_buffer *in_vb,
 			  struct vb2_v4l2_buffer *out_vb)
 {
 	struct vim2m_dev *dev = ctx->dev;
-	struct vim2m_q_data *q_data;
-	u8 *p_in, *p_out;
-	int x, y, t, w;
-	int tile_w, bytes_left;
-	int width, height, bytesperline;
+	struct vim2m_q_data *q_data_in, *q_data_out;
+	u8 *p_in, *p, *p_out;
+	int width, height, bytesperline, x, y, start, end, step;
+	struct vim2m_fmt *in, *out;
 
-	q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
+	q_data_in = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT);
+	in = q_data_in->fmt;
+	width = q_data_in->width;
+	height = q_data_in->height;
+	bytesperline = (q_data_in->width * q_data_in->fmt->depth) >> 3;
 
-	width	= q_data->width;
-	height	= q_data->height;
-	bytesperline	= (q_data->width * q_data->fmt->depth) >> 3;
+	q_data_out = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
+	out = q_data_out->fmt;
 
 	p_in = vb2_plane_vaddr(&in_vb->vb2_buf, 0);
 	p_out = vb2_plane_vaddr(&out_vb->vb2_buf, 0);
@@ -227,100 +426,28 @@ static int device_process(struct vim2m_ctx *ctx,
 		return -EFAULT;
 	}
 
-	if (vb2_plane_size(&in_vb->vb2_buf, 0) >
-			vb2_plane_size(&out_vb->vb2_buf, 0)) {
-		v4l2_err(&dev->v4l2_dev, "Output buffer is too small\n");
-		return -EINVAL;
-	}
-
-	tile_w = (width * (q_data[V4L2_M2M_DST].fmt->depth >> 3))
-		/ MEM2MEM_NUM_TILES;
-	bytes_left = bytesperline - tile_w * MEM2MEM_NUM_TILES;
-	w = 0;
-
-	out_vb->sequence =
-		get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE)->sequence++;
-	in_vb->sequence = q_data->sequence++;
+	out_vb->sequence = get_q_data(ctx,
+				      V4L2_BUF_TYPE_VIDEO_CAPTURE)->sequence++;
+	in_vb->sequence = q_data_in->sequence++;
 	v4l2_m2m_buf_copy_data(in_vb, out_vb, true);
 
-	switch (ctx->mode) {
-	case MEM2MEM_HFLIP | MEM2MEM_VFLIP:
-		p_out += bytesperline * height - bytes_left;
-		for (y = 0; y < height; ++y) {
-			for (t = 0; t < MEM2MEM_NUM_TILES; ++t) {
-				if (w & 0x1) {
-					for (x = 0; x < tile_w; ++x)
-						*--p_out = *p_in++ +
-							MEM2MEM_COLOR_STEP;
-				} else {
-					for (x = 0; x < tile_w; ++x)
-						*--p_out = *p_in++ -
-							MEM2MEM_COLOR_STEP;
-				}
-				++w;
-			}
-			p_in += bytes_left;
-			p_out -= bytes_left;
-		}
-		break;
+	if (ctx->mode & MEM2MEM_VFLIP) {
+		start = height - 1;
+		end = -1;
+		step = -1;
+	} else {
+		start = 0;
+		end = height;
+		step = 1;
+	}
+	for (y = start; y != end; y += step) {
+		p = p_in + (y * bytesperline);
+		if (ctx->mode & MEM2MEM_HFLIP)
+			p += bytesperline - (q_data_in->fmt->depth >> 3);
 
-	case MEM2MEM_HFLIP:
-		for (y = 0; y < height; ++y) {
-			p_out += MEM2MEM_NUM_TILES * tile_w;
-			for (t = 0; t < MEM2MEM_NUM_TILES; ++t) {
-				if (w & 0x01) {
-					for (x = 0; x < tile_w; ++x)
-						*--p_out = *p_in++ +
-							MEM2MEM_COLOR_STEP;
-				} else {
-					for (x = 0; x < tile_w; ++x)
-						*--p_out = *p_in++ -
-							MEM2MEM_COLOR_STEP;
-				}
-				++w;
-			}
-			p_in += bytes_left;
-			p_out += bytesperline;
-		}
-		break;
-
-	case MEM2MEM_VFLIP:
-		p_out += bytesperline * (height - 1);
-		for (y = 0; y < height; ++y) {
-			for (t = 0; t < MEM2MEM_NUM_TILES; ++t) {
-				if (w & 0x1) {
-					for (x = 0; x < tile_w; ++x)
-						*p_out++ = *p_in++ +
-							MEM2MEM_COLOR_STEP;
-				} else {
-					for (x = 0; x < tile_w; ++x)
-						*p_out++ = *p_in++ -
-							MEM2MEM_COLOR_STEP;
-				}
-				++w;
-			}
-			p_in += bytes_left;
-			p_out += bytes_left - 2 * bytesperline;
-		}
-		break;
-
-	default:
-		for (y = 0; y < height; ++y) {
-			for (t = 0; t < MEM2MEM_NUM_TILES; ++t) {
-				if (w & 0x1) {
-					for (x = 0; x < tile_w; ++x)
-						*p_out++ = *p_in++ +
-							MEM2MEM_COLOR_STEP;
-				} else {
-					for (x = 0; x < tile_w; ++x)
-						*p_out++ = *p_in++ -
-							MEM2MEM_COLOR_STEP;
-				}
-				++w;
-			}
-			p_in += bytes_left;
-			p_out += bytes_left;
-		}
+		for (x = 0; x < width >> 1; x++)
+			copy_two_pixels(in, out, &p, &p_out,
+					ctx->mode & MEM2MEM_HFLIP);
 	}
 
 	return 0;
@@ -433,25 +560,11 @@ static int vidioc_querycap(struct file *file, void *priv,
 
 static int enum_fmt(struct v4l2_fmtdesc *f, u32 type)
 {
-	int i, num;
 	struct vim2m_fmt *fmt;
 
-	num = 0;
-
-	for (i = 0; i < NUM_FORMATS; ++i) {
-		if (formats[i].types & type) {
-			/* index-th format of type type found ? */
-			if (num == f->index)
-				break;
-			/* Correct type but haven't reached our index yet,
-			 * just increment per-type index */
-			++num;
-		}
-	}
-
-	if (i < NUM_FORMATS) {
+	if (f->index < NUM_FORMATS) {
 		/* Format found */
-		fmt = &formats[i];
+		fmt = &formats[f->index];
 		f->pixelformat = fmt->fourcc;
 		return 0;
 	}
@@ -542,12 +655,6 @@ static int vidioc_try_fmt_vid_cap(struct file *file, void *priv,
 		f->fmt.pix.pixelformat = formats[0].fourcc;
 		fmt = find_format(f);
 	}
-	if (!(fmt->types & MEM2MEM_CAPTURE)) {
-		v4l2_err(&ctx->dev->v4l2_dev,
-			 "Fourcc format (0x%08x) invalid.\n",
-			 f->fmt.pix.pixelformat);
-		return -EINVAL;
-	}
 	f->fmt.pix.colorspace = ctx->colorspace;
 	f->fmt.pix.xfer_func = ctx->xfer_func;
 	f->fmt.pix.ycbcr_enc = ctx->ycbcr_enc;
@@ -560,19 +667,12 @@ static int vidioc_try_fmt_vid_out(struct file *file, void *priv,
 				  struct v4l2_format *f)
 {
 	struct vim2m_fmt *fmt;
-	struct vim2m_ctx *ctx = file2ctx(file);
 
 	fmt = find_format(f);
 	if (!fmt) {
 		f->fmt.pix.pixelformat = formats[0].fourcc;
 		fmt = find_format(f);
 	}
-	if (!(fmt->types & MEM2MEM_OUTPUT)) {
-		v4l2_err(&ctx->dev->v4l2_dev,
-			 "Fourcc format (0x%08x) invalid.\n",
-			 f->fmt.pix.pixelformat);
-		return -EINVAL;
-	}
 	if (!f->fmt.pix.colorspace)
 		f->fmt.pix.colorspace = V4L2_COLORSPACE_REC709;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] media: vim2m: use per-file handler work queue
  2019-01-29 16:00 [PATCH 0/3] vim2m: make it work properly Mauro Carvalho Chehab
  2019-01-29 16:00 ` [PATCH 1/3] media: vim2m: fix driver for it to handle different fourcc formats Mauro Carvalho Chehab
@ 2019-01-29 16:00 ` Mauro Carvalho Chehab
  2019-01-30 12:41   ` Ezequiel Garcia
  2019-01-29 16:00 ` [PATCH 3/3] media: vim2m: allow setting the default transaction time via parameter Mauro Carvalho Chehab
  2 siblings, 1 reply; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-01-29 16:00 UTC (permalink / raw)
  To: Linux Media Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Hans Verkuil, Ezequiel Garcia,
	Anton Leontiev, Sakari Ailus

It doesn't make sense to have a per-device work queue, as the
scheduler should be called per file handler. Having a single
one causes failures if multiple streams are filtered by vim2m.

So, move it to be inside the context structure.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
---
 drivers/media/platform/vim2m.c | 38 +++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/media/platform/vim2m.c b/drivers/media/platform/vim2m.c
index ccd0576c766e..a9e43070567e 100644
--- a/drivers/media/platform/vim2m.c
+++ b/drivers/media/platform/vim2m.c
@@ -146,9 +146,6 @@ struct vim2m_dev {
 
 	atomic_t		num_inst;
 	struct mutex		dev_mutex;
-	spinlock_t		irqlock;
-
-	struct delayed_work	work_run;
 
 	struct v4l2_m2m_dev	*m2m_dev;
 };
@@ -167,6 +164,10 @@ struct vim2m_ctx {
 	/* Transaction time (i.e. simulated processing time) in milliseconds */
 	u32			transtime;
 
+	struct mutex		vb_mutex;
+	struct delayed_work	work_run;
+	spinlock_t		irqlock;
+
 	/* Abort requested by m2m */
 	int			aborting;
 
@@ -490,7 +491,6 @@ static void job_abort(void *priv)
 static void device_run(void *priv)
 {
 	struct vim2m_ctx *ctx = priv;
-	struct vim2m_dev *dev = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
 
 	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
@@ -507,18 +507,18 @@ static void device_run(void *priv)
 				   &ctx->hdl);
 
 	/* Run delayed work, which simulates a hardware irq  */
-	schedule_delayed_work(&dev->work_run, msecs_to_jiffies(ctx->transtime));
+	schedule_delayed_work(&ctx->work_run, msecs_to_jiffies(ctx->transtime));
 }
 
 static void device_work(struct work_struct *w)
 {
-	struct vim2m_dev *vim2m_dev =
-		container_of(w, struct vim2m_dev, work_run.work);
 	struct vim2m_ctx *curr_ctx;
+	struct vim2m_dev *vim2m_dev;
 	struct vb2_v4l2_buffer *src_vb, *dst_vb;
 	unsigned long flags;
 
-	curr_ctx = v4l2_m2m_get_curr_priv(vim2m_dev->m2m_dev);
+	curr_ctx = container_of(w, struct vim2m_ctx, work_run.work);
+	vim2m_dev = curr_ctx->dev;
 
 	if (NULL == curr_ctx) {
 		pr_err("Instance released before the end of transaction\n");
@@ -530,10 +530,10 @@ static void device_work(struct work_struct *w)
 
 	curr_ctx->num_processed++;
 
-	spin_lock_irqsave(&vim2m_dev->irqlock, flags);
+	spin_lock_irqsave(&curr_ctx->irqlock, flags);
 	v4l2_m2m_buf_done(src_vb, VB2_BUF_STATE_DONE);
 	v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_DONE);
-	spin_unlock_irqrestore(&vim2m_dev->irqlock, flags);
+	spin_unlock_irqrestore(&curr_ctx->irqlock, flags);
 
 	if (curr_ctx->num_processed == curr_ctx->translen
 	    || curr_ctx->aborting) {
@@ -893,11 +893,10 @@ static int vim2m_start_streaming(struct vb2_queue *q, unsigned count)
 static void vim2m_stop_streaming(struct vb2_queue *q)
 {
 	struct vim2m_ctx *ctx = vb2_get_drv_priv(q);
-	struct vim2m_dev *dev = ctx->dev;
 	struct vb2_v4l2_buffer *vbuf;
 	unsigned long flags;
 
-	cancel_delayed_work_sync(&dev->work_run);
+	cancel_delayed_work_sync(&ctx->work_run);
 	for (;;) {
 		if (V4L2_TYPE_IS_OUTPUT(q->type))
 			vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
@@ -907,9 +906,9 @@ static void vim2m_stop_streaming(struct vb2_queue *q)
 			return;
 		v4l2_ctrl_request_complete(vbuf->vb2_buf.req_obj.req,
 					   &ctx->hdl);
-		spin_lock_irqsave(&ctx->dev->irqlock, flags);
+		spin_lock_irqsave(&ctx->irqlock, flags);
 		v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
-		spin_unlock_irqrestore(&ctx->dev->irqlock, flags);
+		spin_unlock_irqrestore(&ctx->irqlock, flags);
 	}
 }
 
@@ -943,7 +942,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
 	src_vq->ops = &vim2m_qops;
 	src_vq->mem_ops = &vb2_vmalloc_memops;
 	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
-	src_vq->lock = &ctx->dev->dev_mutex;
+	src_vq->lock = &ctx->vb_mutex;
 	src_vq->supports_requests = true;
 
 	ret = vb2_queue_init(src_vq);
@@ -957,7 +956,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
 	dst_vq->ops = &vim2m_qops;
 	dst_vq->mem_ops = &vb2_vmalloc_memops;
 	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
-	dst_vq->lock = &ctx->dev->dev_mutex;
+	dst_vq->lock = &ctx->vb_mutex;
 
 	return vb2_queue_init(dst_vq);
 }
@@ -1032,6 +1031,10 @@ static int vim2m_open(struct file *file)
 
 	ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(dev->m2m_dev, ctx, &queue_init);
 
+	mutex_init(&ctx->vb_mutex);
+	spin_lock_init(&ctx->irqlock);
+	INIT_DELAYED_WORK(&ctx->work_run, device_work);
+
 	if (IS_ERR(ctx->fh.m2m_ctx)) {
 		rc = PTR_ERR(ctx->fh.m2m_ctx);
 
@@ -1112,8 +1115,6 @@ static int vim2m_probe(struct platform_device *pdev)
 	if (!dev)
 		return -ENOMEM;
 
-	spin_lock_init(&dev->irqlock);
-
 	ret = v4l2_device_register(&pdev->dev, &dev->v4l2_dev);
 	if (ret)
 		return ret;
@@ -1125,7 +1126,6 @@ static int vim2m_probe(struct platform_device *pdev)
 	vfd = &dev->vfd;
 	vfd->lock = &dev->dev_mutex;
 	vfd->v4l2_dev = &dev->v4l2_dev;
-	INIT_DELAYED_WORK(&dev->work_run, device_work);
 
 	ret = video_register_device(vfd, VFL_TYPE_GRABBER, 0);
 	if (ret) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] media: vim2m: allow setting the default transaction time via parameter
  2019-01-29 16:00 [PATCH 0/3] vim2m: make it work properly Mauro Carvalho Chehab
  2019-01-29 16:00 ` [PATCH 1/3] media: vim2m: fix driver for it to handle different fourcc formats Mauro Carvalho Chehab
  2019-01-29 16:00 ` [PATCH 2/3] media: vim2m: use per-file handler work queue Mauro Carvalho Chehab
@ 2019-01-29 16:00 ` Mauro Carvalho Chehab
  2 siblings, 0 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-01-29 16:00 UTC (permalink / raw)
  To: Linux Media Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Hans Verkuil, Ezequiel Garcia,
	Tomasz Figa, Sakari Ailus, Anton Leontiev

While there's a control to allow setting it at runtime, as the
control handler is per file handler, only the application setting
the m2m device can change it. As this is a custom control, it is
unlikely that existing apps would be able to set it.

Due to that, and due to the fact that v4l2-mem2mem serializes all
accesses to a m2m device, trying to setup two GStreamer
v4l2videoconvert instance at the same time will cause frame drops.

So, add an alternate way of setting its default via a modprobe parameter.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
---
 drivers/media/platform/vim2m.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/media/platform/vim2m.c b/drivers/media/platform/vim2m.c
index a9e43070567e..0e7814b2327e 100644
--- a/drivers/media/platform/vim2m.c
+++ b/drivers/media/platform/vim2m.c
@@ -41,6 +41,12 @@ static unsigned debug;
 module_param(debug, uint, 0644);
 MODULE_PARM_DESC(debug, "activates debug info");
 
+/* Default transaction time in msec */
+static unsigned default_transtime = 40; /* Max 25 fps */
+module_param(default_transtime, uint, 0644);
+MODULE_PARM_DESC(default_transtime, "default transaction time in ms");
+
+
 #define MIN_W 32
 #define MIN_H 32
 #define MAX_W 640
@@ -58,9 +64,6 @@ MODULE_PARM_DESC(debug, "activates debug info");
 /* In bytes, per queue */
 #define MEM2MEM_VID_MEM_LIMIT	(16 * 1024 * 1024)
 
-/* Default transaction time in msec */
-#define MEM2MEM_DEF_TRANSTIME	40
-
 /* Flags that indicate processing mode */
 #define MEM2MEM_HFLIP	(1 << 0)
 #define MEM2MEM_VFLIP	(1 << 1)
@@ -764,6 +767,8 @@ static int vim2m_s_ctrl(struct v4l2_ctrl *ctrl)
 
 	case V4L2_CID_TRANS_TIME_MSEC:
 		ctx->transtime = ctrl->val;
+		if (ctx->transtime < 1)
+			ctx->transtime = 1;
 		break;
 
 	case V4L2_CID_TRANS_NUM_BUFS:
@@ -961,12 +966,11 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
 	return vb2_queue_init(dst_vq);
 }
 
-static const struct v4l2_ctrl_config vim2m_ctrl_trans_time_msec = {
+static struct v4l2_ctrl_config vim2m_ctrl_trans_time_msec = {
 	.ops = &vim2m_ctrl_ops,
 	.id = V4L2_CID_TRANS_TIME_MSEC,
 	.name = "Transaction Time (msec)",
 	.type = V4L2_CTRL_TYPE_INTEGER,
-	.def = MEM2MEM_DEF_TRANSTIME,
 	.min = 1,
 	.max = 10001,
 	.step = 1,
@@ -1008,6 +1012,8 @@ static int vim2m_open(struct file *file)
 	v4l2_ctrl_handler_init(hdl, 4);
 	v4l2_ctrl_new_std(hdl, &vim2m_ctrl_ops, V4L2_CID_HFLIP, 0, 1, 1, 0);
 	v4l2_ctrl_new_std(hdl, &vim2m_ctrl_ops, V4L2_CID_VFLIP, 0, 1, 1, 0);
+
+	vim2m_ctrl_trans_time_msec.def = default_transtime;
 	v4l2_ctrl_new_custom(hdl, &vim2m_ctrl_trans_time_msec, NULL);
 	v4l2_ctrl_new_custom(hdl, &vim2m_ctrl_trans_num_bufs, NULL);
 	if (hdl->error) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] media: vim2m: use per-file handler work queue
  2019-01-29 16:00 ` [PATCH 2/3] media: vim2m: use per-file handler work queue Mauro Carvalho Chehab
@ 2019-01-30 12:41   ` Ezequiel Garcia
  2019-01-30 13:19     ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 8+ messages in thread
From: Ezequiel Garcia @ 2019-01-30 12:41 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Linux Media Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, Hans Verkuil,
	Anton Leontiev, Sakari Ailus

On Tue, 2019-01-29 at 14:00 -0200, Mauro Carvalho Chehab wrote:
> It doesn't make sense to have a per-device work queue, as the
> scheduler should be called per file handler. Having a single
> one causes failures if multiple streams are filtered by vim2m.
> 

Having a per-device workqueue should emulate a real device more closely.

But more importantly, why would having a single workqeueue fail if multiple
streams are run? The m2m should take care of the proper serialization
between multiple contexts, unless I am missing something here.

Thanks,
Eze

> So, move it to be inside the context structure.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
> ---
>  drivers/media/platform/vim2m.c | 38 +++++++++++++++++-----------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/media/platform/vim2m.c b/drivers/media/platform/vim2m.c
> index ccd0576c766e..a9e43070567e 100644
> --- a/drivers/media/platform/vim2m.c
> +++ b/drivers/media/platform/vim2m.c
> @@ -146,9 +146,6 @@ struct vim2m_dev {
>  
>  	atomic_t		num_inst;
>  	struct mutex		dev_mutex;
> -	spinlock_t		irqlock;
> -
> -	struct delayed_work	work_run;
>  
>  	struct v4l2_m2m_dev	*m2m_dev;
>  };
> @@ -167,6 +164,10 @@ struct vim2m_ctx {
>  	/* Transaction time (i.e. simulated processing time) in milliseconds */
>  	u32			transtime;
>  
> +	struct mutex		vb_mutex;
> +	struct delayed_work	work_run;
> +	spinlock_t		irqlock;
> +
>  	/* Abort requested by m2m */
>  	int			aborting;
>  
> @@ -490,7 +491,6 @@ static void job_abort(void *priv)
>  static void device_run(void *priv)
>  {
>  	struct vim2m_ctx *ctx = priv;
> -	struct vim2m_dev *dev = ctx->dev;
>  	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>  
>  	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
> @@ -507,18 +507,18 @@ static void device_run(void *priv)
>  				   &ctx->hdl);
>  
>  	/* Run delayed work, which simulates a hardware irq  */
> -	schedule_delayed_work(&dev->work_run, msecs_to_jiffies(ctx->transtime));
> +	schedule_delayed_work(&ctx->work_run, msecs_to_jiffies(ctx->transtime));
>  }
>  
>  static void device_work(struct work_struct *w)
>  {
> -	struct vim2m_dev *vim2m_dev =
> -		container_of(w, struct vim2m_dev, work_run.work);
>  	struct vim2m_ctx *curr_ctx;
> +	struct vim2m_dev *vim2m_dev;
>  	struct vb2_v4l2_buffer *src_vb, *dst_vb;
>  	unsigned long flags;
>  
> -	curr_ctx = v4l2_m2m_get_curr_priv(vim2m_dev->m2m_dev);
> +	curr_ctx = container_of(w, struct vim2m_ctx, work_run.work);
> +	vim2m_dev = curr_ctx->dev;
>  
>  	if (NULL == curr_ctx) {
>  		pr_err("Instance released before the end of transaction\n");
> @@ -530,10 +530,10 @@ static void device_work(struct work_struct *w)
>  
>  	curr_ctx->num_processed++;
>  
> -	spin_lock_irqsave(&vim2m_dev->irqlock, flags);
> +	spin_lock_irqsave(&curr_ctx->irqlock, flags);
>  	v4l2_m2m_buf_done(src_vb, VB2_BUF_STATE_DONE);
>  	v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_DONE);
> -	spin_unlock_irqrestore(&vim2m_dev->irqlock, flags);
> +	spin_unlock_irqrestore(&curr_ctx->irqlock, flags);
>  
>  	if (curr_ctx->num_processed == curr_ctx->translen
>  	    || curr_ctx->aborting) {
> @@ -893,11 +893,10 @@ static int vim2m_start_streaming(struct vb2_queue *q, unsigned count)
>  static void vim2m_stop_streaming(struct vb2_queue *q)
>  {
>  	struct vim2m_ctx *ctx = vb2_get_drv_priv(q);
> -	struct vim2m_dev *dev = ctx->dev;
>  	struct vb2_v4l2_buffer *vbuf;
>  	unsigned long flags;
>  
> -	cancel_delayed_work_sync(&dev->work_run);
> +	cancel_delayed_work_sync(&ctx->work_run);
>  	for (;;) {
>  		if (V4L2_TYPE_IS_OUTPUT(q->type))
>  			vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> @@ -907,9 +906,9 @@ static void vim2m_stop_streaming(struct vb2_queue *q)
>  			return;
>  		v4l2_ctrl_request_complete(vbuf->vb2_buf.req_obj.req,
>  					   &ctx->hdl);
> -		spin_lock_irqsave(&ctx->dev->irqlock, flags);
> +		spin_lock_irqsave(&ctx->irqlock, flags);
>  		v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
> -		spin_unlock_irqrestore(&ctx->dev->irqlock, flags);
> +		spin_unlock_irqrestore(&ctx->irqlock, flags);
>  	}
>  }
>  
> @@ -943,7 +942,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
>  	src_vq->ops = &vim2m_qops;
>  	src_vq->mem_ops = &vb2_vmalloc_memops;
>  	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> -	src_vq->lock = &ctx->dev->dev_mutex;
> +	src_vq->lock = &ctx->vb_mutex;
>  	src_vq->supports_requests = true;
>  
>  	ret = vb2_queue_init(src_vq);
> @@ -957,7 +956,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
>  	dst_vq->ops = &vim2m_qops;
>  	dst_vq->mem_ops = &vb2_vmalloc_memops;
>  	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> -	dst_vq->lock = &ctx->dev->dev_mutex;
> +	dst_vq->lock = &ctx->vb_mutex;
>  
>  	return vb2_queue_init(dst_vq);
>  }
> @@ -1032,6 +1031,10 @@ static int vim2m_open(struct file *file)
>  
>  	ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(dev->m2m_dev, ctx, &queue_init);
>  
> +	mutex_init(&ctx->vb_mutex);
> +	spin_lock_init(&ctx->irqlock);
> +	INIT_DELAYED_WORK(&ctx->work_run, device_work);
> +
>  	if (IS_ERR(ctx->fh.m2m_ctx)) {
>  		rc = PTR_ERR(ctx->fh.m2m_ctx);
>  
> @@ -1112,8 +1115,6 @@ static int vim2m_probe(struct platform_device *pdev)
>  	if (!dev)
>  		return -ENOMEM;
>  
> -	spin_lock_init(&dev->irqlock);
> -
>  	ret = v4l2_device_register(&pdev->dev, &dev->v4l2_dev);
>  	if (ret)
>  		return ret;
> @@ -1125,7 +1126,6 @@ static int vim2m_probe(struct platform_device *pdev)
>  	vfd = &dev->vfd;
>  	vfd->lock = &dev->dev_mutex;
>  	vfd->v4l2_dev = &dev->v4l2_dev;
> -	INIT_DELAYED_WORK(&dev->work_run, device_work);
>  
>  	ret = video_register_device(vfd, VFL_TYPE_GRABBER, 0);
>  	if (ret) {



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] media: vim2m: use per-file handler work queue
  2019-01-30 12:41   ` Ezequiel Garcia
@ 2019-01-30 13:19     ` Mauro Carvalho Chehab
  2019-01-30 14:56       ` Ezequiel Garcia
  0 siblings, 1 reply; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-01-30 13:19 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Linux Media Mailing List, Jonathan Corbet, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Hans Verkuil, Anton Leontiev,
	Sakari Ailus

Em Wed, 30 Jan 2019 09:41:44 -0300
Ezequiel Garcia <ezequiel@collabora.com> escreveu:

> On Tue, 2019-01-29 at 14:00 -0200, Mauro Carvalho Chehab wrote:
> > It doesn't make sense to have a per-device work queue, as the
> > scheduler should be called per file handler. Having a single
> > one causes failures if multiple streams are filtered by vim2m.
> >   
> 
> Having a per-device workqueue should emulate a real device more closely.

Yes.

> But more importantly, why would having a single workqeueue fail if multiple
> streams are run? The m2m should take care of the proper serialization
> between multiple contexts, unless I am missing something here.

Yes, the m2m core serializes the access to m2m src/dest buffer per device.

So, two instances can't access the buffer at the same time. That makes
sense for a real physical hardware, although specifically for the virtual
one, it doesn't (any may not make sense for some stateless codec, if
the codec would internally be able to handle multiple requests at the same
time).

Without this patch, when multiple instances are used, sometimes it ends 
into a dead lock preventing to stop all of them.

I didn't have time to debug where exactly it happens, but I suspect that
the issue is related to using the same mutex for both VB and open/release
instances.

Yet, I ended by opting to move all queue-specific mutex/semaphore to be
instance-based, as this makes a lot more sense, IMHO. Also, if some day
we end by allowing support for some hardware that would have support to
run multiple m2m instances in parallel, vim2m would already be ready.


> 
> Thanks,
> Eze
> 
> > So, move it to be inside the context structure.
> > 
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
> > ---
> >  drivers/media/platform/vim2m.c | 38 +++++++++++++++++-----------------
> >  1 file changed, 19 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/media/platform/vim2m.c b/drivers/media/platform/vim2m.c
> > index ccd0576c766e..a9e43070567e 100644
> > --- a/drivers/media/platform/vim2m.c
> > +++ b/drivers/media/platform/vim2m.c
> > @@ -146,9 +146,6 @@ struct vim2m_dev {
> >  
> >  	atomic_t		num_inst;
> >  	struct mutex		dev_mutex;
> > -	spinlock_t		irqlock;
> > -
> > -	struct delayed_work	work_run;
> >  
> >  	struct v4l2_m2m_dev	*m2m_dev;
> >  };
> > @@ -167,6 +164,10 @@ struct vim2m_ctx {
> >  	/* Transaction time (i.e. simulated processing time) in milliseconds */
> >  	u32			transtime;
> >  
> > +	struct mutex		vb_mutex;
> > +	struct delayed_work	work_run;
> > +	spinlock_t		irqlock;
> > +
> >  	/* Abort requested by m2m */
> >  	int			aborting;
> >  
> > @@ -490,7 +491,6 @@ static void job_abort(void *priv)
> >  static void device_run(void *priv)
> >  {
> >  	struct vim2m_ctx *ctx = priv;
> > -	struct vim2m_dev *dev = ctx->dev;
> >  	struct vb2_v4l2_buffer *src_buf, *dst_buf;
> >  
> >  	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
> > @@ -507,18 +507,18 @@ static void device_run(void *priv)
> >  				   &ctx->hdl);
> >  
> >  	/* Run delayed work, which simulates a hardware irq  */
> > -	schedule_delayed_work(&dev->work_run, msecs_to_jiffies(ctx->transtime));
> > +	schedule_delayed_work(&ctx->work_run, msecs_to_jiffies(ctx->transtime));
> >  }
> >  
> >  static void device_work(struct work_struct *w)
> >  {
> > -	struct vim2m_dev *vim2m_dev =
> > -		container_of(w, struct vim2m_dev, work_run.work);
> >  	struct vim2m_ctx *curr_ctx;
> > +	struct vim2m_dev *vim2m_dev;
> >  	struct vb2_v4l2_buffer *src_vb, *dst_vb;
> >  	unsigned long flags;
> >  
> > -	curr_ctx = v4l2_m2m_get_curr_priv(vim2m_dev->m2m_dev);
> > +	curr_ctx = container_of(w, struct vim2m_ctx, work_run.work);
> > +	vim2m_dev = curr_ctx->dev;
> >  
> >  	if (NULL == curr_ctx) {
> >  		pr_err("Instance released before the end of transaction\n");
> > @@ -530,10 +530,10 @@ static void device_work(struct work_struct *w)
> >  
> >  	curr_ctx->num_processed++;
> >  
> > -	spin_lock_irqsave(&vim2m_dev->irqlock, flags);
> > +	spin_lock_irqsave(&curr_ctx->irqlock, flags);
> >  	v4l2_m2m_buf_done(src_vb, VB2_BUF_STATE_DONE);
> >  	v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_DONE);
> > -	spin_unlock_irqrestore(&vim2m_dev->irqlock, flags);
> > +	spin_unlock_irqrestore(&curr_ctx->irqlock, flags);
> >  
> >  	if (curr_ctx->num_processed == curr_ctx->translen
> >  	    || curr_ctx->aborting) {
> > @@ -893,11 +893,10 @@ static int vim2m_start_streaming(struct vb2_queue *q, unsigned count)
> >  static void vim2m_stop_streaming(struct vb2_queue *q)
> >  {
> >  	struct vim2m_ctx *ctx = vb2_get_drv_priv(q);
> > -	struct vim2m_dev *dev = ctx->dev;
> >  	struct vb2_v4l2_buffer *vbuf;
> >  	unsigned long flags;
> >  
> > -	cancel_delayed_work_sync(&dev->work_run);
> > +	cancel_delayed_work_sync(&ctx->work_run);
> >  	for (;;) {
> >  		if (V4L2_TYPE_IS_OUTPUT(q->type))
> >  			vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> > @@ -907,9 +906,9 @@ static void vim2m_stop_streaming(struct vb2_queue *q)
> >  			return;
> >  		v4l2_ctrl_request_complete(vbuf->vb2_buf.req_obj.req,
> >  					   &ctx->hdl);
> > -		spin_lock_irqsave(&ctx->dev->irqlock, flags);
> > +		spin_lock_irqsave(&ctx->irqlock, flags);
> >  		v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
> > -		spin_unlock_irqrestore(&ctx->dev->irqlock, flags);
> > +		spin_unlock_irqrestore(&ctx->irqlock, flags);
> >  	}
> >  }
> >  
> > @@ -943,7 +942,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
> >  	src_vq->ops = &vim2m_qops;
> >  	src_vq->mem_ops = &vb2_vmalloc_memops;
> >  	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > -	src_vq->lock = &ctx->dev->dev_mutex;
> > +	src_vq->lock = &ctx->vb_mutex;
> >  	src_vq->supports_requests = true;
> >  
> >  	ret = vb2_queue_init(src_vq);
> > @@ -957,7 +956,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *ds
> >  	dst_vq->ops = &vim2m_qops;
> >  	dst_vq->mem_ops = &vb2_vmalloc_memops;
> >  	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> > -	dst_vq->lock = &ctx->dev->dev_mutex;
> > +	dst_vq->lock = &ctx->vb_mutex;
> >  
> >  	return vb2_queue_init(dst_vq);
> >  }
> > @@ -1032,6 +1031,10 @@ static int vim2m_open(struct file *file)
> >  
> >  	ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(dev->m2m_dev, ctx, &queue_init);
> >  
> > +	mutex_init(&ctx->vb_mutex);
> > +	spin_lock_init(&ctx->irqlock);
> > +	INIT_DELAYED_WORK(&ctx->work_run, device_work);
> > +
> >  	if (IS_ERR(ctx->fh.m2m_ctx)) {
> >  		rc = PTR_ERR(ctx->fh.m2m_ctx);
> >  
> > @@ -1112,8 +1115,6 @@ static int vim2m_probe(struct platform_device *pdev)
> >  	if (!dev)
> >  		return -ENOMEM;
> >  
> > -	spin_lock_init(&dev->irqlock);
> > -
> >  	ret = v4l2_device_register(&pdev->dev, &dev->v4l2_dev);
> >  	if (ret)
> >  		return ret;
> > @@ -1125,7 +1126,6 @@ static int vim2m_probe(struct platform_device *pdev)
> >  	vfd = &dev->vfd;
> >  	vfd->lock = &dev->dev_mutex;
> >  	vfd->v4l2_dev = &dev->v4l2_dev;
> > -	INIT_DELAYED_WORK(&dev->work_run, device_work);
> >  
> >  	ret = video_register_device(vfd, VFL_TYPE_GRABBER, 0);
> >  	if (ret) {  
> 
> 




Cheers,
Mauro

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] media: vim2m: use per-file handler work queue
  2019-01-30 13:19     ` Mauro Carvalho Chehab
@ 2019-01-30 14:56       ` Ezequiel Garcia
  2019-01-30 18:00         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 8+ messages in thread
From: Ezequiel Garcia @ 2019-01-30 14:56 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Media Mailing List, Jonathan Corbet, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Hans Verkuil, Anton Leontiev,
	Sakari Ailus

Hey Mauro,

On Wed, 2019-01-30 at 11:19 -0200, Mauro Carvalho Chehab wrote:
> Em Wed, 30 Jan 2019 09:41:44 -0300
> Ezequiel Garcia <ezequiel@collabora.com> escreveu:
> 
> > On Tue, 2019-01-29 at 14:00 -0200, Mauro Carvalho Chehab wrote:
> > > It doesn't make sense to have a per-device work queue, as the
> > > scheduler should be called per file handler. Having a single
> > > one causes failures if multiple streams are filtered by vim2m.
> > >   
> > 
> > Having a per-device workqueue should emulate a real device more closely.
> 
> Yes.
> 
> > But more importantly, why would having a single workqeueue fail if multiple
> > streams are run? The m2m should take care of the proper serialization
> > between multiple contexts, unless I am missing something here.
> 
> Yes, the m2m core serializes the access to m2m src/dest buffer per device.
> 
> So, two instances can't access the buffer at the same time. That makes
> sense for a real physical hardware, although specifically for the virtual
> one, it doesn't (any may not make sense for some stateless codec, if
> the codec would internally be able to handle multiple requests at the same
> time).
> 
> Without this patch, when multiple instances are used, sometimes it ends 
> into a dead lock preventing to stop all of them.
> 
> I didn't have time to debug where exactly it happens, but I suspect that
> the issue is related to using the same mutex for both VB and open/release
> instances.
> 
> Yet, I ended by opting to move all queue-specific mutex/semaphore to be
> instance-based, as this makes a lot more sense, IMHO. Also, if some day
> we end by allowing support for some hardware that would have support to
> run multiple m2m instances in parallel, vim2m would already be ready.
> 

I don't oppose to the idea of having a per-context workqueue.

However, I'm not too comfortable with having a bug somewhere (and not knowing
whert) and instead of fixing it, working around it. I'd rather
fix the bug instead, then decide about the workqueue.

Thanks,
Eze


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] media: vim2m: use per-file handler work queue
  2019-01-30 14:56       ` Ezequiel Garcia
@ 2019-01-30 18:00         ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-01-30 18:00 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Linux Media Mailing List, Jonathan Corbet, Mauro Carvalho Chehab,
	Linux Doc Mailing List, Hans Verkuil, Anton Leontiev,
	Sakari Ailus

Em Wed, 30 Jan 2019 11:56:58 -0300
Ezequiel Garcia <ezequiel@collabora.com> escreveu:

> Hey Mauro,
> 
> On Wed, 2019-01-30 at 11:19 -0200, Mauro Carvalho Chehab wrote:
> > Em Wed, 30 Jan 2019 09:41:44 -0300
> > Ezequiel Garcia <ezequiel@collabora.com> escreveu:
> >   
> > > On Tue, 2019-01-29 at 14:00 -0200, Mauro Carvalho Chehab wrote:  
> > > > It doesn't make sense to have a per-device work queue, as the
> > > > scheduler should be called per file handler. Having a single
> > > > one causes failures if multiple streams are filtered by vim2m.
> > > >     
> > > 
> > > Having a per-device workqueue should emulate a real device more closely.  
> > 
> > Yes.
> >   
> > > But more importantly, why would having a single workqeueue fail if multiple
> > > streams are run? The m2m should take care of the proper serialization
> > > between multiple contexts, unless I am missing something here.  
> > 
> > Yes, the m2m core serializes the access to m2m src/dest buffer per device.
> > 
> > So, two instances can't access the buffer at the same time. That makes
> > sense for a real physical hardware, although specifically for the virtual
> > one, it doesn't (any may not make sense for some stateless codec, if
> > the codec would internally be able to handle multiple requests at the same
> > time).
> > 
> > Without this patch, when multiple instances are used, sometimes it ends 
> > into a dead lock preventing to stop all of them.
> > 
> > I didn't have time to debug where exactly it happens, but I suspect that
> > the issue is related to using the same mutex for both VB and open/release
> > instances.
> > 
> > Yet, I ended by opting to move all queue-specific mutex/semaphore to be
> > instance-based, as this makes a lot more sense, IMHO. Also, if some day
> > we end by allowing support for some hardware that would have support to
> > run multiple m2m instances in parallel, vim2m would already be ready.
> >   
> 
> I don't oppose to the idea of having a per-context workqueue.
> 
> However, I'm not too comfortable with having a bug somewhere (and not knowing
> whert) and instead of fixing it, working around it. I'd rather
> fix the bug instead, then decide about the workqueue.

I suspect that just using a separate mutex for open/release could
be enough to make the upstream version stable, when multiple instances
are running.


However, it has been a challenge for me to debug it here, as I'm traveling
with limited resources. I'm using the same machine for both desktop, to run
a VM to access some corp resources and to do Kernel devel.

That's usually a very bad idea. Yet, I'm pretty sure that after this patch,
things are stable, as I've been able to test it without any issues and
without needing to reboot my machine.

If you have enough time, feel free to test. Otherwise, I intend to just
apply this patch series, as it fixes a few bugs and make vim2m to
actually display what it would be expected, instead of plotting just some
weird image that the current version shows.

Cheers,
Mauro

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-01-30 18:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-29 16:00 [PATCH 0/3] vim2m: make it work properly Mauro Carvalho Chehab
2019-01-29 16:00 ` [PATCH 1/3] media: vim2m: fix driver for it to handle different fourcc formats Mauro Carvalho Chehab
2019-01-29 16:00 ` [PATCH 2/3] media: vim2m: use per-file handler work queue Mauro Carvalho Chehab
2019-01-30 12:41   ` Ezequiel Garcia
2019-01-30 13:19     ` Mauro Carvalho Chehab
2019-01-30 14:56       ` Ezequiel Garcia
2019-01-30 18:00         ` Mauro Carvalho Chehab
2019-01-29 16:00 ` [PATCH 3/3] media: vim2m: allow setting the default transaction time via parameter Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).