All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] media: videobuf2: Add a transfer error event
@ 2021-11-08 16:02 Niklas Söderlund
  2021-11-08 16:02 ` [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails Niklas Söderlund
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Niklas Söderlund @ 2021-11-08 16:02 UTC (permalink / raw)
  To: Hans Verkuil, linux-media; +Cc: linux-renesas-soc, Niklas Söderlund

Hello,

This series adds a new V4L2 event, V4L2_EVENT_XFER_ERROR. This new event 
is intended to be used when a device in the capturing pipeline 
encounters an unrecoverable error and needs to inform the capturing 
application thru the video node about the error.

The fist use-case for this is also demonstrated in this series by the 
R-Car CSI-2 receiver that generates the new event when it detects an 
error on the bus.

Patch 1/4 is a bug fix for the R-Car VIN driver that was found while 
working on this series. This patch is good to be picked-up on it's own 
but is a requirement for later patches in this series so I have opted to 
include it here as the first patch. 

Patch 2/4 adds the new V4L2 event. While patch 3/4 and 4/4 makes use of 
it in the R-Car capture pipeline. Careful readers will note that the 
last to patches already have a patch history. This is because they have 
been part of an earlier attempt a while back to solve this issue in a 
different way, and during those discussions the need for this new event 
was found.

Niklas Söderlund (4):
  media: rcar-vin: Free buffers with error if hardware stop fails
  media: videobuf2: Add a transfer error event
  rcar-vin: Stop stream when subdevice signal transfer error
  rcar-csi2: Do not try to recover after transfer error

 .../userspace-api/media/v4l/vidioc-dqevent.rst  |  5 +++++
 .../media/videodev2.h.rst.exceptions            |  1 +
 drivers/media/platform/rcar-vin/rcar-csi2.c     | 17 +++++++++--------
 drivers/media/platform/rcar-vin/rcar-dma.c      | 10 ++++++++++
 drivers/media/platform/rcar-vin/rcar-v4l2.c     | 17 ++++++++++++++++-
 include/uapi/linux/videodev2.h                  |  1 +
 6 files changed, 42 insertions(+), 9 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails
  2021-11-08 16:02 [PATCH 0/4] media: videobuf2: Add a transfer error event Niklas Söderlund
@ 2021-11-08 16:02 ` Niklas Söderlund
  2021-11-15 14:36   ` Hans Verkuil
  2021-11-08 16:02 ` [PATCH 2/4] media: videobuf2: Add a transfer error event Niklas Söderlund
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Niklas Söderlund @ 2021-11-08 16:02 UTC (permalink / raw)
  To: Hans Verkuil, linux-media; +Cc: linux-renesas-soc, Niklas Söderlund

The driver already have logic to detect if it fails to stop properly and
report this error to the user. The driver however did not report the
unused buffers or buffers given to the hardware (if any) with an error,
the buffers where instead returned to user-space in the active state.

Build on the existing detection of the error condition and correctly
return the buffers with an error if it triggers.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
---
 drivers/media/platform/rcar-vin/rcar-dma.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/media/platform/rcar-vin/rcar-dma.c b/drivers/media/platform/rcar-vin/rcar-dma.c
index 25ead9333d0046e7..79bb9081853f8781 100644
--- a/drivers/media/platform/rcar-vin/rcar-dma.c
+++ b/drivers/media/platform/rcar-vin/rcar-dma.c
@@ -1371,6 +1371,16 @@ void rvin_stop_streaming(struct rvin_dev *vin)
 
 	spin_unlock_irqrestore(&vin->qlock, flags);
 
+	/* If something went wrong, free buffers with an error. */
+	if (!buffersFreed) {
+		return_unused_buffers(vin, VB2_BUF_STATE_ERROR);
+		for (i = 0; i < HW_BUFFER_NUM; i++) {
+			if (vin->buf_hw[i].buffer)
+				vb2_buffer_done(&vin->buf_hw[i].buffer->vb2_buf,
+						VB2_BUF_STATE_ERROR);
+		}
+	}
+
 	rvin_set_stream(vin, 0);
 
 	/* disable interrupts */
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] media: videobuf2: Add a transfer error event
  2021-11-08 16:02 [PATCH 0/4] media: videobuf2: Add a transfer error event Niklas Söderlund
  2021-11-08 16:02 ` [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails Niklas Söderlund
@ 2021-11-08 16:02 ` Niklas Söderlund
  2022-03-02 15:40   ` Michael Rodin
  2021-11-08 16:02 ` [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error Niklas Söderlund
  2021-11-08 16:02 ` [PATCH 4/4] rcar-csi2: Do not try to recover after " Niklas Söderlund
  3 siblings, 1 reply; 16+ messages in thread
From: Niklas Söderlund @ 2021-11-08 16:02 UTC (permalink / raw)
  To: Hans Verkuil, linux-media; +Cc: linux-renesas-soc, Niklas Söderlund

Add a new V4L2_EVENT_XFER_ERROR event to signal if a unrecoverable error
happens during video transfer.

The use-case that sparked this new event is to signal to the video
device driver that an error has happen on the CSI-2 bus from the CSI-2
receiver subdevice.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
---
 Documentation/userspace-api/media/v4l/vidioc-dqevent.rst     | 5 +++++
 Documentation/userspace-api/media/videodev2.h.rst.exceptions | 1 +
 include/uapi/linux/videodev2.h                               | 1 +
 3 files changed, 7 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
index 6eb40073c906deba..84984641c9351aa9 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
@@ -182,6 +182,11 @@ call.
 	the regions changes. This event has a struct
 	:c:type:`v4l2_event_motion_det`
 	associated with it.
+    * - ``V4L2_EVENT_XFER_ERROR``
+      - 7
+      - This event is triggered when an transfer error is detected while
+        streaming. For example if a unrecoverable error is detected on a video
+        bus in the pipeline.
     * - ``V4L2_EVENT_PRIVATE_START``
       - 0x08000000
       - Base event number for driver-private events.
diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
index eb0b1cd37abd9381..7ed9884b879c888e 100644
--- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
+++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
@@ -498,6 +498,7 @@ replace define V4L2_EVENT_CTRL event-type
 replace define V4L2_EVENT_FRAME_SYNC event-type
 replace define V4L2_EVENT_SOURCE_CHANGE event-type
 replace define V4L2_EVENT_MOTION_DET event-type
+replace define V4L2_EVENT_XFER_ERROR event-type
 replace define V4L2_EVENT_PRIVATE_START event-type
 
 replace define V4L2_EVENT_CTRL_CH_VALUE ctrl-changes-flags
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index f118fe7a9f58d240..48d4738eb862418b 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -2373,6 +2373,7 @@ struct v4l2_streamparm {
 #define V4L2_EVENT_FRAME_SYNC			4
 #define V4L2_EVENT_SOURCE_CHANGE		5
 #define V4L2_EVENT_MOTION_DET			6
+#define V4L2_EVENT_XFER_ERROR			7
 #define V4L2_EVENT_PRIVATE_START		0x08000000
 
 /* Payload for V4L2_EVENT_VSYNC */
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2021-11-08 16:02 [PATCH 0/4] media: videobuf2: Add a transfer error event Niklas Söderlund
  2021-11-08 16:02 ` [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails Niklas Söderlund
  2021-11-08 16:02 ` [PATCH 2/4] media: videobuf2: Add a transfer error event Niklas Söderlund
@ 2021-11-08 16:02 ` Niklas Söderlund
  2021-11-08 17:36   ` Hans Verkuil
  2021-11-08 16:02 ` [PATCH 4/4] rcar-csi2: Do not try to recover after " Niklas Söderlund
  3 siblings, 1 reply; 16+ messages in thread
From: Niklas Söderlund @ 2021-11-08 16:02 UTC (permalink / raw)
  To: Hans Verkuil, linux-media
  Cc: linux-renesas-soc, Niklas Söderlund, Jacopo Mondi

When a subdevice signals a transfer error stop the VIN in addition to
informing user-space of the event.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
---
* Changes since v3
- Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
- Call vb2_queue_error() when encountering the event.

* Changes since v2
- Log using vin_dbg() instead of v4l2_info().
---
 drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
--- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
+++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
@@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
 static void rvin_notify_video_device(struct rvin_dev *vin,
 				     unsigned int notification, void *arg)
 {
+	const struct v4l2_event *event;
+
 	switch (notification) {
 	case V4L2_DEVICE_NOTIFY_EVENT:
-		v4l2_event_queue(&vin->vdev, arg);
+		event = arg;
+
+		switch (event->type) {
+		case V4L2_EVENT_XFER_ERROR:
+			vin_dbg(vin,
+				"Subdevice signaled transfer error, stopping.\n");
+			rvin_stop_streaming(vin);
+			vb2_queue_error(&vin->queue);
+			break;
+		default:
+			break;
+		}
+
+		v4l2_event_queue(&vin->vdev, event);
 		break;
 	default:
 		break;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] rcar-csi2: Do not try to recover after transfer error
  2021-11-08 16:02 [PATCH 0/4] media: videobuf2: Add a transfer error event Niklas Söderlund
                   ` (2 preceding siblings ...)
  2021-11-08 16:02 ` [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error Niklas Söderlund
@ 2021-11-08 16:02 ` Niklas Söderlund
  3 siblings, 0 replies; 16+ messages in thread
From: Niklas Söderlund @ 2021-11-08 16:02 UTC (permalink / raw)
  To: Hans Verkuil, linux-media
  Cc: linux-renesas-soc, Niklas Söderlund, Jacopo Mondi

Instead of restarting the R-Car CSI-2 receiver if a transmission error
is detected, inform the R-Car VIN driver of the error so it can stop the
whole pipeline and inform user-space. This is done to reflect a updated
usage recommendation in later versions of the datasheet.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
---
* Changes since v3
- Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
- Disable error interrupts after the first one to not spam the error
  event.

* Changes since v2
- Update spelling in commit message.
---
 drivers/media/platform/rcar-vin/rcar-csi2.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/media/platform/rcar-vin/rcar-csi2.c b/drivers/media/platform/rcar-vin/rcar-csi2.c
index 11848d0c4a55cb4c..427c236243a03ec2 100644
--- a/drivers/media/platform/rcar-vin/rcar-csi2.c
+++ b/drivers/media/platform/rcar-vin/rcar-csi2.c
@@ -902,21 +902,22 @@ static irqreturn_t rcsi2_irq(int irq, void *data)
 
 	rcsi2_write(priv, INTERRSTATE_REG, err_status);
 
-	dev_info(priv->dev, "Transfer error, restarting CSI-2 receiver\n");
-
 	return IRQ_WAKE_THREAD;
 }
 
 static irqreturn_t rcsi2_irq_thread(int irq, void *data)
 {
 	struct rcar_csi2 *priv = data;
+	struct v4l2_event event = {
+		.type = V4L2_EVENT_XFER_ERROR,
+	};
 
-	mutex_lock(&priv->lock);
-	rcsi2_stop(priv);
-	usleep_range(1000, 2000);
-	if (rcsi2_start(priv))
-		dev_warn(priv->dev, "Failed to restart CSI-2 receiver\n");
-	mutex_unlock(&priv->lock);
+	/* Disable further interrupts to not spam the transfer error event. */
+	rcsi2_write(priv, INTEN_REG, 0);
+
+	dev_err(priv->dev, "Transfer error detected.\n");
+
+	v4l2_subdev_notify_event(&priv->subdev, &event);
 
 	return IRQ_HANDLED;
 }
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2021-11-08 16:02 ` [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error Niklas Söderlund
@ 2021-11-08 17:36   ` Hans Verkuil
  2021-11-08 18:42     ` Niklas Söderlund
  0 siblings, 1 reply; 16+ messages in thread
From: Hans Verkuil @ 2021-11-08 17:36 UTC (permalink / raw)
  To: Niklas Söderlund, linux-media; +Cc: linux-renesas-soc, Jacopo Mondi

On 08/11/2021 17:02, Niklas Söderlund wrote:
> When a subdevice signals a transfer error stop the VIN in addition to
> informing user-space of the event.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> ---
> * Changes since v3
> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> - Call vb2_queue_error() when encountering the event.
> 
> * Changes since v2
> - Log using vin_dbg() instead of v4l2_info().
> ---
>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
>  static void rvin_notify_video_device(struct rvin_dev *vin,
>  				     unsigned int notification, void *arg)
>  {
> +	const struct v4l2_event *event;
> +
>  	switch (notification) {
>  	case V4L2_DEVICE_NOTIFY_EVENT:
> -		v4l2_event_queue(&vin->vdev, arg);
> +		event = arg;
> +
> +		switch (event->type) {
> +		case V4L2_EVENT_XFER_ERROR:
> +			vin_dbg(vin,
> +				"Subdevice signaled transfer error, stopping.\n");
> +			rvin_stop_streaming(vin);
> +			vb2_queue_error(&vin->queue);

Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
would also have to send this new event? Would it be possible to modify
vb2_queue_error() to raise this event? I haven't analyzed all the drivers
that call this function to see if that makes sense.

Perhaps a separate new function vb2_queue_error_with_event() would also be
an option.

Regards,

	Hans

> +			break;
> +		default:
> +			break;
> +		}
> +
> +		v4l2_event_queue(&vin->vdev, event);
>  		break;
>  	default:
>  		break;
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2021-11-08 17:36   ` Hans Verkuil
@ 2021-11-08 18:42     ` Niklas Söderlund
  2021-11-15 14:26       ` Hans Verkuil
  0 siblings, 1 reply; 16+ messages in thread
From: Niklas Söderlund @ 2021-11-08 18:42 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: linux-media, linux-renesas-soc, Jacopo Mondi

Hi Hans,

On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> On 08/11/2021 17:02, Niklas Söderlund wrote:
> > When a subdevice signals a transfer error stop the VIN in addition to
> > informing user-space of the event.
> > 
> > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> > Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> > ---
> > * Changes since v3
> > - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> > - Call vb2_queue_error() when encountering the event.
> > 
> > * Changes since v2
> > - Log using vin_dbg() instead of v4l2_info().
> > ---
> >  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> > --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> >  static void rvin_notify_video_device(struct rvin_dev *vin,
> >  				     unsigned int notification, void *arg)
> >  {
> > +	const struct v4l2_event *event;
> > +
> >  	switch (notification) {
> >  	case V4L2_DEVICE_NOTIFY_EVENT:
> > -		v4l2_event_queue(&vin->vdev, arg);
> > +		event = arg;
> > +
> > +		switch (event->type) {
> > +		case V4L2_EVENT_XFER_ERROR:
> > +			vin_dbg(vin,
> > +				"Subdevice signaled transfer error, stopping.\n");
> > +			rvin_stop_streaming(vin);
> > +			vb2_queue_error(&vin->queue);
> 
> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> would also have to send this new event? Would it be possible to modify
> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> that call this function to see if that makes sense.
> 
> Perhaps a separate new function vb2_queue_error_with_event() would also be
> an option.

I think that maybe a good idea, but I think that would be needed on-top 
of this work as I can't really test it. Here the rcar-csi2.ko is a 
subdevice which detects the error condition and generates the event. And 
this code is in rcar-vin.ko, the video device driver which reacts to the 
event and then forwards it to user-space.

Or am I misunderstanding you? And you think I should remove the 
v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
call?

> 
> Regards,
> 
> 	Hans
> 
> > +			break;
> > +		default:
> > +			break;
> > +		}
> > +
> > +		v4l2_event_queue(&vin->vdev, event);
> >  		break;
> >  	default:
> >  		break;
> > 
> 

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2021-11-08 18:42     ` Niklas Söderlund
@ 2021-11-15 14:26       ` Hans Verkuil
  2022-03-02 16:48         ` Michael Rodin
  0 siblings, 1 reply; 16+ messages in thread
From: Hans Verkuil @ 2021-11-15 14:26 UTC (permalink / raw)
  To: Niklas Söderlund; +Cc: linux-media, linux-renesas-soc, Jacopo Mondi

On 08/11/2021 19:42, Niklas Söderlund wrote:
> Hi Hans,
> 
> On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
>> On 08/11/2021 17:02, Niklas Söderlund wrote:
>>> When a subdevice signals a transfer error stop the VIN in addition to
>>> informing user-space of the event.
>>>
>>> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
>>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
>>> ---
>>> * Changes since v3
>>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
>>> - Call vb2_queue_error() when encountering the event.
>>>
>>> * Changes since v2
>>> - Log using vin_dbg() instead of v4l2_info().
>>> ---
>>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
>>>  1 file changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
>>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
>>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
>>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
>>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
>>>  static void rvin_notify_video_device(struct rvin_dev *vin,
>>>  				     unsigned int notification, void *arg)
>>>  {
>>> +	const struct v4l2_event *event;
>>> +
>>>  	switch (notification) {
>>>  	case V4L2_DEVICE_NOTIFY_EVENT:
>>> -		v4l2_event_queue(&vin->vdev, arg);
>>> +		event = arg;
>>> +
>>> +		switch (event->type) {
>>> +		case V4L2_EVENT_XFER_ERROR:
>>> +			vin_dbg(vin,
>>> +				"Subdevice signaled transfer error, stopping.\n");
>>> +			rvin_stop_streaming(vin);
>>> +			vb2_queue_error(&vin->queue);
>>
>> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
>> would also have to send this new event? Would it be possible to modify
>> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
>> that call this function to see if that makes sense.
>>
>> Perhaps a separate new function vb2_queue_error_with_event() would also be
>> an option.
> 
> I think that maybe a good idea, but I think that would be needed on-top 
> of this work as I can't really test it. Here the rcar-csi2.ko is a 
> subdevice which detects the error condition and generates the event. And 
> this code is in rcar-vin.ko, the video device driver which reacts to the 
> event and then forwards it to user-space.
> 
> Or am I misunderstanding you? And you think I should remove the 
> v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> call?

Yes. And use vb2_queue_error_with_event in other drivers as well where
applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
arguments. I don't want this just in rcar since it makes perfect sense
as a generic event for such situations.

Regards,

	Hans

> 
>>
>> Regards,
>>
>> 	Hans
>>
>>> +			break;
>>> +		default:
>>> +			break;
>>> +		}
>>> +
>>> +		v4l2_event_queue(&vin->vdev, event);
>>>  		break;
>>>  	default:
>>>  		break;
>>>
>>
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails
  2021-11-08 16:02 ` [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails Niklas Söderlund
@ 2021-11-15 14:36   ` Hans Verkuil
  0 siblings, 0 replies; 16+ messages in thread
From: Hans Verkuil @ 2021-11-15 14:36 UTC (permalink / raw)
  To: Niklas Söderlund, linux-media; +Cc: linux-renesas-soc

On 08/11/2021 17:02, Niklas Söderlund wrote:
> The driver already have logic to detect if it fails to stop properly and
> report this error to the user. The driver however did not report the
> unused buffers or buffers given to the hardware (if any) with an error,
> the buffers where instead returned to user-space in the active state.
> 
> Build on the existing detection of the error condition and correctly
> return the buffers with an error if it triggers.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> ---
>  drivers/media/platform/rcar-vin/rcar-dma.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/media/platform/rcar-vin/rcar-dma.c b/drivers/media/platform/rcar-vin/rcar-dma.c
> index 25ead9333d0046e7..79bb9081853f8781 100644
> --- a/drivers/media/platform/rcar-vin/rcar-dma.c
> +++ b/drivers/media/platform/rcar-vin/rcar-dma.c
> @@ -1371,6 +1371,16 @@ void rvin_stop_streaming(struct rvin_dev *vin)
>  
>  	spin_unlock_irqrestore(&vin->qlock, flags);
>  
> +	/* If something went wrong, free buffers with an error. */
> +	if (!buffersFreed) {
> +		return_unused_buffers(vin, VB2_BUF_STATE_ERROR);
> +		for (i = 0; i < HW_BUFFER_NUM; i++) {
> +			if (vin->buf_hw[i].buffer)
> +				vb2_buffer_done(&vin->buf_hw[i].buffer->vb2_buf,
> +						VB2_BUF_STATE_ERROR);
> +		}
> +	}
> +
>  	rvin_set_stream(vin, 0);
>  
>  	/* disable interrupts */
> 

I'll take this patch, but mark the other three as 'Changes Requested'.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] media: videobuf2: Add a transfer error event
  2021-11-08 16:02 ` [PATCH 2/4] media: videobuf2: Add a transfer error event Niklas Söderlund
@ 2022-03-02 15:40   ` Michael Rodin
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Rodin @ 2022-03-02 15:40 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Hans Verkuil, linux-media, linux-renesas-soc, Michael Rodin

Hi Niklas,

thank you for addressing the problem, which I tried to solve in [1]!
I have a few concerns about this patch...

[1] https://lore.kernel.org/linux-renesas-soc/1592588777-100596-1-git-send-email-mrodin@de.adit-jv.com/

On Mon, Nov 08, 2021 at 05:02:18PM +0100, Niklas Söderlund wrote:
> Add a new V4L2_EVENT_XFER_ERROR event to signal if a unrecoverable error
> happens during video transfer.
> 
> The use-case that sparked this new event is to signal to the video
> device driver that an error has happen on the CSI-2 bus from the CSI-2
> receiver subdevice.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> ---
>  Documentation/userspace-api/media/v4l/vidioc-dqevent.rst     | 5 +++++
>  Documentation/userspace-api/media/videodev2.h.rst.exceptions | 1 +
>  include/uapi/linux/videodev2.h                               | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> index 6eb40073c906deba..84984641c9351aa9 100644
> --- a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> +++ b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst
> @@ -182,6 +182,11 @@ call.
>  	the regions changes. This event has a struct
>  	:c:type:`v4l2_event_motion_det`
>  	associated with it.
> +    * - ``V4L2_EVENT_XFER_ERROR``
> +      - 7
> +      - This event is triggered when an transfer error is detected while
> +        streaming. For example if a unrecoverable error is detected on a video
> +        bus in the pipeline.

Could you please mention, what a userspace application and a video capture
driver are supposed to do after reception of this event? (e.g. should video
capture driver execute vb2_queue_error as done in the patch 3?) I have a
few concerns depending on the kind of error which this event will report:
 1. If an error is "unrecoverable" as you mentioned, then the overall
    video pipeline is in an unusable state, so sending this event to
    userspace probably does not make sense, since we can already signal IO
    error to userspace via vb2_queue_error.
 2. If this event is also for reporting recoverable errors, e.g. which can
    be recovered by restarting the video pipeline, then probably it makes
    sense to explicitly mention, who shall restart the video pipeline
    via streamoff/streamon: application or video capture driver?

>      * - ``V4L2_EVENT_PRIVATE_START``
>        - 0x08000000
>        - Base event number for driver-private events.
> diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> index eb0b1cd37abd9381..7ed9884b879c888e 100644
> --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> @@ -498,6 +498,7 @@ replace define V4L2_EVENT_CTRL event-type
>  replace define V4L2_EVENT_FRAME_SYNC event-type
>  replace define V4L2_EVENT_SOURCE_CHANGE event-type
>  replace define V4L2_EVENT_MOTION_DET event-type
> +replace define V4L2_EVENT_XFER_ERROR event-type
>  replace define V4L2_EVENT_PRIVATE_START event-type
>  
>  replace define V4L2_EVENT_CTRL_CH_VALUE ctrl-changes-flags
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index f118fe7a9f58d240..48d4738eb862418b 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -2373,6 +2373,7 @@ struct v4l2_streamparm {
>  #define V4L2_EVENT_FRAME_SYNC			4
>  #define V4L2_EVENT_SOURCE_CHANGE		5
>  #define V4L2_EVENT_MOTION_DET			6
> +#define V4L2_EVENT_XFER_ERROR			7
>  #define V4L2_EVENT_PRIVATE_START		0x08000000
>  
>  /* Payload for V4L2_EVENT_VSYNC */
> -- 
> 2.33.1
> 

-- 
Best Regards,
Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2021-11-15 14:26       ` Hans Verkuil
@ 2022-03-02 16:48         ` Michael Rodin
  2022-03-02 20:17           ` Niklas Söderlund
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Rodin @ 2022-03-02 16:48 UTC (permalink / raw)
  To: Hans Verkuil, Niklas Söderlund
  Cc: linux-media, linux-renesas-soc, Jacopo Mondi, Michael Rodin

Hi Niklas, Hans,

On Mon, Nov 15, 2021 at 03:26:53PM +0100, Hans Verkuil wrote:
> On 08/11/2021 19:42, Niklas Söderlund wrote:
> > Hi Hans,
> > 
> > On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> >> On 08/11/2021 17:02, Niklas Söderlund wrote:
> >>> When a subdevice signals a transfer error stop the VIN in addition to
> >>> informing user-space of the event.
> >>>
> >>> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> >>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> >>> ---
> >>> * Changes since v3
> >>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> >>> - Call vb2_queue_error() when encountering the event.
> >>>
> >>> * Changes since v2
> >>> - Log using vin_dbg() instead of v4l2_info().
> >>> ---
> >>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> >>>  1 file changed, 16 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> >>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> >>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> >>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> >>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> >>>  static void rvin_notify_video_device(struct rvin_dev *vin,
> >>>  				     unsigned int notification, void *arg)
> >>>  {
> >>> +	const struct v4l2_event *event;
> >>> +
> >>>  	switch (notification) {
> >>>  	case V4L2_DEVICE_NOTIFY_EVENT:
> >>> -		v4l2_event_queue(&vin->vdev, arg);
> >>> +		event = arg;
> >>> +
> >>> +		switch (event->type) {
> >>> +		case V4L2_EVENT_XFER_ERROR:
> >>> +			vin_dbg(vin,
> >>> +				"Subdevice signaled transfer error, stopping.\n");
> >>> +			rvin_stop_streaming(vin);
> >>> +			vb2_queue_error(&vin->queue);
> >>
> >> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> >> would also have to send this new event? Would it be possible to modify
> >> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> >> that call this function to see if that makes sense.
> >>
> >> Perhaps a separate new function vb2_queue_error_with_event() would also be
> >> an option.
> > 
> > I think that maybe a good idea, but I think that would be needed on-top 
> > of this work as I can't really test it. Here the rcar-csi2.ko is a 
> > subdevice which detects the error condition and generates the event. And 
> > this code is in rcar-vin.ko, the video device driver which reacts to the 
> > event and then forwards it to user-space.
> > 
> > Or am I misunderstanding you? And you think I should remove the 
> > v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> > call?
> 
> Yes. And use vb2_queue_error_with_event in other drivers as well where
> applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
> perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
> arguments. I don't want this just in rcar since it makes perfect sense
> as a generic event for such situations.

Handling errors in this way could be problematic, because a (CSI2) transfer
error does not mean a total hardware failure on Rcar3. From my experience
there are 3 kinds of CSI2 errors:
  1. errors which occur sometimes, but do not affect video streaming
  2. errors which occur on every start of streaming but usually do not
     affect actual video streaming to VIN module after the start
  3. fatal errors which require a "Software Reset" mentioned by Renesas in
     the chapter 25.3.13 of the hardware manual in order to continue
     video streaming
This patch set makes the video pipeline unusable if we get errors described
in the first scenario if I am not mistaken. In the second scenario the
video pipeline was already not usable before because we end up in a
continuous restart loop in rcar-csi2.c. And the third scenario is not
really addressed by this patch set (or maybe the job is offloaded on to
userspace)?

Maybe it's better to implement a recovery in a different way, which would
consider the three mentioned error scenarios above:
  1. Monitor rvin_irq after streaming has started, e.g. by using a timer
     (I tried someting similar in [1])
  2. restart the complete video pipeline via rvin_stop_streaming and
     rvin_start_streaming if no frame is captured in a reasonable amount
     of time (optionally after checking if a subdevice has sent a
     V4L2_EVENT_XFER_ERROR).
This would make the complete recovery process almost invisible for the
application and avoid any application changes.

What do you think?

[1] https://lore.kernel.org/linux-renesas-soc/1592588777-100596-1-git-send-email-mrodin@de.adit-jv.com/

> Regards,
> 
> 	Hans
> 
> > 
> >>
> >> Regards,
> >>
> >> 	Hans
> >>
> >>> +			break;
> >>> +		default:
> >>> +			break;
> >>> +		}
> >>> +
> >>> +		v4l2_event_queue(&vin->vdev, event);
> >>>  		break;
> >>>  	default:
> >>>  		break;
> >>>
> >>
> > 
> 

-- 
Best Regards,
Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2022-03-02 16:48         ` Michael Rodin
@ 2022-03-02 20:17           ` Niklas Söderlund
  2022-03-06 20:01             ` Michael Rodin
  0 siblings, 1 reply; 16+ messages in thread
From: Niklas Söderlund @ 2022-03-02 20:17 UTC (permalink / raw)
  To: Michael Rodin; +Cc: Hans Verkuil, linux-media, linux-renesas-soc, Jacopo Mondi

Hi Michael,

Thanks for your feedback.

On 2022-03-02 17:48:34 +0100, Michael Rodin wrote:
> Hi Niklas, Hans,
> 
> On Mon, Nov 15, 2021 at 03:26:53PM +0100, Hans Verkuil wrote:
> > On 08/11/2021 19:42, Niklas Söderlund wrote:
> > > Hi Hans,
> > > 
> > > On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> > >> On 08/11/2021 17:02, Niklas Söderlund wrote:
> > >>> When a subdevice signals a transfer error stop the VIN in addition to
> > >>> informing user-space of the event.
> > >>>
> > >>> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> > >>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> > >>> ---
> > >>> * Changes since v3
> > >>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> > >>> - Call vb2_queue_error() when encountering the event.
> > >>>
> > >>> * Changes since v2
> > >>> - Log using vin_dbg() instead of v4l2_info().
> > >>> ---
> > >>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> > >>>  1 file changed, 16 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > >>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> > >>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > >>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > >>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> > >>>  static void rvin_notify_video_device(struct rvin_dev *vin,
> > >>>  				     unsigned int notification, void *arg)
> > >>>  {
> > >>> +	const struct v4l2_event *event;
> > >>> +
> > >>>  	switch (notification) {
> > >>>  	case V4L2_DEVICE_NOTIFY_EVENT:
> > >>> -		v4l2_event_queue(&vin->vdev, arg);
> > >>> +		event = arg;
> > >>> +
> > >>> +		switch (event->type) {
> > >>> +		case V4L2_EVENT_XFER_ERROR:
> > >>> +			vin_dbg(vin,
> > >>> +				"Subdevice signaled transfer error, stopping.\n");
> > >>> +			rvin_stop_streaming(vin);
> > >>> +			vb2_queue_error(&vin->queue);
> > >>
> > >> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> > >> would also have to send this new event? Would it be possible to modify
> > >> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> > >> that call this function to see if that makes sense.
> > >>
> > >> Perhaps a separate new function vb2_queue_error_with_event() would also be
> > >> an option.
> > > 
> > > I think that maybe a good idea, but I think that would be needed on-top 
> > > of this work as I can't really test it. Here the rcar-csi2.ko is a 
> > > subdevice which detects the error condition and generates the event. And 
> > > this code is in rcar-vin.ko, the video device driver which reacts to the 
> > > event and then forwards it to user-space.
> > > 
> > > Or am I misunderstanding you? And you think I should remove the 
> > > v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> > > call?
> > 
> > Yes. And use vb2_queue_error_with_event in other drivers as well where
> > applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
> > perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
> > arguments. I don't want this just in rcar since it makes perfect sense
> > as a generic event for such situations.
> 
> Handling errors in this way could be problematic, because a (CSI2) transfer
> error does not mean a total hardware failure on Rcar3. From my experience
> there are 3 kinds of CSI2 errors:
>   1. errors which occur sometimes, but do not affect video streaming
>   2. errors which occur on every start of streaming but usually do not
>      affect actual video streaming to VIN module after the start
>   3. fatal errors which require a "Software Reset" mentioned by Renesas in
>      the chapter 25.3.13 of the hardware manual in order to continue
>      video streaming
> This patch set makes the video pipeline unusable if we get errors described
> in the first scenario if I am not mistaken. In the second scenario the
> video pipeline was already not usable before because we end up in a
> continuous restart loop in rcar-csi2.c. And the third scenario is not
> really addressed by this patch set (or maybe the job is offloaded on to
> userspace)?
> 
> Maybe it's better to implement a recovery in a different way, which would
> consider the three mentioned error scenarios above:
>   1. Monitor rvin_irq after streaming has started, e.g. by using a timer
>      (I tried someting similar in [1])
>   2. restart the complete video pipeline via rvin_stop_streaming and
>      rvin_start_streaming if no frame is captured in a reasonable amount
>      of time (optionally after checking if a subdevice has sent a
>      V4L2_EVENT_XFER_ERROR).
> This would make the complete recovery process almost invisible for the
> application and avoid any application changes.
> 
> What do you think?

I think you bring up a few interesting points and for discussions sake I 
think we need to split it in two. One on how we could implement 
V4L2_EVENT_XFER_ERROR in a generic sens and one on how to best deal with 
errors in the R-Car Gen3 capture pipeline.

For the proposed V4L2_EVENT_XFER_ERROR the idea from my side is that a 
driver in the pipeline shall only raise this error (and propagate it to 
the effected video node) when there is no way to recover without 
involving user-space. So when this event happens an application at the 
very least needs to do a full s_stream cycle to restart the capture 
session.

On the particulars of the VIN capture pipeline the only way I found so 
far to freak out the CSI-2 receiver enough to trigger the event with this 
patch series is to disconnect the HDMI source from the ADV7481 while 
streaming and I don't think any in kernel recover method can fix that 
;-)

Over all I do agree with your idea that if we can recover from errors 
that are recovererable that is good. For this series I would like to 
focus on the former to get V4L2_EVENT_XFER_ERROR in and then if we have 
ways to provoke and test recovery in the Gen3 pipeline we can add such 
things to the drivers. Do this make sens or do you think the change in 
the R-Car CSI-2 driver to raise V4L2_EVENT_XFER_ERROR is to harsh? My 
motivation for is is the new datasheet and discussions with Renesas, but 
then again my only way to trigger CSI-2 errors is to pull cables while 
streaming so maybe I'm biased as such issues can't really be recover 
from...

Let me know what you think, I was about to spin a new version of this 
series.

> 
> [1] https://lore.kernel.org/linux-renesas-soc/1592588777-100596-1-git-send-email-mrodin@de.adit-jv.com/
> 
> > Regards,
> > 
> > 	Hans
> > 
> > > 
> > >>
> > >> Regards,
> > >>
> > >> 	Hans
> > >>
> > >>> +			break;
> > >>> +		default:
> > >>> +			break;
> > >>> +		}
> > >>> +
> > >>> +		v4l2_event_queue(&vin->vdev, event);
> > >>>  		break;
> > >>>  	default:
> > >>>  		break;
> > >>>
> > >>
> > > 
> > 
> 
> -- 
> Best Regards,
> Michael

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2022-03-02 20:17           ` Niklas Söderlund
@ 2022-03-06 20:01             ` Michael Rodin
  2022-03-07 15:26               ` Niklas Söderlund
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Rodin @ 2022-03-06 20:01 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Rodin, Michael (ADITG/ESA1),
	Hans Verkuil, linux-media, linux-renesas-soc, Jacopo Mondi

Hi Niklas,
On Wed, Mar 02, 2022 at 09:17:37PM +0100, Niklas Söderlund wrote:
> Hi Michael,
> 
> Thanks for your feedback.
> 
> On 2022-03-02 17:48:34 +0100, Michael Rodin wrote:
> > Hi Niklas, Hans,
> > 
> > On Mon, Nov 15, 2021 at 03:26:53PM +0100, Hans Verkuil wrote:
> > > On 08/11/2021 19:42, Niklas Söderlund wrote:
> > > > Hi Hans,
> > > > 
> > > > On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> > > >> On 08/11/2021 17:02, Niklas Söderlund wrote:
> > > >>> When a subdevice signals a transfer error stop the VIN in addition to
> > > >>> informing user-space of the event.
> > > >>>
> > > >>> Signed-off-by: Niklas Söderlund <https://urldefense.proofpoint.com/v2/url?u=http-3A__niklas.soderlund-2Brenesas-40ragnatech.se&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=isB_ZRIMVizn-pFdEQpwsKeg81nAE-QHMhjUzSozUZg&e=>
> > > >>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> > > >>> ---
> > > >>> * Changes since v3
> > > >>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> > > >>> - Call vb2_queue_error() when encountering the event.
> > > >>>
> > > >>> * Changes since v2
> > > >>> - Log using vin_dbg() instead of v4l2_info().
> > > >>> ---
> > > >>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> > > >>>  1 file changed, 16 insertions(+), 1 deletion(-)
> > > >>>
> > > >>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > >>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> > > >>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > >>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > >>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> > > >>>  static void rvin_notify_video_device(struct rvin_dev *vin,
> > > >>>  				     unsigned int notification, void *arg)
> > > >>>  {
> > > >>> +	const struct v4l2_event *event;
> > > >>> +
> > > >>>  	switch (notification) {
> > > >>>  	case V4L2_DEVICE_NOTIFY_EVENT:
> > > >>> -		v4l2_event_queue(&vin->vdev, arg);
> > > >>> +		event = arg;
> > > >>> +
> > > >>> +		switch (event->type) {
> > > >>> +		case V4L2_EVENT_XFER_ERROR:
> > > >>> +			vin_dbg(vin,
> > > >>> +				"Subdevice signaled transfer error, stopping.\n");
> > > >>> +			rvin_stop_streaming(vin);
> > > >>> +			vb2_queue_error(&vin->queue);
> > > >>
> > > >> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> > > >> would also have to send this new event? Would it be possible to modify
> > > >> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> > > >> that call this function to see if that makes sense.
> > > >>
> > > >> Perhaps a separate new function vb2_queue_error_with_event() would also be
> > > >> an option.
> > > > 
> > > > I think that maybe a good idea, but I think that would be needed on-top 
> > > > of this work as I can't really test it. Here the rcar-csi2.ko is a 
> > > > subdevice which detects the error condition and generates the event. And 
> > > > this code is in rcar-vin.ko, the video device driver which reacts to the 
> > > > event and then forwards it to user-space.
> > > > 
> > > > Or am I misunderstanding you? And you think I should remove the 
> > > > v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> > > > call?
> > > 
> > > Yes. And use vb2_queue_error_with_event in other drivers as well where
> > > applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
> > > perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
> > > arguments. I don't want this just in rcar since it makes perfect sense
> > > as a generic event for such situations.
> > 
> > Handling errors in this way could be problematic, because a (CSI2) transfer
> > error does not mean a total hardware failure on Rcar3. From my experience
> > there are 3 kinds of CSI2 errors:
> >   1. errors which occur sometimes, but do not affect video streaming
> >   2. errors which occur on every start of streaming but usually do not
> >      affect actual video streaming to VIN module after the start
> >   3. fatal errors which require a "Software Reset" mentioned by Renesas in
> >      the chapter 25.3.13 of the hardware manual in order to continue
> >      video streaming
> > This patch set makes the video pipeline unusable if we get errors described
> > in the first scenario if I am not mistaken. In the second scenario the
> > video pipeline was already not usable before because we end up in a
> > continuous restart loop in rcar-csi2.c. And the third scenario is not
> > really addressed by this patch set (or maybe the job is offloaded on to
> > userspace)?
> > 
> > Maybe it's better to implement a recovery in a different way, which would
> > consider the three mentioned error scenarios above:
> >   1. Monitor rvin_irq after streaming has started, e.g. by using a timer
> >      (I tried someting similar in [1])
> >   2. restart the complete video pipeline via rvin_stop_streaming and
> >      rvin_start_streaming if no frame is captured in a reasonable amount
> >      of time (optionally after checking if a subdevice has sent a
> >      V4L2_EVENT_XFER_ERROR).
> > This would make the complete recovery process almost invisible for the
> > application and avoid any application changes.
> > 
> > What do you think?
> 
> I think you bring up a few interesting points and for discussions sake I 
> think we need to split it in two. One on how we could implement 
> V4L2_EVENT_XFER_ERROR in a generic sens and one on how to best deal with 
> errors in the R-Car Gen3 capture pipeline.
> 
> For the proposed V4L2_EVENT_XFER_ERROR the idea from my side is that a 
> driver in the pipeline shall only raise this error (and propagate it to 
> the effected video node) when there is no way to recover without 
> involving user-space. So when this event happens an application at the 
> very least needs to do a full s_stream cycle to restart the capture 
> session.

Isn't a "full s_stream cycle" something what _every_ capture driver or even
vb2 framework could execute internally without offloading the task to user
space? This new event will probably burden userspace too much. Every V4L2
capture application which worked fine on other SoCs will fail on RCar3 in
the use cases where transfer errors occur until application developers
implement handling of this event. Signalling an error via vb2_queue_error
should be probably sufficient, because it already signals via EPOLLERR/EIO
to userspace to do a streamoff to clear this error (as per videobuf2-core.h).
Maybe we just have to document that userspace should try to do a streamon
after this? EPOLLERR/EIO errors will be propagated to any application, but
not every application subscribes to V4L2 events...

I took a look at the other drivers and found that imx-media-csi.c seems to
follow a similar approach to what I described in my last email. There is a
2 second timer, which monitors the csi_idmac_eof_interrupt and signals a
fatal error to the capture device via vb2_queue_error when this timer
expires. Monitoring end-of-frame interrupts looks also like a more robust
solution, because in this way we know reliably when something has failed
in the pipeline in contrast to reacting to start-of-transfer errors, which
can be false positives. I think it would be good to follow a similar
approach by adding an EOF timer to rvin_irq. This should also cover errors
on the digital pin camera interface if we get any in the future. After
this we could add a few additional improvements:
 1. configurability of the EOF timeout. 2 seconds can be too much,
    especially for automotive use cases. Probably this timeout could be
    reduced depending on the source frame rate.
 2. internal streamoff/streamon cycle after the timer expires instead of
    calling vb2_queue_error

> On the particulars of the VIN capture pipeline the only way I found so 
> far to freak out the CSI-2 receiver enough to trigger the event with this 
> patch series is to disconnect the HDMI source from the ADV7481 while 
> streaming and I don't think any in kernel recover method can fix that 
> ;-)

In this case I would expect an V4L2_EVENT_EOS or V4L2_EVENT_SOURCE_CHANGE
from the adv748x driver. Maybe we should call vb2_queue_error for these
events as well. But the described EOF timeout approach could also cover
this use case.

> Over all I do agree with your idea that if we can recover from errors 
> that are recovererable that is good. For this series I would like to 
> focus on the former to get V4L2_EVENT_XFER_ERROR in and then if we have 
> ways to provoke and test recovery in the Gen3 pipeline we can add such 
> things to the drivers. Do this make sens or do you think the change in 
> the R-Car CSI-2 driver to raise V4L2_EVENT_XFER_ERROR is to harsh? My 
> motivation for is is the new datasheet and discussions with Renesas, but 
> then again my only way to trigger CSI-2 errors is to pull cables while 
> streaming so maybe I'm biased as such issues can't really be recover 
> from...
> 
> Let me know what you think, I was about to spin a new version of this 
> series.

I believe this event might be useful for subdevice drivers which want to
request a streamoff/streamon cycle from the capture driver but it's
probably not useful for the RCar3 CSI2 driver in the current state because
reacting to start-of-transfer errors can be risky due to false positives.

What do you think about implementing an approach similar to imx-media-csi.c
first?

> > 
> > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Drenesas-2Dsoc_1592588777-2D100596-2D1-2Dgit-2Dsend-2Demail-2Dmrodin-40de.adit-2Djv.com_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=B4GPIasxu8VP04sLBFvhbmr8yhotp6GcfDQqEKjqeAc&e=
> > 
> > > Regards,
> > > 
> > > 	Hans
> > > 
> > > > 
> > > >>
> > > >> Regards,
> > > >>
> > > >> 	Hans
> > > >>
> > > >>> +			break;
> > > >>> +		default:
> > > >>> +			break;
> > > >>> +		}
> > > >>> +
> > > >>> +		v4l2_event_queue(&vin->vdev, event);
> > > >>>  		break;
> > > >>>  	default:
> > > >>>  		break;
> > > >>>
> > > >>
> > > > 
> > > 
> > 
> > -- 
> > Best Regards,
> > Michael
> 
> -- 
> Kind Regards,
> Niklas Söderlund

-- 
Best Regards,
Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2022-03-06 20:01             ` Michael Rodin
@ 2022-03-07 15:26               ` Niklas Söderlund
  2022-03-09 19:27                 ` Michael Rodin
  0 siblings, 1 reply; 16+ messages in thread
From: Niklas Söderlund @ 2022-03-07 15:26 UTC (permalink / raw)
  To: Michael Rodin; +Cc: Hans Verkuil, linux-media, linux-renesas-soc, Jacopo Mondi

Hi Michael,

On 2022-03-06 21:01:51 +0100, Michael Rodin wrote:
> Hi Niklas,
> On Wed, Mar 02, 2022 at 09:17:37PM +0100, Niklas Söderlund wrote:
> > Hi Michael,
> > 
> > Thanks for your feedback.
> > 
> > On 2022-03-02 17:48:34 +0100, Michael Rodin wrote:
> > > Hi Niklas, Hans,
> > > 
> > > On Mon, Nov 15, 2021 at 03:26:53PM +0100, Hans Verkuil wrote:
> > > > On 08/11/2021 19:42, Niklas Söderlund wrote:
> > > > > Hi Hans,
> > > > > 
> > > > > On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> > > > >> On 08/11/2021 17:02, Niklas Söderlund wrote:
> > > > >>> When a subdevice signals a transfer error stop the VIN in addition to
> > > > >>> informing user-space of the event.
> > > > >>>
> > > > >>> Signed-off-by: Niklas Söderlund <https://urldefense.proofpoint.com/v2/url?u=http-3A__niklas.soderlund-2Brenesas-40ragnatech.se&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=isB_ZRIMVizn-pFdEQpwsKeg81nAE-QHMhjUzSozUZg&e=>
> > > > >>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> > > > >>> ---
> > > > >>> * Changes since v3
> > > > >>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> > > > >>> - Call vb2_queue_error() when encountering the event.
> > > > >>>
> > > > >>> * Changes since v2
> > > > >>> - Log using vin_dbg() instead of v4l2_info().
> > > > >>> ---
> > > > >>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> > > > >>>  1 file changed, 16 insertions(+), 1 deletion(-)
> > > > >>>
> > > > >>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > >>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> > > > >>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > >>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > >>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> > > > >>>  static void rvin_notify_video_device(struct rvin_dev *vin,
> > > > >>>  				     unsigned int notification, void *arg)
> > > > >>>  {
> > > > >>> +	const struct v4l2_event *event;
> > > > >>> +
> > > > >>>  	switch (notification) {
> > > > >>>  	case V4L2_DEVICE_NOTIFY_EVENT:
> > > > >>> -		v4l2_event_queue(&vin->vdev, arg);
> > > > >>> +		event = arg;
> > > > >>> +
> > > > >>> +		switch (event->type) {
> > > > >>> +		case V4L2_EVENT_XFER_ERROR:
> > > > >>> +			vin_dbg(vin,
> > > > >>> +				"Subdevice signaled transfer error, stopping.\n");
> > > > >>> +			rvin_stop_streaming(vin);
> > > > >>> +			vb2_queue_error(&vin->queue);
> > > > >>
> > > > >> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> > > > >> would also have to send this new event? Would it be possible to modify
> > > > >> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> > > > >> that call this function to see if that makes sense.
> > > > >>
> > > > >> Perhaps a separate new function vb2_queue_error_with_event() would also be
> > > > >> an option.
> > > > > 
> > > > > I think that maybe a good idea, but I think that would be needed on-top 
> > > > > of this work as I can't really test it. Here the rcar-csi2.ko is a 
> > > > > subdevice which detects the error condition and generates the event. And 
> > > > > this code is in rcar-vin.ko, the video device driver which reacts to the 
> > > > > event and then forwards it to user-space.
> > > > > 
> > > > > Or am I misunderstanding you? And you think I should remove the 
> > > > > v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> > > > > call?
> > > > 
> > > > Yes. And use vb2_queue_error_with_event in other drivers as well where
> > > > applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
> > > > perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
> > > > arguments. I don't want this just in rcar since it makes perfect sense
> > > > as a generic event for such situations.
> > > 
> > > Handling errors in this way could be problematic, because a (CSI2) transfer
> > > error does not mean a total hardware failure on Rcar3. From my experience
> > > there are 3 kinds of CSI2 errors:
> > >   1. errors which occur sometimes, but do not affect video streaming
> > >   2. errors which occur on every start of streaming but usually do not
> > >      affect actual video streaming to VIN module after the start
> > >   3. fatal errors which require a "Software Reset" mentioned by Renesas in
> > >      the chapter 25.3.13 of the hardware manual in order to continue
> > >      video streaming
> > > This patch set makes the video pipeline unusable if we get errors described
> > > in the first scenario if I am not mistaken. In the second scenario the
> > > video pipeline was already not usable before because we end up in a
> > > continuous restart loop in rcar-csi2.c. And the third scenario is not
> > > really addressed by this patch set (or maybe the job is offloaded on to
> > > userspace)?
> > > 
> > > Maybe it's better to implement a recovery in a different way, which would
> > > consider the three mentioned error scenarios above:
> > >   1. Monitor rvin_irq after streaming has started, e.g. by using a timer
> > >      (I tried someting similar in [1])
> > >   2. restart the complete video pipeline via rvin_stop_streaming and
> > >      rvin_start_streaming if no frame is captured in a reasonable amount
> > >      of time (optionally after checking if a subdevice has sent a
> > >      V4L2_EVENT_XFER_ERROR).
> > > This would make the complete recovery process almost invisible for the
> > > application and avoid any application changes.
> > > 
> > > What do you think?
> > 
> > I think you bring up a few interesting points and for discussions sake I 
> > think we need to split it in two. One on how we could implement 
> > V4L2_EVENT_XFER_ERROR in a generic sens and one on how to best deal with 
> > errors in the R-Car Gen3 capture pipeline.
> > 
> > For the proposed V4L2_EVENT_XFER_ERROR the idea from my side is that a 
> > driver in the pipeline shall only raise this error (and propagate it to 
> > the effected video node) when there is no way to recover without 
> > involving user-space. So when this event happens an application at the 
> > very least needs to do a full s_stream cycle to restart the capture 
> > session.
> 
> Isn't a "full s_stream cycle" something what _every_ capture driver or even
> vb2 framework could execute internally without offloading the task to user
> space? This new event will probably burden userspace too much. Every V4L2
> capture application which worked fine on other SoCs will fail on RCar3 in
> the use cases where transfer errors occur until application developers
> implement handling of this event. Signalling an error via vb2_queue_error
> should be probably sufficient, because it already signals via EPOLLERR/EIO
> to userspace to do a streamoff to clear this error (as per videobuf2-core.h).
> Maybe we just have to document that userspace should try to do a streamon
> after this? EPOLLERR/EIO errors will be propagated to any application, but
> not every application subscribes to V4L2 events...

I think doing a "full s_stream cycle" anywhere but from user-space to be 
wrong. It will mess-up the state of capture session, for example it 
would reset the capture sequence numbers which can freak out user-space 
3A processing. I think that if an error is hit that requires a full 
restart of the stream it should be driven from user-space.

But with this series the error is also signaled thru a vb2_queue_error 
right? What is added is for the applications that do look at V4L2 events 
to know why it happen right? I mean the vb2 queue can signal error but 
the event can tell the application if it was due to V4L2_EVENT_EOS,
V4L2_EVENT_SOURCE_CHANGE or that something went very wrong using the 
proposed V4L2_EVENT_XFER_ERROR event.

> 
> I took a look at the other drivers and found that imx-media-csi.c seems to
> follow a similar approach to what I described in my last email. There is a
> 2 second timer, which monitors the csi_idmac_eof_interrupt and signals a
> fatal error to the capture device via vb2_queue_error when this timer
> expires. Monitoring end-of-frame interrupts looks also like a more robust
> solution, because in this way we know reliably when something has failed
> in the pipeline in contrast to reacting to start-of-transfer errors, which
> can be false positives. I think it would be good to follow a similar
> approach by adding an EOF timer to rvin_irq. This should also cover errors
> on the digital pin camera interface if we get any in the future. After
> this we could add a few additional improvements:
>  1. configurability of the EOF timeout. 2 seconds can be too much,
>     especially for automotive use cases. Probably this timeout could be
>     reduced depending on the source frame rate.
>  2. internal streamoff/streamon cycle after the timer expires instead of
>     calling vb2_queue_error

I would need to deep-dive in the details but in general I think anything 
we can do to improve the robustness of capture is good. And if we within 
a subdevice can recover form errors without causing issues that is good.  
But for the work in this series I think it's two different questions.  
One is how and if we can add an event to signal something went very 
wrong and the capture pipeline can't recover and have stopped, the other 
is how we in individual drivers try to detect and recover from errors.

> 
> > On the particulars of the VIN capture pipeline the only way I found so 
> > far to freak out the CSI-2 receiver enough to trigger the event with this 
> > patch series is to disconnect the HDMI source from the ADV7481 while 
> > streaming and I don't think any in kernel recover method can fix that 
> > ;-)
> 
> In this case I would expect an V4L2_EVENT_EOS or V4L2_EVENT_SOURCE_CHANGE
> from the adv748x driver. Maybe we should call vb2_queue_error for these
> events as well. But the described EOF timeout approach could also cover
> this use case.

Hum, is not those events more stable to report "planed" events? From the 
docs.

* V4L2_EVENT_EOS
  This event is triggered when the end of a stream is reached. This is 
  typically used with MPEG decoders to report to the application when 
  the last of the MPEG stream has been decoded.

* V4L2_EVENT_SOURCE_CHANGE
  This event is triggered when a source parameter change is detected 
  during runtime by the video device. It can be a runtime resolution 
  change triggered by a video decoder or the format change happening on 
  an input connector. This event requires that the id matches the input 
  index (when used with a video device node) or the pad index (when used 
  with a subdevice node) from which you want to receive events.

  This event has a struct v4l2_event_src_change associated with it. The 
  changes bitfield denotes what has changed for the subscribed pad. If 
  multiple events occurred before application could dequeue them, then 
  the changes will have the ORed value of all the events generated.

While I think the idea for V4L2_EVENT_XFER_ERROR more shall describe the 
case something "unnatural" happens that we can't really recover from.  
Maybe "ran over cable with lawn mower". I only brought up pulling the 
cable as it's the only way I have to trigger it easily (I don't have a 
lawn :-).

> 
> > Over all I do agree with your idea that if we can recover from errors 
> > that are recovererable that is good. For this series I would like to 
> > focus on the former to get V4L2_EVENT_XFER_ERROR in and then if we have 
> > ways to provoke and test recovery in the Gen3 pipeline we can add such 
> > things to the drivers. Do this make sens or do you think the change in 
> > the R-Car CSI-2 driver to raise V4L2_EVENT_XFER_ERROR is to harsh? My 
> > motivation for is is the new datasheet and discussions with Renesas, but 
> > then again my only way to trigger CSI-2 errors is to pull cables while 
> > streaming so maybe I'm biased as such issues can't really be recover 
> > from...
> > 
> > Let me know what you think, I was about to spin a new version of this 
> > series.
> 
> I believe this event might be useful for subdevice drivers which want to
> request a streamoff/streamon cycle from the capture driver but it's
> probably not useful for the RCar3 CSI2 driver in the current state because
> reacting to start-of-transfer errors can be risky due to false positives.
> 
> What do you think about implementing an approach similar to imx-media-csi.c
> first?

I'm not sure really what I think. Maybe if I understand better how to 
generate these false positives it would help me understand. I also get a 
feeling we are trying to solve two different problems. I want to find a 
way to just terminate capture when an unrecoverable error is hit while I 
get the feeling you are more focused on how to recover from recoverable 
errors in the best way. Maybe I'm missing understanding something, sorry 
if I am.

I think both problems are important but do not overlap to a large 
degree.

> 
> > > 
> > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Drenesas-2Dsoc_1592588777-2D100596-2D1-2Dgit-2Dsend-2Demail-2Dmrodin-40de.adit-2Djv.com_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=B4GPIasxu8VP04sLBFvhbmr8yhotp6GcfDQqEKjqeAc&e=
> > > 
> > > > Regards,
> > > > 
> > > > 	Hans
> > > > 
> > > > > 
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> 	Hans
> > > > >>
> > > > >>> +			break;
> > > > >>> +		default:
> > > > >>> +			break;
> > > > >>> +		}
> > > > >>> +
> > > > >>> +		v4l2_event_queue(&vin->vdev, event);
> > > > >>>  		break;
> > > > >>>  	default:
> > > > >>>  		break;
> > > > >>>
> > > > >>
> > > > > 
> > > > 
> > > 
> > > -- 
> > > Best Regards,
> > > Michael
> > 
> > -- 
> > Kind Regards,
> > Niklas Söderlund
> 
> -- 
> Best Regards,
> Michael

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2022-03-07 15:26               ` Niklas Söderlund
@ 2022-03-09 19:27                 ` Michael Rodin
  2022-05-19 18:09                   ` Michael Rodin
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Rodin @ 2022-03-09 19:27 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Michael Rodin, Hans Verkuil, linux-media, linux-renesas-soc,
	Jacopo Mondi

Hi Niklas,

On Mon, Mar 07, 2022 at 04:26:23PM +0100, Niklas Söderlund wrote:
> Hi Michael,
> 
> On 2022-03-06 21:01:51 +0100, Michael Rodin wrote:
> > Hi Niklas,
> > On Wed, Mar 02, 2022 at 09:17:37PM +0100, Niklas Söderlund wrote:
> > > Hi Michael,
> > > 
> > > Thanks for your feedback.
> > > 
> > > On 2022-03-02 17:48:34 +0100, Michael Rodin wrote:
> > > > Hi Niklas, Hans,
> > > > 
> > > > On Mon, Nov 15, 2021 at 03:26:53PM +0100, Hans Verkuil wrote:
> > > > > On 08/11/2021 19:42, Niklas Söderlund wrote:
> > > > > > Hi Hans,
> > > > > > 
> > > > > > On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> > > > > >> On 08/11/2021 17:02, Niklas Söderlund wrote:
> > > > > >>> When a subdevice signals a transfer error stop the VIN in addition to
> > > > > >>> informing user-space of the event.
> > > > > >>>
> > > > > >>> Signed-off-by: Niklas Söderlund <https://urldefense.proofpoint.com/v2/url?u=http-3A__niklas.soderlund-2Brenesas-40ragnatech.se&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=isB_ZRIMVizn-pFdEQpwsKeg81nAE-QHMhjUzSozUZg&e=>
> > > > > >>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> > > > > >>> ---
> > > > > >>> * Changes since v3
> > > > > >>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> > > > > >>> - Call vb2_queue_error() when encountering the event.
> > > > > >>>
> > > > > >>> * Changes since v2
> > > > > >>> - Log using vin_dbg() instead of v4l2_info().
> > > > > >>> ---
> > > > > >>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> > > > > >>>  1 file changed, 16 insertions(+), 1 deletion(-)
> > > > > >>>
> > > > > >>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > > >>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> > > > > >>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > > >>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > > >>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> > > > > >>>  static void rvin_notify_video_device(struct rvin_dev *vin,
> > > > > >>>  				     unsigned int notification, void *arg)
> > > > > >>>  {
> > > > > >>> +	const struct v4l2_event *event;
> > > > > >>> +
> > > > > >>>  	switch (notification) {
> > > > > >>>  	case V4L2_DEVICE_NOTIFY_EVENT:
> > > > > >>> -		v4l2_event_queue(&vin->vdev, arg);
> > > > > >>> +		event = arg;
> > > > > >>> +
> > > > > >>> +		switch (event->type) {
> > > > > >>> +		case V4L2_EVENT_XFER_ERROR:
> > > > > >>> +			vin_dbg(vin,
> > > > > >>> +				"Subdevice signaled transfer error, stopping.\n");
> > > > > >>> +			rvin_stop_streaming(vin);
> > > > > >>> +			vb2_queue_error(&vin->queue);
> > > > > >>
> > > > > >> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> > > > > >> would also have to send this new event? Would it be possible to modify
> > > > > >> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> > > > > >> that call this function to see if that makes sense.
> > > > > >>
> > > > > >> Perhaps a separate new function vb2_queue_error_with_event() would also be
> > > > > >> an option.
> > > > > > 
> > > > > > I think that maybe a good idea, but I think that would be needed on-top 
> > > > > > of this work as I can't really test it. Here the rcar-csi2.ko is a 
> > > > > > subdevice which detects the error condition and generates the event. And 
> > > > > > this code is in rcar-vin.ko, the video device driver which reacts to the 
> > > > > > event and then forwards it to user-space.
> > > > > > 
> > > > > > Or am I misunderstanding you? And you think I should remove the 
> > > > > > v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> > > > > > call?
> > > > > 
> > > > > Yes. And use vb2_queue_error_with_event in other drivers as well where
> > > > > applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
> > > > > perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
> > > > > arguments. I don't want this just in rcar since it makes perfect sense
> > > > > as a generic event for such situations.
> > > > 
> > > > Handling errors in this way could be problematic, because a (CSI2) transfer
> > > > error does not mean a total hardware failure on Rcar3. From my experience
> > > > there are 3 kinds of CSI2 errors:
> > > >   1. errors which occur sometimes, but do not affect video streaming
> > > >   2. errors which occur on every start of streaming but usually do not
> > > >      affect actual video streaming to VIN module after the start
> > > >   3. fatal errors which require a "Software Reset" mentioned by Renesas in
> > > >      the chapter 25.3.13 of the hardware manual in order to continue
> > > >      video streaming
> > > > This patch set makes the video pipeline unusable if we get errors described
> > > > in the first scenario if I am not mistaken. In the second scenario the
> > > > video pipeline was already not usable before because we end up in a
> > > > continuous restart loop in rcar-csi2.c. And the third scenario is not
> > > > really addressed by this patch set (or maybe the job is offloaded on to
> > > > userspace)?
> > > > 
> > > > Maybe it's better to implement a recovery in a different way, which would
> > > > consider the three mentioned error scenarios above:
> > > >   1. Monitor rvin_irq after streaming has started, e.g. by using a timer
> > > >      (I tried someting similar in [1])
> > > >   2. restart the complete video pipeline via rvin_stop_streaming and
> > > >      rvin_start_streaming if no frame is captured in a reasonable amount
> > > >      of time (optionally after checking if a subdevice has sent a
> > > >      V4L2_EVENT_XFER_ERROR).
> > > > This would make the complete recovery process almost invisible for the
> > > > application and avoid any application changes.
> > > > 
> > > > What do you think?
> > > 
> > > I think you bring up a few interesting points and for discussions sake I 
> > > think we need to split it in two. One on how we could implement 
> > > V4L2_EVENT_XFER_ERROR in a generic sens and one on how to best deal with 
> > > errors in the R-Car Gen3 capture pipeline.
> > > 
> > > For the proposed V4L2_EVENT_XFER_ERROR the idea from my side is that a 
> > > driver in the pipeline shall only raise this error (and propagate it to 
> > > the effected video node) when there is no way to recover without 
> > > involving user-space. So when this event happens an application at the 
> > > very least needs to do a full s_stream cycle to restart the capture 
> > > session.
> > 
> > Isn't a "full s_stream cycle" something what _every_ capture driver or even
> > vb2 framework could execute internally without offloading the task to user
> > space? This new event will probably burden userspace too much. Every V4L2
> > capture application which worked fine on other SoCs will fail on RCar3 in
> > the use cases where transfer errors occur until application developers
> > implement handling of this event. Signalling an error via vb2_queue_error
> > should be probably sufficient, because it already signals via EPOLLERR/EIO
> > to userspace to do a streamoff to clear this error (as per videobuf2-core.h).
> > Maybe we just have to document that userspace should try to do a streamon
> > after this? EPOLLERR/EIO errors will be propagated to any application, but
> > not every application subscribes to V4L2 events...
> 
> I think doing a "full s_stream cycle" anywhere but from user-space to be 
> wrong. It will mess-up the state of capture session, for example it 
> would reset the capture sequence numbers which can freak out user-space 
> 3A processing. I think that if an error is hit that requires a full 
> restart of the stream it should be driven from user-space.
> 
> But with this series the error is also signaled thru a vb2_queue_error 
> right? What is added is for the applications that do look at V4L2 events 
> to know why it happen right? I mean the vb2 queue can signal error but 
> the event can tell the application if it was due to V4L2_EVENT_EOS,
> V4L2_EVENT_SOURCE_CHANGE or that something went very wrong using the 
> proposed V4L2_EVENT_XFER_ERROR event.

Agree, now I see that it makes sense to have an additional V4L2 event so
when userspace receives EPOLLERR or EIO, it can check what went wrong by
dequeuing the received event. If the event was an xfer error, userspace can
restart the streaming session once and wait for the next EPOLLERR or EIO.

> > 
> > I took a look at the other drivers and found that imx-media-csi.c seems to
> > follow a similar approach to what I described in my last email. There is a
> > 2 second timer, which monitors the csi_idmac_eof_interrupt and signals a
> > fatal error to the capture device via vb2_queue_error when this timer
> > expires. Monitoring end-of-frame interrupts looks also like a more robust
> > solution, because in this way we know reliably when something has failed
> > in the pipeline in contrast to reacting to start-of-transfer errors, which
> > can be false positives. I think it would be good to follow a similar
> > approach by adding an EOF timer to rvin_irq. This should also cover errors
> > on the digital pin camera interface if we get any in the future. After
> > this we could add a few additional improvements:
> >  1. configurability of the EOF timeout. 2 seconds can be too much,
> >     especially for automotive use cases. Probably this timeout could be
> >     reduced depending on the source frame rate.
> >  2. internal streamoff/streamon cycle after the timer expires instead of
> >     calling vb2_queue_error
> 
> I would need to deep-dive in the details but in general I think anything 
> we can do to improve the robustness of capture is good. And if we within 
> a subdevice can recover form errors without causing issues that is good.  
> But for the work in this series I think it's two different questions.  
> One is how and if we can add an event to signal something went very 
> wrong and the capture pipeline can't recover and have stopped, the other 
> is how we in individual drivers try to detect and recover from errors.

Agree, this is rather an idea for the second question and a suggestion how
the patches 3 and 4 could be improved.

> > 
> > > On the particulars of the VIN capture pipeline the only way I found so 
> > > far to freak out the CSI-2 receiver enough to trigger the event with this 
> > > patch series is to disconnect the HDMI source from the ADV7481 while 
> > > streaming and I don't think any in kernel recover method can fix that 
> > > ;-)
> > 
> > In this case I would expect an V4L2_EVENT_EOS or V4L2_EVENT_SOURCE_CHANGE
> > from the adv748x driver. Maybe we should call vb2_queue_error for these
> > events as well. But the described EOF timeout approach could also cover
> > this use case.
> 
> Hum, is not those events more stable to report "planed" events? From the 
> docs.
> 
> * V4L2_EVENT_EOS
>   This event is triggered when the end of a stream is reached. This is 
>   typically used with MPEG decoders to report to the application when 
>   the last of the MPEG stream has been decoded.
> 
> * V4L2_EVENT_SOURCE_CHANGE
>   This event is triggered when a source parameter change is detected 
>   during runtime by the video device. It can be a runtime resolution 
>   change triggered by a video decoder or the format change happening on 
>   an input connector. This event requires that the id matches the input 
>   index (when used with a video device node) or the pad index (when used 
>   with a subdevice node) from which you want to receive events.
> 
>   This event has a struct v4l2_event_src_change associated with it. The 
>   changes bitfield denotes what has changed for the subscribed pad. If 
>   multiple events occurred before application could dequeue them, then 
>   the changes will have the ORed value of all the events generated.
> 
> While I think the idea for V4L2_EVENT_XFER_ERROR more shall describe the 
> case something "unnatural" happens that we can't really recover from.  
> Maybe "ran over cable with lawn mower". I only brought up pulling the 
> cable as it's the only way I have to trigger it easily (I don't have a 
> lawn :-).

I was thinking about this discussion [1]. If I understand the
recommendation from Hans correctly, userspace should get a SOURCE_CHANGE
event and then query dv timings to know whether HDMI cable has been
disconnected or connected.

[1] Subject: "Re: [PATCH 1/3] v4l2-subdev: Add subdev ioctl support for ENUM/GET/SET INPUT"

> > 
> > > Over all I do agree with your idea that if we can recover from errors 
> > > that are recovererable that is good. For this series I would like to 
> > > focus on the former to get V4L2_EVENT_XFER_ERROR in and then if we have 
> > > ways to provoke and test recovery in the Gen3 pipeline we can add such 
> > > things to the drivers. Do this make sens or do you think the change in 
> > > the R-Car CSI-2 driver to raise V4L2_EVENT_XFER_ERROR is to harsh? My 
> > > motivation for is is the new datasheet and discussions with Renesas, but 
> > > then again my only way to trigger CSI-2 errors is to pull cables while 
> > > streaming so maybe I'm biased as such issues can't really be recover 
> > > from...
> > > 
> > > Let me know what you think, I was about to spin a new version of this 
> > > series.
> > 
> > I believe this event might be useful for subdevice drivers which want to
> > request a streamoff/streamon cycle from the capture driver but it's
> > probably not useful for the RCar3 CSI2 driver in the current state because
> > reacting to start-of-transfer errors can be risky due to false positives.
> > 
> > What do you think about implementing an approach similar to imx-media-csi.c
> > first?
> 
> I'm not sure really what I think. Maybe if I understand better how to 
> generate these false positives it would help me understand. I also get a 
> feeling we are trying to solve two different problems. I want to find a 
> way to just terminate capture when an unrecoverable error is hit while I 
> get the feeling you are more focused on how to recover from recoverable 
> errors in the best way. Maybe I'm missing understanding something, sorry 
> if I am.
> 
> I think both problems are important but do not overlap to a large 
> degree.

I think, one possibility to generate false positive CSI2 errors on a
Salvator board is by manipulating CSI2-related registers in the adv7482. I
took a look at the adv7482 hardware manual and I have found one method by
reenabling automatically calculated DPHY timings:

1. start streaming from HDMI (I am using a V-SG4K-HDI signal generator which
outputs 1080P@60).
# media-ctl -d /dev/media0 -r
# media-ctl -d /dev/media0 -l "'adv748x 4-0070 hdmi':1 -> 'adv748x 4-0070 txa':0 [1]"
# media-ctl -d /dev/media0 -l "'rcar_csi2 feaa0000.csi2':1 -> 'VIN1 output':0 [1]"
# media-ctl -d /dev/media0 --set-v4l2 "'adv748x 4-0070 hdmi':1 [fmt:RGB888_1X24/1920x1080 field:none]"
# media-ctl -d /dev/media0 --set-v4l2 "'adv748x 4-0070 txa':1 [fmt:RGB888_1X24/1920x1080 field:none]"
# media-ctl -d /dev/media0 --set-v4l2 "'rcar_csi2 feaa0000.csi2':1 [fmt:RGB888_1X24/1920x1080 field:none]"
# v4l2-ctl --stream-mmap --device /dev/video1 --verbose
2. in another shell reenable calculation of the DPHY timings in adv7482:
# modprobe i2c-dev
# i2cset -f -y 4 0x64 0x00 0x04 && i2cset -f -y 4 0x64 0x00 0x24
# dmesg
[ 8806.752375] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
[ 8806.759642] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
[ 8806.766881] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
[ 8806.774109] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
# 

There will be some transfer errors, however I have disabled the restart in the
CSI2 driver so rcar-csi2 will not attempt to recover any more. In
approximately half of the trials I see that streaming can continue despite
CSI2 errors. In the other half of my trials streaming stops and I have to
restart it via v4l2-ctl. Maybe its even possible to simulate CSI2 errors
which are completely harmless by changing other adv7482 registers. Or was
my example already helpful to understand what I mean by those false positives?
I hope this makes clear, why its not good to stop streaming and trigger the
xfer error event each time we get a CSI2 error but instead setup a timeout
and wait for the frame like its done in the imx driver.

> > 
> > > > 
> > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Drenesas-2Dsoc_1592588777-2D100596-2D1-2Dgit-2Dsend-2Demail-2Dmrodin-40de.adit-2Djv.com_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=B4GPIasxu8VP04sLBFvhbmr8yhotp6GcfDQqEKjqeAc&e=
> > > > 
> > > > > Regards,
> > > > > 
> > > > > 	Hans
> > > > > 
> > > > > > 
> > > > > >>
> > > > > >> Regards,
> > > > > >>
> > > > > >> 	Hans
> > > > > >>
> > > > > >>> +			break;
> > > > > >>> +		default:
> > > > > >>> +			break;
> > > > > >>> +		}
> > > > > >>> +
> > > > > >>> +		v4l2_event_queue(&vin->vdev, event);
> > > > > >>>  		break;
> > > > > >>>  	default:
> > > > > >>>  		break;
> > > > > >>>
> > > > > >>
> > > > > > 
> > > > > 
> > > > 
> > > > -- 
> > > > Best Regards,
> > > > Michael
> > > 
> > > -- 
> > > Kind Regards,
> > > Niklas Söderlund
> > 
> > -- 
> > Best Regards,
> > Michael
> 
> -- 
> Kind Regards,
> Niklas Söderlund

-- 
Best Regards,
Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error
  2022-03-09 19:27                 ` Michael Rodin
@ 2022-05-19 18:09                   ` Michael Rodin
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Rodin @ 2022-05-19 18:09 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Michael Rodin, Hans Verkuil, linux-media, linux-renesas-soc,
	Jacopo Mondi


Hi Niklas, Hans,

I have restarted this patch series with some changes [1], I hope it's ok
for you and I am looking forward to your comments.

Thank you!

[1] https://lore.kernel.org/linux-media/1652983210-1194-1-git-send-email-mrodin@de.adit-jv.com/

On Wed, Mar 09, 2022 at 08:27:07PM +0100, Michael Rodin wrote:
> Hi Niklas,
> 
> On Mon, Mar 07, 2022 at 04:26:23PM +0100, Niklas Söderlund wrote:
> > Hi Michael,
> > 
> > On 2022-03-06 21:01:51 +0100, Michael Rodin wrote:
> > > Hi Niklas,
> > > On Wed, Mar 02, 2022 at 09:17:37PM +0100, Niklas Söderlund wrote:
> > > > Hi Michael,
> > > > 
> > > > Thanks for your feedback.
> > > > 
> > > > On 2022-03-02 17:48:34 +0100, Michael Rodin wrote:
> > > > > Hi Niklas, Hans,
> > > > > 
> > > > > On Mon, Nov 15, 2021 at 03:26:53PM +0100, Hans Verkuil wrote:
> > > > > > On 08/11/2021 19:42, Niklas Söderlund wrote:
> > > > > > > Hi Hans,
> > > > > > > 
> > > > > > > On 2021-11-08 18:36:25 +0100, Hans Verkuil wrote:
> > > > > > >> On 08/11/2021 17:02, Niklas Söderlund wrote:
> > > > > > >>> When a subdevice signals a transfer error stop the VIN in addition to
> > > > > > >>> informing user-space of the event.
> > > > > > >>>
> > > > > > >>> Signed-off-by: Niklas Söderlund <https://urldefense.proofpoint.com/v2/url?u=http-3A__niklas.soderlund-2Brenesas-40ragnatech.se&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=isB_ZRIMVizn-pFdEQpwsKeg81nAE-QHMhjUzSozUZg&e=>
> > > > > > >>> Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
> > > > > > >>> ---
> > > > > > >>> * Changes since v3
> > > > > > >>> - Switch to new V4L2_EVENT_XFER_ERROR from V4L2_EVENT_EOS.
> > > > > > >>> - Call vb2_queue_error() when encountering the event.
> > > > > > >>>
> > > > > > >>> * Changes since v2
> > > > > > >>> - Log using vin_dbg() instead of v4l2_info().
> > > > > > >>> ---
> > > > > > >>>  drivers/media/platform/rcar-vin/rcar-v4l2.c | 17 ++++++++++++++++-
> > > > > > >>>  1 file changed, 16 insertions(+), 1 deletion(-)
> > > > > > >>>
> > > > > > >>> diff --git a/drivers/media/platform/rcar-vin/rcar-v4l2.c b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > > > >>> index a5bfa76fdac6e55a..bf17fdefe90aabf5 100644
> > > > > > >>> --- a/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > > > >>> +++ b/drivers/media/platform/rcar-vin/rcar-v4l2.c
> > > > > > >>> @@ -992,9 +992,24 @@ void rvin_v4l2_unregister(struct rvin_dev *vin)
> > > > > > >>>  static void rvin_notify_video_device(struct rvin_dev *vin,
> > > > > > >>>  				     unsigned int notification, void *arg)
> > > > > > >>>  {
> > > > > > >>> +	const struct v4l2_event *event;
> > > > > > >>> +
> > > > > > >>>  	switch (notification) {
> > > > > > >>>  	case V4L2_DEVICE_NOTIFY_EVENT:
> > > > > > >>> -		v4l2_event_queue(&vin->vdev, arg);
> > > > > > >>> +		event = arg;
> > > > > > >>> +
> > > > > > >>> +		switch (event->type) {
> > > > > > >>> +		case V4L2_EVENT_XFER_ERROR:
> > > > > > >>> +			vin_dbg(vin,
> > > > > > >>> +				"Subdevice signaled transfer error, stopping.\n");
> > > > > > >>> +			rvin_stop_streaming(vin);
> > > > > > >>> +			vb2_queue_error(&vin->queue);
> > > > > > >>
> > > > > > >> Hmm, wouldn't it be the case that every driver that calls vb2_queue_error()
> > > > > > >> would also have to send this new event? Would it be possible to modify
> > > > > > >> vb2_queue_error() to raise this event? I haven't analyzed all the drivers
> > > > > > >> that call this function to see if that makes sense.
> > > > > > >>
> > > > > > >> Perhaps a separate new function vb2_queue_error_with_event() would also be
> > > > > > >> an option.
> > > > > > > 
> > > > > > > I think that maybe a good idea, but I think that would be needed on-top 
> > > > > > > of this work as I can't really test it. Here the rcar-csi2.ko is a 
> > > > > > > subdevice which detects the error condition and generates the event. And 
> > > > > > > this code is in rcar-vin.ko, the video device driver which reacts to the 
> > > > > > > event and then forwards it to user-space.
> > > > > > > 
> > > > > > > Or am I misunderstanding you? And you think I should remove the 
> > > > > > > v4l2_event_queue() below in favor of a new vb2_queue_error_with_event() 
> > > > > > > call?
> > > > > > 
> > > > > > Yes. And use vb2_queue_error_with_event in other drivers as well where
> > > > > > applicable. Hmm, it can't be called vb2_ since it is v4l2_ specific, so
> > > > > > perhaps v4l2_queue_error which takes a video_device and a vb2_queue as
> > > > > > arguments. I don't want this just in rcar since it makes perfect sense
> > > > > > as a generic event for such situations.
> > > > > 
> > > > > Handling errors in this way could be problematic, because a (CSI2) transfer
> > > > > error does not mean a total hardware failure on Rcar3. From my experience
> > > > > there are 3 kinds of CSI2 errors:
> > > > >   1. errors which occur sometimes, but do not affect video streaming
> > > > >   2. errors which occur on every start of streaming but usually do not
> > > > >      affect actual video streaming to VIN module after the start
> > > > >   3. fatal errors which require a "Software Reset" mentioned by Renesas in
> > > > >      the chapter 25.3.13 of the hardware manual in order to continue
> > > > >      video streaming
> > > > > This patch set makes the video pipeline unusable if we get errors described
> > > > > in the first scenario if I am not mistaken. In the second scenario the
> > > > > video pipeline was already not usable before because we end up in a
> > > > > continuous restart loop in rcar-csi2.c. And the third scenario is not
> > > > > really addressed by this patch set (or maybe the job is offloaded on to
> > > > > userspace)?
> > > > > 
> > > > > Maybe it's better to implement a recovery in a different way, which would
> > > > > consider the three mentioned error scenarios above:
> > > > >   1. Monitor rvin_irq after streaming has started, e.g. by using a timer
> > > > >      (I tried someting similar in [1])
> > > > >   2. restart the complete video pipeline via rvin_stop_streaming and
> > > > >      rvin_start_streaming if no frame is captured in a reasonable amount
> > > > >      of time (optionally after checking if a subdevice has sent a
> > > > >      V4L2_EVENT_XFER_ERROR).
> > > > > This would make the complete recovery process almost invisible for the
> > > > > application and avoid any application changes.
> > > > > 
> > > > > What do you think?
> > > > 
> > > > I think you bring up a few interesting points and for discussions sake I 
> > > > think we need to split it in two. One on how we could implement 
> > > > V4L2_EVENT_XFER_ERROR in a generic sens and one on how to best deal with 
> > > > errors in the R-Car Gen3 capture pipeline.
> > > > 
> > > > For the proposed V4L2_EVENT_XFER_ERROR the idea from my side is that a 
> > > > driver in the pipeline shall only raise this error (and propagate it to 
> > > > the effected video node) when there is no way to recover without 
> > > > involving user-space. So when this event happens an application at the 
> > > > very least needs to do a full s_stream cycle to restart the capture 
> > > > session.
> > > 
> > > Isn't a "full s_stream cycle" something what _every_ capture driver or even
> > > vb2 framework could execute internally without offloading the task to user
> > > space? This new event will probably burden userspace too much. Every V4L2
> > > capture application which worked fine on other SoCs will fail on RCar3 in
> > > the use cases where transfer errors occur until application developers
> > > implement handling of this event. Signalling an error via vb2_queue_error
> > > should be probably sufficient, because it already signals via EPOLLERR/EIO
> > > to userspace to do a streamoff to clear this error (as per videobuf2-core.h).
> > > Maybe we just have to document that userspace should try to do a streamon
> > > after this? EPOLLERR/EIO errors will be propagated to any application, but
> > > not every application subscribes to V4L2 events...
> > 
> > I think doing a "full s_stream cycle" anywhere but from user-space to be 
> > wrong. It will mess-up the state of capture session, for example it 
> > would reset the capture sequence numbers which can freak out user-space 
> > 3A processing. I think that if an error is hit that requires a full 
> > restart of the stream it should be driven from user-space.
> > 
> > But with this series the error is also signaled thru a vb2_queue_error 
> > right? What is added is for the applications that do look at V4L2 events 
> > to know why it happen right? I mean the vb2 queue can signal error but 
> > the event can tell the application if it was due to V4L2_EVENT_EOS,
> > V4L2_EVENT_SOURCE_CHANGE or that something went very wrong using the 
> > proposed V4L2_EVENT_XFER_ERROR event.
> 
> Agree, now I see that it makes sense to have an additional V4L2 event so
> when userspace receives EPOLLERR or EIO, it can check what went wrong by
> dequeuing the received event. If the event was an xfer error, userspace can
> restart the streaming session once and wait for the next EPOLLERR or EIO.
> 
> > > 
> > > I took a look at the other drivers and found that imx-media-csi.c seems to
> > > follow a similar approach to what I described in my last email. There is a
> > > 2 second timer, which monitors the csi_idmac_eof_interrupt and signals a
> > > fatal error to the capture device via vb2_queue_error when this timer
> > > expires. Monitoring end-of-frame interrupts looks also like a more robust
> > > solution, because in this way we know reliably when something has failed
> > > in the pipeline in contrast to reacting to start-of-transfer errors, which
> > > can be false positives. I think it would be good to follow a similar
> > > approach by adding an EOF timer to rvin_irq. This should also cover errors
> > > on the digital pin camera interface if we get any in the future. After
> > > this we could add a few additional improvements:
> > >  1. configurability of the EOF timeout. 2 seconds can be too much,
> > >     especially for automotive use cases. Probably this timeout could be
> > >     reduced depending on the source frame rate.
> > >  2. internal streamoff/streamon cycle after the timer expires instead of
> > >     calling vb2_queue_error
> > 
> > I would need to deep-dive in the details but in general I think anything 
> > we can do to improve the robustness of capture is good. And if we within 
> > a subdevice can recover form errors without causing issues that is good.  
> > But for the work in this series I think it's two different questions.  
> > One is how and if we can add an event to signal something went very 
> > wrong and the capture pipeline can't recover and have stopped, the other 
> > is how we in individual drivers try to detect and recover from errors.
> 
> Agree, this is rather an idea for the second question and a suggestion how
> the patches 3 and 4 could be improved.
> 
> > > 
> > > > On the particulars of the VIN capture pipeline the only way I found so 
> > > > far to freak out the CSI-2 receiver enough to trigger the event with this 
> > > > patch series is to disconnect the HDMI source from the ADV7481 while 
> > > > streaming and I don't think any in kernel recover method can fix that 
> > > > ;-)
> > > 
> > > In this case I would expect an V4L2_EVENT_EOS or V4L2_EVENT_SOURCE_CHANGE
> > > from the adv748x driver. Maybe we should call vb2_queue_error for these
> > > events as well. But the described EOF timeout approach could also cover
> > > this use case.
> > 
> > Hum, is not those events more stable to report "planed" events? From the 
> > docs.
> > 
> > * V4L2_EVENT_EOS
> >   This event is triggered when the end of a stream is reached. This is 
> >   typically used with MPEG decoders to report to the application when 
> >   the last of the MPEG stream has been decoded.
> > 
> > * V4L2_EVENT_SOURCE_CHANGE
> >   This event is triggered when a source parameter change is detected 
> >   during runtime by the video device. It can be a runtime resolution 
> >   change triggered by a video decoder or the format change happening on 
> >   an input connector. This event requires that the id matches the input 
> >   index (when used with a video device node) or the pad index (when used 
> >   with a subdevice node) from which you want to receive events.
> > 
> >   This event has a struct v4l2_event_src_change associated with it. The 
> >   changes bitfield denotes what has changed for the subscribed pad. If 
> >   multiple events occurred before application could dequeue them, then 
> >   the changes will have the ORed value of all the events generated.
> > 
> > While I think the idea for V4L2_EVENT_XFER_ERROR more shall describe the 
> > case something "unnatural" happens that we can't really recover from.  
> > Maybe "ran over cable with lawn mower". I only brought up pulling the 
> > cable as it's the only way I have to trigger it easily (I don't have a 
> > lawn :-).
> 
> I was thinking about this discussion [1]. If I understand the
> recommendation from Hans correctly, userspace should get a SOURCE_CHANGE
> event and then query dv timings to know whether HDMI cable has been
> disconnected or connected.
> 
> [1] Subject: "Re: [PATCH 1/3] v4l2-subdev: Add subdev ioctl support for ENUM/GET/SET INPUT"
> 
> > > 
> > > > Over all I do agree with your idea that if we can recover from errors 
> > > > that are recovererable that is good. For this series I would like to 
> > > > focus on the former to get V4L2_EVENT_XFER_ERROR in and then if we have 
> > > > ways to provoke and test recovery in the Gen3 pipeline we can add such 
> > > > things to the drivers. Do this make sens or do you think the change in 
> > > > the R-Car CSI-2 driver to raise V4L2_EVENT_XFER_ERROR is to harsh? My 
> > > > motivation for is is the new datasheet and discussions with Renesas, but 
> > > > then again my only way to trigger CSI-2 errors is to pull cables while 
> > > > streaming so maybe I'm biased as such issues can't really be recover 
> > > > from...
> > > > 
> > > > Let me know what you think, I was about to spin a new version of this 
> > > > series.
> > > 
> > > I believe this event might be useful for subdevice drivers which want to
> > > request a streamoff/streamon cycle from the capture driver but it's
> > > probably not useful for the RCar3 CSI2 driver in the current state because
> > > reacting to start-of-transfer errors can be risky due to false positives.
> > > 
> > > What do you think about implementing an approach similar to imx-media-csi.c
> > > first?
> > 
> > I'm not sure really what I think. Maybe if I understand better how to 
> > generate these false positives it would help me understand. I also get a 
> > feeling we are trying to solve two different problems. I want to find a 
> > way to just terminate capture when an unrecoverable error is hit while I 
> > get the feeling you are more focused on how to recover from recoverable 
> > errors in the best way. Maybe I'm missing understanding something, sorry 
> > if I am.
> > 
> > I think both problems are important but do not overlap to a large 
> > degree.
> 
> I think, one possibility to generate false positive CSI2 errors on a
> Salvator board is by manipulating CSI2-related registers in the adv7482. I
> took a look at the adv7482 hardware manual and I have found one method by
> reenabling automatically calculated DPHY timings:
> 
> 1. start streaming from HDMI (I am using a V-SG4K-HDI signal generator which
> outputs 1080P@60).
> # media-ctl -d /dev/media0 -r
> # media-ctl -d /dev/media0 -l "'adv748x 4-0070 hdmi':1 -> 'adv748x 4-0070 txa':0 [1]"
> # media-ctl -d /dev/media0 -l "'rcar_csi2 feaa0000.csi2':1 -> 'VIN1 output':0 [1]"
> # media-ctl -d /dev/media0 --set-v4l2 "'adv748x 4-0070 hdmi':1 [fmt:RGB888_1X24/1920x1080 field:none]"
> # media-ctl -d /dev/media0 --set-v4l2 "'adv748x 4-0070 txa':1 [fmt:RGB888_1X24/1920x1080 field:none]"
> # media-ctl -d /dev/media0 --set-v4l2 "'rcar_csi2 feaa0000.csi2':1 [fmt:RGB888_1X24/1920x1080 field:none]"
> # v4l2-ctl --stream-mmap --device /dev/video1 --verbose
> 2. in another shell reenable calculation of the DPHY timings in adv7482:
> # modprobe i2c-dev
> # i2cset -f -y 4 0x64 0x00 0x04 && i2cset -f -y 4 0x64 0x00 0x24
> # dmesg
> [ 8806.752375] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
> [ 8806.759642] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
> [ 8806.766881] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
> [ 8806.774109] rcar-csi2 feaa0000.csi2: Transfer error, restarting CSI-2 receiver
> # 
> 
> There will be some transfer errors, however I have disabled the restart in the
> CSI2 driver so rcar-csi2 will not attempt to recover any more. In
> approximately half of the trials I see that streaming can continue despite
> CSI2 errors. In the other half of my trials streaming stops and I have to
> restart it via v4l2-ctl. Maybe its even possible to simulate CSI2 errors
> which are completely harmless by changing other adv7482 registers. Or was
> my example already helpful to understand what I mean by those false positives?
> I hope this makes clear, why its not good to stop streaming and trigger the
> xfer error event each time we get a CSI2 error but instead setup a timeout
> and wait for the frame like its done in the imx driver.
> 
> > > 
> > > > > 
> > > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Drenesas-2Dsoc_1592588777-2D100596-2D1-2Dgit-2Dsend-2Demail-2Dmrodin-40de.adit-2Djv.com_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=iHMOF-_0M012_DHC6tHxMudpVvpBDqktUtXsKy4xpY0&s=B4GPIasxu8VP04sLBFvhbmr8yhotp6GcfDQqEKjqeAc&e=
> > > > > 
> > > > > > Regards,
> > > > > > 
> > > > > > 	Hans
> > > > > > 
> > > > > > > 
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >>
> > > > > > >> 	Hans
> > > > > > >>
> > > > > > >>> +			break;
> > > > > > >>> +		default:
> > > > > > >>> +			break;
> > > > > > >>> +		}
> > > > > > >>> +
> > > > > > >>> +		v4l2_event_queue(&vin->vdev, event);
> > > > > > >>>  		break;
> > > > > > >>>  	default:
> > > > > > >>>  		break;
> > > > > > >>>
> > > > > > >>
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > -- 
> > > > > Best Regards,
> > > > > Michael
> > > > 
> > > > -- 
> > > > Kind Regards,
> > > > Niklas Söderlund
> > > 
> > > -- 
> > > Best Regards,
> > > Michael
> > 
> > -- 
> > Kind Regards,
> > Niklas Söderlund
> 
> -- 
> Best Regards,
> Michael

-- 
Best Regards,
Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-05-19 18:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 16:02 [PATCH 0/4] media: videobuf2: Add a transfer error event Niklas Söderlund
2021-11-08 16:02 ` [PATCH 1/4] media: rcar-vin: Free buffers with error if hardware stop fails Niklas Söderlund
2021-11-15 14:36   ` Hans Verkuil
2021-11-08 16:02 ` [PATCH 2/4] media: videobuf2: Add a transfer error event Niklas Söderlund
2022-03-02 15:40   ` Michael Rodin
2021-11-08 16:02 ` [PATCH 3/4] rcar-vin: Stop stream when subdevice signal transfer error Niklas Söderlund
2021-11-08 17:36   ` Hans Verkuil
2021-11-08 18:42     ` Niklas Söderlund
2021-11-15 14:26       ` Hans Verkuil
2022-03-02 16:48         ` Michael Rodin
2022-03-02 20:17           ` Niklas Söderlund
2022-03-06 20:01             ` Michael Rodin
2022-03-07 15:26               ` Niklas Söderlund
2022-03-09 19:27                 ` Michael Rodin
2022-05-19 18:09                   ` Michael Rodin
2021-11-08 16:02 ` [PATCH 4/4] rcar-csi2: Do not try to recover after " Niklas Söderlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.