linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
@ 2017-04-25  6:19 Alexandre Courbot
  2017-04-25 19:15 ` Jacek Anaszewski
  0 siblings, 1 reply; 8+ messages in thread
From: Alexandre Courbot @ 2017-04-25  6:19 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, Jacek Anaszewski, Mauro Carvalho Chehab
  Cc: linux-media, linux-kernel, Alexandre Courbot

v4l2_m2m_job_finish(), which is called from the interrupt handler with
slock acquired, can call the device_run() hook immediately if another
context was in the queue. This hook also acquires slock, resulting in
a deadlock for this scenario.

Fix this by releasing slock right before calling v4l2_m2m_job_finish().
This is safe to do as the state of the hardware cannot change before
v4l2_m2m_job_finish() is called anyway.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
index 52dc7941db65..223b4379929e 100644
--- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
+++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
@@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
 	if (curr_ctx->mode == S5P_JPEG_ENCODE)
 		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
 	v4l2_m2m_buf_done(dst_buf, state);
-	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
 
 	curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
 	spin_unlock(&jpeg->slock);
 
 	s5p_jpeg_clear_int(jpeg->regs);
 
+	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
 	return IRQ_HANDLED;
 }
 
@@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
 		v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
 	}
 
-	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
 	if (jpeg->variant->version == SJPEG_EXYNOS4)
 		curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
 
 	spin_unlock(&jpeg->slock);
+
+	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
 	return IRQ_HANDLED;
 }
 
@@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
 	if (curr_ctx->mode == S5P_JPEG_ENCODE)
 		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
 	v4l2_m2m_buf_done(dst_buf, state);
-	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
 
 	curr_ctx->subsampling =
 			exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
+
+	spin_unlock(&jpeg->slock);
+
+	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
+	return IRQ_HANDLED;
+
 exit_unlock:
 	spin_unlock(&jpeg->slock);
 	return IRQ_HANDLED;
-- 
2.13.0.rc0.306.g87b477812d-goog

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
  2017-04-25  6:19 [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition Alexandre Courbot
@ 2017-04-25 19:15 ` Jacek Anaszewski
  2017-04-26  2:54   ` Alexandre Courbot
  0 siblings, 1 reply; 8+ messages in thread
From: Jacek Anaszewski @ 2017-04-25 19:15 UTC (permalink / raw)
  To: Alexandre Courbot, Andrzej Pietrasiewicz, Mauro Carvalho Chehab
  Cc: linux-media, linux-kernel

Hi Alexandre,

Thanks for the patch.

On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
> v4l2_m2m_job_finish(), which is called from the interrupt handler with
> slock acquired, can call the device_run() hook immediately if another
> context was in the queue. This hook also acquires slock, resulting in
> a deadlock for this scenario.
> 
> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
> This is safe to do as the state of the hardware cannot change before
> v4l2_m2m_job_finish() is called anyway.
> 
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
> index 52dc7941db65..223b4379929e 100644
> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
>  	if (curr_ctx->mode == S5P_JPEG_ENCODE)
>  		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>  	v4l2_m2m_buf_done(dst_buf, state);
> -	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>  
>  	curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>  	spin_unlock(&jpeg->slock);
>  
>  	s5p_jpeg_clear_int(jpeg->regs);
>  
> +	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>  	return IRQ_HANDLED;
>  }
>  
> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
>  		v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>  	}
>  
> -	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>  	if (jpeg->variant->version == SJPEG_EXYNOS4)
>  		curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>  
>  	spin_unlock(&jpeg->slock);
> +
> +	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>  	return IRQ_HANDLED;
>  }
>  
> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
>  	if (curr_ctx->mode == S5P_JPEG_ENCODE)
>  		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>  	v4l2_m2m_buf_done(dst_buf, state);
> -	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>  
>  	curr_ctx->subsampling =
>  			exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
> +
> +	spin_unlock(&jpeg->slock);
> +
> +	v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
> +	return IRQ_HANDLED;
> +
>  exit_unlock:
>  	spin_unlock(&jpeg->slock);
>  	return IRQ_HANDLED;
> 

Acked-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>

Just out of curiosity - could you share how you discovered the problem -
by some static checkers or trying to use the driver?

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
  2017-04-25 19:15 ` Jacek Anaszewski
@ 2017-04-26  2:54   ` Alexandre Courbot
  2017-04-26 19:35     ` Jacek Anaszewski
  0 siblings, 1 reply; 8+ messages in thread
From: Alexandre Courbot @ 2017-04-26  2:54 UTC (permalink / raw)
  To: Jacek Anaszewski
  Cc: Andrzej Pietrasiewicz, Mauro Carvalho Chehab, linux-media, linux-kernel

On Wed, Apr 26, 2017 at 4:15 AM, Jacek Anaszewski
<jacek.anaszewski@gmail.com> wrote:
> Hi Alexandre,
>
> Thanks for the patch.
>
> On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
>> v4l2_m2m_job_finish(), which is called from the interrupt handler with
>> slock acquired, can call the device_run() hook immediately if another
>> context was in the queue. This hook also acquires slock, resulting in
>> a deadlock for this scenario.
>>
>> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
>> This is safe to do as the state of the hardware cannot change before
>> v4l2_m2m_job_finish() is called anyway.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>> ---
>>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>> index 52dc7941db65..223b4379929e 100644
>> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
>> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>       v4l2_m2m_buf_done(dst_buf, state);
>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>
>>       curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>>       spin_unlock(&jpeg->slock);
>>
>>       s5p_jpeg_clear_int(jpeg->regs);
>>
>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>       return IRQ_HANDLED;
>>  }
>>
>> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
>>               v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>>       }
>>
>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>       if (jpeg->variant->version == SJPEG_EXYNOS4)
>>               curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>>
>>       spin_unlock(&jpeg->slock);
>> +
>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>       return IRQ_HANDLED;
>>  }
>>
>> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>       v4l2_m2m_buf_done(dst_buf, state);
>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>
>>       curr_ctx->subsampling =
>>                       exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
>> +
>> +     spin_unlock(&jpeg->slock);
>> +
>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>> +     return IRQ_HANDLED;
>> +
>>  exit_unlock:
>>       spin_unlock(&jpeg->slock);
>>       return IRQ_HANDLED;
>>
>
> Acked-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
>
> Just out of curiosity - could you share how you discovered the problem -
> by some static checkers or trying to use the driver?

We discovered this issue after adding a new unit test for the jpeg
codec in Chromium OS:

https://bugs.chromium.org/p/chromium/issues/detail?id=705971

>From what I understand the test spawns different processes that access
the codec device concurrently, creating the situation leading to the
bug.

On a slightly related note, I was thinking whether it would make sense
to move the call to  v4l2_m2m_job_finish() (and maybe other parts of
the current interrupt handler) into a worker or a threaded interrupt
handler so as to reduce the time we spend with interrupts disabled.
Can I have your input on this idea?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
  2017-04-26  2:54   ` Alexandre Courbot
@ 2017-04-26 19:35     ` Jacek Anaszewski
  2017-05-29  7:29       ` Alexandre Courbot
  0 siblings, 1 reply; 8+ messages in thread
From: Jacek Anaszewski @ 2017-04-26 19:35 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Andrzej Pietrasiewicz, Mauro Carvalho Chehab, linux-media, linux-kernel

On 04/26/2017 04:54 AM, Alexandre Courbot wrote:
> On Wed, Apr 26, 2017 at 4:15 AM, Jacek Anaszewski
> <jacek.anaszewski@gmail.com> wrote:
>> Hi Alexandre,
>>
>> Thanks for the patch.
>>
>> On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
>>> v4l2_m2m_job_finish(), which is called from the interrupt handler with
>>> slock acquired, can call the device_run() hook immediately if another
>>> context was in the queue. This hook also acquires slock, resulting in
>>> a deadlock for this scenario.
>>>
>>> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
>>> This is safe to do as the state of the hardware cannot change before
>>> v4l2_m2m_job_finish() is called anyway.
>>>
>>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>>> ---
>>>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>> index 52dc7941db65..223b4379929e 100644
>>> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>       v4l2_m2m_buf_done(dst_buf, state);
>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>
>>>       curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>>>       spin_unlock(&jpeg->slock);
>>>
>>>       s5p_jpeg_clear_int(jpeg->regs);
>>>
>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>       return IRQ_HANDLED;
>>>  }
>>>
>>> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
>>>               v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>>>       }
>>>
>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>       if (jpeg->variant->version == SJPEG_EXYNOS4)
>>>               curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>>>
>>>       spin_unlock(&jpeg->slock);
>>> +
>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>       return IRQ_HANDLED;
>>>  }
>>>
>>> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>       v4l2_m2m_buf_done(dst_buf, state);
>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>
>>>       curr_ctx->subsampling =
>>>                       exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
>>> +
>>> +     spin_unlock(&jpeg->slock);
>>> +
>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>> +     return IRQ_HANDLED;
>>> +
>>>  exit_unlock:
>>>       spin_unlock(&jpeg->slock);
>>>       return IRQ_HANDLED;
>>>
>>
>> Acked-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
>>
>> Just out of curiosity - could you share how you discovered the problem -
>> by some static checkers or trying to use the driver?
> 
> We discovered this issue after adding a new unit test for the jpeg
> codec in Chromium OS:
> 
> https://bugs.chromium.org/p/chromium/issues/detail?id=705971
> 
>>From what I understand the test spawns different processes that access
> the codec device concurrently, creating the situation leading to the
> bug.

Thanks for the explanation. Nice fix.

> On a slightly related note, I was thinking whether it would make sense
> to move the call to  v4l2_m2m_job_finish() (and maybe other parts of
> the current interrupt handler) into a worker or a threaded interrupt
> handler so as to reduce the time we spend with interrupts disabled.
> Can I have your input on this idea?

Right, all remaining drivers call it from workers.
Feel free to submit a patch.

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
  2017-04-26 19:35     ` Jacek Anaszewski
@ 2017-05-29  7:29       ` Alexandre Courbot
  2017-05-29 19:08         ` Jacek Anaszewski
  0 siblings, 1 reply; 8+ messages in thread
From: Alexandre Courbot @ 2017-05-29  7:29 UTC (permalink / raw)
  To: Jacek Anaszewski
  Cc: Andrzej Pietrasiewicz, Mauro Carvalho Chehab, linux-media, linux-kernel

Hi everyone,

On Thu, Apr 27, 2017 at 4:35 AM, Jacek Anaszewski
<jacek.anaszewski@gmail.com> wrote:
> On 04/26/2017 04:54 AM, Alexandre Courbot wrote:
>> On Wed, Apr 26, 2017 at 4:15 AM, Jacek Anaszewski
>> <jacek.anaszewski@gmail.com> wrote:
>>> Hi Alexandre,
>>>
>>> Thanks for the patch.
>>>
>>> On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
>>>> v4l2_m2m_job_finish(), which is called from the interrupt handler with
>>>> slock acquired, can call the device_run() hook immediately if another
>>>> context was in the queue. This hook also acquires slock, resulting in
>>>> a deadlock for this scenario.
>>>>
>>>> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
>>>> This is safe to do as the state of the hardware cannot change before
>>>> v4l2_m2m_job_finish() is called anyway.
>>>>
>>>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>>>> ---
>>>>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>>>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>> index 52dc7941db65..223b4379929e 100644
>>>> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
>>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>>       v4l2_m2m_buf_done(dst_buf, state);
>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>
>>>>       curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>>>>       spin_unlock(&jpeg->slock);
>>>>
>>>>       s5p_jpeg_clear_int(jpeg->regs);
>>>>
>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>       return IRQ_HANDLED;
>>>>  }
>>>>
>>>> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
>>>>               v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>>>>       }
>>>>
>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>       if (jpeg->variant->version == SJPEG_EXYNOS4)
>>>>               curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>>>>
>>>>       spin_unlock(&jpeg->slock);
>>>> +
>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>       return IRQ_HANDLED;
>>>>  }
>>>>
>>>> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
>>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>>       v4l2_m2m_buf_done(dst_buf, state);
>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>
>>>>       curr_ctx->subsampling =
>>>>                       exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
>>>> +
>>>> +     spin_unlock(&jpeg->slock);
>>>> +
>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>> +     return IRQ_HANDLED;
>>>> +
>>>>  exit_unlock:
>>>>       spin_unlock(&jpeg->slock);
>>>>       return IRQ_HANDLED;
>>>>
>>>
>>> Acked-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
>>>
>>> Just out of curiosity - could you share how you discovered the problem -
>>> by some static checkers or trying to use the driver?
>>
>> We discovered this issue after adding a new unit test for the jpeg
>> codec in Chromium OS:
>>
>> https://bugs.chromium.org/p/chromium/issues/detail?id=705971
>>
>>>From what I understand the test spawns different processes that access
>> the codec device concurrently, creating the situation leading to the
>> bug.
>
> Thanks for the explanation. Nice fix.

Gentle ping as I am not seeing this patch in the tree yet. Thanks.

>
>> On a slightly related note, I was thinking whether it would make sense
>> to move the call to  v4l2_m2m_job_finish() (and maybe other parts of
>> the current interrupt handler) into a worker or a threaded interrupt
>> handler so as to reduce the time we spend with interrupts disabled.
>> Can I have your input on this idea?
>
> Right, all remaining drivers call it from workers.
> Feel free to submit a patch.
>
> --
> Best regards,
> Jacek Anaszewski

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
  2017-05-29  7:29       ` Alexandre Courbot
@ 2017-05-29 19:08         ` Jacek Anaszewski
       [not found]           ` <CGME20170530083004epcas1p22ffe80b6a9442bcd3a6fb0b02381759f@epcas1p2.samsung.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Jacek Anaszewski @ 2017-05-29 19:08 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Alexandre Courbot, Andrzej Pietrasiewicz, linux-media, linux-kernel

Hi Mauro,

This patch seems to have lost somehow. Could you help merging it?

Thanks,
Jacek Anaszewski

On 05/29/2017 09:29 AM, Alexandre Courbot wrote:
> Hi everyone,
> 
> On Thu, Apr 27, 2017 at 4:35 AM, Jacek Anaszewski
> <jacek.anaszewski@gmail.com> wrote:
>> On 04/26/2017 04:54 AM, Alexandre Courbot wrote:
>>> On Wed, Apr 26, 2017 at 4:15 AM, Jacek Anaszewski
>>> <jacek.anaszewski@gmail.com> wrote:
>>>> Hi Alexandre,
>>>>
>>>> Thanks for the patch.
>>>>
>>>> On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
>>>>> v4l2_m2m_job_finish(), which is called from the interrupt handler withHi s
>>>>> slock acquired, can call the device_run() hook immediately if another
>>>>> context was in the queue. This hook also acquires slock, resulting in
>>>>> a deadlock for this scenario.
>>>>>
>>>>> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
>>>>> This is safe to do as the state of the hardware cannot change before
>>>>> v4l2_m2m_job_finish() is called anyway.
>>>>>
>>>>> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
>>>>> ---
>>>>>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>>>>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>>> index 52dc7941db65..223b4379929e 100644
>>>>> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>>> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>>> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
>>>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>>>       v4l2_m2m_buf_done(dst_buf, state);
>>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>>
>>>>>       curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>>>>>       spin_unlock(&jpeg->slock);
>>>>>
>>>>>       s5p_jpeg_clear_int(jpeg->regs);
>>>>>
>>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>>       return IRQ_HANDLED;
>>>>>  }
>>>>>
>>>>> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
>>>>>               v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>>>>>       }
>>>>>
>>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>>       if (jpeg->variant->version == SJPEG_EXYNOS4)
>>>>>               curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>>>>>
>>>>>       spin_unlock(&jpeg->slock);
>>>>> +
>>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>>       return IRQ_HANDLED;
>>>>>  }
>>>>>
>>>>> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
>>>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>>>       v4l2_m2m_buf_done(dst_buf, state);
>>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>>
>>>>>       curr_ctx->subsampling =
>>>>>                       exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
>>>>> +
>>>>> +     spin_unlock(&jpeg->slock);
>>>>> +
>>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>> +     return IRQ_HANDLED;
>>>>> +
>>>>>  exit_unlock:
>>>>>       spin_unlock(&jpeg->slock);
>>>>>       return IRQ_HANDLED;
>>>>>
>>>>
>>>> Acked-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
>>>>
>>>> Just out of curiosity - could you share how you discovered the problem -
>>>> by some static checkers or trying to use the driver?
>>>
>>> We discovered this issue after adding a new unit test for the jpeg
>>> codec in Chromium OS:
>>>
>>> https://bugs.chromium.org/p/chromium/issues/detail?id=705971
>>>
>>> >From what I understand the test spawns different processes that access
>>> the codec device concurrently, creating the situation leading to the
>>> bug.
>>
>> Thanks for the explanation. Nice fix.
> 
> Gentle ping as I am not seeing this patch in the tree yet. Thanks.
> 
>>
>>> On a slightly related note, I was thinking whether it would make sense
>>> to move the call to  v4l2_m2m_job_finish() (and maybe other parts of
>>> the current interrupt handler) into a worker or a threaded interrupt
>>> handler so as to reduce the time we spend with interrupts disabled.
>>> Can I have your input on this idea?
>>
>> Right, all remaining drivers call it from workers.
>> Feel free to submit a patch.
>>
>> --
>> Best regards,
>> Jacek Anaszewski
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
       [not found]           ` <CGME20170530083004epcas1p22ffe80b6a9442bcd3a6fb0b02381759f@epcas1p2.samsung.com>
@ 2017-05-30  8:30             ` Sylwester Nawrocki
  2017-05-30  8:50               ` Alexandre Courbot
  0 siblings, 1 reply; 8+ messages in thread
From: Sylwester Nawrocki @ 2017-05-30  8:30 UTC (permalink / raw)
  To: Jacek Anaszewski
  Cc: Mauro Carvalho Chehab, Alexandre Courbot, Andrzej Pietrasiewicz,
	linux-media, linux-kernel

Hi,

On 05/29/2017 09:08 PM, Jacek Anaszewski wrote:
> This patch seems to have lost somehow. Could you help merging it?

It's not lost, it has been on my todo queue. I have applied it now.

-- 
Thanks,
Sylwester

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition
  2017-05-30  8:30             ` Sylwester Nawrocki
@ 2017-05-30  8:50               ` Alexandre Courbot
  0 siblings, 0 replies; 8+ messages in thread
From: Alexandre Courbot @ 2017-05-30  8:50 UTC (permalink / raw)
  To: Sylwester Nawrocki
  Cc: Jacek Anaszewski, Mauro Carvalho Chehab, Andrzej Pietrasiewicz,
	linux-media, linux-kernel

On Tue, May 30, 2017 at 5:30 PM, Sylwester Nawrocki
<s.nawrocki@samsung.com> wrote:
> Hi,
>
> On 05/29/2017 09:08 PM, Jacek Anaszewski wrote:
>>
>> This patch seems to have lost somehow. Could you help merging it?
>
>
> It's not lost, it has been on my todo queue. I have applied it now.

Awesome, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-30  8:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-25  6:19 [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition Alexandre Courbot
2017-04-25 19:15 ` Jacek Anaszewski
2017-04-26  2:54   ` Alexandre Courbot
2017-04-26 19:35     ` Jacek Anaszewski
2017-05-29  7:29       ` Alexandre Courbot
2017-05-29 19:08         ` Jacek Anaszewski
     [not found]           ` <CGME20170530083004epcas1p22ffe80b6a9442bcd3a6fb0b02381759f@epcas1p2.samsung.com>
2017-05-30  8:30             ` Sylwester Nawrocki
2017-05-30  8:50               ` Alexandre Courbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).