All of lore.kernel.org
 help / color / mirror / Atom feed
* NULL pointer dereference at v4l2_ctrl_request_complete
@ 2021-03-31 16:14 Ezequiel Garcia
  2021-03-31 22:34 ` Ezequiel Garcia
  0 siblings, 1 reply; 8+ messages in thread
From: Ezequiel Garcia @ 2021-03-31 16:14 UTC (permalink / raw)
  To: Hans Verkuil, linux-media

Hi Hans,

We have found this crash in mainline:

[  258.174662] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[  258.182967] Mem abort info:
[  258.187746]   ESR = 0x96000004
[  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
[  258.199108]   SET = 0, FnV = 0
[  258.203732]   EA = 0, S1PTW = 0
[  258.208389] Data abort info:
[  258.212743]   ISV = 0, ISS = 0x00000004
[  258.218322]   CM = 0, WnR = 0
[  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
[  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
gpu_sched fuse
[  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
     5.12.0-rc2+ #106
[  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
[  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
[  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[  258.283426] pc : __memcpy+0x7c/0x180
[  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
[  258.290585] sp : ffff8000116fbc60
[  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
[  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
[  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
[  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
[  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
[  258.312688] x19: ffff000006812a00 x18: 0000000000000000
[  258.316708] x17: 0000000000000000 x16: 0000000000000000
[  258.320724] x15: 0000000000000000 x14: 0000000000000000
[  258.324744] x13: 0000000000000000 x12: 0000000000000000
[  258.328761] x11: 0000000000000000 x10: 0000000000000000
[  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
[  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
[  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
[  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
[  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
[  258.352864] Call trace:
[  258.354010]  __memcpy+0x7c/0x180
[  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
[  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
[  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
[  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
[  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
[  258.376279]  process_one_work+0x1c0/0x484
[  258.378995]  worker_thread+0x70/0x434
[  258.381364]  kthread+0x158/0x160
[  258.383296]  ret_from_fork+0x10/0x34
[  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
[  258.390377] ---[ end trace fea6ecb96dad642d ]---

This needs several concurrent streams to run at the same time,
it doesn't happen with just one; it was reproduced in rkvdec
and hantro drivers. The above stacktrace is on MPEG-2,
but we have found it on H.264 as well.

I've traced this to:

                v4l2_ctrl_lock(ctrl);
                if (ref->req) {
                        ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);

And:

static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
                       union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
{
        if (ctrl == NULL)
                return;
        memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
                               ^^^^^^^^^^^^
                               This is NULL

I've been staring at this file and trying to understand how this can happen
but haven't been able to make any progress so far...

Any ideas ?

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-03-31 16:14 NULL pointer dereference at v4l2_ctrl_request_complete Ezequiel Garcia
@ 2021-03-31 22:34 ` Ezequiel Garcia
  2021-04-01  0:23   ` Alexandre Courbot
  2021-04-01  9:34   ` Hans Verkuil
  0 siblings, 2 replies; 8+ messages in thread
From: Ezequiel Garcia @ 2021-03-31 22:34 UTC (permalink / raw)
  To: Hans Verkuil, linux-media

Hi Hans,

On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hi Hans,
>
> We have found this crash in mainline:
>
> [  258.174662] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000000
> [  258.182967] Mem abort info:
> [  258.187746]   ESR = 0x96000004
> [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  258.199108]   SET = 0, FnV = 0
> [  258.203732]   EA = 0, S1PTW = 0
> [  258.208389] Data abort info:
> [  258.212743]   ISV = 0, ISS = 0x00000004
> [  258.218322]   CM = 0, WnR = 0
> [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
> [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
> v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
> videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
> gpu_sched fuse
> [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
>      5.12.0-rc2+ #106
> [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
> [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
> [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> [  258.283426] pc : __memcpy+0x7c/0x180
> [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
> [  258.290585] sp : ffff8000116fbc60
> [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
> [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
> [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
> [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
> [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
> [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
> [  258.316708] x17: 0000000000000000 x16: 0000000000000000
> [  258.320724] x15: 0000000000000000 x14: 0000000000000000
> [  258.324744] x13: 0000000000000000 x12: 0000000000000000
> [  258.328761] x11: 0000000000000000 x10: 0000000000000000
> [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
> [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
> [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
> [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
> [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
> [  258.352864] Call trace:
> [  258.354010]  __memcpy+0x7c/0x180
> [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
> [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
> [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
> [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
> [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
> [  258.376279]  process_one_work+0x1c0/0x484
> [  258.378995]  worker_thread+0x70/0x434
> [  258.381364]  kthread+0x158/0x160
> [  258.383296]  ret_from_fork+0x10/0x34
> [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
> [  258.390377] ---[ end trace fea6ecb96dad642d ]---
>
> This needs several concurrent streams to run at the same time,
> it doesn't happen with just one; it was reproduced in rkvdec
> and hantro drivers. The above stacktrace is on MPEG-2,
> but we have found it on H.264 as well.
>
> I've traced this to:
>
>                 v4l2_ctrl_lock(ctrl);
>                 if (ref->req) {
>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
>
> And:
>
> static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
>                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
> {
>         if (ctrl == NULL)
>                 return;
>         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
>                                ^^^^^^^^^^^^
>                                This is NULL
>
> I've been staring at this file and trying to understand how this can happen
> but haven't been able to make any progress so far...
>
> Any ideas ?
>

This issue may be somehow linked to controls that haven't been set,
and thus have their initialization value.

My traces indicate that the controls with NULL values are those
that GStreamer is not setting. I can confirm that by re-testing
with GStreamer setting all the controls always.

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-03-31 22:34 ` Ezequiel Garcia
@ 2021-04-01  0:23   ` Alexandre Courbot
  2021-04-01  0:33     ` Alexandre Courbot
  2021-04-01  9:34   ` Hans Verkuil
  1 sibling, 1 reply; 8+ messages in thread
From: Alexandre Courbot @ 2021-04-01  0:23 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: Hans Verkuil, linux-media

Hi Ezequiel,

On Thu, Apr 1, 2021 at 7:35 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hi Hans,
>
> On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> >
> > Hi Hans,
> >
> > We have found this crash in mainline:
> >
> > [  258.174662] Unable to handle kernel NULL pointer dereference at
> > virtual address 0000000000000000
> > [  258.182967] Mem abort info:
> > [  258.187746]   ESR = 0x96000004
> > [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [  258.199108]   SET = 0, FnV = 0
> > [  258.203732]   EA = 0, S1PTW = 0
> > [  258.208389] Data abort info:
> > [  258.212743]   ISV = 0, ISS = 0x00000004
> > [  258.218322]   CM = 0, WnR = 0
> > [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
> > [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> > [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
> > v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
> > videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
> > gpu_sched fuse
> > [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
> >      5.12.0-rc2+ #106
> > [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
> > [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
> > [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > [  258.283426] pc : __memcpy+0x7c/0x180
> > [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
> > [  258.290585] sp : ffff8000116fbc60
> > [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
> > [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
> > [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
> > [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
> > [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
> > [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
> > [  258.316708] x17: 0000000000000000 x16: 0000000000000000
> > [  258.320724] x15: 0000000000000000 x14: 0000000000000000
> > [  258.324744] x13: 0000000000000000 x12: 0000000000000000
> > [  258.328761] x11: 0000000000000000 x10: 0000000000000000
> > [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
> > [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
> > [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
> > [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
> > [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
> > [  258.352864] Call trace:
> > [  258.354010]  __memcpy+0x7c/0x180
> > [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
> > [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
> > [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
> > [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
> > [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
> > [  258.376279]  process_one_work+0x1c0/0x484
> > [  258.378995]  worker_thread+0x70/0x434
> > [  258.381364]  kthread+0x158/0x160
> > [  258.383296]  ret_from_fork+0x10/0x34
> > [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
> > [  258.390377] ---[ end trace fea6ecb96dad642d ]---
> >
> > This needs several concurrent streams to run at the same time,
> > it doesn't happen with just one; it was reproduced in rkvdec
> > and hantro drivers. The above stacktrace is on MPEG-2,
> > but we have found it on H.264 as well.
> >
> > I've traced this to:
> >
> >                 v4l2_ctrl_lock(ctrl);
> >                 if (ref->req) {
> >                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
> >
> > And:
> >
> > static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
> >                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
> > {
> >         if (ctrl == NULL)
> >                 return;
> >         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
> >                                ^^^^^^^^^^^^
> >                                This is NULL
> >
> > I've been staring at this file and trying to understand how this can happen
> > but haven't been able to make any progress so far...
> >
> > Any ideas ?
> >
>
> This issue may be somehow linked to controls that haven't been set,
> and thus have their initialization value.
>
> My traces indicate that the controls with NULL values are those
> that GStreamer is not setting. I can confirm that by re-testing
> with GStreamer setting all the controls always.

If I am not mixing up issues then the following patch should fix this:

https://patchwork.linuxtv.org/project/linux-media/patch/ce92a438-4708-74a0-8fb4-c8fb6495e6b8@xs4all.nl/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-04-01  0:23   ` Alexandre Courbot
@ 2021-04-01  0:33     ` Alexandre Courbot
  0 siblings, 0 replies; 8+ messages in thread
From: Alexandre Courbot @ 2021-04-01  0:33 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: Hans Verkuil, linux-media

On Thu, Apr 1, 2021 at 9:23 AM Alexandre Courbot <acourbot@chromium.org> wrote:
>
> Hi Ezequiel,
>
> On Thu, Apr 1, 2021 at 7:35 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> >
> > Hi Hans,
> >
> > On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > > Hi Hans,
> > >
> > > We have found this crash in mainline:
> > >
> > > [  258.174662] Unable to handle kernel NULL pointer dereference at
> > > virtual address 0000000000000000
> > > [  258.182967] Mem abort info:
> > > [  258.187746]   ESR = 0x96000004
> > > [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
> > > [  258.199108]   SET = 0, FnV = 0
> > > [  258.203732]   EA = 0, S1PTW = 0
> > > [  258.208389] Data abort info:
> > > [  258.212743]   ISV = 0, ISS = 0x00000004
> > > [  258.218322]   CM = 0, WnR = 0
> > > [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
> > > [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> > > [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > > [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
> > > v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
> > > videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
> > > gpu_sched fuse
> > > [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
> > >      5.12.0-rc2+ #106
> > > [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
> > > [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
> > > [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > [  258.283426] pc : __memcpy+0x7c/0x180
> > > [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
> > > [  258.290585] sp : ffff8000116fbc60
> > > [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
> > > [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
> > > [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
> > > [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
> > > [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
> > > [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
> > > [  258.316708] x17: 0000000000000000 x16: 0000000000000000
> > > [  258.320724] x15: 0000000000000000 x14: 0000000000000000
> > > [  258.324744] x13: 0000000000000000 x12: 0000000000000000
> > > [  258.328761] x11: 0000000000000000 x10: 0000000000000000
> > > [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
> > > [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
> > > [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
> > > [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
> > > [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
> > > [  258.352864] Call trace:
> > > [  258.354010]  __memcpy+0x7c/0x180
> > > [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
> > > [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
> > > [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
> > > [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
> > > [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
> > > [  258.376279]  process_one_work+0x1c0/0x484
> > > [  258.378995]  worker_thread+0x70/0x434
> > > [  258.381364]  kthread+0x158/0x160
> > > [  258.383296]  ret_from_fork+0x10/0x34
> > > [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
> > > [  258.390377] ---[ end trace fea6ecb96dad642d ]---
> > >
> > > This needs several concurrent streams to run at the same time,
> > > it doesn't happen with just one; it was reproduced in rkvdec
> > > and hantro drivers. The above stacktrace is on MPEG-2,
> > > but we have found it on H.264 as well.
> > >
> > > I've traced this to:
> > >
> > >                 v4l2_ctrl_lock(ctrl);
> > >                 if (ref->req) {
> > >                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
> > >
> > > And:
> > >
> > > static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
> > >                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
> > > {
> > >         if (ctrl == NULL)
> > >                 return;
> > >         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
> > >                                ^^^^^^^^^^^^
> > >                                This is NULL
> > >
> > > I've been staring at this file and trying to understand how this can happen
> > > but haven't been able to make any progress so far...
> > >
> > > Any ideas ?
> > >
> >
> > This issue may be somehow linked to controls that haven't been set,
> > and thus have their initialization value.
> >
> > My traces indicate that the controls with NULL values are those
> > that GStreamer is not setting. I can confirm that by re-testing
> > with GStreamer setting all the controls always.
>
> If I am not mixing up issues then the following patch should fix this:
>
> https://patchwork.linuxtv.org/project/linux-media/patch/ce92a438-4708-74a0-8fb4-c8fb6495e6b8@xs4all.nl/

... or maybe not. I just ran into the same crash after applying it
locally, so guess I was just lucky for a short while. :/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-03-31 22:34 ` Ezequiel Garcia
  2021-04-01  0:23   ` Alexandre Courbot
@ 2021-04-01  9:34   ` Hans Verkuil
  2021-04-12  8:28     ` Alexandre Courbot
  1 sibling, 1 reply; 8+ messages in thread
From: Hans Verkuil @ 2021-04-01  9:34 UTC (permalink / raw)
  To: Ezequiel Garcia, linux-media

On 01/04/2021 00:34, Ezequiel Garcia wrote:
> Hi Hans,
> 
> On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
>>
>> Hi Hans,
>>
>> We have found this crash in mainline:
>>
>> [  258.174662] Unable to handle kernel NULL pointer dereference at
>> virtual address 0000000000000000
>> [  258.182967] Mem abort info:
>> [  258.187746]   ESR = 0x96000004
>> [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [  258.199108]   SET = 0, FnV = 0
>> [  258.203732]   EA = 0, S1PTW = 0
>> [  258.208389] Data abort info:
>> [  258.212743]   ISV = 0, ISS = 0x00000004
>> [  258.218322]   CM = 0, WnR = 0
>> [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
>> [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
>> [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>> [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
>> v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
>> videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
>> gpu_sched fuse
>> [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
>>      5.12.0-rc2+ #106
>> [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
>> [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
>> [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> [  258.283426] pc : __memcpy+0x7c/0x180
>> [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
>> [  258.290585] sp : ffff8000116fbc60
>> [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
>> [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
>> [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
>> [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
>> [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
>> [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
>> [  258.316708] x17: 0000000000000000 x16: 0000000000000000
>> [  258.320724] x15: 0000000000000000 x14: 0000000000000000
>> [  258.324744] x13: 0000000000000000 x12: 0000000000000000
>> [  258.328761] x11: 0000000000000000 x10: 0000000000000000
>> [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
>> [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
>> [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
>> [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
>> [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
>> [  258.352864] Call trace:
>> [  258.354010]  __memcpy+0x7c/0x180
>> [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
>> [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
>> [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
>> [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
>> [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
>> [  258.376279]  process_one_work+0x1c0/0x484
>> [  258.378995]  worker_thread+0x70/0x434
>> [  258.381364]  kthread+0x158/0x160
>> [  258.383296]  ret_from_fork+0x10/0x34
>> [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
>> [  258.390377] ---[ end trace fea6ecb96dad642d ]---
>>
>> This needs several concurrent streams to run at the same time,
>> it doesn't happen with just one; it was reproduced in rkvdec
>> and hantro drivers. The above stacktrace is on MPEG-2,
>> but we have found it on H.264 as well.
>>
>> I've traced this to:
>>
>>                 v4l2_ctrl_lock(ctrl);
>>                 if (ref->req) {
>>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
>>
>> And:
>>
>> static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
>>                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
>> {
>>         if (ctrl == NULL)
>>                 return;
>>         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
>>                                ^^^^^^^^^^^^
>>                                This is NULL
>>
>> I've been staring at this file and trying to understand how this can happen
>> but haven't been able to make any progress so far...
>>
>> Any ideas ?

Can you test with this patch:

---------------------------------------
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index 757d215c2be4..19c56cb9f86a 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -2703,8 +2703,11 @@ static int handler_new_ref(struct v4l2_ctrl_handler *hdl,
 	if (hdl->error)
 		return hdl->error;

-	if (allocate_req)
+	if (allocate_req) {
 		size_extra_req = ctrl->elems * ctrl->elem_size;
+		if (!size_extra_req)
+			pr_info("empty size_extra_req %x %s\n", ctrl->id, ctrl->name);
+	}
 	new_ref = kzalloc(sizeof(*new_ref) + size_extra_req, GFP_KERNEL);
 	if (!new_ref)
 		return handler_set_err(hdl, -ENOMEM);
@@ -4610,6 +4613,8 @@ void v4l2_ctrl_request_complete(struct media_request *req,

 		v4l2_ctrl_lock(ctrl);
 		if (ref->req) {
+			if (!ref->req->p_req.p)
+				pr_info("v4l2_ctrl_request_complete %x %s\n", ctrl->id, ctrl->name);
 			ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
 		} else {
 			ptr_to_ptr(ctrl, ctrl->p_cur, ref->p_req);
---------------------------------------

I want to know if there are controls where size_extra_req == 0 (meaning
that p_req.p is also NULL), and which control encounters the NULL p_req.p.

Regards,

	Hans

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-04-01  9:34   ` Hans Verkuil
@ 2021-04-12  8:28     ` Alexandre Courbot
  2021-04-12  8:45       ` Hans Verkuil
  0 siblings, 1 reply; 8+ messages in thread
From: Alexandre Courbot @ 2021-04-12  8:28 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Ezequiel Garcia, linux-media

Hi Hans,

I have been able to reproduce the issue with the stateless part of the
mtk-vcodec decoder driver (not merged as of today), which shows the
exact same crash.

With your patch applied I got the following output right before the
crash, when decoding H.264 or VP9:

H.264:
[  549.264784] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
[  976.630017] v4l2-ctrls: v4l2_ctrl_request_complete 980001 User Controls
[  125.152149] v4l2-ctrls: v4l2_ctrl_request_complete 980001 User Controls
[  330.049192] v4l2-ctrls: v4l2_ctrl_request_complete 990001 Codec Controls

VP9:
[  215.945812] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
[  428.564501] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
[  830.970340] v4l2-ctrls: v4l2_ctrl_request_complete 990b00 VP9 Profile

The other pr_info() in your patch was never seen.

Interestingly these controls are not set by requests, and are not even
supposed to be set on a decoder. I am not quite sure where this could
come from...

Cheers,
Alex.

On Thu, Apr 1, 2021 at 6:36 PM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
>
> On 01/04/2021 00:34, Ezequiel Garcia wrote:
> > Hi Hans,
> >
> > On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> >>
> >> Hi Hans,
> >>
> >> We have found this crash in mainline:
> >>
> >> [  258.174662] Unable to handle kernel NULL pointer dereference at
> >> virtual address 0000000000000000
> >> [  258.182967] Mem abort info:
> >> [  258.187746]   ESR = 0x96000004
> >> [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
> >> [  258.199108]   SET = 0, FnV = 0
> >> [  258.203732]   EA = 0, S1PTW = 0
> >> [  258.208389] Data abort info:
> >> [  258.212743]   ISV = 0, ISS = 0x00000004
> >> [  258.218322]   CM = 0, WnR = 0
> >> [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
> >> [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> >> [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> >> [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
> >> v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
> >> videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
> >> gpu_sched fuse
> >> [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
> >>      5.12.0-rc2+ #106
> >> [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
> >> [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
> >> [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> >> [  258.283426] pc : __memcpy+0x7c/0x180
> >> [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
> >> [  258.290585] sp : ffff8000116fbc60
> >> [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
> >> [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
> >> [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
> >> [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
> >> [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
> >> [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
> >> [  258.316708] x17: 0000000000000000 x16: 0000000000000000
> >> [  258.320724] x15: 0000000000000000 x14: 0000000000000000
> >> [  258.324744] x13: 0000000000000000 x12: 0000000000000000
> >> [  258.328761] x11: 0000000000000000 x10: 0000000000000000
> >> [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
> >> [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
> >> [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
> >> [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
> >> [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
> >> [  258.352864] Call trace:
> >> [  258.354010]  __memcpy+0x7c/0x180
> >> [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
> >> [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
> >> [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
> >> [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
> >> [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
> >> [  258.376279]  process_one_work+0x1c0/0x484
> >> [  258.378995]  worker_thread+0x70/0x434
> >> [  258.381364]  kthread+0x158/0x160
> >> [  258.383296]  ret_from_fork+0x10/0x34
> >> [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
> >> [  258.390377] ---[ end trace fea6ecb96dad642d ]---
> >>
> >> This needs several concurrent streams to run at the same time,
> >> it doesn't happen with just one; it was reproduced in rkvdec
> >> and hantro drivers. The above stacktrace is on MPEG-2,
> >> but we have found it on H.264 as well.
> >>
> >> I've traced this to:
> >>
> >>                 v4l2_ctrl_lock(ctrl);
> >>                 if (ref->req) {
> >>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
> >>
> >> And:
> >>
> >> static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
> >>                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
> >> {
> >>         if (ctrl == NULL)
> >>                 return;
> >>         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
> >>                                ^^^^^^^^^^^^
> >>                                This is NULL
> >>
> >> I've been staring at this file and trying to understand how this can happen
> >> but haven't been able to make any progress so far...
> >>
> >> Any ideas ?
>
> Can you test with this patch:
>
> ---------------------------------------
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> index 757d215c2be4..19c56cb9f86a 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> @@ -2703,8 +2703,11 @@ static int handler_new_ref(struct v4l2_ctrl_handler *hdl,
>         if (hdl->error)
>                 return hdl->error;
>
> -       if (allocate_req)
> +       if (allocate_req) {
>                 size_extra_req = ctrl->elems * ctrl->elem_size;
> +               if (!size_extra_req)
> +                       pr_info("empty size_extra_req %x %s\n", ctrl->id, ctrl->name);
> +       }
>         new_ref = kzalloc(sizeof(*new_ref) + size_extra_req, GFP_KERNEL);
>         if (!new_ref)
>                 return handler_set_err(hdl, -ENOMEM);
> @@ -4610,6 +4613,8 @@ void v4l2_ctrl_request_complete(struct media_request *req,
>
>                 v4l2_ctrl_lock(ctrl);
>                 if (ref->req) {
> +                       if (!ref->req->p_req.p)
> +                               pr_info("v4l2_ctrl_request_complete %x %s\n", ctrl->id, ctrl->name);
>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
>                 } else {
>                         ptr_to_ptr(ctrl, ctrl->p_cur, ref->p_req);
> ---------------------------------------
>
> I want to know if there are controls where size_extra_req == 0 (meaning
> that p_req.p is also NULL), and which control encounters the NULL p_req.p.
>
> Regards,
>
>         Hans

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-04-12  8:28     ` Alexandre Courbot
@ 2021-04-12  8:45       ` Hans Verkuil
  2021-04-12  8:55         ` Alexandre Courbot
  0 siblings, 1 reply; 8+ messages in thread
From: Hans Verkuil @ 2021-04-12  8:45 UTC (permalink / raw)
  To: Alexandre Courbot; +Cc: Ezequiel Garcia, linux-media

Hi Alexandre,

On 12/04/2021 10:28, Alexandre Courbot wrote:
> Hi Hans,
> 
> I have been able to reproduce the issue with the stateless part of the
> mtk-vcodec decoder driver (not merged as of today), which shows the
> exact same crash.
> 
> With your patch applied I got the following output right before the
> crash, when decoding H.264 or VP9:
> 
> H.264:
> [  549.264784] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
> [  976.630017] v4l2-ctrls: v4l2_ctrl_request_complete 980001 User Controls
> [  125.152149] v4l2-ctrls: v4l2_ctrl_request_complete 980001 User Controls
> [  330.049192] v4l2-ctrls: v4l2_ctrl_request_complete 990001 Codec Controls
> 
> VP9:
> [  215.945812] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
> [  428.564501] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
> [  830.970340] v4l2-ctrls: v4l2_ctrl_request_complete 990b00 VP9 Profile
> 
> The other pr_info() in your patch was never seen.
> 
> Interestingly these controls are not set by requests, and are not even
> supposed to be set on a decoder. I am not quite sure where this could
> come from...

I found the issue and Ezequiel successfully tested a patch I mailed him privately.

The patch needs a bit of cleanup, but I plan to post it today.

Regards,

	Hans

> 
> Cheers,
> Alex.
> 
> On Thu, Apr 1, 2021 at 6:36 PM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
>>
>> On 01/04/2021 00:34, Ezequiel Garcia wrote:
>>> Hi Hans,
>>>
>>> On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
>>> <ezequiel@vanguardiasur.com.ar> wrote:
>>>>
>>>> Hi Hans,
>>>>
>>>> We have found this crash in mainline:
>>>>
>>>> [  258.174662] Unable to handle kernel NULL pointer dereference at
>>>> virtual address 0000000000000000
>>>> [  258.182967] Mem abort info:
>>>> [  258.187746]   ESR = 0x96000004
>>>> [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
>>>> [  258.199108]   SET = 0, FnV = 0
>>>> [  258.203732]   EA = 0, S1PTW = 0
>>>> [  258.208389] Data abort info:
>>>> [  258.212743]   ISV = 0, ISS = 0x00000004
>>>> [  258.218322]   CM = 0, WnR = 0
>>>> [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
>>>> [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
>>>> [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>>> [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
>>>> v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
>>>> videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
>>>> gpu_sched fuse
>>>> [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
>>>>      5.12.0-rc2+ #106
>>>> [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
>>>> [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
>>>> [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>> [  258.283426] pc : __memcpy+0x7c/0x180
>>>> [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
>>>> [  258.290585] sp : ffff8000116fbc60
>>>> [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
>>>> [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
>>>> [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
>>>> [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
>>>> [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
>>>> [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
>>>> [  258.316708] x17: 0000000000000000 x16: 0000000000000000
>>>> [  258.320724] x15: 0000000000000000 x14: 0000000000000000
>>>> [  258.324744] x13: 0000000000000000 x12: 0000000000000000
>>>> [  258.328761] x11: 0000000000000000 x10: 0000000000000000
>>>> [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
>>>> [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
>>>> [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
>>>> [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
>>>> [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
>>>> [  258.352864] Call trace:
>>>> [  258.354010]  __memcpy+0x7c/0x180
>>>> [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
>>>> [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
>>>> [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
>>>> [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
>>>> [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
>>>> [  258.376279]  process_one_work+0x1c0/0x484
>>>> [  258.378995]  worker_thread+0x70/0x434
>>>> [  258.381364]  kthread+0x158/0x160
>>>> [  258.383296]  ret_from_fork+0x10/0x34
>>>> [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
>>>> [  258.390377] ---[ end trace fea6ecb96dad642d ]---
>>>>
>>>> This needs several concurrent streams to run at the same time,
>>>> it doesn't happen with just one; it was reproduced in rkvdec
>>>> and hantro drivers. The above stacktrace is on MPEG-2,
>>>> but we have found it on H.264 as well.
>>>>
>>>> I've traced this to:
>>>>
>>>>                 v4l2_ctrl_lock(ctrl);
>>>>                 if (ref->req) {
>>>>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
>>>>
>>>> And:
>>>>
>>>> static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
>>>>                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
>>>> {
>>>>         if (ctrl == NULL)
>>>>                 return;
>>>>         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
>>>>                                ^^^^^^^^^^^^
>>>>                                This is NULL
>>>>
>>>> I've been staring at this file and trying to understand how this can happen
>>>> but haven't been able to make any progress so far...
>>>>
>>>> Any ideas ?
>>
>> Can you test with this patch:
>>
>> ---------------------------------------
>> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
>> index 757d215c2be4..19c56cb9f86a 100644
>> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
>> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
>> @@ -2703,8 +2703,11 @@ static int handler_new_ref(struct v4l2_ctrl_handler *hdl,
>>         if (hdl->error)
>>                 return hdl->error;
>>
>> -       if (allocate_req)
>> +       if (allocate_req) {
>>                 size_extra_req = ctrl->elems * ctrl->elem_size;
>> +               if (!size_extra_req)
>> +                       pr_info("empty size_extra_req %x %s\n", ctrl->id, ctrl->name);
>> +       }
>>         new_ref = kzalloc(sizeof(*new_ref) + size_extra_req, GFP_KERNEL);
>>         if (!new_ref)
>>                 return handler_set_err(hdl, -ENOMEM);
>> @@ -4610,6 +4613,8 @@ void v4l2_ctrl_request_complete(struct media_request *req,
>>
>>                 v4l2_ctrl_lock(ctrl);
>>                 if (ref->req) {
>> +                       if (!ref->req->p_req.p)
>> +                               pr_info("v4l2_ctrl_request_complete %x %s\n", ctrl->id, ctrl->name);
>>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
>>                 } else {
>>                         ptr_to_ptr(ctrl, ctrl->p_cur, ref->p_req);
>> ---------------------------------------
>>
>> I want to know if there are controls where size_extra_req == 0 (meaning
>> that p_req.p is also NULL), and which control encounters the NULL p_req.p.
>>
>> Regards,
>>
>>         Hans


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NULL pointer dereference at v4l2_ctrl_request_complete
  2021-04-12  8:45       ` Hans Verkuil
@ 2021-04-12  8:55         ` Alexandre Courbot
  0 siblings, 0 replies; 8+ messages in thread
From: Alexandre Courbot @ 2021-04-12  8:55 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Ezequiel Garcia, linux-media

Hi Hans,

On Mon, Apr 12, 2021 at 5:45 PM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
>
> Hi Alexandre,
>
> On 12/04/2021 10:28, Alexandre Courbot wrote:
> > Hi Hans,
> >
> > I have been able to reproduce the issue with the stateless part of the
> > mtk-vcodec decoder driver (not merged as of today), which shows the
> > exact same crash.
> >
> > With your patch applied I got the following output right before the
> > crash, when decoding H.264 or VP9:
> >
> > H.264:
> > [  549.264784] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
> > [  976.630017] v4l2-ctrls: v4l2_ctrl_request_complete 980001 User Controls
> > [  125.152149] v4l2-ctrls: v4l2_ctrl_request_complete 980001 User Controls
> > [  330.049192] v4l2-ctrls: v4l2_ctrl_request_complete 990001 Codec Controls
> >
> > VP9:
> > [  215.945812] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
> > [  428.564501] v4l2-ctrls: v4l2_ctrl_request_complete 990a6b H264 Profile
> > [  830.970340] v4l2-ctrls: v4l2_ctrl_request_complete 990b00 VP9 Profile
> >
> > The other pr_info() in your patch was never seen.
> >
> > Interestingly these controls are not set by requests, and are not even
> > supposed to be set on a decoder. I am not quite sure where this could
> > come from...
>
> I found the issue and Ezequiel successfully tested a patch I mailed him privately.
>
> The patch needs a bit of cleanup, but I plan to post it today.

Oh, that's good news! Will be happy to test it then.

Cheers,
Alex.

>
> Regards,
>
>         Hans
>
> >
> > Cheers,
> > Alex.
> >
> > On Thu, Apr 1, 2021 at 6:36 PM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
> >>
> >> On 01/04/2021 00:34, Ezequiel Garcia wrote:
> >>> Hi Hans,
> >>>
> >>> On Wed, 31 Mar 2021 at 13:14, Ezequiel Garcia
> >>> <ezequiel@vanguardiasur.com.ar> wrote:
> >>>>
> >>>> Hi Hans,
> >>>>
> >>>> We have found this crash in mainline:
> >>>>
> >>>> [  258.174662] Unable to handle kernel NULL pointer dereference at
> >>>> virtual address 0000000000000000
> >>>> [  258.182967] Mem abort info:
> >>>> [  258.187746]   ESR = 0x96000004
> >>>> [  258.192100]   EC = 0x25: DABT (current EL), IL = 32 bits
> >>>> [  258.199108]   SET = 0, FnV = 0
> >>>> [  258.203732]   EA = 0, S1PTW = 0
> >>>> [  258.208389] Data abort info:
> >>>> [  258.212743]   ISV = 0, ISS = 0x00000004
> >>>> [  258.218322]   CM = 0, WnR = 0
> >>>> [  258.222760] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000444b0000
> >>>> [  258.230822] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> >>>> [  258.239097] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> >>>> [  258.246162] Modules linked in: hantro_vpu(C) videobuf2_vmalloc
> >>>> v4l2_h264 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem
> >>>> videobuf2_v4l2 videobuf2_common videodev mc etnaviv fsl_imx8_ddr_perf
> >>>> gpu_sched fuse
> >>>> [  258.263431] CPU: 0 PID: 34 Comm: kworker/0:1 Tainted: G         C
> >>>>      5.12.0-rc2+ #106
> >>>> [  258.270312] Hardware name: NXP i.MX8MQ EVK (DT)
> >>>> [  258.273542] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
> >>>> [  258.278716] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> >>>> [  258.283426] pc : __memcpy+0x7c/0x180
> >>>> [  258.285709] lr : v4l2_ctrl_request_complete+0x7c/0x200 [videodev]
> >>>> [  258.290585] sp : ffff8000116fbc60
> >>>> [  258.292598] x29: ffff8000116fbc60 x28: ffff80001142b780
> >>>> [  258.296617] x27: ffff80001127bc80 x26: 00000000e9b00000
> >>>> [  258.300637] x25: 00000000e9a00000 x24: ffff0000067e2500
> >>>> [  258.304653] x23: ffff0000067e2538 x22: ffff00000387d238
> >>>> [  258.308671] x21: ffff0000067e2598 x20: ffff000004644e00
> >>>> [  258.312688] x19: ffff000006812a00 x18: 0000000000000000
> >>>> [  258.316708] x17: 0000000000000000 x16: 0000000000000000
> >>>> [  258.320724] x15: 0000000000000000 x14: 0000000000000000
> >>>> [  258.324744] x13: 0000000000000000 x12: 0000000000000000
> >>>> [  258.328761] x11: 0000000000000000 x10: 0000000000000000
> >>>> [  258.332776] x9 : 0000000000000000 x8 : 0000000000000000
> >>>> [  258.336793] x7 : 0000000000000000 x6 : ffff000006812a40
> >>>> [  258.340811] x5 : 0000000000000006 x4 : 0000000000000000
> >>>> [  258.344826] x3 : 0000000000000010 x2 : 0000000000000010
> >>>> [  258.348845] x1 : 0000000000000000 x0 : ffff000006812a40
> >>>> [  258.352864] Call trace:
> >>>> [  258.354010]  __memcpy+0x7c/0x180
> >>>> [  258.355941]  hantro_end_prepare_run+0x2c/0x60 [hantro_vpu]
> >>>> [  258.360150]  hantro_g1_mpeg2_dec_run+0x3b0/0x7f0 [hantro_vpu]
> >>>> [  258.364615]  device_run+0xa8/0xbc [hantro_vpu]
> >>>> [  258.367777]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
> >>>> [  258.371723]  v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
> >>>> [  258.376279]  process_one_work+0x1c0/0x484
> >>>> [  258.378995]  worker_thread+0x70/0x434
> >>>> [  258.381364]  kthread+0x158/0x160
> >>>> [  258.383296]  ret_from_fork+0x10/0x34
> >>>> [  258.385582] Code: a8c12027 a88120c7 a8c12027 a88120c7 (a8c12027)
> >>>> [  258.390377] ---[ end trace fea6ecb96dad642d ]---
> >>>>
> >>>> This needs several concurrent streams to run at the same time,
> >>>> it doesn't happen with just one; it was reproduced in rkvdec
> >>>> and hantro drivers. The above stacktrace is on MPEG-2,
> >>>> but we have found it on H.264 as well.
> >>>>
> >>>> I've traced this to:
> >>>>
> >>>>                 v4l2_ctrl_lock(ctrl);
> >>>>                 if (ref->req) {
> >>>>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
> >>>>
> >>>> And:
> >>>>
> >>>> static void ptr_to_ptr(struct v4l2_ctrl *ctrl,
> >>>>                        union v4l2_ctrl_ptr from, union v4l2_ctrl_ptr to)
> >>>> {
> >>>>         if (ctrl == NULL)
> >>>>                 return;
> >>>>         memcpy(to.p, from.p_const, ctrl->elems * ctrl->elem_size);
> >>>>                                ^^^^^^^^^^^^
> >>>>                                This is NULL
> >>>>
> >>>> I've been staring at this file and trying to understand how this can happen
> >>>> but haven't been able to make any progress so far...
> >>>>
> >>>> Any ideas ?
> >>
> >> Can you test with this patch:
> >>
> >> ---------------------------------------
> >> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> >> index 757d215c2be4..19c56cb9f86a 100644
> >> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> >> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> >> @@ -2703,8 +2703,11 @@ static int handler_new_ref(struct v4l2_ctrl_handler *hdl,
> >>         if (hdl->error)
> >>                 return hdl->error;
> >>
> >> -       if (allocate_req)
> >> +       if (allocate_req) {
> >>                 size_extra_req = ctrl->elems * ctrl->elem_size;
> >> +               if (!size_extra_req)
> >> +                       pr_info("empty size_extra_req %x %s\n", ctrl->id, ctrl->name);
> >> +       }
> >>         new_ref = kzalloc(sizeof(*new_ref) + size_extra_req, GFP_KERNEL);
> >>         if (!new_ref)
> >>                 return handler_set_err(hdl, -ENOMEM);
> >> @@ -4610,6 +4613,8 @@ void v4l2_ctrl_request_complete(struct media_request *req,
> >>
> >>                 v4l2_ctrl_lock(ctrl);
> >>                 if (ref->req) {
> >> +                       if (!ref->req->p_req.p)
> >> +                               pr_info("v4l2_ctrl_request_complete %x %s\n", ctrl->id, ctrl->name);
> >>                         ptr_to_ptr(ctrl, ref->req->p_req, ref->p_req);
> >>                 } else {
> >>                         ptr_to_ptr(ctrl, ctrl->p_cur, ref->p_req);
> >> ---------------------------------------
> >>
> >> I want to know if there are controls where size_extra_req == 0 (meaning
> >> that p_req.p is also NULL), and which control encounters the NULL p_req.p.
> >>
> >> Regards,
> >>
> >>         Hans
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-12  9:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-31 16:14 NULL pointer dereference at v4l2_ctrl_request_complete Ezequiel Garcia
2021-03-31 22:34 ` Ezequiel Garcia
2021-04-01  0:23   ` Alexandre Courbot
2021-04-01  0:33     ` Alexandre Courbot
2021-04-01  9:34   ` Hans Verkuil
2021-04-12  8:28     ` Alexandre Courbot
2021-04-12  8:45       ` Hans Verkuil
2021-04-12  8:55         ` Alexandre Courbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.