dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Re: [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference
       [not found] ` <YwPoCqvQ02kUl9tP@dev-arch.thelio-3990X>
@ 2022-08-23  7:47   ` Łukasz Bartosik
  2022-08-25  7:37     ` Jani Nikula
  0 siblings, 1 reply; 3+ messages in thread
From: Łukasz Bartosik @ 2022-08-23  7:47 UTC (permalink / raw)
  To: linux-kernel, dri-devel, Nathan Chancellor, keescook
  Cc: Tvrtko Ursulin, llvm, upstream, intel-gfx, Rodrigo Vivi

>
> Hi all,
>
> Apologies in advance if you see this twice. I did not see the original
> make it to either lore.kernel.org or the freedesktop.org archives so I
> figured it might have been sent into the void.
>
> On Tue, Feb 01, 2022 at 04:33:54PM +0100, Lukasz Bartosik wrote:
> > From: Łukasz Bartosik <lb@semihalf.com>
> >
> > Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel.
> > The root cause is null pointer defeference of bi_next
> > in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c.
> >
> > BUG: kernel NULL pointer dereference, address: 000000000000002e
> > PGD 0 P4D 0
> > Oops: 0002 [#1] PREEMPT SMP NOPTI
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G     U            5.17.0-rc1
> > Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021
> > RIP: 0010:tgl_get_bw_info+0x2de/0x510
> > ...
> > [    2.554467] Call Trace:
> > [    2.554467]  <TASK>
> > [    2.554467]  intel_bw_init_hw+0x14a/0x434
> > [    2.554467]  ? _printk+0x59/0x73
> > [    2.554467]  ? _dev_err+0x77/0x91
> > [    2.554467]  i915_driver_hw_probe+0x329/0x33e
> > [    2.554467]  i915_driver_probe+0x4c8/0x638
> > [    2.554467]  i915_pci_probe+0xf8/0x14e
> > [    2.554467]  ? _raw_spin_unlock_irqrestore+0x12/0x2c
> > [    2.554467]  pci_device_probe+0xaa/0x142
> > [    2.554467]  really_probe+0x13f/0x2f4
> > [    2.554467]  __driver_probe_device+0x9e/0xd3
> > [    2.554467]  driver_probe_device+0x24/0x7c
> > [    2.554467]  __driver_attach+0xba/0xcf
> > [    2.554467]  ? driver_attach+0x1f/0x1f
> > [    2.554467]  bus_for_each_dev+0x8c/0xc0
> > [    2.554467]  bus_add_driver+0x11b/0x1f7
> > [    2.554467]  driver_register+0x60/0xea
> > [    2.554467]  ? mipi_dsi_bus_init+0x16/0x16
> > [    2.554467]  i915_init+0x2c/0xb9
> > [    2.554467]  ? mipi_dsi_bus_init+0x16/0x16
> > [    2.554467]  do_one_initcall+0x12e/0x2b3
> > [    2.554467]  do_initcall_level+0xd6/0xf3
> > [    2.554467]  do_initcalls+0x4e/0x79
> > [    2.554467]  kernel_init_freeable+0xed/0x14d
> > [    2.554467]  ? rest_init+0xc1/0xc1
> > [    2.554467]  kernel_init+0x1a/0x120
> > [    2.554467]  ret_from_fork+0x1f/0x30
> > [    2.554467]  </TASK>
> > ...
> > Kernel panic - not syncing: Fatal exception
> >
> > Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae")
> > Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_bw.c | 16 +++++++++-------
> >  1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> > index 2da4aacc956b..bd0ed68b7faa 100644
> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> > @@ -404,15 +404,17 @@ static int tgl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
> >               int clpchgroup;
> >               int j;
> >
> > -             if (i < num_groups - 1)
> > -                     bi_next = &dev_priv->max_bw[i + 1];
> > -
> >               clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
> >
> > -             if (i < num_groups - 1 && clpchgroup < clperchgroup)
> > -                     bi_next->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
> > -             else
> > -                     bi_next->num_planes = 0;
> > +             if (i < num_groups - 1) {
> > +                     bi_next = &dev_priv->max_bw[i + 1];
> > +
> > +                     if (clpchgroup < clperchgroup)
> > +                             bi_next->num_planes = (ipqdepth - clpchgroup) /
> > +                                                    clpchgroup + 1;
> > +                     else
> > +                             bi_next->num_planes = 0;
> > +             }
> >
> >               bi->num_qgv_points = qi.num_points;
> >               bi->num_psf_gv_points = qi.num_psf_points;
> > --
> > 2.35.0.rc2.247.g8bbb082509-goog
> >
> >
>
> Was this patch ever applied or was the issue fixed in a different way?
> If CONFIG_INIT_STACK_ALL_ZERO is enabled (it is on by default when the
> compiler supports it), bi_next will be deterministically initialized to
> NULL, which means 'bi_next->num_planes = 0' will crash when the first if
> statement is not taken (i.e. 'i > num_groups - 1'). This was reported to
> us at [1] so it impacts real users (and I have been applying this change
> locally for six months). I see some discussion in this thread, was it
> ever resolved?
>
> [1]: https://github.com/ClangBuiltLinux/linux/issues/1626
>
> Cheers,
> Nathan

The patch was not accepted by upstream. I gave up after sending two reminders
that the issue is still present which resulted in no upstream reaction.
I have been also applying that patch locally for a few months.
Thanks for bringing it up to upstream attention again.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference
  2022-08-23  7:47   ` [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference Łukasz Bartosik
@ 2022-08-25  7:37     ` Jani Nikula
  2022-08-25 15:38       ` Nathan Chancellor
  0 siblings, 1 reply; 3+ messages in thread
From: Jani Nikula @ 2022-08-25  7:37 UTC (permalink / raw)
  To: Łukasz Bartosik, linux-kernel, dri-devel, Nathan Chancellor,
	keescook
  Cc: Tvrtko Ursulin, llvm, upstream, Sripada, Radhakrishna, intel-gfx,
	Anusha Srivatsa, Rodrigo Vivi

On Tue, 23 Aug 2022, Łukasz Bartosik <lb@semihalf.com> wrote:
>>
>> Hi all,
>>
>> Apologies in advance if you see this twice. I did not see the original
>> make it to either lore.kernel.org or the freedesktop.org archives so I
>> figured it might have been sent into the void.
>>
>> On Tue, Feb 01, 2022 at 04:33:54PM +0100, Lukasz Bartosik wrote:
>> > From: Łukasz Bartosik <lb@semihalf.com>
>> >
>> > Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel.
>> > The root cause is null pointer defeference of bi_next
>> > in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c.
>> >
>> > BUG: kernel NULL pointer dereference, address: 000000000000002e
>> > PGD 0 P4D 0
>> > Oops: 0002 [#1] PREEMPT SMP NOPTI
>> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G     U            5.17.0-rc1
>> > Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021
>> > RIP: 0010:tgl_get_bw_info+0x2de/0x510
>> > ...
>> > [    2.554467] Call Trace:
>> > [    2.554467]  <TASK>
>> > [    2.554467]  intel_bw_init_hw+0x14a/0x434
>> > [    2.554467]  ? _printk+0x59/0x73
>> > [    2.554467]  ? _dev_err+0x77/0x91
>> > [    2.554467]  i915_driver_hw_probe+0x329/0x33e
>> > [    2.554467]  i915_driver_probe+0x4c8/0x638
>> > [    2.554467]  i915_pci_probe+0xf8/0x14e
>> > [    2.554467]  ? _raw_spin_unlock_irqrestore+0x12/0x2c
>> > [    2.554467]  pci_device_probe+0xaa/0x142
>> > [    2.554467]  really_probe+0x13f/0x2f4
>> > [    2.554467]  __driver_probe_device+0x9e/0xd3
>> > [    2.554467]  driver_probe_device+0x24/0x7c
>> > [    2.554467]  __driver_attach+0xba/0xcf
>> > [    2.554467]  ? driver_attach+0x1f/0x1f
>> > [    2.554467]  bus_for_each_dev+0x8c/0xc0
>> > [    2.554467]  bus_add_driver+0x11b/0x1f7
>> > [    2.554467]  driver_register+0x60/0xea
>> > [    2.554467]  ? mipi_dsi_bus_init+0x16/0x16
>> > [    2.554467]  i915_init+0x2c/0xb9
>> > [    2.554467]  ? mipi_dsi_bus_init+0x16/0x16
>> > [    2.554467]  do_one_initcall+0x12e/0x2b3
>> > [    2.554467]  do_initcall_level+0xd6/0xf3
>> > [    2.554467]  do_initcalls+0x4e/0x79
>> > [    2.554467]  kernel_init_freeable+0xed/0x14d
>> > [    2.554467]  ? rest_init+0xc1/0xc1
>> > [    2.554467]  kernel_init+0x1a/0x120
>> > [    2.554467]  ret_from_fork+0x1f/0x30
>> > [    2.554467]  </TASK>
>> > ...
>> > Kernel panic - not syncing: Fatal exception
>> >
>> > Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae")
>> > Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
>> > ---
>> >  drivers/gpu/drm/i915/display/intel_bw.c | 16 +++++++++-------
>> >  1 file changed, 9 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
>> > index 2da4aacc956b..bd0ed68b7faa 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
>> > @@ -404,15 +404,17 @@ static int tgl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
>> >               int clpchgroup;
>> >               int j;
>> >
>> > -             if (i < num_groups - 1)
>> > -                     bi_next = &dev_priv->max_bw[i + 1];
>> > -
>> >               clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
>> >
>> > -             if (i < num_groups - 1 && clpchgroup < clperchgroup)
>> > -                     bi_next->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
>> > -             else
>> > -                     bi_next->num_planes = 0;
>> > +             if (i < num_groups - 1) {
>> > +                     bi_next = &dev_priv->max_bw[i + 1];
>> > +
>> > +                     if (clpchgroup < clperchgroup)
>> > +                             bi_next->num_planes = (ipqdepth - clpchgroup) /
>> > +                                                    clpchgroup + 1;
>> > +                     else
>> > +                             bi_next->num_planes = 0;
>> > +             }
>> >
>> >               bi->num_qgv_points = qi.num_points;
>> >               bi->num_psf_gv_points = qi.num_psf_points;
>> > --
>> > 2.35.0.rc2.247.g8bbb082509-goog
>> >
>> >
>>
>> Was this patch ever applied or was the issue fixed in a different way?
>> If CONFIG_INIT_STACK_ALL_ZERO is enabled (it is on by default when the
>> compiler supports it), bi_next will be deterministically initialized to
>> NULL, which means 'bi_next->num_planes = 0' will crash when the first if
>> statement is not taken (i.e. 'i > num_groups - 1'). This was reported to
>> us at [1] so it impacts real users (and I have been applying this change
>> locally for six months). I see some discussion in this thread, was it
>> ever resolved?
>>
>> [1]: https://github.com/ClangBuiltLinux/linux/issues/1626
>>
>> Cheers,
>> Nathan
>
> The patch was not accepted by upstream. I gave up after sending two reminders
> that the issue is still present which resulted in no upstream reaction.
> I have been also applying that patch locally for a few months.
> Thanks for bringing it up to upstream attention again.

Apologies for us dropping the ball here. There were objections to the
code from Ville [1] but nobody stepped up to clean it up. I think this
was really more about the commit being fixed c64a9a7c05be ("drm/i915:
Update memory bandwidth formulae") than about the patch at hand.

In any case, I've gone ahead and pushed this patch to drm-intel-next
now. With the Fixes tag it should eventually find its way to stable
v5.17+. Thank you for the patch, review - and nagging. ;)

What still remains is cleaning up the code. But that should never have
stalled the fix for months. Sorry again.


BR,
Jani.


[1] https://lore.kernel.org/r/YgOYBfQJF7hIzEPE@intel.com

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference
  2022-08-25  7:37     ` Jani Nikula
@ 2022-08-25 15:38       ` Nathan Chancellor
  0 siblings, 0 replies; 3+ messages in thread
From: Nathan Chancellor @ 2022-08-25 15:38 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Tvrtko Ursulin, llvm, upstream, keescook, Sripada, Radhakrishna,
	intel-gfx, linux-kernel, dri-devel, Anusha Srivatsa,
	Rodrigo Vivi, Łukasz Bartosik

On Thu, Aug 25, 2022 at 10:37:14AM +0300, Jani Nikula wrote:
> On Tue, 23 Aug 2022, Łukasz Bartosik <lb@semihalf.com> wrote:
> >>
> >> Hi all,
> >>
> >> Apologies in advance if you see this twice. I did not see the original
> >> make it to either lore.kernel.org or the freedesktop.org archives so I
> >> figured it might have been sent into the void.
> >>
> >> On Tue, Feb 01, 2022 at 04:33:54PM +0100, Lukasz Bartosik wrote:
> >> > From: Łukasz Bartosik <lb@semihalf.com>
> >> >
> >> > Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel.
> >> > The root cause is null pointer defeference of bi_next
> >> > in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c.
> >> >
> >> > BUG: kernel NULL pointer dereference, address: 000000000000002e
> >> > PGD 0 P4D 0
> >> > Oops: 0002 [#1] PREEMPT SMP NOPTI
> >> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G     U            5.17.0-rc1
> >> > Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021
> >> > RIP: 0010:tgl_get_bw_info+0x2de/0x510
> >> > ...
> >> > [    2.554467] Call Trace:
> >> > [    2.554467]  <TASK>
> >> > [    2.554467]  intel_bw_init_hw+0x14a/0x434
> >> > [    2.554467]  ? _printk+0x59/0x73
> >> > [    2.554467]  ? _dev_err+0x77/0x91
> >> > [    2.554467]  i915_driver_hw_probe+0x329/0x33e
> >> > [    2.554467]  i915_driver_probe+0x4c8/0x638
> >> > [    2.554467]  i915_pci_probe+0xf8/0x14e
> >> > [    2.554467]  ? _raw_spin_unlock_irqrestore+0x12/0x2c
> >> > [    2.554467]  pci_device_probe+0xaa/0x142
> >> > [    2.554467]  really_probe+0x13f/0x2f4
> >> > [    2.554467]  __driver_probe_device+0x9e/0xd3
> >> > [    2.554467]  driver_probe_device+0x24/0x7c
> >> > [    2.554467]  __driver_attach+0xba/0xcf
> >> > [    2.554467]  ? driver_attach+0x1f/0x1f
> >> > [    2.554467]  bus_for_each_dev+0x8c/0xc0
> >> > [    2.554467]  bus_add_driver+0x11b/0x1f7
> >> > [    2.554467]  driver_register+0x60/0xea
> >> > [    2.554467]  ? mipi_dsi_bus_init+0x16/0x16
> >> > [    2.554467]  i915_init+0x2c/0xb9
> >> > [    2.554467]  ? mipi_dsi_bus_init+0x16/0x16
> >> > [    2.554467]  do_one_initcall+0x12e/0x2b3
> >> > [    2.554467]  do_initcall_level+0xd6/0xf3
> >> > [    2.554467]  do_initcalls+0x4e/0x79
> >> > [    2.554467]  kernel_init_freeable+0xed/0x14d
> >> > [    2.554467]  ? rest_init+0xc1/0xc1
> >> > [    2.554467]  kernel_init+0x1a/0x120
> >> > [    2.554467]  ret_from_fork+0x1f/0x30
> >> > [    2.554467]  </TASK>
> >> > ...
> >> > Kernel panic - not syncing: Fatal exception
> >> >
> >> > Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae")
> >> > Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/display/intel_bw.c | 16 +++++++++-------
> >> >  1 file changed, 9 insertions(+), 7 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> >> > index 2da4aacc956b..bd0ed68b7faa 100644
> >> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> >> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> >> > @@ -404,15 +404,17 @@ static int tgl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
> >> >               int clpchgroup;
> >> >               int j;
> >> >
> >> > -             if (i < num_groups - 1)
> >> > -                     bi_next = &dev_priv->max_bw[i + 1];
> >> > -
> >> >               clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
> >> >
> >> > -             if (i < num_groups - 1 && clpchgroup < clperchgroup)
> >> > -                     bi_next->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
> >> > -             else
> >> > -                     bi_next->num_planes = 0;
> >> > +             if (i < num_groups - 1) {
> >> > +                     bi_next = &dev_priv->max_bw[i + 1];
> >> > +
> >> > +                     if (clpchgroup < clperchgroup)
> >> > +                             bi_next->num_planes = (ipqdepth - clpchgroup) /
> >> > +                                                    clpchgroup + 1;
> >> > +                     else
> >> > +                             bi_next->num_planes = 0;
> >> > +             }
> >> >
> >> >               bi->num_qgv_points = qi.num_points;
> >> >               bi->num_psf_gv_points = qi.num_psf_points;
> >> > --
> >> > 2.35.0.rc2.247.g8bbb082509-goog
> >> >
> >> >
> >>
> >> Was this patch ever applied or was the issue fixed in a different way?
> >> If CONFIG_INIT_STACK_ALL_ZERO is enabled (it is on by default when the
> >> compiler supports it), bi_next will be deterministically initialized to
> >> NULL, which means 'bi_next->num_planes = 0' will crash when the first if
> >> statement is not taken (i.e. 'i > num_groups - 1'). This was reported to
> >> us at [1] so it impacts real users (and I have been applying this change
> >> locally for six months). I see some discussion in this thread, was it
> >> ever resolved?
> >>
> >> [1]: https://github.com/ClangBuiltLinux/linux/issues/1626
> >>
> >> Cheers,
> >> Nathan
> >
> > The patch was not accepted by upstream. I gave up after sending two reminders
> > that the issue is still present which resulted in no upstream reaction.
> > I have been also applying that patch locally for a few months.
> > Thanks for bringing it up to upstream attention again.
> 
> Apologies for us dropping the ball here. There were objections to the
> code from Ville [1] but nobody stepped up to clean it up. I think this
> was really more about the commit being fixed c64a9a7c05be ("drm/i915:
> Update memory bandwidth formulae") than about the patch at hand.
> 
> In any case, I've gone ahead and pushed this patch to drm-intel-next
> now. With the Fixes tag it should eventually find its way to stable
> v5.17+. Thank you for the patch, review - and nagging. ;)
> 
> What still remains is cleaning up the code. But that should never have
> stalled the fix for months. Sorry again.

No worries, better late than never :) Thanks for applying the change!

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-25 15:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220201153354.11971-1-lukasz.bartosik@semihalf.com>
     [not found] ` <YwPoCqvQ02kUl9tP@dev-arch.thelio-3990X>
2022-08-23  7:47   ` [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference Łukasz Bartosik
2022-08-25  7:37     ` Jani Nikula
2022-08-25 15:38       ` Nathan Chancellor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).