From: "Łukasz Bartosik" <lb@semihalf.com>
To: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
Nathan Chancellor <nathan@kernel.org>,
keescook@chromium.org
Cc: Jani Nikula <jani.nikula@linux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
intel-gfx@lists.freedesktop.org, upstream@semihalf.com,
llvm@lists.linux.dev
Subject: Re: [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference
Date: Tue, 23 Aug 2022 09:47:33 +0200 [thread overview]
Message-ID: <CAK8ByeL=1EtgBRGh9hhHofgpRqB--CQgih+tAJwFv_MchDhcSw@mail.gmail.com> (raw)
In-Reply-To: <YwPoCqvQ02kUl9tP@dev-arch.thelio-3990X>
>
> Hi all,
>
> Apologies in advance if you see this twice. I did not see the original
> make it to either lore.kernel.org or the freedesktop.org archives so I
> figured it might have been sent into the void.
>
> On Tue, Feb 01, 2022 at 04:33:54PM +0100, Lukasz Bartosik wrote:
> > From: Łukasz Bartosik <lb@semihalf.com>
> >
> > Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel.
> > The root cause is null pointer defeference of bi_next
> > in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c.
> >
> > BUG: kernel NULL pointer dereference, address: 000000000000002e
> > PGD 0 P4D 0
> > Oops: 0002 [#1] PREEMPT SMP NOPTI
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G U 5.17.0-rc1
> > Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021
> > RIP: 0010:tgl_get_bw_info+0x2de/0x510
> > ...
> > [ 2.554467] Call Trace:
> > [ 2.554467] <TASK>
> > [ 2.554467] intel_bw_init_hw+0x14a/0x434
> > [ 2.554467] ? _printk+0x59/0x73
> > [ 2.554467] ? _dev_err+0x77/0x91
> > [ 2.554467] i915_driver_hw_probe+0x329/0x33e
> > [ 2.554467] i915_driver_probe+0x4c8/0x638
> > [ 2.554467] i915_pci_probe+0xf8/0x14e
> > [ 2.554467] ? _raw_spin_unlock_irqrestore+0x12/0x2c
> > [ 2.554467] pci_device_probe+0xaa/0x142
> > [ 2.554467] really_probe+0x13f/0x2f4
> > [ 2.554467] __driver_probe_device+0x9e/0xd3
> > [ 2.554467] driver_probe_device+0x24/0x7c
> > [ 2.554467] __driver_attach+0xba/0xcf
> > [ 2.554467] ? driver_attach+0x1f/0x1f
> > [ 2.554467] bus_for_each_dev+0x8c/0xc0
> > [ 2.554467] bus_add_driver+0x11b/0x1f7
> > [ 2.554467] driver_register+0x60/0xea
> > [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16
> > [ 2.554467] i915_init+0x2c/0xb9
> > [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16
> > [ 2.554467] do_one_initcall+0x12e/0x2b3
> > [ 2.554467] do_initcall_level+0xd6/0xf3
> > [ 2.554467] do_initcalls+0x4e/0x79
> > [ 2.554467] kernel_init_freeable+0xed/0x14d
> > [ 2.554467] ? rest_init+0xc1/0xc1
> > [ 2.554467] kernel_init+0x1a/0x120
> > [ 2.554467] ret_from_fork+0x1f/0x30
> > [ 2.554467] </TASK>
> > ...
> > Kernel panic - not syncing: Fatal exception
> >
> > Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae")
> > Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_bw.c | 16 +++++++++-------
> > 1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> > index 2da4aacc956b..bd0ed68b7faa 100644
> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> > @@ -404,15 +404,17 @@ static int tgl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
> > int clpchgroup;
> > int j;
> >
> > - if (i < num_groups - 1)
> > - bi_next = &dev_priv->max_bw[i + 1];
> > -
> > clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
> >
> > - if (i < num_groups - 1 && clpchgroup < clperchgroup)
> > - bi_next->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
> > - else
> > - bi_next->num_planes = 0;
> > + if (i < num_groups - 1) {
> > + bi_next = &dev_priv->max_bw[i + 1];
> > +
> > + if (clpchgroup < clperchgroup)
> > + bi_next->num_planes = (ipqdepth - clpchgroup) /
> > + clpchgroup + 1;
> > + else
> > + bi_next->num_planes = 0;
> > + }
> >
> > bi->num_qgv_points = qi.num_points;
> > bi->num_psf_gv_points = qi.num_psf_points;
> > --
> > 2.35.0.rc2.247.g8bbb082509-goog
> >
> >
>
> Was this patch ever applied or was the issue fixed in a different way?
> If CONFIG_INIT_STACK_ALL_ZERO is enabled (it is on by default when the
> compiler supports it), bi_next will be deterministically initialized to
> NULL, which means 'bi_next->num_planes = 0' will crash when the first if
> statement is not taken (i.e. 'i > num_groups - 1'). This was reported to
> us at [1] so it impacts real users (and I have been applying this change
> locally for six months). I see some discussion in this thread, was it
> ever resolved?
>
> [1]: https://github.com/ClangBuiltLinux/linux/issues/1626
>
> Cheers,
> Nathan
The patch was not accepted by upstream. I gave up after sending two reminders
that the issue is still present which resulted in no upstream reaction.
I have been also applying that patch locally for a few months.
Thanks for bringing it up to upstream attention again.
WARNING: multiple messages have this Message-ID (diff)
From: "Łukasz Bartosik" <lb@semihalf.com>
To: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
Nathan Chancellor <nathan@kernel.org>,
keescook@chromium.org
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
llvm@lists.linux.dev, upstream@semihalf.com,
intel-gfx@lists.freedesktop.org,
Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference
Date: Tue, 23 Aug 2022 09:47:33 +0200 [thread overview]
Message-ID: <CAK8ByeL=1EtgBRGh9hhHofgpRqB--CQgih+tAJwFv_MchDhcSw@mail.gmail.com> (raw)
In-Reply-To: <YwPoCqvQ02kUl9tP@dev-arch.thelio-3990X>
>
> Hi all,
>
> Apologies in advance if you see this twice. I did not see the original
> make it to either lore.kernel.org or the freedesktop.org archives so I
> figured it might have been sent into the void.
>
> On Tue, Feb 01, 2022 at 04:33:54PM +0100, Lukasz Bartosik wrote:
> > From: Łukasz Bartosik <lb@semihalf.com>
> >
> > Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel.
> > The root cause is null pointer defeference of bi_next
> > in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c.
> >
> > BUG: kernel NULL pointer dereference, address: 000000000000002e
> > PGD 0 P4D 0
> > Oops: 0002 [#1] PREEMPT SMP NOPTI
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G U 5.17.0-rc1
> > Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021
> > RIP: 0010:tgl_get_bw_info+0x2de/0x510
> > ...
> > [ 2.554467] Call Trace:
> > [ 2.554467] <TASK>
> > [ 2.554467] intel_bw_init_hw+0x14a/0x434
> > [ 2.554467] ? _printk+0x59/0x73
> > [ 2.554467] ? _dev_err+0x77/0x91
> > [ 2.554467] i915_driver_hw_probe+0x329/0x33e
> > [ 2.554467] i915_driver_probe+0x4c8/0x638
> > [ 2.554467] i915_pci_probe+0xf8/0x14e
> > [ 2.554467] ? _raw_spin_unlock_irqrestore+0x12/0x2c
> > [ 2.554467] pci_device_probe+0xaa/0x142
> > [ 2.554467] really_probe+0x13f/0x2f4
> > [ 2.554467] __driver_probe_device+0x9e/0xd3
> > [ 2.554467] driver_probe_device+0x24/0x7c
> > [ 2.554467] __driver_attach+0xba/0xcf
> > [ 2.554467] ? driver_attach+0x1f/0x1f
> > [ 2.554467] bus_for_each_dev+0x8c/0xc0
> > [ 2.554467] bus_add_driver+0x11b/0x1f7
> > [ 2.554467] driver_register+0x60/0xea
> > [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16
> > [ 2.554467] i915_init+0x2c/0xb9
> > [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16
> > [ 2.554467] do_one_initcall+0x12e/0x2b3
> > [ 2.554467] do_initcall_level+0xd6/0xf3
> > [ 2.554467] do_initcalls+0x4e/0x79
> > [ 2.554467] kernel_init_freeable+0xed/0x14d
> > [ 2.554467] ? rest_init+0xc1/0xc1
> > [ 2.554467] kernel_init+0x1a/0x120
> > [ 2.554467] ret_from_fork+0x1f/0x30
> > [ 2.554467] </TASK>
> > ...
> > Kernel panic - not syncing: Fatal exception
> >
> > Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae")
> > Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_bw.c | 16 +++++++++-------
> > 1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> > index 2da4aacc956b..bd0ed68b7faa 100644
> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> > @@ -404,15 +404,17 @@ static int tgl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
> > int clpchgroup;
> > int j;
> >
> > - if (i < num_groups - 1)
> > - bi_next = &dev_priv->max_bw[i + 1];
> > -
> > clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
> >
> > - if (i < num_groups - 1 && clpchgroup < clperchgroup)
> > - bi_next->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
> > - else
> > - bi_next->num_planes = 0;
> > + if (i < num_groups - 1) {
> > + bi_next = &dev_priv->max_bw[i + 1];
> > +
> > + if (clpchgroup < clperchgroup)
> > + bi_next->num_planes = (ipqdepth - clpchgroup) /
> > + clpchgroup + 1;
> > + else
> > + bi_next->num_planes = 0;
> > + }
> >
> > bi->num_qgv_points = qi.num_points;
> > bi->num_psf_gv_points = qi.num_psf_points;
> > --
> > 2.35.0.rc2.247.g8bbb082509-goog
> >
> >
>
> Was this patch ever applied or was the issue fixed in a different way?
> If CONFIG_INIT_STACK_ALL_ZERO is enabled (it is on by default when the
> compiler supports it), bi_next will be deterministically initialized to
> NULL, which means 'bi_next->num_planes = 0' will crash when the first if
> statement is not taken (i.e. 'i > num_groups - 1'). This was reported to
> us at [1] so it impacts real users (and I have been applying this change
> locally for six months). I see some discussion in this thread, was it
> ever resolved?
>
> [1]: https://github.com/ClangBuiltLinux/linux/issues/1626
>
> Cheers,
> Nathan
The patch was not accepted by upstream. I gave up after sending two reminders
that the issue is still present which resulted in no upstream reaction.
I have been also applying that patch locally for a few months.
Thanks for bringing it up to upstream attention again.
WARNING: multiple messages have this Message-ID (diff)
From: "Łukasz Bartosik" <lb@semihalf.com>
To: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
Nathan Chancellor <nathan@kernel.org>,
keescook@chromium.org
Cc: llvm@lists.linux.dev, upstream@semihalf.com,
intel-gfx@lists.freedesktop.org,
Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference
Date: Tue, 23 Aug 2022 09:47:33 +0200 [thread overview]
Message-ID: <CAK8ByeL=1EtgBRGh9hhHofgpRqB--CQgih+tAJwFv_MchDhcSw@mail.gmail.com> (raw)
In-Reply-To: <YwPoCqvQ02kUl9tP@dev-arch.thelio-3990X>
>
> Hi all,
>
> Apologies in advance if you see this twice. I did not see the original
> make it to either lore.kernel.org or the freedesktop.org archives so I
> figured it might have been sent into the void.
>
> On Tue, Feb 01, 2022 at 04:33:54PM +0100, Lukasz Bartosik wrote:
> > From: Łukasz Bartosik <lb@semihalf.com>
> >
> > Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel.
> > The root cause is null pointer defeference of bi_next
> > in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c.
> >
> > BUG: kernel NULL pointer dereference, address: 000000000000002e
> > PGD 0 P4D 0
> > Oops: 0002 [#1] PREEMPT SMP NOPTI
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G U 5.17.0-rc1
> > Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021
> > RIP: 0010:tgl_get_bw_info+0x2de/0x510
> > ...
> > [ 2.554467] Call Trace:
> > [ 2.554467] <TASK>
> > [ 2.554467] intel_bw_init_hw+0x14a/0x434
> > [ 2.554467] ? _printk+0x59/0x73
> > [ 2.554467] ? _dev_err+0x77/0x91
> > [ 2.554467] i915_driver_hw_probe+0x329/0x33e
> > [ 2.554467] i915_driver_probe+0x4c8/0x638
> > [ 2.554467] i915_pci_probe+0xf8/0x14e
> > [ 2.554467] ? _raw_spin_unlock_irqrestore+0x12/0x2c
> > [ 2.554467] pci_device_probe+0xaa/0x142
> > [ 2.554467] really_probe+0x13f/0x2f4
> > [ 2.554467] __driver_probe_device+0x9e/0xd3
> > [ 2.554467] driver_probe_device+0x24/0x7c
> > [ 2.554467] __driver_attach+0xba/0xcf
> > [ 2.554467] ? driver_attach+0x1f/0x1f
> > [ 2.554467] bus_for_each_dev+0x8c/0xc0
> > [ 2.554467] bus_add_driver+0x11b/0x1f7
> > [ 2.554467] driver_register+0x60/0xea
> > [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16
> > [ 2.554467] i915_init+0x2c/0xb9
> > [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16
> > [ 2.554467] do_one_initcall+0x12e/0x2b3
> > [ 2.554467] do_initcall_level+0xd6/0xf3
> > [ 2.554467] do_initcalls+0x4e/0x79
> > [ 2.554467] kernel_init_freeable+0xed/0x14d
> > [ 2.554467] ? rest_init+0xc1/0xc1
> > [ 2.554467] kernel_init+0x1a/0x120
> > [ 2.554467] ret_from_fork+0x1f/0x30
> > [ 2.554467] </TASK>
> > ...
> > Kernel panic - not syncing: Fatal exception
> >
> > Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae")
> > Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_bw.c | 16 +++++++++-------
> > 1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> > index 2da4aacc956b..bd0ed68b7faa 100644
> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> > @@ -404,15 +404,17 @@ static int tgl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
> > int clpchgroup;
> > int j;
> >
> > - if (i < num_groups - 1)
> > - bi_next = &dev_priv->max_bw[i + 1];
> > -
> > clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
> >
> > - if (i < num_groups - 1 && clpchgroup < clperchgroup)
> > - bi_next->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
> > - else
> > - bi_next->num_planes = 0;
> > + if (i < num_groups - 1) {
> > + bi_next = &dev_priv->max_bw[i + 1];
> > +
> > + if (clpchgroup < clperchgroup)
> > + bi_next->num_planes = (ipqdepth - clpchgroup) /
> > + clpchgroup + 1;
> > + else
> > + bi_next->num_planes = 0;
> > + }
> >
> > bi->num_qgv_points = qi.num_points;
> > bi->num_psf_gv_points = qi.num_psf_points;
> > --
> > 2.35.0.rc2.247.g8bbb082509-goog
> >
> >
>
> Was this patch ever applied or was the issue fixed in a different way?
> If CONFIG_INIT_STACK_ALL_ZERO is enabled (it is on by default when the
> compiler supports it), bi_next will be deterministically initialized to
> NULL, which means 'bi_next->num_planes = 0' will crash when the first if
> statement is not taken (i.e. 'i > num_groups - 1'). This was reported to
> us at [1] so it impacts real users (and I have been applying this change
> locally for six months). I see some discussion in this thread, was it
> ever resolved?
>
> [1]: https://github.com/ClangBuiltLinux/linux/issues/1626
>
> Cheers,
> Nathan
The patch was not accepted by upstream. I gave up after sending two reminders
that the issue is still present which resulted in no upstream reaction.
I have been also applying that patch locally for a few months.
Thanks for bringing it up to upstream attention again.
next prev parent reply other threads:[~2022-08-23 7:47 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-01 15:33 [Intel-gfx] [PATCH v1] drm/i915: fix null pointer dereference Lukasz Bartosik
2022-02-01 15:49 ` Jani Nikula
2022-02-08 16:20 ` Łukasz Bartosik
2022-02-09 2:02 ` Sripada, Radhakrishna
2022-02-09 10:31 ` Ville Syrjälä
2022-03-08 15:38 ` Łukasz Bartosik
2022-02-02 14:27 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2022-02-02 15:31 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2022-08-22 17:14 ` [Intel-gfx] [PATCH v1] " Nathan Chancellor
2022-08-22 20:30 ` Kees Cook
2022-08-22 20:33 ` Nathan Chancellor
2022-08-22 20:33 ` Nathan Chancellor
2022-08-23 7:47 ` Łukasz Bartosik [this message]
2022-08-23 7:47 ` Łukasz Bartosik
2022-08-23 7:47 ` Łukasz Bartosik
2022-08-25 7:37 ` Jani Nikula
2022-08-25 7:37 ` Jani Nikula
2022-08-25 7:37 ` Jani Nikula
2022-08-25 15:38 ` Nathan Chancellor
2022-08-25 15:38 ` Nathan Chancellor
2022-08-25 15:38 ` Nathan Chancellor
2022-08-23 9:27 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: fix null pointer dereference (rev2) Patchwork
2022-08-23 11:30 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: fix null pointer dereference (rev3) Patchwork
2022-08-23 18:53 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: fix null pointer dereference (rev4) Patchwork
2022-08-24 23:27 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAK8ByeL=1EtgBRGh9hhHofgpRqB--CQgih+tAJwFv_MchDhcSw@mail.gmail.com' \
--to=lb@semihalf.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=nathan@kernel.org \
--cc=rodrigo.vivi@intel.com \
--cc=tvrtko.ursulin@linux.intel.com \
--cc=upstream@semihalf.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.