linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 5.15-rc1 i915 blank screen booting on ThinkPads
@ 2021-09-16  4:37 Hugh Dickins
  2021-09-16  8:44 ` Tvrtko Ursulin
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2021-09-16  4:37 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Vinay Belgaumkar,
	Michal Wajdeczko, Tvrtko Ursulin, Sujaritha Sundaresan,
	John Harrison, Daniele Ceraolo Spurio, Matt Roper,
	Lucas De Marchi, Matthew Brost, Dave Airlie, Daniel Vetter,
	Pavel Machek, Hugh Dickins, linux-kernel

Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
kernel crashed in some way.

I wanted to say what i915 generations these are, but don't know where
to look - I don't see it in dmesg, even when DRM_I915_DEBUG enabled.

Possibly relevant: builtin kernels, CONFIG_MODULES off, no initrd.

On the older laptop:

First bisection showed first bad commit
41e5c17ebfc2 "drm/i915/guc/slpc: Sysfs hooks for SLPC"

But reverting that still crashed boot with blank screen (and
reverting the two related commits after it made no difference).

Second bisection, starting from 5.15-rc1 bad and 41e5c17ebfc2 "good",
but patching it out each time before building, showed first bad commit
3ffe82d701a4 "drm/i915/xehp: handle new steering options"

That one did not revert cleanly from 5.15-rc1, but reverting
927dfdd09d8c "drm/i915/dg2: Add SQIDI steering" then
1705f22c86fb "drm/i915/dg2: Update steering tables" then
768fe28dd3dc "drm/i915/xehpsdv: Define steering tables" then
3ffe82d701a4 "drm/i915/xehp: handle new steering options"
worked (there was one very easy fixup needed somewhere).

And 5.15-rc1 with those five reversions boots and runs fine...
on that older laptop.  But reverting those from the kernel on the
newer laptop did not help at all, still booting with blank screen
(or no more lines shown after the switch from VGA).  Put them back.

On the newer laptop, bisection showed first bad commit
62eaf0ae217d "drm/i915/guc: Support request cancellation"

And 5.15-rc1 with that reverted boots and runs fine on the newer.

I am hoping that there will be some i915 fixups to come in a later rc!
May be nothing more than uninitialized variables or NULL pointers.
You'll probably want more info from me: please ask, but I'm slow.

Thanks,
Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-16  4:37 5.15-rc1 i915 blank screen booting on ThinkPads Hugh Dickins
@ 2021-09-16  8:44 ` Tvrtko Ursulin
  2021-09-16 10:17   ` Jani Nikula
  0 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2021-09-16  8:44 UTC (permalink / raw)
  To: Hugh Dickins, intel-gfx
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Vinay Belgaumkar,
	Michal Wajdeczko, Sujaritha Sundaresan, John Harrison,
	Daniele Ceraolo Spurio, Matt Roper, Lucas De Marchi,
	Matthew Brost, Dave Airlie, Daniel Vetter, Pavel Machek,
	linux-kernel


Hi,

On 16/09/2021 05:37, Hugh Dickins wrote:
> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
> kernel crashed in some way.

T420s could be SandyBridge and X1 Carbon KabyLake.

> I wanted to say what i915 generations these are, but don't know where
> to look - I don't see it in dmesg, even when DRM_I915_DEBUG enabled.
> 
> Possibly relevant: builtin kernels, CONFIG_MODULES off, no initrd.
> 
> On the older laptop:
> 
> First bisection showed first bad commit
> 41e5c17ebfc2 "drm/i915/guc/slpc: Sysfs hooks for SLPC"
> 
> But reverting that still crashed boot with blank screen (and
> reverting the two related commits after it made no difference).
> 
> Second bisection, starting from 5.15-rc1 bad and 41e5c17ebfc2 "good",
> but patching it out each time before building, showed first bad commit
> 3ffe82d701a4 "drm/i915/xehp: handle new steering options"
> 
> That one did not revert cleanly from 5.15-rc1, but reverting
> 927dfdd09d8c "drm/i915/dg2: Add SQIDI steering" then
> 1705f22c86fb "drm/i915/dg2: Update steering tables" then
> 768fe28dd3dc "drm/i915/xehpsdv: Define steering tables" then
> 3ffe82d701a4 "drm/i915/xehp: handle new steering options"
> worked (there was one very easy fixup needed somewhere).
> 
> And 5.15-rc1 with those five reversions boots and runs fine...
> on that older laptop.  But reverting those from the kernel on the
> newer laptop did not help at all, still booting with blank screen
> (or no more lines shown after the switch from VGA).  Put them back.

Bisect results sound suspicious since the steering patches do not come 
into play on SandyBridge.

> On the newer laptop, bisection showed first bad commit
> 62eaf0ae217d "drm/i915/guc: Support request cancellation"
> 
> And 5.15-rc1 with that reverted boots and runs fine on the newer.
But not on the older laptop?

Given bisect points to this, it may be worth trying to build both 
kernels with CONFIG_DRM_I915_REQUEST_TIMEOUT=0 (no reverts) to see what 
happens. But first the logs which I'll ask next.

> I am hoping that there will be some i915 fixups to come in a later rc!
> May be nothing more than uninitialized variables or NULL pointers.
> You'll probably want more info from me: please ask, but I'm slow.

Kernel logs with drm.debug=0xe, with the broken black screen state, 
would probably answer a lot of questions if you could gather it from 
both machines?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-16  8:44 ` Tvrtko Ursulin
@ 2021-09-16 10:17   ` Jani Nikula
  2021-09-17 21:26     ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Jani Nikula @ 2021-09-16 10:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, Hugh Dickins, intel-gfx
  Cc: Joonas Lahtinen, Rodrigo Vivi, Vinay Belgaumkar,
	Michal Wajdeczko, Sujaritha Sundaresan, John Harrison,
	Daniele Ceraolo Spurio, Matt Roper, Lucas De Marchi,
	Matthew Brost, Dave Airlie, Daniel Vetter, Pavel Machek,
	linux-kernel

On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
> Hi,
>
> On 16/09/2021 05:37, Hugh Dickins wrote:
>> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
>> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
>> kernel crashed in some way.
>
> T420s could be SandyBridge and X1 Carbon KabyLake.
>
>> I wanted to say what i915 generations these are, but don't know where
>> to look - I don't see it in dmesg, even when DRM_I915_DEBUG enabled.
>> 
>> Possibly relevant: builtin kernels, CONFIG_MODULES off, no initrd.
>> 
>> On the older laptop:
>> 
>> First bisection showed first bad commit
>> 41e5c17ebfc2 "drm/i915/guc/slpc: Sysfs hooks for SLPC"
>> 
>> But reverting that still crashed boot with blank screen (and
>> reverting the two related commits after it made no difference).
>> 
>> Second bisection, starting from 5.15-rc1 bad and 41e5c17ebfc2 "good",
>> but patching it out each time before building, showed first bad commit
>> 3ffe82d701a4 "drm/i915/xehp: handle new steering options"
>> 
>> That one did not revert cleanly from 5.15-rc1, but reverting
>> 927dfdd09d8c "drm/i915/dg2: Add SQIDI steering" then
>> 1705f22c86fb "drm/i915/dg2: Update steering tables" then
>> 768fe28dd3dc "drm/i915/xehpsdv: Define steering tables" then
>> 3ffe82d701a4 "drm/i915/xehp: handle new steering options"
>> worked (there was one very easy fixup needed somewhere).
>> 
>> And 5.15-rc1 with those five reversions boots and runs fine...
>> on that older laptop.  But reverting those from the kernel on the
>> newer laptop did not help at all, still booting with blank screen
>> (or no more lines shown after the switch from VGA).  Put them back.
>
> Bisect results sound suspicious since the steering patches do not come 
> into play on SandyBridge.
>
>> On the newer laptop, bisection showed first bad commit
>> 62eaf0ae217d "drm/i915/guc: Support request cancellation"
>> 
>> And 5.15-rc1 with that reverted boots and runs fine on the newer.
> But not on the older laptop?
>
> Given bisect points to this, it may be worth trying to build both 
> kernels with CONFIG_DRM_I915_REQUEST_TIMEOUT=0 (no reverts) to see what 
> happens. But first the logs which I'll ask next.
>
>> I am hoping that there will be some i915 fixups to come in a later rc!
>> May be nothing more than uninitialized variables or NULL pointers.
>> You'll probably want more info from me: please ask, but I'm slow.
>
> Kernel logs with drm.debug=0xe, with the broken black screen state, 
> would probably answer a lot of questions if you could gather it from 
> both machines?

And for that, I think it's best to file separate bugs at [1] and attach
the logs there. It helps keep the info in one place. Thanks.

BR,
Jani.


[1] https://gitlab.freedesktop.org/drm/intel/issues/new


>
> Regards,
>
> Tvrtko

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-16 10:17   ` Jani Nikula
@ 2021-09-17 21:26     ` Hugh Dickins
  2021-09-17 21:30       ` Matthew Brost
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2021-09-17 21:26 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Tvrtko Ursulin, Hugh Dickins, intel-gfx, Joonas Lahtinen,
	Rodrigo Vivi, Vinay Belgaumkar, Michal Wajdeczko,
	Sujaritha Sundaresan, John Harrison, Daniele Ceraolo Spurio,
	Matt Roper, Lucas De Marchi, Matthew Brost, Dave Airlie,
	Daniel Vetter, Pavel Machek, linux-kernel

On Thu, 16 Sep 2021, Jani Nikula wrote:
> On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
> > On 16/09/2021 05:37, Hugh Dickins wrote:
> >> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
> >> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
> >> kernel crashed in some way.
...
> > Kernel logs with drm.debug=0xe, with the broken black screen state, 
> > would probably answer a lot of questions if you could gather it from 
> > both machines?
> 
> And for that, I think it's best to file separate bugs at [1] and attach
> the logs there. It helps keep the info in one place. Thanks.
> 
> BR,
> Jani.
> 
> [1] https://gitlab.freedesktop.org/drm/intel/issues/new

Thanks for the quick replies: but of course, getting kernel logs was
the difficult part, this being bootup, with just a blank screen, and
no logging to disk at this stage.  I've never needed it before, but
netconsole to the rescue.

Problem then obvious, both machines now working,
please let me skip the bug reports, here's a patch:

[PATCH] drm/i915: fix blank screen booting crashes

5.15-rc1 crashes with blank screen when booting up on two ThinkPads
using i915.  Bisections converge convincingly, but arrive at different
and surprising "culprits", none of them the actual culprit.

netconsole (with init_netconsole() hacked to call i915_init() when
logging has started, instead of by module_init()) tells the story:

kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
function needs to be 4-byte aligned.

Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Hugh Dickins <hughd@google.com>
---

 drivers/gpu/drm/i915/gt/intel_context.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -362,6 +362,7 @@ static int __intel_context_active(struct
 	return 0;
 }
 
+__aligned(4)	/* Respect the I915_SW_FENCE_MASK */
 static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
 				 enum i915_sw_fence_notify state)
 {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-17 21:26     ` Hugh Dickins
@ 2021-09-17 21:30       ` Matthew Brost
  2021-09-17 22:52         ` Jani Nikula
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Brost @ 2021-09-17 21:30 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Jani Nikula, Tvrtko Ursulin, intel-gfx, Joonas Lahtinen,
	Rodrigo Vivi, Vinay Belgaumkar, Michal Wajdeczko,
	Sujaritha Sundaresan, John Harrison, Daniele Ceraolo Spurio,
	Matt Roper, Lucas De Marchi, Dave Airlie, Daniel Vetter,
	Pavel Machek, linux-kernel

On Fri, Sep 17, 2021 at 02:26:48PM -0700, Hugh Dickins wrote:
> On Thu, 16 Sep 2021, Jani Nikula wrote:
> > On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
> > > On 16/09/2021 05:37, Hugh Dickins wrote:
> > >> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
> > >> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
> > >> kernel crashed in some way.
> ...
> > > Kernel logs with drm.debug=0xe, with the broken black screen state, 
> > > would probably answer a lot of questions if you could gather it from 
> > > both machines?
> > 
> > And for that, I think it's best to file separate bugs at [1] and attach
> > the logs there. It helps keep the info in one place. Thanks.
> > 
> > BR,
> > Jani.
> > 
> > [1] https://gitlab.freedesktop.org/drm/intel/issues/new
> 
> Thanks for the quick replies: but of course, getting kernel logs was
> the difficult part, this being bootup, with just a blank screen, and
> no logging to disk at this stage.  I've never needed it before, but
> netconsole to the rescue.
> 
> Problem then obvious, both machines now working,
> please let me skip the bug reports, here's a patch:
> 

Thanks for finding / fixing this Hugh. I will post this patch in a way
our CI system can understand.

Matt 

> [PATCH] drm/i915: fix blank screen booting crashes
> 
> 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
> using i915.  Bisections converge convincingly, but arrive at different
> and surprising "culprits", none of them the actual culprit.
> 
> netconsole (with init_netconsole() hacked to call i915_init() when
> logging has started, instead of by module_init()) tells the story:
> 
> kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
> with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
> I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
> function needs to be 4-byte aligned.
> 
> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
> 
>  drivers/gpu/drm/i915/gt/intel_context.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -362,6 +362,7 @@ static int __intel_context_active(struct
>  	return 0;
>  }
>  
> +__aligned(4)	/* Respect the I915_SW_FENCE_MASK */
>  static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
>  				 enum i915_sw_fence_notify state)
>  {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-17 21:30       ` Matthew Brost
@ 2021-09-17 22:52         ` Jani Nikula
  2021-09-17 23:29           ` Matthew Brost
  0 siblings, 1 reply; 8+ messages in thread
From: Jani Nikula @ 2021-09-17 22:52 UTC (permalink / raw)
  To: Matthew Brost, Hugh Dickins
  Cc: Tvrtko Ursulin, intel-gfx, Joonas Lahtinen, Rodrigo Vivi,
	Vinay Belgaumkar, Michal Wajdeczko, Sujaritha Sundaresan,
	John Harrison, Daniele Ceraolo Spurio, Matt Roper,
	Lucas De Marchi, Dave Airlie, Daniel Vetter, Pavel Machek,
	linux-kernel

On Fri, 17 Sep 2021, Matthew Brost <matthew.brost@intel.com> wrote:
> On Fri, Sep 17, 2021 at 02:26:48PM -0700, Hugh Dickins wrote:
>> On Thu, 16 Sep 2021, Jani Nikula wrote:
>> > On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
>> > > On 16/09/2021 05:37, Hugh Dickins wrote:
>> > >> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
>> > >> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
>> > >> kernel crashed in some way.
>> ...
>> > > Kernel logs with drm.debug=0xe, with the broken black screen state, 
>> > > would probably answer a lot of questions if you could gather it from 
>> > > both machines?
>> > 
>> > And for that, I think it's best to file separate bugs at [1] and attach
>> > the logs there. It helps keep the info in one place. Thanks.
>> > 
>> > BR,
>> > Jani.
>> > 
>> > [1] https://gitlab.freedesktop.org/drm/intel/issues/new
>> 
>> Thanks for the quick replies: but of course, getting kernel logs was
>> the difficult part, this being bootup, with just a blank screen, and
>> no logging to disk at this stage.  I've never needed it before, but
>> netconsole to the rescue.
>> 
>> Problem then obvious, both machines now working,
>> please let me skip the bug reports, here's a patch:
>> 
>
> Thanks for finding / fixing this Hugh. I will post this patch in a way
> our CI system can understand.

Thanks indeed!

Matt, please get rid of the BUG_ON while at it, and make it a
WARN. Oopsing doesn't do anyone any good.

BR,
Jani.

>
> Matt 
>
>> [PATCH] drm/i915: fix blank screen booting crashes
>> 
>> 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
>> using i915.  Bisections converge convincingly, but arrive at different
>> and surprising "culprits", none of them the actual culprit.
>> 
>> netconsole (with init_netconsole() hacked to call i915_init() when
>> logging has started, instead of by module_init()) tells the story:
>> 
>> kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
>> with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
>> I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
>> function needs to be 4-byte aligned.
>> 
>> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
>> Signed-off-by: Hugh Dickins <hughd@google.com>
>> ---
>> 
>>  drivers/gpu/drm/i915/gt/intel_context.c |    1 +
>>  1 file changed, 1 insertion(+)
>> 
>> --- a/drivers/gpu/drm/i915/gt/intel_context.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
>> @@ -362,6 +362,7 @@ static int __intel_context_active(struct
>>  	return 0;
>>  }
>>  
>> +__aligned(4)	/* Respect the I915_SW_FENCE_MASK */
>>  static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
>>  				 enum i915_sw_fence_notify state)
>>  {

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-17 22:52         ` Jani Nikula
@ 2021-09-17 23:29           ` Matthew Brost
  2021-09-18  0:22             ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Brost @ 2021-09-17 23:29 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Hugh Dickins, Tvrtko Ursulin, intel-gfx, Joonas Lahtinen,
	Rodrigo Vivi, Vinay Belgaumkar, Michal Wajdeczko,
	Sujaritha Sundaresan, John Harrison, Daniele Ceraolo Spurio,
	Matt Roper, Lucas De Marchi, Dave Airlie, Daniel Vetter,
	Pavel Machek, linux-kernel

On Sat, Sep 18, 2021 at 01:52:48AM +0300, Jani Nikula wrote:
> On Fri, 17 Sep 2021, Matthew Brost <matthew.brost@intel.com> wrote:
> > On Fri, Sep 17, 2021 at 02:26:48PM -0700, Hugh Dickins wrote:
> >> On Thu, 16 Sep 2021, Jani Nikula wrote:
> >> > On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
> >> > > On 16/09/2021 05:37, Hugh Dickins wrote:
> >> > >> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
> >> > >> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
> >> > >> kernel crashed in some way.
> >> ...
> >> > > Kernel logs with drm.debug=0xe, with the broken black screen state, 
> >> > > would probably answer a lot of questions if you could gather it from 
> >> > > both machines?
> >> > 
> >> > And for that, I think it's best to file separate bugs at [1] and attach
> >> > the logs there. It helps keep the info in one place. Thanks.
> >> > 
> >> > BR,
> >> > Jani.
> >> > 
> >> > [1] https://gitlab.freedesktop.org/drm/intel/issues/new
> >> 
> >> Thanks for the quick replies: but of course, getting kernel logs was
> >> the difficult part, this being bootup, with just a blank screen, and
> >> no logging to disk at this stage.  I've never needed it before, but
> >> netconsole to the rescue.
> >> 
> >> Problem then obvious, both machines now working,
> >> please let me skip the bug reports, here's a patch:
> >> 
> >
> > Thanks for finding / fixing this Hugh. I will post this patch in a way
> > our CI system can understand.
> 
> Thanks indeed!
> 
> Matt, please get rid of the BUG_ON while at it, and make it a
> WARN. Oopsing doesn't do anyone any good.
> 

Sure. Will do. Long term we should just look to rip out crap this (i.e.
stealing bits from aligned addresses for flags).

Matt

> BR,
> Jani.
> 
> >
> > Matt 
> >
> >> [PATCH] drm/i915: fix blank screen booting crashes
> >> 
> >> 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
> >> using i915.  Bisections converge convincingly, but arrive at different
> >> and surprising "culprits", none of them the actual culprit.
> >> 
> >> netconsole (with init_netconsole() hacked to call i915_init() when
> >> logging has started, instead of by module_init()) tells the story:
> >> 
> >> kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
> >> with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
> >> I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
> >> function needs to be 4-byte aligned.
> >> 
> >> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> >> Signed-off-by: Hugh Dickins <hughd@google.com>
> >> ---
> >> 
> >>  drivers/gpu/drm/i915/gt/intel_context.c |    1 +
> >>  1 file changed, 1 insertion(+)
> >> 
> >> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> >> @@ -362,6 +362,7 @@ static int __intel_context_active(struct
> >>  	return 0;
> >>  }
> >>  
> >> +__aligned(4)	/* Respect the I915_SW_FENCE_MASK */
> >>  static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
> >>  				 enum i915_sw_fence_notify state)
> >>  {
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.15-rc1 i915 blank screen booting on ThinkPads
  2021-09-17 23:29           ` Matthew Brost
@ 2021-09-18  0:22             ` Hugh Dickins
  0 siblings, 0 replies; 8+ messages in thread
From: Hugh Dickins @ 2021-09-18  0:22 UTC (permalink / raw)
  To: Matthew Brost
  Cc: Jani Nikula, Hugh Dickins, Tvrtko Ursulin, intel-gfx,
	Joonas Lahtinen, Rodrigo Vivi, Vinay Belgaumkar,
	Michal Wajdeczko, Sujaritha Sundaresan, John Harrison,
	Daniele Ceraolo Spurio, Matt Roper, Lucas De Marchi, Dave Airlie,
	Daniel Vetter, Pavel Machek, linux-kernel

On Fri, 17 Sep 2021, Matthew Brost wrote:
> On Sat, Sep 18, 2021 at 01:52:48AM +0300, Jani Nikula wrote:
> > On Fri, 17 Sep 2021, Matthew Brost <matthew.brost@intel.com> wrote:
> > > On Fri, Sep 17, 2021 at 02:26:48PM -0700, Hugh Dickins wrote:
> > >> On Thu, 16 Sep 2021, Jani Nikula wrote:
> > >> > On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
> > >> > > On 16/09/2021 05:37, Hugh Dickins wrote:
> > >> > >> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
> > >> > >> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
> > >> > >> kernel crashed in some way.
> > >> ...
> > >> > > Kernel logs with drm.debug=0xe, with the broken black screen state, 
> > >> > > would probably answer a lot of questions if you could gather it from 
> > >> > > both machines?
> > >> > 
> > >> > And for that, I think it's best to file separate bugs at [1] and attach
> > >> > the logs there. It helps keep the info in one place. Thanks.
> > >> > 
> > >> > BR,
> > >> > Jani.
> > >> > 
> > >> > [1] https://gitlab.freedesktop.org/drm/intel/issues/new
> > >> 
> > >> Thanks for the quick replies: but of course, getting kernel logs was
> > >> the difficult part, this being bootup, with just a blank screen, and
> > >> no logging to disk at this stage.  I've never needed it before, but
> > >> netconsole to the rescue.
> > >> 
> > >> Problem then obvious, both machines now working,
> > >> please let me skip the bug reports, here's a patch:
> > >> 
> > >
> > > Thanks for finding / fixing this Hugh. I will post this patch in a way
> > > our CI system can understand.
> > 
> > Thanks indeed!
> > 
> > Matt, please get rid of the BUG_ON while at it, and make it a
> > WARN. Oopsing doesn't do anyone any good.
> > 
> 
> Sure. Will do. Long term we should just look to rip out crap this (i.e.
> stealing bits from aligned addresses for flags).

It just crossed my mind, that I never did due diligence on _other_
callers of i915_sw_fence_init().  In fact they're okay, but that's
because their fence functions are all declared with the
#define __i915_sw_fence_call __aligned(4)
from i915_sw_fence.h, which I had not seen when I sent the patch.

I'm not going to resend, but if I were you, I'd quietly edit that
patch to use __i915_sw_fence_call in place of my __aligned(4).

Thanks,
Hugh

> 
> Matt
> 
> > BR,
> > Jani.
> > 
> > >
> > > Matt 
> > >
> > >> [PATCH] drm/i915: fix blank screen booting crashes
> > >> 
> > >> 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
> > >> using i915.  Bisections converge convincingly, but arrive at different
> > >> and surprising "culprits", none of them the actual culprit.
> > >> 
> > >> netconsole (with init_netconsole() hacked to call i915_init() when
> > >> logging has started, instead of by module_init()) tells the story:
> > >> 
> > >> kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
> > >> with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
> > >> I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
> > >> function needs to be 4-byte aligned.
> > >> 
> > >> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> > >> Signed-off-by: Hugh Dickins <hughd@google.com>
> > >> ---
> > >> 
> > >>  drivers/gpu/drm/i915/gt/intel_context.c |    1 +
> > >>  1 file changed, 1 insertion(+)
> > >> 
> > >> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > >> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > >> @@ -362,6 +362,7 @@ static int __intel_context_active(struct
> > >>  	return 0;
> > >>  }
> > >>  
> > >> +__aligned(4)	/* Respect the I915_SW_FENCE_MASK */
> > >>  static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
> > >>  				 enum i915_sw_fence_notify state)
> > >>  {
> > 
> > -- 
> > Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-18  0:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16  4:37 5.15-rc1 i915 blank screen booting on ThinkPads Hugh Dickins
2021-09-16  8:44 ` Tvrtko Ursulin
2021-09-16 10:17   ` Jani Nikula
2021-09-17 21:26     ` Hugh Dickins
2021-09-17 21:30       ` Matthew Brost
2021-09-17 22:52         ` Jani Nikula
2021-09-17 23:29           ` Matthew Brost
2021-09-18  0:22             ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).