dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Re: [REGRESSION]: nouveau: Asynchronous wait on fence
       [not found] ` <5ecf0eac-a089-4da9-b76e-b45272c98393@leemhuis.info>
@ 2023-11-15  6:19   ` Owen T. Heisler
  2023-11-21 15:16     ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 6+ messages in thread
From: Owen T. Heisler @ 2023-11-15  6:19 UTC (permalink / raw)
  To: Linux regressions mailing list, stable
  Cc: Sasha Levin, Karol Herbst, nouveau, dri-devel, Danilo Krummrich

On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 28.10.23 04:46, Owen T. Heisler wrote:
>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>
>> ## Problem
>>
>> 1. Connect external display to DVI port on dock and run X with both
>>     displays in use.
>> 2. Wait hours or days.
>> 3. Suddenly the secondary Nvidia-connected display turns off and X stops
>>     responding to keyboard/mouse input. In *some* cases it is possible to
>>     switch to a virtual TTY with Ctrl+Alt+Fn and log in there.

> You thus might want to check if the problem occurs with 6.6 -- and
> ideally also check if reverting the culprit there fixes things for you.

Hi Thorsten and others,

The problem also occurs with v6.6. Here is a decoded kernel log from an 
untainted kernel:

https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log

The culprit commit does not revert cleanly on v6.6. I have not yet 
attempted to resolve the conflicts.

I have also updated the bug description at
<https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>.

Thanks,
Owen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION]: nouveau: Asynchronous wait on fence
  2023-11-15  6:19   ` [REGRESSION]: nouveau: Asynchronous wait on fence Owen T. Heisler
@ 2023-11-21 15:16     ` Linux regression tracking (Thorsten Leemhuis)
  2023-11-21 20:23       ` Owen T. Heisler
  0 siblings, 1 reply; 6+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-11-21 15:16 UTC (permalink / raw)
  To: Owen T. Heisler, Linux regressions mailing list, stable
  Cc: Sasha Levin, Karol Herbst, nouveau, dri-devel, Danilo Krummrich

On 15.11.23 07:19, Owen T. Heisler wrote:
> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>
>>> ## Problem
>>>
>>> 1. Connect external display to DVI port on dock and run X with both
>>>     displays in use.
>>> 2. Wait hours or days.
>>> 3. Suddenly the secondary Nvidia-connected display turns off and X stops
>>>     responding to keyboard/mouse input. In *some* cases it is
>>> possible to
>>>     switch to a virtual TTY with Ctrl+Alt+Fn and log in there.
> 
>> You thus might want to check if the problem occurs with 6.6 -- and
>> ideally also check if reverting the culprit there fixes things for you.
> 
> The problem also occurs with v6.6.

You meanwhile might want to give 6.7-rc as well on the off chance that
it improves things, even if that is unlikely.

> Here is a decoded kernel log from an
> untainted kernel:
> 
> https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log
> 
> The culprit commit does not revert cleanly on v6.6. I have not yet
> attempted to resolve the conflicts.
> 
> I have also updated the bug description at
> <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>.

Maybe one of the nouveau developer can take a quick look at
d386a4b54607cf and suggest a simple way to revert it in latest mainline.
Maybe just removing the main chunk of code that is added is all that it
takes.

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION]: nouveau: Asynchronous wait on fence
  2023-11-21 15:16     ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-11-21 20:23       ` Owen T. Heisler
  2023-11-29  0:37         ` Owen T. Heisler
  0 siblings, 1 reply; 6+ messages in thread
From: Owen T. Heisler @ 2023-11-21 20:23 UTC (permalink / raw)
  To: Linux regressions mailing list, stable
  Cc: Sasha Levin, Karol Herbst, nouveau, dri-devel, Danilo Krummrich

On 11/21/23 09:16, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 15.11.23 07:19, Owen T. Heisler wrote:
>> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>>
>>>> ## Problem
>>>>
>>>> 1. Connect external display to DVI port on dock and run X with both
>>>>      displays in use.
>>>> 2. Wait hours or days.
>>>> 3. Suddenly the secondary Nvidia-connected display turns off and X stops
>>>>      responding to keyboard/mouse input. In *some* cases it is
>>>> possible to
>>>>      switch to a virtual TTY with Ctrl+Alt+Fn and log in there.

>> Here is a decoded kernel log from an
>> untainted kernel:
>>
>> https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log

> Maybe one of the nouveau developer can take a quick look at
> d386a4b54607cf and suggest a simple way to revert it in latest mainline.
> Maybe just removing the main chunk of code that is added is all that it
> takes.

I was able to resolve the revert conflict; it was indeed trivial though 
I did not realize it initially. I am currently testing v6.6 with the 
culprit commit reverted. I need to test for at least a full week (ending 
11-23) before I can assume it fixes the problem.

After that I can try the latest v6.7-rc as you suggested.

I have updated the bug description at
<https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>.

Thanks again,
Owen

--
Owen T. Heisler
<https://owenh.net>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION]: nouveau: Asynchronous wait on fence
  2023-11-21 20:23       ` Owen T. Heisler
@ 2023-11-29  0:37         ` Owen T. Heisler
  2023-12-05 12:33           ` Thorsten Leemhuis
  0 siblings, 1 reply; 6+ messages in thread
From: Owen T. Heisler @ 2023-11-29  0:37 UTC (permalink / raw)
  To: Linux regressions mailing list, stable
  Cc: Sasha Levin, Karol Herbst, nouveau, dri-devel, Danilo Krummrich

On 11/21/23 14:23, Owen T. Heisler wrote:
> On 11/21/23 09:16, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 15.11.23 07:19, Owen T. Heisler wrote:
>>> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>>>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>>>
>>>>> 3. Suddenly the secondary Nvidia-connected display turns off and X 
>>>>> stops responding to keyboard/mouse input.

> I am currently testing v6.6 with the culprit commit reverted.

- v6.6: fails
- v6.6 with the culprit commit reverted: works

See <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180> for full 
details including a decoded kernel log.

Thanks,
Owen

--
Owen T. Heisler
<https://owenh.net>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION]: nouveau: Asynchronous wait on fence
  2023-11-29  0:37         ` Owen T. Heisler
@ 2023-12-05 12:33           ` Thorsten Leemhuis
  2023-12-06  4:08             ` Owen T. Heisler
  0 siblings, 1 reply; 6+ messages in thread
From: Thorsten Leemhuis @ 2023-12-05 12:33 UTC (permalink / raw)
  To: Owen T. Heisler, Linux regressions mailing list, stable
  Cc: Sasha Levin, Karol Herbst, nouveau, dri-devel, Danilo Krummrich

Karol, Lyude, and Daniel:

On 29.11.23 01:37, Owen T. Heisler wrote:
> On 11/21/23 14:23, Owen T. Heisler wrote:
>> On 11/21/23 09:16, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 15.11.23 07:19, Owen T. Heisler wrote:
>>>> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>>>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>>>>> #regzbot link:
>>>>>> https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>>>>
>>>>>> 3. Suddenly the secondary Nvidia-connected display turns off and X
>>>>>> stops responding to keyboard/mouse input.
> 
>> I am currently testing v6.6 with the culprit commit reverted.
> 
> - v6.6: fails
> - v6.6 with the culprit commit reverted: works
> 
> See <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180> for full
> details including a decoded kernel log.

Not sure about the others, but it's kind of confusing that you update
the issue descriptions all the time and never add a comment to that ticket.

Anyway: Nouveau maintainers, could any of you at least comment on this?
Sure, it's the regression is caused by an old commit (6eaa1f3c59a707 was
merged for v5.14-rc7) and reverting it likely is not a option, but it
nevertheless it would be great if this could be solved somehow.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION]: nouveau: Asynchronous wait on fence
  2023-12-05 12:33           ` Thorsten Leemhuis
@ 2023-12-06  4:08             ` Owen T. Heisler
  0 siblings, 0 replies; 6+ messages in thread
From: Owen T. Heisler @ 2023-12-06  4:08 UTC (permalink / raw)
  To: Thorsten Leemhuis, Linux regressions mailing list, stable
  Cc: Sasha Levin, Karol Herbst, nouveau, dri-devel, Danilo Krummrich

Hi Thorsten and others,

On 12/5/23 06:33, Thorsten Leemhuis wrote:
> On 29.11.23 01:37, Owen T. Heisler wrote:
>> On 11/21/23 14:23, Owen T. Heisler wrote:
>>> On 11/21/23 09:16, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 15.11.23 07:19, Owen T. Heisler wrote:
>>>>> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>>>>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>>>>>> #regzbot link:
>>>>>>> https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>>>>>
>>>>>>> 3. Suddenly the secondary Nvidia-connected display turns off and X
>>>>>>> stops responding to keyboard/mouse input.
>>
>>> I am currently testing v6.6 with the culprit commit reverted.
>>
>> - v6.6: fails
>> - v6.6 with the culprit commit reverted: works
>>
>> See <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180> for full
>> details including a decoded kernel log.
> 
> Not sure about the others, but it's kind of confusing that you update
> the issue descriptions all the time and never add a comment to that ticket.

Thank you for the feedback; I will use comments more for future updates 
there. I didn't know anyone was following that issue (I haven't received 
any reply from nouveau developers on the nouveau list [1] or on gitlab 
[2]) so I have tried to keep that issue description succinct and 
up-to-date for anyone reading it for the first time.

[1]: 
<https://lists.freedesktop.org/archives/nouveau/2022-September/041001.html>
[2]: But Karol Herbst did add the "regression" label.

> Anyway: Nouveau maintainers, could any of you at least comment on this?
> Sure, it's the regression is caused by an old commit (6eaa1f3c59a707 was
> merged for v5.14-rc7) and reverting it likely is not a option, but it
> nevertheless it would be great if this could be solved somehow.

Also if anyone has any ideas about any stress-tests or anything else 
that I might be able to trigger the crash with, please share.

Thanks,
Owen

--
Owen T. Heisler
<https://owenh.net>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-12-06  4:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6f027566-c841-4415-bc85-ce11a5832b14@owenh.net>
     [not found] ` <5ecf0eac-a089-4da9-b76e-b45272c98393@leemhuis.info>
2023-11-15  6:19   ` [REGRESSION]: nouveau: Asynchronous wait on fence Owen T. Heisler
2023-11-21 15:16     ` Linux regression tracking (Thorsten Leemhuis)
2023-11-21 20:23       ` Owen T. Heisler
2023-11-29  0:37         ` Owen T. Heisler
2023-12-05 12:33           ` Thorsten Leemhuis
2023-12-06  4:08             ` Owen T. Heisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).