* 5.4+: PAGE FAULT crashes the system multiple times per 24h
@ 2020-02-10 14:39 Udo van den Heuvel
2020-02-10 16:04 ` Gabriel C
0 siblings, 1 reply; 7+ messages in thread
From: Udo van den Heuvel @ 2020-02-10 14:39 UTC (permalink / raw)
To: linux-mm@vger.kernel.org
Hello,
Would this be a bug in the mm area?
For bug https://bugzilla.kernel.org/show_bug.cgi?id=206191 I have been
bisecting way but now the process landed me with a kernel that cannot
find the root fs. (with either good or bad bisect choices)
Pictures of the crash that is the reason for this bisect:
https://bugzilla.kernel.org/attachment.cgi?id=286787
https://bugzilla.kernel.org/attachment.cgi?id=286789
https://bugzilla.kernel.org/attachment.cgi?id=286791
https://bugzilla.kernel.org/attachment.cgi?id=286793
How can I proceed from here with the bisecting?
Did someone perhaps find the root cause for the page fault?
As the crash is fairly easy to reproduce I can test patches...
Please let me know!
Kind regards,
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 5.4+: PAGE FAULT crashes the system multiple times per 24h
2020-02-10 14:39 5.4+: PAGE FAULT crashes the system multiple times per 24h Udo van den Heuvel
@ 2020-02-10 16:04 ` Gabriel C
2020-02-10 16:24 ` Udo van den Heuvel
0 siblings, 1 reply; 7+ messages in thread
From: Gabriel C @ 2020-02-10 16:04 UTC (permalink / raw)
To: Udo van den Heuvel; +Cc: linux-mm@vger.kernel.org
Am Mo., 10. Feb. 2020 um 15:39 Uhr schrieb Udo van den Heuvel
<udovdh@xs4all.nl>:
>
> Hello,
Hi,
>
> Would this be a bug in the mm area?
I don' know, possible.
Can be everything and nothing, bad OC, bad RAM, broken firmware could
be a cause too.
>
> For bug https://bugzilla.kernel.org/show_bug.cgi?id=206191 I have been
> bisecting way but now the process landed me with a kernel that cannot
> find the root fs. (with either good or bad bisect choices)
>
> Pictures of the crash that is the reason for this bisect:
> https://bugzilla.kernel.org/attachment.cgi?id=286787
> https://bugzilla.kernel.org/attachment.cgi?id=286789
> https://bugzilla.kernel.org/attachment.cgi?id=286791
> https://bugzilla.kernel.org/attachment.cgi?id=286793
>
I looked at some of your logs. I hit freeze/crashes similar to yours
with an R3 APU a while back.
That was caused by a mismatch in kernel -> Xorg driver <-> mesa code + firmware.
I think first you should try to fix your amdgpu bug which is this one:
https://gitlab.freedesktop.org/drm/amd/issues/963
And the fixes are the patchset there:
https://patchwork.freedesktop.org/series/72733/
Also, can you try booting without all these crazy options?
As an example why would you need to force ACPI on your HW?
BR,
Gabriel C.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 5.4+: PAGE FAULT crashes the system multiple times per 24h
2020-02-10 16:04 ` Gabriel C
@ 2020-02-10 16:24 ` Udo van den Heuvel
2020-02-10 17:01 ` Gabriel C
0 siblings, 1 reply; 7+ messages in thread
From: Udo van den Heuvel @ 2020-02-10 16:24 UTC (permalink / raw)
To: Gabriel C; +Cc: linux-mm@vger.kernel.org
Hello Gabriel,
Thank you kindly for your rmail and teh links inthere, I will most
certainly look into those.
On 10-02-2020 17:04, Gabriel C wrote:
> I think first you should try to fix your amdgpu bug which is this one:
> https://gitlab.freedesktop.org/drm/amd/issues/963
>
> And the fixes are the patchset there:
> https://patchwork.freedesktop.org/series/72733/
Thanks, will try those on 5.5.2.
> Also, can you try booting without all these crazy options?
What is crazy here?
Each one has a story.
> As an example why would you need to force ACPI on your HW?
Force?
Because then I can be certain it will be there, this has been there for
quite a while.
Or would you suggest I run my x86_64 without acpi? (I am not an expert
in this area yet)
noexec=on noexec32=on vga=0xF06 SYSFONT=latarcyrheb-sun16
LANG=en_US.UTF-8 KEYTABLE=us
fbcon=font:VGA8x16
Not important I guess.
acpi_enforce_resources=lax
To avoid conflict.
radeon.pcie_gen2=1
To enable PCIE gen 2
cgroup_disable=memory
No control groups for memory.
threadirqs
Theads for irqs.
plymouth.enable=0 rd.plymouth=0
No plymouth.
mce=dont_log_ce
To avoid logging.
panic=0
Kernel behaviour.
rd.lvm.vg=myvg rd.lvm.vg=ssdvg
To have the kernel open the vg
radeon.dpm=1
We want power management
zswap.enabled=1
We want zswap.
rd.auto=1
enable autoassembly of special devices like cryptoLUKS, dmraid,
mdraid or lvm.
audit=0
No audit.
systemd.log_level=warning
Less systemd clutter in logging.
ip=192.168.10.70::192.168.10.98:255.255.255.0:::off:192.168.10.98
rd.neednet=1
This is unnecessary.
net.ifnames=0
Old style network interface names.
amdgpu.gttsize=8192
Had to do with viewing larger PDFs, for genealogy etc.
clocksource=hpet
We want hpet. Not tsc.
amdgpu.lockup_timeout=0
rd.luks.options=discard
We want to use discard on our ssd's.
elevator=mq-deadline
We want a different scheduler for ssd versus hdd.
Kind regards,
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 5.4+: PAGE FAULT crashes the system multiple times per 24h
2020-02-10 16:24 ` Udo van den Heuvel
@ 2020-02-10 17:01 ` Gabriel C
2020-02-11 2:56 ` Udo van den Heuvel
2020-02-11 17:04 ` Udo van den Heuvel
0 siblings, 2 replies; 7+ messages in thread
From: Gabriel C @ 2020-02-10 17:01 UTC (permalink / raw)
To: Udo van den Heuvel; +Cc: linux-mm@vger.kernel.org
Am Mo., 10. Feb. 2020 um 17:25 Uhr schrieb Udo van den Heuvel
<udovdh@xs4all.nl>:
>
> Hello Gabriel,
>
> Thank you kindly for your rmail and teh links inthere, I will most
> certainly look into those.
>
> On 10-02-2020 17:04, Gabriel C wrote:
> > I think first you should try to fix your amdgpu bug which is this one:
> > https://gitlab.freedesktop.org/drm/amd/issues/963
> >
> > And the fixes are the patchset there:
> > https://patchwork.freedesktop.org/series/72733/
>
> Thanks, will try those on 5.5.2.
>
> > Also, can you try booting without all these crazy options?
>
> What is crazy here?
> Each one has a story.
>
Sure, I'm not saying to not use these.
But try to boot a kernel with only what you need to boot when hunting bugs.
As an example, if such a kernel works then you know for sure one of
the option or a combination causes bugs.
> > As an example why would you need to force ACPI on your HW?
>
> Force?
> Because then I can be certain it will be there, this has been there for
> quite a while.
> Or would you suggest I run my x86_64 without acpi? (I am not an expert
> in this area yet)
The force parameter is used to try to enable ACPI on HW has is OFF by
default, you don't need that.
....
> rd.luks.options=discard
>
> We want to use discard on our ssd's.
Use mount options?
> elevator=mq-deadline
>We want a different scheduler for ssd versus hdd.
If you really want that you should use udev rules for SSD/NVME/HDD/USB etc.
BR,
Gabriel C.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 5.4+: PAGE FAULT crashes the system multiple times per 24h
2020-02-10 17:01 ` Gabriel C
@ 2020-02-11 2:56 ` Udo van den Heuvel
2020-02-11 17:04 ` Udo van den Heuvel
1 sibling, 0 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2020-02-11 2:56 UTC (permalink / raw)
To: Gabriel C; +Cc: linux-mm@vger.kernel.org
On 10-02-2020 18:01, Gabriel C wrote:
>> rd.luks.options=discard
>>
>> We want to use discard on our ssd's.
>
> Use mount options?
Not enough to make it work.
>
>> elevator=mq-deadline
>> We want a different scheduler for ssd versus hdd.
>
> If you really want that you should use udev rules for SSD/NVME/HDD/USB etc.
Simply load the scheduler module and set the scheduler in rc.local is
easier.
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 5.4+: PAGE FAULT crashes the system multiple times per 24h
2020-02-10 17:01 ` Gabriel C
2020-02-11 2:56 ` Udo van den Heuvel
@ 2020-02-11 17:04 ` Udo van den Heuvel
1 sibling, 0 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2020-02-11 17:04 UTC (permalink / raw)
Cc: linux-mm@vger.kernel.org
On 10-02-2020 18:01, Gabriel C wrote:
> But try to boot a kernel with only what you need to boot when hunting bugs.
> As an example, if such a kernel works then you know for sure one of
> the option or a combination causes bugs.
These options are reasonable and necessary; so far things worked OK.
So why would they start being an issue?
And how can I even proceed when the kernel cannot find a rootfs anymore
while bisecting?
5.5.2 also has the page fault issue.
So why Linus does call 5.5.x 'stable' is beyond me.
How can I continue and find the root cause for the page fault hang?
> The force parameter is used to try to enable ACPI on HW has is OFF by
> default, you don't need that.
I booted 5.5.3 without acpi=force and dmesg output with `acpi` in it
looks similar.
So acpi=force wil be removed from future kernel commandlines.
>> We want to use discard on our ssd's.
>
> Use mount options?
Not enough to make it work for LUKS.
>> elevator=mq-deadline
>> We want a different scheduler for ssd versus hdd.
>
> If you really want that you should use udev rules for SSD/NVME/HDD/USB etc.
/etc/rc.d/rc.local is easier.
Look at the overhead of a service file.
Same as the overhead of NetworkManager versus a few kilobytes of
network-scripts.
But Fedora thinks otherwise....
Kind regards,
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* 5.4+: PAGE FAULT crashes the system multiple times per 24h
@ 2020-02-09 8:39 Udo van den Heuvel
0 siblings, 0 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2020-02-09 8:39 UTC (permalink / raw)
To: linux-kernel
Hello,
For bug https://bugzilla.kernel.org/show_bug.cgi?id=206191 I have been
bisecting way but now the process landed me with a kernel that cannot
find the root fs. (with either good or bad bisect choices)
Pictures of the crash that is the reason for this bisect:
https://bugzilla.kernel.org/attachment.cgi?id=286787
https://bugzilla.kernel.org/attachment.cgi?id=286789
https://bugzilla.kernel.org/attachment.cgi?id=286791
https://bugzilla.kernel.org/attachment.cgi?id=286793
How can I proceed from here with the bisecting?
Did someone perhaps find the root cause for the page fault?
As the crash is fairly easy to reproduce I can test patches...
Please let me know!
Kind regards,
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-02-11 17:04 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-10 14:39 5.4+: PAGE FAULT crashes the system multiple times per 24h Udo van den Heuvel
2020-02-10 16:04 ` Gabriel C
2020-02-10 16:24 ` Udo van den Heuvel
2020-02-10 17:01 ` Gabriel C
2020-02-11 2:56 ` Udo van den Heuvel
2020-02-11 17:04 ` Udo van den Heuvel
-- strict thread matches above, loose matches on Subject: below --
2020-02-09 8:39 Udo van den Heuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).