From mboxrd@z Thu Jan 1 00:00:00 1970 From: Julian Margetson Subject: Re: [BUG/REGRESSION] Kernel 4.5-rc1 on Acube Sam460ex AMCC 460ex Power PC motherboards Date: Wed, 27 Jan 2016 06:18:35 -0400 Message-ID: <56A8997B.3090400@candw.ms> References: <55592246.5090505@candw.ms> <555A25D7.7070000@candw.ms> <56815C0F.4010207@candw.ms> <56A173FA.2090405@candw.ms> <56A1E59B.5060200@vodafone.de> <56A6190D.3090609@candw.ms> <56A68702.5040909@candw.ms> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2073382003==" Return-path: Received: from smtp754.redcondor.net (smtp754.redcondor.net [208.80.206.54]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1600F6E6AF for ; Wed, 27 Jan 2016 02:19:04 -0800 (PST) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Dan Williams Cc: Dave Hansen , Maling list - DRI developers , Alex Perez , Christian Zigotzky List-Id: dri-devel@lists.freedesktop.org This is a multi-part message in MIME format. --===============2073382003== Content-Type: multipart/alternative; boundary="------------020406010300060605080206" This is a multi-part message in MIME format. --------------020406010300060605080206 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 1/26/2016 9:43 PM, Dan Williams wrote: > On Mon, Jan 25, 2016 at 12:35 PM, Julian Margetson wrote: >> On 1/25/2016 3:20 PM, Dan Williams wrote: > [..] >> Hmm, this commit could only cause a behavior change if it modifies the >> value of the pfn as seen by insert_pfn(). Can you try the attached >> debug patch to see if that assumption is being violated? >> >> Had to manually delete the lines in the second part of the patch. > Sorry about that I had based direct on that failing commit rather than > 4.5-rc1. A reflowed version in the attached. > >> [ 42.557813] Oops: Machine check, sig: 7 [#1] >> [ 42.562350] PREEMPT Canyonlands >> [ 42.565692] Modules linked in: >> [ 42.568933] CPU: 0 PID: 495 Comm: Xorg Tainted: G W >> 4.5.0-rc1-Sam460ex #1 >> [ 42.577291] task: ee3adcc0 ti: ee260000 task.ti: ee260000 >> [ 42.582984] NIP: 1ff72480 LR: 1ff72404 CTR: 1ff724d0 >> [ 42.588220] REGS: ee261f10 TRAP: 0214 Tainted: G W >> (4.5.0-rc1-Sam460ex) >> [ 42.596663] MSR: 0002d000 CR: 24004242 XER: 00000000 >> [ 42.603512] >> GPR00: 1f436134 bfc4dac0 b79cb6f0 b718dffc b69a4008 00000780 00000004 >> 00000000 >> GPR08: 00000000 b718dffc 00000000 bfc4da70 1ff72404 2080dff4 00000000 >> 00000780 >> GPR16: 00000000 00000020 00000000 00000000 00001e00 20aaa620 00000438 >> b69a4008 >> GPR24: 00000780 bfc4db18 20a94760 b718e000 b718e000 b69a4008 2007aff4 >> 00001e00 >> [ 42.635363] NIP [1ff72480] 0x1ff72480 >> [ 42.639225] LR [1ff72404] 0x1ff72404 >> [ 42.642991] Call Trace: >> [ 42.798393] ---[ end trace 8fcfa5f0e9942055 ]--- > I'm not familiar with powerpc crash dumps, so there's not much > information I can glean from this. Any folks on the cc can translate > a powerpc "Machine check"? > > I'm down to looking a differences between the passing and failing > case. Can you print out the value the pte entry and the in > insert_pfn, like the following: > > diff --git a/mm/memory.c b/mm/memory.c > index 30991f83d0bf..c44e387130b2 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1521,6 +1521,8 @@ static int insert_pfn(struct vm_area_struct > *vma, unsigned long addr, > entry = pte_mkdevmap(pfn_t_pte(pfn, prot)); > else > entry = pte_mkspecial(pfn_t_pte(pfn, prot)); > + pr_info("%s: entry: %#llx pfn: %#lx\n", __func__, > + (unsigned long long) entry, pfn_t_to_pfn(pfn)); > set_pte_at(mm, addr, pte, entry); > update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */ > > ...of course for the passing case you'll need to drop the call to > pfn_t_to_pfn() and just print the pfn directly. > > Thank you for the help tracking this down, it's much appreciated. Happy to help out. Just need some guidance sometimes as I am relatively new at this. ----------------------------------------------------------------------------------------------------------------------------- 15.802615] systemd[1]: Started Journal Service. [ 44.263074] Oops: Machine check, sig: 7 [#1] [ 44.267603] PREEMPT Canyonlands [ 44.270938] Modules linked in: [ 44.274182] CPU: 0 PID: 586 Comm: Xorg Tainted: G W 4.5.0-rc1-Sam460ex #2 [ 44.282538] task: ecd505c0 ti: efff2000 task.ti: ecd76000 [ 44.288239] NIP: c0000cec LR: 1fb81404 CTR: 1fb814d0 [ 44.293483] REGS: efff3f10 TRAP: 0214 Tainted: G W (4.5.0-rc1-Sam460ex) [ 44.301926] MSR: 00021000 CR: 84004242 XER: 00000000 [ 44.308185] GPR00: 1f045134 bfd0ce80 b7e7b6f0 b763dffc b6e54008 00000780 00000004 00000000 GPR08: 00000000 b763dffc b6e54010 ecf50000 ecf50000 00000009 00000000 00000780 GPR16: 00000000 00000020 00000000 00000000 00001e00 2079b638 00000438 b6e54008 GPR24: 00000780 bfd0ced8 20785770 b763e000 b763e000 b6e54008 1fc89ff4 00001e00 [ 44.340039] NIP [c0000cec] DataTLBError44x+0x6c/0x90 [ 44.345279] LR [1fb81404] 0x1fb81404 [ 44.349053] Call Trace: [ 44.351631] Instruction dump: [ 44.354776] 7d7342a6 816b0040 7d92eaa6 7db00aa6 51ac063e 7d92eba6 7d9e0aa6 39a00009 [ 44.363081] 518d57bc 554c6cfa 7d6c582e 556c0029 <4182003c> 514cbd38 816c0000 818c0004 [ 44.524699] ---[ end trace 439fa29153308785 ]--- [ 44.529322] [ 47.216536] insert_pfn: entry: 0x80ed246b pfn: 0x80ed2 [ 47.221777] insert_pfn: entry: 0x80ed346b pfn: 0x80ed3 [ 47.228485] insert_pfn: entry: 0x80ed446b pfn: 0x80ed4 [ 47.237798] insert_pfn: entry: 0x80ed546b pfn: 0x80ed5 [ 47.249809] insert_pfn: entry: 0x80ed646b pfn: 0x80ed6 [ 47.257588] insert_pfn: entry: 0x80ed746b pfn: 0x80ed7 [ 47.265879] insert_pfn: entry: 0x80ed846b pfn: 0x80ed8 [ 47.275825] insert_pfn: entry: 0x80ed946b pfn: 0x80ed9 [ 47.281437] insert_pfn: entry: 0x80eda46b pfn: 0x80eda [ 47.288113] insert_pfn: entry: 0x80edb46b pfn: 0x80edb [ 47.293660] insert_pfn: entry: 0x80edc46b pfn: 0x80edc [ 47.299834] insert_pfn: entry: 0x80edd46b pfn: 0x80edd [ 47.305223] insert_pfn: entry: 0x80ede46b pfn: 0x80ede [ 47.314891] insert_pfn: entry: 0x80edf46b pfn: 0x80edf [ 47.329777] insert_pfn: entry: 0x80ee046b pfn: 0x80ee0 [ 47.339769] insert_pfn: entry: 0x80ee146b pfn: 0x80ee1 [ 47.349777] Machine check in kernel mode. [ 47.353814] Data Write PLB Error [ 47.357049] Vector: 214 at [efff3f10] [ 47.360799] pc: c0000cec: DataTLBError44x+0x6c/0x90 [ 47.366085] lr: 2008f404 [ 47.369002] sp: bfe76110 [ 47.371885] msr: 21000 [ 47.374506] current = 0xeced85c0 [ 47.377910] pid = 668, comm = Xorg [ 47.381835] Linux version 4.5.0-rc1-Sam460ex (root@julian-VirtualBox) (gcc version 4.8.2 (Ubuntu 4.8.2-16ubuntu3) ) #2 PREEMPT Wed Jan 27 06:07:01 AST 2016 [ 47.395758] enter ? for help [ 47.398638] mon> [ 49.401927] Oops: Machine check, sig: 7 [#2] [ 49.406450] PREEMPT Canyonlands [ 49.409783] Modules linked in: [ 49.413026] CPU: 0 PID: 668 Comm: Xorg Tainted: G D W 4.5.0-rc1-Sam460ex #2 [ 49.421383] task: eced85c0 ti: efff2000 task.ti: ecf8c000 [ 49.427075] NIP: c0000cec LR: 2008f404 CTR: 2008f4d0 [ 49.432311] REGS: efff3f10 TRAP: 0214 Tainted: G D W (4.5.0-rc1-Sam460ex) [ 49.440755] MSR: 00021000 CR: 88004262 XER: 00000000 [ 49.447013] GPR00: 1f553134 bfe76110 b7d6d6f0 b752fffc b6d46008 00000780 00000004 00000000 GPR08: 00000000 b752fffc b6d46010 ecef9000 ecef9000 00000009 00000000 00000780 GPR16: 00000000 00000020 00000000 00000000 00001e00 20eb5650 00000438 b6d46008 GPR24: 00000780 bfe76168 20e9f728 b7530000 b7530000 b6d46008 20197ff4 00001e00 [ 49.478867] NIP [c0000cec] DataTLBError44x+0x6c/0x90 [ 49.484108] LR [2008f404] 0x2008f404 [ 49.487881] Call Trace: [ 49.490460] Instruction dump: [ 49.493603] 7d7342a6 816b0040 7d92eaa6 7db00aa6 51ac063e 7d92eba6 7d9e0aa6 39a00009 [ 49.501909] 518d57bc 554c6cfa 7d6c582e 556c0029 <4182003c> 514cbd38 816c0000 818c0004 [ 49.510404] ---[ end trace 439fa29153308786 ]--- [ 49.515026] --------------020406010300060605080206 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 7bit
On 1/26/2016 9:43 PM, Dan Williams wrote:
On Mon, Jan 25, 2016 at 12:35 PM, Julian Margetson <runaway@candw.ms> wrote:
On 1/25/2016 3:20 PM, Dan Williams wrote:
[..]
Hmm, this commit could only cause a behavior change if it modifies the
value of the pfn as seen by insert_pfn().  Can you try the attached
debug patch to see if that assumption is being violated?

Had to manually delete the lines in the second part of the patch.
Sorry about that I had based direct on that failing commit rather than
4.5-rc1.  A reflowed version in the attached.

[   42.557813] Oops: Machine check, sig: 7 [#1]
[   42.562350] PREEMPT Canyonlands
[   42.565692] Modules linked in:
[   42.568933] CPU: 0 PID: 495 Comm: Xorg Tainted: G        W
4.5.0-rc1-Sam460ex #1
[   42.577291] task: ee3adcc0 ti: ee260000 task.ti: ee260000
[   42.582984] NIP: 1ff72480 LR: 1ff72404 CTR: 1ff724d0
[   42.588220] REGS: ee261f10 TRAP: 0214   Tainted: G        W
(4.5.0-rc1-Sam460ex)
[   42.596663] MSR: 0002d000 <CE,EE,PR,ME>  CR: 24004242  XER: 00000000
[   42.603512]
GPR00: 1f436134 bfc4dac0 b79cb6f0 b718dffc b69a4008 00000780 00000004
00000000
GPR08: 00000000 b718dffc 00000000 bfc4da70 1ff72404 2080dff4 00000000
00000780
GPR16: 00000000 00000020 00000000 00000000 00001e00 20aaa620 00000438
b69a4008
GPR24: 00000780 bfc4db18 20a94760 b718e000 b718e000 b69a4008 2007aff4
00001e00
[   42.635363] NIP [1ff72480] 0x1ff72480
[   42.639225] LR [1ff72404] 0x1ff72404
[   42.642991] Call Trace:
[   42.798393] ---[ end trace 8fcfa5f0e9942055 ]---
I'm not familiar with powerpc crash dumps, so there's not much
information I can glean from this.  Any folks on the cc can translate
a powerpc "Machine check"?

I'm down to looking a differences between the passing and failing
case.  Can you print out the value the pte entry and the in
insert_pfn, like the following:

diff --git a/mm/memory.c b/mm/memory.c
index 30991f83d0bf..c44e387130b2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1521,6 +1521,8 @@ static int insert_pfn(struct vm_area_struct
*vma, unsigned long addr,
               entry = pte_mkdevmap(pfn_t_pte(pfn, prot));
       else
               entry = pte_mkspecial(pfn_t_pte(pfn, prot));
+       pr_info("%s: entry: %#llx pfn: %#lx\n", __func__,
+                       (unsigned long long) entry, pfn_t_to_pfn(pfn));
       set_pte_at(mm, addr, pte, entry);
       update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */

...of course for the passing case you'll need to drop the call to
pfn_t_to_pfn() and just print the pfn directly.

Thank you for the help tracking this down, it's much appreciated.
Happy to help out. Just need some guidance sometimes as I am relatively new at this.
-----------------------------------------------------------------------------------------------------------------------------
15.802615] systemd[1]: Started Journal Service.
[   44.263074] Oops: Machine check, sig: 7 [#1]
[   44.267603] PREEMPT Canyonlands
[   44.270938] Modules linked in:
[   44.274182] CPU: 0 PID: 586 Comm: Xorg Tainted: G        W       4.5.0-rc1-Sam460ex #2
[   44.282538] task: ecd505c0 ti: efff2000 task.ti: ecd76000
[   44.288239] NIP: c0000cec LR: 1fb81404 CTR: 1fb814d0
[   44.293483] REGS: efff3f10 TRAP: 0214   Tainted: G        W        (4.5.0-rc1-Sam460ex)
[   44.301926] MSR: 00021000 <CE,ME>  CR: 84004242  XER: 00000000
[   44.308185]
GPR00: 1f045134 bfd0ce80 b7e7b6f0 b763dffc b6e54008 00000780 00000004 00000000
GPR08: 00000000 b763dffc b6e54010 ecf50000 ecf50000 00000009 00000000 00000780
GPR16: 00000000 00000020 00000000 00000000 00001e00 2079b638 00000438 b6e54008
GPR24: 00000780 bfd0ced8 20785770 b763e000 b763e000 b6e54008 1fc89ff4 00001e00
[   44.340039] NIP [c0000cec] DataTLBError44x+0x6c/0x90
[   44.345279] LR [1fb81404] 0x1fb81404
[   44.349053] Call Trace:
[   44.351631] Instruction dump:
[   44.354776] 7d7342a6 816b0040 7d92eaa6 7db00aa6 51ac063e 7d92eba6 7d9e0aa6 39a00009
[   44.363081] 518d57bc 554c6cfa 7d6c582e 556c0029 <4182003c> 514cbd38 816c0000 818c0004
[   44.524699] ---[ end trace 439fa29153308785 ]---
[   44.529322]
[   47.216536] insert_pfn: entry: 0x80ed246b pfn: 0x80ed2
[   47.221777] insert_pfn: entry: 0x80ed346b pfn: 0x80ed3
[   47.228485] insert_pfn: entry: 0x80ed446b pfn: 0x80ed4
[   47.237798] insert_pfn: entry: 0x80ed546b pfn: 0x80ed5
[   47.249809] insert_pfn: entry: 0x80ed646b pfn: 0x80ed6
[   47.257588] insert_pfn: entry: 0x80ed746b pfn: 0x80ed7
[   47.265879] insert_pfn: entry: 0x80ed846b pfn: 0x80ed8
[   47.275825] insert_pfn: entry: 0x80ed946b pfn: 0x80ed9
[   47.281437] insert_pfn: entry: 0x80eda46b pfn: 0x80eda
[   47.288113] insert_pfn: entry: 0x80edb46b pfn: 0x80edb
[   47.293660] insert_pfn: entry: 0x80edc46b pfn: 0x80edc
[   47.299834] insert_pfn: entry: 0x80edd46b pfn: 0x80edd
[   47.305223] insert_pfn: entry: 0x80ede46b pfn: 0x80ede
[   47.314891] insert_pfn: entry: 0x80edf46b pfn: 0x80edf
[   47.329777] insert_pfn: entry: 0x80ee046b pfn: 0x80ee0
[   47.339769] insert_pfn: entry: 0x80ee146b pfn: 0x80ee1
[   47.349777] Machine check in kernel mode.
[   47.353814] Data Write PLB Error
[   47.357049] Vector: 214  at [efff3f10]
[   47.360799]     pc: c0000cec: DataTLBError44x+0x6c/0x90
[   47.366085]     lr: 2008f404
[   47.369002]     sp: bfe76110
[   47.371885]    msr: 21000
[   47.374506]   current = 0xeced85c0
[   47.377910]     pid   = 668, comm = Xorg
[   47.381835] Linux version 4.5.0-rc1-Sam460ex (root@julian-VirtualBox) (gcc version 4.8.2 (Ubuntu 4.8.2-16ubuntu3) ) #2 PREEMPT Wed Jan 27 06:07:01 AST 2016
[   47.395758] enter ? for help
[   47.398638] mon>  <no input ...>
[   49.401927] Oops: Machine check, sig: 7 [#2]
[   49.406450] PREEMPT Canyonlands
[   49.409783] Modules linked in:
[   49.413026] CPU: 0 PID: 668 Comm: Xorg Tainted: G      D W       4.5.0-rc1-Sam460ex #2
[   49.421383] task: eced85c0 ti: efff2000 task.ti: ecf8c000
[   49.427075] NIP: c0000cec LR: 2008f404 CTR: 2008f4d0
[   49.432311] REGS: efff3f10 TRAP: 0214   Tainted: G      D W        (4.5.0-rc1-Sam460ex)
[   49.440755] MSR: 00021000 <CE,ME>  CR: 88004262  XER: 00000000
[   49.447013]
GPR00: 1f553134 bfe76110 b7d6d6f0 b752fffc b6d46008 00000780 00000004 00000000
GPR08: 00000000 b752fffc b6d46010 ecef9000 ecef9000 00000009 00000000 00000780
GPR16: 00000000 00000020 00000000 00000000 00001e00 20eb5650 00000438 b6d46008
GPR24: 00000780 bfe76168 20e9f728 b7530000 b7530000 b6d46008 20197ff4 00001e00
[   49.478867] NIP [c0000cec] DataTLBError44x+0x6c/0x90
[   49.484108] LR [2008f404] 0x2008f404
[   49.487881] Call Trace:
[   49.490460] Instruction dump:
[   49.493603] 7d7342a6 816b0040 7d92eaa6 7db00aa6 51ac063e 7d92eba6 7d9e0aa6 39a00009
[   49.501909] 518d57bc 554c6cfa 7d6c582e 556c0029 <4182003c> 514cbd38 816c0000 818c0004
[   49.510404] ---[ end trace 439fa29153308786 ]---
[   49.515026]


--------------020406010300060605080206-- --===============2073382003== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK --===============2073382003==--