* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes [not found] <20200902114222.181353-1-aneesh.kumar@linux.ibm.com> @ 2020-09-04 6:48 ` Anshuman Khandual 2020-09-04 15:26 ` Gerald Schaefer [not found] ` <20200902114222.181353-14-aneesh.kumar@linux.ibm.com> 1 sibling, 1 reply; 15+ messages in thread From: Anshuman Khandual @ 2020-09-04 6:48 UTC (permalink / raw) To: Aneesh Kumar K.V, linux-mm, akpm Cc: linux-s390, mpe, Vineet Gupta, linux-riscv, linux-snps-arc, linuxppc-dev, Gerald Schaefer On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: > This patch series includes fixes for debug_vm_pgtable test code so that > they follow page table updates rules correctly. The first two patches introduce > changes w.r.t ppc64. The patches are included in this series for completeness. We can > merge them via ppc64 tree if required. > > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy > page table update rules. > > These tests are broken w.r.t page table update rules and results in kernel > crash as below. > > [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] > pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 > lr: c0000000005eeeec: pte_update+0x11c/0x190 > sp: c000000c6d1e7950 > msr: 8000000002029033 > current = 0xc000000c6d172c80 > paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 > pid = 1, comm = swapper/0 > kernel BUG at arch/powerpc/mm/pgtable.c:304! > [link register ] c0000000005eeeec pte_update+0x11c/0x190 > [c000000c6d1e7950] 0000000000000001 (unreliable) > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > With DEBUG_VM disabled > > [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 > [ 20.530183] Faulting instruction address: 0xc0000000000df330 > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] > pc: c0000000000df330: memset+0x68/0x104 > lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > sp: c000000c6d19f990 > msr: 8000000002009033 > dar: 0 > current = 0xc000000c6d177480 > paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 > pid = 1, comm = swapper/0 > [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > Changes from v3: > * Address review feedback > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. This version - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well + linux-riscv <linux-riscv@lists.infradead.org> + linux-snps-arc@lists.infradead.org <linux-snps-arc@lists.infradead.org> + linux-s390@vger.kernel.org + Gerald Schaefer <gerald.schaefer@de.ibm.com> + Vineet Gupta <vgupta@synopsys.com> There is still an open git bisect issue on arm64 platform which ideally should be fixed. - Anshuman _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual @ 2020-09-04 15:26 ` Gerald Schaefer 2020-09-04 16:01 ` Gerald Schaefer 2020-09-09 8:08 ` Anshuman Khandual 0 siblings, 2 replies; 15+ messages in thread From: Gerald Schaefer @ 2020-09-04 15:26 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Fri, 4 Sep 2020 12:18:05 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: > > This patch series includes fixes for debug_vm_pgtable test code so that > > they follow page table updates rules correctly. The first two patches introduce > > changes w.r.t ppc64. The patches are included in this series for completeness. We can > > merge them via ppc64 tree if required. > > > > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy > > page table update rules. > > > > These tests are broken w.r.t page table update rules and results in kernel > > crash as below. > > > > [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! > > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] > > pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 > > lr: c0000000005eeeec: pte_update+0x11c/0x190 > > sp: c000000c6d1e7950 > > msr: 8000000002029033 > > current = 0xc000000c6d172c80 > > paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 > > pid = 1, comm = swapper/0 > > kernel BUG at arch/powerpc/mm/pgtable.c:304! > > [link register ] c0000000005eeeec pte_update+0x11c/0x190 > > [c000000c6d1e7950] 0000000000000001 (unreliable) > > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 > > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 > > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 > > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 > > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 > > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > > > With DEBUG_VM disabled > > > > [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 > > [ 20.530183] Faulting instruction address: 0xc0000000000df330 > > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] > > pc: c0000000000df330: memset+0x68/0x104 > > lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > > sp: c000000c6d19f990 > > msr: 8000000002009033 > > dar: 0 > > current = 0xc000000c6d177480 > > paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 > > pid = 1, comm = swapper/0 > > [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) > > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 > > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 > > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 > > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 > > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > > > Changes from v3: > > * Address review feedback > > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. > > This version > > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) > - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed > - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well When I quickly tested v3, it worked fine, but now it turned out to only work fine "sometimes", both v3 and v4. I need to look into it further, but so far it seems related to the hugetlb_advanced_tests(). I guess there was already some discussion on this test, but we did not receive all of the thread(s). Please always add at least linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> for further discussions. That being said, sorry for duplications, this might already have been discussed. Preliminary analysis showed that it only seems to go wrong for certain random vaddr values. I cannot make any sense of that yet, but what seems strange to me is that the hugetlb_advanced_tests() take a (real) pte_t pointer as input, and also use that for all kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.). Although all the hugetlb code in the kernel is (mis)using pte_t pointers instead of the correct pmd/pud_t pointers like THP, that is just for historic reasons. The pointers will actually never point to a real pte_t (i.e. page table entry), but of course to a pmd or pud entry, depending on hugepage size. What is passed in as ptep to hugetlb_advanced_tests() seems to be the result from the previous ptep = pte_alloc_map(mm, pmdp, vaddr), so I would expect that it points to a real page table entry. Need to investigate further, but IIUC, using such a pointer for adding large pte entries (i.e. pmd/pud entries) at least feels very wrong to me, and I assume it is related to the issues we see on s390. We actually see different issues, e.g. once a panic directly in hugetlb_advanced_tests() -> huge_ptep_get_and_clear(), but also indirect symptoms after debug_vm_pgtable() completes, like this: [ 10.533901] BUG task_struct (Not tainted): Padding overwritten. 0x0000000019f798c7-0x0000000019f798c7 @offset=30087 Last but not least, what I said about the pte vs. pmd/pud of course also should apply to the hugetlb_basic_tests(), although they are not directly using a pte_t pointer, and especially also not writing to it. Still, the pte_aligned pfn parameter is not guaranteed to also be pmd/pud_aligned, which doesn't feel right. So, for now, until this is sorted out, I guess we also need to exclude s390 at least from the hugetlb_advanced_tests(). The hugetlb_basic_tests() seem to work fine so far (probably by chance :-)) _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 15:26 ` Gerald Schaefer @ 2020-09-04 16:01 ` Gerald Schaefer 2020-09-04 17:53 ` Gerald Schaefer ` (2 more replies) 2020-09-09 8:08 ` Anshuman Khandual 1 sibling, 3 replies; 15+ messages in thread From: Gerald Schaefer @ 2020-09-04 16:01 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Fri, 4 Sep 2020 17:26:47 +0200 Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > On Fri, 4 Sep 2020 12:18:05 +0530 > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > > > > > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: > > > This patch series includes fixes for debug_vm_pgtable test code so that > > > they follow page table updates rules correctly. The first two patches introduce > > > changes w.r.t ppc64. The patches are included in this series for completeness. We can > > > merge them via ppc64 tree if required. > > > > > > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy > > > page table update rules. > > > > > > These tests are broken w.r.t page table update rules and results in kernel > > > crash as below. > > > > > > [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! > > > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] > > > pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 > > > lr: c0000000005eeeec: pte_update+0x11c/0x190 > > > sp: c000000c6d1e7950 > > > msr: 8000000002029033 > > > current = 0xc000000c6d172c80 > > > paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 > > > pid = 1, comm = swapper/0 > > > kernel BUG at arch/powerpc/mm/pgtable.c:304! > > > [link register ] c0000000005eeeec pte_update+0x11c/0x190 > > > [c000000c6d1e7950] 0000000000000001 (unreliable) > > > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 > > > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 > > > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 > > > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > > > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 > > > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 > > > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > > > > > With DEBUG_VM disabled > > > > > > [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 > > > [ 20.530183] Faulting instruction address: 0xc0000000000df330 > > > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] > > > pc: c0000000000df330: memset+0x68/0x104 > > > lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > > > sp: c000000c6d19f990 > > > msr: 8000000002009033 > > > dar: 0 > > > current = 0xc000000c6d177480 > > > paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 > > > pid = 1, comm = swapper/0 > > > [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > > > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) > > > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 > > > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 > > > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > > > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 > > > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 > > > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > > > > > Changes from v3: > > > * Address review feedback > > > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. > > > > This version > > > > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) > > - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed > > - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well > > When I quickly tested v3, it worked fine, but now it turned out to > only work fine "sometimes", both v3 and v4. I need to look into it > further, but so far it seems related to the hugetlb_advanced_tests(). > > I guess there was already some discussion on this test, but we did > not receive all of the thread(s). Please always add at least > linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> > for further discussions. BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com. The old gerald.schaefer@de.ibm.com seems to work (again), but is not very reliable. BTW2, a quick test with this change (so far) made the issues on s390 go away: @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) spin_unlock(ptl); #ifndef CONFIG_PPC_BOOK3S_64 - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); #endif spin_lock(&mm->page_table_lock); That would more match the "pte_t pointer" usage for hugetlb code, i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, but I think the root cause is the pte_t pointer. Not entirely sure though if that would really be the correct fix. I somehow lost whatever little track I had about what these tests really want to check, and if that would still be valid with that change. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 16:01 ` Gerald Schaefer @ 2020-09-04 17:53 ` Gerald Schaefer 2020-09-09 8:38 ` Anshuman Khandual 2020-09-08 15:39 ` Gerald Schaefer 2020-09-09 8:15 ` Anshuman Khandual 2 siblings, 1 reply; 15+ messages in thread From: Gerald Schaefer @ 2020-09-04 17:53 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Fri, 4 Sep 2020 18:01:15 +0200 Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > On Fri, 4 Sep 2020 17:26:47 +0200 > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > > On Fri, 4 Sep 2020 12:18:05 +0530 > > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > > > > > > > > > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: > > > > This patch series includes fixes for debug_vm_pgtable test code so that > > > > they follow page table updates rules correctly. The first two patches introduce > > > > changes w.r.t ppc64. The patches are included in this series for completeness. We can > > > > merge them via ppc64 tree if required. > > > > > > > > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy > > > > page table update rules. > > > > > > > > These tests are broken w.r.t page table update rules and results in kernel > > > > crash as below. > > > > > > > > [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! > > > > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] > > > > pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 > > > > lr: c0000000005eeeec: pte_update+0x11c/0x190 > > > > sp: c000000c6d1e7950 > > > > msr: 8000000002029033 > > > > current = 0xc000000c6d172c80 > > > > paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 > > > > pid = 1, comm = swapper/0 > > > > kernel BUG at arch/powerpc/mm/pgtable.c:304! > > > > [link register ] c0000000005eeeec pte_update+0x11c/0x190 > > > > [c000000c6d1e7950] 0000000000000001 (unreliable) > > > > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 > > > > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 > > > > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 > > > > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > > > > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 > > > > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 > > > > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > > > > > > > With DEBUG_VM disabled > > > > > > > > [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 > > > > [ 20.530183] Faulting instruction address: 0xc0000000000df330 > > > > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] > > > > pc: c0000000000df330: memset+0x68/0x104 > > > > lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > > > > sp: c000000c6d19f990 > > > > msr: 8000000002009033 > > > > dar: 0 > > > > current = 0xc000000c6d177480 > > > > paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 > > > > pid = 1, comm = swapper/0 > > > > [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > > > > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) > > > > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 > > > > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 > > > > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > > > > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 > > > > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 > > > > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > > > > > > > > Changes from v3: > > > > * Address review feedback > > > > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. > > > > > > This version > > > > > > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) > > > - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed > > > - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well > > > > When I quickly tested v3, it worked fine, but now it turned out to > > only work fine "sometimes", both v3 and v4. I need to look into it > > further, but so far it seems related to the hugetlb_advanced_tests(). > > > > I guess there was already some discussion on this test, but we did > > not receive all of the thread(s). Please always add at least > > linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> > > for further discussions. > > BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com. > The old gerald.schaefer@de.ibm.com seems to work (again), but is not > very reliable. > > BTW2, a quick test with this change (so far) made the issues on s390 > go away: > > @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) > spin_unlock(ptl); > > #ifndef CONFIG_PPC_BOOK3S_64 > - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); > + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); > #endif > > spin_lock(&mm->page_table_lock); > > That would more match the "pte_t pointer" usage for hugetlb code, > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > but I think the root cause is the pte_t pointer. > > Not entirely sure though if that would really be the correct fix. > I somehow lost whatever little track I had about what these tests > really want to check, and if that would still be valid with that > change. Another potential issue, apparently not for s390, but maybe for others, is that the vaddr passed to hugetlb_advanced_tests() is also not pmd/pud size aligned, like you did in pmd/pud_advanced_tests(). I guess for the hugetlb_advanced_tests() you need to choose if you want to test pmd or pud hugepages, and accordingly prepare the *ptep, pfn and vaddr input. If you only check for CONFIG_HUGETLB_PAGE, then probably only pmd hugepages would be safe, there might be architectures only supporting one hugepage size. So, for s390, at least the ptep input value is a problem. Still need to better understand how it goes wrong, but it seems to be fixed when using proper pmdp, and also works with pudp. For others, especially the apparent issues on ppc64, the other non-hugepage aligned input pfn and vaddr might also be an issue, e.g. power at least seems to use the vaddr in its set_huge_pte_at() implementation for some pmd_off(mm, addr) calculation. Again, sorry if this was already discussed, I missed most of it and honestly didn't properly look at the scarce mails that we did receive... _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 17:53 ` Gerald Schaefer @ 2020-09-09 8:38 ` Anshuman Khandual 0 siblings, 0 replies; 15+ messages in thread From: Anshuman Khandual @ 2020-09-09 8:38 UTC (permalink / raw) To: Gerald Schaefer Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On 09/04/2020 11:23 PM, Gerald Schaefer wrote: > On Fri, 4 Sep 2020 18:01:15 +0200 > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > >> On Fri, 4 Sep 2020 17:26:47 +0200 >> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: >> >>> On Fri, 4 Sep 2020 12:18:05 +0530 >>> Anshuman Khandual <anshuman.khandual@arm.com> wrote: >>> >>>> >>>> >>>> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: >>>>> This patch series includes fixes for debug_vm_pgtable test code so that >>>>> they follow page table updates rules correctly. The first two patches introduce >>>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can >>>>> merge them via ppc64 tree if required. >>>>> >>>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy >>>>> page table update rules. >>>>> >>>>> These tests are broken w.r.t page table update rules and results in kernel >>>>> crash as below. >>>>> >>>>> [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! >>>>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] >>>>> pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 >>>>> lr: c0000000005eeeec: pte_update+0x11c/0x190 >>>>> sp: c000000c6d1e7950 >>>>> msr: 8000000002029033 >>>>> current = 0xc000000c6d172c80 >>>>> paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 >>>>> pid = 1, comm = swapper/0 >>>>> kernel BUG at arch/powerpc/mm/pgtable.c:304! >>>>> [link register ] c0000000005eeeec pte_update+0x11c/0x190 >>>>> [c000000c6d1e7950] 0000000000000001 (unreliable) >>>>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 >>>>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 >>>>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 >>>>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 >>>>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 >>>>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 >>>>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c >>>>> >>>>> With DEBUG_VM disabled >>>>> >>>>> [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 >>>>> [ 20.530183] Faulting instruction address: 0xc0000000000df330 >>>>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] >>>>> pc: c0000000000df330: memset+0x68/0x104 >>>>> lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 >>>>> sp: c000000c6d19f990 >>>>> msr: 8000000002009033 >>>>> dar: 0 >>>>> current = 0xc000000c6d177480 >>>>> paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 >>>>> pid = 1, comm = swapper/0 >>>>> [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 >>>>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) >>>>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 >>>>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 >>>>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 >>>>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 >>>>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 >>>>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c >>>>> >>>>> Changes from v3: >>>>> * Address review feedback >>>>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. >>>> >>>> This version >>>> >>>> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) >>>> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed >>>> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well >>> >>> When I quickly tested v3, it worked fine, but now it turned out to >>> only work fine "sometimes", both v3 and v4. I need to look into it >>> further, but so far it seems related to the hugetlb_advanced_tests(). >>> >>> I guess there was already some discussion on this test, but we did >>> not receive all of the thread(s). Please always add at least >>> linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> >>> for further discussions. >> >> BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com. >> The old gerald.schaefer@de.ibm.com seems to work (again), but is not >> very reliable. >> >> BTW2, a quick test with this change (so far) made the issues on s390 >> go away: >> >> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) >> spin_unlock(ptl); >> >> #ifndef CONFIG_PPC_BOOK3S_64 >> - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); >> + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); >> #endif >> >> spin_lock(&mm->page_table_lock); >> >> That would more match the "pte_t pointer" usage for hugetlb code, >> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, >> but I think the root cause is the pte_t pointer. >> >> Not entirely sure though if that would really be the correct fix. >> I somehow lost whatever little track I had about what these tests >> really want to check, and if that would still be valid with that >> change. > > Another potential issue, apparently not for s390, but maybe for > others, is that the vaddr passed to hugetlb_advanced_tests() is > also not pmd/pud size aligned, like you did in pmd/pud_advanced_tests(). > > I guess for the hugetlb_advanced_tests() you need to choose if > you want to test pmd or pud hugepages, and accordingly prepare > the *ptep, pfn and vaddr input. If you only check for CONFIG_HUGETLB_PAGE, > then probably only pmd hugepages would be safe, there might be > architectures only supporting one hugepage size. I guess preparing for PMD based HugeTLB tests should be sufficient for now, which can be improved later on to cover other levels. > > So, for s390, at least the ptep input value is a problem. Still > need to better understand how it goes wrong, but it seems to be > fixed when using proper pmdp, and also works with pudp. > > For others, especially the apparent issues on ppc64, the other > non-hugepage aligned input pfn and vaddr might also be an issue, > e.g. power at least seems to use the vaddr in its set_huge_pte_at() > implementation for some pmd_off(mm, addr) calculation. > > Again, sorry if this was already discussed, I missed most of it > and honestly didn't properly look at the scarce mails that we did > receive... Sure, will consider these points and try improve tests afterwards. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 16:01 ` Gerald Schaefer 2020-09-04 17:53 ` Gerald Schaefer @ 2020-09-08 15:39 ` Gerald Schaefer 2020-09-09 6:08 ` Aneesh Kumar K.V 2020-09-09 8:15 ` Anshuman Khandual 2 siblings, 1 reply; 15+ messages in thread From: Gerald Schaefer @ 2020-09-08 15:39 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Fri, 4 Sep 2020 18:01:15 +0200 Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: [...] > > BTW2, a quick test with this change (so far) made the issues on s390 > go away: > > @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) > spin_unlock(ptl); > > #ifndef CONFIG_PPC_BOOK3S_64 > - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); > + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); > #endif > > spin_lock(&mm->page_table_lock); > > That would more match the "pte_t pointer" usage for hugetlb code, > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > but I think the root cause is the pte_t pointer. > > Not entirely sure though if that would really be the correct fix. > I somehow lost whatever little track I had about what these tests > really want to check, and if that would still be valid with that > change. Uh oh, wasn't aware that this (or some predecessor) already went upstream, and broke our debug kernel today. I found out now what goes (horribly) wrong on s390, see below for more details. In short, using hugetlb primitives with ptep pointers that do _not_ point to a pmd or pud entry will not work on s390. It also seems to make no sense to verify / test such a thing in general, as it would also be a severe bug if any kernel code would do that. After all, with hugepages, there are no pte tables, only pmd etc. tables. My change above would fix the issue for s390, but I can still not completely judge if that would not break other things for your tests. In general, for normal kernel code, much of what you do would be very broken, but I guess your tests are doing such "special" things because they can. E.g. because they operate on some "sandbox" mm and page tables, and you also do not need properly populated page tables for some exit / free cleanup, you just throw them away explicitly with pXd_free at the end. So it might just be "the right thing" to pass a casted pmd pointer to hugetlb_advanced_tests(), to simulate and test (proper) usage of the hugetlb primitives. I also see no other way to make this work for s390, than using a proper pmd/pud pointer. If not possible, please add us to the #ifndef. So, for all those interested, here is what goes wrong on s390. huge_ptep_get_and_clear() uses the "idte" instruction for the clearing (and TLB invalidation) part. That instruction expects a "region or segment table" origin, which is a pmd/pud/p4d/pgd, but not a pte table. Even worse, when we calculate the table origin from the given ptep (which *should* not point to a pte), due to different table sizes for pte / pXd tables, we end up at some place before the given pte table. The "idte" instruction also gets the virtual address, and does corresponding index addition to the given table origin. Depending on the pmd_index we now end up either within the pte table again, in which case we see a panic because idte complains about seeing a pte value. If we are unlucky, then we end up outside the pte table, and depending on the content of that memory location, idte might succeed, effectively corrupting that memory. That explains why we only see the panic sometimes, depending on random vaddr, other symptoms other times, and probably completely silent memory corruption for the rest... _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-08 15:39 ` Gerald Schaefer @ 2020-09-09 6:08 ` Aneesh Kumar K.V 2020-09-09 11:16 ` Gerald Schaefer 0 siblings, 1 reply; 15+ messages in thread From: Aneesh Kumar K.V @ 2020-09-09 6:08 UTC (permalink / raw) To: Gerald Schaefer, Anshuman Khandual Cc: linux-s390, mpe, linux-mm, Vineet Gupta, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer Gerald Schaefer <gerald.schaefer@linux.ibm.com> writes: > On Fri, 4 Sep 2020 18:01:15 +0200 > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > [...] >> >> BTW2, a quick test with this change (so far) made the issues on s390 >> go away: >> >> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) >> spin_unlock(ptl); >> >> #ifndef CONFIG_PPC_BOOK3S_64 >> - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); >> + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); >> #endif >> >> spin_lock(&mm->page_table_lock); >> >> That would more match the "pte_t pointer" usage for hugetlb code, >> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, >> but I think the root cause is the pte_t pointer. >> >> Not entirely sure though if that would really be the correct fix. >> I somehow lost whatever little track I had about what these tests >> really want to check, and if that would still be valid with that >> change. > > Uh oh, wasn't aware that this (or some predecessor) already went > upstream, and broke our debug kernel today. Not sure i followed the above. Are you finding that s390 kernel crash after this patch series or the original patchset? As noted in my patch the hugetlb test is broken and we should fix that. A quick fix is to comment out that test for s390 too as i have done for PPC64. -aneesh _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-09 6:08 ` Aneesh Kumar K.V @ 2020-09-09 11:16 ` Gerald Schaefer 0 siblings, 0 replies; 15+ messages in thread From: Gerald Schaefer @ 2020-09-09 11:16 UTC (permalink / raw) To: Aneesh Kumar K.V Cc: linux-s390, Anshuman Khandual, mpe, linux-mm, Vineet Gupta, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Wed, 09 Sep 2020 11:38:39 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: > Gerald Schaefer <gerald.schaefer@linux.ibm.com> writes: > > > On Fri, 4 Sep 2020 18:01:15 +0200 > > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > > > [...] > >> > >> BTW2, a quick test with this change (so far) made the issues on s390 > >> go away: > >> > >> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) > >> spin_unlock(ptl); > >> > >> #ifndef CONFIG_PPC_BOOK3S_64 > >> - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); > >> + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); > >> #endif > >> > >> spin_lock(&mm->page_table_lock); > >> > >> That would more match the "pte_t pointer" usage for hugetlb code, > >> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > >> but I think the root cause is the pte_t pointer. > >> > >> Not entirely sure though if that would really be the correct fix. > >> I somehow lost whatever little track I had about what these tests > >> really want to check, and if that would still be valid with that > >> change. > > > > Uh oh, wasn't aware that this (or some predecessor) already went > > upstream, and broke our debug kernel today. > > Not sure i followed the above. Are you finding that s390 kernel crash > after this patch series or the original patchset? As noted in my patch > the hugetlb test is broken and we should fix that. A quick fix is to > comment out that test for s390 too as i have done for PPC64. We see it with both, it basically is broken since there is a hugetlb test using real pte pointers. It doesn't always show, depending on random vaddr, so it slipped through earlier testing. I guess we also would have had one or the other chance to notice that earlier, through better review, or better reading of previous mails. I must admit that I neglected this a bit. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 16:01 ` Gerald Schaefer 2020-09-04 17:53 ` Gerald Schaefer 2020-09-08 15:39 ` Gerald Schaefer @ 2020-09-09 8:15 ` Anshuman Khandual 2020-09-09 11:10 ` Gerald Schaefer 2 siblings, 1 reply; 15+ messages in thread From: Anshuman Khandual @ 2020-09-09 8:15 UTC (permalink / raw) To: Gerald Schaefer Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On 09/04/2020 09:31 PM, Gerald Schaefer wrote: > On Fri, 4 Sep 2020 17:26:47 +0200 > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > >> On Fri, 4 Sep 2020 12:18:05 +0530 >> Anshuman Khandual <anshuman.khandual@arm.com> wrote: >> >>> >>> >>> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: >>>> This patch series includes fixes for debug_vm_pgtable test code so that >>>> they follow page table updates rules correctly. The first two patches introduce >>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can >>>> merge them via ppc64 tree if required. >>>> >>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy >>>> page table update rules. >>>> >>>> These tests are broken w.r.t page table update rules and results in kernel >>>> crash as below. >>>> >>>> [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! >>>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] >>>> pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 >>>> lr: c0000000005eeeec: pte_update+0x11c/0x190 >>>> sp: c000000c6d1e7950 >>>> msr: 8000000002029033 >>>> current = 0xc000000c6d172c80 >>>> paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 >>>> pid = 1, comm = swapper/0 >>>> kernel BUG at arch/powerpc/mm/pgtable.c:304! >>>> [link register ] c0000000005eeeec pte_update+0x11c/0x190 >>>> [c000000c6d1e7950] 0000000000000001 (unreliable) >>>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 >>>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 >>>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 >>>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 >>>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 >>>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 >>>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c >>>> >>>> With DEBUG_VM disabled >>>> >>>> [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 >>>> [ 20.530183] Faulting instruction address: 0xc0000000000df330 >>>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] >>>> pc: c0000000000df330: memset+0x68/0x104 >>>> lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 >>>> sp: c000000c6d19f990 >>>> msr: 8000000002009033 >>>> dar: 0 >>>> current = 0xc000000c6d177480 >>>> paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 >>>> pid = 1, comm = swapper/0 >>>> [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 >>>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) >>>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 >>>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 >>>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 >>>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 >>>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 >>>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c >>>> >>>> Changes from v3: >>>> * Address review feedback >>>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. >>> >>> This version >>> >>> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) >>> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed >>> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well >> >> When I quickly tested v3, it worked fine, but now it turned out to >> only work fine "sometimes", both v3 and v4. I need to look into it >> further, but so far it seems related to the hugetlb_advanced_tests(). >> >> I guess there was already some discussion on this test, but we did >> not receive all of the thread(s). Please always add at least >> linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> >> for further discussions. > > BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com. > The old gerald.schaefer@de.ibm.com seems to work (again), but is not > very reliable. Sure, noted. > > BTW2, a quick test with this change (so far) made the issues on s390 > go away: > > @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) > spin_unlock(ptl); > > #ifndef CONFIG_PPC_BOOK3S_64 > - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); > + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot); > #endif > > spin_lock(&mm->page_table_lock); > > That would more match the "pte_t pointer" usage for hugetlb code, > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > but I think the root cause is the pte_t pointer. Ideally, the pte_t pointer used here should be from huge_pte_alloc() not from pte_alloc_map_lock() as the case currently. > > Not entirely sure though if that would really be the correct fix. > I somehow lost whatever little track I had about what these tests > really want to check, and if that would still be valid with that > change. > _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-09 8:15 ` Anshuman Khandual @ 2020-09-09 11:10 ` Gerald Schaefer 0 siblings, 0 replies; 15+ messages in thread From: Gerald Schaefer @ 2020-09-09 11:10 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Wed, 9 Sep 2020 13:45:48 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote: [...] > > > > That would more match the "pte_t pointer" usage for hugetlb code, > > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > > but I think the root cause is the pte_t pointer. > > Ideally, the pte_t pointer used here should be from huge_pte_alloc() > not from pte_alloc_map_lock() as the case currently. Ah, good point. I assumed that this would also always return casted pmd etc. pointers, and never pte pointers. Unfortunately, that doesn't seem to be true for all architectures, e.g. ia64, parisc, (some) powerpc, where they really do a pte_alloc_map() for some reason. I guess that means you cannot simply cast the pmd pointer, as suggested, although I really do not understand how any architecture can work with real ptes for hugepages. But that's fair, s390 also does some things that nobody would expect or understand for other architectures... So, for using huge_pte_alloc() you'd also need some size, maybe iterating over hstates with for_each_hstate() could be an option, if they are already initialized at that point. Then you have the size(s) with huge_page_size(hstate) and can actually call the hugetlb tests for all supported sizes, and with proper pointer from huge_pte_alloc(). _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-04 15:26 ` Gerald Schaefer 2020-09-04 16:01 ` Gerald Schaefer @ 2020-09-09 8:08 ` Anshuman Khandual 2020-09-09 11:36 ` Gerald Schaefer 1 sibling, 1 reply; 15+ messages in thread From: Anshuman Khandual @ 2020-09-09 8:08 UTC (permalink / raw) To: Gerald Schaefer Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On 09/04/2020 08:56 PM, Gerald Schaefer wrote: > On Fri, 4 Sep 2020 12:18:05 +0530 > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > >> >> >> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: >>> This patch series includes fixes for debug_vm_pgtable test code so that >>> they follow page table updates rules correctly. The first two patches introduce >>> changes w.r.t ppc64. The patches are included in this series for completeness. We can >>> merge them via ppc64 tree if required. >>> >>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy >>> page table update rules. >>> >>> These tests are broken w.r.t page table update rules and results in kernel >>> crash as below. >>> >>> [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! >>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] >>> pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 >>> lr: c0000000005eeeec: pte_update+0x11c/0x190 >>> sp: c000000c6d1e7950 >>> msr: 8000000002029033 >>> current = 0xc000000c6d172c80 >>> paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 >>> pid = 1, comm = swapper/0 >>> kernel BUG at arch/powerpc/mm/pgtable.c:304! >>> [link register ] c0000000005eeeec pte_update+0x11c/0x190 >>> [c000000c6d1e7950] 0000000000000001 (unreliable) >>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 >>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 >>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 >>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 >>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 >>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 >>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c >>> >>> With DEBUG_VM disabled >>> >>> [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 >>> [ 20.530183] Faulting instruction address: 0xc0000000000df330 >>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] >>> pc: c0000000000df330: memset+0x68/0x104 >>> lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 >>> sp: c000000c6d19f990 >>> msr: 8000000002009033 >>> dar: 0 >>> current = 0xc000000c6d177480 >>> paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 >>> pid = 1, comm = swapper/0 >>> [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 >>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) >>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 >>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 >>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 >>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 >>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 >>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c >>> >>> Changes from v3: >>> * Address review feedback >>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. >> >> This version >> >> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) >> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed >> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well > > When I quickly tested v3, it worked fine, but now it turned out to > only work fine "sometimes", both v3 and v4. I need to look into it > further, but so far it seems related to the hugetlb_advanced_tests(). > > I guess there was already some discussion on this test, but we did > not receive all of the thread(s). Please always add at least > linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> > for further discussions. IIRC, the V3 series previously had all these addresses copied properly but this version once again missed copying all required addresses. > > That being said, sorry for duplications, this might already have been > discussed. Preliminary analysis showed that it only seems to go wrong > for certain random vaddr values. I cannot make any sense of that yet, > but what seems strange to me is that the hugetlb_advanced_tests() > take a (real) pte_t pointer as input, and also use that for all > kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.). > > Although all the hugetlb code in the kernel is (mis)using pte_t > pointers instead of the correct pmd/pud_t pointers like THP, that > is just for historic reasons. The pointers will actually never point > to a real pte_t (i.e. page table entry), but of course to a pmd > or pud entry, depending on hugepage size. HugeTLB logically operates on a PTE entry irrespective of it's real page table level position. Nonetheless, IIUC, vaddr here should have been aligned to real page table level in which the entry is being mapped currently. > > What is passed in as ptep to hugetlb_advanced_tests() seems to be > the result from the previous ptep = pte_alloc_map(mm, pmdp, vaddr), > so I would expect that it points to a real page table entry. Need > to investigate further, but IIUC, using such a pointer for adding > large pte entries (i.e. pmd/pud entries) at least feels very wrong > to me, and I assume it is related to the issues we see on s390. Will look into this further. > > We actually see different issues, e.g. once a panic directly in > hugetlb_advanced_tests() -> huge_ptep_get_and_clear(), but also > indirect symptoms after debug_vm_pgtable() completes, like this: > > [ 10.533901] BUG task_struct (Not tainted): Padding overwritten. 0x0000000019f798c7-0x0000000019f798c7 @offset=30087 > > Last but not least, what I said about the pte vs. pmd/pud of > course also should apply to the hugetlb_basic_tests(), although > they are not directly using a pte_t pointer, and especially > also not writing to it. Still, the pte_aligned pfn parameter > is not guaranteed to also be pmd/pud_aligned, which doesn't > feel right. hugetlb_basic_tests() does not directly operate on real page table entries. But I do see the point wrt using pmd_aligned pfn instead. I will look into this in detail and send out something after this series settles down. > > So, for now, until this is sorted out, I guess we also need > to exclude s390 at least from the hugetlb_advanced_tests(). > The hugetlb_basic_tests() seem to work fine so far (probably > by chance :-)) _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes 2020-09-09 8:08 ` Anshuman Khandual @ 2020-09-09 11:36 ` Gerald Schaefer 0 siblings, 0 replies; 15+ messages in thread From: Gerald Schaefer @ 2020-09-09 11:36 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, mpe, akpm, linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer On Wed, 9 Sep 2020 13:38:25 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > On 09/04/2020 08:56 PM, Gerald Schaefer wrote: > > On Fri, 4 Sep 2020 12:18:05 +0530 > > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > >> > >> > >> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote: > >>> This patch series includes fixes for debug_vm_pgtable test code so that > >>> they follow page table updates rules correctly. The first two patches introduce > >>> changes w.r.t ppc64. The patches are included in this series for completeness. We can > >>> merge them via ppc64 tree if required. > >>> > >>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy > >>> page table update rules. > >>> > >>> These tests are broken w.r.t page table update rules and results in kernel > >>> crash as below. > >>> > >>> [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! > >>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] > >>> pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 > >>> lr: c0000000005eeeec: pte_update+0x11c/0x190 > >>> sp: c000000c6d1e7950 > >>> msr: 8000000002029033 > >>> current = 0xc000000c6d172c80 > >>> paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 > >>> pid = 1, comm = swapper/0 > >>> kernel BUG at arch/powerpc/mm/pgtable.c:304! > >>> [link register ] c0000000005eeeec pte_update+0x11c/0x190 > >>> [c000000c6d1e7950] 0000000000000001 (unreliable) > >>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 > >>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 > >>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 > >>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > >>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 > >>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 > >>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > >>> > >>> With DEBUG_VM disabled > >>> > >>> [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 > >>> [ 20.530183] Faulting instruction address: 0xc0000000000df330 > >>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] > >>> pc: c0000000000df330: memset+0x68/0x104 > >>> lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > >>> sp: c000000c6d19f990 > >>> msr: 8000000002009033 > >>> dar: 0 > >>> current = 0xc000000c6d177480 > >>> paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 > >>> pid = 1, comm = swapper/0 > >>> [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 > >>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) > >>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 > >>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 > >>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 > >>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 > >>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 > >>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c > >>> > >>> Changes from v3: > >>> * Address review feedback > >>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure. > >> > >> This version > >> > >> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE) > >> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed > >> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well > > > > When I quickly tested v3, it worked fine, but now it turned out to > > only work fine "sometimes", both v3 and v4. I need to look into it > > further, but so far it seems related to the hugetlb_advanced_tests(). > > > > I guess there was already some discussion on this test, but we did > > not receive all of the thread(s). Please always add at least > > linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com> > > for further discussions. > > IIRC, the V3 series previously had all these addresses copied properly > but this version once again missed copying all required addresses. I also had issues with the de.ibm.com address, which might also have made some mails disappear, and others might simply have been overlooked be me. Don't bother, my bad. > > > > > That being said, sorry for duplications, this might already have been > > discussed. Preliminary analysis showed that it only seems to go wrong > > for certain random vaddr values. I cannot make any sense of that yet, > > but what seems strange to me is that the hugetlb_advanced_tests() > > take a (real) pte_t pointer as input, and also use that for all > > kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.). > > > > Although all the hugetlb code in the kernel is (mis)using pte_t > > pointers instead of the correct pmd/pud_t pointers like THP, that > > is just for historic reasons. The pointers will actually never point > > to a real pte_t (i.e. page table entry), but of course to a pmd > > or pud entry, depending on hugepage size. > > HugeTLB logically operates on a PTE entry irrespective of it's real > page table level position. Nonetheless, IIUC, vaddr here should have > been aligned to real page table level in which the entry is being > mapped currently. That goes back to the time where only x86 had hugepages, and they have the same layout for pte/pmd/etc entries, so it simply didn't matter that the code (mis)used pte pointers / entries. But even for x86, the hugetlb pte pointers would never have pointed to real ptes, but pmds instead. That's why I call it misuse. s390 is very sensitive to page table level, and we can also determine the level from the entry value, which is used for some primitives. Others have implicit assumptions and calculations, which go wrong if a wrong level is passed in, like in this case for huge_ptep_get_and_clear(). Simply aligning vaddr / pfn will not be enough to fix this for s390, it has to be a pmd/pud pointer. Or, as you already mentioned, the result of huge_pte_alloc(). Furthermore, the pmd and pte layout are different, so we simply cannot use any pte_xxx primitives for hugepages. That was the reason for introducing huge_ptep_get(), which will do an implicit conversion from the real pmd/pud entry to a "fake" pte entry, which can then be used with such pte_xxx primitives. Before writing it back in set_huge_pte_at() we then do the reverse conversion to a proper pmd/pud again. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20200902114222.181353-14-aneesh.kumar@linux.ibm.com>]
[parent not found: <20200911021358.GA3656343@ubuntu-n2-xlarge-x86>]
* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test [not found] ` <20200911021358.GA3656343@ubuntu-n2-xlarge-x86> @ 2020-09-11 5:21 ` Aneesh Kumar K.V 2020-09-23 3:14 ` Anshuman Khandual 0 siblings, 1 reply; 15+ messages in thread From: Aneesh Kumar K.V @ 2020-09-11 5:21 UTC (permalink / raw) To: Nathan Chancellor Cc: Anshuman Khandual, mpe, linux-mm, akpm, linuxppc-dev, linux-riscv Nathan Chancellor <natechancellor@gmail.com> writes: > On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote: >> pte_clear_tests operate on an existing pte entry. Make sure that >> is not a none pte entry. >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >> --- >> mm/debug_vm_pgtable.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >> index 9afa1354326b..c36530c69e33 100644 >> --- a/mm/debug_vm_pgtable.c >> +++ b/mm/debug_vm_pgtable.c >> @@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, >> #endif /* PAGETABLE_P4D_FOLDED */ >> >> static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep, >> - unsigned long vaddr) >> + unsigned long pfn, unsigned long vaddr, >> + pgprot_t prot) >> { >> - pte_t pte = ptep_get(ptep); >> + pte_t pte = pfn_pte(pfn, prot); >> >> pr_debug("Validating PTE clear\n"); >> pte = __pte(pte_val(pte) | RANDOM_ORVALUE); >> @@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void) >> >> ptl = pte_lockptr(mm, pmdp); >> spin_lock(ptl); >> - pte_clear_tests(mm, ptep, vaddr); >> + pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot); >> pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); >> pte_unmap_unlock(ptep, ptl); >> >> -- > This patch causes a panic at boot for RISC-V defconfig. The rootfs is here if it is needed: > https://github.com/ClangBuiltLinux/boot-utils/blob/3b21a5b71451742866349ba4f18638c5a754e660/images/riscv/rootfs.cpio.zst > > $ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux- O=out/riscv distclean defconfig Image > > $ qemu-system-riscv64 -bios default -M virt -display none -initrd rootfs.cpio -kernel Image -m 512m -nodefaults -serial mon:stdio > ... > > OpenSBI v0.6 > ____ _____ ____ _____ > / __ \ / ____| _ \_ _| > | | | |_ __ ___ _ __ | (___ | |_) || | > | | | | '_ \ / _ \ '_ \ \___ \| _ < | | > | |__| | |_) | __/ | | |____) | |_) || |_ > \____/| .__/ \___|_| |_|_____/|____/_____| > | | > |_| > > Platform Name : QEMU Virt Machine > Platform HART Features : RV64ACDFIMSU > Platform Max HARTs : 8 > Current Hart : 0 > Firmware Base : 0x80000000 > Firmware Size : 120 KB > Runtime SBI Version : 0.2 > > MIDELEG : 0x0000000000000222 > MEDELEG : 0x000000000000b109 > PMP0 : 0x0000000080000000-0x000000008001ffff (A) > PMP1 : 0x0000000000000000-0xffffffffffffffff (A,R,W,X) > [ 0.000000] Linux version 5.9.0-rc4-next-20200910 (nathan@ubuntu-n2-xlarge-x86) (riscv64-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP Thu Sep 10 19:10:43 MST 2020 > ... > [ 0.294593] NET: Registered protocol family 17 > [ 0.295781] 9pnet: Installing 9P2000 support > [ 0.296153] Key type dns_resolver registered > [ 0.296694] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers > [ 0.297635] Unable to handle kernel paging request at virtual address 0a7fffe01dafefc8 > [ 0.298029] Oops [#1] > [ 0.298153] Modules linked in: > [ 0.298433] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4-next-20200910 #1 > [ 0.298792] epc: ffffffe000205afc ra : ffffffe0008be0aa sp : ffffffe01ae73d40 > [ 0.299078] gp : ffffffe0010b9b48 tp : ffffffe01ae68000 t0 : ffffffe008152000 > [ 0.299362] t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffe01ae73d60 > [ 0.299648] s1 : bffffffffffffffb a0 : 0a7fffe01dafefc8 a1 : bffffffffffffffb > [ 0.299948] a2 : ffffffe0010a2698 a3 : 0000000000000001 a4 : 0000000000000003 > [ 0.300231] a5 : 0000000000000800 a6 : fffffffff0000080 a7 : 000000001b642000 > [ 0.300521] s2 : ffffffe0081517b8 s3 : ffffffe008150a80 s4 : ffffffe01af30000 > [ 0.300806] s5 : ffffffe01f8ca9b8 s6 : ffffffe008150000 s7 : ffffffe0010bb100 > [ 0.301161] s8 : ffffffe0010bb108 s9 : 0000000000080202 s10: ffffffe0010bb928 > [ 0.301481] s11: 000000002008085b t3 : 0000000000000000 t4 : 0000000000000000 > [ 0.301722] t5 : 0000000000000000 t6 : ffffffe008150000 > [ 0.301947] status: 0000000000000120 badaddr: 0a7fffe01dafefc8 cause: 000000000000000f > [ 0.302569] ---[ end trace 7ffb153d816164cf ]--- > [ 0.302797] note: swapper/0[1] exited with preempt_count 1 > [ 0.303101] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b > [ 0.303614] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- I guess it is the combination of a valid pte and usage of RANDOM_ORVALUE. The below change get the kernel to boot. Can somebody faimilar with riscv pte format take a look at the RANDOM_ORVALUE? modified mm/debug_vm_pgtable.c @@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep, pte_t pte = pfn_pte(pfn, prot); pr_debug("Validating PTE clear\n"); - pte = __pte(pte_val(pte) | RANDOM_ORVALUE); +// pte = __pte(pte_val(pte) | RANDOM_ORVALUE); set_pte_at(mm, vaddr, ptep, pte); barrier(); pte_clear(mm, vaddr, ptep); _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test 2020-09-11 5:21 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V @ 2020-09-23 3:14 ` Anshuman Khandual 0 siblings, 0 replies; 15+ messages in thread From: Anshuman Khandual @ 2020-09-23 3:14 UTC (permalink / raw) To: Aneesh Kumar K.V, Nathan Chancellor Cc: linux-mm, akpm, linuxppc-dev, linux-riscv, mpe On 09/11/2020 10:51 AM, Aneesh Kumar K.V wrote: > Nathan Chancellor <natechancellor@gmail.com> writes: > >> On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote: >>> pte_clear_tests operate on an existing pte entry. Make sure that >>> is not a none pte entry. >>> >>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >>> --- >>> mm/debug_vm_pgtable.c | 7 ++++--- >>> 1 file changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>> index 9afa1354326b..c36530c69e33 100644 >>> --- a/mm/debug_vm_pgtable.c >>> +++ b/mm/debug_vm_pgtable.c >>> @@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, >>> #endif /* PAGETABLE_P4D_FOLDED */ >>> >>> static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep, >>> - unsigned long vaddr) >>> + unsigned long pfn, unsigned long vaddr, >>> + pgprot_t prot) >>> { >>> - pte_t pte = ptep_get(ptep); >>> + pte_t pte = pfn_pte(pfn, prot); >>> >>> pr_debug("Validating PTE clear\n"); >>> pte = __pte(pte_val(pte) | RANDOM_ORVALUE); >>> @@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void) >>> >>> ptl = pte_lockptr(mm, pmdp); >>> spin_lock(ptl); >>> - pte_clear_tests(mm, ptep, vaddr); >>> + pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot); >>> pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot); >>> pte_unmap_unlock(ptep, ptl); >>> >>> -- >> This patch causes a panic at boot for RISC-V defconfig. The rootfs is here if it is needed: >> https://github.com/ClangBuiltLinux/boot-utils/blob/3b21a5b71451742866349ba4f18638c5a754e660/images/riscv/rootfs.cpio.zst >> >> $ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux- O=out/riscv distclean defconfig Image >> >> $ qemu-system-riscv64 -bios default -M virt -display none -initrd rootfs.cpio -kernel Image -m 512m -nodefaults -serial mon:stdio >> ... >> >> OpenSBI v0.6 >> ____ _____ ____ _____ >> / __ \ / ____| _ \_ _| >> | | | |_ __ ___ _ __ | (___ | |_) || | >> | | | | '_ \ / _ \ '_ \ \___ \| _ < | | >> | |__| | |_) | __/ | | |____) | |_) || |_ >> \____/| .__/ \___|_| |_|_____/|____/_____| >> | | >> |_| >> >> Platform Name : QEMU Virt Machine >> Platform HART Features : RV64ACDFIMSU >> Platform Max HARTs : 8 >> Current Hart : 0 >> Firmware Base : 0x80000000 >> Firmware Size : 120 KB >> Runtime SBI Version : 0.2 >> >> MIDELEG : 0x0000000000000222 >> MEDELEG : 0x000000000000b109 >> PMP0 : 0x0000000080000000-0x000000008001ffff (A) >> PMP1 : 0x0000000000000000-0xffffffffffffffff (A,R,W,X) >> [ 0.000000] Linux version 5.9.0-rc4-next-20200910 (nathan@ubuntu-n2-xlarge-x86) (riscv64-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP Thu Sep 10 19:10:43 MST 2020 >> ... >> [ 0.294593] NET: Registered protocol family 17 >> [ 0.295781] 9pnet: Installing 9P2000 support >> [ 0.296153] Key type dns_resolver registered >> [ 0.296694] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers >> [ 0.297635] Unable to handle kernel paging request at virtual address 0a7fffe01dafefc8 >> [ 0.298029] Oops [#1] >> [ 0.298153] Modules linked in: >> [ 0.298433] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4-next-20200910 #1 >> [ 0.298792] epc: ffffffe000205afc ra : ffffffe0008be0aa sp : ffffffe01ae73d40 >> [ 0.299078] gp : ffffffe0010b9b48 tp : ffffffe01ae68000 t0 : ffffffe008152000 >> [ 0.299362] t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffe01ae73d60 >> [ 0.299648] s1 : bffffffffffffffb a0 : 0a7fffe01dafefc8 a1 : bffffffffffffffb >> [ 0.299948] a2 : ffffffe0010a2698 a3 : 0000000000000001 a4 : 0000000000000003 >> [ 0.300231] a5 : 0000000000000800 a6 : fffffffff0000080 a7 : 000000001b642000 >> [ 0.300521] s2 : ffffffe0081517b8 s3 : ffffffe008150a80 s4 : ffffffe01af30000 >> [ 0.300806] s5 : ffffffe01f8ca9b8 s6 : ffffffe008150000 s7 : ffffffe0010bb100 >> [ 0.301161] s8 : ffffffe0010bb108 s9 : 0000000000080202 s10: ffffffe0010bb928 >> [ 0.301481] s11: 000000002008085b t3 : 0000000000000000 t4 : 0000000000000000 >> [ 0.301722] t5 : 0000000000000000 t6 : ffffffe008150000 >> [ 0.301947] status: 0000000000000120 badaddr: 0a7fffe01dafefc8 cause: 000000000000000f >> [ 0.302569] ---[ end trace 7ffb153d816164cf ]--- >> [ 0.302797] note: swapper/0[1] exited with preempt_count 1 >> [ 0.303101] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b >> [ 0.303614] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- > > > I guess it is the combination of a valid pte and usage of > RANDOM_ORVALUE. The below change get the kernel to boot. Can somebody > faimilar with riscv pte format take a look at the RANDOM_ORVALUE? > > modified mm/debug_vm_pgtable.c > @@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep, > pte_t pte = pfn_pte(pfn, prot); > > pr_debug("Validating PTE clear\n"); > - pte = __pte(pte_val(pte) | RANDOM_ORVALUE); > +// pte = __pte(pte_val(pte) | RANDOM_ORVALUE); > set_pte_at(mm, vaddr, ptep, pte); > barrier(); > pte_clear(mm, vaddr, ptep); Do we have a fix for this problem ? Otherwise we just risk going into the next release with this regression on riscv platforms. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20201011200258.GA91021@roeck-us.net>]
* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test [not found] ` <20201011200258.GA91021@roeck-us.net> @ 2020-10-12 4:29 ` Aneesh Kumar K.V 0 siblings, 0 replies; 15+ messages in thread From: Aneesh Kumar K.V @ 2020-10-12 4:29 UTC (permalink / raw) To: Guenter Roeck Cc: Anshuman Khandual, mpe, linux-mm, akpm, linuxppc-dev, linux-riscv Guenter Roeck <linux@roeck-us.net> writes: > On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote: >> pte_clear_tests operate on an existing pte entry. Make sure that >> is not a none pte entry. >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > > This patch causes all riscv64 images to crash. Reverting it > as well as the follow-up patch fixes the problem, but there are > still several warning messages starting with > BUG kmem_cache (Not tainted): Freechain corrupt > I did not try to track down this other problem. > > A detailed crash log is at > https://kerneltests.org/builders/qemu-riscv64-next/builds/523/steps/qemubuildcommand/logs/stdio > > Bisect log is attached. https://lore.kernel.org/linux-mm/87zh5wx51b.fsf@linux.ibm.com This was mentioned earlier. The RANDOM_OR_VALUE used is interacting with some of the riscv page table accessors. -aneesh _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-10-12 4:29 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20200902114222.181353-1-aneesh.kumar@linux.ibm.com> 2020-09-04 6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual 2020-09-04 15:26 ` Gerald Schaefer 2020-09-04 16:01 ` Gerald Schaefer 2020-09-04 17:53 ` Gerald Schaefer 2020-09-09 8:38 ` Anshuman Khandual 2020-09-08 15:39 ` Gerald Schaefer 2020-09-09 6:08 ` Aneesh Kumar K.V 2020-09-09 11:16 ` Gerald Schaefer 2020-09-09 8:15 ` Anshuman Khandual 2020-09-09 11:10 ` Gerald Schaefer 2020-09-09 8:08 ` Anshuman Khandual 2020-09-09 11:36 ` Gerald Schaefer [not found] ` <20200902114222.181353-14-aneesh.kumar@linux.ibm.com> [not found] ` <20200911021358.GA3656343@ubuntu-n2-xlarge-x86> 2020-09-11 5:21 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V 2020-09-23 3:14 ` Anshuman Khandual [not found] ` <20201011200258.GA91021@roeck-us.net> 2020-10-12 4:29 ` Aneesh Kumar K.V
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).