* [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
@ 2016-02-01 21:24 ` Laura Abbott
0 siblings, 0 replies; 24+ messages in thread
From: Laura Abbott @ 2016-02-01 21:24 UTC (permalink / raw)
To: linux-arm-kernel
On 02/01/2016 04:29 AM, Mark Rutland wrote:
> Hi,
>
> On Fri, Jan 29, 2016 at 03:46:57PM -0800, Laura Abbott wrote:
>>
>> ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
>> pages for debugging purposes. This requires memory be mapped
>> with PAGE_SIZE mappings since breaking down larger mappings
>> at runtime will lead to TLB conflicts. Check if debug_pagealloc
>> is enabled at runtime and if so, map everyting with PAGE_SIZE
>> pages. Implement the functions to actually map/unmap the
>> pages at runtime.
>>
>>
>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>
> I tried to apply atop of the arm64 for-next/pgtable branch, but git
> wasn't very happy about that -- which branch/patches is this based on?
>
> I'm not sure if I'm missing something, have something I shouldn't, or if
> my MTA is corrupting patches again...
>
Hmmm, I based it off of your arm64-pagetable-rework-20160125 tag and
Ard's patch for vmalloc and set_memory_* . The patches seem to apply
on the for-next/pgtable branch as well so I'm guessing you are missing
Ard's patch.
>> ---
>> arch/arm64/Kconfig | 3 +++
>> arch/arm64/mm/mmu.c | 10 +++++++---
>> arch/arm64/mm/pageattr.c | 40 +++++++++++++++++++++++++++++++++-------
>> 3 files changed, 43 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 8cc6228..0f33218 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -537,6 +537,9 @@ config HOTPLUG_CPU
>> source kernel/Kconfig.preempt
>> source kernel/Kconfig.hz
>>
>> +config ARCH_SUPPORTS_DEBUG_PAGEALLOC
>> + def_bool y
>> +
>> config ARCH_HAS_HOLES_MEMORYMODEL
>> def_bool y if SPARSEMEM
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 103ebc0..5abbd30 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -181,7 +181,8 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
>> do {
>> next = pmd_addr_end(addr, end);
>> /* try section mapping first */
>> - if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>> + if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
>> + (!debug_pagealloc_enabled() || !pgtable_alloc)) {
>> pmd_t old_pmd =*pmd;
>> set_pmd(pmd, __pmd(phys |
>> pgprot_val(mk_sect_prot(prot))));
>> @@ -208,8 +209,11 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
>> }
>>
>> static inline bool use_1G_block(unsigned long addr, unsigned long next,
>> - unsigned long phys)
>> + unsigned long phys, phys_addr_t (*pgtable_alloc)(void))
>> {
>> + if (pgtable_alloc && debug_pagealloc_enabled())
>> + return false;
>> +
>> if (PAGE_SHIFT != 12)
>> return false;
>>
>> @@ -241,7 +245,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
>> /*
>> * For 4K granule only, attempt to put down a 1GB block
>> */
>> - if (use_1G_block(addr, next, phys)) {
>> + if (use_1G_block(addr, next, phys, pgtable_alloc)) {
>> pud_t old_pud = *pud;
>> set_pud(pud, __pud(phys |
>> pgprot_val(mk_sect_prot(prot))));
>
> The checks for pgtable_alloc is to allow create_mapping_noalloc to use
> sections regardless, right?
>
> I got a little confused by that initially, but that seems to make sense.
>
Yeah, that was what I was going for. I went for the implication with pgtable_alloc
over adding a bool variable to create_mapping. I'll see if I can put a comment
in somewhere to clarify the intentions.
>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>> index 1360a02..69a8a7d 100644
>> --- a/arch/arm64/mm/pageattr.c
>> +++ b/arch/arm64/mm/pageattr.c
>> @@ -38,7 +38,8 @@ static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long addr,
>> }
>>
>> static int change_memory_common(unsigned long addr, int numpages,
>> - pgprot_t set_mask, pgprot_t clear_mask)
>> + pgprot_t set_mask, pgprot_t clear_mask,
>> + bool ignore_vma_check)
>> {
>> unsigned long start = addr;
>> unsigned long size = PAGE_SIZE*numpages;
>> @@ -65,11 +66,14 @@ static int change_memory_common(unsigned long addr, int numpages,
>> *
>> * So check whether the [addr, addr + size) interval is entirely
>> * covered by precisely one VM area that has the VM_ALLOC flag set.
>> + *
>> + * The one exception to this is is we're forcing everything to be
>> + * page mapped with DEBUG_PAGEALLOC
>> */
>> area = find_vm_area((void *)addr);
>> - if (!area ||
>> + if (!ignore_vma_check && (!area ||
>> end > (unsigned long)area->addr + area->size ||
>> - !(area->flags & VM_ALLOC))
>> + !(area->flags & VM_ALLOC)))
>> return -EINVAL;
>
> Perhaps we could pull out everything but the VMA checks into a static
> __change_memory_common. Then we can call that directly in
> __kernel_map_pages, and don't need to pass a confusing boolean
> parameter.
Sounds fine.
>
> Otherwise, this looks good to me.
>
> Mark.
>
Thanks,
Laura
>>
>> data.set_mask = set_mask;
>> @@ -86,21 +90,24 @@ int set_memory_ro(unsigned long addr, int numpages)
>> {
>> return change_memory_common(addr, numpages,
>> __pgprot(PTE_RDONLY),
>> - __pgprot(PTE_WRITE));
>> + __pgprot(PTE_WRITE),
>> + false);
>> }
>>
>> int set_memory_rw(unsigned long addr, int numpages)
>> {
>> return change_memory_common(addr, numpages,
>> __pgprot(PTE_WRITE),
>> - __pgprot(PTE_RDONLY));
>> + __pgprot(PTE_RDONLY),
>> + false);
>> }
>>
>> int set_memory_nx(unsigned long addr, int numpages)
>> {
>> return change_memory_common(addr, numpages,
>> __pgprot(PTE_PXN),
>> - __pgprot(0));
>> + __pgprot(0),
>> + false);
>> }
>> EXPORT_SYMBOL_GPL(set_memory_nx);
>>
>> @@ -108,6 +115,25 @@ int set_memory_x(unsigned long addr, int numpages)
>> {
>> return change_memory_common(addr, numpages,
>> __pgprot(0),
>> - __pgprot(PTE_PXN));
>> + __pgprot(PTE_PXN),
>> + false);
>> }
>> EXPORT_SYMBOL_GPL(set_memory_x);
>> +
>> +#ifdef CONFIG_DEBUG_PAGEALLOC
>> +void __kernel_map_pages(struct page *page, int numpages, int enable)
>> +{
>> + unsigned long addr = (unsigned long) page_address(page);
>> +
>> + if (enable)
>> + change_memory_common(addr, numpages,
>> + __pgprot(PTE_VALID),
>> + __pgprot(0),
>> + true);
>> + else
>> + change_memory_common(addr, numpages,
>> + __pgprot(0),
>> + __pgprot(PTE_VALID),
>> + true);
>> +}
>> +#endif
>> --
>> 2.5.0
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
2016-02-01 21:24 ` Laura Abbott
@ 2016-02-02 8:57 ` Ard Biesheuvel
-1 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2016-02-02 8:57 UTC (permalink / raw)
To: Laura Abbott
Cc: Mark Rutland, Laura Abbott, Catalin Marinas, Will Deacon,
linux-arm-kernel, linux-kernel
On 1 February 2016 at 22:24, Laura Abbott <labbott@redhat.com> wrote:
> On 02/01/2016 04:29 AM, Mark Rutland wrote:
>>
>> Hi,
>>
>> On Fri, Jan 29, 2016 at 03:46:57PM -0800, Laura Abbott wrote:
>>>
>>>
>>> ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
>>> pages for debugging purposes. This requires memory be mapped
>>> with PAGE_SIZE mappings since breaking down larger mappings
>>> at runtime will lead to TLB conflicts. Check if debug_pagealloc
>>> is enabled at runtime and if so, map everyting with PAGE_SIZE
>>> pages. Implement the functions to actually map/unmap the
>>> pages at runtime.
>>>
>>>
>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>
>>
>> I tried to apply atop of the arm64 for-next/pgtable branch, but git
>> wasn't very happy about that -- which branch/patches is this based on?
>>
>> I'm not sure if I'm missing something, have something I shouldn't, or if
>> my MTA is corrupting patches again...
>>
>
> Hmmm, I based it off of your arm64-pagetable-rework-20160125 tag and
> Ard's patch for vmalloc and set_memory_* . The patches seem to apply
> on the for-next/pgtable branch as well so I'm guessing you are missing
> Ard's patch.
>
>
>
>>>
>>> ---
>>> arch/arm64/Kconfig | 3 +++
>>> arch/arm64/mm/mmu.c | 10 +++++++---
>>> arch/arm64/mm/pageattr.c | 40 +++++++++++++++++++++++++++++++++-------
>>> 3 files changed, 43 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 8cc6228..0f33218 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -537,6 +537,9 @@ config HOTPLUG_CPU
>>> source kernel/Kconfig.preempt
>>> source kernel/Kconfig.hz
>>>
>>> +config ARCH_SUPPORTS_DEBUG_PAGEALLOC
>>> + def_bool y
>>> +
>>> config ARCH_HAS_HOLES_MEMORYMODEL
>>> def_bool y if SPARSEMEM
>>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 103ebc0..5abbd30 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -181,7 +181,8 @@ static void alloc_init_pmd(pud_t *pud, unsigned long
>>> addr, unsigned long end,
>>> do {
>>> next = pmd_addr_end(addr, end);
>>> /* try section mapping first */
>>> - if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>>> + if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
>>> + (!debug_pagealloc_enabled() || !pgtable_alloc)) {
>>> pmd_t old_pmd =*pmd;
>>> set_pmd(pmd, __pmd(phys |
>>>
>>> pgprot_val(mk_sect_prot(prot))));
>>> @@ -208,8 +209,11 @@ static void alloc_init_pmd(pud_t *pud, unsigned long
>>> addr, unsigned long end,
>>> }
>>>
>>> static inline bool use_1G_block(unsigned long addr, unsigned long next,
>>> - unsigned long phys)
>>> + unsigned long phys, phys_addr_t
>>> (*pgtable_alloc)(void))
>>> {
>>> + if (pgtable_alloc && debug_pagealloc_enabled())
>>> + return false;
>>> +
>>> if (PAGE_SHIFT != 12)
>>> return false;
>>>
>>> @@ -241,7 +245,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long
>>> addr, unsigned long end,
>>> /*
>>> * For 4K granule only, attempt to put down a 1GB block
>>> */
>>> - if (use_1G_block(addr, next, phys)) {
>>> + if (use_1G_block(addr, next, phys, pgtable_alloc)) {
>>> pud_t old_pud = *pud;
>>> set_pud(pud, __pud(phys |
>>>
>>> pgprot_val(mk_sect_prot(prot))));
>>
>>
>> The checks for pgtable_alloc is to allow create_mapping_noalloc to use
>> sections regardless, right?
>>
>> I got a little confused by that initially, but that seems to make sense.
>>
>
> Yeah, that was what I was going for. I went for the implication with
> pgtable_alloc
> over adding a bool variable to create_mapping. I'll see if I can put a
> comment
> in somewhere to clarify the intentions.
>
Yes, the logic becomes slightly fuzzy, so it would be good to document
this one way or the other.
--
Ard.
>
>>>
>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>> index 1360a02..69a8a7d 100644
>>> --- a/arch/arm64/mm/pageattr.c
>>> +++ b/arch/arm64/mm/pageattr.c
>>> @@ -38,7 +38,8 @@ static int change_page_range(pte_t *ptep, pgtable_t
>>> token, unsigned long addr,
>>> }
>>>
>>> static int change_memory_common(unsigned long addr, int numpages,
>>> - pgprot_t set_mask, pgprot_t clear_mask)
>>> + pgprot_t set_mask, pgprot_t clear_mask,
>>> + bool ignore_vma_check)
>>> {
>>> unsigned long start = addr;
>>> unsigned long size = PAGE_SIZE*numpages;
>>> @@ -65,11 +66,14 @@ static int change_memory_common(unsigned long addr,
>>> int numpages,
>>> *
>>> * So check whether the [addr, addr + size) interval is entirely
>>> * covered by precisely one VM area that has the VM_ALLOC flag
>>> set.
>>> + *
>>> + * The one exception to this is is we're forcing everything to be
>>> + * page mapped with DEBUG_PAGEALLOC
>>> */
>>> area = find_vm_area((void *)addr);
>>> - if (!area ||
>>> + if (!ignore_vma_check && (!area ||
>>> end > (unsigned long)area->addr + area->size ||
>>> - !(area->flags & VM_ALLOC))
>>> + !(area->flags & VM_ALLOC)))
>>> return -EINVAL;
>>
>>
>> Perhaps we could pull out everything but the VMA checks into a static
>> __change_memory_common. Then we can call that directly in
>> __kernel_map_pages, and don't need to pass a confusing boolean
>> parameter.
>
>
> Sounds fine.
>
>>
>> Otherwise, this looks good to me.
>> Mark.
>>
>
> Thanks,
> Laura
>
>
>>>
>>>
>>> data.set_mask = set_mask;
>>> @@ -86,21 +90,24 @@ int set_memory_ro(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(PTE_RDONLY),
>>> - __pgprot(PTE_WRITE));
>>> + __pgprot(PTE_WRITE),
>>> + false);
>>> }
>>>
>>> int set_memory_rw(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(PTE_WRITE),
>>> - __pgprot(PTE_RDONLY));
>>> + __pgprot(PTE_RDONLY),
>>> + false);
>>> }
>>>
>>> int set_memory_nx(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(PTE_PXN),
>>> - __pgprot(0));
>>> + __pgprot(0),
>>> + false);
>>> }
>>> EXPORT_SYMBOL_GPL(set_memory_nx);
>>>
>>> @@ -108,6 +115,25 @@ int set_memory_x(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(0),
>>> - __pgprot(PTE_PXN));
>>> + __pgprot(PTE_PXN),
>>> + false);
>>> }
>>> EXPORT_SYMBOL_GPL(set_memory_x);
>>> +
>>> +#ifdef CONFIG_DEBUG_PAGEALLOC
>>> +void __kernel_map_pages(struct page *page, int numpages, int enable)
>>> +{
>>> + unsigned long addr = (unsigned long) page_address(page);
>>> +
>>> + if (enable)
>>> + change_memory_common(addr, numpages,
>>> + __pgprot(PTE_VALID),
>>> + __pgprot(0),
>>> + true);
>>> + else
>>> + change_memory_common(addr, numpages,
>>> + __pgprot(0),
>>> + __pgprot(PTE_VALID),
>>> + true);
>>> +}
>>> +#endif
>>> --
>>> 2.5.0
>>>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
@ 2016-02-02 8:57 ` Ard Biesheuvel
0 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2016-02-02 8:57 UTC (permalink / raw)
To: linux-arm-kernel
On 1 February 2016 at 22:24, Laura Abbott <labbott@redhat.com> wrote:
> On 02/01/2016 04:29 AM, Mark Rutland wrote:
>>
>> Hi,
>>
>> On Fri, Jan 29, 2016 at 03:46:57PM -0800, Laura Abbott wrote:
>>>
>>>
>>> ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
>>> pages for debugging purposes. This requires memory be mapped
>>> with PAGE_SIZE mappings since breaking down larger mappings
>>> at runtime will lead to TLB conflicts. Check if debug_pagealloc
>>> is enabled at runtime and if so, map everyting with PAGE_SIZE
>>> pages. Implement the functions to actually map/unmap the
>>> pages at runtime.
>>>
>>>
>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>
>>
>> I tried to apply atop of the arm64 for-next/pgtable branch, but git
>> wasn't very happy about that -- which branch/patches is this based on?
>>
>> I'm not sure if I'm missing something, have something I shouldn't, or if
>> my MTA is corrupting patches again...
>>
>
> Hmmm, I based it off of your arm64-pagetable-rework-20160125 tag and
> Ard's patch for vmalloc and set_memory_* . The patches seem to apply
> on the for-next/pgtable branch as well so I'm guessing you are missing
> Ard's patch.
>
>
>
>>>
>>> ---
>>> arch/arm64/Kconfig | 3 +++
>>> arch/arm64/mm/mmu.c | 10 +++++++---
>>> arch/arm64/mm/pageattr.c | 40 +++++++++++++++++++++++++++++++++-------
>>> 3 files changed, 43 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 8cc6228..0f33218 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -537,6 +537,9 @@ config HOTPLUG_CPU
>>> source kernel/Kconfig.preempt
>>> source kernel/Kconfig.hz
>>>
>>> +config ARCH_SUPPORTS_DEBUG_PAGEALLOC
>>> + def_bool y
>>> +
>>> config ARCH_HAS_HOLES_MEMORYMODEL
>>> def_bool y if SPARSEMEM
>>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 103ebc0..5abbd30 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -181,7 +181,8 @@ static void alloc_init_pmd(pud_t *pud, unsigned long
>>> addr, unsigned long end,
>>> do {
>>> next = pmd_addr_end(addr, end);
>>> /* try section mapping first */
>>> - if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>>> + if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
>>> + (!debug_pagealloc_enabled() || !pgtable_alloc)) {
>>> pmd_t old_pmd =*pmd;
>>> set_pmd(pmd, __pmd(phys |
>>>
>>> pgprot_val(mk_sect_prot(prot))));
>>> @@ -208,8 +209,11 @@ static void alloc_init_pmd(pud_t *pud, unsigned long
>>> addr, unsigned long end,
>>> }
>>>
>>> static inline bool use_1G_block(unsigned long addr, unsigned long next,
>>> - unsigned long phys)
>>> + unsigned long phys, phys_addr_t
>>> (*pgtable_alloc)(void))
>>> {
>>> + if (pgtable_alloc && debug_pagealloc_enabled())
>>> + return false;
>>> +
>>> if (PAGE_SHIFT != 12)
>>> return false;
>>>
>>> @@ -241,7 +245,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long
>>> addr, unsigned long end,
>>> /*
>>> * For 4K granule only, attempt to put down a 1GB block
>>> */
>>> - if (use_1G_block(addr, next, phys)) {
>>> + if (use_1G_block(addr, next, phys, pgtable_alloc)) {
>>> pud_t old_pud = *pud;
>>> set_pud(pud, __pud(phys |
>>>
>>> pgprot_val(mk_sect_prot(prot))));
>>
>>
>> The checks for pgtable_alloc is to allow create_mapping_noalloc to use
>> sections regardless, right?
>>
>> I got a little confused by that initially, but that seems to make sense.
>>
>
> Yeah, that was what I was going for. I went for the implication with
> pgtable_alloc
> over adding a bool variable to create_mapping. I'll see if I can put a
> comment
> in somewhere to clarify the intentions.
>
Yes, the logic becomes slightly fuzzy, so it would be good to document
this one way or the other.
--
Ard.
>
>>>
>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>> index 1360a02..69a8a7d 100644
>>> --- a/arch/arm64/mm/pageattr.c
>>> +++ b/arch/arm64/mm/pageattr.c
>>> @@ -38,7 +38,8 @@ static int change_page_range(pte_t *ptep, pgtable_t
>>> token, unsigned long addr,
>>> }
>>>
>>> static int change_memory_common(unsigned long addr, int numpages,
>>> - pgprot_t set_mask, pgprot_t clear_mask)
>>> + pgprot_t set_mask, pgprot_t clear_mask,
>>> + bool ignore_vma_check)
>>> {
>>> unsigned long start = addr;
>>> unsigned long size = PAGE_SIZE*numpages;
>>> @@ -65,11 +66,14 @@ static int change_memory_common(unsigned long addr,
>>> int numpages,
>>> *
>>> * So check whether the [addr, addr + size) interval is entirely
>>> * covered by precisely one VM area that has the VM_ALLOC flag
>>> set.
>>> + *
>>> + * The one exception to this is is we're forcing everything to be
>>> + * page mapped with DEBUG_PAGEALLOC
>>> */
>>> area = find_vm_area((void *)addr);
>>> - if (!area ||
>>> + if (!ignore_vma_check && (!area ||
>>> end > (unsigned long)area->addr + area->size ||
>>> - !(area->flags & VM_ALLOC))
>>> + !(area->flags & VM_ALLOC)))
>>> return -EINVAL;
>>
>>
>> Perhaps we could pull out everything but the VMA checks into a static
>> __change_memory_common. Then we can call that directly in
>> __kernel_map_pages, and don't need to pass a confusing boolean
>> parameter.
>
>
> Sounds fine.
>
>>
>> Otherwise, this looks good to me.
>> Mark.
>>
>
> Thanks,
> Laura
>
>
>>>
>>>
>>> data.set_mask = set_mask;
>>> @@ -86,21 +90,24 @@ int set_memory_ro(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(PTE_RDONLY),
>>> - __pgprot(PTE_WRITE));
>>> + __pgprot(PTE_WRITE),
>>> + false);
>>> }
>>>
>>> int set_memory_rw(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(PTE_WRITE),
>>> - __pgprot(PTE_RDONLY));
>>> + __pgprot(PTE_RDONLY),
>>> + false);
>>> }
>>>
>>> int set_memory_nx(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(PTE_PXN),
>>> - __pgprot(0));
>>> + __pgprot(0),
>>> + false);
>>> }
>>> EXPORT_SYMBOL_GPL(set_memory_nx);
>>>
>>> @@ -108,6 +115,25 @@ int set_memory_x(unsigned long addr, int numpages)
>>> {
>>> return change_memory_common(addr, numpages,
>>> __pgprot(0),
>>> - __pgprot(PTE_PXN));
>>> + __pgprot(PTE_PXN),
>>> + false);
>>> }
>>> EXPORT_SYMBOL_GPL(set_memory_x);
>>> +
>>> +#ifdef CONFIG_DEBUG_PAGEALLOC
>>> +void __kernel_map_pages(struct page *page, int numpages, int enable)
>>> +{
>>> + unsigned long addr = (unsigned long) page_address(page);
>>> +
>>> + if (enable)
>>> + change_memory_common(addr, numpages,
>>> + __pgprot(PTE_VALID),
>>> + __pgprot(0),
>>> + true);
>>> + else
>>> + change_memory_common(addr, numpages,
>>> + __pgprot(0),
>>> + __pgprot(PTE_VALID),
>>> + true);
>>> +}
>>> +#endif
>>> --
>>> 2.5.0
>>>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
2016-02-01 21:24 ` Laura Abbott
@ 2016-02-02 12:23 ` Mark Rutland
-1 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-02-02 12:23 UTC (permalink / raw)
To: Laura Abbott
Cc: Laura Abbott, Catalin Marinas, Will Deacon, Ard Biesheuvel,
linux-arm-kernel, linux-kernel
On Mon, Feb 01, 2016 at 01:24:25PM -0800, Laura Abbott wrote:
> On 02/01/2016 04:29 AM, Mark Rutland wrote:
> >Hi,
> >
> >On Fri, Jan 29, 2016 at 03:46:57PM -0800, Laura Abbott wrote:
> >>
> >>ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
> >>pages for debugging purposes. This requires memory be mapped
> >>with PAGE_SIZE mappings since breaking down larger mappings
> >>at runtime will lead to TLB conflicts. Check if debug_pagealloc
> >>is enabled at runtime and if so, map everyting with PAGE_SIZE
> >>pages. Implement the functions to actually map/unmap the
> >>pages at runtime.
> >>
> >>
> >>Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
> >
> >I tried to apply atop of the arm64 for-next/pgtable branch, but git
> >wasn't very happy about that -- which branch/patches is this based on?
> >
> >I'm not sure if I'm missing something, have something I shouldn't, or if
> >my MTA is corrupting patches again...
> >
>
> Hmmm, I based it off of your arm64-pagetable-rework-20160125 tag and
> Ard's patch for vmalloc and set_memory_* . The patches seem to apply
> on the for-next/pgtable branch as well so I'm guessing you are missing
> Ard's patch.
Yup, that was it. I evidently was paying far too little attention as I'd
also missed the mm/ patch for the !CONFIG_DEBUG_PAGEALLOC case.
Is there anything else in mm/ that I've potentially missed? I'm seeing a
hang on Juno just after reaching userspace (splat below) with
debug_pagealloc=on.
It looks like something's gone wrong around find_vmap_area -- at least
one CPU is forever awaiting vmap_area_lock, and presumably some other
CPU has held it and gone into the weeds, leading to the RCU stalls and
NMI lockup warnings.
[ 31.037054] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 31.042684] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 31.050795] (detected by 1, t=5255 jiffies, g=340, c=339, q=50)
[ 31.056935] rcu_preempt kthread starved for 4838 jiffies! g340 c339 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 36.509055] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/2:2H:995]
[ 36.521059] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [systemd-udevd:1048]
[ 36.533056] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [systemd-udevd:1037]
[ 36.545055] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [systemd-udevd:1036]
[ 56.497055] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [upstart-file-br:1012]
[ 94.057052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 94.062671] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 94.070780] (detected by 1, t=21010 jiffies, g=340, c=339, q=50)
[ 94.076981] rcu_preempt kthread starved for 20593 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 157.077052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 157.082673] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 157.090782] (detected by 2, t=36765 jiffies, g=340, c=339, q=50)
[ 157.096986] rcu_preempt kthread starved for 36348 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 220.097052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 220.102670] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 220.110779] (detected by 2, t=52520 jiffies, g=340, c=339, q=50)
[ 220.116971] rcu_preempt kthread starved for 52103 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 283.117052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 283.122670] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 283.130779] (detected by 1, t=68275 jiffies, g=340, c=339, q=50)
[ 283.136973] rcu_preempt kthread starved for 67858 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
Typically show-backtrace-all-active-cpus(l) gives me something like:
[ 183.282835] CPU: 0 PID: 998 Comm: systemd-udevd Tainted: G L 4.5.0-rc1+ #7
[ 183.290783] Hardware name: ARM Juno development board (r0) (DT)
[ 183.296659] task: ffffffc97437a400 ti: ffffffc973ec8000 task.ti: ffffffc973ec8000
[ 183.304095] PC is at _raw_spin_lock+0x34/0x48
[ 183.308421] LR is at find_vmap_area+0x24/0xa0
[ 183.312746] pc : [<ffffffc00065faf4>] lr : [<ffffffc000185bc4>] pstate: 60000145
[ 183.320092] sp : ffffffc973ecb6c0
[ 183.323382] x29: ffffffc973ecb6c0 x28: ffffffbde7d50300
[ 183.328662] x27: ffffffffffffffff x26: ffffffbde7d50300
[ 183.333941] x25: 000000097e513000 x24: 0000000000000001
[ 183.339219] x23: 0000000000000000 x22: 0000000000000001
[ 183.344498] x21: ffffffc000a6dd90 x20: ffffffc000a6d000
[ 183.349778] x19: ffffffc97540c000 x18: 0000007fc4e8b960
[ 183.355057] x17: 0000007fac3088d4 x16: ffffffc0001be448
[ 183.360336] x15: 003b9aca00000000 x14: 0032aa26d4000000
[ 183.365614] x13: ffffffffa94f64df x12: 0000000000000018
[ 183.370894] x11: ffffffc97eecd730 x10: 0000000000000030
[ 183.376173] x9 : ffffffbde7d50340 x8 : ffffffc0008556a0
[ 183.381451] x7 : ffffffc0008556b8 x6 : ffffffc0008556d0
[ 183.386729] x5 : ffffffc0009d2000 x4 : 0000000000000001
[ 183.392008] x3 : 000000000000d033 x2 : 000000000000000b
[ 183.397286] x1 : 00000000d038d033 x0 : ffffffc000a6dd90
[ 183.402563]
I'll have a go with lock debugging. Otherwise do you have any ideas?
Thanks,
Mark.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
@ 2016-02-02 12:23 ` Mark Rutland
0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-02-02 12:23 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 01, 2016 at 01:24:25PM -0800, Laura Abbott wrote:
> On 02/01/2016 04:29 AM, Mark Rutland wrote:
> >Hi,
> >
> >On Fri, Jan 29, 2016 at 03:46:57PM -0800, Laura Abbott wrote:
> >>
> >>ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
> >>pages for debugging purposes. This requires memory be mapped
> >>with PAGE_SIZE mappings since breaking down larger mappings
> >>at runtime will lead to TLB conflicts. Check if debug_pagealloc
> >>is enabled at runtime and if so, map everyting with PAGE_SIZE
> >>pages. Implement the functions to actually map/unmap the
> >>pages at runtime.
> >>
> >>
> >>Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
> >
> >I tried to apply atop of the arm64 for-next/pgtable branch, but git
> >wasn't very happy about that -- which branch/patches is this based on?
> >
> >I'm not sure if I'm missing something, have something I shouldn't, or if
> >my MTA is corrupting patches again...
> >
>
> Hmmm, I based it off of your arm64-pagetable-rework-20160125 tag and
> Ard's patch for vmalloc and set_memory_* . The patches seem to apply
> on the for-next/pgtable branch as well so I'm guessing you are missing
> Ard's patch.
Yup, that was it. I evidently was paying far too little attention as I'd
also missed the mm/ patch for the !CONFIG_DEBUG_PAGEALLOC case.
Is there anything else in mm/ that I've potentially missed? I'm seeing a
hang on Juno just after reaching userspace (splat below) with
debug_pagealloc=on.
It looks like something's gone wrong around find_vmap_area -- at least
one CPU is forever awaiting vmap_area_lock, and presumably some other
CPU has held it and gone into the weeds, leading to the RCU stalls and
NMI lockup warnings.
[ 31.037054] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 31.042684] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 31.050795] (detected by 1, t=5255 jiffies, g=340, c=339, q=50)
[ 31.056935] rcu_preempt kthread starved for 4838 jiffies! g340 c339 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 36.509055] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/2:2H:995]
[ 36.521059] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [systemd-udevd:1048]
[ 36.533056] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [systemd-udevd:1037]
[ 36.545055] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [systemd-udevd:1036]
[ 56.497055] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [upstart-file-br:1012]
[ 94.057052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 94.062671] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 94.070780] (detected by 1, t=21010 jiffies, g=340, c=339, q=50)
[ 94.076981] rcu_preempt kthread starved for 20593 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 157.077052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 157.082673] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 157.090782] (detected by 2, t=36765 jiffies, g=340, c=339, q=50)
[ 157.096986] rcu_preempt kthread starved for 36348 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 220.097052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 220.102670] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 220.110779] (detected by 2, t=52520 jiffies, g=340, c=339, q=50)
[ 220.116971] rcu_preempt kthread starved for 52103 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 283.117052] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 283.122670] 0-...: (1 ticks this GP) idle=999/140000000000001/0 softirq=738/738 fqs=418
[ 283.130779] (detected by 1, t=68275 jiffies, g=340, c=339, q=50)
[ 283.136973] rcu_preempt kthread starved for 67858 jiffies! g340 c339 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
Typically show-backtrace-all-active-cpus(l) gives me something like:
[ 183.282835] CPU: 0 PID: 998 Comm: systemd-udevd Tainted: G L 4.5.0-rc1+ #7
[ 183.290783] Hardware name: ARM Juno development board (r0) (DT)
[ 183.296659] task: ffffffc97437a400 ti: ffffffc973ec8000 task.ti: ffffffc973ec8000
[ 183.304095] PC is at _raw_spin_lock+0x34/0x48
[ 183.308421] LR is at find_vmap_area+0x24/0xa0
[ 183.312746] pc : [<ffffffc00065faf4>] lr : [<ffffffc000185bc4>] pstate: 60000145
[ 183.320092] sp : ffffffc973ecb6c0
[ 183.323382] x29: ffffffc973ecb6c0 x28: ffffffbde7d50300
[ 183.328662] x27: ffffffffffffffff x26: ffffffbde7d50300
[ 183.333941] x25: 000000097e513000 x24: 0000000000000001
[ 183.339219] x23: 0000000000000000 x22: 0000000000000001
[ 183.344498] x21: ffffffc000a6dd90 x20: ffffffc000a6d000
[ 183.349778] x19: ffffffc97540c000 x18: 0000007fc4e8b960
[ 183.355057] x17: 0000007fac3088d4 x16: ffffffc0001be448
[ 183.360336] x15: 003b9aca00000000 x14: 0032aa26d4000000
[ 183.365614] x13: ffffffffa94f64df x12: 0000000000000018
[ 183.370894] x11: ffffffc97eecd730 x10: 0000000000000030
[ 183.376173] x9 : ffffffbde7d50340 x8 : ffffffc0008556a0
[ 183.381451] x7 : ffffffc0008556b8 x6 : ffffffc0008556d0
[ 183.386729] x5 : ffffffc0009d2000 x4 : 0000000000000001
[ 183.392008] x3 : 000000000000d033 x2 : 000000000000000b
[ 183.397286] x1 : 00000000d038d033 x0 : ffffffc000a6dd90
[ 183.402563]
I'll have a go with lock debugging. Otherwise do you have any ideas?
Thanks,
Mark.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
2016-02-02 12:23 ` Mark Rutland
@ 2016-02-02 12:31 ` Mark Rutland
-1 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-02-02 12:31 UTC (permalink / raw)
To: Laura Abbott
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, linux-kernel,
linux-arm-kernel, Laura Abbott
On Tue, Feb 02, 2016 at 12:23:18PM +0000, Mark Rutland wrote:
Is there anything else in mm/ that I've potentially missed?
> I'm seeing a hang on Juno just after reaching userspace (splat below)
> with debug_pagealloc=on.
>
> It looks like something's gone wrong around find_vmap_area -- at least
> one CPU is forever awaiting vmap_area_lock, and presumably some other
> CPU has held it and gone into the weeds, leading to the RCU stalls and
> NMI lockup warnings.
[...]
> I'll have a go with lock debugging.
FWIW, with lock debugging I get the below splat before reaching userspace.
[ 0.145869] =================================
[ 0.145901] [ INFO: inconsistent lock state ]
[ 0.145935] 4.5.0-rc1+ #8 Not tainted
[ 0.145964] ---------------------------------
[ 0.145996] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 0.146036] swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 0.146070] (vmap_area_lock){+.?...}, at: [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
[ 0.146151] {SOFTIRQ-ON-W} state was registered at:
[ 0.146184] [<ffffffc0000fc2ac>] mark_lock+0x1bc/0x708
[ 0.146229] [<ffffffc0000feb18>] __lock_acquire+0x928/0x1d90
[ 0.146274] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
[ 0.146318] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
[ 0.146362] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
[ 0.146406] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
[ 0.146447] [<ffffffc000096420>] change_memory_common+0x38/0x120
[ 0.146495] [<ffffffc0000965dc>] __kernel_map_pages+0x54/0x60
[ 0.146537] [<ffffffc000176744>] get_page_from_freelist+0x86c/0x9a0
[ 0.146584] [<ffffffc000176b98>] __alloc_pages_nodemask+0xf0/0x8a0
[ 0.146629] [<ffffffc0009dd46c>] alloc_pages_exact_nid+0x48/0x90
[ 0.146675] [<ffffffc0009b8004>] page_ext_init+0x94/0x124
[ 0.146718] [<ffffffc0009a390c>] start_kernel+0x350/0x3d4
[ 0.146761] [<ffffffc0000811b4>] 0xffffffc0000811b4
[ 0.146802] irq event stamp: 402
[ 0.146830] hardirqs last enabled at (402): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
[ 0.146894] hardirqs last disabled at (401): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
[ 0.146956] softirqs last enabled at (368): [<ffffffc0000bc070>] _local_bh_enable+0x20/0x48
[ 0.147022] softirqs last disabled at (369): [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
[ 0.147081]
[ 0.147081] other info that might help us debug this:
[ 0.147130] Possible unsafe locking scenario:
[ 0.147130]
[ 0.147177] CPU0
[ 0.147201] ----
[ 0.147225] lock(vmap_area_lock);
[ 0.147260] <Interrupt>
[ 0.147285] lock(vmap_area_lock);
[ 0.147321]
[ 0.147321] *** DEADLOCK ***
[ 0.147321]
[ 0.147381] 1 lock held by swapper/5/0:
[ 0.147410] #0: (rcu_callback){......}, at: [<ffffffc0001193f0>] rcu_process_callbacks+0x2b8/0x5f8
[ 0.147492]
[ 0.147492] stack backtrace:
[ 0.147538] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.5.0-rc1+ #8
[ 0.147577] Hardware name: ARM Juno development board (r0) (DT)
[ 0.147613] Call trace:
[ 0.147644] [<ffffffc000089a38>] dump_backtrace+0x0/0x180
[ 0.147684] [<ffffffc000089bcc>] show_stack+0x14/0x20
[ 0.147724] [<ffffffc000373bb8>] dump_stack+0x90/0xc8
[ 0.147764] [<ffffffc00016bb3c>] print_usage_bug.part.21+0x260/0x278
[ 0.147807] [<ffffffc0000fc238>] mark_lock+0x148/0x708
[ 0.147846] [<ffffffc0000feadc>] __lock_acquire+0x8ec/0x1d90
[ 0.147887] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
[ 0.147925] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
[ 0.147965] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
[ 0.148003] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
[ 0.148044] [<ffffffc000096420>] change_memory_common+0x38/0x120
[ 0.148084] [<ffffffc0000965c8>] __kernel_map_pages+0x40/0x60
[ 0.148123] [<ffffffc000172cb8>] free_pages_prepare+0x2d0/0x330
[ 0.148164] [<ffffffc000174660>] __free_pages_ok+0x20/0x108
[ 0.148203] [<ffffffc0001750e0>] __free_pages+0x30/0x50
[ 0.148241] [<ffffffc0001753ac>] __free_kmem_pages+0x24/0x50
[ 0.148280] [<ffffffc000175410>] free_kmem_pages+0x38/0x40
[ 0.148320] [<ffffffc0000b5908>] free_task+0x30/0x60
[ 0.148359] [<ffffffc0000b59f8>] __put_task_struct+0xc0/0x110
[ 0.148400] [<ffffffc0000b91bc>] delayed_put_task_struct+0x44/0x50
[ 0.148442] [<ffffffc000119430>] rcu_process_callbacks+0x2f8/0x5f8
[ 0.148482] [<ffffffc0000bc594>] __do_softirq+0x13c/0x278
[ 0.148520] [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
[ 0.148559] [<ffffffc00010ad90>] __handle_domain_irq+0x60/0xb8
[ 0.148599] [<ffffffc0000824f0>] gic_handle_irq+0x58/0xa8
[ 0.148636] Exception stack(0xffffffc97594be30 to 0xffffffc97594bf50)
[ 0.148678] be20: ffffffc975933f00 0000000000000243
[ 0.148734] be40: 000000097e4a7000 0000000000000000 0000000000000000 0000000000000008
[ 0.148791] be60: 00000007de290000 00000000000270f0 0000000000000001 ffffffc975948000
[ 0.148848] be80: ffffffc00179e000 0000000000000000 ffffffc001507000 ffffffc001507f00
[ 0.148903] bea0: 000000000000000e 0000000000000007 0000000000000001 0000000000000007
[ 0.148960] bec0: 000000000000000e ffffffc000a5a000 ffffffc975948000 ffffffc000a5a000
[ 0.149017] bee0: ffffffc000a38c40 ffffffc000a3c460 ffffffc975948000 ffffffc000a5a000
[ 0.149073] bf00: ffffffc000af6000 0000000000000000 0000000000000000 ffffffc97594bf50
[ 0.149130] bf20: ffffffc0000867e0 ffffffc97594bf50 ffffffc0000867e4 0000000060000045
[ 0.149185] bf40: ffffffc97594bf50 ffffffc0000867e0
[ 0.149222] [<ffffffc0000855e4>] el1_irq+0xa4/0x114
[ 0.149260] [<ffffffc0000867e4>] arch_cpu_idle+0x14/0x20
[ 0.149299] [<ffffffc0000f7878>] default_idle_call+0x18/0x30
[ 0.149339] [<ffffffc0000f7a78>] cpu_startup_entry+0x1e8/0x240
[ 0.149380] [<ffffffc00008f064>] secondary_start_kernel+0x16c/0x198
[ 0.149419] [<00000000800827fc>] 0x800827fc
The kernel then happily ran userspace for a while, but running hackbench
was sufficient to lock it up:
[ 132.624028] BUG: spinlock lockup suspected on CPU#4, hackbench/5589
[ 132.624600] BUG: spinlock lockup suspected on CPU#5, hackbench/5270
[ 132.624619] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 132.626651] BUG: spinlock lockup suspected on CPU#3, hackbench/5280
[ 132.626663] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 132.628358] BUG: spinlock lockup suspected on CPU#0, init/1
[ 132.628370] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 132.675602] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 136.619403] BUG: spinlock lockup suspected on CPU#2, hackbench/6768
[ 136.625640] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 136.675626] BUG: spinlock lockup suspected on CPU#1, hackbench/7089
[ 136.681860] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 152.689601] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [hackbench:7089]
[ 155.149604] INFO: rcu_preempt self-detected stall on CPU
[ 155.149609] INFO: rcu_preempt self-detected stall on CPU
[ 155.149611] INFO: rcu_preempt self-detected stall on CPU
[ 155.149625] 1-...: (6496 ticks this GP) idle=fef/140000000000002/0 softirq=1935/1935 fqs=1
[ 155.149639]
[ 155.149640] 4-...: (6493 ticks this GP) idle=305/140000000000002/0 softirq=1961/1961 fqs=1
[ 155.149650]
[ 155.149651] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 155.149665] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 155.205127] 0-...: (6498 ticks this GP) idle=a2d/140000000000002/0 softirq=2595/2595 fqs=1
[ 155.213526] (t=6516 jiffies g=1204 c=1203 q=422)
[ 155.218307] rcu_preempt kthread starved for 6516 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 156.677602] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [init:1]
[ 156.701602] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [hackbench:6768]
[ 156.713603] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [hackbench:5280]
[ 156.725602] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [hackbench:5589]
[ 156.737603] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [hackbench:5270]
Thanks,
Mark.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
@ 2016-02-02 12:31 ` Mark Rutland
0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-02-02 12:31 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Feb 02, 2016 at 12:23:18PM +0000, Mark Rutland wrote:
Is there anything else in mm/ that I've potentially missed?
> I'm seeing a hang on Juno just after reaching userspace (splat below)
> with debug_pagealloc=on.
>
> It looks like something's gone wrong around find_vmap_area -- at least
> one CPU is forever awaiting vmap_area_lock, and presumably some other
> CPU has held it and gone into the weeds, leading to the RCU stalls and
> NMI lockup warnings.
[...]
> I'll have a go with lock debugging.
FWIW, with lock debugging I get the below splat before reaching userspace.
[ 0.145869] =================================
[ 0.145901] [ INFO: inconsistent lock state ]
[ 0.145935] 4.5.0-rc1+ #8 Not tainted
[ 0.145964] ---------------------------------
[ 0.145996] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 0.146036] swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 0.146070] (vmap_area_lock){+.?...}, at: [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
[ 0.146151] {SOFTIRQ-ON-W} state was registered at:
[ 0.146184] [<ffffffc0000fc2ac>] mark_lock+0x1bc/0x708
[ 0.146229] [<ffffffc0000feb18>] __lock_acquire+0x928/0x1d90
[ 0.146274] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
[ 0.146318] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
[ 0.146362] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
[ 0.146406] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
[ 0.146447] [<ffffffc000096420>] change_memory_common+0x38/0x120
[ 0.146495] [<ffffffc0000965dc>] __kernel_map_pages+0x54/0x60
[ 0.146537] [<ffffffc000176744>] get_page_from_freelist+0x86c/0x9a0
[ 0.146584] [<ffffffc000176b98>] __alloc_pages_nodemask+0xf0/0x8a0
[ 0.146629] [<ffffffc0009dd46c>] alloc_pages_exact_nid+0x48/0x90
[ 0.146675] [<ffffffc0009b8004>] page_ext_init+0x94/0x124
[ 0.146718] [<ffffffc0009a390c>] start_kernel+0x350/0x3d4
[ 0.146761] [<ffffffc0000811b4>] 0xffffffc0000811b4
[ 0.146802] irq event stamp: 402
[ 0.146830] hardirqs last enabled at (402): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
[ 0.146894] hardirqs last disabled at (401): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
[ 0.146956] softirqs last enabled at (368): [<ffffffc0000bc070>] _local_bh_enable+0x20/0x48
[ 0.147022] softirqs last disabled at (369): [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
[ 0.147081]
[ 0.147081] other info that might help us debug this:
[ 0.147130] Possible unsafe locking scenario:
[ 0.147130]
[ 0.147177] CPU0
[ 0.147201] ----
[ 0.147225] lock(vmap_area_lock);
[ 0.147260] <Interrupt>
[ 0.147285] lock(vmap_area_lock);
[ 0.147321]
[ 0.147321] *** DEADLOCK ***
[ 0.147321]
[ 0.147381] 1 lock held by swapper/5/0:
[ 0.147410] #0: (rcu_callback){......}, at: [<ffffffc0001193f0>] rcu_process_callbacks+0x2b8/0x5f8
[ 0.147492]
[ 0.147492] stack backtrace:
[ 0.147538] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.5.0-rc1+ #8
[ 0.147577] Hardware name: ARM Juno development board (r0) (DT)
[ 0.147613] Call trace:
[ 0.147644] [<ffffffc000089a38>] dump_backtrace+0x0/0x180
[ 0.147684] [<ffffffc000089bcc>] show_stack+0x14/0x20
[ 0.147724] [<ffffffc000373bb8>] dump_stack+0x90/0xc8
[ 0.147764] [<ffffffc00016bb3c>] print_usage_bug.part.21+0x260/0x278
[ 0.147807] [<ffffffc0000fc238>] mark_lock+0x148/0x708
[ 0.147846] [<ffffffc0000feadc>] __lock_acquire+0x8ec/0x1d90
[ 0.147887] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
[ 0.147925] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
[ 0.147965] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
[ 0.148003] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
[ 0.148044] [<ffffffc000096420>] change_memory_common+0x38/0x120
[ 0.148084] [<ffffffc0000965c8>] __kernel_map_pages+0x40/0x60
[ 0.148123] [<ffffffc000172cb8>] free_pages_prepare+0x2d0/0x330
[ 0.148164] [<ffffffc000174660>] __free_pages_ok+0x20/0x108
[ 0.148203] [<ffffffc0001750e0>] __free_pages+0x30/0x50
[ 0.148241] [<ffffffc0001753ac>] __free_kmem_pages+0x24/0x50
[ 0.148280] [<ffffffc000175410>] free_kmem_pages+0x38/0x40
[ 0.148320] [<ffffffc0000b5908>] free_task+0x30/0x60
[ 0.148359] [<ffffffc0000b59f8>] __put_task_struct+0xc0/0x110
[ 0.148400] [<ffffffc0000b91bc>] delayed_put_task_struct+0x44/0x50
[ 0.148442] [<ffffffc000119430>] rcu_process_callbacks+0x2f8/0x5f8
[ 0.148482] [<ffffffc0000bc594>] __do_softirq+0x13c/0x278
[ 0.148520] [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
[ 0.148559] [<ffffffc00010ad90>] __handle_domain_irq+0x60/0xb8
[ 0.148599] [<ffffffc0000824f0>] gic_handle_irq+0x58/0xa8
[ 0.148636] Exception stack(0xffffffc97594be30 to 0xffffffc97594bf50)
[ 0.148678] be20: ffffffc975933f00 0000000000000243
[ 0.148734] be40: 000000097e4a7000 0000000000000000 0000000000000000 0000000000000008
[ 0.148791] be60: 00000007de290000 00000000000270f0 0000000000000001 ffffffc975948000
[ 0.148848] be80: ffffffc00179e000 0000000000000000 ffffffc001507000 ffffffc001507f00
[ 0.148903] bea0: 000000000000000e 0000000000000007 0000000000000001 0000000000000007
[ 0.148960] bec0: 000000000000000e ffffffc000a5a000 ffffffc975948000 ffffffc000a5a000
[ 0.149017] bee0: ffffffc000a38c40 ffffffc000a3c460 ffffffc975948000 ffffffc000a5a000
[ 0.149073] bf00: ffffffc000af6000 0000000000000000 0000000000000000 ffffffc97594bf50
[ 0.149130] bf20: ffffffc0000867e0 ffffffc97594bf50 ffffffc0000867e4 0000000060000045
[ 0.149185] bf40: ffffffc97594bf50 ffffffc0000867e0
[ 0.149222] [<ffffffc0000855e4>] el1_irq+0xa4/0x114
[ 0.149260] [<ffffffc0000867e4>] arch_cpu_idle+0x14/0x20
[ 0.149299] [<ffffffc0000f7878>] default_idle_call+0x18/0x30
[ 0.149339] [<ffffffc0000f7a78>] cpu_startup_entry+0x1e8/0x240
[ 0.149380] [<ffffffc00008f064>] secondary_start_kernel+0x16c/0x198
[ 0.149419] [<00000000800827fc>] 0x800827fc
The kernel then happily ran userspace for a while, but running hackbench
was sufficient to lock it up:
[ 132.624028] BUG: spinlock lockup suspected on CPU#4, hackbench/5589
[ 132.624600] BUG: spinlock lockup suspected on CPU#5, hackbench/5270
[ 132.624619] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 132.626651] BUG: spinlock lockup suspected on CPU#3, hackbench/5280
[ 132.626663] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 132.628358] BUG: spinlock lockup suspected on CPU#0, init/1
[ 132.628370] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 132.675602] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 136.619403] BUG: spinlock lockup suspected on CPU#2, hackbench/6768
[ 136.625640] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 136.675626] BUG: spinlock lockup suspected on CPU#1, hackbench/7089
[ 136.681860] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
[ 152.689601] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [hackbench:7089]
[ 155.149604] INFO: rcu_preempt self-detected stall on CPU
[ 155.149609] INFO: rcu_preempt self-detected stall on CPU
[ 155.149611] INFO: rcu_preempt self-detected stall on CPU
[ 155.149625] 1-...: (6496 ticks this GP) idle=fef/140000000000002/0 softirq=1935/1935 fqs=1
[ 155.149639]
[ 155.149640] 4-...: (6493 ticks this GP) idle=305/140000000000002/0 softirq=1961/1961 fqs=1
[ 155.149650]
[ 155.149651] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 155.149665] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 155.205127] 0-...: (6498 ticks this GP) idle=a2d/140000000000002/0 softirq=2595/2595 fqs=1
[ 155.213526] (t=6516 jiffies g=1204 c=1203 q=422)
[ 155.218307] rcu_preempt kthread starved for 6516 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
[ 156.677602] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [init:1]
[ 156.701602] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [hackbench:6768]
[ 156.713603] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [hackbench:5280]
[ 156.725602] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [hackbench:5589]
[ 156.737603] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [hackbench:5270]
Thanks,
Mark.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
2016-02-02 12:31 ` Mark Rutland
@ 2016-02-02 16:08 ` Laura Abbott
-1 siblings, 0 replies; 24+ messages in thread
From: Laura Abbott @ 2016-02-02 16:08 UTC (permalink / raw)
To: Mark Rutland
Cc: Ard Biesheuvel, Catalin Marinas, Will Deacon, linux-kernel,
linux-arm-kernel, Laura Abbott
On 02/02/2016 04:31 AM, Mark Rutland wrote:
> On Tue, Feb 02, 2016 at 12:23:18PM +0000, Mark Rutland wrote:
> Is there anything else in mm/ that I've potentially missed?
>> I'm seeing a hang on Juno just after reaching userspace (splat below)
>> with debug_pagealloc=on.
>>
>> It looks like something's gone wrong around find_vmap_area -- at least
>> one CPU is forever awaiting vmap_area_lock, and presumably some other
>> CPU has held it and gone into the weeds, leading to the RCU stalls and
>> NMI lockup warnings.
>
> [...]
>
>> I'll have a go with lock debugging.
>
> FWIW, with lock debugging I get the below splat before reaching userspace.
>
> [ 0.145869] =================================
> [ 0.145901] [ INFO: inconsistent lock state ]
> [ 0.145935] 4.5.0-rc1+ #8 Not tainted
> [ 0.145964] ---------------------------------
> [ 0.145996] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> [ 0.146036] swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> [ 0.146070] (vmap_area_lock){+.?...}, at: [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
> [ 0.146151] {SOFTIRQ-ON-W} state was registered at:
> [ 0.146184] [<ffffffc0000fc2ac>] mark_lock+0x1bc/0x708
> [ 0.146229] [<ffffffc0000feb18>] __lock_acquire+0x928/0x1d90
> [ 0.146274] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
> [ 0.146318] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
> [ 0.146362] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
> [ 0.146406] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
> [ 0.146447] [<ffffffc000096420>] change_memory_common+0x38/0x120
> [ 0.146495] [<ffffffc0000965dc>] __kernel_map_pages+0x54/0x60
> [ 0.146537] [<ffffffc000176744>] get_page_from_freelist+0x86c/0x9a0
> [ 0.146584] [<ffffffc000176b98>] __alloc_pages_nodemask+0xf0/0x8a0
> [ 0.146629] [<ffffffc0009dd46c>] alloc_pages_exact_nid+0x48/0x90
> [ 0.146675] [<ffffffc0009b8004>] page_ext_init+0x94/0x124
> [ 0.146718] [<ffffffc0009a390c>] start_kernel+0x350/0x3d4
> [ 0.146761] [<ffffffc0000811b4>] 0xffffffc0000811b4
> [ 0.146802] irq event stamp: 402
> [ 0.146830] hardirqs last enabled at (402): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
> [ 0.146894] hardirqs last disabled at (401): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
> [ 0.146956] softirqs last enabled at (368): [<ffffffc0000bc070>] _local_bh_enable+0x20/0x48
> [ 0.147022] softirqs last disabled at (369): [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
> [ 0.147081]
> [ 0.147081] other info that might help us debug this:
> [ 0.147130] Possible unsafe locking scenario:
> [ 0.147130]
> [ 0.147177] CPU0
> [ 0.147201] ----
> [ 0.147225] lock(vmap_area_lock);
> [ 0.147260] <Interrupt>
> [ 0.147285] lock(vmap_area_lock);
> [ 0.147321]
> [ 0.147321] *** DEADLOCK ***
> [ 0.147321]
> [ 0.147381] 1 lock held by swapper/5/0:
> [ 0.147410] #0: (rcu_callback){......}, at: [<ffffffc0001193f0>] rcu_process_callbacks+0x2b8/0x5f8
> [ 0.147492]
> [ 0.147492] stack backtrace:
> [ 0.147538] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.5.0-rc1+ #8
> [ 0.147577] Hardware name: ARM Juno development board (r0) (DT)
> [ 0.147613] Call trace:
> [ 0.147644] [<ffffffc000089a38>] dump_backtrace+0x0/0x180
> [ 0.147684] [<ffffffc000089bcc>] show_stack+0x14/0x20
> [ 0.147724] [<ffffffc000373bb8>] dump_stack+0x90/0xc8
> [ 0.147764] [<ffffffc00016bb3c>] print_usage_bug.part.21+0x260/0x278
> [ 0.147807] [<ffffffc0000fc238>] mark_lock+0x148/0x708
> [ 0.147846] [<ffffffc0000feadc>] __lock_acquire+0x8ec/0x1d90
> [ 0.147887] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
> [ 0.147925] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
> [ 0.147965] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
> [ 0.148003] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
> [ 0.148044] [<ffffffc000096420>] change_memory_common+0x38/0x120
> [ 0.148084] [<ffffffc0000965c8>] __kernel_map_pages+0x40/0x60
> [ 0.148123] [<ffffffc000172cb8>] free_pages_prepare+0x2d0/0x330
> [ 0.148164] [<ffffffc000174660>] __free_pages_ok+0x20/0x108
> [ 0.148203] [<ffffffc0001750e0>] __free_pages+0x30/0x50
> [ 0.148241] [<ffffffc0001753ac>] __free_kmem_pages+0x24/0x50
> [ 0.148280] [<ffffffc000175410>] free_kmem_pages+0x38/0x40
> [ 0.148320] [<ffffffc0000b5908>] free_task+0x30/0x60
> [ 0.148359] [<ffffffc0000b59f8>] __put_task_struct+0xc0/0x110
> [ 0.148400] [<ffffffc0000b91bc>] delayed_put_task_struct+0x44/0x50
> [ 0.148442] [<ffffffc000119430>] rcu_process_callbacks+0x2f8/0x5f8
> [ 0.148482] [<ffffffc0000bc594>] __do_softirq+0x13c/0x278
> [ 0.148520] [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
> [ 0.148559] [<ffffffc00010ad90>] __handle_domain_irq+0x60/0xb8
> [ 0.148599] [<ffffffc0000824f0>] gic_handle_irq+0x58/0xa8
> [ 0.148636] Exception stack(0xffffffc97594be30 to 0xffffffc97594bf50)
> [ 0.148678] be20: ffffffc975933f00 0000000000000243
> [ 0.148734] be40: 000000097e4a7000 0000000000000000 0000000000000000 0000000000000008
> [ 0.148791] be60: 00000007de290000 00000000000270f0 0000000000000001 ffffffc975948000
> [ 0.148848] be80: ffffffc00179e000 0000000000000000 ffffffc001507000 ffffffc001507f00
> [ 0.148903] bea0: 000000000000000e 0000000000000007 0000000000000001 0000000000000007
> [ 0.148960] bec0: 000000000000000e ffffffc000a5a000 ffffffc975948000 ffffffc000a5a000
> [ 0.149017] bee0: ffffffc000a38c40 ffffffc000a3c460 ffffffc975948000 ffffffc000a5a000
> [ 0.149073] bf00: ffffffc000af6000 0000000000000000 0000000000000000 ffffffc97594bf50
> [ 0.149130] bf20: ffffffc0000867e0 ffffffc97594bf50 ffffffc0000867e4 0000000060000045
> [ 0.149185] bf40: ffffffc97594bf50 ffffffc0000867e0
> [ 0.149222] [<ffffffc0000855e4>] el1_irq+0xa4/0x114
> [ 0.149260] [<ffffffc0000867e4>] arch_cpu_idle+0x14/0x20
> [ 0.149299] [<ffffffc0000f7878>] default_idle_call+0x18/0x30
> [ 0.149339] [<ffffffc0000f7a78>] cpu_startup_entry+0x1e8/0x240
> [ 0.149380] [<ffffffc00008f064>] secondary_start_kernel+0x16c/0x198
> [ 0.149419] [<00000000800827fc>] 0x800827fc
>
> The kernel then happily ran userspace for a while, but running hackbench
> was sufficient to lock it up:
>
> [ 132.624028] BUG: spinlock lockup suspected on CPU#4, hackbench/5589
> [ 132.624600] BUG: spinlock lockup suspected on CPU#5, hackbench/5270
> [ 132.624619] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 132.626651] BUG: spinlock lockup suspected on CPU#3, hackbench/5280
> [ 132.626663] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 132.628358] BUG: spinlock lockup suspected on CPU#0, init/1
> [ 132.628370] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 132.675602] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 136.619403] BUG: spinlock lockup suspected on CPU#2, hackbench/6768
> [ 136.625640] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 136.675626] BUG: spinlock lockup suspected on CPU#1, hackbench/7089
> [ 136.681860] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 152.689601] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [hackbench:7089]
> [ 155.149604] INFO: rcu_preempt self-detected stall on CPU
> [ 155.149609] INFO: rcu_preempt self-detected stall on CPU
> [ 155.149611] INFO: rcu_preempt self-detected stall on CPU
> [ 155.149625] 1-...: (6496 ticks this GP) idle=fef/140000000000002/0 softirq=1935/1935 fqs=1
> [ 155.149639]
> [ 155.149640] 4-...: (6493 ticks this GP) idle=305/140000000000002/0 softirq=1961/1961 fqs=1
> [ 155.149650]
> [ 155.149651] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 155.149665] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 155.205127] 0-...: (6498 ticks this GP) idle=a2d/140000000000002/0 softirq=2595/2595 fqs=1
> [ 155.213526] (t=6516 jiffies g=1204 c=1203 q=422)
> [ 155.218307] rcu_preempt kthread starved for 6516 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 156.677602] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [init:1]
> [ 156.701602] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [hackbench:6768]
> [ 156.713603] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [hackbench:5280]
> [ 156.725602] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [hackbench:5589]
> [ 156.737603] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [hackbench:5270]
>
> Thanks,
> Mark.
>
Yes, this is absolutely a deadlock with the vmap_lock. Your suggestion to pull it out so
we don't have the extra check should fix it I suspect.
Thanks,
Laura
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
@ 2016-02-02 16:08 ` Laura Abbott
0 siblings, 0 replies; 24+ messages in thread
From: Laura Abbott @ 2016-02-02 16:08 UTC (permalink / raw)
To: linux-arm-kernel
On 02/02/2016 04:31 AM, Mark Rutland wrote:
> On Tue, Feb 02, 2016 at 12:23:18PM +0000, Mark Rutland wrote:
> Is there anything else in mm/ that I've potentially missed?
>> I'm seeing a hang on Juno just after reaching userspace (splat below)
>> with debug_pagealloc=on.
>>
>> It looks like something's gone wrong around find_vmap_area -- at least
>> one CPU is forever awaiting vmap_area_lock, and presumably some other
>> CPU has held it and gone into the weeds, leading to the RCU stalls and
>> NMI lockup warnings.
>
> [...]
>
>> I'll have a go with lock debugging.
>
> FWIW, with lock debugging I get the below splat before reaching userspace.
>
> [ 0.145869] =================================
> [ 0.145901] [ INFO: inconsistent lock state ]
> [ 0.145935] 4.5.0-rc1+ #8 Not tainted
> [ 0.145964] ---------------------------------
> [ 0.145996] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> [ 0.146036] swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> [ 0.146070] (vmap_area_lock){+.?...}, at: [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
> [ 0.146151] {SOFTIRQ-ON-W} state was registered at:
> [ 0.146184] [<ffffffc0000fc2ac>] mark_lock+0x1bc/0x708
> [ 0.146229] [<ffffffc0000feb18>] __lock_acquire+0x928/0x1d90
> [ 0.146274] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
> [ 0.146318] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
> [ 0.146362] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
> [ 0.146406] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
> [ 0.146447] [<ffffffc000096420>] change_memory_common+0x38/0x120
> [ 0.146495] [<ffffffc0000965dc>] __kernel_map_pages+0x54/0x60
> [ 0.146537] [<ffffffc000176744>] get_page_from_freelist+0x86c/0x9a0
> [ 0.146584] [<ffffffc000176b98>] __alloc_pages_nodemask+0xf0/0x8a0
> [ 0.146629] [<ffffffc0009dd46c>] alloc_pages_exact_nid+0x48/0x90
> [ 0.146675] [<ffffffc0009b8004>] page_ext_init+0x94/0x124
> [ 0.146718] [<ffffffc0009a390c>] start_kernel+0x350/0x3d4
> [ 0.146761] [<ffffffc0000811b4>] 0xffffffc0000811b4
> [ 0.146802] irq event stamp: 402
> [ 0.146830] hardirqs last enabled at (402): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
> [ 0.146894] hardirqs last disabled at (401): [<ffffffc000172c58>] free_pages_prepare+0x270/0x330
> [ 0.146956] softirqs last enabled at (368): [<ffffffc0000bc070>] _local_bh_enable+0x20/0x48
> [ 0.147022] softirqs last disabled at (369): [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
> [ 0.147081]
> [ 0.147081] other info that might help us debug this:
> [ 0.147130] Possible unsafe locking scenario:
> [ 0.147130]
> [ 0.147177] CPU0
> [ 0.147201] ----
> [ 0.147225] lock(vmap_area_lock);
> [ 0.147260] <Interrupt>
> [ 0.147285] lock(vmap_area_lock);
> [ 0.147321]
> [ 0.147321] *** DEADLOCK ***
> [ 0.147321]
> [ 0.147381] 1 lock held by swapper/5/0:
> [ 0.147410] #0: (rcu_callback){......}, at: [<ffffffc0001193f0>] rcu_process_callbacks+0x2b8/0x5f8
> [ 0.147492]
> [ 0.147492] stack backtrace:
> [ 0.147538] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.5.0-rc1+ #8
> [ 0.147577] Hardware name: ARM Juno development board (r0) (DT)
> [ 0.147613] Call trace:
> [ 0.147644] [<ffffffc000089a38>] dump_backtrace+0x0/0x180
> [ 0.147684] [<ffffffc000089bcc>] show_stack+0x14/0x20
> [ 0.147724] [<ffffffc000373bb8>] dump_stack+0x90/0xc8
> [ 0.147764] [<ffffffc00016bb3c>] print_usage_bug.part.21+0x260/0x278
> [ 0.147807] [<ffffffc0000fc238>] mark_lock+0x148/0x708
> [ 0.147846] [<ffffffc0000feadc>] __lock_acquire+0x8ec/0x1d90
> [ 0.147887] [<ffffffc00010032c>] lock_acquire+0x9c/0xe0
> [ 0.147925] [<ffffffc0006c8cd8>] _raw_spin_lock+0x40/0x58
> [ 0.147965] [<ffffffc0001a749c>] find_vmap_area+0x1c/0x98
> [ 0.148003] [<ffffffc0001a9c6c>] find_vm_area+0xc/0x38
> [ 0.148044] [<ffffffc000096420>] change_memory_common+0x38/0x120
> [ 0.148084] [<ffffffc0000965c8>] __kernel_map_pages+0x40/0x60
> [ 0.148123] [<ffffffc000172cb8>] free_pages_prepare+0x2d0/0x330
> [ 0.148164] [<ffffffc000174660>] __free_pages_ok+0x20/0x108
> [ 0.148203] [<ffffffc0001750e0>] __free_pages+0x30/0x50
> [ 0.148241] [<ffffffc0001753ac>] __free_kmem_pages+0x24/0x50
> [ 0.148280] [<ffffffc000175410>] free_kmem_pages+0x38/0x40
> [ 0.148320] [<ffffffc0000b5908>] free_task+0x30/0x60
> [ 0.148359] [<ffffffc0000b59f8>] __put_task_struct+0xc0/0x110
> [ 0.148400] [<ffffffc0000b91bc>] delayed_put_task_struct+0x44/0x50
> [ 0.148442] [<ffffffc000119430>] rcu_process_callbacks+0x2f8/0x5f8
> [ 0.148482] [<ffffffc0000bc594>] __do_softirq+0x13c/0x278
> [ 0.148520] [<ffffffc0000bca10>] irq_exit+0xa0/0xd8
> [ 0.148559] [<ffffffc00010ad90>] __handle_domain_irq+0x60/0xb8
> [ 0.148599] [<ffffffc0000824f0>] gic_handle_irq+0x58/0xa8
> [ 0.148636] Exception stack(0xffffffc97594be30 to 0xffffffc97594bf50)
> [ 0.148678] be20: ffffffc975933f00 0000000000000243
> [ 0.148734] be40: 000000097e4a7000 0000000000000000 0000000000000000 0000000000000008
> [ 0.148791] be60: 00000007de290000 00000000000270f0 0000000000000001 ffffffc975948000
> [ 0.148848] be80: ffffffc00179e000 0000000000000000 ffffffc001507000 ffffffc001507f00
> [ 0.148903] bea0: 000000000000000e 0000000000000007 0000000000000001 0000000000000007
> [ 0.148960] bec0: 000000000000000e ffffffc000a5a000 ffffffc975948000 ffffffc000a5a000
> [ 0.149017] bee0: ffffffc000a38c40 ffffffc000a3c460 ffffffc975948000 ffffffc000a5a000
> [ 0.149073] bf00: ffffffc000af6000 0000000000000000 0000000000000000 ffffffc97594bf50
> [ 0.149130] bf20: ffffffc0000867e0 ffffffc97594bf50 ffffffc0000867e4 0000000060000045
> [ 0.149185] bf40: ffffffc97594bf50 ffffffc0000867e0
> [ 0.149222] [<ffffffc0000855e4>] el1_irq+0xa4/0x114
> [ 0.149260] [<ffffffc0000867e4>] arch_cpu_idle+0x14/0x20
> [ 0.149299] [<ffffffc0000f7878>] default_idle_call+0x18/0x30
> [ 0.149339] [<ffffffc0000f7a78>] cpu_startup_entry+0x1e8/0x240
> [ 0.149380] [<ffffffc00008f064>] secondary_start_kernel+0x16c/0x198
> [ 0.149419] [<00000000800827fc>] 0x800827fc
>
> The kernel then happily ran userspace for a while, but running hackbench
> was sufficient to lock it up:
>
> [ 132.624028] BUG: spinlock lockup suspected on CPU#4, hackbench/5589
> [ 132.624600] BUG: spinlock lockup suspected on CPU#5, hackbench/5270
> [ 132.624619] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 132.626651] BUG: spinlock lockup suspected on CPU#3, hackbench/5280
> [ 132.626663] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 132.628358] BUG: spinlock lockup suspected on CPU#0, init/1
> [ 132.628370] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 132.675602] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 136.619403] BUG: spinlock lockup suspected on CPU#2, hackbench/6768
> [ 136.625640] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 136.675626] BUG: spinlock lockup suspected on CPU#1, hackbench/7089
> [ 136.681860] lock: vmap_area_lock+0x0/0x38, .magic: dead4ead, .owner: hackbench/7089, .owner_cpu: 1
> [ 152.689601] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [hackbench:7089]
> [ 155.149604] INFO: rcu_preempt self-detected stall on CPU
> [ 155.149609] INFO: rcu_preempt self-detected stall on CPU
> [ 155.149611] INFO: rcu_preempt self-detected stall on CPU
> [ 155.149625] 1-...: (6496 ticks this GP) idle=fef/140000000000002/0 softirq=1935/1935 fqs=1
> [ 155.149639]
> [ 155.149640] 4-...: (6493 ticks this GP) idle=305/140000000000002/0 softirq=1961/1961 fqs=1
> [ 155.149650]
> [ 155.149651] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 155.149665] rcu_preempt kthread starved for 6499 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 155.205127] 0-...: (6498 ticks this GP) idle=a2d/140000000000002/0 softirq=2595/2595 fqs=1
> [ 155.213526] (t=6516 jiffies g=1204 c=1203 q=422)
> [ 155.218307] rcu_preempt kthread starved for 6516 jiffies! g1204 c1203 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 156.677602] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [init:1]
> [ 156.701602] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [hackbench:6768]
> [ 156.713603] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [hackbench:5280]
> [ 156.725602] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [hackbench:5589]
> [ 156.737603] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [hackbench:5270]
>
> Thanks,
> Mark.
>
Yes, this is absolutely a deadlock with the vmap_lock. Your suggestion to pull it out so
we don't have the extra check should fix it I suspect.
Thanks,
Laura
^ permalink raw reply [flat|nested] 24+ messages in thread