All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: struct dev_pagemap corruption
Date: Fri, 5 Apr 2019 09:53:24 -0400	[thread overview]
Message-ID: <20190405135324.GA5627@redhat.com> (raw)
In-Reply-To: <7885dce0-edbe-db04-b5ec-bd271c9a0612@arm.com>

On Fri, Apr 05, 2019 at 10:10:22AM +0530, Anshuman Khandual wrote:
> Hello,
> 
> On arm64 platform "struct dev_pagemap" is getting corrupted during ZONE_DEVICE
> unmapping path through device_destroy(). Its device memory range end address
> (pgmap->res.end) which is getting corrupted in this particular case. AFAICS
> pgmap which gets initialized by the driver and mapped with devm_memremap_pages()
> should retain it's values during the unmapping path as well. Is this assumption
> right ?
> 
> [   62.779412] Call trace:
> [   62.779808]  dump_backtrace+0x0/0x118
> [   62.780460]  show_stack+0x14/0x20
> [   62.781204]  dump_stack+0xa8/0xcc
> [   62.781941]  devm_memremap_pages_release+0x24/0x1d8
> [   62.783021]  devm_action_release+0x10/0x18
> [   62.783911]  release_nodes+0x1b0/0x220
> [   62.784732]  devres_release_all+0x34/0x50
> [   62.785623]  device_release+0x24/0x90
> [   62.786454]  kobject_put+0x74/0xe8
> [   62.787214]  device_destroy+0x48/0x58
> [   62.788041]  zone_device_public_altmap_init+0x404/0x42c [zone_device_public_altmap]
> [   62.789675]  do_one_initcall+0x74/0x190
> [   62.790528]  do_init_module+0x50/0x1c0
> [   62.791346]  load_module+0x1be4/0x2140
> [   62.792192]  __se_sys_finit_module+0xb8/0xc8
> [   62.793128]  __arm64_sys_finit_module+0x18/0x20
> [   62.794128]  el0_svc_handler+0x88/0x100
> [   62.794989]  el0_svc+0x8/0xc
> 
> The problem can be traced down here.
> 
> diff --git a/drivers/base/devres.c b/drivers/base/devres.c
> index e038e2b3b7ea..2a410c88c596 100644
> --- a/drivers/base/devres.c
> +++ b/drivers/base/devres.c
> @@ -33,7 +33,7 @@ struct devres {
>          * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
>          * buffer alignment as if it was allocated by plain kmalloc().
>          */
> -       u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
> +       u8 __aligned(__alignof__(unsigned long long)) data[];
>  };

I doubt that pgmap->res.end get corrupted during device_destroy() but given
that the above changes fix the issue it kind of boggle the mind. If i where
to debug this i would probably run a kernel with qemu -s to get a gdbserver
and then attach gdb and set breakpoint on devm_memremap_pages() then when
that trigger i would set memory watch on the pgmap->res.end (there use to be
way to use memory as pmem through kernel boot option).

A printk alternative solution is, assuming you only have one pgmap, add a
global static struct pgmap *debug_pgmap = NULL; in memremap.c set that in
devm_memremap_pages() and add an helper function:

void debug_pgmap(const char *file, unsigned line)
{
    printk(... file, line);
    printk(... debug_pmap->res.end);
}

In a header:
#define DEBUG_PGMAP debug_pgmap(__FILE__, __LINE__);

Then sprinkle DEBUG_PGMAP within device_destroy(), device_unregister(),
device_release() and see when it get corrupted.

gdb would be faster but sometime i got issue with memory watchpoint and virt.

Cheers,
Jérôme


WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Will Deacon <will.deacon@arm.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: struct dev_pagemap corruption
Date: Fri, 5 Apr 2019 09:53:24 -0400	[thread overview]
Message-ID: <20190405135324.GA5627@redhat.com> (raw)
In-Reply-To: <7885dce0-edbe-db04-b5ec-bd271c9a0612@arm.com>

On Fri, Apr 05, 2019 at 10:10:22AM +0530, Anshuman Khandual wrote:
> Hello,
> 
> On arm64 platform "struct dev_pagemap" is getting corrupted during ZONE_DEVICE
> unmapping path through device_destroy(). Its device memory range end address
> (pgmap->res.end) which is getting corrupted in this particular case. AFAICS
> pgmap which gets initialized by the driver and mapped with devm_memremap_pages()
> should retain it's values during the unmapping path as well. Is this assumption
> right ?
> 
> [   62.779412] Call trace:
> [   62.779808]  dump_backtrace+0x0/0x118
> [   62.780460]  show_stack+0x14/0x20
> [   62.781204]  dump_stack+0xa8/0xcc
> [   62.781941]  devm_memremap_pages_release+0x24/0x1d8
> [   62.783021]  devm_action_release+0x10/0x18
> [   62.783911]  release_nodes+0x1b0/0x220
> [   62.784732]  devres_release_all+0x34/0x50
> [   62.785623]  device_release+0x24/0x90
> [   62.786454]  kobject_put+0x74/0xe8
> [   62.787214]  device_destroy+0x48/0x58
> [   62.788041]  zone_device_public_altmap_init+0x404/0x42c [zone_device_public_altmap]
> [   62.789675]  do_one_initcall+0x74/0x190
> [   62.790528]  do_init_module+0x50/0x1c0
> [   62.791346]  load_module+0x1be4/0x2140
> [   62.792192]  __se_sys_finit_module+0xb8/0xc8
> [   62.793128]  __arm64_sys_finit_module+0x18/0x20
> [   62.794128]  el0_svc_handler+0x88/0x100
> [   62.794989]  el0_svc+0x8/0xc
> 
> The problem can be traced down here.
> 
> diff --git a/drivers/base/devres.c b/drivers/base/devres.c
> index e038e2b3b7ea..2a410c88c596 100644
> --- a/drivers/base/devres.c
> +++ b/drivers/base/devres.c
> @@ -33,7 +33,7 @@ struct devres {
>          * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
>          * buffer alignment as if it was allocated by plain kmalloc().
>          */
> -       u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
> +       u8 __aligned(__alignof__(unsigned long long)) data[];
>  };

I doubt that pgmap->res.end get corrupted during device_destroy() but given
that the above changes fix the issue it kind of boggle the mind. If i where
to debug this i would probably run a kernel with qemu -s to get a gdbserver
and then attach gdb and set breakpoint on devm_memremap_pages() then when
that trigger i would set memory watch on the pgmap->res.end (there use to be
way to use memory as pmem through kernel boot option).

A printk alternative solution is, assuming you only have one pgmap, add a
global static struct pgmap *debug_pgmap = NULL; in memremap.c set that in
devm_memremap_pages() and add an helper function:

void debug_pgmap(const char *file, unsigned line)
{
    printk(... file, line);
    printk(... debug_pmap->res.end);
}

In a header:
#define DEBUG_PGMAP debug_pgmap(__FILE__, __LINE__);

Then sprinkle DEBUG_PGMAP within device_destroy(), device_unregister(),
device_release() and see when it get corrupted.

gdb would be faster but sometime i got issue with memory watchpoint and virt.

Cheers,
Jérôme

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2019-04-05 13:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-05  4:40 struct dev_pagemap corruption Anshuman Khandual
2019-04-05  4:40 ` Anshuman Khandual
2019-04-05 13:37 ` Robin Murphy
2019-04-05 13:37   ` Robin Murphy
2019-04-05 14:54   ` Anshuman Khandual
2019-04-05 14:54     ` Anshuman Khandual
2019-04-05 15:57     ` Robin Murphy
2019-04-05 15:57       ` Robin Murphy
2019-04-07 22:25     ` Dan Williams
2019-04-07 22:25       ` Dan Williams
2019-04-05 13:42 ` Catalin Marinas
2019-04-05 13:42   ` Catalin Marinas
2019-04-05 14:58   ` Anshuman Khandual
2019-04-05 14:58     ` Anshuman Khandual
2019-04-05 13:53 ` Jerome Glisse [this message]
2019-04-05 13:53   ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190405135324.GA5627@redhat.com \
    --to=jglisse@redhat.com \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.