linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
@ 2018-11-13 12:38 Yongkai Wu
  2018-11-13 13:04 ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Yongkai Wu @ 2018-11-13 12:38 UTC (permalink / raw)
  To: mike.kravetz; +Cc: linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

It is better to keep page mapping info when free_huge_page() hit the
VM_BUG_ON_PAGE,
so we can get more infomation from the coredump for further analysis.

Signed-off-by: Yongkai Wu <nic_w@163.com>
---
 mm/hugetlb.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c007fb5..ba693bb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1248,10 +1248,11 @@ void free_huge_page(struct page *page)
  (struct hugepage_subpool *)page_private(page);
  bool restore_reserve;

+        VM_BUG_ON_PAGE(page_count(page), page);
+        VM_BUG_ON_PAGE(page_mapcount(page), page);
+
  set_page_private(page, 0);
  page->mapping = NULL;
- VM_BUG_ON_PAGE(page_count(page), page);
- VM_BUG_ON_PAGE(page_mapcount(page), page);
  restore_reserve = PagePrivate(page);
  ClearPagePrivate(page);

-- 
1.8.3.1

[-- Attachment #2: Type: text/html, Size: 1537 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-13 12:38 [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE Yongkai Wu
@ 2018-11-13 13:04 ` Michal Hocko
  2018-11-13 15:12   ` Yongkai Wu
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2018-11-13 13:04 UTC (permalink / raw)
  To: Yongkai Wu; +Cc: mike.kravetz, linux-mm, linux-kernel

On Tue 13-11-18 20:38:16, Yongkai Wu wrote:
> It is better to keep page mapping info when free_huge_page() hit the
> VM_BUG_ON_PAGE,
> so we can get more infomation from the coredump for further analysis.

The patch seems to be whitespace damaged. Put that aside, have you
actually seen a case where preservning the page state would help to nail
down any bug.

I am not objecting to the patch, it actually makes some sense to me, I
am just curious about a background motivation.
 
> Signed-off-by: Yongkai Wu <nic_w@163.com>
> ---
>  mm/hugetlb.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c007fb5..ba693bb 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1248,10 +1248,11 @@ void free_huge_page(struct page *page)
>   (struct hugepage_subpool *)page_private(page);
>   bool restore_reserve;
> 
> +        VM_BUG_ON_PAGE(page_count(page), page);
> +        VM_BUG_ON_PAGE(page_mapcount(page), page);
> +
>   set_page_private(page, 0);
>   page->mapping = NULL;
> - VM_BUG_ON_PAGE(page_count(page), page);
> - VM_BUG_ON_PAGE(page_mapcount(page), page);
>   restore_reserve = PagePrivate(page);
>   ClearPagePrivate(page);
> 
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-13 13:04 ` Michal Hocko
@ 2018-11-13 15:12   ` Yongkai Wu
  2018-11-13 15:24     ` Michal Hocko
  2018-11-13 18:04     ` Mike Kravetz
  0 siblings, 2 replies; 8+ messages in thread
From: Yongkai Wu @ 2018-11-13 15:12 UTC (permalink / raw)
  To: mhocko; +Cc: mike.kravetz, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2428 bytes --]

Dear Maintainer,
Actually i met a VM_BUG_ON_PAGE issue in centos7.4 some days ago.When the
issue first happen,
i just can know that it happen in free_huge_page() when doing soft offline
huge page.
But because page->mapping is set to null,i can not get any further
information how the issue happen.

So i modified the code as the patch show,and apply the new code to our
produce line and wait some time,
then the issue come again.And this time i can know the whole file path
which trigger the issue by using
crash tool to get the inode、dentry and so on,that help me to find a way to
reproduce the issue quite easily
and finally found the root cause and solve it.

I think if keep the page->mapping,we can even do the rmap to check more
detail info too by using the crash tool to analyse the coredump.
So i think preservning the page state would more or less help to debug.
But if it is not so meaningful,just let it go. ^_^

Thank you for your time.

Best Regards

On Tue, Nov 13, 2018 at 9:04 PM Michal Hocko <mhocko@kernel.org> wrote:

> On Tue 13-11-18 20:38:16, Yongkai Wu wrote:
> > It is better to keep page mapping info when free_huge_page() hit the
> > VM_BUG_ON_PAGE,
> > so we can get more infomation from the coredump for further analysis.
>
> The patch seems to be whitespace damaged. Put that aside, have you
> actually seen a case where preservning the page state would help to nail
> down any bug.
>
> I am not objecting to the patch, it actually makes some sense to me, I
> am just curious about a background motivation.
>
> > Signed-off-by: Yongkai Wu <nic_w@163.com>
> > ---
> >  mm/hugetlb.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index c007fb5..ba693bb 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1248,10 +1248,11 @@ void free_huge_page(struct page *page)
> >   (struct hugepage_subpool *)page_private(page);
> >   bool restore_reserve;
> >
> > +        VM_BUG_ON_PAGE(page_count(page), page);
> > +        VM_BUG_ON_PAGE(page_mapcount(page), page);
> > +
> >   set_page_private(page, 0);
> >   page->mapping = NULL;
> > - VM_BUG_ON_PAGE(page_count(page), page);
> > - VM_BUG_ON_PAGE(page_mapcount(page), page);
> >   restore_reserve = PagePrivate(page);
> >   ClearPagePrivate(page);
> >
> > --
> > 1.8.3.1
>
> --
> Michal Hocko
> SUSE Labs
>

[-- Attachment #2: Type: text/html, Size: 3110 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-13 15:12   ` Yongkai Wu
@ 2018-11-13 15:24     ` Michal Hocko
  2018-11-13 18:04     ` Mike Kravetz
  1 sibling, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2018-11-13 15:24 UTC (permalink / raw)
  To: Yongkai Wu; +Cc: mike.kravetz, linux-mm, linux-kernel

[Please do not top-post]

On Tue 13-11-18 23:12:24, Yongkai Wu wrote:
> Dear Maintainer,
> Actually i met a VM_BUG_ON_PAGE issue in centos7.4 some days ago.When the
> issue first happen,
> i just can know that it happen in free_huge_page() when doing soft offline
> huge page.
> But because page->mapping is set to null,i can not get any further
> information how the issue happen.
> 
> So i modified the code as the patch show,and apply the new code to our
> produce line and wait some time,
> then the issue come again.And this time i can know the whole file path
> which trigger the issue by using
> crash tool to get the inodea??dentry and so on,that help me to find a way to
> reproduce the issue quite easily
> and finally found the root cause and solve it.

OK, thanks for the clarification. Please repost without
the patch being clobbered by your email client and feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

A note about your experience in the changelog would be useful IMHO.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-13 15:12   ` Yongkai Wu
  2018-11-13 15:24     ` Michal Hocko
@ 2018-11-13 18:04     ` Mike Kravetz
  2018-11-14 15:37       ` Yongkai Wu
  2018-11-15  8:30       ` Yongkai Wu
  1 sibling, 2 replies; 8+ messages in thread
From: Mike Kravetz @ 2018-11-13 18:04 UTC (permalink / raw)
  To: Yongkai Wu, mhocko; +Cc: linux-mm, linux-kernel

On 11/13/18 7:12 AM, Yongkai Wu wrote:
> Dear Maintainer,
> Actually i met a VM_BUG_ON_PAGE issue in centos7.4 some days ago.When the issue first happen,
> i just can know that it happen in free_huge_page() when doing soft offline huge page.
> But because page->mapping is set to null,i can not get any further information how the issue happen.
> 
> So i modified the code as the patch show,and apply the new code to our produce line and wait some time,
> then the issue come again.And this time i can know the whole file path which trigger the issue by using 
> crash tool to get the inodea??dentry and so on,that help me to find a way to reproduce the issue quite easily
> and finally found the root cause and solve it.

Thank you for the information and the patch.

As previously stated by Michal, please add some additional information to the
change log (commit message) and fix the formatting of the patch.

Can you tell us more about the root cause of your issue?  What was the issue?
How could you reproduce it?  How did you solve it?
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-13 18:04     ` Mike Kravetz
@ 2018-11-14 15:37       ` Yongkai Wu
  2018-11-15  8:30       ` Yongkai Wu
  1 sibling, 0 replies; 8+ messages in thread
From: Yongkai Wu @ 2018-11-14 15:37 UTC (permalink / raw)
  To: mike.kravetz; +Cc: mhocko, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

Dear Mike,
Glad to get your reply.I am sorry for late reply because i am on business
trip today.I will tell more detail about the issue i met later.
And because i have no experience how to send a kernel patch correctly,so it
may takes some time for me to learn that before i resend the patch.

Best regards

On Wed, Nov 14, 2018 at 2:05 AM Mike Kravetz <mike.kravetz@oracle.com>
wrote:

> On 11/13/18 7:12 AM, Yongkai Wu wrote:
> > Dear Maintainer,
> > Actually i met a VM_BUG_ON_PAGE issue in centos7.4 some days ago.When
> the issue first happen,
> > i just can know that it happen in free_huge_page() when doing soft
> offline huge page.
> > But because page->mapping is set to null,i can not get any further
> information how the issue happen.
> >
> > So i modified the code as the patch show,and apply the new code to our
> produce line and wait some time,
> > then the issue come again.And this time i can know the whole file path
> which trigger the issue by using
> > crash tool to get the inode、dentry and so on,that help me to find a way
> to reproduce the issue quite easily
> > and finally found the root cause and solve it.
>
> Thank you for the information and the patch.
>
> As previously stated by Michal, please add some additional information to
> the
> change log (commit message) and fix the formatting of the patch.
>
> Can you tell us more about the root cause of your issue?  What was the
> issue?
> How could you reproduce it?  How did you solve it?
> --
> Mike Kravetz
>

[-- Attachment #2: Type: text/html, Size: 1889 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-13 18:04     ` Mike Kravetz
  2018-11-14 15:37       ` Yongkai Wu
@ 2018-11-15  8:30       ` Yongkai Wu
  2018-11-15  8:52         ` Michal Hocko
  1 sibling, 1 reply; 8+ messages in thread
From: Yongkai Wu @ 2018-11-15  8:30 UTC (permalink / raw)
  To: mike.kravetz; +Cc: mhocko, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7622 bytes --]

Dear Maintainer,
The following is the detail information of the issue i meet.
It may be too boring to read,but i try to explain it more detail.



We use centos 7.4(kernel version:3.10.0-693.el7.x86_64).
Several days ago,i met a kernel panic issue,the kernel log showed:

[759990.616719] EDAC MC2: 51 CE memory read error on
CPU_SrcID#1_MC#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x4dd4f69
offset:0xcc0 grain:32 syndrome:0x0 -  OVERFLOW err_code:0101:0091 socket:1
imc:0 rank:0 bg:1 ba:1 row:1371a col:1a8)
[759990.627721] soft offline: 0x4dd4f69: migration failed 1, type
6fffff00008000
[759990.627743] ------------[ cut here ]------------
[759990.627763] kernel BUG at
/data/rpmbuild/BUILD/kernel-3.10.0/kernel-3.10.0/mm/hugetlb.c:1250!
[759990.628390] CPU: 27 PID: 457768 Comm: mcelog
[759990.628413] Hardware name: Lenovo HR650X           /HR650X     , BIOS
HR6N0333 06/23/2018
[759990.628433] task: ffff882f7f4f4f10 ti: ffff883326f10000 task.ti:
ffff883326f10000
[759990.628452] RIP: 0010:[<ffffffff811cc094>]  [<ffffffff811cc094>]
free_huge_page+0x1e4/0x200
[759990.628479] RSP: 0018:ffff883326f13d80  EFLAGS: 00010213
[759990.628493] RAX: 0000000000000001 RBX: ffffea0137000000 RCX:
0000000000000012
[759990.628511] RDX: 0000000040000000 RSI: ffffffff81f55500 RDI:
ffffea0137000000
[759990.628529] RBP: ffff883326f13da8 R08: ffffffff81f4e0e8 R09:
0000000000000000
......
[759990.628741] Call Trace:
[759990.628752]  [<ffffffff8169fe6a>] __put_compound_page+0x1f/0x22
[759990.628768]  [<ffffffff8169fea2>] put_compound_page+0x35/0x174
[759990.628786]  [<ffffffff811905a5>] put_page+0x45/0x50
[759990.629591]  [<ffffffff811d09a0>] putback_active_hugepage+0xd0/0xf0
[759990.630365]  [<ffffffff811fa0eb>] soft_offline_page+0x4db/0x580
[759990.631134]  [<ffffffff81453595>] store_soft_offline_page+0xa5/0xe0
[759990.631900]  [<ffffffff8143a298>] dev_attr_store+0x18/0x30
[759990.632660]  [<ffffffff81280486>] sysfs_write_file+0xc6/0x140
[759990.633409]  [<ffffffff8120101d>] vfs_write+0xbd/0x1e0
[759990.634148]  [<ffffffff81201e2f>] SyS_write+0x7f/0xe0
[759990.634870]  [<ffffffff816b4849>] system_call_fastpath+0x16/0x1b




I check the coredump,by disassembly free_huge_page() :
0xffffffff811cbf78 <free_huge_page+0xc8>:       cmp    $0xffffffff,%eax
0xffffffff811cbf7b <free_huge_page+0xcb>:       jne    0xffffffff811cc094
<free_huge_page+0x1e4>
......
0xffffffff811cc094 <free_huge_page+0x1e4>:    ud2


and check the sourcecode,i can only know that the panic reason is:
page->_count=0 but page->_mapcount=1,so hit the
BUG_ON(page_mapcount(page));
But can not get any further clue how the issue happen.




So i modify the code as the patch show,and apply the new code to our
produce line and wait some days,then the issue come again on another server.
And this time,by analyse the coredump using crash tool,i can know the whole
file path which trigger the issue.
For example:
crash> page.mapping ffffea02f9000000
      mapping = 0xffff88b098ae8160
crash> address_space.host 0xffff88b098ae8160
      host = 0xffff88b098ae8010
crash> inode.i_dentry 0xffff88b098ae8010
        i_dentry = {
              first = 0xffff88b0bbeb58b0
        }
crash> dentry.d_name.name -l dentry.d_alias 0xffff88b0bbeb58b0
       d_name.name = 0xffff88b0bbeb5838 "file_a"

So i can know the issue happen when doing soft offline to the page of the
file "file_a".
And i can also know the whole file path by list the dentry.d_parent and
check the dentry name.




Check with other team,i know that their user component will use file_a all
the time,
so the page->_mapcount not equal to -1 seems normal,and page->_count=0 is
abnormal at that time.

I guess if i triggeer a soft offline to the physical addr of the page using
by file_a,maybe the issue can reproduce.
So i write a user application to mmap to file_a and get the physical addr
of the page,the key step just as the following:
      fd = open(FILE_A_PATH,O_RDWR,0666);
      buf = mmap(NULL, pagesize, PROT_READ, MAP_SHARED, fd, 0);
      phys_addr = vtop((unsigned long long)buf);
In the function vtop(),i use "/proc/pid/pagemap" to get the physical addr
of the page.

Suppose that the physical addr is 0xbe40000000,then i can trigger a soft
offline to the addr:
        echo 0xbe40000000 > /sys/devices/system/memory/soft_offline_page
And after i trigger two or more times,the issue reproduce.





Then i use systemtap to probe page->_count and page->_mapcount in the
fucntions soft_offline_page(),putback_active_hugepage() and migrate_pages()
part of my systemtap script:

function get_page_mapcount:long (page:long) %{
    struct page *page;

    page = (struct page *)STAP_ARG_page;
    if(page == NULL)
        STAP_RETURN(NULL);
    else
        STAP_RETURN(page_mapcount(page)-1);
%}

probe kernel.function("migrate_pages")
{
    page2 = get_page_from_migrate_pages($from);
    printf("Now exec migrate_pages --
page=%p,pfn=%ld,phy_addr=0x%lx,page_flags=0x%lx\n",page2,get_page_pfn(page2),get_page_phy_addr(page2),get_page_flags(page2));

printf("page->mapping=%p,page->_count=%d,page->_mapcount=%d\n",get_page_mapping(page2),get_page_count(page2),get_page_mapcount(page2));
    print_backtrace();
}

Then i trigger soft offline to reproduce the issue,and finally find the
root cause:
In centos 7.4,when run into soft_offline_huge_page(),get_any_page() only
increase the page->_count by 1,
(isolate_huge_page() will also inc page->_count by 1 but then put_page()
will release the ref)

because we use 1 GB size of hugepage,so hugepage_migration_supported() will
always return false,
so soft_offline_huge_page() -->  migrate_pages() -->
unmap_and_move_huge_page() will call putback_active_hugepage() to decrease
page->_count by 1 just as the code show:
static int unmap_and_move_huge_page(new_page_t get_new_page,
{
......
if (!hugepage_migration_supported(page_hstate(hpage))) {
    putback_active_hugepage(hpage);   // ==> will decrease page->_count by 1
    return -ENOSYS;
}
......

Then when return to soft_offline_huge_page,page->_count will be decrease by
1 again by putback_active_hugepage():
static int soft_offline_huge_page(struct page *page, int flags)
{
     ret = migrate_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL,
     MIGRATE_SYNC, MR_MEMORY_FAILURE);
     if (ret) {
         pr_info("soft offline: %#lx: migration failed %d, type %lx\n",
         pfn, ret, page->flags);
         putback_active_hugepage(hpage);  // ==> here will decrease
page->_count by 1 again
         ......
     } else {
           ......
     }
}


So we can know when call soft_offline_page() to the 1 GB size of
hugepage,page->_count will be abnormally decrease by 1!



【  I remove one putback_active_hugepage() in soft_offline_huge_page() to
fix this issue.  】

And i check the latest kernel code on git hub(4.19),it seems already fix
this issue by the following code:
static int soft_offline_huge_page(struct page *page, int flags)
{
     ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
     MIGRATE_SYNC, MR_MEMORY_FAILURE);
     if (ret) {
         pr_info("soft offline: %#lx: hugepage migration failed %d, type
%lx (%pGp)\n",
             pfn, ret, page->flags, &page->flags);
         if (!list_empty(&pagelist))             // ==> seems this code can
fix the issue i meet
             putback_movable_pages(&pagelist);
         if (ret > 0)
         ret = -EIO;
     } else {
     }
}
But i can not find a similar bug fix report or commit log.

[-- Attachment #2: Type: text/html, Size: 10813 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE
  2018-11-15  8:30       ` Yongkai Wu
@ 2018-11-15  8:52         ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2018-11-15  8:52 UTC (permalink / raw)
  To: Yongkai Wu; +Cc: mike.kravetz, linux-mm, linux-kernel

On Thu 15-11-18 16:30:32, Yongkai Wu wrote:
[...]

Thanks for the clarification. It can be helpful for somebody trying to
debug a similar issue in the future.

> But i can not find a similar bug fix report or commit log.

What about 6bc9b56433b7 ("mm: fix race on soft-offlining free huge pages") ?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-11-15  8:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-13 12:38 [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE Yongkai Wu
2018-11-13 13:04 ` Michal Hocko
2018-11-13 15:12   ` Yongkai Wu
2018-11-13 15:24     ` Michal Hocko
2018-11-13 18:04     ` Mike Kravetz
2018-11-14 15:37       ` Yongkai Wu
2018-11-15  8:30       ` Yongkai Wu
2018-11-15  8:52         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).