All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ebru Akagunduz <ebru.akagunduz@gmail.com>
To: vbabka@suse.cz, sergey.senozhatsky.work@gmail.com,
	akpm@linux-foundation.org
Cc: mhocko@kernel.org, kirill.shutemov@linux.intel.com,
	sfr@canb.auug.org.au, linux-mm@kvack.org,
	linux-next@vger.kernel.org, linux-kernel@vger.kernel.org,
	riel@redhat.com, aarcange@redhat.com
Subject: Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup
Date: Thu, 2 Jun 2016 21:58:56 +0300	[thread overview]
Message-ID: <20160602185856.GA3854@debian> (raw)
In-Reply-To: <0c47a3a0-5530-b257-1c1f-28ed44ba97e6@suse.cz>

On Thu, Jun 02, 2016 at 03:24:05PM +0200, Vlastimil Babka wrote:
> [+CC's]
> 
> On 06/02/2016 03:48 AM, Sergey Senozhatsky wrote:
> >On (06/01/16 13:11), Stephen Rothwell wrote:
> >>Hi all,
> >>
> >>Changes since 20160531:
> >>
> >>My fixes tree contains:
> >>
> >>  of: silence warnings due to max() usage
> >>
> >>The arm tree gained a conflict against Linus' tree.
> >>
> >>Non-merge commits (relative to Linus' tree): 1100
> >> 936 files changed, 38159 insertions(+), 17475 deletions(-)
> >
> >Hello,
> >
> >the cc1 process ended up in DN state during kernel -j4 compilation.
> >
> >...
> >[ 2856.323052] INFO: task cc1:4582 blocked for more than 21 seconds.
> >[ 2856.323055]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> >[ 2856.323056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >[ 2856.323059] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> >[ 2856.323062]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> >[ 2856.323065]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> >[ 2856.323067]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> >[ 2856.323068] Call Trace:
> >[ 2856.323074]  [<ffffffff81441e33>] schedule+0x83/0x98
> >[ 2856.323077]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> >[ 2856.323080]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> >[ 2856.323083]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> >[ 2856.323084]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> >[ 2856.323086]  [<ffffffff81443630>] down_write+0x1f/0x2e
> >[ 2856.323089]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> >[ 2856.323091]  [<ffffffff8103702a>] mmput+0x29/0xc5
> >[ 2856.323093]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> >[ 2856.323095]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> >[ 2856.323097]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> >[ 2856.323099]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> >[ 2856.323101]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
> >
> >[ 2877.322853] INFO: task cc1:4582 blocked for more than 21 seconds.
> >[ 2877.322858]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> >[ 2877.322858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >[ 2877.322861] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> >[ 2877.322865]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> >[ 2877.322867]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> >[ 2877.322867]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> >[ 2877.322870] Call Trace:
> >[ 2877.322875]  [<ffffffff81441e33>] schedule+0x83/0x98
> >[ 2877.322878]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> >[ 2877.322881]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> >[ 2877.322884]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> >[ 2877.322885]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> >[ 2877.322887]  [<ffffffff81443630>] down_write+0x1f/0x2e
> >[ 2877.322890]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> >[ 2877.322892]  [<ffffffff8103702a>] mmput+0x29/0xc5
> >[ 2877.322894]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> >[ 2877.322896]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> >[ 2877.322898]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> >[ 2877.322900]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> >[ 2877.322902]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
> 
> I think it's this patch:
> 
> http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-make-swapin-readahead-under-down_read-of-mmap_sem.patch
> 
> Some parts of the code in collapse_huge_page() that were under
> down_write(mmap_sem) are under down_read() after the patch. But
> there's "goto out" which continues via "goto out_up_write" which
> does up_write(mmap_sem) so there's an imbalance. One path seems to
> go via both up_read() and up_write(). I can imagine this can cause a
> stuck down_write() among other things?
Recently, I realized the same imbalance, it is an obvious
inconsistency. I don't know, this issue can be related with
mine. I'll send a fix patch.

Kind regards.

WARNING: multiple messages have this Message-ID (diff)
From: Ebru Akagunduz <ebru.akagunduz@gmail.com>
To: vbabka@suse.cz, sergey.senozhatsky.work@gmail.com,
	akpm@linux-foundation.org
Cc: mhocko@kernel.org, kirill.shutemov@linux.intel.com,
	sfr@canb.auug.org.au, linux-mm@kvack.org,
	linux-next@vger.kernel.org, linux-kernel@vger.kernel.org,
	riel@redhat.com, aarcange@redhat.com
Subject: Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup
Date: Thu, 2 Jun 2016 21:58:56 +0300	[thread overview]
Message-ID: <20160602185856.GA3854@debian> (raw)
In-Reply-To: <0c47a3a0-5530-b257-1c1f-28ed44ba97e6@suse.cz>

On Thu, Jun 02, 2016 at 03:24:05PM +0200, Vlastimil Babka wrote:
> [+CC's]
> 
> On 06/02/2016 03:48 AM, Sergey Senozhatsky wrote:
> >On (06/01/16 13:11), Stephen Rothwell wrote:
> >>Hi all,
> >>
> >>Changes since 20160531:
> >>
> >>My fixes tree contains:
> >>
> >>  of: silence warnings due to max() usage
> >>
> >>The arm tree gained a conflict against Linus' tree.
> >>
> >>Non-merge commits (relative to Linus' tree): 1100
> >> 936 files changed, 38159 insertions(+), 17475 deletions(-)
> >
> >Hello,
> >
> >the cc1 process ended up in DN state during kernel -j4 compilation.
> >
> >...
> >[ 2856.323052] INFO: task cc1:4582 blocked for more than 21 seconds.
> >[ 2856.323055]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> >[ 2856.323056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >[ 2856.323059] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> >[ 2856.323062]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> >[ 2856.323065]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> >[ 2856.323067]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> >[ 2856.323068] Call Trace:
> >[ 2856.323074]  [<ffffffff81441e33>] schedule+0x83/0x98
> >[ 2856.323077]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> >[ 2856.323080]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> >[ 2856.323083]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> >[ 2856.323084]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> >[ 2856.323086]  [<ffffffff81443630>] down_write+0x1f/0x2e
> >[ 2856.323089]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> >[ 2856.323091]  [<ffffffff8103702a>] mmput+0x29/0xc5
> >[ 2856.323093]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> >[ 2856.323095]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> >[ 2856.323097]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> >[ 2856.323099]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> >[ 2856.323101]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
> >
> >[ 2877.322853] INFO: task cc1:4582 blocked for more than 21 seconds.
> >[ 2877.322858]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> >[ 2877.322858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >[ 2877.322861] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> >[ 2877.322865]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> >[ 2877.322867]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> >[ 2877.322867]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> >[ 2877.322870] Call Trace:
> >[ 2877.322875]  [<ffffffff81441e33>] schedule+0x83/0x98
> >[ 2877.322878]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> >[ 2877.322881]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> >[ 2877.322884]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> >[ 2877.322885]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> >[ 2877.322887]  [<ffffffff81443630>] down_write+0x1f/0x2e
> >[ 2877.322890]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> >[ 2877.322892]  [<ffffffff8103702a>] mmput+0x29/0xc5
> >[ 2877.322894]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> >[ 2877.322896]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> >[ 2877.322898]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> >[ 2877.322900]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> >[ 2877.322902]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
> 
> I think it's this patch:
> 
> http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-make-swapin-readahead-under-down_read-of-mmap_sem.patch
> 
> Some parts of the code in collapse_huge_page() that were under
> down_write(mmap_sem) are under down_read() after the patch. But
> there's "goto out" which continues via "goto out_up_write" which
> does up_write(mmap_sem) so there's an imbalance. One path seems to
> go via both up_read() and up_write(). I can imagine this can cause a
> stuck down_write() among other things?
Recently, I realized the same imbalance, it is an obvious
inconsistency. I don't know, this issue can be related with
mine. I'll send a fix patch.

Kind regards.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-06-02 18:59 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01  3:11 linux-next: Tree for Jun 1 Stephen Rothwell
2016-06-02  1:48 ` [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup Sergey Senozhatsky
2016-06-02  1:48   ` Sergey Senozhatsky
2016-06-02  9:21   ` Michal Hocko
2016-06-02  9:21     ` Michal Hocko
2016-06-02 12:08     ` Sergey Senozhatsky
2016-06-02 12:08       ` Sergey Senozhatsky
2016-06-02 12:21       ` Michal Hocko
2016-06-02 12:21         ` Michal Hocko
2016-06-03 13:51         ` Andrea Arcangeli
2016-06-03 13:51           ` Andrea Arcangeli
2016-06-03 14:46           ` Michal Hocko
2016-06-03 14:46             ` Michal Hocko
2016-06-03 15:10             ` Andrea Arcangeli
2016-06-03 15:10               ` Andrea Arcangeli
2016-06-07  7:34               ` Michal Hocko
2016-06-07  7:34                 ` Michal Hocko
2016-06-08  8:19               ` Vlastimil Babka
2016-06-08  8:19                 ` Vlastimil Babka
2016-06-03  7:15     ` Sergey Senozhatsky
2016-06-03  7:15       ` Sergey Senozhatsky
2016-06-03  7:25       ` Michal Hocko
2016-06-03  7:25         ` Michal Hocko
2016-06-03  8:43         ` Sergey Senozhatsky
2016-06-03  8:43           ` Sergey Senozhatsky
2016-06-03  9:55           ` Michal Hocko
2016-06-03  9:55             ` Michal Hocko
2016-06-03 10:05             ` Michal Hocko
2016-06-03 10:05               ` Michal Hocko
2016-06-03 13:38               ` Sergey Senozhatsky
2016-06-03 13:38                 ` Sergey Senozhatsky
2016-06-03 13:45                 ` Michal Hocko
2016-06-03 13:45                   ` Michal Hocko
2016-06-03 13:49                   ` Michal Hocko
2016-06-03 13:49                     ` Michal Hocko
2016-06-03 13:49                     ` Michal Hocko
2016-06-04  7:51                     ` Sergey Senozhatsky
2016-06-04  7:51                       ` Sergey Senozhatsky
2016-06-06  8:39                       ` Michal Hocko
2016-06-06  8:39                         ` Michal Hocko
2016-06-02 13:24   ` Vlastimil Babka
2016-06-02 13:24     ` Vlastimil Babka
2016-06-02 18:58     ` Ebru Akagunduz [this message]
2016-06-02 18:58       ` Ebru Akagunduz
2016-06-03  1:00       ` Sergey Senozhatsky
2016-06-03  1:00         ` Sergey Senozhatsky
2016-06-03  1:29         ` Sergey Senozhatsky
2016-06-03  1:29           ` Sergey Senozhatsky
2016-06-03  4:14           ` Sergey Senozhatsky
2016-06-03  4:14             ` Sergey Senozhatsky
2016-06-03 12:28     ` [PATCH] mm, thp: fix locking inconsistency in collapse_huge_page Ebru Akagunduz
2016-06-03 12:28       ` Ebru Akagunduz
2016-06-06 13:05       ` Vlastimil Babka
2016-06-06 13:05         ` Vlastimil Babka
2016-06-09  3:51         ` Sergey Senozhatsky
2016-06-09  3:51           ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160602185856.GA3854@debian \
    --to=ebru.akagunduz@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-next@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=riel@redhat.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sfr@canb.auug.org.au \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.