linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ebru Akagunduz <ebru.akagunduz@gmail.com>
To: vbabka@suse.cz, sergey.senozhatsky.work@gmail.com,
	akpm@linux-foundation.org
Cc: mhocko@kernel.org, kirill.shutemov@linux.intel.com,
	sfr@canb.auug.org.au, linux-mm@kvack.org,
	linux-next@vger.kernel.org, linux-kernel@vger.kernel.org,
	riel@redhat.com, aarcange@redhat.com
Subject: Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup
Date: Thu, 2 Jun 2016 21:58:56 +0300	[thread overview]
Message-ID: <20160602185856.GA3854@debian> (raw)
In-Reply-To: <0c47a3a0-5530-b257-1c1f-28ed44ba97e6@suse.cz>

On Thu, Jun 02, 2016 at 03:24:05PM +0200, Vlastimil Babka wrote:
> [+CC's]
> 
> On 06/02/2016 03:48 AM, Sergey Senozhatsky wrote:
> >On (06/01/16 13:11), Stephen Rothwell wrote:
> >>Hi all,
> >>
> >>Changes since 20160531:
> >>
> >>My fixes tree contains:
> >>
> >>  of: silence warnings due to max() usage
> >>
> >>The arm tree gained a conflict against Linus' tree.
> >>
> >>Non-merge commits (relative to Linus' tree): 1100
> >> 936 files changed, 38159 insertions(+), 17475 deletions(-)
> >
> >Hello,
> >
> >the cc1 process ended up in DN state during kernel -j4 compilation.
> >
> >...
> >[ 2856.323052] INFO: task cc1:4582 blocked for more than 21 seconds.
> >[ 2856.323055]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> >[ 2856.323056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >[ 2856.323059] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> >[ 2856.323062]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> >[ 2856.323065]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> >[ 2856.323067]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> >[ 2856.323068] Call Trace:
> >[ 2856.323074]  [<ffffffff81441e33>] schedule+0x83/0x98
> >[ 2856.323077]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> >[ 2856.323080]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> >[ 2856.323083]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> >[ 2856.323084]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> >[ 2856.323086]  [<ffffffff81443630>] down_write+0x1f/0x2e
> >[ 2856.323089]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> >[ 2856.323091]  [<ffffffff8103702a>] mmput+0x29/0xc5
> >[ 2856.323093]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> >[ 2856.323095]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> >[ 2856.323097]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> >[ 2856.323099]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> >[ 2856.323101]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
> >
> >[ 2877.322853] INFO: task cc1:4582 blocked for more than 21 seconds.
> >[ 2877.322858]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> >[ 2877.322858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >[ 2877.322861] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> >[ 2877.322865]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> >[ 2877.322867]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> >[ 2877.322867]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> >[ 2877.322870] Call Trace:
> >[ 2877.322875]  [<ffffffff81441e33>] schedule+0x83/0x98
> >[ 2877.322878]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> >[ 2877.322881]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> >[ 2877.322884]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> >[ 2877.322885]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> >[ 2877.322887]  [<ffffffff81443630>] down_write+0x1f/0x2e
> >[ 2877.322890]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> >[ 2877.322892]  [<ffffffff8103702a>] mmput+0x29/0xc5
> >[ 2877.322894]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> >[ 2877.322896]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> >[ 2877.322898]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> >[ 2877.322900]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> >[ 2877.322902]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
> 
> I think it's this patch:
> 
> http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-make-swapin-readahead-under-down_read-of-mmap_sem.patch
> 
> Some parts of the code in collapse_huge_page() that were under
> down_write(mmap_sem) are under down_read() after the patch. But
> there's "goto out" which continues via "goto out_up_write" which
> does up_write(mmap_sem) so there's an imbalance. One path seems to
> go via both up_read() and up_write(). I can imagine this can cause a
> stuck down_write() among other things?
Recently, I realized the same imbalance, it is an obvious
inconsistency. I don't know, this issue can be related with
mine. I'll send a fix patch.

Kind regards.

  reply	other threads:[~2016-06-02 18:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01  3:11 linux-next: Tree for Jun 1 Stephen Rothwell
2016-06-02  1:48 ` [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup Sergey Senozhatsky
2016-06-02  9:21   ` Michal Hocko
2016-06-02 12:08     ` Sergey Senozhatsky
2016-06-02 12:21       ` Michal Hocko
2016-06-03 13:51         ` Andrea Arcangeli
2016-06-03 14:46           ` Michal Hocko
2016-06-03 15:10             ` Andrea Arcangeli
2016-06-07  7:34               ` Michal Hocko
2016-06-08  8:19               ` Vlastimil Babka
2016-06-03  7:15     ` Sergey Senozhatsky
2016-06-03  7:25       ` Michal Hocko
2016-06-03  8:43         ` Sergey Senozhatsky
2016-06-03  9:55           ` Michal Hocko
2016-06-03 10:05             ` Michal Hocko
2016-06-03 13:38               ` Sergey Senozhatsky
2016-06-03 13:45                 ` Michal Hocko
2016-06-03 13:49                   ` Michal Hocko
2016-06-04  7:51                     ` Sergey Senozhatsky
2016-06-06  8:39                       ` Michal Hocko
2016-06-02 13:24   ` Vlastimil Babka
2016-06-02 18:58     ` Ebru Akagunduz [this message]
2016-06-03  1:00       ` Sergey Senozhatsky
2016-06-03  1:29         ` Sergey Senozhatsky
2016-06-03  4:14           ` Sergey Senozhatsky
2016-06-03 12:28     ` [PATCH] mm, thp: fix locking inconsistency in collapse_huge_page Ebru Akagunduz
2016-06-06 13:05       ` Vlastimil Babka
2016-06-09  3:51         ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160602185856.GA3854@debian \
    --to=ebru.akagunduz@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-next@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=riel@redhat.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sfr@canb.auug.org.au \
    --cc=vbabka@suse.cz \
    --subject='Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox