All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Eric Wheeler <linux-mm@lists.ewheeler.net>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	mhocko@kernel.org, hannes@cmpxchg.org, minchan@kernel.org,
	ying.huang@intel.com, mgorman@techsingularity.net,
	vdavydov.dev@gmail.com, akpm@linux-foundation.org,
	shakeelb@google.com, gthelen@google.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.
Date: Fri, 26 Jan 2018 12:12:00 +0900	[thread overview]
Message-ID: <201801260312.w0Q3C0tr067684@www262.sakura.ne.jp> (raw)
In-Reply-To: <alpine.LRH.2.11.1801252209010.6864@mail.ewheeler.net>

Eric Wheeler wrote:
> Hi Tetsuo,
> 
> Thank you for looking into this!
> 
> I tried running this C program in 4.14.15 but did not get a deadlock, just 
> OOM kills. Is the patch required to induce the deadlock?

This reproducer must not trigger actual deadlock. Running this reproducer
with this patch applied causes lockdep warning. I just tried to suggest
possibility that making shrink_slab() suddenly no-op might cause unexpected
results. We still don't know what is happening in your case.

> 
> Also, what are you doing to XFS to make it trigger?

Nothing.



Would you answer to Michal's questions

  Is this a permanent state or does the holder eventually releases the lock?

  Do you remember the last good kernel?

and my guess

  Since commit 0bcac06f27d75285 was not backported to 4.14-stable kernel,
  this is unlikely the bug introduced by 0bcac06f27d75285 unless Eric
  explicitly backported 0bcac06f27d75285.

?

Can you take SysRq-t (e.g. "echo t > /proc/sysrq-trigger") when processes
got stuck? I think that we need to know what other threads are doing when
__lock_page() is waiting in order to distinguish "somebody forgot to unlock
the page" and "somebody is still doing something (e.g. waiting for memory
allocation) in order to unlock the page".

If you can take SysRq-t, taking SysRq-t with
http://lkml.kernel.org/r/1510833448-19918-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
applied and built with CONFIG_DEBUG_SHOW_MEMALLOC_LINE=y should give us
more clues (e.g. how long threads are waiting for memory allocation).

WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Eric Wheeler <linux-mm@lists.ewheeler.net>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	mhocko@kernel.org, hannes@cmpxchg.org, minchan@kernel.org,
	ying.huang@intel.com, mgorman@techsingularity.net,
	vdavydov.dev@gmail.com, akpm@linux-foundation.org,
	shakeelb@google.com, gthelen@google.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.
Date: Fri, 26 Jan 2018 12:12:00 +0900	[thread overview]
Message-ID: <201801260312.w0Q3C0tr067684@www262.sakura.ne.jp> (raw)
In-Reply-To: <alpine.LRH.2.11.1801252209010.6864@mail.ewheeler.net>

Eric Wheeler wrote:
> Hi Tetsuo,
> 
> Thank you for looking into this!
> 
> I tried running this C program in 4.14.15 but did not get a deadlock, just 
> OOM kills. Is the patch required to induce the deadlock?

This reproducer must not trigger actual deadlock. Running this reproducer
with this patch applied causes lockdep warning. I just tried to suggest
possibility that making shrink_slab() suddenly no-op might cause unexpected
results. We still don't know what is happening in your case.

> 
> Also, what are you doing to XFS to make it trigger?

Nothing.



Would you answer to Michal's questions

  Is this a permanent state or does the holder eventually releases the lock?

  Do you remember the last good kernel?

and my guess

  Since commit 0bcac06f27d75285 was not backported to 4.14-stable kernel,
  this is unlikely the bug introduced by 0bcac06f27d75285 unless Eric
  explicitly backported 0bcac06f27d75285.

?

Can you take SysRq-t (e.g. "echo t > /proc/sysrq-trigger") when processes
got stuck? I think that we need to know what other threads are doing when
__lock_page() is waiting in order to distinguish "somebody forgot to unlock
the page" and "somebody is still doing something (e.g. waiting for memory
allocation) in order to unlock the page".

If you can take SysRq-t, taking SysRq-t with
http://lkml.kernel.org/r/1510833448-19918-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
applied and built with CONFIG_DEBUG_SHOW_MEMALLOC_LINE=y should give us
more clues (e.g. how long threads are waiting for memory allocation).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-26  3:12 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-13 21:37 [PATCH 1/2] mm,vmscan: Kill global shrinker lock Tetsuo Handa
2017-11-13 21:37 ` Tetsuo Handa
2017-11-13 21:37 ` [PATCH 2/2] mm,vmscan: Allow parallel registration/unregistration of shrinkers Tetsuo Handa
2017-11-13 21:37   ` Tetsuo Handa
2017-11-13 22:05 ` [PATCH 1/2] mm,vmscan: Kill global shrinker lock Shakeel Butt
2017-11-13 22:05   ` Shakeel Butt
2017-11-15  0:56 ` Minchan Kim
2017-11-15  0:56   ` Minchan Kim
2017-11-15  6:28   ` Shakeel Butt
2017-11-15  6:28     ` Shakeel Butt
2017-11-16  0:46     ` Minchan Kim
2017-11-16  0:46       ` Minchan Kim
2017-11-16  1:41       ` Shakeel Butt
2017-11-16  1:41         ` Shakeel Butt
2017-11-16  4:50         ` Minchan Kim
2017-11-16  4:50           ` Minchan Kim
2017-11-15  8:56   ` Michal Hocko
2017-11-15  8:56     ` Michal Hocko
2017-11-15  9:18     ` Michal Hocko
2017-11-15  9:18       ` Michal Hocko
2017-11-16 17:44   ` Johannes Weiner
2017-11-16 17:44     ` Johannes Weiner
2017-11-23 23:46     ` Minchan Kim
2017-11-23 23:46       ` Minchan Kim
2017-11-15  9:02 ` Michal Hocko
2017-11-15  9:02   ` Michal Hocko
2017-11-15 10:58   ` Tetsuo Handa
2017-11-15 10:58     ` Tetsuo Handa
2017-11-15 11:51     ` Michal Hocko
2017-11-15 11:51       ` Michal Hocko
2017-11-16  0:56       ` Minchan Kim
2017-11-16  0:56         ` Minchan Kim
2017-11-15 13:28     ` Johannes Weiner
2017-11-15 13:28       ` Johannes Weiner
2017-11-16 10:56       ` Tetsuo Handa
2017-11-16 10:56         ` Tetsuo Handa
2017-11-15 14:00   ` Johannes Weiner
2017-11-15 14:00     ` Johannes Weiner
2017-11-15 14:11     ` Michal Hocko
2017-11-15 14:11       ` Michal Hocko
2018-01-25  2:04       ` Tetsuo Handa
2018-01-25  2:04         ` Tetsuo Handa
2018-01-25  8:36         ` Michal Hocko
2018-01-25  8:36           ` Michal Hocko
2018-01-25 10:56           ` Tetsuo Handa
2018-01-25 10:56             ` Tetsuo Handa
2018-01-25 11:41             ` Michal Hocko
2018-01-25 11:41               ` Michal Hocko
2018-01-25 22:19             ` Eric Wheeler
2018-01-25 22:19               ` Eric Wheeler
2018-01-26  3:12               ` Tetsuo Handa [this message]
2018-01-26  3:12                 ` Tetsuo Handa
2018-01-26 10:08                 ` Michal Hocko
2018-01-26 10:08                   ` Michal Hocko
2017-11-17 17:35 ` Christoph Hellwig
2017-11-17 17:35   ` Christoph Hellwig
2017-11-17 17:41   ` Shakeel Butt
2017-11-17 17:41     ` Shakeel Butt
2017-11-17 17:53     ` Shakeel Butt
2017-11-17 17:53       ` Shakeel Butt
2017-11-17 18:36     ` Christoph Hellwig
2017-11-17 18:36       ` Christoph Hellwig
2017-11-20  9:25   ` Michal Hocko
2017-11-20  9:25     ` Michal Hocko
2017-11-20  9:33     ` Christoph Hellwig
2017-11-20  9:33       ` Christoph Hellwig
2017-11-20  9:42       ` Michal Hocko
2017-11-20  9:42         ` Michal Hocko
2017-11-20 10:41         ` Christoph Hellwig
2017-11-20 10:41           ` Christoph Hellwig
2017-11-20 10:56           ` Tetsuo Handa
2017-11-20 10:56             ` Tetsuo Handa
2017-11-20 18:28             ` Paul E. McKenney
2017-11-20 18:28               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201801260312.w0Q3C0tr067684@www262.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mm@lists.ewheeler.net \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.