From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mikael Pettersson Subject: Re: [3.13 regression] kswapd0 and ksoftirqd/0 CPU hogs Date: Wed, 1 Jun 2016 08:36:14 +0200 Message-ID: <22350.33374.510810.977906@gargle.gargle.HOWL> References: <21393.43065.207399.530921@gargle.gargle.HOWL> <21426.40682.197715.245775@gargle.gargle.HOWL> <21786.40688.53001.509365@gargle.gargle.HOWL> <22217.61083.254294.356622@gargle.gargle.HOWL> <22349.25130.697316.272584@gargle.gargle.HOWL> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Return-path: Received: from mail-lf0-f65.google.com ([209.85.215.65]:32957 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757170AbcFAGgS (ORCPT ); Wed, 1 Jun 2016 02:36:18 -0400 Received: by mail-lf0-f65.google.com with SMTP id w16so899014lfd.0 for ; Tue, 31 May 2016 23:36:18 -0700 (PDT) In-Reply-To: Sender: linux-m68k-owner@vger.kernel.org List-Id: linux-m68k@vger.kernel.org To: Geert Uytterhoeven Cc: Mikael Pettersson , Finn Thain , Michael Schmitz , Andreas Schwab , Linux/m68k Geert Uytterhoeven writes: > Hi Mikael, > > On Tue, May 31, 2016 at 12:06 PM, Mikael Pettersson > wrote: > > Finn Thain writes: > > > On Sun, 21 Feb 2016, Mikael Pettersson wrote: > > > > I've done two git bisects on this. The first one was inconclusive > > > > (pointed to a harmless commit), but the second one ended up with: > > > > > > > > # first bad commit: [ac4de9543aca59f2b763746647577302fbedd57e] Merge branch 'akpm' (patches from Andrew Morton) > > > > > > > > That's a big pile of VM changes, so I think it could be the culprit. > > > > > > I think this issue may date back to v2.6.38 or earlier. > > > > > > The redhat.com bug report was closed in 2012 but Fedora users were still > > > seeing the problem after it was supposedly fixed. > > > https://bugzilla.redhat.com/show_bug.cgi?id=712019 > > > > > > That page also has a link to the bug report for Ubuntu: > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/484045 > > > > > > BTW, I came across this recently: "Rik van Riel pointed out that [the > > > kswapd thread] tends to be slow for [the purpose of compaction], and it > > > can get stuck in a shrinker somewhere waiting for a lock." > > > http://lwn.net/Articles/684611/ > > > > > > Perhaps a stack trace would help to ascertain whether this is the same > > > known bug or not (?) > > > > > > -- > > > > FWIW, my latest round(s) of bisects identified the following: > > > > fdbadebec27cc92358ed4f593e8763cf10b82687 is the first bad commit > > commit fdbadebec27cc92358ed4f593e8763cf10b82687 > > Author: Li Zefan > > Date: Thu Sep 12 15:13:19 2013 -0700 > > > > memcg: remove redundant code in mem_cgroup_force_empty_write() > > > > vfs guarantees the cgroup won't be destroyed, so it's redundant to get a > > css reference. > > > > Signed-off-by: Li Zefan > > Acked-by: Michal Hocko > > Cc: KAMEZAWA Hiroyuki > > Cc: Johannes Weiner > > Cc: Tejun Heo > > Signed-off-by: Andrew Morton > > Signed-off-by: Linus Torvalds > > > > :040000 040000 1f6b5b056995067c7c60e6f87e9cd1f181e8fbeb ea29d63e70ce2320e144fac7b157a146d41360bf M mm > > > > This appears to be the first commit in the merge (git bisect refuses to > > bisect before it), so either it's it or the problem predates the merge. > > That's upstream commit c33bd8354f3a3bb26a98d2b6bf600b7b35657328? > > Well done! Looks indeed like a suspect for the behavior you're seeing. > I suppose you will follow up with the mm people? Alas, this was a false find. This commit is definitely in the bad range, but reverting it from 3.12.60 doesn't eliminate the kswapd bug. I'll have to bisect mainline again.