From: Mel Gorman <mgorman@suse.de> To: James Bottomley <James.Bottomley@suse.de> Cc: Jan Kara <jack@suse.cz>, colin.king@canonical.com, Chris Mason <chris.mason@oracle.com>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, linux-kernel <linux-kernel@vger.kernel.org>, linux-ext4 <linux-ext4@vger.kernel.org>, mgorman@novell.com Subject: Re: [BUG] fatal hang untarring 90GB file, possibly writeback related. Date: Thu, 28 Apr 2011 20:21:04 +0100 [thread overview] Message-ID: <20110428192104.GA4658@suse.de> (raw) In-Reply-To: <1304015436.2598.19.camel@mulgrave.site> On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote: > On Thu, 2011-04-28 at 18:18 +0100, Mel Gorman wrote: > > On Thu, Apr 28, 2011 at 11:56:17AM -0500, James Bottomley wrote: > > > # Events: 6K cycles > > > # > > > # Overhead Command Shared Object Symbol > > > # ........ ........... ................... ....................................... > > > # > > > 20.41% kswapd0 [kernel.kallsyms] [k] shrink_slab > > > | > > > --- shrink_slab > > > | > > > |--99.91%-- kswapd > > > | kthread > > > | kernel_thread_helper > > > --0.09%-- [...] > > > > > > > Ok. I can't see how the patch "mm: vmscan: reclaim order-0 and use > > compaction instead of lumpy reclaim" is related unless we are seeing > > two problems that happen to manifest in a similar manner. > > > > However, there were a number of changes made to dcache in particular > > for 2.6.38. Specifically thinks like dentry_kill use trylock and is > > happy to loop around if it fails to acquire anything. See things like > > this for example; > > OK, so for this, I tried a 2.6.37 kernel. It doesn't work very well, > networking is hosed for no reason I can see (probably systemd / cgroups > problems). > > However, it runs enough for me to say that the tar proceeds to > completion in a non-PREEMPT kernel. (I tried several times for good > measure). That makes this definitely a regression of some sort, but it > doesn't definitively identify the dcache code ... it could be an ext4 > bug that got introduced in 2.6.38 either. > True, it could be any shrinker and dcache is just a guess. > > <SNIP> > > > > Way hey, cgroups are also in the mix. How jolly. > > > > Is systemd a common element of the machines hitting this bug by any > > chance? > > Well, yes, the bug report is against FC15, which needs cgroups for > systemd. > Ok although we do not have direct evidence that it's the problem yet. A broken shrinker could just mean we are also trying to aggressively reclaim in cgroups. > > The remaining traces seem to be follow-on damage related to the three > > issues of "shrinkers are bust in some manner" causing "we are not > > getting over the min watermark" and as a side-show "we are spending lots > > of time doing something unspecified but unhelpful in cgroups". > > Heh, well find a way for me to verify this: I can't turn off cgroups > because systemd then won't work and the machine won't boot ... > Same testcase, same kernel but a distro that is not using systemd to verify if cgroups are the problem. Not ideal I know. When I'm back online Tuesday, I'll try reproducing this on a !Fedora distribution. In the meantime, the following untested hatchet job might spit out which shrinker we are getting stuck in. It is also breaking out of the shrink_slab loop so it'd even be interesting to see if the bug is mitigated in any way. diff --git a/mm/vmscan.c b/mm/vmscan.c index c74a501..ed99104 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -225,6 +225,7 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask, { struct shrinker *shrinker; unsigned long ret = 0; + unsigned long shrink_expired = jiffies + HZ; if (scanned == 0) scanned = SWAP_CLUSTER_MAX; @@ -270,6 +271,14 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask, gfp_mask); if (shrink_ret == -1) break; + if (time_after(jiffies, shrink_expired)) { + printk(KERN_WARNING "Slab shrinker %p gone mental" + " comm=%s nr=%ld\n", + shrinker->shrink, + current->comm, + shrinker->nr); + break; + } if (shrink_ret < nr_before) ret += nr_before - shrink_ret; count_vm_events(SLABS_SCANNED, this_scan);
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de> To: James Bottomley <James.Bottomley@suse.de> Cc: Jan Kara <jack@suse.cz>, colin.king@canonical.com, Chris Mason <chris.mason@oracle.com>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, linux-kernel <linux-kernel@vger.kernel.org>, linux-ext4 <linux-ext4@vger.kernel.org>, mgorman@novell.com Subject: Re: [BUG] fatal hang untarring 90GB file, possibly writeback related. Date: Thu, 28 Apr 2011 20:21:04 +0100 [thread overview] Message-ID: <20110428192104.GA4658@suse.de> (raw) In-Reply-To: <1304015436.2598.19.camel@mulgrave.site> On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote: > On Thu, 2011-04-28 at 18:18 +0100, Mel Gorman wrote: > > On Thu, Apr 28, 2011 at 11:56:17AM -0500, James Bottomley wrote: > > > # Events: 6K cycles > > > # > > > # Overhead Command Shared Object Symbol > > > # ........ ........... ................... ....................................... > > > # > > > 20.41% kswapd0 [kernel.kallsyms] [k] shrink_slab > > > | > > > --- shrink_slab > > > | > > > |--99.91%-- kswapd > > > | kthread > > > | kernel_thread_helper > > > --0.09%-- [...] > > > > > > > Ok. I can't see how the patch "mm: vmscan: reclaim order-0 and use > > compaction instead of lumpy reclaim" is related unless we are seeing > > two problems that happen to manifest in a similar manner. > > > > However, there were a number of changes made to dcache in particular > > for 2.6.38. Specifically thinks like dentry_kill use trylock and is > > happy to loop around if it fails to acquire anything. See things like > > this for example; > > OK, so for this, I tried a 2.6.37 kernel. It doesn't work very well, > networking is hosed for no reason I can see (probably systemd / cgroups > problems). > > However, it runs enough for me to say that the tar proceeds to > completion in a non-PREEMPT kernel. (I tried several times for good > measure). That makes this definitely a regression of some sort, but it > doesn't definitively identify the dcache code ... it could be an ext4 > bug that got introduced in 2.6.38 either. > True, it could be any shrinker and dcache is just a guess. > > <SNIP> > > > > Way hey, cgroups are also in the mix. How jolly. > > > > Is systemd a common element of the machines hitting this bug by any > > chance? > > Well, yes, the bug report is against FC15, which needs cgroups for > systemd. > Ok although we do not have direct evidence that it's the problem yet. A broken shrinker could just mean we are also trying to aggressively reclaim in cgroups. > > The remaining traces seem to be follow-on damage related to the three > > issues of "shrinkers are bust in some manner" causing "we are not > > getting over the min watermark" and as a side-show "we are spending lots > > of time doing something unspecified but unhelpful in cgroups". > > Heh, well find a way for me to verify this: I can't turn off cgroups > because systemd then won't work and the machine won't boot ... > Same testcase, same kernel but a distro that is not using systemd to verify if cgroups are the problem. Not ideal I know. When I'm back online Tuesday, I'll try reproducing this on a !Fedora distribution. In the meantime, the following untested hatchet job might spit out which shrinker we are getting stuck in. It is also breaking out of the shrink_slab loop so it'd even be interesting to see if the bug is mitigated in any way. diff --git a/mm/vmscan.c b/mm/vmscan.c index c74a501..ed99104 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -225,6 +225,7 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask, { struct shrinker *shrinker; unsigned long ret = 0; + unsigned long shrink_expired = jiffies + HZ; if (scanned == 0) scanned = SWAP_CLUSTER_MAX; @@ -270,6 +271,14 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask, gfp_mask); if (shrink_ret == -1) break; + if (time_after(jiffies, shrink_expired)) { + printk(KERN_WARNING "Slab shrinker %p gone mental" + " comm=%s nr=%ld\n", + shrinker->shrink, + current->comm, + shrinker->nr); + break; + } if (shrink_ret < nr_before) ret += nr_before - shrink_ret; count_vm_events(SLABS_SCANNED, this_scan); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-28 19:21 UTC|newest] Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-04-27 16:09 [BUG] fatal hang untarring 90GB file, possibly writeback related James Bottomley 2011-04-27 16:09 ` James Bottomley 2011-04-27 16:33 ` Chris Mason 2011-04-27 16:33 ` Chris Mason 2011-04-27 16:50 ` James Bottomley 2011-04-27 16:50 ` James Bottomley 2011-04-27 16:50 ` James Bottomley 2011-04-27 16:54 ` Chris Mason 2011-04-27 16:54 ` Chris Mason 2011-04-27 17:21 ` James Bottomley 2011-04-27 17:21 ` James Bottomley 2011-04-27 17:21 ` James Bottomley 2011-04-27 17:34 ` Chris Mason 2011-04-27 17:34 ` Chris Mason 2011-04-27 17:50 ` James Bottomley 2011-04-27 17:50 ` James Bottomley 2011-04-27 18:25 ` Colin Ian King 2011-04-27 18:25 ` Colin Ian King 2011-04-28 15:57 ` James Bottomley 2011-04-28 15:57 ` James Bottomley 2011-04-27 20:05 ` James Bottomley 2011-04-27 20:05 ` James Bottomley 2011-04-28 11:36 ` Colin Ian King 2011-04-28 11:36 ` Colin Ian King 2011-04-28 12:29 ` Chris Mason 2011-04-28 12:29 ` Chris Mason 2011-04-28 13:42 ` Colin Ian King 2011-04-28 13:42 ` Colin Ian King 2011-04-28 13:45 ` Chris Mason 2011-04-28 13:45 ` Chris Mason 2011-04-28 14:01 ` Colin Ian King 2011-04-28 14:04 ` Chris Mason 2011-04-28 14:04 ` Chris Mason 2011-04-28 15:23 ` Colin Ian King 2011-04-28 14:25 ` Jan Kara 2011-04-28 14:25 ` Jan Kara 2011-04-28 14:33 ` Jan Kara 2011-04-28 14:33 ` Jan Kara 2011-04-28 14:58 ` Colin Ian King 2011-04-28 22:40 ` Jan Kara 2011-04-28 22:40 ` Jan Kara 2011-04-28 22:44 ` James Bottomley 2011-04-28 22:44 ` James Bottomley 2011-05-03 18:55 ` Colin Ian King 2011-05-03 18:55 ` Colin Ian King 2011-04-28 16:11 ` James Bottomley 2011-04-28 16:11 ` James Bottomley 2011-04-28 14:49 ` James Bottomley 2011-04-28 14:49 ` James Bottomley 2011-04-28 13:52 ` Jan Kara 2011-04-28 13:52 ` Jan Kara 2011-04-28 14:07 ` Mel Gorman 2011-04-28 14:07 ` Mel Gorman 2011-04-28 14:25 ` James Bottomley 2011-04-28 14:25 ` James Bottomley 2011-04-28 15:08 ` Mel Gorman 2011-04-28 15:08 ` Mel Gorman 2011-04-28 16:01 ` James Bottomley 2011-04-28 16:01 ` James Bottomley 2011-04-28 16:50 ` James Bottomley 2011-04-28 16:50 ` James Bottomley 2011-04-28 16:56 ` James Bottomley 2011-04-28 16:56 ` James Bottomley 2011-04-28 17:18 ` Mel Gorman 2011-04-28 17:18 ` Mel Gorman 2011-04-28 18:30 ` James Bottomley 2011-04-28 18:30 ` James Bottomley 2011-04-28 19:21 ` Mel Gorman [this message] 2011-04-28 19:21 ` Mel Gorman 2011-04-28 19:59 ` James Bottomley 2011-04-28 19:59 ` James Bottomley 2011-04-28 20:27 ` Mel Gorman 2011-04-28 20:27 ` Mel Gorman 2011-04-29 15:02 ` James Bottomley 2011-04-29 15:02 ` James Bottomley 2011-04-28 21:12 ` James Bottomley 2011-04-28 21:12 ` James Bottomley 2011-04-28 22:43 ` James Bottomley 2011-04-28 22:43 ` James Bottomley 2011-05-03 9:13 ` Mel Gorman 2011-05-03 9:13 ` Mel Gorman 2011-05-03 14:13 ` James Bottomley 2011-05-03 14:13 ` James Bottomley 2011-05-03 14:22 ` James Bottomley 2011-05-06 7:42 ` Mel Gorman 2011-05-06 7:42 ` Mel Gorman 2011-05-06 8:07 ` Mel Gorman 2011-05-09 18:16 ` James Bottomley 2011-05-09 18:16 ` James Bottomley 2011-05-10 10:21 ` Mel Gorman 2011-05-10 10:21 ` Mel Gorman 2011-05-10 10:33 ` Pekka Enberg 2011-05-10 10:33 ` Pekka Enberg 2011-05-10 14:01 ` James Bottomley 2011-05-10 14:01 ` James Bottomley 2011-05-10 14:35 ` Mel Gorman 2011-05-10 14:35 ` Mel Gorman 2011-05-10 15:29 ` James Bottomley 2011-05-10 15:29 ` James Bottomley 2011-05-10 15:57 ` James Bottomley 2011-05-10 15:57 ` James Bottomley 2011-05-10 17:05 ` James Bottomley 2011-05-10 17:05 ` James Bottomley 2011-05-10 17:17 ` Mel Gorman 2011-05-10 17:17 ` Mel Gorman 2011-05-10 17:29 ` James Bottomley 2011-05-10 17:29 ` James Bottomley 2011-05-10 21:08 ` Raghavendra D Prabhu 2011-05-11 9:16 ` Mel Gorman 2011-05-11 9:16 ` Mel Gorman 2011-05-06 11:42 ` Mel Gorman 2011-05-06 11:42 ` Mel Gorman 2011-05-06 15:44 ` Mel Gorman 2011-05-06 15:44 ` Mel Gorman 2011-05-06 19:14 ` James Bottomley 2011-05-06 19:14 ` James Bottomley 2011-05-06 19:37 ` Mel Gorman 2011-05-06 19:37 ` Mel Gorman 2011-05-10 5:37 ` Colin Ian King 2011-05-10 5:37 ` Colin Ian King 2011-05-10 5:37 ` Colin Ian King 2011-05-06 15:58 ` James Bottomley 2011-05-03 9:54 ` Colin Ian King 2011-05-03 9:54 ` Colin Ian King 2011-04-28 17:10 ` Colin Ian King 2011-04-28 17:10 ` Colin Ian King 2011-04-28 0:37 ` Dave Chinner 2011-04-28 0:37 ` Dave Chinner 2011-04-29 10:23 ` Sedat Dilek 2011-04-29 10:23 ` Sedat Dilek 2011-04-29 15:37 ` James Bottomley 2011-04-29 15:37 ` James Bottomley 2011-04-29 16:31 ` James Bottomley 2011-04-29 16:31 ` James Bottomley 2011-04-29 18:02 ` James Bottomley 2011-04-29 18:02 ` James Bottomley 2011-05-02 20:04 ` James Bottomley 2011-05-02 20:04 ` James Bottomley
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20110428192104.GA4658@suse.de \ --to=mgorman@suse.de \ --cc=James.Bottomley@suse.de \ --cc=chris.mason@oracle.com \ --cc=colin.king@canonical.com \ --cc=jack@suse.cz \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@novell.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.