All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@suse.de>
To: Mel Gorman <mgorman@suse.de>
Cc: Jan Kara <jack@suse.cz>,
	colin.king@canonical.com, Chris Mason <chris.mason@oracle.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	mgorman@novell.com
Subject: Re: [BUG] fatal hang untarring 90GB file, possibly writeback related.
Date: Thu, 28 Apr 2011 17:43:48 -0500	[thread overview]
Message-ID: <1304030629.2598.42.camel@mulgrave.site> (raw)
In-Reply-To: <1304025145.2598.24.camel@mulgrave.site>

On Thu, 2011-04-28 at 16:12 -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 14:59 -0500, James Bottomley wrote:
> > Actually, talking to Chris, I think I can get the system up using
> > init=/bin/bash without systemd, so I can try the no cgroup config.
> 
> OK, so a non-PREEMPT non-CGROUP kernel has survived three back to back
> runs of untar without locking or getting kswapd pegged, so I'm pretty
> certain this is cgroups related.  The next steps are to turn cgroups
> back on but try disabling the memory and IO controllers.

I tried non-PREEMPT CGROUP but disabled GROUP_MEM_RES_CTLR.

The results are curious:  the tar does complete (I've done three back to
back).  However, I did get one soft lockup in kswapd (below).  But the
system recovers instead of halting I/O and hanging like it did
previously.

The soft lockup is in shrink_slab, so perhaps it's a combination of slab
shrinker and cgroup memory controller issues?

James

---
[  670.823843] BUG: soft lockup - CPU#2 stuck for 67s! [kswapd0:46]
[  670.825472] Modules linked in: netconsole configfs cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel btusb snd_hda_codec snd_hwdep iwlagn snd_seq mac80211 bluetooth snd_seq_device uvcvideo snd_pcm cfg80211 wmi microcode e1000e videodev xhci_hcd rfkill snd_timer iTCO_wdt v4l2_compat_ioctl32 iTCO_vendor_support pcspkr i2c_i801 snd soundcore snd_page_alloc joydev uinput ipv6 sdhci_pci sdhci mmc_core i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  670.830864] CPU 2 
[  670.830881] Modules linked in: netconsole configfs cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel btusb snd_hda_codec snd_hwdep iwlagn snd_seq mac80211 bluetooth snd_seq_device uvcvideo snd_pcm cfg80211 wmi microcode e1000e videodev xhci_hcd rfkill snd_timer iTCO_wdt v4l2_compat_ioctl32 iTCO_vendor_support pcspkr i2c_i801 snd soundcore snd_page_alloc joydev uinput ipv6 sdhci_pci sdhci mmc_core i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  670.838385] 
[  670.840289] Pid: 46, comm: kswapd0 Not tainted 2.6.39-rc4+ #3 LENOVO 4170CTO/4170CTO
[  670.842193] RIP: 0010:[<ffffffff810e07cb>]  [<ffffffff810e07cb>] shrink_slab+0x86/0x166
[  670.844063] RSP: 0018:ffff88006eea5da0  EFLAGS: 00000206
[  670.845881] RAX: 0000000000000000 RBX: ffff88006eea5de0 RCX: 0000000000000002
[  670.847652] RDX: 0000000000000000 RSI: ffff88006eea5d60 RDI: ffff88006eea5d60
[  670.849394] RBP: ffff88006eea5de0 R08: 000000000000000c R09: 0000000000000000
[  670.851091] R10: 0000000000000001 R11: 000000000000005f R12: ffffffff8147b50e
[  670.852733] R13: ffff8801005e6e00 R14: 0000000000000010 R15: 0000000000017fb6
[  670.854351] FS:  0000000000000000(0000) GS:ffff880100280000(0000) knlGS:0000000000000000
[  670.855968] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  670.857555] CR2: 00000037d90ae040 CR3: 0000000001a03000 CR4: 00000000000406e0
[  670.859138] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  670.860720] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  670.862320] Process kswapd0 (pid: 46, threadinfo ffff88006eea4000, task ffff88006eeb0000)
[  670.863932] Stack:
[  670.865477]  0000000000000001 0000000000000080 ffff880000000002 ffff8801005e6e00
[  670.867023]  ffff8801005e6000 0000000000000002 0000000000000000 000000000000000c
[  670.868558]  ffff88006eea5ee0 ffffffff810e308c 0000000000000003 ffff88006eeb0000
[  670.870120] Call Trace:
[  670.871652]  [<ffffffff810e308c>] kswapd+0x4f0/0x774
[  670.873218]  [<ffffffff810e2b9c>] ? try_to_free_pages+0xe5/0xe5
[  670.874786]  [<ffffffff8106ce57>] kthread+0x84/0x8c
[  670.876327]  [<ffffffff8147bc64>] kernel_thread_helper+0x4/0x10
[  670.877871]  [<ffffffff8106cdd3>] ? kthread_worker_fn+0x148/0x148
[  670.879403]  [<ffffffff8147bc60>] ? gs_change+0x13/0x13
[  670.880932] Code: 83 eb 10 e9 ce 00 00 00 44 89 f2 31 f6 48 89 df ff 13 48 63 4b 08 4c 63 e8 48 8b 45 c8 31 d2 48 f7 f1 31 d2 49 0f af c5 49 f7 f7 
[  670.881086]  03 43 20 48 85 c0 48 89 43 20 79 18 48 8b 33 48 89 c2 48 c7 
[  670.884285] Call Trace:
[  670.885884]  [<ffffffff810e308c>] kswapd+0x4f0/0x774
[  670.887462]  [<ffffffff810e2b9c>] ? try_to_free_pages+0xe5/0xe5
[  670.889031]  [<ffffffff8106ce57>] kthread+0x84/0x8c
[  670.890578]  [<ffffffff8147bc64>] kernel_thread_helper+0x4/0x10
[  670.892130]  [<ffffffff8106cdd3>] ? kthread_worker_fn+0x148/0x148
[  670.893653]  [<ffffffff8147bc60>] ? gs_change+0x13/0x13



WARNING: multiple messages have this Message-ID (diff)
From: James Bottomley <James.Bottomley@suse.de>
To: Mel Gorman <mgorman@suse.de>
Cc: Jan Kara <jack@suse.cz>,
	colin.king@canonical.com, Chris Mason <chris.mason@oracle.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	mgorman@novell.com
Subject: Re: [BUG] fatal hang untarring 90GB file, possibly writeback related.
Date: Thu, 28 Apr 2011 17:43:48 -0500	[thread overview]
Message-ID: <1304030629.2598.42.camel@mulgrave.site> (raw)
In-Reply-To: <1304025145.2598.24.camel@mulgrave.site>

On Thu, 2011-04-28 at 16:12 -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 14:59 -0500, James Bottomley wrote:
> > Actually, talking to Chris, I think I can get the system up using
> > init=/bin/bash without systemd, so I can try the no cgroup config.
> 
> OK, so a non-PREEMPT non-CGROUP kernel has survived three back to back
> runs of untar without locking or getting kswapd pegged, so I'm pretty
> certain this is cgroups related.  The next steps are to turn cgroups
> back on but try disabling the memory and IO controllers.

I tried non-PREEMPT CGROUP but disabled GROUP_MEM_RES_CTLR.

The results are curious:  the tar does complete (I've done three back to
back).  However, I did get one soft lockup in kswapd (below).  But the
system recovers instead of halting I/O and hanging like it did
previously.

The soft lockup is in shrink_slab, so perhaps it's a combination of slab
shrinker and cgroup memory controller issues?

James

---
[  670.823843] BUG: soft lockup - CPU#2 stuck for 67s! [kswapd0:46]
[  670.825472] Modules linked in: netconsole configfs cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel btusb snd_hda_codec snd_hwdep iwlagn snd_seq mac80211 bluetooth snd_seq_device uvcvideo snd_pcm cfg80211 wmi microcode e1000e videodev xhci_hcd rfkill snd_timer iTCO_wdt v4l2_compat_ioctl32 iTCO_vendor_support pcspkr i2c_i801 snd soundcore snd_page_alloc joydev uinput ipv6 sdhci_pci sdhci mmc_core i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  670.830864] CPU 2 
[  670.830881] Modules linked in: netconsole configfs cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel btusb snd_hda_codec snd_hwdep iwlagn snd_seq mac80211 bluetooth snd_seq_device uvcvideo snd_pcm cfg80211 wmi microcode e1000e videodev xhci_hcd rfkill snd_timer iTCO_wdt v4l2_compat_ioctl32 iTCO_vendor_support pcspkr i2c_i801 snd soundcore snd_page_alloc joydev uinput ipv6 sdhci_pci sdhci mmc_core i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  670.838385] 
[  670.840289] Pid: 46, comm: kswapd0 Not tainted 2.6.39-rc4+ #3 LENOVO 4170CTO/4170CTO
[  670.842193] RIP: 0010:[<ffffffff810e07cb>]  [<ffffffff810e07cb>] shrink_slab+0x86/0x166
[  670.844063] RSP: 0018:ffff88006eea5da0  EFLAGS: 00000206
[  670.845881] RAX: 0000000000000000 RBX: ffff88006eea5de0 RCX: 0000000000000002
[  670.847652] RDX: 0000000000000000 RSI: ffff88006eea5d60 RDI: ffff88006eea5d60
[  670.849394] RBP: ffff88006eea5de0 R08: 000000000000000c R09: 0000000000000000
[  670.851091] R10: 0000000000000001 R11: 000000000000005f R12: ffffffff8147b50e
[  670.852733] R13: ffff8801005e6e00 R14: 0000000000000010 R15: 0000000000017fb6
[  670.854351] FS:  0000000000000000(0000) GS:ffff880100280000(0000) knlGS:0000000000000000
[  670.855968] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  670.857555] CR2: 00000037d90ae040 CR3: 0000000001a03000 CR4: 00000000000406e0
[  670.859138] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  670.860720] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  670.862320] Process kswapd0 (pid: 46, threadinfo ffff88006eea4000, task ffff88006eeb0000)
[  670.863932] Stack:
[  670.865477]  0000000000000001 0000000000000080 ffff880000000002 ffff8801005e6e00
[  670.867023]  ffff8801005e6000 0000000000000002 0000000000000000 000000000000000c
[  670.868558]  ffff88006eea5ee0 ffffffff810e308c 0000000000000003 ffff88006eeb0000
[  670.870120] Call Trace:
[  670.871652]  [<ffffffff810e308c>] kswapd+0x4f0/0x774
[  670.873218]  [<ffffffff810e2b9c>] ? try_to_free_pages+0xe5/0xe5
[  670.874786]  [<ffffffff8106ce57>] kthread+0x84/0x8c
[  670.876327]  [<ffffffff8147bc64>] kernel_thread_helper+0x4/0x10
[  670.877871]  [<ffffffff8106cdd3>] ? kthread_worker_fn+0x148/0x148
[  670.879403]  [<ffffffff8147bc60>] ? gs_change+0x13/0x13
[  670.880932] Code: 83 eb 10 e9 ce 00 00 00 44 89 f2 31 f6 48 89 df ff 13 48 63 4b 08 4c 63 e8 48 8b 45 c8 31 d2 48 f7 f1 31 d2 49 0f af c5 49 f7 f7 
[  670.881086]  03 43 20 48 85 c0 48 89 43 20 79 18 48 8b 33 48 89 c2 48 c7 
[  670.884285] Call Trace:
[  670.885884]  [<ffffffff810e308c>] kswapd+0x4f0/0x774
[  670.887462]  [<ffffffff810e2b9c>] ? try_to_free_pages+0xe5/0xe5
[  670.889031]  [<ffffffff8106ce57>] kthread+0x84/0x8c
[  670.890578]  [<ffffffff8147bc64>] kernel_thread_helper+0x4/0x10
[  670.892130]  [<ffffffff8106cdd3>] ? kthread_worker_fn+0x148/0x148
[  670.893653]  [<ffffffff8147bc60>] ? gs_change+0x13/0x13


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-04-28 22:44 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-27 16:09 [BUG] fatal hang untarring 90GB file, possibly writeback related James Bottomley
2011-04-27 16:09 ` James Bottomley
2011-04-27 16:33 ` Chris Mason
2011-04-27 16:33   ` Chris Mason
2011-04-27 16:50   ` James Bottomley
2011-04-27 16:50     ` James Bottomley
2011-04-27 16:50     ` James Bottomley
2011-04-27 16:54     ` Chris Mason
2011-04-27 16:54       ` Chris Mason
2011-04-27 17:21       ` James Bottomley
2011-04-27 17:21         ` James Bottomley
2011-04-27 17:21         ` James Bottomley
2011-04-27 17:34         ` Chris Mason
2011-04-27 17:34           ` Chris Mason
2011-04-27 17:50           ` James Bottomley
2011-04-27 17:50             ` James Bottomley
2011-04-27 18:25             ` Colin Ian King
2011-04-27 18:25               ` Colin Ian King
2011-04-28 15:57               ` James Bottomley
2011-04-28 15:57                 ` James Bottomley
2011-04-27 20:05             ` James Bottomley
2011-04-27 20:05               ` James Bottomley
2011-04-28 11:36               ` Colin Ian King
2011-04-28 11:36                 ` Colin Ian King
2011-04-28 12:29                 ` Chris Mason
2011-04-28 12:29                   ` Chris Mason
2011-04-28 13:42                   ` Colin Ian King
2011-04-28 13:42                     ` Colin Ian King
2011-04-28 13:45                     ` Chris Mason
2011-04-28 13:45                       ` Chris Mason
2011-04-28 14:01                       ` Colin Ian King
2011-04-28 14:04                         ` Chris Mason
2011-04-28 14:04                           ` Chris Mason
2011-04-28 15:23                           ` Colin Ian King
2011-04-28 14:25                         ` Jan Kara
2011-04-28 14:25                           ` Jan Kara
2011-04-28 14:33                           ` Jan Kara
2011-04-28 14:33                             ` Jan Kara
2011-04-28 14:58                             ` Colin Ian King
2011-04-28 22:40                               ` Jan Kara
2011-04-28 22:40                                 ` Jan Kara
2011-04-28 22:44                                 ` James Bottomley
2011-04-28 22:44                                   ` James Bottomley
2011-05-03 18:55                                 ` Colin Ian King
2011-05-03 18:55                                   ` Colin Ian King
2011-04-28 16:11                             ` James Bottomley
2011-04-28 16:11                               ` James Bottomley
2011-04-28 14:49                   ` James Bottomley
2011-04-28 14:49                     ` James Bottomley
2011-04-28 13:52                 ` Jan Kara
2011-04-28 13:52                   ` Jan Kara
2011-04-28 14:07                   ` Mel Gorman
2011-04-28 14:07                     ` Mel Gorman
2011-04-28 14:25                     ` James Bottomley
2011-04-28 14:25                       ` James Bottomley
2011-04-28 15:08                       ` Mel Gorman
2011-04-28 15:08                         ` Mel Gorman
2011-04-28 16:01                         ` James Bottomley
2011-04-28 16:01                           ` James Bottomley
2011-04-28 16:50                           ` James Bottomley
2011-04-28 16:50                             ` James Bottomley
2011-04-28 16:56                             ` James Bottomley
2011-04-28 16:56                               ` James Bottomley
2011-04-28 17:18                               ` Mel Gorman
2011-04-28 17:18                                 ` Mel Gorman
2011-04-28 18:30                                 ` James Bottomley
2011-04-28 18:30                                   ` James Bottomley
2011-04-28 19:21                                   ` Mel Gorman
2011-04-28 19:21                                     ` Mel Gorman
2011-04-28 19:59                                     ` James Bottomley
2011-04-28 19:59                                       ` James Bottomley
2011-04-28 20:27                                       ` Mel Gorman
2011-04-28 20:27                                         ` Mel Gorman
2011-04-29 15:02                                         ` James Bottomley
2011-04-29 15:02                                           ` James Bottomley
2011-04-28 21:12                                       ` James Bottomley
2011-04-28 21:12                                         ` James Bottomley
2011-04-28 22:43                                         ` James Bottomley [this message]
2011-04-28 22:43                                           ` James Bottomley
2011-05-03  9:13                                           ` Mel Gorman
2011-05-03  9:13                                             ` Mel Gorman
2011-05-03 14:13                                             ` James Bottomley
2011-05-03 14:13                                               ` James Bottomley
2011-05-03 14:22                                               ` James Bottomley
2011-05-06  7:42                                                 ` Mel Gorman
2011-05-06  7:42                                                   ` Mel Gorman
2011-05-06  8:07                                                   ` Mel Gorman
2011-05-09 18:16                                                     ` James Bottomley
2011-05-09 18:16                                                       ` James Bottomley
2011-05-10 10:21                                                       ` Mel Gorman
2011-05-10 10:21                                                         ` Mel Gorman
2011-05-10 10:33                                                         ` Pekka Enberg
2011-05-10 10:33                                                           ` Pekka Enberg
2011-05-10 14:01                                                         ` James Bottomley
2011-05-10 14:01                                                           ` James Bottomley
2011-05-10 14:35                                                           ` Mel Gorman
2011-05-10 14:35                                                             ` Mel Gorman
2011-05-10 15:29                                                             ` James Bottomley
2011-05-10 15:29                                                               ` James Bottomley
2011-05-10 15:57                                                               ` James Bottomley
2011-05-10 15:57                                                                 ` James Bottomley
2011-05-10 17:05                                                                 ` James Bottomley
2011-05-10 17:05                                                                   ` James Bottomley
2011-05-10 17:17                                                                   ` Mel Gorman
2011-05-10 17:17                                                                     ` Mel Gorman
2011-05-10 17:29                                                                     ` James Bottomley
2011-05-10 17:29                                                                       ` James Bottomley
2011-05-10 21:08                                                               ` Raghavendra D Prabhu
2011-05-11  9:16                                                                 ` Mel Gorman
2011-05-11  9:16                                                                   ` Mel Gorman
2011-05-06 11:42                                                   ` Mel Gorman
2011-05-06 11:42                                                     ` Mel Gorman
2011-05-06 15:44                                                   ` Mel Gorman
2011-05-06 15:44                                                     ` Mel Gorman
2011-05-06 19:14                                                     ` James Bottomley
2011-05-06 19:14                                                       ` James Bottomley
2011-05-06 19:37                                                       ` Mel Gorman
2011-05-06 19:37                                                         ` Mel Gorman
2011-05-10  5:37                                                     ` Colin Ian King
2011-05-10  5:37                                                       ` Colin Ian King
2011-05-10  5:37                                                       ` Colin Ian King
2011-05-06 15:58                                                   ` James Bottomley
2011-05-03  9:54                                 ` Colin Ian King
2011-05-03  9:54                                   ` Colin Ian King
2011-04-28 17:10                         ` Colin Ian King
2011-04-28 17:10                           ` Colin Ian King
2011-04-28  0:37         ` Dave Chinner
2011-04-28  0:37           ` Dave Chinner
2011-04-29 10:23         ` Sedat Dilek
2011-04-29 10:23           ` Sedat Dilek
2011-04-29 15:37           ` James Bottomley
2011-04-29 15:37             ` James Bottomley
2011-04-29 16:31             ` James Bottomley
2011-04-29 16:31               ` James Bottomley
2011-04-29 18:02               ` James Bottomley
2011-04-29 18:02                 ` James Bottomley
2011-05-02 20:04                 ` James Bottomley
2011-05-02 20:04                   ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1304030629.2598.42.camel@mulgrave.site \
    --to=james.bottomley@suse.de \
    --cc=chris.mason@oracle.com \
    --cc=colin.king@canonical.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@novell.com \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.