linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* lockups shortly after booting in current git.
@ 2012-03-29 15:55 Dave Jones
  2012-03-29 18:23 ` Linus Torvalds
       [not found] ` <CA+55aFxOcMt_mcr+ZYwc-SpKbROnh4Gn7jqrFY_SZcBy1Ev7Qw@mail.gmail.com>
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Jones @ 2012-03-29 15:55 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linus Torvalds

In v3.3-8839-gb5174fa I see some behaviour on my laptop
that I haven't noticed on other machines.
It boots up, I can log in, and start up a terminal, firefox, banshee
and use the system, and all is good. Then if I lock the screen,
walk away for a while and come back to it, the loadavg has sky-rocketed,
and every process is hung. Trying to run any commands that perform disk io
hang immediately.

There's a (truncated) sysrq-t output at http://codemonkey.org.uk/junk/process-trace.txt

Top showed kworker was going nuts. Trying to run perf resulted in a black screen.

anyone have any ideas ?

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 15:55 lockups shortly after booting in current git Dave Jones
@ 2012-03-29 18:23 ` Linus Torvalds
  2012-03-29 18:29   ` Dave Jones
       [not found] ` <CA+55aFxOcMt_mcr+ZYwc-SpKbROnh4Gn7jqrFY_SZcBy1Ev7Qw@mail.gmail.com>
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 18:23 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Wu Fengguang, Theodore Ts'o

On Thu, Mar 29, 2012 at 8:55 AM, Dave Jones <davej@redhat.com> wrote:
> In v3.3-8839-gb5174fa I see some behaviour on my laptop
> that I haven't noticed on other machines.
> It boots up, I can log in, and start up a terminal, firefox, banshee
> and use the system, and all is good. Then if I lock the screen,
> walk away for a while and come back to it, the loadavg has sky-rocketed,
> and every process is hung. Trying to run any commands that perform disk io
> hang immediately.
>
> There's a (truncated) sysrq-t output at http://codemonkey.org.uk/junk/process-trace.txt
>
> Top showed kworker was going nuts. Trying to run perf resulted in a black screen.

Ok, so the filesystem and the disk do seem to be very active. That
could be normal, but I assume this trace was done at a time when you
*expected* the system to be idle.

Your process trace has quite a few processes waiting for disk or filesystem:

3x firefox (these are waiting for the sys_mmap below):
  schedule+0x29/0x70
  rwsem_down_failed_common+0xb5/0x150
  rwsem_down_read_failed+0x15/0x17
  call_rwsem_down_read_failed+0x14/0x30
  do_page_fault+0x1b6/0x5d0
  page_fault+0x25/0x30

firefox (this one is waiting for the page fault below to finish):
  schedule+0x29/0x70
  rwsem_down_failed_common+0xb5/0x150
  rwsem_down_write_failed+0x13/0x20
  call_rwsem_down_write_failed+0x13/0x20
  sys_mmap_pgoff+0x125/0x210
  sys_mmap+0x22/0x30
  system_call_fastpath+0x16/0x1b

firefox (waiting for the jbd2 journal):
  schedule+0x29/0x70
  start_this_handle+0x2dd/0x6d0
  jbd2__journal_start+0x12a/0x1a0
  jbd2_journal_start+0x13/0x20
  ext4_journal_start_sb+0x7e/0x2d0
  ext4_dirty_inode+0x26/0x60
  __mark_inode_dirty+0x3c/0x240
  file_update_time+0xd7/0x140
  do_wp_page+0x3ec/0x720
  handle_pte_fault+0x27d/0xa10
  handle_mm_fault+0x1e8/0x2f0
  do_page_fault+0x16b/0x5d0
  page_fault+0x25/0x30

npviewer.bin (waiting for inode->i_mutex - apparently the same file as
'banshee'):
  __mutex_lock_common+0x1b5/0x4c0
  mutex_lock_nested+0x44/0x50
  generic_file_aio_write+0x58/0xe0
  ext4_file_write+0xca/0x270
  do_sync_write+0xd2/0x110
  vfs_write+0xb6/0x180
  sys_write+0x4d/0x90
  sysenter_dispatch+0x7/0x26

banshee (waiting on actual IO):
  schedule+0x29/0x70
  io_schedule+0x8f/0xd0
  sleep_on_page+0xe/0x20
  __wait_on_bit+0x5f/0x90
  wait_on_page_bit+0x78/0x80
  grab_cache_page_write_begin+0xae/0xf0
  ext4_da_write_begin+0xa0/0x2e0
  generic_file_buffered_write+0x107/0x2c0
  __generic_file_aio_write+0x22c/0x440
  generic_file_aio_write+0x74/0xe0
  ext4_file_write+0xca/0x270
  do_sync_write+0xd2/0x110
  vfs_write+0xb6/0x180
  sys_write+0x4d/0x90
  system_call_fastpath+0x16/0x1b

xscreensaver-co D (waiting for the ext4 journal):
  schedule+0x29/0x70
  start_this_handle+0x2dd/0x6d0
  jbd2__journal_start+0x12a/0x1a0
  jbd2_journal_start+0x13/0x20
  ext4_journal_start_sb+0x7e/0x2d0
  ext4_dirty_inode+0x26/0x60
  __mark_inode_dirty+0x3c/0x240
  file_update_time+0xd7/0x140
  __generic_file_aio_write+0x1f8/0x440
  generic_file_aio_write+0x74/0xe0
  ext4_file_write+0xca/0x270
  do_sync_write+0xd2/0x110
  vfs_write+0xb6/0x180
  sys_write+0x4d/0x90
  system_call_fastpath+0x16/0x1b

kworker/1:0 D (waiting for scsi_test_unit_ready):
  schedule+0x29/0x70
  schedule_timeout+0x385/0x4f0
  wait_for_common+0x112/0x160
  wait_for_completion+0x1d/0x20
  blk_execute_rq+0x1b8/0x200
  scsi_execute+0xee/0x180
  scsi_execute_req+0xbd/0x130
  scsi_test_unit_ready+0x9d/0x140
  sd_check_events+0x12e/0x1c0
  disk_events_workfn+0x62/0x120
  process_one_work+0x1d3/0x740
  worker_thread+0x15f/0x350
  kthread+0xb7/0xc0
  kernel_thread_helper+0x4/0x10

so it ostensibly looks like a disk driver thing, but at the same time
what I find very surprising is that these are all waiting for
*writes*.

And quite frankly, that's quite unusual. Writes are often async, so
when the disk goes south it usually shows up as people waiting for the
*reads*. Sure, eventually the flushing gets to be synchronous and you
*also* get writers blocked, but from my experience it tends to be the
readers that block first.

Now, it's a truncated list of processes, and looking at the lock data
at the end there are some missing processes that are clearly also
stuck:

  1 lock held by xfconfd/1512:
  #0:  (&type->i_mutex_dir_key#3){+.+.+.}, at: [<ffffffff811d42bb>]
do_last+0x38b/0x900

but even that looks like a *writer* because anything blocked within
that do_last() function would likely be blocked on "vfs_create()":
it's somebody opening a non-existing file with O_CREAT.

Now, lack of readers *may* be because you have everything cached, so
maybe I'm just barking up the wrong tree, but my gut feel is that this
mig hbe writeback-related rather than necessarily related to disk IO
itself.

That's doubly true since I see processes waiting for the journal lock,
but I don't actually see anything waiting for journal IO completion.
Did something not release the journal lock. Even banshee - that *is*
waiting for actual IO - seems to wait for page_writeback, which is not
done by the *disk* driver, but by the filesystem (end_page_writeback).
So that would be consistent with the ext4 changes having broken that.

Was there a previous post-3.3 git kernel that worked?

                      Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 18:23 ` Linus Torvalds
@ 2012-03-29 18:29   ` Dave Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Jones @ 2012-03-29 18:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel, Wu Fengguang, Theodore Ts'o

On Thu, Mar 29, 2012 at 11:23:57AM -0700, Linus Torvalds wrote:

 > That's doubly true since I see processes waiting for the journal lock,
 > but I don't actually see anything waiting for journal IO completion.
 > Did something not release the journal lock. Even banshee - that *is*
 > waiting for actual IO - seems to wait for page_writeback, which is not
 > done by the *disk* driver, but by the filesystem (end_page_writeback).
 > So that would be consistent with the ext4 changes having broken that.
 > 
 > Was there a previous post-3.3 git kernel that worked?

This is the first one I tried. (I've not been anywhere with the laptop
since 3.3 -- clearly I need to get out more). 

3.3 final worked ok though.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
       [not found]   ` <20120329182632.GA6891@redhat.com>
@ 2012-03-29 18:36     ` Linus Torvalds
  2012-03-29 19:53       ` Dave Jones
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 18:36 UTC (permalink / raw)
  To: Dave Jones, Theodore Ts'o, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 11:26 AM, Dave Jones <davej@redhat.com> wrote:
> On Thu, Mar 29, 2012 at 09:03:31AM -0700, Linus Torvalds wrote:
>  > Can you do sysrq-w (or whatever it is that does only the D-state
>  > processes)? That might give the more relevant process information without
>  > everything else making it unreadable...
>
> http://codemonkey.org.uk/junk/process-trace2.txt

Ok, this doesn't look all that different, but now it's not truncated
any more because the random "waits for children or poll" cases are
gone.

It's still basically all blocked on the ext4 journal. All writes as
far as I can see.

That said, there's that odd kworker/1:2 there that seems to be
constantly running. Can you get the EIP for that one (several times)
by doing sysrq-p (or maybe sysrq-l is better - that should give the
stack trace too).

                        Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 18:36     ` Linus Torvalds
@ 2012-03-29 19:53       ` Dave Jones
  2012-03-29 20:10         ` Linus Torvalds
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-03-29 19:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Theodore Ts'o, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 11:36:49AM -0700, Linus Torvalds wrote:
 > On Thu, Mar 29, 2012 at 11:26 AM, Dave Jones <davej@redhat.com> wrote:
 > > On Thu, Mar 29, 2012 at 09:03:31AM -0700, Linus Torvalds wrote:
 > >  > Can you do sysrq-w (or whatever it is that does only the D-state
 > >  > processes)? That might give the more relevant process information without
 > >  > everything else making it unreadable...
 > >
 > > http://codemonkey.org.uk/junk/process-trace2.txt
 > 
 > Ok, this doesn't look all that different, but now it's not truncated
 > any more because the random "waits for children or poll" cases are
 > gone.
 > 
 > It's still basically all blocked on the ext4 journal. All writes as
 > far as I can see.
 > 
 > That said, there's that odd kworker/1:2 there that seems to be
 > constantly running. Can you get the EIP for that one (several times)
 > by doing sysrq-p (or maybe sysrq-l is better - that should give the
 > stack trace too).

Took a while to reproduce that time (but now I have a theory, going
to try and reproduce that after sending this mail).

sysrq-p looks kinda boring. I couldn't get sysrq-l to coincide
with kworker running.

http://codemonkey.org.uk/junk/process-trace3.txt

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 19:53       ` Dave Jones
@ 2012-03-29 20:10         ` Linus Torvalds
  2012-03-29 20:26           ` Dave Jones
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 20:10 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Theodore Ts'o, Wu Fengguang,
	Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 12:53 PM, Dave Jones <davej@redhat.com> wrote:
>
> sysrq-p looks kinda boring. I couldn't get sysrq-l to coincide
> with kworker running.

Yeah, none of that looks interesting.

Apparently kworker isn't actually using all CPU after all.

                    Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 20:10         ` Linus Torvalds
@ 2012-03-29 20:26           ` Dave Jones
  2012-03-29 20:38             ` Linus Torvalds
  2012-03-29 20:39             ` Ted Ts'o
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Jones @ 2012-03-29 20:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Theodore Ts'o, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:10:21PM -0700, Linus Torvalds wrote:
 > On Thu, Mar 29, 2012 at 12:53 PM, Dave Jones <davej@redhat.com> wrote:
 > >
 > > sysrq-p looks kinda boring. I couldn't get sysrq-l to coincide
 > > with kworker running.
 > 
 > Yeah, none of that looks interesting.
 > 
 > Apparently kworker isn't actually using all CPU after all.

Ok, so progress, kinda.  I can now reproduce it in 10 minutes
just by starting a make -j8 on the kernel, and running fsx in parallel
on the same ssd.

While that's building, I'll click around in firefox, and after a few
minutes, it comes to a standstill. At that point, I can't spawn
new shells. 

kworker does seem to be a red herring. This time, I'm looking at top and..

top - 16:24:02 up 16 min, 10 users,  load average: 11.27, 9.62, 5.58
Tasks: 164 total,   1 running, 163 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2%us,  3.3%sy,  0.0%ni, 49.1%id, 47.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3991860k total,  1832652k used,  2159208k free,    67320k buffers
Swap:  6109180k total,        0k used,  6109180k free,   696908k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                 
 1082 root      20   0  175m  20m 9.9m S  4.6  0.5   1:18.24 Xorg                                                                                    
 2918 davej     20   0 15260 1404 1008 R  1.6  0.0   0:17.23 top                                                                                     
    7 root      -2   0     0    0    0 S  0.3  0.0   0:03.38 rcuc/0                         


Pretty dull. Loadavg is consistent at 11, nothing is making forward progress,
and make/fsx are ignoring ctrl-c/ctrl-z

I had a perf top running in another window. It took doesn't show anything exciting..

  8.83%  [kernel]                            [k] read_hpet
  8.26%  [kernel]                            [k] lock_is_held
  5.53%  [kernel]                            [k] __lock_acquire
  4.71%  [kernel]                            [k] sub_preempt_count
  4.32%  [kernel]                            [k] add_preempt_count
  3.90%  [kernel]                            [k] debug_smp_processor_id
  3.86%  [kernel]                            [k] __module_address
  3.24%  [kernel]                            [k] sched_clock_local
  2.92%  [kernel]                            [k] lock_release
  2.91%  [kernel]                            [k] rcu_lockdep_current_cpu_online
  2.22%  [iwlwifi]                           [k] iwl_trans_pcie_read32
  1.92%  [kernel]                            [k] lock_acquired
  1.84%  [kernel]                            [k] match_held_lock
  1.78%  [kernel]                            [k] rcu_is_cpu_idle
  1.76%  [kernel]                            [k] trace_hardirqs_off_caller
  1.72%  [kernel]                            [k] debug_lockdep_rcu_enabled
  1.70%  [kernel]                            [k] native_read_tsc
  1.62%  [kernel]                            [k] local_clock

I'll go back to trying the bisect now that I know how to reproduce it quickly.
Do you think it might be worth restricting the bisect to fs/ ?
Or shall I just do the whole tree bisect from 3.3 ?

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 20:26           ` Dave Jones
@ 2012-03-29 20:38             ` Linus Torvalds
  2012-03-29 20:39             ` Ted Ts'o
  1 sibling, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 20:38 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Theodore Ts'o, Wu Fengguang,
	Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 1:26 PM, Dave Jones <davej@redhat.com> wrote:
>
> I'll go back to trying the bisect now that I know how to reproduce it quickly.
> Do you think it might be worth restricting the bisect to fs/ ?
> Or shall I just do the whole tree bisect from 3.3 ?

Despite my "it looks like writes only" theory it *might* be a libata
issue (I assume this is a SATA drive), so I wouldn't force it to just
fs/.

There are also patches in mm/, and who knows if it might be some
locking or RCU change that triggers it too.

So I would suggest a whole-tree bisect. At least start out that way
for a few rounds. It's painful when there has been 8000+ commits since
v3.3, but it may be that as you bisect it becomes clear that it's some
particular subsystem, and at *that* point maybe it's worth saying
"don't bother bisecting through all the ARM changes", just start a new
bisect with the known good/bad points and limit it to subsystem X.

After three bisections, you should be down to a thousand commits. And
at that point it becomes worth looking at "ok, which subsystems seem
to be represented in these 1000 commits" to see if you can narrow
things down.

                            Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 20:26           ` Dave Jones
  2012-03-29 20:38             ` Linus Torvalds
@ 2012-03-29 20:39             ` Ted Ts'o
  2012-03-29 20:59               ` Linus Torvalds
  1 sibling, 1 reply; 19+ messages in thread
From: Ted Ts'o @ 2012-03-29 20:39 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 04:26:19PM -0400, Dave Jones wrote:
> 
> I'll go back to trying the bisect now that I know how to reproduce it quickly.
> Do you think it might be worth restricting the bisect to fs/ ?
> Or shall I just do the whole tree bisect from 3.3 ?

I'd suggest restricting the bisect to fs/ first, and if that comes up
with something non-sensical, you can always use the good/bad data
points for a whole tree bisect.

One commit that you might want to try testing is b43d17f31^, and see
whether that works.  One potential commit that might be suspicious is
b43d17f31 given the stack trace shows one process apparently waiting
on PageWriteback.

					- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 20:39             ` Ted Ts'o
@ 2012-03-29 20:59               ` Linus Torvalds
  2012-03-29 21:12                 ` Dave Jones
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 20:59 UTC (permalink / raw)
  To: Ted Ts'o, Dave Jones, Linus Torvalds, Wu Fengguang,
	Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 1:39 PM, Ted Ts'o <tytso@mit.edu> wrote:
>
> I'd suggest restricting the bisect to fs/ first, and if that comes up
> with something non-sensical, you can always use the good/bad data
> points for a whole tree bisect.
>
> One commit that you might want to try testing is b43d17f31^, and see
> whether that works.  One potential commit that might be suspicious is
> b43d17f31 given the stack trace shows one process apparently waiting
> on PageWriteback.

Ahh, that does sound interesting and relevant.

Yeah, maybe the directed approach is worth it. In fact, considering
the above commie, maybe it might be worth it to be *very* directed,
and just test that one thing first, and maybe even skip the bisection
entirely and just test a revert if it looks promising.

                       Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 20:59               ` Linus Torvalds
@ 2012-03-29 21:12                 ` Dave Jones
  2012-03-29 21:45                   ` Ted Ts'o
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-03-29 21:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ted Ts'o, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:59:33PM -0700, Linus Torvalds wrote:
 > On Thu, Mar 29, 2012 at 1:39 PM, Ted Ts'o <tytso@mit.edu> wrote:
 > >
 > > I'd suggest restricting the bisect to fs/ first, and if that comes up
 > > with something non-sensical, you can always use the good/bad data
 > > points for a whole tree bisect.
 > >
 > > One commit that you might want to try testing is b43d17f31^, and see
 > > whether that works.  One potential commit that might be suspicious is
 > > b43d17f31 given the stack trace shows one process apparently waiting
 > > on PageWriteback.
 > 
 > Ahh, that does sound interesting and relevant.
 > 
 > Yeah, maybe the directed approach is worth it. In fact, considering
 > the above commie, maybe it might be worth it to be *very* directed,
 > and just test that one thing first, and maybe even skip the bisection
 > entirely and just test a revert if it looks promising.

I'll try a build with just that reverted, given the bisect build is taking a while.

Any thoughts on any printk's I could add to verify a situation occurred or not ?
The problem with bisecting a bug like this is that it's hard to tell if
the bug has been fixed, or if I've just not hit it yet.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 21:12                 ` Dave Jones
@ 2012-03-29 21:45                   ` Ted Ts'o
  2012-03-29 21:49                     ` Dave Jones
  2012-03-29 21:52                     ` Dave Jones
  0 siblings, 2 replies; 19+ messages in thread
From: Ted Ts'o @ 2012-03-29 21:45 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 05:12:44PM -0400, Dave Jones wrote:
> 
> I'll try a build with just that reverted, given the bisect build is taking a while.
> 

Something else you could try doing without even have to do a rebuild
is to just mount the filesystem with the mount option
nomblk_io_submit, which avoids using any of the code in
fs/ext4/page_io.c.  (This option causes ext4 will send blocks to the
block layer old fashioned way, on 4k block at a time, and rely on the
elevator code to coaslece the write requests.)

> Any thoughts on any printk's I could add to verify a situation occurred or not ?
> The problem with bisecting a bug like this is that it's hard to tell if
> the bug has been fixed, or if I've just not hit it yet.

If it really is about the PageWriteback bit not getting cleared, not
really.  If you're willing to expand the struct page to include a
timestamp, we could use that to note pages which have been in
writeback for a long time, but that's obviously quite expensive.

But actually, normally when it's a PageWriteback stall, usually you
get a soft lockup warning, assuming that was compiled into the system.
And I didn't see that in your trace, which is surprising given the
symptoms you described.  Was it perhaps not included in your log file
snippet?  Or was soft lockup detection not enabled?

	     	      	     	       	   - Ted



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 21:45                   ` Ted Ts'o
@ 2012-03-29 21:49                     ` Dave Jones
  2012-03-29 21:52                       ` Linus Torvalds
  2012-03-29 21:52                     ` Dave Jones
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-03-29 21:49 UTC (permalink / raw)
  To: Ted Ts'o, Linus Torvalds, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 02:45:10PM -0700, Ted Ts'o wrote:

 > > I'll try a build with just that reverted, given the bisect build is taking a while.

up 30 minutes with that reverted, no problems so far..

 > But actually, normally when it's a PageWriteback stall, usually you
 > get a soft lockup warning, assuming that was compiled into the system.
 > And I didn't see that in your trace, which is surprising given the
 > symptoms you described.  Was it perhaps not included in your log file
 > snippet?  Or was soft lockup detection not enabled?

Enabled, didn't trigger.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 21:49                     ` Dave Jones
@ 2012-03-29 21:52                       ` Linus Torvalds
  2012-03-29 22:54                         ` Dave Jones
  2012-03-29 23:25                         ` Ted Ts'o
  0 siblings, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 21:52 UTC (permalink / raw)
  To: Dave Jones, Ted Ts'o, Linus Torvalds, Wu Fengguang,
	Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 2:49 PM, Dave Jones <davej@redhat.com> wrote:
> On Thu, Mar 29, 2012 at 02:45:10PM -0700, Ted Ts'o wrote:
>
>  > > I'll try a build with just that reverted, given the bisect build is taking a while.
>
> up 30 minutes with that reverted, no problems so far..

Goodie. Let it run for a while more, and really pound on it.

Ted, are there any downsides to just reverting that commit (ie any
subtle interactions) for now?  That's assuming that Dave's testing
continues to confirm that it is that one commit.

                    Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 21:45                   ` Ted Ts'o
  2012-03-29 21:49                     ` Dave Jones
@ 2012-03-29 21:52                     ` Dave Jones
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Jones @ 2012-03-29 21:52 UTC (permalink / raw)
  To: Ted Ts'o, Linus Torvalds, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 02:45:10PM -0700, Ted Ts'o wrote:
 > On Thu, Mar 29, 2012 at 05:12:44PM -0400, Dave Jones wrote:
 > > 
 > > I'll try a build with just that reverted, given the bisect build is taking a while.
 > > 

Just in case it gives additional clues.. The partition I'm doing builds on
which suffers the hangs is on a luks backed device.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 21:52                       ` Linus Torvalds
@ 2012-03-29 22:54                         ` Dave Jones
  2012-03-29 23:04                           ` Linus Torvalds
  2012-03-29 23:25                         ` Ted Ts'o
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-03-29 22:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ted Ts'o, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 02:52:40PM -0700, Linus Torvalds wrote:
 
 > > up 30 minutes with that reverted, no problems so far..
 > 
 > Goodie. Let it run for a while more, and really pound on it.
 > 

90 mins of 2x kernel builds in parallel along with
3x fsx runs. then my battery ran out.

looks like we have a winner.

Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 22:54                         ` Dave Jones
@ 2012-03-29 23:04                           ` Linus Torvalds
  2012-03-29 23:35                             ` Ted Ts'o
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2012-03-29 23:04 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Ted Ts'o, Wu Fengguang,
	Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 3:54 PM, Dave Jones <davej@redhat.com> wrote:
>
> 90 mins of 2x kernel builds in parallel along with
> 3x fsx runs. then my battery ran out.
>
> looks like we have a winner.

Ok, let's revert it. Do you want to write the description of what
happened, or should I just do it?

                  Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 21:52                       ` Linus Torvalds
  2012-03-29 22:54                         ` Dave Jones
@ 2012-03-29 23:25                         ` Ted Ts'o
  1 sibling, 0 replies; 19+ messages in thread
From: Ted Ts'o @ 2012-03-29 23:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dave Jones, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 02:52:40PM -0700, Linus Torvalds wrote:
> >
> > up 30 minutes with that reverted, no problems so far..
> 
> Goodie. Let it run for a while more, and really pound on it.
> 
> Ted, are there any downsides to just reverting that commit (ie any
> subtle interactions) for now?  That's assuming that Dave's testing
> continues to confirm that it is that one commit.

That commit fixes a race which is seen when you write into fallocated
(and hence uninitialized) disk blocks under *very* heavy memory
pressure.  Furthermore, although theoretically it could trigger under
normal direct I/O writes, it only seems to trigger if you are issuing
a huge number of AIO writes, such that a just-written page can get
evicted from memory, and then read back into memory, before the
workqueue has a chance to update the extent tree.

This race has been around for a little over a year, and no one noticed
until two months ago; it only happens under fairly exotic conditions,
and in fact even after trying very hard to create a simple repro under
lab conditions, we could only reproduce the problem and confirm the
fix on production servers running MySQL on very fast PCIe-attached
flash devices.

Given that Dave was able to hit this problem pretty quickly, if we
confirm that this commit is at fault, the only reasonable thing to do
is to revert it IMO.

    	      	      	   	       	  - Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: lockups shortly after booting in current git.
  2012-03-29 23:04                           ` Linus Torvalds
@ 2012-03-29 23:35                             ` Ted Ts'o
  0 siblings, 0 replies; 19+ messages in thread
From: Ted Ts'o @ 2012-03-29 23:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dave Jones, Wu Fengguang, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 04:04:50PM -0700, Linus Torvalds wrote:
> On Thu, Mar 29, 2012 at 3:54 PM, Dave Jones <davej@redhat.com> wrote:
> >
> > 90 mins of 2x kernel builds in parallel along with
> > 3x fsx runs. then my battery ran out.
> >
> > looks like we have a winner.
> 
> Ok, let's revert it. Do you want to write the description of what
> happened, or should I just do it?

How about this:

Commit b43d17f31 (ext4: don't release page refs in ext4_end_bio) is
apprently causing system lockups for a system using a LUKS backed
system partition.  This luckup can be reproduced by running a make -j8
kernel compile in parallel with fsx.

Since the commit in question was fixing a rare and fairly exotic race,
and the lockup was found fairly quickly, it is being reverted as a
regression fix.

						- Ted

P.S.  I've been using 3.3 + the ext4 development tree on my laptop for
a while now, with a LUKS-encrypted+LVM setup on both a SSD and HDD on
a Ubuntu 10.04 based system.  I've not noticed the problem that Dave
is seeing under normal use.  So I'm wondering what was triggering it
on what should have been an idle desktop system in his case.  But this
is definitely a case where we should revert first, and ask questions
later.

Dave, thanks for noticing and reporting this!


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-03-29 23:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-29 15:55 lockups shortly after booting in current git Dave Jones
2012-03-29 18:23 ` Linus Torvalds
2012-03-29 18:29   ` Dave Jones
     [not found] ` <CA+55aFxOcMt_mcr+ZYwc-SpKbROnh4Gn7jqrFY_SZcBy1Ev7Qw@mail.gmail.com>
     [not found]   ` <20120329182632.GA6891@redhat.com>
2012-03-29 18:36     ` Linus Torvalds
2012-03-29 19:53       ` Dave Jones
2012-03-29 20:10         ` Linus Torvalds
2012-03-29 20:26           ` Dave Jones
2012-03-29 20:38             ` Linus Torvalds
2012-03-29 20:39             ` Ted Ts'o
2012-03-29 20:59               ` Linus Torvalds
2012-03-29 21:12                 ` Dave Jones
2012-03-29 21:45                   ` Ted Ts'o
2012-03-29 21:49                     ` Dave Jones
2012-03-29 21:52                       ` Linus Torvalds
2012-03-29 22:54                         ` Dave Jones
2012-03-29 23:04                           ` Linus Torvalds
2012-03-29 23:35                             ` Ted Ts'o
2012-03-29 23:25                         ` Ted Ts'o
2012-03-29 21:52                     ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).