linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Linux-next parallel cp workload hang
       [not found] <20160518013802.GA4679@ZX.nay.redhat.com>
@ 2016-05-18  2:06 ` Al Viro
  0 siblings, 0 replies; 9+ messages in thread
From: Al Viro @ 2016-05-18  2:06 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: linux-next, inux-kernel, linux-fsdevel, Dave Chinner

[dchinner Cc'd]

On Wed, May 18, 2016 at 09:38:02AM +0800, Xiong Zhou wrote:
> Hi,
> 
> Parallel cp workload (xfstests generic/273) hangs like blow.
> It's reproducible with a small chance, less the 1/100 i think.
> 
> Have hit this in linux-next 20160504 0506 0510 trees, testing on
> xfs with loop or block device. Ext4 survived several rounds
> of testing.
> 
> Linux next 20160510 tree hangs within 500 rounds testing several
> times. The same tree with vfs parallel lookup patchset reverted
> survived 900 rounds testing. Reverted commits are attached.
> 
> Bisecting in this patchset ided this commit:
> 
> 3b0a3c1ac1598722fc289da19219d14f2a37b31f is the first bad commit
> commit 3b0a3c1ac1598722fc289da19219d14f2a37b31f
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date:   Wed Apr 20 23:42:46 2016 -0400
> 
>     simple local filesystems: switch to ->iterate_shared()
>     
>     no changes needed (XFS isn't simple, but it has the same parallelism
>     in the interesting parts exercised from CXFS).
>     
> With this commit reverted on top of Linux next 0510 tree, 5000+ rounds
> of testing passed.

Joy...  So much for CXFS testing, probably...  All that commit changes for
XFS is allowing xfs_readdir() to be called in parallel for different struct
file.  We can always revert the XFS part, of course, but it would be
interesting to find out what happens to be dependent on external exclusion
there.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18 23:02           ` Dave Chinner
@ 2016-05-19  6:22             ` Xiong Zhou
  0 siblings, 0 replies; 9+ messages in thread
From: Xiong Zhou @ 2016-05-19  6:22 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Xiong Zhou, linux-next, viro, linux-kernel, linux-fsdevel

On Thu, May 19, 2016 at 09:02:31AM +1000, Dave Chinner wrote:
> On Thu, May 19, 2016 at 12:17:26AM +1000, Dave Chinner wrote:
> > Patch below should fix the deadlock.
> 
> The test has been running for several hours without failure using
> this patch, so I'd say this fixes the problem...

Yes, the same for me.

Thanks.

Xiong

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18 14:17         ` Dave Chinner
@ 2016-05-18 23:02           ` Dave Chinner
  2016-05-19  6:22             ` Xiong Zhou
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2016-05-18 23:02 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: linux-next, viro, linux-kernel, linux-fsdevel

On Thu, May 19, 2016 at 12:17:26AM +1000, Dave Chinner wrote:
> Patch below should fix the deadlock.

The test has been running for several hours without failure using
this patch, so I'd say this fixes the problem...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18 11:46       ` Xiong Zhou
@ 2016-05-18 14:17         ` Dave Chinner
  2016-05-18 23:02           ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2016-05-18 14:17 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: linux-next, viro, linux-kernel, linux-fsdevel

On Wed, May 18, 2016 at 07:46:17PM +0800, Xiong Zhou wrote:
> 
> On Wed, May 18, 2016 at 07:54:09PM +1000, Dave Chinner wrote:
> > On Wed, May 18, 2016 at 04:31:50PM +0800, Xiong Zhou wrote:
> > > Hi,
> > > 
> > > On Wed, May 18, 2016 at 03:56:34PM +1000, Dave Chinner wrote:
> > > > On Wed, May 18, 2016 at 09:46:15AM +0800, Xiong Zhou wrote:
> > > > > Hi,
> > > > > 
> > > > > Parallel cp workload (xfstests generic/273) hangs like blow.
> > > > > It's reproducible with a small chance, less the 1/100 i think.
> > > > > 
> > > > > Have hit this in linux-next 20160504 0506 0510 trees, testing on
> > > > > xfs with loop or block device. Ext4 survived several rounds
> > > > > of testing.
> > > > > 
> > > > > Linux next 20160510 tree hangs within 500 rounds testing several
> > > > > times. The same tree with vfs parallel lookup patchset reverted
> > > > > survived 900 rounds testing. Reverted commits are attached.  > 

Ok, this is trivial to reproduce. Al - I've hit this 9 times out of
ten running it on a 4p VM with a pair of 4GB ram disks using all
the current upstream default mkfs and mount configurations. On the
tenth attempt I got the tracing to capture what I needed to see -
process 7340 was the last xfs_buf_lock_done trace without an unlock
trace, and that process had this trace:

schedule
rwsem_down_read_failed
call_rwsem_down_read_failed
down_read
xfs_ilock
xfs_ilock_data_map_shared
xfs_dir2_leaf_getdents
xfs_readdir
xfs_file_readdir
iterate_dir
SyS_getdents
entry_SYSCALL_64_fastpath

Which means it's holding a buffer lock while trying to get the
ilock(shared). That's never going to end well - I'm now wondering
why lockdep hasn't been all over this lock order inversion....

Essentially, it's a three-process deadlock involving
shared/exclusive barriers and inverted lock orders. It's a
pre-existing problem with buffer mapping lock orders, nothing to do
with with the VFS parallelisation code.

process 1		process 2		process 3
---------		---------		---------
readdir
iolock(shared)
  get_leaf_dents
    iterate entries
       ilock(shared)
       map, lock and read buffer
       iunlock(shared)
       process entries in buffer
       .....
       						readdir
						iolock(shared)
						  get_leaf_dents
						    iterate entries
						      ilock(shared)
						      map, lock buffer
						      <blocks>
       			finish ->iterate_shared
			file_accessed()
			  ->update_time
			    start transaction
			    ilock(excl)
			    <blocks>
	.....
	finishes processing buffer
	get next buffer
	  ilock(shared)
	  <blocks>

And that's the deadlock.

Now I know what the problem is I can say that process 2 - the
transactional timestamp update - is the reason the readdir
operations are blocking like this. And I know why CXFS never hit
this - it doesn't use the VFS paths, so the VFS calls to update
timestamps don't exist during concurrent readdir operations on the
CXFS metadata server. Hence process 2 doesn't exist and no exclusive
barriers are put in amongst the shared locking....

Patch below should fix the deadlock.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


xfs: concurrent readdir hangs on data buffer locks

From: Dave Chinner <dchinner@redhat.com>

There's a three-process deadlock involving shared/exclusive barriers
and inverted lock orders in the directory readdir implementation.
It's a pre-existing problem with lock ordering, exposed by the
VFS parallelisation code.

process 1               process 2               process 3
---------               ---------               ---------
readdir
iolock(shared)
  get_leaf_dents
    iterate entries
       ilock(shared)
       map, lock and read buffer
       iunlock(shared)
       process entries in buffer
       .....
                                                readdir
                                                iolock(shared)
                                                  get_leaf_dents
                                                    iterate entries
                                                      ilock(shared)
                                                      map, lock buffer
                                                      <blocks>
                        finish ->iterate_shared
                        file_accessed()
                          ->update_time
                            start transaction
                            ilock(excl)
                            <blocks>
        .....
        finishes processing buffer
        get next buffer
          ilock(shared)
          <blocks>

And that's the deadlock.

Fix this by dropping the current buffer lock in process 1 before
trying to map the next buffer. This means we keep the lock order of
ilock -> buffer lock intact and hence will allow process 3 to make
progress and drop it's ilock(shared) once it is done.

Reported-by: Xiong Zhou <xzhou@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_dir2_readdir.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
index 93b3ab0..21501dc 100644
--- a/fs/xfs/xfs_dir2_readdir.c
+++ b/fs/xfs/xfs_dir2_readdir.c
@@ -273,10 +273,11 @@ xfs_dir2_leaf_readbuf(
 	size_t			bufsize,
 	struct xfs_dir2_leaf_map_info *mip,
 	xfs_dir2_off_t		*curoff,
-	struct xfs_buf		**bpp)
+	struct xfs_buf		**bpp,
+	bool			trim_map)
 {
 	struct xfs_inode	*dp = args->dp;
-	struct xfs_buf		*bp = *bpp;
+	struct xfs_buf		*bp = NULL;
 	struct xfs_bmbt_irec	*map = mip->map;
 	struct blk_plug		plug;
 	int			error = 0;
@@ -286,13 +287,10 @@ xfs_dir2_leaf_readbuf(
 	struct xfs_da_geometry	*geo = args->geo;
 
 	/*
-	 * If we have a buffer, we need to release it and
-	 * take it out of the mapping.
+	 * If the caller just finished processing a buffer, it will tell us
+	 * we need to trim that block out of the mapping now it is done.
 	 */
-
-	if (bp) {
-		xfs_trans_brelse(NULL, bp);
-		bp = NULL;
+	if (trim_map) {
 		mip->map_blocks -= geo->fsbcount;
 		/*
 		 * Loop to get rid of the extents for the
@@ -533,10 +531,17 @@ xfs_dir2_leaf_getdents(
 		 */
 		if (!bp || ptr >= (char *)bp->b_addr + geo->blksize) {
 			int	lock_mode;
+			bool	trim_map = false;
+
+			if (bp) {
+				xfs_trans_brelse(NULL, bp);
+				bp = NULL;
+				trim_map = true;
+			}
 
 			lock_mode = xfs_ilock_data_map_shared(dp);
 			error = xfs_dir2_leaf_readbuf(args, bufsize, map_info,
-						      &curoff, &bp);
+						      &curoff, &bp, trim_map);
 			xfs_iunlock(dp, lock_mode);
 			if (error || !map_info->map_valid)
 				break;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18  9:54     ` Dave Chinner
@ 2016-05-18 11:46       ` Xiong Zhou
  2016-05-18 14:17         ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Xiong Zhou @ 2016-05-18 11:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Xiong Zhou, linux-next, viro, linux-kernel, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 3796 bytes --]

Hi,

On Wed, May 18, 2016 at 07:54:09PM +1000, Dave Chinner wrote:
> On Wed, May 18, 2016 at 04:31:50PM +0800, Xiong Zhou wrote:
> > Hi,
> > 
> > On Wed, May 18, 2016 at 03:56:34PM +1000, Dave Chinner wrote:
> > > On Wed, May 18, 2016 at 09:46:15AM +0800, Xiong Zhou wrote:
> > > > Hi,
> > > > 
> > > > Parallel cp workload (xfstests generic/273) hangs like blow.
> > > > It's reproducible with a small chance, less the 1/100 i think.
> > > > 
> > > > Have hit this in linux-next 20160504 0506 0510 trees, testing on
> > > > xfs with loop or block device. Ext4 survived several rounds
> > > > of testing.
> > > > 
> > > > Linux next 20160510 tree hangs within 500 rounds testing several
> > > > times. The same tree with vfs parallel lookup patchset reverted
> > > > survived 900 rounds testing. Reverted commits are attached.  > 
> > > What hardware?
> > 
> > A HP prototype host.
> 
> description? cpus, memory, etc? I want to have some idea of what
> hardware I need to reproduce this...

#lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                48
On-line CPU(s) list:   0-47
Thread(s) per core:    2
Core(s) per socket:    12
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
Stepping:              2 CPU MHz:               2596.918
BogoMIPS:              5208.33
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              30720K
NUMA node0 CPU(s):     0-11,24-35
NUMA node1 CPU(s):     12-23,36-47

#free -m
        total        used        free      shared  buff/cache   available
Mem:    31782         623       27907           9        3251       30491
Swap:   10239           0       10239

> 
> xfs_info from the scratch filesystem would also be handy.

meta-data=/dev/pmem1             isize=256    agcount=4, agsize=131072 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=524288, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> > > Can you reproduce this with CONFIG_XFS_DEBUG=y set? if you can, and
> > > it doesn't trigger any warnings or asserts, can you then try to
> > > reproduce it while tracing the following events:
> > > 
> > > 	xfs_buf_lock
> > > 	xfs_buf_lock_done
> > > 	xfs_buf_trylock
> > > 	xfs_buf_unlock
> > > 
> > > So we might be able to see if there's an unexpected buffer
> > > locking/state pattern occurring when the hang occurs?
> > 
> > Yes, i've reproduced this with both CONFIG_XFS_DEBUG=y and the tracers
> > on. There are some trace output after hang for a while.
> 
> I'm not actually interested in the trace after the hang - I'm
> interested in what happened leading up to the hang. The output
> you've given me tell me that the directory block at offset is locked
> but nothing in the trace tells me what locked it.
> 
> Can I suggest using trace-cmd to record the events, then when the
> test hangs kill the check process so that trace-cmd terminates and
> gathers the events. Then dump the report to a text file and attach
> that?

Sure. Trace report, dmesg, ps axjf after Ctrl+C are attached.

Thanks for the instructions and patient.
Xiong
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

[-- Attachment #2: g273-trace-report.tar.gz --]
[-- Type: application/gzip, Size: 244506 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18  8:31   ` Xiong Zhou
@ 2016-05-18  9:54     ` Dave Chinner
  2016-05-18 11:46       ` Xiong Zhou
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2016-05-18  9:54 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: linux-next, viro, linux-kernel, linux-fsdevel

On Wed, May 18, 2016 at 04:31:50PM +0800, Xiong Zhou wrote:
> Hi,
> 
> On Wed, May 18, 2016 at 03:56:34PM +1000, Dave Chinner wrote:
> > On Wed, May 18, 2016 at 09:46:15AM +0800, Xiong Zhou wrote:
> > > Hi,
> > > 
> > > Parallel cp workload (xfstests generic/273) hangs like blow.
> > > It's reproducible with a small chance, less the 1/100 i think.
> > > 
> > > Have hit this in linux-next 20160504 0506 0510 trees, testing on
> > > xfs with loop or block device. Ext4 survived several rounds
> > > of testing.
> > > 
> > > Linux next 20160510 tree hangs within 500 rounds testing several
> > > times. The same tree with vfs parallel lookup patchset reverted
> > > survived 900 rounds testing. Reverted commits are attached.  > 
> > What hardware?
> 
> A HP prototype host.

description? cpus, memory, etc? I want to have some idea of what
hardware I need to reproduce this...

xfs_info from the scratch filesystem would also be handy.

> > Can you reproduce this with CONFIG_XFS_DEBUG=y set? if you can, and
> > it doesn't trigger any warnings or asserts, can you then try to
> > reproduce it while tracing the following events:
> > 
> > 	xfs_buf_lock
> > 	xfs_buf_lock_done
> > 	xfs_buf_trylock
> > 	xfs_buf_unlock
> > 
> > So we might be able to see if there's an unexpected buffer
> > locking/state pattern occurring when the hang occurs?
> 
> Yes, i've reproduced this with both CONFIG_XFS_DEBUG=y and the tracers
> on. There are some trace output after hang for a while.

I'm not actually interested in the trace after the hang - I'm
interested in what happened leading up to the hang. The output
you've given me tell me that the directory block at offset is locked
but nothing in the trace tells me what locked it.

Can I suggest using trace-cmd to record the events, then when the
test hangs kill the check process so that trace-cmd terminates and
gathers the events. Then dump the report to a text file and attach
that?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18  5:56 ` Dave Chinner
@ 2016-05-18  8:31   ` Xiong Zhou
  2016-05-18  9:54     ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Xiong Zhou @ 2016-05-18  8:31 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Xiong Zhou, linux-next, viro, linux-kernel, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 9125 bytes --]

Hi,

On Wed, May 18, 2016 at 03:56:34PM +1000, Dave Chinner wrote:
> On Wed, May 18, 2016 at 09:46:15AM +0800, Xiong Zhou wrote:
> > Hi,
> > 
> > Parallel cp workload (xfstests generic/273) hangs like blow.
> > It's reproducible with a small chance, less the 1/100 i think.
> > 
> > Have hit this in linux-next 20160504 0506 0510 trees, testing on
> > xfs with loop or block device. Ext4 survived several rounds
> > of testing.
> > 
> > Linux next 20160510 tree hangs within 500 rounds testing several
> > times. The same tree with vfs parallel lookup patchset reverted
> > survived 900 rounds testing. Reverted commits are attached.  > 
> What hardware?

A HP prototype host.

> 
> > Bisecting in this patchset ided this commit:
> > 
> > 3b0a3c1ac1598722fc289da19219d14f2a37b31f is the first bad commit
> > commit 3b0a3c1ac1598722fc289da19219d14f2a37b31f
> > Author: Al Viro <viro@zeniv.linux.org.uk>
> > Date:   Wed Apr 20 23:42:46 2016 -0400
> > 
> >     simple local filesystems: switch to ->iterate_shared()
> >     
> >     no changes needed (XFS isn't simple, but it has the same parallelism
> >     in the interesting parts exercised from CXFS).
> >     
> > With this commit reverted on top of Linux next 0510 tree, 5000+ rounds
> > of testing passed.
> > 
> > Although 2000 rounds testing had been conducted before good/bad
> > verdict, i'm not 100 percent sure about all this, since it's
> > so hard to hit, and i am not that lucky..
> > 
> > Bisect log and full blocked state process dump log are also attached.
> > 
> > Furthermore, this was first hit when testing fs dax on nvdimm,
> > however it's reproducible without dax mount option, and also
> > reproducible on loop device, just seems harder to hit.
> > 
> > Thanks,
> > Xiong
> > 
> > [0.771475] INFO: task cp:49033 blocked for more than 120 seconds.
> > [0.794263]       Not tainted 4.6.0-rc6-next-20160504 #5
> > [0.812515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> > this message.
> > [0.841801] cp              D ffff880b4e977928     0 49033  49014
> > 0x00000080
> > [0.868923]  ffff880b4e977928 ffff880ba275d380 ffff880b8d712b80
> > ffff880b4e978000
> > [0.897504]  7fffffffffffffff 0000000000000002 0000000000000000
> > ffff880b8d712b80
> > [0.925234]  ffff880b4e977940 ffffffff816cbc25 ffff88035a1dabb0
> > ffff880b4e9779e8
> > [0.953237] Call Trace:
> > [0.958314]  [<ffffffff816cbc25>] schedule+0x35/0x80
> > [0.974854]  [<ffffffff816ceb71>] schedule_timeout+0x231/0x2d0
> > [0.995728]  [<ffffffff810d2e2d>] ? down_trylock+0x2d/0x40
> > [1.015351]  [<ffffffffa03fa6c2>] ? xfs_iext_bno_to_ext+0xa2/0x190 [xfs]
> > [1.040182]  [<ffffffff816cdb89>] __down_common+0xaa/0x104
> > [1.059021]  [<ffffffffa04076d2>] ? _xfs_buf_find+0x162/0x340 [xfs]
> > [1.081357]  [<ffffffff816cdc00>] __down+0x1d/0x1f
> > [1.097166]  [<ffffffff810d2e81>] down+0x41/0x50
> > [1.112869]  [<ffffffffa04074bc>] xfs_buf_lock+0x3c/0xf0 [xfs]
> > [1.134504]  [<ffffffffa04076d2>] _xfs_buf_find+0x162/0x340 [xfs]
> > [1.156871]  [<ffffffffa04078da>] xfs_buf_get_map+0x2a/0x270 [xfs]
> 
> So what's holding that directory data buffer lock? It should only be
> held if there is either IO in progress, or a modification of the
> buffer in progress that is blocked somewhere else.
> 
> > [1.180010]  [<ffffffffa04084ed>] xfs_buf_read_map+0x2d/0x180 [xfs]
> > [1.203538]  [<ffffffffa0435f21>] xfs_trans_read_buf_map+0xf1/0x300 [xfs]
> > [1.229194]  [<ffffffffa03e75b1>] xfs_da_read_buf+0xd1/0x100 [xfs]
> > [1.251948]  [<ffffffffa03ed5e6>] xfs_dir3_data_read+0x26/0x60 [xfs]
> > [1.275736]  [<ffffffffa040923e>] xfs_dir2_leaf_readbuf.isra.12+0x1be/0x4a0 [xfs]
> > [1.305094]  [<ffffffff816cdc92>] ? down_read+0x12/0x30
> > [1.323787]  [<ffffffffa04187a4>] ? xfs_ilock+0xe4/0x110 [xfs]
> > [1.345114]  [<ffffffffa040965b>] xfs_dir2_leaf_getdents+0x13b/0x3d0 [xfs]
> > [1.371818]  [<ffffffffa0409f26>] xfs_readdir+0x1a6/0x1c0 [xfs]
> 
> So we should be holding the ilock in shared mode here...
> 
> > [1.393471]  [<ffffffffa040bcbb>] xfs_file_readdir+0x2b/0x30 [xfs]
> > [1.416874]  [<ffffffff8122cec3>] iterate_dir+0x173/0x190
> > [1.436709]  [<ffffffff81003276>] ? do_audit_syscall_entry+0x66/0x70
> > [1.460951]  [<ffffffff8122d338>] SyS_getdents+0x98/0x120
> > [1.480566]  [<ffffffff8122cee0>] ? iterate_dir+0x190/0x190
> > [1.500909]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
> > [1.520847]  [<ffffffff816cf9a1>] entry_SYSCALL64_slow_path+0x25/0x25
> > [1.545372] INFO: task cp:49040 blocked for more than 120 seconds.
> > [1.568933]       Not tainted 4.6.0-rc6-next-20160504 #5
> > [1.587943] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> > this message.
> > [1.618544] cp              D ffff880b91463b00     0 49040  49016
> > 0x00000080
> > [1.645502]  ffff880b91463b00 ffff880464d5c140 ffff88029b475700
> > ffff880b91464000
> > [1.674145]  ffff880411c42610 ffffffffffffffff ffff880411c42628
> > ffff8802c10bc610
> > [1.702834]  ffff880b91463b18 ffffffff816cbc25 ffff88029b475700
> > ffff880b91463b88
> > [1.731501] Call Trace:
> > [1.736866]  [<ffffffff816cbc25>] schedule+0x35/0x80
> > [1.754119]  [<ffffffff816ce3c2>] rwsem_down_read_failed+0xf2/0x140
> > [1.777411]  [<ffffffffa0418800>] ? xfs_ilock_data_map_shared+0x30/0x40
> > [xfs]
> > [1.805090]  [<ffffffff81355e08>] call_rwsem_down_read_failed+0x18/0x30
> > [1.830482]  [<ffffffff816cdca0>] down_read+0x20/0x30
> > [1.848505]  [<ffffffffa04187a4>] xfs_ilock+0xe4/0x110 [xfs]
> > [1.869293]  [<ffffffffa0418800>] xfs_ilock_data_map_shared+0x30/0x40
> 
> And it this is an attempt to lock the inode shared, so if that is
> failing while there's another shared holder, than means there's an
> exclusive waiter queued up (i.e. read iheld -> write blocked -> read
> blocked).
> 
> 
> So looking at dump-g273xfs0510:
> 
> [  845.727907] INFO: task cp:40126 blocked for more than 120 seconds.
> [  845.751175]       Not tainted 4.6.0-rc7-next-20160510 #9
> [  845.770011] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  845.799967] cp              D ffff88046b64bcb8     0 40126  40106 0x00000080
> [  845.826636]  ffff88046b64bcb8 ffffffffa0458d36 ffff8804685c2b80 0000000002408240
> [  845.855232]  ffff88046b64c000 ffff880426b08450 ffff880426b08450 ffff880426b08468
> [  845.883836]  ffff880426b08458 ffff88046b64bcd0 ffffffff816ce9c5 ffff8804685c2b80
> [  845.911854] Call Trace:
> [  845.917076]  [<ffffffffa0458d36>] ? kmem_zone_alloc+0x96/0x120 [xfs]
> [  845.940302]  [<ffffffff816ce9c5>] schedule+0x35/0x80
> [  845.957118]  [<ffffffff816d1534>] rwsem_down_write_failed+0x114/0x230
> [  845.981242]  [<ffffffffa045a294>] ? xlog_grant_head_check+0x54/0x100 [xfs]
> [  846.007673]  [<ffffffffa04492e4>] ? xfs_vn_update_time+0xd4/0x160 [xfs]
> [  846.032009]  [<ffffffff81358217>] call_rwsem_down_write_failed+0x17/0x30
> [  846.056909]  [<ffffffff816d0a7d>] down_write+0x2d/0x40
> [  846.074580]  [<ffffffffa044b78f>] xfs_ilock+0xcf/0x110 [xfs]
> [  846.094681]  [<ffffffffa04492e4>] xfs_vn_update_time+0xd4/0x160 [xfs]
> [  846.118308]  [<ffffffff81238648>] touch_atime+0xa8/0xd0
> [  846.136385]  [<ffffffff8122f132>] iterate_dir+0x112/0x190
> [  846.155995]  [<ffffffff8122f608>] SyS_getdents+0x98/0x120
> [  846.175819]  [<ffffffff8122f1b0>] ? iterate_dir+0x190/0x190
> [  846.195808]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
> [  846.215019]  [<ffffffff816d2761>] entry_SYSCALL64_slow_path+0x25/0x25
> 
> 
> Timestamp updates are transactional, requiring exclusive access to
> the directory ilock. That's why we have a train-smash of down_read()
> calls blocking in these dumps - they are queued up behind this call.
> This is expected under workloads where a couple of hundred threads
> are all reading the same directory and trying to open files in it.
> 
> So why isn't the current read holder making progress? Why is it
> blocked on a directory data buffer lock when the only accesses
> should be read only?
> 
> There's nothing in either of the stack traces that tells me this. I
> suspect that there may be some kind of buffer readahead vs buffer
> read interaction that results in leaking a buffer lock, but it is
> not obvious what that may be yet.
> 
> Can you reproduce this with CONFIG_XFS_DEBUG=y set? if you can, and
> it doesn't trigger any warnings or asserts, can you then try to
> reproduce it while tracing the following events:
> 
> 	xfs_buf_lock
> 	xfs_buf_lock_done
> 	xfs_buf_trylock
> 	xfs_buf_unlock
> 
> So we might be able to see if there's an unexpected buffer
> locking/state pattern occurring when the hang occurs?

Yes, i've reproduced this with both CONFIG_XFS_DEBUG=y and the tracers
on. There are some trace output after hang for a while. Tracing output,
dmesg, block stack dump and config are attached. BTW, reproduced on
mainline tree this time, to commit 0b7962a.

> 
> Also, if you run on slower storage, does the hang get harder or
> easier to hit?

Harder to me.
>From easy to hard, or hard to harder, they are nvdimm device with dax
mount option , then nvdimm device without dax mount option, lood device.

Thanks,
Xiong

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

[-- Attachment #2: g273-mainline.tar.gz --]
[-- Type: application/gzip, Size: 139695 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Linux-next parallel cp workload hang
  2016-05-18  1:46 Xiong Zhou
@ 2016-05-18  5:56 ` Dave Chinner
  2016-05-18  8:31   ` Xiong Zhou
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2016-05-18  5:56 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: linux-next, viro, linux-kernel, linux-fsdevel

On Wed, May 18, 2016 at 09:46:15AM +0800, Xiong Zhou wrote:
> Hi,
> 
> Parallel cp workload (xfstests generic/273) hangs like blow.
> It's reproducible with a small chance, less the 1/100 i think.
> 
> Have hit this in linux-next 20160504 0506 0510 trees, testing on
> xfs with loop or block device. Ext4 survived several rounds
> of testing.
> 
> Linux next 20160510 tree hangs within 500 rounds testing several
> times. The same tree with vfs parallel lookup patchset reverted
> survived 900 rounds testing. Reverted commits are attached.

What hardware?

> Bisecting in this patchset ided this commit:
> 
> 3b0a3c1ac1598722fc289da19219d14f2a37b31f is the first bad commit
> commit 3b0a3c1ac1598722fc289da19219d14f2a37b31f
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date:   Wed Apr 20 23:42:46 2016 -0400
> 
>     simple local filesystems: switch to ->iterate_shared()
>     
>     no changes needed (XFS isn't simple, but it has the same parallelism
>     in the interesting parts exercised from CXFS).
>     
> With this commit reverted on top of Linux next 0510 tree, 5000+ rounds
> of testing passed.
> 
> Although 2000 rounds testing had been conducted before good/bad
> verdict, i'm not 100 percent sure about all this, since it's
> so hard to hit, and i am not that lucky..
> 
> Bisect log and full blocked state process dump log are also attached.
> 
> Furthermore, this was first hit when testing fs dax on nvdimm,
> however it's reproducible without dax mount option, and also
> reproducible on loop device, just seems harder to hit.
> 
> Thanks,
> Xiong
> 
> [0.771475] INFO: task cp:49033 blocked for more than 120 seconds.
> [0.794263]       Not tainted 4.6.0-rc6-next-20160504 #5
> [0.812515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [0.841801] cp              D ffff880b4e977928     0 49033  49014
> 0x00000080
> [0.868923]  ffff880b4e977928 ffff880ba275d380 ffff880b8d712b80
> ffff880b4e978000
> [0.897504]  7fffffffffffffff 0000000000000002 0000000000000000
> ffff880b8d712b80
> [0.925234]  ffff880b4e977940 ffffffff816cbc25 ffff88035a1dabb0
> ffff880b4e9779e8
> [0.953237] Call Trace:
> [0.958314]  [<ffffffff816cbc25>] schedule+0x35/0x80
> [0.974854]  [<ffffffff816ceb71>] schedule_timeout+0x231/0x2d0
> [0.995728]  [<ffffffff810d2e2d>] ? down_trylock+0x2d/0x40
> [1.015351]  [<ffffffffa03fa6c2>] ? xfs_iext_bno_to_ext+0xa2/0x190 [xfs]
> [1.040182]  [<ffffffff816cdb89>] __down_common+0xaa/0x104
> [1.059021]  [<ffffffffa04076d2>] ? _xfs_buf_find+0x162/0x340 [xfs]
> [1.081357]  [<ffffffff816cdc00>] __down+0x1d/0x1f
> [1.097166]  [<ffffffff810d2e81>] down+0x41/0x50
> [1.112869]  [<ffffffffa04074bc>] xfs_buf_lock+0x3c/0xf0 [xfs]
> [1.134504]  [<ffffffffa04076d2>] _xfs_buf_find+0x162/0x340 [xfs]
> [1.156871]  [<ffffffffa04078da>] xfs_buf_get_map+0x2a/0x270 [xfs]

So what's holding that directory data buffer lock? It should only be
held if there is either IO in progress, or a modification of the
buffer in progress that is blocked somewhere else.

> [1.180010]  [<ffffffffa04084ed>] xfs_buf_read_map+0x2d/0x180 [xfs]
> [1.203538]  [<ffffffffa0435f21>] xfs_trans_read_buf_map+0xf1/0x300 [xfs]
> [1.229194]  [<ffffffffa03e75b1>] xfs_da_read_buf+0xd1/0x100 [xfs]
> [1.251948]  [<ffffffffa03ed5e6>] xfs_dir3_data_read+0x26/0x60 [xfs]
> [1.275736]  [<ffffffffa040923e>] xfs_dir2_leaf_readbuf.isra.12+0x1be/0x4a0 [xfs]
> [1.305094]  [<ffffffff816cdc92>] ? down_read+0x12/0x30
> [1.323787]  [<ffffffffa04187a4>] ? xfs_ilock+0xe4/0x110 [xfs]
> [1.345114]  [<ffffffffa040965b>] xfs_dir2_leaf_getdents+0x13b/0x3d0 [xfs]
> [1.371818]  [<ffffffffa0409f26>] xfs_readdir+0x1a6/0x1c0 [xfs]

So we should be holding the ilock in shared mode here...

> [1.393471]  [<ffffffffa040bcbb>] xfs_file_readdir+0x2b/0x30 [xfs]
> [1.416874]  [<ffffffff8122cec3>] iterate_dir+0x173/0x190
> [1.436709]  [<ffffffff81003276>] ? do_audit_syscall_entry+0x66/0x70
> [1.460951]  [<ffffffff8122d338>] SyS_getdents+0x98/0x120
> [1.480566]  [<ffffffff8122cee0>] ? iterate_dir+0x190/0x190
> [1.500909]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
> [1.520847]  [<ffffffff816cf9a1>] entry_SYSCALL64_slow_path+0x25/0x25
> [1.545372] INFO: task cp:49040 blocked for more than 120 seconds.
> [1.568933]       Not tainted 4.6.0-rc6-next-20160504 #5
> [1.587943] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [1.618544] cp              D ffff880b91463b00     0 49040  49016
> 0x00000080
> [1.645502]  ffff880b91463b00 ffff880464d5c140 ffff88029b475700
> ffff880b91464000
> [1.674145]  ffff880411c42610 ffffffffffffffff ffff880411c42628
> ffff8802c10bc610
> [1.702834]  ffff880b91463b18 ffffffff816cbc25 ffff88029b475700
> ffff880b91463b88
> [1.731501] Call Trace:
> [1.736866]  [<ffffffff816cbc25>] schedule+0x35/0x80
> [1.754119]  [<ffffffff816ce3c2>] rwsem_down_read_failed+0xf2/0x140
> [1.777411]  [<ffffffffa0418800>] ? xfs_ilock_data_map_shared+0x30/0x40
> [xfs]
> [1.805090]  [<ffffffff81355e08>] call_rwsem_down_read_failed+0x18/0x30
> [1.830482]  [<ffffffff816cdca0>] down_read+0x20/0x30
> [1.848505]  [<ffffffffa04187a4>] xfs_ilock+0xe4/0x110 [xfs]
> [1.869293]  [<ffffffffa0418800>] xfs_ilock_data_map_shared+0x30/0x40

And it this is an attempt to lock the inode shared, so if that is
failing while there's another shared holder, than means there's an
exclusive waiter queued up (i.e. read iheld -> write blocked -> read
blocked).


So looking at dump-g273xfs0510:

[  845.727907] INFO: task cp:40126 blocked for more than 120 seconds.
[  845.751175]       Not tainted 4.6.0-rc7-next-20160510 #9
[  845.770011] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  845.799967] cp              D ffff88046b64bcb8     0 40126  40106 0x00000080
[  845.826636]  ffff88046b64bcb8 ffffffffa0458d36 ffff8804685c2b80 0000000002408240
[  845.855232]  ffff88046b64c000 ffff880426b08450 ffff880426b08450 ffff880426b08468
[  845.883836]  ffff880426b08458 ffff88046b64bcd0 ffffffff816ce9c5 ffff8804685c2b80
[  845.911854] Call Trace:
[  845.917076]  [<ffffffffa0458d36>] ? kmem_zone_alloc+0x96/0x120 [xfs]
[  845.940302]  [<ffffffff816ce9c5>] schedule+0x35/0x80
[  845.957118]  [<ffffffff816d1534>] rwsem_down_write_failed+0x114/0x230
[  845.981242]  [<ffffffffa045a294>] ? xlog_grant_head_check+0x54/0x100 [xfs]
[  846.007673]  [<ffffffffa04492e4>] ? xfs_vn_update_time+0xd4/0x160 [xfs]
[  846.032009]  [<ffffffff81358217>] call_rwsem_down_write_failed+0x17/0x30
[  846.056909]  [<ffffffff816d0a7d>] down_write+0x2d/0x40
[  846.074580]  [<ffffffffa044b78f>] xfs_ilock+0xcf/0x110 [xfs]
[  846.094681]  [<ffffffffa04492e4>] xfs_vn_update_time+0xd4/0x160 [xfs]
[  846.118308]  [<ffffffff81238648>] touch_atime+0xa8/0xd0
[  846.136385]  [<ffffffff8122f132>] iterate_dir+0x112/0x190
[  846.155995]  [<ffffffff8122f608>] SyS_getdents+0x98/0x120
[  846.175819]  [<ffffffff8122f1b0>] ? iterate_dir+0x190/0x190
[  846.195808]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[  846.215019]  [<ffffffff816d2761>] entry_SYSCALL64_slow_path+0x25/0x25


Timestamp updates are transactional, requiring exclusive access to
the directory ilock. That's why we have a train-smash of down_read()
calls blocking in these dumps - they are queued up behind this call.
This is expected under workloads where a couple of hundred threads
are all reading the same directory and trying to open files in it.

So why isn't the current read holder making progress? Why is it
blocked on a directory data buffer lock when the only accesses
should be read only?

There's nothing in either of the stack traces that tells me this. I
suspect that there may be some kind of buffer readahead vs buffer
read interaction that results in leaking a buffer lock, but it is
not obvious what that may be yet.

Can you reproduce this with CONFIG_XFS_DEBUG=y set? if you can, and
it doesn't trigger any warnings or asserts, can you then try to
reproduce it while tracing the following events:

	xfs_buf_lock
	xfs_buf_lock_done
	xfs_buf_trylock
	xfs_buf_unlock

So we might be able to see if there's an unexpected buffer
locking/state pattern occurring when the hang occurs?

Also, if you run on slower storage, does the hang get harder or
easier to hit?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Linux-next parallel cp workload hang
@ 2016-05-18  1:46 Xiong Zhou
  2016-05-18  5:56 ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Xiong Zhou @ 2016-05-18  1:46 UTC (permalink / raw)
  To: linux-next, viro; +Cc: linux-kernel, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 5438 bytes --]

Hi,

Parallel cp workload (xfstests generic/273) hangs like blow.
It's reproducible with a small chance, less the 1/100 i think.

Have hit this in linux-next 20160504 0506 0510 trees, testing on
xfs with loop or block device. Ext4 survived several rounds
of testing.

Linux next 20160510 tree hangs within 500 rounds testing several
times. The same tree with vfs parallel lookup patchset reverted
survived 900 rounds testing. Reverted commits are attached.

Bisecting in this patchset ided this commit:

3b0a3c1ac1598722fc289da19219d14f2a37b31f is the first bad commit
commit 3b0a3c1ac1598722fc289da19219d14f2a37b31f
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Wed Apr 20 23:42:46 2016 -0400

    simple local filesystems: switch to ->iterate_shared()
    
    no changes needed (XFS isn't simple, but it has the same parallelism
    in the interesting parts exercised from CXFS).
    
With this commit reverted on top of Linux next 0510 tree, 5000+ rounds
of testing passed.

Although 2000 rounds testing had been conducted before good/bad
verdict, i'm not 100 percent sure about all this, since it's
so hard to hit, and i am not that lucky..

Bisect log and full blocked state process dump log are also attached.

Furthermore, this was first hit when testing fs dax on nvdimm,
however it's reproducible without dax mount option, and also
reproducible on loop device, just seems harder to hit.

Thanks,
Xiong

[0.771475] INFO: task cp:49033 blocked for more than 120 seconds.
[0.794263]       Not tainted 4.6.0-rc6-next-20160504 #5
[0.812515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[0.841801] cp              D ffff880b4e977928     0 49033  49014
0x00000080
[0.868923]  ffff880b4e977928 ffff880ba275d380 ffff880b8d712b80
ffff880b4e978000
[0.897504]  7fffffffffffffff 0000000000000002 0000000000000000
ffff880b8d712b80
[0.925234]  ffff880b4e977940 ffffffff816cbc25 ffff88035a1dabb0
ffff880b4e9779e8
[0.953237] Call Trace:
[0.958314]  [<ffffffff816cbc25>] schedule+0x35/0x80
[0.974854]  [<ffffffff816ceb71>] schedule_timeout+0x231/0x2d0
[0.995728]  [<ffffffff810d2e2d>] ? down_trylock+0x2d/0x40
[1.015351]  [<ffffffffa03fa6c2>] ? xfs_iext_bno_to_ext+0xa2/0x190 [xfs]
[1.040182]  [<ffffffff816cdb89>] __down_common+0xaa/0x104
[1.059021]  [<ffffffffa04076d2>] ? _xfs_buf_find+0x162/0x340 [xfs]
[1.081357]  [<ffffffff816cdc00>] __down+0x1d/0x1f
[1.097166]  [<ffffffff810d2e81>] down+0x41/0x50
[1.112869]  [<ffffffffa04074bc>] xfs_buf_lock+0x3c/0xf0 [xfs]
[1.134504]  [<ffffffffa04076d2>] _xfs_buf_find+0x162/0x340 [xfs]
[1.156871]  [<ffffffffa04078da>] xfs_buf_get_map+0x2a/0x270 [xfs]
[1.180010]  [<ffffffffa04084ed>] xfs_buf_read_map+0x2d/0x180 [xfs]
[1.203538]  [<ffffffffa0435f21>] xfs_trans_read_buf_map+0xf1/0x300 [xfs]
[1.229194]  [<ffffffffa03e75b1>] xfs_da_read_buf+0xd1/0x100 [xfs]
[1.251948]  [<ffffffffa03ed5e6>] xfs_dir3_data_read+0x26/0x60 [xfs]
[1.275736]  [<ffffffffa040923e>]
xfs_dir2_leaf_readbuf.isra.12+0x1be/0x4a0 [xfs]
[1.305094]  [<ffffffff816cdc92>] ? down_read+0x12/0x30
[1.323787]  [<ffffffffa04187a4>] ? xfs_ilock+0xe4/0x110 [xfs]
[1.345114]  [<ffffffffa040965b>] xfs_dir2_leaf_getdents+0x13b/0x3d0
[xfs]
[1.371818]  [<ffffffffa0409f26>] xfs_readdir+0x1a6/0x1c0 [xfs]
[1.393471]  [<ffffffffa040bcbb>] xfs_file_readdir+0x2b/0x30 [xfs]
[1.416874]  [<ffffffff8122cec3>] iterate_dir+0x173/0x190
[1.436709]  [<ffffffff81003276>] ? do_audit_syscall_entry+0x66/0x70
[1.460951]  [<ffffffff8122d338>] SyS_getdents+0x98/0x120
[1.480566]  [<ffffffff8122cee0>] ? iterate_dir+0x190/0x190
[1.500909]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[1.520847]  [<ffffffff816cf9a1>] entry_SYSCALL64_slow_path+0x25/0x25
[1.545372] INFO: task cp:49040 blocked for more than 120 seconds.
[1.568933]       Not tainted 4.6.0-rc6-next-20160504 #5
[1.587943] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[1.618544] cp              D ffff880b91463b00     0 49040  49016
0x00000080
[1.645502]  ffff880b91463b00 ffff880464d5c140 ffff88029b475700
ffff880b91464000
[1.674145]  ffff880411c42610 ffffffffffffffff ffff880411c42628
ffff8802c10bc610
[1.702834]  ffff880b91463b18 ffffffff816cbc25 ffff88029b475700
ffff880b91463b88
[1.731501] Call Trace:
[1.736866]  [<ffffffff816cbc25>] schedule+0x35/0x80
[1.754119]  [<ffffffff816ce3c2>] rwsem_down_read_failed+0xf2/0x140
[1.777411]  [<ffffffffa0418800>] ? xfs_ilock_data_map_shared+0x30/0x40
[xfs]
[1.805090]  [<ffffffff81355e08>] call_rwsem_down_read_failed+0x18/0x30
[1.830482]  [<ffffffff816cdca0>] down_read+0x20/0x30
[1.848505]  [<ffffffffa04187a4>] xfs_ilock+0xe4/0x110 [xfs]
[1.869293]  [<ffffffffa0418800>] xfs_ilock_data_map_shared+0x30/0x40
[xfs]
[1.896775]  [<ffffffffa040bc60>] xfs_dir_open+0x30/0x60 [xfs]
[1.917882]  [<ffffffff8121603f>] do_dentry_open+0x20f/0x320
[1.938919]  [<ffffffffa040bc30>] ? xfs_file_mmap+0x50/0x50 [xfs]
[1.961532]  [<ffffffff81217487>] vfs_open+0x57/0x60
[1.978945]  [<ffffffff81226f65>] path_openat+0x325/0x14e0
[1.999273]  [<ffffffff81228533>] ? putname+0x53/0x60
[2.017695]  [<ffffffff81229581>] do_filp_open+0x91/0x100
[2.036893]  [<ffffffff81237866>] ? __alloc_fd+0x46/0x180
[2.055479]  [<ffffffff812177e4>] do_sys_open+0x124/0x210
[2.073783]  [<ffffffff8113110b>] ? __audit_syscall_exit+0x1db/0x260
[2.096426]  [<ffffffff81217904>] SyS_openat+0x14/0x20
[2.113690]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[2.132417]  [<ffffffff816cf9a1>] entry_SYSCALL64_slow_path+0x25/0x25


[-- Attachment #2: g273-block-dumps.tar.gz --]
[-- Type: application/gzip, Size: 102883 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-05-19  6:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20160518013802.GA4679@ZX.nay.redhat.com>
2016-05-18  2:06 ` Linux-next parallel cp workload hang Al Viro
2016-05-18  1:46 Xiong Zhou
2016-05-18  5:56 ` Dave Chinner
2016-05-18  8:31   ` Xiong Zhou
2016-05-18  9:54     ` Dave Chinner
2016-05-18 11:46       ` Xiong Zhou
2016-05-18 14:17         ` Dave Chinner
2016-05-18 23:02           ` Dave Chinner
2016-05-19  6:22             ` Xiong Zhou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).