All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
       [not found]   ` <20110608150241.8412a63d.akpm@linux-foundation.org>
@ 2011-06-09  3:32     ` Wu Fengguang
  2011-06-09  3:54       ` Wu Fengguang
  0 siblings, 1 reply; 18+ messages in thread
From: Wu Fengguang @ 2011-06-09  3:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, linux-fsdevel, bugzilla-daemon, daaugusto,
	kernel-bugzilla, listposter, justincase, clopez

On Thu, Jun 09, 2011 at 06:02:41AM +0800, Andrew Morton wrote:
> (udate by email because bugzilla is broken)
> 
> On Wed, 8 Jun 2011 21:38:43 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=18632
> 
> I always get a bit worried when I see dm-crypt involved - it doesn't
> seem terribly well-behaved or reliable.
> 
> That being said, some of the other reports here indicate that we are
> hitting a sync() livelock, sometimes fsync().

These comments all mention "blocked sync(1) on concurrent heavy writes".

        Comment #6 From Douglas Augusto
        Comment #12 From Nathan Grennan 
        Comment #16 From justincase@yopmail.com 
        Comment #23 From Carlos Alberto Lopez Perez

Comment #17 From Justin Wheeler is special in that fio is doing
read-only IO. Is it updating the mtime and the mtime writes are in
turn starved by the higher-priority continuous reads? (or at least the
write speed cannot keep up with the new mtime dirties)

> A pile of patches from
> Wu Fengguang which could/should address these issues is about to hit
> the linux-next tree.

Some livelock fixes will show up in tomorrow's linux-next tree. As for
now the fixes are available here:

git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback.git fs-writeback

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09  3:32     ` [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds Wu Fengguang
@ 2011-06-09  3:54       ` Wu Fengguang
  2011-06-09  8:27         ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Wu Fengguang @ 2011-06-09  3:54 UTC (permalink / raw)
  To: Dave Chinner, Christoph Hellwig
  Cc: Andrew Morton, Jan Kara, linux-fsdevel, bugzilla-daemon,
	daaugusto, kernel-bugzilla, listposter, justincase, clopez

On Thu, Jun 09, 2011 at 11:32:17AM +0800, Wu Fengguang wrote:
> On Thu, Jun 09, 2011 at 06:02:41AM +0800, Andrew Morton wrote:
> > (udate by email because bugzilla is broken)
> > 
> > On Wed, 8 Jun 2011 21:38:43 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=18632
> > 
> > I always get a bit worried when I see dm-crypt involved - it doesn't
> > seem terribly well-behaved or reliable.
> > 
> > That being said, some of the other reports here indicate that we are
> > hitting a sync() livelock, sometimes fsync().
> 
> These comments all mention "blocked sync(1) on concurrent heavy writes".
> 
>         Comment #6 From Douglas Augusto
>         Comment #12 From Nathan Grennan 
>         Comment #16 From justincase@yopmail.com 

>         Comment #23 From Carlos Alberto Lopez Perez

Oh it's not sync(1) in Carlos' case, but random commands like man/dd
as well as

        task xfssyncd:451 blocked for more than 120 seconds.

It looks more like XFS related livelock, because I ran into similar
problem in a very simple XFS setup on some plain disk partition.

Thanks,
Fengguang

> Comment #17 From Justin Wheeler is special in that fio is doing
> read-only IO. Is it updating the mtime and the mtime writes are in
> turn starved by the higher-priority continuous reads? (or at least the
> write speed cannot keep up with the new mtime dirties)
> 
> > A pile of patches from
> > Wu Fengguang which could/should address these issues is about to hit
> > the linux-next tree.
> 
> Some livelock fixes will show up in tomorrow's linux-next tree. As for
> now the fixes are available here:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback.git fs-writeback
> 
> Thanks,
> Fengguang

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09  3:54       ` Wu Fengguang
@ 2011-06-09  8:27         ` Christoph Hellwig
  2011-06-09  9:09           ` Wu Fengguang
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2011-06-09  8:27 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Dave Chinner, Christoph Hellwig, Andrew Morton, Jan Kara,
	linux-fsdevel, bugzilla-daemon, daaugusto, kernel-bugzilla,
	listposter, justincase, clopez

On Thu, Jun 09, 2011 at 11:54:26AM +0800, Wu Fengguang wrote:
> Oh it's not sync(1) in Carlos' case, but random commands like man/dd
> as well as
> 
>         task xfssyncd:451 blocked for more than 120 seconds.
> 
> It looks more like XFS related livelock, because I ran into similar
> problem in a very simple XFS setup on some plain disk partition.

Care to explain what issues you saw, and with what setup and kernel
version?  Also usually the task blocked more than 120 seconds display
should come with a stacktrace, how does it look like?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09  8:27         ` Christoph Hellwig
@ 2011-06-09  9:09           ` Wu Fengguang
  2011-06-09 11:02             ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Wu Fengguang @ 2011-06-09  9:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Chinner, Andrew Morton, Jan Kara, linux-fsdevel,
	bugzilla-daemon, daaugusto, kernel-bugzilla, listposter,
	justincase, clopez

On Thu, Jun 09, 2011 at 04:27:18PM +0800, Christoph Hellwig wrote:
> On Thu, Jun 09, 2011 at 11:54:26AM +0800, Wu Fengguang wrote:
> > Oh it's not sync(1) in Carlos' case, but random commands like man/dd
> > as well as
> > 
> >         task xfssyncd:451 blocked for more than 120 seconds.
> > 
> > It looks more like XFS related livelock, because I ran into similar
> > problem in a very simple XFS setup on some plain disk partition.
> 
> Care to explain what issues you saw, and with what setup and kernel
> version?  Also usually the task blocked more than 120 seconds display
> should come with a stacktrace, how does it look like?

I have a sync livelock test script and it sometimes livelocked on XFS
even with the livelock fix patches. Ext4 is always OK.

[ 3581.181253] sync            D ffff8800b6ca15d8  4560  4403   4392 0x00000000
[ 3581.181734]  ffff88006f775bc8 0000000000000046 ffff8800b6ca12b8 00000001b6ca1938
[ 3581.182411]  ffff88006f774000 00000000001d2e40 00000000001d2e40 ffff8800b6ca1280
[ 3581.183088]  00000000001d2e40 ffff88006f775fd8 00000340af111ef2 00000000001d2e40
[ 3581.183765] Call Trace:
[ 3581.184008]  [<ffffffff8109be73>] ? lock_release_holdtime+0xa3/0xab
[ 3581.184392]  [<ffffffff8108cc0d>] ? prepare_to_wait+0x6c/0x79
[ 3581.184756]  [<ffffffff8108cc0d>] ? prepare_to_wait+0x6c/0x79
[ 3581.185120]  [<ffffffff812ed520>] xfs_ioend_wait+0x87/0x9f
[ 3581.185474]  [<ffffffff8108c97a>] ? wake_up_bit+0x2a/0x2a
[ 3581.185827]  [<ffffffff812f742a>] xfs_sync_inode_data+0x92/0x9d
[ 3581.186198]  [<ffffffff812f76e2>] xfs_inode_ag_walk+0x1a5/0x287
[ 3581.186569]  [<ffffffff812f779b>] ? xfs_inode_ag_walk+0x25e/0x287
[ 3581.186946]  [<ffffffff812f7398>] ? xfs_sync_worker+0x69/0x69
[ 3581.187311]  [<ffffffff812e2354>] ? xfs_perag_get+0x68/0xd0
[ 3581.187669]  [<ffffffff81092175>] ? local_clock+0x41/0x5a
[ 3581.188020]  [<ffffffff8109be73>] ? lock_release_holdtime+0xa3/0xab
[ 3581.188403]  [<ffffffff812e22ec>] ? xfs_check_sizes+0x160/0x160
[ 3581.188773]  [<ffffffff812e2354>] ? xfs_perag_get+0x68/0xd0
[ 3581.189130]  [<ffffffff812e236c>] ? xfs_perag_get+0x80/0xd0
[ 3581.189488]  [<ffffffff812e22ec>] ? xfs_check_sizes+0x160/0x160
[ 3581.189858]  [<ffffffff812f7831>] ? xfs_inode_ag_iterator+0x6d/0x8f
[ 3581.190241]  [<ffffffff812f7398>] ? xfs_sync_worker+0x69/0x69
[ 3581.190606]  [<ffffffff812f780b>] xfs_inode_ag_iterator+0x47/0x8f
[ 3581.190982]  [<ffffffff811611f5>] ? __sync_filesystem+0x7a/0x7a
[ 3581.191352]  [<ffffffff812f7877>] xfs_sync_data+0x24/0x43
[ 3581.191703]  [<ffffffff812f7911>] xfs_quiesce_data+0x2c/0x88
[ 3581.192065]  [<ffffffff812f5556>] xfs_fs_sync_fs+0x21/0x48
[ 3581.192419]  [<ffffffff811611e1>] __sync_filesystem+0x66/0x7a
[ 3581.192783]  [<ffffffff8116120b>] sync_one_sb+0x16/0x18
[ 3581.193128]  [<ffffffff8113e3e3>] iterate_supers+0x72/0xce
[ 3581.193482]  [<ffffffff81161140>] sync_filesystems+0x20/0x22
[ 3581.193842]  [<ffffffff8116127e>] sys_sync+0x21/0x33
[ 3581.194177]  [<ffffffff819016c2>] system_call_fastpath+0x16/0x1b

I just reconfirmed the problem on 3.0-rc2 with/without the livelock
fix patches (writeback fixes and cleanups v5 at http://lkml.org/lkml/2011/6/7/569),
and find that situation has improved for XFS. It still has much longer
sync time than ext4, however won't stuck until dd exits.

root@fat /home/wfg# ./sync-livelock.sh

3.0-rc2, xfs:

        sync time: 20
        sync time: 26
        sync time: 27

3.0-rc2, ext4:
        sync time: 4
        sync time: 4
        sync time: 3

3.0-rc2 with livelock fix patches, xfs:
        sync time: 18
        sync time: 21
        sync time: 14
        sync time: 20
        sync time: 21

Thanks,
Fengguang
---
$ cat ./sync-livelock.sh
#!/bin/sh

umount /dev/sda7
mkfs.xfs -f /dev/sda7
# mkfs.ext4 /dev/sda7
# mkfs.btrfs /dev/sda7
mount /dev/sda7 /fs

echo $((50<<20)) > /proc/sys/vm/dirty_bytes

pid=
for i in `seq 10`
do     
        dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 &
        pid="$pid $!"
done

sleep 1

tic=$(date +'%s')
sync
tac=$(date +'%s')

echo
echo sync time: $((tac-tic))
egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo

pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09  9:09           ` Wu Fengguang
@ 2011-06-09 11:02             ` Christoph Hellwig
  2011-06-09 12:11               ` Wu Fengguang
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2011-06-09 11:02 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Christoph Hellwig, Dave Chinner, Andrew Morton, Jan Kara,
	linux-fsdevel, bugzilla-daemon, daaugusto, kernel-bugzilla,
	listposter, justincase, clopez

On Thu, Jun 09, 2011 at 05:09:06PM +0800, Wu Fengguang wrote:
> I have a sync livelock test script and it sometimes livelocked on XFS
> even with the livelock fix patches. Ext4 is always OK.

This sounds similar to the cfq issue just posted to lkml as
"CFQ: async queue blocks the whole system".

Does this happen with non-CFQ I/O schedulers, too?

> [ 3581.185120]  [<ffffffff812ed520>] xfs_ioend_wait+0x87/0x9f

This waits for the I/O completion to actually arrive - something that
XFS does correctly in both sync and fsync, but ext4 only does for fsync.
It might have some issues in the way it's implemented, I'll look if
we can do something.  But I suspect cfq delaying async writes too much
is definitively going to cause issues for us here.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 11:02             ` Christoph Hellwig
@ 2011-06-09 12:11               ` Wu Fengguang
  2011-06-09 12:17                 ` Wu Fengguang
                                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Wu Fengguang @ 2011-06-09 12:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Chinner, Andrew Morton, Jan Kara, linux-fsdevel,
	bugzilla-daemon, daaugusto, kernel-bugzilla, listposter,
	justincase, clopez, Tao Ma, Jens Axboe, Vivek Goyal

On Thu, Jun 09, 2011 at 07:02:14PM +0800, Christoph Hellwig wrote:
> On Thu, Jun 09, 2011 at 05:09:06PM +0800, Wu Fengguang wrote:
> > I have a sync livelock test script and it sometimes livelocked on XFS
> > even with the livelock fix patches. Ext4 is always OK.
> 
> This sounds similar to the cfq issue just posted to lkml as
> "CFQ: async queue blocks the whole system".

I once ran two dd's doing sequential reads and writes in parallel, and
find the write dd to be completely blocked (note: I can no longer
reproduce it today, on 3.0-rc2). At the time I thought: "Wow, this is
good for typical desktop". But yes, it is livelocking async IOs, which
is bad.

> Does this happen with non-CFQ I/O schedulers, too?

Just tried the deadline scheduler, sync times are still long:

echo deadline > /sys/block/sda/queue/scheduler

sync time: 21
sync time: 22
sync time: 29

Also tried disabling the cfq low latency feature,

echo cfq > /sys/block/sda/queue/scheduler
echo 0   > /sys/block/sda/queue/iosched/low_latency

However the low_latency value seem to have NO much effects in the
sync time (and also don't considerably improve async dd write
bandwidth at the presence of another parallel dd read).

sync time: 19
sync time: 24
sync time: 22

> > [ 3581.185120]  [<ffffffff812ed520>] xfs_ioend_wait+0x87/0x9f
> 
> This waits for the I/O completion to actually arrive - something that
> XFS does correctly in both sync and fsync, but ext4 only does for fsync.

Will it benefit to flush the disk _once_ at the end of sync?
(perhaps it's not be as easy in complex storage setups or whatever)

> It might have some issues in the way it's implemented, I'll look if
> we can do something.  But I suspect cfq delaying async writes too much
> is definitively going to cause issues for us here.

It's definitely a problem that cfq delays async writes too much.
However in Carlos's report,

        https://bugzilla.kernel.org/attachment.cgi?id=61222

there are no sync(1) or fsync running at all. So it may be indicating
a different problem.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 12:11               ` Wu Fengguang
@ 2011-06-09 12:17                 ` Wu Fengguang
  2011-06-09 12:17                 ` Christoph Hellwig
  2011-06-09 14:12                 ` Tao Ma
  2 siblings, 0 replies; 18+ messages in thread
From: Wu Fengguang @ 2011-06-09 12:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Chinner, Andrew Morton, Jan Kara, linux-fsdevel,
	bugzilla-daemon, daaugusto, kernel-bugzilla, listposter,
	justincase, clopez, Tao Ma, Jens Axboe, Vivek Goyal

> > It might have some issues in the way it's implemented, I'll look if
> > we can do something.  But I suspect cfq delaying async writes too much
> > is definitively going to cause issues for us here.
> 
> It's definitely a problem that cfq delays async writes too much.
> However in Carlos's report,
> 
>         https://bugzilla.kernel.org/attachment.cgi?id=61222
> 
> there are no sync(1) or fsync running at all. So it may be indicating
> a different problem.

Sorry, your assumption is still valid: there are several bash doing
xfs_read()s which will generate the required sync IO to livelock async
writes.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 12:11               ` Wu Fengguang
  2011-06-09 12:17                 ` Wu Fengguang
@ 2011-06-09 12:17                 ` Christoph Hellwig
  2011-06-09 12:43                   ` Wu Fengguang
  2011-06-09 14:12                 ` Tao Ma
  2 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2011-06-09 12:17 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Christoph Hellwig, Dave Chinner, Andrew Morton, Jan Kara,
	linux-fsdevel, bugzilla-daemon, daaugusto, kernel-bugzilla,
	listposter, justincase, clopez, Tao Ma, Jens Axboe, Vivek Goyal

On Thu, Jun 09, 2011 at 08:11:17PM +0800, Wu Fengguang wrote:
> > This waits for the I/O completion to actually arrive - something that
> > XFS does correctly in both sync and fsync, but ext4 only does for fsync.
> 
> Will it benefit to flush the disk _once_ at the end of sync?
> (perhaps it's not be as easy in complex storage setups or whatever)

XFS currently does it's own passes over data and metadata in sync. To
some extent they are overkill, but things like the iowait that's
prominent in your trace actually is needed to catch all in-flight
direct I/O.

> It's definitely a problem that cfq delays async writes too much.
> However in Carlos's report,
> 
>         https://bugzilla.kernel.org/attachment.cgi?id=61222
> 
> there are no sync(1) or fsync running at all. So it may be indicating
> a different problem.

If you have a heavy read load we still need to write back timestamps,
and it kinda sounds like he's hitting that.  No guarantee that actuall
is the cause, though.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 12:17                 ` Christoph Hellwig
@ 2011-06-09 12:43                   ` Wu Fengguang
  2011-06-09 13:23                     ` Christoph Hellwig
  2011-06-09 13:56                     ` Tao Ma
  0 siblings, 2 replies; 18+ messages in thread
From: Wu Fengguang @ 2011-06-09 12:43 UTC (permalink / raw)
  To: Christoph Hellwig, Carlos Alberto Lopez Perez
  Cc: Dave Chinner, Andrew Morton, Jan Kara, linux-fsdevel,
	bugzilla-daemon, daaugusto, kernel-bugzilla, listposter,
	justincase, Tao Ma, Jens Axboe, Vivek Goyal

> > It's definitely a problem that cfq delays async writes too much.
> > However in Carlos's report,
> > 
> >         https://bugzilla.kernel.org/attachment.cgi?id=61222
> > 
> > there are no sync(1) or fsync running at all. So it may be indicating
> > a different problem.
> 
> If you have a heavy read load we still need to write back timestamps,
> and it kinda sounds like he's hitting that.  No guarantee that actuall
> is the cause, though.

Right, in the other email I do find the reads.

So Carlos, it's worthwhile to try the deadline scheduler, which will
at least improve read/write workloads (likely the one you are running)
that don't involve sync/fsync.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 12:43                   ` Wu Fengguang
@ 2011-06-09 13:23                     ` Christoph Hellwig
  2011-06-10  3:21                       ` Wu Fengguang
  2011-06-09 13:56                     ` Tao Ma
  1 sibling, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2011-06-09 13:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Christoph Hellwig, Carlos Alberto Lopez Perez, Dave Chinner,
	Andrew Morton, Jan Kara, linux-fsdevel, bugzilla-daemon,
	daaugusto, kernel-bugzilla, listposter, justincase, Tao Ma,
	Jens Axboe, Vivek Goyal

Here is a patch that removes the XFS-internal writing of data in sync
and instead trusts the VFS code to get it right, and also does the
ioend_wait under a lock, to protect against new pending I/Os showing up.

It passed a single xfsqa run so far, so at least it won't blow up
completely.

Can you check if that helps with your livelock test case?  (And yes, I
should go and try it myself.  Will do as soon as I'll get a bit more
time).


Index: xfs/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_super.c	2011-06-09 14:17:06.952532698 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_super.c	2011-06-09 14:17:36.742531190 +0200
@@ -991,7 +991,7 @@ xfs_fs_write_inode(
 		 * ->sync_fs call do that for thus, which reduces the number
 		 * of synchronous log foces dramatically.
 		 */
-		xfs_ioend_wait(ip);
+//		xfs_ioend_wait(ip);
 		xfs_ilock(ip, XFS_ILOCK_SHARED);
 		if (ip->i_update_core) {
 			error = xfs_log_inode(ip);
Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2011-06-09 14:16:59.575866405 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2011-06-09 14:22:31.675849591 +0200
@@ -215,6 +215,19 @@ xfs_inode_ag_iterator(
 }
 
 STATIC int
+xfs_wait_ioend_cb(
+	struct xfs_inode	*ip,
+	struct xfs_perag	*pag,
+	int			flags)
+{
+	xfs_ilock(ip, XFS_IOLOCK_SHARED);
+	xfs_ioend_wait(ip);
+	xfs_iunlock(ip, XFS_IOLOCK_SHARED);
+
+	return 0;
+}
+
+STATIC int
 xfs_sync_inode_data(
 	struct xfs_inode	*ip,
 	struct xfs_perag	*pag,
@@ -359,14 +372,14 @@ xfs_quiesce_data(
 {
 	int			error, error2 = 0;
 
-	/* push non-blocking */
-	xfs_sync_data(mp, 0);
-	xfs_qm_sync(mp, SYNC_TRYLOCK);
-
-	/* push and block till complete */
-	xfs_sync_data(mp, SYNC_WAIT);
 	xfs_qm_sync(mp, SYNC_WAIT);
 
+	error = xfs_inode_ag_iterator(mp, xfs_wait_ioend_cb, 0);
+	if (error)
+		return XFS_ERROR(error);
+
+	xfs_log_force(mp, XFS_LOG_SYNC);
+
 	/* write superblock and hoover up shutdown errors */
 	error = xfs_sync_fsdata(mp);
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 12:43                   ` Wu Fengguang
  2011-06-09 13:23                     ` Christoph Hellwig
@ 2011-06-09 13:56                     ` Tao Ma
  1 sibling, 0 replies; 18+ messages in thread
From: Tao Ma @ 2011-06-09 13:56 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Christoph Hellwig, Carlos Alberto Lopez Perez, Dave Chinner,
	Andrew Morton, Jan Kara, linux-fsdevel, bugzilla-daemon,
	daaugusto, kernel-bugzilla, listposter, justincase, Jens Axboe,
	Vivek Goyal

On 06/09/2011 08:43 PM, Wu Fengguang wrote:
>>> It's definitely a problem that cfq delays async writes too much.
>>> However in Carlos's report,
>>>
>>>         https://bugzilla.kernel.org/attachment.cgi?id=61222
>>>
>>> there are no sync(1) or fsync running at all. So it may be indicating
>>> a different problem.
>>
>> If you have a heavy read load we still need to write back timestamps,
>> and it kinda sounds like he's hitting that.  No guarantee that actuall
>> is the cause, though.
> 
> Right, in the other email I do find the reads.
> 
> So Carlos, it's worthwhile to try the deadline scheduler, which will
> at least improve read/write workloads (likely the one you are running)
> that don't involve sync/fsync.
yes, deadline scheduler surely should help in this cases.

Regards,
Tao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 12:11               ` Wu Fengguang
  2011-06-09 12:17                 ` Wu Fengguang
  2011-06-09 12:17                 ` Christoph Hellwig
@ 2011-06-09 14:12                 ` Tao Ma
  2011-06-09 14:21                   ` Christoph Hellwig
  2 siblings, 1 reply; 18+ messages in thread
From: Tao Ma @ 2011-06-09 14:12 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Christoph Hellwig, Dave Chinner, Andrew Morton, Jan Kara,
	linux-fsdevel, bugzilla-daemon, daaugusto, kernel-bugzilla,
	listposter, justincase, clopez, Jens Axboe, Vivek Goyal

Hi Fengguang and Christoph,
On 06/09/2011 08:11 PM, Wu Fengguang wrote:
> On Thu, Jun 09, 2011 at 07:02:14PM +0800, Christoph Hellwig wrote:
>> On Thu, Jun 09, 2011 at 05:09:06PM +0800, Wu Fengguang wrote:
>>> I have a sync livelock test script and it sometimes livelocked on XFS
>>> even with the livelock fix patches. Ext4 is always OK.
>>
>> This sounds similar to the cfq issue just posted to lkml as
>> "CFQ: async queue blocks the whole system".
Just want to say more about the situation here. Actually the flusher is
too much easier to be blocked by the sync requests. And whenever it is
blocked, it takes a quite long time to get back(because several cfq
designs), so do you think we can use WRITE_SYNC for the bdev inodes in
flusher? AFAICS, in most of the cases when a volume is mounted, the
writeback for a bdev inode means the metadata writeback. And they are
very important for a file system and should be written as far as
possible. I ran my test cases with the change, and now the livelock
doesn't show up anymore.

Regards,
Tao
> 
> I once ran two dd's doing sequential reads and writes in parallel, and
> find the write dd to be completely blocked (note: I can no longer
> reproduce it today, on 3.0-rc2). At the time I thought: "Wow, this is
> good for typical desktop". But yes, it is livelocking async IOs, which
> is bad.
> 
>> Does this happen with non-CFQ I/O schedulers, too?
> 
> Just tried the deadline scheduler, sync times are still long:
> 
> echo deadline > /sys/block/sda/queue/scheduler
> 
> sync time: 21
> sync time: 22
> sync time: 29
> 
> Also tried disabling the cfq low latency feature,
> 
> echo cfq > /sys/block/sda/queue/scheduler
> echo 0   > /sys/block/sda/queue/iosched/low_latency
> 
> However the low_latency value seem to have NO much effects in the
> sync time (and also don't considerably improve async dd write
> bandwidth at the presence of another parallel dd read).
> 
> sync time: 19
> sync time: 24
> sync time: 22
> 
>>> [ 3581.185120]  [<ffffffff812ed520>] xfs_ioend_wait+0x87/0x9f
>>
>> This waits for the I/O completion to actually arrive - something that
>> XFS does correctly in both sync and fsync, but ext4 only does for fsync.
> 
> Will it benefit to flush the disk _once_ at the end of sync?
> (perhaps it's not be as easy in complex storage setups or whatever)
> 
>> It might have some issues in the way it's implemented, I'll look if
>> we can do something.  But I suspect cfq delaying async writes too much
>> is definitively going to cause issues for us here.
> 
> It's definitely a problem that cfq delays async writes too much.
> However in Carlos's report,
> 
>         https://bugzilla.kernel.org/attachment.cgi?id=61222
> 
> there are no sync(1) or fsync running at all. So it may be indicating
> a different problem.
> 
> Thanks,
> Fengguang


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 14:12                 ` Tao Ma
@ 2011-06-09 14:21                   ` Christoph Hellwig
  2011-06-09 14:32                     ` Vivek Goyal
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2011-06-09 14:21 UTC (permalink / raw)
  To: Tao Ma
  Cc: Wu Fengguang, Christoph Hellwig, Dave Chinner, Andrew Morton,
	Jan Kara, linux-fsdevel, bugzilla-daemon, daaugusto,
	kernel-bugzilla, listposter, justincase, clopez, Jens Axboe,
	Vivek Goyal

On Thu, Jun 09, 2011 at 10:12:38PM +0800, Tao Ma wrote:
> Just want to say more about the situation here. Actually the flusher is
> too much easier to be blocked by the sync requests. And whenever it is
> blocked, it takes a quite long time to get back(because several cfq
> designs), so do you think we can use WRITE_SYNC for the bdev inodes in
> flusher? AFAICS, in most of the cases when a volume is mounted, the
> writeback for a bdev inode means the metadata writeback. And they are
> very important for a file system and should be written as far as
> possible. I ran my test cases with the change, and now the livelock
> doesn't show up anymore.

It's not a very good guestimate for metadata.  A lot of metadata is
either stored in directories (often happens for directories) or doesn't
use the pagecache writeback functions at all.

The major problem here seems to be that async requests simply starve
sync requests far too much.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 14:21                   ` Christoph Hellwig
@ 2011-06-09 14:32                     ` Vivek Goyal
  2011-06-09 14:51                       ` Tao Ma
  0 siblings, 1 reply; 18+ messages in thread
From: Vivek Goyal @ 2011-06-09 14:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tao Ma, Wu Fengguang, Dave Chinner, Andrew Morton, Jan Kara,
	linux-fsdevel, bugzilla-daemon, daaugusto, kernel-bugzilla,
	listposter, justincase, clopez, Jens Axboe, Shaohua Li

On Thu, Jun 09, 2011 at 10:21:33AM -0400, Christoph Hellwig wrote:
> On Thu, Jun 09, 2011 at 10:12:38PM +0800, Tao Ma wrote:
> > Just want to say more about the situation here. Actually the flusher is
> > too much easier to be blocked by the sync requests. And whenever it is
> > blocked, it takes a quite long time to get back(because several cfq
> > designs), so do you think we can use WRITE_SYNC for the bdev inodes in
> > flusher? AFAICS, in most of the cases when a volume is mounted, the
> > writeback for a bdev inode means the metadata writeback. And they are
> > very important for a file system and should be written as far as
> > possible. I ran my test cases with the change, and now the livelock
> > doesn't show up anymore.
> 
> It's not a very good guestimate for metadata.  A lot of metadata is
> either stored in directories (often happens for directories) or doesn't
> use the pagecache writeback functions at all.
> 
> The major problem here seems to be that async requests simply starve
> sync requests far too much.

You mean sync requests starve async requests?

It is possible that CFQ can starve async requests for long time in
presence of sync reqeusts. If that's the case, all the reported issues
should go away with deadline scheduler.

As I mentioned in other mail, one commit made the dias very heavily
loaded in favor of sync requests.

commit f8ae6e3eb8251be32c6e913393d9f8d9e0609489
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Fri Jan 14 08:41:02 2011 +0100

    block cfq: make queue preempt work for queues from different workload


I will do some quick tests and try to write a small patch where we can
keep track of how many times sync and async workloads have been scheduled
and make sure we don't starve async req completely.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 14:32                     ` Vivek Goyal
@ 2011-06-09 14:51                       ` Tao Ma
  0 siblings, 0 replies; 18+ messages in thread
From: Tao Ma @ 2011-06-09 14:51 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Christoph Hellwig, Wu Fengguang, Dave Chinner, Andrew Morton,
	Jan Kara, linux-fsdevel, bugzilla-daemon, daaugusto,
	kernel-bugzilla, listposter, justincase, clopez, Jens Axboe,
	Shaohua Li

On 06/09/2011 10:32 PM, Vivek Goyal wrote:
> On Thu, Jun 09, 2011 at 10:21:33AM -0400, Christoph Hellwig wrote:
>> On Thu, Jun 09, 2011 at 10:12:38PM +0800, Tao Ma wrote:
>>> Just want to say more about the situation here. Actually the flusher is
>>> too much easier to be blocked by the sync requests. And whenever it is
>>> blocked, it takes a quite long time to get back(because several cfq
>>> designs), so do you think we can use WRITE_SYNC for the bdev inodes in
>>> flusher? AFAICS, in most of the cases when a volume is mounted, the
>>> writeback for a bdev inode means the metadata writeback. And they are
>>> very important for a file system and should be written as far as
>>> possible. I ran my test cases with the change, and now the livelock
>>> doesn't show up anymore.
>>
>> It's not a very good guestimate for metadata.  A lot of metadata is
>> either stored in directories (often happens for directories) or doesn't
>> use the pagecache writeback functions at all.
>>
>> The major problem here seems to be that async requests simply starve
>> sync requests far too much.
> 
> You mean sync requests starve async requests?
> 
> It is possible that CFQ can starve async requests for long time in
> presence of sync reqeusts. If that's the case, all the reported issues
> should go away with deadline scheduler.
> 
> As I mentioned in other mail, one commit made the dias very heavily
> loaded in favor of sync requests.
> 
> commit f8ae6e3eb8251be32c6e913393d9f8d9e0609489
> Author: Shaohua Li <shaohua.li@intel.com>
> Date:   Fri Jan 14 08:41:02 2011 +0100
> 
>     block cfq: make queue preempt work for queues from different workload
> 
> 
> I will do some quick tests and try to write a small patch where we can
> keep track of how many times sync and async workloads have been scheduled
> and make sure we don't starve async req completely.
oh, so you mean the patch is the culprit? I will try to revert it to see
whether the system works better.

Regards,
Tao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-09 13:23                     ` Christoph Hellwig
@ 2011-06-10  3:21                       ` Wu Fengguang
  2011-06-19 15:56                         ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Wu Fengguang @ 2011-06-10  3:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Carlos Alberto Lopez Perez, Dave Chinner, Andrew Morton,
	Jan Kara, linux-fsdevel, bugzilla-daemon, daaugusto,
	kernel-bugzilla, listposter, justincase, Tao Ma, Jens Axboe,
	Vivek Goyal

On Thu, Jun 09, 2011 at 09:23:57PM +0800, Christoph Hellwig wrote:
> Here is a patch that removes the XFS-internal writing of data in sync
> and instead trusts the VFS code to get it right, and also does the
> ioend_wait under a lock, to protect against new pending I/Os showing up.
> 
> It passed a single xfsqa run so far, so at least it won't blow up
> completely.
> 
> Can you check if that helps with your livelock test case?  (And yes, I
> should go and try it myself.  Will do as soon as I'll get a bit more
> time).

My pleasure. It's easier to compare results on the same test bed and
I've made it very convenient to test patches :)

The patch does improve the sync time.

before patch                   after patch
--------------------------------------------------
sync time: 18                  sync time: 18
sync time: 21                  sync time: 15
sync time: 14                  sync time: 14
sync time: 20                  sync time: 16
sync time: 21                  sync time: 14
                               
avg        18.800              avg        15.400 
stddev      2.638              stddev      1.497 

I also did extra test runs and collect sync(1) call traces by doing
"echo t > /proc/sysrq-trigger" randomly.  The ones in XFS are:

[   78.395664] sync            D 0000000000000002  5296  3024   3003 0x00000000
[   78.396162]  ffff8800b4229de8 0000000000000046 ffffffff819252fa 0000000000000000
[   78.396850]  ffff8800b4228000 00000000001d1a40 00000000001d1a40 ffff8800aefa22f0
[   78.397519]  00000000001d1a40 ffff8800b4229fd8 00000000001d1a40 ffff8800b4229fd8
[   78.398189] Call Trace:
[   78.398431]  [<ffffffff819252fa>] ? schedule+0x374/0x992
[   78.398778]  [<ffffffff812e906f>] ? _xfs_log_force+0x1de/0x226
[   78.399143]  [<ffffffff81092be5>] ? local_clock+0x41/0x5a
[   78.399493]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[   78.399875]  [<ffffffff812e906f>] ? _xfs_log_force+0x1de/0x226
[   78.400239]  [<ffffffff812e9074>] _xfs_log_force+0x1e3/0x226
[   78.400599]  [<ffffffff81067c10>] ? try_to_wake_up+0x28b/0x28b
[   78.400964]  [<ffffffff813052f9>] ? xfs_inode_ag_iterator+0x6d/0x8f
[   78.401347]  [<ffffffff81164771>] ? __sync_filesystem+0x7a/0x7a
[   78.401715]  [<ffffffff812e90cf>] xfs_log_force+0x18/0x3a
[   78.402064]  [<ffffffff813053e8>] xfs_quiesce_data+0x3a/0x96
[   78.402424]  [<ffffffff81302f82>] xfs_fs_sync_fs+0x21/0x48
[   78.402777]  [<ffffffff8116475d>] __sync_filesystem+0x66/0x7a
[   78.403138]  [<ffffffff81164787>] sync_one_sb+0x16/0x18
[   78.403482]  [<ffffffff81142d37>] iterate_supers+0x72/0xc8
[   78.403835]  [<ffffffff811646bc>] sync_filesystems+0x20/0x22
[   78.404198]  [<ffffffff811647fa>] sys_sync+0x21/0x33
[   78.404533]  [<ffffffff8192f742>] system_call_fastpath+0x16/0x1b

[  221.106751] sync            D 0000000000000007  5296  3136   3116 0x00000000
[  221.107229]  ffff8800aee7bb40 0000000000000046 ffffffff819252fa 0000000000000000
[  221.107903]  ffff8800aee7a000 00000000001d1a40 00000000001d1a40 ffff88009c6a45e0
[  221.108574]  00000000001d1a40 ffff8800aee7bfd8 00000000001d1a40 ffff8800aee7bfd8
[  221.109246] Call Trace:
[  221.109486]  [<ffffffff819252fa>] ? schedule+0x374/0x992
[  221.109831]  [<ffffffff81928012>] ? rwsem_down_failed_common+0xc7/0x103
[  221.110225]  [<ffffffff81092be5>] ? local_clock+0x41/0x5a
[  221.110574]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[  221.110956]  [<ffffffff81928012>] ? rwsem_down_failed_common+0xc7/0x103
[  221.111352]  [<ffffffff8192801e>] rwsem_down_failed_common+0xd3/0x103
[  221.111738]  [<ffffffff81928075>] rwsem_down_read_failed+0x12/0x14
[  221.112116]  [<ffffffff813c81a4>] call_rwsem_down_read_failed+0x14/0x30
[  221.112513]  [<ffffffff812de648>] ? xfs_ilock+0x4d/0xba
[  221.112855]  [<ffffffff812de62b>] ? xfs_ilock+0x30/0xba
[  221.113199]  [<ffffffff8109153b>] ? down_read_nested+0x74/0x89
[  221.113563]  [<ffffffff812de648>] ? xfs_ilock+0x4d/0xba
[  221.113904]  [<ffffffff812de648>] xfs_ilock+0x4d/0xba
[  221.114243]  [<ffffffff81304ee0>] xfs_wait_ioend_cb+0x1b/0x36
[  221.114604]  [<ffffffff813051a9>] xfs_inode_ag_walk+0x1a5/0x288
[  221.114973]  [<ffffffff81305263>] ? xfs_inode_ag_walk+0x25f/0x288
[  221.115348]  [<ffffffff81304ec5>] ? xfs_sync_inode_data+0x9d/0x9d
[  221.115722]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[  221.116103]  [<ffffffff812efb94>] ? xfs_check_sizes+0x160/0x160
[  221.116473]  [<ffffffff812efbfc>] ? xfs_perag_get+0x68/0xd1
[  221.116828]  [<ffffffff812efc14>] ? xfs_perag_get+0x80/0xd1
[  221.117185]  [<ffffffff812efb94>] ? xfs_check_sizes+0x160/0x160
[  221.117553]  [<ffffffff813052f9>] ? xfs_inode_ag_iterator+0x6d/0x8f
[  221.117936]  [<ffffffff81304ec5>] ? xfs_sync_inode_data+0x9d/0x9d
[  221.118312]  [<ffffffff813052d3>] xfs_inode_ag_iterator+0x47/0x8f
[  221.118689]  [<ffffffff81164771>] ? __sync_filesystem+0x7a/0x7a
[  221.119058]  [<ffffffff813053d1>] xfs_quiesce_data+0x23/0x96
[  221.119417]  [<ffffffff81302f82>] xfs_fs_sync_fs+0x21/0x48
[  221.119771]  [<ffffffff8116475d>] __sync_filesystem+0x66/0x7a
[  221.120134]  [<ffffffff81164787>] sync_one_sb+0x16/0x18
[  221.120477]  [<ffffffff81142d37>] iterate_supers+0x72/0xc8
[  221.120828]  [<ffffffff811646bc>] sync_filesystems+0x20/0x22
[  221.121188]  [<ffffffff811647fa>] sys_sync+0x21/0x33

[  349.972430] sync            D 0000000000000007  5296  3312   3292 0x00000000
[  349.972907]  ffff88009c77fc08 0000000000000046 ffffffff819252fa 0000000100000070
[  349.973575]  ffff88009c77e000 00000000001d1a40 00000000001d1a40 ffff8800b56b0000
[  349.974250]  00000000001d1a40 ffff88009c77ffd8 00000000001d1a40 ffff88009c77ffd8
[  349.974921] Call Trace:
[  349.975159]  [<ffffffff819252fa>] ? schedule+0x374/0x992
[  349.975504]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[  349.975880]  [<ffffffff8108d5c5>] ? prepare_to_wait+0x6c/0x79
[  349.976238]  [<ffffffff812fade3>] xfs_ioend_wait+0x87/0x9f
[  349.976588]  [<ffffffff8108d332>] ? wake_up_bit+0x2a/0x2a
[  349.976932]  [<ffffffff812de648>] ? xfs_ilock+0x4d/0xba
[  349.977272]  [<ffffffff81304ee8>] xfs_wait_ioend_cb+0x23/0x36
[  349.977631]  [<ffffffff813051a9>] xfs_inode_ag_walk+0x1a5/0x288
[  349.977997]  [<ffffffff81305263>] ? xfs_inode_ag_walk+0x25f/0x288
[  349.978371]  [<ffffffff81304ec5>] ? xfs_sync_inode_data+0x9d/0x9d
[  349.978748]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[  349.979129]  [<ffffffff812efb94>] ? xfs_check_sizes+0x160/0x160
[  349.979499]  [<ffffffff812efbfc>] ? xfs_perag_get+0x68/0xd1
[  349.979856]  [<ffffffff812efc14>] ? xfs_perag_get+0x80/0xd1
[  349.980211]  [<ffffffff812efb94>] ? xfs_check_sizes+0x160/0x160
[  349.980582]  [<ffffffff813052f9>] ? xfs_inode_ag_iterator+0x6d/0x8f
[  349.980964]  [<ffffffff81304ec5>] ? xfs_sync_inode_data+0x9d/0x9d
[  349.981338]  [<ffffffff813052d3>] xfs_inode_ag_iterator+0x47/0x8f
[  349.981717]  [<ffffffff81164771>] ? __sync_filesystem+0x7a/0x7a
[  349.982085]  [<ffffffff813053d1>] xfs_quiesce_data+0x23/0x96
[  349.982443]  [<ffffffff81302f82>] xfs_fs_sync_fs+0x21/0x48
[  349.982798]  [<ffffffff8116475d>] __sync_filesystem+0x66/0x7a
[  349.983159]  [<ffffffff81164787>] sync_one_sb+0x16/0x18
[  349.983504]  [<ffffffff81142d37>] iterate_supers+0x72/0xc8
[  349.983857]  [<ffffffff811646bc>] sync_filesystems+0x20/0x22
[  349.984215]  [<ffffffff811647fa>] sys_sync+0x21/0x33
[  349.984552]  [<ffffffff8192f742>] system_call_fastpath+0x16/0x1b


[  419.417496] sync            D 0000000000000006  5296  3412   3392 0x00000000
[  419.417972]  ffff8800aef6dc08 0000000000000046 ffffffff819252fa 0000000100000070
[  419.418645]  ffff8800aef6c000 00000000001d1a40 00000000001d1a40 ffff8800b704a2f0
[  419.419317]  00000000001d1a40 ffff8800aef6dfd8 00000000001d1a40 ffff8800aef6dfd8
[  419.419987] Call Trace:
[  419.420228]  [<ffffffff819252fa>] ? schedule+0x374/0x992
[  419.420576]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[  419.420957]  [<ffffffff8108d5c5>] ? prepare_to_wait+0x6c/0x79
[  419.421321]  [<ffffffff812fade3>] xfs_ioend_wait+0x87/0x9f
[  419.421673]  [<ffffffff8108d332>] ? wake_up_bit+0x2a/0x2a
[  419.422020]  [<ffffffff812de648>] ? xfs_ilock+0x4d/0xba
[  419.422364]  [<ffffffff81304ee8>] xfs_wait_ioend_cb+0x23/0x36
[  419.422724]  [<ffffffff813051a9>] xfs_inode_ag_walk+0x1a5/0x288
[  419.423093]  [<ffffffff81305263>] ? xfs_inode_ag_walk+0x25f/0x288
[  419.423470]  [<ffffffff81304ec5>] ? xfs_sync_inode_data+0x9d/0x9d
[  419.423844]  [<ffffffff8109d329>] ? lock_release_holdtime+0xa3/0xab
[  419.424226]  [<ffffffff812efb94>] ? xfs_check_sizes+0x160/0x160
[  419.424596]  [<ffffffff812efbfc>] ? xfs_perag_get+0x68/0xd1
[  419.424951]  [<ffffffff812efc14>] ? xfs_perag_get+0x80/0xd1
[  419.425307]  [<ffffffff812efb94>] ? xfs_check_sizes+0x160/0x160
[  419.425676]  [<ffffffff813052f9>] ? xfs_inode_ag_iterator+0x6d/0x8f
[  419.426058]  [<ffffffff81304ec5>] ? xfs_sync_inode_data+0x9d/0x9d
[  419.426435]  [<ffffffff813052d3>] xfs_inode_ag_iterator+0x47/0x8f
[  419.426812]  [<ffffffff81164771>] ? __sync_filesystem+0x7a/0x7a
[  419.427180]  [<ffffffff813053d1>] xfs_quiesce_data+0x23/0x96
[  419.427542]  [<ffffffff81302f82>] xfs_fs_sync_fs+0x21/0x48
[  419.427895]  [<ffffffff8116475d>] __sync_filesystem+0x66/0x7a
[  419.428258]  [<ffffffff81164787>] sync_one_sb+0x16/0x18
[  419.428601]  [<ffffffff81142d37>] iterate_supers+0x72/0xc8
[  419.428954]  [<ffffffff811646bc>] sync_filesystems+0x20/0x22
[  419.429315]  [<ffffffff811647fa>] sys_sync+0x21/0x33
[  419.429648]  [<ffffffff8192f742>] system_call_fastpath+0x16/0x1b

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-10  3:21                       ` Wu Fengguang
@ 2011-06-19 15:56                         ` Christoph Hellwig
  2011-06-19 16:33                           ` Wu Fengguang
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2011-06-19 15:56 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Christoph Hellwig, Carlos Alberto Lopez Perez, Dave Chinner,
	Andrew Morton, Jan Kara, linux-fsdevel, bugzilla-daemon,
	daaugusto, kernel-bugzilla, listposter, justincase, Tao Ma,
	Jens Axboe, Vivek Goyal

On Fri, Jun 10, 2011 at 11:21:49AM +0800, Wu Fengguang wrote:
> > Can you check if that helps with your livelock test case?  (And yes, I
> > should go and try it myself.  Will do as soon as I'll get a bit more
> > time).
> 
> My pleasure. It's easier to compare results on the same test bed and
> I've made it very convenient to test patches :)

Here's one that should be even better, although in it's current form it
break a subtile corner case of sync semantics, so it'll need more work
if I go down that way:

Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2011-06-17 14:16:18.442399481 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2011-06-18 17:55:44.864025123 +0200
@@ -359,14 +359,16 @@ xfs_quiesce_data(
 {
 	int			error, error2 = 0;
 
-	/* push non-blocking */
-	xfs_sync_data(mp, 0);
 	xfs_qm_sync(mp, SYNC_TRYLOCK);
-
-	/* push and block till complete */
-	xfs_sync_data(mp, SYNC_WAIT);
 	xfs_qm_sync(mp, SYNC_WAIT);
 
+	/* flush all pending size updates and unwritten extent conversions */
+	flush_workqueue(xfsconvertd_workqueue);
+	flush_workqueue(xfsdatad_workqueue);
+
+	/* force out the newly dirtied log buffers */
+	xfs_log_force(mp, XFS_LOG_SYNC);
+
 	/* write superblock and hoover up shutdown errors */
 	error = xfs_sync_fsdata(mp);
 
Index: xfs/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_super.c	2011-06-18 17:51:05.660705925 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_super.c	2011-06-18 17:52:50.107367305 +0200
@@ -929,45 +929,12 @@ xfs_fs_write_inode(
 		 * ->sync_fs call do that for thus, which reduces the number
 		 * of synchronous log foces dramatically.
 		 */
-		xfs_ioend_wait(ip);
 		xfs_ilock(ip, XFS_ILOCK_SHARED);
-		if (ip->i_update_core) {
+		if (ip->i_update_core)
 			error = xfs_log_inode(ip);
-			if (error)
-				goto out_unlock;
-		}
-	} else {
-		/*
-		 * We make this non-blocking if the inode is contended, return
-		 * EAGAIN to indicate to the caller that they did not succeed.
-		 * This prevents the flush path from blocking on inodes inside
-		 * another operation right now, they get caught later by
-		 * xfs_sync.
-		 */
-		if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
-			goto out;
-
-		if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip))
-			goto out_unlock;
-
-		/*
-		 * Now we have the flush lock and the inode is not pinned, we
-		 * can check if the inode is really clean as we know that
-		 * there are no pending transaction completions, it is not
-		 * waiting on the delayed write queue and there is no IO in
-		 * progress.
-		 */
-		if (xfs_inode_clean(ip)) {
-			xfs_ifunlock(ip);
-			error = 0;
-			goto out_unlock;
-		}
-		error = xfs_iflush(ip, SYNC_TRYLOCK);
+		xfs_iunlock(ip, XFS_ILOCK_SHARED);
 	}
 
- out_unlock:
-	xfs_iunlock(ip, XFS_ILOCK_SHARED);
- out:
 	/*
 	 * if we failed to write out the inode then mark
 	 * it dirty again so we'll try again later.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds.
  2011-06-19 15:56                         ` Christoph Hellwig
@ 2011-06-19 16:33                           ` Wu Fengguang
  0 siblings, 0 replies; 18+ messages in thread
From: Wu Fengguang @ 2011-06-19 16:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Carlos Alberto Lopez Perez, Dave Chinner, Andrew Morton,
	Jan Kara, linux-fsdevel, bugzilla-daemon, daaugusto,
	kernel-bugzilla, listposter, justincase, Tao Ma, Jens Axboe,
	Vivek Goyal

On Sun, Jun 19, 2011 at 11:56:46PM +0800, Christoph Hellwig wrote:
> On Fri, Jun 10, 2011 at 11:21:49AM +0800, Wu Fengguang wrote:
> > > Can you check if that helps with your livelock test case?  (And yes, I
> > > should go and try it myself.  Will do as soon as I'll get a bit more
> > > time).
> > 
> > My pleasure. It's easier to compare results on the same test bed and
> > I've made it very convenient to test patches :)
> 
> Here's one that should be even better, although in it's current form it
> break a subtile corner case of sync semantics, so it'll need more work
> if I go down that way:

The sync(1)s are much faster now, in most invocations it's in the same
level as ext4 :)

root@fat /home/wfg# ./sync-livelock.sh

sync time: 3
sync time: 5
sync time: 5
sync time: 6
sync time: 15
sync time: 5
sync time: 5
sync time: 5
sync time: 4
sync time: 10
sync time: 4
sync time: 4

Thanks,
Fengguang

> Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
> ===================================================================
> --- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2011-06-17 14:16:18.442399481 +0200
> +++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2011-06-18 17:55:44.864025123 +0200
> @@ -359,14 +359,16 @@ xfs_quiesce_data(
>  {
>  	int			error, error2 = 0;
>  
> -	/* push non-blocking */
> -	xfs_sync_data(mp, 0);
>  	xfs_qm_sync(mp, SYNC_TRYLOCK);
> -
> -	/* push and block till complete */
> -	xfs_sync_data(mp, SYNC_WAIT);
>  	xfs_qm_sync(mp, SYNC_WAIT);
>  
> +	/* flush all pending size updates and unwritten extent conversions */
> +	flush_workqueue(xfsconvertd_workqueue);
> +	flush_workqueue(xfsdatad_workqueue);
> +
> +	/* force out the newly dirtied log buffers */
> +	xfs_log_force(mp, XFS_LOG_SYNC);
> +
>  	/* write superblock and hoover up shutdown errors */
>  	error = xfs_sync_fsdata(mp);
>  
> Index: xfs/fs/xfs/linux-2.6/xfs_super.c
> ===================================================================
> --- xfs.orig/fs/xfs/linux-2.6/xfs_super.c	2011-06-18 17:51:05.660705925 +0200
> +++ xfs/fs/xfs/linux-2.6/xfs_super.c	2011-06-18 17:52:50.107367305 +0200
> @@ -929,45 +929,12 @@ xfs_fs_write_inode(
>  		 * ->sync_fs call do that for thus, which reduces the number
>  		 * of synchronous log foces dramatically.
>  		 */
> -		xfs_ioend_wait(ip);
>  		xfs_ilock(ip, XFS_ILOCK_SHARED);
> -		if (ip->i_update_core) {
> +		if (ip->i_update_core)
>  			error = xfs_log_inode(ip);
> -			if (error)
> -				goto out_unlock;
> -		}
> -	} else {
> -		/*
> -		 * We make this non-blocking if the inode is contended, return
> -		 * EAGAIN to indicate to the caller that they did not succeed.
> -		 * This prevents the flush path from blocking on inodes inside
> -		 * another operation right now, they get caught later by
> -		 * xfs_sync.
> -		 */
> -		if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
> -			goto out;
> -
> -		if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip))
> -			goto out_unlock;
> -
> -		/*
> -		 * Now we have the flush lock and the inode is not pinned, we
> -		 * can check if the inode is really clean as we know that
> -		 * there are no pending transaction completions, it is not
> -		 * waiting on the delayed write queue and there is no IO in
> -		 * progress.
> -		 */
> -		if (xfs_inode_clean(ip)) {
> -			xfs_ifunlock(ip);
> -			error = 0;
> -			goto out_unlock;
> -		}
> -		error = xfs_iflush(ip, SYNC_TRYLOCK);
> +		xfs_iunlock(ip, XFS_ILOCK_SHARED);
>  	}
>  
> - out_unlock:
> -	xfs_iunlock(ip, XFS_ILOCK_SHARED);
> - out:
>  	/*
>  	 * if we failed to write out the inode then mark
>  	 * it dirty again so we'll try again later.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-06-19 16:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-18632-27@https.bugzilla.kernel.org/>
     [not found] ` <201106082138.p58Lchgj002615@demeter2.kernel.org>
     [not found]   ` <20110608150241.8412a63d.akpm@linux-foundation.org>
2011-06-09  3:32     ` [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds Wu Fengguang
2011-06-09  3:54       ` Wu Fengguang
2011-06-09  8:27         ` Christoph Hellwig
2011-06-09  9:09           ` Wu Fengguang
2011-06-09 11:02             ` Christoph Hellwig
2011-06-09 12:11               ` Wu Fengguang
2011-06-09 12:17                 ` Wu Fengguang
2011-06-09 12:17                 ` Christoph Hellwig
2011-06-09 12:43                   ` Wu Fengguang
2011-06-09 13:23                     ` Christoph Hellwig
2011-06-10  3:21                       ` Wu Fengguang
2011-06-19 15:56                         ` Christoph Hellwig
2011-06-19 16:33                           ` Wu Fengguang
2011-06-09 13:56                     ` Tao Ma
2011-06-09 14:12                 ` Tao Ma
2011-06-09 14:21                   ` Christoph Hellwig
2011-06-09 14:32                     ` Vivek Goyal
2011-06-09 14:51                       ` Tao Ma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.