linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] fs: include device name in error messages about freezing
@ 2014-05-13 18:31 Mateusz Guzik
  2014-05-13 18:31 ` [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems Mateusz Guzik
  0 siblings, 1 reply; 15+ messages in thread
From: Mateusz Guzik @ 2014-05-13 18:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Josef Bacik, Jan Kara, Al Viro, Eric Sandeen

While here use pr_err instead of printk(KERN_ERR...)

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Josef Bacik <jbacik@fb.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Sandeen <esandeen@redhat.com>

---
 fs/super.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 48377f7..017e10a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1323,8 +1323,7 @@ int freeze_super(struct super_block *sb)
 	if (sb->s_op->freeze_fs) {
 		ret = sb->s_op->freeze_fs(sb);
 		if (ret) {
-			printk(KERN_ERR
-				"VFS:Filesystem freeze failed\n");
+			pr_err("VFS:Filesystem %s freeze failed\n", sb->s_id);
 			sb->s_writers.frozen = SB_UNFROZEN;
 			smp_wmb();
 			wake_up(&sb->s_writers.wait_unfrozen);
@@ -1364,8 +1363,7 @@ int thaw_super(struct super_block *sb)
 	if (sb->s_op->unfreeze_fs) {
 		error = sb->s_op->unfreeze_fs(sb);
 		if (error) {
-			printk(KERN_ERR
-				"VFS:Filesystem thaw failed\n");
+			pr_err("VFS:Filesystem %s thaw failed\n", sb->s_id);
 			up_write(&sb->s_umount);
 			return error;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-13 18:31 [PATCH 1/2] fs: include device name in error messages about freezing Mateusz Guzik
@ 2014-05-13 18:31 ` Mateusz Guzik
  2014-05-13 18:39   ` Joe Perches
  2014-05-14 21:54   ` Dave Chinner
  0 siblings, 2 replies; 15+ messages in thread
From: Mateusz Guzik @ 2014-05-13 18:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Josef Bacik, Jan Kara, Al Viro, Eric Sandeen

This helps hang troubleshooting efforts when only dmesg is available.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Josef Bacik <jbacik@fb.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Sandeen <esandeen@redhat.com>

---
 fs/super.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 017e10a..581c008 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1289,12 +1289,9 @@ int freeze_super(struct super_block *sb)
 		return 0;	/* sic - it's "nothing to do" */
 	}
 
-	if (sb->s_flags & MS_RDONLY) {
+	if (sb->s_flags & MS_RDONLY)
 		/* Nothing to do really... */
-		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
-		up_write(&sb->s_umount);
-		return 0;
-	}
+		goto out;
 
 	/* From now on, no new normal writers can start */
 	sb->s_writers.frozen = SB_FREEZE_WRITE;
@@ -1335,8 +1332,10 @@ int freeze_super(struct super_block *sb)
 	 * This is just for debugging purposes so that fs can warn if it
 	 * sees write activity when frozen is set to SB_FREEZE_COMPLETE.
 	 */
+out:
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
 	up_write(&sb->s_umount);
+	pr_info("VFS:Filesystem %s thawed\n", sb->s_id);
 	return 0;
 }
 EXPORT_SYMBOL(freeze_super);
@@ -1374,7 +1373,7 @@ out:
 	smp_wmb();
 	wake_up(&sb->s_writers.wait_unfrozen);
 	deactivate_locked_super(sb);
-
+	pr_info("VFS:Filesystem %s thawed\n", sb->s_id);
 	return 0;
 }
 EXPORT_SYMBOL(thaw_super);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-13 18:31 ` [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems Mateusz Guzik
@ 2014-05-13 18:39   ` Joe Perches
  2014-05-13 18:53     ` Mateusz Guzik
  2014-05-14 21:54   ` Dave Chinner
  1 sibling, 1 reply; 15+ messages in thread
From: Joe Perches @ 2014-05-13 18:39 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara, Al Viro,
	Eric Sandeen

On Tue, 2014-05-13 at 20:31 +0200, Mateusz Guzik wrote:
> This helps hang troubleshooting efforts when only dmesg is available.
[]
> diff --git a/fs/super.c b/fs/super.c
[]
> @@ -1289,12 +1289,9 @@ int freeze_super(struct super_block *sb)
>  		return 0;	/* sic - it's "nothing to do" */
>  	}
>  
> -	if (sb->s_flags & MS_RDONLY) {
> +	if (sb->s_flags & MS_RDONLY)
>  		/* Nothing to do really... */
> -		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> -		up_write(&sb->s_umount);
> -		return 0;
> -	}
> +		goto out;

Should this block really be part of this patch?
If so, please add a small note to the commit log.

I'm not sure making the reader goto out for this
bit of the code needs doing as there are multiple
blocks with return above here.

Trivially, when there is a comment in a single line
like this, then braces are often used too.
 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-13 18:39   ` Joe Perches
@ 2014-05-13 18:53     ` Mateusz Guzik
  2014-05-13 19:00       ` Joe Perches
  0 siblings, 1 reply; 15+ messages in thread
From: Mateusz Guzik @ 2014-05-13 18:53 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara, Al Viro,
	Eric Sandeen

On Tue, May 13, 2014 at 11:39:31AM -0700, Joe Perches wrote:
> On Tue, 2014-05-13 at 20:31 +0200, Mateusz Guzik wrote:
> > This helps hang troubleshooting efforts when only dmesg is available.
> []
> > diff --git a/fs/super.c b/fs/super.c
> []
> > @@ -1289,12 +1289,9 @@ int freeze_super(struct super_block *sb)
> >  		return 0;	/* sic - it's "nothing to do" */
> >  	}
> >  
> > -	if (sb->s_flags & MS_RDONLY) {
> > +	if (sb->s_flags & MS_RDONLY)
> >  		/* Nothing to do really... */
> > -		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> > -		up_write(&sb->s_umount);
> > -		return 0;
> > -	}
> > +		goto out;
> 
> Should this block really be part of this patch?
> If so, please add a small note to the commit log.
> 

This is the same code which you can find at the end of the function.
I added the label so that I can write freeze printk only once.

> I'm not sure making the reader goto out for this
> bit of the code needs doing as there are multiple
> blocks with return above here.
> 

There are 2. One just returns an error and second one does not change
s_writers.frozen, so unreezing of something like this would fail.

However, the case which got modified results in unfreezable fs, so not
printing the message for this case would result in unmatched entries in
the log.

> Trivially, when there is a comment in a single line
> like this, then braces are often used too.
>  

Thanks, I'll send a modified patch later. Waiting for some other comments.

-- 
Mateusz Guzik

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-13 18:53     ` Mateusz Guzik
@ 2014-05-13 19:00       ` Joe Perches
  2014-05-13 19:06         ` Mateusz Guzik
  0 siblings, 1 reply; 15+ messages in thread
From: Joe Perches @ 2014-05-13 19:00 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara, Al Viro,
	Eric Sandeen

On Tue, 2014-05-13 at 20:53 +0200, Mateusz Guzik wrote:
> This is the same code which you can find at the end of the function.
> I added the label so that I can write freeze printk only once.

Yes, but for a read-only filesystem is the message useful?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-13 19:00       ` Joe Perches
@ 2014-05-13 19:06         ` Mateusz Guzik
  0 siblings, 0 replies; 15+ messages in thread
From: Mateusz Guzik @ 2014-05-13 19:06 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara, Al Viro,
	Eric Sandeen

On Tue, May 13, 2014 at 12:00:21PM -0700, Joe Perches wrote:
> On Tue, 2014-05-13 at 20:53 +0200, Mateusz Guzik wrote:
> > This is the same code which you can find at the end of the function.
> > I added the label so that I can write freeze printk only once.
> 
> Yes, but for a read-only filesystem is the message useful?
> 

I see no harm in printing it and it seems more correct given that
s_writers.frozen changed, but I'm fine either way.

btw, I just noticed a copy-pasto - the same message for freezing and
unfreezing found its way into the patch. sigh.
-- 
Mateusz Guzik

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-13 18:31 ` [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems Mateusz Guzik
  2014-05-13 18:39   ` Joe Perches
@ 2014-05-14 21:54   ` Dave Chinner
  2014-05-15  2:17     ` Theodore Ts'o
  2014-05-15  9:42     ` Mateusz Guzik
  1 sibling, 2 replies; 15+ messages in thread
From: Dave Chinner @ 2014-05-14 21:54 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara, Al Viro,
	Eric Sandeen

On Tue, May 13, 2014 at 08:31:02PM +0200, Mateusz Guzik wrote:
> This helps hang troubleshooting efforts when only dmesg is available.

I really don't think that spamming dmesg every time a filesystem is
frozen or thawed is a good idea. This happens a *lot* when systems
are using snapshots, and for the most part nobody cares about
freeze/thaw cycles because they almost always work just fine.

I'd think that /proc/self/mounts would be a much better place to
indicate that the fs is frozen. After all, that's where we tell
people whether the filesystem is ro or rw, and frozen is just
a temporary, non-invasive ro state...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-14 21:54   ` Dave Chinner
@ 2014-05-15  2:17     ` Theodore Ts'o
  2014-05-15  3:04       ` Dave Chinner
  2014-05-15  9:42     ` Mateusz Guzik
  1 sibling, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2014-05-15  2:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mateusz Guzik, linux-kernel, linux-fsdevel, Josef Bacik,
	Jan Kara, Al Viro, Eric Sandeen

On Thu, May 15, 2014 at 07:54:57AM +1000, Dave Chinner wrote:
> On Tue, May 13, 2014 at 08:31:02PM +0200, Mateusz Guzik wrote:
> > This helps hang troubleshooting efforts when only dmesg is available.
> 
> I really don't think that spamming dmesg every time a filesystem is
> frozen or thawed is a good idea. This happens a *lot* when systems
> are using snapshots, and for the most part nobody cares about
> freeze/thaw cycles because they almost always work just fine.

The problems people are worried about are when there is freezing and
unfreezing taking place in combination with a machine-level
suspend/resume cycle, no?  What if we only spammed dmesg when doing a
suspend/resume?

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-15  2:17     ` Theodore Ts'o
@ 2014-05-15  3:04       ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2014-05-15  3:04 UTC (permalink / raw)
  To: Theodore Ts'o, Mateusz Guzik, linux-kernel, linux-fsdevel,
	Josef Bacik, Jan Kara, Al Viro, Eric Sandeen

On Wed, May 14, 2014 at 10:17:49PM -0400, Theodore Ts'o wrote:
> On Thu, May 15, 2014 at 07:54:57AM +1000, Dave Chinner wrote:
> > On Tue, May 13, 2014 at 08:31:02PM +0200, Mateusz Guzik wrote:
> > > This helps hang troubleshooting efforts when only dmesg is available.
> > 
> > I really don't think that spamming dmesg every time a filesystem is
> > frozen or thawed is a good idea. This happens a *lot* when systems
> > are using snapshots, and for the most part nobody cares about
> > freeze/thaw cycles because they almost always work just fine.
> 
> The problems people are worried about are when there is freezing and
> unfreezing taking place in combination with a machine-level
> suspend/resume cycle, no?  What if we only spammed dmesg when doing a
> suspend/resume?

Filesystems are not frozen by power management based suspend/resume
cycles. That uses sync and then assumes that it can't race with
with the filesystem doing metadata writeback while the suspend code
is creating the freeze image. Hence there won't by any syslog
spam because freeze is not used. ;)

Sure, I've been advocating that suspend should use freezing rather
than sync for years, but that's never happened...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-14 21:54   ` Dave Chinner
  2014-05-15  2:17     ` Theodore Ts'o
@ 2014-05-15  9:42     ` Mateusz Guzik
  2014-05-15 10:01       ` Jan Kara
  1 sibling, 1 reply; 15+ messages in thread
From: Mateusz Guzik @ 2014-05-15  9:42 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara, Al Viro,
	Eric Sandeen

On Thu, May 15, 2014 at 07:54:57AM +1000, Dave Chinner wrote:
> On Tue, May 13, 2014 at 08:31:02PM +0200, Mateusz Guzik wrote:
> > This helps hang troubleshooting efforts when only dmesg is available.
> 
> I really don't think that spamming dmesg every time a filesystem is
> frozen or thawed is a good idea. This happens a *lot* when systems
> are using snapshots, and for the most part nobody cares about
> freeze/thaw cycles because they almost always work just fine.
> 

I agree it may get noisy.

> I'd think that /proc/self/mounts would be a much better place to
> indicate that the fs is frozen. After all, that's where we tell
> people whether the filesystem is ro or rw, and frozen is just
> a temporary, non-invasive ro state...
> 

Except you can't inspect /proc/self/mounts when the only thing you got
is dmesg, so this does not really help my case.

That said, I'll try to come up with a different solution.

Poorly reported side-effects of frozen I/O are only a part of the real
problem which is hung task detector being able to typically report
backtraces of "victims" only. I came up with printks becuase these are
a cheap way and would help us out in a lot of cases.

So general idea is to support providing callbacks when setting tasks to
TASK_UNINTERRUPTIBLE which could either tell the user what's up directly
or would perform some heuristics. I'll post this in a separate thread
later, maybe with PoC.

Thanks,
-- 
Mateusz Guzik

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-15  9:42     ` Mateusz Guzik
@ 2014-05-15 10:01       ` Jan Kara
  2014-05-15 10:43         ` Lukáš Czerner
  2014-05-15 12:47         ` Theodore Ts'o
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Kara @ 2014-05-15 10:01 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Dave Chinner, linux-kernel, linux-fsdevel, Josef Bacik, Jan Kara,
	Al Viro, Eric Sandeen

On Thu 15-05-14 11:42:37, Mateusz Guzik wrote:
> On Thu, May 15, 2014 at 07:54:57AM +1000, Dave Chinner wrote:
> > On Tue, May 13, 2014 at 08:31:02PM +0200, Mateusz Guzik wrote:
> > > This helps hang troubleshooting efforts when only dmesg is available.
> > 
> > I really don't think that spamming dmesg every time a filesystem is
> > frozen or thawed is a good idea. This happens a *lot* when systems
> > are using snapshots, and for the most part nobody cares about
> > freeze/thaw cycles because they almost always work just fine.
> > 
> 
> I agree it may get noisy.
> 
> > I'd think that /proc/self/mounts would be a much better place to
> > indicate that the fs is frozen. After all, that's where we tell
> > people whether the filesystem is ro or rw, and frozen is just
> > a temporary, non-invasive ro state...
> > 
> 
> Except you can't inspect /proc/self/mounts when the only thing you got
> is dmesg, so this does not really help my case.
> 
> That said, I'll try to come up with a different solution.
> 
> Poorly reported side-effects of frozen I/O are only a part of the real
> problem which is hung task detector being able to typically report
> backtraces of "victims" only. I came up with printks becuase these are
> a cheap way and would help us out in a lot of cases.
  I was tracking down a couple of times what the hell is freezing the
filesystem (and not unfreezing it) and I agree with Mateusz it would be
nice if we could tell after the fact who froze the fs. Maybe we could store
that information in superblock and dump it during emergency thaw?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-15 10:01       ` Jan Kara
@ 2014-05-15 10:43         ` Lukáš Czerner
  2014-05-15 12:47         ` Theodore Ts'o
  1 sibling, 0 replies; 15+ messages in thread
From: Lukáš Czerner @ 2014-05-15 10:43 UTC (permalink / raw)
  To: Jan Kara
  Cc: Mateusz Guzik, Dave Chinner, linux-kernel, linux-fsdevel,
	Josef Bacik, Al Viro, Eric Sandeen

On Thu, 15 May 2014, Jan Kara wrote:

> Date: Thu, 15 May 2014 12:01:57 +0200
> From: Jan Kara <jack@suse.cz>
> To: Mateusz Guzik <mguzik@redhat.com>
> Cc: Dave Chinner <david@fromorbit.com>, linux-kernel@vger.kernel.org,
>     linux-fsdevel@vger.kernel.org, Josef Bacik <jbacik@fb.com>,
>     Jan Kara <jack@suse.cz>, Al Viro <viro@ZenIV.linux.org.uk>,
>     Eric Sandeen <esandeen@redhat.com>
> Subject: Re: [PATCH 2/2] fs: print a message when freezing/unfreezing
>     filesystems
> 
> On Thu 15-05-14 11:42:37, Mateusz Guzik wrote:
> > On Thu, May 15, 2014 at 07:54:57AM +1000, Dave Chinner wrote:
> > > On Tue, May 13, 2014 at 08:31:02PM +0200, Mateusz Guzik wrote:
> > > > This helps hang troubleshooting efforts when only dmesg is available.
> > > 
> > > I really don't think that spamming dmesg every time a filesystem is
> > > frozen or thawed is a good idea. This happens a *lot* when systems
> > > are using snapshots, and for the most part nobody cares about
> > > freeze/thaw cycles because they almost always work just fine.
> > > 
> > 
> > I agree it may get noisy.
> > 
> > > I'd think that /proc/self/mounts would be a much better place to
> > > indicate that the fs is frozen. After all, that's where we tell
> > > people whether the filesystem is ro or rw, and frozen is just
> > > a temporary, non-invasive ro state...
> > > 
> > 
> > Except you can't inspect /proc/self/mounts when the only thing you got
> > is dmesg, so this does not really help my case.
> > 
> > That said, I'll try to come up with a different solution.
> > 
> > Poorly reported side-effects of frozen I/O are only a part of the real
> > problem which is hung task detector being able to typically report
> > backtraces of "victims" only. I came up with printks becuase these are
> > a cheap way and would help us out in a lot of cases.
>   I was tracking down a couple of times what the hell is freezing the
> filesystem (and not unfreezing it) and I agree with Mateusz it would be
> nice if we could tell after the fact who froze the fs. Maybe we could store
> that information in superblock and dump it during emergency thaw?

I like that idea and if we agree to create procfs file to list
frozen file system this might be one of the useful information to
report there.

-Lukas

> 
> 								Honza
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-15 10:01       ` Jan Kara
  2014-05-15 10:43         ` Lukáš Czerner
@ 2014-05-15 12:47         ` Theodore Ts'o
  2014-05-15 13:46           ` Jan Kara
  1 sibling, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2014-05-15 12:47 UTC (permalink / raw)
  To: Jan Kara
  Cc: Mateusz Guzik, Dave Chinner, linux-kernel, linux-fsdevel,
	Josef Bacik, Al Viro, Eric Sandeen

On Thu, May 15, 2014 at 12:01:57PM +0200, Jan Kara wrote:
>   I was tracking down a couple of times what the hell is freezing the
> filesystem (and not unfreezing it) and I agree with Mateusz it would be
> nice if we could tell after the fact who froze the fs. Maybe we could store
> that information in superblock and dump it during emergency thaw?

Saving it in the superblock would require changing a bunch of file
systems.  What if we store this information in memory, and print it
out under certain conditions (i.e., after a soft lockup detection, or
upon request of some magic sysrq request)?

Or we could create a tunable threshold and print a message after a
file system has been frozen more than a particular specified duration,
with that duration set conservatively to something like 60 or 120
seconds by default.

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-15 12:47         ` Theodore Ts'o
@ 2014-05-15 13:46           ` Jan Kara
  2014-05-15 14:15             ` Theodore Ts'o
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2014-05-15 13:46 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jan Kara, Mateusz Guzik, Dave Chinner, linux-kernel,
	linux-fsdevel, Josef Bacik, Al Viro, Eric Sandeen

On Thu 15-05-14 08:47:25, Ted Tso wrote:
> On Thu, May 15, 2014 at 12:01:57PM +0200, Jan Kara wrote:
> >   I was tracking down a couple of times what the hell is freezing the
> > filesystem (and not unfreezing it) and I agree with Mateusz it would be
> > nice if we could tell after the fact who froze the fs. Maybe we could store
> > that information in superblock and dump it during emergency thaw?
> 
> Saving it in the superblock would require changing a bunch of file
> systems.  What if we store this information in memory, and print it
> out under certain conditions (i.e., after a soft lockup detection, or
> upon request of some magic sysrq request)?
  By 'superblock' I meant 'struct super_block' ;) So we are in agreement I
believe.

> Or we could create a tunable threshold and print a message after a
> file system has been frozen more than a particular specified duration,
> with that duration set conservatively to something like 60 or 120
> seconds by default.
  I was thinking about this as well but all these "warn after X seconds"
warnings tend to have quite a few false positives in practice so dumping
this in emergency-thaw sysrq handler or exposing the information somewhere
in proc (e.g. mountinfo) would look like a better option to me.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
  2014-05-15 13:46           ` Jan Kara
@ 2014-05-15 14:15             ` Theodore Ts'o
  0 siblings, 0 replies; 15+ messages in thread
From: Theodore Ts'o @ 2014-05-15 14:15 UTC (permalink / raw)
  To: Jan Kara
  Cc: Mateusz Guzik, Dave Chinner, linux-kernel, linux-fsdevel,
	Josef Bacik, Al Viro, Eric Sandeen

On Thu, May 15, 2014 at 03:46:10PM +0200, Jan Kara wrote:
> > Saving it in the superblock would require changing a bunch of file
> > systems.  What if we store this information in memory, and print it
> > out under certain conditions (i.e., after a soft lockup detection, or
> > upon request of some magic sysrq request)?
>   By 'superblock' I meant 'struct super_block' ;) So we are in agreement I
> believe.

Ah, yes, we're in agreement.  I thought you were talking about the
on-disk superblock.

> > Or we could create a tunable threshold and print a message after a
> > file system has been frozen more than a particular specified duration,
> > with that duration set conservatively to something like 60 or 120
> > seconds by default.
>   I was thinking about this as well but all these "warn after X seconds"
> warnings tend to have quite a few false positives in practice so dumping
> this in emergency-thaw sysrq handler or exposing the information somewhere
> in proc (e.g. mountinfo) would look like a better option to me.

Well, we already have the soft lockup warning, which sometimes has
some false positives, but in practice, if a process is runable but
doesn't get to run in 2 minutes (the default is 20 seconds, but we've
used 2 minutes to avoid the false positive problem on a super busy
system), something is probably clearly wrong.

Similarly, if a process is trying to write to a frozen file system,
and can't after two minutes, something is almost certainly wrong, or
least, it's something a system administrator should know about it.  We
can argue over whether the default threshold should be 20 seconds, or
120 seconds, or 2 hours, but I think there would be agreement that for
pretty much any configuration, there is some delay after which
printing a message is actually the right thing to do.  (Yes, "time
that a process is waiting to write to a frozen file system != time the
file system is frozen" --- the latter is easier to implement, but if
people feel strongly about it, the former isn't that much more
difficult.)

The problem with using an sysrq handler is the user has to know how to
use it.  If the user files a bug saying the system has mysteriously
hung, the fact that the system log contains a hint as to what might be
going on would be very useful for an enterprise distribiution's help
desk.  (Yes, this won't help if it's the root file system is the one
that's been frozen, unless the customer has configured remote syslog.
But for many cases, it might provide a vital clue that could save a
lot of time and support costs.)

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-05-15 14:15 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-13 18:31 [PATCH 1/2] fs: include device name in error messages about freezing Mateusz Guzik
2014-05-13 18:31 ` [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems Mateusz Guzik
2014-05-13 18:39   ` Joe Perches
2014-05-13 18:53     ` Mateusz Guzik
2014-05-13 19:00       ` Joe Perches
2014-05-13 19:06         ` Mateusz Guzik
2014-05-14 21:54   ` Dave Chinner
2014-05-15  2:17     ` Theodore Ts'o
2014-05-15  3:04       ` Dave Chinner
2014-05-15  9:42     ` Mateusz Guzik
2014-05-15 10:01       ` Jan Kara
2014-05-15 10:43         ` Lukáš Czerner
2014-05-15 12:47         ` Theodore Ts'o
2014-05-15 13:46           ` Jan Kara
2014-05-15 14:15             ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).