[PATCH] xfs: Fix possible truncation of log data in xlog_bread

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-22  8:12 ` Tony Lu
  0 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-22  8:12 UTC (permalink / raw)
  To: xfs, Ben Myers, Alex Elder, Dave Chinner
  Cc: linux-kernel, Tony Lu, linux-fsdevel, Chris Metcalf

I encountered the following panic when using xfs partitions as rootfs, which
is due to the truncated log data read by xlog_bread_noalign(). We should
extend the buffer by one extra log sector to ensure there's enough space to
accommodate requested log data, which we indeed did in xlog_get_bp(), but we
forgot to do in xlog_bread_noalign().

XFS mounting filesystem sda2
Starting XFS recovery on filesystem: sda2 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
Please append a correct "root=" boot option; here are the available partitio:
0800       156290904 sda  driver: sd
  0801        31463271 sda1 00000000-0000-0000-0000-000000000000
  0802        31463302 sda2 00000000-0000-0000-0000-000000000000
  0803        31463302 sda3 00000000-0000-0000-0000-000000000000
  0804               1 sda4 00000000-0000-0000-0000-000000000000
  0805        10490413 sda5 00000000-0000-0000-0000-000000000000
  0806        51407968 sda6 00000000-0000-0000-0000-000000000000
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)

Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
  frame 0: 0xfffffff70016e5a0 dump_stack+0x0/0x20 (sp 0xfffffe03fbedfe88)
  frame 1: 0xfffffff7004af470 panic+0x150/0x3a0 (sp 0xfffffe03fbedfe88)
  frame 2: 0xfffffff700881e88 mount_block_root+0x2c0/0x4c8 (sp 0xfffffe03fbe)
  frame 3: 0xfffffff700882390 prepare_namespace+0x250/0x358 (sp 0xfffffe03fb)
  frame 4: 0xfffffff700880778 kernel_init+0x4c8/0x520 (sp 0xfffffe03fbedffb0)
  frame 5: 0xfffffff70011ecb8 start_kernel_thread+0x18/0x20 (sp 0xfffffe03fb)
Stack dump complete

Signed-off-by: Zhigang Lu <zlu@tilera.com>
Reviewed-by: Chris Metcalf <cmetcalf@tilera.com>
---
 fs/xfs/xfs_log_recover.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 96fcbb8..64264a5 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -179,6 +179,21 @@ xlog_bread_noalign(
                return EFSCORRUPTED;
        }
 
+       /*
+        * The blk_no may be a non-sector-aligned block offset, in
+        * which case we round down the blk_no to be aligned with
+        * the sector size, and if the nbblks is sector-aligned,
+        * an I/O of the size nbblks could truncate the requested
+        * log data.  If the requested size is only 1 basic block it
+        * will never straddle a sector boundary, so this won't be
+        * an issue.  Nor will this be a problem if the log I/O is
+        * done in basic blocks (sector size 1).  But otherwise we
+        * extend the buffer by one extra log sector to ensure
+        * there's space to accommodate this possibility.
+        */
+       if (nbblks > 1 && log->l_sectBBsize > 1)
+               nbblks += log->l_sectBBsize;
+
        blk_no = round_down(blk_no, log->l_sectBBsize);
        nbblks = round_up(nbblks, log->l_sectBBsize);
 
-- 
1.7.10.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-22  8:12 ` Tony Lu
  0 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-22  8:12 UTC (permalink / raw)
  To: xfs, Ben Myers, Alex Elder, Dave Chinner
  Cc: linux-fsdevel, linux-kernel, Chris Metcalf, Tony Lu

I encountered the following panic when using xfs partitions as rootfs, which
is due to the truncated log data read by xlog_bread_noalign(). We should
extend the buffer by one extra log sector to ensure there's enough space to
accommodate requested log data, which we indeed did in xlog_get_bp(), but we
forgot to do in xlog_bread_noalign().

XFS mounting filesystem sda2
Starting XFS recovery on filesystem: sda2 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
Please append a correct "root=" boot option; here are the available partitio:
0800       156290904 sda  driver: sd
  0801        31463271 sda1 00000000-0000-0000-0000-000000000000
  0802        31463302 sda2 00000000-0000-0000-0000-000000000000
  0803        31463302 sda3 00000000-0000-0000-0000-000000000000
  0804               1 sda4 00000000-0000-0000-0000-000000000000
  0805        10490413 sda5 00000000-0000-0000-0000-000000000000
  0806        51407968 sda6 00000000-0000-0000-0000-000000000000
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)

Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
  frame 0: 0xfffffff70016e5a0 dump_stack+0x0/0x20 (sp 0xfffffe03fbedfe88)
  frame 1: 0xfffffff7004af470 panic+0x150/0x3a0 (sp 0xfffffe03fbedfe88)
  frame 2: 0xfffffff700881e88 mount_block_root+0x2c0/0x4c8 (sp 0xfffffe03fbe)
  frame 3: 0xfffffff700882390 prepare_namespace+0x250/0x358 (sp 0xfffffe03fb)
  frame 4: 0xfffffff700880778 kernel_init+0x4c8/0x520 (sp 0xfffffe03fbedffb0)
  frame 5: 0xfffffff70011ecb8 start_kernel_thread+0x18/0x20 (sp 0xfffffe03fb)
Stack dump complete

Signed-off-by: Zhigang Lu <zlu@tilera.com>
Reviewed-by: Chris Metcalf <cmetcalf@tilera.com>
---
 fs/xfs/xfs_log_recover.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 96fcbb8..64264a5 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -179,6 +179,21 @@ xlog_bread_noalign(
                return EFSCORRUPTED;
        }
 
+       /*
+        * The blk_no may be a non-sector-aligned block offset, in
+        * which case we round down the blk_no to be aligned with
+        * the sector size, and if the nbblks is sector-aligned,
+        * an I/O of the size nbblks could truncate the requested
+        * log data.  If the requested size is only 1 basic block it
+        * will never straddle a sector boundary, so this won't be
+        * an issue.  Nor will this be a problem if the log I/O is
+        * done in basic blocks (sector size 1).  But otherwise we
+        * extend the buffer by one extra log sector to ensure
+        * there's space to accommodate this possibility.
+        */
+       if (nbblks > 1 && log->l_sectBBsize > 1)
+               nbblks += log->l_sectBBsize;
+
        blk_no = round_down(blk_no, log->l_sectBBsize);
        nbblks = round_up(nbblks, log->l_sectBBsize);
 
-- 
1.7.10.3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-22  8:12 ` Tony Lu
@ 2013-02-22 19:14   ` Ben Myers
  -1 siblings, 0 replies; 27+ messages in thread
From: Ben Myers @ 2013-02-22 19:14 UTC (permalink / raw)
  To: Tony Lu
  Cc: xfs, Alex Elder, Dave Chinner, linux-fsdevel, linux-kernel,
	Chris Metcalf

Hi Tony,

On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
> I encountered the following panic when using xfs partitions as rootfs, which
> is due to the truncated log data read by xlog_bread_noalign(). We should
> extend the buffer by one extra log sector to ensure there's enough space to
> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> forgot to do in xlog_bread_noalign().
> 
> XFS mounting filesystem sda2
> Starting XFS recovery on filesystem: sda2 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
> Please append a correct "root=" boot option; here are the available partitio:
> 0800       156290904 sda  driver: sd
>   0801        31463271 sda1 00000000-0000-0000-0000-000000000000
>   0802        31463302 sda2 00000000-0000-0000-0000-000000000000
>   0803        31463302 sda3 00000000-0000-0000-0000-000000000000
>   0804               1 sda4 00000000-0000-0000-0000-000000000000
>   0805        10490413 sda5 00000000-0000-0000-0000-000000000000
>   0806        51407968 sda6 00000000-0000-0000-0000-000000000000
> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
> 
> Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
>   frame 0: 0xfffffff70016e5a0 dump_stack+0x0/0x20 (sp 0xfffffe03fbedfe88)
>   frame 1: 0xfffffff7004af470 panic+0x150/0x3a0 (sp 0xfffffe03fbedfe88)
>   frame 2: 0xfffffff700881e88 mount_block_root+0x2c0/0x4c8 (sp 0xfffffe03fbe)
>   frame 3: 0xfffffff700882390 prepare_namespace+0x250/0x358 (sp 0xfffffe03fb)
>   frame 4: 0xfffffff700880778 kernel_init+0x4c8/0x520 (sp 0xfffffe03fbedffb0)
>   frame 5: 0xfffffff70011ecb8 start_kernel_thread+0x18/0x20 (sp 0xfffffe03fb)
> Stack dump complete
> 
> Signed-off-by: Zhigang Lu <zlu@tilera.com>
> Reviewed-by: Chris Metcalf <cmetcalf@tilera.com>

Looks fine to me.  I'll pull it in after some testing.

Do you happen to have a metadump of this filesystem?

Reviewed-by: Ben Myers <bpm@sgi.com>

Thanks!
-Ben

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-22 19:14   ` Ben Myers
  0 siblings, 0 replies; 27+ messages in thread
From: Ben Myers @ 2013-02-22 19:14 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Dave Chinner,
	linux-fsdevel

Hi Tony,

On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
> I encountered the following panic when using xfs partitions as rootfs, which
> is due to the truncated log data read by xlog_bread_noalign(). We should
> extend the buffer by one extra log sector to ensure there's enough space to
> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> forgot to do in xlog_bread_noalign().
> 
> XFS mounting filesystem sda2
> Starting XFS recovery on filesystem: sda2 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
> Please append a correct "root=" boot option; here are the available partitio:
> 0800       156290904 sda  driver: sd
>   0801        31463271 sda1 00000000-0000-0000-0000-000000000000
>   0802        31463302 sda2 00000000-0000-0000-0000-000000000000
>   0803        31463302 sda3 00000000-0000-0000-0000-000000000000
>   0804               1 sda4 00000000-0000-0000-0000-000000000000
>   0805        10490413 sda5 00000000-0000-0000-0000-000000000000
>   0806        51407968 sda6 00000000-0000-0000-0000-000000000000
> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
> 
> Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
>   frame 0: 0xfffffff70016e5a0 dump_stack+0x0/0x20 (sp 0xfffffe03fbedfe88)
>   frame 1: 0xfffffff7004af470 panic+0x150/0x3a0 (sp 0xfffffe03fbedfe88)
>   frame 2: 0xfffffff700881e88 mount_block_root+0x2c0/0x4c8 (sp 0xfffffe03fbe)
>   frame 3: 0xfffffff700882390 prepare_namespace+0x250/0x358 (sp 0xfffffe03fb)
>   frame 4: 0xfffffff700880778 kernel_init+0x4c8/0x520 (sp 0xfffffe03fbedffb0)
>   frame 5: 0xfffffff70011ecb8 start_kernel_thread+0x18/0x20 (sp 0xfffffe03fb)
> Stack dump complete
> 
> Signed-off-by: Zhigang Lu <zlu@tilera.com>
> Reviewed-by: Chris Metcalf <cmetcalf@tilera.com>

Looks fine to me.  I'll pull it in after some testing.

Do you happen to have a metadump of this filesystem?

Reviewed-by: Ben Myers <bpm@sgi.com>

Thanks!
-Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-22  8:12 ` Tony Lu
@ 2013-02-23  0:08   ` Dave Chinner
  -1 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-23  0:08 UTC (permalink / raw)
  To: Tony Lu
  Cc: xfs, Ben Myers, Alex Elder, Dave Chinner, linux-fsdevel,
	linux-kernel, Chris Metcalf

On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
> I encountered the following panic when using xfs partitions as rootfs, which
> is due to the truncated log data read by xlog_bread_noalign(). We should
> extend the buffer by one extra log sector to ensure there's enough space to
> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> forgot to do in xlog_bread_noalign().

We've never done that round up in xlog_bread_noalign(). It shouldn't
be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
fundamentally different things. That is, xlog_get_bp() is ensuring
the buffer is large enough for the upcoming IO that will be
requested, while xlog_bread_noalign() is simply ensuring what it is
passed is correctly aligned to device sector boundaries.

So, if you have to fudge an extra block for xlog_bread_noalign(),
that implies that what xlog_bread_noalign() was passed was probably
not correct. It also implies that you are using sector sizes larger
than 512 bytes, because that's the only time this might matter. Put
simply, this:

> XFS mounting filesystem sda2
> Starting XFS recovery on filesystem: sda2 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failed

Is not sufficient information for me to determine if you've correctly
analysed the problem you were seeing and that this is the correct
fix for it. I don't even know what kernel you are seeing this on, or
how you are reproducing it.

Note that I'm not saying the fix isn't necessary or correct, just
that I cannot review it based this commit message.  Given that this
code is essentially unchanged in behaviour since the large sector
size support was adding in 2003(*), understanding how it is
deficient is critical part of the reviewi process....

Information you need to provide so I have a chance of reviewing
whether it is correct or not:

	- what kernel you saw this on,
	- what the filesystem configuration was
	- what workload reproduced this problem (a test case would
	  be nice, and xfstest even better)
	- the actual contents of the log that lead to the short read
	  during recovery
	- whether xfs_logprint was capable of parsing the log
	  correctly
	- where in the actual log recovery process the failure
	  occurred (e.g. was it trying to recover transactions from
	  a section of a wrapped log?)

IOWs, please show your working so we can determine if this is the
root cause of the problem you are seeing. :)

(*) http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=f14e527f411712f89178c31370b5d733ea1d0280

FWIW, I think your change might need work - there's the possibility
that is can round up the length beyond the end of the log if we ask
to read up to the last sector of the log (i.e. blkno + blklen ==
end of log) and then round up blklen by one sector....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-23  0:08   ` Dave Chinner
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-23  0:08 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
> I encountered the following panic when using xfs partitions as rootfs, which
> is due to the truncated log data read by xlog_bread_noalign(). We should
> extend the buffer by one extra log sector to ensure there's enough space to
> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> forgot to do in xlog_bread_noalign().

We've never done that round up in xlog_bread_noalign(). It shouldn't
be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
fundamentally different things. That is, xlog_get_bp() is ensuring
the buffer is large enough for the upcoming IO that will be
requested, while xlog_bread_noalign() is simply ensuring what it is
passed is correctly aligned to device sector boundaries.

So, if you have to fudge an extra block for xlog_bread_noalign(),
that implies that what xlog_bread_noalign() was passed was probably
not correct. It also implies that you are using sector sizes larger
than 512 bytes, because that's the only time this might matter. Put
simply, this:

> XFS mounting filesystem sda2
> Starting XFS recovery on filesystem: sda2 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failed

Is not sufficient information for me to determine if you've correctly
analysed the problem you were seeing and that this is the correct
fix for it. I don't even know what kernel you are seeing this on, or
how you are reproducing it.

Note that I'm not saying the fix isn't necessary or correct, just
that I cannot review it based this commit message.  Given that this
code is essentially unchanged in behaviour since the large sector
size support was adding in 2003(*), understanding how it is
deficient is critical part of the reviewi process....

Information you need to provide so I have a chance of reviewing
whether it is correct or not:

	- what kernel you saw this on,
	- what the filesystem configuration was
	- what workload reproduced this problem (a test case would
	  be nice, and xfstest even better)
	- the actual contents of the log that lead to the short read
	  during recovery
	- whether xfs_logprint was capable of parsing the log
	  correctly
	- where in the actual log recovery process the failure
	  occurred (e.g. was it trying to recover transactions from
	  a section of a wrapped log?)

IOWs, please show your working so we can determine if this is the
root cause of the problem you are seeing. :)

(*) http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=f14e527f411712f89178c31370b5d733ea1d0280

FWIW, I think your change might need work - there's the possibility
that is can round up the length beyond the end of the log if we ask
to read up to the last sector of the log (i.e. blkno + blklen ==
end of log) and then round up blklen by one sector....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-23  0:08   ` Dave Chinner
@ 2013-02-23  7:06     ` Tony Lu
  -1 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-23  7:06 UTC (permalink / raw)
  To: Dave Chinner, Ben Myers
  Cc: xfs, Alex Elder, Dave Chinner, linux-fsdevel, linux-kernel,
	Chris Metcalf

>From: Dave Chinner [mailto:david@fromorbit.com]
>On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
>> I encountered the following panic when using xfs partitions as rootfs, which
>> is due to the truncated log data read by xlog_bread_noalign(). We should
>> extend the buffer by one extra log sector to ensure there's enough space to
>> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
>> forgot to do in xlog_bread_noalign().
>
>We've never done that round up in xlog_bread_noalign(). It shouldn't
>be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
>fundamentally different things. That is, xlog_get_bp() is ensuring
>the buffer is large enough for the upcoming IO that will be
>requested, while xlog_bread_noalign() is simply ensuring what it is
>passed is correctly aligned to device sector boundaries.

I set the sector size as 4096 when making the xfs filesystem.
-sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

In this case, xlog_bread_noalign() needs to do such round up and round down frequently. And it is used to ensure what it is passed is aligned to the log sector size, but not the device sector boundaries.

Here is the debug info I added when mounting this xfs partition.
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
.......
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

>So, if you have to fudge an extra block for xlog_bread_noalign(),
>that implies that what xlog_bread_noalign() was passed was probably
>not correct. It also implies that you are using sector sizes larger
>than 512 bytes, because that's the only time this might matter. Put
>simply, this:

While debugging, I found when it crashed, the blk_no was not align to the log sector size and nnblks was aligned to the log sector size, which makes sense.

For example, if xlog_bread_noalign() wants to read blocks from #1 to # 9, in which case the passed parameter blk_no is 1, and nbblks is 8, sectBBsize is 8, after the round down and round up operations, we get blk_no as 0, and nbblks as still 8. We definitely lose the last block of the log data.

>> XFS mounting filesystem sda2
>> Starting XFS recovery on filesystem: sda2 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failed
>
>Is not sufficient information for me to determine if you've correctly
>analysed the problem you were seeing and that this is the correct
>fix for it. I don't even know what kernel you are seeing this on, or
>how you are reproducing it.

I was using the 2.6.38.6 kernel, and using xfs as a rootfs partition. After untaring the rootfs files on the xfs partition, and tried to reboot from the xfs, then the panic occasionally occurred.

>
>Note that I'm not saying the fix isn't necessary or correct, just
>that I cannot review it based this commit message.  Given that this
>code is essentially unchanged in behaviour since the large sector
>size support was adding in 2003(*), understanding how it is
>deficient is critical part of the reviewi process....
>
>Information you need to provide so I have a chance of reviewing
>whether it is correct or not:
>
>	- what kernel you saw this on,
>	- what the filesystem configuration was
>	- what workload reproduced this problem (a test case would
>	  be nice, and xfstest even better)
>	- the actual contents of the log that lead to the short read
>	  during recovery
>	- whether xfs_logprint was capable of parsing the log
>	  correctly
>	- where in the actual log recovery process the failure
>	  occurred (e.g. was it trying to recover transactions from
>	  a section of a wrapped log?)

I hope I can provide the corrupted log for you, but probably I could not find it, since I fixed this bug a year ago. Recently when I do some clean-up on my code, I find this one, so I think I should return it back to the community.

>IOWs, please show your working so we can determine if this is the
>root cause of the problem you are seeing. :)
>
>(*)
>http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff
>;h=f14e527f411712f89178c31370b5d733ea1d0280
>
>FWIW, I think your change might need work - there's the possibility
>that is can round up the length beyond the end of the log if we ask
>to read up to the last sector of the log (i.e. blkno + blklen ==
>end of log) and then round up blklen by one sector....
>
Good catch, you are right on this. To avoid this possibility, I changed the patch a little bit as following.
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -171,6 +171,7 @@ xlog_bread_noalign(
        struct xfs_buf  *bp)
 {
        int             error;
+       xfs_daddr_t     orig_blk_no = blk_no;
 
        if (!xlog_buf_bbcount_valid(log, nbblks)) {
                xfs_warn(log->l_mp, "Invalid block length (0x%x) for buffer",
@@ -180,6 +181,13 @@ xlog_bread_noalign(
        }
 
        blk_no = round_down(blk_no, log->l_sectBBsize);
+
+       /* 
+        * If the blk_no is rounded down to be sector-aligned, we need to
+        * increase the nbblks accordingly, avoiding truncating the requested
+        * log data.
+        */
+       nbblks += orig_blk_no - blk_no;
        nbblks = round_up(nbblks, log->l_sectBBsize);
 
        ASSERT(nbblks > 0);

Thanks
-Tony

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-23  7:06     ` Tony Lu
  0 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-23  7:06 UTC (permalink / raw)
  To: Dave Chinner, Ben Myers
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Dave Chinner,
	linux-fsdevel

>From: Dave Chinner [mailto:david@fromorbit.com]
>On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
>> I encountered the following panic when using xfs partitions as rootfs, which
>> is due to the truncated log data read by xlog_bread_noalign(). We should
>> extend the buffer by one extra log sector to ensure there's enough space to
>> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
>> forgot to do in xlog_bread_noalign().
>
>We've never done that round up in xlog_bread_noalign(). It shouldn't
>be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
>fundamentally different things. That is, xlog_get_bp() is ensuring
>the buffer is large enough for the upcoming IO that will be
>requested, while xlog_bread_noalign() is simply ensuring what it is
>passed is correctly aligned to device sector boundaries.

I set the sector size as 4096 when making the xfs filesystem.
-sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

In this case, xlog_bread_noalign() needs to do such round up and round down frequently. And it is used to ensure what it is passed is aligned to the log sector size, but not the device sector boundaries.

Here is the debug info I added when mounting this xfs partition.
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
.......
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

>So, if you have to fudge an extra block for xlog_bread_noalign(),
>that implies that what xlog_bread_noalign() was passed was probably
>not correct. It also implies that you are using sector sizes larger
>than 512 bytes, because that's the only time this might matter. Put
>simply, this:

While debugging, I found when it crashed, the blk_no was not align to the log sector size and nnblks was aligned to the log sector size, which makes sense.

For example, if xlog_bread_noalign() wants to read blocks from #1 to # 9, in which case the passed parameter blk_no is 1, and nbblks is 8, sectBBsize is 8, after the round down and round up operations, we get blk_no as 0, and nbblks as still 8. We definitely lose the last block of the log data.

>> XFS mounting filesystem sda2
>> Starting XFS recovery on filesystem: sda2 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failed
>
>Is not sufficient information for me to determine if you've correctly
>analysed the problem you were seeing and that this is the correct
>fix for it. I don't even know what kernel you are seeing this on, or
>how you are reproducing it.

I was using the 2.6.38.6 kernel, and using xfs as a rootfs partition. After untaring the rootfs files on the xfs partition, and tried to reboot from the xfs, then the panic occasionally occurred.

>
>Note that I'm not saying the fix isn't necessary or correct, just
>that I cannot review it based this commit message.  Given that this
>code is essentially unchanged in behaviour since the large sector
>size support was adding in 2003(*), understanding how it is
>deficient is critical part of the reviewi process....
>
>Information you need to provide so I have a chance of reviewing
>whether it is correct or not:
>
>	- what kernel you saw this on,
>	- what the filesystem configuration was
>	- what workload reproduced this problem (a test case would
>	  be nice, and xfstest even better)
>	- the actual contents of the log that lead to the short read
>	  during recovery
>	- whether xfs_logprint was capable of parsing the log
>	  correctly
>	- where in the actual log recovery process the failure
>	  occurred (e.g. was it trying to recover transactions from
>	  a section of a wrapped log?)

I hope I can provide the corrupted log for you, but probably I could not find it, since I fixed this bug a year ago. Recently when I do some clean-up on my code, I find this one, so I think I should return it back to the community.

>IOWs, please show your working so we can determine if this is the
>root cause of the problem you are seeing. :)
>
>(*)
>http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff
>;h=f14e527f411712f89178c31370b5d733ea1d0280
>
>FWIW, I think your change might need work - there's the possibility
>that is can round up the length beyond the end of the log if we ask
>to read up to the last sector of the log (i.e. blkno + blklen ==
>end of log) and then round up blklen by one sector....
>
Good catch, you are right on this. To avoid this possibility, I changed the patch a little bit as following.
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -171,6 +171,7 @@ xlog_bread_noalign(
        struct xfs_buf  *bp)
 {
        int             error;
+       xfs_daddr_t     orig_blk_no = blk_no;
 
        if (!xlog_buf_bbcount_valid(log, nbblks)) {
                xfs_warn(log->l_mp, "Invalid block length (0x%x) for buffer",
@@ -180,6 +181,13 @@ xlog_bread_noalign(
        }
 
        blk_no = round_down(blk_no, log->l_sectBBsize);
+
+       /* 
+        * If the blk_no is rounded down to be sector-aligned, we need to
+        * increase the nbblks accordingly, avoiding truncating the requested
+        * log data.
+        */
+       nbblks += orig_blk_no - blk_no;
        nbblks = round_up(nbblks, log->l_sectBBsize);
 
        ASSERT(nbblks > 0);

Thanks
-Tony

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-22 19:14   ` Ben Myers
@ 2013-02-23  8:32     ` Tony Lu
  -1 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-23  8:32 UTC (permalink / raw)
  To: Ben Myers
  Cc: xfs, Alex Elder, Dave Chinner, linux-fsdevel, linux-kernel,
	Chris Metcalf

>-----Original Message-----
>From: Ben Myers [mailto:bpm@sgi.com]
>
>Hi Tony,
>
>On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
>> I encountered the following panic when using xfs partitions as rootfs, which
>> is due to the truncated log data read by xlog_bread_noalign(). We should
>> extend the buffer by one extra log sector to ensure there's enough space to
>> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
>> forgot to do in xlog_bread_noalign().
>>
>> XFS mounting filesystem sda2
>> Starting XFS recovery on filesystem: sda2 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
>> Please append a correct "root=" boot option; here are the available partitio:
>> 0800       156290904 sda  driver: sd
>>   0801        31463271 sda1 00000000-0000-0000-0000-000000000000
>>   0802        31463302 sda2 00000000-0000-0000-0000-000000000000
>>   0803        31463302 sda3 00000000-0000-0000-0000-000000000000
>>   0804               1 sda4 00000000-0000-0000-0000-000000000000
>>   0805        10490413 sda5 00000000-0000-0000-0000-000000000000
>>   0806        51407968 sda6 00000000-0000-0000-0000-000000000000
>> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
>>
>> Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
>>   frame 0: 0xfffffff70016e5a0 dump_stack+0x0/0x20 (sp 0xfffffe03fbedfe88)
>>   frame 1: 0xfffffff7004af470 panic+0x150/0x3a0 (sp 0xfffffe03fbedfe88)
>>   frame 2: 0xfffffff700881e88 mount_block_root+0x2c0/0x4c8 (sp
>0xfffffe03fbe)
>>   frame 3: 0xfffffff700882390 prepare_namespace+0x250/0x358 (sp
>0xfffffe03fb)
>>   frame 4: 0xfffffff700880778 kernel_init+0x4c8/0x520 (sp
>0xfffffe03fbedffb0)
>>   frame 5: 0xfffffff70011ecb8 start_kernel_thread+0x18/0x20 (sp
>0xfffffe03fb)
>> Stack dump complete
>>
>> Signed-off-by: Zhigang Lu <zlu@tilera.com>
>> Reviewed-by: Chris Metcalf <cmetcalf@tilera.com>
>
>Looks fine to me.  I'll pull it in after some testing.
>
>Do you happen to have a metadump of this filesystem?
>
>Reviewed-by: Ben Myers <bpm@sgi.com>

Sorry I did not keep the metadump of it. But I kept some debugging info when I debugged and fixed it a year ago.

Starting XFS recovery on filesystem: ram0 (logdev: internal)
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
XFS: xlog_recover_process_data: bad clientid
Assertion failed: 0, file: /home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/xfs_log_recover.c, line: 2852
BUG: failure at /home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/support/debug.c:100/assfail()!
Kernel panic - not syncing: BUG!

Starting stack dump of tid 843, pid 843 (mount) on cpu 1 at cycle 345934778384
  frame 0: 0xfffffff7001380a0 dump_stack+0x0/0x20 (sp 0xfffffe43e55df7b0)
  frame 1: 0xfffffff7003b5470 panic+0x150/0x3a0 (sp 0xfffffe43e55df7b0)
  frame 2: 0xfffffff700824cf0 assfail+0x80/0x80 (sp 0xfffffe43e55df858)
  frame 3: 0xfffffff70037c7c0 xlog_recover_process_data+0x598/0x698 (sp 0xfffffe43e55df868)
  frame 4: 0xfffffff7002c55e8 xlog_do_recovery_pass+0x810/0x908 (sp 0xfffffe43e55df8e8)
  frame 5: 0xfffffff70068f0d8 xlog_do_log_recovery+0xc8/0x1d8 (sp 0xfffffe43e55dfa48)
  frame 6: 0xfffffff70054cf60 xlog_do_recover+0x48/0x380 (sp 0xfffffe43e55dfa88)
  frame 7: 0xfffffff7006fdbf0 xlog_recover+0x138/0x170 (sp 0xfffffe43e55dfac0)
  frame 8: 0xfffffff7005b2d70 xfs_log_mount+0x150/0x2e8 (sp 0xfffffe43e55dfb00)
  frame 9: 0xfffffff700269830 xfs_mountfs+0x510/0xb20 (sp 0xfffffe43e55dfb38)
  frame 10: 0xfffffff700486930 xfs_fs_fill_super+0x2e0/0x3f0 (sp 0xfffffe43e55dfba8)
  frame 11: 0xfffffff7000950c8 mount_bdev+0x168/0x2d0 (sp 0xfffffe43e55dfbe0)
  frame 12: 0xfffffff700071e08 vfs_kern_mount+0x110/0x408 (sp 0xfffffe43e55dfc50)
  frame 13: 0xfffffff7000badf8 do_kern_mount+0x68/0x1e0 (sp 0xfffffe43e55dfc98)
  frame 14: 0xfffffff700046470 do_mount+0x200/0x878 (sp 0xfffffe43e55dfcd8)
  frame 15: 0xfffffff7000c8050 sys_mount+0xd0/0x1a0 (sp 0xfffffe43e55dfd60)
  frame 16: 0xfffffff7001a2c30 handle_syscall+0x280/0x340 (sp 0xfffffe43e55dfdc0)
  <syscall while in user mode>
  frame 17: 0xaaaad46688 libc-2.12.so[aaaac20000+1d0000] (sp 0x1ffffddf4b0)
  frame 18: 0x15555555560 mount[15555550000+20000] (sp 0x1ffffddf4b0)
  frame 19: 0x15555557dc0 mount[15555550000+20000] (sp 0x1ffffddf500)
  frame 20: 0x15555558a80 mount[15555550000+20000] (sp 0x1ffffddf858)
  frame 21: 0x15555559a60 mount[15555550000+20000] (sp 0x1ffffddf930)
  frame 22: 0xaaaac3e5e8 libc-2.12.so[aaaac20000+1d0000] (sp 0x1ffffddfaf8)
Stack dump complete
Client requested halt.

Thanks
-Tony

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-23  8:32     ` Tony Lu
  0 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-23  8:32 UTC (permalink / raw)
  To: Ben Myers
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Dave Chinner,
	linux-fsdevel

>-----Original Message-----
>From: Ben Myers [mailto:bpm@sgi.com]
>
>Hi Tony,
>
>On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
>> I encountered the following panic when using xfs partitions as rootfs, which
>> is due to the truncated log data read by xlog_bread_noalign(). We should
>> extend the buffer by one extra log sector to ensure there's enough space to
>> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
>> forgot to do in xlog_bread_noalign().
>>
>> XFS mounting filesystem sda2
>> Starting XFS recovery on filesystem: sda2 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
>> Please append a correct "root=" boot option; here are the available partitio:
>> 0800       156290904 sda  driver: sd
>>   0801        31463271 sda1 00000000-0000-0000-0000-000000000000
>>   0802        31463302 sda2 00000000-0000-0000-0000-000000000000
>>   0803        31463302 sda3 00000000-0000-0000-0000-000000000000
>>   0804               1 sda4 00000000-0000-0000-0000-000000000000
>>   0805        10490413 sda5 00000000-0000-0000-0000-000000000000
>>   0806        51407968 sda6 00000000-0000-0000-0000-000000000000
>> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
>>
>> Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
>>   frame 0: 0xfffffff70016e5a0 dump_stack+0x0/0x20 (sp 0xfffffe03fbedfe88)
>>   frame 1: 0xfffffff7004af470 panic+0x150/0x3a0 (sp 0xfffffe03fbedfe88)
>>   frame 2: 0xfffffff700881e88 mount_block_root+0x2c0/0x4c8 (sp
>0xfffffe03fbe)
>>   frame 3: 0xfffffff700882390 prepare_namespace+0x250/0x358 (sp
>0xfffffe03fb)
>>   frame 4: 0xfffffff700880778 kernel_init+0x4c8/0x520 (sp
>0xfffffe03fbedffb0)
>>   frame 5: 0xfffffff70011ecb8 start_kernel_thread+0x18/0x20 (sp
>0xfffffe03fb)
>> Stack dump complete
>>
>> Signed-off-by: Zhigang Lu <zlu@tilera.com>
>> Reviewed-by: Chris Metcalf <cmetcalf@tilera.com>
>
>Looks fine to me.  I'll pull it in after some testing.
>
>Do you happen to have a metadump of this filesystem?
>
>Reviewed-by: Ben Myers <bpm@sgi.com>

Sorry I did not keep the metadump of it. But I kept some debugging info when I debugged and fixed it a year ago.

Starting XFS recovery on filesystem: ram0 (logdev: internal)
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
XFS: xlog_recover_process_data: bad clientid
Assertion failed: 0, file: /home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/xfs_log_recover.c, line: 2852
BUG: failure at /home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/support/debug.c:100/assfail()!
Kernel panic - not syncing: BUG!

Starting stack dump of tid 843, pid 843 (mount) on cpu 1 at cycle 345934778384
  frame 0: 0xfffffff7001380a0 dump_stack+0x0/0x20 (sp 0xfffffe43e55df7b0)
  frame 1: 0xfffffff7003b5470 panic+0x150/0x3a0 (sp 0xfffffe43e55df7b0)
  frame 2: 0xfffffff700824cf0 assfail+0x80/0x80 (sp 0xfffffe43e55df858)
  frame 3: 0xfffffff70037c7c0 xlog_recover_process_data+0x598/0x698 (sp 0xfffffe43e55df868)
  frame 4: 0xfffffff7002c55e8 xlog_do_recovery_pass+0x810/0x908 (sp 0xfffffe43e55df8e8)
  frame 5: 0xfffffff70068f0d8 xlog_do_log_recovery+0xc8/0x1d8 (sp 0xfffffe43e55dfa48)
  frame 6: 0xfffffff70054cf60 xlog_do_recover+0x48/0x380 (sp 0xfffffe43e55dfa88)
  frame 7: 0xfffffff7006fdbf0 xlog_recover+0x138/0x170 (sp 0xfffffe43e55dfac0)
  frame 8: 0xfffffff7005b2d70 xfs_log_mount+0x150/0x2e8 (sp 0xfffffe43e55dfb00)
  frame 9: 0xfffffff700269830 xfs_mountfs+0x510/0xb20 (sp 0xfffffe43e55dfb38)
  frame 10: 0xfffffff700486930 xfs_fs_fill_super+0x2e0/0x3f0 (sp 0xfffffe43e55dfba8)
  frame 11: 0xfffffff7000950c8 mount_bdev+0x168/0x2d0 (sp 0xfffffe43e55dfbe0)
  frame 12: 0xfffffff700071e08 vfs_kern_mount+0x110/0x408 (sp 0xfffffe43e55dfc50)
  frame 13: 0xfffffff7000badf8 do_kern_mount+0x68/0x1e0 (sp 0xfffffe43e55dfc98)
  frame 14: 0xfffffff700046470 do_mount+0x200/0x878 (sp 0xfffffe43e55dfcd8)
  frame 15: 0xfffffff7000c8050 sys_mount+0xd0/0x1a0 (sp 0xfffffe43e55dfd60)
  frame 16: 0xfffffff7001a2c30 handle_syscall+0x280/0x340 (sp 0xfffffe43e55dfdc0)
  <syscall while in user mode>
  frame 17: 0xaaaad46688 libc-2.12.so[aaaac20000+1d0000] (sp 0x1ffffddf4b0)
  frame 18: 0x15555555560 mount[15555550000+20000] (sp 0x1ffffddf4b0)
  frame 19: 0x15555557dc0 mount[15555550000+20000] (sp 0x1ffffddf500)
  frame 20: 0x15555558a80 mount[15555550000+20000] (sp 0x1ffffddf858)
  frame 21: 0x15555559a60 mount[15555550000+20000] (sp 0x1ffffddf930)
  frame 22: 0xaaaac3e5e8 libc-2.12.so[aaaac20000+1d0000] (sp 0x1ffffddfaf8)
Stack dump complete
Client requested halt.

Thanks
-Tony

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-23  7:06     ` Tony Lu
@ 2013-02-23 23:55       ` Dave Chinner
  -1 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-23 23:55 UTC (permalink / raw)
  To: Tony Lu
  Cc: Ben Myers, xfs, Alex Elder, Dave Chinner, linux-fsdevel,
	linux-kernel, Chris Metcalf

On Sat, Feb 23, 2013 at 07:06:10AM +0000, Tony Lu wrote:
> >From: Dave Chinner [mailto:david@fromorbit.com]
> >On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
> >> I encountered the following panic when using xfs partitions as rootfs, which
> >> is due to the truncated log data read by xlog_bread_noalign(). We should
> >> extend the buffer by one extra log sector to ensure there's enough space to
> >> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> >> forgot to do in xlog_bread_noalign().
> >
> >We've never done that round up in xlog_bread_noalign(). It shouldn't
> >be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
> >fundamentally different things. That is, xlog_get_bp() is ensuring
> >the buffer is large enough for the upcoming IO that will be
> >requested, while xlog_bread_noalign() is simply ensuring what it is
> >passed is correctly aligned to device sector boundaries.
> 
> I set the sector size as 4096 when making the xfs filesystem.
> -sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

.....

> In this case, xlog_bread_noalign() needs to do such round up and
> round down frequently. And it is used to ensure what it is passed
> is aligned to the log sector size, but not the device sector
> boundaries.

If you have a 4k sector device, then the log sector size is the same
as the physical device. Hence the log code assumes that if you have
a specific log sector size, it is operating on a device that has the
physical IO constraints of that sector size.

> >So, if you have to fudge an extra block for xlog_bread_noalign(),
> >that implies that what xlog_bread_noalign() was passed was
> >probably not correct. It also implies that you are using sector
> >sizes larger than 512 bytes, because that's the only time this
> >might matter. Put simply, this:
> 
> While debugging, I found when it crashed, the blk_no was not align
> to the log sector size and nnblks was aligned to the log sector
> size, which makes sense.

Actually, it doesn't. The log writes done by the kernel are supposed
to be aligned and padded to sector size, which means that we should
never see an unaligned block numbers when reading log buffer headers
back off disk. i.e. when you run mkfs.xfs -s size=4096, you end up
with a a log stripe unit of 4096 bytes, which means it pads
every write to 4096 byte boundaries rather than 512 byte boundaries.

> Starting XFS recovery on filesystem: ram0 (logdev: internal)

Ok, you're using a ramdisk and not a real 4k sector device. Hence it
won't fail if we do 512 byte aligned IO rather than 4k aligned IO.

> xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
> xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
> xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
> XFS: xlog_recover_process_data: bad clientid
>
> For example, if xlog_bread_noalign() wants to read blocks from #1
> to # 9, in which case the passed parameter blk_no is 1, and nbblks
> is 8, sectBBsize is 8, after the round down and round up
> operations, we get blk_no as 0, and nbblks as still 8. We
> definitely lose the last block of the log data.

Yes, I fully understand that. But I also understand how the log
works and that this behaviour *should not happen*. That's why
I'm asking questions about what the problem you are trying to fix.

The issue here is that the log buffer write was not aligned to the
underlying sector size. That is, what we see here is a header block
read, followed by the log buffer data read. The header size is
determined by the iclogbuf size - a 512 byte block implies default
32k iclogbuf size - and the following data region read of 63 blocks
also indicates a 32k iclogbuf size.

IOWs, what we have here is a 32k log buffer write apparently at a
sector-unaligned block address (0xf4d = 3917 which is not a multiple
of 8). This is why log recovery went wrong: a fundamental
architectural assumption the log is built around has somehow been
violated.

That is, the log recovery failure does not appear to be a problem
with the sector alignment done by xlog_bread_noalign() - it appears
to be a failure with the alignment of log buffer IO written to the
log. That's a far more serious problem than a log recovery problem,
but I can't see how that could occur and so I need a test case that
reproduces the recovery failure for deeper analysis....

> I was using the 2.6.38.6 kernel, and using xfs as a rootfs
> partition. After untaring the rootfs files on the xfs partition,
> and tried to reboot from the xfs, then the panic occasionally
> occurred.

Ramdisks don't persist over a reboot, so you must have had some
other way of reproducing the problem. Can you tell me how you
reproduced it on a ramdisk? Better yet, send me a script that
reproduces the problem?

> I hope I can provide the corrupted log for you, but probably I
> could not find it, since I fixed this bug a year ago. Recently
> when I do some clean-up on my code, I find this one, so I think I
> should return it back to the community.

Great to hear, but I'd really like it if you pushed fixes for XFS
bugs upstream the moment you find them. You'll get help triaging the
problem, determining that the fix is correct, etc, and that will
benefit your customers by providing them with a more robust product.
The community also benefits from getting bug fixes much faster....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-23 23:55       ` Dave Chinner
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-23 23:55 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On Sat, Feb 23, 2013 at 07:06:10AM +0000, Tony Lu wrote:
> >From: Dave Chinner [mailto:david@fromorbit.com]
> >On Fri, Feb 22, 2013 at 08:12:52AM +0000, Tony Lu wrote:
> >> I encountered the following panic when using xfs partitions as rootfs, which
> >> is due to the truncated log data read by xlog_bread_noalign(). We should
> >> extend the buffer by one extra log sector to ensure there's enough space to
> >> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> >> forgot to do in xlog_bread_noalign().
> >
> >We've never done that round up in xlog_bread_noalign(). It shouldn't
> >be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
> >fundamentally different things. That is, xlog_get_bp() is ensuring
> >the buffer is large enough for the upcoming IO that will be
> >requested, while xlog_bread_noalign() is simply ensuring what it is
> >passed is correctly aligned to device sector boundaries.
> 
> I set the sector size as 4096 when making the xfs filesystem.
> -sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

.....

> In this case, xlog_bread_noalign() needs to do such round up and
> round down frequently. And it is used to ensure what it is passed
> is aligned to the log sector size, but not the device sector
> boundaries.

If you have a 4k sector device, then the log sector size is the same
as the physical device. Hence the log code assumes that if you have
a specific log sector size, it is operating on a device that has the
physical IO constraints of that sector size.

> >So, if you have to fudge an extra block for xlog_bread_noalign(),
> >that implies that what xlog_bread_noalign() was passed was
> >probably not correct. It also implies that you are using sector
> >sizes larger than 512 bytes, because that's the only time this
> >might matter. Put simply, this:
> 
> While debugging, I found when it crashed, the blk_no was not align
> to the log sector size and nnblks was aligned to the log sector
> size, which makes sense.

Actually, it doesn't. The log writes done by the kernel are supposed
to be aligned and padded to sector size, which means that we should
never see an unaligned block numbers when reading log buffer headers
back off disk. i.e. when you run mkfs.xfs -s size=4096, you end up
with a a log stripe unit of 4096 bytes, which means it pads
every write to 4096 byte boundaries rather than 512 byte boundaries.

> Starting XFS recovery on filesystem: ram0 (logdev: internal)

Ok, you're using a ramdisk and not a real 4k sector device. Hence it
won't fail if we do 512 byte aligned IO rather than 4k aligned IO.

> xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
> xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
> xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
> XFS: xlog_recover_process_data: bad clientid
>
> For example, if xlog_bread_noalign() wants to read blocks from #1
> to # 9, in which case the passed parameter blk_no is 1, and nbblks
> is 8, sectBBsize is 8, after the round down and round up
> operations, we get blk_no as 0, and nbblks as still 8. We
> definitely lose the last block of the log data.

Yes, I fully understand that. But I also understand how the log
works and that this behaviour *should not happen*. That's why
I'm asking questions about what the problem you are trying to fix.

The issue here is that the log buffer write was not aligned to the
underlying sector size. That is, what we see here is a header block
read, followed by the log buffer data read. The header size is
determined by the iclogbuf size - a 512 byte block implies default
32k iclogbuf size - and the following data region read of 63 blocks
also indicates a 32k iclogbuf size.

IOWs, what we have here is a 32k log buffer write apparently at a
sector-unaligned block address (0xf4d = 3917 which is not a multiple
of 8). This is why log recovery went wrong: a fundamental
architectural assumption the log is built around has somehow been
violated.

That is, the log recovery failure does not appear to be a problem
with the sector alignment done by xlog_bread_noalign() - it appears
to be a failure with the alignment of log buffer IO written to the
log. That's a far more serious problem than a log recovery problem,
but I can't see how that could occur and so I need a test case that
reproduces the recovery failure for deeper analysis....

> I was using the 2.6.38.6 kernel, and using xfs as a rootfs
> partition. After untaring the rootfs files on the xfs partition,
> and tried to reboot from the xfs, then the panic occasionally
> occurred.

Ramdisks don't persist over a reboot, so you must have had some
other way of reproducing the problem. Can you tell me how you
reproduced it on a ramdisk? Better yet, send me a script that
reproduces the problem?

> I hope I can provide the corrupted log for you, but probably I
> could not find it, since I fixed this bug a year ago. Recently
> when I do some clean-up on my code, I find this one, so I think I
> should return it back to the community.

Great to hear, but I'd really like it if you pushed fixes for XFS
bugs upstream the moment you find them. You'll get help triaging the
problem, determining that the fix is correct, etc, and that will
benefit your customers by providing them with a more robust product.
The community also benefits from getting bug fixes much faster....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-23 23:55       ` Dave Chinner
@ 2013-02-24  4:46         ` Tony Lu
  -1 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-24  4:46 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Ben Myers, xfs, Alex Elder, Dave Chinner, linux-fsdevel,
	linux-kernel, Chris Metcalf

>> For example, if xlog_bread_noalign() wants to read blocks from #1
>> to # 9, in which case the passed parameter blk_no is 1, and nbblks
>> is 8, sectBBsize is 8, after the round down and round up
>> operations, we get blk_no as 0, and nbblks as still 8. We
>> definitely lose the last block of the log data.
>
>Yes, I fully understand that. But I also understand how the log
>works and that this behaviour *should not happen*. That's why
>I'm asking questions about what the problem you are trying to fix.

I am not sure about this, since I saw many reads on non-sector-align blocks even when successfully mounting good XFS partitions. 
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
.......
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

And also there is code in xlog_write_log_records() which handles non-sector-align reads and writes.

	/* We may need to do a read at the start to fill in part of
	 * the buffer in the starting sector not covered by the first
	 * write below.
	 */
	balign = round_down(start_block, sectbb);
	if (balign != start_block) {
		error = xlog_bread_noalign(log, start_block, 1, bp);
		if (error)
			goto out_put_bp;

		j = start_block - balign;
	}

>Ramdisks don't persist over a reboot, so you must have had some
>other way of reproducing the problem. Can you tell me how you
>reproduced it on a ramdisk? Better yet, send me a script that
>reproduces the problem?

I will try to reproduce it. Basically it is a loop of mount, creating many files and unmount.

Thanks
-Tony

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-24  4:46         ` Tony Lu
  0 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-24  4:46 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

>> For example, if xlog_bread_noalign() wants to read blocks from #1
>> to # 9, in which case the passed parameter blk_no is 1, and nbblks
>> is 8, sectBBsize is 8, after the round down and round up
>> operations, we get blk_no as 0, and nbblks as still 8. We
>> definitely lose the last block of the log data.
>
>Yes, I fully understand that. But I also understand how the log
>works and that this behaviour *should not happen*. That's why
>I'm asking questions about what the problem you are trying to fix.

I am not sure about this, since I saw many reads on non-sector-align blocks even when successfully mounting good XFS partitions. 
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
.......
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

And also there is code in xlog_write_log_records() which handles non-sector-align reads and writes.

	/* We may need to do a read at the start to fill in part of
	 * the buffer in the starting sector not covered by the first
	 * write below.
	 */
	balign = round_down(start_block, sectbb);
	if (balign != start_block) {
		error = xlog_bread_noalign(log, start_block, 1, bp);
		if (error)
			goto out_put_bp;

		j = start_block - balign;
	}

>Ramdisks don't persist over a reboot, so you must have had some
>other way of reproducing the problem. Can you tell me how you
>reproduced it on a ramdisk? Better yet, send me a script that
>reproduces the problem?

I will try to reproduce it. Basically it is a loop of mount, creating many files and unmount.

Thanks
-Tony

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-24  4:46         ` Tony Lu
@ 2013-02-24 14:10           ` Dave Chinner
  -1 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-24 14:10 UTC (permalink / raw)
  To: Tony Lu
  Cc: Ben Myers, xfs, Alex Elder, Dave Chinner, linux-fsdevel,
	linux-kernel, Chris Metcalf

On Sun, Feb 24, 2013 at 04:46:30AM +0000, Tony Lu wrote:
> >> For example, if xlog_bread_noalign() wants to read blocks from #1
> >> to # 9, in which case the passed parameter blk_no is 1, and nbblks
> >> is 8, sectBBsize is 8, after the round down and round up
> >> operations, we get blk_no as 0, and nbblks as still 8. We
> >> definitely lose the last block of the log data.
> >
> >Yes, I fully understand that. But I also understand how the log
> >works and that this behaviour *should not happen*. That's why
> >I'm asking questions about what the problem you are trying to fix.
> 
> I am not sure about this, since I saw many reads on
> non-sector-align blocks even when successfully mounting good XFS
> partitions.

I didn't say that non-sector align reads should not be attempted by
log recovery - it's obvious from the on disk format of the log that
we have to parse it in chunks of 512 bytes to make sense of it's
contents, and that leads to the 512 byte reads and other subsequent
unaligned reads.

*However*

Seeing that there are unaligned reads occurring does not mean that
the structures in the log should be unaligned. Your test output
indicated a log record header at an unaligned block address, and
that's incorrect. It doesn't matter what the rest of the log
recovery code does with non-aligned IO - the fact is that your debug
implies that the contents of the log is corrupt and that implies a
deeper problem....

> And also there is code in xlog_write_log_records() which handles
> non-sector-align reads and writes.

Yes, it does handle it, but that doesn't mean that it is correct to
pass unaligned block ranges to it.

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-24 14:10           ` Dave Chinner
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-24 14:10 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On Sun, Feb 24, 2013 at 04:46:30AM +0000, Tony Lu wrote:
> >> For example, if xlog_bread_noalign() wants to read blocks from #1
> >> to # 9, in which case the passed parameter blk_no is 1, and nbblks
> >> is 8, sectBBsize is 8, after the round down and round up
> >> operations, we get blk_no as 0, and nbblks as still 8. We
> >> definitely lose the last block of the log data.
> >
> >Yes, I fully understand that. But I also understand how the log
> >works and that this behaviour *should not happen*. That's why
> >I'm asking questions about what the problem you are trying to fix.
> 
> I am not sure about this, since I saw many reads on
> non-sector-align blocks even when successfully mounting good XFS
> partitions.

I didn't say that non-sector align reads should not be attempted by
log recovery - it's obvious from the on disk format of the log that
we have to parse it in chunks of 512 bytes to make sense of it's
contents, and that leads to the 512 byte reads and other subsequent
unaligned reads.

*However*

Seeing that there are unaligned reads occurring does not mean that
the structures in the log should be unaligned. Your test output
indicated a log record header at an unaligned block address, and
that's incorrect. It doesn't matter what the rest of the log
recovery code does with non-aligned IO - the fact is that your debug
implies that the contents of the log is corrupt and that implies a
deeper problem....

> And also there is code in xlog_write_log_records() which handles
> non-sector-align reads and writes.

Yes, it does handle it, but that doesn't mean that it is correct to
pass unaligned block ranges to it.

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-24 14:10           ` Dave Chinner
  (?)
@ 2013-02-26  7:28           ` Tony Lu
  2013-02-26 20:52               ` Dave Chinner
  2013-03-01 15:51               ` Mark Tinguely
  -1 siblings, 2 replies; 27+ messages in thread
From: Tony Lu @ 2013-02-26  7:28 UTC (permalink / raw)
  To: Dave Chinner, Ben Myers
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Dave Chinner,
	linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 4131 bytes --]

I get a reliable way to reproduce this bug. The logprint and metadump are attached.

Kernel version: 2.6.38.8
Mkfs.xfs version: xfsprogs-3.1.1
mkfs.xfs -s size=4096 /dev/sda1

Run the following mount-cp-umount script to reproduce:
#!/bin/sh
device=/dev/sda1
mount_point=/mnt
times=10

for ((num=1;num<=$times;num++))
do
        echo "$num mount $device $mount_point"
        mount $device $mount_point

        echo "cp -rf /bin $mount_point/$num"
        cp -rf /bin $mount_point/$num

        echo "$num umount $device $mount_point"
        umount $mount_point

#num=$(($num + 1))
done

After several times of mount/cp/umount, this xfs crashes, and the xfs partition can not be mounted any more. Here is the output of console.
-sh-4.1# ./umount-test 
1 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/1
1 umount /dev/sda1 /mnt
2 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/2
2 umount /dev/sda1 /mnt
3 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/3
3 umount /dev/sda1 /mnt
4 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/4
4 umount /dev/sda1 /mnt
5 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin /mnt/5
5 umount /dev/sda1 /mnt
6 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
cp -rf /bin /mnt/6
6 umount /dev/sda1 /mnt
7 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
cp -rf /bin /mnt/7
7 umount /dev/sda1 /mnt
Interrupt
8 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failed

Thanks
-Tony
>-----Original Message-----
>From: Dave Chinner [mailto:david@fromorbit.com]
>Sent: Sunday, February 24, 2013 10:10 PM
>To: Tony Lu
>Cc: Ben Myers; xfs@oss.sgi.com; Alex Elder; Dave Chinner;
>linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; Chris Metcalf
>Subject: Re: [PATCH] xfs: Fix possible truncation of log data in
>xlog_bread_noalign()
>
>On Sun, Feb 24, 2013 at 04:46:30AM +0000, Tony Lu wrote:
>> >> For example, if xlog_bread_noalign() wants to read blocks from #1
>> >> to # 9, in which case the passed parameter blk_no is 1, and nbblks
>> >> is 8, sectBBsize is 8, after the round down and round up
>> >> operations, we get blk_no as 0, and nbblks as still 8. We
>> >> definitely lose the last block of the log data.
>> >
>> >Yes, I fully understand that. But I also understand how the log
>> >works and that this behaviour *should not happen*. That's why
>> >I'm asking questions about what the problem you are trying to fix.
>>
>> I am not sure about this, since I saw many reads on
>> non-sector-align blocks even when successfully mounting good XFS
>> partitions.
>
>I didn't say that non-sector align reads should not be attempted by
>log recovery - it's obvious from the on disk format of the log that
>we have to parse it in chunks of 512 bytes to make sense of it's
>contents, and that leads to the 512 byte reads and other subsequent
>unaligned reads.
>
>*However*
>
>Seeing that there are unaligned reads occurring does not mean that
>the structures in the log should be unaligned. Your test output
>indicated a log record header at an unaligned block address, and
>that's incorrect. It doesn't matter what the rest of the log
>recovery code does with non-aligned IO - the fact is that your debug
>implies that the contents of the log is corrupt and that implies a
>deeper problem....
>
>> And also there is code in xlog_write_log_records() which handles
>> non-sector-align reads and writes.
>
>Yes, it does handle it, but that doesn't mean that it is correct to
>pass unaligned block ranges to it.
>
>Cheers,
>
>Dave.
>
>--
>Dave Chinner
>david@fromorbit.com

[-- Attachment #2: metadump.tar.gz --]
[-- Type: application/x-gzip, Size: 539952 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-26  7:28           ` Tony Lu
@ 2013-02-26 20:52               ` Dave Chinner
  2013-03-01 15:51               ` Mark Tinguely
  1 sibling, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-26 20:52 UTC (permalink / raw)
  To: Tony Lu
  Cc: Ben Myers, xfs, Alex Elder, Dave Chinner, linux-fsdevel,
	linux-kernel, Chris Metcalf

On Tue, Feb 26, 2013 at 07:28:19AM +0000, Tony Lu wrote:
> I get a reliable way to reproduce this bug. The logprint and metadump are attached.
> 
> Kernel version: 2.6.38.8

This is important....

....

.... because this:

> 4 umount /dev/sda1 /mnt
> 5 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)

Indicates that the unmount record is either not being written, it is
being written when there log has not been fully flushed or log
recovery is not finding it. You need to copy out the log
first to determine what the state of the log is before you mount the
filesystem - that way if log recovery is run you can see whether it
was supposed to run. (i.e. a clean log should never run recovery,
and unmount should always leave a clean log).

Either way, I'm more than 10,000 iterations into a run of 100k
iterations of this script on 3.8.0, and I have not seen a single log
recovery attempt occur. That implies you are seeing a bug in 2.6.38
that has since been fixed. It would be a good idea for you to
upgrade the system to a 3.8 kernel and determine if you can still
reproduce the problem on your system - that way we'll know if the
bug really has been fixed or not....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-02-26 20:52               ` Dave Chinner
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-02-26 20:52 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On Tue, Feb 26, 2013 at 07:28:19AM +0000, Tony Lu wrote:
> I get a reliable way to reproduce this bug. The logprint and metadump are attached.
> 
> Kernel version: 2.6.38.8

This is important....

....

.... because this:

> 4 umount /dev/sda1 /mnt
> 5 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)

Indicates that the unmount record is either not being written, it is
being written when there log has not been fully flushed or log
recovery is not finding it. You need to copy out the log
first to determine what the state of the log is before you mount the
filesystem - that way if log recovery is run you can see whether it
was supposed to run. (i.e. a clean log should never run recovery,
and unmount should always leave a clean log).

Either way, I'm more than 10,000 iterations into a run of 100k
iterations of this script on 3.8.0, and I have not seen a single log
recovery attempt occur. That implies you are seeing a bug in 2.6.38
that has since been fixed. It would be a good idea for you to
upgrade the system to a 3.8 kernel and determine if you can still
reproduce the problem on your system - that way we'll know if the
bug really has been fixed or not....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-02-26  7:28           ` Tony Lu
@ 2013-03-01 15:51               ` Mark Tinguely
  2013-03-01 15:51               ` Mark Tinguely
  1 sibling, 0 replies; 27+ messages in thread
From: Mark Tinguely @ 2013-03-01 15:51 UTC (permalink / raw)
  To: Tony Lu
  Cc: Dave Chinner, Ben Myers, Alex Elder, linux-kernel, Chris Metcalf,
	xfs, Dave Chinner, linux-fsdevel

On 02/26/13 01:28, Tony Lu wrote:
> I get a reliable way to reproduce this bug. The logprint and metadump are attached.
>
> Kernel version: 2.6.38.8
> Mkfs.xfs version: xfsprogs-3.1.1
> mkfs.xfs -s size=4096 /dev/sda1
>
> Run the following mount-cp-umount script to reproduce:
> #!/bin/sh
> device=/dev/sda1
> mount_point=/mnt
> times=10
>
> for ((num=1;num<=$times;num++))
> do
>          echo "$num mount $device $mount_point"
>          mount $device $mount_point
>
>          echo "cp -rf /bin $mount_point/$num"
>          cp -rf /bin $mount_point/$num
>
>          echo "$num umount $device $mount_point"
>          umount $mount_point
>
> #num=$(($num + 1))
> done
>
> After several times of mount/cp/umount, this xfs crashes, and the xfs partition can not be mounted any more. Here is the output of console.
> -sh-4.1# ./umount-test
> 1 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/1
> 1 umount /dev/sda1 /mnt
> 2 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/2
> 2 umount /dev/sda1 /mnt
> 3 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/3
> 3 umount /dev/sda1 /mnt
> 4 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/4
> 4 umount /dev/sda1 /mnt
> 5 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin /mnt/5
> 5 umount /dev/sda1 /mnt
> 6 mount /dev/sda1 /mnt
>
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
> cp -rf /bin /mnt/6
> 6 umount /dev/sda1 /mnt
> 7 mount /dev/sda1 /mnt
>
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/7
> 7 umount /dev/sda1 /mnt
> Interrupt
> 8 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failed
>
> Thanks
> -Tony

It works fine on a 2.6.32 machine I had sitting around - and I never 
required log recovery.

I think you need to answer Dave's question as to why is your unmounts 
are requiring recovery?

Are there errors in the /var/log/messages?

I downloaded the Linux 2.6.38.8 source and take a look if I can recreate 
the problem.

--Mark.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-03-01 15:51               ` Mark Tinguely
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Tinguely @ 2013-03-01 15:51 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On 02/26/13 01:28, Tony Lu wrote:
> I get a reliable way to reproduce this bug. The logprint and metadump are attached.
>
> Kernel version: 2.6.38.8
> Mkfs.xfs version: xfsprogs-3.1.1
> mkfs.xfs -s size=4096 /dev/sda1
>
> Run the following mount-cp-umount script to reproduce:
> #!/bin/sh
> device=/dev/sda1
> mount_point=/mnt
> times=10
>
> for ((num=1;num<=$times;num++))
> do
>          echo "$num mount $device $mount_point"
>          mount $device $mount_point
>
>          echo "cp -rf /bin $mount_point/$num"
>          cp -rf /bin $mount_point/$num
>
>          echo "$num umount $device $mount_point"
>          umount $mount_point
>
> #num=$(($num + 1))
> done
>
> After several times of mount/cp/umount, this xfs crashes, and the xfs partition can not be mounted any more. Here is the output of console.
> -sh-4.1# ./umount-test
> 1 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/1
> 1 umount /dev/sda1 /mnt
> 2 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/2
> 2 umount /dev/sda1 /mnt
> 3 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/3
> 3 umount /dev/sda1 /mnt
> 4 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/4
> 4 umount /dev/sda1 /mnt
> 5 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin /mnt/5
> 5 umount /dev/sda1 /mnt
> 6 mount /dev/sda1 /mnt
>
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
> cp -rf /bin /mnt/6
> 6 umount /dev/sda1 /mnt
> 7 mount /dev/sda1 /mnt
>
> XFS mounting filesystem sda1
> cp -rf /bin /mnt/7
> 7 umount /dev/sda1 /mnt
> Interrupt
> 8 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failed
>
> Thanks
> -Tony

It works fine on a 2.6.32 machine I had sitting around - and I never 
required log recovery.

I think you need to answer Dave's question as to why is your unmounts 
are requiring recovery?

Are there errors in the /var/log/messages?

I downloaded the Linux 2.6.38.8 source and take a look if I can recreate 
the problem.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-03-01 15:51               ` Mark Tinguely
@ 2013-03-01 20:24                 ` Mark Tinguely
  -1 siblings, 0 replies; 27+ messages in thread
From: Mark Tinguely @ 2013-03-01 20:24 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On 03/01/13 09:51, Mark Tinguely wrote:
> On 02/26/13 01:28, Tony Lu wrote:
>> I get a reliable way to reproduce this bug. The logprint and metadump
>> are attached.
>>
>> Kernel version: 2.6.38.8
>> Mkfs.xfs version: xfsprogs-3.1.1
>> mkfs.xfs -s size=4096 /dev/sda1
>>
>> Run the following mount-cp-umount script to reproduce:
>> #!/bin/sh
>> device=/dev/sda1
>> mount_point=/mnt
>> times=10
>>
>> for ((num=1;num<=$times;num++))
>> do
>> echo "$num mount $device $mount_point"
>> mount $device $mount_point
>>
>> echo "cp -rf /bin $mount_point/$num"
>> cp -rf /bin $mount_point/$num
>>
>> echo "$num umount $device $mount_point"
>> umount $mount_point
>>
>> #num=$(($num + 1))
>> done
>>
>> After several times of mount/cp/umount, this xfs crashes, and the xfs
>> partition can not be mounted any more. Here is the output of console.
>> -sh-4.1# ./umount-test
>> 1 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/1
>> 1 umount /dev/sda1 /mnt
>> 2 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/2
>> 2 umount /dev/sda1 /mnt
>> 3 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/3
>> 3 umount /dev/sda1 /mnt
>> 4 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/4
>> 4 umount /dev/sda1 /mnt
>> 5 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
>> /mnt/5
>> 5 umount /dev/sda1 /mnt
>> 6 mount /dev/sda1 /mnt
>>
>> XFS mounting filesystem sda1
>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
>> cp -rf /bin /mnt/6
>> 6 umount /dev/sda1 /mnt
>> 7 mount /dev/sda1 /mnt
>>
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/7
>> 7 umount /dev/sda1 /mnt
>> Interrupt
>> 8 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failed
>>
>> Thanks
>> -Tony
>
> It works fine on a 2.6.32 machine I had sitting around - and I never
> required log recovery.
>
> I think you need to answer Dave's question as to why is your unmounts
> are requiring recovery?
>
> Are there errors in the /var/log/messages?
>
> I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
> the problem.
>
> --Mark.

I could not reproduce the problem on a vanilla install. XFS shutdown and 
remounted cleanly running your script (several iterations looping set to 
100).

I started fsstress on another XFS partition on the same disk to see if I 
could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger 
other ASSERTs on the fsstress partition so I never stayed up long enough 
to cause a shutdown race.

Not wanting to patch that version of Linux/XFS, I am bailing here. If 
you want to turn on the XFS debug it may point out why your filesystem 
is not shutting down cleanly.

--Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-03-01 20:24                 ` Mark Tinguely
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Tinguely @ 2013-03-01 20:24 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

On 03/01/13 09:51, Mark Tinguely wrote:
> On 02/26/13 01:28, Tony Lu wrote:
>> I get a reliable way to reproduce this bug. The logprint and metadump
>> are attached.
>>
>> Kernel version: 2.6.38.8
>> Mkfs.xfs version: xfsprogs-3.1.1
>> mkfs.xfs -s size=4096 /dev/sda1
>>
>> Run the following mount-cp-umount script to reproduce:
>> #!/bin/sh
>> device=/dev/sda1
>> mount_point=/mnt
>> times=10
>>
>> for ((num=1;num<=$times;num++))
>> do
>> echo "$num mount $device $mount_point"
>> mount $device $mount_point
>>
>> echo "cp -rf /bin $mount_point/$num"
>> cp -rf /bin $mount_point/$num
>>
>> echo "$num umount $device $mount_point"
>> umount $mount_point
>>
>> #num=$(($num + 1))
>> done
>>
>> After several times of mount/cp/umount, this xfs crashes, and the xfs
>> partition can not be mounted any more. Here is the output of console.
>> -sh-4.1# ./umount-test
>> 1 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/1
>> 1 umount /dev/sda1 /mnt
>> 2 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/2
>> 2 umount /dev/sda1 /mnt
>> 3 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/3
>> 3 umount /dev/sda1 /mnt
>> 4 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/4
>> 4 umount /dev/sda1 /mnt
>> 5 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
>> /mnt/5
>> 5 umount /dev/sda1 /mnt
>> 6 mount /dev/sda1 /mnt
>>
>> XFS mounting filesystem sda1
>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
>> cp -rf /bin /mnt/6
>> 6 umount /dev/sda1 /mnt
>> 7 mount /dev/sda1 /mnt
>>
>> XFS mounting filesystem sda1
>> cp -rf /bin /mnt/7
>> 7 umount /dev/sda1 /mnt
>> Interrupt
>> 8 mount /dev/sda1 /mnt
>> XFS mounting filesystem sda1
>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failed
>>
>> Thanks
>> -Tony
>
> It works fine on a 2.6.32 machine I had sitting around - and I never
> required log recovery.
>
> I think you need to answer Dave's question as to why is your unmounts
> are requiring recovery?
>
> Are there errors in the /var/log/messages?
>
> I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
> the problem.
>
> --Mark.

I could not reproduce the problem on a vanilla install. XFS shutdown and 
remounted cleanly running your script (several iterations looping set to 
100).

I started fsstress on another XFS partition on the same disk to see if I 
could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger 
other ASSERTs on the fsstress partition so I never stayed up long enough 
to cause a shutdown race.

Not wanting to patch that version of Linux/XFS, I am bailing here. If 
you want to turn on the XFS debug it may point out why your filesystem 
is not shutting down cleanly.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-03-01 20:24                 ` Mark Tinguely
@ 2013-03-04  8:32                   ` Tony Lu
  -1 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-03-04  8:32 UTC (permalink / raw)
  To: Mark Tinguely
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

Thanks for you following up.

My apologize that I just found that it is one change I made before that causes this problem. This change forces mkfs.xfs to format xfs partitions whose sectorsize were not smaller than 4096 bytes, which was due to a bug that earlier versions of xfs used (struct *page)->private(long) as a bitmap to represent each block's state within a page (the size of a page could be 64K or larger, then it needs 128 bit or more to represent each block's state within a page).

This is reproducible on 2.6.38.6 kernel on X86. But I do not get why this change makes the xfs log inconsistent during mount/cp/umount operations.

diff -dur xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h xfsprogs-3.1.4/include/xfs_alloc_btree.h
--- xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h        2010-01-30 03:46:13.000000000 +0800
+++ xfsprogs-3.1.4/include/xfs_alloc_btree.h    2013-03-04 16:11:41.000000000 +0800
@@ -59,7 +59,7 @@
 #define XFS_MAX_BLOCKSIZE_LOG  16      /* i.e. 65536 bytes */
 #define XFS_MIN_BLOCKSIZE      (1 << XFS_MIN_BLOCKSIZE_LOG)
 #define XFS_MAX_BLOCKSIZE      (1 << XFS_MAX_BLOCKSIZE_LOG)
-#define XFS_MIN_SECTORSIZE_LOG 9       /* i.e. 512 bytes */
+#define XFS_MIN_SECTORSIZE_LOG 12      /* i.e. 512 bytes */
 #define XFS_MAX_SECTORSIZE_LOG 15      /* i.e. 32768 bytes */
 #define XFS_MIN_SECTORSIZE     (1 << XFS_MIN_SECTORSIZE_LOG)
 #define XFS_MAX_SECTORSIZE     (1 << XFS_MAX_SECTORSIZE_LOG)

Thanks
-Tony

>-----Original Message-----
>From: Mark Tinguely [mailto:tinguely@sgi.com]
>Sent: Saturday, March 02, 2013 4:24 AM
>To: Tony Lu
>Cc: Alex Elder; linux-kernel@vger.kernel.org; Chris Metcalf; xfs@oss.sgi.com;
>Ben Myers; Dave Chinner; linux-fsdevel@vger.kernel.org
>Subject: Re: [PATCH] xfs: Fix possible truncation of log data in
>xlog_bread_noalign()
>
>On 03/01/13 09:51, Mark Tinguely wrote:
>> On 02/26/13 01:28, Tony Lu wrote:
>>> I get a reliable way to reproduce this bug. The logprint and metadump
>>> are attached.
>>>
>>> Kernel version: 2.6.38.8
>>> Mkfs.xfs version: xfsprogs-3.1.1
>>> mkfs.xfs -s size=4096 /dev/sda1
>>>
>>> Run the following mount-cp-umount script to reproduce:
>>> #!/bin/sh
>>> device=/dev/sda1
>>> mount_point=/mnt
>>> times=10
>>>
>>> for ((num=1;num<=$times;num++))
>>> do
>>> echo "$num mount $device $mount_point"
>>> mount $device $mount_point
>>>
>>> echo "cp -rf /bin $mount_point/$num"
>>> cp -rf /bin $mount_point/$num
>>>
>>> echo "$num umount $device $mount_point"
>>> umount $mount_point
>>>
>>> #num=$(($num + 1))
>>> done
>>>
>>> After several times of mount/cp/umount, this xfs crashes, and the xfs
>>> partition can not be mounted any more. Here is the output of console.
>>> -sh-4.1# ./umount-test
>>> 1 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/1
>>> 1 umount /dev/sda1 /mnt
>>> 2 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/2
>>> 2 umount /dev/sda1 /mnt
>>> 3 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/3
>>> 3 umount /dev/sda1 /mnt
>>> 4 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/4
>>> 4 umount /dev/sda1 /mnt
>>> 5 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
>>> /mnt/5
>>> 5 umount /dev/sda1 /mnt
>>> 6 mount /dev/sda1 /mnt
>>>
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
>>> cp -rf /bin /mnt/6
>>> 6 umount /dev/sda1 /mnt
>>> 7 mount /dev/sda1 /mnt
>>>
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/7
>>> 7 umount /dev/sda1 /mnt
>>> Interrupt
>>> 8 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> XFS: xlog_recover_process_data: bad clientid
>>> XFS: log mount/recovery failed: error 5
>>> XFS: log mount failed
>>>
>>> Thanks
>>> -Tony
>>
>> It works fine on a 2.6.32 machine I had sitting around - and I never
>> required log recovery.
>>
>> I think you need to answer Dave's question as to why is your unmounts
>> are requiring recovery?
>>
>> Are there errors in the /var/log/messages?
>>
>> I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
>> the problem.
>>
>> --Mark.
>
>I could not reproduce the problem on a vanilla install. XFS shutdown and
>remounted cleanly running your script (several iterations looping set to
>100).
>
>I started fsstress on another XFS partition on the same disk to see if I
>could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger
>other ASSERTs on the fsstress partition so I never stayed up long enough
>to cause a shutdown race.
>
>Not wanting to patch that version of Linux/XFS, I am bailing here. If
>you want to turn on the XFS debug it may point out why your filesystem
>is not shutting down cleanly.
>
>--Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-03-04  8:32                   ` Tony Lu
  0 siblings, 0 replies; 27+ messages in thread
From: Tony Lu @ 2013-03-04  8:32 UTC (permalink / raw)
  To: Mark Tinguely
  Cc: Alex Elder, linux-kernel, Chris Metcalf, xfs, Ben Myers,
	Dave Chinner, linux-fsdevel

Thanks for you following up.

My apologize that I just found that it is one change I made before that causes this problem. This change forces mkfs.xfs to format xfs partitions whose sectorsize were not smaller than 4096 bytes, which was due to a bug that earlier versions of xfs used (struct *page)->private(long) as a bitmap to represent each block's state within a page (the size of a page could be 64K or larger, then it needs 128 bit or more to represent each block's state within a page).

This is reproducible on 2.6.38.6 kernel on X86. But I do not get why this change makes the xfs log inconsistent during mount/cp/umount operations.

diff -dur xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h xfsprogs-3.1.4/include/xfs_alloc_btree.h
--- xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h        2010-01-30 03:46:13.000000000 +0800
+++ xfsprogs-3.1.4/include/xfs_alloc_btree.h    2013-03-04 16:11:41.000000000 +0800
@@ -59,7 +59,7 @@
 #define XFS_MAX_BLOCKSIZE_LOG  16      /* i.e. 65536 bytes */
 #define XFS_MIN_BLOCKSIZE      (1 << XFS_MIN_BLOCKSIZE_LOG)
 #define XFS_MAX_BLOCKSIZE      (1 << XFS_MAX_BLOCKSIZE_LOG)
-#define XFS_MIN_SECTORSIZE_LOG 9       /* i.e. 512 bytes */
+#define XFS_MIN_SECTORSIZE_LOG 12      /* i.e. 512 bytes */
 #define XFS_MAX_SECTORSIZE_LOG 15      /* i.e. 32768 bytes */
 #define XFS_MIN_SECTORSIZE     (1 << XFS_MIN_SECTORSIZE_LOG)
 #define XFS_MAX_SECTORSIZE     (1 << XFS_MAX_SECTORSIZE_LOG)

Thanks
-Tony

>-----Original Message-----
>From: Mark Tinguely [mailto:tinguely@sgi.com]
>Sent: Saturday, March 02, 2013 4:24 AM
>To: Tony Lu
>Cc: Alex Elder; linux-kernel@vger.kernel.org; Chris Metcalf; xfs@oss.sgi.com;
>Ben Myers; Dave Chinner; linux-fsdevel@vger.kernel.org
>Subject: Re: [PATCH] xfs: Fix possible truncation of log data in
>xlog_bread_noalign()
>
>On 03/01/13 09:51, Mark Tinguely wrote:
>> On 02/26/13 01:28, Tony Lu wrote:
>>> I get a reliable way to reproduce this bug. The logprint and metadump
>>> are attached.
>>>
>>> Kernel version: 2.6.38.8
>>> Mkfs.xfs version: xfsprogs-3.1.1
>>> mkfs.xfs -s size=4096 /dev/sda1
>>>
>>> Run the following mount-cp-umount script to reproduce:
>>> #!/bin/sh
>>> device=/dev/sda1
>>> mount_point=/mnt
>>> times=10
>>>
>>> for ((num=1;num<=$times;num++))
>>> do
>>> echo "$num mount $device $mount_point"
>>> mount $device $mount_point
>>>
>>> echo "cp -rf /bin $mount_point/$num"
>>> cp -rf /bin $mount_point/$num
>>>
>>> echo "$num umount $device $mount_point"
>>> umount $mount_point
>>>
>>> #num=$(($num + 1))
>>> done
>>>
>>> After several times of mount/cp/umount, this xfs crashes, and the xfs
>>> partition can not be mounted any more. Here is the output of console.
>>> -sh-4.1# ./umount-test
>>> 1 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/1
>>> 1 umount /dev/sda1 /mnt
>>> 2 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/2
>>> 2 umount /dev/sda1 /mnt
>>> 3 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/3
>>> 3 umount /dev/sda1 /mnt
>>> 4 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/4
>>> 4 umount /dev/sda1 /mnt
>>> 5 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
>>> /mnt/5
>>> 5 umount /dev/sda1 /mnt
>>> 6 mount /dev/sda1 /mnt
>>>
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
>>> cp -rf /bin /mnt/6
>>> 6 umount /dev/sda1 /mnt
>>> 7 mount /dev/sda1 /mnt
>>>
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/7
>>> 7 umount /dev/sda1 /mnt
>>> Interrupt
>>> 8 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> XFS: xlog_recover_process_data: bad clientid
>>> XFS: log mount/recovery failed: error 5
>>> XFS: log mount failed
>>>
>>> Thanks
>>> -Tony
>>
>> It works fine on a 2.6.32 machine I had sitting around - and I never
>> required log recovery.
>>
>> I think you need to answer Dave's question as to why is your unmounts
>> are requiring recovery?
>>
>> Are there errors in the /var/log/messages?
>>
>> I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
>> the problem.
>>
>> --Mark.
>
>I could not reproduce the problem on a vanilla install. XFS shutdown and
>remounted cleanly running your script (several iterations looping set to
>100).
>
>I started fsstress on another XFS partition on the same disk to see if I
>could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger
>other ASSERTs on the fsstress partition so I never stayed up long enough
>to cause a shutdown race.
>
>Not wanting to patch that version of Linux/XFS, I am bailing here. If
>you want to turn on the XFS debug it may point out why your filesystem
>is not shutting down cleanly.
>
>--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
  2013-03-04  8:32                   ` Tony Lu
@ 2013-03-04 21:03                     ` Dave Chinner
  -1 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-03-04 21:03 UTC (permalink / raw)
  To: Tony Lu
  Cc: Mark Tinguely, Alex Elder, linux-kernel, Chris Metcalf, xfs,
	Ben Myers, Dave Chinner, linux-fsdevel

On Mon, Mar 04, 2013 at 08:32:45AM +0000, Tony Lu wrote:
> Thanks for you following up.
> 
> My apologize that I just found that it is one change I made before
> that causes this problem. This change forces mkfs.xfs to format
> xfs partitions whose sectorsize were not smaller than 4096 bytes,
> which was due to a bug that earlier versions of xfs used (struct
> *page)->private(long) as a bitmap to represent each block's state
> within a page (the size of a page could be 64K or larger, then it
> needs 128 bit or more to represent each block's state within a
> page).

You do realise that bug doesn't affect x86-64 platforms as they
don't support 64k pages?

> This is reproducible on 2.6.38.6 kernel on X86. But I do not get
> why this change makes the xfs log inconsistent during
> mount/cp/umount operations.

Neither do I, and I don't care to look any further because the
problem is of your own making. In future, please check first that
the bug you are reporting is reproducable on a current upstream
kernel and userspace.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()
@ 2013-03-04 21:03                     ` Dave Chinner
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2013-03-04 21:03 UTC (permalink / raw)
  To: Tony Lu
  Cc: Alex Elder, Mark Tinguely, linux-kernel, Chris Metcalf, xfs,
	Ben Myers, Dave Chinner, linux-fsdevel

On Mon, Mar 04, 2013 at 08:32:45AM +0000, Tony Lu wrote:
> Thanks for you following up.
> 
> My apologize that I just found that it is one change I made before
> that causes this problem. This change forces mkfs.xfs to format
> xfs partitions whose sectorsize were not smaller than 4096 bytes,
> which was due to a bug that earlier versions of xfs used (struct
> *page)->private(long) as a bitmap to represent each block's state
> within a page (the size of a page could be 64K or larger, then it
> needs 128 bit or more to represent each block's state within a
> page).

You do realise that bug doesn't affect x86-64 platforms as they
don't support 64k pages?

> This is reproducible on 2.6.38.6 kernel on X86. But I do not get
> why this change makes the xfs log inconsistent during
> mount/cp/umount operations.

Neither do I, and I don't care to look any further because the
problem is of your own making. In future, please check first that
the bug you are reporting is reproducable on a current upstream
kernel and userspace.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-03-04 21:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-22  8:12 [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign() Tony Lu
2013-02-22  8:12 ` Tony Lu
2013-02-22 19:14 ` Ben Myers
2013-02-22 19:14   ` Ben Myers
2013-02-23  8:32   ` Tony Lu
2013-02-23  8:32     ` Tony Lu
2013-02-23  0:08 ` Dave Chinner
2013-02-23  0:08   ` Dave Chinner
2013-02-23  7:06   ` Tony Lu
2013-02-23  7:06     ` Tony Lu
2013-02-23 23:55     ` Dave Chinner
2013-02-23 23:55       ` Dave Chinner
2013-02-24  4:46       ` Tony Lu
2013-02-24  4:46         ` Tony Lu
2013-02-24 14:10         ` Dave Chinner
2013-02-24 14:10           ` Dave Chinner
2013-02-26  7:28           ` Tony Lu
2013-02-26 20:52             ` Dave Chinner
2013-02-26 20:52               ` Dave Chinner
2013-03-01 15:51             ` Mark Tinguely
2013-03-01 15:51               ` Mark Tinguely
2013-03-01 20:24               ` Mark Tinguely
2013-03-01 20:24                 ` Mark Tinguely
2013-03-04  8:32                 ` Tony Lu
2013-03-04  8:32                   ` Tony Lu
2013-03-04 21:03                   ` Dave Chinner
2013-03-04 21:03                     ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.