linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
@ 2018-10-17 11:20 bugzilla-daemon
  2018-10-17 17:49 ` [Bug 201453] " bugzilla-daemon
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-10-17 11:20 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=201453

            Bug ID: 201453
           Summary: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad
                    (negative number) as agi freecount
           Product: File System
           Version: 2.5
    Kernel Version: v4.18
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@kernel-bugs.kernel.org
          Reporter: zlang@redhat.com
        Regression: No

Description of problem:
On s390x, I hit a xfs/490 failure (can't reproduce it on x86_64). By manually
debuging, I find:

# mkfs.xfs -f -m finobt=0 /dev/loop1                                            
meta-data=/dev/loop1             isize=512    agcount=4, agsize=786496 blks     
         =                       sectsz=512   attr=2, projid32bit=1             
         =                       crc=1        finobt=0, sparse=1, rmapbt=0      
         =                       reflink=1
data     =                       bsize=4096   blocks=3145984, imaxpct=25        
         =                       sunit=0      swidth=0 blks                     
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1               
log      =internal log           bsize=4096   blocks=2560, version=2            
         =                       sectsz=512   sunit=0 blks, lazy-count=1        
realtime =none                   extsz=4096   blocks=0, rtextents=0             
# mount /dev/loop1 /mnt/testarea/scratch/                                       
# mkdir /mnt/testarea/scratch/dir                                               
# xfs_io -fc "pwrite 0 4096" -c fsync /mnt/testarea/scratch/dir/testfile        
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0000 sec (150.240 MiB/sec and 38461.5385 ops/sec)               
# stat -c %i /mnt/testarea/scratch/dir/testfile                                 
132
# umount /dev/loop1                                                             
# _scratch_xfs_db -c "convert inode 132 agno"                                   
0x0 (0)
# _scratch_xfs_get_metadata_field "recs[1].freecount" "agi 0" "addr root"       
-197
]# xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1               
recs[1] = [startino,holemask,count,freecount,free]
1:[128,0,64,-197,0xffffffffffffffe0]


Version-Release number of selected component (if applicable):
kernel 4.18
xfsprogs 4.19.0-rc0

How reproducible:
100% on s390x with loop device (at least from my testing)

Steps to Reproduce:
run xfs/490 on s390x

Actual results:
as above

Expected results:
test pass

Additional info:
I think it's not a kernel problem, the negative number maybe not real on disk.
Due to the SCRATCH_DEV still can be mounted without errors:

[root@ibm-z-110 xfstests]# mount /dev/loop1 /mnt/testarea/scratch
[root@ibm-z-110 xfstests]# dmesg|tail
[ 7289.790976] XFS (loop1): Mounting V5 Filesystem
[ 7289.796601] XFS (loop1): Ending clean mount

And it's not reproducible on x86_64:
# xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1
recs[1] = [startino,holemask,count,freecount,free] 
1:[128,0,64,59,0xffffffffffffffe0]

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 201453] Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
  2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
@ 2018-10-17 17:49 ` bugzilla-daemon
  2018-10-18  1:41 ` [Bug 201453] New: " Dave Chinner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-10-17 17:49 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=201453

--- Comment #1 from Zorro Lang (zlang@redhat.com) ---
Created attachment 279077
  --> https://bugzilla.kernel.org/attachment.cgi?id=279077&action=edit
xfs metadump

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
  2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
  2018-10-17 17:49 ` [Bug 201453] " bugzilla-daemon
@ 2018-10-18  1:41 ` Dave Chinner
  2018-10-18  1:41 ` [Bug 201453] " bugzilla-daemon
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2018-10-18  1:41 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-xfs

On Wed, Oct 17, 2018 at 11:20:42AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> # _scratch_xfs_get_metadata_field "recs[1].freecount" "agi 0" "addr root"       
> -197
> ]# xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1               
> recs[1] = [startino,holemask,count,freecount,free]
> 1:[128,0,64,-197,0xffffffffffffffe0]
.....
> And it's not reproducible on x86_64:
> # xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1
> recs[1] = [startino,holemask,count,freecount,free] 
> 1:[128,0,64,59,0xffffffffffffffe0]

-197 = -(256 - 59)

This looks like a sign extension problem in the xfs_db code. s390 is
a big endian system, right?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 201453] Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
  2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
  2018-10-17 17:49 ` [Bug 201453] " bugzilla-daemon
  2018-10-18  1:41 ` [Bug 201453] New: " Dave Chinner
@ 2018-10-18  1:41 ` bugzilla-daemon
  2018-10-18  1:50 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-10-18  1:41 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=201453

--- Comment #2 from Dave Chinner (david@fromorbit.com) ---
On Wed, Oct 17, 2018 at 11:20:42AM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> # _scratch_xfs_get_metadata_field "recs[1].freecount" "agi 0" "addr root"     
> -197
> ]# xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1             
> recs[1] = [startino,holemask,count,freecount,free]
> 1:[128,0,64,-197,0xffffffffffffffe0]
.....
> And it's not reproducible on x86_64:
> # xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1
> recs[1] = [startino,holemask,count,freecount,free] 
> 1:[128,0,64,59,0xffffffffffffffe0]

-197 = -(256 - 59)

This looks like a sign extension problem in the xfs_db code. s390 is
a big endian system, right?

Cheers,

Dave.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 201453] Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
  2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
                   ` (2 preceding siblings ...)
  2018-10-18  1:41 ` [Bug 201453] " bugzilla-daemon
@ 2018-10-18  1:50 ` bugzilla-daemon
  2018-10-18  6:25 ` bugzilla-daemon
  2018-10-24 20:10 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-10-18  1:50 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=201453

Eric Sandeen (sandeen@sandeen.net) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sandeen@sandeen.net

--- Comment #3 from Eric Sandeen (sandeen@sandeen.net) ---
Yep, does this fix it?

diff --git a/db/btblock.c b/db/btblock.c
index cbd2990..5a5b061 100644
--- a/db/btblock.c
+++ b/db/btblock.c
@@ -513,7 +513,7 @@ const field_t       inobt_sprec_flds[] = {
        { "holemask", FLDT_UINT16X, OI(ROFF(ir_u.sp.ir_holemask)), C1, 0,
          TYP_NONE },
        { "count", FLDT_UINT8D, OI(ROFF(ir_u.sp.ir_count)), C1, 0, TYP_NONE },
-       { "freecount", FLDT_INT8D, OI(ROFF(ir_u.sp.ir_freecount)), C1, 0,
+       { "freecount", FLDT_UINT8D, OI(ROFF(ir_u.sp.ir_freecount)), C1, 0,
          TYP_NONE },
        { "free", FLDT_INOFREE, OI(ROFF(ir_free)), C1, 0, TYP_NONE },
        { NULL }

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Bug 201453] Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
  2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
                   ` (3 preceding siblings ...)
  2018-10-18  1:50 ` bugzilla-daemon
@ 2018-10-18  6:25 ` bugzilla-daemon
  2018-10-24 20:10 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-10-18  6:25 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=201453

--- Comment #4 from Zorro Lang (zlang@redhat.com) ---
(In reply to Eric Sandeen from comment #3)
> Yep, does this fix it?

Yes, this's helpful.
# xfs_db -c "agi 0" -c "addr root" -c "print recs[1]" /dev/loop1
recs[1] = [startino,holemask,count,freecount,free]
1:[64,0,64,59,0xffffffffffffffe0]


> 
> diff --git a/db/btblock.c b/db/btblock.c
> index cbd2990..5a5b061 100644
> --- a/db/btblock.c
> +++ b/db/btblock.c
> @@ -513,7 +513,7 @@ const field_t       inobt_sprec_flds[] = {
>         { "holemask", FLDT_UINT16X, OI(ROFF(ir_u.sp.ir_holemask)), C1, 0,
>           TYP_NONE },
>         { "count", FLDT_UINT8D, OI(ROFF(ir_u.sp.ir_count)), C1, 0, TYP_NONE
> },
> -       { "freecount", FLDT_INT8D, OI(ROFF(ir_u.sp.ir_freecount)), C1, 0,
> +       { "freecount", FLDT_UINT8D, OI(ROFF(ir_u.sp.ir_freecount)), C1, 0,
>           TYP_NONE },
>         { "free", FLDT_INOFREE, OI(ROFF(ir_free)), C1, 0, TYP_NONE },
>         { NULL }

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 201453] Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount
  2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
                   ` (4 preceding siblings ...)
  2018-10-18  6:25 ` bugzilla-daemon
@ 2018-10-24 20:10 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-10-24 20:10 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=201453

--- Comment #5 from Eric Sandeen (sandeen@sandeen.net) ---
So zorro correctly points out that the big vs little endian certainly should
not matter for this u8.

What does matter is the signed type, because getbitval is doing tricks to try
to handle sign extension and it does it differently for big vs. little endian:

                if (getbit_l(p, bit + i)) {
                        /* If the last bit is on and we care about sign
                         * bits and we don't have a full 64 bit
                         * container, turn all bits on between the
                         * sign bit and the most sig bit.
                         */

                        /* handle endian swap here */
#if __BYTE_ORDER == LITTLE_ENDIAN
                        if (i == 0 && signext && nbits < 64)
                                rval = (~0ULL) << nbits;
                        rval |= 1ULL << (nbits - i - 1);
#else
                        if ((i == (nbits - 1)) && signext && nbits < 64)
                                rval |= ((~0ULL) << nbits);
                        rval |= 1ULL << (nbits - i - 1);
#endif

Switching it to FLDT_UINT8D makes "signext" false so none of this happens, but
that's papering over the underlying bug with signed types.

The bug seems to be the test for if ((i == (nbits - 1)) ...) - this is testing
the last / rightmost bit in the number, which is /not/ the MSB.

But I cannot seem to wrap my head around the right way to fix it, yet.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-25  4:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-17 11:20 [Bug 201453] New: Bug 1640090 - [xfstests xfs/490]: xfs_db print a bad (negative number) as agi freecount bugzilla-daemon
2018-10-17 17:49 ` [Bug 201453] " bugzilla-daemon
2018-10-18  1:41 ` [Bug 201453] New: " Dave Chinner
2018-10-18  1:41 ` [Bug 201453] " bugzilla-daemon
2018-10-18  1:50 ` bugzilla-daemon
2018-10-18  6:25 ` bugzilla-daemon
2018-10-24 20:10 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).