All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs: Uninitialized memory read at xlog_write
@ 2017-09-04 12:19 Tetsuo Handa
  2017-09-04 17:34 ` Darrick J. Wong
  2017-09-11 15:01 ` Brian Foster
  0 siblings, 2 replies; 14+ messages in thread
From: Tetsuo Handa @ 2017-09-04 12:19 UTC (permalink / raw)
  To: sandeen, dchinner; +Cc: linux-xfs

Hello.

I hit kmemcheck splat using linux-next-20170901.

[ 1360.354052] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffffa17673526960)
[ 1360.354053] 00000000000000006a1004080000000000000000000000000000000000000000
[ 1360.354059]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
[ 1360.354063]  ^
[ 1360.354080] RIP: 0010:xlog_write+0x378/0x890 [xfs]
[ 1360.354080] RSP: 0018:ffffafeac12b7cf0 EFLAGS: 00010202
[ 1360.354081] RAX: ffffafeac1808234 RBX: 0000000000000038 RCX: 000000000000000c
[ 1360.354082] RDX: 000000000000000e RSI: ffffa17673526960 RDI: ffffafeac180823c
[ 1360.354082] RBP: ffffafeac12b7d98 R08: 0000000000000000 R09: 0000000000000000
[ 1360.354083] R10: ffffa176737f8000 R11: 0000000000000000 R12: 0000000000000038
[ 1360.354083] R13: 0000000000000002 R14: 0000000000000000 R15: ffffa17673526900
[ 1360.354084] FS:  0000000000000000(0000) GS:ffffffffb0c30000(0000) knlGS:0000000000000000
[ 1360.354085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1360.354085] CR2: ffffa17674a66320 CR3: 0000000135c77003 CR4: 00000000000606f0
[ 1360.354133]  xlog_write+0x378/0x890 [xfs]
[ 1360.354142]  xlog_cil_push+0x1f4/0x370 [xfs]
[ 1360.354151]  xlog_cil_push_work+0xd/0x10 [xfs]
[ 1360.354207]  process_one_work+0x1cd/0x330
[ 1360.354208]  worker_thread+0x222/0x3e0
[ 1360.354209]  kthread+0xfe/0x130
[ 1360.354212]  ret_from_fork+0x25/0x30
[ 1360.354213]  0xffffffffffffffff

# ./scripts/faddr2line fs/xfs/xfs.ko xlog_write+0x378/0x890
xlog_write+0x378/0x890:
__inline_memcpy at /data/linux-next/./arch/x86/include/asm/string_64.h:13
 (inlined by) xlog_write at /data/linux-next/fs/xfs/xfs_log.c:2479

                        ASSERT(copy_len >= 0);
    3136:       0f 88 77 01 00 00       js     32b3 <xlog_write+0x4d3>
                        if (copy_len > 0) {
    313c:       0f 84 d6 02 00 00       je     3418 <xlog_write+0x638>
                                memcpy(ptr, reg->i_addr + copy_off, copy_len);
    3142:       48 63 da                movslq %edx,%rbx
    3145:       49 63 f6                movslq %r14d,%rsi
    3148:       48 89 c7                mov    %rax,%rdi
                     "1:\ttestb $1,%b4\n\t"
                     "je 2f\n\t"
                     "movsb\n"
                     "2:"
                     : "=&c" (d0), "=&D" (d1), "=&S" (d2)
                     : "0" (n / 4), "q" (n), "1" ((long)to), "2" ((long)from)
    314b:       48 89 da                mov    %rbx,%rdx
    314e:       48 03 31                add    (%rcx),%rsi
    3151:       48 c1 ea 02             shr    $0x2,%rdx
        asm volatile("rep ; movsl\n\t"
    3155:       48 89 d1                mov    %rdx,%rcx
    3158:       f3 a5                   rep movsl %ds:(%rsi),%es:(%rdi)    // <= xlog_write+0x378/0x890
    315a:       f6 c3 02                test   $0x2,%bl
    315d:       74 02                   je     3161 <xlog_write+0x381>
    315f:       66 a5                   movsw  %ds:(%rsi),%es:(%rdi)
    3161:       f6 c3 01                test   $0x1,%bl
    3164:       74 01                   je     3167 <xlog_write+0x387>
    3166:       a4                      movsb  %ds:(%rsi),%es:(%rdi)
    3167:       8b 55 c4                mov    -0x3c(%rbp),%edx
        *len -= bytes;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-04 12:19 xfs: Uninitialized memory read at xlog_write Tetsuo Handa
@ 2017-09-04 17:34 ` Darrick J. Wong
  2017-09-04 21:44   ` Tetsuo Handa
  2017-09-11 15:01 ` Brian Foster
  1 sibling, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-09-04 17:34 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: sandeen, dchinner, linux-xfs

On Mon, Sep 04, 2017 at 09:19:41PM +0900, Tetsuo Handa wrote:
> Hello.
> 
> I hit kmemcheck splat using linux-next-20170901.

What was running at the time?

Would be nice to know what log vector that was...

--D

> [ 1360.354052] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffffa17673526960)
> [ 1360.354053] 00000000000000006a1004080000000000000000000000000000000000000000
> [ 1360.354059]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> [ 1360.354063]  ^
> [ 1360.354080] RIP: 0010:xlog_write+0x378/0x890 [xfs]
> [ 1360.354080] RSP: 0018:ffffafeac12b7cf0 EFLAGS: 00010202
> [ 1360.354081] RAX: ffffafeac1808234 RBX: 0000000000000038 RCX: 000000000000000c
> [ 1360.354082] RDX: 000000000000000e RSI: ffffa17673526960 RDI: ffffafeac180823c
> [ 1360.354082] RBP: ffffafeac12b7d98 R08: 0000000000000000 R09: 0000000000000000
> [ 1360.354083] R10: ffffa176737f8000 R11: 0000000000000000 R12: 0000000000000038
> [ 1360.354083] R13: 0000000000000002 R14: 0000000000000000 R15: ffffa17673526900
> [ 1360.354084] FS:  0000000000000000(0000) GS:ffffffffb0c30000(0000) knlGS:0000000000000000
> [ 1360.354085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1360.354085] CR2: ffffa17674a66320 CR3: 0000000135c77003 CR4: 00000000000606f0
> [ 1360.354133]  xlog_write+0x378/0x890 [xfs]
> [ 1360.354142]  xlog_cil_push+0x1f4/0x370 [xfs]
> [ 1360.354151]  xlog_cil_push_work+0xd/0x10 [xfs]
> [ 1360.354207]  process_one_work+0x1cd/0x330
> [ 1360.354208]  worker_thread+0x222/0x3e0
> [ 1360.354209]  kthread+0xfe/0x130
> [ 1360.354212]  ret_from_fork+0x25/0x30
> [ 1360.354213]  0xffffffffffffffff
> 
> # ./scripts/faddr2line fs/xfs/xfs.ko xlog_write+0x378/0x890
> xlog_write+0x378/0x890:
> __inline_memcpy at /data/linux-next/./arch/x86/include/asm/string_64.h:13
>  (inlined by) xlog_write at /data/linux-next/fs/xfs/xfs_log.c:2479
> 
>                         ASSERT(copy_len >= 0);
>     3136:       0f 88 77 01 00 00       js     32b3 <xlog_write+0x4d3>
>                         if (copy_len > 0) {
>     313c:       0f 84 d6 02 00 00       je     3418 <xlog_write+0x638>
>                                 memcpy(ptr, reg->i_addr + copy_off, copy_len);
>     3142:       48 63 da                movslq %edx,%rbx
>     3145:       49 63 f6                movslq %r14d,%rsi
>     3148:       48 89 c7                mov    %rax,%rdi
>                      "1:\ttestb $1,%b4\n\t"
>                      "je 2f\n\t"
>                      "movsb\n"
>                      "2:"
>                      : "=&c" (d0), "=&D" (d1), "=&S" (d2)
>                      : "0" (n / 4), "q" (n), "1" ((long)to), "2" ((long)from)
>     314b:       48 89 da                mov    %rbx,%rdx
>     314e:       48 03 31                add    (%rcx),%rsi
>     3151:       48 c1 ea 02             shr    $0x2,%rdx
>         asm volatile("rep ; movsl\n\t"
>     3155:       48 89 d1                mov    %rdx,%rcx
>     3158:       f3 a5                   rep movsl %ds:(%rsi),%es:(%rdi)    // <= xlog_write+0x378/0x890
>     315a:       f6 c3 02                test   $0x2,%bl
>     315d:       74 02                   je     3161 <xlog_write+0x381>
>     315f:       66 a5                   movsw  %ds:(%rsi),%es:(%rdi)
>     3161:       f6 c3 01                test   $0x1,%bl
>     3164:       74 01                   je     3167 <xlog_write+0x387>
>     3166:       a4                      movsb  %ds:(%rsi),%es:(%rdi)
>     3167:       8b 55 c4                mov    -0x3c(%rbp),%edx
>         *len -= bytes;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-04 17:34 ` Darrick J. Wong
@ 2017-09-04 21:44   ` Tetsuo Handa
  0 siblings, 0 replies; 14+ messages in thread
From: Tetsuo Handa @ 2017-09-04 21:44 UTC (permalink / raw)
  To: darrick.wong; +Cc: sandeen, dchinner, linux-xfs

Darrick J. Wong wrote:
> On Mon, Sep 04, 2017 at 09:19:41PM +0900, Tetsuo Handa wrote:
> > Hello.
> > 
> > I hit kmemcheck splat using linux-next-20170901.
> 
> What was running at the time?
> 
> Would be nice to know what log vector that was...

Nothing special. Just booted CentOS 7 using KMEMCHECK=y linux-next-20170901 kernel.

> 
> --D
> 
> > [ 1360.354052] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffffa17673526960)
> > [ 1360.354053] 00000000000000006a1004080000000000000000000000000000000000000000
> > [ 1360.354059]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > [ 1360.354063]  ^
> > [ 1360.354080] RIP: 0010:xlog_write+0x378/0x890 [xfs]
> > [ 1360.354080] RSP: 0018:ffffafeac12b7cf0 EFLAGS: 00010202
> > [ 1360.354081] RAX: ffffafeac1808234 RBX: 0000000000000038 RCX: 000000000000000c
> > [ 1360.354082] RDX: 000000000000000e RSI: ffffa17673526960 RDI: ffffafeac180823c
> > [ 1360.354082] RBP: ffffafeac12b7d98 R08: 0000000000000000 R09: 0000000000000000
> > [ 1360.354083] R10: ffffa176737f8000 R11: 0000000000000000 R12: 0000000000000038
> > [ 1360.354083] R13: 0000000000000002 R14: 0000000000000000 R15: ffffa17673526900
> > [ 1360.354084] FS:  0000000000000000(0000) GS:ffffffffb0c30000(0000) knlGS:0000000000000000
> > [ 1360.354085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1360.354085] CR2: ffffa17674a66320 CR3: 0000000135c77003 CR4: 00000000000606f0
> > [ 1360.354133]  xlog_write+0x378/0x890 [xfs]
> > [ 1360.354142]  xlog_cil_push+0x1f4/0x370 [xfs]
> > [ 1360.354151]  xlog_cil_push_work+0xd/0x10 [xfs]
> > [ 1360.354207]  process_one_work+0x1cd/0x330
> > [ 1360.354208]  worker_thread+0x222/0x3e0
> > [ 1360.354209]  kthread+0xfe/0x130
> > [ 1360.354212]  ret_from_fork+0x25/0x30
> > [ 1360.354213]  0xffffffffffffffff
> > 
> > # ./scripts/faddr2line fs/xfs/xfs.ko xlog_write+0x378/0x890
> > xlog_write+0x378/0x890:
> > __inline_memcpy at /data/linux-next/./arch/x86/include/asm/string_64.h:13
> >  (inlined by) xlog_write at /data/linux-next/fs/xfs/xfs_log.c:2479
> > 
> >                         ASSERT(copy_len >= 0);
> >     3136:       0f 88 77 01 00 00       js     32b3 <xlog_write+0x4d3>
> >                         if (copy_len > 0) {
> >     313c:       0f 84 d6 02 00 00       je     3418 <xlog_write+0x638>
> >                                 memcpy(ptr, reg->i_addr + copy_off, copy_len);
> >     3142:       48 63 da                movslq %edx,%rbx
> >     3145:       49 63 f6                movslq %r14d,%rsi
> >     3148:       48 89 c7                mov    %rax,%rdi
> >                      "1:\ttestb $1,%b4\n\t"
> >                      "je 2f\n\t"
> >                      "movsb\n"
> >                      "2:"
> >                      : "=&c" (d0), "=&D" (d1), "=&S" (d2)
> >                      : "0" (n / 4), "q" (n), "1" ((long)to), "2" ((long)from)
> >     314b:       48 89 da                mov    %rbx,%rdx
> >     314e:       48 03 31                add    (%rcx),%rsi
> >     3151:       48 c1 ea 02             shr    $0x2,%rdx
> >         asm volatile("rep ; movsl\n\t"
> >     3155:       48 89 d1                mov    %rdx,%rcx
> >     3158:       f3 a5                   rep movsl %ds:(%rsi),%es:(%rdi)    // <= xlog_write+0x378/0x890
> >     315a:       f6 c3 02                test   $0x2,%bl
> >     315d:       74 02                   je     3161 <xlog_write+0x381>
> >     315f:       66 a5                   movsw  %ds:(%rsi),%es:(%rdi)
> >     3161:       f6 c3 01                test   $0x1,%bl
> >     3164:       74 01                   je     3167 <xlog_write+0x387>
> >     3166:       a4                      movsb  %ds:(%rsi),%es:(%rdi)
> >     3167:       8b 55 c4                mov    -0x3c(%rbp),%edx
> >         *len -= bytes;
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-04 12:19 xfs: Uninitialized memory read at xlog_write Tetsuo Handa
  2017-09-04 17:34 ` Darrick J. Wong
@ 2017-09-11 15:01 ` Brian Foster
  2017-09-13  7:14   ` Tetsuo Handa
  1 sibling, 1 reply; 14+ messages in thread
From: Brian Foster @ 2017-09-11 15:01 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: sandeen, dchinner, linux-xfs

On Mon, Sep 04, 2017 at 09:19:41PM +0900, Tetsuo Handa wrote:
> Hello.
> 
> I hit kmemcheck splat using linux-next-20170901.
> 
> [ 1360.354052] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffffa17673526960)
> [ 1360.354053] 00000000000000006a1004080000000000000000000000000000000000000000
> [ 1360.354059]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> [ 1360.354063]  ^
> [ 1360.354080] RIP: 0010:xlog_write+0x378/0x890 [xfs]
> [ 1360.354080] RSP: 0018:ffffafeac12b7cf0 EFLAGS: 00010202
> [ 1360.354081] RAX: ffffafeac1808234 RBX: 0000000000000038 RCX: 000000000000000c
> [ 1360.354082] RDX: 000000000000000e RSI: ffffa17673526960 RDI: ffffafeac180823c
> [ 1360.354082] RBP: ffffafeac12b7d98 R08: 0000000000000000 R09: 0000000000000000
> [ 1360.354083] R10: ffffa176737f8000 R11: 0000000000000000 R12: 0000000000000038
> [ 1360.354083] R13: 0000000000000002 R14: 0000000000000000 R15: ffffa17673526900
> [ 1360.354084] FS:  0000000000000000(0000) GS:ffffffffb0c30000(0000) knlGS:0000000000000000
> [ 1360.354085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1360.354085] CR2: ffffa17674a66320 CR3: 0000000135c77003 CR4: 00000000000606f0
> [ 1360.354133]  xlog_write+0x378/0x890 [xfs]
> [ 1360.354142]  xlog_cil_push+0x1f4/0x370 [xfs]
> [ 1360.354151]  xlog_cil_push_work+0xd/0x10 [xfs]
> [ 1360.354207]  process_one_work+0x1cd/0x330
> [ 1360.354208]  worker_thread+0x222/0x3e0
> [ 1360.354209]  kthread+0xfe/0x130
> [ 1360.354212]  ret_from_fork+0x25/0x30
> [ 1360.354213]  0xffffffffffffffff
> 

I threw a kmemcheck enabled kernel on a couple boxes out of curiosity.
I didn't reproduce anything related to XFS with the "allow partially
uninit." option enabled, even after running a quick group run in
xfstests. Without that option, I did see an xlog_write() splat on a
rhel7 boot but it was preceded by a constant stream of other reports
(__d_lookup(), link_path_walk(), etc.) and ultimately resulted in a
failed/hung bootup (after quite some time). I'm in the process of seeing
a similar result on a Fedora vm running for-next. This isn't really
debuggable in the current form due to the volume of preceding errors,
the required boot time and the fact that my machines ultimately fall
over.

Do you see different behavior? Do you have the "allow partially
uninitialized memory" option enabled or disabled? I wonder if there are
fixes for these other errors I'm seeing in linux-next that XFS for-next
hasn't picked up yet..?

Brian

> # ./scripts/faddr2line fs/xfs/xfs.ko xlog_write+0x378/0x890
> xlog_write+0x378/0x890:
> __inline_memcpy at /data/linux-next/./arch/x86/include/asm/string_64.h:13
>  (inlined by) xlog_write at /data/linux-next/fs/xfs/xfs_log.c:2479
> 
>                         ASSERT(copy_len >= 0);
>     3136:       0f 88 77 01 00 00       js     32b3 <xlog_write+0x4d3>
>                         if (copy_len > 0) {
>     313c:       0f 84 d6 02 00 00       je     3418 <xlog_write+0x638>
>                                 memcpy(ptr, reg->i_addr + copy_off, copy_len);
>     3142:       48 63 da                movslq %edx,%rbx
>     3145:       49 63 f6                movslq %r14d,%rsi
>     3148:       48 89 c7                mov    %rax,%rdi
>                      "1:\ttestb $1,%b4\n\t"
>                      "je 2f\n\t"
>                      "movsb\n"
>                      "2:"
>                      : "=&c" (d0), "=&D" (d1), "=&S" (d2)
>                      : "0" (n / 4), "q" (n), "1" ((long)to), "2" ((long)from)
>     314b:       48 89 da                mov    %rbx,%rdx
>     314e:       48 03 31                add    (%rcx),%rsi
>     3151:       48 c1 ea 02             shr    $0x2,%rdx
>         asm volatile("rep ; movsl\n\t"
>     3155:       48 89 d1                mov    %rdx,%rcx
>     3158:       f3 a5                   rep movsl %ds:(%rsi),%es:(%rdi)    // <= xlog_write+0x378/0x890
>     315a:       f6 c3 02                test   $0x2,%bl
>     315d:       74 02                   je     3161 <xlog_write+0x381>
>     315f:       66 a5                   movsw  %ds:(%rsi),%es:(%rdi)
>     3161:       f6 c3 01                test   $0x1,%bl
>     3164:       74 01                   je     3167 <xlog_write+0x387>
>     3166:       a4                      movsb  %ds:(%rsi),%es:(%rdi)
>     3167:       8b 55 c4                mov    -0x3c(%rbp),%edx
>         *len -= bytes;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-11 15:01 ` Brian Foster
@ 2017-09-13  7:14   ` Tetsuo Handa
  2017-09-13  9:43     ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2017-09-13  7:14 UTC (permalink / raw)
  To: bfoster; +Cc: sandeen, dchinner, linux-xfs

Brian Foster wrote:
> I threw a kmemcheck enabled kernel on a couple boxes out of curiosity.
> I didn't reproduce anything related to XFS with the "allow partially
> uninit." option enabled, even after running a quick group run in
> xfstests. Without that option, I did see an xlog_write() splat on a
> rhel7 boot but it was preceded by a constant stream of other reports
> (__d_lookup(), link_path_walk(), etc.) and ultimately resulted in a
> failed/hung bootup (after quite some time). I'm in the process of seeing
> a similar result on a Fedora vm running for-next. This isn't really
> debuggable in the current form due to the volume of preceding errors,
> the required boot time and the fact that my machines ultimately fall
> over.
> 
> Do you see different behavior? Do you have the "allow partially
> uninitialized memory" option enabled or disabled? I wonder if there are
> fixes for these other errors I'm seeing in linux-next that XFS for-next
> hasn't picked up yet..?

When I tested using linux-next-20170901, xlog_write() was the first splat
(after I workarounded take_dentry_name_snapshot() splat).
Today I retested using linux.git with a different VM and got a different result.

Here are console log and kernel config as of 6d8ef53e8b2fed8b ("Merge tag 'f2fs-for-4.14'
of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs") on linux.git .

http://I-love.SAKURA.ne.jp/tmp/serial-20170913.txt.xz
http://I-love.SAKURA.ne.jp/tmp/config-4.13-20170913


CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_KMEMCHECK=y
# CONFIG_KMEMCHECK_DISABLED_BY_DEFAULT is not set
CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT=y
# CONFIG_KMEMCHECK_ONESHOT_BY_DEFAULT is not set
CONFIG_KMEMCHECK_QUEUE_SIZE=64
CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT=5
CONFIG_KMEMCHECK_PARTIAL_OK=y
# CONFIG_KMEMCHECK_BITOPS_OK is not set

----------
         Mounting Huge Pages File System...

[  OK  ] Stopped target Switch Root.

[  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
[ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
[ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
[ 1054.691515]  ^
[ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
[ 1054.691519] RSP: 0018:ffff8801356d7cd8 EFLAGS: 00010202
[ 1054.691520] RAX: 0000000000000038 RBX: 0000000000000002 RCX: 000000000000000c
[ 1054.691520] RDX: 0000000000000038 RSI: ffff880135396660 RDI: ffffc9000180823c
[ 1054.691521] RBP: ffff8801356d7d70 R08: ffffc90001808234 R09: 0000000000000000
[ 1054.691521] R10: 0000000000000038 R11: 0000000000000000 R12: 0000000000000000
[ 1054.691521] R13: ffff880135396600 R14: 00000000000002b4 R15: ffff8801323f2000
[ 1054.691522] FS:  0000000000000000(0000) GS:ffff88013f400000(0000) knlGS:0000000000000000
[ 1054.691523] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1054.691524] CR2: ffff8801351a3c18 CR3: 000000013d5e4005 CR4: 00000000001606f0
[ 1054.691525]  xlog_write+0x344/0x6b0
[ 1054.691526]  xlog_cil_push+0x240/0x460
[ 1054.691527]  xlog_cil_push_work+0x10/0x20
[ 1054.691529]  process_one_work+0x121/0x2a0
[ 1054.691529]  worker_thread+0x1b7/0x390
[ 1054.691531]  kthread+0xff/0x140
[ 1054.691532]  ret_from_fork+0x22/0x30
[ 1054.691534]  0xffffffffffffffff


[  OK  ] Stopped target Initrd Root File System.
----------

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-13  7:14   ` Tetsuo Handa
@ 2017-09-13  9:43     ` Dave Chinner
  2017-09-13  9:59       ` Tetsuo Handa
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2017-09-13  9:43 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: bfoster, sandeen, dchinner, linux-xfs

On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> Brian Foster wrote:
> > I threw a kmemcheck enabled kernel on a couple boxes out of curiosity.
> > I didn't reproduce anything related to XFS with the "allow partially
> > uninit." option enabled, even after running a quick group run in
> > xfstests. Without that option, I did see an xlog_write() splat on a
> > rhel7 boot but it was preceded by a constant stream of other reports
> > (__d_lookup(), link_path_walk(), etc.) and ultimately resulted in a
> > failed/hung bootup (after quite some time). I'm in the process of seeing
> > a similar result on a Fedora vm running for-next. This isn't really
> > debuggable in the current form due to the volume of preceding errors,
> > the required boot time and the fact that my machines ultimately fall
> > over.
> > 
> > Do you see different behavior? Do you have the "allow partially
> > uninitialized memory" option enabled or disabled? I wonder if there are
> > fixes for these other errors I'm seeing in linux-next that XFS for-next
> > hasn't picked up yet..?
> 
> When I tested using linux-next-20170901, xlog_write() was the first splat
> (after I workarounded take_dentry_name_snapshot() splat).
> Today I retested using linux.git with a different VM and got a different result.
> 
> Here are console log and kernel config as of 6d8ef53e8b2fed8b ("Merge tag 'f2fs-for-4.14'
> of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs") on linux.git .
> 
> http://I-love.SAKURA.ne.jp/tmp/serial-20170913.txt.xz
> http://I-love.SAKURA.ne.jp/tmp/config-4.13-20170913
> 
> 
> CONFIG_HAVE_ARCH_KMEMCHECK=y
> CONFIG_KMEMCHECK=y
> # CONFIG_KMEMCHECK_DISABLED_BY_DEFAULT is not set
> CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT=y
> # CONFIG_KMEMCHECK_ONESHOT_BY_DEFAULT is not set
> CONFIG_KMEMCHECK_QUEUE_SIZE=64
> CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT=5
> CONFIG_KMEMCHECK_PARTIAL_OK=y
> # CONFIG_KMEMCHECK_BITOPS_OK is not set
> 
> ----------
>          Mounting Huge Pages File System...
> 
> [  OK  ] Stopped target Switch Root.
> 
> [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> [ 1054.691515]  ^
> [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0

What line of code does this correspond to?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-13  9:43     ` Dave Chinner
@ 2017-09-13  9:59       ` Tetsuo Handa
  2017-09-13 21:40         ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2017-09-13  9:59 UTC (permalink / raw)
  To: david; +Cc: bfoster, sandeen, dchinner, linux-xfs

Dave Chinner wrote:
> On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> > Brian Foster wrote:
> > > I threw a kmemcheck enabled kernel on a couple boxes out of curiosity.
> > > I didn't reproduce anything related to XFS with the "allow partially
> > > uninit." option enabled, even after running a quick group run in
> > > xfstests. Without that option, I did see an xlog_write() splat on a
> > > rhel7 boot but it was preceded by a constant stream of other reports
> > > (__d_lookup(), link_path_walk(), etc.) and ultimately resulted in a
> > > failed/hung bootup (after quite some time). I'm in the process of seeing
> > > a similar result on a Fedora vm running for-next. This isn't really
> > > debuggable in the current form due to the volume of preceding errors,
> > > the required boot time and the fact that my machines ultimately fall
> > > over.
> > > 
> > > Do you see different behavior? Do you have the "allow partially
> > > uninitialized memory" option enabled or disabled? I wonder if there are
> > > fixes for these other errors I'm seeing in linux-next that XFS for-next
> > > hasn't picked up yet..?
> > 
> > When I tested using linux-next-20170901, xlog_write() was the first splat
> > (after I workarounded take_dentry_name_snapshot() splat).
> > Today I retested using linux.git with a different VM and got a different result.
> > 
> > Here are console log and kernel config as of 6d8ef53e8b2fed8b ("Merge tag 'f2fs-for-4.14'
> > of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs") on linux.git .
> > 
> > http://I-love.SAKURA.ne.jp/tmp/serial-20170913.txt.xz
> > http://I-love.SAKURA.ne.jp/tmp/config-4.13-20170913
> > 
> > 
> > CONFIG_HAVE_ARCH_KMEMCHECK=y
> > CONFIG_KMEMCHECK=y
> > # CONFIG_KMEMCHECK_DISABLED_BY_DEFAULT is not set
> > CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT=y
> > # CONFIG_KMEMCHECK_ONESHOT_BY_DEFAULT is not set
> > CONFIG_KMEMCHECK_QUEUE_SIZE=64
> > CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT=5
> > CONFIG_KMEMCHECK_PARTIAL_OK=y
> > # CONFIG_KMEMCHECK_BITOPS_OK is not set
> > 
> > ----------
> >          Mounting Huge Pages File System...
> > 
> > [  OK  ] Stopped target Switch Root.
> > 
> > [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> > [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> > [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > [ 1054.691515]  ^
> > [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
> 
> What line of code does this correspond to?
> 

                        /*
                         * Copy region.
                         *
                         * Unmount records just log an opheader, so can have
                         * empty payloads with no data region to copy. Hence we
                         * only copy the payload if the vector says it has data
                         * to copy.
                         */
                        ASSERT(copy_len >= 0);
                        if (copy_len > 0) {
                                memcpy(ptr, reg->i_addr + copy_off, copy_len); // <= xlog_write+0x344/0x6b0
                                xlog_write_adv_cnt(&ptr, &len, &log_offset,
                                                   copy_len);
                        }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-13  9:59       ` Tetsuo Handa
@ 2017-09-13 21:40         ` Dave Chinner
  2017-09-14 10:15           ` Tetsuo Handa
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2017-09-13 21:40 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: bfoster, sandeen, dchinner, linux-xfs

On Wed, Sep 13, 2017 at 06:59:38PM +0900, Tetsuo Handa wrote:
> Dave Chinner wrote:
> > On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> > > [  OK  ] Stopped target Switch Root.
> > > 
> > > [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> > > [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> > > [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > > [ 1054.691515]  ^
> > > [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
> > 
> > What line of code does this correspond to?
> > 
> 
>                         /*
>                          * Copy region.
>                          *
>                          * Unmount records just log an opheader, so can have
>                          * empty payloads with no data region to copy. Hence we
>                          * only copy the payload if the vector says it has data
>                          * to copy.
>                          */
>                         ASSERT(copy_len >= 0);
>                         if (copy_len > 0) {
>                                 memcpy(ptr, reg->i_addr + copy_off, copy_len); // <= xlog_write+0x344/0x6b0
>                                 xlog_write_adv_cnt(&ptr, &len, &log_offset,
>                                                    copy_len);
>                         }
> 

Ok, that's what I suspected. The region being copied is set up
in xlog_cil_insert_format_items(), so problem is in one of the
->iop_format methods it calls to format the dirty metadata into the
region.

And given that the address is ...6660, it's likely the offset into
the structure being copied is 96 bytes.

$ pahole...
.....
struct xfs_log_dinode {
.....
       xfs_agino_t                di_next_unlinked;     /*    96     4 */
.....

Try the patch below.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: log a dummy next_unlinked value

From: Dave Chinner <dchinner@redhat.com>

Prevent kmemcheck from throwing warnings about reading 32 bits of
uninitialised memory when formatting an inode region into the incore
log buffer by writing a dummy value of zero into the
di_next_unlinked field.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_inode_item.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 6d0f74ec31e8..67c59a6c8b7c 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -364,6 +364,9 @@ xfs_inode_to_log_dinode(
 	to->di_dmstate = from->di_dmstate;
 	to->di_flags = from->di_flags;
 
+	/* log a dummy value to ensure logged structure is fully initialised */
+	to->di_next_unlinked = NULLAGINO;
+
 	if (from->di_version == 3) {
 		to->di_changecount = inode->i_version;
 		to->di_crtime.t_sec = from->di_crtime.t_sec;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-13 21:40         ` Dave Chinner
@ 2017-09-14 10:15           ` Tetsuo Handa
  2017-09-14 22:44             ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2017-09-14 10:15 UTC (permalink / raw)
  To: david; +Cc: bfoster, sandeen, dchinner, linux-xfs

Dave Chinner wrote:
> On Wed, Sep 13, 2017 at 06:59:38PM +0900, Tetsuo Handa wrote:
> > Dave Chinner wrote:
> > > On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> > > > [  OK  ] Stopped target Switch Root.
> > > > 
> > > > [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> > > > [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> > > > [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > > > [ 1054.691515]  ^
> > > > [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
> > > 
> > > What line of code does this correspond to?
> > > 
> > 
> >                         /*
> >                          * Copy region.
> >                          *
> >                          * Unmount records just log an opheader, so can have
> >                          * empty payloads with no data region to copy. Hence we
> >                          * only copy the payload if the vector says it has data
> >                          * to copy.
> >                          */
> >                         ASSERT(copy_len >= 0);
> >                         if (copy_len > 0) {
> >                                 memcpy(ptr, reg->i_addr + copy_off, copy_len); // <= xlog_write+0x344/0x6b0
> >                                 xlog_write_adv_cnt(&ptr, &len, &log_offset,
> >                                                    copy_len);
> >                         }
> > 
> 
> Ok, that's what I suspected. The region being copied is set up
> in xlog_cil_insert_format_items(), so problem is in one of the
> ->iop_format methods it calls to format the dirty metadata into the
> region.
> 
> And given that the address is ...6660, it's likely the offset into
> the structure being copied is 96 bytes.
> 
> $ pahole...
> .....
> struct xfs_log_dinode {
> .....
>        xfs_agino_t                di_next_unlinked;     /*    96     4 */
> .....
> 
> Try the patch below.

That patch did not help.

I checked values passed to memcpy() using below patch.

----------
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index c5107c7..f91c4c7 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -2476,6 +2476,8 @@
 			 */
 			ASSERT(copy_len >= 0);
 			if (copy_len > 0) {
+				printk(KERN_INFO "ptr=%p reg->i_addr=%p copy_off=%u copy_len=%u\n",
+				       ptr, reg->i_addr, copy_off, copy_len);
 				memcpy(ptr, reg->i_addr + copy_off, copy_len);
 				xlog_write_adv_cnt(&ptr, &len, &log_offset,
 						   copy_len);
----------

The copy_len was not multiple of sizeof(struct xfs_log_dinode).
Thus, I guess we can't assume this is "struct xfs_log_dinode".

----------
         Starting Load/Save Random Seed...

         Starting Configure read-only root support...

[ 1106.927991] ptr=ffffc90001c08218 reg->i_addr=ffff880134c7fda8 copy_off=0 copy_len=16
[ 1106.928022] ptr=ffffc90001c08234 reg->i_addr=ffff88013395f858 copy_off=0 copy_len=56
[ 1106.928100] ptr=ffffc90001c08278 reg->i_addr=ffff88013395f890 copy_off=0 copy_len=96
[ 1106.932354] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88013395f860)
[ 1106.932355] 58f895330188ffff0c0e06040000000000000000000000000000000000000000
[ 1106.932362]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
[ 1106.932368]  ^
[ 1106.932432] RIP: 0010:xlog_write+0x69a/0x730
[ 1106.932433] RSP: 0018:ffff880134c7fcc8 EFLAGS: 00010282
[ 1106.932434] RAX: 0000000000000038 RBX: ffff88013395f800 RCX: 000000000000000c
[ 1106.932434] RDX: ffffc90001c08234 RSI: ffff88013395f860 RDI: ffffc90001c0823c
[ 1106.932435] RBP: ffff880134c7fd70 R08: 0000000000000000 R09: 0000000000000000
[ 1106.932435] R10: ffffffff81dedfd8 R11: 0000000000000000 R12: 0000000000000038
[ 1106.932436] R13: 0000000000000002 R14: 0000000000000000 R15: ffff88013487f000
[ 1106.932437] FS:  0000000000000000(0000) GS:ffff88013f400000(0000) knlGS:0000000000000000
[ 1106.932438] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1106.932461] CR2: ffff88013486e418 CR3: 00000001339c9004 CR4: 00000000000606f0
[ 1106.932465]  xlog_write+0x69a/0x730
[ 1106.932467]  xlog_cil_push+0x240/0x460
[ 1106.932468]  xlog_cil_push_work+0x10/0x20
[ 1106.932470]  process_one_work+0x121/0x2a0
[ 1106.932471]  worker_thread+0x1b7/0x390
[ 1106.932472]  kthread+0xff/0x140
[ 1106.932506]  ret_from_fork+0x22/0x30
[ 1106.932509]  0xffffffffffffffff
[ 1137.332948] ptr=ffffc90001c10218 reg->i_addr=ffff880133bbbda8 copy_off=0 copy_len=16
[ 1137.332976] ptr=ffffc90001c10234 reg->i_addr=ffff880135240258 copy_off=0 copy_len=24
[ 1137.333024] ptr=ffffc90001c10258 reg->i_addr=ffff880135240270 copy_off=0 copy_len=384
[ 1167.850472] ptr=ffffc90001c18218 reg->i_addr=ffff88013182fda8 copy_off=0 copy_len=16
[ 1167.850503] ptr=ffffc90001c18234 reg->i_addr=ffff880136614a58 copy_off=0 copy_len=24
[ 1167.850555] ptr=ffffc90001c18258 reg->i_addr=ffff880136614a70 copy_off=0 copy_len=384
         Starting udev Coldplug all Devices...

         Starting Create Static Device Nodes in /dev...
----------

----------
[ 1561.441679] ptr=ffffc90001c08218 reg->i_addr=ffff880134c7fda8 copy_off=0 copy_len=16
[ 1561.441708] ptr=ffffc90001c08234 reg->i_addr=ffff8801319a4058 copy_off=0 copy_len=24
[ 1561.441755] ptr=ffffc90001c08258 reg->i_addr=ffff8801319a4070 copy_off=0 copy_len=128
[ 1561.441881] ptr=ffffc90001c082e4 reg->i_addr=ffff880131452068 copy_off=0 copy_len=24
[ 1561.441928] ptr=ffffc90001c08308 reg->i_addr=ffff880131452080 copy_off=0 copy_len=128
[ 1561.442048] ptr=ffffc90001c08394 reg->i_addr=ffff880131452100 copy_off=0 copy_len=3840
[ 1561.448086] ptr=ffffc90001c092a0 reg->i_addr=ffff88013636d068 copy_off=0 copy_len=24
[ 1561.448134] ptr=ffffc90001c092c4 reg->i_addr=ffff88013636d080 copy_off=0 copy_len=128
[ 1561.448253] ptr=ffffc90001c09350 reg->i_addr=ffff88013636d100 copy_off=0 copy_len=3456
[ 1561.454000] ptr=ffffc90001c0a0dc reg->i_addr=ffff880135b7a868 copy_off=0 copy_len=56
[ 1561.454073] ptr=ffffc90001c0a120 reg->i_addr=ffff880135b7a8a0 copy_off=0 copy_len=96
[ 1561.454170] ptr=ffffc90001c0a18c reg->i_addr=ffff880135b7a900 copy_off=0 copy_len=16
[ 1561.455971] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135b7a874)
[ 1561.455973] 10000000070000003b12030005000000616310003a504e50e8ed120800000000
[ 1561.455979]  i i i i i i i i i i i i i i i i u u i i u u u u i i i i i i i i
[ 1561.455984]                                          ^
[ 1561.455989] RIP: 0010:xlog_write+0x69a/0x730
[ 1561.455989] RSP: 0018:ffff880134c7fcc8 EFLAGS: 00010282
[ 1561.455990] RAX: 0000000000000038 RBX: ffff880135b7a800 RCX: 000000000000000b
[ 1561.455991] RDX: ffffc90001c0a0dc RSI: ffff880135b7a874 RDI: ffffc90001c0a0e8
[ 1561.455991] RBP: ffff880134c7fd70 R08: 0000000000000000 R09: 0000000000000004
[ 1561.455992] R10: ffffffff81df2a14 R11: 0000000000000000 R12: 0000000000000038
[ 1561.455992] R13: 000000000000000a R14: 0000000000000000 R15: ffff88013487f170
[ 1561.455993] FS:  0000000000000000(0000) GS:ffff88013f400000(0000) knlGS:0000000000000000
[ 1561.455994] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1561.455997] CR2: ffff88013486e418 CR3: 00000001319fc005 CR4: 00000000000606f0
[ 1561.455999]  xlog_write+0x69a/0x730
[ 1561.456000]  xlog_cil_push+0x240/0x460
[ 1561.456002]  xlog_cil_push_work+0x10/0x20
[ 1561.456003]  process_one_work+0x121/0x2a0
[ 1561.456004]  worker_thread+0x1b7/0x390
[ 1561.456005]  kthread+0xff/0x140
[ 1561.456007]  ret_from_fork+0x22/0x30
[ 1561.456009]  0xffffffffffffffff
----------

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-14 10:15           ` Tetsuo Handa
@ 2017-09-14 22:44             ` Dave Chinner
  2017-09-15 11:19               ` Tetsuo Handa
  2017-09-18 18:21               ` Darrick J. Wong
  0 siblings, 2 replies; 14+ messages in thread
From: Dave Chinner @ 2017-09-14 22:44 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: bfoster, sandeen, dchinner, linux-xfs

On Thu, Sep 14, 2017 at 07:15:38PM +0900, Tetsuo Handa wrote:
> Dave Chinner wrote:
> > On Wed, Sep 13, 2017 at 06:59:38PM +0900, Tetsuo Handa wrote:
> > > Dave Chinner wrote:
> > > > On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> > > > > [  OK  ] Stopped target Switch Root.
> > > > > 
> > > > > [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> > > > > [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> > > > > [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > > > > [ 1054.691515]  ^
> > > > > [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
> > > > 
> > > > What line of code does this correspond to?
> > > > 
> > > 
> > >                         /*
> > >                          * Copy region.
> > >                          *
> > >                          * Unmount records just log an opheader, so can have
> > >                          * empty payloads with no data region to copy. Hence we
> > >                          * only copy the payload if the vector says it has data
> > >                          * to copy.
> > >                          */
> > >                         ASSERT(copy_len >= 0);
> > >                         if (copy_len > 0) {
> > >                                 memcpy(ptr, reg->i_addr + copy_off, copy_len); // <= xlog_write+0x344/0x6b0
> > >                                 xlog_write_adv_cnt(&ptr, &len, &log_offset,
> > >                                                    copy_len);
> > >                         }
> > > 
> > 
> > Ok, that's what I suspected. The region being copied is set up
> > in xlog_cil_insert_format_items(), so problem is in one of the
> > ->iop_format methods it calls to format the dirty metadata into the
> > region.
> > 
> > And given that the address is ...6660, it's likely the offset into
> > the structure being copied is 96 bytes.
> > 
> > $ pahole...
> > .....
> > struct xfs_log_dinode {
> > .....
> >        xfs_agino_t                di_next_unlinked;     /*    96     4 */
> > .....
> > 
> > Try the patch below.
> 
> That patch did not help.
> 
> I checked values passed to memcpy() using below patch.

ok....

> 
> ----------
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index c5107c7..f91c4c7 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2476,6 +2476,8 @@
>  			 */
>  			ASSERT(copy_len >= 0);
>  			if (copy_len > 0) {
> +				printk(KERN_INFO "ptr=%p reg->i_addr=%p copy_off=%u copy_len=%u\n",
> +				       ptr, reg->i_addr, copy_off, copy_len);

You need to print out the reg->i_type here. Then we'll know exactly
where it came from.

>  				memcpy(ptr, reg->i_addr + copy_off, copy_len);
>  				xlog_write_adv_cnt(&ptr, &len, &log_offset,
>  						   copy_len);
> ----------
> 
> The copy_len was not multiple of sizeof(struct xfs_log_dinode).
> Thus, I guess we can't assume this is "struct xfs_log_dinode".

It still could be - we don't always log the entire structure.

static inline uint xfs_log_dinode_size(int version)
{
        if (version == 3)
		return sizeof(struct xfs_log_dinode);
	return offsetof(struct xfs_log_dinode, di_next_unlinked);
}

i.e. If it's v2 inode, the logged region is 96 bytes in length, not
176.

> 
> ----------
>          Starting Load/Save Random Seed...
> 
>          Starting Configure read-only root support...
> 
> [ 1106.927991] ptr=ffffc90001c08218 reg->i_addr=ffff880134c7fda8 copy_off=0 copy_len=16
> [ 1106.928022] ptr=ffffc90001c08234 reg->i_addr=ffff88013395f858 copy_off=0 copy_len=56
> [ 1106.928100] ptr=ffffc90001c08278 reg->i_addr=ffff88013395f890 copy_off=0 copy_len=96
> [ 1106.932354] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88013395f860)

Hold on- that warning has come from the /prior/ region copy, not
the current region!

i.e. 2nd last copy region was ffff88013395f858 for 0x38 bytes, and
this spans address that failed ffff88013395f860. i.e. it's 8 bytes
into that region. It's 48 bytes before the region that we were
copying when kmemcheck triggered!

I'm going to guess that we've got a log op header, followed by
a inode format header, follow by something like a v2 inode core.
i.e. log op headers are 16 bytes, xfs_inode_log_format are 56 bytes,
and a v2 inode core is 96 bytes.

So, the inode log format header:

struct xfs_inode_log_format {
        uint16_t                   ilf_type;             /*     0     2 */
        uint16_t                   ilf_size;             /*     2     2 */
        uint32_t                   ilf_fields;           /*     4     4 */
        uint16_t                   ilf_asize;            /*     8     2 */
        uint16_t                   ilf_dsize;            /*    10     2 */

        /* XXX 4 bytes hole, try to pack */

        uint64_t                   ilf_ino;              /*    16     8 */
        union {
                uint32_t           ilfu_rdev;            /*           4 */
                uuid_t             ilfu_uuid;            /*          16 */
        } ilf_u;                                         /*    24    16 */
        int64_t                    ilf_blkno;            /*    40     8 */
        int32_t                    ilf_len;              /*    48     4 */
        int32_t                    ilf_boffset;          /*    52     4 */

        /* size: 56, cachelines: 1, members: 10 */
        /* sum members: 52, holes: 1, sum holes: 4 */
        /* last cacheline: 56 bytes */
};

Oh, that's right. Someone, a long, long time ago, screwed up the on
disk inode log structure but nobody noticed it due to the Irix MIPS
compilers padding the structure identically on 32 and 64 bit
systems. Port to linux, and i386/x86-64 padded it differently. The
fix at the time was to make it right in recovery and continue to
use the native structure at runtime. Ok, let's fix that properly
now.

Try the patch below.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: Don't log uninitialised fields in inode structures

From: Dave Chinner <dchinner@redhat.com>

Prevent kmemcheck from throwing warnings about reading uninitialised
memory when formatting inodes into the incore log buffer. There are
several issues here - we don't always log all the fields in the
inode log format item, and we never log the inode the
di_next_unlinked field.

In the case of the inode log format item, this is aused b cerbated
by the old xfs_inode_log_format structure padding issue. Hence make
the padded, 64 bit aligned version of the structure the one we always
use for formatting the log and get rid of the 64 bit variant. This
means we'll always log the 64-bit version and so recovery only needs
to convert from the unpadded 32 bit version from older 32 bit
kernels.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_log_format.h | 27 +++++----------
 fs/xfs/xfs_inode_item.c        | 79 ++++++++++++++++++++++--------------------
 fs/xfs/xfs_ondisk.h            |  2 +-
 3 files changed, 50 insertions(+), 58 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 8372e9bcd7b6..71de185735e0 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -270,6 +270,7 @@ typedef struct xfs_inode_log_format {
 	uint32_t		ilf_fields;	/* flags for fields logged */
 	uint16_t		ilf_asize;	/* size of attr d/ext/root */
 	uint16_t		ilf_dsize;	/* size of data/ext/root */
+	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
 	uint64_t		ilf_ino;	/* inode number */
 	union {
 		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
@@ -280,29 +281,17 @@ typedef struct xfs_inode_log_format {
 	int32_t			ilf_boffset;	/* off of inode in buffer */
 } xfs_inode_log_format_t;
 
-typedef struct xfs_inode_log_format_32 {
-	uint16_t		ilf_type;	/* inode log item type */
-	uint16_t		ilf_size;	/* size of this item */
-	uint32_t		ilf_fields;	/* flags for fields logged */
-	uint16_t		ilf_asize;	/* size of attr d/ext/root */
-	uint16_t		ilf_dsize;	/* size of data/ext/root */
-	uint64_t		ilf_ino;	/* inode number */
-	union {
-		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
-		uuid_t		ilfu_uuid;	/* mount point value */
-	} ilf_u;
-	int64_t			ilf_blkno;	/* blkno of inode buffer */
-	int32_t			ilf_len;	/* len of inode buffer */
-	int32_t			ilf_boffset;	/* off of inode in buffer */
-} __attribute__((packed)) xfs_inode_log_format_32_t;
-
-typedef struct xfs_inode_log_format_64 {
+/*
+ * Old 32 bit systems will log in this format without the 64 bit
+ * alignment padding. Recovery will detect this and convert it to the
+ * correct format.
+ */
+struct xfs_inode_log_format_32 {
 	uint16_t		ilf_type;	/* inode log item type */
 	uint16_t		ilf_size;	/* size of this item */
 	uint32_t		ilf_fields;	/* flags for fields logged */
 	uint16_t		ilf_asize;	/* size of attr d/ext/root */
 	uint16_t		ilf_dsize;	/* size of data/ext/root */
-	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
 	uint64_t		ilf_ino;	/* inode number */
 	union {
 		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
@@ -311,7 +300,7 @@ typedef struct xfs_inode_log_format_64 {
 	int64_t			ilf_blkno;	/* blkno of inode buffer */
 	int32_t			ilf_len;	/* len of inode buffer */
 	int32_t			ilf_boffset;	/* off of inode in buffer */
-} xfs_inode_log_format_64_t;
+} __attribute__((packed));
 
 
 /*
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 6d0f74ec31e8..aec9cf36b5b7 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -364,6 +364,9 @@ xfs_inode_to_log_dinode(
 	to->di_dmstate = from->di_dmstate;
 	to->di_flags = from->di_flags;
 
+	/* log a dummy value to ensure log structure is fully initialised */
+	to->di_next_unlinked = NULLAGINO;
+
 	if (from->di_version == 3) {
 		to->di_changecount = inode->i_version;
 		to->di_crtime.t_sec = from->di_crtime.t_sec;
@@ -404,6 +407,11 @@ xfs_inode_item_format_core(
  * the second with the on-disk inode structure, and a possible third and/or
  * fourth with the inode data/extents/b-tree root and inode attributes
  * data/extents/b-tree root.
+ *
+ * Note: Always use the 64 bit inode log format structure so we don't
+ * leave an uninitialised hole in the format item on 64 bit systems. Log
+ * recovery on 32 bit systems handles this just fine, so there's no reason
+ * for not using an initialising the properly padded structure all the time.
  */
 STATIC void
 xfs_inode_item_format(
@@ -412,8 +420,8 @@ xfs_inode_item_format(
 {
 	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
 	struct xfs_inode	*ip = iip->ili_inode;
-	struct xfs_inode_log_format *ilf;
 	struct xfs_log_iovec	*vecp = NULL;
+	struct xfs_inode_log_format *ilf;
 
 	ASSERT(ip->i_d.di_version > 1);
 
@@ -425,7 +433,17 @@ xfs_inode_item_format(
 	ilf->ilf_boffset = ip->i_imap.im_boffset;
 	ilf->ilf_fields = XFS_ILOG_CORE;
 	ilf->ilf_size = 2; /* format + core */
-	xlog_finish_iovec(lv, vecp, sizeof(struct xfs_inode_log_format));
+
+	/*
+	 * make sure we don't leak uninitialised data into the log in the case
+	 * when we don't log every field in the inode.
+	 */
+	ilf->ilf_dsize = 0;
+	ilf->ilf_asize = 0;
+	ilf->ilf_pad = 0;
+	uuid_copy(&ilf->ilf_u.ilfu_uuid, &uuid_null);
+
+	xlog_finish_iovec(lv, vecp, sizeof(*ilf));
 
 	xfs_inode_item_format_core(ip, lv, &vecp);
 	xfs_inode_item_format_data_fork(iip, ilf, lv, &vecp);
@@ -855,44 +873,29 @@ xfs_istale_done(
 }
 
 /*
- * convert an xfs_inode_log_format struct from either 32 or 64 bit versions
- * (which can have different field alignments) to the native version
+ * convert an xfs_inode_log_format struct from the old 32 bit version
+ * (which can have different field alignments) to the native 64 bit version
  */
 int
 xfs_inode_item_format_convert(
-	xfs_log_iovec_t		*buf,
-	xfs_inode_log_format_t	*in_f)
+	struct xfs_log_iovec		*buf,
+	struct xfs_inode_log_format	*in_f)
 {
-	if (buf->i_len == sizeof(xfs_inode_log_format_32_t)) {
-		xfs_inode_log_format_32_t *in_f32 = buf->i_addr;
-
-		in_f->ilf_type = in_f32->ilf_type;
-		in_f->ilf_size = in_f32->ilf_size;
-		in_f->ilf_fields = in_f32->ilf_fields;
-		in_f->ilf_asize = in_f32->ilf_asize;
-		in_f->ilf_dsize = in_f32->ilf_dsize;
-		in_f->ilf_ino = in_f32->ilf_ino;
-		/* copy biggest field of ilf_u */
-		uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f32->ilf_u.ilfu_uuid);
-		in_f->ilf_blkno = in_f32->ilf_blkno;
-		in_f->ilf_len = in_f32->ilf_len;
-		in_f->ilf_boffset = in_f32->ilf_boffset;
-		return 0;
-	} else if (buf->i_len == sizeof(xfs_inode_log_format_64_t)){
-		xfs_inode_log_format_64_t *in_f64 = buf->i_addr;
-
-		in_f->ilf_type = in_f64->ilf_type;
-		in_f->ilf_size = in_f64->ilf_size;
-		in_f->ilf_fields = in_f64->ilf_fields;
-		in_f->ilf_asize = in_f64->ilf_asize;
-		in_f->ilf_dsize = in_f64->ilf_dsize;
-		in_f->ilf_ino = in_f64->ilf_ino;
-		/* copy biggest field of ilf_u */
-		uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f64->ilf_u.ilfu_uuid);
-		in_f->ilf_blkno = in_f64->ilf_blkno;
-		in_f->ilf_len = in_f64->ilf_len;
-		in_f->ilf_boffset = in_f64->ilf_boffset;
-		return 0;
-	}
-	return -EFSCORRUPTED;
+	struct xfs_inode_log_format_32	*in_f32 = buf->i_addr;
+
+	if (buf->i_len != sizeof(*in_f32))
+		return -EFSCORRUPTED;
+
+	in_f->ilf_type = in_f32->ilf_type;
+	in_f->ilf_size = in_f32->ilf_size;
+	in_f->ilf_fields = in_f32->ilf_fields;
+	in_f->ilf_asize = in_f32->ilf_asize;
+	in_f->ilf_dsize = in_f32->ilf_dsize;
+	in_f->ilf_ino = in_f32->ilf_ino;
+	/* copy biggest field of ilf_u */
+	uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f32->ilf_u.ilfu_uuid);
+	in_f->ilf_blkno = in_f32->ilf_blkno;
+	in_f->ilf_len = in_f32->ilf_len;
+	in_f->ilf_boffset = in_f32->ilf_boffset;
+	return 0;
 }
diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h
index 0c381d71b242..0492436a053f 100644
--- a/fs/xfs/xfs_ondisk.h
+++ b/fs/xfs/xfs_ondisk.h
@@ -134,7 +134,7 @@ xfs_check_ondisk_structs(void)
 	XFS_CHECK_STRUCT_SIZE(struct xfs_icreate_log,		28);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_ictimestamp,		8);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format_32,	52);
-	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format_64,	56);
+	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format,	56);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_qoff_logformat,	20);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_trans_header,		16);
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-14 22:44             ` Dave Chinner
@ 2017-09-15 11:19               ` Tetsuo Handa
  2017-09-18 18:21               ` Darrick J. Wong
  1 sibling, 0 replies; 14+ messages in thread
From: Tetsuo Handa @ 2017-09-15 11:19 UTC (permalink / raw)
  To: david; +Cc: bfoster, sandeen, dchinner, linux-xfs

Dave Chinner wrote:
> On Thu, Sep 14, 2017 at 07:15:38PM +0900, Tetsuo Handa wrote:
> > 
> > ----------
> >          Starting Load/Save Random Seed...
> > 
> >          Starting Configure read-only root support...
> > 
> > [ 1106.927991] ptr=ffffc90001c08218 reg->i_addr=ffff880134c7fda8 copy_off=0 copy_len=16
> > [ 1106.928022] ptr=ffffc90001c08234 reg->i_addr=ffff88013395f858 copy_off=0 copy_len=56
> > [ 1106.928100] ptr=ffffc90001c08278 reg->i_addr=ffff88013395f890 copy_off=0 copy_len=96
> > [ 1106.932354] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88013395f860)
> 
> Hold on- that warning has come from the /prior/ region copy, not
> the current region!
> 
> i.e. 2nd last copy region was ffff88013395f858 for 0x38 bytes, and
> this spans address that failed ffff88013395f860. i.e. it's 8 bytes
> into that region. It's 48 bytes before the region that we were
> copying when kmemcheck triggered!
> 
> I'm going to guess that we've got a log op header, followed by
> a inode format header, follow by something like a v2 inode core.
> i.e. log op headers are 16 bytes, xfs_inode_log_format are 56 bytes,
> and a v2 inode core is 96 bytes.
> 
> So, the inode log format header:
> 
> struct xfs_inode_log_format {
>         uint16_t                   ilf_type;             /*     0     2 */
>         uint16_t                   ilf_size;             /*     2     2 */
>         uint32_t                   ilf_fields;           /*     4     4 */
>         uint16_t                   ilf_asize;            /*     8     2 */
>         uint16_t                   ilf_dsize;            /*    10     2 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         uint64_t                   ilf_ino;              /*    16     8 */
>         union {
>                 uint32_t           ilfu_rdev;            /*           4 */
>                 uuid_t             ilfu_uuid;            /*          16 */
>         } ilf_u;                                         /*    24    16 */
>         int64_t                    ilf_blkno;            /*    40     8 */
>         int32_t                    ilf_len;              /*    48     4 */
>         int32_t                    ilf_boffset;          /*    52     4 */
> 
>         /* size: 56, cachelines: 1, members: 10 */
>         /* sum members: 52, holes: 1, sum holes: 4 */
>         /* last cacheline: 56 bytes */
> };
> 
> Oh, that's right. Someone, a long, long time ago, screwed up the on
> disk inode log structure but nobody noticed it due to the Irix MIPS
> compilers padding the structure identically on 32 and 64 bit
> systems. Port to linux, and i386/x86-64 padded it differently. The
> fix at the time was to make it right in recovery and continue to
> use the native structure at runtime. Ok, let's fix that properly
> now.
> 
> Try the patch below.

Bingo! This patch fixes the XFS splat. Thank you.

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: Don't log uninitialised fields in inode structures
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Prevent kmemcheck from throwing warnings about reading uninitialised
> memory when formatting inodes into the incore log buffer. There are
> several issues here - we don't always log all the fields in the
> inode log format item, and we never log the inode the
> di_next_unlinked field.
> 
> In the case of the inode log format item, this is aused b cerbated
> by the old xfs_inode_log_format structure padding issue. Hence make
> the padded, 64 bit aligned version of the structure the one we always
> use for formatting the log and get rid of the 64 bit variant. This
> means we'll always log the 64-bit version and so recovery only needs
> to convert from the unpadded 32 bit version from older 32 bit
> kernels.
> 
> Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_log_format.h | 27 +++++----------
>  fs/xfs/xfs_inode_item.c        | 79 ++++++++++++++++++++++--------------------
>  fs/xfs/xfs_ondisk.h            |  2 +-
>  3 files changed, 50 insertions(+), 58 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-14 22:44             ` Dave Chinner
  2017-09-15 11:19               ` Tetsuo Handa
@ 2017-09-18 18:21               ` Darrick J. Wong
  2017-09-20  0:46                 ` Dave Chinner
  1 sibling, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-09-18 18:21 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Tetsuo Handa, bfoster, sandeen, dchinner, linux-xfs

On Fri, Sep 15, 2017 at 08:44:31AM +1000, Dave Chinner wrote:
> On Thu, Sep 14, 2017 at 07:15:38PM +0900, Tetsuo Handa wrote:
> > Dave Chinner wrote:
> > > On Wed, Sep 13, 2017 at 06:59:38PM +0900, Tetsuo Handa wrote:
> > > > Dave Chinner wrote:
> > > > > On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> > > > > > [  OK  ] Stopped target Switch Root.
> > > > > > 
> > > > > > [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> > > > > > [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> > > > > > [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > > > > > [ 1054.691515]  ^
> > > > > > [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
> > > > > 
> > > > > What line of code does this correspond to?
> > > > > 
> > > > 
> > > >                         /*
> > > >                          * Copy region.
> > > >                          *
> > > >                          * Unmount records just log an opheader, so can have
> > > >                          * empty payloads with no data region to copy. Hence we
> > > >                          * only copy the payload if the vector says it has data
> > > >                          * to copy.
> > > >                          */
> > > >                         ASSERT(copy_len >= 0);
> > > >                         if (copy_len > 0) {
> > > >                                 memcpy(ptr, reg->i_addr + copy_off, copy_len); // <= xlog_write+0x344/0x6b0
> > > >                                 xlog_write_adv_cnt(&ptr, &len, &log_offset,
> > > >                                                    copy_len);
> > > >                         }
> > > > 
> > > 
> > > Ok, that's what I suspected. The region being copied is set up
> > > in xlog_cil_insert_format_items(), so problem is in one of the
> > > ->iop_format methods it calls to format the dirty metadata into the
> > > region.
> > > 
> > > And given that the address is ...6660, it's likely the offset into
> > > the structure being copied is 96 bytes.
> > > 
> > > $ pahole...
> > > .....
> > > struct xfs_log_dinode {
> > > .....
> > >        xfs_agino_t                di_next_unlinked;     /*    96     4 */
> > > .....
> > > 
> > > Try the patch below.
> > 
> > That patch did not help.
> > 
> > I checked values passed to memcpy() using below patch.
> 
> ok....
> 
> > 
> > ----------
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index c5107c7..f91c4c7 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -2476,6 +2476,8 @@
> >  			 */
> >  			ASSERT(copy_len >= 0);
> >  			if (copy_len > 0) {
> > +				printk(KERN_INFO "ptr=%p reg->i_addr=%p copy_off=%u copy_len=%u\n",
> > +				       ptr, reg->i_addr, copy_off, copy_len);
> 
> You need to print out the reg->i_type here. Then we'll know exactly
> where it came from.
> 
> >  				memcpy(ptr, reg->i_addr + copy_off, copy_len);
> >  				xlog_write_adv_cnt(&ptr, &len, &log_offset,
> >  						   copy_len);
> > ----------
> > 
> > The copy_len was not multiple of sizeof(struct xfs_log_dinode).
> > Thus, I guess we can't assume this is "struct xfs_log_dinode".
> 
> It still could be - we don't always log the entire structure.
> 
> static inline uint xfs_log_dinode_size(int version)
> {
>         if (version == 3)
> 		return sizeof(struct xfs_log_dinode);
> 	return offsetof(struct xfs_log_dinode, di_next_unlinked);
> }
> 
> i.e. If it's v2 inode, the logged region is 96 bytes in length, not
> 176.
> 
> > 
> > ----------
> >          Starting Load/Save Random Seed...
> > 
> >          Starting Configure read-only root support...
> > 
> > [ 1106.927991] ptr=ffffc90001c08218 reg->i_addr=ffff880134c7fda8 copy_off=0 copy_len=16
> > [ 1106.928022] ptr=ffffc90001c08234 reg->i_addr=ffff88013395f858 copy_off=0 copy_len=56
> > [ 1106.928100] ptr=ffffc90001c08278 reg->i_addr=ffff88013395f890 copy_off=0 copy_len=96
> > [ 1106.932354] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88013395f860)
> 
> Hold on- that warning has come from the /prior/ region copy, not
> the current region!
> 
> i.e. 2nd last copy region was ffff88013395f858 for 0x38 bytes, and
> this spans address that failed ffff88013395f860. i.e. it's 8 bytes
> into that region. It's 48 bytes before the region that we were
> copying when kmemcheck triggered!
> 
> I'm going to guess that we've got a log op header, followed by
> a inode format header, follow by something like a v2 inode core.
> i.e. log op headers are 16 bytes, xfs_inode_log_format are 56 bytes,
> and a v2 inode core is 96 bytes.
> 
> So, the inode log format header:
> 
> struct xfs_inode_log_format {
>         uint16_t                   ilf_type;             /*     0     2 */
>         uint16_t                   ilf_size;             /*     2     2 */
>         uint32_t                   ilf_fields;           /*     4     4 */
>         uint16_t                   ilf_asize;            /*     8     2 */
>         uint16_t                   ilf_dsize;            /*    10     2 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         uint64_t                   ilf_ino;              /*    16     8 */
>         union {
>                 uint32_t           ilfu_rdev;            /*           4 */
>                 uuid_t             ilfu_uuid;            /*          16 */
>         } ilf_u;                                         /*    24    16 */
>         int64_t                    ilf_blkno;            /*    40     8 */
>         int32_t                    ilf_len;              /*    48     4 */
>         int32_t                    ilf_boffset;          /*    52     4 */
> 
>         /* size: 56, cachelines: 1, members: 10 */
>         /* sum members: 52, holes: 1, sum holes: 4 */
>         /* last cacheline: 56 bytes */
> };
> 
> Oh, that's right. Someone, a long, long time ago, screwed up the on
> disk inode log structure but nobody noticed it due to the Irix MIPS
> compilers padding the structure identically on 32 and 64 bit
> systems. Port to linux, and i386/x86-64 padded it differently. The
> fix at the time was to make it right in recovery and continue to
> use the native structure at runtime. Ok, let's fix that properly
> now.
> 
> Try the patch below.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: Don't log uninitialised fields in inode structures
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Prevent kmemcheck from throwing warnings about reading uninitialised
> memory when formatting inodes into the incore log buffer. There are
> several issues here - we don't always log all the fields in the
> inode log format item, and we never log the inode the
> di_next_unlinked field.
> 
> In the case of the inode log format item, this is aused b cerbated

"is aused b cerbated" ?

> by the old xfs_inode_log_format structure padding issue. Hence make
> the padded, 64 bit aligned version of the structure the one we always
> use for formatting the log and get rid of the 64 bit variant. This
> means we'll always log the 64-bit version and so recovery only needs
> to convert from the unpadded 32 bit version from older 32 bit
> kernels.

And those old 32-bit kernels can read the xfs_inode_log_format{,_64}
structures, right?

> Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_log_format.h | 27 +++++----------
>  fs/xfs/xfs_inode_item.c        | 79 ++++++++++++++++++++++--------------------
>  fs/xfs/xfs_ondisk.h            |  2 +-
>  3 files changed, 50 insertions(+), 58 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
> index 8372e9bcd7b6..71de185735e0 100644
> --- a/fs/xfs/libxfs/xfs_log_format.h
> +++ b/fs/xfs/libxfs/xfs_log_format.h
> @@ -270,6 +270,7 @@ typedef struct xfs_inode_log_format {
>  	uint32_t		ilf_fields;	/* flags for fields logged */
>  	uint16_t		ilf_asize;	/* size of attr d/ext/root */
>  	uint16_t		ilf_dsize;	/* size of data/ext/root */
> +	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
>  	uint64_t		ilf_ino;	/* inode number */
>  	union {
>  		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
> @@ -280,29 +281,17 @@ typedef struct xfs_inode_log_format {
>  	int32_t			ilf_boffset;	/* off of inode in buffer */
>  } xfs_inode_log_format_t;
>  
> -typedef struct xfs_inode_log_format_32 {
> -	uint16_t		ilf_type;	/* inode log item type */
> -	uint16_t		ilf_size;	/* size of this item */
> -	uint32_t		ilf_fields;	/* flags for fields logged */
> -	uint16_t		ilf_asize;	/* size of attr d/ext/root */
> -	uint16_t		ilf_dsize;	/* size of data/ext/root */
> -	uint64_t		ilf_ino;	/* inode number */
> -	union {
> -		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
> -		uuid_t		ilfu_uuid;	/* mount point value */
> -	} ilf_u;
> -	int64_t			ilf_blkno;	/* blkno of inode buffer */
> -	int32_t			ilf_len;	/* len of inode buffer */
> -	int32_t			ilf_boffset;	/* off of inode in buffer */
> -} __attribute__((packed)) xfs_inode_log_format_32_t;
> -
> -typedef struct xfs_inode_log_format_64 {
> +/*
> + * Old 32 bit systems will log in this format without the 64 bit
> + * alignment padding. Recovery will detect this and convert it to the
> + * correct format.
> + */
> +struct xfs_inode_log_format_32 {
>  	uint16_t		ilf_type;	/* inode log item type */
>  	uint16_t		ilf_size;	/* size of this item */
>  	uint32_t		ilf_fields;	/* flags for fields logged */
>  	uint16_t		ilf_asize;	/* size of attr d/ext/root */
>  	uint16_t		ilf_dsize;	/* size of data/ext/root */
> -	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
>  	uint64_t		ilf_ino;	/* inode number */
>  	union {
>  		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
> @@ -311,7 +300,7 @@ typedef struct xfs_inode_log_format_64 {
>  	int64_t			ilf_blkno;	/* blkno of inode buffer */
>  	int32_t			ilf_len;	/* len of inode buffer */
>  	int32_t			ilf_boffset;	/* off of inode in buffer */
> -} xfs_inode_log_format_64_t;
> +} __attribute__((packed));
>  
>  
>  /*
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index 6d0f74ec31e8..aec9cf36b5b7 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -364,6 +364,9 @@ xfs_inode_to_log_dinode(
>  	to->di_dmstate = from->di_dmstate;
>  	to->di_flags = from->di_flags;
>  
> +	/* log a dummy value to ensure log structure is fully initialised */
> +	to->di_next_unlinked = NULLAGINO;
> +
>  	if (from->di_version == 3) {
>  		to->di_changecount = inode->i_version;
>  		to->di_crtime.t_sec = from->di_crtime.t_sec;
> @@ -404,6 +407,11 @@ xfs_inode_item_format_core(
>   * the second with the on-disk inode structure, and a possible third and/or
>   * fourth with the inode data/extents/b-tree root and inode attributes
>   * data/extents/b-tree root.
> + *
> + * Note: Always use the 64 bit inode log format structure so we don't
> + * leave an uninitialised hole in the format item on 64 bit systems. Log
> + * recovery on 32 bit systems handles this just fine, so there's no reason
> + * for not using an initialising the properly padded structure all the time.
>   */
>  STATIC void
>  xfs_inode_item_format(
> @@ -412,8 +420,8 @@ xfs_inode_item_format(
>  {
>  	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
>  	struct xfs_inode	*ip = iip->ili_inode;
> -	struct xfs_inode_log_format *ilf;
>  	struct xfs_log_iovec	*vecp = NULL;
> +	struct xfs_inode_log_format *ilf;
>  
>  	ASSERT(ip->i_d.di_version > 1);
>  
> @@ -425,7 +433,17 @@ xfs_inode_item_format(
>  	ilf->ilf_boffset = ip->i_imap.im_boffset;
>  	ilf->ilf_fields = XFS_ILOG_CORE;
>  	ilf->ilf_size = 2; /* format + core */
> -	xlog_finish_iovec(lv, vecp, sizeof(struct xfs_inode_log_format));
> +
> +	/*
> +	 * make sure we don't leak uninitialised data into the log in the case
> +	 * when we don't log every field in the inode.
> +	 */
> +	ilf->ilf_dsize = 0;
> +	ilf->ilf_asize = 0;
> +	ilf->ilf_pad = 0;
> +	uuid_copy(&ilf->ilf_u.ilfu_uuid, &uuid_null);
> +
> +	xlog_finish_iovec(lv, vecp, sizeof(*ilf));
>  
>  	xfs_inode_item_format_core(ip, lv, &vecp);
>  	xfs_inode_item_format_data_fork(iip, ilf, lv, &vecp);
> @@ -855,44 +873,29 @@ xfs_istale_done(
>  }
>  
>  /*
> - * convert an xfs_inode_log_format struct from either 32 or 64 bit versions
> - * (which can have different field alignments) to the native version
> + * convert an xfs_inode_log_format struct from the old 32 bit version
> + * (which can have different field alignments) to the native 64 bit version
>   */
>  int
>  xfs_inode_item_format_convert(
> -	xfs_log_iovec_t		*buf,
> -	xfs_inode_log_format_t	*in_f)
> +	struct xfs_log_iovec		*buf,
> +	struct xfs_inode_log_format	*in_f)
>  {
> -	if (buf->i_len == sizeof(xfs_inode_log_format_32_t)) {
> -		xfs_inode_log_format_32_t *in_f32 = buf->i_addr;
> -
> -		in_f->ilf_type = in_f32->ilf_type;
> -		in_f->ilf_size = in_f32->ilf_size;
> -		in_f->ilf_fields = in_f32->ilf_fields;
> -		in_f->ilf_asize = in_f32->ilf_asize;
> -		in_f->ilf_dsize = in_f32->ilf_dsize;
> -		in_f->ilf_ino = in_f32->ilf_ino;
> -		/* copy biggest field of ilf_u */
> -		uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f32->ilf_u.ilfu_uuid);
> -		in_f->ilf_blkno = in_f32->ilf_blkno;
> -		in_f->ilf_len = in_f32->ilf_len;
> -		in_f->ilf_boffset = in_f32->ilf_boffset;
> -		return 0;
> -	} else if (buf->i_len == sizeof(xfs_inode_log_format_64_t)){
> -		xfs_inode_log_format_64_t *in_f64 = buf->i_addr;
> -
> -		in_f->ilf_type = in_f64->ilf_type;
> -		in_f->ilf_size = in_f64->ilf_size;
> -		in_f->ilf_fields = in_f64->ilf_fields;
> -		in_f->ilf_asize = in_f64->ilf_asize;
> -		in_f->ilf_dsize = in_f64->ilf_dsize;
> -		in_f->ilf_ino = in_f64->ilf_ino;
> -		/* copy biggest field of ilf_u */
> -		uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f64->ilf_u.ilfu_uuid);
> -		in_f->ilf_blkno = in_f64->ilf_blkno;
> -		in_f->ilf_len = in_f64->ilf_len;
> -		in_f->ilf_boffset = in_f64->ilf_boffset;
> -		return 0;
> -	}
> -	return -EFSCORRUPTED;
> +	struct xfs_inode_log_format_32	*in_f32 = buf->i_addr;
> +
> +	if (buf->i_len != sizeof(*in_f32))
> +		return -EFSCORRUPTED;
> +
> +	in_f->ilf_type = in_f32->ilf_type;
> +	in_f->ilf_size = in_f32->ilf_size;
> +	in_f->ilf_fields = in_f32->ilf_fields;
> +	in_f->ilf_asize = in_f32->ilf_asize;
> +	in_f->ilf_dsize = in_f32->ilf_dsize;
> +	in_f->ilf_ino = in_f32->ilf_ino;
> +	/* copy biggest field of ilf_u */
> +	uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f32->ilf_u.ilfu_uuid);
> +	in_f->ilf_blkno = in_f32->ilf_blkno;
> +	in_f->ilf_len = in_f32->ilf_len;
> +	in_f->ilf_boffset = in_f32->ilf_boffset;
> +	return 0;
>  }
> diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h
> index 0c381d71b242..0492436a053f 100644
> --- a/fs/xfs/xfs_ondisk.h
> +++ b/fs/xfs/xfs_ondisk.h
> @@ -134,7 +134,7 @@ xfs_check_ondisk_structs(void)
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_icreate_log,		28);
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_ictimestamp,		8);
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format_32,	52);
> -	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format_64,	56);
> +	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format,	56);

Will require a change to xfs/122 as well...

--D

>  	XFS_CHECK_STRUCT_SIZE(struct xfs_qoff_logformat,	20);
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_trans_header,		16);
>  }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-18 18:21               ` Darrick J. Wong
@ 2017-09-20  0:46                 ` Dave Chinner
  2017-09-20  0:49                   ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2017-09-20  0:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Tetsuo Handa, bfoster, sandeen, dchinner, linux-xfs

On Mon, Sep 18, 2017 at 11:21:21AM -0700, Darrick J. Wong wrote:
> On Fri, Sep 15, 2017 at 08:44:31AM +1000, Dave Chinner wrote:
> > xfs: Don't log uninitialised fields in inode structures
> > 
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Prevent kmemcheck from throwing warnings about reading uninitialised
> > memory when formatting inodes into the incore log buffer. There are
> > several issues here - we don't always log all the fields in the
> > inode log format item, and we never log the inode the
> > di_next_unlinked field.
> > 
> > In the case of the inode log format item, this is aused b cerbated
> 
> "is aused b cerbated" ?

vi fail when replacing the word "used" with "exacerbated". :)

> > by the old xfs_inode_log_format structure padding issue. Hence make
> > the padded, 64 bit aligned version of the structure the one we always
> > use for formatting the log and get rid of the 64 bit variant. This
> > means we'll always log the 64-bit version and so recovery only needs
> > to convert from the unpadded 32 bit version from older 32 bit
> > kernels.
> 
> And those old 32-bit kernels can read the xfs_inode_log_format{,_64}
> structures, right?

Yes. With the in-memory format now being the same as the old
xfs_inode_log_format_64 structure in the log, we only need a
conversion from "32 bit in log to 64 bit in memory" on all
platforms.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs: Uninitialized memory read at xlog_write
  2017-09-20  0:46                 ` Dave Chinner
@ 2017-09-20  0:49                   ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-09-20  0:49 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Tetsuo Handa, bfoster, sandeen, dchinner, linux-xfs

On Wed, Sep 20, 2017 at 10:46:39AM +1000, Dave Chinner wrote:
> On Mon, Sep 18, 2017 at 11:21:21AM -0700, Darrick J. Wong wrote:
> > On Fri, Sep 15, 2017 at 08:44:31AM +1000, Dave Chinner wrote:
> > > xfs: Don't log uninitialised fields in inode structures
> > > 
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > Prevent kmemcheck from throwing warnings about reading uninitialised
> > > memory when formatting inodes into the incore log buffer. There are
> > > several issues here - we don't always log all the fields in the
> > > inode log format item, and we never log the inode the
> > > di_next_unlinked field.
> > > 
> > > In the case of the inode log format item, this is aused b cerbated
> > 
> > "is aused b cerbated" ?
> 
> vi fail when replacing the word "used" with "exacerbated". :)
> 
> > > by the old xfs_inode_log_format structure padding issue. Hence make
> > > the padded, 64 bit aligned version of the structure the one we always
> > > use for formatting the log and get rid of the 64 bit variant. This
> > > means we'll always log the 64-bit version and so recovery only needs
> > > to convert from the unpadded 32 bit version from older 32 bit
> > > kernels.
> > 
> > And those old 32-bit kernels can read the xfs_inode_log_format{,_64}
> > structures, right?
> 
> Yes. With the in-memory format now being the same as the old
> xfs_inode_log_format_64 structure in the log, we only need a
> conversion from "32 bit in log to 64 bit in memory" on all
> platforms.

Ok, that's what I thought.  Will await a positive test result + proper
patch submission. :)

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-09-20  0:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-04 12:19 xfs: Uninitialized memory read at xlog_write Tetsuo Handa
2017-09-04 17:34 ` Darrick J. Wong
2017-09-04 21:44   ` Tetsuo Handa
2017-09-11 15:01 ` Brian Foster
2017-09-13  7:14   ` Tetsuo Handa
2017-09-13  9:43     ` Dave Chinner
2017-09-13  9:59       ` Tetsuo Handa
2017-09-13 21:40         ` Dave Chinner
2017-09-14 10:15           ` Tetsuo Handa
2017-09-14 22:44             ` Dave Chinner
2017-09-15 11:19               ` Tetsuo Handa
2017-09-18 18:21               ` Darrick J. Wong
2017-09-20  0:46                 ` Dave Chinner
2017-09-20  0:49                   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.