linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4.0 kernel XFS filesystem crash when running AIM7's disk workload
@ 2015-04-17 17:38 Waiman Long
  2015-04-17 23:45 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2015-04-17 17:38 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Linux Kernel Mailing List

Hi Dave,

When I was running the AIM7's disk workload on a 8-socket Westmere-EX 
server with 4.0 kernel, the kernel crash. A set of small ramdisks were 
created (ramdisk_size=271072). Those ramdisks were formatted with XFS 
filesystem before the test began. The kernel log was:

XFS (ram12): Mounting V4 Filesystem
XFS (ram12): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram12): Log size out of supported range. Continuing onwards, but if 
log hangs are
experienced then please report this message in the bug report.
XFS (ram12): Ending clean mount
XFS (ram13): Mounting V4 Filesystem
XFS (ram13): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram13): Log size out of supported range. Continuing onwards, but if 
log hangs are
experienced then please report this message in the bug report.
XFS (ram13): Ending clean mount
XFS (ram14): Mounting V4 Filesystem
XFS (ram14): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram14): Log size out of supported range. Continuing onwards, but if 
log hangs are
experienced then please report this message in the bug report.
XFS (ram14): Ending clean mount
XFS (ram15): Mounting V4 Filesystem
XFS (ram15): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram15): Log size out of supported range. Continuing onwards, but if 
log hangs are
experienced then please report this message in the bug report.
XFS (ram15): Ending clean mount
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff812abd6d>] __memcpy+0xd/0x110
PGD 29f7655f067 PUD 29f75a80067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: xfs exportfs libcrc32c ebtable_nat ebtables 
xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT 
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables 
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state 
nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan 
vhost tun kvm_intel kvm ipmi_si ipmi_msghandler tpm_infineon iTCO_wdt 
iTCO_vendor_support wmi acpi_cpufreq microcode pcspkr serio_raw qlcnic 
be2net vxlan udp_tunnel ip6_udp_tunnel ses enclosure igb dca ptp 
pps_core lpc_ich mfd_core hpilo hpwdt sg i7core_edac edac_core 
netxen_nic ext4(E) jbd2(E) mbcache(E) sr_mod(E) cdrom(E) sd_mod(E) 
lpfc(E) qla2xxx(E) scsi_transport_fc(E) pata_acpi(E) ata_generic(E) 
ata_piix(E) hpsa(E) radeon(E) ttm(E) drm_kms_helper(E) drm(E) 
i2c_algo_bit(E) i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) 
dm_mod(E)
CPU: 69 PID: 116603 Comm: xfsaild/ram5 Tainted: G            E   4.0.0 #2
Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
task: ffff8b9f7eeb4f80 ti: ffff8b9f7f1ac000 task.ti: ffff8b9f7f1ac000
RIP: 0010:[<ffffffff812abd6d>]  [<ffffffff812abd6d>] __memcpy+0xd/0x110
RSP: 0018:ffff8b9f7f1afc10  EFLAGS: 00010206
RAX: ffff88102476a3cc RBX: ffff889ff2ab5000 RCX: 0000000000000005
RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88102476a3cc
RBP: ffff8b9f7f1afc18 R08: 0000000000000001 R09: ffff88102476a3cc
R10: ffff8a1f6c03ea80 R11: 0000000000000000 R12: ffff8b1ff1269400
R13: ffff8b1f64837c98 R14: ffff881038701200 R15: ffff88102476a300
FS:  0000000000000000(0000) GS:ffff8b1fffa40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000029f7655e000 CR4: 00000000000006e0
Stack:
  ffffffffa0ca8c41 ffff8b9f7f1afc68 ffffffffa0cc4803 ffff8b9f7f1afc68
  ffffffffa0cd2777 ffff8b9f7f1afc68 ffff8b1ff1269400 ffff8a9f59022800
  ffff8b1f7c932718 0000000000000003 ffff8a9f590228e4 ffff8b9f7f1afce8
Call Trace:
  [<ffffffffa0ca8c41>] ? xfs_iflush_fork+0x181/0x240 [xfs]
  [<ffffffffa0cc4803>] xfs_iflush_int+0x1f3/0x320 [xfs]
  [<ffffffffa0cd2777>] ? kmem_alloc+0x87/0x100 [xfs]
  [<ffffffffa0cc60a5>] xfs_iflush_cluster+0x295/0x380 [xfs]
  [<ffffffffa0cc8ff4>] xfs_iflush+0xf4/0x1f0 [xfs]
  [<ffffffffa0cda22a>] xfs_inode_item_push+0xea/0x130 [xfs]
  [<ffffffffa0ce140d>] xfsaild_push+0x10d/0x500 [xfs]
  [<ffffffff810b7c20>] ? lock_timer_base+0x70/0x70
  [<ffffffffa0ce1898>] xfsaild+0x98/0x130 [xfs]
  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
  [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff815c5748>] ret_from_fork+0x58/0x90
  [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
Code: 0f b6 c0 5b c9 c3 0f 1f 84 00 00 00 00 00 e8 2b f9 ff ff 80 7b 25 
00 74 c8 eb d3 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 
a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c
RIP  [<ffffffff812abd6d>] __memcpy+0xd/0x110
  RSP <ffff8b9f7f1afc10>
CR2: 0000000000000000
---[ end trace fb8a4add69562a76 ]---

The xfs_iflush_fork+0x181/0x240 (385) IP address is at:

823        case XFS_DINODE_FMT_LOCAL:
824            if ((iip->ili_fields & dataflag[whichfork]) &&
    0x00000000000023c0 <+336>:    movslq %ecx,%rcx
    0x00000000000023c3 <+339>:    movswl 0x0(%rcx,%rcx,1),%eax
    0x00000000000023cb <+347>:    test   %eax,0x90(%rdx)
    0x00000000000023d1 <+353>:    je     0x2350 <xfs_iflush_fork+224>
    0x00000000000023da <+362>:    test   %edx,%edx
    0x00000000000023dc <+364>:    jle    0x2350 <xfs_iflush_fork+224>

825                (ifp->if_bytes > 0)) {
    0x00000000000023d7 <+359>:    mov    (%r10),%edx

826                ASSERT(ifp->if_u1.if_data != NULL);
827                ASSERT(ifp->if_bytes <= XFS_IFORK_SIZE(ip, whichfork));
828                memcpy(cp, ifp->if_u1.if_data, ifp->if_bytes);
    0x00000000000023e2 <+370>:    mov    0x18(%r10),%rsi
    0x00000000000023e6 <+374>:    movslq %edx,%rdx
    0x00000000000023e9 <+377>:    mov    %r9,%rdi
    0x00000000000023ec <+380>:    callq  0x23f1 <xfs_iflush_fork+385>

829            }
830            break;


  xfs_iflush_int+0x1f3/0x320 (499) [xfs]:

flush_fork(ip, dip, iip, XFS_DATA_FORK);
    0x0000000000000335 <+245>:    xor    %ecx,%ecx
    0x0000000000000337 <+247>:    mov    %r13,%rdx
    0x000000000000033a <+250>:    mov    %r15,%rsi
    0x000000000000033d <+253>:    mov    %r12,%rdi
    0x0000000000000340 <+256>:    callq  0x345 <xfs_iflush_int+261>

3409        if (XFS_IFORK_Q(ip))
    0x0000000000000345 <+261>:    cmpb   $0x0,0x14a(%r12)
    0x000000000000034e <+270>:    jne    0x420 <xfs_iflush_int+480>

3410            xfs_iflush_fork(ip, dip, iip, XFS_ATTR_FORK);
    0x0000000000000420 <+480>:    mov    $0x1,%ecx
    0x0000000000000425 <+485>:    mov    %r13,%rdx
    0x0000000000000428 <+488>:    mov    %r15,%rsi
    0x000000000000042b <+491>:    mov    %r12,%rdi
    0x000000000000042e <+494>:    callq  0x433 <xfs_iflush_int+499>
    0x0000000000000433 <+499>:    jmpq   0x354 <xfs_iflush_int+276>
    0x0000000000000438 <+504>:    nopl   0x0(%rax,%rax,1)

The crash can be reproduced pretty consistently. Please let me know if 
you need additional information.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 4.0 kernel XFS filesystem crash when running AIM7's disk workload
  2015-04-17 17:38 4.0 kernel XFS filesystem crash when running AIM7's disk workload Waiman Long
@ 2015-04-17 23:45 ` Dave Chinner
  2015-04-21 20:52   ` Waiman Long
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2015-04-17 23:45 UTC (permalink / raw)
  To: Waiman Long; +Cc: xfs, Linux Kernel Mailing List

On Fri, Apr 17, 2015 at 01:38:49PM -0400, Waiman Long wrote:
> Hi Dave,
> 
> When I was running the AIM7's disk workload on a 8-socket
> Westmere-EX server with 4.0 kernel, the kernel crash. A set of small
> ramdisks were created (ramdisk_size=271072). Those ramdisks were
> formatted with XFS filesystem before the test began. The kernel log
> was:
> 
> XFS (ram12): Mounting V4 Filesystem
> XFS (ram12): Log size 1424 blocks too small, minimum size is 1596 blocks
> XFS (ram12): Log size out of supported range. Continuing onwards,
> but if log hangs are
> experienced then please report this message in the bug report.

First thing you need to do is upgrade xfsprogs so that this message
goes away. or use "mkfs.xfs -l size=10m" so that the log is larger
than the minimum.

> XFS (ram15): Ending clean mount
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffff812abd6d>] __memcpy+0xd/0x110
> PGD 29f7655f067 PUD 29f75a80067 PMD 0
> Oops: 0000 [#1] SMP
> Modules linked in: xfs exportfs libcrc32c ebtable_nat ebtables
> xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT
> nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
> ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
> nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6
> vhost_net macvtap macvlan vhost tun kvm_intel kvm ipmi_si
> ipmi_msghandler tpm_infineon iTCO_wdt iTCO_vendor_support wmi
> acpi_cpufreq microcode pcspkr serio_raw qlcnic be2net vxlan
> udp_tunnel ip6_udp_tunnel ses enclosure igb dca ptp pps_core lpc_ich
> mfd_core hpilo hpwdt sg i7core_edac edac_core netxen_nic ext4(E)
> jbd2(E) mbcache(E) sr_mod(E) cdrom(E) sd_mod(E) lpfc(E) qla2xxx(E)
> scsi_transport_fc(E) pata_acpi(E) ata_generic(E) ata_piix(E) hpsa(E)
> radeon(E) ttm(E) drm_kms_helper(E) drm(E) i2c_algo_bit(E)
> i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)

Why do you have a mix of signed and unsigned modules loaded?

> CPU: 69 PID: 116603 Comm: xfsaild/ram5 Tainted: G            E   4.0.0 #2
> Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
> task: ffff8b9f7eeb4f80 ti: ffff8b9f7f1ac000 task.ti: ffff8b9f7f1ac000
> RIP: 0010:[<ffffffff812abd6d>]  [<ffffffff812abd6d>] __memcpy+0xd/0x110
> RSP: 0018:ffff8b9f7f1afc10  EFLAGS: 00010206
> RAX: ffff88102476a3cc RBX: ffff889ff2ab5000 RCX: 0000000000000005
> RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88102476a3cc

edx = 6 bytes.

> RBP: ffff8b9f7f1afc18 R08: 0000000000000001 R09: ffff88102476a3cc
> R10: ffff8a1f6c03ea80 R11: 0000000000000000 R12: ffff8b1ff1269400
> R13: ffff8b1f64837c98 R14: ffff881038701200 R15: ffff88102476a300
> FS:  0000000000000000(0000) GS:ffff8b1fffa40000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000029f7655e000 CR4: 00000000000006e0
> Stack:
>  ffffffffa0ca8c41 ffff8b9f7f1afc68 ffffffffa0cc4803 ffff8b9f7f1afc68
>  ffffffffa0cd2777 ffff8b9f7f1afc68 ffff8b1ff1269400 ffff8a9f59022800
>  ffff8b1f7c932718 0000000000000003 ffff8a9f590228e4 ffff8b9f7f1afce8
> Call Trace:
>  [<ffffffffa0ca8c41>] ? xfs_iflush_fork+0x181/0x240 [xfs]
>  [<ffffffffa0cc4803>] xfs_iflush_int+0x1f3/0x320 [xfs]
>  [<ffffffffa0cd2777>] ? kmem_alloc+0x87/0x100 [xfs]
>  [<ffffffffa0cc60a5>] xfs_iflush_cluster+0x295/0x380 [xfs]
>  [<ffffffffa0cc8ff4>] xfs_iflush+0xf4/0x1f0 [xfs]
>  [<ffffffffa0cda22a>] xfs_inode_item_push+0xea/0x130 [xfs]
>  [<ffffffffa0ce140d>] xfsaild_push+0x10d/0x500 [xfs]
>  [<ffffffff810b7c20>] ? lock_timer_base+0x70/0x70
>  [<ffffffffa0ce1898>] xfsaild+0x98/0x130 [xfs]
>  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
>  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
>  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
>  [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
>  [<ffffffff815c5748>] ret_from_fork+0x58/0x90
>  [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
> Code: 0f b6 c0 5b c9 c3 0f 1f 84 00 00 00 00 00 e8 2b f9 ff ff 80 7b
> 25 00 74 c8 eb d3 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07
> <f3> 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c
> RIP  [<ffffffff812abd6d>] __memcpy+0xd/0x110
>  RSP <ffff8b9f7f1afc10>
> CR2: 0000000000000000
> ---[ end trace fb8a4add69562a76 ]---
> 
> The xfs_iflush_fork+0x181/0x240 (385) IP address is at:
> 

(rearrange slightly to make more sense)

> 823        case XFS_DINODE_FMT_LOCAL:
> 824            if ((iip->ili_fields & dataflag[whichfork]) &&
>    0x00000000000023c0 <+336>:    movslq %ecx,%rcx
>    0x00000000000023c3 <+339>:    movswl 0x0(%rcx,%rcx,1),%eax
>    0x00000000000023cb <+347>:    test   %eax,0x90(%rdx)
>    0x00000000000023d1 <+353>:    je     0x2350 <xfs_iflush_fork+224>
> 
> 825                (ifp->if_bytes > 0)) {
>    0x00000000000023d7 <+359>:    mov    (%r10),%edx
>    0x00000000000023da <+362>:    test   %edx,%edx
>    0x00000000000023dc <+364>:    jle    0x2350 <xfs_iflush_fork+224>

So the contents of rdx says that the inode fork size is 6 bytes in
local format. The call location also indicates that it is the
attribute fork that is in being flushed. The minimum size of the
attr fork is 3 bytes - an empty header. However, then ext valid size
has a second header that adds 4 bytes to the size, plus the bytes
inteh attr name and value.

Hence a size of 6 bytes is invalid, and probably indicates that
there is some form of memory corruption going on here.

IIRC, we haven't touched this code for a while - can you test 3.19
and see if it has the same problem? If it doesn't have the problem,
and given you can reliably reproduce the crash, can you run a
bisect to find the cause?

FWIW, there's been a few dentry cache related crashes reported late
in the 4.0-rc series that point to memory corruption as the cause of
the panics, so I'm wondering if this is another symptom of the same
problem....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 4.0 kernel XFS filesystem crash when running AIM7's disk workload
  2015-04-17 23:45 ` Dave Chinner
@ 2015-04-21 20:52   ` Waiman Long
  2015-04-21 21:59     ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2015-04-21 20:52 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Linux Kernel Mailing List

On 04/17/2015 07:45 PM, Dave Chinner wrote:
> On Fri, Apr 17, 2015 at 01:38:49PM -0400, Waiman Long wrote:
>> Hi Dave,
>>
>> When I was running the AIM7's disk workload on a 8-socket
>> Westmere-EX server with 4.0 kernel, the kernel crash. A set of small
>> ramdisks were created (ramdisk_size=271072). Those ramdisks were
>> formatted with XFS filesystem before the test began. The kernel log
>> was:
>>
>> XFS (ram12): Mounting V4 Filesystem
>> XFS (ram12): Log size 1424 blocks too small, minimum size is 1596 blocks
>> XFS (ram12): Log size out of supported range. Continuing onwards,
>> but if log hangs are
>> experienced then please report this message in the bug report.
> First thing you need to do is upgrade xfsprogs so that this message
> goes away. or use "mkfs.xfs -l size=10m" so that the log is larger
> than the minimum.
>
>> XFS (ram15): Ending clean mount
>> BUG: unable to handle kernel NULL pointer dereference at           (null)
>> IP: [<ffffffff812abd6d>] __memcpy+0xd/0x110
>> PGD 29f7655f067 PUD 29f75a80067 PMD 0
>> Oops: 0000 [#1] SMP
>> Modules linked in: xfs exportfs libcrc32c ebtable_nat ebtables
>> xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT
>> nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
>> ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
>> nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6
>> vhost_net macvtap macvlan vhost tun kvm_intel kvm ipmi_si
>> ipmi_msghandler tpm_infineon iTCO_wdt iTCO_vendor_support wmi
>> acpi_cpufreq microcode pcspkr serio_raw qlcnic be2net vxlan
>> udp_tunnel ip6_udp_tunnel ses enclosure igb dca ptp pps_core lpc_ich
>> mfd_core hpilo hpwdt sg i7core_edac edac_core netxen_nic ext4(E)
>> jbd2(E) mbcache(E) sr_mod(E) cdrom(E) sd_mod(E) lpfc(E) qla2xxx(E)
>> scsi_transport_fc(E) pata_acpi(E) ata_generic(E) ata_piix(E) hpsa(E)
>> radeon(E) ttm(E) drm_kms_helper(E) drm(E) i2c_algo_bit(E)
>> i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> Why do you have a mix of signed and unsigned modules loaded?

I did the test on a RHEL 6.6 system. The 4.0 kernel is unsigned, but 
there are some additional RHEL modules loaded at boot up time.

>> CPU: 69 PID: 116603 Comm: xfsaild/ram5 Tainted: G            E   4.0.0 #2
>> Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
>> task: ffff8b9f7eeb4f80 ti: ffff8b9f7f1ac000 task.ti: ffff8b9f7f1ac000
>> RIP: 0010:[<ffffffff812abd6d>]  [<ffffffff812abd6d>] __memcpy+0xd/0x110
>> RSP: 0018:ffff8b9f7f1afc10  EFLAGS: 00010206
>> RAX: ffff88102476a3cc RBX: ffff889ff2ab5000 RCX: 0000000000000005
>> RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88102476a3cc
> edx = 6 bytes.
>
>> RBP: ffff8b9f7f1afc18 R08: 0000000000000001 R09: ffff88102476a3cc
>> R10: ffff8a1f6c03ea80 R11: 0000000000000000 R12: ffff8b1ff1269400
>> R13: ffff8b1f64837c98 R14: ffff881038701200 R15: ffff88102476a300
>> FS:  0000000000000000(0000) GS:ffff8b1fffa40000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 0000000000000000 CR3: 0000029f7655e000 CR4: 00000000000006e0
>> Stack:
>>   ffffffffa0ca8c41 ffff8b9f7f1afc68 ffffffffa0cc4803 ffff8b9f7f1afc68
>>   ffffffffa0cd2777 ffff8b9f7f1afc68 ffff8b1ff1269400 ffff8a9f59022800
>>   ffff8b1f7c932718 0000000000000003 ffff8a9f590228e4 ffff8b9f7f1afce8
>> Call Trace:
>>   [<ffffffffa0ca8c41>] ? xfs_iflush_fork+0x181/0x240 [xfs]
>>   [<ffffffffa0cc4803>] xfs_iflush_int+0x1f3/0x320 [xfs]
>>   [<ffffffffa0cd2777>] ? kmem_alloc+0x87/0x100 [xfs]
>>   [<ffffffffa0cc60a5>] xfs_iflush_cluster+0x295/0x380 [xfs]
>>   [<ffffffffa0cc8ff4>] xfs_iflush+0xf4/0x1f0 [xfs]
>>   [<ffffffffa0cda22a>] xfs_inode_item_push+0xea/0x130 [xfs]
>>   [<ffffffffa0ce140d>] xfsaild_push+0x10d/0x500 [xfs]
>>   [<ffffffff810b7c20>] ? lock_timer_base+0x70/0x70
>>   [<ffffffffa0ce1898>] xfsaild+0x98/0x130 [xfs]
>>   [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
>>   [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
>>   [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
>>   [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
>>   [<ffffffff815c5748>] ret_from_fork+0x58/0x90
>>   [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
>> Code: 0f b6 c0 5b c9 c3 0f 1f 84 00 00 00 00 00 e8 2b f9 ff ff 80 7b
>> 25 00 74 c8 eb d3 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07
>> <f3>  48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c
>> RIP  [<ffffffff812abd6d>] __memcpy+0xd/0x110
>>   RSP<ffff8b9f7f1afc10>
>> CR2: 0000000000000000
>> ---[ end trace fb8a4add69562a76 ]---
>>
>> The xfs_iflush_fork+0x181/0x240 (385) IP address is at:
>>
> (rearrange slightly to make more sense)
>
>> 823        case XFS_DINODE_FMT_LOCAL:
>> 824            if ((iip->ili_fields&  dataflag[whichfork])&&
>>     0x00000000000023c0<+336>:    movslq %ecx,%rcx
>>     0x00000000000023c3<+339>:    movswl 0x0(%rcx,%rcx,1),%eax
>>     0x00000000000023cb<+347>:    test   %eax,0x90(%rdx)
>>     0x00000000000023d1<+353>:    je     0x2350<xfs_iflush_fork+224>
>>
>> 825                (ifp->if_bytes>  0)) {
>>     0x00000000000023d7<+359>:    mov    (%r10),%edx
>>     0x00000000000023da<+362>:    test   %edx,%edx
>>     0x00000000000023dc<+364>:    jle    0x2350<xfs_iflush_fork+224>
> So the contents of rdx says that the inode fork size is 6 bytes in
> local format. The call location also indicates that it is the
> attribute fork that is in being flushed. The minimum size of the
> attr fork is 3 bytes - an empty header. However, then ext valid size
> has a second header that adds 4 bytes to the size, plus the bytes
> inteh attr name and value.
>
> Hence a size of 6 bytes is invalid, and probably indicates that
> there is some form of memory corruption going on here.
>
> IIRC, we haven't touched this code for a while - can you test 3.19
> and see if it has the same problem? If it doesn't have the problem,
> and given you can reliably reproduce the crash, can you run a
> bisect to find the cause?

I have done the bisection and the following commit in 3.13 is the one 
that cause the problem, I think:

f7be2d7f594cbc7a00902b5427332a1ad519a528
xfs: push down inactive transaction mgmt for truncate

I looked at the patch, and it didn't seem quite right, but I don't know 
much about the XFS internal to be sure. Maybe you can take a look at that.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 4.0 kernel XFS filesystem crash when running AIM7's disk workload
  2015-04-21 20:52   ` Waiman Long
@ 2015-04-21 21:59     ` Dave Chinner
  2015-04-22 17:45       ` Waiman Long
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2015-04-21 21:59 UTC (permalink / raw)
  To: Waiman Long; +Cc: xfs, Linux Kernel Mailing List

On Tue, Apr 21, 2015 at 04:52:37PM -0400, Waiman Long wrote:
> On 04/17/2015 07:45 PM, Dave Chinner wrote:
> >On Fri, Apr 17, 2015 at 01:38:49PM -0400, Waiman Long wrote:
> >>Hi Dave,
> >>
> >>When I was running the AIM7's disk workload on a 8-socket
> >>Westmere-EX server with 4.0 kernel, the kernel crash. A set of small
> >>ramdisks were created (ramdisk_size=271072). Those ramdisks were
> >>formatted with XFS filesystem before the test began. The kernel log
> >>was:
> >>
> >>XFS (ram12): Mounting V4 Filesystem
> >>XFS (ram12): Log size 1424 blocks too small, minimum size is 1596 blocks
> >>XFS (ram12): Log size out of supported range. Continuing onwards,
> >>but if log hangs are
> >>experienced then please report this message in the bug report.
> >First thing you need to do is upgrade xfsprogs so that this message
> >goes away. or use "mkfs.xfs -l size=10m" so that the log is larger
> >than the minimum.
> >
> >>XFS (ram15): Ending clean mount
> >>BUG: unable to handle kernel NULL pointer dereference at           (null)
> >>IP: [<ffffffff812abd6d>] __memcpy+0xd/0x110
> >>PGD 29f7655f067 PUD 29f75a80067 PMD 0
> >>Oops: 0000 [#1] SMP
> >>Modules linked in: xfs exportfs libcrc32c ebtable_nat ebtables
> >>xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT
> >>nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
> >>ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
> >>nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6
> >>vhost_net macvtap macvlan vhost tun kvm_intel kvm ipmi_si
> >>ipmi_msghandler tpm_infineon iTCO_wdt iTCO_vendor_support wmi
> >>acpi_cpufreq microcode pcspkr serio_raw qlcnic be2net vxlan
> >>udp_tunnel ip6_udp_tunnel ses enclosure igb dca ptp pps_core lpc_ich
> >>mfd_core hpilo hpwdt sg i7core_edac edac_core netxen_nic ext4(E)
> >>jbd2(E) mbcache(E) sr_mod(E) cdrom(E) sd_mod(E) lpfc(E) qla2xxx(E)
> >>scsi_transport_fc(E) pata_acpi(E) ata_generic(E) ata_piix(E) hpsa(E)
> >>radeon(E) ttm(E) drm_kms_helper(E) drm(E) i2c_algo_bit(E)
> >>i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> >Why do you have a mix of signed and unsigned modules loaded?
> 
> I did the test on a RHEL 6.6 system. The 4.0 kernel is unsigned, but
> there are some additional RHEL modules loaded at boot up time.

Wait, what?

Do you have rhel 6.6 modules loaded into a 4.0 kernel? If so, I'd
suggest you fix things so that doesn't happen before running any
more tests...

> >>CPU: 69 PID: 116603 Comm: xfsaild/ram5 Tainted: G            E   4.0.0 #2
> >>Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
> >>task: ffff8b9f7eeb4f80 ti: ffff8b9f7f1ac000 task.ti: ffff8b9f7f1ac000
> >>RIP: 0010:[<ffffffff812abd6d>]  [<ffffffff812abd6d>] __memcpy+0xd/0x110
> >>RSP: 0018:ffff8b9f7f1afc10  EFLAGS: 00010206
> >>RAX: ffff88102476a3cc RBX: ffff889ff2ab5000 RCX: 0000000000000005
> >>RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88102476a3cc
> >edx = 6 bytes.
> >
> >>RBP: ffff8b9f7f1afc18 R08: 0000000000000001 R09: ffff88102476a3cc
> >>R10: ffff8a1f6c03ea80 R11: 0000000000000000 R12: ffff8b1ff1269400
> >>R13: ffff8b1f64837c98 R14: ffff881038701200 R15: ffff88102476a300
> >>FS:  0000000000000000(0000) GS:ffff8b1fffa40000(0000) knlGS:0000000000000000
> >>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>CR2: 0000000000000000 CR3: 0000029f7655e000 CR4: 00000000000006e0
> >>Stack:
> >>  ffffffffa0ca8c41 ffff8b9f7f1afc68 ffffffffa0cc4803 ffff8b9f7f1afc68
> >>  ffffffffa0cd2777 ffff8b9f7f1afc68 ffff8b1ff1269400 ffff8a9f59022800
> >>  ffff8b1f7c932718 0000000000000003 ffff8a9f590228e4 ffff8b9f7f1afce8
> >>Call Trace:
> >>  [<ffffffffa0ca8c41>] ? xfs_iflush_fork+0x181/0x240 [xfs]
> >>  [<ffffffffa0cc4803>] xfs_iflush_int+0x1f3/0x320 [xfs]
> >>  [<ffffffffa0cd2777>] ? kmem_alloc+0x87/0x100 [xfs]
> >>  [<ffffffffa0cc60a5>] xfs_iflush_cluster+0x295/0x380 [xfs]
> >>  [<ffffffffa0cc8ff4>] xfs_iflush+0xf4/0x1f0 [xfs]
> >>  [<ffffffffa0cda22a>] xfs_inode_item_push+0xea/0x130 [xfs]
> >>  [<ffffffffa0ce140d>] xfsaild_push+0x10d/0x500 [xfs]
> >>  [<ffffffff810b7c20>] ? lock_timer_base+0x70/0x70
> >>  [<ffffffffa0ce1898>] xfsaild+0x98/0x130 [xfs]
> >>  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
> >>  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
> >>  [<ffffffffa0ce1800>] ? xfsaild_push+0x500/0x500 [xfs]
> >>  [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
> >>  [<ffffffff815c5748>] ret_from_fork+0x58/0x90
> >>  [<ffffffff81074b50>] ? kthread_freezable_should_stop+0x70/0x70
> >>Code: 0f b6 c0 5b c9 c3 0f 1f 84 00 00 00 00 00 e8 2b f9 ff ff 80 7b
> >>25 00 74 c8 eb d3 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07
> >><f3>  48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c
> >>RIP  [<ffffffff812abd6d>] __memcpy+0xd/0x110
> >>  RSP<ffff8b9f7f1afc10>
> >>CR2: 0000000000000000
> >>---[ end trace fb8a4add69562a76 ]---
> >>
> >>The xfs_iflush_fork+0x181/0x240 (385) IP address is at:
> >>
> >(rearrange slightly to make more sense)
> >
> >>823        case XFS_DINODE_FMT_LOCAL:
> >>824            if ((iip->ili_fields&  dataflag[whichfork])&&
> >>    0x00000000000023c0<+336>:    movslq %ecx,%rcx
> >>    0x00000000000023c3<+339>:    movswl 0x0(%rcx,%rcx,1),%eax
> >>    0x00000000000023cb<+347>:    test   %eax,0x90(%rdx)
> >>    0x00000000000023d1<+353>:    je     0x2350<xfs_iflush_fork+224>
> >>
> >>825                (ifp->if_bytes>  0)) {
> >>    0x00000000000023d7<+359>:    mov    (%r10),%edx
> >>    0x00000000000023da<+362>:    test   %edx,%edx
> >>    0x00000000000023dc<+364>:    jle    0x2350<xfs_iflush_fork+224>
> >So the contents of rdx says that the inode fork size is 6 bytes in
> >local format. The call location also indicates that it is the
> >attribute fork that is in being flushed. The minimum size of the
> >attr fork is 3 bytes - an empty header. However, then ext valid size
> >has a second header that adds 4 bytes to the size, plus the bytes
> >inteh attr name and value.
> >
> >Hence a size of 6 bytes is invalid, and probably indicates that
> >there is some form of memory corruption going on here.
> >
> >IIRC, we haven't touched this code for a while - can you test 3.19
> >and see if it has the same problem? If it doesn't have the problem,
> >and given you can reliably reproduce the crash, can you run a
> >bisect to find the cause?
> 
> I have done the bisection and the following commit in 3.13 is the
> one that cause the problem, I think:
> 
> f7be2d7f594cbc7a00902b5427332a1ad519a528
> xfs: push down inactive transaction mgmt for truncate
> 
> I looked at the patch, and it didn't seem quite right,

In what way?

> but I don't
> know much about the XFS internal to be sure. Maybe you can take a
> look at that.

Doesn't actually seem very likely - that's mostly just a factoring
patch, and it is called on every inode that is reclaimed from
memory, so it's not like that code path doesn't get well tested....

So, I'm confused - I thought you were reporting a recent regression.
Are you actually reporting a regression between a RHEL 6.6 kernel
and the current mainline kernel?  Is this the first time you've run
this test on XFS on a kernel more recent than RHEL6.6?

Details, please; they are important.

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 4.0 kernel XFS filesystem crash when running AIM7's disk workload
  2015-04-21 21:59     ` Dave Chinner
@ 2015-04-22 17:45       ` Waiman Long
  0 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2015-04-22 17:45 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Linux Kernel Mailing List

On 04/21/2015 05:59 PM, Dave Chinner wrote:
> On Tue, Apr 21, 2015 at 04:52:37PM -0400, Waiman Long wrote:
>> On 04/17/2015 07:45 PM, Dave Chinner wrote:
>>> On Fri, Apr 17, 2015 at 01:38:49PM -0400, Waiman Long wrote:
>>>> Hi Dave,
>>>>
>>>> When I was running the AIM7's disk workload on a 8-socket
>>>> Westmere-EX server with 4.0 kernel, the kernel crash. A set of small
>>>> ramdisks were created (ramdisk_size=271072). Those ramdisks were
>>>> formatted with XFS filesystem before the test began. The kernel log
>>>> was:
>>>>
>>>> XFS (ram12): Mounting V4 Filesystem
>>>> XFS (ram12): Log size 1424 blocks too small, minimum size is 1596 blocks
>>>> XFS (ram12): Log size out of supported range. Continuing onwards,
>>>> but if log hangs are
>>>> experienced then please report this message in the bug report.
>>> First thing you need to do is upgrade xfsprogs so that this message
>>> goes away. or use "mkfs.xfs -l size=10m" so that the log is larger
>>> than the minimum.
>>>
>>>> XFS (ram15): Ending clean mount
>>>> BUG: unable to handle kernel NULL pointer dereference at           (null)
>>>> IP: [<ffffffff812abd6d>] __memcpy+0xd/0x110
>>>> PGD 29f7655f067 PUD 29f75a80067 PMD 0
>>>> Oops: 0000 [#1] SMP
>>>> Modules linked in: xfs exportfs libcrc32c ebtable_nat ebtables
>>>> xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT
>>>> nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
>>>> ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
>>>> nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6
>>>> vhost_net macvtap macvlan vhost tun kvm_intel kvm ipmi_si
>>>> ipmi_msghandler tpm_infineon iTCO_wdt iTCO_vendor_support wmi
>>>> acpi_cpufreq microcode pcspkr serio_raw qlcnic be2net vxlan
>>>> udp_tunnel ip6_udp_tunnel ses enclosure igb dca ptp pps_core lpc_ich
>>>> mfd_core hpilo hpwdt sg i7core_edac edac_core netxen_nic ext4(E)
>>>> jbd2(E) mbcache(E) sr_mod(E) cdrom(E) sd_mod(E) lpfc(E) qla2xxx(E)
>>>> scsi_transport_fc(E) pata_acpi(E) ata_generic(E) ata_piix(E) hpsa(E)
>>>> radeon(E) ttm(E) drm_kms_helper(E) drm(E) i2c_algo_bit(E)
>>>> i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
>>> Why do you have a mix of signed and unsigned modules loaded?
>> I did the test on a RHEL 6.6 system. The 4.0 kernel is unsigned, but
>> there are some additional RHEL modules loaded at boot up time.
> Wait, what?
>
> Do you have rhel 6.6 modules loaded into a 4.0 kernel? If so, I'd
> suggest you fix things so that doesn't happen before running any
> more tests...

No, I didn't. I thought the system startup scripts may have loaded some 
additional kernel modules, but I didn't check to see if it is really the 
case. Anyway, this is not the issue that was causing problem that I saw.


>>> 823        case XFS_DINODE_FMT_LOCAL:
>>> 824            if ((iip->ili_fields&   dataflag[whichfork])&&
>>>     0x00000000000023c0<+336>:    movslq %ecx,%rcx
>>>     0x00000000000023c3<+339>:    movswl 0x0(%rcx,%rcx,1),%eax
>>>     0x00000000000023cb<+347>:    test   %eax,0x90(%rdx)
>>>     0x00000000000023d1<+353>:    je     0x2350<xfs_iflush_fork+224>
>>>
>>> 825                (ifp->if_bytes>   0)) {
>>>     0x00000000000023d7<+359>:    mov    (%r10),%edx
>>>     0x00000000000023da<+362>:    test   %edx,%edx
>>>     0x00000000000023dc<+364>:    jle    0x2350<xfs_iflush_fork+224>
>>> So the contents of rdx says that the inode fork size is 6 bytes in
>>> local format. The call location also indicates that it is the
>>> attribute fork that is in being flushed. The minimum size of the
>>> attr fork is 3 bytes - an empty header. However, then ext valid size
>>> has a second header that adds 4 bytes to the size, plus the bytes
>>> inteh attr name and value.
>>>
>>> Hence a size of 6 bytes is invalid, and probably indicates that
>>> there is some form of memory corruption going on here.
>>>
>>> IIRC, we haven't touched this code for a while - can you test 3.19
>>> and see if it has the same problem? If it doesn't have the problem,
>>> and given you can reliably reproduce the crash, can you run a
>>> bisect to find the cause?
>> I have done the bisection and the following commit in 3.13 is the
>> one that cause the problem, I think:
>>
>> f7be2d7f594cbc7a00902b5427332a1ad519a528
>> xfs: push down inactive transaction mgmt for truncate
>>
>> I looked at the patch, and it didn't seem quite right,
> In what way?
>
>> but I don't
>> know much about the XFS internal to be sure. Maybe you can take a
>> look at that.
> Doesn't actually seem very likely - that's mostly just a factoring
> patch, and it is called on every inode that is reclaimed from
> memory, so it's not like that code path doesn't get well tested....
>
> So, I'm confused - I thought you were reporting a recent regression.
> Are you actually reporting a regression between a RHEL 6.6 kernel
> and the current mainline kernel?  Is this the first time you've run
> this test on XFS on a kernel more recent than RHEL6.6?
>
> Details, please; they are important.
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> Cheers,
>
> Dave.

I have just sent out a patch to fix this problem. Please let me know if 
there is any problem with the patch.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-04-22 17:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-17 17:38 4.0 kernel XFS filesystem crash when running AIM7's disk workload Waiman Long
2015-04-17 23:45 ` Dave Chinner
2015-04-21 20:52   ` Waiman Long
2015-04-21 21:59     ` Dave Chinner
2015-04-22 17:45       ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).