linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df
@ 2017-06-06 10:00 Li Wang
  2017-06-06 21:04 ` Theodore Ts'o
  2017-06-07  3:27 ` [LTP] " Eryu Guan
  0 siblings, 2 replies; 3+ messages in thread
From: Li Wang @ 2017-06-06 10:00 UTC (permalink / raw)
  To: ebiggers, jack, tytso; +Cc: linux-kernel, ltp, linux-ext4, Chunyu Hu

Hi,

ltp/access04 always panic the latest mainstream kernel-4.12-rc4 on
ppc64le. From the calltrace
I guess the reason is probably that the tests mount ext2 file system
using ext4 driver.

A simple way to reproduce:

# dd of=wangli if=/dev/zero count=1024 bs=1024
# mkfs -t ext2 wangli
# mount -t ext4 wangli /mnt/


Are there any new changes in ext4 (on kernel-4.12-rc4) recently?


[  318.557844] EXT4-fs (loop0): mounting ext2 file system using the
ext4 subsystem
[  318.558104] Unable to handle kernel paging request for unaligned
access at address 0xc0000001c52c53df
[  318.558109] Faulting instruction address: 0xc000000000918b28
[  318.558114] Oops: Kernel access of bad area, sig: 7 [#1]
[  318.558117] SMP NR_CPUS=2048
[  318.558117] NUMA
[  318.558120] pSeries
[  318.558124] Modules linked in: ext4 jbd2 mbcache loop
rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd ghash_generic gf128mul xts vmx_crypto grace
sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth
scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
[  318.558152] CPU: 2 PID: 40748 Comm: access04 Not tainted 4.12.0-rc4 #1
[  318.558155] task: c0000003889fb200 task.stack: c0000003ac134000
[  318.558158] NIP: c000000000918b28 LR: c00000000011c5d4 CTR: c000000000130900
[  318.558162] REGS: c0000003ac137420 TRAP: 0600   Not tainted  (4.12.0-rc4)
[  318.558164] MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
[  318.558171]   CR: 28028842  XER: 00000000
[  318.558174] CFAR: c00000000011c5d0 DAR: c0000001c52c53df DSISR:
00000000 SOFTE: 0
[  318.558174] GPR00: c00000000011c5d4 c0000003ac1376a0
c000000001049000 c0000001c52c53df
[  318.558174] GPR04: c0000004788657f0 0000000000000000
0000000000000000 0000000000000001
[  318.558174] GPR08: 0000000477be0000 0000000000000000
0000000080000002 0000000000000000
[  318.558174] GPR12: c000000000130900 c00000000fac1500
0000000000000000 c0000004648b6800
[  318.558174] GPR16: 0000000000000000 c000000408ad0400
0000000000000000 0000000000040001
[  318.558174] GPR20: 0000000000000001 0000000000000000
0000000000004000 c000000000cc5780
[  318.558174] GPR24: 00000001c45ffc5f 0000000000000000
c000000000cc5780 c0000001c52c53df
[  318.558174] GPR28: c000000009d06034 0000000000000004
0000000000000800 c0000001c52c53df
[  318.558222] NIP [c000000000918b28] _raw_spin_lock+0x28/0xc0
[  318.558226] LR [c00000000011c5d4] try_to_wake_up+0x1f4/0x5b0
[  318.558229] Call Trace:
[  318.558231] [c0000003ac1376a0] [c000000009d06034]
0xc000000009d06034 (unreliable)
[  318.558236] [c0000003ac1376d0] [c00000000011c5d4] try_to_wake_up+0x1f4/0x5b0
[  318.558241] [c0000003ac137750] [c000000000102828] create_worker+0x148/0x250
[  318.558245] [c0000003ac1377f0] [c0000000001059dc]
alloc_unbound_pwq+0x3bc/0x4c0
[  318.558249] [c0000003ac137850] [c00000000010601c]
apply_wqattrs_prepare+0x2ac/0x320
[  318.558253] [c0000003ac1378c0] [c0000000001060cc]
apply_workqueue_attrs_locked+0x3c/0xa0
[  318.558257] [c0000003ac1378f0] [c00000000010662c]
apply_workqueue_attrs+0x4c/0x80
[  318.558261] [c0000003ac137930] [c0000000001081cc]
__alloc_workqueue_key+0x16c/0x4e0
[  318.558280] [c0000003ac1379f0] [d000000008455ca0]
ext4_fill_super+0x1c70/0x3390 [ext4]
[  318.558286] [c0000003ac137b30] [c000000000316bdc] mount_bdev+0x21c/0x250
[  318.558298] [c0000003ac137bd0] [d00000000844db20] ext4_mount+0x20/0x40 [ext4]
[  318.558303] [c0000003ac137bf0] [c000000000318184] mount_fs+0x74/0x210
[  318.558307] [c0000003ac137ca0] [c00000000033fd18] vfs_kern_mount+0x68/0x1d0
[  318.558310] [c0000003ac137d10] [c000000000344a28] do_mount+0x278/0xef0
[  318.558314] [c0000003ac137de0] [c000000000345ac4] SyS_mount+0x94/0x100
[  318.558319] [c0000003ac137e30] [c00000000000af84] system_call+0x38/0xe0
[  318.558322] Instruction dump:
[  318.558324] 990d02bc 4bffffc8 3c4c0073 38420500 7c0802a6 fbe1fff8
7c7f1b78 f8010010
[  318.558329] f821ffd1 39400000 994d02bc 814d0008 <7d201829> 2c090000
40c20010 7d40192d
[  318.558336] ---[ end trace a2b72248c6bfebea ]---




More info of test environment
------------------------------------------
# uname -rm
4.12.0-rc4 ppc64le

# lscpu
Architecture:          ppc64le
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    8
Core(s) per socket:    1
Socket(s):             2
NUMA node(s):          2
Model:                 2.1 (pvr 004b 0201)
Model name:            POWER8 (architected), altivec supported
Hypervisor vendor:     pHyp
Virtualization type:   para
L1d cache:             64K
L1i cache:             32K
NUMA node0 CPU(s):     0-15
NUMA node1 CPU(s):


-- 
Li Wang
liwang@redhat.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df
  2017-06-06 10:00 [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df Li Wang
@ 2017-06-06 21:04 ` Theodore Ts'o
  2017-06-07  3:27 ` [LTP] " Eryu Guan
  1 sibling, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2017-06-06 21:04 UTC (permalink / raw)
  To: Li Wang; +Cc: ebiggers, jack, linux-kernel, ltp, linux-ext4, Chunyu Hu

On Tue, Jun 06, 2017 at 06:00:34PM +0800, Li Wang wrote:
> Hi,
> 
> ltp/access04 always panic the latest mainstream kernel-4.12-rc4 on
> ppc64le. From the calltrace
> I guess the reason is probably that the tests mount ext2 file system
> using ext4 driver.
> 
> A simple way to reproduce:
> 
> # dd of=wangli if=/dev/zero count=1024 bs=1024
> # mkfs -t ext2 wangli
> # mount -t ext4 wangli /mnt/

So I'm guessing from the stack trace that the crash is happening while
creating a workqueue:

> [  318.558229] Call Trace:
> [  318.558231] [c0000003ac1376a0] [c000000009d06034] 0xc000000009d06034 (unreliable)
> [  318.558236] [c0000003ac1376d0] [c00000000011c5d4] try_to_wake_up+0x1f4/0x5b0
> [  318.558241] [c0000003ac137750] [c000000000102828] create_worker+0x148/0x250
> [  318.558245] [c0000003ac1377f0] [c0000000001059dc] alloc_unbound_pwq+0x3bc/0x4c0
> [  318.558249] [c0000003ac137850] [c00000000010601c] apply_wqattrs_prepare+0x2ac/0x320
> [  318.558253] [c0000003ac1378c0] [c0000000001060cc] apply_workqueue_attrs_locked+0x3c/0xa0
> [  318.558257] [c0000003ac1378f0] [c00000000010662c] apply_workqueue_attrs+0x4c/0x80
> [  318.558261] [c0000003ac137930] [c0000000001081cc] __alloc_workqueue_key+0x16c/0x4e0
> [  318.558280] [c0000003ac1379f0] [d000000008455ca0] ext4_fill_super+0x1c70/0x3390 [ext4]
  ...

And the ext4 code in question is just doing this:

	EXT4_SB(sb)->rsv_conversion_wq =
		alloc_workqueue("ext4-rsv-conversion", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);

This looks pretty boring and I don't see anything which is specific
about ext4 or the file system that would cause the crash.

Can you bisect this by any chance?  When was the last kernel version
where this worked?  When did it first fail?

Thanks,

					- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [LTP] [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df
  2017-06-06 10:00 [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df Li Wang
  2017-06-06 21:04 ` Theodore Ts'o
@ 2017-06-07  3:27 ` Eryu Guan
  1 sibling, 0 replies; 3+ messages in thread
From: Eryu Guan @ 2017-06-07  3:27 UTC (permalink / raw)
  To: Li Wang; +Cc: ebiggers, jack, tytso, linux-ext4, linux-kernel, ltp

On Tue, Jun 06, 2017 at 06:00:34PM +0800, Li Wang wrote:
> Hi,
> 
> ltp/access04 always panic the latest mainstream kernel-4.12-rc4 on
> ppc64le. From the calltrace
> I guess the reason is probably that the tests mount ext2 file system
> using ext4 driver.
> 
> A simple way to reproduce:
> 
> # dd of=wangli if=/dev/zero count=1024 bs=1024
> # mkfs -t ext2 wangli
> # mount -t ext4 wangli /mnt/

I can't reproduce this crash either by your reproducer nor by ltp
access04 test on ppc64le host.

> 
> 
> Are there any new changes in ext4 (on kernel-4.12-rc4) recently?

I don't think it's an ext4 bug, I've seen similar crashes twice in
4.12-rc4 kernel testings, once testing XFS running fstests, and once
running ltp on ext3. But it seems not related to filesystem code.

[  828.119270] run fstests generic/034 at 2017-06-06 19:16:10 
[  828.720341] XFS (sda5): Unmounting Filesystem 
[  828.814003] device-mapper: uevent: version 1.0.3 
[  828.814096] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c5e7f 
[  828.814103] Faulting instruction address: 0xc0000000004d214c 
[  828.814109] Oops: Kernel access of bad area, sig: 7 [#1] 
[  828.814113] SMP NR_CPUS=2048  
[  828.814114] NUMA  
[  828.814117] pSeries 
[  828.814122] Modules linked in: dm_mod(+) sg pseries_rng ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp 
[  828.814150] CPU: 10 PID: 137772 Comm: modprobe Not tainted 4.12.0-rc4 #1 
[  828.814155] task: c0000003fe13c800 task.stack: c00000046ec68000 
[  828.814163] NIP: c0000000004d214c LR: c00000000011c884 CTR: c000000000130900 
[  828.814168] REGS: c00000046ec6b3d0 TRAP: 0600   Not tainted  (4.12.0-rc4) 
[  828.814173] MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> 
[  828.814184]   CR: 28228244  XER: 00000005 
[  828.814191] CFAR: c00000000011c880 DAR: c0000001c52c5e7f DSISR: 00000000 SOFTE: 0  
[  828.814191] GPR00: c00000000011c848 c00000046ec6b650 c000000001049100 c0000003f3b77020  
[  828.814191] GPR04: c0000003f3b77020 c0000001c52c5e7f 0000000000000000 0000000000000001  
[  828.814191] GPR08: 0008f92d89943c42 00000024000048b7 0000000000000008 0000000000000000  
[  828.814191] GPR12: c000000000130900 c00000000fac6900 d000000007dd3908 d000000007dd3908  
[  828.814191] GPR16: c00000046ec6bdec c00000046ec6bda0 000000000000ff20 0000000000000000  
[  828.814191] GPR20: 00000000000052f8 0000000000000000 0000000000004000 c000000000cc5780  
[  828.814191] GPR24: 00000001c45ffc5f 0000000000000000 00000001c45ffc5f c00000000107dd00  
[  828.814191] GPR28: c0000003f3b77834 0000000000000004 0000000000000800 c0000003f3b77000  
[  828.814257] NIP [c0000000004d214c] llist_add_batch+0xc/0x40 
[  828.814263] LR [c00000000011c884] try_to_wake_up+0x4a4/0x5b0 
[  828.814268] Call Trace: 
[  828.814273] [c00000046ec6b650] [c00000000011c848] try_to_wake_up+0x468/0x5b0 (unreliable) 
[  828.814282] [c00000046ec6b6d0] [c000000000102828] create_worker+0x148/0x250 
[  828.814290] [c00000046ec6b770] [c0000000001059dc] alloc_unbound_pwq+0x3bc/0x4c0 
[  828.814296] [c00000046ec6b7d0] [c00000000010601c] apply_wqattrs_prepare+0x2ac/0x320 
[  828.814304] [c00000046ec6b840] [c0000000001060cc] apply_workqueue_attrs_locked+0x3c/0xa0 
[  828.814313] [c00000046ec6b870] [c00000000010662c] apply_workqueue_attrs+0x4c/0x80 
[  828.814322] [c00000046ec6b8b0] [c0000000001081cc] __alloc_workqueue_key+0x16c/0x4e0 
[  828.814343] [c00000046ec6b970] [d000000007e04748] local_init+0xdc/0x1a4 [dm_mod] 
[  828.814362] [c00000046ec6b9f0] [d000000007e04854] dm_init+0x44/0xc4 [dm_mod] 
[  828.814375] [c00000046ec6ba30] [c00000000000ccf0] do_one_initcall+0x60/0x1c0 
[  828.814390] [c00000046ec6baf0] [c00000000091e748] do_init_module+0x8c/0x244 
[  828.814405] [c00000046ec6bb80] [c000000000197e08] load_module+0x12f8/0x1600 
[  828.814414] [c00000046ec6bd30] [c000000000198388] SyS_finit_module+0xa8/0x110 
[  828.814424] [c00000046ec6be30] [c00000000000af84] system_call+0x38/0xe0 
[  828.814429] Instruction dump: 
[  828.814436] 60420000 38600000 4e800020 60000000 60420000 7c832378 4e800020 60000000  
[  828.814448] 60000000 e9250000 f9240000 7c0004ac <7d4028a8> 7c2a4800 40c20010 7c6029ad  
[  828.814466] ---[ end trace 87ec4ff1fa8e1a3d ]--- 

I suspect it's a regression introduced in 4.12-rc4 kernel, I didn't see
such crashes when testing 4.12-rc3 kernel. I'll do bisect once I worked
out a reliable reproducer (unless you can reliably reproduce it with
your reproducer :).

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-06-07  3:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-06 10:00 [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df Li Wang
2017-06-06 21:04 ` Theodore Ts'o
2017-06-07  3:27 ` [LTP] " Eryu Guan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).