linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [next-20180727][qla2xxx][BUG] WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280
@ 2018-08-01  6:33 Abdul Haleem
  2018-08-01  7:19 ` jianchao.wang
  0 siblings, 1 reply; 2+ messages in thread
From: Abdul Haleem @ 2018-08-01  6:33 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-block, linux-fsdevel, linux-ext4, linux-scsi, linux-next,
	Stephen Rothwell, linux-kernel, jejb, Jens Axboe, dgilbert,
	bart.vanassche, rosattig, kyle.mahlkuch

Greeting's

linux-next kernel crashes when creating file system on scsi disks using
mkfs.ext4 (Discarding device blocks) 

Machine: Power 8 Power VM LPAR
kernel : 4.18.0-rc6-next-20180727
adapter : Fibre Channel [0c04]: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA [1077:2532]
Test:  mkfs.ext4 /dev/mapper/maptha

Above command triggers WARN_ON_ONCE at line:
0xc000000000690080 is in scsi_end_request (drivers/scsi/scsi_lib.c:691)
686	
687		if (blk_queue_add_random(q))
688			add_disk_randomness(req->rq_disk);
689	
690		if (!blk_rq_is_scsi(req)) {
691			WARN_ON_ONCE(!(cmd->flags & SCMD_INITIALIZED));
692			cmd->flags &= ~SCMD_INITIALIZED;
693			destroy_rcu_head(&cmd->rcu);
694		}
695	

followed by kernel crash

# mkfs -t ext4 /dev/mapper/mpatha
mke2fs 1.43.1 (08-Jun-2016)
Found a dos partition table in /dev/mapper/mpatha
Proceed anyway? (y,n) y
Discarding device blocks: 
qla2xxx [0106:a0:00.1]-801c:2: Abort command issued nexus=2:1:0 --  1 2002.
qla2xxx [0106:a0:00.0]-801c:0: Abort command issued nexus=0:1:0 --  1 2002.
qla2xxx [0106:a0:00.1]-801c:2: Abort command issued nexus=2:1:0 --  1 2002.
qla2xxx [0106:a0:00.0]-801c:0: Abort command issued nexus=0:1:0 --  1 2002.
WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280
Modules linked in: xt_CHECKSUM(E) ipt_MASQUERADE(E) tun(E) kvm_pr(E)
kvm(E) ip6t_rpfilter(E) ip6t_REJECT(E) nf_reject_ipv6(E) ipt_REJECT(E)
nf_reject_ipv4(E) xt_conntrack(E) ip_set(E) nfnetlink(E) ebtable_nat(E)
ebtable_broute(E) bridge(E) stp(E) llc(E) ip6table_raw(E)
ip6table_nat(E) nf_nat_ipv6(E) ip6table_security(E) ip6table_mangle(E)
iptable_raw(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E)
nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) iptable_security(E)
iptable_mangle(E) ebtable_filter(E) ebtables(E) ip6table_filter(E)
ip6_tables(E) iptable_filter(E) ip_tables(E) dm_service_time(E) xts(E)
vmx_crypto(E) pseries_rng(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E)
grace(E) sunrpc(E) dm_multipath(E) dm_mod(E) sch_fq_codel(E) ext4(E)
mbcache(E) jbd2(E) fscrypto(E) sr_mod(E)
 cdrom(E) sd_mod(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E)
qla2xxx(E) nvme_fc(E) nvme_fabrics(E) nvme_core(E) scsi_transport_fc(E)
tg3(E)
CPU: 12 PID: 511 Comm: kworker/12:2 Tainted: G            E     4.18.0-rc6-next-20180727-autotest-autotest #1
Workqueue: qla2xxx_wq qla25xx_free_rsp_que [qla2xxx]
NIP:  c000000000690080 LR: c00000000068fe80 CTR: d000000005050788
REGS: c00000027d39b380 TRAP: 0700   Tainted: G            E      (4.18.0-rc6-next-20180727-autotest-autotest)
MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002824  XER: 20000000
CFAR: c00000000068feb8 IRQMASK: 1 
GPR00: c00000000068fe80 c00000027d39b600 c00000000113de00 0000000000000000 
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
GPR08: c00000027dce3000 0000000000000001 0000000000000000 d000000004330708 
GPR12: 0000000000002200 c00000000ec57600 c00000000012cf78 c000000289561b40 
GPR16: 0000000000000000 0000000000000000 0000000000044000 0000000000004007 
GPR20: c00000027dce3860 0000000000000000 0000000000000000 c000000280db9c60 
GPR24: 0000000000000002 c00000000153567c c0000002712a1cc8 0000000000000000 
GPR28: 0000000000000000 c0000002712a1cc8 c00000027e217000 c0000002708efe00 
NIP [c000000000690080] scsi_end_request+0x250/0x280
LR [c00000000068fe80] scsi_end_request+0x50/0x280
Call Trace:
[c00000027d39b600] [c00000000068fe80] scsi_end_request+0x50/0x280 (unreliable)
[c00000027d39b660] [c0000000006904ac] scsi_io_completion+0x29c/0x7d0
[c00000027d39b710] [c0000000006848e4] scsi_finish_command+0x104/0x1c0
[c00000027d39b790] [c00000000068f148] scsi_softirq_done+0x198/0x1f0
[c00000027d39b820] [c0000000004f2b80] blk_mq_complete_request+0x130/0x1d0
[c00000027d39b860] [c00000000068d27c] scsi_mq_done+0x2c/0xe0
[c00000027d39b890] [d000000004291080] qla2xxx_qpair_sp_compl+0xa8/0x140 [qla2xxx]
[c00000027d39b900] [d0000000042cc9d0] qla2x00_process_completed_request+0x68/0x140 [qla2xxx]
------------[ cut here ]------------
kernel BUG at block/blk-core.c:3196!
[c00000027d39b970] [d0000000042cd4ac] qla2x00_status_entry.isra.7+0x4f4/0x1750 [qla2xxx]
Oops: Exception in kernel mode, sig: 5 [#1]
[c00000027d39bb00] [d0000000042cfab0] qla24xx_process_response_queue+0x7d8/0xba0 [qla2xxx]
LE SMP NR_CPUS=2048 
[c00000027d39bc40] [d0000000042f8c48] qla25xx_free_rsp_que+0x1f0/0x270 [qla2xxx]
NUMA pSeries
[c00000027d39bc80] [c00000000012478c] process_one_work+0x24c/0x500
Modules linked in: xt_CHECKSUM(E)
[c00000027d39bd20] [c000000000124acc] worker_thread+0x8c/0x590
 ipt_MASQUERADE(E)
[c00000027d39bdc0] [c00000000012d0c8] kthread+0x158/0x1a0
 tun(E)
[c00000027d39be30] [c00000000000b65c] ret_from_kernel_thread+0x5c/0x80
 kvm_pr(E)
Instruction dump:
 kvm(E)
7f23cb78 
 ip6t_rpfilter(E)
4bed1d15 
 ip6t_REJECT(E)
60000000 3c62003f 
 nf_reject_ipv6(E)
e8637888 
 ipt_REJECT(E)
7f24cb78 
 nf_reject_ipv4(E)
4bc9ade1 
 xt_conntrack(E)
60000000 
 ip_set(E)
e93f0258 
 nfnetlink(E)
e9290120 
 ebtable_nat(E)
fb4900c0 
 ebtable_broute(E)
4bffff48 
 bridge(E)
<0fe00000> 
 stp(E)
4bfffe38 
 llc(E)
60000000 60000000 
 ip6table_raw(E)
---[ end trace ee1b33dccf2a8b8a ]---
 ip6table_nat(E) nf_nat_ipv6(E) ip6table_security(E) ip6table_mangle(E)
iptable_raw(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E)
nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) iptable_security(E)
iptable_mangle(E) ebtable_filter(E) ebtables(E) ip6table_filter(E)
ip6_tables(E) iptable_filter(E) ip_tables(E) dm_service_time(E) xts(E)
vmx_crypto(E) pseries_rng(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E)
grace(E) sunrpc(E) dm_multipath(E) dm_mod(E) sch_fq_codel(E) ext4(E)
mbcache(E) jbd2(E) fscrypto(E) sr_mod(E) cdrom(E) sd_mod(E) ibmvscsi(E)
scsi_transport_srp(E) ibmveth(E) qla2xxx(E) nvme_fc(E) nvme_fabrics(E)
nvme_core(E) scsi_transport_fc(E) tg3(E)
CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W   E     4.18.0-rc6-next-20180727-autotest-autotest #1
NIP:  c0000000004e5160 LR: c0000000004e529c CTR: c000000000022a30
REGS: c00000028ffb39f0 TRAP: 0700   Tainted: G        W   E      (4.18.0-rc6-next-20180727-autotest-autotest)
MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28244482  XER: 2000000f
CFAR: c0000000004e52a4 IRQMASK: 1 
GPR00: c0000000004e529c c00000028ffb3c70 c00000000113de00 c00000028911cc00 
GPR04: 0000002461ee9abb ffffffffffffffff 0000000001f401d3 0010d9418d5afe73 
GPR08: 0000000000000018 0000000000000001 00000000000005ff d000000005b89418 
GPR12: c000000000022a30 c00000000eca8a00 c00000028bcc3f90 0000000000000000 
GPR16: 0000000000000001 c000000000db7200 c000000001163b00 00000000ffffc7db 
GPR20: 000000000000000a c000000000db7200 c000000001162200 c00000028ffb0000 
GPR24: c00000028ffb0000 0000000000000100 0000000000000028 c000000001162200 
GPR28: 0000000000000000 0000002461ee9abb c00000027de90998 c00000028911cc00 
NIP [c0000000004e5160] blk_finish_request+0x70/0x200
LR [c0000000004e529c] blk_finish_request+0x1ac/0x200
Call Trace:
[c00000028ffb3c70] [c0000000004e529c] blk_finish_request+0x1ac/0x200 (unreliable)
[c00000028ffb3cb0] [c0000000004e5360] blk_end_bidi_request+0x70/0xc0
[c00000028ffb3cf0] [c0000000004e540c] blk_end_request_all+0x2c/0x60
[c00000028ffb3d10] [d000000005b86ae4] map_request+0x8dc/0x920 [dm_mod]
[c00000028ffb3db0] [c0000000004f0470] blk_done_softirq+0xc0/0xf0
[c00000028ffb3df0] [c0000000009e5ed0] __do_softirq+0x170/0x3ac
[c00000028ffb3ef0] [c000000000108690] irq_exit+0x1b0/0x1d0
[c00000028ffb3f20] [c000000000016ac8] __do_irq+0x98/0x210
[c00000028ffb3f90] [c000000000028830] call_do_irq+0x14/0x24
[c00000028bcc39e0] [c000000000016cd4] do_IRQ+0x94/0x140
[c00000028bcc3a30] [c000000000008c24] hardware_interrupt_common+0x114/0x120
--- interrupt: 501 at plpar_hcall_norets+0x1c/0x28
    LR = check_and_cede_processor+0x2c/0x40
[c00000028bcc3d20] [c00000028bcc0000] 0xc00000028bcc0000 (unreliable)
[c00000028bcc3d80] [c0000000007b8cb8] shared_cede_loop+0x48/0x130
[c00000028bcc3db0] [c0000000007b5fd0] cpuidle_enter_state+0xa0/0x3f0
[c00000028bcc3e10] [c000000000144ae4] call_cpuidle+0x44/0x90
[c00000028bcc3e30] [c0000000001450d8] do_idle+0x328/0x3a0
[c00000028bcc3ec0] [c000000000145354] cpu_startup_entry+0x34/0x40
[c00000028bcc3ef0] [c0000000000495e0] start_secondary+0x4d0/0x510
[c00000028bcc3f90] [c00000000000ab70] start_secondary_prolog+0x10/0x14
Instruction dump:
2fa90000 409e01ac 813f0018 752a0002 40820150 712a0004 40820128 7fe9fb78 
e9490041 7d295278 3149ffff 7d2a4910 <0b090000> 3d220037 39290e84 81290000 
---[ end trace ee1b33dccf2a8b8b ]---

Kernel panic - not syncing: Fatal exception in interrupt
WARNING: CPU: 5 PID: 0 at drivers/tty/vt/vt.c:4220 do_unblank_screen+0x1f0/0x280
Modules linked in: xt_CHECKSUM(E) ipt_MASQUERADE(E) tun(E) kvm_pr(E)
kvm(E) ip6t_rpfilter(E) ip6t_REJECT(E) nf_reject_ipv6(E) ipt_REJECT(E)
nf_reject_ipv4(E) xt_conntrack(E) ip_set(E) nfnetlink(E) ebtable_nat(E)
ebtable_broute(E) bridge(E) stp(E) llc(E) ip6table_raw(E)
ip6table_nat(E) nf_nat_ipv6(E) ip6table_security(E) ip6table_mangle(E)
iptable_raw(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E)
nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) iptable_security(E)
iptable_mangle(E) ebtable_filter(E) ebtables(E) ip6table_filter(E)
ip6_tables(E) iptable_filter(E) ip_tables(E) dm_service_time(E) xts(E)
vmx_crypto(E) pseries_rng(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E)
grace(E) sunrpc(E) dm_multipath(E) dm_mod(E) sch_fq_codel(E) ext4(E)
mbcache(E) jbd2(E) fscrypto(E) sr_mod(E)
 cdrom(E) sd_mod(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E)
qla2xxx(E) nvme_fc(E) nvme_fabrics(E) nvme_core(E) scsi_transport_fc(E)
tg3(E)
CPU: 5 PID: 0 Comm: swapper/5 Tainted: G      D W   E     4.18.0-rc6-next-20180727-autotest-autotest #1
NIP:  c000000000605360 LR: c0000000006051ac CTR: c00000000002c0f0
REGS: c00000028ffb3320 TRAP: 0700   Tainted: G      D W   E      (4.18.0-rc6-next-20180727-autotest-autotest)
MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 88242222  XER: 20000009
CFAR: c0000000006051c0 IRQMASK: 3 
GPR00: c0000000006053a4 c00000028ffb35a0 c00000000113de00 0000000000000000 
GPR04: 0000000000000003 000000000000000a 0000000030656665 6320303030303030 
GPR08: 000000028eeb0000 0000000000000000 0000000000000000 3030303731326537 
GPR12: c00000000002c0f0 c00000000eca8a00 c00000028bcc3f90 0000000000000000 
GPR16: 0000000000000001 c000000000db7200 c000000001163b00 00000000ffffc7db 
GPR20: 000000000000000a c000000000db7200 c000000001162200 c00000028ffb0000 
GPR24: c00000028ffb0000 0000000000000100 c000000000feb688 c0000000012f6658 
GPR28: c0000000004e5160 0000000000000000 c0000000014050c0 0000000000000000 
NIP [c000000000605360] do_unblank_screen+0x1f0/0x280
LR [c0000000006051ac] do_unblank_screen+0x3c/0x280
Call Trace:
[c00000028ffb35a0] [c0000000006053a4] do_unblank_screen+0x234/0x280 (unreliable)
[c00000028ffb3620] [c00000000052bbf0] bust_spinlocks+0x40/0x80
[c00000028ffb3640] [c000000000100200] panic+0x1b4/0x314
[c00000028ffb36e0] [c000000000023d48] oops_end+0x1d8/0x200
[c00000028ffb3760] [c000000000024254] _exception_pkey+0x1c4/0x1f0
[c00000028ffb3900] [c000000000026330] program_check_exception+0x2c0/0x3e0
[c00000028ffb3980] [c000000000008e94] program_check_common+0x134/0x140
--- interrupt: 700 at blk_finish_request+0x70/0x200
    LR = blk_finish_request+0x1ac/0x200
[c00000028ffb3cb0] [c0000000004e5360] blk_end_bidi_request+0x70/0xc0
[c00000028ffb3cf0] [c0000000004e540c] blk_end_request_all+0x2c/0x60
[c00000028ffb3d10] [d000000005b86ae4] map_request+0x8dc/0x920 [dm_mod]
[c00000028ffb3db0] [c0000000004f0470] blk_done_softirq+0xc0/0xf0
[c00000028ffb3df0] [c0000000009e5ed0] __do_softirq+0x170/0x3ac
[c00000028ffb3ef0] [c000000000108690] irq_exit+0x1b0/0x1d0
[c00000028ffb3f20] [c000000000016ac8] __do_irq+0x98/0x210
[c00000028ffb3f90] [c000000000028830] call_do_irq+0x14/0x24
[c00000028bcc39e0] [c000000000016cd4] do_IRQ+0x94/0x140
[c00000028bcc3a30] [c000000000008c24] hardware_interrupt_common+0x114/0x120
--- interrupt: 501 at plpar_hcall_norets+0x1c/0x28
    LR = check_and_cede_processor+0x2c/0x40
[c00000028bcc3d20] [c00000028bcc0000] 0xc00000028bcc0000 (unreliable)
[c00000028bcc3d80] [c0000000007b8cb8] shared_cede_loop+0x48/0x130
[c00000028bcc3db0] [c0000000007b5fd0] cpuidle_enter_state+0xa0/0x3f0
[c00000028bcc3e10] [c000000000144ae4] call_cpuidle+0x44/0x90
[c00000028bcc3e30] [c0000000001450d8] do_idle+0x328/0x3a0
[c00000028bcc3ec0] [c000000000145354] cpu_startup_entry+0x34/0x40
[c00000028bcc3ef0] [c0000000000495e0] start_secondary+0x4d0/0x510
[c00000028bcc3f90] [c00000000000ab70] start_secondary_prolog+0x10/0x14
Instruction dump:
4bb79dc5 60000000 38210080 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 
4e800020 60000000 60000000 60000000 <0fe00000> 4bfffe60 60000000 60000000 
---[ end trace ee1b33dccf2a8b8c ]---
Rebooting in 10 seconds..

# lsblk
NAME     MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT 
sdd        8:48   0   30G  0 disk  
└─mpatha 253:1    0   30G  0 mpath 
sdb        8:16   0   30G  0 disk  
└─mpatha 253:1    0   30G  0 mpath 

# lsscsi
[0:0:1:0]    disk    AIX      VDASD            0001  /dev/sda 
[1:0:0:0]    disk    IBM      2145             0000  /dev/sdb 
[1:0:1:0]    disk    IBM      2145             0000  /dev/sdd 

# lsscsi -H
[0]    ibmvscsi      
[1]    qla2xxx       

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [next-20180727][qla2xxx][BUG] WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280
  2018-08-01  6:33 [next-20180727][qla2xxx][BUG] WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280 Abdul Haleem
@ 2018-08-01  7:19 ` jianchao.wang
  0 siblings, 0 replies; 2+ messages in thread
From: jianchao.wang @ 2018-08-01  7:19 UTC (permalink / raw)
  To: Abdul Haleem, linuxppc-dev, Madhani, Himanshu
  Cc: linux-block, linux-fsdevel, linux-ext4, linux-scsi, linux-next,
	Stephen Rothwell, linux-kernel, jejb, Jens Axboe, dgilbert,
	bart.vanassche, rosattig, kyle.mahlkuch

Hi Abdul

On 08/01/2018 02:33 PM, Abdul Haleem wrote:
> # mkfs -t ext4 /dev/mapper/mpatha
> mke2fs 1.43.1 (08-Jun-2016)
> Found a dos partition table in /dev/mapper/mpatha
> Proceed anyway? (y,n) y
> Discarding device blocks: 
> qla2xxx [0106:a0:00.1]-801c:2: Abort command issued nexus=2:1:0 --  1 2002.
> qla2xxx [0106:a0:00.0]-801c:0: Abort command issued nexus=0:1:0 --  1 2002.
> qla2xxx [0106:a0:00.1]-801c:2: Abort command issued nexus=2:1:0 --  1 2002.
> qla2xxx [0106:a0:00.0]-801c:0: Abort command issued nexus=0:1:0 --  1 2002.
> WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280
...
> NIP [c000000000690080] scsi_end_request+0x250/0x280
> LR [c00000000068fe80] scsi_end_request+0x50/0x280
> Call Trace:
> [c00000027d39b600] [c00000000068fe80] scsi_end_request+0x50/0x280 (unreliable)
> [c00000027d39b660] [c0000000006904ac] scsi_io_completion+0x29c/0x7d0
> [c00000027d39b710] [c0000000006848e4] scsi_finish_command+0x104/0x1c0
> [c00000027d39b790] [c00000000068f148] scsi_softirq_done+0x198/0x1f0
> [c00000027d39b820] [c0000000004f2b80] blk_mq_complete_request+0x130/0x1d0
> [c00000027d39b860] [c00000000068d27c] scsi_mq_done+0x2c/0xe0
> [c00000027d39b890] [d000000004291080] qla2xxx_qpair_sp_compl+0xa8/0x140 [qla2xxx]
> [c00000027d39b900] [d0000000042cc9d0] qla2x00_process_completed_request+0x68/0x140 [qla2xxx]
> ------------[ cut here ]------------
> kernel BUG at block/blk-core.c:3196!

blk_finish_request

BUG_ON(blk_queued_rq(req))

We are also suffering a similar issue on qla2xxx,
the BUG_ON in blk_finish_request is triggered while there are lots of command aborted.
The root cause should be qla2xxx driver still invoke scsi_done for an aborted command
and cause race between requeue path and normal complete path.

Add Himanshu Madhani from qlogic team.
It seems that they are working on this.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-08-01  9:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-01  6:33 [next-20180727][qla2xxx][BUG] WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280 Abdul Haleem
2018-08-01  7:19 ` jianchao.wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).