All of lore.kernel.org
 help / color / mirror / Atom feed
* megaraid_sas problem for scsi_add_host() fail
@ 2020-02-27 18:41 John Garry
  2020-03-02 11:10 ` Sumit Saxena
  0 siblings, 1 reply; 6+ messages in thread
From: John Garry @ 2020-02-27 18:41 UTC (permalink / raw)
  To: Kashyap Desai, Sumit Saxena, Shivasharan S; +Cc: megaraidlinux.pdl, linux-scsi

Hi guys,

While do some testing, I noticed that the megaraid SAS driver has a 
problem and looks to corrupt my system when the call to scsi_add_host() 
in the driver fails:

[   62.516871] megasas: 07.713.01.00-rc1
[   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
[   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's 
base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
[   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
[   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit 
consistent mask
[   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
[   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
[   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus : 
(128/128)
[   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
[   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports 
maximum commands: 4077 LDIO threshold: 0
[   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware 
commands: 4076
[   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
[   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
[   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is 
called outbound_intr_mask:0x40000009
[   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs: 
1 max_lds: 64
[   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
[   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) : 
Enabled
[   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
[   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
[   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset 
timeout : 6 secs/60 secs
[   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
[   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
[   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
[   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is 
called outbound_intr_mask:0x40000000
[   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
[   63.822235] megaraid_sas 0000:08:00.0: pci id : 
(0x1000)/(0x0016)/(0x19e5)/(0xd215)
[   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
[   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
[   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
[   63.877352] scsi host0: Avago SAS based MegaRAID driver
[   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from 
megasas_io_attach 6802
[   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is 
called outbound_intr_mask:0x40000009
[   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
[   64.636476] nvme nvme0: pci function 0000:81:00.0
[   64.669635] libphy: Fixed MDIO Bus: probed
[   64.680255] tun: Universal TUN/TAP device driver, 1.6
[   64.694422] thunder_xcv, ver 1.0
[   64.702042] thunder_bgx, ver 1.0
[   64.709277] nicpf, ver 1.0
[   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version 
5.6.0-k
[   64.754981] nvme nvme0: Removing after probe failure status: -12
[   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
[   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver - 
version 2.4.0-k
[   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
[   64.813848] sky2: driver version 1.30
[   64.825564] VFIO - User Level meta-driver version: 0.3
[   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   64.862029] ehci-pci: EHCI PCI platform driver
[   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
[   64.886700] 
==================================================================
[   64.901999] BUG: KASAN: slab-out-of-bounds in 
run_timer_softirq+0x6f4/0xae0
[   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0

[   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
[   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, 
BIOS 2280-V2 CS V3.B160.01 02/24/2020
[   64.972575] Call trace:
[   64.977729]  dump_backtrace+0x0/0x298
[   64.985439]  show_stack+0x14/0x20
[   64.992418]  dump_stack+0x118/0x190
[   64.999762]  print_address_description.isra.9+0x6c/0x3b8
[   65.010953]  __kasan_report+0x134/0x23c
[   65.019029]  kasan_report+0xc/0x18
[   65.026188]  __asan_store8+0x94/0xb8
[   65.033720]  run_timer_softirq+0x6f4/0xae0
[   65.042343]  efi_header_end+0x16c/0x840
[   65.050420]  irq_exit+0x19c/0x1a8
[   65.057396]  __handle_domain_irq+0x7c/0xe0
[   65.066022]  gic_handle_irq+0x64/0x168
[   65.073917]  el1_irq+0xbc/0x180
[   65.080528]  arch_cpu_idle+0x3c/0x320
[   65.088239]  default_idle_call+0x28/0x4c
[   65.096502]  do_idle+0x278/0x348
[   65.103295]  cpu_startup_entry+0x24/0x40
[   65.111554]  rest_init+0x1c4/0x298
[   65.118718]  arch_call_rest_init+0xc/0x14
[   65.127159]  start_kernel+0x848/0x888

[   65.138006] Allocated by task 0:
[   65.144802] (stack is not available)

[   65.155465] Freed by task 0:
[   65.161530] (stack is not available)

[   65.172193] The buggy address belongs to the object at ffff0026b931aa00
  which belongs to the cache pool_workqueue of size 256
[   65.199113] The buggy address is located 224 bytes inside of
  256-byte region [ffff0026b931aa00, ffff0026b931ab00)
[   65.223840] The buggy address belongs to the page:
[   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0 
mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
[   65.257923] flags: 0x6ffff00000010200(slab|head)
[   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608 
ffff0026dd81c880
[   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff 
0000000000000000
[   65.300270] page dumped because: kasan: bad access detected

[   65.315139] Memory state around the buggy address:
[   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   65.370870]                                                        ^
[   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   65.414675] 
==================================================================
[   65.429885] Disabling lock debugging due to kernel taint
[   65.441431] Unable to handle kernel paging request at virtual address 
ffffa0001013c0b0
[   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
[   65.458088] Mem abort info:
[   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned 
bus number 1
[   65.474927]   ESR = 0x96000007
[   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
[   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
[   65.496918]   SET = 0, FnV = 0
[   65.496922]   EA = 0, S1PTW = 0
[   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
[   65.526575] Data abort info:
[   65.526580]   ISV = 0, ISS = 0x00000007
[   65.535948] hub 1-0:1.0: USB hub found
[   65.545245]   CM = 0, WnR = 0
[   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
[   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003, 
pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
[   65.551519] hub 1-0:1.0: 2 ports detected
[   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[   65.559379] Modules linked in:
[   65.569534] ehci-platform: EHCI generic platform driver
[   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B 
       5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
[   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, 
BIOS 2280-V2 CS V3.B160.01 02/24/2020
[   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
[   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
[   65.588048] ehci-orion: EHCI orion driver
[   65.609756] pc : megasas_readl+0x60/0x80
[   65.609759] lr : megasas_readl+0x1c/0x80
[   65.609761] sp : ffff0026d97bfc00
[   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
[   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
[   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
[   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
[   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
[   65.618543] ehci-exynos: EHCI Exynos driver
[   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
[   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
[   65.629847] x15: 00000000000013a8 x14: 0000000000000000
[   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
[   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
[   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
[   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
[   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
[   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
[   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
[   65.647210] Call trace:
[   65.647214]  megasas_readl+0x60/0x80
[   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
[   65.647221]  megasas_fault_detect_work+0x44/0x520
[   65.647226]  process_one_work+0x488/0xc08
[   65.647228]  worker_thread+0x68/0x5d0
[   65.647233]  kthread+0x1c8/0x1d0
[   65.669535] ohci-pci: OHCI PCI platform driver
[   65.689683]  ret_from_fork+0x10/0x18
[   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
[   65.689695] ---[ end trace 3632c7efc4f2d69c ]---


That's 5.6-rc3 .

Please have a look,

John




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: megaraid_sas problem for scsi_add_host() fail
  2020-02-27 18:41 megaraid_sas problem for scsi_add_host() fail John Garry
@ 2020-03-02 11:10 ` Sumit Saxena
  2020-03-02 12:13   ` John Garry
  0 siblings, 1 reply; 6+ messages in thread
From: Sumit Saxena @ 2020-03-02 11:10 UTC (permalink / raw)
  To: John Garry; +Cc: Kashyap Desai, Shivasharan S, PDL,MEGARAIDLINUX, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 10485 bytes --]

On Fri, Feb 28, 2020 at 12:11 AM John Garry <john.garry@huawei.com> wrote:
>
> Hi guys,
>
Hi John,

> While do some testing, I noticed that the megaraid SAS driver has a
> problem and looks to corrupt my system when the call to scsi_add_host()
> in the driver fails:
It is megaraid_sas driver bug. Driver does not freeup resources properly, when
scsi_add_host() fails. Please try attached patch.

Thanks,
Sumit
>
> [   62.516871] megasas: 07.713.01.00-rc1
> [   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
> [   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's
> base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
> [   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
> [   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit
> consistent mask
> [   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
> [   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
> [   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus :
> (128/128)
> [   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
> [   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports
> maximum commands: 4077 LDIO threshold: 0
> [   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware
> commands: 4076
> [   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
> [   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
> [   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
> called outbound_intr_mask:0x40000009
> [   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs:
> 1 max_lds: 64
> [   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
> [   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) :
> Enabled
> [   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
> [   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
> [   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset
> timeout : 6 secs/60 secs
> [   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
> [   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
> [   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
> [   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is
> called outbound_intr_mask:0x40000000
> [   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
> [   63.822235] megaraid_sas 0000:08:00.0: pci id :
> (0x1000)/(0x0016)/(0x19e5)/(0xd215)
> [   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
> [   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
> [   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
> [   63.877352] scsi host0: Avago SAS based MegaRAID driver
> [   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from
> megasas_io_attach 6802
> [   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
> called outbound_intr_mask:0x40000009
> [   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
> [   64.636476] nvme nvme0: pci function 0000:81:00.0
> [   64.669635] libphy: Fixed MDIO Bus: probed
> [   64.680255] tun: Universal TUN/TAP device driver, 1.6
> [   64.694422] thunder_xcv, ver 1.0
> [   64.702042] thunder_bgx, ver 1.0
> [   64.709277] nicpf, ver 1.0
> [   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> [   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> [   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version
> 5.6.0-k
> [   64.754981] nvme nvme0: Removing after probe failure status: -12
> [   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
> [   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver -
> version 2.4.0-k
> [   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
> [   64.813848] sky2: driver version 1.30
> [   64.825564] VFIO - User Level meta-driver version: 0.3
> [   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> [   64.862029] ehci-pci: EHCI PCI platform driver
> [   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
> [   64.886700]
> ==================================================================
> [   64.901999] BUG: KASAN: slab-out-of-bounds in
> run_timer_softirq+0x6f4/0xae0
> [   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0
>
> [   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
> [   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
> BIOS 2280-V2 CS V3.B160.01 02/24/2020
> [   64.972575] Call trace:
> [   64.977729]  dump_backtrace+0x0/0x298
> [   64.985439]  show_stack+0x14/0x20
> [   64.992418]  dump_stack+0x118/0x190
> [   64.999762]  print_address_description.isra.9+0x6c/0x3b8
> [   65.010953]  __kasan_report+0x134/0x23c
> [   65.019029]  kasan_report+0xc/0x18
> [   65.026188]  __asan_store8+0x94/0xb8
> [   65.033720]  run_timer_softirq+0x6f4/0xae0
> [   65.042343]  efi_header_end+0x16c/0x840
> [   65.050420]  irq_exit+0x19c/0x1a8
> [   65.057396]  __handle_domain_irq+0x7c/0xe0
> [   65.066022]  gic_handle_irq+0x64/0x168
> [   65.073917]  el1_irq+0xbc/0x180
> [   65.080528]  arch_cpu_idle+0x3c/0x320
> [   65.088239]  default_idle_call+0x28/0x4c
> [   65.096502]  do_idle+0x278/0x348
> [   65.103295]  cpu_startup_entry+0x24/0x40
> [   65.111554]  rest_init+0x1c4/0x298
> [   65.118718]  arch_call_rest_init+0xc/0x14
> [   65.127159]  start_kernel+0x848/0x888
>
> [   65.138006] Allocated by task 0:
> [   65.144802] (stack is not available)
>
> [   65.155465] Freed by task 0:
> [   65.161530] (stack is not available)
>
> [   65.172193] The buggy address belongs to the object at ffff0026b931aa00
>   which belongs to the cache pool_workqueue of size 256
> [   65.199113] The buggy address is located 224 bytes inside of
>   256-byte region [ffff0026b931aa00, ffff0026b931ab00)
> [   65.223840] The buggy address belongs to the page:
> [   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0
> mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
> [   65.257923] flags: 0x6ffff00000010200(slab|head)
> [   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608
> ffff0026dd81c880
> [   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff
> 0000000000000000
> [   65.300270] page dumped because: kasan: bad access detected
>
> [   65.315139] Memory state around the buggy address:
> [   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc
> [   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc
> [   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc
> [   65.370870]                                                        ^
> [   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc
> [   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc
> fc fc fc
> [   65.414675]
> ==================================================================
> [   65.429885] Disabling lock debugging due to kernel taint
> [   65.441431] Unable to handle kernel paging request at virtual address
> ffffa0001013c0b0
> [   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
> [   65.458088] Mem abort info:
> [   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned
> bus number 1
> [   65.474927]   ESR = 0x96000007
> [   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
> [   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
> [   65.496918]   SET = 0, FnV = 0
> [   65.496922]   EA = 0, S1PTW = 0
> [   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
> [   65.526575] Data abort info:
> [   65.526580]   ISV = 0, ISS = 0x00000007
> [   65.535948] hub 1-0:1.0: USB hub found
> [   65.545245]   CM = 0, WnR = 0
> [   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
> [   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003,
> pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
> [   65.551519] hub 1-0:1.0: 2 ports detected
> [   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> [   65.559379] Modules linked in:
> [   65.569534] ehci-platform: EHCI generic platform driver
> [   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B
>        5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
> [   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
> BIOS 2280-V2 CS V3.B160.01 02/24/2020
> [   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
> [   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
> [   65.588048] ehci-orion: EHCI orion driver
> [   65.609756] pc : megasas_readl+0x60/0x80
> [   65.609759] lr : megasas_readl+0x1c/0x80
> [   65.609761] sp : ffff0026d97bfc00
> [   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
> [   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
> [   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
> [   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
> [   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
> [   65.618543] ehci-exynos: EHCI Exynos driver
> [   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
> [   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
> [   65.629847] x15: 00000000000013a8 x14: 0000000000000000
> [   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
> [   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
> [   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
> [   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
> [   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
> [   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
> [   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> [   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
> [   65.647210] Call trace:
> [   65.647214]  megasas_readl+0x60/0x80
> [   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
> [   65.647221]  megasas_fault_detect_work+0x44/0x520
> [   65.647226]  process_one_work+0x488/0xc08
> [   65.647228]  worker_thread+0x68/0x5d0
> [   65.647233]  kthread+0x1c8/0x1d0
> [   65.669535] ohci-pci: OHCI PCI platform driver
> [   65.689683]  ret_from_fork+0x10/0x18
> [   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
> [   65.689695] ---[ end trace 3632c7efc4f2d69c ]---
>
>
> That's 5.6-rc3 .
>
> Please have a look,
>
> John
>
>
>

[-- Attachment #2: 0001-megaraid_sas-Cleanup-resources-properly-when-scsi_ad.patch --]
[-- Type: application/octet-stream, Size: 1836 bytes --]

From e787178b31b8f98e6e6af4960e4a0623448bfd77 Mon Sep 17 00:00:00 2001
From: Sumit Saxena <sumit.saxena@broadcom.com>
Date: Tue, 3 Mar 2020 00:31:48 +0530
Subject: [PATCH] megaraid_sas: Cleanup resources properly when
 scsi_add_host/megasas_start_aen  fails during device probe

megaraid_sas driver does not cleanup resources properly, when
functions- scsi_add_host()/megasas_start_aen() fails during device probe.

Fix: When scsi_add_host/megasas_start_aen_fails, do cleanup of allocated resources.
e.g. Stop firmware fault detection thread, Stop SRIOV heartbeat timer, freeup
memory being used by firmware crash dump buffers, set instance->unload flag to 1,
call scsi_remove_host, if adapter(SCSI host) is added to SCSI layer .

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 5bebdd397580..784dfd9b9e6a 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -7423,7 +7423,20 @@ static int megasas_probe_one(struct pci_dev *pdev,
 	return 0;
 
 fail_start_aen:
+	instance->unload = 1;
+	scsi_remove_host(instance->host);
 fail_io_attach:
+	/* Shutdown SR-IOV heartbeat timer */
+	if (instance->requestorId && !instance->skip_heartbeat_timer_del)
+		del_timer_sync(&instance->sriov_heartbeat_timer);
+
+	/* Stop the FW fault detection watchdog */
+	if (instance->adapter_type != MFI_SERIES)
+		megasas_fusion_stop_watchdog(instance);
+
+	if (instance->fw_crash_state != UNAVAILABLE)
+		megasas_free_host_crash_buffer(instance);
+
 	megasas_mgmt_info.count--;
 	megasas_mgmt_info.max_index--;
 	megasas_mgmt_info.instance[megasas_mgmt_info.max_index] = NULL;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: megaraid_sas problem for scsi_add_host() fail
  2020-03-02 11:10 ` Sumit Saxena
@ 2020-03-02 12:13   ` John Garry
  2020-03-02 17:58     ` Sumit Saxena
  0 siblings, 1 reply; 6+ messages in thread
From: John Garry @ 2020-03-02 12:13 UTC (permalink / raw)
  To: Sumit Saxena; +Cc: Kashyap Desai, Shivasharan S, PDL,MEGARAIDLINUX, linux-scsi



Hi Sumit,

> It is megaraid_sas driver bug. Driver does not freeup resources properly, when
> scsi_add_host() fails. Please try attached patch.

Yeah, that looks to work. The driver gracefully failed to bind.

However we might have lots of memory leaks:


root@(none)$ echo scan > /sys/kernel/debug/kmemleak
root@(none)$ [  140.585484] kmemleak: 259 new suspected memory leaks 
(see /sys/kernel/debug/kmemleak)
[  140.585484] kmemleak: 259 new suspected memory leaks (see 
/sys/kernel/debug/kmemleak)

root@(none)$
root@(none)$ more /sys/kernel/debug/kmemleak
unreferenced object 0xffff0026b9184c00 (size 512):
   comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
   hex dump (first 32 bytes):
     60 00 00 00 61 00 00 00 62 00 00 00 63 00 00 00  `...a...b...c...
     64 00 00 00 65 00 00 00 66 00 00 00 67 00 00 00  d...e...f...g...
   backtrace:
     [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
     [<(____ptrval____)>] __kmalloc+0x174/0x280
     [<(____ptrval____)>] megasas_probe_one+0x798/0x2878
     [<(____ptrval____)>] local_pci_probe+0x74/0xf0
     [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
     [<(____ptrval____)>] process_one_work+0x488/0xc08
     [<(____ptrval____)>] worker_thread+0x330/0x5d0
     [<(____ptrval____)>] kthread+0x1c8/0x1d0
     [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff0026b922c000 (size 4096):
   comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
   hex dump (first 32 bytes):
     00 00 21 b7 26 00 ff ff 00 00 9f ff 00 00 00 00  ..!.&...........
     00 10 22 10 00 a0 ff ff 00 00 00 00 00 00 00 00  ..".............
   backtrace:
     [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
     [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
     [<(____ptrval____)>] megasas_alloc_fusion_context+0x30/0x1b0
     [<(____ptrval____)>] megasas_probe_one+0x7d8/0x2878
     [<(____ptrval____)>] local_pci_probe+0x74/0xf0
     [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
     [<(____ptrval____)>] process_one_work+0x488/0xc08
     [<(____ptrval____)>] worker_thread+0x330/0x5d0
     [<(____ptrval____)>] kthread+0x1c8/0x1d0
     [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff0026b7013000 (size 2048):
   comm "kworker/0:0", pid 5, jiffies 4294903512 (age 94.540s)
   hex dump (first 32 bytes):
     00 58 18 b9 26 00 ff ff 00 5c 18 b9 26 00 ff ff  .X..&....\..&...
     00 60 18 b9 26 00 ff ff 00 64 18 b9 26 00 ff ff  .`..&....d..&...
   backtrace:
     [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
     [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
root@(none)$


Thanks,
John

> 
> Thanks,
> Sumit
>>
>> [   62.516871] megasas: 07.713.01.00-rc1
>> [   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
>> [   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's
>> base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
>> [   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
>> [   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit
>> consistent mask
>> [   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
>> [   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
>> [   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus :
>> (128/128)
>> [   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
>> [   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports
>> maximum commands: 4077 LDIO threshold: 0
>> [   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware
>> commands: 4076
>> [   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
>> [   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
>> [   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
>> called outbound_intr_mask:0x40000009
>> [   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs:
>> 1 max_lds: 64
>> [   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
>> [   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) :
>> Enabled
>> [   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
>> [   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
>> [   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset
>> timeout : 6 secs/60 secs
>> [   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
>> [   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
>> [   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
>> [   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is
>> called outbound_intr_mask:0x40000000
>> [   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
>> [   63.822235] megaraid_sas 0000:08:00.0: pci id :
>> (0x1000)/(0x0016)/(0x19e5)/(0xd215)
>> [   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
>> [   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
>> [   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
>> [   63.877352] scsi host0: Avago SAS based MegaRAID driver
>> [   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from
>> megasas_io_attach 6802
>> [   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
>> called outbound_intr_mask:0x40000009
>> [   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
>> [   64.636476] nvme nvme0: pci function 0000:81:00.0
>> [   64.669635] libphy: Fixed MDIO Bus: probed
>> [   64.680255] tun: Universal TUN/TAP device driver, 1.6
>> [   64.694422] thunder_xcv, ver 1.0
>> [   64.702042] thunder_bgx, ver 1.0
>> [   64.709277] nicpf, ver 1.0
>> [   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
>> [   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
>> [   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version
>> 5.6.0-k
>> [   64.754981] nvme nvme0: Removing after probe failure status: -12
>> [   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
>> [   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver -
>> version 2.4.0-k
>> [   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
>> [   64.813848] sky2: driver version 1.30
>> [   64.825564] VFIO - User Level meta-driver version: 0.3
>> [   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
>> [   64.862029] ehci-pci: EHCI PCI platform driver
>> [   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
>> [   64.886700]
>> ==================================================================
>> [   64.901999] BUG: KASAN: slab-out-of-bounds in
>> run_timer_softirq+0x6f4/0xae0
>> [   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0
>>
>> [   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
>> 5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
>> [   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
>> BIOS 2280-V2 CS V3.B160.01 02/24/2020
>> [   64.972575] Call trace:
>> [   64.977729]  dump_backtrace+0x0/0x298
>> [   64.985439]  show_stack+0x14/0x20
>> [   64.992418]  dump_stack+0x118/0x190
>> [   64.999762]  print_address_description.isra.9+0x6c/0x3b8
>> [   65.010953]  __kasan_report+0x134/0x23c
>> [   65.019029]  kasan_report+0xc/0x18
>> [   65.026188]  __asan_store8+0x94/0xb8
>> [   65.033720]  run_timer_softirq+0x6f4/0xae0
>> [   65.042343]  efi_header_end+0x16c/0x840
>> [   65.050420]  irq_exit+0x19c/0x1a8
>> [   65.057396]  __handle_domain_irq+0x7c/0xe0
>> [   65.066022]  gic_handle_irq+0x64/0x168
>> [   65.073917]  el1_irq+0xbc/0x180
>> [   65.080528]  arch_cpu_idle+0x3c/0x320
>> [   65.088239]  default_idle_call+0x28/0x4c
>> [   65.096502]  do_idle+0x278/0x348
>> [   65.103295]  cpu_startup_entry+0x24/0x40
>> [   65.111554]  rest_init+0x1c4/0x298
>> [   65.118718]  arch_call_rest_init+0xc/0x14
>> [   65.127159]  start_kernel+0x848/0x888
>>
>> [   65.138006] Allocated by task 0:
>> [   65.144802] (stack is not available)
>>
>> [   65.155465] Freed by task 0:
>> [   65.161530] (stack is not available)
>>
>> [   65.172193] The buggy address belongs to the object at ffff0026b931aa00
>>    which belongs to the cache pool_workqueue of size 256
>> [   65.199113] The buggy address is located 224 bytes inside of
>>    256-byte region [ffff0026b931aa00, ffff0026b931ab00)
>> [   65.223840] The buggy address belongs to the page:
>> [   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0
>> mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
>> [   65.257923] flags: 0x6ffff00000010200(slab|head)
>> [   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608
>> ffff0026dd81c880
>> [   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff
>> 0000000000000000
>> [   65.300270] page dumped because: kasan: bad access detected
>>
>> [   65.315139] Memory state around the buggy address:
>> [   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc
>> fc fc fc
>> [   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc
>> fc fc fc
>> [   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc
>> fc fc fc
>> [   65.370870]                                                        ^
>> [   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc
>> fc fc fc
>> [   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc
>> fc fc fc
>> [   65.414675]
>> ==================================================================
>> [   65.429885] Disabling lock debugging due to kernel taint
>> [   65.441431] Unable to handle kernel paging request at virtual address
>> ffffa0001013c0b0
>> [   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
>> [   65.458088] Mem abort info:
>> [   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned
>> bus number 1
>> [   65.474927]   ESR = 0x96000007
>> [   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
>> [   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [   65.496918]   SET = 0, FnV = 0
>> [   65.496922]   EA = 0, S1PTW = 0
>> [   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
>> [   65.526575] Data abort info:
>> [   65.526580]   ISV = 0, ISS = 0x00000007
>> [   65.535948] hub 1-0:1.0: USB hub found
>> [   65.545245]   CM = 0, WnR = 0
>> [   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
>> [   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003,
>> pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
>> [   65.551519] hub 1-0:1.0: 2 ports detected
>> [   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
>> [   65.559379] Modules linked in:
>> [   65.569534] ehci-platform: EHCI generic platform driver
>> [   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B
>>         5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
>> [   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
>> BIOS 2280-V2 CS V3.B160.01 02/24/2020
>> [   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
>> [   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
>> [   65.588048] ehci-orion: EHCI orion driver
>> [   65.609756] pc : megasas_readl+0x60/0x80
>> [   65.609759] lr : megasas_readl+0x1c/0x80
>> [   65.609761] sp : ffff0026d97bfc00
>> [   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
>> [   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
>> [   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
>> [   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
>> [   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
>> [   65.618543] ehci-exynos: EHCI Exynos driver
>> [   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
>> [   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
>> [   65.629847] x15: 00000000000013a8 x14: 0000000000000000
>> [   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
>> [   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
>> [   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
>> [   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
>> [   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
>> [   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
>> [   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
>> [   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
>> [   65.647210] Call trace:
>> [   65.647214]  megasas_readl+0x60/0x80
>> [   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
>> [   65.647221]  megasas_fault_detect_work+0x44/0x520
>> [   65.647226]  process_one_work+0x488/0xc08
>> [   65.647228]  worker_thread+0x68/0x5d0
>> [   65.647233]  kthread+0x1c8/0x1d0
>> [   65.669535] ohci-pci: OHCI PCI platform driver
>> [   65.689683]  ret_from_fork+0x10/0x18
>> [   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
>> [   65.689695] ---[ end trace 3632c7efc4f2d69c ]---
>>
>>
>> That's 5.6-rc3 .
>>
>> Please have a look,
>>
>> John
>>
>>
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: megaraid_sas problem for scsi_add_host() fail
  2020-03-02 12:13   ` John Garry
@ 2020-03-02 17:58     ` Sumit Saxena
  2020-03-02 18:34       ` John Garry
  0 siblings, 1 reply; 6+ messages in thread
From: Sumit Saxena @ 2020-03-02 17:58 UTC (permalink / raw)
  To: John Garry; +Cc: Kashyap Desai, Shivasharan S, PDL,MEGARAIDLINUX, linux-scsi

On Mon, Mar 2, 2020 at 5:43 PM John Garry <john.garry@huawei.com> wrote:
>
>
>
> Hi Sumit,
>
> > It is megaraid_sas driver bug. Driver does not freeup resources properly, when
> > scsi_add_host() fails. Please try attached patch.
>
> Yeah, that looks to work. The driver gracefully failed to bind.
Ok, I will send this patch to upstream.
>
> However we might have lots of memory leaks:
Thanks for pointing it out John. I will look into these memory leaks and send
patches with fix in the next few days.

Thanks,
Sumit

>
>
> root@(none)$ echo scan > /sys/kernel/debug/kmemleak
> root@(none)$ [  140.585484] kmemleak: 259 new suspected memory leaks
> (see /sys/kernel/debug/kmemleak)
> [  140.585484] kmemleak: 259 new suspected memory leaks (see
> /sys/kernel/debug/kmemleak)
>
> root@(none)$
> root@(none)$ more /sys/kernel/debug/kmemleak
> unreferenced object 0xffff0026b9184c00 (size 512):
>    comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
>    hex dump (first 32 bytes):
>      60 00 00 00 61 00 00 00 62 00 00 00 63 00 00 00  `...a...b...c...
>      64 00 00 00 65 00 00 00 66 00 00 00 67 00 00 00  d...e...f...g...
>    backtrace:
>      [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
>      [<(____ptrval____)>] __kmalloc+0x174/0x280
>      [<(____ptrval____)>] megasas_probe_one+0x798/0x2878
>      [<(____ptrval____)>] local_pci_probe+0x74/0xf0
>      [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
>      [<(____ptrval____)>] process_one_work+0x488/0xc08
>      [<(____ptrval____)>] worker_thread+0x330/0x5d0
>      [<(____ptrval____)>] kthread+0x1c8/0x1d0
>      [<(____ptrval____)>] ret_from_fork+0x10/0x18
> unreferenced object 0xffff0026b922c000 (size 4096):
>    comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
>    hex dump (first 32 bytes):
>      00 00 21 b7 26 00 ff ff 00 00 9f ff 00 00 00 00  ..!.&...........
>      00 10 22 10 00 a0 ff ff 00 00 00 00 00 00 00 00  ..".............
>    backtrace:
>      [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
>      [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
>      [<(____ptrval____)>] megasas_alloc_fusion_context+0x30/0x1b0
>      [<(____ptrval____)>] megasas_probe_one+0x7d8/0x2878
>      [<(____ptrval____)>] local_pci_probe+0x74/0xf0
>      [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
>      [<(____ptrval____)>] process_one_work+0x488/0xc08
>      [<(____ptrval____)>] worker_thread+0x330/0x5d0
>      [<(____ptrval____)>] kthread+0x1c8/0x1d0
>      [<(____ptrval____)>] ret_from_fork+0x10/0x18
> unreferenced object 0xffff0026b7013000 (size 2048):
>    comm "kworker/0:0", pid 5, jiffies 4294903512 (age 94.540s)
>    hex dump (first 32 bytes):
>      00 58 18 b9 26 00 ff ff 00 5c 18 b9 26 00 ff ff  .X..&....\..&...
>      00 60 18 b9 26 00 ff ff 00 64 18 b9 26 00 ff ff  .`..&....d..&...
>    backtrace:
>      [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
>      [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
> root@(none)$
>
>
> Thanks,
> John
>
> >
> > Thanks,
> > Sumit
> >>
> >> [   62.516871] megasas: 07.713.01.00-rc1
> >> [   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
> >> [   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's
> >> base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
> >> [   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
> >> [   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit
> >> consistent mask
> >> [   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
> >> [   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
> >> [   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus :
> >> (128/128)
> >> [   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
> >> [   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports
> >> maximum commands: 4077 LDIO threshold: 0
> >> [   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware
> >> commands: 4076
> >> [   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
> >> [   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
> >> [   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
> >> called outbound_intr_mask:0x40000009
> >> [   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs:
> >> 1 max_lds: 64
> >> [   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
> >> [   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) :
> >> Enabled
> >> [   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
> >> [   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
> >> [   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset
> >> timeout : 6 secs/60 secs
> >> [   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
> >> [   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
> >> [   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
> >> [   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is
> >> called outbound_intr_mask:0x40000000
> >> [   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
> >> [   63.822235] megaraid_sas 0000:08:00.0: pci id :
> >> (0x1000)/(0x0016)/(0x19e5)/(0xd215)
> >> [   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
> >> [   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
> >> [   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
> >> [   63.877352] scsi host0: Avago SAS based MegaRAID driver
> >> [   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from
> >> megasas_io_attach 6802
> >> [   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
> >> called outbound_intr_mask:0x40000009
> >> [   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
> >> [   64.636476] nvme nvme0: pci function 0000:81:00.0
> >> [   64.669635] libphy: Fixed MDIO Bus: probed
> >> [   64.680255] tun: Universal TUN/TAP device driver, 1.6
> >> [   64.694422] thunder_xcv, ver 1.0
> >> [   64.702042] thunder_bgx, ver 1.0
> >> [   64.709277] nicpf, ver 1.0
> >> [   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> >> [   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> >> [   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version
> >> 5.6.0-k
> >> [   64.754981] nvme nvme0: Removing after probe failure status: -12
> >> [   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
> >> [   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver -
> >> version 2.4.0-k
> >> [   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
> >> [   64.813848] sky2: driver version 1.30
> >> [   64.825564] VFIO - User Level meta-driver version: 0.3
> >> [   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> >> [   64.862029] ehci-pci: EHCI PCI platform driver
> >> [   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
> >> [   64.886700]
> >> ==================================================================
> >> [   64.901999] BUG: KASAN: slab-out-of-bounds in
> >> run_timer_softirq+0x6f4/0xae0
> >> [   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0
> >>
> >> [   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> >> 5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
> >> [   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
> >> BIOS 2280-V2 CS V3.B160.01 02/24/2020
> >> [   64.972575] Call trace:
> >> [   64.977729]  dump_backtrace+0x0/0x298
> >> [   64.985439]  show_stack+0x14/0x20
> >> [   64.992418]  dump_stack+0x118/0x190
> >> [   64.999762]  print_address_description.isra.9+0x6c/0x3b8
> >> [   65.010953]  __kasan_report+0x134/0x23c
> >> [   65.019029]  kasan_report+0xc/0x18
> >> [   65.026188]  __asan_store8+0x94/0xb8
> >> [   65.033720]  run_timer_softirq+0x6f4/0xae0
> >> [   65.042343]  efi_header_end+0x16c/0x840
> >> [   65.050420]  irq_exit+0x19c/0x1a8
> >> [   65.057396]  __handle_domain_irq+0x7c/0xe0
> >> [   65.066022]  gic_handle_irq+0x64/0x168
> >> [   65.073917]  el1_irq+0xbc/0x180
> >> [   65.080528]  arch_cpu_idle+0x3c/0x320
> >> [   65.088239]  default_idle_call+0x28/0x4c
> >> [   65.096502]  do_idle+0x278/0x348
> >> [   65.103295]  cpu_startup_entry+0x24/0x40
> >> [   65.111554]  rest_init+0x1c4/0x298
> >> [   65.118718]  arch_call_rest_init+0xc/0x14
> >> [   65.127159]  start_kernel+0x848/0x888
> >>
> >> [   65.138006] Allocated by task 0:
> >> [   65.144802] (stack is not available)
> >>
> >> [   65.155465] Freed by task 0:
> >> [   65.161530] (stack is not available)
> >>
> >> [   65.172193] The buggy address belongs to the object at ffff0026b931aa00
> >>    which belongs to the cache pool_workqueue of size 256
> >> [   65.199113] The buggy address is located 224 bytes inside of
> >>    256-byte region [ffff0026b931aa00, ffff0026b931ab00)
> >> [   65.223840] The buggy address belongs to the page:
> >> [   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0
> >> mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
> >> [   65.257923] flags: 0x6ffff00000010200(slab|head)
> >> [   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608
> >> ffff0026dd81c880
> >> [   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff
> >> 0000000000000000
> >> [   65.300270] page dumped because: kasan: bad access detected
> >>
> >> [   65.315139] Memory state around the buggy address:
> >> [   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> fc fc fc
> >> [   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> fc fc fc
> >> [   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> fc fc fc
> >> [   65.370870]                                                        ^
> >> [   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> fc fc fc
> >> [   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> fc fc fc
> >> [   65.414675]
> >> ==================================================================
> >> [   65.429885] Disabling lock debugging due to kernel taint
> >> [   65.441431] Unable to handle kernel paging request at virtual address
> >> ffffa0001013c0b0
> >> [   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
> >> [   65.458088] Mem abort info:
> >> [   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned
> >> bus number 1
> >> [   65.474927]   ESR = 0x96000007
> >> [   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
> >> [   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
> >> [   65.496918]   SET = 0, FnV = 0
> >> [   65.496922]   EA = 0, S1PTW = 0
> >> [   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
> >> [   65.526575] Data abort info:
> >> [   65.526580]   ISV = 0, ISS = 0x00000007
> >> [   65.535948] hub 1-0:1.0: USB hub found
> >> [   65.545245]   CM = 0, WnR = 0
> >> [   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
> >> [   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003,
> >> pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
> >> [   65.551519] hub 1-0:1.0: 2 ports detected
> >> [   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> >> [   65.559379] Modules linked in:
> >> [   65.569534] ehci-platform: EHCI generic platform driver
> >> [   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B
> >>         5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
> >> [   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
> >> BIOS 2280-V2 CS V3.B160.01 02/24/2020
> >> [   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
> >> [   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
> >> [   65.588048] ehci-orion: EHCI orion driver
> >> [   65.609756] pc : megasas_readl+0x60/0x80
> >> [   65.609759] lr : megasas_readl+0x1c/0x80
> >> [   65.609761] sp : ffff0026d97bfc00
> >> [   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
> >> [   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
> >> [   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
> >> [   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
> >> [   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
> >> [   65.618543] ehci-exynos: EHCI Exynos driver
> >> [   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
> >> [   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
> >> [   65.629847] x15: 00000000000013a8 x14: 0000000000000000
> >> [   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
> >> [   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
> >> [   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
> >> [   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
> >> [   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
> >> [   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
> >> [   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> >> [   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
> >> [   65.647210] Call trace:
> >> [   65.647214]  megasas_readl+0x60/0x80
> >> [   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
> >> [   65.647221]  megasas_fault_detect_work+0x44/0x520
> >> [   65.647226]  process_one_work+0x488/0xc08
> >> [   65.647228]  worker_thread+0x68/0x5d0
> >> [   65.647233]  kthread+0x1c8/0x1d0
> >> [   65.669535] ohci-pci: OHCI PCI platform driver
> >> [   65.689683]  ret_from_fork+0x10/0x18
> >> [   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
> >> [   65.689695] ---[ end trace 3632c7efc4f2d69c ]---
> >>
> >>
> >> That's 5.6-rc3 .
> >>
> >> Please have a look,
> >>
> >> John
> >>
> >>
> >>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: megaraid_sas problem for scsi_add_host() fail
  2020-03-02 17:58     ` Sumit Saxena
@ 2020-03-02 18:34       ` John Garry
  2020-03-02 18:48         ` Sumit Saxena
  0 siblings, 1 reply; 6+ messages in thread
From: John Garry @ 2020-03-02 18:34 UTC (permalink / raw)
  To: Sumit Saxena; +Cc: Kashyap Desai, Shivasharan S, PDL,MEGARAIDLINUX, linux-scsi

On 02/03/2020 17:58, Sumit Saxena wrote:
> On Mon, Mar 2, 2020 at 5:43 PM John Garry <john.garry@huawei.com> wrote:
>>
>>
>>
>> Hi Sumit,
>>
>>> It is megaraid_sas driver bug. Driver does not freeup resources properly, when
>>> scsi_add_host() fails. Please try attached patch.
>>
>> Yeah, that looks to work. The driver gracefully failed to bind.
> Ok, I will send this patch to upstream.
>>
>> However we might have lots of memory leaks:
> Thanks for pointing it out John. I will look into these memory leaks and send
> patches with fix in the next few days.

Are the leaks unrelated to the fix? If so,

Reported-and-tested-by: John Garry <john.garry@huawei.com>

If they are related, then the fix just looks incomplete...

Thanks,
John

> 
> Thanks,
> Sumit
> 
>>
>>
>> root@(none)$ echo scan > /sys/kernel/debug/kmemleak
>> root@(none)$ [  140.585484] kmemleak: 259 new suspected memory leaks
>> (see /sys/kernel/debug/kmemleak)
>> [  140.585484] kmemleak: 259 new suspected memory leaks (see
>> /sys/kernel/debug/kmemleak)
>>
>> root@(none)$
>> root@(none)$ more /sys/kernel/debug/kmemleak
>> unreferenced object 0xffff0026b9184c00 (size 512):
>>     comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
>>     hex dump (first 32 bytes):
>>       60 00 00 00 61 00 00 00 62 00 00 00 63 00 00 00  `...a...b...c...
>>       64 00 00 00 65 00 00 00 66 00 00 00 67 00 00 00  d...e...f...g...
>>     backtrace:
>>       [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
>>       [<(____ptrval____)>] __kmalloc+0x174/0x280
>>       [<(____ptrval____)>] megasas_probe_one+0x798/0x2878
>>       [<(____ptrval____)>] local_pci_probe+0x74/0xf0
>>       [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
>>       [<(____ptrval____)>] process_one_work+0x488/0xc08
>>       [<(____ptrval____)>] worker_thread+0x330/0x5d0
>>       [<(____ptrval____)>] kthread+0x1c8/0x1d0
>>       [<(____ptrval____)>] ret_from_fork+0x10/0x18
>> unreferenced object 0xffff0026b922c000 (size 4096):
>>     comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
>>     hex dump (first 32 bytes):
>>       00 00 21 b7 26 00 ff ff 00 00 9f ff 00 00 00 00  ..!.&...........
>>       00 10 22 10 00 a0 ff ff 00 00 00 00 00 00 00 00  ..".............
>>     backtrace:
>>       [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
>>       [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
>>       [<(____ptrval____)>] megasas_alloc_fusion_context+0x30/0x1b0
>>       [<(____ptrval____)>] megasas_probe_one+0x7d8/0x2878
>>       [<(____ptrval____)>] local_pci_probe+0x74/0xf0
>>       [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
>>       [<(____ptrval____)>] process_one_work+0x488/0xc08
>>       [<(____ptrval____)>] worker_thread+0x330/0x5d0
>>       [<(____ptrval____)>] kthread+0x1c8/0x1d0
>>       [<(____ptrval____)>] ret_from_fork+0x10/0x18
>> unreferenced object 0xffff0026b7013000 (size 2048):
>>     comm "kworker/0:0", pid 5, jiffies 4294903512 (age 94.540s)
>>     hex dump (first 32 bytes):
>>       00 58 18 b9 26 00 ff ff 00 5c 18 b9 26 00 ff ff  .X..&....\..&...
>>       00 60 18 b9 26 00 ff ff 00 64 18 b9 26 00 ff ff  .`..&....d..&...
>>     backtrace:
>>       [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
>>       [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
>> root@(none)$
>>
>>
>> Thanks,
>> John
>>
>>>
>>> Thanks,
>>> Sumit
>>>>
>>>> [   62.516871] megasas: 07.713.01.00-rc1
>>>> [   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
>>>> [   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's
>>>> base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
>>>> [   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
>>>> [   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit
>>>> consistent mask
>>>> [   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
>>>> [   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
>>>> [   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus :
>>>> (128/128)
>>>> [   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
>>>> [   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports
>>>> maximum commands: 4077 LDIO threshold: 0
>>>> [   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware
>>>> commands: 4076
>>>> [   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
>>>> [   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
>>>> [   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
>>>> called outbound_intr_mask:0x40000009
>>>> [   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs:
>>>> 1 max_lds: 64
>>>> [   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
>>>> [   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) :
>>>> Enabled
>>>> [   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
>>>> [   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
>>>> [   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset
>>>> timeout : 6 secs/60 secs
>>>> [   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
>>>> [   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
>>>> [   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
>>>> [   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is
>>>> called outbound_intr_mask:0x40000000
>>>> [   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
>>>> [   63.822235] megaraid_sas 0000:08:00.0: pci id :
>>>> (0x1000)/(0x0016)/(0x19e5)/(0xd215)
>>>> [   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
>>>> [   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
>>>> [   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
>>>> [   63.877352] scsi host0: Avago SAS based MegaRAID driver
>>>> [   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from
>>>> megasas_io_attach 6802
>>>> [   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
>>>> called outbound_intr_mask:0x40000009
>>>> [   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
>>>> [   64.636476] nvme nvme0: pci function 0000:81:00.0
>>>> [   64.669635] libphy: Fixed MDIO Bus: probed
>>>> [   64.680255] tun: Universal TUN/TAP device driver, 1.6
>>>> [   64.694422] thunder_xcv, ver 1.0
>>>> [   64.702042] thunder_bgx, ver 1.0
>>>> [   64.709277] nicpf, ver 1.0
>>>> [   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
>>>> [   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
>>>> [   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version
>>>> 5.6.0-k
>>>> [   64.754981] nvme nvme0: Removing after probe failure status: -12
>>>> [   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
>>>> [   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver -
>>>> version 2.4.0-k
>>>> [   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
>>>> [   64.813848] sky2: driver version 1.30
>>>> [   64.825564] VFIO - User Level meta-driver version: 0.3
>>>> [   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
>>>> [   64.862029] ehci-pci: EHCI PCI platform driver
>>>> [   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
>>>> [   64.886700]
>>>> ==================================================================
>>>> [   64.901999] BUG: KASAN: slab-out-of-bounds in
>>>> run_timer_softirq+0x6f4/0xae0
>>>> [   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0
>>>>
>>>> [   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
>>>> 5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
>>>> [   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
>>>> BIOS 2280-V2 CS V3.B160.01 02/24/2020
>>>> [   64.972575] Call trace:
>>>> [   64.977729]  dump_backtrace+0x0/0x298
>>>> [   64.985439]  show_stack+0x14/0x20
>>>> [   64.992418]  dump_stack+0x118/0x190
>>>> [   64.999762]  print_address_description.isra.9+0x6c/0x3b8
>>>> [   65.010953]  __kasan_report+0x134/0x23c
>>>> [   65.019029]  kasan_report+0xc/0x18
>>>> [   65.026188]  __asan_store8+0x94/0xb8
>>>> [   65.033720]  run_timer_softirq+0x6f4/0xae0
>>>> [   65.042343]  efi_header_end+0x16c/0x840
>>>> [   65.050420]  irq_exit+0x19c/0x1a8
>>>> [   65.057396]  __handle_domain_irq+0x7c/0xe0
>>>> [   65.066022]  gic_handle_irq+0x64/0x168
>>>> [   65.073917]  el1_irq+0xbc/0x180
>>>> [   65.080528]  arch_cpu_idle+0x3c/0x320
>>>> [   65.088239]  default_idle_call+0x28/0x4c
>>>> [   65.096502]  do_idle+0x278/0x348
>>>> [   65.103295]  cpu_startup_entry+0x24/0x40
>>>> [   65.111554]  rest_init+0x1c4/0x298
>>>> [   65.118718]  arch_call_rest_init+0xc/0x14
>>>> [   65.127159]  start_kernel+0x848/0x888
>>>>
>>>> [   65.138006] Allocated by task 0:
>>>> [   65.144802] (stack is not available)
>>>>
>>>> [   65.155465] Freed by task 0:
>>>> [   65.161530] (stack is not available)
>>>>
>>>> [   65.172193] The buggy address belongs to the object at ffff0026b931aa00
>>>>     which belongs to the cache pool_workqueue of size 256
>>>> [   65.199113] The buggy address is located 224 bytes inside of
>>>>     256-byte region [ffff0026b931aa00, ffff0026b931ab00)
>>>> [   65.223840] The buggy address belongs to the page:
>>>> [   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0
>>>> mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
>>>> [   65.257923] flags: 0x6ffff00000010200(slab|head)
>>>> [   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608
>>>> ffff0026dd81c880
>>>> [   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff
>>>> 0000000000000000
>>>> [   65.300270] page dumped because: kasan: bad access detected
>>>>
>>>> [   65.315139] Memory state around the buggy address:
>>>> [   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> fc fc fc
>>>> [   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> fc fc fc
>>>> [   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> fc fc fc
>>>> [   65.370870]                                                        ^
>>>> [   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> fc fc fc
>>>> [   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>> fc fc fc
>>>> [   65.414675]
>>>> ==================================================================
>>>> [   65.429885] Disabling lock debugging due to kernel taint
>>>> [   65.441431] Unable to handle kernel paging request at virtual address
>>>> ffffa0001013c0b0
>>>> [   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
>>>> [   65.458088] Mem abort info:
>>>> [   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned
>>>> bus number 1
>>>> [   65.474927]   ESR = 0x96000007
>>>> [   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
>>>> [   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
>>>> [   65.496918]   SET = 0, FnV = 0
>>>> [   65.496922]   EA = 0, S1PTW = 0
>>>> [   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
>>>> [   65.526575] Data abort info:
>>>> [   65.526580]   ISV = 0, ISS = 0x00000007
>>>> [   65.535948] hub 1-0:1.0: USB hub found
>>>> [   65.545245]   CM = 0, WnR = 0
>>>> [   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
>>>> [   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003,
>>>> pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
>>>> [   65.551519] hub 1-0:1.0: 2 ports detected
>>>> [   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
>>>> [   65.559379] Modules linked in:
>>>> [   65.569534] ehci-platform: EHCI generic platform driver
>>>> [   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B
>>>>          5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
>>>> [   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
>>>> BIOS 2280-V2 CS V3.B160.01 02/24/2020
>>>> [   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
>>>> [   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
>>>> [   65.588048] ehci-orion: EHCI orion driver
>>>> [   65.609756] pc : megasas_readl+0x60/0x80
>>>> [   65.609759] lr : megasas_readl+0x1c/0x80
>>>> [   65.609761] sp : ffff0026d97bfc00
>>>> [   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
>>>> [   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
>>>> [   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
>>>> [   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
>>>> [   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
>>>> [   65.618543] ehci-exynos: EHCI Exynos driver
>>>> [   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
>>>> [   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
>>>> [   65.629847] x15: 00000000000013a8 x14: 0000000000000000
>>>> [   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
>>>> [   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
>>>> [   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
>>>> [   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
>>>> [   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
>>>> [   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
>>>> [   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
>>>> [   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
>>>> [   65.647210] Call trace:
>>>> [   65.647214]  megasas_readl+0x60/0x80
>>>> [   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
>>>> [   65.647221]  megasas_fault_detect_work+0x44/0x520
>>>> [   65.647226]  process_one_work+0x488/0xc08
>>>> [   65.647228]  worker_thread+0x68/0x5d0
>>>> [   65.647233]  kthread+0x1c8/0x1d0
>>>> [   65.669535] ohci-pci: OHCI PCI platform driver
>>>> [   65.689683]  ret_from_fork+0x10/0x18
>>>> [   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
>>>> [   65.689695] ---[ end trace 3632c7efc4f2d69c ]---
>>>>
>>>>
>>>> That's 5.6-rc3 .
>>>>
>>>> Please have a look,
>>>>
>>>> John
>>>>
>>>>
>>>>
>>
> .
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: megaraid_sas problem for scsi_add_host() fail
  2020-03-02 18:34       ` John Garry
@ 2020-03-02 18:48         ` Sumit Saxena
  0 siblings, 0 replies; 6+ messages in thread
From: Sumit Saxena @ 2020-03-02 18:48 UTC (permalink / raw)
  To: John Garry; +Cc: Kashyap Desai, Shivasharan S, PDL,MEGARAIDLINUX, linux-scsi

On Tue, Mar 3, 2020 at 12:04 AM John Garry <john.garry@huawei.com> wrote:
>
> On 02/03/2020 17:58, Sumit Saxena wrote:
> > On Mon, Mar 2, 2020 at 5:43 PM John Garry <john.garry@huawei.com> wrote:
> >>
> >>
> >>
> >> Hi Sumit,
> >>
> >>> It is megaraid_sas driver bug. Driver does not freeup resources properly, when
> >>> scsi_add_host() fails. Please try attached patch.
> >>
> >> Yeah, that looks to work. The driver gracefully failed to bind.
> > Ok, I will send this patch to upstream.
> >>
> >> However we might have lots of memory leaks:
> > Thanks for pointing it out John. I will look into these memory leaks and send
> > patches with fix in the next few days.
>
> Are the leaks unrelated to the fix? If so,
There are issues in this error handling code of megaraid_sas driver which
may lead to memory leaks and kernel crash. The patch provided only fixed
the problem(kernel crash) you faced. Complete work in this area along
with memory
leak fixes may take some time. I thought of sending this fix standalone now and
can fix memory leaks issue in later patches.

Thanks,
Sumit
>
> Reported-and-tested-by: John Garry <john.garry@huawei.com>
>
> If they are related, then the fix just looks incomplete...
>
> Thanks,
> John
>
> >
> > Thanks,
> > Sumit
> >
> >>
> >>
> >> root@(none)$ echo scan > /sys/kernel/debug/kmemleak
> >> root@(none)$ [  140.585484] kmemleak: 259 new suspected memory leaks
> >> (see /sys/kernel/debug/kmemleak)
> >> [  140.585484] kmemleak: 259 new suspected memory leaks (see
> >> /sys/kernel/debug/kmemleak)
> >>
> >> root@(none)$
> >> root@(none)$ more /sys/kernel/debug/kmemleak
> >> unreferenced object 0xffff0026b9184c00 (size 512):
> >>     comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
> >>     hex dump (first 32 bytes):
> >>       60 00 00 00 61 00 00 00 62 00 00 00 63 00 00 00  `...a...b...c...
> >>       64 00 00 00 65 00 00 00 66 00 00 00 67 00 00 00  d...e...f...g...
> >>     backtrace:
> >>       [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
> >>       [<(____ptrval____)>] __kmalloc+0x174/0x280
> >>       [<(____ptrval____)>] megasas_probe_one+0x798/0x2878
> >>       [<(____ptrval____)>] local_pci_probe+0x74/0xf0
> >>       [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
> >>       [<(____ptrval____)>] process_one_work+0x488/0xc08
> >>       [<(____ptrval____)>] worker_thread+0x330/0x5d0
> >>       [<(____ptrval____)>] kthread+0x1c8/0x1d0
> >>       [<(____ptrval____)>] ret_from_fork+0x10/0x18
> >> unreferenced object 0xffff0026b922c000 (size 4096):
> >>     comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
> >>     hex dump (first 32 bytes):
> >>       00 00 21 b7 26 00 ff ff 00 00 9f ff 00 00 00 00  ..!.&...........
> >>       00 10 22 10 00 a0 ff ff 00 00 00 00 00 00 00 00  ..".............
> >>     backtrace:
> >>       [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
> >>       [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
> >>       [<(____ptrval____)>] megasas_alloc_fusion_context+0x30/0x1b0
> >>       [<(____ptrval____)>] megasas_probe_one+0x7d8/0x2878
> >>       [<(____ptrval____)>] local_pci_probe+0x74/0xf0
> >>       [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
> >>       [<(____ptrval____)>] process_one_work+0x488/0xc08
> >>       [<(____ptrval____)>] worker_thread+0x330/0x5d0
> >>       [<(____ptrval____)>] kthread+0x1c8/0x1d0
> >>       [<(____ptrval____)>] ret_from_fork+0x10/0x18
> >> unreferenced object 0xffff0026b7013000 (size 2048):
> >>     comm "kworker/0:0", pid 5, jiffies 4294903512 (age 94.540s)
> >>     hex dump (first 32 bytes):
> >>       00 58 18 b9 26 00 ff ff 00 5c 18 b9 26 00 ff ff  .X..&....\..&...
> >>       00 60 18 b9 26 00 ff ff 00 64 18 b9 26 00 ff ff  .`..&....d..&...
> >>     backtrace:
> >>       [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
> >>       [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
> >> root@(none)$
> >>
> >>
> >> Thanks,
> >> John
> >>
> >>>
> >>> Thanks,
> >>> Sumit
> >>>>
> >>>> [   62.516871] megasas: 07.713.01.00-rc1
> >>>> [   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
> >>>> [   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's
> >>>> base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
> >>>> [   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
> >>>> [   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit
> >>>> consistent mask
> >>>> [   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
> >>>> [   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
> >>>> [   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus :
> >>>> (128/128)
> >>>> [   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
> >>>> [   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports
> >>>> maximum commands: 4077 LDIO threshold: 0
> >>>> [   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware
> >>>> commands: 4076
> >>>> [   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
> >>>> [   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
> >>>> [   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
> >>>> called outbound_intr_mask:0x40000009
> >>>> [   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs:
> >>>> 1 max_lds: 64
> >>>> [   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
> >>>> [   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) :
> >>>> Enabled
> >>>> [   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
> >>>> [   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
> >>>> [   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset
> >>>> timeout : 6 secs/60 secs
> >>>> [   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
> >>>> [   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
> >>>> [   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
> >>>> [   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is
> >>>> called outbound_intr_mask:0x40000000
> >>>> [   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
> >>>> [   63.822235] megaraid_sas 0000:08:00.0: pci id :
> >>>> (0x1000)/(0x0016)/(0x19e5)/(0xd215)
> >>>> [   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
> >>>> [   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
> >>>> [   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
> >>>> [   63.877352] scsi host0: Avago SAS based MegaRAID driver
> >>>> [   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from
> >>>> megasas_io_attach 6802
> >>>> [   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
> >>>> called outbound_intr_mask:0x40000009
> >>>> [   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
> >>>> [   64.636476] nvme nvme0: pci function 0000:81:00.0
> >>>> [   64.669635] libphy: Fixed MDIO Bus: probed
> >>>> [   64.680255] tun: Universal TUN/TAP device driver, 1.6
> >>>> [   64.694422] thunder_xcv, ver 1.0
> >>>> [   64.702042] thunder_bgx, ver 1.0
> >>>> [   64.709277] nicpf, ver 1.0
> >>>> [   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> >>>> [   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> >>>> [   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version
> >>>> 5.6.0-k
> >>>> [   64.754981] nvme nvme0: Removing after probe failure status: -12
> >>>> [   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
> >>>> [   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver -
> >>>> version 2.4.0-k
> >>>> [   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
> >>>> [   64.813848] sky2: driver version 1.30
> >>>> [   64.825564] VFIO - User Level meta-driver version: 0.3
> >>>> [   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> >>>> [   64.862029] ehci-pci: EHCI PCI platform driver
> >>>> [   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
> >>>> [   64.886700]
> >>>> ==================================================================
> >>>> [   64.901999] BUG: KASAN: slab-out-of-bounds in
> >>>> run_timer_softirq+0x6f4/0xae0
> >>>> [   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0
> >>>>
> >>>> [   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> >>>> 5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
> >>>> [   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
> >>>> BIOS 2280-V2 CS V3.B160.01 02/24/2020
> >>>> [   64.972575] Call trace:
> >>>> [   64.977729]  dump_backtrace+0x0/0x298
> >>>> [   64.985439]  show_stack+0x14/0x20
> >>>> [   64.992418]  dump_stack+0x118/0x190
> >>>> [   64.999762]  print_address_description.isra.9+0x6c/0x3b8
> >>>> [   65.010953]  __kasan_report+0x134/0x23c
> >>>> [   65.019029]  kasan_report+0xc/0x18
> >>>> [   65.026188]  __asan_store8+0x94/0xb8
> >>>> [   65.033720]  run_timer_softirq+0x6f4/0xae0
> >>>> [   65.042343]  efi_header_end+0x16c/0x840
> >>>> [   65.050420]  irq_exit+0x19c/0x1a8
> >>>> [   65.057396]  __handle_domain_irq+0x7c/0xe0
> >>>> [   65.066022]  gic_handle_irq+0x64/0x168
> >>>> [   65.073917]  el1_irq+0xbc/0x180
> >>>> [   65.080528]  arch_cpu_idle+0x3c/0x320
> >>>> [   65.088239]  default_idle_call+0x28/0x4c
> >>>> [   65.096502]  do_idle+0x278/0x348
> >>>> [   65.103295]  cpu_startup_entry+0x24/0x40
> >>>> [   65.111554]  rest_init+0x1c4/0x298
> >>>> [   65.118718]  arch_call_rest_init+0xc/0x14
> >>>> [   65.127159]  start_kernel+0x848/0x888
> >>>>
> >>>> [   65.138006] Allocated by task 0:
> >>>> [   65.144802] (stack is not available)
> >>>>
> >>>> [   65.155465] Freed by task 0:
> >>>> [   65.161530] (stack is not available)
> >>>>
> >>>> [   65.172193] The buggy address belongs to the object at ffff0026b931aa00
> >>>>     which belongs to the cache pool_workqueue of size 256
> >>>> [   65.199113] The buggy address is located 224 bytes inside of
> >>>>     256-byte region [ffff0026b931aa00, ffff0026b931ab00)
> >>>> [   65.223840] The buggy address belongs to the page:
> >>>> [   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0
> >>>> mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
> >>>> [   65.257923] flags: 0x6ffff00000010200(slab|head)
> >>>> [   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608
> >>>> ffff0026dd81c880
> >>>> [   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff
> >>>> 0000000000000000
> >>>> [   65.300270] page dumped because: kasan: bad access detected
> >>>>
> >>>> [   65.315139] Memory state around the buggy address:
> >>>> [   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>>> fc fc fc
> >>>> [   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>>> fc fc fc
> >>>> [   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>>> fc fc fc
> >>>> [   65.370870]                                                        ^
> >>>> [   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>>> fc fc fc
> >>>> [   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>>> fc fc fc
> >>>> [   65.414675]
> >>>> ==================================================================
> >>>> [   65.429885] Disabling lock debugging due to kernel taint
> >>>> [   65.441431] Unable to handle kernel paging request at virtual address
> >>>> ffffa0001013c0b0
> >>>> [   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
> >>>> [   65.458088] Mem abort info:
> >>>> [   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned
> >>>> bus number 1
> >>>> [   65.474927]   ESR = 0x96000007
> >>>> [   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
> >>>> [   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
> >>>> [   65.496918]   SET = 0, FnV = 0
> >>>> [   65.496922]   EA = 0, S1PTW = 0
> >>>> [   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
> >>>> [   65.526575] Data abort info:
> >>>> [   65.526580]   ISV = 0, ISS = 0x00000007
> >>>> [   65.535948] hub 1-0:1.0: USB hub found
> >>>> [   65.545245]   CM = 0, WnR = 0
> >>>> [   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
> >>>> [   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003,
> >>>> pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
> >>>> [   65.551519] hub 1-0:1.0: 2 ports detected
> >>>> [   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> >>>> [   65.559379] Modules linked in:
> >>>> [   65.569534] ehci-platform: EHCI generic platform driver
> >>>> [   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B
> >>>>          5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
> >>>> [   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
> >>>> BIOS 2280-V2 CS V3.B160.01 02/24/2020
> >>>> [   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
> >>>> [   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
> >>>> [   65.588048] ehci-orion: EHCI orion driver
> >>>> [   65.609756] pc : megasas_readl+0x60/0x80
> >>>> [   65.609759] lr : megasas_readl+0x1c/0x80
> >>>> [   65.609761] sp : ffff0026d97bfc00
> >>>> [   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
> >>>> [   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
> >>>> [   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
> >>>> [   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
> >>>> [   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
> >>>> [   65.618543] ehci-exynos: EHCI Exynos driver
> >>>> [   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
> >>>> [   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
> >>>> [   65.629847] x15: 00000000000013a8 x14: 0000000000000000
> >>>> [   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
> >>>> [   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
> >>>> [   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
> >>>> [   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
> >>>> [   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
> >>>> [   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
> >>>> [   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> >>>> [   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
> >>>> [   65.647210] Call trace:
> >>>> [   65.647214]  megasas_readl+0x60/0x80
> >>>> [   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
> >>>> [   65.647221]  megasas_fault_detect_work+0x44/0x520
> >>>> [   65.647226]  process_one_work+0x488/0xc08
> >>>> [   65.647228]  worker_thread+0x68/0x5d0
> >>>> [   65.647233]  kthread+0x1c8/0x1d0
> >>>> [   65.669535] ohci-pci: OHCI PCI platform driver
> >>>> [   65.689683]  ret_from_fork+0x10/0x18
> >>>> [   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
> >>>> [   65.689695] ---[ end trace 3632c7efc4f2d69c ]---
> >>>>
> >>>>
> >>>> That's 5.6-rc3 .
> >>>>
> >>>> Please have a look,
> >>>>
> >>>> John
> >>>>
> >>>>
> >>>>
> >>
> > .
> >
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-03-02 18:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-27 18:41 megaraid_sas problem for scsi_add_host() fail John Garry
2020-03-02 11:10 ` Sumit Saxena
2020-03-02 12:13   ` John Garry
2020-03-02 17:58     ` Sumit Saxena
2020-03-02 18:34       ` John Garry
2020-03-02 18:48         ` Sumit Saxena

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.