linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* v5.14-rc5: KASAN complains about use-after-free in __ib_process_cq()
@ 2021-08-23 17:30 Bart Van Assche
  2021-08-23 17:32 ` Jason Gunthorpe
  0 siblings, 1 reply; 2+ messages in thread
From: Bart Van Assche @ 2021-08-23 17:30 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Christoph Hellwig, linux-rdma

Hi,

If I run blktests against Jens' for-next branch (5026771bd46e ("Merge branch
'for-5.15/io_uring-late' into for-next")) then most SRP tests time out.
Additionally, a KASAN use-after-free complaint is sometimes reported for
__ib_process_cq(). With commit 4b5f4d3fb408 ("RDMA: Split the alloc_hw_stats()
ops to port and device variants") however all SRP tests pass and no KASAN
complaints are reported. There are no changes in the SRP drivers between these
two commits. This makes me wonder whether a regression has been introduced in
the RDMA core? I have not yet run a full bisect - this is something I am
working on. Please note that I may be hitting multiple unrelated issues -
there is no evidence so far that the SRP test timeouts are related to changes
in the RDMA code. These could also be caused by changes in the block layer.

Thanks,

Bart.

root[4317]: run blktests srp/006
[ ... ]
kernel: ==================================================================
kernel: BUG: KASAN: use-after-free in __ib_process_cq+0x118/0x3d0 [ib_core]
kernel: Read of size 8 at addr ffff8881e02e9d20 by task kworker/1:27/3431
kernel:
kernel: CPU: 1 PID: 3431 Comm: kworker/1:27 Tainted: G            E     5.14.0-rc7-dbg+ #27
kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
kernel: Workqueue: srp_remove srp_remove_work [ib_srp]
kernel: Call Trace:
kernel:  show_stack+0x52/0x58
kernel:  dump_stack_lvl+0x49/0x5e
kernel:  print_address_description.constprop.0+0x24/0x150
kernel:  kasan_report.cold+0x82/0xdb
kernel:  __asan_load8+0x69/0x90
kernel:  __ib_process_cq+0x118/0x3d0 [ib_core]
kernel:  ib_process_cq_direct+0x7d/0xa0 [ib_core]
kernel:  srp_free_ch_ib+0x191/0x570 [ib_srp]
kernel:  srp_remove_work+0x174/0x2d0 [ib_srp]
kernel:  process_one_work+0x56a/0xab0
kernel:  worker_thread+0x2e7/0x700
kernel:  kthread+0x1f6/0x220
kernel:  ret_from_fork+0x1f/0x30
kernel:
kernel: The buggy address belongs to the page:
kernel: page:000000003ae07f35 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1e02e9
kernel: flags: 0x1000000000000000(node=0|zone=2)
kernel: raw: 1000000000000000 ffffea000780bc08 ffffea000780ba08 0000000000000000
kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
kernel: page dumped because: kasan: bad access detected
kernel:
kernel: Memory state around the buggy address:
kernel:  ffff8881e02e9c00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
kernel:  ffff8881e02e9c80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
kernel: >ffff8881e02e9d00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
kernel:                                ^
kernel:  ffff8881e02e9d80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
kernel:  ffff8881e02e9e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
kernel: ==================================================================
kernel: Disabling lock debugging due to kernel taint

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: v5.14-rc5: KASAN complains about use-after-free in __ib_process_cq()
  2021-08-23 17:30 v5.14-rc5: KASAN complains about use-after-free in __ib_process_cq() Bart Van Assche
@ 2021-08-23 17:32 ` Jason Gunthorpe
  0 siblings, 0 replies; 2+ messages in thread
From: Jason Gunthorpe @ 2021-08-23 17:32 UTC (permalink / raw)
  To: Bart Van Assche, Leon Romanovsky; +Cc: Christoph Hellwig, linux-rdma

On Mon, Aug 23, 2021 at 10:30:39AM -0700, Bart Van Assche wrote:
> Hi,
> 
> If I run blktests against Jens' for-next branch (5026771bd46e ("Merge branch
> 'for-5.15/io_uring-late' into for-next")) then most SRP tests time out.
> Additionally, a KASAN use-after-free complaint is sometimes reported for
> __ib_process_cq(). With commit 4b5f4d3fb408 ("RDMA: Split the alloc_hw_stats()
> ops to port and device variants") however all SRP tests pass and no KASAN
> complaints are reported. There are no changes in the SRP drivers between these
> two commits. This makes me wonder whether a regression has been introduced in
> the RDMA core? I have not yet run a full bisect - this is something I am
> working on. Please note that I may be hitting multiple unrelated issues -
> there is no evidence so far that the SRP test timeouts are related to changes
> in the RDMA code. These could also be caused by changes in the block layer.

Maybe Leon's QP rework?

Jason

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-23 17:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23 17:30 v5.14-rc5: KASAN complains about use-after-free in __ib_process_cq() Bart Van Assche
2021-08-23 17:32 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).