linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Wenpeng Liang <liangwenpeng@huawei.com>
Cc: dledford@redhat.com, linux-rdma@vger.kernel.org, linuxarm@huawei.com
Subject: Re: [RESEND PATCH v2 for-next] RDMA/hns: Solve the problem that dma_pool is used during the reset
Date: Fri, 20 Aug 2021 16:29:17 -0300	[thread overview]
Message-ID: <20210820192917.GA572552@nvidia.com> (raw)
In-Reply-To: <1629339474-43445-1-git-send-email-liangwenpeng@huawei.com>

On Thu, Aug 19, 2021 at 10:17:54AM +0800, Wenpeng Liang wrote:
> From: Lang Cheng <chenglang@huawei.com>
> 
> During the reset, the driver calls dma_pool_destroy() to release the
> dma_pool resources. If the dma_pool_free interface is called during the
> modify_qp operation, an exception will occur.
> 
> [15834.440744] Unable to handle kernel paging request at virtual address
> ffffa2cfc7725678
> ...
> [15834.660596] Call trace:
> [15834.663033]  queued_spin_lock_slowpath+0x224/0x308
> [15834.667802]  _raw_spin_lock_irqsave+0x78/0x88
> [15834.672140]  dma_pool_free+0x34/0x118
> [15834.675799]  hns_roce_free_cmd_mailbox+0x54/0x88 [hns_roce_hw_v2]
> [15834.681872]  hns_roce_v2_qp_modify.isra.57+0xcc/0x120 [hns_roce_hw_v2]
> [15834.688376]  hns_roce_v2_modify_qp+0x4d4/0x1ef8 [hns_roce_hw_v2]
> [15834.694362]  hns_roce_modify_qp+0x214/0x5a8 [hns_roce_hw_v2]
> [15834.699996]  _ib_modify_qp+0xf0/0x308
> [15834.703642]  ib_modify_qp+0x38/0x48
> [15834.707118]  rt_ktest_modify_qp+0x14c/0x998 [rdma_test]
> ...
> [15837.269216] Unable to handle kernel paging request at virtual address
> 000197c995a1d1b4
> ...
> [15837.480898] Call trace:
> [15837.483335]  __free_pages+0x28/0x78
> [15837.486807]  dma_direct_free_pages+0xa0/0xe8
> [15837.491058]  dma_direct_free+0x48/0x60
> [15837.494790]  dma_free_attrs+0xa4/0xe8
> [15837.498449]  hns_roce_buf_free+0xb0/0x150 [hns_roce_hw_v2]
> [15837.503918]  mtr_free_bufs.isra.1+0x88/0xc0 [hns_roce_hw_v2]
> [15837.509558]  hns_roce_mtr_destroy+0x60/0x80 [hns_roce_hw_v2]
> [15837.515198]  hns_roce_v2_cleanup_eq_table+0x1d0/0x2a0 [hns_roce_hw_v2]
> [15837.521701]  hns_roce_exit+0x108/0x1e0 [hns_roce_hw_v2]
> [15837.526908]  __hns_roce_hw_v2_uninit_instance.isra.75+0x70/0xb8 [hns_roce_hw_v2]
> [15837.534276]  hns_roce_hw_v2_uninit_instance+0x64/0x80 [hns_roce_hw_v2]
> [15837.540786]  hclge_uninit_client_instance+0xe8/0x1e8 [hclge]
> [15837.546419]  hnae3_uninit_client_instance+0xc4/0x118 [hnae3]
> [15837.552052]  hnae3_unregister_client+0x16c/0x1f0 [hnae3]
> [15837.557346]  hns_roce_hw_v2_exit+0x34/0x50 [hns_roce_hw_v2]
> [15837.562895]  __arm64_sys_delete_module+0x208/0x268
> [15837.567665]  el0_svc_common.constprop.4+0x110/0x200
> [15837.572520]  do_el0_svc+0x34/0x98
> [15837.575821]  el0_svc+0x14/0x40
> [15837.578862]  el0_sync_handler+0xb0/0x2d0
> [15837.582766]  el0_sync+0x140/0x180
> 
> It is caused by two concurrent processes:
> 	uninit_instance->dma_pool_destroy(cmdq)
> 	modify_qp->dma_poll_free(cmdq)

Something else has gone wrong in your system.

modify_qp is not allowed to be running after ib_unregister_device()
returns.

I see:

 [15834.707118]  rt_ktest_modify_qp+0x14c/0x998 [rdma_test]

Which suggest to me your ULP is a test and that test is not properly
acting as an ib_client. When a client is unregistered it must close
all RDMA objects and stop all activity before the client unregister
callback returns.

Jason

      reply	other threads:[~2021-08-20 19:29 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19  2:17 [RESEND PATCH v2 for-next] RDMA/hns: Solve the problem that dma_pool is used during the reset Wenpeng Liang
2021-08-20 19:29 ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210820192917.GA572552@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=dledford@redhat.com \
    --cc=liangwenpeng@huawei.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).