* [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
@ 2021-08-21 7:43 Alok Prasad
2021-08-21 11:55 ` Leon Romanovsky
2021-08-22 2:45 ` kernel test robot
0 siblings, 2 replies; 6+ messages in thread
From: Alok Prasad @ 2021-08-21 7:43 UTC (permalink / raw)
To: jgg, dledford
Cc: michal.kalderon, ariel.elior, smalin, linux-rdma, Alok Prasad,
Ariel Elior
This patch fixes crash caused by querying qp.
This is due the fact that when no traffic is running,
rdma_create_qp hasn't created any qp hence qed->qp is null.
Below call trace is generated while using iproute2 utility
"rdma res show -dd qp" on rdma interface.
==========================================================================
[ 302.569794] BUG: kernel NULL pointer dereference, address: 0000000000000034
..
[ 302.570378] Hardware name: Dell Inc. PowerEdge R720/0M1GCR, BIOS 1.2.6 05/10/2012
[ 302.570500] RIP: 0010:qed_rdma_query_qp+0x33/0x1a0 [qed]
[ 302.570861] RSP: 0018:ffffba560a08f580 EFLAGS: 00010206
[ 302.570979] RAX: 0000000200000000 RBX: ffffba560a08f5b8 RCX: 0000000000000000
[ 302.571100] RDX: ffffba560a08f5b8 RSI: 0000000000000000 RDI: ffff9807ee458090
[ 302.571221] RBP: ffffba560a08f5a0 R08: 0000000000000000 R09: ffff9807890e7048
[ 302.571342] R10: ffffba560a08f658 R11: 0000000000000000 R12: 0000000000000000
[ 302.571462] R13: ffff9807ee458090 R14: ffff9807f0afb000 R15: ffffba560a08f7ec
[ 302.571583] FS: 00007fbbf8bfe740(0000) GS:ffff980aafa00000(0000) knlGS:0000000000000000
[ 302.571729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 302.571847] CR2: 0000000000000034 CR3: 00000001720ba001 CR4: 00000000000606f0
[ 302.571968] Call Trace:
[ 302.572083] qedr_query_qp+0x82/0x360 [qedr]
[ 302.572211] ib_query_qp+0x34/0x40 [ib_core]
[ 302.572361] ? ib_query_qp+0x34/0x40 [ib_core]
[ 302.572503] fill_res_qp_entry_query.isra.26+0x47/0x1d0 [ib_core]
[ 302.572670] ? __nla_put+0x20/0x30
[ 302.572788] ? nla_put+0x33/0x40
[ 302.572901] fill_res_qp_entry+0xe3/0x120 [ib_core]
[ 302.573058] res_get_common_dumpit+0x3f8/0x5d0 [ib_core]
[ 302.573213] ? fill_res_cm_id_entry+0x1f0/0x1f0 [ib_core]
[ 302.573377] nldev_res_get_qp_dumpit+0x1a/0x20 [ib_core]
[ 302.573529] netlink_dump+0x156/0x2f0
[ 302.573648] __netlink_dump_start+0x1ab/0x260
[ 302.573765] rdma_nl_rcv+0x1de/0x330 [ib_core]
[ 302.573918] ? nldev_res_get_cm_id_dumpit+0x20/0x20 [ib_core]
[ 302.574074] netlink_unicast+0x1b8/0x270
[ 302.574191] netlink_sendmsg+0x33e/0x470
[ 302.574307] sock_sendmsg+0x63/0x70
[ 302.574421] __sys_sendto+0x13f/0x180
[ 302.574536] ? setup_sgl.isra.12+0x70/0xc0
[ 302.574655] __x64_sys_sendto+0x28/0x30
[ 302.574769] do_syscall_64+0x3a/0xb0
[ 302.574884] entry_SYSCALL_64_after_hwframe+0x44/0xae
==========================================================================
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
---
drivers/infiniband/hw/qedr/verbs.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index fdc47ef7d861..79603e3fe2db 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -2758,15 +2758,18 @@ int qedr_query_qp(struct ib_qp *ibqp,
int rc = 0;
memset(¶ms, 0, sizeof(params));
-
- rc = dev->ops->rdma_query_qp(dev->rdma_ctx, qp->qed_qp, ¶ms);
- if (rc)
- goto err;
-
memset(qp_attr, 0, sizeof(*qp_attr));
memset(qp_init_attr, 0, sizeof(*qp_init_attr));
- qp_attr->qp_state = qedr_get_ibqp_state(params.state);
+ if (qp->qed_qp)
+ rc = dev->ops->rdma_query_qp(dev->rdma_ctx,
+ qp->qed_qp, ¶ms);
+
+ if (qp->qp_type == IB_QPT_GSI)
+ qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
+ else
+ qp_attr->qp_state = qedr_get_ibqp_state(params.state);
+
qp_attr->cur_qp_state = qedr_get_ibqp_state(params.state);
qp_attr->path_mtu = ib_mtu_int_to_enum(params.mtu);
qp_attr->path_mig_state = IB_MIG_MIGRATED;
@@ -2810,8 +2813,6 @@ int qedr_query_qp(struct ib_qp *ibqp,
DP_DEBUG(dev, QEDR_MSG_QP, "QEDR_QUERY_QP: max_inline_data=%d\n",
qp_attr->cap.max_inline_data);
-
-err:
return rc;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
2021-08-21 7:43 [for-rc] RDMA/qedr: qedr crash while running rdma-tool Alok Prasad
@ 2021-08-21 11:55 ` Leon Romanovsky
2021-08-24 6:19 ` Alok Prasad
2021-08-22 2:45 ` kernel test robot
1 sibling, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2021-08-21 11:55 UTC (permalink / raw)
To: Alok Prasad
Cc: jgg, dledford, michal.kalderon, ariel.elior, smalin, linux-rdma,
Ariel Elior
On Sat, Aug 21, 2021 at 07:43:39AM +0000, Alok Prasad wrote:
> This patch fixes crash caused by querying qp.
> This is due the fact that when no traffic is running,
> rdma_create_qp hasn't created any qp hence qed->qp is null.
This description is not correct, all QP creation flows
dev->ops->rdma_create_qp() is called and if qedr_create_qp() successes,
we will have valid qp->qed_qp pointer.
>
> Below call trace is generated while using iproute2 utility
> "rdma res show -dd qp" on rdma interface.
>
> ==========================================================================
> [ 302.569794] BUG: kernel NULL pointer dereference, address: 0000000000000034
> ..
> [ 302.570378] Hardware name: Dell Inc. PowerEdge R720/0M1GCR, BIOS 1.2.6 05/10/2012
> [ 302.570500] RIP: 0010:qed_rdma_query_qp+0x33/0x1a0 [qed]
> [ 302.570861] RSP: 0018:ffffba560a08f580 EFLAGS: 00010206
> [ 302.570979] RAX: 0000000200000000 RBX: ffffba560a08f5b8 RCX: 0000000000000000
> [ 302.571100] RDX: ffffba560a08f5b8 RSI: 0000000000000000 RDI: ffff9807ee458090
> [ 302.571221] RBP: ffffba560a08f5a0 R08: 0000000000000000 R09: ffff9807890e7048
> [ 302.571342] R10: ffffba560a08f658 R11: 0000000000000000 R12: 0000000000000000
> [ 302.571462] R13: ffff9807ee458090 R14: ffff9807f0afb000 R15: ffffba560a08f7ec
> [ 302.571583] FS: 00007fbbf8bfe740(0000) GS:ffff980aafa00000(0000) knlGS:0000000000000000
> [ 302.571729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 302.571847] CR2: 0000000000000034 CR3: 00000001720ba001 CR4: 00000000000606f0
> [ 302.571968] Call Trace:
> [ 302.572083] qedr_query_qp+0x82/0x360 [qedr]
> [ 302.572211] ib_query_qp+0x34/0x40 [ib_core]
> [ 302.572361] ? ib_query_qp+0x34/0x40 [ib_core]
> [ 302.572503] fill_res_qp_entry_query.isra.26+0x47/0x1d0 [ib_core]
> [ 302.572670] ? __nla_put+0x20/0x30
> [ 302.572788] ? nla_put+0x33/0x40
> [ 302.572901] fill_res_qp_entry+0xe3/0x120 [ib_core]
> [ 302.573058] res_get_common_dumpit+0x3f8/0x5d0 [ib_core]
> [ 302.573213] ? fill_res_cm_id_entry+0x1f0/0x1f0 [ib_core]
> [ 302.573377] nldev_res_get_qp_dumpit+0x1a/0x20 [ib_core]
> [ 302.573529] netlink_dump+0x156/0x2f0
> [ 302.573648] __netlink_dump_start+0x1ab/0x260
> [ 302.573765] rdma_nl_rcv+0x1de/0x330 [ib_core]
> [ 302.573918] ? nldev_res_get_cm_id_dumpit+0x20/0x20 [ib_core]
> [ 302.574074] netlink_unicast+0x1b8/0x270
> [ 302.574191] netlink_sendmsg+0x33e/0x470
> [ 302.574307] sock_sendmsg+0x63/0x70
> [ 302.574421] __sys_sendto+0x13f/0x180
> [ 302.574536] ? setup_sgl.isra.12+0x70/0xc0
> [ 302.574655] __x64_sys_sendto+0x28/0x30
> [ 302.574769] do_syscall_64+0x3a/0xb0
> [ 302.574884] entry_SYSCALL_64_after_hwframe+0x44/0xae
> ==========================================================================
>
> Signed-off-by: Ariel Elior <aelior@marvell.com>
> Signed-off-by: Shai Malin <smalin@marvell.com>
> Signed-off-by: Alok Prasad <palok@marvell.com>
> ---
> drivers/infiniband/hw/qedr/verbs.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
> index fdc47ef7d861..79603e3fe2db 100644
> --- a/drivers/infiniband/hw/qedr/verbs.c
> +++ b/drivers/infiniband/hw/qedr/verbs.c
> @@ -2758,15 +2758,18 @@ int qedr_query_qp(struct ib_qp *ibqp,
> int rc = 0;
>
> memset(¶ms, 0, sizeof(params));
> -
> - rc = dev->ops->rdma_query_qp(dev->rdma_ctx, qp->qed_qp, ¶ms);
> - if (rc)
> - goto err;
> -
At that point, QP should be valid.
> memset(qp_attr, 0, sizeof(*qp_attr));
> memset(qp_init_attr, 0, sizeof(*qp_init_attr));
>
> - qp_attr->qp_state = qedr_get_ibqp_state(params.state);
> + if (qp->qed_qp)
> + rc = dev->ops->rdma_query_qp(dev->rdma_ctx,
> + qp->qed_qp, ¶ms);
> +
> + if (qp->qp_type == IB_QPT_GSI)
> + qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
> + else
> + qp_attr->qp_state = qedr_get_ibqp_state(params.state);
> +
> qp_attr->cur_qp_state = qedr_get_ibqp_state(params.state);
> qp_attr->path_mtu = ib_mtu_int_to_enum(params.mtu);
> qp_attr->path_mig_state = IB_MIG_MIGRATED;
> @@ -2810,8 +2813,6 @@ int qedr_query_qp(struct ib_qp *ibqp,
>
> DP_DEBUG(dev, QEDR_MSG_QP, "QEDR_QUERY_QP: max_inline_data=%d\n",
> qp_attr->cap.max_inline_data);
> -
> -err:
> return rc;
> }
>
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
2021-08-21 7:43 [for-rc] RDMA/qedr: qedr crash while running rdma-tool Alok Prasad
2021-08-21 11:55 ` Leon Romanovsky
@ 2021-08-22 2:45 ` kernel test robot
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2021-08-22 2:45 UTC (permalink / raw)
To: Alok Prasad, jgg, dledford
Cc: kbuild-all, michal.kalderon, ariel.elior, smalin, linux-rdma,
Alok Prasad, Ariel Elior
[-- Attachment #1: Type: text/plain, Size: 4811 bytes --]
Hi Alok,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on rdma/for-next]
[also build test WARNING on v5.14-rc6 next-20210820]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Alok-Prasad/RDMA-qedr-qedr-crash-while-running-rdma-tool/20210821-154459
base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/f9b6462f18a87caead9b362d4cdd049504ac3c62
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Alok-Prasad/RDMA-qedr-qedr-crash-while-running-rdma-tool/20210821-154459
git checkout f9b6462f18a87caead9b362d4cdd049504ac3c62
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=powerpc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
drivers/infiniband/hw/qedr/verbs.c: In function 'qedr_query_qp':
>> drivers/infiniband/hw/qedr/verbs.c:2754:35: warning: implicit conversion from 'enum qed_roce_qp_state' to 'enum ib_qp_state' [-Wenum-conversion]
2754 | qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
| ^
vim +2754 drivers/infiniband/hw/qedr/verbs.c
2735
2736 int qedr_query_qp(struct ib_qp *ibqp,
2737 struct ib_qp_attr *qp_attr,
2738 int attr_mask, struct ib_qp_init_attr *qp_init_attr)
2739 {
2740 struct qed_rdma_query_qp_out_params params;
2741 struct qedr_qp *qp = get_qedr_qp(ibqp);
2742 struct qedr_dev *dev = qp->dev;
2743 int rc = 0;
2744
2745 memset(¶ms, 0, sizeof(params));
2746 memset(qp_attr, 0, sizeof(*qp_attr));
2747 memset(qp_init_attr, 0, sizeof(*qp_init_attr));
2748
2749 if (qp->qed_qp)
2750 rc = dev->ops->rdma_query_qp(dev->rdma_ctx,
2751 qp->qed_qp, ¶ms);
2752
2753 if (qp->qp_type == IB_QPT_GSI)
> 2754 qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
2755 else
2756 qp_attr->qp_state = qedr_get_ibqp_state(params.state);
2757
2758 qp_attr->cur_qp_state = qedr_get_ibqp_state(params.state);
2759 qp_attr->path_mtu = ib_mtu_int_to_enum(params.mtu);
2760 qp_attr->path_mig_state = IB_MIG_MIGRATED;
2761 qp_attr->rq_psn = params.rq_psn;
2762 qp_attr->sq_psn = params.sq_psn;
2763 qp_attr->dest_qp_num = params.dest_qp;
2764
2765 qp_attr->qp_access_flags = qedr_to_ib_qp_acc_flags(¶ms);
2766
2767 qp_attr->cap.max_send_wr = qp->sq.max_wr;
2768 qp_attr->cap.max_recv_wr = qp->rq.max_wr;
2769 qp_attr->cap.max_send_sge = qp->sq.max_sges;
2770 qp_attr->cap.max_recv_sge = qp->rq.max_sges;
2771 qp_attr->cap.max_inline_data = dev->attr.max_inline;
2772 qp_init_attr->cap = qp_attr->cap;
2773
2774 qp_attr->ah_attr.type = RDMA_AH_ATTR_TYPE_ROCE;
2775 rdma_ah_set_grh(&qp_attr->ah_attr, NULL,
2776 params.flow_label, qp->sgid_idx,
2777 params.hop_limit_ttl, params.traffic_class_tos);
2778 rdma_ah_set_dgid_raw(&qp_attr->ah_attr, ¶ms.dgid.bytes[0]);
2779 rdma_ah_set_port_num(&qp_attr->ah_attr, 1);
2780 rdma_ah_set_sl(&qp_attr->ah_attr, 0);
2781 qp_attr->timeout = params.timeout;
2782 qp_attr->rnr_retry = params.rnr_retry;
2783 qp_attr->retry_cnt = params.retry_cnt;
2784 qp_attr->min_rnr_timer = params.min_rnr_nak_timer;
2785 qp_attr->pkey_index = params.pkey_index;
2786 qp_attr->port_num = 1;
2787 rdma_ah_set_path_bits(&qp_attr->ah_attr, 0);
2788 rdma_ah_set_static_rate(&qp_attr->ah_attr, 0);
2789 qp_attr->alt_pkey_index = 0;
2790 qp_attr->alt_port_num = 0;
2791 qp_attr->alt_timeout = 0;
2792 memset(&qp_attr->alt_ah_attr, 0, sizeof(qp_attr->alt_ah_attr));
2793
2794 qp_attr->sq_draining = (params.state == QED_ROCE_QP_STATE_SQD) ? 1 : 0;
2795 qp_attr->max_dest_rd_atomic = params.max_dest_rd_atomic;
2796 qp_attr->max_rd_atomic = params.max_rd_atomic;
2797 qp_attr->en_sqd_async_notify = (params.sqd_async) ? 1 : 0;
2798
2799 DP_DEBUG(dev, QEDR_MSG_QP, "QEDR_QUERY_QP: max_inline_data=%d\n",
2800 qp_attr->cap.max_inline_data);
2801 return rc;
2802 }
2803
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 73335 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
2021-08-21 11:55 ` Leon Romanovsky
@ 2021-08-24 6:19 ` Alok Prasad
2021-10-22 15:49 ` Kamal Heib
0 siblings, 1 reply; 6+ messages in thread
From: Alok Prasad @ 2021-08-24 6:19 UTC (permalink / raw)
To: Leon Romanovsky
Cc: jgg, dledford, Michal Kalderon, Ariel Elior, Shai Malin,
linux-rdma, Ariel Elior
Hi Leon,
> On Sat, Aug 21, 2021 at 07:43:39AM +0000, Alok Prasad wrote:
> > This patch fixes crash caused by querying qp.
> > This is due the fact that when no traffic is running,
> > rdma_create_qp hasn't created any qp hence qed->qp is null.
>
> This description is not correct, all QP creation flows
> dev->ops->rdma_create_qp() is called and if qedr_create_qp() successes,
> we will have valid qp->qed_qp pointer.
>
In qedr_create_qp(), first qp we create is GSI QP
and it immediately returns after creating gsi_qp, and none of function
either qedr_create_user_qp() nor qedr_create_kernel_qp() is
called, both of them would have in turned called dev->ops->rdma_create_qp(),
hence qp->qed_qp is null here.
Anyway will send a v2 as kernel test robot reported one
Enum Warning.
> >
> > Below call trace is generated while using iproute2 utility
> > "rdma res show -dd qp" on rdma interface.
> >
> > ==========================================================================
> > [ 302.569794] BUG: kernel NULL pointer dereference, address: 0000000000000034
> > ..
> > [ 302.570378] Hardware name: Dell Inc. PowerEdge R720/0M1GCR, BIOS 1.2.6 05/10/2012
> > [ 302.570500] RIP: 0010:qed_rdma_query_qp+0x33/0x1a0 [qed]
> > [ 302.570861] RSP: 0018:ffffba560a08f580 EFLAGS: 00010206
> > [ 302.570979] RAX: 0000000200000000 RBX: ffffba560a08f5b8 RCX: 0000000000000000
> > [ 302.571100] RDX: ffffba560a08f5b8 RSI: 0000000000000000 RDI: ffff9807ee458090
> > [ 302.571221] RBP: ffffba560a08f5a0 R08: 0000000000000000 R09: ffff9807890e7048
> > [ 302.571342] R10: ffffba560a08f658 R11: 0000000000000000 R12: 0000000000000000
> > [ 302.571462] R13: ffff9807ee458090 R14: ffff9807f0afb000 R15: ffffba560a08f7ec
> > [ 302.571583] FS: 00007fbbf8bfe740(0000) GS:ffff980aafa00000(0000)
> knlGS:0000000000000000
> > [ 302.571729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 302.571847] CR2: 0000000000000034 CR3: 00000001720ba001 CR4: 00000000000606f0
> > [ 302.571968] Call Trace:
> > [ 302.572083] qedr_query_qp+0x82/0x360 [qedr]
> > [ 302.572211] ib_query_qp+0x34/0x40 [ib_core]
> > [ 302.572361] ? ib_query_qp+0x34/0x40 [ib_core]
> > [ 302.572503] fill_res_qp_entry_query.isra.26+0x47/0x1d0 [ib_core]
> > [ 302.572670] ? __nla_put+0x20/0x30
> > [ 302.572788] ? nla_put+0x33/0x40
> > [ 302.572901] fill_res_qp_entry+0xe3/0x120 [ib_core]
> > [ 302.573058] res_get_common_dumpit+0x3f8/0x5d0 [ib_core]
> > [ 302.573213] ? fill_res_cm_id_entry+0x1f0/0x1f0 [ib_core]
> > [ 302.573377] nldev_res_get_qp_dumpit+0x1a/0x20 [ib_core]
> > [ 302.573529] netlink_dump+0x156/0x2f0
> > [ 302.573648] __netlink_dump_start+0x1ab/0x260
> > [ 302.573765] rdma_nl_rcv+0x1de/0x330 [ib_core]
> > [ 302.573918] ? nldev_res_get_cm_id_dumpit+0x20/0x20 [ib_core]
> > [ 302.574074] netlink_unicast+0x1b8/0x270
> > [ 302.574191] netlink_sendmsg+0x33e/0x470
> > [ 302.574307] sock_sendmsg+0x63/0x70
> > [ 302.574421] __sys_sendto+0x13f/0x180
> > [ 302.574536] ? setup_sgl.isra.12+0x70/0xc0
> > [ 302.574655] __x64_sys_sendto+0x28/0x30
> > [ 302.574769] do_syscall_64+0x3a/0xb0
> > [ 302.574884] entry_SYSCALL_64_after_hwframe+0x44/0xae
> > ==========================================================================
> >
> > Signed-off-by: Ariel Elior <aelior@marvell.com>
> > Signed-off-by: Shai Malin <smalin@marvell.com>
> > Signed-off-by: Alok Prasad <palok@marvell.com>
> > ---
> > drivers/infiniband/hw/qedr/verbs.c | 17 +++++++++--------
> > 1 file changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
> > index fdc47ef7d861..79603e3fe2db 100644
> > --- a/drivers/infiniband/hw/qedr/verbs.c
> > +++ b/drivers/infiniband/hw/qedr/verbs.c
> > @@ -2758,15 +2758,18 @@ int qedr_query_qp(struct ib_qp *ibqp,
> > int rc = 0;
> >
> > memset(¶ms, 0, sizeof(params));
> > -
> > - rc = dev->ops->rdma_query_qp(dev->rdma_ctx, qp->qed_qp, ¶ms);
> > - if (rc)
> > - goto err;
> > -
>
> At that point, QP should be valid.
>
> > memset(qp_attr, 0, sizeof(*qp_attr));
> > memset(qp_init_attr, 0, sizeof(*qp_init_attr));
> >
> > - qp_attr->qp_state = qedr_get_ibqp_state(params.state);
> > + if (qp->qed_qp)
> > + rc = dev->ops->rdma_query_qp(dev->rdma_ctx,
> > + qp->qed_qp, ¶ms);
> > +
> > + if (qp->qp_type == IB_QPT_GSI)
> > + qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
> > + else
> > + qp_attr->qp_state = qedr_get_ibqp_state(params.state);
> > +
> > qp_attr->cur_qp_state = qedr_get_ibqp_state(params.state);
> > qp_attr->path_mtu = ib_mtu_int_to_enum(params.mtu);
> > qp_attr->path_mig_state = IB_MIG_MIGRATED;
> > @@ -2810,8 +2813,6 @@ int qedr_query_qp(struct ib_qp *ibqp,
> >
> > DP_DEBUG(dev, QEDR_MSG_QP, "QEDR_QUERY_QP: max_inline_data=%d\n",
> > qp_attr->cap.max_inline_data);
> > -
> > -err:
> > return rc;
> > }
> >
> > --
> > 2.17.1
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
2021-08-24 6:19 ` Alok Prasad
@ 2021-10-22 15:49 ` Kamal Heib
2021-10-23 16:48 ` [EXT] " Alok Prasad
0 siblings, 1 reply; 6+ messages in thread
From: Kamal Heib @ 2021-10-22 15:49 UTC (permalink / raw)
To: Alok Prasad
Cc: jgg, dledford, Michal Kalderon, Ariel Elior, Shai Malin,
linux-rdma, Leon Romanovsky
On 8/24/21 09:19, Alok Prasad wrote:
> Hi Leon,
>
>> On Sat, Aug 21, 2021 at 07:43:39AM +0000, Alok Prasad wrote:
>>> This patch fixes crash caused by querying qp.
>>> This is due the fact that when no traffic is running,
>>> rdma_create_qp hasn't created any qp hence qed->qp is null.
>>
>> This description is not correct, all QP creation flows
>> dev->ops->rdma_create_qp() is called and if qedr_create_qp() successes,
>> we will have valid qp->qed_qp pointer.
>>
>
> In qedr_create_qp(), first qp we create is GSI QP
> and it immediately returns after creating gsi_qp, and none of function
> either qedr_create_user_qp() nor qedr_create_kernel_qp() is
> called, both of them would have in turned called dev->ops->rdma_create_qp(),
> hence qp->qed_qp is null here.
>
> Anyway will send a v2 as kernel test robot reported one
> Enum Warning.
Hi Alok,
Could you please tell when you plan to send a v2 for this patch?
We need this patch to get accepted in order to fix the distribution
version of the qedr driver.
Thanks,
Kamal
>
>>>
>>> Below call trace is generated while using iproute2 utility
>>> "rdma res show -dd qp" on rdma interface.
>>>
>>> ==========================================================================
>>> [ 302.569794] BUG: kernel NULL pointer dereference, address: 0000000000000034
>>> ..
>>> [ 302.570378] Hardware name: Dell Inc. PowerEdge R720/0M1GCR, BIOS 1.2.6 05/10/2012
>>> [ 302.570500] RIP: 0010:qed_rdma_query_qp+0x33/0x1a0 [qed]
>>> [ 302.570861] RSP: 0018:ffffba560a08f580 EFLAGS: 00010206
>>> [ 302.570979] RAX: 0000000200000000 RBX: ffffba560a08f5b8 RCX: 0000000000000000
>>> [ 302.571100] RDX: ffffba560a08f5b8 RSI: 0000000000000000 RDI: ffff9807ee458090
>>> [ 302.571221] RBP: ffffba560a08f5a0 R08: 0000000000000000 R09: ffff9807890e7048
>>> [ 302.571342] R10: ffffba560a08f658 R11: 0000000000000000 R12: 0000000000000000
>>> [ 302.571462] R13: ffff9807ee458090 R14: ffff9807f0afb000 R15: ffffba560a08f7ec
>>> [ 302.571583] FS: 00007fbbf8bfe740(0000) GS:ffff980aafa00000(0000)
>> knlGS:0000000000000000
>>> [ 302.571729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 302.571847] CR2: 0000000000000034 CR3: 00000001720ba001 CR4: 00000000000606f0
>>> [ 302.571968] Call Trace:
>>> [ 302.572083] qedr_query_qp+0x82/0x360 [qedr]
>>> [ 302.572211] ib_query_qp+0x34/0x40 [ib_core]
>>> [ 302.572361] ? ib_query_qp+0x34/0x40 [ib_core]
>>> [ 302.572503] fill_res_qp_entry_query.isra.26+0x47/0x1d0 [ib_core]
>>> [ 302.572670] ? __nla_put+0x20/0x30
>>> [ 302.572788] ? nla_put+0x33/0x40
>>> [ 302.572901] fill_res_qp_entry+0xe3/0x120 [ib_core]
>>> [ 302.573058] res_get_common_dumpit+0x3f8/0x5d0 [ib_core]
>>> [ 302.573213] ? fill_res_cm_id_entry+0x1f0/0x1f0 [ib_core]
>>> [ 302.573377] nldev_res_get_qp_dumpit+0x1a/0x20 [ib_core]
>>> [ 302.573529] netlink_dump+0x156/0x2f0
>>> [ 302.573648] __netlink_dump_start+0x1ab/0x260
>>> [ 302.573765] rdma_nl_rcv+0x1de/0x330 [ib_core]
>>> [ 302.573918] ? nldev_res_get_cm_id_dumpit+0x20/0x20 [ib_core]
>>> [ 302.574074] netlink_unicast+0x1b8/0x270
>>> [ 302.574191] netlink_sendmsg+0x33e/0x470
>>> [ 302.574307] sock_sendmsg+0x63/0x70
>>> [ 302.574421] __sys_sendto+0x13f/0x180
>>> [ 302.574536] ? setup_sgl.isra.12+0x70/0xc0
>>> [ 302.574655] __x64_sys_sendto+0x28/0x30
>>> [ 302.574769] do_syscall_64+0x3a/0xb0
>>> [ 302.574884] entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> ==========================================================================
>>>
>>> Signed-off-by: Ariel Elior <aelior@marvell.com>
>>> Signed-off-by: Shai Malin <smalin@marvell.com>
>>> Signed-off-by: Alok Prasad <palok@marvell.com>
>>> ---
>>> drivers/infiniband/hw/qedr/verbs.c | 17 +++++++++--------
>>> 1 file changed, 9 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
>>> index fdc47ef7d861..79603e3fe2db 100644
>>> --- a/drivers/infiniband/hw/qedr/verbs.c
>>> +++ b/drivers/infiniband/hw/qedr/verbs.c
>>> @@ -2758,15 +2758,18 @@ int qedr_query_qp(struct ib_qp *ibqp,
>>> int rc = 0;
>>>
>>> memset(¶ms, 0, sizeof(params));
>>> -
>>> - rc = dev->ops->rdma_query_qp(dev->rdma_ctx, qp->qed_qp, ¶ms);
>>> - if (rc)
>>> - goto err;
>>> -
>>
>> At that point, QP should be valid.
>>
>>> memset(qp_attr, 0, sizeof(*qp_attr));
>>> memset(qp_init_attr, 0, sizeof(*qp_init_attr));
>>>
>>> - qp_attr->qp_state = qedr_get_ibqp_state(params.state);
>>> + if (qp->qed_qp)
>>> + rc = dev->ops->rdma_query_qp(dev->rdma_ctx,
>>> + qp->qed_qp, ¶ms);
>>> +
>>> + if (qp->qp_type == IB_QPT_GSI)
>>> + qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
>>> + else
>>> + qp_attr->qp_state = qedr_get_ibqp_state(params.state);
>>> +
>>> qp_attr->cur_qp_state = qedr_get_ibqp_state(params.state);
>>> qp_attr->path_mtu = ib_mtu_int_to_enum(params.mtu);
>>> qp_attr->path_mig_state = IB_MIG_MIGRATED;
>>> @@ -2810,8 +2813,6 @@ int qedr_query_qp(struct ib_qp *ibqp,
>>>
>>> DP_DEBUG(dev, QEDR_MSG_QP, "QEDR_QUERY_QP: max_inline_data=%d\n",
>>> qp_attr->cap.max_inline_data);
>>> -
>>> -err:
>>> return rc;
>>> }
>>>
>>> --
>>> 2.17.1
>>>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [EXT] Re: [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
2021-10-22 15:49 ` Kamal Heib
@ 2021-10-23 16:48 ` Alok Prasad
0 siblings, 0 replies; 6+ messages in thread
From: Alok Prasad @ 2021-10-23 16:48 UTC (permalink / raw)
To: Kamal Heib
Cc: jgg, dledford, Michal Kalderon, Ariel Elior, Shai Malin,
linux-rdma, Leon Romanovsky
> -----Original Message-----
> From: Kamal Heib <kheib@redhat.com>
> Sent: 22 October 2021 21:20
> To: Alok Prasad <palok@marvell.com>
> Cc: jgg@ziepe.ca; dledford@redhat.com; Michal Kalderon <mkalderon@marvell.com>; Ariel
> Elior <aelior@marvell.com>; Shai Malin <smalin@marvell.com>; linux-rdma@vger.kernel.org;
> Leon Romanovsky <leon@kernel.org>
> Subject: [EXT] Re: [for-rc] RDMA/qedr: qedr crash while running rdma-tool.
>
> External Email
>
> ----------------------------------------------------------------------
>
>
> On 8/24/21 09:19, Alok Prasad wrote:
> > Hi Leon,
> >
> >> On Sat, Aug 21, 2021 at 07:43:39AM +0000, Alok Prasad wrote:
> >>> This patch fixes crash caused by querying qp.
> >>> This is due the fact that when no traffic is running,
> >>> rdma_create_qp hasn't created any qp hence qed->qp is null.
> >>
> >> This description is not correct, all QP creation flows
> >> dev->ops->rdma_create_qp() is called and if qedr_create_qp() successes,
> >> we will have valid qp->qed_qp pointer.
> >>
> >
> > In qedr_create_qp(), first qp we create is GSI QP
> > and it immediately returns after creating gsi_qp, and none of function
> > either qedr_create_user_qp() nor qedr_create_kernel_qp() is
> > called, both of them would have in turned called dev->ops->rdma_create_qp(),
> > hence qp->qed_qp is null here.
> >
> > Anyway will send a v2 as kernel test robot reported one
> > Enum Warning.
>
> Hi Alok,
>
> Could you please tell when you plan to send a v2 for this patch?
>
> We need this patch to get accepted in order to fix the distribution
> version of the qedr driver.
>
> Thanks,
> Kamal
Just sent! Thanks Reminding it.
Regards,
Alok
> >
> >>>
> >>> Below call trace is generated while using iproute2 utility
> >>> "rdma res show -dd qp" on rdma interface.
> >>>
> >>> ==========================================================================
> >>> [ 302.569794] BUG: kernel NULL pointer dereference, address: 0000000000000034
> >>> ..
> >>> [ 302.570378] Hardware name: Dell Inc. PowerEdge R720/0M1GCR, BIOS 1.2.6 05/10/2012
> >>> [ 302.570500] RIP: 0010:qed_rdma_query_qp+0x33/0x1a0 [qed]
> >>> [ 302.570861] RSP: 0018:ffffba560a08f580 EFLAGS: 00010206
> >>> [ 302.570979] RAX: 0000000200000000 RBX: ffffba560a08f5b8 RCX: 0000000000000000
> >>> [ 302.571100] RDX: ffffba560a08f5b8 RSI: 0000000000000000 RDI: ffff9807ee458090
> >>> [ 302.571221] RBP: ffffba560a08f5a0 R08: 0000000000000000 R09: ffff9807890e7048
> >>> [ 302.571342] R10: ffffba560a08f658 R11: 0000000000000000 R12: 0000000000000000
> >>> [ 302.571462] R13: ffff9807ee458090 R14: ffff9807f0afb000 R15: ffffba560a08f7ec
> >>> [ 302.571583] FS: 00007fbbf8bfe740(0000) GS:ffff980aafa00000(0000)
> >> knlGS:0000000000000000
> >>> [ 302.571729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [ 302.571847] CR2: 0000000000000034 CR3: 00000001720ba001 CR4: 00000000000606f0
> >>> [ 302.571968] Call Trace:
> >>> [ 302.572083] qedr_query_qp+0x82/0x360 [qedr]
> >>> [ 302.572211] ib_query_qp+0x34/0x40 [ib_core]
> >>> [ 302.572361] ? ib_query_qp+0x34/0x40 [ib_core]
> >>> [ 302.572503] fill_res_qp_entry_query.isra.26+0x47/0x1d0 [ib_core]
> >>> [ 302.572670] ? __nla_put+0x20/0x30
> >>> [ 302.572788] ? nla_put+0x33/0x40
> >>> [ 302.572901] fill_res_qp_entry+0xe3/0x120 [ib_core]
> >>> [ 302.573058] res_get_common_dumpit+0x3f8/0x5d0 [ib_core]
> >>> [ 302.573213] ? fill_res_cm_id_entry+0x1f0/0x1f0 [ib_core]
> >>> [ 302.573377] nldev_res_get_qp_dumpit+0x1a/0x20 [ib_core]
> >>> [ 302.573529] netlink_dump+0x156/0x2f0
> >>> [ 302.573648] __netlink_dump_start+0x1ab/0x260
> >>> [ 302.573765] rdma_nl_rcv+0x1de/0x330 [ib_core]
> >>> [ 302.573918] ? nldev_res_get_cm_id_dumpit+0x20/0x20 [ib_core]
> >>> [ 302.574074] netlink_unicast+0x1b8/0x270
> >>> [ 302.574191] netlink_sendmsg+0x33e/0x470
> >>> [ 302.574307] sock_sendmsg+0x63/0x70
> >>> [ 302.574421] __sys_sendto+0x13f/0x180
> >>> [ 302.574536] ? setup_sgl.isra.12+0x70/0xc0
> >>> [ 302.574655] __x64_sys_sendto+0x28/0x30
> >>> [ 302.574769] do_syscall_64+0x3a/0xb0
> >>> [ 302.574884] entry_SYSCALL_64_after_hwframe+0x44/0xae
> >>> ==========================================================================
> >>>
> >>> Signed-off-by: Ariel Elior <aelior@marvell.com>
> >>> Signed-off-by: Shai Malin <smalin@marvell.com>
> >>> Signed-off-by: Alok Prasad <palok@marvell.com>
> >>> ---
> >>> drivers/infiniband/hw/qedr/verbs.c | 17 +++++++++--------
> >>> 1 file changed, 9 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
> >>> index fdc47ef7d861..79603e3fe2db 100644
> >>> --- a/drivers/infiniband/hw/qedr/verbs.c
> >>> +++ b/drivers/infiniband/hw/qedr/verbs.c
> >>> @@ -2758,15 +2758,18 @@ int qedr_query_qp(struct ib_qp *ibqp,
> >>> int rc = 0;
> >>>
> >>> memset(¶ms, 0, sizeof(params));
> >>> -
> >>> - rc = dev->ops->rdma_query_qp(dev->rdma_ctx, qp->qed_qp, ¶ms);
> >>> - if (rc)
> >>> - goto err;
> >>> -
> >>
> >> At that point, QP should be valid.
> >>
> >>> memset(qp_attr, 0, sizeof(*qp_attr));
> >>> memset(qp_init_attr, 0, sizeof(*qp_init_attr));
> >>>
> >>> - qp_attr->qp_state = qedr_get_ibqp_state(params.state);
> >>> + if (qp->qed_qp)
> >>> + rc = dev->ops->rdma_query_qp(dev->rdma_ctx,
> >>> + qp->qed_qp, ¶ms);
> >>> +
> >>> + if (qp->qp_type == IB_QPT_GSI)
> >>> + qp_attr->qp_state = QED_ROCE_QP_STATE_RTS;
> >>> + else
> >>> + qp_attr->qp_state = qedr_get_ibqp_state(params.state);
> >>> +
> >>> qp_attr->cur_qp_state = qedr_get_ibqp_state(params.state);
> >>> qp_attr->path_mtu = ib_mtu_int_to_enum(params.mtu);
> >>> qp_attr->path_mig_state = IB_MIG_MIGRATED;
> >>> @@ -2810,8 +2813,6 @@ int qedr_query_qp(struct ib_qp *ibqp,
> >>>
> >>> DP_DEBUG(dev, QEDR_MSG_QP, "QEDR_QUERY_QP: max_inline_data=%d\n",
> >>> qp_attr->cap.max_inline_data);
> >>> -
> >>> -err:
> >>> return rc;
> >>> }
> >>>
> >>> --
> >>> 2.17.1
> >>>
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-23 16:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-21 7:43 [for-rc] RDMA/qedr: qedr crash while running rdma-tool Alok Prasad
2021-08-21 11:55 ` Leon Romanovsky
2021-08-24 6:19 ` Alok Prasad
2021-10-22 15:49 ` Kamal Heib
2021-10-23 16:48 ` [EXT] " Alok Prasad
2021-08-22 2:45 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).