linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline
@ 2022-10-26  9:50 Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 1/5] RDMA/hns: Fix ext_sge num error when post send Haoyue Xu
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Haoyue Xu @ 2022-10-26  9:50 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, xuhaoyue1

From: Luoyouming <luoyouming@huawei.com>

The patchset mainly fixes and refactors the code relate
to inline features and supports cqe inline in user space.
1.#1-2: Fix the problem of sge
2.#3-4: Remove enable rq inline in kernel and add compatibility handling
3.#5: Support CQE inline in user space

The user space part is named "libhns: Support cqe inline,
handle rq inline compatibility, fix sge_num bug".

Luoyouming (5):
  RDMA/hns: Fix ext_sge num error when post send
  RDMA/hns: Fix the problem of sge nums
  RDMA/hns: Remove enable rq inline in kernel and add compatibility
    handling
  RDMA/hns: Support cqe inline in user space
  RDMA/hns: Remove rq inline in kernel

 drivers/infiniband/hw/hns/hns_roce_device.h |  26 +--
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  |  93 ++++------
 drivers/infiniband/hw/hns/hns_roce_hw_v2.h  |   3 +-
 drivers/infiniband/hw/hns/hns_roce_main.c   |  28 ++-
 drivers/infiniband/hw/hns/hns_roce_qp.c     | 180 +++++++++++---------
 include/uapi/rdma/hns-abi.h                 |  21 +++
 6 files changed, 186 insertions(+), 165 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 for-rc 1/5] RDMA/hns: Fix ext_sge num error when post send
  2022-10-26  9:50 [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline Haoyue Xu
@ 2022-10-26  9:50 ` Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums Haoyue Xu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Haoyue Xu @ 2022-10-26  9:50 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, xuhaoyue1

From: Luoyouming <luoyouming@huawei.com>

The max_gs is the sum of extended sge and standard sge. In function
fill_ext_sge_inl_data, max_gs does not subtract the number of extended
sges, but is directly used to calculate the size of extended sges.

Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")

Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 1435fe2ea176..0937db738be7 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -187,20 +187,29 @@ static void set_atomic_seg(const struct ib_send_wr *wr,
 	hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SGE_NUM, valid_num_sge);
 }
 
+static unsigned int get_std_sge_num(struct hns_roce_qp *qp)
+{
+	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_UD)
+		return 0;
+
+	return HNS_ROCE_SGE_IN_WQE;
+}
+
 static int fill_ext_sge_inl_data(struct hns_roce_qp *qp,
 				 const struct ib_send_wr *wr,
 				 unsigned int *sge_idx, u32 msg_len)
 {
 	struct ib_device *ibdev = &(to_hr_dev(qp->ibqp.device))->ib_dev;
-	unsigned int ext_sge_sz = qp->sq.max_gs * HNS_ROCE_SGE_SIZE;
 	unsigned int left_len_in_pg;
 	unsigned int idx = *sge_idx;
+	unsigned int std_sge_num;
 	unsigned int i = 0;
 	unsigned int len;
 	void *addr;
 	void *dseg;
 
-	if (msg_len > ext_sge_sz) {
+	std_sge_num = get_std_sge_num(qp);
+	if (msg_len > (qp->sq.max_gs - std_sge_num) * HNS_ROCE_SGE_SIZE) {
 		ibdev_err(ibdev,
 			  "no enough extended sge space for inline data.\n");
 		return -EINVAL;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums
  2022-10-26  9:50 [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 1/5] RDMA/hns: Fix ext_sge num error when post send Haoyue Xu
@ 2022-10-26  9:50 ` Haoyue Xu
  2022-10-28 16:42   ` Jason Gunthorpe
  2022-10-26  9:50 ` [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling Haoyue Xu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Haoyue Xu @ 2022-10-26  9:50 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, xuhaoyue1

From: Luoyouming <luoyouming@huawei.com>

Currently, the driver only uses max_send_sge to initialize sge num
when creating_qp. So, in the sq inline scenario, the driver may not
has enough sge to send data. For example, if max_send_sge is 16 and
max_inline_data is 1024, the driver needs 1024/16=64 sge to send data.
Therefore, the calculation method of sge num is modified to take the
maximum value of max_send_sge and max_inline_data/16 to solve this
problem.

Fixes: 05201e01be93 ("RDMA/hns: Refactor process of setting extended sge")
Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")

Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h |   3 +
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  |  12 +--
 drivers/infiniband/hw/hns/hns_roce_main.c   |  18 +++-
 drivers/infiniband/hw/hns/hns_roce_qp.c     | 108 +++++++++++++++++---
 include/uapi/rdma/hns-abi.h                 |  17 +++
 5 files changed, 128 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 723e55a7de8d..f701cc86896b 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -202,6 +202,7 @@ struct hns_roce_ucontext {
 	struct list_head	page_list;
 	struct mutex		page_mutex;
 	struct hns_user_mmap_entry *db_mmap_entry;
+	u32			config;
 };
 
 struct hns_roce_pd {
@@ -334,6 +335,7 @@ struct hns_roce_wq {
 	u32		head;
 	u32		tail;
 	void __iomem	*db_reg;
+	u32		ext_sge_cnt;
 };
 
 struct hns_roce_sge {
@@ -635,6 +637,7 @@ struct hns_roce_qp {
 	struct list_head	rq_node; /* all recv qps are on a list */
 	struct list_head	sq_node; /* all send qps are on a list */
 	struct hns_user_mmap_entry *dwqe_mmap_entry;
+	u32			config;
 };
 
 struct hns_roce_ib_iboe {
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 0937db738be7..65875b4cff13 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -187,14 +187,6 @@ static void set_atomic_seg(const struct ib_send_wr *wr,
 	hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SGE_NUM, valid_num_sge);
 }
 
-static unsigned int get_std_sge_num(struct hns_roce_qp *qp)
-{
-	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_UD)
-		return 0;
-
-	return HNS_ROCE_SGE_IN_WQE;
-}
-
 static int fill_ext_sge_inl_data(struct hns_roce_qp *qp,
 				 const struct ib_send_wr *wr,
 				 unsigned int *sge_idx, u32 msg_len)
@@ -202,14 +194,12 @@ static int fill_ext_sge_inl_data(struct hns_roce_qp *qp,
 	struct ib_device *ibdev = &(to_hr_dev(qp->ibqp.device))->ib_dev;
 	unsigned int left_len_in_pg;
 	unsigned int idx = *sge_idx;
-	unsigned int std_sge_num;
 	unsigned int i = 0;
 	unsigned int len;
 	void *addr;
 	void *dseg;
 
-	std_sge_num = get_std_sge_num(qp);
-	if (msg_len > (qp->sq.max_gs - std_sge_num) * HNS_ROCE_SGE_SIZE) {
+	if (msg_len > qp->sq.ext_sge_cnt * HNS_ROCE_SGE_SIZE) {
 		ibdev_err(ibdev,
 			  "no enough extended sge space for inline data.\n");
 		return -EINVAL;
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index dcf89689a4c6..8ba68ac12388 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -354,10 +354,11 @@ static int hns_roce_alloc_uar_entry(struct ib_ucontext *uctx)
 static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
 				   struct ib_udata *udata)
 {
-	int ret;
 	struct hns_roce_ucontext *context = to_hr_ucontext(uctx);
-	struct hns_roce_ib_alloc_ucontext_resp resp = {};
 	struct hns_roce_dev *hr_dev = to_hr_dev(uctx->device);
+	struct hns_roce_ib_alloc_ucontext_resp resp = {};
+	struct hns_roce_ib_alloc_ucontext ucmd = {};
+	int ret;
 
 	if (!hr_dev->active)
 		return -EAGAIN;
@@ -365,6 +366,19 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
 	resp.qp_tab_size = hr_dev->caps.num_qps;
 	resp.srq_tab_size = hr_dev->caps.num_srqs;
 
+	ret = ib_copy_from_udata(&ucmd, udata,
+				 min(udata->inlen, sizeof(ucmd)));
+	if (ret)
+		return ret;
+
+	if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09)
+		context->config = ucmd.config & HNS_ROCE_EXSGE_FLAGS;
+
+	if (context->config & HNS_ROCE_EXSGE_FLAGS) {
+		resp.config |= HNS_ROCE_RSP_EXSGE_FLAGS;
+		resp.max_inline_data = hr_dev->caps.max_sq_inline;
+	}
+
 	ret = hns_roce_uar_alloc(hr_dev, &context->uar);
 	if (ret)
 		goto error_fail_uar_alloc;
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index f0bd82a18069..7354c9e61502 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -476,38 +476,110 @@ static int set_rq_size(struct hns_roce_dev *hr_dev, struct ib_qp_cap *cap,
 	return 0;
 }
 
-static u32 get_wqe_ext_sge_cnt(struct hns_roce_qp *qp)
+static u32 get_max_inline_data(struct hns_roce_dev *hr_dev,
+			       struct ib_qp_cap *cap)
 {
-	/* GSI/UD QP only has extended sge */
-	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_UD)
-		return qp->sq.max_gs;
-
-	if (qp->sq.max_gs > HNS_ROCE_SGE_IN_WQE)
-		return qp->sq.max_gs - HNS_ROCE_SGE_IN_WQE;
+	if (cap->max_inline_data) {
+		cap->max_inline_data = roundup_pow_of_two(
+					cap->max_inline_data);
+		return min(cap->max_inline_data,
+			   hr_dev->caps.max_sq_inline);
+	}
 
 	return 0;
 }
 
+static void update_inline_data(struct hns_roce_qp *hr_qp,
+			       struct ib_qp_cap *cap)
+{
+	u32 sge_num = hr_qp->sq.ext_sge_cnt;
+
+	if (hr_qp->config & HNS_ROCE_EXSGE_FLAGS) {
+		if (!(hr_qp->ibqp.qp_type == IB_QPT_GSI ||
+			hr_qp->ibqp.qp_type == IB_QPT_UD))
+			sge_num = max((u32)HNS_ROCE_SGE_IN_WQE, sge_num);
+
+		cap->max_inline_data = max(cap->max_inline_data,
+					sge_num * HNS_ROCE_SGE_SIZE);
+	}
+
+	hr_qp->max_inline_data = cap->max_inline_data;
+}
+
+static u32 get_sge_num_from_max_send_sge(bool is_ud_or_gsi,
+					 u32 max_send_sge)
+{
+	unsigned int std_sge_num;
+	unsigned int min_sge;
+
+	std_sge_num = is_ud_or_gsi ? 0 : HNS_ROCE_SGE_IN_WQE;
+	min_sge = is_ud_or_gsi ? 1 : 0;
+	return max_send_sge > std_sge_num ? (max_send_sge - std_sge_num) :
+				min_sge;
+}
+
+static unsigned int get_sge_num_from_max_inl_data(bool is_ud_or_gsi,
+						  u32 max_inline_data)
+{
+	unsigned int inline_sge;
+
+	inline_sge = roundup_pow_of_two(max_inline_data) / HNS_ROCE_SGE_SIZE;
+
+	/*
+	 * if max_inline_data less than
+	 * HNS_ROCE_SGE_IN_WQE * HNS_ROCE_SGE_SIZE,
+	 * In addition to ud's mode, no need to extend sge.
+	 */
+	if ((!is_ud_or_gsi) && (inline_sge <= HNS_ROCE_SGE_IN_WQE))
+		inline_sge = 0;
+
+	return inline_sge;
+}
+
 static void set_ext_sge_param(struct hns_roce_dev *hr_dev, u32 sq_wqe_cnt,
 			      struct hns_roce_qp *hr_qp, struct ib_qp_cap *cap)
 {
+	bool is_ud_or_gsi = (hr_qp->ibqp.qp_type == IB_QPT_GSI ||
+				hr_qp->ibqp.qp_type == IB_QPT_UD);
+	unsigned int std_sge_num;
+	u32 inline_ext_sge = 0;
+	u32 ext_wqe_sge_cnt;
 	u32 total_sge_cnt;
-	u32 wqe_sge_cnt;
+
+	cap->max_inline_data = get_max_inline_data(hr_dev, cap);
 
 	hr_qp->sge.sge_shift = HNS_ROCE_SGE_SHIFT;
+	std_sge_num = is_ud_or_gsi ? 0 : HNS_ROCE_SGE_IN_WQE;
+	ext_wqe_sge_cnt = get_sge_num_from_max_send_sge(is_ud_or_gsi,
+							cap->max_send_sge);
 
-	hr_qp->sq.max_gs = max(1U, cap->max_send_sge);
+	if (hr_qp->config & HNS_ROCE_EXSGE_FLAGS) {
+		inline_ext_sge = max(ext_wqe_sge_cnt,
+				 get_sge_num_from_max_inl_data(
+				 is_ud_or_gsi, cap->max_inline_data));
+		hr_qp->sq.ext_sge_cnt = inline_ext_sge ?
+					roundup_pow_of_two(inline_ext_sge) : 0;
 
-	wqe_sge_cnt = get_wqe_ext_sge_cnt(hr_qp);
+		hr_qp->sq.max_gs = max(1U, (hr_qp->sq.ext_sge_cnt + std_sge_num));
+		hr_qp->sq.max_gs = min(hr_qp->sq.max_gs, hr_dev->caps.max_sq_sg);
+
+		ext_wqe_sge_cnt = hr_qp->sq.ext_sge_cnt;
+	} else {
+		hr_qp->sq.max_gs = max(1U, cap->max_send_sge);
+		hr_qp->sq.max_gs = min(hr_qp->sq.max_gs, hr_dev->caps.max_sq_sg);
+		hr_qp->sq.ext_sge_cnt = hr_qp->sq.max_gs;
+	}
 
 	/* If the number of extended sge is not zero, they MUST use the
 	 * space of HNS_HW_PAGE_SIZE at least.
 	 */
-	if (wqe_sge_cnt) {
-		total_sge_cnt = roundup_pow_of_two(sq_wqe_cnt * wqe_sge_cnt);
+	if (ext_wqe_sge_cnt) {
+		total_sge_cnt = roundup_pow_of_two(sq_wqe_cnt * ext_wqe_sge_cnt);
 		hr_qp->sge.sge_cnt = max(total_sge_cnt,
 				(u32)HNS_HW_PAGE_SIZE / HNS_ROCE_SGE_SIZE);
 	}
+
+	update_inline_data(hr_qp, cap);
 }
 
 static int check_sq_size_with_integrity(struct hns_roce_dev *hr_dev,
@@ -556,6 +628,7 @@ static int set_user_sq_size(struct hns_roce_dev *hr_dev,
 
 	hr_qp->sq.wqe_shift = ucmd->log_sq_stride;
 	hr_qp->sq.wqe_cnt = cnt;
+	cap->max_send_sge = hr_qp->sq.max_gs;
 
 	return 0;
 }
@@ -986,13 +1059,9 @@ static int set_qp_param(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
 			struct hns_roce_ib_create_qp *ucmd)
 {
 	struct ib_device *ibdev = &hr_dev->ib_dev;
+	struct hns_roce_ucontext *uctx;
 	int ret;
 
-	if (init_attr->cap.max_inline_data > hr_dev->caps.max_sq_inline)
-		init_attr->cap.max_inline_data = hr_dev->caps.max_sq_inline;
-
-	hr_qp->max_inline_data = init_attr->cap.max_inline_data;
-
 	if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR)
 		hr_qp->sq_signal_bits = IB_SIGNAL_ALL_WR;
 	else
@@ -1015,12 +1084,17 @@ static int set_qp_param(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
 			return ret;
 		}
 
+		uctx = rdma_udata_to_drv_context(udata, struct hns_roce_ucontext,
+						ibucontext);
+		hr_qp->config = uctx->config;
 		ret = set_user_sq_size(hr_dev, &init_attr->cap, hr_qp, ucmd);
 		if (ret)
 			ibdev_err(ibdev,
 				  "failed to set user SQ size, ret = %d.\n",
 				  ret);
 	} else {
+		if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09)
+			hr_qp->config = HNS_ROCE_EXSGE_FLAGS;
 		ret = set_kernel_sq_size(hr_dev, &init_attr->cap, hr_qp);
 		if (ret)
 			ibdev_err(ibdev,
diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h
index f6fde06db4b4..017da74f56af 100644
--- a/include/uapi/rdma/hns-abi.h
+++ b/include/uapi/rdma/hns-abi.h
@@ -85,13 +85,30 @@ struct hns_roce_ib_create_qp_resp {
 	__aligned_u64 dwqe_mmap_key;
 };
 
+enum {
+	HNS_ROCE_EXSGE_FLAGS = 1 << 0,
+};
+
+enum {
+	HNS_ROCE_RSP_EXSGE_FLAGS = 1 << 0,
+};
+
+
 struct hns_roce_ib_alloc_ucontext_resp {
 	__u32	qp_tab_size;
 	__u32	cqe_size;
 	__u32	srq_tab_size;
 	__u32	reserved;
+	__u32	config;
+	__u32	max_inline_data;
 };
 
+struct hns_roce_ib_alloc_ucontext {
+	__u32 config;
+	__u32 reserved;
+};
+
+
 struct hns_roce_ib_alloc_pd_resp {
 	__u32 pdn;
 };
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling
  2022-10-26  9:50 [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 1/5] RDMA/hns: Fix ext_sge num error when post send Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums Haoyue Xu
@ 2022-10-26  9:50 ` Haoyue Xu
  2022-10-28 16:40   ` Jason Gunthorpe
  2022-10-26  9:50 ` [PATCH v2 for-rc 4/5] RDMA/hns: Remove rq inline in kernel Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 5/5] RDMA/hns: Support cqe inline in user space Haoyue Xu
  4 siblings, 1 reply; 12+ messages in thread
From: Haoyue Xu @ 2022-10-26  9:50 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, xuhaoyue1

From: Luoyouming <luoyouming@huawei.com>

The rq inline makes some changes as follows, Firstly, it is only
used in user space. Secondly, it should notify hardware in QP RTR
status. Thirdly, Add compatibility processing between different
user space and kernel space. Change the HNS_ROCE_CAP_FLAG_RQ_INLINE
to a new bit to prevent old kernel spaces / spaced from enabling
rq inline.

Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h |  6 +++--
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 28 +++++++++++++--------
 drivers/infiniband/hw/hns/hns_roce_main.c   |  5 ++++
 drivers/infiniband/hw/hns/hns_roce_qp.c     |  2 +-
 include/uapi/rdma/hns-abi.h                 |  2 ++
 5 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index f701cc86896b..9ce053fe737d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -132,7 +132,8 @@ enum hns_roce_event {
 enum {
 	HNS_ROCE_CAP_FLAG_REREG_MR		= BIT(0),
 	HNS_ROCE_CAP_FLAG_ROCE_V1_V2		= BIT(1),
-	HNS_ROCE_CAP_FLAG_RQ_INLINE		= BIT(2),
+	/* discard this bit, reserved for compatibility */
+	HNS_ROCE_CAP_FLAG_DISCARD		= BIT(2),
 	HNS_ROCE_CAP_FLAG_CQ_RECORD_DB		= BIT(3),
 	HNS_ROCE_CAP_FLAG_QP_RECORD_DB		= BIT(4),
 	HNS_ROCE_CAP_FLAG_SRQ			= BIT(5),
@@ -144,6 +145,7 @@ enum {
 	HNS_ROCE_CAP_FLAG_DIRECT_WQE		= BIT(12),
 	HNS_ROCE_CAP_FLAG_SDI_MODE		= BIT(14),
 	HNS_ROCE_CAP_FLAG_STASH			= BIT(17),
+	HNS_ROCE_CAP_FLAG_RQ_INLINE		= BIT(20),
 };
 
 #define HNS_ROCE_DB_TYPE_COUNT			2
@@ -887,7 +889,7 @@ struct hns_roce_hw {
 			 u32 step_idx);
 	int (*modify_qp)(struct ib_qp *ibqp, const struct ib_qp_attr *attr,
 			 int attr_mask, enum ib_qp_state cur_state,
-			 enum ib_qp_state new_state);
+			 enum ib_qp_state new_state, struct ib_udata *udata);
 	int (*qp_flow_control_init)(struct hns_roce_dev *hr_dev,
 			 struct hns_roce_qp *hr_qp);
 	void (*dereg_mr)(struct hns_roce_dev *hr_dev);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 65875b4cff13..b936bf6e58cc 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -4327,10 +4327,6 @@ static void modify_qp_reset_to_init(struct ib_qp *ibqp,
 	hr_reg_write(context, QPC_RQ_DB_RECORD_ADDR_H,
 		     upper_32_bits(hr_qp->rdb.dma));
 
-	if (ibqp->qp_type != IB_QPT_UD && ibqp->qp_type != IB_QPT_GSI)
-		hr_reg_write_bool(context, QPC_RQIE,
-			     hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RQ_INLINE);
-
 	hr_reg_write(context, QPC_RX_CQN, get_cqn(ibqp->recv_cq));
 
 	if (ibqp->srq) {
@@ -4521,8 +4517,11 @@ static inline enum ib_mtu get_mtu(struct ib_qp *ibqp,
 static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
 				 const struct ib_qp_attr *attr, int attr_mask,
 				 struct hns_roce_v2_qp_context *context,
-				 struct hns_roce_v2_qp_context *qpc_mask)
+				 struct hns_roce_v2_qp_context *qpc_mask,
+				 struct ib_udata *udata)
 {
+	struct hns_roce_ucontext *uctx = rdma_udata_to_drv_context(udata,
+					  struct hns_roce_ucontext, ibucontext);
 	struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
 	struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
 	struct ib_device *ibdev = &hr_dev->ib_dev;
@@ -4642,6 +4641,14 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
 	hr_reg_write(context, QPC_LP_SGEN_INI, 3);
 	hr_reg_clear(qpc_mask, QPC_LP_SGEN_INI);
 
+	if (udata && (ibqp->qp_type == IB_QPT_RC) &&
+	    (uctx->config & HNS_ROCE_RQ_INLINE_FLAGS)) {
+		hr_reg_write_bool(context, QPC_RQIE,
+				  hr_dev->caps.flags &
+				  HNS_ROCE_CAP_FLAG_RQ_INLINE);
+		hr_reg_clear(qpc_mask, QPC_RQIE);
+	}
+
 	return 0;
 }
 
@@ -4989,7 +4996,8 @@ static int hns_roce_v2_set_abs_fields(struct ib_qp *ibqp,
 				      enum ib_qp_state cur_state,
 				      enum ib_qp_state new_state,
 				      struct hns_roce_v2_qp_context *context,
-				      struct hns_roce_v2_qp_context *qpc_mask)
+				      struct hns_roce_v2_qp_context *qpc_mask,
+				      struct ib_udata *udata)
 {
 	struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
 	int ret = 0;
@@ -5006,7 +5014,7 @@ static int hns_roce_v2_set_abs_fields(struct ib_qp *ibqp,
 		modify_qp_init_to_init(ibqp, attr, context, qpc_mask);
 	} else if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) {
 		ret = modify_qp_init_to_rtr(ibqp, attr, attr_mask, context,
-					    qpc_mask);
+					    qpc_mask, udata);
 	} else if (cur_state == IB_QPS_RTR && new_state == IB_QPS_RTS) {
 		ret = modify_qp_rtr_to_rts(ibqp, attr, attr_mask, context,
 					   qpc_mask);
@@ -5211,7 +5219,7 @@ static void v2_set_flushed_fields(struct ib_qp *ibqp,
 static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 				 const struct ib_qp_attr *attr,
 				 int attr_mask, enum ib_qp_state cur_state,
-				 enum ib_qp_state new_state)
+				 enum ib_qp_state new_state, struct ib_udata *udata)
 {
 	struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
 	struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
@@ -5234,7 +5242,7 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 	memset(qpc_mask, 0xff, hr_dev->caps.qpc_sz);
 
 	ret = hns_roce_v2_set_abs_fields(ibqp, attr, attr_mask, cur_state,
-					 new_state, context, qpc_mask);
+					 new_state, context, qpc_mask, udata);
 	if (ret)
 		goto out;
 
@@ -5435,7 +5443,7 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
 	if (modify_qp_is_ok(hr_qp)) {
 		/* Modify qp to reset before destroying qp */
 		ret = hns_roce_v2_modify_qp(&hr_qp->ibqp, NULL, 0,
-					    hr_qp->state, IB_QPS_RESET);
+					    hr_qp->state, IB_QPS_RESET, udata);
 		if (ret)
 			ibdev_err(ibdev,
 				  "failed to modify QP to RST, ret = %d.\n",
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index 8ba68ac12388..ea1ef395f60f 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -379,6 +379,11 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
 		resp.max_inline_data = hr_dev->caps.max_sq_inline;
 	}
 
+	if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RQ_INLINE) {
+		context->config |= ucmd.config & HNS_ROCE_RQ_INLINE_FLAGS;
+		resp.config |= HNS_ROCE_RSP_RQ_INLINE_FLAGS;
+	}
+
 	ret = hns_roce_uar_alloc(hr_dev, &context->uar);
 	if (ret)
 		goto error_fail_uar_alloc;
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 7354c9e61502..e65b899ce82f 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -1411,7 +1411,7 @@ int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		goto out;
 
 	ret = hr_dev->hw->modify_qp(ibqp, attr, attr_mask, cur_state,
-				    new_state);
+				    new_state, udata);
 
 out:
 	mutex_unlock(&hr_qp->mutex);
diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h
index 017da74f56af..9375ac3de059 100644
--- a/include/uapi/rdma/hns-abi.h
+++ b/include/uapi/rdma/hns-abi.h
@@ -87,10 +87,12 @@ struct hns_roce_ib_create_qp_resp {
 
 enum {
 	HNS_ROCE_EXSGE_FLAGS = 1 << 0,
+	HNS_ROCE_RQ_INLINE_FLAGS = 1 << 1,
 };
 
 enum {
 	HNS_ROCE_RSP_EXSGE_FLAGS = 1 << 0,
+	HNS_ROCE_RSP_RQ_INLINE_FLAGS = 1 << 1,
 };
 
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 for-rc 4/5] RDMA/hns: Remove rq inline in kernel
  2022-10-26  9:50 [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline Haoyue Xu
                   ` (2 preceding siblings ...)
  2022-10-26  9:50 ` [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling Haoyue Xu
@ 2022-10-26  9:50 ` Haoyue Xu
  2022-10-26  9:50 ` [PATCH v2 for-rc 5/5] RDMA/hns: Support cqe inline in user space Haoyue Xu
  4 siblings, 0 replies; 12+ messages in thread
From: Haoyue Xu @ 2022-10-26  9:50 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, xuhaoyue1

From: Luoyouming <luoyouming@huawei.com>

The kernel space does not support the rq inline feature anymore,
so remove the code associated with rq inline.

Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h | 16 ------
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 53 -----------------
 drivers/infiniband/hw/hns/hns_roce_qp.c     | 64 ---------------------
 3 files changed, 133 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 9ce053fe737d..3f6f328d3a3e 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -569,21 +569,6 @@ struct hns_roce_mbox_msg {
 
 struct hns_roce_dev;
 
-struct hns_roce_rinl_sge {
-	void			*addr;
-	u32			len;
-};
-
-struct hns_roce_rinl_wqe {
-	struct hns_roce_rinl_sge *sg_list;
-	u32			 sge_cnt;
-};
-
-struct hns_roce_rinl_buf {
-	struct hns_roce_rinl_wqe *wqe_list;
-	u32			 wqe_cnt;
-};
-
 enum {
 	HNS_ROCE_FLUSH_FLAG = 0,
 };
@@ -634,7 +619,6 @@ struct hns_roce_qp {
 	/* 0: flush needed, 1: unneeded */
 	unsigned long		flush_flag;
 	struct hns_roce_work	flush_work;
-	struct hns_roce_rinl_buf rq_inl_buf;
 	struct list_head	node; /* all qps are on a list */
 	struct list_head	rq_node; /* all recv qps are on a list */
 	struct list_head	sq_node; /* all send qps are on a list */
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index b936bf6e58cc..19e326bda936 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -821,22 +821,10 @@ static void fill_recv_sge_to_wqe(const struct ib_recv_wr *wr, void *wqe,
 static void fill_rq_wqe(struct hns_roce_qp *hr_qp, const struct ib_recv_wr *wr,
 			u32 wqe_idx, u32 max_sge)
 {
-	struct hns_roce_rinl_sge *sge_list;
 	void *wqe = NULL;
-	u32 i;
 
 	wqe = hns_roce_get_recv_wqe(hr_qp, wqe_idx);
 	fill_recv_sge_to_wqe(wr, wqe, max_sge, hr_qp->rq.rsv_sge);
-
-	/* rq support inline data */
-	if (hr_qp->rq_inl_buf.wqe_cnt) {
-		sge_list = hr_qp->rq_inl_buf.wqe_list[wqe_idx].sg_list;
-		hr_qp->rq_inl_buf.wqe_list[wqe_idx].sge_cnt = (u32)wr->num_sge;
-		for (i = 0; i < wr->num_sge; i++) {
-			sge_list[i].addr = (void *)(u64)wr->sg_list[i].addr;
-			sge_list[i].len = wr->sg_list[i].length;
-		}
-	}
 }
 
 static int hns_roce_v2_post_recv(struct ib_qp *ibqp,
@@ -3612,39 +3600,6 @@ static int hns_roce_v2_req_notify_cq(struct ib_cq *ibcq,
 	return 0;
 }
 
-static int hns_roce_handle_recv_inl_wqe(struct hns_roce_v2_cqe *cqe,
-					struct hns_roce_qp *qp,
-					struct ib_wc *wc)
-{
-	struct hns_roce_rinl_sge *sge_list;
-	u32 wr_num, wr_cnt, sge_num;
-	u32 sge_cnt, data_len, size;
-	void *wqe_buf;
-
-	wr_num = hr_reg_read(cqe, CQE_WQE_IDX);
-	wr_cnt = wr_num & (qp->rq.wqe_cnt - 1);
-
-	sge_list = qp->rq_inl_buf.wqe_list[wr_cnt].sg_list;
-	sge_num = qp->rq_inl_buf.wqe_list[wr_cnt].sge_cnt;
-	wqe_buf = hns_roce_get_recv_wqe(qp, wr_cnt);
-	data_len = wc->byte_len;
-
-	for (sge_cnt = 0; (sge_cnt < sge_num) && (data_len); sge_cnt++) {
-		size = min(sge_list[sge_cnt].len, data_len);
-		memcpy((void *)sge_list[sge_cnt].addr, wqe_buf, size);
-
-		data_len -= size;
-		wqe_buf += size;
-	}
-
-	if (unlikely(data_len)) {
-		wc->status = IB_WC_LOC_LEN_ERR;
-		return -EAGAIN;
-	}
-
-	return 0;
-}
-
 static int sw_comp(struct hns_roce_qp *hr_qp, struct hns_roce_wq *wq,
 		   int num_entries, struct ib_wc *wc)
 {
@@ -3868,10 +3823,8 @@ static inline bool is_rq_inl_enabled(struct ib_wc *wc, u32 hr_opcode,
 
 static int fill_recv_wc(struct ib_wc *wc, struct hns_roce_v2_cqe *cqe)
 {
-	struct hns_roce_qp *qp = to_hr_qp(wc->qp);
 	u32 hr_opcode;
 	int ib_opcode;
-	int ret;
 
 	wc->byte_len = le32_to_cpu(cqe->byte_cnt);
 
@@ -3896,12 +3849,6 @@ static int fill_recv_wc(struct ib_wc *wc, struct hns_roce_v2_cqe *cqe)
 	else
 		wc->opcode = ib_opcode;
 
-	if (is_rq_inl_enabled(wc, hr_opcode, cqe)) {
-		ret = hns_roce_handle_recv_inl_wqe(cqe, qp, wc);
-		if (unlikely(ret))
-			return ret;
-	}
-
 	wc->sl = hr_reg_read(cqe, CQE_SL);
 	wc->src_qp = hr_reg_read(cqe, CQE_RMT_QPN);
 	wc->slid = 0;
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index e65b899ce82f..b3b84aa0f058 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -433,7 +433,6 @@ static int set_rq_size(struct hns_roce_dev *hr_dev, struct ib_qp_cap *cap,
 	if (!has_rq) {
 		hr_qp->rq.wqe_cnt = 0;
 		hr_qp->rq.max_gs = 0;
-		hr_qp->rq_inl_buf.wqe_cnt = 0;
 		cap->max_recv_wr = 0;
 		cap->max_recv_sge = 0;
 
@@ -463,12 +462,6 @@ static int set_rq_size(struct hns_roce_dev *hr_dev, struct ib_qp_cap *cap,
 				    hr_qp->rq.max_gs);
 
 	hr_qp->rq.wqe_cnt = cnt;
-	if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RQ_INLINE &&
-	    hr_qp->ibqp.qp_type != IB_QPT_UD &&
-	    hr_qp->ibqp.qp_type != IB_QPT_GSI)
-		hr_qp->rq_inl_buf.wqe_cnt = cnt;
-	else
-		hr_qp->rq_inl_buf.wqe_cnt = 0;
 
 	cap->max_recv_wr = cnt;
 	cap->max_recv_sge = hr_qp->rq.max_gs - hr_qp->rq.rsv_sge;
@@ -733,49 +726,6 @@ static int hns_roce_qp_has_rq(struct ib_qp_init_attr *attr)
 	return 1;
 }
 
-static int alloc_rq_inline_buf(struct hns_roce_qp *hr_qp,
-			       struct ib_qp_init_attr *init_attr)
-{
-	u32 max_recv_sge = init_attr->cap.max_recv_sge;
-	u32 wqe_cnt = hr_qp->rq_inl_buf.wqe_cnt;
-	struct hns_roce_rinl_wqe *wqe_list;
-	int i;
-
-	/* allocate recv inline buf */
-	wqe_list = kcalloc(wqe_cnt, sizeof(struct hns_roce_rinl_wqe),
-			   GFP_KERNEL);
-	if (!wqe_list)
-		goto err;
-
-	/* Allocate a continuous buffer for all inline sge we need */
-	wqe_list[0].sg_list = kcalloc(wqe_cnt, (max_recv_sge *
-				      sizeof(struct hns_roce_rinl_sge)),
-				      GFP_KERNEL);
-	if (!wqe_list[0].sg_list)
-		goto err_wqe_list;
-
-	/* Assign buffers of sg_list to each inline wqe */
-	for (i = 1; i < wqe_cnt; i++)
-		wqe_list[i].sg_list = &wqe_list[0].sg_list[i * max_recv_sge];
-
-	hr_qp->rq_inl_buf.wqe_list = wqe_list;
-
-	return 0;
-
-err_wqe_list:
-	kfree(wqe_list);
-
-err:
-	return -ENOMEM;
-}
-
-static void free_rq_inline_buf(struct hns_roce_qp *hr_qp)
-{
-	if (hr_qp->rq_inl_buf.wqe_list)
-		kfree(hr_qp->rq_inl_buf.wqe_list[0].sg_list);
-	kfree(hr_qp->rq_inl_buf.wqe_list);
-}
-
 static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
 			struct ib_qp_init_attr *init_attr,
 			struct ib_udata *udata, unsigned long addr)
@@ -784,18 +734,6 @@ static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
 	struct hns_roce_buf_attr buf_attr = {};
 	int ret;
 
-	if (!udata && hr_qp->rq_inl_buf.wqe_cnt) {
-		ret = alloc_rq_inline_buf(hr_qp, init_attr);
-		if (ret) {
-			ibdev_err(ibdev,
-				  "failed to alloc inline buf, ret = %d.\n",
-				  ret);
-			return ret;
-		}
-	} else {
-		hr_qp->rq_inl_buf.wqe_list = NULL;
-	}
-
 	ret = set_wqe_buf_attr(hr_dev, hr_qp, &buf_attr);
 	if (ret) {
 		ibdev_err(ibdev, "failed to split WQE buf, ret = %d.\n", ret);
@@ -815,7 +753,6 @@ static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
 	return 0;
 
 err_inline:
-	free_rq_inline_buf(hr_qp);
 
 	return ret;
 }
@@ -823,7 +760,6 @@ static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
 static void free_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
 {
 	hns_roce_mtr_destroy(hr_dev, &hr_qp->mtr);
-	free_rq_inline_buf(hr_qp);
 }
 
 static inline bool user_qp_has_sdb(struct hns_roce_dev *hr_dev,
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 for-rc 5/5] RDMA/hns: Support cqe inline in user space
  2022-10-26  9:50 [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline Haoyue Xu
                   ` (3 preceding siblings ...)
  2022-10-26  9:50 ` [PATCH v2 for-rc 4/5] RDMA/hns: Remove rq inline in kernel Haoyue Xu
@ 2022-10-26  9:50 ` Haoyue Xu
  4 siblings, 0 replies; 12+ messages in thread
From: Haoyue Xu @ 2022-10-26  9:50 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, xuhaoyue1

From: Luoyouming <luoyouming@huawei.com>

Enable the CQEIE field and configure the CQEIS field of QPC.
And add compatibility handling.

Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h |  1 +
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 12 ++++++++++++
 drivers/infiniband/hw/hns/hns_roce_hw_v2.h  |  3 ++-
 drivers/infiniband/hw/hns/hns_roce_main.c   |  5 +++++
 include/uapi/rdma/hns-abi.h                 |  2 ++
 5 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3f6f328d3a3e..893ec7e4e777 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -145,6 +145,7 @@ enum {
 	HNS_ROCE_CAP_FLAG_DIRECT_WQE		= BIT(12),
 	HNS_ROCE_CAP_FLAG_SDI_MODE		= BIT(14),
 	HNS_ROCE_CAP_FLAG_STASH			= BIT(17),
+	HNS_ROCE_CAP_FLAG_CQE_INLINE		= BIT(19),
 	HNS_ROCE_CAP_FLAG_RQ_INLINE		= BIT(20),
 };
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 19e326bda936..7381319d128a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -4596,6 +4596,18 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
 		hr_reg_clear(qpc_mask, QPC_RQIE);
 	}
 
+	if (udata &&
+	    (ibqp->qp_type == IB_QPT_RC || ibqp->qp_type == IB_QPT_XRC_TGT) &&
+	    (uctx->config & HNS_ROCE_CQE_INLINE_FLAGS)) {
+		hr_reg_write_bool(context, QPC_CQEIE,
+				  hr_dev->caps.flags &
+				  HNS_ROCE_CAP_FLAG_CQE_INLINE);
+		hr_reg_clear(qpc_mask, QPC_CQEIE);
+
+		hr_reg_write(context, QPC_CQEIS, 0);
+		hr_reg_clear(qpc_mask, QPC_CQEIS);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index c7bf2d52c1cd..d1fc76d7d78a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -526,7 +526,8 @@ struct hns_roce_v2_qp_context {
 #define QPC_RQ_RTY_TX_ERR QPC_FIELD_LOC(607, 607)
 #define QPC_RX_CQN QPC_FIELD_LOC(631, 608)
 #define QPC_XRC_QP_TYPE QPC_FIELD_LOC(632, 632)
-#define QPC_RSV3 QPC_FIELD_LOC(634, 633)
+#define QPC_CQEIE QPC_FIELD_LOC(633, 633)
+#define QPC_CQEIS QPC_FIELD_LOC(634, 634)
 #define QPC_MIN_RNR_TIME QPC_FIELD_LOC(639, 635)
 #define QPC_RQ_PRODUCER_IDX QPC_FIELD_LOC(655, 640)
 #define QPC_RQ_CONSUMER_IDX QPC_FIELD_LOC(671, 656)
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index ea1ef395f60f..56ed24501155 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -384,6 +384,11 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
 		resp.config |= HNS_ROCE_RSP_RQ_INLINE_FLAGS;
 	}
 
+	if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_CQE_INLINE) {
+		context->config |= ucmd.config & HNS_ROCE_CQE_INLINE_FLAGS;
+		resp.config |= HNS_ROCE_RSP_CQE_INLINE_FLAGS;
+	}
+
 	ret = hns_roce_uar_alloc(hr_dev, &context->uar);
 	if (ret)
 		goto error_fail_uar_alloc;
diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h
index 9375ac3de059..b2b41d9d0dee 100644
--- a/include/uapi/rdma/hns-abi.h
+++ b/include/uapi/rdma/hns-abi.h
@@ -88,11 +88,13 @@ struct hns_roce_ib_create_qp_resp {
 enum {
 	HNS_ROCE_EXSGE_FLAGS = 1 << 0,
 	HNS_ROCE_RQ_INLINE_FLAGS = 1 << 1,
+	HNS_ROCE_CQE_INLINE_FLAGS = 1 << 2,
 };
 
 enum {
 	HNS_ROCE_RSP_EXSGE_FLAGS = 1 << 0,
 	HNS_ROCE_RSP_RQ_INLINE_FLAGS = 1 << 1,
+	HNS_ROCE_RSP_CQE_INLINE_FLAGS = 1 << 2,
 };
 
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling
  2022-10-26  9:50 ` [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling Haoyue Xu
@ 2022-10-28 16:40   ` Jason Gunthorpe
  2022-10-31  8:23     ` xuhaoyue (A)
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 16:40 UTC (permalink / raw)
  To: Haoyue Xu; +Cc: leon, linux-rdma, linuxarm

On Wed, Oct 26, 2022 at 05:50:52PM +0800, Haoyue Xu wrote:
> From: Luoyouming <luoyouming@huawei.com>
> 
> The rq inline makes some changes as follows, Firstly, it is only
> used in user space. Secondly, it should notify hardware in QP RTR
> status. Thirdly, Add compatibility processing between different
> user space and kernel space. Change the HNS_ROCE_CAP_FLAG_RQ_INLINE
> to a new bit to prevent old kernel spaces / spaced from enabling
> rq inline.
> 
> Signed-off-by: Luoyouming <luoyouming@huawei.com>
> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_device.h |  6 +++--
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 28 +++++++++++++--------
>  drivers/infiniband/hw/hns/hns_roce_main.c   |  5 ++++
>  drivers/infiniband/hw/hns/hns_roce_qp.c     |  2 +-
>  include/uapi/rdma/hns-abi.h                 |  2 ++
>  5 files changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
> index f701cc86896b..9ce053fe737d 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
> @@ -132,7 +132,8 @@ enum hns_roce_event {
>  enum {
>  	HNS_ROCE_CAP_FLAG_REREG_MR		= BIT(0),
>  	HNS_ROCE_CAP_FLAG_ROCE_V1_V2		= BIT(1),
> -	HNS_ROCE_CAP_FLAG_RQ_INLINE		= BIT(2),
> +	/* discard this bit, reserved for compatibility */
> +	HNS_ROCE_CAP_FLAG_DISCARD		= BIT(2),

If it is for compatability with userspace why is this enum not under
include/uapi? Something has gone wrong here, please fix it.

Also, it is better to name this __HNS_ROCE_CAP_FLAG_RQ_INLINE to
indicate it is not used instead of 'discard'

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums
  2022-10-26  9:50 ` [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums Haoyue Xu
@ 2022-10-28 16:42   ` Jason Gunthorpe
  2022-10-29  9:47     ` xuhaoyue (A)
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 16:42 UTC (permalink / raw)
  To: Haoyue Xu; +Cc: leon, linux-rdma, linuxarm

On Wed, Oct 26, 2022 at 05:50:51PM +0800, Haoyue Xu wrote:
> From: Luoyouming <luoyouming@huawei.com>
> 
> Currently, the driver only uses max_send_sge to initialize sge num
> when creating_qp. So, in the sq inline scenario, the driver may not
> has enough sge to send data. For example, if max_send_sge is 16 and
> max_inline_data is 1024, the driver needs 1024/16=64 sge to send data.
> Therefore, the calculation method of sge num is modified to take the
> maximum value of max_send_sge and max_inline_data/16 to solve this
> problem.
> 
> Fixes: 05201e01be93 ("RDMA/hns: Refactor process of setting extended sge")
> Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")
> 
> Signed-off-by: Luoyouming <luoyouming@huawei.com>
> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_device.h |   3 +
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  |  12 +--
>  drivers/infiniband/hw/hns/hns_roce_main.c   |  18 +++-
>  drivers/infiniband/hw/hns/hns_roce_qp.c     | 108 +++++++++++++++++---
>  include/uapi/rdma/hns-abi.h                 |  17 +++
>  5 files changed, 128 insertions(+), 30 deletions(-)
 
There should be no space after the fixes line, please check all
patches

Also this entire series seems unsuitable to go into for-rc

You need to justify the impact of each rc patch in the commit message,
and nearly evey rc patch should have a fixes tag. 

Make the first two patches are OK for -rc, but they should have better
commit messages.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums
  2022-10-28 16:42   ` Jason Gunthorpe
@ 2022-10-29  9:47     ` xuhaoyue (A)
  0 siblings, 0 replies; 12+ messages in thread
From: xuhaoyue (A) @ 2022-10-29  9:47 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: leon, linux-rdma, linuxarm

I will change it in V3.

On 2022/10/29 0:42:03, Jason Gunthorpe wrote:
> On Wed, Oct 26, 2022 at 05:50:51PM +0800, Haoyue Xu wrote:
>> From: Luoyouming <luoyouming@huawei.com>
>>
>> Currently, the driver only uses max_send_sge to initialize sge num
>> when creating_qp. So, in the sq inline scenario, the driver may not
>> has enough sge to send data. For example, if max_send_sge is 16 and
>> max_inline_data is 1024, the driver needs 1024/16=64 sge to send data.
>> Therefore, the calculation method of sge num is modified to take the
>> maximum value of max_send_sge and max_inline_data/16 to solve this
>> problem.
>>
>> Fixes: 05201e01be93 ("RDMA/hns: Refactor process of setting extended sge")
>> Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")
>>
>> Signed-off-by: Luoyouming <luoyouming@huawei.com>
>> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
>> ---
>>  drivers/infiniband/hw/hns/hns_roce_device.h |   3 +
>>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  |  12 +--
>>  drivers/infiniband/hw/hns/hns_roce_main.c   |  18 +++-
>>  drivers/infiniband/hw/hns/hns_roce_qp.c     | 108 +++++++++++++++++---
>>  include/uapi/rdma/hns-abi.h                 |  17 +++
>>  5 files changed, 128 insertions(+), 30 deletions(-)
>  
> There should be no space after the fixes line, please check all
> patches
> 
> Also this entire series seems unsuitable to go into for-rc
> 
> You need to justify the impact of each rc patch in the commit message,
> and nearly evey rc patch should have a fixes tag. 
> 
> Make the first two patches are OK for -rc, but they should have better
> commit messages.
> 
> Jason
> .
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling
  2022-10-28 16:40   ` Jason Gunthorpe
@ 2022-10-31  8:23     ` xuhaoyue (A)
  2022-10-31 12:07       ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: xuhaoyue (A) @ 2022-10-31  8:23 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: leon, linux-rdma, linuxarm

This bit is used to prevent compatibility issues in the old kernel. It is not for compatibility with userspace.
It should be a bugfix. I will separate this into a new bugfix patch.
I will change the name to __HNS_ROCE_CAP_FLAG_RQ_INLINE in V3.

On 2022/10/29 0:40:11, Jason Gunthorpe wrote:
> On Wed, Oct 26, 2022 at 05:50:52PM +0800, Haoyue Xu wrote:
>> From: Luoyouming <luoyouming@huawei.com>
>>
>> The rq inline makes some changes as follows, Firstly, it is only
>> used in user space. Secondly, it should notify hardware in QP RTR
>> status. Thirdly, Add compatibility processing between different
>> user space and kernel space. Change the HNS_ROCE_CAP_FLAG_RQ_INLINE
>> to a new bit to prevent old kernel spaces / spaced from enabling
>> rq inline.
>>
>> Signed-off-by: Luoyouming <luoyouming@huawei.com>
>> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
>> ---
>>  drivers/infiniband/hw/hns/hns_roce_device.h |  6 +++--
>>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 28 +++++++++++++--------
>>  drivers/infiniband/hw/hns/hns_roce_main.c   |  5 ++++
>>  drivers/infiniband/hw/hns/hns_roce_qp.c     |  2 +-
>>  include/uapi/rdma/hns-abi.h                 |  2 ++
>>  5 files changed, 30 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
>> index f701cc86896b..9ce053fe737d 100644
>> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
>> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
>> @@ -132,7 +132,8 @@ enum hns_roce_event {
>>  enum {
>>  	HNS_ROCE_CAP_FLAG_REREG_MR		= BIT(0),
>>  	HNS_ROCE_CAP_FLAG_ROCE_V1_V2		= BIT(1),
>> -	HNS_ROCE_CAP_FLAG_RQ_INLINE		= BIT(2),
>> +	/* discard this bit, reserved for compatibility */
>> +	HNS_ROCE_CAP_FLAG_DISCARD		= BIT(2),
> 
> If it is for compatability with userspace why is this enum not under
> include/uapi? Something has gone wrong here, please fix it.
> 
> Also, it is better to name this __HNS_ROCE_CAP_FLAG_RQ_INLINE to
> indicate it is not used instead of 'discard'
> 
> Jason
> .
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling
  2022-10-31  8:23     ` xuhaoyue (A)
@ 2022-10-31 12:07       ` Jason Gunthorpe
  2022-11-03 12:50         ` xuhaoyue (A)
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2022-10-31 12:07 UTC (permalink / raw)
  To: xuhaoyue (A); +Cc: leon, linux-rdma, linuxarm

On Mon, Oct 31, 2022 at 04:23:14PM +0800, xuhaoyue (A) wrote:

> This bit is used to prevent compatibility issues in the old
> kernel. It is not for compatibility with userspace.  It should be a
> bugfix. I will separate this into a new bugfix patch.  I will change
> the name to __HNS_ROCE_CAP_FLAG_RQ_INLINE in V3.

There is no such thing as "compatability issues in the old kernel"

If it is used then name it sensibly, if it isn't then delete it.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling
  2022-10-31 12:07       ` Jason Gunthorpe
@ 2022-11-03 12:50         ` xuhaoyue (A)
  0 siblings, 0 replies; 12+ messages in thread
From: xuhaoyue (A) @ 2022-11-03 12:50 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: leon, linux-rdma, linuxarm

You are right. This enum has gone wrong here. I will fix in the V3.

On 2022/10/31 20:07:43, Jason Gunthorpe wrote:
> On Mon, Oct 31, 2022 at 04:23:14PM +0800, xuhaoyue (A) wrote:
> 
>> This bit is used to prevent compatibility issues in the old
>> kernel. It is not for compatibility with userspace.  It should be a
>> bugfix. I will separate this into a new bugfix patch.  I will change
>> the name to __HNS_ROCE_CAP_FLAG_RQ_INLINE in V3.
> 
> There is no such thing as "compatability issues in the old kernel"
> 
> If it is used then name it sensibly, if it isn't then delete it.
> 
> Jason
> .
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-11-03 12:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-26  9:50 [PATCH v2 for-rc 0/5] Fix sge_num bug and add cqe inline, refactor rq inline Haoyue Xu
2022-10-26  9:50 ` [PATCH v2 for-rc 1/5] RDMA/hns: Fix ext_sge num error when post send Haoyue Xu
2022-10-26  9:50 ` [PATCH v2 for-rc 2/5] RDMA/hns: Fix the problem of sge nums Haoyue Xu
2022-10-28 16:42   ` Jason Gunthorpe
2022-10-29  9:47     ` xuhaoyue (A)
2022-10-26  9:50 ` [PATCH v2 for-rc 3/5] RDMA/hns: Remove enable rq inline in kernel and add compatibility handling Haoyue Xu
2022-10-28 16:40   ` Jason Gunthorpe
2022-10-31  8:23     ` xuhaoyue (A)
2022-10-31 12:07       ` Jason Gunthorpe
2022-11-03 12:50         ` xuhaoyue (A)
2022-10-26  9:50 ` [PATCH v2 for-rc 4/5] RDMA/hns: Remove rq inline in kernel Haoyue Xu
2022-10-26  9:50 ` [PATCH v2 for-rc 5/5] RDMA/hns: Support cqe inline in user space Haoyue Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).