* [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
@ 2022-09-27 5:53 Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access Li Zhijian
` (12 more replies)
0 siblings, 13 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Hey folks,
Firstly i want to say thank you to all you guys, especially Bob, who in the
past 1+ month, gave me a lots of idea and inspiration.
With the your help, some changes are make in 5th version, such as:
- new names and new patch split schemem, suggested by Bob
- bugfix: set is_pmem true only if the whole MR is pmem. it's possible the
one MR container both PMEM and DRAM.
- introduce feth structure, instead of u32
- new bugfix to rxe_lookup_mw() and lookup_mr(), see (RDMA/rxe: make sure requested access is a subset of {mr,mw}->access),
with this fix, we remove check_placement_type(), lookup_mr() has done the such check.
- Enable QP attr flushable
These change logs also appear in the patch it belongs to.
These patches are going to implement a *NEW* RDMA opcode "RDMA FLUSH".
In IB SPEC 1.5[1], 2 new opcodes, ATOMIC WRITE and RDMA FLUSH were
added in the MEMORY PLACEMENT EXTENSIONS section.
This patchset makes SoftRoCE support new RDMA FLUSH on RC service.
You can verify the patchset by building and running the rdma_flush example[2].
server:
$ ./rdma_flush_server -s [server_address] -p [port_number]
client:
$ ./rdma_flush_client -s [server_address] -p [port_number]
Corresponding pyverbs and tests(tests.test_qpex.QpExTestCase.test_qp_ex_rc_rdma_flush)
are also added to rdma-core
[1]: https://www.infinibandta.org/wp-content/uploads/2021/08/IBTA-Overview-of-IBTA-Volume-1-Release-1.5-and-MPE-2021-08-17-Secure.pptx
[2]: https://github.com/zhijianli88/rdma-core/tree/rdma-flush-v5
CC: Xiao Yang <yangx.jy@fujitsu.com>
CC: "Gotou, Yasunori" <y-goto@fujitsu.com>
CC: Jason Gunthorpe <jgg@ziepe.ca>
CC: Zhu Yanjun <zyjzyj2000@gmail.com>
CC: Leon Romanovsky <leon@kernel.org>
CC: Bob Pearson <rpearsonhpe@gmail.com>
CC: Mark Bloch <mbloch@nvidia.com>
CC: Wenpeng Liang <liangwenpeng@huawei.com>
CC: Tom Talpey <tom@talpey.com>
CC: "Gromadzki, Tomasz" <tomasz.gromadzki@intel.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: linux-rdma@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Can also access the kernel source in:
https://github.com/zhijianli88/linux/tree/rdma-flush-v5
Changes log
V4:
- rework responder process
- rebase to v5.19+
- remove [7/7]: RDMA/rxe: Add RD FLUSH service support since RD is not really supported
V3:
- Just rebase and commit log and comment updates
- delete patch-1: "RDMA: mr: Introduce is_pmem", which will be combined into "Allow registering persistent flag for pmem MR only"
- delete patch-7
V2:
RDMA: mr: Introduce is_pmem
check 1st byte to avoid crossing page boundary
new scheme to check is_pmem # Dan
RDMA: Allow registering MR with flush access flags
combine with [03/10] RDMA/rxe: Allow registering FLUSH flags for supported device only to this patch # Jason
split RDMA_FLUSH to 2 capabilities
RDMA/rxe: Allow registering persistent flag for pmem MR only
update commit message, get rid of confusing ib_check_flush_access_flags() # Tom
RDMA/rxe: Implement RC RDMA FLUSH service in requester side
extend flush to include length field. # Tom and Tomasz
RDMA/rxe: Implement flush execution in responder side
adjust start for WHOLE MR level # Tom
don't support DMA mr for flush # Tom
check flush return value
RDMA/rxe: Enable RDMA FLUSH capability for rxe device
adjust patch's order. move it here from [04/10]
Li Zhijian (11):
RDMA/rxe: make sure requested access is a subset of {mr,mw}->access
RDMA: Extend RDMA user ABI to support flush
RDMA: Extend RDMA kernel verbs ABI to support flush
RDMA/rxe: Extend rxe user ABI to support flush
RDMA/rxe: Allow registering persistent flag for pmem MR only
RDMA/rxe: Extend rxe packet format to support flush
RDMA/rxe: Implement RC RDMA FLUSH service in requester side
RDMA/rxe: Implement flush execution in responder side
RDMA/rxe: Implement flush completion
RDMA/cm: Make QP FLUSHABLE
RDMA/rxe: Enable RDMA FLUSH capability for rxe device
drivers/infiniband/core/cm.c | 3 +-
drivers/infiniband/sw/rxe/rxe_comp.c | 4 +-
drivers/infiniband/sw/rxe/rxe_hdr.h | 47 +++++++
drivers/infiniband/sw/rxe/rxe_loc.h | 1 +
drivers/infiniband/sw/rxe/rxe_mr.c | 81 ++++++++++-
drivers/infiniband/sw/rxe/rxe_mw.c | 3 +-
drivers/infiniband/sw/rxe/rxe_opcode.c | 17 +++
drivers/infiniband/sw/rxe/rxe_opcode.h | 16 ++-
drivers/infiniband/sw/rxe/rxe_param.h | 4 +-
drivers/infiniband/sw/rxe/rxe_req.c | 15 +-
drivers/infiniband/sw/rxe/rxe_resp.c | 180 +++++++++++++++++++++---
drivers/infiniband/sw/rxe/rxe_verbs.h | 6 +
include/rdma/ib_pack.h | 3 +
include/rdma/ib_verbs.h | 20 ++-
include/uapi/rdma/ib_user_ioctl_verbs.h | 2 +
include/uapi/rdma/ib_user_verbs.h | 16 +++
include/uapi/rdma/rdma_user_rxe.h | 7 +
17 files changed, 389 insertions(+), 36 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 31+ messages in thread
* [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-10-28 17:45 ` Jason Gunthorpe
2022-09-27 5:53 ` [for-next PATCH v5 02/11] RDMA: Extend RDMA user ABI to support flush Li Zhijian
` (11 subsequent siblings)
12 siblings, 1 reply; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
We should reject the requests with access flags that is not registered
by MR/MW. For example, lookup_mr() should return NULL when requested access
is 0x03 and mr->access is 0x01.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_mr.c | 2 +-
drivers/infiniband/sw/rxe/rxe_mw.c | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 502e9ada99b3..74a38d06332f 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -511,7 +511,7 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
if (unlikely((type == RXE_LOOKUP_LOCAL && mr->lkey != key) ||
(type == RXE_LOOKUP_REMOTE && mr->rkey != key) ||
- mr_pd(mr) != pd || (access && !(access & mr->access)) ||
+ mr_pd(mr) != pd || ((access & mr->access) != access) ||
mr->state != RXE_MR_STATE_VALID)) {
rxe_put(mr);
mr = NULL;
diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c
index 902b7df7aaed..8df1c9066ed8 100644
--- a/drivers/infiniband/sw/rxe/rxe_mw.c
+++ b/drivers/infiniband/sw/rxe/rxe_mw.c
@@ -293,8 +293,7 @@ struct rxe_mw *rxe_lookup_mw(struct rxe_qp *qp, int access, u32 rkey)
if (unlikely((mw->rkey != rkey) || rxe_mw_pd(mw) != pd ||
(mw->ibmw.type == IB_MW_TYPE_2 && mw->qp != qp) ||
- (mw->length == 0) ||
- (access && !(access & mw->access)) ||
+ (mw->length == 0) || ((access & mw->access) != access) ||
mw->state != RXE_MW_STATE_VALID)) {
rxe_put(mw);
return NULL;
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 02/11] RDMA: Extend RDMA user ABI to support flush
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs " Li Zhijian
` (10 subsequent siblings)
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
This commit extends the RDMA user ABI to support the flush
operation defined in IBA A19.4.1. These changes are
backwards compatible with the existing RDMA user ABI.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: new names and new patch split scheme, suggested by Bob
---
include/uapi/rdma/ib_user_ioctl_verbs.h | 2 ++
include/uapi/rdma/ib_user_verbs.h | 16 ++++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/include/uapi/rdma/ib_user_ioctl_verbs.h b/include/uapi/rdma/ib_user_ioctl_verbs.h
index 7dd56210226f..07b105e22f6f 100644
--- a/include/uapi/rdma/ib_user_ioctl_verbs.h
+++ b/include/uapi/rdma/ib_user_ioctl_verbs.h
@@ -57,6 +57,8 @@ enum ib_uverbs_access_flags {
IB_UVERBS_ACCESS_ZERO_BASED = 1 << 5,
IB_UVERBS_ACCESS_ON_DEMAND = 1 << 6,
IB_UVERBS_ACCESS_HUGETLB = 1 << 7,
+ IB_UVERBS_ACCESS_FLUSH_GLOBAL = 1 << 8,
+ IB_UVERBS_ACCESS_FLUSH_PERSISTENT = 1 << 9,
IB_UVERBS_ACCESS_RELAXED_ORDERING = IB_UVERBS_ACCESS_OPTIONAL_FIRST,
IB_UVERBS_ACCESS_OPTIONAL_RANGE =
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 43672cb1fd57..2d5f32d9d0d9 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -105,6 +105,18 @@ enum {
IB_USER_VERBS_EX_CMD_MODIFY_CQ
};
+/* see IBA A19.4.1.1 Placement Types */
+enum ib_placement_type {
+ IB_FLUSH_GLOBAL = 1U << 0,
+ IB_FLUSH_PERSISTENT = 1U << 1,
+};
+
+/* see IBA A19.4.1.2 Selectivity Level */
+enum ib_selectivity_level {
+ IB_FLUSH_RANGE = 0,
+ IB_FLUSH_MR,
+};
+
/*
* Make sure that all structs defined in this file remain laid out so
* that they pack the same way on 32-bit and 64-bit architectures (to
@@ -466,6 +478,7 @@ enum ib_uverbs_wc_opcode {
IB_UVERBS_WC_BIND_MW = 5,
IB_UVERBS_WC_LOCAL_INV = 6,
IB_UVERBS_WC_TSO = 7,
+ IB_UVERBS_WC_FLUSH = 8,
};
struct ib_uverbs_wc {
@@ -784,6 +797,7 @@ enum ib_uverbs_wr_opcode {
IB_UVERBS_WR_RDMA_READ_WITH_INV = 11,
IB_UVERBS_WR_MASKED_ATOMIC_CMP_AND_SWP = 12,
IB_UVERBS_WR_MASKED_ATOMIC_FETCH_AND_ADD = 13,
+ IB_UVERBS_WR_FLUSH = 14,
/* Review enum ib_wr_opcode before modifying this */
};
@@ -1331,6 +1345,8 @@ enum ib_uverbs_device_cap_flags {
/* Deprecated. Please use IB_UVERBS_RAW_PACKET_CAP_SCATTER_FCS. */
IB_UVERBS_DEVICE_RAW_SCATTER_FCS = 1ULL << 34,
IB_UVERBS_DEVICE_PCI_WRITE_END_PADDING = 1ULL << 36,
+ IB_UVERBS_DEVICE_FLUSH_GLOBAL = 1ULL << 38,
+ IB_UVERBS_DEVICE_FLUSH_PERSISTENT = 1ULL << 39,
};
enum ib_uverbs_raw_packet_caps {
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs ABI to support flush
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 02/11] RDMA: Extend RDMA user ABI to support flush Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-29 6:21 ` Li Zhijian
2022-10-28 17:44 ` Jason Gunthorpe
2022-09-27 5:53 ` [for-next PATCH v5 04/11] RDMA/rxe: Extend rxe user " Li Zhijian
` (9 subsequent siblings)
12 siblings, 2 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
This commit extends the RDMA kernel verbs ABI to support the flush
operation defined in IBA A19.4.1. These changes are
backwards compatible with the existing RDMA kernel verbs ABI.
It makes device/HCA support new FLUSH attributes/capabilities, and it
also makes memory region support new FLUSH access flags.
Users can use ibv_reg_mr(3) to register flush access flags. Only the
access flags also supported by device's capabilities can be registered
successfully.
Once registered successfully, it means the MR is flushable. Similarly,
A flushable MR should also have one or both of GLOBAL_VISIBILITY and
PERSISTENT attributes/capabilities like device/HCA.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: new names and new patch split scheme, suggested by Bob
---
include/rdma/ib_pack.h | 3 +++
include/rdma/ib_verbs.h | 20 +++++++++++++++++++-
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/include/rdma/ib_pack.h b/include/rdma/ib_pack.h
index a9162f25beaf..56211d1cc9f9 100644
--- a/include/rdma/ib_pack.h
+++ b/include/rdma/ib_pack.h
@@ -84,6 +84,7 @@ enum {
/* opcode 0x15 is reserved */
IB_OPCODE_SEND_LAST_WITH_INVALIDATE = 0x16,
IB_OPCODE_SEND_ONLY_WITH_INVALIDATE = 0x17,
+ IB_OPCODE_FLUSH = 0x1C,
/* real constants follow -- see comment about above IB_OPCODE()
macro for more details */
@@ -112,6 +113,7 @@ enum {
IB_OPCODE(RC, FETCH_ADD),
IB_OPCODE(RC, SEND_LAST_WITH_INVALIDATE),
IB_OPCODE(RC, SEND_ONLY_WITH_INVALIDATE),
+ IB_OPCODE(RC, FLUSH),
/* UC */
IB_OPCODE(UC, SEND_FIRST),
@@ -149,6 +151,7 @@ enum {
IB_OPCODE(RD, ATOMIC_ACKNOWLEDGE),
IB_OPCODE(RD, COMPARE_SWAP),
IB_OPCODE(RD, FETCH_ADD),
+ IB_OPCODE(RD, FLUSH),
/* UD */
IB_OPCODE(UD, SEND_ONLY),
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 975d6e9efbcb..571838dd06eb 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -270,6 +270,9 @@ enum ib_device_cap_flags {
/* The device supports padding incoming writes to cacheline. */
IB_DEVICE_PCI_WRITE_END_PADDING =
IB_UVERBS_DEVICE_PCI_WRITE_END_PADDING,
+ /* Placement type attributes */
+ IB_DEVICE_FLUSH_GLOBAL = IB_UVERBS_DEVICE_FLUSH_GLOBAL,
+ IB_DEVICE_FLUSH_PERSISTENT = IB_UVERBS_DEVICE_FLUSH_PERSISTENT,
};
enum ib_kernel_cap_flags {
@@ -985,6 +988,7 @@ enum ib_wc_opcode {
IB_WC_REG_MR,
IB_WC_MASKED_COMP_SWAP,
IB_WC_MASKED_FETCH_ADD,
+ IB_WC_FLUSH = IB_UVERBS_WC_FLUSH,
/*
* Set value of IB_WC_RECV so consumers can test if a completion is a
* receive by testing (opcode & IB_WC_RECV).
@@ -1325,6 +1329,7 @@ enum ib_wr_opcode {
IB_UVERBS_WR_MASKED_ATOMIC_CMP_AND_SWP,
IB_WR_MASKED_ATOMIC_FETCH_AND_ADD =
IB_UVERBS_WR_MASKED_ATOMIC_FETCH_AND_ADD,
+ IB_WR_FLUSH = IB_UVERBS_WR_FLUSH,
/* These are kernel only and can not be issued by userspace */
IB_WR_REG_MR = 0x20,
@@ -1458,10 +1463,14 @@ enum ib_access_flags {
IB_ACCESS_ON_DEMAND = IB_UVERBS_ACCESS_ON_DEMAND,
IB_ACCESS_HUGETLB = IB_UVERBS_ACCESS_HUGETLB,
IB_ACCESS_RELAXED_ORDERING = IB_UVERBS_ACCESS_RELAXED_ORDERING,
+ IB_ACCESS_FLUSH_GLOBAL = IB_UVERBS_ACCESS_FLUSH_GLOBAL,
+ IB_ACCESS_FLUSH_PERSISTENT = IB_UVERBS_ACCESS_FLUSH_PERSISTENT,
+ IB_ACCESS_FLUSHABLE = IB_ACCESS_FLUSH_GLOBAL |
+ IB_ACCESS_FLUSH_PERSISTENT,
IB_ACCESS_OPTIONAL = IB_UVERBS_ACCESS_OPTIONAL_RANGE,
IB_ACCESS_SUPPORTED =
- ((IB_ACCESS_HUGETLB << 1) - 1) | IB_ACCESS_OPTIONAL,
+ ((IB_ACCESS_FLUSH_PERSISTENT << 1) - 1) | IB_ACCESS_OPTIONAL,
};
/*
@@ -4321,6 +4330,8 @@ int ib_dealloc_xrcd_user(struct ib_xrcd *xrcd, struct ib_udata *udata);
static inline int ib_check_mr_access(struct ib_device *ib_dev,
unsigned int flags)
{
+ u64 device_cap = ib_dev->attrs.device_cap_flags;
+
/*
* Local write permission is required if remote write or
* remote atomic permission is also requested.
@@ -4335,6 +4346,13 @@ static inline int ib_check_mr_access(struct ib_device *ib_dev,
if (flags & IB_ACCESS_ON_DEMAND &&
!(ib_dev->attrs.kernel_cap_flags & IBK_ON_DEMAND_PAGING))
return -EINVAL;
+
+ if ((flags & IB_ACCESS_FLUSH_GLOBAL &&
+ !(device_cap & IB_DEVICE_FLUSH_GLOBAL)) ||
+ (flags & IB_ACCESS_FLUSH_PERSISTENT &&
+ !(device_cap & IB_DEVICE_FLUSH_PERSISTENT)))
+ return -EINVAL;
+
return 0;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 04/11] RDMA/rxe: Extend rxe user ABI to support flush
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (2 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs " Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only Li Zhijian
` (8 subsequent siblings)
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
This commit extends the rxe user ABI to support the flush
operation defined in IBA A19.4.1. These changes are
backwards compatible with the existing rxe user ABI.
The user API request a flush by filling this structure.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: new patch split scheme, suggested by Bob
---
include/uapi/rdma/rdma_user_rxe.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/uapi/rdma/rdma_user_rxe.h b/include/uapi/rdma/rdma_user_rxe.h
index 73f679dfd2df..e2b93df94590 100644
--- a/include/uapi/rdma/rdma_user_rxe.h
+++ b/include/uapi/rdma/rdma_user_rxe.h
@@ -82,6 +82,13 @@ struct rxe_send_wr {
__u32 invalidate_rkey;
} ex;
union {
+ struct {
+ __aligned_u64 remote_addr;
+ __u32 length;
+ __u32 rkey;
+ __u8 type;
+ __u8 level;
+ } flush;
struct {
__aligned_u64 remote_addr;
__u32 rkey;
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (3 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 04/11] RDMA/rxe: Extend rxe user " Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-10-28 17:53 ` Jason Gunthorpe
2022-09-27 5:53 ` [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush Li Zhijian
` (7 subsequent siblings)
12 siblings, 1 reply; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Memory region could support at most 2 flush access flags:
IB_ACCESS_FLUSH_PERSISTENT and IB_ACCESS_FLUSH_GLOBAL
But we only allow user to register persistent flush flags to the pmem MR
where it has the ability of persisting data across power cycles.
So register a persistent access flag to a non-pmem MR will be rejected.
CC: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: make sure the whole MR is pmem
V4: set is_pmem more simple
V2: new scheme check is_pmem # Dan
update commit message, get rid of confusing ib_check_flush_access_flags() # Tom
---
drivers/infiniband/sw/rxe/rxe_mr.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 74a38d06332f..1da3ad5eba64 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -112,6 +112,13 @@ void rxe_mr_init_dma(int access, struct rxe_mr *mr)
mr->type = IB_MR_TYPE_DMA;
}
+static bool vaddr_in_pmem(char *vaddr)
+{
+ return REGION_INTERSECTS ==
+ region_intersects(virt_to_phys(vaddr), 1, IORESOURCE_MEM,
+ IORES_DESC_PERSISTENT_MEMORY);
+}
+
int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
int access, struct rxe_mr *mr)
{
@@ -122,6 +129,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
int num_buf;
void *vaddr;
int err;
+ bool is_pmem = false;
int i;
umem = ib_umem_get(&rxe->ib_dev, start, length, access);
@@ -149,6 +157,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
num_buf = 0;
map = mr->map;
if (length > 0) {
+ is_pmem = true;
buf = map[0]->buf;
for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
@@ -166,6 +175,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
goto err_cleanup_map;
}
+ /* True only if the *whole* MR is pmem */
+ if (is_pmem)
+ is_pmem = vaddr_in_pmem(vaddr);
+
buf->addr = (uintptr_t)vaddr;
buf->size = PAGE_SIZE;
num_buf++;
@@ -174,6 +187,12 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
}
}
+ if (!is_pmem && access & IB_ACCESS_FLUSH_PERSISTENT) {
+ pr_warn("Cannot register IB_ACCESS_FLUSH_PERSISTENT for non-pmem memory\n");
+ err = -EINVAL;
+ goto err_release_umem;
+ }
+
mr->umem = umem;
mr->access = access;
mr->offset = ib_umem_offset(umem);
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (4 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-11-11 8:43 ` Yanjun Zhu
2022-09-27 5:53 ` [for-next PATCH v5 07/11] RDMA/rxe: Implement RC RDMA FLUSH service in requester side Li Zhijian
` (6 subsequent siblings)
12 siblings, 1 reply; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Extend rxe opcode tables, headers, helper and constants to support
flush operations.
Refer to the IBA A19.4.1 for more FETH definition details
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: new FETH structure and simplify header helper
new names and new patch split scheme, suggested by Bob.
---
drivers/infiniband/sw/rxe/rxe_hdr.h | 47 ++++++++++++++++++++++++++
drivers/infiniband/sw/rxe/rxe_opcode.c | 17 ++++++++++
drivers/infiniband/sw/rxe/rxe_opcode.h | 16 +++++----
3 files changed, 74 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_hdr.h b/drivers/infiniband/sw/rxe/rxe_hdr.h
index e432f9e37795..e995a97c54fd 100644
--- a/drivers/infiniband/sw/rxe/rxe_hdr.h
+++ b/drivers/infiniband/sw/rxe/rxe_hdr.h
@@ -607,6 +607,52 @@ static inline void reth_set_len(struct rxe_pkt_info *pkt, u32 len)
rxe_opcode[pkt->opcode].offset[RXE_RETH], len);
}
+/******************************************************************************
+ * FLUSH Extended Transport Header
+ ******************************************************************************/
+
+struct rxe_feth {
+ __be32 bits;
+};
+
+#define FETH_PLT_MASK (0x0000000f) /* bits 3-0 */
+#define FETH_SEL_MASK (0x00000030) /* bits 5-4 */
+#define FETH_SEL_SHIFT (4U)
+
+static inline u32 __feth_plt(void *arg)
+{
+ struct rxe_feth *feth = arg;
+
+ return be32_to_cpu(feth->bits) & FETH_PLT_MASK;
+}
+
+static inline u32 __feth_sel(void *arg)
+{
+ struct rxe_feth *feth = arg;
+
+ return (be32_to_cpu(feth->bits) & FETH_SEL_MASK) >> FETH_SEL_SHIFT;
+}
+
+static inline u32 feth_plt(struct rxe_pkt_info *pkt)
+{
+ return __feth_plt(pkt->hdr + rxe_opcode[pkt->opcode].offset[RXE_FETH]);
+}
+
+static inline u32 feth_sel(struct rxe_pkt_info *pkt)
+{
+ return __feth_sel(pkt->hdr + rxe_opcode[pkt->opcode].offset[RXE_FETH]);
+}
+
+static inline void feth_init(struct rxe_pkt_info *pkt, u8 type, u8 level)
+{
+ struct rxe_feth *feth = (struct rxe_feth *)
+ (pkt->hdr + rxe_opcode[pkt->opcode].offset[RXE_FETH]);
+ u32 bits = ((level << FETH_SEL_SHIFT) & FETH_SEL_MASK) |
+ (type & FETH_PLT_MASK);
+
+ feth->bits = cpu_to_be32(bits);
+}
+
/******************************************************************************
* Atomic Extended Transport Header
******************************************************************************/
@@ -910,6 +956,7 @@ enum rxe_hdr_length {
RXE_ATMETH_BYTES = sizeof(struct rxe_atmeth),
RXE_IETH_BYTES = sizeof(struct rxe_ieth),
RXE_RDETH_BYTES = sizeof(struct rxe_rdeth),
+ RXE_FETH_BYTES = sizeof(struct rxe_feth),
};
static inline size_t header_size(struct rxe_pkt_info *pkt)
diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.c b/drivers/infiniband/sw/rxe/rxe_opcode.c
index d4ba4d506f17..55aad13e57bb 100644
--- a/drivers/infiniband/sw/rxe/rxe_opcode.c
+++ b/drivers/infiniband/sw/rxe/rxe_opcode.c
@@ -101,6 +101,12 @@ struct rxe_wr_opcode_info rxe_wr_opcode_info[] = {
[IB_QPT_UC] = WR_LOCAL_OP_MASK,
},
},
+ [IB_WR_FLUSH] = {
+ .name = "IB_WR_FLUSH",
+ .mask = {
+ [IB_QPT_RC] = WR_FLUSH_MASK,
+ },
+ },
};
struct rxe_opcode_info rxe_opcode[RXE_NUM_OPCODE] = {
@@ -378,6 +384,17 @@ struct rxe_opcode_info rxe_opcode[RXE_NUM_OPCODE] = {
RXE_IETH_BYTES,
}
},
+ [IB_OPCODE_RC_FLUSH] = {
+ .name = "IB_OPCODE_RC_FLUSH",
+ .mask = RXE_FETH_MASK | RXE_RETH_MASK | RXE_FLUSH_MASK |
+ RXE_START_MASK | RXE_END_MASK | RXE_REQ_MASK,
+ .length = RXE_BTH_BYTES + RXE_FETH_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_FETH] = RXE_BTH_BYTES,
+ [RXE_RETH] = RXE_BTH_BYTES + RXE_FETH_BYTES,
+ }
+ },
/* UC */
[IB_OPCODE_UC_SEND_FIRST] = {
diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.h b/drivers/infiniband/sw/rxe/rxe_opcode.h
index 8f9aaaf260f2..02d256745793 100644
--- a/drivers/infiniband/sw/rxe/rxe_opcode.h
+++ b/drivers/infiniband/sw/rxe/rxe_opcode.h
@@ -19,7 +19,8 @@ enum rxe_wr_mask {
WR_SEND_MASK = BIT(2),
WR_READ_MASK = BIT(3),
WR_WRITE_MASK = BIT(4),
- WR_LOCAL_OP_MASK = BIT(5),
+ WR_FLUSH_MASK = BIT(5),
+ WR_LOCAL_OP_MASK = BIT(6),
WR_READ_OR_WRITE_MASK = WR_READ_MASK | WR_WRITE_MASK,
WR_WRITE_OR_SEND_MASK = WR_WRITE_MASK | WR_SEND_MASK,
@@ -47,6 +48,7 @@ enum rxe_hdr_type {
RXE_RDETH,
RXE_DETH,
RXE_IMMDT,
+ RXE_FETH,
RXE_PAYLOAD,
NUM_HDR_TYPES
};
@@ -63,6 +65,7 @@ enum rxe_hdr_mask {
RXE_IETH_MASK = BIT(RXE_IETH),
RXE_RDETH_MASK = BIT(RXE_RDETH),
RXE_DETH_MASK = BIT(RXE_DETH),
+ RXE_FETH_MASK = BIT(RXE_FETH),
RXE_PAYLOAD_MASK = BIT(RXE_PAYLOAD),
RXE_REQ_MASK = BIT(NUM_HDR_TYPES + 0),
@@ -71,13 +74,14 @@ enum rxe_hdr_mask {
RXE_WRITE_MASK = BIT(NUM_HDR_TYPES + 3),
RXE_READ_MASK = BIT(NUM_HDR_TYPES + 4),
RXE_ATOMIC_MASK = BIT(NUM_HDR_TYPES + 5),
+ RXE_FLUSH_MASK = BIT(NUM_HDR_TYPES + 6),
- RXE_RWR_MASK = BIT(NUM_HDR_TYPES + 6),
- RXE_COMP_MASK = BIT(NUM_HDR_TYPES + 7),
+ RXE_RWR_MASK = BIT(NUM_HDR_TYPES + 7),
+ RXE_COMP_MASK = BIT(NUM_HDR_TYPES + 8),
- RXE_START_MASK = BIT(NUM_HDR_TYPES + 8),
- RXE_MIDDLE_MASK = BIT(NUM_HDR_TYPES + 9),
- RXE_END_MASK = BIT(NUM_HDR_TYPES + 10),
+ RXE_START_MASK = BIT(NUM_HDR_TYPES + 9),
+ RXE_MIDDLE_MASK = BIT(NUM_HDR_TYPES + 10),
+ RXE_END_MASK = BIT(NUM_HDR_TYPES + 11),
RXE_LOOPBACK_MASK = BIT(NUM_HDR_TYPES + 12),
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 07/11] RDMA/rxe: Implement RC RDMA FLUSH service in requester side
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (5 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 08/11] RDMA/rxe: Implement flush execution in responder side Li Zhijian
` (5 subsequent siblings)
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Implement FLUSH request operation in the requester.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V4: Remove flush union for legecy API, add WR_FLUSH_MASK
V3: Fix sparse: incorrect type in assignment; Reported-by: kernel test robot <lkp@intel.com>
V2: extend flush to include length field.
---
drivers/infiniband/sw/rxe/rxe_req.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index f63771207970..5996b0e3177a 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -241,6 +241,9 @@ static int next_opcode_rc(struct rxe_qp *qp, u32 opcode, int fits)
IB_OPCODE_RC_SEND_ONLY_WITH_IMMEDIATE :
IB_OPCODE_RC_SEND_FIRST;
+ case IB_WR_FLUSH:
+ return IB_OPCODE_RC_FLUSH;
+
case IB_WR_RDMA_READ:
return IB_OPCODE_RC_RDMA_READ_REQUEST;
@@ -421,11 +424,18 @@ static struct sk_buff *init_req_packet(struct rxe_qp *qp,
/* init optional headers */
if (pkt->mask & RXE_RETH_MASK) {
- reth_set_rkey(pkt, ibwr->wr.rdma.rkey);
+ if (pkt->mask & RXE_FETH_MASK)
+ reth_set_rkey(pkt, ibwr->wr.flush.rkey);
+ else
+ reth_set_rkey(pkt, ibwr->wr.rdma.rkey);
reth_set_va(pkt, wqe->iova);
reth_set_len(pkt, wqe->dma.resid);
}
+ /* Fill Flush Extension Transport Header */
+ if (pkt->mask & RXE_FETH_MASK)
+ feth_init(pkt, ibwr->wr.flush.type, ibwr->wr.flush.level);
+
if (pkt->mask & RXE_IMMDT_MASK)
immdt_set_imm(pkt, ibwr->ex.imm_data);
@@ -484,6 +494,9 @@ static int finish_packet(struct rxe_qp *qp, struct rxe_av *av,
memset(pad, 0, bth_pad(pkt));
}
+ } else if (pkt->mask & RXE_FLUSH_MASK) {
+ /* oA19-2: shall have no payload. */
+ wqe->dma.resid = 0;
}
return 0;
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 08/11] RDMA/rxe: Implement flush execution in responder side
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (6 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 07/11] RDMA/rxe: Implement RC RDMA FLUSH service in requester side Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 09/11] RDMA/rxe: Implement flush completion Li Zhijian
` (4 subsequent siblings)
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Only the requested placement types that also registered in the destination
memory region are acceptable.
Otherwise, responder will also reply NAK "Remote Access Error" if it
found a placement type violation.
We will persist data via arch_wb_cache_pmem(), which could be
architecture specific.
This commit also add 2 helpers to update qp.resp from the incoming packet.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
v5: add QP attr check for flush access
rename flush_nvdimm_iova -> rxe_flush_pmem_iova()
v4: add send_read_response_ack and flush resource
---
drivers/infiniband/sw/rxe/rxe_loc.h | 1 +
drivers/infiniband/sw/rxe/rxe_mr.c | 60 +++++++++
drivers/infiniband/sw/rxe/rxe_resp.c | 180 ++++++++++++++++++++++----
drivers/infiniband/sw/rxe/rxe_verbs.h | 6 +
4 files changed, 225 insertions(+), 22 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index c2a5c8814a48..944d564a11cd 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -68,6 +68,7 @@ void rxe_mr_init_dma(int access, struct rxe_mr *mr);
int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
int access, struct rxe_mr *mr);
int rxe_mr_init_fast(int max_pages, struct rxe_mr *mr);
+int rxe_flush_pmem_iova(struct rxe_mr *mr, u64 iova, int length);
int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
enum rxe_mr_copy_dir dir);
int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma,
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 1da3ad5eba64..fa7e71074233 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -4,6 +4,8 @@
* Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.
*/
+#include <linux/libnvdimm.h>
+
#include "rxe.h"
#include "rxe_loc.h"
@@ -305,6 +307,64 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
return addr;
}
+int rxe_flush_pmem_iova(struct rxe_mr *mr, u64 iova, int length)
+{
+ int err;
+ int bytes;
+ u8 *va;
+ struct rxe_map **map;
+ struct rxe_phys_buf *buf;
+ int m;
+ int i;
+ size_t offset;
+
+ if (length == 0)
+ return 0;
+
+ if (mr->type == IB_MR_TYPE_DMA) {
+ err = -EFAULT;
+ goto err1;
+ }
+
+ err = mr_check_range(mr, iova, length);
+ if (err) {
+ err = -EFAULT;
+ goto err1;
+ }
+
+ lookup_iova(mr, iova, &m, &i, &offset);
+
+ map = mr->map + m;
+ buf = map[0]->buf + i;
+
+ while (length > 0) {
+ va = (u8 *)(uintptr_t)buf->addr + offset;
+ bytes = buf->size - offset;
+
+ if (bytes > length)
+ bytes = length;
+
+ arch_wb_cache_pmem(va, bytes);
+
+ length -= bytes;
+
+ offset = 0;
+ buf++;
+ i++;
+
+ if (i == RXE_BUF_PER_MAP) {
+ i = 0;
+ map++;
+ buf = map[0]->buf;
+ }
+ }
+
+ return 0;
+
+err1:
+ return err;
+}
+
/* copy data from a range (vaddr, vaddr+length-1) to or from
* a mr object starting at iova.
*/
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index ed5a09e86417..0b68e5d8e1d2 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -22,6 +22,7 @@ enum resp_states {
RESPST_EXECUTE,
RESPST_READ_REPLY,
RESPST_ATOMIC_REPLY,
+ RESPST_PROCESS_FLUSH,
RESPST_COMPLETE,
RESPST_ACKNOWLEDGE,
RESPST_CLEANUP,
@@ -57,6 +58,7 @@ static char *resp_state_name[] = {
[RESPST_EXECUTE] = "EXECUTE",
[RESPST_READ_REPLY] = "READ_REPLY",
[RESPST_ATOMIC_REPLY] = "ATOMIC_REPLY",
+ [RESPST_PROCESS_FLUSH] = "PROCESS_FLUSH",
[RESPST_COMPLETE] = "COMPLETE",
[RESPST_ACKNOWLEDGE] = "ACKNOWLEDGE",
[RESPST_CLEANUP] = "CLEANUP",
@@ -253,19 +255,38 @@ static enum resp_states check_op_seq(struct rxe_qp *qp,
}
}
+static bool check_qp_attr_access(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ if (((pkt->mask & RXE_READ_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_READ)) ||
+ ((pkt->mask & RXE_WRITE_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_WRITE)) ||
+ ((pkt->mask & RXE_ATOMIC_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_ATOMIC))) {
+ return false;
+ }
+
+ if (pkt->mask & RXE_FLUSH_MASK) {
+ u32 flush_type = feth_plt(pkt);
+
+ if ((flush_type & IB_FLUSH_GLOBAL &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_FLUSH_GLOBAL)) ||
+ (flush_type & IB_FLUSH_PERSISTENT &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_FLUSH_PERSISTENT)))
+ return false;
+ }
+
+ return true;
+}
+
static enum resp_states check_op_valid(struct rxe_qp *qp,
struct rxe_pkt_info *pkt)
{
switch (qp_type(qp)) {
case IB_QPT_RC:
- if (((pkt->mask & RXE_READ_MASK) &&
- !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_READ)) ||
- ((pkt->mask & RXE_WRITE_MASK) &&
- !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_WRITE)) ||
- ((pkt->mask & RXE_ATOMIC_MASK) &&
- !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_ATOMIC))) {
+ if (!check_qp_attr_access(qp, pkt))
return RESPST_ERR_UNSUPPORTED_OPCODE;
- }
break;
@@ -402,6 +423,23 @@ static enum resp_states check_length(struct rxe_qp *qp,
}
}
+static void qp_resp_from_reth(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
+{
+ qp->resp.va = reth_va(pkt);
+ qp->resp.offset = 0;
+ qp->resp.rkey = reth_rkey(pkt);
+ qp->resp.resid = reth_len(pkt);
+ qp->resp.length = reth_len(pkt);
+}
+
+static void qp_resp_from_atmeth(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
+{
+ qp->resp.va = atmeth_va(pkt);
+ qp->resp.offset = 0;
+ qp->resp.rkey = atmeth_rkey(pkt);
+ qp->resp.resid = sizeof(u64);
+}
+
static enum resp_states check_rkey(struct rxe_qp *qp,
struct rxe_pkt_info *pkt)
{
@@ -413,23 +451,26 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
u32 pktlen;
int mtu = qp->mtu;
enum resp_states state;
- int access;
+ int access = 0;
if (pkt->mask & RXE_READ_OR_WRITE_MASK) {
- if (pkt->mask & RXE_RETH_MASK) {
- qp->resp.va = reth_va(pkt);
- qp->resp.offset = 0;
- qp->resp.rkey = reth_rkey(pkt);
- qp->resp.resid = reth_len(pkt);
- qp->resp.length = reth_len(pkt);
- }
+ if (pkt->mask & RXE_RETH_MASK)
+ qp_resp_from_reth(qp, pkt);
+
access = (pkt->mask & RXE_READ_MASK) ? IB_ACCESS_REMOTE_READ
: IB_ACCESS_REMOTE_WRITE;
+ } else if (pkt->mask & RXE_FLUSH_MASK) {
+ u32 flush_type = feth_plt(pkt);
+
+ if (pkt->mask & RXE_RETH_MASK)
+ qp_resp_from_reth(qp, pkt);
+
+ if (flush_type & IB_FLUSH_GLOBAL)
+ access |= IB_ACCESS_FLUSH_GLOBAL;
+ if (flush_type & IB_FLUSH_PERSISTENT)
+ access |= IB_ACCESS_FLUSH_PERSISTENT;
} else if (pkt->mask & RXE_ATOMIC_MASK) {
- qp->resp.va = atmeth_va(pkt);
- qp->resp.offset = 0;
- qp->resp.rkey = atmeth_rkey(pkt);
- qp->resp.resid = sizeof(u64);
+ qp_resp_from_atmeth(qp, pkt);
access = IB_ACCESS_REMOTE_ATOMIC;
} else {
return RESPST_EXECUTE;
@@ -450,7 +491,7 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
if (rkey_is_mw(rkey)) {
mw = rxe_lookup_mw(qp, access, rkey);
if (!mw) {
- pr_debug("%s: no MW matches rkey %#x\n",
+ pr_err("%s: no MW matches rkey %#x\n",
__func__, rkey);
state = RESPST_ERR_RKEY_VIOLATION;
goto err;
@@ -458,7 +499,7 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
mr = mw->mr;
if (!mr) {
- pr_err("%s: MW doesn't have an MR\n", __func__);
+ pr_debug("%s: MW doesn't have an MR\n", __func__);
state = RESPST_ERR_RKEY_VIOLATION;
goto err;
}
@@ -478,12 +519,21 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
}
}
+ if (pkt->mask & RXE_FLUSH_MASK) {
+ /* FLUSH MR may not set va or resid
+ * no need to check range since we will flush whole mr
+ */
+ if (feth_sel(pkt) == IB_FLUSH_MR)
+ goto skip_check_range;
+ }
+
if (mr_check_range(mr, va + qp->resp.offset, resid)) {
state = RESPST_ERR_RKEY_VIOLATION;
goto err;
}
- if (pkt->mask & RXE_WRITE_MASK) {
+skip_check_range:
+ if (pkt->mask & RXE_WRITE_MASK) {
if (resid > mtu) {
if (pktlen != mtu || bth_pad(pkt)) {
state = RESPST_ERR_LENGTH;
@@ -587,11 +637,61 @@ static struct resp_res *rxe_prepare_res(struct rxe_qp *qp,
res->last_psn = pkt->psn;
res->cur_psn = pkt->psn;
break;
+ case RXE_FLUSH_MASK:
+ res->flush.va = qp->resp.va + qp->resp.offset;
+ res->flush.length = qp->resp.length;
+ res->flush.type = feth_plt(pkt);
+ res->flush.level = feth_sel(pkt);
}
return res;
}
+static enum resp_states process_flush(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ u64 length, start;
+ struct rxe_mr *mr = qp->resp.mr;
+ struct resp_res *res = qp->resp.res;
+
+ /* oA19-14, oA19-15 */
+ if (res && res->replay)
+ return RESPST_ACKNOWLEDGE;
+ else if (!res) {
+ res = rxe_prepare_res(qp, pkt, RXE_FLUSH_MASK);
+ qp->resp.res = res;
+ }
+
+ if (res->flush.level == IB_FLUSH_RANGE) {
+ start = res->flush.va;
+ length = res->flush.length;
+ } else { /* level == IB_FLUSH_MR */
+ start = mr->ibmr.iova;
+ length = mr->ibmr.length;
+ }
+
+ if (res->flush.type & IB_FLUSH_PERSISTENT) {
+ if (rxe_flush_pmem_iova(mr, start, length))
+ return RESPST_ERR_RKEY_VIOLATION;
+ /* Make data persistent. */
+ wmb();
+ } else if (res->flush.type & IB_FLUSH_GLOBAL) {
+ /* Make data global visibility. */
+ wmb();
+ }
+
+ qp->resp.msn++;
+
+ /* next expected psn, read handles this separately */
+ qp->resp.psn = (pkt->psn + 1) & BTH_PSN_MASK;
+ qp->resp.ack_psn = qp->resp.psn;
+
+ qp->resp.opcode = pkt->opcode;
+ qp->resp.status = IB_WC_SUCCESS;
+
+ return RESPST_ACKNOWLEDGE;
+}
+
/* Guarantee atomicity of atomic operations at the machine level. */
static DEFINE_SPINLOCK(atomic_ops_lock);
@@ -888,6 +988,8 @@ static enum resp_states execute(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
return RESPST_READ_REPLY;
} else if (pkt->mask & RXE_ATOMIC_MASK) {
return RESPST_ATOMIC_REPLY;
+ } else if (pkt->mask & RXE_FLUSH_MASK) {
+ return RESPST_PROCESS_FLUSH;
} else {
/* Unreachable */
WARN_ON_ONCE(1);
@@ -1061,6 +1163,19 @@ static int send_atomic_ack(struct rxe_qp *qp, u8 syndrome, u32 psn)
return ret;
}
+static int send_read_response_ack(struct rxe_qp *qp, u8 syndrome, u32 psn)
+{
+ int ret = send_common_ack(qp, syndrome, psn,
+ IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY,
+ "RDMA READ response of length zero ACK");
+
+ /* have to clear this since it is used to trigger
+ * long read replies
+ */
+ qp->resp.res = NULL;
+ return ret;
+}
+
static enum resp_states acknowledge(struct rxe_qp *qp,
struct rxe_pkt_info *pkt)
{
@@ -1071,6 +1186,8 @@ static enum resp_states acknowledge(struct rxe_qp *qp,
send_ack(qp, qp->resp.aeth_syndrome, pkt->psn);
else if (pkt->mask & RXE_ATOMIC_MASK)
send_atomic_ack(qp, AETH_ACK_UNLIMITED, pkt->psn);
+ else if (pkt->mask & RXE_FLUSH_MASK)
+ send_read_response_ack(qp, AETH_ACK_UNLIMITED, pkt->psn);
else if (bth_ack(pkt))
send_ack(qp, AETH_ACK_UNLIMITED, pkt->psn);
@@ -1127,6 +1244,22 @@ static enum resp_states duplicate_request(struct rxe_qp *qp,
/* SEND. Ack again and cleanup. C9-105. */
send_ack(qp, AETH_ACK_UNLIMITED, prev_psn);
return RESPST_CLEANUP;
+ } else if (pkt->mask & RXE_FLUSH_MASK) {
+ struct resp_res *res;
+
+ /* Find the operation in our list of responder resources. */
+ res = find_resource(qp, pkt->psn);
+ if (res) {
+ res->replay = 1;
+ res->cur_psn = pkt->psn;
+ qp->resp.res = res;
+ rc = RESPST_PROCESS_FLUSH;
+ goto out;
+ }
+
+ /* Resource not found. Class D error. Drop the request. */
+ rc = RESPST_CLEANUP;
+ goto out;
} else if (pkt->mask & RXE_READ_MASK) {
struct resp_res *res;
@@ -1320,6 +1453,9 @@ int rxe_responder(void *arg)
case RESPST_ATOMIC_REPLY:
state = atomic_reply(qp, pkt);
break;
+ case RESPST_PROCESS_FLUSH:
+ state = process_flush(qp, pkt);
+ break;
case RESPST_ACKNOWLEDGE:
state = acknowledge(qp, pkt);
break;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 5f5cbfcb3569..4cfe4d8b0aaa 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -165,6 +165,12 @@ struct resp_res {
u64 va;
u32 resid;
} read;
+ struct {
+ u32 length;
+ u64 va;
+ u8 type;
+ u8 level;
+ } flush;
};
};
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 09/11] RDMA/rxe: Implement flush completion
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (7 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 08/11] RDMA/rxe: Implement flush execution in responder side Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 10/11] RDMA/cm: Make QP FLUSHABLE Li Zhijian
` (3 subsequent siblings)
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Per IBA SPEC, FLUSH will ack in rdma read response with 0 length.
Use IB_WC_FLUSH (aka IB_UVERBS_WC_FLUSH) code to tell userspace a FLUSH
completion.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_comp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index fb0c008af78c..2dea786e20ad 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -104,6 +104,7 @@ static enum ib_wc_opcode wr_to_wc_opcode(enum ib_wr_opcode opcode)
case IB_WR_LOCAL_INV: return IB_WC_LOCAL_INV;
case IB_WR_REG_MR: return IB_WC_REG_MR;
case IB_WR_BIND_MW: return IB_WC_BIND_MW;
+ case IB_WR_FLUSH: return IB_WC_FLUSH;
default:
return 0xff;
@@ -263,7 +264,8 @@ static inline enum comp_state check_ack(struct rxe_qp *qp,
*/
case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:
if (wqe->wr.opcode != IB_WR_RDMA_READ &&
- wqe->wr.opcode != IB_WR_RDMA_READ_WITH_INV) {
+ wqe->wr.opcode != IB_WR_RDMA_READ_WITH_INV &&
+ wqe->wr.opcode != IB_WR_FLUSH) {
wqe->status = IB_WC_FATAL_ERR;
return COMPST_ERROR;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 10/11] RDMA/cm: Make QP FLUSHABLE
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (8 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 09/11] RDMA/rxe: Implement flush completion Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 11/11] RDMA/rxe: Enable RDMA FLUSH capability for rxe device Li Zhijian
` (2 subsequent siblings)
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
It enables flushable access flag for qp
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: new patch, inspired by Bob
---
drivers/infiniband/core/cm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 1f9938a2c475..58837aac980b 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -4096,7 +4096,8 @@ static int cm_init_qp_init_attr(struct cm_id_private *cm_id_priv,
qp_attr->qp_access_flags = IB_ACCESS_REMOTE_WRITE;
if (cm_id_priv->responder_resources)
qp_attr->qp_access_flags |= IB_ACCESS_REMOTE_READ |
- IB_ACCESS_REMOTE_ATOMIC;
+ IB_ACCESS_REMOTE_ATOMIC |
+ IB_ACCESS_FLUSHABLE;
qp_attr->pkey_index = cm_id_priv->av.pkey_index;
if (cm_id_priv->av.port)
qp_attr->port_num = cm_id_priv->av.port->port_num;
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [for-next PATCH v5 11/11] RDMA/rxe: Enable RDMA FLUSH capability for rxe device
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (9 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 10/11] RDMA/cm: Make QP FLUSHABLE Li Zhijian
@ 2022-09-27 5:53 ` Li Zhijian
2022-10-28 17:44 ` [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Jason Gunthorpe
2022-10-28 17:57 ` Jason Gunthorpe
12 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-09-27 5:53 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel, Li Zhijian
Now we are ready to enable RDMA FLUSH capability for RXE.
It can support Global Visibility and Persistence placement types.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_param.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
index 86c7a8bf3cbb..c7a82823a041 100644
--- a/drivers/infiniband/sw/rxe/rxe_param.h
+++ b/drivers/infiniband/sw/rxe/rxe_param.h
@@ -51,7 +51,9 @@ enum rxe_device_param {
| IB_DEVICE_SRQ_RESIZE
| IB_DEVICE_MEM_MGT_EXTENSIONS
| IB_DEVICE_MEM_WINDOW
- | IB_DEVICE_MEM_WINDOW_TYPE_2B,
+ | IB_DEVICE_MEM_WINDOW_TYPE_2B
+ | IB_DEVICE_FLUSH_GLOBAL
+ | IB_DEVICE_FLUSH_PERSISTENT,
RXE_MAX_SGE = 32,
RXE_MAX_WQE_SIZE = sizeof(struct rxe_send_wqe) +
sizeof(struct ib_sge) * RXE_MAX_SGE,
--
2.31.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs ABI to support flush
2022-09-27 5:53 ` [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs " Li Zhijian
@ 2022-09-29 6:21 ` Li Zhijian
2022-09-30 18:04 ` Jason Gunthorpe
2022-10-28 17:44 ` Jason Gunthorpe
1 sibling, 1 reply; 31+ messages in thread
From: Li Zhijian @ 2022-09-29 6:21 UTC (permalink / raw)
To: Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel
Leon, Jason
On 27/09/2022 13:53, Li Zhijian wrote:
> /*
> @@ -4321,6 +4330,8 @@ int ib_dealloc_xrcd_user(struct ib_xrcd *xrcd, struct ib_udata *udata);
> static inline int ib_check_mr_access(struct ib_device *ib_dev,
> unsigned int flags)
> {
> + u64 device_cap = ib_dev->attrs.device_cap_flags;
> +
> /*
> * Local write permission is required if remote write or
> * remote atomic permission is also requested.
> @@ -4335,6 +4346,13 @@ static inline int ib_check_mr_access(struct ib_device *ib_dev,
> if (flags & IB_ACCESS_ON_DEMAND &&
> !(ib_dev->attrs.kernel_cap_flags & IBK_ON_DEMAND_PAGING))
> return -EINVAL;
> +
> + if ((flags & IB_ACCESS_FLUSH_GLOBAL &&
> + !(device_cap & IB_DEVICE_FLUSH_GLOBAL)) ||
> + (flags & IB_ACCESS_FLUSH_PERSISTENT &&
> + !(device_cap & IB_DEVICE_FLUSH_PERSISTENT)))
> + return -EINVAL;
> +
Regarding of the return value of ib_check_mr_access. While updating the man page of ibv_reg_mr(3) of rdma-core,
```
IBV_ACCESS_REMOTE_READ Enable Remote Read Access
IBV_ACCESS_REMOTE_ATOMIC Enable Remote Atomic Operation Access (if supported)
IBV_ACCESS_MW_BIND Enable Memory Window Binding
IBV_ACCESS_ZERO_BASED Use byte offset from beginning of MR to access this MR, instead of a pointer address
IBV_ACCESS_ON_DEMAND Create an on-demand paging MR (if supported)
...
RETURN VALUE
ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr() returns a pointer to the registered MR, or NULL if the request fails. The local key (L_Key) field lkey is used as the lkey field of struct ibv_sge when posting
buffers with ibv_post_* verbs, and the the remote key (R_Key) field rkey is used by remote processes to perform Atomic and RDMA operations. The remote process places this rkey as the rkey field of struct ibv_send_wr
passed to the ibv_post_send function.
```
we can see, IBV_ACCESS_REMOTE_ATOMIC and IBV_ACCESS_ON_DEMAND are tagged "if supported" . but currently kernel
just returns EINVAL when user registers a MR with IB_ACCESS_ON_DEMAND to RXE.
I wonder we should return -EOPNOTSUPP if the device doesn't support requested capabilities
Thanks
Li
> return 0;
> }
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs ABI to support flush
2022-09-29 6:21 ` Li Zhijian
@ 2022-09-30 18:04 ` Jason Gunthorpe
0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2022-09-30 18:04 UTC (permalink / raw)
To: Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On Thu, Sep 29, 2022 at 02:21:24PM +0800, Li Zhijian wrote:
> we can see, IBV_ACCESS_REMOTE_ATOMIC and IBV_ACCESS_ON_DEMAND are
> tagged "if supported" . but currently kernel just returns EINVAL
> when user registers a MR with IB_ACCESS_ON_DEMAND to RXE.
>
> I wonder we should return -EOPNOTSUPP if the device doesn't support requested capabilities
Yes, unsupported combinations of access flags should trigger
EOPNOTSUPP
Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (10 preceding siblings ...)
2022-09-27 5:53 ` [for-next PATCH v5 11/11] RDMA/rxe: Enable RDMA FLUSH capability for rxe device Li Zhijian
@ 2022-10-28 17:44 ` Jason Gunthorpe
2022-10-28 17:57 ` Jason Gunthorpe
12 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 17:44 UTC (permalink / raw)
To: Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
> Hey folks,
>
> Firstly i want to say thank you to all you guys, especially Bob, who in the
> past 1+ month, gave me a lots of idea and inspiration.
>
> With the your help, some changes are make in 5th version, such as:
> - new names and new patch split schemem, suggested by Bob
> - bugfix: set is_pmem true only if the whole MR is pmem. it's possible the
> one MR container both PMEM and DRAM.
> - introduce feth structure, instead of u32
> - new bugfix to rxe_lookup_mw() and lookup_mr(), see (RDMA/rxe: make sure requested access is a subset of {mr,mw}->access),
> with this fix, we remove check_placement_type(), lookup_mr() has done the such check.
> - Enable QP attr flushable
> These change logs also appear in the patch it belongs to.
>
> These patches are going to implement a *NEW* RDMA opcode "RDMA FLUSH".
> In IB SPEC 1.5[1], 2 new opcodes, ATOMIC WRITE and RDMA FLUSH were
> added in the MEMORY PLACEMENT EXTENSIONS section.
This doesn't apply anymore, I did try to fix it, but it ended up not
compiling, so it is better if you handle it and repost.
Thanks,
Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs ABI to support flush
2022-09-27 5:53 ` [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs " Li Zhijian
2022-09-29 6:21 ` Li Zhijian
@ 2022-10-28 17:44 ` Jason Gunthorpe
2022-10-29 3:15 ` Li Zhijian
1 sibling, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 17:44 UTC (permalink / raw)
To: Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On Tue, Sep 27, 2022 at 01:53:29PM +0800, Li Zhijian wrote:
> @@ -4321,6 +4330,8 @@ int ib_dealloc_xrcd_user(struct ib_xrcd *xrcd, struct ib_udata *udata);
> static inline int ib_check_mr_access(struct ib_device *ib_dev,
> unsigned int flags)
> {
> + u64 device_cap = ib_dev->attrs.device_cap_flags;
> +
> /*
> * Local write permission is required if remote write or
> * remote atomic permission is also requested.
> @@ -4335,6 +4346,13 @@ static inline int ib_check_mr_access(struct ib_device *ib_dev,
> if (flags & IB_ACCESS_ON_DEMAND &&
> !(ib_dev->attrs.kernel_cap_flags & IBK_ON_DEMAND_PAGING))
> return -EINVAL;
> +
> + if ((flags & IB_ACCESS_FLUSH_GLOBAL &&
> + !(device_cap & IB_DEVICE_FLUSH_GLOBAL)) ||
> + (flags & IB_ACCESS_FLUSH_PERSISTENT &&
> + !(device_cap & IB_DEVICE_FLUSH_PERSISTENT)))
> + return -EINVAL;
This should be -EOPNOTSUPP as the above is changed to in for-next
Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access
2022-09-27 5:53 ` [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access Li Zhijian
@ 2022-10-28 17:45 ` Jason Gunthorpe
0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 17:45 UTC (permalink / raw)
To: Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On Tue, Sep 27, 2022 at 01:53:27PM +0800, Li Zhijian wrote:
> We should reject the requests with access flags that is not registered
> by MR/MW. For example, lookup_mr() should return NULL when requested access
> is 0x03 and mr->access is 0x01.
>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_mr.c | 2 +-
> drivers/infiniband/sw/rxe/rxe_mw.c | 3 +--
> 2 files changed, 2 insertions(+), 3 deletions(-)
I'm going to apply this little bug fix to for-next
Thanks,
Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only
2022-09-27 5:53 ` [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only Li Zhijian
@ 2022-10-28 17:53 ` Jason Gunthorpe
2022-10-30 3:33 ` Li Zhijian
0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 17:53 UTC (permalink / raw)
To: Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On Tue, Sep 27, 2022 at 01:53:31PM +0800, Li Zhijian wrote:
> @@ -122,6 +129,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
> int num_buf;
> void *vaddr;
> int err;
> + bool is_pmem = false;
> int i;
>
> umem = ib_umem_get(&rxe->ib_dev, start, length, access);
> @@ -149,6 +157,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
> num_buf = 0;
> map = mr->map;
> if (length > 0) {
> + is_pmem = true;
> buf = map[0]->buf;
>
> for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
> @@ -166,6 +175,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
> goto err_cleanup_map;
> }
>
> + /* True only if the *whole* MR is pmem */
> + if (is_pmem)
> + is_pmem = vaddr_in_pmem(vaddr);
> +
I'm not so keen on this use of resources, but this should be written more
like
phys = page_to_phys(sg_page_iter_page(&sg_iter))
region_intersects(phys + sg_iter->offset, sg_iter->length,.. )
And you understand this will make memory registration of every RXE
user a bit slower? And actual pmem will be painfully slow.
It seems like we are doing something wrong here..
> @@ -174,6 +187,12 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
> }
> }
>
> + if (!is_pmem && access & IB_ACCESS_FLUSH_PERSISTENT) {
> + pr_warn("Cannot register IB_ACCESS_FLUSH_PERSISTENT for non-pmem memory\n");
> + err = -EINVAL;
> + goto err_release_umem;
> + }
Do not pr_warn on syscall paths
Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
` (11 preceding siblings ...)
2022-10-28 17:44 ` [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Jason Gunthorpe
@ 2022-10-28 17:57 ` Jason Gunthorpe
2022-11-11 2:49 ` Yanjun Zhu
12 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2022-10-28 17:57 UTC (permalink / raw)
To: Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
> Hey folks,
>
> Firstly i want to say thank you to all you guys, especially Bob, who in the
> past 1+ month, gave me a lots of idea and inspiration.
I would like it if someone familiar with rxe could reviewed-by the
protocol parts.
Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs ABI to support flush
2022-10-28 17:44 ` Jason Gunthorpe
@ 2022-10-29 3:15 ` Li Zhijian
0 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-10-29 3:15 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On 29/10/2022 01:44, Jason Gunthorpe wrote:
> On Tue, Sep 27, 2022 at 01:53:29PM +0800, Li Zhijian wrote:
>> @@ -4321,6 +4330,8 @@ int ib_dealloc_xrcd_user(struct ib_xrcd *xrcd, struct ib_udata *udata);
>> static inline int ib_check_mr_access(struct ib_device *ib_dev,
>> unsigned int flags)
>> {
>> + u64 device_cap = ib_dev->attrs.device_cap_flags;
>> +
>> /*
>> * Local write permission is required if remote write or
>> * remote atomic permission is also requested.
>> @@ -4335,6 +4346,13 @@ static inline int ib_check_mr_access(struct ib_device *ib_dev,
>> if (flags & IB_ACCESS_ON_DEMAND &&
>> !(ib_dev->attrs.kernel_cap_flags & IBK_ON_DEMAND_PAGING))
>> return -EINVAL;
>> +
>> + if ((flags & IB_ACCESS_FLUSH_GLOBAL &&
>> + !(device_cap & IB_DEVICE_FLUSH_GLOBAL)) ||
>> + (flags & IB_ACCESS_FLUSH_PERSISTENT &&
>> + !(device_cap & IB_DEVICE_FLUSH_PERSISTENT)))
>> + return -EINVAL;
> This should be -EOPNOTSUPP as the above is changed to in for-next
Yes, my local tree(V6) had updated this. will repost this later.
>
> Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only
2022-10-28 17:53 ` Jason Gunthorpe
@ 2022-10-30 3:33 ` Li Zhijian
0 siblings, 0 replies; 31+ messages in thread
From: Li Zhijian @ 2022-10-30 3:33 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
On 29/10/2022 01:53, Jason Gunthorpe wrote:
> On Tue, Sep 27, 2022 at 01:53:31PM +0800, Li Zhijian wrote:
>> @@ -122,6 +129,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>> int num_buf;
>> void *vaddr;
>> int err;
>> + bool is_pmem = false;
>> int i;
>>
>> umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>> @@ -149,6 +157,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>> num_buf = 0;
>> map = mr->map;
>> if (length > 0) {
>> + is_pmem = true;
>> buf = map[0]->buf;
>>
>> for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
>> @@ -166,6 +175,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>> goto err_cleanup_map;
>> }
>>
>> + /* True only if the *whole* MR is pmem */
>> + if (is_pmem)
>> + is_pmem = vaddr_in_pmem(vaddr);
>> +
> I'm not so keen on this use of resources, but this should be written more
> like
>
> phys = page_to_phys(sg_page_iter_page(&sg_iter))
> region_intersects(phys + sg_iter->offset, sg_iter->length,.. )
>
> And you understand this will make memory registration of every RXE
> user a bit slower?
Good catch, i missed it before.
I tested it qemu guest in which pmem is backing to a normal file in host.
In this case, this testing take ~+9% overhead(1.2S -> 1.3S) for 1G size mr. most the time was taken by gup.
the real pmem environment will be tested later.
To minimize side effect, i updated the code to do pmem mr checking on if the require_pmem is true.
> region_intersects(phys + sg_iter->offset, sg_iter->length,.. )
I haven't fully apply this suggestion since i think my assumption that a page can only associate to a unique/same
memory zone is true. So i only check 1 byte of each page.
index 5d014cef916e..e4e7c180fa0d 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -112,6 +112,13 @@ void rxe_mr_init_dma(int access, struct rxe_mr *mr)
mr->ibmr.type = IB_MR_TYPE_DMA;
}
+static bool paddr_in_pmem(unsigned long paddr)
+{
+ return REGION_INTERSECTS ==
+ region_intersects(paddr, 1, IORESOURCE_MEM,
+ IORES_DESC_PERSISTENT_MEMORY);
+}
+
int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
int access, struct rxe_mr *mr)
{
@@ -122,6 +129,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
int num_buf;
void *vaddr;
int err;
+ bool require_pmem = access & IB_ACCESS_FLUSH_PERSISTENT;
umem = ib_umem_get(&rxe->ib_dev, start, length, access);
if (IS_ERR(umem)) {
@@ -149,6 +157,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
num_buf = 0;
map = mr->map;
if (length > 0) {
+ struct page *pg;
buf = map[0]->buf;
for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
@@ -158,13 +167,20 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
num_buf = 0;
}
- vaddr = page_address(sg_page_iter_page(&sg_iter));
+ pg = sg_page_iter_page(&sg_iter);
+ vaddr = page_address(pg);
if (!vaddr) {
pr_warn("%s: Unable to get virtual address\n",
__func__);
err = -ENOMEM;
goto err_release_umem;
}
+
+ if (require_pmem && !paddr_in_pmem(page_to_phys(pg))) {
+ err = -EINVAL;
+ goto err_release_umem;
+ }
+
buf->addr = (uintptr_t)vaddr;
num_buf++;
buf++;
> And actual pmem will be painfully slow.
>
> It seems like we are doing something wrong here..
>
Do you think we don't need this patch ?
>> @@ -174,6 +187,12 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>> }
>> }
>>
>> + if (!is_pmem && access & IB_ACCESS_FLUSH_PERSISTENT) {
>> + pr_warn("Cannot register IB_ACCESS_FLUSH_PERSISTENT for non-pmem memory\n");
>> + err = -EINVAL;
>> + goto err_release_umem;
>> + }
> Do not pr_warn on syscall paths
Got it, will remove it.
Thanks
Zhijian
>
> Jason
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-10-28 17:57 ` Jason Gunthorpe
@ 2022-11-11 2:49 ` Yanjun Zhu
2022-11-11 5:10 ` lizhijian
0 siblings, 1 reply; 31+ messages in thread
From: Yanjun Zhu @ 2022-11-11 2:49 UTC (permalink / raw)
To: Jason Gunthorpe, Li Zhijian
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
y-goto, mbloch, liangwenpeng, tom, tomasz.gromadzki,
dan.j.williams, linux-kernel
在 2022/10/29 1:57, Jason Gunthorpe 写道:
> On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
>> Hey folks,
>>
>> Firstly i want to say thank you to all you guys, especially Bob, who in the
>> past 1+ month, gave me a lots of idea and inspiration.
>
> I would like it if someone familiar with rxe could reviewed-by the
> protocol parts.
Hi, Jason
I reviewed these patches. I am fine with these patches.
Hi, Zhijian
I noticed the followings:
"
$ ./rdma_flush_server -s [server_address] -p [port_number]
client:
$ ./rdma_flush_client -s [server_address] -p [port_number]
"
Can you merge the server and the client to rdma-core?
Thanks,
Zhu Yanjun
>
> Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-11-11 2:49 ` Yanjun Zhu
@ 2022-11-11 5:10 ` lizhijian
2022-11-11 5:52 ` Yanjun Zhu
0 siblings, 1 reply; 31+ messages in thread
From: lizhijian @ 2022-11-11 5:10 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
On 11/11/2022 10:49, Yanjun Zhu wrote:
> 在 2022/10/29 1:57, Jason Gunthorpe 写道:
>> On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
>>> Hey folks,
>>>
>>> Firstly i want to say thank you to all you guys, especially Bob, who
>>> in the
>>> past 1+ month, gave me a lots of idea and inspiration.
>>
>> I would like it if someone familiar with rxe could reviewed-by the
>> protocol parts.
>
> Hi, Jason
>
> I reviewed these patches. I am fine with these patches.
>
> Hi, Zhijian
>
> I noticed the followings:
> "
> $ ./rdma_flush_server -s [server_address] -p [port_number]
> client:
> $ ./rdma_flush_client -s [server_address] -p [port_number]
> "
> Can you merge the server and the client to rdma-core?
Yanjun,
Yes, there was already a draft PR here
https://github.com/linux-rdma/rdma-core/pull/1181, but it cannot go
ahead until the kernel's patches are merged.
and i will post a new version these days, would you mind if i add your
"Reviewed-by" in next version ?
>
> Thanks,
> Zhu Yanjun
>
>>
>> Jason
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-11-11 5:10 ` lizhijian
@ 2022-11-11 5:52 ` Yanjun Zhu
2022-11-11 6:10 ` lizhijian
0 siblings, 1 reply; 31+ messages in thread
From: Yanjun Zhu @ 2022-11-11 5:52 UTC (permalink / raw)
To: lizhijian, Yanjun Zhu, Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
在 2022/11/11 13:10, lizhijian@fujitsu.com 写道:
>
>
> On 11/11/2022 10:49, Yanjun Zhu wrote:
>> 在 2022/10/29 1:57, Jason Gunthorpe 写道:
>>> On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
>>>> Hey folks,
>>>>
>>>> Firstly i want to say thank you to all you guys, especially Bob, who
>>>> in the
>>>> past 1+ month, gave me a lots of idea and inspiration.
>>>
>>> I would like it if someone familiar with rxe could reviewed-by the
>>> protocol parts.
>>
>> Hi, Jason
>>
>> I reviewed these patches. I am fine with these patches.
>>
>> Hi, Zhijian
>>
>> I noticed the followings:
>> "
>> $ ./rdma_flush_server -s [server_address] -p [port_number]
>> client:
>> $ ./rdma_flush_client -s [server_address] -p [port_number]
>> "
>> Can you merge the server and the client to rdma-core?
>
> Yanjun,
>
> Yes, there was already a draft PR here
> https://github.com/linux-rdma/rdma-core/pull/1181, but it cannot go
> ahead until the kernel's patches are merged.
>
> and i will post a new version these days, would you mind if i add your
> "Reviewed-by" in next version ?
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Thanks.
Another problem, normally rxe should connect to physical ib devices,
such as mlx ib device. That is, one host is rxe, the other host is mlx
ib device. The rdma connection should be created between the 2 hosts.
Do you connect to mlx ib device with this RDMA FLUSH operation?
And what is the test result?
Thanks a lot.
Zhu Yanjun
>
>
>
>>
>> Thanks,
>> Zhu Yanjun
>>
>>>
>>> Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-11-11 5:52 ` Yanjun Zhu
@ 2022-11-11 6:10 ` lizhijian
2022-11-11 6:30 ` Yanjun Zhu
0 siblings, 1 reply; 31+ messages in thread
From: lizhijian @ 2022-11-11 6:10 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
On 11/11/2022 13:52, Yanjun Zhu wrote:
> 在 2022/11/11 13:10, lizhijian@fujitsu.com 写道:
>>
>>
>> On 11/11/2022 10:49, Yanjun Zhu wrote:
>>> 在 2022/10/29 1:57, Jason Gunthorpe 写道:
>>>> On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
>>>>> Hey folks,
>>>>>
>>>>> Firstly i want to say thank you to all you guys, especially Bob, who
>>>>> in the
>>>>> past 1+ month, gave me a lots of idea and inspiration.
>>>>
>>>> I would like it if someone familiar with rxe could reviewed-by the
>>>> protocol parts.
>>>
>>> Hi, Jason
>>>
>>> I reviewed these patches. I am fine with these patches.
>>>
>>> Hi, Zhijian
>>>
>>> I noticed the followings:
>>> "
>>> $ ./rdma_flush_server -s [server_address] -p [port_number]
>>> client:
>>> $ ./rdma_flush_client -s [server_address] -p [port_number]
>>> "
>>> Can you merge the server and the client to rdma-core?
>>
>> Yanjun,
>>
>> Yes, there was already a draft PR here
>> https://github.com/linux-rdma/rdma-core/pull/1181, but it cannot go
>> ahead until the kernel's patches are merged.
>>
>> and i will post a new version these days, would you mind if i add your
>> "Reviewed-by" in next version ?
>
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> Thanks.
>
> Another problem, normally rxe should connect to physical ib devices,
> such as mlx ib device. That is, one host is rxe, the other host is mlx
> ib device. The rdma connection should be created between the 2 hosts.
it's fully compatible with old operation.
>
> Do you connect to mlx ib device with this RDMA FLUSH operation?
> And what is the test result?
Yes, i tested it.
After these patches, only RXE device can register *FLUSHABLE* MRs
successfully. If mlx try that, EOPNOSUPP will be returned.
Similarly, Since other hardwares(MLX for example) have not supported
FLUSH operation, EOPNOSUPP will be returned if users try to to that.
In short, for RXE requester, MLX responder will return error for the
request. MLX requester is not able to request a FLUSH operation.
Thanks
Zhijian
>
> Thanks a lot.
> Zhu Yanjun
>
>>
>>
>>
>>>
>>> Thanks,
>>> Zhu Yanjun
>>>
>>>>
>>>> Jason
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-11-11 6:10 ` lizhijian
@ 2022-11-11 6:30 ` Yanjun Zhu
2022-11-11 6:38 ` lizhijian
0 siblings, 1 reply; 31+ messages in thread
From: Yanjun Zhu @ 2022-11-11 6:30 UTC (permalink / raw)
To: lizhijian, Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
在 2022/11/11 14:10, lizhijian@fujitsu.com 写道:
>
> On 11/11/2022 13:52, Yanjun Zhu wrote:
>> 在 2022/11/11 13:10, lizhijian@fujitsu.com 写道:
>>>
>>> On 11/11/2022 10:49, Yanjun Zhu wrote:
>>>> 在 2022/10/29 1:57, Jason Gunthorpe 写道:
>>>>> On Tue, Sep 27, 2022 at 01:53:26PM +0800, Li Zhijian wrote:
>>>>>> Hey folks,
>>>>>>
>>>>>> Firstly i want to say thank you to all you guys, especially Bob, who
>>>>>> in the
>>>>>> past 1+ month, gave me a lots of idea and inspiration.
>>>>> I would like it if someone familiar with rxe could reviewed-by the
>>>>> protocol parts.
>>>> Hi, Jason
>>>>
>>>> I reviewed these patches. I am fine with these patches.
>>>>
>>>> Hi, Zhijian
>>>>
>>>> I noticed the followings:
>>>> "
>>>> $ ./rdma_flush_server -s [server_address] -p [port_number]
>>>> client:
>>>> $ ./rdma_flush_client -s [server_address] -p [port_number]
>>>> "
>>>> Can you merge the server and the client to rdma-core?
>>> Yanjun,
>>>
>>> Yes, there was already a draft PR here
>>> https://github.com/linux-rdma/rdma-core/pull/1181, but it cannot go
>>> ahead until the kernel's patches are merged.
>>>
>>> and i will post a new version these days, would you mind if i add your
>>> "Reviewed-by" in next version ?
>> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>> Thanks.
>>
>> Another problem, normally rxe should connect to physical ib devices,
>> such as mlx ib device. That is, one host is rxe, the other host is mlx
>> ib device. The rdma connection should be created between the 2 hosts.
> it's fully compatible with old operation.
>
>
>> Do you connect to mlx ib device with this RDMA FLUSH operation?
>> And what is the test result?
> Yes, i tested it.
>
> After these patches, only RXE device can register *FLUSHABLE* MRs
> successfully. If mlx try that, EOPNOSUPP will be returned.
>
> Similarly, Since other hardwares(MLX for example) have not supported
> FLUSH operation, EOPNOSUPP will be returned if users try to to that.
>
> In short, for RXE requester, MLX responder will return error for the
> request. MLX requester is not able to request a FLUSH operation.
Thanks. Do you mean that FLUSH operation is only supported in RXE? ^_^
And MLX does not support FLUSH operation currently?
Zhu Yanjun
>
> Thanks
> Zhijian
>
>
>> Thanks a lot.
>> Zhu Yanjun
>>
>>>
>>>
>>>> Thanks,
>>>> Zhu Yanjun
>>>>
>>>>> Jason
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-11-11 6:30 ` Yanjun Zhu
@ 2022-11-11 6:38 ` lizhijian
2022-11-11 7:08 ` Yanjun Zhu
0 siblings, 1 reply; 31+ messages in thread
From: lizhijian @ 2022-11-11 6:38 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
On 11/11/2022 14:30, Yanjun Zhu wrote:
>>
>> After these patches, only RXE device can register *FLUSHABLE* MRs
>> successfully. If mlx try that, EOPNOSUPP will be returned.
>>
>> Similarly, Since other hardwares(MLX for example) have not supported
>> FLUSH operation, EOPNOSUPP will be returned if users try to to that.
>>
>> In short, for RXE requester, MLX responder will return error for the
>> request. MLX requester is not able to request a FLUSH operation.
>
> Thanks. Do you mean that FLUSH operation is only supported in RXE? ^_^
>
> And MLX does not support FLUSH operation currently?
IMO, FLUSH and Atomic Write are newly introduced by IBA spec 1.5
published in 2021. So hardware/drivers(MLX) should do something to
support it.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation
2022-11-11 6:38 ` lizhijian
@ 2022-11-11 7:08 ` Yanjun Zhu
0 siblings, 0 replies; 31+ messages in thread
From: Yanjun Zhu @ 2022-11-11 7:08 UTC (permalink / raw)
To: lizhijian, Jason Gunthorpe
Cc: Bob Pearson, Leon Romanovsky, linux-rdma, Zhu Yanjun, yangx.jy,
Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
在 2022/11/11 14:38, lizhijian@fujitsu.com 写道:
>
> On 11/11/2022 14:30, Yanjun Zhu wrote:
>>> After these patches, only RXE device can register *FLUSHABLE* MRs
>>> successfully. If mlx try that, EOPNOSUPP will be returned.
>>>
>>> Similarly, Since other hardwares(MLX for example) have not supported
>>> FLUSH operation, EOPNOSUPP will be returned if users try to to that.
>>>
>>> In short, for RXE requester, MLX responder will return error for the
>>> request. MLX requester is not able to request a FLUSH operation.
>> Thanks. Do you mean that FLUSH operation is only supported in RXE? ^_^
>>
>> And MLX does not support FLUSH operation currently?
> IMO, FLUSH and Atomic Write are newly introduced by IBA spec 1.5
> published in 2021. So hardware/drivers(MLX) should do something to
> support it.
Thanks.
If I got you correctly, FLUSH and Atomic Write is a new feature. And
from the test result, it is not supported by MLX driver currently.
Wait for MLX Engineer for updates about FLUSH and Atomic Write.
IMO, it had better make rxe successfully connect to one physical ib
device with FLUSH and Atomic Write, such as MLX or others.
Zhu Yanjun
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush
2022-09-27 5:53 ` [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush Li Zhijian
@ 2022-11-11 8:43 ` Yanjun Zhu
2022-11-11 8:55 ` lizhijian
0 siblings, 1 reply; 31+ messages in thread
From: Yanjun Zhu @ 2022-11-11 8:43 UTC (permalink / raw)
To: Li Zhijian, Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, y-goto, mbloch, liangwenpeng, tom,
tomasz.gromadzki, dan.j.williams, linux-kernel
在 2022/9/27 13:53, Li Zhijian 写道:
> Extend rxe opcode tables, headers, helper and constants to support
> flush operations.
>
> Refer to the IBA A19.4.1 for more FETH definition details
>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> V5: new FETH structure and simplify header helper
> new names and new patch split scheme, suggested by Bob.
> ---
> drivers/infiniband/sw/rxe/rxe_hdr.h | 47 ++++++++++++++++++++++++++
> drivers/infiniband/sw/rxe/rxe_opcode.c | 17 ++++++++++
> drivers/infiniband/sw/rxe/rxe_opcode.h | 16 +++++----
> 3 files changed, 74 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_hdr.h b/drivers/infiniband/sw/rxe/rxe_hdr.h
> index e432f9e37795..e995a97c54fd 100644
> --- a/drivers/infiniband/sw/rxe/rxe_hdr.h
> +++ b/drivers/infiniband/sw/rxe/rxe_hdr.h
> @@ -607,6 +607,52 @@ static inline void reth_set_len(struct rxe_pkt_info *pkt, u32 len)
> rxe_opcode[pkt->opcode].offset[RXE_RETH], len);
> }
>
> +/******************************************************************************
> + * FLUSH Extended Transport Header
> + ******************************************************************************/
> +
> +struct rxe_feth {
> + __be32 bits;
> +};
> +
> +#define FETH_PLT_MASK (0x0000000f) /* bits 3-0 */
> +#define FETH_SEL_MASK (0x00000030) /* bits 5-4 */
> +#define FETH_SEL_SHIFT (4U)
> +
> +static inline u32 __feth_plt(void *arg)
> +{
> + struct rxe_feth *feth = arg;
> +
> + return be32_to_cpu(feth->bits) & FETH_PLT_MASK;
> +}
> +
> +static inline u32 __feth_sel(void *arg)
> +{
> + struct rxe_feth *feth = arg;
> +
> + return (be32_to_cpu(feth->bits) & FETH_SEL_MASK) >> FETH_SEL_SHIFT;
> +}
> +
> +static inline u32 feth_plt(struct rxe_pkt_info *pkt)
> +{
> + return __feth_plt(pkt->hdr + rxe_opcode[pkt->opcode].offset[RXE_FETH]);
> +}
> +
> +static inline u32 feth_sel(struct rxe_pkt_info *pkt)
> +{
> + return __feth_sel(pkt->hdr + rxe_opcode[pkt->opcode].offset[RXE_FETH]);
> +}
> +
> +static inline void feth_init(struct rxe_pkt_info *pkt, u8 type, u8 level)
> +{
> + struct rxe_feth *feth = (struct rxe_feth *)
> + (pkt->hdr + rxe_opcode[pkt->opcode].offset[RXE_FETH]);
> + u32 bits = ((level << FETH_SEL_SHIFT) & FETH_SEL_MASK) |
> + (type & FETH_PLT_MASK);
> +
> + feth->bits = cpu_to_be32(bits);
> +}
> +
> /******************************************************************************
> * Atomic Extended Transport Header
> ******************************************************************************/
> @@ -910,6 +956,7 @@ enum rxe_hdr_length {
> RXE_ATMETH_BYTES = sizeof(struct rxe_atmeth),
> RXE_IETH_BYTES = sizeof(struct rxe_ieth),
> RXE_RDETH_BYTES = sizeof(struct rxe_rdeth),
> + RXE_FETH_BYTES = sizeof(struct rxe_feth),
> };
>
> static inline size_t header_size(struct rxe_pkt_info *pkt)
> diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.c b/drivers/infiniband/sw/rxe/rxe_opcode.c
> index d4ba4d506f17..55aad13e57bb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_opcode.c
> +++ b/drivers/infiniband/sw/rxe/rxe_opcode.c
> @@ -101,6 +101,12 @@ struct rxe_wr_opcode_info rxe_wr_opcode_info[] = {
> [IB_QPT_UC] = WR_LOCAL_OP_MASK,
> },
> },
> + [IB_WR_FLUSH] = {
> + .name = "IB_WR_FLUSH",
> + .mask = {
> + [IB_QPT_RC] = WR_FLUSH_MASK,
> + },
> + },
> };
Hi, Zhijian
I am making tests with it. Except rc, other modes are supported? such as
rd, xrc?
Zhu Yanjun
>
> struct rxe_opcode_info rxe_opcode[RXE_NUM_OPCODE] = {
> @@ -378,6 +384,17 @@ struct rxe_opcode_info rxe_opcode[RXE_NUM_OPCODE] = {
> RXE_IETH_BYTES,
> }
> },
> + [IB_OPCODE_RC_FLUSH] = {
> + .name = "IB_OPCODE_RC_FLUSH",
> + .mask = RXE_FETH_MASK | RXE_RETH_MASK | RXE_FLUSH_MASK |
> + RXE_START_MASK | RXE_END_MASK | RXE_REQ_MASK,
> + .length = RXE_BTH_BYTES + RXE_FETH_BYTES + RXE_RETH_BYTES,
> + .offset = {
> + [RXE_BTH] = 0,
> + [RXE_FETH] = RXE_BTH_BYTES,
> + [RXE_RETH] = RXE_BTH_BYTES + RXE_FETH_BYTES,
> + }
> + },
>
> /* UC */
> [IB_OPCODE_UC_SEND_FIRST] = {
> diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.h b/drivers/infiniband/sw/rxe/rxe_opcode.h
> index 8f9aaaf260f2..02d256745793 100644
> --- a/drivers/infiniband/sw/rxe/rxe_opcode.h
> +++ b/drivers/infiniband/sw/rxe/rxe_opcode.h
> @@ -19,7 +19,8 @@ enum rxe_wr_mask {
> WR_SEND_MASK = BIT(2),
> WR_READ_MASK = BIT(3),
> WR_WRITE_MASK = BIT(4),
> - WR_LOCAL_OP_MASK = BIT(5),
> + WR_FLUSH_MASK = BIT(5),
> + WR_LOCAL_OP_MASK = BIT(6),
>
> WR_READ_OR_WRITE_MASK = WR_READ_MASK | WR_WRITE_MASK,
> WR_WRITE_OR_SEND_MASK = WR_WRITE_MASK | WR_SEND_MASK,
> @@ -47,6 +48,7 @@ enum rxe_hdr_type {
> RXE_RDETH,
> RXE_DETH,
> RXE_IMMDT,
> + RXE_FETH,
> RXE_PAYLOAD,
> NUM_HDR_TYPES
> };
> @@ -63,6 +65,7 @@ enum rxe_hdr_mask {
> RXE_IETH_MASK = BIT(RXE_IETH),
> RXE_RDETH_MASK = BIT(RXE_RDETH),
> RXE_DETH_MASK = BIT(RXE_DETH),
> + RXE_FETH_MASK = BIT(RXE_FETH),
> RXE_PAYLOAD_MASK = BIT(RXE_PAYLOAD),
>
> RXE_REQ_MASK = BIT(NUM_HDR_TYPES + 0),
> @@ -71,13 +74,14 @@ enum rxe_hdr_mask {
> RXE_WRITE_MASK = BIT(NUM_HDR_TYPES + 3),
> RXE_READ_MASK = BIT(NUM_HDR_TYPES + 4),
> RXE_ATOMIC_MASK = BIT(NUM_HDR_TYPES + 5),
> + RXE_FLUSH_MASK = BIT(NUM_HDR_TYPES + 6),
>
> - RXE_RWR_MASK = BIT(NUM_HDR_TYPES + 6),
> - RXE_COMP_MASK = BIT(NUM_HDR_TYPES + 7),
> + RXE_RWR_MASK = BIT(NUM_HDR_TYPES + 7),
> + RXE_COMP_MASK = BIT(NUM_HDR_TYPES + 8),
>
> - RXE_START_MASK = BIT(NUM_HDR_TYPES + 8),
> - RXE_MIDDLE_MASK = BIT(NUM_HDR_TYPES + 9),
> - RXE_END_MASK = BIT(NUM_HDR_TYPES + 10),
> + RXE_START_MASK = BIT(NUM_HDR_TYPES + 9),
> + RXE_MIDDLE_MASK = BIT(NUM_HDR_TYPES + 10),
> + RXE_END_MASK = BIT(NUM_HDR_TYPES + 11),
>
> RXE_LOOPBACK_MASK = BIT(NUM_HDR_TYPES + 12),
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush
2022-11-11 8:43 ` Yanjun Zhu
@ 2022-11-11 8:55 ` lizhijian
2022-11-11 9:28 ` Yanjun Zhu
0 siblings, 1 reply; 31+ messages in thread
From: lizhijian @ 2022-11-11 8:55 UTC (permalink / raw)
To: Yanjun Zhu, Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
On 11/11/2022 16:43, Yanjun Zhu wrote:
>> /******************************************************************************
>> * Atomic Extended Transport Header
>>
>> ******************************************************************************/
>> @@ -910,6 +956,7 @@ enum rxe_hdr_length {
>> RXE_ATMETH_BYTES = sizeof(struct rxe_atmeth),
>> RXE_IETH_BYTES = sizeof(struct rxe_ieth),
>> RXE_RDETH_BYTES = sizeof(struct rxe_rdeth),
>> + RXE_FETH_BYTES = sizeof(struct rxe_feth),
>> };
>> static inline size_t header_size(struct rxe_pkt_info *pkt)
>> diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.c
>> b/drivers/infiniband/sw/rxe/rxe_opcode.c
>> index d4ba4d506f17..55aad13e57bb 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_opcode.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_opcode.c
>> @@ -101,6 +101,12 @@ struct rxe_wr_opcode_info rxe_wr_opcode_info[] = {
>> [IB_QPT_UC] = WR_LOCAL_OP_MASK,
>> },
>> },
>> + [IB_WR_FLUSH] = {
>> + .name = "IB_WR_FLUSH",
>> + .mask = {
>> + [IB_QPT_RC] = WR_FLUSH_MASK,
>> + },
>> + },
>> };
>
> Hi, Zhijian
>
> I am making tests with it. Except rc, other modes are supported? such as
> rd, xrc?
>
Only RC is implemented for FLUSH, current RXE only supports RC service[1].
BTW, XRC is on the way in Bob's patch IIRC.
https://lore.kernel.org/r/cce0f07d-25fc-5880-69e7-001d951750b7@gmail.com
> Zhu Yanjun
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush
2022-11-11 8:55 ` lizhijian
@ 2022-11-11 9:28 ` Yanjun Zhu
0 siblings, 0 replies; 31+ messages in thread
From: Yanjun Zhu @ 2022-11-11 9:28 UTC (permalink / raw)
To: lizhijian, Bob Pearson, Leon Romanovsky, Jason Gunthorpe, linux-rdma
Cc: Zhu Yanjun, yangx.jy, Yasunori Gotou (Fujitsu),
mbloch, liangwenpeng, tom, tomasz.gromadzki, dan.j.williams,
linux-kernel
在 2022/11/11 16:55, lizhijian@fujitsu.com 写道:
>
> On 11/11/2022 16:43, Yanjun Zhu wrote:
>>> /******************************************************************************
>>> * Atomic Extended Transport Header
>>>
>>> ******************************************************************************/
>>> @@ -910,6 +956,7 @@ enum rxe_hdr_length {
>>> RXE_ATMETH_BYTES = sizeof(struct rxe_atmeth),
>>> RXE_IETH_BYTES = sizeof(struct rxe_ieth),
>>> RXE_RDETH_BYTES = sizeof(struct rxe_rdeth),
>>> + RXE_FETH_BYTES = sizeof(struct rxe_feth),
>>> };
>>> static inline size_t header_size(struct rxe_pkt_info *pkt)
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.c
>>> b/drivers/infiniband/sw/rxe/rxe_opcode.c
>>> index d4ba4d506f17..55aad13e57bb 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_opcode.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_opcode.c
>>> @@ -101,6 +101,12 @@ struct rxe_wr_opcode_info rxe_wr_opcode_info[] = {
>>> [IB_QPT_UC] = WR_LOCAL_OP_MASK,
>>> },
>>> },
>>> + [IB_WR_FLUSH] = {
>>> + .name = "IB_WR_FLUSH",
>>> + .mask = {
>>> + [IB_QPT_RC] = WR_FLUSH_MASK,
>>> + },
>>> + },
>>> };
>> Hi, Zhijian
>>
>> I am making tests with it. Except rc, other modes are supported? such as
>> rd, xrc?
>>
> Only RC is implemented for FLUSH, current RXE only supports RC service[1].
> BTW, XRC is on the way in Bob's patch IIRC.
>
> https://lore.kernel.org/r/cce0f07d-25fc-5880-69e7-001d951750b7@gmail.com
40 * IBA header types and methods
41 *
42 * Some of these are for reference and completeness only since
43 * rxe does not currently support RD transport
44 * most of this could be moved into IB core. ib_pack.h has
45 * part of this but is incomplete
Zhu Yanjun
>
>
>> Zhu Yanjun
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2022-11-11 9:29 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 5:53 [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 01/11] RDMA/rxe: make sure requested access is a subset of {mr,mw}->access Li Zhijian
2022-10-28 17:45 ` Jason Gunthorpe
2022-09-27 5:53 ` [for-next PATCH v5 02/11] RDMA: Extend RDMA user ABI to support flush Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 03/11] RDMA: Extend RDMA kernel verbs " Li Zhijian
2022-09-29 6:21 ` Li Zhijian
2022-09-30 18:04 ` Jason Gunthorpe
2022-10-28 17:44 ` Jason Gunthorpe
2022-10-29 3:15 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 04/11] RDMA/rxe: Extend rxe user " Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 05/11] RDMA/rxe: Allow registering persistent flag for pmem MR only Li Zhijian
2022-10-28 17:53 ` Jason Gunthorpe
2022-10-30 3:33 ` Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 06/11] RDMA/rxe: Extend rxe packet format to support flush Li Zhijian
2022-11-11 8:43 ` Yanjun Zhu
2022-11-11 8:55 ` lizhijian
2022-11-11 9:28 ` Yanjun Zhu
2022-09-27 5:53 ` [for-next PATCH v5 07/11] RDMA/rxe: Implement RC RDMA FLUSH service in requester side Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 08/11] RDMA/rxe: Implement flush execution in responder side Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 09/11] RDMA/rxe: Implement flush completion Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 10/11] RDMA/cm: Make QP FLUSHABLE Li Zhijian
2022-09-27 5:53 ` [for-next PATCH v5 11/11] RDMA/rxe: Enable RDMA FLUSH capability for rxe device Li Zhijian
2022-10-28 17:44 ` [for-next PATCH v5 00/11] RDMA/rxe: Add RDMA FLUSH operation Jason Gunthorpe
2022-10-28 17:57 ` Jason Gunthorpe
2022-11-11 2:49 ` Yanjun Zhu
2022-11-11 5:10 ` lizhijian
2022-11-11 5:52 ` Yanjun Zhu
2022-11-11 6:10 ` lizhijian
2022-11-11 6:30 ` Yanjun Zhu
2022-11-11 6:38 ` lizhijian
2022-11-11 7:08 ` Yanjun Zhu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).