All of lore.kernel.org
 help / color / mirror / Atom feed
* [rdma-core v2 0/9] Broadcom User Space RoCE Driver
@ 2017-02-18 15:43 Devesh Sharma
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This series introduces the user space RoCE driver for the Broadcom
NetXtreme-E 10/25/40/50 RDMA Ethernet Controller. This driver
is dependent on the bnxt_re driver posted earlier to linux-rdma
community and is under reveiw.

This patch series is based on the latest master of rdma-core
repository hosted at https://github.com/linux-rdma/rdma-core.git

The GIT for this library is hosted at following URL on github
https://github.com/dsharma283/bnxtre-rdma-core.git
branch: bnxtre-v2

Please review and give your valuable feedback for the betterment.

v1->v2
 -- Rename directory from  bnxtre to bnxt_re
 -- Squashed byte-order conversion patch
 -- Removed version.h file

Devesh Sharma (9):
  libbnxt_re: introduce bnxtre user space RDMA provider
  libbnxt_re: Add support for user memory regions
  libbnxt_re: Add support for CQ and QP management
  libbnxt_re: Add support for posting and polling
  libbnxt_re: Allow apps to poll for flushed completions
  libbnxt_re: Enable UD control path and wqe posting
  libbnxt_re: Enable polling for UD completions
  libbnxt_re: Add support for atomic operations
  libbnxt_re: Add support for SRQ in user lib

 CMakeLists.txt                   |    1 +
 MAINTAINERS                      |    5 +
 providers/bnxt_re/CMakeLists.txt |    6 +
 providers/bnxt_re/bnxt_re-abi.h  |  375 +++++++++
 providers/bnxt_re/db.c           |  108 +++
 providers/bnxt_re/flush.h        |   85 ++
 providers/bnxt_re/main.c         |  215 ++++++
 providers/bnxt_re/main.h         |  398 ++++++++++
 providers/bnxt_re/memory.c       |   76 ++
 providers/bnxt_re/memory.h       |  154 ++++
 providers/bnxt_re/verbs.c        | 1591 ++++++++++++++++++++++++++++++++++++++
 providers/bnxt_re/verbs.h        |  101 +++
 12 files changed, 3115 insertions(+)
 create mode 100644 providers/bnxt_re/CMakeLists.txt
 create mode 100644 providers/bnxt_re/bnxt_re-abi.h
 create mode 100644 providers/bnxt_re/db.c
 create mode 100644 providers/bnxt_re/flush.h
 create mode 100644 providers/bnxt_re/main.c
 create mode 100644 providers/bnxt_re/main.h
 create mode 100644 providers/bnxt_re/memory.c
 create mode 100644 providers/bnxt_re/memory.h
 create mode 100644 providers/bnxt_re/verbs.c
 create mode 100644 providers/bnxt_re/verbs.h

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [rdma-core v2 1/9] libbnxt_re: introduce bnxtre user space RDMA provider
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 2/9] libbnxt_re: Add support for user memory regions Devesh Sharma
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

libnxtre is a user-space driver which provides RDMA
capability to user applications. The current framework
has following parts working:

 -Basic Cmake framework to build and install the library.
 -Register and unregister user-space driver with uverbs
  interface.
 -List all available bnxt_re devices using "ibv_devinfo"
  admin command.
 -List all the device and port attributes using
  "ibv_devinfo" admin command.
 -Support allocate/free of protection domains.
 -Check ABI version between library and kernel module.
 -Update MAINTAINERS file

v1->v2:
 --Deleted bnxtre.driver file
 --Dropped HAVE_CONFIG_H macro and cna_table declaration
 --Renamed abi.h to bnxt_re-abi.h

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 CMakeLists.txt                   |   1 +
 MAINTAINERS                      |   5 +
 providers/bnxt_re/CMakeLists.txt |   4 +
 providers/bnxt_re/bnxt_re-abi.h  |  59 ++++++++++
 providers/bnxt_re/main.c         | 187 ++++++++++++++++++++++++++++++
 providers/bnxt_re/main.h         | 110 ++++++++++++++++++
 providers/bnxt_re/verbs.c        | 242 +++++++++++++++++++++++++++++++++++++++
 providers/bnxt_re/verbs.h        | 101 ++++++++++++++++
 8 files changed, 709 insertions(+)
 create mode 100644 providers/bnxt_re/CMakeLists.txt
 create mode 100644 providers/bnxt_re/bnxt_re-abi.h
 create mode 100644 providers/bnxt_re/main.c
 create mode 100644 providers/bnxt_re/main.h
 create mode 100644 providers/bnxt_re/verbs.c
 create mode 100644 providers/bnxt_re/verbs.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index c6dc136..1d4dbbb 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -371,6 +371,7 @@ add_subdirectory(libibcm)
 
 # Providers
 if (HAVE_COHERENT_DMA)
+add_subdirectory(providers/bnxt_re)
 add_subdirectory(providers/cxgb3)
 add_subdirectory(providers/cxgb4)
 add_subdirectory(providers/hns)
diff --git a/MAINTAINERS b/MAINTAINERS
index 2ae504c..19fe88d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -40,6 +40,11 @@ F:	*/CMakeLists.txt
 F:	*/lib*.map
 F:	buildlib/
 
+BNXT_RE USERSPACE PROVIDER (for bnxt_re.ko)
+M:	Devesh Sharma  <Devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
+S:	Supported
+F:	providers/bnxt_re/
+
 CXGB3 USERSPACE PROVIDER (for iw_cxgb3.ko)
 M:	Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
 S:	Supported
diff --git a/providers/bnxt_re/CMakeLists.txt b/providers/bnxt_re/CMakeLists.txt
new file mode 100644
index 0000000..45e609d
--- /dev/null
+++ b/providers/bnxt_re/CMakeLists.txt
@@ -0,0 +1,4 @@
+rdma_provider(bnxt_re
+	main.c
+	verbs.c
+)
diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
new file mode 100644
index 0000000..05a0888
--- /dev/null
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -0,0 +1,59 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: ABI data structure definition
+ */
+
+#ifndef __BNXT_RE_ABI_H__
+#define __BNXT_RE_ABI_H__
+
+#include <infiniband/kern-abi.h>
+
+#define BNXT_RE_ABI_VERSION 1
+
+struct bnxt_re_cntx_resp {
+	struct ibv_get_context_resp resp;
+	__u32 dev_id;
+	__u32 max_qp; /* To allocate qp-table */
+};
+
+struct bnxt_re_pd_resp {
+	struct ibv_alloc_pd_resp resp;
+	__u32 pdid;
+	__u32 dpi;
+	__u64 dbr;
+};
+
+#endif
diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
new file mode 100644
index 0000000..23f2f4f
--- /dev/null
+++ b/providers/bnxt_re/main.c
@@ -0,0 +1,187 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Device detection and initializatoin
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <pthread.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+
+#include "main.h"
+#include "verbs.h"
+
+#define PCI_VENDOR_ID_BROADCOM		0x14E4
+
+#define CNA(v, d)					\
+	{	.vendor = PCI_VENDOR_ID_##v,		\
+		.device = d }
+
+static const struct {
+	unsigned int vendor;
+	unsigned int device;
+} cna_table[] = {
+	CNA(BROADCOM, 0x16C0),  /* BCM57417 NPAR */
+	CNA(BROADCOM, 0x16CE),  /* BMC57311 */
+	CNA(BROADCOM, 0x16CF),  /* BMC57312 */
+	CNA(BROADCOM, 0x16DF),  /* BMC57314 */
+	CNA(BROADCOM, 0x16E5),  /* BMC57314 VF */
+	CNA(BROADCOM, 0x16E2),  /* BMC57417 */
+	CNA(BROADCOM, 0x16E3),  /* BMC57416 */
+	CNA(BROADCOM, 0x16D6),  /* BMC57412*/
+	CNA(BROADCOM, 0x16D7),  /* BMC57414 */
+	CNA(BROADCOM, 0x16D8),  /* BMC57416 Cu */
+	CNA(BROADCOM, 0x16D9),  /* BMC57417 Cu */
+	CNA(BROADCOM, 0x16C1),  /* BMC57414 VF */
+	CNA(BROADCOM, 0x16EF),  /* BCM57416 NPAR */
+	CNA(BROADCOM, 0x16ED),  /* BCM57414 NPAR */
+	CNA(BROADCOM, 0x16EB)   /* BCM57412 NPAR */
+};
+
+static struct ibv_context_ops bnxt_re_cntx_ops = {
+	.query_device  = bnxt_re_query_device,
+	.query_port    = bnxt_re_query_port,
+	.alloc_pd      = bnxt_re_alloc_pd,
+	.dealloc_pd    = bnxt_re_free_pd,
+	.reg_mr        = bnxt_re_reg_mr,
+	.dereg_mr      = bnxt_re_dereg_mr,
+	.create_cq     = bnxt_re_create_cq,
+	.poll_cq       = bnxt_re_poll_cq,
+	.req_notify_cq = bnxt_re_arm_cq,
+	.cq_event      = bnxt_re_cq_event,
+	.resize_cq     = bnxt_re_resize_cq,
+	.destroy_cq    = bnxt_re_destroy_cq,
+	.create_srq    = bnxt_re_create_srq,
+	.modify_srq    = bnxt_re_modify_srq,
+	.query_srq     = bnxt_re_query_srq,
+	.destroy_srq   = bnxt_re_destroy_srq,
+	.post_srq_recv = bnxt_re_post_srq_recv,
+	.create_qp     = bnxt_re_create_qp,
+	.query_qp      = bnxt_re_query_qp,
+	.modify_qp     = bnxt_re_modify_qp,
+	.destroy_qp    = bnxt_re_destroy_qp,
+	.post_send     = bnxt_re_post_send,
+	.post_recv     = bnxt_re_post_recv,
+	.create_ah     = bnxt_re_create_ah,
+	.destroy_ah    = bnxt_re_destroy_ah
+};
+
+static int bnxt_re_init_context(struct verbs_device *vdev,
+				struct ibv_context *ibvctx, int cmd_fd)
+{
+	struct ibv_get_context cmd;
+	struct bnxt_re_cntx_resp resp;
+	struct bnxt_re_context *cntx;
+
+	cntx = to_bnxt_re_context(ibvctx);
+
+	memset(&resp, 0, sizeof(resp));
+	ibvctx->cmd_fd = cmd_fd;
+	if (ibv_cmd_get_context(ibvctx, &cmd, sizeof(cmd),
+				&resp.resp, sizeof(resp)))
+		return errno;
+
+	cntx->dev_id = resp.dev_id;
+	cntx->max_qp = resp.max_qp;
+	ibvctx->ops = bnxt_re_cntx_ops;
+
+	return 0;
+}
+
+static void bnxt_re_uninit_context(struct verbs_device *vdev,
+				   struct ibv_context *ibvctx)
+{
+	/* Unmap if anything device specific was mapped in init_context. */
+}
+
+static struct verbs_device *bnxt_re_driver_init(const char *uverbs_sys_path,
+						int abi_version)
+{
+	char value[10];
+	struct bnxt_re_dev *dev;
+	unsigned int vendor, device;
+	int i;
+
+	if (ibv_read_sysfs_file(uverbs_sys_path, "device/vendor",
+				value, sizeof(value)) < 0)
+		return NULL;
+	vendor = strtol(value, NULL, 16);
+
+	if (ibv_read_sysfs_file(uverbs_sys_path, "device/device",
+				value, sizeof(value)) < 0)
+		return NULL;
+	device = strtol(value, NULL, 16);
+
+	for (i = 0; i < sizeof(cna_table) / sizeof(cna_table[0]); ++i)
+		if (vendor == cna_table[i].vendor &&
+		    device == cna_table[i].device)
+			goto found;
+	return NULL;
+found:
+	if (abi_version != BNXT_RE_ABI_VERSION) {
+		fprintf(stderr, DEV "FATAL: Max supported ABI of %s is %d "
+			"check for the latest version of kernel driver and"
+			"user library\n", uverbs_sys_path, abi_version);
+		return NULL;
+	}
+
+	dev = calloc(1, sizeof(*dev));
+	if (!dev) {
+		fprintf(stderr, DEV "Failed to allocate device for %s\n",
+			uverbs_sys_path);
+		return NULL;
+	}
+
+	dev->vdev.sz = sizeof(*dev);
+	dev->vdev.size_of_context =
+		sizeof(struct bnxt_re_context) - sizeof(struct ibv_context);
+
+	dev->vdev.init_context = bnxt_re_init_context;
+	dev->vdev.uninit_context = bnxt_re_uninit_context;
+
+	return &dev->vdev;
+}
+
+static __attribute__((constructor)) void bnxt_re_register_driver(void)
+{
+	verbs_register_driver("bnxtre", bnxt_re_driver_init);
+}
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
new file mode 100644
index 0000000..d621efa
--- /dev/null
+++ b/providers/bnxt_re/main.h
@@ -0,0 +1,110 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Basic device data structures needed for book-keeping
+ */
+
+#ifndef __MAIN_H__
+#define __MAIN_H__
+
+#include <inttypes.h>
+#include <stddef.h>
+#include <endian.h>
+#include <pthread.h>
+
+#include <infiniband/driver.h>
+#include <util/udma_barrier.h>
+
+#include "bnxt_re-abi.h"
+
+struct bnxt_re_pd {
+	struct ibv_pd ibvpd;
+	uint32_t pdid;
+};
+
+struct bnxt_re_cq {
+	struct ibv_cq ibvcq;
+};
+
+struct bnxt_re_qp {
+	struct ibv_qp ibvqp;
+};
+
+struct bnxt_re_srq {
+	struct ibv_srq ibvsrq;
+};
+
+struct bnxt_re_mr {
+	struct ibv_mr ibvmr;
+};
+
+#define DEV	"bnxtre : "
+
+struct bnxt_re_dpi {
+	__u32 dpindx;
+	__u64 *dbpage;
+	pthread_spinlock_t db_lock;
+};
+
+struct bnxt_re_dev {
+	struct verbs_device vdev;
+	uint8_t abi_version;
+};
+
+struct bnxt_re_context {
+	struct ibv_context ibvctx;
+	uint32_t dev_id;
+	uint32_t max_qp;
+	uint32_t max_srq;
+	struct bnxt_re_dpi udpi;
+};
+
+static inline struct bnxt_re_dev *to_bnxt_re_dev(struct ibv_device *ibvdev)
+{
+	return container_of(ibvdev, struct bnxt_re_dev, vdev);
+}
+
+static inline struct bnxt_re_context *to_bnxt_re_context(
+		struct ibv_context *ibvctx)
+{
+	return container_of(ibvctx, struct bnxt_re_context, ibvctx);
+}
+
+static inline struct bnxt_re_pd *to_bnxt_re_pd(struct ibv_pd *ibvpd)
+{
+	return container_of(ibvpd, struct bnxt_re_pd, ibvpd);
+}
+
+#endif
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
new file mode 100644
index 0000000..5b3e1cc
--- /dev/null
+++ b/providers/bnxt_re/verbs.c
@@ -0,0 +1,242 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: User IB-Verbs implementation
+ */
+
+#include <assert.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+#include <pthread.h>
+#include <malloc.h>
+#include <sys/mman.h>
+#include <netinet/in.h>
+#include <unistd.h>
+
+#include "main.h"
+#include "verbs.h"
+
+int bnxt_re_query_device(struct ibv_context *ibvctx,
+			 struct ibv_device_attr *dev_attr)
+{
+	struct ibv_query_device cmd;
+	uint64_t fw_ver;
+	int status;
+
+	memset(dev_attr, 0, sizeof(struct ibv_device_attr));
+	status = ibv_cmd_query_device(ibvctx, dev_attr, &fw_ver,
+				      &cmd, sizeof(cmd));
+	return status;
+}
+
+int bnxt_re_query_port(struct ibv_context *ibvctx, uint8_t port,
+		       struct ibv_port_attr *port_attr)
+{
+	struct ibv_query_port cmd;
+
+	memset(port_attr, 0, sizeof(struct ibv_port_attr));
+	return ibv_cmd_query_port(ibvctx, port, port_attr, &cmd, sizeof(cmd));
+}
+
+struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *ibvctx)
+{
+	struct ibv_alloc_pd cmd;
+	struct bnxt_re_pd_resp resp;
+	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx);
+	struct bnxt_re_pd *pd;
+
+	pd = calloc(1, sizeof(*pd));
+	if (!pd)
+		return NULL;
+
+	memset(&resp, 0, sizeof(resp));
+	if (ibv_cmd_alloc_pd(ibvctx, &pd->ibvpd, &cmd, sizeof(cmd),
+			     &resp.resp, sizeof(resp)))
+		goto out;
+
+	pd->pdid = resp.pdid;
+
+	/* Map DB page now. */
+	cntx->udpi.dpindx = resp.dpi;
+	cntx->udpi.dbpage = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED,
+				 ibvctx->cmd_fd, resp.dbr);
+	if (cntx->udpi.dbpage == MAP_FAILED) {
+		(void)ibv_cmd_dealloc_pd(&pd->ibvpd);
+		goto out;
+	}
+	pthread_spin_init(&cntx->udpi.db_lock, PTHREAD_PROCESS_PRIVATE);
+
+	return &pd->ibvpd;
+out:
+	free(pd);
+	return NULL;
+}
+
+int bnxt_re_free_pd(struct ibv_pd *ibvpd)
+{
+	struct bnxt_re_pd *pd = to_bnxt_re_pd(ibvpd);
+	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvpd->context);
+	int status;
+
+	status = ibv_cmd_dealloc_pd(ibvpd);
+	if (status)
+		return status;
+
+	pthread_spin_destroy(&cntx->udpi.db_lock);
+	if (cntx->udpi.dbpage && (cntx->udpi.dbpage != MAP_FAILED))
+		munmap(cntx->udpi.dbpage, 4096);
+	free(pd);
+
+	return 0;
+}
+
+struct ibv_mr *bnxt_re_reg_mr(struct ibv_pd *ibvpd, void *sva, size_t len,
+			      int access)
+{
+	return NULL;
+}
+
+int bnxt_re_dereg_mr(struct ibv_mr *ibvmr)
+{
+	return -ENOSYS;
+}
+
+struct ibv_cq *bnxt_re_create_cq(struct ibv_context *ibvctx, int ncqe,
+				 struct ibv_comp_channel *channel, int vec)
+{
+	return NULL;
+}
+
+int bnxt_re_resize_cq(struct ibv_cq *ibvcq, int ncqe)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_destroy_cq(struct ibv_cq *ibvcq)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc)
+{
+	return -ENOSYS;
+}
+
+void bnxt_re_cq_event(struct ibv_cq *ibvcq)
+{
+
+}
+
+int bnxt_re_arm_cq(struct ibv_cq *ibvcq, int flags)
+{
+	return -ENOSYS;
+}
+
+struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
+				 struct ibv_qp_init_attr *attr)
+{
+	return NULL;
+}
+
+int bnxt_re_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
+		      int attr_mask)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
+		     int attr_mask, struct ibv_qp_init_attr *init_attr)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_destroy_qp(struct ibv_qp *ibvqp)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+		      struct ibv_send_wr **bad)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+		      struct ibv_recv_wr **bad)
+{
+	return -ENOSYS;
+}
+
+struct ibv_srq *bnxt_re_create_srq(struct ibv_pd *ibvpd,
+				   struct ibv_srq_init_attr *attr)
+{
+	return NULL;
+}
+
+int bnxt_re_modify_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr,
+		       int init_attr)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_destroy_srq(struct ibv_srq *ibvsrq)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_query_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr)
+{
+	return -ENOSYS;
+}
+
+int bnxt_re_post_srq_recv(struct ibv_srq *ibvsrq, struct ibv_recv_wr *wr,
+			  struct ibv_recv_wr **bad)
+{
+	return -ENOSYS;
+}
+
+struct ibv_ah *bnxt_re_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr)
+{
+	return NULL;
+}
+
+int bnxt_re_destroy_ah(struct ibv_ah *ibvah)
+{
+	return -ENOSYS;
+}
diff --git a/providers/bnxt_re/verbs.h b/providers/bnxt_re/verbs.h
new file mode 100644
index 0000000..1b2a4a7
--- /dev/null
+++ b/providers/bnxt_re/verbs.h
@@ -0,0 +1,101 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Internal IB-verbs function declaration
+ */
+
+#ifndef __VERBS_H__
+#define __VERBS_H__
+
+#include <assert.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <errno.h>
+#include <pthread.h>
+#include <malloc.h>
+#include <sys/mman.h>
+#include <netinet/in.h>
+#include <unistd.h>
+
+#include <infiniband/driver.h>
+#include <infiniband/verbs.h>
+
+int bnxt_re_query_device(struct ibv_context *uctx,
+			 struct ibv_device_attr *attr);
+int bnxt_re_query_port(struct ibv_context *uctx, uint8_t port,
+		       struct ibv_port_attr *attr);
+struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *uctx);
+int bnxt_re_free_pd(struct ibv_pd *ibvpd);
+struct ibv_mr *bnxt_re_reg_mr(struct ibv_pd *ibvpd, void *buf, size_t len,
+			      int ibv_access_flags);
+int bnxt_re_dereg_mr(struct ibv_mr *ibvmr);
+
+struct ibv_cq *bnxt_re_create_cq(struct ibv_context *uctx, int ncqe,
+				 struct ibv_comp_channel *ch, int vec);
+int bnxt_re_resize_cq(struct ibv_cq *ibvcq, int ncqe);
+int bnxt_re_destroy_cq(struct ibv_cq *ibvcq);
+int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc);
+void bnxt_re_cq_event(struct ibv_cq *ibvcq);
+int bnxt_re_arm_cq(struct ibv_cq *ibvcq, int flags);
+
+struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
+				 struct ibv_qp_init_attr *attr);
+int bnxt_re_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
+		      int ibv_qp_attr_mask);
+int bnxt_re_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
+		     int attr_mask, struct ibv_qp_init_attr *init_attr);
+int bnxt_re_destroy_qp(struct ibv_qp *ibvqp);
+int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+		      struct ibv_send_wr **bad);
+int bnxt_re_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+		      struct ibv_recv_wr **bad);
+
+struct ibv_srq *bnxt_re_create_srq(struct ibv_pd *ibvpd,
+				   struct ibv_srq_init_attr *attr);
+int bnxt_re_modify_srq(struct ibv_srq *ibvsrq,
+		       struct ibv_srq_attr *attr, int mask);
+int bnxt_re_destroy_srq(struct ibv_srq *ibvsrq);
+int bnxt_re_query_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr);
+int bnxt_re_post_srq_recv(struct ibv_srq *ibvsrq, struct ibv_recv_wr *wr,
+			  struct ibv_recv_wr **bad);
+
+struct ibv_ah *bnxt_re_create_ah(struct ibv_pd *ibvpd,
+				 struct ibv_ah_attr *attr);
+int bnxt_re_destroy_ah(struct ibv_ah *ibvah);
+
+#endif /* __BNXT_RE_VERBS_H__ */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 2/9] libbnxt_re: Add support for user memory regions
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-02-18 15:43   ` [rdma-core v2 1/9] libbnxt_re: introduce bnxtre user space RDMA provider Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 3/9] libbnxt_re: Add support for CQ and QP management Devesh Sharma
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds code to allow user applications to register and
unregister memory buffers with the HCA. Following functions are
now supported:
 - ibv_reg_mr()
 - ibv_dereg_mr()

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/bnxt_re-abi.h |  4 ++++
 providers/bnxt_re/verbs.c       | 26 ++++++++++++++++++++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index 05a0888..53645d5 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -56,4 +56,8 @@ struct bnxt_re_pd_resp {
 	__u64 dbr;
 };
 
+struct bnxt_re_mr_resp {
+	struct ibv_reg_mr_resp resp;
+};
+
 #endif
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 5b3e1cc..72a3443 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -129,12 +129,34 @@ int bnxt_re_free_pd(struct ibv_pd *ibvpd)
 struct ibv_mr *bnxt_re_reg_mr(struct ibv_pd *ibvpd, void *sva, size_t len,
 			      int access)
 {
-	return NULL;
+	struct bnxt_re_mr *mr;
+	struct ibv_reg_mr cmd;
+	struct bnxt_re_mr_resp resp;
+
+	mr = calloc(1, sizeof(*mr));
+	if (!mr)
+		return NULL;
+
+	if (ibv_cmd_reg_mr(ibvpd, sva, len, (uint64_t)sva, access, &mr->ibvmr,
+			   &cmd, sizeof(cmd), &resp.resp, sizeof(resp))) {
+		free(mr);
+		return NULL;
+	}
+
+	return &mr->ibvmr;
 }
 
 int bnxt_re_dereg_mr(struct ibv_mr *ibvmr)
 {
-	return -ENOSYS;
+	struct bnxt_re_mr *mr = (struct bnxt_re_mr *)ibvmr;
+	int status;
+
+	status = ibv_cmd_dereg_mr(ibvmr);
+	if (status)
+		return status;
+	free(mr);
+
+	return 0;
 }
 
 struct ibv_cq *bnxt_re_create_cq(struct ibv_context *ibvctx, int ncqe,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 3/9] libbnxt_re: Add support for CQ and QP management
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-02-18 15:43   ` [rdma-core v2 1/9] libbnxt_re: introduce bnxtre user space RDMA provider Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 2/9] libbnxt_re: Add support for user memory regions Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 4/9] libbnxt_re: Add support for posting and polling Devesh Sharma
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds support for completion queue creation and
destruction following are the changes:

 - Added User/Kernel ABI to communicate CQ specific parameters.
 - Added a function in a new file to allocate Page-Aligned address
   space.
 - Added a function to free page-aligned address space.
 - Added function to create and destroy completion queue.
 - Add ABI to for QP creation and WQE/RQE format.
 - Add functions to allocate SQ, RQ and Search PSN address
   space.
 - Add functions to store/clean qp-handles in the form of
   a linear table. There is table maintained in every instance
   of ucontext.
 - CQ and QP contexts now hold a pointer to the DPI mapped
   during PD allocation.
 - Removed hard-coding of page size during mapping DB page.
 - Renamed a variable in PD code.
 - Add support for create-qp.
 - Add support for destroy-qp.
 - Add support for modify-qp.
 - Add support for query-qp.

v1->v2
 -- Sorted file list
 -- Removed inline functions from ahi.h

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/CMakeLists.txt |   1 +
 providers/bnxt_re/bnxt_re-abi.h  | 113 +++++++++++++++++++
 providers/bnxt_re/main.c         |   5 +
 providers/bnxt_re/main.h         |  73 ++++++++++--
 providers/bnxt_re/memory.c       |  76 +++++++++++++
 providers/bnxt_re/memory.h       |  76 +++++++++++++
 providers/bnxt_re/verbs.c        | 236 ++++++++++++++++++++++++++++++++++++++-
 7 files changed, 562 insertions(+), 18 deletions(-)
 create mode 100644 providers/bnxt_re/memory.c
 create mode 100644 providers/bnxt_re/memory.h

diff --git a/providers/bnxt_re/CMakeLists.txt b/providers/bnxt_re/CMakeLists.txt
index 45e609d..7ea5ed8 100644
--- a/providers/bnxt_re/CMakeLists.txt
+++ b/providers/bnxt_re/CMakeLists.txt
@@ -1,4 +1,5 @@
 rdma_provider(bnxt_re
 	main.c
+	memory.c
 	verbs.c
 )
diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index 53645d5..6407e1b 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -47,6 +47,10 @@ struct bnxt_re_cntx_resp {
 	struct ibv_get_context_resp resp;
 	__u32 dev_id;
 	__u32 max_qp; /* To allocate qp-table */
+	__u32 pg_size;
+	__u32 cqe_size;
+	__u32 max_cqd;
+	__u32 rsvd;
 };
 
 struct bnxt_re_pd_resp {
@@ -60,4 +64,113 @@ struct bnxt_re_mr_resp {
 	struct ibv_reg_mr_resp resp;
 };
 
+struct bnxt_re_cq_req {
+	struct ibv_create_cq cmd;
+	__u64 cq_va;
+	__u64 cq_handle;
+};
+
+struct bnxt_re_cq_resp {
+	struct ibv_create_cq_resp resp;
+	__u32 cqid;
+	__u32 tail;
+	__u32 phase;
+	__u32 rsvd;
+};
+
+struct bnxt_re_qp_req {
+	struct ibv_create_qp cmd;
+	__u64 qpsva;
+	__u64 qprva;
+	__u64 qp_handle;
+};
+
+struct bnxt_re_qp_resp {
+	struct ibv_create_qp_resp resp;
+	__u32 qpid;
+	__u32 rsvd;
+};
+
+struct bnxt_re_bsqe {
+	__u32 rsv_ws_fl_wt;
+	__u32 key_immd;
+};
+
+struct bnxt_re_psns {
+	__u32 opc_spsn;
+	__u32 flg_npsn;
+};
+
+struct bnxt_re_sge {
+	__u32 pa_lo;
+	__u32 pa_hi;
+	__u32 lkey;
+	__u32 length;
+};
+
+/*  Cu+ max inline data */
+#define BNXT_RE_MAX_INLINE_SIZE		0x60
+
+struct bnxt_re_send {
+	__u32 length;
+	__u32 qkey;
+	__u32 dst_qp;
+	__u32 avid;
+	__u64 rsvd;
+};
+
+struct bnxt_re_raw {
+	__u32 length;
+	__u32 rsvd1;
+	__u32 cfa_meta;
+	__u32 rsvd2;
+	__u64 rsvd3;
+};
+
+struct bnxt_re_rdma {
+	__u32 length;
+	__u32 rsvd1;
+	__u32 rva_lo;
+	__u32 rva_hi;
+	__u32 rkey;
+	__u32 rsvd2;
+};
+
+struct bnxt_re_atomic {
+	__u32 rva_lo;
+	__u32 rva_hi;
+	__u32 swp_dt_lo;
+	__u32 swp_dt_hi;
+	__u32 cmp_dt_lo;
+	__u32 cmp_dt_hi;
+};
+
+struct bnxt_re_inval {
+	__u64 rsvd[3];
+};
+
+struct bnxt_re_bind {
+	__u32 plkey;
+	__u32 lkey;
+	__u32 va_lo;
+	__u32 va_hi;
+	__u32 len_lo;
+	__u32 len_hi; /* only 40 bits are valid */
+};
+
+struct bnxt_re_brqe {
+	__u32 rsv_ws_fl_wt;
+	__u32 rsvd;
+};
+
+struct bnxt_re_rqe {
+	__u64 rsvd[3];
+};
+
+struct bnxt_re_srqe {
+	__u32 srq_tag; /* 20 bits are valid */
+	__u32 rsvd1;
+	__u64 rsvd[2];
+};
+
 #endif
diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
index 23f2f4f..c9fdd10 100644
--- a/providers/bnxt_re/main.c
+++ b/providers/bnxt_re/main.c
@@ -110,8 +110,10 @@ static int bnxt_re_init_context(struct verbs_device *vdev,
 {
 	struct ibv_get_context cmd;
 	struct bnxt_re_cntx_resp resp;
+	struct bnxt_re_dev *dev;
 	struct bnxt_re_context *cntx;
 
+	dev = to_bnxt_re_dev(&vdev->device);
 	cntx = to_bnxt_re_context(ibvctx);
 
 	memset(&resp, 0, sizeof(resp));
@@ -122,6 +124,9 @@ static int bnxt_re_init_context(struct verbs_device *vdev,
 
 	cntx->dev_id = resp.dev_id;
 	cntx->max_qp = resp.max_qp;
+	dev->pg_size = resp.pg_size;
+	dev->cqe_size = resp.cqe_size;
+	dev->max_cq_depth = resp.max_cqd;
 	ibvctx->ops = bnxt_re_cntx_ops;
 
 	return 0;
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
index d621efa..9677ff7 100644
--- a/providers/bnxt_re/main.h
+++ b/providers/bnxt_re/main.h
@@ -48,6 +48,15 @@
 #include <util/udma_barrier.h>
 
 #include "bnxt_re-abi.h"
+#include "memory.h"
+
+#define DEV	"bnxtre : "
+
+struct bnxt_re_dpi {
+	__u32 dpindx;
+	__u64 *dbpage;
+	pthread_spinlock_t db_lock;
+};
 
 struct bnxt_re_pd {
 	struct ibv_pd ibvpd;
@@ -56,31 +65,48 @@ struct bnxt_re_pd {
 
 struct bnxt_re_cq {
 	struct ibv_cq ibvcq;
-};
-
-struct bnxt_re_qp {
-	struct ibv_qp ibvqp;
+	uint32_t cqid;
+	struct bnxt_re_queue cqq;
+	struct bnxt_re_dpi *udpi;
+	uint32_t cqe_size;
+	uint8_t  phase;
 };
 
 struct bnxt_re_srq {
 	struct ibv_srq ibvsrq;
 };
 
-struct bnxt_re_mr {
-	struct ibv_mr ibvmr;
+struct bnxt_re_qp {
+	struct ibv_qp ibvqp;
+	struct bnxt_re_queue *sqq;
+	struct bnxt_re_psns *psns; /* start ptr. */
+	struct bnxt_re_queue *rqq;
+	struct bnxt_re_srq *srq;
+	struct bnxt_re_cq *scq;
+	struct bnxt_re_cq *rcq;
+	struct bnxt_re_dpi *udpi;
+	uint64_t *swrid;
+	uint64_t *rwrid;
+	uint32_t qpid;
+	uint32_t tbl_indx;
+	uint16_t mtu;
+	uint16_t qpst;
+	uint8_t qptyp;
+	/* wrid? */
+	/* irdord? */
 };
 
-#define DEV	"bnxtre : "
-
-struct bnxt_re_dpi {
-	__u32 dpindx;
-	__u64 *dbpage;
-	pthread_spinlock_t db_lock;
+struct bnxt_re_mr {
+	struct ibv_mr ibvmr;
 };
 
 struct bnxt_re_dev {
 	struct verbs_device vdev;
 	uint8_t abi_version;
+	uint32_t pg_size;
+
+	uint32_t cqe_size;
+	uint32_t max_cq_depth;
 };
 
 struct bnxt_re_context {
@@ -107,4 +133,27 @@ static inline struct bnxt_re_pd *to_bnxt_re_pd(struct ibv_pd *ibvpd)
 	return container_of(ibvpd, struct bnxt_re_pd, ibvpd);
 }
 
+static inline struct bnxt_re_cq *to_bnxt_re_cq(struct ibv_cq *ibvcq)
+{
+	return container_of(ibvcq, struct bnxt_re_cq, ibvcq);
+}
+
+static inline struct bnxt_re_qp *to_bnxt_re_qp(struct ibv_qp *ibvqp)
+{
+	return container_of(ibvqp, struct bnxt_re_qp, ibvqp);
+}
+
+static inline uint32_t bnxt_re_get_sqe_sz(void)
+{
+	return sizeof(struct bnxt_re_bsqe) +
+	       sizeof(struct bnxt_re_send) +
+	       BNXT_RE_MAX_INLINE_SIZE;
+}
+
+static inline uint32_t bnxt_re_get_rqe_sz(void)
+{
+	return sizeof(struct bnxt_re_brqe) +
+	       sizeof(struct bnxt_re_rqe) +
+	       BNXT_RE_MAX_INLINE_SIZE;
+}
 #endif
diff --git a/providers/bnxt_re/memory.c b/providers/bnxt_re/memory.c
new file mode 100644
index 0000000..67125e9
--- /dev/null
+++ b/providers/bnxt_re/memory.c
@@ -0,0 +1,76 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Implements method to allocate page-aligned memory
+ *              buffers.
+ */
+
+#include <string.h>
+#include <sys/mman.h>
+
+#include "main.h"
+
+int bnxt_re_alloc_aligned(struct bnxt_re_queue *que, uint32_t pg_size)
+{
+	int ret, bytes;
+
+	bytes = (que->depth * que->stride);
+	que->bytes = get_aligned(bytes, pg_size);
+	que->va = mmap(NULL, que->bytes, PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (que->va == MAP_FAILED) {
+		que->bytes = 0;
+		return errno;
+	}
+	/* Touch pages before proceeding. */
+	memset(que->va, 0, que->bytes);
+
+	ret = ibv_dontfork_range(que->va, que->bytes);
+	if (ret) {
+		munmap(que->va, que->bytes);
+		que->bytes = 0;
+	}
+
+	return ret;
+}
+
+void bnxt_re_free_aligned(struct bnxt_re_queue *que)
+{
+	if (que->bytes) {
+		ibv_dofork_range(que->va, que->bytes);
+		munmap(que->va, que->bytes);
+		que->bytes = 0;
+	}
+}
diff --git a/providers/bnxt_re/memory.h b/providers/bnxt_re/memory.h
new file mode 100644
index 0000000..ea29a24
--- /dev/null
+++ b/providers/bnxt_re/memory.h
@@ -0,0 +1,76 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Implements data-struture to allocate page-aligned
+ *              memory buffer.
+ */
+
+#ifndef __MEMORY_H__
+#define __MEMORY_H__
+
+#include <pthread.h>
+
+struct bnxt_re_queue {
+	void *va;
+	uint32_t bytes; /* for munmap */
+	uint32_t depth; /* no. of entries */
+	uint32_t head;
+	uint32_t tail;
+	uint32_t stride;
+	pthread_spinlock_t qlock;
+};
+
+static inline unsigned long get_aligned(uint32_t size, uint32_t al_size)
+{
+	return (unsigned long)(size + al_size - 1) & ~(al_size - 1);
+}
+
+static inline unsigned long roundup_pow_of_two(unsigned long val)
+{
+	unsigned long roundup = 1;
+
+	if (val == 1)
+		return (roundup << 1);
+
+	while (roundup < val)
+		roundup <<= 1;
+
+	return roundup;
+}
+
+int bnxt_re_alloc_aligned(struct bnxt_re_queue *que, uint32_t pg_size);
+void bnxt_re_free_aligned(struct bnxt_re_queue *que);
+
+#endif
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 72a3443..66f951c 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -79,6 +79,7 @@ struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *ibvctx)
 	struct ibv_alloc_pd cmd;
 	struct bnxt_re_pd_resp resp;
 	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx);
+	struct bnxt_re_dev *dev = to_bnxt_re_dev(ibvctx->device);
 	struct bnxt_re_pd *pd;
 
 	pd = calloc(1, sizeof(*pd));
@@ -94,7 +95,7 @@ struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *ibvctx)
 
 	/* Map DB page now. */
 	cntx->udpi.dpindx = resp.dpi;
-	cntx->udpi.dbpage = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED,
+	cntx->udpi.dbpage = mmap(NULL, dev->pg_size, PROT_WRITE, MAP_SHARED,
 				 ibvctx->cmd_fd, resp.dbr);
 	if (cntx->udpi.dbpage == MAP_FAILED) {
 		(void)ibv_cmd_dealloc_pd(&pd->ibvpd);
@@ -112,6 +113,7 @@ int bnxt_re_free_pd(struct ibv_pd *ibvpd)
 {
 	struct bnxt_re_pd *pd = to_bnxt_re_pd(ibvpd);
 	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvpd->context);
+	struct bnxt_re_dev *dev = to_bnxt_re_dev(cntx->ibvctx.device);
 	int status;
 
 	status = ibv_cmd_dealloc_pd(ibvpd);
@@ -120,7 +122,8 @@ int bnxt_re_free_pd(struct ibv_pd *ibvpd)
 
 	pthread_spin_destroy(&cntx->udpi.db_lock);
 	if (cntx->udpi.dbpage && (cntx->udpi.dbpage != MAP_FAILED))
-		munmap(cntx->udpi.dbpage, 4096);
+		munmap(cntx->udpi.dbpage, dev->pg_size);
+
 	free(pd);
 
 	return 0;
@@ -162,6 +165,48 @@ int bnxt_re_dereg_mr(struct ibv_mr *ibvmr)
 struct ibv_cq *bnxt_re_create_cq(struct ibv_context *ibvctx, int ncqe,
 				 struct ibv_comp_channel *channel, int vec)
 {
+	struct bnxt_re_cq *cq;
+	struct bnxt_re_cq_req cmd;
+	struct bnxt_re_cq_resp resp;
+
+	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx);
+	struct bnxt_re_dev *dev = to_bnxt_re_dev(ibvctx->device);
+
+	if (ncqe > dev->max_cq_depth)
+		return NULL;
+
+	cq = calloc(1, sizeof(*cq));
+	if (!cq)
+		return NULL;
+
+	cq->cqq.depth = roundup_pow_of_two(ncqe + 1);
+	if (cq->cqq.depth > dev->max_cq_depth + 1)
+		cq->cqq.depth = dev->max_cq_depth + 1;
+	cq->cqq.stride = dev->cqe_size;
+	if (bnxt_re_alloc_aligned(&cq->cqq, dev->pg_size))
+		goto fail;
+
+	pthread_spin_init(&cq->cqq.qlock, PTHREAD_PROCESS_PRIVATE);
+
+	cmd.cq_va = (uint64_t)cq->cqq.va;
+	cmd.cq_handle = (uint64_t)cq;
+
+	memset(&resp, 0, sizeof(resp));
+	if (ibv_cmd_create_cq(ibvctx, ncqe, channel, vec,
+			      &cq->ibvcq, &cmd.cmd, sizeof(cmd),
+			      &resp.resp, sizeof(resp)))
+		goto cmdfail;
+
+	cq->cqid = resp.cqid;
+	cq->phase = resp.phase;
+	cq->cqq.tail = resp.tail;
+	cq->udpi = &cntx->udpi;
+
+	return &cq->ibvcq;
+cmdfail:
+	bnxt_re_free_aligned(&cq->cqq);
+fail:
+	free(cq);
 	return NULL;
 }
 
@@ -172,7 +217,17 @@ int bnxt_re_resize_cq(struct ibv_cq *ibvcq, int ncqe)
 
 int bnxt_re_destroy_cq(struct ibv_cq *ibvcq)
 {
-	return -ENOSYS;
+	int status;
+	struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq);
+
+	status = ibv_cmd_destroy_cq(ibvcq);
+	if (status)
+		return status;
+
+	bnxt_re_free_aligned(&cq->cqq);
+	free(cq);
+
+	return 0;
 }
 
 int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc)
@@ -190,27 +245,196 @@ int bnxt_re_arm_cq(struct ibv_cq *ibvcq, int flags)
 	return -ENOSYS;
 }
 
+static int bnxt_re_check_qp_limits(struct ibv_qp_init_attr *attr)
+{
+	return 0;
+}
+
+static void bnxt_re_free_queue_ptr(struct bnxt_re_qp *qp)
+{
+	if (qp->rqq)
+		free(qp->rqq);
+	if (qp->sqq)
+		free(qp->sqq);
+}
+
+static int bnxt_re_alloc_queue_ptr(struct bnxt_re_qp *qp,
+				   struct ibv_qp_init_attr *attr)
+{
+	qp->sqq = calloc(1, sizeof(struct bnxt_re_queue));
+	if (!qp->sqq)
+		return -ENOMEM;
+	if (attr->srq)
+		qp->srq = NULL;/*TODO: to_bnxt_re_srq(attr->srq);*/
+	else {
+		qp->rqq = calloc(1, sizeof(struct bnxt_re_queue));
+		if (!qp->rqq) {
+			free(qp->sqq);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static void bnxt_re_free_queues(struct bnxt_re_qp *qp)
+{
+	if (qp->rwrid)
+		free(qp->rwrid);
+	pthread_spin_destroy(&qp->rqq->qlock);
+	bnxt_re_free_aligned(qp->rqq);
+
+	if (qp->swrid)
+		free(qp->swrid);
+	pthread_spin_destroy(&qp->sqq->qlock);
+	bnxt_re_free_aligned(qp->sqq);
+}
+
+static int bnxt_re_alloc_queues(struct bnxt_re_qp *qp,
+				struct ibv_qp_init_attr *attr,
+				uint32_t pg_size) {
+	struct bnxt_re_queue *que;
+	uint32_t psn_depth;
+	int ret;
+
+	if (attr->cap.max_send_wr) {
+		que = qp->sqq;
+		que->stride = bnxt_re_get_sqe_sz();
+		que->depth = roundup_pow_of_two(attr->cap.max_send_wr);
+		/* psn_depth extra entries of size que->stride */
+		psn_depth = (que->depth * sizeof(struct bnxt_re_psns)) /
+			     que->stride;
+		que->depth += psn_depth;
+		ret = bnxt_re_alloc_aligned(qp->sqq, pg_size);
+		if (ret)
+			return ret;
+		/* exclude psns depth*/
+		que->depth -= psn_depth;
+		/* start of spsn space sizeof(struct bnxt_re_psns) each. */
+		qp->psns = (que->va + que->stride * que->depth);
+		pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE);
+		qp->swrid = calloc(que->depth, sizeof(uint64_t));
+		if (!qp->swrid) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+	}
+
+	if (attr->cap.max_recv_wr && qp->rqq) {
+		que = qp->rqq;
+		que->stride = bnxt_re_get_rqe_sz();
+		que->depth = roundup_pow_of_two(attr->cap.max_recv_wr);
+		ret = bnxt_re_alloc_aligned(qp->rqq, pg_size);
+		if (ret)
+			goto fail;
+		pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE);
+		qp->rwrid = calloc(que->depth, sizeof(uint64_t));
+		if (!qp->rwrid) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+	}
+
+	return 0;
+
+fail:
+	bnxt_re_free_queues(qp);
+	return ret;
+}
+
 struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
 				 struct ibv_qp_init_attr *attr)
 {
+	struct bnxt_re_qp *qp;
+	struct bnxt_re_qp_req req;
+	struct bnxt_re_qp_resp resp;
+
+	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvpd->context);
+	struct bnxt_re_dev *dev = to_bnxt_re_dev(cntx->ibvctx.device);
+
+	if (bnxt_re_check_qp_limits(attr))
+		return NULL;
+
+	qp = calloc(1, sizeof(*qp));
+	if (!qp)
+		return NULL;
+	/* alloc queue pointers */
+	if (bnxt_re_alloc_queue_ptr(qp, attr))
+		goto fail;
+	/* alloc queues */
+	if (bnxt_re_alloc_queues(qp, attr, dev->pg_size))
+		goto failq;
+	/* Fill ibv_cmd */
+	req.qpsva = (uint64_t)qp->sqq->va;
+	req.qprva = qp->rqq ? (uint64_t)qp->rqq->va : 0;
+	req.qp_handle = (uint64_t)qp;
+
+	if (ibv_cmd_create_qp(ibvpd, &qp->ibvqp, attr, &req.cmd, sizeof(req),
+			      &resp.resp, sizeof(resp))) {
+		goto failcmd;
+	}
+
+	qp->qpid = resp.qpid;
+	qp->qptyp = attr->qp_type;
+	qp->qpst = IBV_QPS_RESET;
+	qp->scq = to_bnxt_re_cq(attr->send_cq);
+	qp->rcq = to_bnxt_re_cq(attr->recv_cq);
+	qp->udpi = &cntx->udpi;
+
+	return &qp->ibvqp;
+failcmd:
+	bnxt_re_free_queues(qp);
+failq:
+	bnxt_re_free_queue_ptr(qp);
+fail:
+	free(qp);
+
 	return NULL;
 }
 
 int bnxt_re_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
 		      int attr_mask)
 {
-	return -ENOSYS;
+	struct ibv_modify_qp cmd = {};
+	struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp);
+	int rc;
+
+	rc = ibv_cmd_modify_qp(ibvqp, attr, attr_mask, &cmd, sizeof(cmd));
+	if (!rc)
+		qp->qpst = ibvqp->state;
+
+	return rc;
 }
 
 int bnxt_re_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
 		     int attr_mask, struct ibv_qp_init_attr *init_attr)
 {
-	return -ENOSYS;
+	struct ibv_query_qp cmd;
+	struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp);
+	int rc;
+
+	rc = ibv_cmd_query_qp(ibvqp, attr, attr_mask, init_attr,
+			      &cmd, sizeof(cmd));
+	if (!rc)
+		qp->qpst = ibvqp->state;
+
+	return rc;
 }
 
 int bnxt_re_destroy_qp(struct ibv_qp *ibvqp)
 {
-	return -ENOSYS;
+	struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp);
+	int status;
+
+	status = ibv_cmd_destroy_qp(ibvqp);
+	if (status)
+		return status;
+
+	bnxt_re_free_queues(qp);
+	bnxt_re_free_queue_ptr(qp);
+	free(qp);
+
+	return 0;
 }
 
 int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 4/9] libbnxt_re: Add support for posting and polling
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 3/9] libbnxt_re: Add support for CQ and QP management Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 5/9] libbnxt_re: Allow apps to poll for flushed completions Devesh Sharma
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds code to support ibv_post_recv(),
ibv_post_send(), ibv_poll_cq() and ibv_arm_cq()
routines. With this patch applications are able
to enqueue RQE or WQE ring doorbells and poll for
completions from CQ. Currently, this code do not
support SRQ, UD service and and flush completions.
Following are the major changes:

 - Added most of the enums to handle device specific
   opcodes, masks, shifts and data structures.
 - Added a new file to define DB related routines.
 - Added routines to handle circular queue operations.
 - Added enums and few utility functions.
 - Added bnxt_re_post_recv().
 - Add code to build and post SQEs for RDMA-WRITE,
   RDMA-READ, SEND through bnxt_re_post_send() routine.
 - Fixed couple of bugs in create-qp and modify-qp.
 - bnxt_re_create_qp() now check the limits.
 - Add polling support for RC send completions.
 - Add polling support for RC Recv completions.
 - Add support to ARM completion queue.
 - Cleanup CQ while QP is being destroyed.
 - Add utility functions to convert chip specific
   completion codes to IB stack specific codes.

v1->v2
 -- Delete redefinition of "true" and "false"
 -- Removed unwanted wmb()
 -- Removed dead code and fixed return type mismatch.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/CMakeLists.txt |   1 +
 providers/bnxt_re/bnxt_re-abi.h  | 174 +++++++++-
 providers/bnxt_re/db.c           |  90 ++++++
 providers/bnxt_re/main.c         |   1 +
 providers/bnxt_re/main.h         | 199 +++++++++++-
 providers/bnxt_re/memory.h       |  63 ++++
 providers/bnxt_re/verbs.c        | 665 ++++++++++++++++++++++++++++++++++++---
 7 files changed, 1152 insertions(+), 41 deletions(-)
 create mode 100644 providers/bnxt_re/db.c

diff --git a/providers/bnxt_re/CMakeLists.txt b/providers/bnxt_re/CMakeLists.txt
index 7ea5ed8..13ad287 100644
--- a/providers/bnxt_re/CMakeLists.txt
+++ b/providers/bnxt_re/CMakeLists.txt
@@ -1,4 +1,5 @@
 rdma_provider(bnxt_re
+	db.c
 	main.c
 	memory.c
 	verbs.c
diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index 6407e1b..b7eef36 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -43,6 +43,142 @@
 
 #define BNXT_RE_ABI_VERSION 1
 
+enum bnxt_re_wr_opcode {
+	BNXT_RE_WR_OPCD_SEND		= 0x00,
+	BNXT_RE_WR_OPCD_SEND_IMM	= 0x01,
+	BNXT_RE_WR_OPCD_SEND_INVAL	= 0x02,
+	BNXT_RE_WR_OPCD_RDMA_WRITE	= 0x04,
+	BNXT_RE_WR_OPCD_RDMA_WRITE_IMM	= 0x05,
+	BNXT_RE_WR_OPCD_RDMA_READ	= 0x06,
+	BNXT_RE_WR_OPCD_ATOMIC_CS	= 0x08,
+	BNXT_RE_WR_OPCD_ATOMIC_FA	= 0x0B,
+	BNXT_RE_WR_OPCD_LOC_INVAL	= 0x0C,
+	BNXT_RE_WR_OPCD_BIND		= 0x0E,
+	BNXT_RE_WR_OPCD_RECV		= 0x80
+};
+
+enum bnxt_re_wr_flags {
+	BNXT_RE_WR_FLAGS_INLINE		= 0x10,
+	BNXT_RE_WR_FLAGS_SE		= 0x08,
+	BNXT_RE_WR_FLAGS_UC_FENCE	= 0x04,
+	BNXT_RE_WR_FLAGS_RD_FENCE	= 0x02,
+	BNXT_RE_WR_FLAGS_SIGNALED	= 0x01
+};
+
+enum bnxt_re_wc_type {
+	BNXT_RE_WC_TYPE_SEND		= 0x00,
+	BNXT_RE_WC_TYPE_RECV_RC		= 0x01,
+	BNXT_RE_WC_TYPE_RECV_UD		= 0x02,
+	BNXT_RE_WC_TYPE_RECV_RAW	= 0x03,
+	BNXT_RE_WC_TYPE_TERM		= 0x0E,
+	BNXT_RE_WC_TYPE_COFF		= 0x0F
+};
+
+enum bnxt_re_req_wc_status {
+	BNXT_RE_REQ_ST_OK		= 0x00,
+	BNXT_RE_REQ_ST_BAD_RESP		= 0x01,
+	BNXT_RE_REQ_ST_LOC_LEN		= 0x02,
+	BNXT_RE_REQ_ST_LOC_QP_OP	= 0x03,
+	BNXT_RE_REQ_ST_PROT		= 0x04,
+	BNXT_RE_REQ_ST_MEM_OP		= 0x05,
+	BNXT_RE_REQ_ST_REM_INVAL	= 0x06,
+	BNXT_RE_REQ_ST_REM_ACC		= 0x07,
+	BNXT_RE_REQ_ST_REM_OP		= 0x08,
+	BNXT_RE_REQ_ST_RNR_NAK_XCED	= 0x09,
+	BNXT_RE_REQ_ST_TRNSP_XCED	= 0x0A,
+	BNXT_RE_REQ_ST_WR_FLUSH		= 0x0B
+};
+
+enum bnxt_re_rsp_wc_status {
+	BNXT_RE_RSP_ST_OK		= 0x00,
+	BNXT_RE_RSP_ST_LOC_ACC		= 0x01,
+	BNXT_RE_RSP_ST_LOC_LEN		= 0x02,
+	BNXT_RE_RSP_ST_LOC_PROT		= 0x03,
+	BNXT_RE_RSP_ST_LOC_QP_OP	= 0x04,
+	BNXT_RE_RSP_ST_MEM_OP		= 0x05,
+	BNXT_RE_RSP_ST_REM_INVAL	= 0x06,
+	BNXT_RE_RSP_ST_WR_FLUSH		= 0x07,
+	BNXT_RE_RSP_ST_HW_FLUSH		= 0x08
+};
+
+enum bnxt_re_hdr_offset {
+	BNXT_RE_HDR_WT_MASK		= 0xFF,
+	BNXT_RE_HDR_FLAGS_MASK		= 0xFF,
+	BNXT_RE_HDR_FLAGS_SHIFT		= 0x08,
+	BNXT_RE_HDR_WS_MASK		= 0xFF,
+	BNXT_RE_HDR_WS_SHIFT		= 0x10
+};
+
+enum bnxt_re_db_que_type {
+	BNXT_RE_QUE_TYPE_SQ		= 0x00,
+	BNXT_RE_QUE_TYPE_RQ		= 0x01,
+	BNXT_RE_QUE_TYPE_SRQ		= 0x02,
+	BNXT_RE_QUE_TYPE_SRQ_ARM	= 0x03,
+	BNXT_RE_QUE_TYPE_CQ		= 0x04,
+	BNXT_RE_QUE_TYPE_CQ_ARMSE	= 0x05,
+	BNXT_RE_QUE_TYPE_CQ_ARMALL	= 0x06,
+	BNXT_RE_QUE_TYPE_CQ_ARMENA	= 0x07,
+	BNXT_RE_QUE_TYPE_SRQ_ARMENA	= 0x08,
+	BNXT_RE_QUE_TYPE_CQ_CUT_ACK	= 0x09,
+	BNXT_RE_QUE_TYPE_NULL		= 0x0F
+};
+
+enum bnxt_re_db_mask {
+	BNXT_RE_DB_INDX_MASK		= 0xFFFFFUL,
+	BNXT_RE_DB_QID_MASK		= 0xFFFFFUL,
+	BNXT_RE_DB_TYP_MASK		= 0x0FUL,
+	BNXT_RE_DB_TYP_SHIFT		= 0x1C
+};
+
+enum bnxt_re_psns_mask {
+	BNXT_RE_PSNS_SPSN_MASK		= 0xFFFFFF,
+	BNXT_RE_PSNS_OPCD_MASK		= 0xFF,
+	BNXT_RE_PSNS_OPCD_SHIFT		= 0x18,
+	BNXT_RE_PSNS_NPSN_MASK		= 0xFFFFFF,
+	BNXT_RE_PSNS_FLAGS_MASK		= 0xFF,
+	BNXT_RE_PSNS_FLAGS_SHIFT	= 0x18
+};
+
+enum bnxt_re_bcqe_mask {
+	BNXT_RE_BCQE_PH_MASK		= 0x01,
+	BNXT_RE_BCQE_TYPE_MASK		= 0x0F,
+	BNXT_RE_BCQE_TYPE_SHIFT		= 0x01,
+	BNXT_RE_BCQE_STATUS_MASK	= 0xFF,
+	BNXT_RE_BCQE_STATUS_SHIFT	= 0x08,
+	BNXT_RE_BCQE_FLAGS_MASK		= 0xFFFFU,
+	BNXT_RE_BCQE_FLAGS_SHIFT	= 0x10,
+	BNXT_RE_BCQE_RWRID_MASK		= 0xFFFFFU,
+	BNXT_RE_BCQE_SRCQP_MASK		= 0xFF,
+	BNXT_RE_BCQE_SRCQP_SHIFT	= 0x18
+};
+
+enum bnxt_re_rc_flags_mask {
+	BNXT_RE_RC_FLAGS_SRQ_RQ_MASK	= 0x01,
+	BNXT_RE_RC_FLAGS_IMM_MASK	= 0x02,
+	BNXT_RE_RC_FLAGS_IMM_SHIFT	= 0x01,
+	BNXT_RE_RC_FLAGS_INV_MASK	= 0x04,
+	BNXT_RE_RC_FLAGS_INV_SHIFT	= 0x02,
+	BNXT_RE_RC_FLAGS_RDMA_MASK	= 0x08,
+	BNXT_RE_RC_FLAGS_RDMA_SHIFT	= 0x03
+};
+
+enum bnxt_re_ud_flags_mask {
+	BNXT_RE_UD_FLAGS_SRQ_RQ_MASK	= 0x01,
+	BNXT_RE_UD_FLAGS_IMM_MASK	= 0x02,
+	BNXT_RE_UD_FLAGS_HDR_TYP_MASK	= 0x0C,
+
+	BNXT_RE_UD_FLAGS_SRQ		= 0x01,
+	BNXT_RE_UD_FLAGS_RQ		= 0x00,
+	BNXT_RE_UD_FLAGS_ROCE		= 0x00,
+	BNXT_RE_UD_FLAGS_ROCE_IPV4	= 0x02,
+	BNXT_RE_UD_FLAGS_ROCE_IPV6	= 0x03
+};
+
+struct bnxt_re_db_hdr {
+	__u32 indx;
+	__u32 typ_qid; /* typ: 4, qid:20*/
+};
+
 struct bnxt_re_cntx_resp {
 	struct ibv_get_context_resp resp;
 	__u32 dev_id;
@@ -78,6 +214,39 @@ struct bnxt_re_cq_resp {
 	__u32 rsvd;
 };
 
+struct bnxt_re_bcqe {
+	__u32 flg_st_typ_ph;
+	__u32 qphi_rwrid;
+};
+
+struct bnxt_re_req_cqe {
+	__u64 qp_handle;
+	__u32 con_indx; /* 16 bits valid. */
+	__u32 rsvd1;
+	__u64 rsvd2;
+};
+
+struct bnxt_re_rc_cqe {
+	__u32 length;
+	__u32 imm_key;
+	__u64 qp_handle;
+	__u64 mr_handle;
+};
+
+struct bnxt_re_ud_cqe {
+	__u32 length; /* 14 bits */
+	__u32 immd;
+	__u64 qp_handle;
+	__u64 qplo_mac; /* 16:48*/
+};
+
+struct bnxt_re_term_cqe {
+	__u64 qp_handle;
+	__u32 rq_sq_cidx;
+	__u32 rsvd;
+	__u64 rsvd1;
+};
+
 struct bnxt_re_qp_req {
 	struct ibv_create_qp cmd;
 	__u64 qpsva;
@@ -164,7 +333,9 @@ struct bnxt_re_brqe {
 };
 
 struct bnxt_re_rqe {
-	__u64 rsvd[3];
+	__u32 wrid;
+	__u32 rsvd1;
+	__u64 rsvd[2];
 };
 
 struct bnxt_re_srqe {
@@ -172,5 +343,4 @@ struct bnxt_re_srqe {
 	__u32 rsvd1;
 	__u64 rsvd[2];
 };
-
 #endif
diff --git a/providers/bnxt_re/db.c b/providers/bnxt_re/db.c
new file mode 100644
index 0000000..3897aea
--- /dev/null
+++ b/providers/bnxt_re/db.c
@@ -0,0 +1,90 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Doorbell handling functions.
+ */
+
+#include "main.h"
+
+static void bnxt_re_ring_db(struct bnxt_re_dpi *dpi,
+			    struct bnxt_re_db_hdr *hdr)
+{
+	uint64_t *dbval = (uint64_t *)&hdr->indx;
+
+	pthread_spin_lock(&dpi->db_lock);
+	*dbval = htole64(*dbval);
+	iowrite64(dpi->dbpage, dbval);
+	pthread_spin_unlock(&dpi->db_lock);
+}
+
+static void bnxt_re_init_db_hdr(struct bnxt_re_db_hdr *hdr, uint32_t indx,
+				uint32_t qid, uint32_t typ)
+{
+	hdr->indx = indx & BNXT_RE_DB_INDX_MASK;
+	hdr->typ_qid = qid & BNXT_RE_DB_QID_MASK;
+	hdr->typ_qid |= ((typ & BNXT_RE_DB_TYP_MASK) << BNXT_RE_DB_TYP_SHIFT);
+}
+
+void bnxt_re_ring_rq_db(struct bnxt_re_qp *qp)
+{
+	struct bnxt_re_db_hdr hdr;
+
+	bnxt_re_init_db_hdr(&hdr, qp->rqq->tail, qp->qpid, BNXT_RE_QUE_TYPE_RQ);
+	bnxt_re_ring_db(qp->udpi, &hdr);
+}
+
+void bnxt_re_ring_sq_db(struct bnxt_re_qp *qp)
+{
+	struct bnxt_re_db_hdr hdr;
+
+	bnxt_re_init_db_hdr(&hdr, qp->sqq->tail, qp->qpid, BNXT_RE_QUE_TYPE_SQ);
+	bnxt_re_ring_db(qp->udpi, &hdr);
+}
+
+void bnxt_re_ring_cq_db(struct bnxt_re_cq *cq)
+{
+	struct bnxt_re_db_hdr hdr;
+
+	bnxt_re_init_db_hdr(&hdr, cq->cqq.head, cq->cqid, BNXT_RE_QUE_TYPE_CQ);
+	bnxt_re_ring_db(cq->udpi, &hdr);
+}
+
+void bnxt_re_ring_cq_arm_db(struct bnxt_re_cq *cq, uint8_t aflag)
+{
+	struct bnxt_re_db_hdr hdr;
+
+	bnxt_re_init_db_hdr(&hdr, cq->cqq.head, cq->cqid, aflag);
+	bnxt_re_ring_db(cq->udpi, &hdr);
+}
diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
index c9fdd10..360c57f 100644
--- a/providers/bnxt_re/main.c
+++ b/providers/bnxt_re/main.c
@@ -105,6 +105,7 @@ static struct ibv_context_ops bnxt_re_cntx_ops = {
 	.destroy_ah    = bnxt_re_destroy_ah
 };
 
+/* Context Init functions */
 static int bnxt_re_init_context(struct verbs_device *vdev,
 				struct ibv_context *ibvctx, int cmd_fd)
 {
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
index 9677ff7..bfe7089 100644
--- a/providers/bnxt_re/main.h
+++ b/providers/bnxt_re/main.h
@@ -40,6 +40,7 @@
 #define __MAIN_H__
 
 #include <inttypes.h>
+#include <stdbool.h>
 #include <stddef.h>
 #include <endian.h>
 #include <pthread.h>
@@ -76,23 +77,40 @@ struct bnxt_re_srq {
 	struct ibv_srq ibvsrq;
 };
 
+struct bnxt_re_wrid {
+	struct bnxt_re_psns *psns;
+	uint64_t wrid;
+	uint32_t bytes;
+	uint8_t sig;
+};
+
+struct bnxt_re_qpcap {
+	uint32_t max_swr;
+	uint32_t max_rwr;
+	uint32_t max_ssge;
+	uint32_t max_rsge;
+	uint32_t max_inline;
+	uint8_t	sqsig;
+};
+
 struct bnxt_re_qp {
 	struct ibv_qp ibvqp;
 	struct bnxt_re_queue *sqq;
-	struct bnxt_re_psns *psns; /* start ptr. */
+	struct bnxt_re_wrid *swrid;
 	struct bnxt_re_queue *rqq;
+	struct bnxt_re_wrid *rwrid;
 	struct bnxt_re_srq *srq;
 	struct bnxt_re_cq *scq;
 	struct bnxt_re_cq *rcq;
 	struct bnxt_re_dpi *udpi;
-	uint64_t *swrid;
-	uint64_t *rwrid;
+	struct bnxt_re_qpcap cap;
 	uint32_t qpid;
 	uint32_t tbl_indx;
+	uint32_t sq_psn;
+	uint32_t pending_db;
 	uint16_t mtu;
 	uint16_t qpst;
 	uint8_t qptyp;
-	/* wrid? */
 	/* irdord? */
 };
 
@@ -117,6 +135,14 @@ struct bnxt_re_context {
 	struct bnxt_re_dpi udpi;
 };
 
+/* DB ring functions used internally*/
+void bnxt_re_ring_rq_db(struct bnxt_re_qp *qp);
+void bnxt_re_ring_sq_db(struct bnxt_re_qp *qp);
+void bnxt_re_ring_srq_db(struct bnxt_re_srq *srq);
+void bnxt_re_ring_cq_db(struct bnxt_re_cq *cq);
+void bnxt_re_ring_cq_arm_db(struct bnxt_re_cq *cq, uint8_t aflag);
+
+/* pointer conversion functions*/
 static inline struct bnxt_re_dev *to_bnxt_re_dev(struct ibv_device *ibvdev)
 {
 	return container_of(ibvdev, struct bnxt_re_dev, vdev);
@@ -150,10 +176,175 @@ static inline uint32_t bnxt_re_get_sqe_sz(void)
 	       BNXT_RE_MAX_INLINE_SIZE;
 }
 
+static inline uint32_t bnxt_re_get_sqe_hdr_sz(void)
+{
+	return sizeof(struct bnxt_re_bsqe) + sizeof(struct bnxt_re_send);
+}
+
 static inline uint32_t bnxt_re_get_rqe_sz(void)
 {
 	return sizeof(struct bnxt_re_brqe) +
 	       sizeof(struct bnxt_re_rqe) +
 	       BNXT_RE_MAX_INLINE_SIZE;
 }
+
+static inline uint32_t bnxt_re_get_rqe_hdr_sz(void)
+{
+	return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_rqe);
+}
+
+static inline uint32_t bnxt_re_get_cqe_sz(void)
+{
+	return sizeof(struct bnxt_re_req_cqe) + sizeof(struct bnxt_re_bcqe);
+}
+
+static inline uint8_t bnxt_re_ibv_to_bnxt_wr_opcd(uint8_t ibv_opcd)
+{
+	uint8_t bnxt_opcd;
+
+	switch (ibv_opcd) {
+	case IBV_WR_SEND:
+		bnxt_opcd = BNXT_RE_WR_OPCD_SEND;
+		break;
+	case IBV_WR_SEND_WITH_IMM:
+		bnxt_opcd = BNXT_RE_WR_OPCD_SEND_IMM;
+		break;
+	case IBV_WR_RDMA_WRITE:
+		bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_WRITE;
+		break;
+	case IBV_WR_RDMA_WRITE_WITH_IMM:
+		bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_WRITE_IMM;
+		break;
+	case IBV_WR_RDMA_READ:
+		bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_READ;
+		break;
+		/* TODO: Add other opcodes */
+	default:
+		bnxt_opcd = 0xFF;
+		break;
+	};
+
+	return bnxt_opcd;
+}
+
+static inline uint8_t bnxt_re_ibv_wr_to_wc_opcd(uint8_t wr_opcd)
+{
+	uint8_t wc_opcd;
+
+	switch (wr_opcd) {
+	case IBV_WR_SEND_WITH_IMM:
+	case IBV_WR_SEND:
+		wc_opcd = IBV_WC_SEND;
+		break;
+	case IBV_WR_RDMA_WRITE_WITH_IMM:
+	case IBV_WR_RDMA_WRITE:
+		wc_opcd = IBV_WC_RDMA_WRITE;
+		break;
+	case IBV_WR_RDMA_READ:
+		wc_opcd = IBV_WC_RDMA_READ;
+		break;
+	case IBV_WR_ATOMIC_CMP_AND_SWP:
+		wc_opcd = IBV_WC_COMP_SWAP;
+		break;
+	case IBV_WR_ATOMIC_FETCH_AND_ADD:
+		wc_opcd = IBV_WC_FETCH_ADD;
+		break;
+	default:
+		wc_opcd = 0xFF;
+		break;
+	}
+
+	return wc_opcd;
+}
+
+static inline uint8_t bnxt_re_to_ibv_wc_status(uint8_t bnxt_wcst,
+					       uint8_t is_req)
+{
+	uint8_t ibv_wcst;
+
+	if (is_req) {
+		switch (bnxt_wcst) {
+		case BNXT_RE_REQ_ST_BAD_RESP:
+			ibv_wcst = IBV_WC_BAD_RESP_ERR;
+			break;
+		case BNXT_RE_REQ_ST_LOC_LEN:
+			ibv_wcst = IBV_WC_LOC_LEN_ERR;
+			break;
+		case BNXT_RE_REQ_ST_LOC_QP_OP:
+			ibv_wcst = IBV_WC_LOC_QP_OP_ERR;
+			break;
+		case BNXT_RE_REQ_ST_PROT:
+			ibv_wcst = IBV_WC_LOC_PROT_ERR;
+			break;
+		case BNXT_RE_REQ_ST_MEM_OP:
+			ibv_wcst = IBV_WC_MW_BIND_ERR;
+			break;
+		case BNXT_RE_REQ_ST_REM_INVAL:
+			ibv_wcst = IBV_WC_REM_INV_REQ_ERR;
+			break;
+		case BNXT_RE_REQ_ST_REM_ACC:
+			ibv_wcst = IBV_WC_REM_ACCESS_ERR;
+			break;
+		case BNXT_RE_REQ_ST_REM_OP:
+			ibv_wcst = IBV_WC_REM_OP_ERR;
+			break;
+		case BNXT_RE_REQ_ST_RNR_NAK_XCED:
+			ibv_wcst = IBV_WC_RNR_RETRY_EXC_ERR;
+			break;
+		case BNXT_RE_REQ_ST_TRNSP_XCED:
+			ibv_wcst = IBV_WC_RETRY_EXC_ERR;
+			break;
+		case BNXT_RE_REQ_ST_WR_FLUSH:
+			ibv_wcst = IBV_WC_WR_FLUSH_ERR;
+			break;
+		default:
+			ibv_wcst = IBV_WC_GENERAL_ERR;
+			break;
+		}
+	} else {
+		switch (bnxt_wcst) {
+		case BNXT_RE_RSP_ST_LOC_ACC:
+			ibv_wcst = IBV_WC_LOC_ACCESS_ERR;
+			break;
+		case BNXT_RE_RSP_ST_LOC_LEN:
+			ibv_wcst = IBV_WC_LOC_LEN_ERR;
+			break;
+		case BNXT_RE_RSP_ST_LOC_PROT:
+			ibv_wcst = IBV_WC_LOC_PROT_ERR;
+			break;
+		case BNXT_RE_RSP_ST_LOC_QP_OP:
+			ibv_wcst = IBV_WC_LOC_QP_OP_ERR;
+			break;
+		case BNXT_RE_RSP_ST_MEM_OP:
+			ibv_wcst = IBV_WC_MW_BIND_ERR;
+			break;
+		case BNXT_RE_RSP_ST_REM_INVAL:
+			ibv_wcst = IBV_WC_REM_INV_REQ_ERR;
+			break;
+		case BNXT_RE_RSP_ST_WR_FLUSH:
+			ibv_wcst = IBV_WC_WR_FLUSH_ERR;
+			break;
+		case BNXT_RE_RSP_ST_HW_FLUSH:
+			ibv_wcst = IBV_WC_FATAL_ERR;
+			break;
+		default:
+			ibv_wcst = IBV_WC_GENERAL_ERR;
+			break;
+		}
+	}
+
+	return ibv_wcst;
+}
+
+static inline uint8_t bnxt_re_is_cqe_valid(struct bnxt_re_cq *cq,
+					   struct bnxt_re_bcqe *hdr)
+{
+	return ((hdr->flg_st_typ_ph & BNXT_RE_BCQE_PH_MASK) == cq->phase);
+}
+
+static inline void bnxt_re_change_cq_phase(struct bnxt_re_cq *cq)
+{
+	if (!cq->cqq.head)
+		cq->phase = (~cq->phase & BNXT_RE_BCQE_PH_MASK);
+}
 #endif
diff --git a/providers/bnxt_re/memory.h b/providers/bnxt_re/memory.h
index ea29a24..f812eb8 100644
--- a/providers/bnxt_re/memory.h
+++ b/providers/bnxt_re/memory.h
@@ -73,4 +73,67 @@ static inline unsigned long roundup_pow_of_two(unsigned long val)
 int bnxt_re_alloc_aligned(struct bnxt_re_queue *que, uint32_t pg_size);
 void bnxt_re_free_aligned(struct bnxt_re_queue *que);
 
+static inline void iowrite64(__u64 *dst, uint64_t *src)
+{
+	*(volatile __u64 *)dst = *src;
+}
+
+static inline void iowrite32(__u32 *dst, uint32_t *src)
+{
+	*(volatile __u32 *)dst = *src;
+}
+
+/* Basic queue operation */
+static inline uint32_t bnxt_re_is_que_full(struct bnxt_re_queue *que)
+{
+	return (((que->tail + 1) & (que->depth - 1)) == que->head);
+}
+
+static inline uint32_t bnxt_re_incr(uint32_t val, uint32_t max)
+{
+	return (++val & (max - 1));
+}
+
+static inline void bnxt_re_incr_tail(struct bnxt_re_queue *que)
+{
+	que->tail = bnxt_re_incr(que->tail, que->depth);
+}
+
+static inline void bnxt_re_incr_head(struct bnxt_re_queue *que)
+{
+	que->head = bnxt_re_incr(que->head, que->depth);
+}
+
+/* Memory byte order conversion functions. */
+static inline int bnxt_re_host_to_le64(uint64_t *src, int bytes)
+{
+	int qwords, indx;
+
+	if (!bytes || bytes < 8)
+		return -EINVAL;
+
+	qwords = bytes / sizeof(uint64_t);
+	for (indx = 0; indx < qwords; indx++) {
+		if (*(src + indx))
+			*(src + indx) = htole64(*(src + indx));
+	}
+
+	return qwords;
+}
+
+static inline int bnxt_re_le64_to_host(uint64_t *src, int bytes)
+{
+	int qwords, indx;
+
+	if (!bytes || bytes < 8)
+		return -EINVAL;
+
+	qwords = bytes / sizeof(uint64_t);
+	for (indx = 0; indx < qwords; indx++) {
+		if (*(src + indx))
+			*(src + indx) = le64toh(*(src + indx));
+	}
+
+	return qwords;
+}
 #endif
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 66f951c..f2f1ce8 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -230,9 +230,263 @@ int bnxt_re_destroy_cq(struct ibv_cq *ibvcq)
 	return 0;
 }
 
+static uint8_t bnxt_re_poll_success_scqe(struct bnxt_re_qp *qp,
+					 struct ibv_wc *ibvwc,
+					 struct bnxt_re_bcqe *hdr,
+					 struct bnxt_re_req_cqe *scqe,
+					 int *cnt)
+{
+	struct bnxt_re_queue *sq = qp->sqq;
+	struct bnxt_re_wrid *swrid;
+	struct bnxt_re_psns *spsn;
+	uint8_t pcqe = false;
+	uint32_t head = sq->head;
+	uint32_t cindx;
+
+	swrid = &qp->swrid[head];
+	spsn = swrid->psns;
+	cindx = scqe->con_indx;
+
+	if (!(swrid->sig & IBV_SEND_SIGNALED)) {
+		*cnt = 0;
+	} else {
+		ibvwc->status = IBV_WC_SUCCESS;
+		ibvwc->wc_flags = 0;
+		ibvwc->qp_num = qp->qpid;
+		ibvwc->wr_id = swrid->wrid;
+		ibvwc->opcode = (le32toh(spsn->opc_spsn) >>
+				BNXT_RE_PSNS_OPCD_SHIFT) &
+				BNXT_RE_PSNS_OPCD_MASK;
+		if (ibvwc->opcode == IBV_WC_RDMA_READ ||
+		    ibvwc->opcode == IBV_WC_COMP_SWAP ||
+		    ibvwc->opcode == IBV_WC_FETCH_ADD)
+			ibvwc->byte_len = swrid->bytes;
+
+		*cnt = 1;
+	}
+
+	bnxt_re_incr_head(sq);
+	if (sq->head != cindx)
+		pcqe = true;
+
+	return pcqe;
+}
+
+static uint8_t bnxt_re_poll_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
+				 void *cqe, int *cnt)
+{
+	struct bnxt_re_bcqe *hdr;
+	struct bnxt_re_req_cqe *scqe;
+	uint8_t status, pcqe = false;
+
+	scqe = cqe;
+	hdr = cqe + sizeof(struct bnxt_re_req_cqe);
+
+	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
+		  BNXT_RE_BCQE_STATUS_MASK;
+	if (status == BNXT_RE_REQ_ST_OK) {
+		pcqe = bnxt_re_poll_success_scqe(qp, ibvwc, hdr, scqe, cnt);
+	} else {
+		/* TODO: Handle error completion properly. */
+		fprintf(stderr, "%s(): swc with error, vendor status = %d\n",
+			__func__, status);
+		*cnt = 1;
+		ibvwc->status = bnxt_re_to_ibv_wc_status(status, true);
+		ibvwc->wr_id = qp->swrid[qp->sqq->head].wrid;
+		bnxt_re_incr_head(qp->sqq);
+	}
+
+	return pcqe;
+}
+
+static void bnxt_re_poll_success_rcqe(struct bnxt_re_qp *qp,
+				      struct ibv_wc *ibvwc,
+				      struct bnxt_re_bcqe *hdr,
+				      struct bnxt_re_rc_cqe *rcqe)
+{
+	struct bnxt_re_queue *rq = qp->rqq;
+	struct bnxt_re_wrid *rwrid;
+	uint32_t head = rq->head;
+	uint8_t flags, is_imm, is_rdma;
+
+	rwrid = &qp->rwrid[head];
+
+	ibvwc->status = IBV_WC_SUCCESS;
+	ibvwc->wr_id = rwrid->wrid;
+	ibvwc->qp_num = qp->qpid;
+	ibvwc->byte_len = rcqe->length;
+	ibvwc->opcode = IBV_WC_RECV;
+
+	flags = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_FLAGS_SHIFT) &
+		 BNXT_RE_BCQE_FLAGS_MASK;
+	is_imm = (flags & BNXT_RE_RC_FLAGS_IMM_MASK) >>
+		  BNXT_RE_RC_FLAGS_IMM_SHIFT;
+	is_rdma = (flags & BNXT_RE_RC_FLAGS_RDMA_MASK) >>
+		   BNXT_RE_RC_FLAGS_RDMA_SHIFT;
+	ibvwc->wc_flags = 0;
+	if (is_imm) {
+		ibvwc->wc_flags |= IBV_WC_WITH_IMM;
+		ibvwc->imm_data = ntohl(rcqe->imm_key);
+		if (is_rdma)
+			ibvwc->opcode = IBV_WC_RECV_RDMA_WITH_IMM;
+	}
+
+	bnxt_re_incr_head(rq);
+}
+
+static uint8_t bnxt_re_poll_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
+				 void *cqe, int *cnt)
+{
+	struct bnxt_re_bcqe *hdr;
+	struct bnxt_re_rc_cqe *rcqe;
+	uint8_t status, pcqe = false;
+
+	rcqe = cqe;
+	hdr = cqe + sizeof(struct bnxt_re_rc_cqe);
+
+	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
+		  BNXT_RE_BCQE_STATUS_MASK;
+	if (status == BNXT_RE_RSP_ST_OK) {
+		bnxt_re_poll_success_rcqe(qp, ibvwc, hdr, rcqe);
+		*cnt = 1;
+	} else {
+		/* TODO: Process error completions properly.*/
+		*cnt = 1;
+		ibvwc->status = bnxt_re_to_ibv_wc_status(status, false);
+		if (qp->rqq) {
+			ibvwc->wr_id = qp->rwrid[qp->rqq->head].wrid;
+			bnxt_re_incr_head(qp->rqq);
+		}
+	}
+
+	return pcqe;
+}
+
+static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc)
+{
+	struct bnxt_re_queue *cqq = &cq->cqq;
+	struct bnxt_re_qp *qp;
+	struct bnxt_re_bcqe *hdr;
+	struct bnxt_re_req_cqe *scqe;
+	struct bnxt_re_ud_cqe *rcqe;
+	void *cqe;
+	uint64_t *qp_handle = NULL;
+	int type, cnt = 0, dqed = 0, hw_polled = 0;
+	uint8_t pcqe = false;
+
+	while (nwc) {
+		cqe = cqq->va + cqq->head * bnxt_re_get_cqe_sz();
+		bnxt_re_le64_to_host((uint64_t *)cqe, cqq->stride);
+		hdr = cqe + sizeof(struct bnxt_re_req_cqe);
+		if (!bnxt_re_is_cqe_valid(cq, hdr))
+			break;
+		type = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_TYPE_SHIFT) &
+			BNXT_RE_BCQE_TYPE_MASK;
+		switch (type) {
+		case BNXT_RE_WC_TYPE_SEND:
+			scqe = cqe;
+			qp_handle = (uint64_t *)&scqe->qp_handle;
+			qp = (struct bnxt_re_qp *)scqe->qp_handle;
+			if (!qp)
+				break; /*stale cqe. should be rung.*/
+			if (qp->qptyp == IBV_QPT_UD)
+				goto bail; /* TODO: Add UD poll */
+
+			pcqe = bnxt_re_poll_scqe(qp, wc, cqe, &cnt);
+			break;
+		case BNXT_RE_WC_TYPE_RECV_RC:
+		case BNXT_RE_WC_TYPE_RECV_UD:
+			rcqe = cqe;
+			qp_handle = (uint64_t *)&rcqe->qp_handle;
+			qp = (struct bnxt_re_qp *)rcqe->qp_handle;
+			if (!qp)
+				break; /*stale cqe. should be rung.*/
+			if (qp->srq)
+				goto bail; /*TODO: Add SRQ poll */
+
+			pcqe = bnxt_re_poll_rcqe(qp, wc, cqe, &cnt);
+			/* TODO: Process UD rcqe */
+			break;
+		case BNXT_RE_WC_TYPE_RECV_RAW:
+			break;
+		case BNXT_RE_WC_TYPE_TERM:
+			break;
+		case BNXT_RE_WC_TYPE_COFF:
+			break;
+		default:
+			break;
+		};
+
+		if (pcqe)
+			goto skipp_real;
+
+		hw_polled++;
+		if (qp_handle) {
+			*qp_handle = 0x0ULL; /* mark cqe as read */
+			qp_handle = NULL;
+		}
+		bnxt_re_incr_head(&cq->cqq);
+		bnxt_re_change_cq_phase(cq);
+skipp_real:
+		if (cnt) {
+			cnt = 0;
+			dqed++;
+			nwc--;
+			wc++;
+		}
+	}
+
+	if (hw_polled)
+		bnxt_re_ring_cq_db(cq);
+bail:
+	return dqed;
+}
+
 int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc)
 {
-	return -ENOSYS;
+	struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq);
+	int dqed;
+
+	pthread_spin_lock(&cq->cqq.qlock);
+	dqed = bnxt_re_poll_one(cq, nwc, wc);
+	pthread_spin_unlock(&cq->cqq.qlock);
+
+	/* TODO: Flush Management*/
+
+	return dqed;
+}
+
+static void bnxt_re_cleanup_cq(struct bnxt_re_qp *qp, struct bnxt_re_cq *cq)
+{
+	struct bnxt_re_queue *que = &cq->cqq;
+	struct bnxt_re_bcqe *hdr;
+	struct bnxt_re_req_cqe *scqe;
+	struct bnxt_re_rc_cqe *rcqe;
+	void *cqe;
+	int indx, type;
+
+	pthread_spin_lock(&que->qlock);
+	for (indx = 0; indx < que->depth; indx++) {
+		cqe = que->va + indx * bnxt_re_get_cqe_sz();
+		hdr = cqe + sizeof(struct bnxt_re_req_cqe);
+		type = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_TYPE_SHIFT) &
+			BNXT_RE_BCQE_TYPE_MASK;
+
+		if (type == BNXT_RE_WC_TYPE_COFF)
+			continue;
+		if (type == BNXT_RE_WC_TYPE_SEND ||
+		    type == BNXT_RE_WC_TYPE_TERM) {
+			scqe = cqe;
+			if (scqe->qp_handle == (uint64_t)qp)
+				scqe->qp_handle = 0ULL;
+		} else {
+			rcqe = cqe;
+			if (rcqe->qp_handle == (uint64_t)qp)
+				rcqe->qp_handle = 0ULL;
+		}
+
+	}
+	pthread_spin_unlock(&que->qlock);
 }
 
 void bnxt_re_cq_event(struct ibv_cq *ibvcq)
@@ -242,11 +496,40 @@ void bnxt_re_cq_event(struct ibv_cq *ibvcq)
 
 int bnxt_re_arm_cq(struct ibv_cq *ibvcq, int flags)
 {
-	return -ENOSYS;
+	struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq);
+
+	pthread_spin_lock(&cq->cqq.qlock);
+	flags = !flags ? BNXT_RE_QUE_TYPE_CQ_ARMALL :
+			 BNXT_RE_QUE_TYPE_CQ_ARMSE;
+	bnxt_re_ring_cq_arm_db(cq, flags);
+	pthread_spin_unlock(&cq->cqq.qlock);
+
+	return 0;
 }
 
-static int bnxt_re_check_qp_limits(struct ibv_qp_init_attr *attr)
+static int bnxt_re_check_qp_limits(struct bnxt_re_context *cntx,
+				   struct ibv_qp_init_attr *attr)
 {
+	struct ibv_device_attr devattr;
+	int ret;
+
+	if (attr->qp_type == IBV_QPT_UD)
+		return -ENOSYS;
+
+	ret = bnxt_re_query_device(&cntx->ibvctx, &devattr);
+	if (ret)
+		return ret;
+	if (attr->cap.max_send_sge > devattr.max_sge)
+		return EINVAL;
+	if (attr->cap.max_recv_sge > devattr.max_sge)
+		return EINVAL;
+	if (attr->cap.max_inline_data > BNXT_RE_MAX_INLINE_SIZE)
+		return EINVAL;
+	if (attr->cap.max_send_wr > devattr.max_qp_wr)
+		attr->cap.max_send_wr = devattr.max_qp_wr;
+	if (attr->cap.max_recv_wr > devattr.max_qp_wr)
+		attr->cap.max_recv_wr = devattr.max_qp_wr;
+
 	return 0;
 }
 
@@ -294,49 +577,56 @@ static int bnxt_re_alloc_queues(struct bnxt_re_qp *qp,
 				struct ibv_qp_init_attr *attr,
 				uint32_t pg_size) {
 	struct bnxt_re_queue *que;
+	struct bnxt_re_psns *psns;
 	uint32_t psn_depth;
-	int ret;
-
-	if (attr->cap.max_send_wr) {
-		que = qp->sqq;
-		que->stride = bnxt_re_get_sqe_sz();
-		que->depth = roundup_pow_of_two(attr->cap.max_send_wr);
-		/* psn_depth extra entries of size que->stride */
-		psn_depth = (que->depth * sizeof(struct bnxt_re_psns)) /
-			     que->stride;
-		que->depth += psn_depth;
-		ret = bnxt_re_alloc_aligned(qp->sqq, pg_size);
-		if (ret)
-			return ret;
-		/* exclude psns depth*/
-		que->depth -= psn_depth;
-		/* start of spsn space sizeof(struct bnxt_re_psns) each. */
-		qp->psns = (que->va + que->stride * que->depth);
-		pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE);
-		qp->swrid = calloc(que->depth, sizeof(uint64_t));
-		if (!qp->swrid) {
-			ret = -ENOMEM;
-			goto fail;
-		}
+	int ret, indx;
+
+	que = qp->sqq;
+	que->stride = bnxt_re_get_sqe_sz();
+	que->depth = roundup_pow_of_two(attr->cap.max_send_wr + 1);
+	/* psn_depth extra entries of size que->stride */
+	psn_depth = (que->depth * sizeof(struct bnxt_re_psns)) /
+		     que->stride;
+	if ((que->depth * sizeof(struct bnxt_re_psns)) % que->stride)
+		psn_depth++;
+
+	que->depth += psn_depth;
+	ret = bnxt_re_alloc_aligned(qp->sqq, pg_size);
+	if (ret)
+		return ret;
+	/* exclude psns depth*/
+	que->depth -= psn_depth;
+	/* start of spsn space sizeof(struct bnxt_re_psns) each. */
+	psns = (que->va + que->stride * que->depth);
+	pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE);
+	qp->swrid = calloc(que->depth, sizeof(struct bnxt_re_wrid));
+	if (!qp->swrid) {
+		ret = -ENOMEM;
+		goto fail;
 	}
 
-	if (attr->cap.max_recv_wr && qp->rqq) {
+	for (indx = 0 ; indx < que->depth; indx++, psns++)
+		qp->swrid[indx].psns = psns;
+	qp->cap.max_swr = que->depth;
+
+	if (qp->rqq) {
 		que = qp->rqq;
 		que->stride = bnxt_re_get_rqe_sz();
-		que->depth = roundup_pow_of_two(attr->cap.max_recv_wr);
+		que->depth = roundup_pow_of_two(attr->cap.max_recv_wr + 1);
 		ret = bnxt_re_alloc_aligned(qp->rqq, pg_size);
 		if (ret)
 			goto fail;
 		pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE);
-		qp->rwrid = calloc(que->depth, sizeof(uint64_t));
+		/* For RQ only bnxt_re_wri.wrid is used. */
+		qp->rwrid = calloc(que->depth, sizeof(struct bnxt_re_wrid));
 		if (!qp->rwrid) {
 			ret = -ENOMEM;
 			goto fail;
 		}
+		qp->cap.max_rwr = que->depth;
 	}
 
 	return 0;
-
 fail:
 	bnxt_re_free_queues(qp);
 	return ret;
@@ -348,11 +638,12 @@ struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
 	struct bnxt_re_qp *qp;
 	struct bnxt_re_qp_req req;
 	struct bnxt_re_qp_resp resp;
+	struct bnxt_re_qpcap *cap;
 
 	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvpd->context);
 	struct bnxt_re_dev *dev = to_bnxt_re_dev(cntx->ibvctx.device);
 
-	if (bnxt_re_check_qp_limits(attr))
+	if (bnxt_re_check_qp_limits(cntx, attr))
 		return NULL;
 
 	qp = calloc(1, sizeof(*qp));
@@ -365,6 +656,7 @@ struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
 	if (bnxt_re_alloc_queues(qp, attr, dev->pg_size))
 		goto failq;
 	/* Fill ibv_cmd */
+	cap = &qp->cap;
 	req.qpsva = (uint64_t)qp->sqq->va;
 	req.qprva = qp->rqq ? (uint64_t)qp->rqq->va : 0;
 	req.qp_handle = (uint64_t)qp;
@@ -380,6 +672,13 @@ struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
 	qp->scq = to_bnxt_re_cq(attr->send_cq);
 	qp->rcq = to_bnxt_re_cq(attr->recv_cq);
 	qp->udpi = &cntx->udpi;
+	/* Save/return the altered Caps. */
+	attr->cap.max_send_wr = cap->max_swr;
+	cap->max_ssge = attr->cap.max_send_sge;
+	attr->cap.max_recv_wr = cap->max_rwr;
+	cap->max_rsge = attr->cap.max_recv_sge;
+	cap->max_inline = attr->cap.max_inline_data;
+	cap->sqsig = attr->sq_sig_all;
 
 	return &qp->ibvqp;
 failcmd:
@@ -400,8 +699,15 @@ int bnxt_re_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr,
 	int rc;
 
 	rc = ibv_cmd_modify_qp(ibvqp, attr, attr_mask, &cmd, sizeof(cmd));
-	if (!rc)
-		qp->qpst = ibvqp->state;
+	if (!rc) {
+		if (attr_mask & IBV_QP_STATE)
+			qp->qpst = attr->qp_state;
+
+		if (attr_mask & IBV_QP_SQ_PSN)
+			qp->sq_psn = attr->sq_psn;
+		if (attr_mask & IBV_QP_PATH_MTU)
+			qp->mtu = (0x80 << attr->path_mtu);
+	}
 
 	return rc;
 }
@@ -430,6 +736,8 @@ int bnxt_re_destroy_qp(struct ibv_qp *ibvqp)
 	if (status)
 		return status;
 
+	bnxt_re_cleanup_cq(qp, qp->rcq);
+	bnxt_re_cleanup_cq(qp, qp->scq);
 	bnxt_re_free_queues(qp);
 	bnxt_re_free_queue_ptr(qp);
 	free(qp);
@@ -437,16 +745,303 @@ int bnxt_re_destroy_qp(struct ibv_qp *ibvqp)
 	return 0;
 }
 
+static inline uint8_t bnxt_re_set_hdr_flags(struct bnxt_re_bsqe *hdr,
+					    uint32_t send_flags, uint8_t sqsig)
+{
+	uint8_t is_inline = false;
+
+	if (send_flags & IBV_SEND_SIGNALED || sqsig)
+		hdr->rsv_ws_fl_wt |= ((BNXT_RE_WR_FLAGS_SIGNALED &
+				       BNXT_RE_HDR_FLAGS_MASK) <<
+				       BNXT_RE_HDR_FLAGS_SHIFT);
+
+	if (send_flags & IBV_SEND_FENCE)
+		/*TODO: See when RD fence can be used. */
+		hdr->rsv_ws_fl_wt |= ((BNXT_RE_WR_FLAGS_UC_FENCE &
+				       BNXT_RE_HDR_FLAGS_MASK) <<
+				       BNXT_RE_HDR_FLAGS_SHIFT);
+
+	if (send_flags & IBV_SEND_SOLICITED)
+		hdr->rsv_ws_fl_wt |= ((BNXT_RE_WR_FLAGS_SE &
+				       BNXT_RE_HDR_FLAGS_MASK) <<
+				       BNXT_RE_HDR_FLAGS_SHIFT);
+	if (send_flags & IBV_SEND_INLINE) {
+		hdr->rsv_ws_fl_wt |= ((BNXT_RE_WR_FLAGS_INLINE &
+				       BNXT_RE_HDR_FLAGS_MASK) <<
+				       BNXT_RE_HDR_FLAGS_SHIFT);
+		is_inline = true;
+	}
+
+	return is_inline;
+}
+
+static int bnxt_re_build_sge(struct bnxt_re_sge *sge, struct ibv_sge *sg_list,
+			     uint32_t num_sge, uint8_t is_inline) {
+	int indx, length = 0;
+	void *dst;
+
+	if (!num_sge) {
+		memset(sge, 0, sizeof(*sge));
+		return 0;
+	}
+
+	if (is_inline) {
+		dst = sge;
+		for (indx = 0; indx < num_sge; indx++) {
+			length += sg_list[indx].length;
+			if (length > BNXT_RE_MAX_INLINE_SIZE)
+				return -ENOMEM;
+			memcpy(dst, (void *)sg_list[indx].addr,
+			       sg_list[indx].length);
+			dst = dst + sg_list[indx].length;
+		}
+	} else {
+		for (indx = 0; indx < num_sge; indx++) {
+			sge[indx].pa_lo = sg_list[indx].addr & 0xFFFFFFFFUL;
+			sge[indx].pa_hi = sg_list[indx].addr >> 32;
+			sge[indx].lkey = sg_list[indx].lkey;
+			sge[indx].length = sg_list[indx].length;
+			length += sg_list[indx].length;
+		}
+	}
+
+	return length;
+}
+
+static void bnxt_re_fill_psns(struct bnxt_re_qp *qp, struct bnxt_re_psns *psns,
+			      uint8_t opcode, uint32_t len)
+{
+	uint32_t pkt_cnt = 0, nxt_psn;
+
+	memset(psns, 0, sizeof(*psns));
+	psns->opc_spsn = qp->sq_psn & BNXT_RE_PSNS_SPSN_MASK;
+	opcode = bnxt_re_ibv_wr_to_wc_opcd(opcode);
+	psns->opc_spsn |= ((opcode & BNXT_RE_PSNS_OPCD_MASK) <<
+			    BNXT_RE_PSNS_OPCD_SHIFT);
+
+	pkt_cnt = (len / qp->mtu);
+	if (len % qp->mtu)
+		pkt_cnt++;
+	nxt_psn = ((qp->sq_psn + pkt_cnt) & BNXT_RE_PSNS_NPSN_MASK);
+	psns->flg_npsn = nxt_psn;
+	qp->sq_psn = nxt_psn;
+
+	*(uint64_t *)psns = htole64(*(uint64_t *)psns);
+}
+
+static void bnxt_re_fill_wrid(struct bnxt_re_wrid *wrid, struct ibv_send_wr *wr,
+			      uint32_t len, uint8_t sqsig)
+{
+	wrid->wrid = wr->wr_id;
+	wrid->bytes = len;
+	wrid->sig = 0;
+	if (wr->send_flags & IBV_SEND_SIGNALED || sqsig)
+		wrid->sig = IBV_SEND_SIGNALED;
+}
+
+static int bnxt_re_build_send_sqe(struct bnxt_re_qp *qp, void *wqe,
+				  struct ibv_send_wr *wr, uint8_t is_inline)
+{
+	struct bnxt_re_bsqe *hdr = wqe;
+	struct bnxt_re_send *sqe = ((void *)wqe + sizeof(struct bnxt_re_bsqe));
+	struct bnxt_re_sge *sge = ((void *)wqe + bnxt_re_get_sqe_hdr_sz());
+	uint32_t wrlen;
+	int len;
+	uint8_t opcode, qesize;
+
+	len = bnxt_re_build_sge(sge, wr->sg_list, wr->num_sge, is_inline);
+	if (len < 0)
+		return len;
+	sqe->length = len;
+
+	/* Fill Header */
+	opcode = bnxt_re_ibv_to_bnxt_wr_opcd(wr->opcode);
+	hdr->rsv_ws_fl_wt |= (opcode & BNXT_RE_HDR_WT_MASK);
+
+	if (is_inline) {
+		wrlen = get_aligned(len, 16);
+		qesize = wrlen >> 4;
+	} else {
+		qesize = wr->num_sge;
+	}
+	qesize += (bnxt_re_get_sqe_hdr_sz() >> 4);
+	hdr->rsv_ws_fl_wt |= (qesize & BNXT_RE_HDR_WS_MASK) <<
+			      BNXT_RE_HDR_WS_SHIFT;
+	return len;
+}
+
+static int bnxt_re_build_rdma_sqe(struct bnxt_re_qp *qp, void *wqe,
+				  struct ibv_send_wr *wr, uint8_t is_inline)
+{
+	struct bnxt_re_rdma *sqe = ((void *)wqe + sizeof(struct bnxt_re_bsqe));
+	int len;
+
+	len = bnxt_re_build_send_sqe(qp, wqe, wr, is_inline);
+	sqe->rva_lo = wr->wr.rdma.remote_addr & 0xFFFFFFFFUL;
+	sqe->rva_hi = (wr->wr.rdma.remote_addr >> 32);
+	sqe->rkey = wr->wr.rdma.rkey;
+
+	return len;
+}
+
 int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
 		      struct ibv_send_wr **bad)
 {
-	return -ENOSYS;
+	struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp);
+	struct bnxt_re_queue *sq = qp->sqq;
+	struct bnxt_re_bsqe *hdr;
+	struct bnxt_re_wrid *wrid;
+	struct bnxt_re_psns *psns;
+	void *sqe;
+	int ret = 0, bytes = 0;
+	uint8_t is_inline = false;
+
+	pthread_spin_lock(&sq->qlock);
+	while (wr) {
+		if ((qp->qpst != IBV_QPS_RTS) && (qp->qpst != IBV_QPS_SQD)) {
+			*bad = wr;
+			pthread_spin_unlock(&sq->qlock);
+			return EINVAL;
+		}
+
+		if ((qp->qptyp == IBV_QPT_UD) &&
+		    (wr->opcode != IBV_WR_SEND &&
+		     wr->opcode != IBV_WR_SEND_WITH_IMM)) {
+			*bad = wr;
+			pthread_spin_unlock(&sq->qlock);
+			return EINVAL;
+		}
+
+		if (bnxt_re_is_que_full(sq) ||
+		    wr->num_sge > qp->cap.max_ssge) {
+			*bad = wr;
+			pthread_spin_unlock(&sq->qlock);
+			return ENOMEM;
+		}
+
+		sqe = (void *)(sq->va + (sq->tail * sq->stride));
+		wrid = &qp->swrid[sq->tail];
+		psns = wrid->psns;
+
+		memset(sqe, 0, bnxt_re_get_sqe_sz());
+		hdr = sqe;
+		is_inline = bnxt_re_set_hdr_flags(hdr, wr->send_flags,
+						  qp->cap.sqsig);
+		switch (wr->opcode) {
+		case IBV_WR_SEND_WITH_IMM:
+			hdr->key_immd = wr->imm_data;
+		case IBV_WR_SEND:
+			bytes = bnxt_re_build_send_sqe(qp, sqe, wr, is_inline);
+			if (bytes < 0)
+				ret = ENOMEM;
+			break;
+		case IBV_WR_RDMA_WRITE_WITH_IMM:
+			hdr->key_immd = wr->imm_data;
+		case IBV_WR_RDMA_WRITE:
+			bytes = bnxt_re_build_rdma_sqe(qp, sqe, wr, is_inline);
+			if (bytes < 0)
+				ret = ENOMEM;
+			break;
+		case IBV_WR_RDMA_READ:
+			bytes = bnxt_re_build_rdma_sqe(qp, sqe, wr, false);
+			if (bytes < 0)
+				ret = ENOMEM;
+			break;
+		default:
+			ret = EINVAL;
+			break;
+		}
+
+		if (ret) {
+			*bad = wr;
+			break;
+		}
+
+		bnxt_re_fill_wrid(wrid, wr, bytes, qp->cap.sqsig);
+		bnxt_re_fill_psns(qp, psns, wr->opcode, bytes);
+		bnxt_re_host_to_le64((uint64_t *)sqe, sq->stride);
+		bnxt_re_incr_tail(sq);
+		wr = wr->next;
+		wmb(); /* write barrier */
+
+		bnxt_re_ring_sq_db(qp);
+	}
+
+	pthread_spin_unlock(&sq->qlock);
+	return ret;
+}
+
+static int bnxt_re_build_rqe(struct bnxt_re_qp *qp, struct ibv_recv_wr *wr,
+			     void *rqe)
+{
+	struct bnxt_re_brqe *hdr = rqe;
+	struct bnxt_re_rqe *rwr;
+	struct bnxt_re_sge *sge;
+	struct bnxt_re_wrid *wrid;
+	int wqe_sz, len;
+
+	rwr = (rqe + sizeof(struct bnxt_re_brqe));
+	sge = (rqe + bnxt_re_get_rqe_hdr_sz());
+	wrid = &qp->rwrid[qp->rqq->tail];
+
+	len = bnxt_re_build_sge(sge, wr->sg_list, wr->num_sge, false);
+	hdr->rsv_ws_fl_wt = BNXT_RE_WR_OPCD_RECV;
+	wqe_sz = wr->num_sge + (bnxt_re_get_rqe_hdr_sz() >> 4); /* 16B align */
+	hdr->rsv_ws_fl_wt |= ((wqe_sz & BNXT_RE_HDR_WS_MASK) <<
+			       BNXT_RE_HDR_WS_SHIFT);
+	rwr->wrid = qp->rqq->tail;
+
+	/* Fill wrid */
+	wrid->wrid = wr->wr_id;
+	wrid->bytes = len; /* N.A. for RQE */
+	wrid->sig = 0; /* N.A. for RQE */
+
+	return len;
 }
 
 int bnxt_re_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
 		      struct ibv_recv_wr **bad)
 {
-	return -ENOSYS;
+	struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp);
+	struct bnxt_re_queue *rq = qp->rqq;
+	void *rqe;
+	int ret;
+
+	pthread_spin_lock(&rq->qlock);
+	while (wr) {
+		/* check QP state, abort if it is ERR or RST */
+		if (qp->qpst == IBV_QPS_RESET || qp->qpst == IBV_QPS_ERR) {
+			*bad = wr;
+			pthread_spin_unlock(&rq->qlock);
+			return EINVAL;
+		}
+
+		if (bnxt_re_is_que_full(rq) ||
+		    wr->num_sge > qp->cap.max_rsge) {
+			pthread_spin_unlock(&rq->qlock);
+			*bad = wr;
+			return ENOMEM;
+		}
+
+		rqe = (void *)(rq->va + (rq->tail * rq->stride));
+		memset(rqe, 0, bnxt_re_get_rqe_sz());
+		ret = bnxt_re_build_rqe(qp, wr, rqe);
+		if (ret < 0) {
+			pthread_spin_unlock(&rq->qlock);
+			*bad = wr;
+			return ENOMEM;
+		}
+
+		bnxt_re_host_to_le64((uint64_t *)rqe, rq->stride);
+		bnxt_re_incr_tail(rq);
+		wr = wr->next;
+
+		wmb(); /* write barrier */
+		bnxt_re_ring_rq_db(qp);
+	}
+	pthread_spin_unlock(&rq->qlock);
+
+	return 0;
 }
 
 struct ibv_srq *bnxt_re_create_srq(struct ibv_pd *ibvpd,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 5/9] libbnxt_re: Allow apps to poll for flushed completions
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 4/9] libbnxt_re: Add support for posting and polling Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 6/9] libbnxt_re: Enable UD control path and wqe posting Devesh Sharma
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds support for reporting flush completions.
following is the overview of the algorithm used.

Step-1: Poll a completion from h/w CQ.
Step-2: check the status, if it is error goto step3 else report
        completion to user based on the con_idx reported.
Step-3: Report the completion with actual error to consumer, and
        without bothering about the con_idx reported in the
        completion do following:
        3a. Add this QP to the CQ flush list if it was not there
            already. If this is req-error, add the QP to send-flush
            list, else add it to recv-flush-list.
        3b. Change QP-soft-state to ERROR if it was not in error
            already.

Step-4: If next CQE is TERM CQE, extract this CQE. make sure this CQE
        is not reported to the consumer. Do the following steps as
        further processing:
        4a. Add this QP to both send-flush-list and recv-flush-list
            if QP is absent from any of the flush lists.
        4b. Change QP-soft-state to ERROR if it was not in error
            already.
Step5: Continue polling from both h/w CQ and flush-lists until
       all the queues are empty.

The QP is removed from the flush list during destroy-qp.

Further, it adds network to host format conversion on
the received immediate data.

This patch also takes care of Hardware specific requirement
to skip reporting h/w flush error CQEs to consumer but ring
the CQ-DB for them.

v1->v2
 -- Used ccan/list.h instead.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/flush.h  |  85 ++++++++++++++++
 providers/bnxt_re/main.c   |   5 +
 providers/bnxt_re/main.h   |   6 ++
 providers/bnxt_re/memory.h |   5 +
 providers/bnxt_re/verbs.c  | 243 ++++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 320 insertions(+), 24 deletions(-)
 create mode 100644 providers/bnxt_re/flush.h

diff --git a/providers/bnxt_re/flush.h b/providers/bnxt_re/flush.h
new file mode 100644
index 0000000..a39ea71
--- /dev/null
+++ b/providers/bnxt_re/flush.h
@@ -0,0 +1,85 @@
+/*
+ * Broadcom NetXtreme-E User Space RoCE driver
+ *
+ * Copyright (c) 2015-2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: A few wrappers for flush queue management
+ */
+
+#ifndef __FLUSH_H__
+#define __FLUSH_H__
+
+#include <ccan/list.h>
+
+struct bnxt_re_fque_node {
+	uint8_t valid;
+	struct list_node list;
+};
+
+static inline void fque_init_node(struct bnxt_re_fque_node *node)
+{
+	list_node_init(&node->list);
+	node->valid = false;
+}
+
+static inline void fque_add_node_tail(struct list_head *head,
+				      struct bnxt_re_fque_node *new)
+{
+	list_add_tail(head, &new->list);
+	new->valid = true;
+}
+
+static inline void fque_del_node(struct bnxt_re_fque_node *entry)
+{
+	entry->valid = false;
+	list_del(&entry->list);
+}
+
+static inline uint8_t _fque_node_valid(struct bnxt_re_fque_node *node)
+{
+	return node->valid;
+}
+
+static inline void bnxt_re_fque_add_node(struct list_head *head,
+					 struct bnxt_re_fque_node *node)
+{
+	if (!_fque_node_valid(node))
+		fque_add_node_tail(head, node);
+}
+
+static inline void bnxt_re_fque_del_node(struct bnxt_re_fque_node *node)
+{
+	if (_fque_node_valid(node))
+		fque_del_node(node);
+}
+#endif	/* __FLUSH_H__ */
diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
index 360c57f..e58c953 100644
--- a/providers/bnxt_re/main.c
+++ b/providers/bnxt_re/main.c
@@ -128,6 +128,7 @@ static int bnxt_re_init_context(struct verbs_device *vdev,
 	dev->pg_size = resp.pg_size;
 	dev->cqe_size = resp.cqe_size;
 	dev->max_cq_depth = resp.max_cqd;
+	pthread_spin_init(&cntx->fqlock, PTHREAD_PROCESS_PRIVATE);
 	ibvctx->ops = bnxt_re_cntx_ops;
 
 	return 0;
@@ -136,7 +137,11 @@ static int bnxt_re_init_context(struct verbs_device *vdev,
 static void bnxt_re_uninit_context(struct verbs_device *vdev,
 				   struct ibv_context *ibvctx)
 {
+	struct bnxt_re_context *cntx;
+
+	cntx = to_bnxt_re_context(ibvctx);
 	/* Unmap if anything device specific was mapped in init_context. */
+	pthread_spin_destroy(&cntx->fqlock);
 }
 
 static struct verbs_device *bnxt_re_driver_init(const char *uverbs_sys_path,
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
index bfe7089..d324ef6 100644
--- a/providers/bnxt_re/main.h
+++ b/providers/bnxt_re/main.h
@@ -50,6 +50,7 @@
 
 #include "bnxt_re-abi.h"
 #include "memory.h"
+#include "flush.h"
 
 #define DEV	"bnxtre : "
 
@@ -69,6 +70,8 @@ struct bnxt_re_cq {
 	uint32_t cqid;
 	struct bnxt_re_queue cqq;
 	struct bnxt_re_dpi *udpi;
+	struct list_head sfhead;
+	struct list_head rfhead;
 	uint32_t cqe_size;
 	uint8_t  phase;
 };
@@ -104,6 +107,8 @@ struct bnxt_re_qp {
 	struct bnxt_re_cq *rcq;
 	struct bnxt_re_dpi *udpi;
 	struct bnxt_re_qpcap cap;
+	struct bnxt_re_fque_node snode;
+	struct bnxt_re_fque_node rnode;
 	uint32_t qpid;
 	uint32_t tbl_indx;
 	uint32_t sq_psn;
@@ -133,6 +138,7 @@ struct bnxt_re_context {
 	uint32_t max_qp;
 	uint32_t max_srq;
 	struct bnxt_re_dpi udpi;
+	pthread_spinlock_t fqlock;
 };
 
 /* DB ring functions used internally*/
diff --git a/providers/bnxt_re/memory.h b/providers/bnxt_re/memory.h
index f812eb8..debb31a 100644
--- a/providers/bnxt_re/memory.h
+++ b/providers/bnxt_re/memory.h
@@ -89,6 +89,11 @@ static inline uint32_t bnxt_re_is_que_full(struct bnxt_re_queue *que)
 	return (((que->tail + 1) & (que->depth - 1)) == que->head);
 }
 
+static inline uint32_t bnxt_re_is_que_empty(struct bnxt_re_queue *que)
+{
+	return que->tail == que->head;
+}
+
 static inline uint32_t bnxt_re_incr(uint32_t val, uint32_t max)
 {
 	return (++val & (max - 1));
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index f2f1ce8..2273e82 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -202,6 +202,9 @@ struct ibv_cq *bnxt_re_create_cq(struct ibv_context *ibvctx, int ncqe,
 	cq->cqq.tail = resp.tail;
 	cq->udpi = &cntx->udpi;
 
+	list_head_init(&cq->sfhead);
+	list_head_init(&cq->rfhead);
+
 	return &cq->ibvcq;
 cmdfail:
 	bnxt_re_free_aligned(&cq->cqq);
@@ -230,6 +233,47 @@ int bnxt_re_destroy_cq(struct ibv_cq *ibvcq)
 	return 0;
 }
 
+static uint8_t bnxt_re_poll_err_scqe(struct bnxt_re_qp *qp,
+				     struct ibv_wc *ibvwc,
+				     struct bnxt_re_bcqe *hdr,
+				     struct bnxt_re_req_cqe *scqe, int *cnt)
+{
+	struct bnxt_re_queue *sq = qp->sqq;
+	struct bnxt_re_context *cntx;
+	struct bnxt_re_wrid *swrid;
+	struct bnxt_re_psns *spsn;
+	struct bnxt_re_cq *scq;
+	uint32_t head = sq->head;
+	uint8_t status;
+
+	scq = to_bnxt_re_cq(qp->ibvqp.send_cq);
+	cntx = to_bnxt_re_context(scq->ibvcq.context);
+	swrid = &qp->swrid[head];
+	spsn = swrid->psns;
+
+	*cnt = 1;
+	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
+		  BNXT_RE_BCQE_STATUS_MASK;
+	ibvwc->status = bnxt_re_to_ibv_wc_status(status, true);
+	ibvwc->wc_flags = 0;
+	ibvwc->wr_id = swrid->wrid;
+	ibvwc->qp_num = qp->qpid;
+	ibvwc->opcode = (le32toh(spsn->opc_spsn) >>
+			BNXT_RE_PSNS_OPCD_SHIFT) &
+			BNXT_RE_PSNS_OPCD_MASK;
+	ibvwc->byte_len = 0;
+
+	bnxt_re_incr_head(qp->sqq);
+
+	if (qp->qpst != IBV_QPS_ERR)
+		qp->qpst = IBV_QPS_ERR;
+	pthread_spin_lock(&cntx->fqlock);
+	bnxt_re_fque_add_node(&scq->sfhead, &qp->snode);
+	pthread_spin_unlock(&cntx->fqlock);
+
+	return false;
+}
+
 static uint8_t bnxt_re_poll_success_scqe(struct bnxt_re_qp *qp,
 					 struct ibv_wc *ibvwc,
 					 struct bnxt_re_bcqe *hdr,
@@ -284,21 +328,53 @@ static uint8_t bnxt_re_poll_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
 
 	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
 		  BNXT_RE_BCQE_STATUS_MASK;
-	if (status == BNXT_RE_REQ_ST_OK) {
+	if (status == BNXT_RE_REQ_ST_OK)
 		pcqe = bnxt_re_poll_success_scqe(qp, ibvwc, hdr, scqe, cnt);
-	} else {
-		/* TODO: Handle error completion properly. */
-		fprintf(stderr, "%s(): swc with error, vendor status = %d\n",
-			__func__, status);
-		*cnt = 1;
-		ibvwc->status = bnxt_re_to_ibv_wc_status(status, true);
-		ibvwc->wr_id = qp->swrid[qp->sqq->head].wrid;
-		bnxt_re_incr_head(qp->sqq);
-	}
+	else
+		pcqe = bnxt_re_poll_err_scqe(qp, ibvwc, hdr, scqe, cnt);
 
 	return pcqe;
 }
 
+static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp,
+				 struct ibv_wc *ibvwc,
+				 struct bnxt_re_bcqe *hdr,
+				 struct bnxt_re_rc_cqe *rcqe)
+{
+	struct bnxt_re_queue *rq = qp->rqq;
+	struct bnxt_re_wrid *rwrid;
+	struct bnxt_re_cq *rcq;
+	struct bnxt_re_context *cntx;
+	uint32_t head = rq->head;
+	uint8_t status;
+
+	rcq = to_bnxt_re_cq(qp->ibvqp.recv_cq);
+	cntx = to_bnxt_re_context(rcq->ibvcq.context);
+
+	rwrid = &qp->rwrid[head];
+	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
+		  BNXT_RE_BCQE_STATUS_MASK;
+	/* skip h/w flush errors */
+	if (status == BNXT_RE_RSP_ST_HW_FLUSH)
+		return 0;
+	ibvwc->status = bnxt_re_to_ibv_wc_status(status, false);
+	/* TODO: Add SRQ Processing here */
+	if (qp->rqq) {
+		ibvwc->wr_id = rwrid->wrid;
+		ibvwc->qp_num = qp->qpid;
+		ibvwc->opcode = IBV_WC_RECV;
+		ibvwc->byte_len = 0;
+		bnxt_re_incr_head(qp->rqq);
+		if (qp->qpst != IBV_QPS_ERR)
+			qp->qpst = IBV_QPS_ERR;
+		pthread_spin_lock(&cntx->fqlock);
+		bnxt_re_fque_add_node(&rcq->rfhead, &qp->rnode);
+		pthread_spin_unlock(&cntx->fqlock);
+	}
+
+	return 1;
+}
+
 static void bnxt_re_poll_success_rcqe(struct bnxt_re_qp *qp,
 				      struct ibv_wc *ibvwc,
 				      struct bnxt_re_bcqe *hdr,
@@ -346,18 +422,37 @@ static uint8_t bnxt_re_poll_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
 
 	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
 		  BNXT_RE_BCQE_STATUS_MASK;
-	if (status == BNXT_RE_RSP_ST_OK) {
+	*cnt = 1;
+	if (status == BNXT_RE_RSP_ST_OK)
 		bnxt_re_poll_success_rcqe(qp, ibvwc, hdr, rcqe);
-		*cnt = 1;
-	} else {
-		/* TODO: Process error completions properly.*/
-		*cnt = 1;
-		ibvwc->status = bnxt_re_to_ibv_wc_status(status, false);
-		if (qp->rqq) {
-			ibvwc->wr_id = qp->rwrid[qp->rqq->head].wrid;
-			bnxt_re_incr_head(qp->rqq);
-		}
-	}
+	else
+		*cnt = bnxt_re_poll_err_rcqe(qp, ibvwc, hdr, rcqe);
+
+	return pcqe;
+}
+
+static uint8_t bnxt_re_poll_term_cqe(struct bnxt_re_qp *qp,
+				     struct ibv_wc *ibvwc, void *cqe, int *cnt)
+{
+	struct bnxt_re_context *cntx;
+	struct bnxt_re_cq *scq, *rcq;
+	uint8_t pcqe = false;
+
+	scq = to_bnxt_re_cq(qp->ibvqp.send_cq);
+	rcq = to_bnxt_re_cq(qp->ibvqp.recv_cq);
+	cntx = to_bnxt_re_context(scq->ibvcq.context);
+	/* For now just add the QP to flush list without
+	 * considering the index reported in the CQE.
+	 * Continue reporting flush completions until the
+	 * SQ and RQ are empty.
+	 */
+	*cnt = 0;
+	if (qp->qpst != IBV_QPS_ERR)
+		qp->qpst = IBV_QPS_ERR;
+	pthread_spin_lock(&cntx->fqlock);
+	bnxt_re_fque_add_node(&rcq->rfhead, &qp->rnode);
+	bnxt_re_fque_add_node(&scq->sfhead, &qp->snode);
+	pthread_spin_unlock(&cntx->fqlock);
 
 	return pcqe;
 }
@@ -410,6 +505,12 @@ static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc)
 		case BNXT_RE_WC_TYPE_RECV_RAW:
 			break;
 		case BNXT_RE_WC_TYPE_TERM:
+			scqe = cqe;
+			qp_handle = (uint64_t *)&scqe->qp_handle;
+			qp = (struct bnxt_re_qp *)scqe->qp_handle;
+			if (!qp)
+				break;
+			pcqe = bnxt_re_poll_term_cqe(qp, wc, cqe, &cnt);
 			break;
 		case BNXT_RE_WC_TYPE_COFF:
 			break;
@@ -442,22 +543,107 @@ bail:
 	return dqed;
 }
 
+static int bnxt_re_poll_flush_wcs(struct bnxt_re_queue *que,
+				  struct bnxt_re_wrid *wridp,
+				  struct ibv_wc *ibvwc, uint32_t qpid,
+				  int nwc)
+{
+	struct bnxt_re_wrid *wrid;
+	struct bnxt_re_psns *psns;
+	uint32_t cnt = 0, head;
+	uint8_t opcode = IBV_WC_RECV;
+
+	while (nwc) {
+		if (bnxt_re_is_que_empty(que))
+			break;
+		head = que->head;
+		wrid = &wridp[head];
+		if (wrid->psns) {
+			psns = wrid->psns;
+			opcode = (psns->opc_spsn >> BNXT_RE_PSNS_OPCD_SHIFT) &
+				  BNXT_RE_PSNS_OPCD_MASK;
+		}
+
+		ibvwc->status = IBV_WC_WR_FLUSH_ERR;
+		ibvwc->opcode = opcode;
+		ibvwc->wr_id = wrid->wrid;
+		ibvwc->qp_num = qpid;
+		ibvwc->byte_len = 0;
+		ibvwc->wc_flags = 0;
+
+		bnxt_re_incr_head(que);
+		nwc--;
+		cnt++;
+		ibvwc++;
+	}
+
+	return cnt;
+}
+
+static int bnxt_re_poll_flush_lists(struct bnxt_re_cq *cq, uint32_t nwc,
+				    struct ibv_wc *ibvwc)
+{
+	struct bnxt_re_fque_node *cur, *tmp;
+	struct bnxt_re_qp *qp;
+	struct bnxt_re_queue *que;
+	int dqed = 0, left;
+
+	/* Check if flush Qs are empty */
+	if (list_empty(&cq->sfhead) && list_empty(&cq->rfhead))
+		return 0;
+
+	if (!list_empty(&cq->sfhead)) {
+		list_for_each_safe(&cq->sfhead, cur, tmp, list) {
+			qp = container_of(cur, struct bnxt_re_qp, snode);
+			que = qp->sqq;
+			if (bnxt_re_is_que_empty(que))
+				continue;
+			dqed = bnxt_re_poll_flush_wcs(que, qp->swrid, ibvwc,
+						      qp->qpid, nwc);
+		}
+	}
+
+	left = nwc - dqed;
+	if (!left)
+		return dqed;
+
+	if (!list_empty(&cq->rfhead)) {
+		list_for_each_safe(&cq->rfhead, cur, tmp, list) {
+			qp = container_of(cur, struct bnxt_re_qp, rnode);
+			que = qp->rqq;
+			if (!que || bnxt_re_is_que_empty(que))
+				continue;
+			dqed += bnxt_re_poll_flush_wcs(que, qp->rwrid,
+						       ibvwc + dqed, qp->qpid,
+						       left);
+		}
+	}
+
+	return dqed;
+}
+
 int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc)
 {
 	struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq);
-	int dqed;
+	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvcq->context);
+	int dqed, left = 0;
 
 	pthread_spin_lock(&cq->cqq.qlock);
 	dqed = bnxt_re_poll_one(cq, nwc, wc);
 	pthread_spin_unlock(&cq->cqq.qlock);
-
-	/* TODO: Flush Management*/
+	/* Check if anything is there to flush. */
+	pthread_spin_lock(&cntx->fqlock);
+	left = nwc - dqed;
+	if (left)
+		dqed += bnxt_re_poll_flush_lists(cq, left, (wc + dqed));
+	pthread_spin_unlock(&cntx->fqlock);
 
 	return dqed;
 }
 
 static void bnxt_re_cleanup_cq(struct bnxt_re_qp *qp, struct bnxt_re_cq *cq)
 {
+	struct bnxt_re_context *cntx;
 	struct bnxt_re_queue *que = &cq->cqq;
 	struct bnxt_re_bcqe *hdr;
 	struct bnxt_re_req_cqe *scqe;
@@ -465,6 +651,8 @@ static void bnxt_re_cleanup_cq(struct bnxt_re_qp *qp, struct bnxt_re_cq *cq)
 	void *cqe;
 	int indx, type;
 
+	cntx = to_bnxt_re_context(cq->ibvcq.context);
+
 	pthread_spin_lock(&que->qlock);
 	for (indx = 0; indx < que->depth; indx++) {
 		cqe = que->va + indx * bnxt_re_get_cqe_sz();
@@ -487,6 +675,11 @@ static void bnxt_re_cleanup_cq(struct bnxt_re_qp *qp, struct bnxt_re_cq *cq)
 
 	}
 	pthread_spin_unlock(&que->qlock);
+
+	pthread_spin_lock(&cntx->fqlock);
+	bnxt_re_fque_del_node(&qp->snode);
+	bnxt_re_fque_del_node(&qp->rnode);
+	pthread_spin_unlock(&cntx->fqlock);
 }
 
 void bnxt_re_cq_event(struct ibv_cq *ibvcq)
@@ -679,6 +872,8 @@ struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
 	cap->max_rsge = attr->cap.max_recv_sge;
 	cap->max_inline = attr->cap.max_inline_data;
 	cap->sqsig = attr->sq_sig_all;
+	fque_init_node(&qp->snode);
+	fque_init_node(&qp->rnode);
 
 	return &qp->ibvqp;
 failcmd:
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 6/9] libbnxt_re: Enable UD control path and wqe posting
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 5/9] libbnxt_re: Allow apps to poll for flushed completions Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 7/9] libbnxt_re: Enable polling for UD completions Devesh Sharma
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds infrastructure needed to enable unreliable
datagram control path. It also adds support to allow posting
Send WQEs to UD QPs. Following are the major changes:

 - Mmap the shared page exported from kernel driver to
   read AH-ID from kernel space.
 - Adds support to create-AH and destroy-AH.
 - Add support to allow posting UD WQEs.
 - Do not use search-psn memory for UD QPs.

v1->v2
 --Removed extra ref of PD in ah structure

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/bnxt_re-abi.h |  7 ++++
 providers/bnxt_re/main.c        | 17 ++++++++
 providers/bnxt_re/main.h        | 12 ++++++
 providers/bnxt_re/verbs.c       | 93 ++++++++++++++++++++++++++++++++++-------
 4 files changed, 114 insertions(+), 15 deletions(-)

diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index b7eef36..3082d76 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -174,6 +174,13 @@ enum bnxt_re_ud_flags_mask {
 	BNXT_RE_UD_FLAGS_ROCE_IPV6	= 0x03
 };
 
+enum bnxt_re_shpg_offt {
+	BNXT_RE_SHPG_BEG_RESV_OFFT	= 0x00,
+	BNXT_RE_SHPG_AVID_OFFT		= 0x10,
+	BNXT_RE_SHPG_AVID_SIZE		= 0x04,
+	BNXT_RE_SHPG_END_RESV_OFFT	= 0xFF0
+};
+
 struct bnxt_re_db_hdr {
 	__u32 indx;
 	__u32 typ_qid; /* typ: 4, qid:20*/
diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
index e58c953..aee5c43 100644
--- a/providers/bnxt_re/main.c
+++ b/providers/bnxt_re/main.c
@@ -129,18 +129,35 @@ static int bnxt_re_init_context(struct verbs_device *vdev,
 	dev->cqe_size = resp.cqe_size;
 	dev->max_cq_depth = resp.max_cqd;
 	pthread_spin_init(&cntx->fqlock, PTHREAD_PROCESS_PRIVATE);
+	/* mmap shared page. */
+	cntx->shpg = mmap(NULL, dev->pg_size, PROT_READ | PROT_WRITE,
+			  MAP_SHARED, cmd_fd, 0);
+	if (cntx->shpg == MAP_FAILED) {
+		cntx->shpg = NULL;
+		goto failed;
+	}
+	pthread_mutex_init(&cntx->shlock, NULL);
+
 	ibvctx->ops = bnxt_re_cntx_ops;
 
 	return 0;
+failed:
+	fprintf(stderr, DEV "Failed to allocate context for device\n");
+	return errno;
 }
 
 static void bnxt_re_uninit_context(struct verbs_device *vdev,
 				   struct ibv_context *ibvctx)
 {
+	struct bnxt_re_dev *dev;
 	struct bnxt_re_context *cntx;
 
+	dev = to_bnxt_re_dev(&vdev->device);
 	cntx = to_bnxt_re_context(ibvctx);
 	/* Unmap if anything device specific was mapped in init_context. */
+	pthread_mutex_destroy(&cntx->shlock);
+	if (cntx->shpg)
+		munmap(cntx->shpg, dev->pg_size);
 	pthread_spin_destroy(&cntx->fqlock);
 }
 
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
index d324ef6..9d99c64 100644
--- a/providers/bnxt_re/main.h
+++ b/providers/bnxt_re/main.h
@@ -123,6 +123,11 @@ struct bnxt_re_mr {
 	struct ibv_mr ibvmr;
 };
 
+struct bnxt_re_ah {
+	struct ibv_ah ibvah;
+	uint32_t avid;
+};
+
 struct bnxt_re_dev {
 	struct verbs_device vdev;
 	uint8_t abi_version;
@@ -138,6 +143,8 @@ struct bnxt_re_context {
 	uint32_t max_qp;
 	uint32_t max_srq;
 	struct bnxt_re_dpi udpi;
+	void *shpg;
+	pthread_mutex_t shlock;
 	pthread_spinlock_t fqlock;
 };
 
@@ -175,6 +182,11 @@ static inline struct bnxt_re_qp *to_bnxt_re_qp(struct ibv_qp *ibvqp)
 	return container_of(ibvqp, struct bnxt_re_qp, ibvqp);
 }
 
+static inline struct bnxt_re_ah *to_bnxt_re_ah(struct ibv_ah *ibvah)
+{
+        return container_of(ibvah, struct bnxt_re_ah, ibvah);
+}
+
 static inline uint32_t bnxt_re_get_sqe_sz(void)
 {
 	return sizeof(struct bnxt_re_bsqe) +
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 2273e82..19c30a0 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -706,9 +706,6 @@ static int bnxt_re_check_qp_limits(struct bnxt_re_context *cntx,
 	struct ibv_device_attr devattr;
 	int ret;
 
-	if (attr->qp_type == IBV_QPT_UD)
-		return -ENOSYS;
-
 	ret = bnxt_re_query_device(&cntx->ibvctx, &devattr);
 	if (ret)
 		return ret;
@@ -784,6 +781,11 @@ static int bnxt_re_alloc_queues(struct bnxt_re_qp *qp,
 		psn_depth++;
 
 	que->depth += psn_depth;
+	/* PSN-search memory is allocated without checking for
+	 * QP-Type. Kenrel driver do not map this memory if it
+	 * is UD-qp. UD-qp use this memory to maintain WC-opcode.
+	 * See definition of bnxt_re_fill_psns() for the use case.
+	 */
 	ret = bnxt_re_alloc_aligned(qp->sqq, pg_size);
 	if (ret)
 		return ret;
@@ -1009,18 +1011,18 @@ static void bnxt_re_fill_psns(struct bnxt_re_qp *qp, struct bnxt_re_psns *psns,
 	uint32_t pkt_cnt = 0, nxt_psn;
 
 	memset(psns, 0, sizeof(*psns));
-	psns->opc_spsn = qp->sq_psn & BNXT_RE_PSNS_SPSN_MASK;
+	if (qp->qptyp == IBV_QPT_RC) {
+		psns->opc_spsn = qp->sq_psn & BNXT_RE_PSNS_SPSN_MASK;
+		pkt_cnt = (len / qp->mtu);
+		if (len % qp->mtu)
+			pkt_cnt++;
+		nxt_psn = ((qp->sq_psn + pkt_cnt) & BNXT_RE_PSNS_NPSN_MASK);
+		psns->flg_npsn = nxt_psn;
+		qp->sq_psn = nxt_psn;
+	}
 	opcode = bnxt_re_ibv_wr_to_wc_opcd(opcode);
 	psns->opc_spsn |= ((opcode & BNXT_RE_PSNS_OPCD_MASK) <<
 			    BNXT_RE_PSNS_OPCD_SHIFT);
-
-	pkt_cnt = (len / qp->mtu);
-	if (len % qp->mtu)
-		pkt_cnt++;
-	nxt_psn = ((qp->sq_psn + pkt_cnt) & BNXT_RE_PSNS_NPSN_MASK);
-	psns->flg_npsn = nxt_psn;
-	qp->sq_psn = nxt_psn;
-
 	*(uint64_t *)psns = htole64(*(uint64_t *)psns);
 }
 
@@ -1065,6 +1067,26 @@ static int bnxt_re_build_send_sqe(struct bnxt_re_qp *qp, void *wqe,
 	return len;
 }
 
+static int bnxt_re_build_ud_sqe(struct bnxt_re_qp *qp, void *wqe,
+				struct ibv_send_wr *wr, uint8_t is_inline)
+{
+	struct bnxt_re_send *sqe = ((void *)wqe + sizeof(struct bnxt_re_bsqe));
+	struct bnxt_re_ah *ah;
+	int len;
+
+	len = bnxt_re_build_send_sqe(qp, wqe, wr, is_inline);
+	sqe->qkey = wr->wr.ud.remote_qkey;
+	sqe->dst_qp = wr->wr.ud.remote_qpn;
+	if (!wr->wr.ud.ah) {
+		len = -EINVAL;
+		goto bail;
+	}
+	ah = to_bnxt_re_ah(wr->wr.ud.ah);
+	sqe->avid = ah->avid & 0xFFFFF;
+bail:
+	return len;
+}
+
 static int bnxt_re_build_rdma_sqe(struct bnxt_re_qp *qp, void *wqe,
 				  struct ibv_send_wr *wr, uint8_t is_inline)
 {
@@ -1126,9 +1148,14 @@ int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
 		case IBV_WR_SEND_WITH_IMM:
 			hdr->key_immd = wr->imm_data;
 		case IBV_WR_SEND:
-			bytes = bnxt_re_build_send_sqe(qp, sqe, wr, is_inline);
+			if (qp->qptyp == IBV_QPT_UD)
+				bytes = bnxt_re_build_ud_sqe(qp, sqe, wr,
+							     is_inline);
+			else
+				bytes = bnxt_re_build_send_sqe(qp, sqe, wr,
+							       is_inline);
 			if (bytes < 0)
-				ret = ENOMEM;
+				ret = (bytes == -EINVAL) ? EINVAL : ENOMEM;
 			break;
 		case IBV_WR_RDMA_WRITE_WITH_IMM:
 			hdr->key_immd = wr->imm_data;
@@ -1269,10 +1296,46 @@ int bnxt_re_post_srq_recv(struct ibv_srq *ibvsrq, struct ibv_recv_wr *wr,
 
 struct ibv_ah *bnxt_re_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr)
 {
+	struct bnxt_re_context *uctx;
+	struct bnxt_re_ah *ah;
+	struct ibv_create_ah_resp resp;
+	int status;
+
+	uctx = to_bnxt_re_context(ibvpd->context);
+
+	ah = calloc(1, sizeof(*ah));
+	if (!ah)
+		goto failed;
+
+	pthread_mutex_lock(&uctx->shlock);
+	memset(&resp, 0, sizeof(resp));
+	status = ibv_cmd_create_ah(ibvpd, &ah->ibvah, attr,
+				   &resp, sizeof(resp));
+	if (status) {
+		pthread_mutex_unlock(&uctx->shlock);
+		free(ah);
+		goto failed;
+	}
+	/* read AV ID now. */
+	rmb();
+	ah->avid = *(uint32_t *)(uctx->shpg + BNXT_RE_SHPG_AVID_OFFT);
+	pthread_mutex_unlock(&uctx->shlock);
+
+	return &ah->ibvah;
+failed:
 	return NULL;
 }
 
 int bnxt_re_destroy_ah(struct ibv_ah *ibvah)
 {
-	return -ENOSYS;
+	struct bnxt_re_ah *ah;
+	int status;
+
+	ah = to_bnxt_re_ah(ibvah);
+	status = ibv_cmd_destroy_ah(ibvah);
+	if (status)
+		return status;
+	free(ah);
+
+	return 0;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 7/9] libbnxt_re: Enable polling for UD completions
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 6/9] libbnxt_re: Enable UD control path and wqe posting Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 8/9] libbnxt_re: Add support for atomic operations Devesh Sharma
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds support to allow polling of send
and recv completions for a UD qp.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/bnxt_re-abi.h |  6 ++++++
 providers/bnxt_re/verbs.c       | 43 +++++++++++++++++++++++++++--------------
 2 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index 3082d76..581e1b7 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -174,6 +174,12 @@ enum bnxt_re_ud_flags_mask {
 	BNXT_RE_UD_FLAGS_ROCE_IPV6	= 0x03
 };
 
+enum bnxt_re_ud_cqe_mask {
+	BNXT_RE_UD_CQE_MAC_MASK		= 0xFFFFFFFFFFFFULL,
+	BNXT_RE_UD_CQE_SRCQPLO_MASK	= 0xFFFF,
+	BNXT_RE_UD_CQE_SRCQPLO_SHIFT	= 0x30
+};
+
 enum bnxt_re_shpg_offt {
 	BNXT_RE_SHPG_BEG_RESV_OFFT	= 0x00,
 	BNXT_RE_SHPG_AVID_OFFT		= 0x10,
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 19c30a0..36a1a6e 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -336,10 +336,8 @@ static uint8_t bnxt_re_poll_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
 	return pcqe;
 }
 
-static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp,
-				 struct ibv_wc *ibvwc,
-				 struct bnxt_re_bcqe *hdr,
-				 struct bnxt_re_rc_cqe *rcqe)
+static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
+				 struct bnxt_re_bcqe *hdr, void *cqe)
 {
 	struct bnxt_re_queue *rq = qp->rqq;
 	struct bnxt_re_wrid *rwrid;
@@ -364,6 +362,10 @@ static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp,
 		ibvwc->qp_num = qp->qpid;
 		ibvwc->opcode = IBV_WC_RECV;
 		ibvwc->byte_len = 0;
+		ibvwc->wc_flags = 0;
+		if (qp->qptyp == IBV_QPT_UD)
+			ibvwc->src_qp = 0;
+
 		bnxt_re_incr_head(qp->rqq);
 		if (qp->qpst != IBV_QPS_ERR)
 			qp->qpst = IBV_QPS_ERR;
@@ -375,16 +377,32 @@ static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp,
 	return 1;
 }
 
+static void bnxt_re_fill_ud_cqe(struct ibv_wc *ibvwc,
+				struct bnxt_re_bcqe *hdr, void *cqe)
+{
+	struct bnxt_re_ud_cqe *ucqe = cqe;
+	uint32_t qpid;
+
+	qpid = ((hdr->qphi_rwrid >> BNXT_RE_BCQE_SRCQP_SHIFT) &
+		BNXT_RE_BCQE_SRCQP_SHIFT) << 0x10; /* higher 8 bits of 24 */
+	qpid |= (ucqe->qplo_mac >> BNXT_RE_UD_CQE_SRCQPLO_SHIFT) &
+		BNXT_RE_UD_CQE_SRCQPLO_MASK; /*lower 16 of 24 */
+	ibvwc->src_qp = qpid;
+	ibvwc->wc_flags |= IBV_WC_GRH;
+	/*IB-stack ABI in user do not ask for MAC to be reported. */
+}
+
 static void bnxt_re_poll_success_rcqe(struct bnxt_re_qp *qp,
 				      struct ibv_wc *ibvwc,
-				      struct bnxt_re_bcqe *hdr,
-				      struct bnxt_re_rc_cqe *rcqe)
+				      struct bnxt_re_bcqe *hdr, void *cqe)
 {
 	struct bnxt_re_queue *rq = qp->rqq;
 	struct bnxt_re_wrid *rwrid;
+	struct bnxt_re_rc_cqe *rcqe;
 	uint32_t head = rq->head;
 	uint8_t flags, is_imm, is_rdma;
 
+	rcqe = cqe;
 	rwrid = &qp->rwrid[head];
 
 	ibvwc->status = IBV_WC_SUCCESS;
@@ -407,6 +425,9 @@ static void bnxt_re_poll_success_rcqe(struct bnxt_re_qp *qp,
 			ibvwc->opcode = IBV_WC_RECV_RDMA_WITH_IMM;
 	}
 
+	if (qp->qptyp == IBV_QPT_UD)
+		bnxt_re_fill_ud_cqe(ibvwc, hdr, cqe);
+
 	bnxt_re_incr_head(rq);
 }
 
@@ -414,19 +435,17 @@ static uint8_t bnxt_re_poll_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
 				 void *cqe, int *cnt)
 {
 	struct bnxt_re_bcqe *hdr;
-	struct bnxt_re_rc_cqe *rcqe;
 	uint8_t status, pcqe = false;
 
-	rcqe = cqe;
 	hdr = cqe + sizeof(struct bnxt_re_rc_cqe);
 
 	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
 		  BNXT_RE_BCQE_STATUS_MASK;
 	*cnt = 1;
 	if (status == BNXT_RE_RSP_ST_OK)
-		bnxt_re_poll_success_rcqe(qp, ibvwc, hdr, rcqe);
+		bnxt_re_poll_success_rcqe(qp, ibvwc, hdr, cqe);
 	else
-		*cnt = bnxt_re_poll_err_rcqe(qp, ibvwc, hdr, rcqe);
+		*cnt = bnxt_re_poll_err_rcqe(qp, ibvwc, hdr, cqe);
 
 	return pcqe;
 }
@@ -484,9 +503,6 @@ static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc)
 			qp = (struct bnxt_re_qp *)scqe->qp_handle;
 			if (!qp)
 				break; /*stale cqe. should be rung.*/
-			if (qp->qptyp == IBV_QPT_UD)
-				goto bail; /* TODO: Add UD poll */
-
 			pcqe = bnxt_re_poll_scqe(qp, wc, cqe, &cnt);
 			break;
 		case BNXT_RE_WC_TYPE_RECV_RC:
@@ -500,7 +516,6 @@ static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc)
 				goto bail; /*TODO: Add SRQ poll */
 
 			pcqe = bnxt_re_poll_rcqe(qp, wc, cqe, &cnt);
-			/* TODO: Process UD rcqe */
 			break;
 		case BNXT_RE_WC_TYPE_RECV_RAW:
 			break;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 8/9] libbnxt_re: Add support for atomic operations
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 7/9] libbnxt_re: Enable polling for UD completions Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-18 15:43   ` [rdma-core v2 9/9] libbnxt_re: Add support for SRQ in user lib Devesh Sharma
  2017-02-21 19:50   ` [rdma-core v2 0/9] Broadcom User Space RoCE Driver Jason Gunthorpe
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds support for compare-and-swap and fetch-and-add
atomic operations in user library.

v1->v2
 -- Fixed the missing "break"
 -- Changed macros to inline function

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/bnxt_re-abi.h |  3 ++-
 providers/bnxt_re/main.h        |  8 +++++-
 providers/bnxt_re/memory.h      | 10 +++++++
 providers/bnxt_re/verbs.c       | 58 +++++++++++++++++++++++++++++++++++------
 4 files changed, 69 insertions(+), 10 deletions(-)

diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index 581e1b7..557221b 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -54,7 +54,8 @@ enum bnxt_re_wr_opcode {
 	BNXT_RE_WR_OPCD_ATOMIC_FA	= 0x0B,
 	BNXT_RE_WR_OPCD_LOC_INVAL	= 0x0C,
 	BNXT_RE_WR_OPCD_BIND		= 0x0E,
-	BNXT_RE_WR_OPCD_RECV		= 0x80
+	BNXT_RE_WR_OPCD_RECV		= 0x80,
+	BNXT_RE_WR_OPCD_INVAL		= 0xFF
 };
 
 enum bnxt_re_wr_flags {
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
index 9d99c64..a417328 100644
--- a/providers/bnxt_re/main.h
+++ b/providers/bnxt_re/main.h
@@ -236,9 +236,15 @@ static inline uint8_t bnxt_re_ibv_to_bnxt_wr_opcd(uint8_t ibv_opcd)
 	case IBV_WR_RDMA_READ:
 		bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_READ;
 		break;
+	case IBV_WR_ATOMIC_CMP_AND_SWP:
+		bnxt_opcd = BNXT_RE_WR_OPCD_ATOMIC_CS;
+		break;
+	case IBV_WR_ATOMIC_FETCH_AND_ADD:
+		bnxt_opcd = BNXT_RE_WR_OPCD_ATOMIC_FA;
+		break;
 		/* TODO: Add other opcodes */
 	default:
-		bnxt_opcd = 0xFF;
+		bnxt_opcd = BNXT_RE_WR_OPCD_INVAL;
 		break;
 	};
 
diff --git a/providers/bnxt_re/memory.h b/providers/bnxt_re/memory.h
index debb31a..d7e6a92 100644
--- a/providers/bnxt_re/memory.h
+++ b/providers/bnxt_re/memory.h
@@ -83,6 +83,16 @@ static inline void iowrite32(__u32 *dst, uint32_t *src)
 	*(volatile __u32 *)dst = *src;
 }
 
+static inline __u32 upper_32_bits(uint64_t n)
+{
+	return (__u32)((n >> 16) >> 16);
+}
+
+static inline __u32 lower_32_bits(uint64_t n)
+{
+	return (__u32)(n & 0xFFFFFFFFUL);
+}
+
 /* Basic queue operation */
 static inline uint32_t bnxt_re_is_que_full(struct bnxt_re_queue *que)
 {
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 36a1a6e..bc06386 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -1068,6 +1068,9 @@ static int bnxt_re_build_send_sqe(struct bnxt_re_qp *qp, void *wqe,
 
 	/* Fill Header */
 	opcode = bnxt_re_ibv_to_bnxt_wr_opcd(wr->opcode);
+	if (opcode == BNXT_RE_WR_OPCD_INVAL)
+		return -EINVAL;
+
 	hdr->rsv_ws_fl_wt |= (opcode & BNXT_RE_HDR_WT_MASK);
 
 	if (is_inline) {
@@ -1116,6 +1119,44 @@ static int bnxt_re_build_rdma_sqe(struct bnxt_re_qp *qp, void *wqe,
 	return len;
 }
 
+static int bnxt_re_build_cns_sqe(struct bnxt_re_qp *qp, void *wqe,
+				 struct ibv_send_wr *wr)
+{
+	struct bnxt_re_bsqe *hdr = wqe;
+	struct bnxt_re_atomic *sqe = ((void *)wqe +
+				      sizeof(struct bnxt_re_bsqe));
+	int len;
+
+	len = bnxt_re_build_send_sqe(qp, wqe, wr, false);
+	hdr->key_immd = wr->wr.atomic.rkey;
+	sqe->rva_lo = lower_32_bits(wr->wr.atomic.remote_addr);
+	sqe->rva_hi = upper_32_bits(wr->wr.atomic.remote_addr);
+	sqe->cmp_dt_lo = lower_32_bits(wr->wr.atomic.compare_add);
+	sqe->cmp_dt_hi = upper_32_bits(wr->wr.atomic.compare_add);
+	sqe->swp_dt_lo = lower_32_bits(wr->wr.atomic.swap);
+	sqe->swp_dt_hi = upper_32_bits(wr->wr.atomic.swap);
+
+	return len;
+}
+
+static int bnxt_re_build_fna_sqe(struct bnxt_re_qp *qp, void *wqe,
+				 struct ibv_send_wr *wr)
+{
+	struct bnxt_re_bsqe *hdr = wqe;
+	struct bnxt_re_atomic *sqe = ((void *)wqe +
+				      sizeof(struct bnxt_re_bsqe));
+	int len;
+
+	len = bnxt_re_build_send_sqe(qp, wqe, wr, false);
+	hdr->key_immd = wr->wr.atomic.rkey;
+	sqe->rva_lo = lower_32_bits(wr->wr.atomic.remote_addr);
+	sqe->rva_hi = upper_32_bits(wr->wr.atomic.remote_addr);
+	sqe->cmp_dt_lo = lower_32_bits(wr->wr.atomic.compare_add);
+	sqe->cmp_dt_hi = upper_32_bits(wr->wr.atomic.compare_add);
+
+	return len;
+}
+
 int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
 		      struct ibv_send_wr **bad)
 {
@@ -1169,27 +1210,28 @@ int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
 			else
 				bytes = bnxt_re_build_send_sqe(qp, sqe, wr,
 							       is_inline);
-			if (bytes < 0)
-				ret = (bytes == -EINVAL) ? EINVAL : ENOMEM;
 			break;
 		case IBV_WR_RDMA_WRITE_WITH_IMM:
 			hdr->key_immd = wr->imm_data;
 		case IBV_WR_RDMA_WRITE:
 			bytes = bnxt_re_build_rdma_sqe(qp, sqe, wr, is_inline);
-			if (bytes < 0)
-				ret = ENOMEM;
 			break;
 		case IBV_WR_RDMA_READ:
 			bytes = bnxt_re_build_rdma_sqe(qp, sqe, wr, false);
-			if (bytes < 0)
-				ret = ENOMEM;
+			break;
+		case IBV_WR_ATOMIC_CMP_AND_SWP:
+			bytes = bnxt_re_build_cns_sqe(qp, sqe, wr);
+			break;
+		case IBV_WR_ATOMIC_FETCH_AND_ADD:
+			bytes = bnxt_re_build_fna_sqe(qp, sqe, wr);
 			break;
 		default:
-			ret = EINVAL;
+			bytes = -EINVAL;
 			break;
 		}
 
-		if (ret) {
+		if (bytes < 0) {
+			ret = (bytes == -EINVAL) ? EINVAL : ENOMEM;
 			*bad = wr;
 			break;
 		}
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [rdma-core v2 9/9] libbnxt_re: Add support for SRQ in user lib
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 8/9] libbnxt_re: Add support for atomic operations Devesh Sharma
@ 2017-02-18 15:43   ` Devesh Sharma
  2017-02-21 19:50   ` [rdma-core v2 0/9] Broadcom User Space RoCE Driver Jason Gunthorpe
  9 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-18 15:43 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

This patch adds support for shared receive
queue. Following are the changes:
 - Add ABI for user/kernel information exchange.
 - Add function to handle SRQ ARMing and DB-ring.
 - Add function to create/destroy SRQ.
 - Add function to query/modify SRQ.
 - Add function to post RQE on a SRQ.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 providers/bnxt_re/bnxt_re-abi.h |  15 +++
 providers/bnxt_re/db.c          |  18 +++
 providers/bnxt_re/main.h        |  32 ++++-
 providers/bnxt_re/verbs.c       | 261 ++++++++++++++++++++++++++++++++++------
 4 files changed, 288 insertions(+), 38 deletions(-)

diff --git a/providers/bnxt_re/bnxt_re-abi.h b/providers/bnxt_re/bnxt_re-abi.h
index 557221b..8dbb7b9 100644
--- a/providers/bnxt_re/bnxt_re-abi.h
+++ b/providers/bnxt_re/bnxt_re-abi.h
@@ -214,6 +214,7 @@ struct bnxt_re_mr_resp {
 	struct ibv_reg_mr_resp resp;
 };
 
+/* CQ */
 struct bnxt_re_cq_req {
 	struct ibv_create_cq cmd;
 	__u64 cq_va;
@@ -261,6 +262,7 @@ struct bnxt_re_term_cqe {
 	__u64 rsvd1;
 };
 
+/* QP */
 struct bnxt_re_qp_req {
 	struct ibv_create_qp cmd;
 	__u64 qpsva;
@@ -352,6 +354,19 @@ struct bnxt_re_rqe {
 	__u64 rsvd[2];
 };
 
+/* SRQ */
+struct bnxt_re_srq_req {
+	struct ibv_create_srq cmd;
+	__u64 srqva;
+	__u64 srq_handle;
+};
+
+struct bnxt_re_srq_resp {
+	struct ibv_create_srq_resp resp;
+	__u32 srqid;
+	__u32 rsvd;
+};
+
 struct bnxt_re_srqe {
 	__u32 srq_tag; /* 20 bits are valid */
 	__u32 rsvd1;
diff --git a/providers/bnxt_re/db.c b/providers/bnxt_re/db.c
index 3897aea..5128af4 100644
--- a/providers/bnxt_re/db.c
+++ b/providers/bnxt_re/db.c
@@ -73,6 +73,24 @@ void bnxt_re_ring_sq_db(struct bnxt_re_qp *qp)
 	bnxt_re_ring_db(qp->udpi, &hdr);
 }
 
+void bnxt_re_ring_srq_db(struct bnxt_re_srq *srq)
+{
+	struct bnxt_re_db_hdr hdr;
+
+	bnxt_re_init_db_hdr(&hdr, srq->srqq->tail, srq->srqid,
+			    BNXT_RE_QUE_TYPE_SRQ);
+	bnxt_re_ring_db(srq->udpi, &hdr);
+}
+
+void bnxt_re_ring_srq_arm(struct bnxt_re_srq *srq)
+{
+	struct bnxt_re_db_hdr hdr;
+
+	bnxt_re_init_db_hdr(&hdr, srq->cap.srq_limit, srq->srqid,
+			    BNXT_RE_QUE_TYPE_SRQ_ARM);
+	bnxt_re_ring_db(srq->udpi, &hdr);
+}
+
 void bnxt_re_ring_cq_db(struct bnxt_re_cq *cq)
 {
 	struct bnxt_re_db_hdr hdr;
diff --git a/providers/bnxt_re/main.h b/providers/bnxt_re/main.h
index a417328..3ddffde 100644
--- a/providers/bnxt_re/main.h
+++ b/providers/bnxt_re/main.h
@@ -76,10 +76,6 @@ struct bnxt_re_cq {
 	uint8_t  phase;
 };
 
-struct bnxt_re_srq {
-	struct ibv_srq ibvsrq;
-};
-
 struct bnxt_re_wrid {
 	struct bnxt_re_psns *psns;
 	uint64_t wrid;
@@ -96,6 +92,16 @@ struct bnxt_re_qpcap {
 	uint8_t	sqsig;
 };
 
+struct bnxt_re_srq {
+	struct ibv_srq ibvsrq;
+	struct ibv_srq_attr cap;
+	struct bnxt_re_queue *srqq;
+	struct bnxt_re_wrid *srwrid;
+	struct bnxt_re_dpi *udpi;
+	uint32_t srqid;
+	uint32_t pre_count;
+};
+
 struct bnxt_re_qp {
 	struct ibv_qp ibvqp;
 	struct bnxt_re_queue *sqq;
@@ -151,6 +157,7 @@ struct bnxt_re_context {
 /* DB ring functions used internally*/
 void bnxt_re_ring_rq_db(struct bnxt_re_qp *qp);
 void bnxt_re_ring_sq_db(struct bnxt_re_qp *qp);
+void bnxt_re_ring_srq_arm(struct bnxt_re_srq *srq);
 void bnxt_re_ring_srq_db(struct bnxt_re_srq *srq);
 void bnxt_re_ring_cq_db(struct bnxt_re_cq *cq);
 void bnxt_re_ring_cq_arm_db(struct bnxt_re_cq *cq, uint8_t aflag);
@@ -182,6 +189,11 @@ static inline struct bnxt_re_qp *to_bnxt_re_qp(struct ibv_qp *ibvqp)
 	return container_of(ibvqp, struct bnxt_re_qp, ibvqp);
 }
 
+static inline struct bnxt_re_srq *to_bnxt_re_srq(struct ibv_srq *ibvsrq)
+{
+	return container_of(ibvsrq, struct bnxt_re_srq, ibvsrq);
+}
+
 static inline struct bnxt_re_ah *to_bnxt_re_ah(struct ibv_ah *ibvah)
 {
         return container_of(ibvah, struct bnxt_re_ah, ibvah);
@@ -211,6 +223,18 @@ static inline uint32_t bnxt_re_get_rqe_hdr_sz(void)
 	return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_rqe);
 }
 
+static inline uint32_t bnxt_re_get_srqe_hdr_sz(void)
+{
+	return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_srqe);
+}
+
+static inline uint32_t bnxt_re_get_srqe_sz(void)
+{
+	return sizeof(struct bnxt_re_brqe) +
+	       sizeof(struct bnxt_re_srqe) +
+	       BNXT_RE_MAX_INLINE_SIZE;
+}
+
 static inline uint32_t bnxt_re_get_cqe_sz(void)
 {
 	return sizeof(struct bnxt_re_req_cqe) + sizeof(struct bnxt_re_bcqe);
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index bc06386..b626751 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -339,36 +339,40 @@ static uint8_t bnxt_re_poll_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
 static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc,
 				 struct bnxt_re_bcqe *hdr, void *cqe)
 {
-	struct bnxt_re_queue *rq = qp->rqq;
+	struct bnxt_re_queue *rq;
 	struct bnxt_re_wrid *rwrid;
 	struct bnxt_re_cq *rcq;
 	struct bnxt_re_context *cntx;
-	uint32_t head = rq->head;
 	uint8_t status;
 
 	rcq = to_bnxt_re_cq(qp->ibvqp.recv_cq);
 	cntx = to_bnxt_re_context(rcq->ibvcq.context);
 
-	rwrid = &qp->rwrid[head];
+	if (!qp->srq) {
+		rq = qp->rqq;
+		rwrid = &qp->rwrid[rq->head];
+	} else {
+		rq = qp->srq->srqq;
+		rwrid = &qp->srq->srwrid[rq->head];
+	}
+
 	status = (hdr->flg_st_typ_ph >> BNXT_RE_BCQE_STATUS_SHIFT) &
 		  BNXT_RE_BCQE_STATUS_MASK;
 	/* skip h/w flush errors */
 	if (status == BNXT_RE_RSP_ST_HW_FLUSH)
 		return 0;
+
 	ibvwc->status = bnxt_re_to_ibv_wc_status(status, false);
-	/* TODO: Add SRQ Processing here */
-	if (qp->rqq) {
-		ibvwc->wr_id = rwrid->wrid;
-		ibvwc->qp_num = qp->qpid;
-		ibvwc->opcode = IBV_WC_RECV;
-		ibvwc->byte_len = 0;
-		ibvwc->wc_flags = 0;
-		if (qp->qptyp == IBV_QPT_UD)
-			ibvwc->src_qp = 0;
+	ibvwc->wr_id = rwrid->wrid;
+	ibvwc->qp_num = qp->qpid;
+	ibvwc->opcode = IBV_WC_RECV;
+	ibvwc->byte_len = 0;
+	ibvwc->wc_flags = 0;
+	if (qp->qptyp == IBV_QPT_UD)
+		ibvwc->src_qp = 0;
+	bnxt_re_incr_head(rq);
 
-		bnxt_re_incr_head(qp->rqq);
-		if (qp->qpst != IBV_QPS_ERR)
-			qp->qpst = IBV_QPS_ERR;
+	if (!qp->srq) {
 		pthread_spin_lock(&cntx->fqlock);
 		bnxt_re_fque_add_node(&rcq->rfhead, &qp->rnode);
 		pthread_spin_unlock(&cntx->fqlock);
@@ -396,14 +400,19 @@ static void bnxt_re_poll_success_rcqe(struct bnxt_re_qp *qp,
 				      struct ibv_wc *ibvwc,
 				      struct bnxt_re_bcqe *hdr, void *cqe)
 {
-	struct bnxt_re_queue *rq = qp->rqq;
+	struct bnxt_re_queue *rq;
 	struct bnxt_re_wrid *rwrid;
 	struct bnxt_re_rc_cqe *rcqe;
-	uint32_t head = rq->head;
 	uint8_t flags, is_imm, is_rdma;
 
 	rcqe = cqe;
-	rwrid = &qp->rwrid[head];
+	if (!qp->srq) {
+		rq = qp->rqq;
+		rwrid = &qp->rwrid[rq->head];
+	} else {
+		rq = qp->srq->srqq;
+		rwrid = &qp->srq->srwrid[rq->head];
+	}
 
 	ibvwc->status = IBV_WC_SUCCESS;
 	ibvwc->wr_id = rwrid->wrid;
@@ -512,9 +521,6 @@ static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc)
 			qp = (struct bnxt_re_qp *)rcqe->qp_handle;
 			if (!qp)
 				break; /*stale cqe. should be rung.*/
-			if (qp->srq)
-				goto bail; /*TODO: Add SRQ poll */
-
 			pcqe = bnxt_re_poll_rcqe(qp, wc, cqe, &cnt);
 			break;
 		case BNXT_RE_WC_TYPE_RECV_RAW:
@@ -554,7 +560,7 @@ skipp_real:
 
 	if (hw_polled)
 		bnxt_re_ring_cq_db(cq);
-bail:
+
 	return dqed;
 }
 
@@ -752,9 +758,7 @@ static int bnxt_re_alloc_queue_ptr(struct bnxt_re_qp *qp,
 	qp->sqq = calloc(1, sizeof(struct bnxt_re_queue));
 	if (!qp->sqq)
 		return -ENOMEM;
-	if (attr->srq)
-		qp->srq = NULL;/*TODO: to_bnxt_re_srq(attr->srq);*/
-	else {
+	if (!attr->srq) {
 		qp->rqq = calloc(1, sizeof(struct bnxt_re_queue));
 		if (!qp->rqq) {
 			free(qp->sqq);
@@ -767,10 +771,12 @@ static int bnxt_re_alloc_queue_ptr(struct bnxt_re_qp *qp,
 
 static void bnxt_re_free_queues(struct bnxt_re_qp *qp)
 {
-	if (qp->rwrid)
-		free(qp->rwrid);
-	pthread_spin_destroy(&qp->rqq->qlock);
-	bnxt_re_free_aligned(qp->rqq);
+	if (qp->rqq) {
+		if (qp->rwrid)
+			free(qp->rwrid);
+		pthread_spin_destroy(&qp->rqq->qlock);
+		bnxt_re_free_aligned(qp->rqq);
+	}
 
 	if (qp->swrid)
 		free(qp->swrid);
@@ -881,6 +887,8 @@ struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd,
 	qp->qpst = IBV_QPS_RESET;
 	qp->scq = to_bnxt_re_cq(attr->send_cq);
 	qp->rcq = to_bnxt_re_cq(attr->recv_cq);
+	if (attr->srq)
+		qp->srq = to_bnxt_re_srq(attr->srq);
 	qp->udpi = &cntx->udpi;
 	/* Save/return the altered Caps. */
 	attr->cap.max_send_wr = cap->max_swr;
@@ -1323,32 +1331,217 @@ int bnxt_re_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
 	return 0;
 }
 
+static void bnxt_re_srq_free_queue_ptr(struct bnxt_re_srq *srq)
+{
+	if (srq && srq->srqq)
+		free(srq->srqq);
+	if (srq)
+		free(srq);
+}
+
+static struct bnxt_re_srq *bnxt_re_srq_alloc_queue_ptr(void)
+{
+	struct bnxt_re_srq *srq;
+
+	srq = calloc(1, sizeof(struct bnxt_re_srq));
+	if (!srq)
+		return NULL;
+
+	srq->srqq = calloc(1, sizeof(struct bnxt_re_queue));
+	if (!srq->srqq) {
+		free(srq);
+		return NULL;
+	}
+
+	return srq;
+}
+
+static void bnxt_re_srq_free_queue(struct bnxt_re_srq *srq)
+{
+	if (srq->srwrid)
+		free(srq->srwrid);
+	pthread_spin_destroy(&srq->srqq->qlock);
+	bnxt_re_free_aligned(srq->srqq);
+}
+
+static int bnxt_re_srq_alloc_queue(struct bnxt_re_srq *srq,
+				   struct ibv_srq_init_attr *attr,
+				   uint32_t pg_size)
+{
+	struct bnxt_re_queue *que;
+	int ret;
+
+	que = srq->srqq;
+	que->depth = roundup_pow_of_two(attr->attr.max_wr + 1);
+	que->stride = bnxt_re_get_srqe_sz();
+	ret = bnxt_re_alloc_aligned(que, pg_size);
+	if (ret)
+		goto bail;
+	pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE);
+	/* For SRQ only bnxt_re_wrid.wrid is used. */
+	srq->srwrid = calloc(que->depth, sizeof(struct bnxt_re_wrid));
+	if (!srq->srwrid) {
+		ret = -ENOMEM;
+		goto bail;
+	}
+	/*TODO: update actual max depth. */
+	return 0;
+bail:
+	bnxt_re_srq_free_queue(srq);
+	return ret;
+}
+
 struct ibv_srq *bnxt_re_create_srq(struct ibv_pd *ibvpd,
 				   struct ibv_srq_init_attr *attr)
 {
+	struct bnxt_re_srq *srq;
+	struct bnxt_re_srq_req cmd;
+	struct bnxt_re_srq_resp resp;
+	struct bnxt_re_context *cntx = to_bnxt_re_context(ibvpd->context);
+	struct bnxt_re_dev *dev = to_bnxt_re_dev(cntx->ibvctx.device);
+	int ret;
+
+	/*TODO: Check max limit on queue depth and sge.*/
+	srq = bnxt_re_srq_alloc_queue_ptr();
+	if (!srq)
+		goto fail;
+
+	if (bnxt_re_srq_alloc_queue(srq, attr, dev->pg_size))
+		goto fail;
+
+	cmd.srqva = (uint64_t)srq->srqq->va;
+	cmd.srq_handle = (uint64_t)srq;
+	ret = ibv_cmd_create_srq(ibvpd, &srq->ibvsrq, attr,
+				 &cmd.cmd, sizeof(cmd),
+				 &resp.resp, sizeof(resp));
+	if (ret)
+		goto fail;
+
+	srq->srqid = resp.srqid;
+	srq->udpi = &cntx->udpi;
+	srq->cap.max_wr = srq->srqq->depth;
+	srq->cap.max_sge = attr->attr.max_sge;
+	srq->cap.srq_limit = attr->attr.srq_limit;
+	srq->pre_count = 0;
+
+	return &srq->ibvsrq;
+fail:
+	bnxt_re_srq_free_queue_ptr(srq);
 	return NULL;
 }
 
 int bnxt_re_modify_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr,
-		       int init_attr)
+		       int attr_mask)
 {
-	return -ENOSYS;
+	struct bnxt_re_srq *srq = to_bnxt_re_srq(ibvsrq);
+	struct ibv_modify_srq cmd;
+	int status = 0;
+
+	status =  ibv_cmd_modify_srq(ibvsrq, attr, attr_mask,
+				     &cmd, sizeof(cmd));
+	if (!status && ((attr_mask & IBV_SRQ_LIMIT) &&
+			(srq->cap.srq_limit != attr->srq_limit))) {
+		srq->cap.srq_limit = attr->srq_limit;
+	}
+
+	return status;
 }
 
 int bnxt_re_destroy_srq(struct ibv_srq *ibvsrq)
 {
-	return -ENOSYS;
+	struct bnxt_re_srq *srq = to_bnxt_re_srq(ibvsrq);
+	int ret;
+
+	ret = ibv_cmd_destroy_srq(ibvsrq);
+	if (ret)
+		return ret;
+	bnxt_re_srq_free_queue(srq);
+	bnxt_re_srq_free_queue_ptr(srq);
+
+	return 0;
 }
 
 int bnxt_re_query_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr)
 {
-	return -ENOSYS;
+	struct ibv_query_srq cmd;
+	int status;
+
+	status = ibv_cmd_query_srq(ibvsrq, attr, &cmd, sizeof(cmd));
+	if (status)
+		return status;
+
+	return 0;
+}
+
+static int bnxt_re_build_srqe(struct bnxt_re_srq *srq,
+			      struct ibv_recv_wr *wr, void *srqe)
+{
+	struct bnxt_re_brqe *hdr = srqe;
+	struct bnxt_re_rqe *rwr;
+	struct bnxt_re_sge *sge;
+	struct bnxt_re_wrid *wrid;
+	int wqe_sz, len;
+
+	rwr = (srqe + sizeof(struct bnxt_re_brqe));
+	sge = (srqe + bnxt_re_get_srqe_hdr_sz());
+	wrid = &srq->srwrid[srq->srqq->tail];
+
+	len = bnxt_re_build_sge(sge, wr->sg_list, wr->num_sge, false);
+	hdr->rsv_ws_fl_wt = BNXT_RE_WR_OPCD_RECV;
+	wqe_sz = wr->num_sge + (bnxt_re_get_srqe_hdr_sz() >> 4); /* 16B align */
+	hdr->rsv_ws_fl_wt |= ((wqe_sz & BNXT_RE_HDR_WS_MASK) <<
+			       BNXT_RE_HDR_WS_SHIFT);
+	rwr->wrid = srq->srqq->tail;
+
+	/* Fill wrid */
+	wrid->wrid = wr->wr_id;
+	wrid->bytes = len; /* N.A. for RQE */
+	wrid->sig = 0; /* N.A. for RQE */
+
+	return len;
 }
 
 int bnxt_re_post_srq_recv(struct ibv_srq *ibvsrq, struct ibv_recv_wr *wr,
 			  struct ibv_recv_wr **bad)
 {
-	return -ENOSYS;
+	struct bnxt_re_srq *srq = to_bnxt_re_srq(ibvsrq);
+	struct bnxt_re_queue *rq = srq->srqq;
+	void *srqe;
+	int ret;
+
+	pthread_spin_lock(&rq->qlock);
+	while (wr) {
+		if (bnxt_re_is_que_full(rq) ||
+		    wr->num_sge > srq->cap.max_sge) {
+			*bad = wr;
+			pthread_spin_unlock(&rq->qlock);
+			return ENOMEM;
+		}
+
+		srqe = (void *)(rq->va + (rq->tail * rq->stride));
+		memset(srqe, 0, bnxt_re_get_srqe_sz());
+		ret = bnxt_re_build_srqe(srq, wr, srqe);
+		if (ret < 0) {
+			pthread_spin_unlock(&rq->qlock);
+			*bad = wr;
+			return ENOMEM;
+		}
+
+		bnxt_re_host_to_le64((uint64_t *)srqe, rq->stride);
+		bnxt_re_incr_tail(rq);
+		wr = wr->next;
+
+		wmb(); /* write barrier */
+		bnxt_re_ring_srq_db(srq);
+		if ((srq->pre_count < srq->srqq->depth) &&
+		    (++srq->pre_count > srq->cap.srq_limit)) {
+			srq->pre_count = srq->srqq->depth;
+			bnxt_re_ring_srq_arm(srq);
+		}
+	}
+	pthread_spin_unlock(&rq->qlock);
+
+	return 0;
 }
 
 struct ibv_ah *bnxt_re_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [rdma-core v2 0/9] Broadcom User Space RoCE Driver
       [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-02-18 15:43   ` [rdma-core v2 9/9] libbnxt_re: Add support for SRQ in user lib Devesh Sharma
@ 2017-02-21 19:50   ` Jason Gunthorpe
       [not found]     ` <20170221195053.GG13138-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  9 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2017-02-21 19:50 UTC (permalink / raw)
  To: Devesh Sharma; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Sat, Feb 18, 2017 at 10:43:49AM -0500, Devesh Sharma wrote:
> This series introduces the user space RoCE driver for the Broadcom
> NetXtreme-E 10/25/40/50 RDMA Ethernet Controller. This driver
> is dependent on the bnxt_re driver posted earlier to linux-rdma
> community and is under reveiw.
> 
> This patch series is based on the latest master of rdma-core
> repository hosted at https://github.com/linux-rdma/rdma-core.git
> 
> The GIT for this library is hosted at following URL on github
> https://github.com/dsharma283/bnxtre-rdma-core.git
> branch: bnxtre-v2
> 
> Please review and give your valuable feedback for the betterment.
> 
> v1->v2

It still doesn't compile, please make sure everything is travis clean
before submitting..

$ buildlib/cbuild pkg travis
[59/172] Building C object providers/hns/CMakeFiles/hns-rdmav2.dir/hns_roce_u_hw_v1.c.o
FAILED: /usr/bin/gcc-6  -Dbnxt_re_rdmav2_EXPORTS -Werror -m32  -std=gnu11 -Wall -Wextra -Wno-sign-compare -Wno-unused-parameter -Wmissing-prototypes -Wmissing-declarations -Wwrite-strings -Wformat=2 -Wshadow -Wstrict-prototypes -Wold-style-definition -Wredundant-decls -O2 -g  -fPIC -Iinclude -MMD -MT providers/bnxt_re/CMakeFiles/bnxt_re-rdmav2.dir/verbs.c.o -MF "providers/bnxt_re/CMakeFiles/bnxt_re-rdmav2.dir/verbs.c.o.d" -o providers/bnxt_re/CMakeFiles/bnxt_re-rdmav2.dir/verbs.c.o   -c ../providers/bnxt_re/verbs.c
../providers/bnxt_re/verbs.c: In function 'bnxt_re_reg_mr':
../providers/bnxt_re/verbs.c:143:38: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  if (ibv_cmd_reg_mr(ibvpd, sva, len, (uint64_t)sva, access, &mr->ibvmr,
                                      ^
../providers/bnxt_re/verbs.c: In function 'bnxt_re_create_cq':
../providers/bnxt_re/verbs.c:191:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  cmd.cq_va = (uint64_t)cq->cqq.va;
              ^
../providers/bnxt_re/verbs.c:192:18: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  cmd.cq_handle = (uint64_t)cq;
                  ^
../providers/bnxt_re/verbs.c: In function 'bnxt_re_poll_one':
../providers/bnxt_re/verbs.c:512:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    qp = (struct bnxt_re_qp *)scqe->qp_handle;
         ^
../providers/bnxt_re/verbs.c:521:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    qp = (struct bnxt_re_qp *)rcqe->qp_handle;
         ^
../providers/bnxt_re/verbs.c:531:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    qp = (struct bnxt_re_qp *)scqe->qp_handle;
         ^
../providers/bnxt_re/verbs.c: In function 'bnxt_re_cleanup_cq':
../providers/bnxt_re/verbs.c:689:27: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
    if (scqe->qp_handle == (uint64_t)qp)
                           ^
../providers/bnxt_re/verbs.c:693:27: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
    if (rcqe->qp_handle == (uint64_t)qp)
                           ^
../providers/bnxt_re/verbs.c: In function 'bnxt_re_create_qp':
../providers/bnxt_re/verbs.c:876:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  req.qpsva = (uint64_t)qp->sqq->va;
              ^
../providers/bnxt_re/verbs.c:877:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  req.qprva = qp->rqq ? (uint64_t)qp->rqq->va : 0;
                        ^
../providers/bnxt_re/verbs.c:878:18: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  req.qp_handle = (uint64_t)qp;
                  ^
../providers/bnxt_re/verbs.c: In function 'bnxt_re_build_sge':
../providers/bnxt_re/verbs.c:1014:16: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    memcpy(dst, (void *)sg_list[indx].addr,
                ^
../providers/bnxt_re/verbs.c: In function 'bnxt_re_create_srq':
../providers/bnxt_re/verbs.c:1412:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  cmd.srqva = (uint64_t)srq->srqq->va;
              ^
../providers/bnxt_re/verbs.c:1413:19: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  cmd.srq_handle = (uint64_t)srq;
                   ^


Hum, you might want to put the wmb() ino bnxt_re_ring_db:

                bnxt_re_incr_tail(sq);
                wr = wr->next;
                wmb(); /* write barrier */

                bnxt_re_ring_sq_db(qp);

More likely to be universally correct that way.

There is only one rmb() in this driver and it seems in a wonky
place. There is no reason to have a barrier after a kernel syscall, if
a barrier is needed there it belongs in there kernel code.

The rmb() is missing after calls to bnxt_re_is_cqe_valid() it looks
like?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rdma-core v2 0/9] Broadcom User Space RoCE Driver
       [not found]     ` <20170221195053.GG13138-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-02-22  3:22       ` Devesh Sharma
       [not found]         ` <CANjDDBi7bvYMvnjgsdSXzj6ot2wJS+w1nRCp-26TgEEpE38vfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Devesh Sharma @ 2017-02-22  3:22 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma

On Wed, Feb 22, 2017 at 1:20 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>
> On Sat, Feb 18, 2017 at 10:43:49AM -0500, Devesh Sharma wrote:
> > This series introduces the user space RoCE driver for the Broadcom
> > NetXtreme-E 10/25/40/50 RDMA Ethernet Controller. This driver
> > is dependent on the bnxt_re driver posted earlier to linux-rdma
> > community and is under reveiw.
> >
> > This patch series is based on the latest master of rdma-core
> > repository hosted at https://github.com/linux-rdma/rdma-core.git
> >
> > The GIT for this library is hosted at following URL on github
> > https://github.com/dsharma283/bnxtre-rdma-core.git
> > branch: bnxtre-v2
> >
> > Please review and give your valuable feedback for the betterment.
> >
> > v1->v2
>
> It still doesn't compile, please make sure everything is travis clean
> before submitting..

I could not find travis package, thus could not gave it try before
submitting v2. I will give it a shot again.

>
> $ buildlib/cbuild pkg travis
> [59/172] Building C object providers/hns/CMakeFiles/hns-rdmav2.dir/hns_roce_u_hw_v1.c.o
> FAILED: /usr/bin/gcc-6  -Dbnxt_re_rdmav2_EXPORTS -Werror -m32  -std=gnu11 -Wall -Wextra -Wno-sign-compare -Wno-unused-parameter -Wmissing-prototypes -Wmissing-declarations -Wwrite-strings -Wformat=2 -Wshadow -Wstrict-prototypes -Wold-style-definition -Wredundant-decls -O2 -g  -fPIC -Iinclude -MMD -MT providers/bnxt_re/CMakeFiles/bnxt_re-rdmav2.dir/verbs.c.o -MF "providers/bnxt_re/CMakeFiles/bnxt_re-rdmav2.dir/verbs.c.o.d" -o providers/bnxt_re/CMakeFiles/bnxt_re-rdmav2.dir/verbs.c.o   -c ../providers/bnxt_re/verbs.c
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_reg_mr':
> ../providers/bnxt_re/verbs.c:143:38: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   if (ibv_cmd_reg_mr(ibvpd, sva, len, (uint64_t)sva, access, &mr->ibvmr,
>                                       ^
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_create_cq':
> ../providers/bnxt_re/verbs.c:191:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   cmd.cq_va = (uint64_t)cq->cqq.va;
>               ^
> ../providers/bnxt_re/verbs.c:192:18: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   cmd.cq_handle = (uint64_t)cq;
>                   ^
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_poll_one':
> ../providers/bnxt_re/verbs.c:512:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
>     qp = (struct bnxt_re_qp *)scqe->qp_handle;
>          ^
> ../providers/bnxt_re/verbs.c:521:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
>     qp = (struct bnxt_re_qp *)rcqe->qp_handle;
>          ^
> ../providers/bnxt_re/verbs.c:531:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
>     qp = (struct bnxt_re_qp *)scqe->qp_handle;
>          ^
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_cleanup_cq':
> ../providers/bnxt_re/verbs.c:689:27: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>     if (scqe->qp_handle == (uint64_t)qp)
>                            ^
> ../providers/bnxt_re/verbs.c:693:27: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>     if (rcqe->qp_handle == (uint64_t)qp)
>                            ^
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_create_qp':
> ../providers/bnxt_re/verbs.c:876:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   req.qpsva = (uint64_t)qp->sqq->va;
>               ^
> ../providers/bnxt_re/verbs.c:877:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   req.qprva = qp->rqq ? (uint64_t)qp->rqq->va : 0;
>                         ^
> ../providers/bnxt_re/verbs.c:878:18: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   req.qp_handle = (uint64_t)qp;
>                   ^
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_build_sge':
> ../providers/bnxt_re/verbs.c:1014:16: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
>     memcpy(dst, (void *)sg_list[indx].addr,
>                 ^
> ../providers/bnxt_re/verbs.c: In function 'bnxt_re_create_srq':
> ../providers/bnxt_re/verbs.c:1412:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   cmd.srqva = (uint64_t)srq->srqq->va;
>               ^
> ../providers/bnxt_re/verbs.c:1413:19: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>   cmd.srq_handle = (uint64_t)srq;
>                    ^
>
>
> Hum, you might want to put the wmb() ino bnxt_re_ring_db:
>
>                 bnxt_re_incr_tail(sq);
>                 wr = wr->next;
>                 wmb(); /* write barrier */

[DS]: That is correct, I was probably waiting for this comment before
I move it..will fold in v3

>
>                 bnxt_re_ring_sq_db(qp);
>
> More likely to be universally correct that way.
>
> There is only one rmb() in this driver and it seems in a wonky
> place. There is no reason to have a barrier after a kernel syscall, if
> a barrier is needed there it belongs in there kernel code.
>
[DS}: Will remove it.

> The rmb() is missing after calls to bnxt_re_is_cqe_valid() it looks
> like?

[DS]: true, its missing. I will correct it.

>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rdma-core v2 0/9] Broadcom User Space RoCE Driver
       [not found]         ` <CANjDDBi7bvYMvnjgsdSXzj6ot2wJS+w1nRCp-26TgEEpE38vfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-02-22  5:08           ` Jason Gunthorpe
       [not found]             ` <20170222050804.GA29755-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2017-02-22  5:08 UTC (permalink / raw)
  To: Devesh Sharma; +Cc: linux-rdma

On Wed, Feb 22, 2017 at 08:52:17AM +0530, Devesh Sharma wrote:
> > It still doesn't compile, please make sure everything is travis clean
> > before submitting..
> 
> I could not find travis package, thus could not gave it try before
> submitting v2. I will give it a shot again.

To run travis you need to install docker and run

$ buildlib/cbuild build-images travis

Then you can run:

> > $ buildlib/cbuild pkg travis

Which emulates travis locally.

Real travis only runs once you create a PR in github.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rdma-core v2 0/9] Broadcom User Space RoCE Driver
       [not found]             ` <20170222050804.GA29755-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-02-24 17:40               ` Devesh Sharma
  0 siblings, 0 replies; 14+ messages in thread
From: Devesh Sharma @ 2017-02-24 17:40 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma

On Wed, Feb 22, 2017 at 10:38 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Wed, Feb 22, 2017 at 08:52:17AM +0530, Devesh Sharma wrote:
>> > It still doesn't compile, please make sure everything is travis clean
>> > before submitting..
>>
>> I could not find travis package, thus could not gave it try before
>> submitting v2. I will give it a shot again.
>
> To run travis you need to install docker and run
>
> $ buildlib/cbuild build-images travis
>
> Then you can run:
>
>> > $ buildlib/cbuild pkg travis
>
> Which emulates travis locally.
>
> Real travis only runs once you create a PR in github.
>

Thanks Jason, I will try it out and submit v3 of this series.

> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-02-24 17:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-18 15:43 [rdma-core v2 0/9] Broadcom User Space RoCE Driver Devesh Sharma
     [not found] ` <1487432638-19607-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-02-18 15:43   ` [rdma-core v2 1/9] libbnxt_re: introduce bnxtre user space RDMA provider Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 2/9] libbnxt_re: Add support for user memory regions Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 3/9] libbnxt_re: Add support for CQ and QP management Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 4/9] libbnxt_re: Add support for posting and polling Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 5/9] libbnxt_re: Allow apps to poll for flushed completions Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 6/9] libbnxt_re: Enable UD control path and wqe posting Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 7/9] libbnxt_re: Enable polling for UD completions Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 8/9] libbnxt_re: Add support for atomic operations Devesh Sharma
2017-02-18 15:43   ` [rdma-core v2 9/9] libbnxt_re: Add support for SRQ in user lib Devesh Sharma
2017-02-21 19:50   ` [rdma-core v2 0/9] Broadcom User Space RoCE Driver Jason Gunthorpe
     [not found]     ` <20170221195053.GG13138-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-02-22  3:22       ` Devesh Sharma
     [not found]         ` <CANjDDBi7bvYMvnjgsdSXzj6ot2wJS+w1nRCp-26TgEEpE38vfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-02-22  5:08           ` Jason Gunthorpe
     [not found]             ` <20170222050804.GA29755-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-02-24 17:40               ` Devesh Sharma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.