linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
@ 2021-04-06 21:01 Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 01/23] iidc: Introduce iidc.h Shiraz Saleem
                   ` (24 more replies)
  0 siblings, 25 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

The following patch series introduces a unified Intel Ethernet Protocol Driver
for RDMA (irdma) for the X722 iWARP device and a new E810 device which supports
iWARP and RoCEv2. The irdma module replaces the legacy i40iw module for X722
and extends the ABI already defined for i40iw. It is backward compatible with
legacy X722 rdma-core provider (libi40iw).

X722 and E810 are PCI network devices that are RDMA capable. The RDMA block of
this parent device is represented via an auxiliary device exported to 'irdma'
using the core auxiliary bus infrastructure recently added for 5.11 kernel.
The parent PCI netdev drivers 'i40e' and 'ice' register auxiliary RDMA devices
with private data/ops encapsulated that bind to auxiliary drivers registered
in irdma module.

This patchset was initially submitted as an RFC where in we got feedback
to come up with a generic scheme for RDMA drivers to attach to a PCI device
owned by netdev PCI driver [1]. Solutions using platform bus and MFD were explored
but rejected by the community and the consensus was to add a new bus infrastructure
to support this usage model.

Further revisions of this series along with the auxiliary bus were submitted
[2]. At this point, Greg KH requested that we take the auxiliary bus review and
revision process to an internal mailing list and garner the buy-in of a respected
kernel contributor, along with consensus of all major stakeholders including
Nvidia (for mlx5 sub-function use-case) and Intel sound driver. This process
took a while and stalled further development/review of this netdev/irdma series.
The auxiliary bus was eventually merged in 5.11.

Between v1 to v2 of this submission, the IIDC went through a major re-write
based on the feedback and we hope it is now more in alignment with what the
community wants.

This series is built against rdma for-next and currently includes the netdev
patches for ease of review. This includes updates to 'ice' driver to provide
RDMA support and converts 'i40e' driver to use the auxiliary bus infrastructure.
Once the patches are closer to merging, a shared pull request will be submitted.

v3-->v4:
* Fixup W=1 warnings in ice patches
* Fix issues uncovered by pyverbs for create user AH and multicast
* Fix descriptor set issue for fast register introduced during port to FIELD_PREP in v2 submission 

v2-->v3:
* rebase rdma for-next. Adapt to core change '1fb7f8973f51 ("RDMA: Support more than 255 rdma ports")'
* irdma Kconfig updates to conform with linux coding style.
* Fix a 0-day build issue
* Remove rdma_resource_limits selector devlink param. Follow on patch to be submitted
for it with suggestion from Parav to use devlink resource.
* Capitalize abbreviations in ice idc. e.g. 'aux' to 'AUX'

v1-->v2:
* Remove IIDC channel OPs - open, close, peer_register and peer_unregister.
  And all its associated FSM in ice PCI core driver.
* Use device_lock in ice PCI core driver while issuing IIDC ops callbacks.
* Remove peer_* verbiage from shared IIDC header and rename the structs and channel ops
  with iidc_core*/iidc_auxiliary*.
* Allocate ib_device at start and register it at the end of drv.probe() in irdma gen2 auxiliary driver.
* Use ibdev_* printing extensively throughout most of the driver
  Remove idev_to_dev, ihw_to_dev macros as no longer required in new print scheme.
* Do not bump ABI ver. to 6 in irdma. Maintain irdma ABI ver. at 5 for legacy i40iw user-provider compatibility.
* Add a boundary check in irdma_alloc_ucontext to fail binding with < 4 user-space provider version.
* Remove devlink from irdma. Add 2 new rdma-related devlink parameters added to ice PCI core driver.
* Use FIELD_PREP/FIELD_GET/GENMASK on get/set of descriptor fields versus home grown ones LS_*/RS_*.
* Bind 2 separate auxiliary drivers in irdma - one for gen1 and one for gen2 and future devices.
* Misc. driver fixes in irdma

[1] https://patchwork.kernel.org/project/linux-rdma/patch/20190215171107.6464-2-shiraz.saleem@intel.com/
[2] https://lore.kernel.org/linux-rdma/20200520070415.3392210-1-jeffrey.t.kirsher@intel.com/

Dave Ertman (4):
  iidc: Introduce iidc.h
  ice: Initialize RDMA support
  ice: Implement iidc operations
  ice: Register auxiliary device to provide RDMA

Michael J. Ruhl (1):
  RDMA/irdma: Add dynamic tracing for CM

Mustafa Ismail (13):
  RDMA/irdma: Register auxiliary driver and implement private channel
    OPs
  RDMA/irdma: Implement device initialization definitions
  RDMA/irdma: Implement HW Admin Queue OPs
  RDMA/irdma: Add HMC backing store setup functions
  RDMA/irdma: Add privileged UDA queue implementation
  RDMA/irdma: Add QoS definitions
  RDMA/irdma: Add connection manager
  RDMA/irdma: Add PBLE resource manager
  RDMA/irdma: Implement device supported verb APIs
  RDMA/irdma: Add RoCEv2 UD OP support
  RDMA/irdma: Add user/kernel shared libraries
  RDMA/irdma: Add miscellaneous utility definitions
  RDMA/irdma: Add ABI definitions

Shiraz Saleem (5):
  ice: Add devlink params support
  i40e: Prep i40e header for aux bus conversion
  i40e: Register auxiliary devices to provide RDMA
  RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
  RDMA/irdma: Update MAINTAINERS file

 Documentation/ABI/stable/sysfs-class-infiniband    |   20 -
 .../networking/devlink/devlink-params.rst          |    6 +
 Documentation/networking/devlink/ice.rst           |   13 +
 MAINTAINERS                                        |   17 +-
 drivers/infiniband/Kconfig                         |    2 +-
 drivers/infiniband/hw/Makefile                     |    2 +-
 drivers/infiniband/hw/i40iw/Kconfig                |    9 -
 drivers/infiniband/hw/i40iw/Makefile               |    9 -
 drivers/infiniband/hw/i40iw/i40iw.h                |  602 --
 drivers/infiniband/hw/i40iw/i40iw_cm.c             | 4419 --------------
 drivers/infiniband/hw/i40iw/i40iw_cm.h             |  462 --
 drivers/infiniband/hw/i40iw/i40iw_ctrl.c           | 5243 -----------------
 drivers/infiniband/hw/i40iw/i40iw_d.h              | 1746 ------
 drivers/infiniband/hw/i40iw/i40iw_hmc.c            |  821 ---
 drivers/infiniband/hw/i40iw/i40iw_hmc.h            |  241 -
 drivers/infiniband/hw/i40iw/i40iw_hw.c             |  851 ---
 drivers/infiniband/hw/i40iw/i40iw_main.c           | 2066 -------
 drivers/infiniband/hw/i40iw/i40iw_osdep.h          |  195 -
 drivers/infiniband/hw/i40iw/i40iw_p.h              |  129 -
 drivers/infiniband/hw/i40iw/i40iw_pble.c           |  613 --
 drivers/infiniband/hw/i40iw/i40iw_pble.h           |  131 -
 drivers/infiniband/hw/i40iw/i40iw_puda.c           | 1496 -----
 drivers/infiniband/hw/i40iw/i40iw_puda.h           |  188 -
 drivers/infiniband/hw/i40iw/i40iw_register.h       | 1030 ----
 drivers/infiniband/hw/i40iw/i40iw_status.h         |  101 -
 drivers/infiniband/hw/i40iw/i40iw_type.h           | 1358 -----
 drivers/infiniband/hw/i40iw/i40iw_uk.c             | 1200 ----
 drivers/infiniband/hw/i40iw/i40iw_user.h           |  422 --
 drivers/infiniband/hw/i40iw/i40iw_utils.c          | 1518 -----
 drivers/infiniband/hw/i40iw/i40iw_verbs.c          | 2652 ---------
 drivers/infiniband/hw/i40iw/i40iw_verbs.h          |  179 -
 drivers/infiniband/hw/i40iw/i40iw_vf.c             |   85 -
 drivers/infiniband/hw/i40iw/i40iw_vf.h             |   62 -
 drivers/infiniband/hw/i40iw/i40iw_virtchnl.c       |  759 ---
 drivers/infiniband/hw/i40iw/i40iw_virtchnl.h       |  124 -
 drivers/infiniband/hw/irdma/Kconfig                |   11 +
 drivers/infiniband/hw/irdma/Makefile               |   27 +
 drivers/infiniband/hw/irdma/cm.c                   | 4425 ++++++++++++++
 drivers/infiniband/hw/irdma/cm.h                   |  417 ++
 drivers/infiniband/hw/irdma/ctrl.c                 | 6143 ++++++++++++++++++++
 drivers/infiniband/hw/irdma/defs.h                 | 1162 ++++
 drivers/infiniband/hw/irdma/hmc.c                  |  739 +++
 drivers/infiniband/hw/irdma/hmc.h                  |  180 +
 drivers/infiniband/hw/irdma/hw.c                   | 2755 +++++++++
 drivers/infiniband/hw/irdma/i40iw_hw.c             |  216 +
 drivers/infiniband/hw/irdma/i40iw_hw.h             |  160 +
 drivers/infiniband/hw/irdma/i40iw_if.c             |  229 +
 drivers/infiniband/hw/irdma/icrdma_hw.c            |   84 +
 drivers/infiniband/hw/irdma/icrdma_hw.h            |   71 +
 drivers/infiniband/hw/irdma/irdma.h                |  157 +
 drivers/infiniband/hw/irdma/main.c                 |  382 ++
 drivers/infiniband/hw/irdma/main.h                 |  565 ++
 drivers/infiniband/hw/irdma/osdep.h                |   86 +
 drivers/infiniband/hw/irdma/pble.c                 |  525 ++
 drivers/infiniband/hw/irdma/pble.h                 |  136 +
 drivers/infiniband/hw/irdma/protos.h               |  116 +
 drivers/infiniband/hw/irdma/puda.c                 | 1749 ++++++
 drivers/infiniband/hw/irdma/puda.h                 |  194 +
 drivers/infiniband/hw/irdma/status.h               |   71 +
 drivers/infiniband/hw/irdma/trace.c                |  112 +
 drivers/infiniband/hw/irdma/trace.h                |    3 +
 drivers/infiniband/hw/irdma/trace_cm.h             |  458 ++
 drivers/infiniband/hw/irdma/type.h                 | 1717 ++++++
 drivers/infiniband/hw/irdma/uda.c                  |  379 ++
 drivers/infiniband/hw/irdma/uda.h                  |   63 +
 drivers/infiniband/hw/irdma/uda_d.h                |  128 +
 drivers/infiniband/hw/irdma/uk.c                   | 1737 ++++++
 drivers/infiniband/hw/irdma/user.h                 |  464 ++
 drivers/infiniband/hw/irdma/utils.c                | 2543 ++++++++
 drivers/infiniband/hw/irdma/verbs.c                | 4544 +++++++++++++++
 drivers/infiniband/hw/irdma/verbs.h                |  225 +
 drivers/infiniband/hw/irdma/ws.c                   |  406 ++
 drivers/infiniband/hw/irdma/ws.h                   |   41 +
 drivers/net/ethernet/intel/Kconfig                 |    2 +
 drivers/net/ethernet/intel/i40e/i40e.h             |    2 +
 drivers/net/ethernet/intel/i40e/i40e_client.c      |  156 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c        |    1 +
 drivers/net/ethernet/intel/ice/Makefile            |    1 +
 drivers/net/ethernet/intel/ice/ice.h               |   40 +
 drivers/net/ethernet/intel/ice/ice_adminq_cmd.h    |   33 +
 drivers/net/ethernet/intel/ice/ice_common.c        |  217 +-
 drivers/net/ethernet/intel/ice/ice_common.h        |    9 +
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c       |   58 +
 drivers/net/ethernet/intel/ice/ice_dcb_lib.h       |    3 +
 drivers/net/ethernet/intel/ice/ice_devlink.c       |   92 +-
 drivers/net/ethernet/intel/ice/ice_devlink.h       |    5 +
 drivers/net/ethernet/intel/ice/ice_hw_autogen.h    |    3 +-
 drivers/net/ethernet/intel/ice/ice_idc.c           |  793 +++
 drivers/net/ethernet/intel/ice/ice_idc_int.h       |   20 +
 drivers/net/ethernet/intel/ice/ice_lag.c           |    2 +
 drivers/net/ethernet/intel/ice/ice_lib.c           |   11 +
 drivers/net/ethernet/intel/ice/ice_lib.h           |    2 +-
 drivers/net/ethernet/intel/ice/ice_main.c          |  160 +-
 drivers/net/ethernet/intel/ice/ice_sched.c         |   69 +-
 drivers/net/ethernet/intel/ice/ice_switch.c        |   27 +
 drivers/net/ethernet/intel/ice/ice_switch.h        |    4 +
 drivers/net/ethernet/intel/ice/ice_type.h          |    4 +
 drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c   |   34 +
 include/linux/net/intel/i40e_client.h              |   14 +
 include/linux/net/intel/iidc.h                     |  242 +
 include/net/devlink.h                              |    4 +
 include/uapi/rdma/i40iw-abi.h                      |  107 -
 include/uapi/rdma/ib_user_ioctl_verbs.h            |    1 +
 include/uapi/rdma/irdma-abi.h                      |  116 +
 net/core/devlink.c                                 |    5 +
 105 files changed, 35528 insertions(+), 28900 deletions(-)
 delete mode 100644 drivers/infiniband/hw/i40iw/Kconfig
 delete mode 100644 drivers/infiniband/hw/i40iw/Makefile
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_cm.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_cm.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_ctrl.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_d.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_hmc.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_hmc.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_hw.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_main.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_osdep.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_p.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_pble.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_pble.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_puda.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_puda.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_register.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_status.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_type.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_uk.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_user.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_utils.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_verbs.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_verbs.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_vf.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_vf.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_virtchnl.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_virtchnl.h
 create mode 100644 drivers/infiniband/hw/irdma/Kconfig
 create mode 100644 drivers/infiniband/hw/irdma/Makefile
 create mode 100644 drivers/infiniband/hw/irdma/cm.c
 create mode 100644 drivers/infiniband/hw/irdma/cm.h
 create mode 100644 drivers/infiniband/hw/irdma/ctrl.c
 create mode 100644 drivers/infiniband/hw/irdma/defs.h
 create mode 100644 drivers/infiniband/hw/irdma/hmc.c
 create mode 100644 drivers/infiniband/hw/irdma/hmc.h
 create mode 100644 drivers/infiniband/hw/irdma/hw.c
 create mode 100644 drivers/infiniband/hw/irdma/i40iw_hw.c
 create mode 100644 drivers/infiniband/hw/irdma/i40iw_hw.h
 create mode 100644 drivers/infiniband/hw/irdma/i40iw_if.c
 create mode 100644 drivers/infiniband/hw/irdma/icrdma_hw.c
 create mode 100644 drivers/infiniband/hw/irdma/icrdma_hw.h
 create mode 100644 drivers/infiniband/hw/irdma/irdma.h
 create mode 100644 drivers/infiniband/hw/irdma/main.c
 create mode 100644 drivers/infiniband/hw/irdma/main.h
 create mode 100644 drivers/infiniband/hw/irdma/osdep.h
 create mode 100644 drivers/infiniband/hw/irdma/pble.c
 create mode 100644 drivers/infiniband/hw/irdma/pble.h
 create mode 100644 drivers/infiniband/hw/irdma/protos.h
 create mode 100644 drivers/infiniband/hw/irdma/puda.c
 create mode 100644 drivers/infiniband/hw/irdma/puda.h
 create mode 100644 drivers/infiniband/hw/irdma/status.h
 create mode 100644 drivers/infiniband/hw/irdma/trace.c
 create mode 100644 drivers/infiniband/hw/irdma/trace.h
 create mode 100644 drivers/infiniband/hw/irdma/trace_cm.h
 create mode 100644 drivers/infiniband/hw/irdma/type.h
 create mode 100644 drivers/infiniband/hw/irdma/uda.c
 create mode 100644 drivers/infiniband/hw/irdma/uda.h
 create mode 100644 drivers/infiniband/hw/irdma/uda_d.h
 create mode 100644 drivers/infiniband/hw/irdma/uk.c
 create mode 100644 drivers/infiniband/hw/irdma/user.h
 create mode 100644 drivers/infiniband/hw/irdma/utils.c
 create mode 100644 drivers/infiniband/hw/irdma/verbs.c
 create mode 100644 drivers/infiniband/hw/irdma/verbs.h
 create mode 100644 drivers/infiniband/hw/irdma/ws.c
 create mode 100644 drivers/infiniband/hw/irdma/ws.h
 create mode 100644 drivers/net/ethernet/intel/ice/ice_idc.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_idc_int.h
 create mode 100644 include/linux/net/intel/iidc.h
 delete mode 100644 include/uapi/rdma/i40iw-abi.h
 create mode 100644 include/uapi/rdma/irdma-abi.h

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-07 15:44   ` Jason Gunthorpe
  2021-04-07 17:35   ` Jason Gunthorpe
  2021-04-06 21:01 ` [PATCH v4 02/23] ice: Initialize RDMA support Shiraz Saleem
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

From: Dave Ertman <david.m.ertman@intel.com>

Introduce a shared header file used by the 'ice' Intel networking driver
providing RDMA support and the 'irdma' driver to provide a private
interface.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 MAINTAINERS                    |   1 +
 include/linux/net/intel/iidc.h | 242 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 243 insertions(+)
 create mode 100644 include/linux/net/intel/iidc.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a1e3502..27e424e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8955,6 +8955,7 @@ F:	Documentation/networking/device_drivers/ethernet/intel/
 F:	drivers/net/ethernet/intel/
 F:	drivers/net/ethernet/intel/*/
 F:	include/linux/avf/virtchnl.h
+F:	include/linux/net/intel/iidc.h
 
 INTEL FRAMEBUFFER DRIVER (excluding 810 and 815)
 M:	Maik Broemme <mbroemme@libmpq.org>
diff --git a/include/linux/net/intel/iidc.h b/include/linux/net/intel/iidc.h
new file mode 100644
index 0000000..2c360c4
--- /dev/null
+++ b/include/linux/net/intel/iidc.h
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2021, Intel Corporation. */
+
+#ifndef _IIDC_H_
+#define _IIDC_H_
+
+#include <linux/auxiliary_bus.h>
+#include <linux/dcbnl.h>
+#include <linux/device.h>
+#include <linux/if_ether.h>
+#include <linux/kernel.h>
+#include <linux/netdevice.h>
+
+enum iidc_event_type {
+	IIDC_EVENT_BEFORE_MTU_CHANGE,
+	IIDC_EVENT_AFTER_MTU_CHANGE,
+	IIDC_EVENT_BEFORE_TC_CHANGE,
+	IIDC_EVENT_AFTER_TC_CHANGE,
+	IIDC_EVENT_DSCP_MAP_CHANGE,
+	IIDC_EVENT_CRIT_ERR,
+	IIDC_EVENT_NBITS		/* must be last */
+};
+
+enum iidc_res_type {
+	IIDC_INVAL_RES,
+	IIDC_VSI,
+	IIDC_VEB,
+	IIDC_EVENT_Q,
+	IIDC_EGRESS_CMPL_Q,
+	IIDC_CMPL_EVENT_Q,
+	IIDC_ASYNC_EVENT_Q,
+	IIDC_DOORBELL_Q,
+	IIDC_RDMA_QSETS_TXSCHED,
+};
+
+enum iidc_reset_type {
+	IIDC_PFR,
+	IIDC_CORER,
+	IIDC_GLOBR,
+};
+
+enum iidc_rdma_protocol {
+	IIDC_RDMA_PROTOCOL_IWARP,
+	IIDC_RDMA_PROTOCOL_ROCEV2,
+};
+
+/* Struct to hold per DCB APP info */
+struct iidc_dcb_app_info {
+	u8  priority;
+	u8  selector;
+	u16 prot_id;
+};
+
+struct iidc_core_dev_info;
+
+#define IIDC_MAX_USER_PRIORITY		8
+#define IIDC_MAX_APPS			72
+#define IIDC_MAX_DSCP_MAPPING		64
+
+/* Struct to hold per RDMA Qset info */
+struct iidc_rdma_qset_params {
+	u32 teid;	/* Qset TEID */
+	u16 qs_handle; /* RDMA driver provides this */
+	u16 vport_id; /* VSI index */
+	u8 tc; /* TC branch the Qset should belong to */
+	u8 reserved[3];
+};
+
+struct iidc_res_base {
+	/* Union for future provision e.g. other res_type */
+	union {
+		struct iidc_rdma_qset_params qsets;
+	} res;
+};
+
+struct iidc_res {
+	/* Type of resource. */
+	enum iidc_res_type res_type;
+	/* Count requested */
+	u16 cnt_req;
+
+	/* Number of resources allocated. Filled in by callee.
+	 * Based on this value, caller to fill up "resources"
+	 */
+	u16 res_allocated;
+
+	/* Unique handle to resources allocated. Zero if call fails.
+	 * Allocated by callee and for now used by caller for internal
+	 * tracking purpose.
+	 */
+	u32 res_handle;
+
+	/* Peer driver has to allocate sufficient memory, to accommodate
+	 * cnt_requested before calling this function.
+	 * Memory has to be zero initialized. It is input/output param.
+	 * As a result of alloc_res API, this structures will be populated.
+	 */
+	struct iidc_res_base res[1];
+};
+
+struct iidc_qos_info {
+	u64 tc_ctx;
+	u8 rel_bw;
+	u8 prio_type;
+	u8 egress_virt_up;
+	u8 ingress_virt_up;
+};
+
+struct iidc_dscp_map {
+	u16 dscp_val;
+	u8 tc;
+};
+
+/* Struct to hold QoS info */
+struct iidc_qos_params {
+	struct iidc_qos_info tc_info[IEEE_8021QAZ_MAX_TCS];
+	u8 up2tc[IIDC_MAX_USER_PRIORITY];
+	u8 vport_relative_bw;
+	u8 vport_priority_type;
+	u32 num_apps;
+	struct iidc_dcb_app_info apps[IIDC_MAX_APPS];
+	struct iidc_dscp_map dscp_maps[IIDC_MAX_DSCP_MAPPING];
+	u8 num_tc;
+};
+
+union iidc_event_info {
+	/* IIDC_EVENT_AFTER_TC_CHANGE */
+	struct iidc_qos_params port_qos;
+	/* IIDC_EVENT_CRIT_ERR */
+	u32 reg;
+};
+
+struct iidc_event {
+	DECLARE_BITMAP(type, IIDC_EVENT_NBITS);
+	union iidc_event_info info;
+};
+
+/* Following APIs are implemented by core PCI driver */
+struct iidc_core_ops {
+	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
+	 * completion queues, Tx/Rx queues, etc...
+	 */
+	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
+			 struct iidc_res *res,
+			 int partial_acceptable);
+	int (*free_res)(struct iidc_core_dev_info *cdev_info,
+			struct iidc_res *res);
+
+	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
+			     enum iidc_reset_type reset_type);
+
+	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
+				   u16 vport_id, bool enable);
+	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
+		       u16 len);
+};
+
+/* Following APIs are implemented by auxiliary drivers and invoked by core PCI
+ * driver
+ */
+struct iidc_auxiliary_ops {
+	/* This event_handler is meant to be a blocking call.  For instance,
+	 * when a BEFORE_MTU_CHANGE event comes in, the event_handler will not
+	 * return until the auxiliary driver is ready for the MTU change to
+	 * happen.
+	 */
+	void (*event_handler)(struct iidc_core_dev_info *cdev_info,
+			      struct iidc_event *event);
+
+	int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id,
+			  u8 *msg, u16 len);
+};
+
+#define IIDC_RDMA_NAME	"intel_rdma"
+#define IIDC_RDMA_ID	0x00000010
+#define IIDC_MAX_NUM_AUX	4
+
+/* The const struct that instantiates cdev_info_id needs to be initialized
+ * in the .c with the macro ASSIGN_IIDC_INFO.
+ * For example:
+ * static const struct cdev_info_id cdev_info_ids[] = ASSIGN_IIDC_INFO;
+ */
+struct cdev_info_id {
+	char *name;
+	int id;
+};
+
+#define IIDC_RDMA_INFO   { .name = IIDC_RDMA_NAME,  .id = IIDC_RDMA_ID },
+
+#define ASSIGN_IIDC_INFO	\
+{				\
+	IIDC_RDMA_INFO		\
+}
+
+#define iidc_priv(x) ((x)->auxiliary_priv)
+
+/* Structure representing auxiliary driver tailored information about the core
+ * PCI dev, each auxiliary driver using the IIDC interface will have an
+ * instance of this struct dedicated to it.
+ */
+struct iidc_core_dev_info {
+	struct pci_dev *pdev; /* PCI device of corresponding to main function */
+	struct auxiliary_device *adev;
+	/* KVA / Linear address corresponding to BAR0 of underlying
+	 * pci_device.
+	 */
+	u8 __iomem *hw_addr;
+	int cdev_info_id;
+
+	u8 ftype;	/* PF(false) or VF (true) */
+
+	u16 vport_id;
+	enum iidc_rdma_protocol rdma_protocol;
+
+	struct iidc_qos_params qos_info;
+	struct net_device *netdev;
+
+	struct msix_entry *msix_entries;
+	u16 msix_count; /* How many vectors are reserved for this device */
+
+	/* Following struct contains function pointers to be initialized
+	 * by core PCI driver and called by auxiliary driver
+	 */
+	const struct iidc_core_ops *ops;
+};
+
+struct iidc_auxiliary_dev {
+	struct auxiliary_device adev;
+	struct iidc_core_dev_info *cdev_info;
+};
+
+/* structure representing the auxiliary driver. This struct is to be
+ * allocated and populated by the auxiliary driver's owner. The core PCI
+ * driver will access these ops by performing a container_of on the
+ * auxiliary_device->dev.driver.
+ */
+struct iidc_auxiliary_drv {
+	struct auxiliary_driver adrv;
+	struct iidc_auxiliary_ops *ops;
+};
+
+#endif /* _IIDC_H_*/
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 02/23] ice: Initialize RDMA support
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 01/23] iidc: Introduce iidc.h Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 03/23] ice: Implement iidc operations Shiraz Saleem
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

From: Dave Ertman <david.m.ertman@intel.com>

Probe the device's capabilities to see if it supports RDMA. If so, allocate
and reserve resources to support its operation; populate structures with
initial values.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/net/ethernet/intel/ice/Makefile         |   1 +
 drivers/net/ethernet/intel/ice/ice.h            |  33 ++++
 drivers/net/ethernet/intel/ice/ice_adminq_cmd.h |   1 +
 drivers/net/ethernet/intel/ice/ice_common.c     |  17 +-
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c    |  35 ++++
 drivers/net/ethernet/intel/ice/ice_dcb_lib.h    |   3 +
 drivers/net/ethernet/intel/ice/ice_idc.c        | 231 ++++++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_idc_int.h    |  17 ++
 drivers/net/ethernet/intel/ice/ice_lag.c        |   2 +
 drivers/net/ethernet/intel/ice/ice_lib.c        |  11 ++
 drivers/net/ethernet/intel/ice/ice_lib.h        |   2 +-
 drivers/net/ethernet/intel/ice/ice_main.c       |  99 +++++++++-
 drivers/net/ethernet/intel/ice/ice_type.h       |   1 +
 13 files changed, 445 insertions(+), 8 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_idc.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_idc_int.h

diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index 73da4f7..5d00352 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -22,6 +22,7 @@ ice-y := ice_main.o	\
 	 ice_ethtool_fdir.o \
 	 ice_flex_pipe.o \
 	 ice_flow.o	\
+	 ice_idc.o	\
 	 ice_devlink.o	\
 	 ice_fw_update.o \
 	 ice_lag.o	\
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 3577064..ebd2159 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -51,6 +51,7 @@
 #include "ice_switch.h"
 #include "ice_common.h"
 #include "ice_sched.h"
+#include "ice_idc_int.h"
 #include "ice_virtchnl_pf.h"
 #include "ice_sriov.h"
 #include "ice_fdir.h"
@@ -74,6 +75,8 @@
 #define ICE_MIN_LAN_OICR_MSIX	1
 #define ICE_MIN_MSIX		(ICE_MIN_LAN_TXRX_MSIX + ICE_MIN_LAN_OICR_MSIX)
 #define ICE_FDIR_MSIX		1
+#define ICE_RDMA_NUM_AEQ_MSIX	4
+#define ICE_MIN_RDMA_MSIX	2
 #define ICE_NO_VSI		0xffff
 #define ICE_VSI_MAP_CONTIG	0
 #define ICE_VSI_MAP_SCATTER	1
@@ -84,6 +87,7 @@
 #define ICE_MAX_LG_RSS_QS	256
 #define ICE_RES_VALID_BIT	0x8000
 #define ICE_RES_MISC_VEC_ID	(ICE_RES_VALID_BIT - 1)
+#define ICE_RES_RDMA_VEC_ID	(ICE_RES_MISC_VEC_ID - 1)
 #define ICE_INVAL_Q_INDEX	0xffff
 #define ICE_INVAL_VFID		256
 
@@ -362,12 +366,14 @@ struct ice_q_vector {
 
 enum ice_pf_flags {
 	ICE_FLAG_FLTR_SYNC,
+	ICE_FLAG_IWARP_ENA,
 	ICE_FLAG_RSS_ENA,
 	ICE_FLAG_SRIOV_ENA,
 	ICE_FLAG_SRIOV_CAPABLE,
 	ICE_FLAG_DCB_CAPABLE,
 	ICE_FLAG_DCB_ENA,
 	ICE_FLAG_FD_ENA,
+	ICE_FLAG_AUX_ENA,
 	ICE_FLAG_ADV_FEATURES,
 	ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA,
 	ICE_FLAG_TOTAL_PORT_SHUTDOWN_ENA,
@@ -427,6 +433,8 @@ struct ice_pf {
 	struct mutex sw_mutex;		/* lock for protecting VSI alloc flow */
 	struct mutex tc_mutex;		/* lock to protect TC changes */
 	u32 msg_enable;
+	u16 num_rdma_msix;		/* Total MSIX vectors for RDMA driver */
+	u16 rdma_base_vector;
 
 	/* spinlock to protect the AdminQ wait list */
 	spinlock_t aq_wait_lock;
@@ -459,6 +467,8 @@ struct ice_pf {
 	unsigned long tx_timeout_last_recovery;
 	u32 tx_timeout_recovery_level;
 	char int_name[ICE_INT_NAME_STR_LEN];
+	struct iidc_core_dev_info **cdev_infos;
+	int aux_idx;
 	u32 sw_int_count;
 
 	__le64 nvm_phy_type_lo; /* NVM PHY type low */
@@ -622,6 +632,11 @@ static inline void ice_clear_sriov_cap(struct ice_pf *pf)
 void ice_fill_rss_lut(u8 *lut, u16 rss_table_size, u16 rss_size);
 int ice_schedule_reset(struct ice_pf *pf, enum ice_reset_req reset);
 void ice_print_link_msg(struct ice_vsi *vsi, bool isup);
+int ice_init_aux_devices(struct ice_pf *pf);
+int
+ice_for_each_aux(struct ice_pf *pf, void *data,
+		 int (*fn)(struct iidc_core_dev_info *, void *));
+void ice_cdev_info_refresh_msix(struct ice_pf *pf);
 const char *ice_stat_str(enum ice_status stat_err);
 const char *ice_aq_str(enum ice_aq_err aq_err);
 bool ice_is_wol_supported(struct ice_pf *pf);
@@ -645,4 +660,22 @@ int ice_aq_wait_for_event(struct ice_pf *pf, u16 opcode, unsigned long timeout,
 int ice_stop(struct net_device *netdev);
 void ice_service_task_schedule(struct ice_pf *pf);
 
+/**
+ * ice_set_rdma_cap - enable RDMA support
+ * @pf: PF struct
+ */
+static inline void ice_set_rdma_cap(struct ice_pf *pf)
+{
+	if (pf->hw.func_caps.common_cap.iwarp && pf->num_rdma_msix)
+		set_bit(ICE_FLAG_IWARP_ENA, pf->flags);
+}
+
+/**
+ * ice_clear_rdma_cap - disable RDMA support
+ * @pf: PF struct
+ */
+static inline void ice_clear_rdma_cap(struct ice_pf *pf)
+{
+	clear_bit(ICE_FLAG_IWARP_ENA, pf->flags);
+}
 #endif /* _ICE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 8018658..81735e5 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -115,6 +115,7 @@ struct ice_aqc_list_caps_elem {
 #define ICE_AQC_CAPS_PENDING_OROM_VER			0x004B
 #define ICE_AQC_CAPS_NET_VER				0x004C
 #define ICE_AQC_CAPS_PENDING_NET_VER			0x004D
+#define ICE_AQC_CAPS_IWARP				0x0051
 #define ICE_AQC_CAPS_NVM_MGMT				0x0080
 
 	u8 major_ver;
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 3d9475e..d7b15c3 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1057,7 +1057,8 @@ enum ice_status ice_check_reset(struct ice_hw *hw)
 				 GLNVM_ULD_POR_DONE_1_M |\
 				 GLNVM_ULD_PCIER_DONE_2_M)
 
-	uld_mask = ICE_RESET_DONE_MASK;
+	uld_mask = ICE_RESET_DONE_MASK | (hw->func_caps.common_cap.iwarp ?
+					  GLNVM_ULD_PE_DONE_M : 0);
 
 	/* Device is Active; check Global Reset processes are done */
 	for (cnt = 0; cnt < ICE_PF_RESET_WAIT_COUNT; cnt++) {
@@ -1854,6 +1855,10 @@ static u32 ice_get_num_per_func(struct ice_hw *hw, u32 max)
 		ice_debug(hw, ICE_DBG_INIT, "%s: nvm_unified_update = %d\n", prefix,
 			  caps->nvm_unified_update);
 		break;
+	case ICE_AQC_CAPS_IWARP:
+		caps->iwarp = (number == 1);
+		ice_debug(hw, ICE_DBG_INIT, "%s: iwarp = %d\n", prefix, caps->iwarp);
+		break;
 	case ICE_AQC_CAPS_MAX_MTU:
 		caps->max_mtu = number;
 		ice_debug(hw, ICE_DBG_INIT, "%s: max_mtu = %d\n",
@@ -1887,6 +1892,16 @@ static u32 ice_get_num_per_func(struct ice_hw *hw, u32 max)
 		caps->maxtc = 4;
 		ice_debug(hw, ICE_DBG_INIT, "reducing maxtc to %d (based on #ports)\n",
 			  caps->maxtc);
+		if (caps->iwarp) {
+			ice_debug(hw, ICE_DBG_INIT, "forcing RDMA off\n");
+			caps->iwarp = 0;
+		}
+
+		/* print message only when processing device capabilities
+		 * during initialization.
+		 */
+		if (caps == &hw->dev_caps.common_cap)
+			dev_info(ice_hw_to_dev(hw), "RDMA functionality is not available with the current device configuration.\n");
 	}
 }
 
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index 1e8f71f..3aebfa8 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -640,6 +640,7 @@ static int ice_dcb_noncontig_cfg(struct ice_pf *pf)
 void ice_pf_dcb_recfg(struct ice_pf *pf)
 {
 	struct ice_dcbx_cfg *dcbcfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
+	struct iidc_core_dev_info *rcdi;
 	u8 tc_map = 0;
 	int v, ret;
 
@@ -675,6 +676,9 @@ void ice_pf_dcb_recfg(struct ice_pf *pf)
 		if (vsi->type == ICE_VSI_PF)
 			ice_dcbnl_set_all(vsi);
 	}
+	rcdi = ice_find_cdev_info_by_id(pf, IIDC_RDMA_ID);
+	if (rcdi)
+		ice_setup_dcb_qos_info(pf, &rcdi->qos_info);
 }
 
 /**
@@ -816,6 +820,37 @@ void ice_update_dcb_stats(struct ice_pf *pf)
 }
 
 /**
+ * ice_setup_dcb_qos_info - Setup DCB QoS information
+ * @pf: ptr to ice_pf
+ * @qos_info: QoS param instance
+ */
+void ice_setup_dcb_qos_info(struct ice_pf *pf, struct iidc_qos_params *qos_info)
+{
+	struct ice_dcbx_cfg *dcbx_cfg;
+	unsigned int i;
+	u32 up2tc;
+
+	dcbx_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
+	up2tc = rd32(&pf->hw, PRTDCB_TUP2TC);
+	qos_info->num_apps = dcbx_cfg->numapps;
+
+	qos_info->num_tc = ice_dcb_get_num_tc(dcbx_cfg);
+
+	for (i = 0; i < IIDC_MAX_USER_PRIORITY; i++)
+		qos_info->up2tc[i] = (up2tc >> (i * 3)) & 0x7;
+
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
+		qos_info->tc_info[i].rel_bw =
+			dcbx_cfg->etscfg.tcbwtable[i];
+
+	for (i = 0; i < qos_info->num_apps; i++) {
+		qos_info->apps[i].priority = dcbx_cfg->app[i].priority;
+		qos_info->apps[i].prot_id = dcbx_cfg->app[i].prot_id;
+		qos_info->apps[i].selector = dcbx_cfg->app[i].selector;
+	}
+}
+
+/**
  * ice_dcb_process_lldp_set_mib_change - Process MIB change
  * @pf: ptr to ice_pf
  * @event: pointer to the admin queue receive event
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.h b/drivers/net/ethernet/intel/ice/ice_dcb_lib.h
index 35c21d9..b41c7ba 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.h
@@ -31,6 +31,8 @@
 ice_tx_prepare_vlan_flags_dcb(struct ice_ring *tx_ring,
 			      struct ice_tx_buf *first);
 void
+ice_setup_dcb_qos_info(struct ice_pf *pf, struct iidc_qos_params *qos_info);
+void
 ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
 				    struct ice_rq_event_info *event);
 void ice_vsi_cfg_netdev_tc(struct ice_vsi *vsi, u8 ena_tc);
@@ -116,6 +118,7 @@ static inline bool ice_is_dcb_active(struct ice_pf __always_unused *pf)
 #define ice_update_dcb_stats(pf) do {} while (0)
 #define ice_pf_dcb_recfg(pf) do {} while (0)
 #define ice_vsi_cfg_dcb_rings(vsi) do {} while (0)
+#define ice_setup_dcb_qos_info(pf, qos_info) do {} while (0)
 #define ice_dcb_process_lldp_set_mib_change(pf, event) do {} while (0)
 #define ice_set_cgd_num(tlan_ctx, ring) do {} while (0)
 #define ice_vsi_cfg_netdev_tc(vsi, ena_tc) do {} while (0)
diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c
new file mode 100644
index 0000000..e21a005
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -0,0 +1,231 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2021, Intel Corporation. */
+
+/* Inter-Driver Communication */
+#include "ice.h"
+#include "ice_lib.h"
+#include "ice_dcb_lib.h"
+
+static DEFINE_IDA(ice_cdev_info_ida);
+
+static struct cdev_info_id ice_cdev_ids[] = ASSIGN_IIDC_INFO;
+
+/**
+ * ice_for_each_aux - iterate across and call function for each AUX driver
+ * @pf: pointer to private board struct
+ * @data: data to pass to function on each call
+ * @fn: pointer to function to call for each AUX driver
+ */
+int
+ice_for_each_aux(struct ice_pf *pf, void *data,
+		 int (*fn)(struct iidc_core_dev_info *, void *))
+{
+	unsigned int i;
+
+	if (!pf->cdev_infos)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(ice_cdev_ids); i++) {
+		struct iidc_core_dev_info *cdev_info;
+
+		cdev_info = pf->cdev_infos[i];
+		if (cdev_info) {
+			int ret = fn(cdev_info, data);
+
+			if (ret)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * ice_unroll_cdev_info - destroy cdev_info resources
+ * @cdev_info: ptr to cdev_info struct
+ * @data: ptr to opaque data
+ *
+ * This function releases resources for cdev_info objects.
+ * Meant to be called from a ice_for_each_aux invocation
+ */
+int ice_unroll_cdev_info(struct iidc_core_dev_info *cdev_info,
+			 void __always_unused *data)
+{
+	if (!cdev_info)
+		return 0;
+
+	kfree(cdev_info);
+
+	return 0;
+}
+
+/**
+ * ice_cdev_info_refresh_msix - load new values into iidc_core_dev_info structs
+ * @pf: pointer to private board struct
+ */
+void ice_cdev_info_refresh_msix(struct ice_pf *pf)
+{
+	struct iidc_core_dev_info *cdev_info;
+	unsigned int i;
+
+	if (!pf->cdev_infos)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(ice_cdev_ids); i++) {
+		if (!pf->cdev_infos[i])
+			continue;
+
+		cdev_info = pf->cdev_infos[i];
+
+		switch (cdev_info->cdev_info_id) {
+		case IIDC_RDMA_ID:
+			cdev_info->msix_count = pf->num_rdma_msix;
+			cdev_info->msix_entries =
+				&pf->msix_entries[pf->rdma_base_vector];
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+/**
+ * ice_reserve_cdev_info_qvector - Reserve vector resources for AUX drivers
+ * @pf: board private structure to initialize
+ */
+static int ice_reserve_cdev_info_qvector(struct ice_pf *pf)
+{
+	if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags)) {
+		int index;
+
+		index = ice_get_res(pf, pf->irq_tracker, pf->num_rdma_msix, ICE_RES_RDMA_VEC_ID);
+		if (index < 0)
+			return index;
+		pf->num_avail_sw_msix -= pf->num_rdma_msix;
+		pf->rdma_base_vector = (u16)index;
+	}
+	return 0;
+}
+
+/**
+ * ice_find_cdev_info_by_id - find cdev_info instance by its ID
+ * @pf: pointer to private board struct
+ * @cdev_info_id: AUX driver ID
+ */
+struct iidc_core_dev_info *
+ice_find_cdev_info_by_id(struct ice_pf *pf, int cdev_info_id)
+{
+	struct iidc_core_dev_info *cdev_info = NULL;
+	unsigned int i;
+
+	if (!pf->cdev_infos)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(ice_cdev_ids); i++) {
+		cdev_info = pf->cdev_infos[i];
+		if (cdev_info && cdev_info->cdev_info_id == cdev_info_id)
+			break;
+		cdev_info = NULL;
+	}
+	return cdev_info;
+}
+
+/**
+ * ice_cdev_info_update_vsi - update the pf_vsi info in cdev_info struct
+ * @cdev_info: pointer to cdev_info struct
+ * @data: opaque pointer - VSI to be updated
+ */
+int ice_cdev_info_update_vsi(struct iidc_core_dev_info *cdev_info, void *data)
+{
+	struct ice_vsi *vsi = data;
+
+	if (!cdev_info)
+		return 0;
+
+	cdev_info->vport_id = vsi->vsi_num;
+	return 0;
+}
+
+/**
+ * ice_init_aux_devices - initializes cdev_info objects and AUX devices
+ * @pf: ptr to ice_pf
+ */
+int ice_init_aux_devices(struct ice_pf *pf)
+{
+	struct ice_vsi *vsi = ice_get_main_vsi(pf);
+	struct pci_dev *pdev = pf->pdev;
+	struct device *dev = &pdev->dev;
+	unsigned int i;
+	int ret;
+
+	/* Reserve vector resources */
+	ret = ice_reserve_cdev_info_qvector(pf);
+	if (ret) {
+		dev_err(dev, "failed to reserve vectors for aux drivers\n");
+		return ret;
+	}
+
+	/* This PFs auxiliary ID value */
+	pf->aux_idx = ida_alloc(&ice_cdev_info_ida, GFP_KERNEL);
+	if (pf->aux_idx < 0) {
+		dev_err(dev, "failed to allocate device ID for aux drvs\n");
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(ice_cdev_ids); i++) {
+		struct iidc_core_dev_info *cdev_info;
+		struct iidc_qos_params *qos_info;
+		struct msix_entry *entry = NULL;
+		int j;
+
+		cdev_info = kzalloc(sizeof(*cdev_info), GFP_KERNEL);
+		if (!cdev_info) {
+			ida_simple_remove(&ice_cdev_info_ida, pf->aux_idx);
+			pf->aux_idx = -1;
+			return -ENOMEM;
+		}
+
+		pf->cdev_infos[i] = cdev_info;
+
+		cdev_info->hw_addr = (u8 __iomem *)pf->hw.hw_addr;
+		cdev_info->cdev_info_id = ice_cdev_ids[i].id;
+		cdev_info->vport_id = vsi->vsi_num;
+		cdev_info->netdev = vsi->netdev;
+
+		cdev_info->pdev = pdev;
+		qos_info = &cdev_info->qos_info;
+
+		/* setup qos_info fields with defaults */
+		qos_info->num_apps = 0;
+		qos_info->num_tc = 1;
+
+		for (j = 0; j < IIDC_MAX_USER_PRIORITY; j++)
+			qos_info->up2tc[j] = 0;
+
+		qos_info->tc_info[0].rel_bw = 100;
+		for (j = 1; j < IEEE_8021QAZ_MAX_TCS; j++)
+			qos_info->tc_info[j].rel_bw = 0;
+
+		/* for DCB, override the qos_info defaults. */
+		ice_setup_dcb_qos_info(pf, qos_info);
+
+		/* make sure AUX specific resources such as msix_count and
+		 * msix_entries are initialized
+		 */
+		switch (ice_cdev_ids[i].id) {
+		case IIDC_RDMA_ID:
+			if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags)) {
+				cdev_info->msix_count = pf->num_rdma_msix;
+				entry = &pf->msix_entries[pf->rdma_base_vector];
+			}
+			cdev_info->rdma_protocol = IIDC_RDMA_PROTOCOL_IWARP;
+			break;
+		default:
+			break;
+		}
+
+		cdev_info->msix_entries = entry;
+	}
+
+	return ret;
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_idc_int.h b/drivers/net/ethernet/intel/ice/ice_idc_int.h
new file mode 100644
index 0000000..67b63c6
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_idc_int.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2021, Intel Corporation. */
+
+#ifndef _ICE_IDC_INT_H_
+#define _ICE_IDC_INT_H_
+
+#include <linux/net/intel/iidc.h>
+#include "ice.h"
+
+struct ice_pf;
+
+int ice_cdev_info_update_vsi(struct iidc_core_dev_info *cdev_info, void *data);
+int ice_unroll_cdev_info(struct iidc_core_dev_info *cdev_info, void *data);
+struct iidc_core_dev_info *
+ice_find_cdev_info_by_id(struct ice_pf *pf, int cdev_info_id);
+
+#endif /* !_ICE_IDC_INT_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
index 4599fc3..37c18c6 100644
--- a/drivers/net/ethernet/intel/ice/ice_lag.c
+++ b/drivers/net/ethernet/intel/ice/ice_lag.c
@@ -172,6 +172,7 @@ static void ice_lag_info_event(struct ice_lag *lag, void *ptr)
 	}
 
 	ice_clear_sriov_cap(pf);
+	ice_clear_rdma_cap(pf);
 
 	lag->bonded = true;
 	lag->role = ICE_LAG_UNSET;
@@ -222,6 +223,7 @@ static void ice_lag_info_event(struct ice_lag *lag, void *ptr)
 	}
 
 	ice_set_sriov_cap(pf);
+	ice_set_rdma_cap(pf);
 	lag->bonded = false;
 	lag->role = ICE_LAG_NONE;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 8d4e2ad..5f97b80 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -600,6 +600,17 @@ bool ice_is_safe_mode(struct ice_pf *pf)
 }
 
 /**
+ * ice_is_aux_ena
+ * @pf: pointer to the PF struct
+ *
+ * returns true if AUX devices/drivers are supported, false otherwise
+ */
+bool ice_is_aux_ena(struct ice_pf *pf)
+{
+	return test_bit(ICE_FLAG_AUX_ENA, pf->flags);
+}
+
+/**
  * ice_vsi_clean_rss_flow_fld - Delete RSS configuration
  * @vsi: the VSI being cleaned up
  *
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h
index 3da1789..0f794d7 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_lib.h
@@ -99,7 +99,7 @@ enum ice_status
 ice_vsi_cfg_mac_fltr(struct ice_vsi *vsi, const u8 *macaddr, bool set);
 
 bool ice_is_safe_mode(struct ice_pf *pf);
-
+bool ice_is_aux_ena(struct ice_pf *pf);
 bool ice_is_dflt_vsi_in_use(struct ice_sw *sw);
 
 bool ice_is_vsi_dflt_vsi(struct ice_sw *sw, struct ice_vsi *vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 2c23c8f..d1b2a6e 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -3312,6 +3312,12 @@ static void ice_set_pf_caps(struct ice_pf *pf)
 {
 	struct ice_hw_func_caps *func_caps = &pf->hw.func_caps;
 
+	clear_bit(ICE_FLAG_IWARP_ENA, pf->flags);
+	clear_bit(ICE_FLAG_AUX_ENA, pf->flags);
+	if (func_caps->common_cap.iwarp) {
+		set_bit(ICE_FLAG_IWARP_ENA, pf->flags);
+		set_bit(ICE_FLAG_AUX_ENA, pf->flags);
+	}
 	clear_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
 	if (func_caps->common_cap.dcb)
 		set_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
@@ -3391,11 +3397,12 @@ static int ice_init_pf(struct ice_pf *pf)
  */
 static int ice_ena_msix_range(struct ice_pf *pf)
 {
-	int v_left, v_actual, v_other, v_budget = 0;
+	int num_cpus, v_left, v_actual, v_other, v_budget = 0;
 	struct device *dev = ice_pf_to_dev(pf);
 	int needed, err, i;
 
 	v_left = pf->hw.func_caps.common_cap.num_msix_vectors;
+	num_cpus = num_online_cpus();
 
 	/* reserve for LAN miscellaneous handler */
 	needed = ICE_MIN_LAN_OICR_MSIX;
@@ -3417,13 +3424,23 @@ static int ice_ena_msix_range(struct ice_pf *pf)
 	v_other = v_budget;
 
 	/* reserve vectors for LAN traffic */
-	needed = min_t(int, num_online_cpus(), v_left);
+	needed = num_cpus;
 	if (v_left < needed)
 		goto no_hw_vecs_left_err;
 	pf->num_lan_msix = needed;
 	v_budget += needed;
 	v_left -= needed;
 
+	/* reserve vectors for RDMA auxiliary driver */
+	if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags)) {
+		needed = num_cpus + ICE_RDMA_NUM_AEQ_MSIX;
+		if (v_left < needed)
+			goto no_hw_vecs_left_err;
+		pf->num_rdma_msix = needed;
+		v_budget += needed;
+		v_left -= needed;
+	}
+
 	pf->msix_entries = devm_kcalloc(dev, v_budget,
 					sizeof(*pf->msix_entries), GFP_KERNEL);
 	if (!pf->msix_entries) {
@@ -3453,16 +3470,46 @@ static int ice_ena_msix_range(struct ice_pf *pf)
 			err = -ERANGE;
 			goto msix_err;
 		} else {
-			int v_traffic = v_actual - v_other;
+			int v_remain = v_actual - v_other;
+			int v_rdma = 0, v_min_rdma = 0;
+
+			if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags)) {
+				/* Need at least 1 interrupt in addition to
+				 * AEQ MSIX
+				 */
+				v_rdma = ICE_RDMA_NUM_AEQ_MSIX + 1;
+				v_min_rdma = ICE_MIN_RDMA_MSIX;
+			}
 
 			if (v_actual == ICE_MIN_MSIX ||
-			    v_traffic < ICE_MIN_LAN_TXRX_MSIX)
+			    v_remain < ICE_MIN_LAN_TXRX_MSIX + v_min_rdma) {
+				dev_warn(dev, "Not enough MSI-X vectors to support RDMA.\n");
+				clear_bit(ICE_FLAG_IWARP_ENA, pf->flags);
+
+				pf->num_rdma_msix = 0;
 				pf->num_lan_msix = ICE_MIN_LAN_TXRX_MSIX;
-			else
-				pf->num_lan_msix = v_traffic;
+			} else if ((v_remain < ICE_MIN_LAN_TXRX_MSIX + v_rdma) ||
+				   (v_remain - v_rdma < v_rdma)) {
+				/* Support minimum RDMA and give remaining
+				 * vectors to LAN MSIX
+				 */
+				pf->num_rdma_msix = v_min_rdma;
+				pf->num_lan_msix = v_remain - v_min_rdma;
+			} else {
+				/* Split remaining MSIX with RDMA after
+				 * accounting for AEQ MSIX
+				 */
+				pf->num_rdma_msix = (v_remain - ICE_RDMA_NUM_AEQ_MSIX) / 2 +
+						    ICE_RDMA_NUM_AEQ_MSIX;
+				pf->num_lan_msix = v_remain - pf->num_rdma_msix;
+			}
 
 			dev_notice(dev, "Enabled %d MSI-X vectors for LAN traffic.\n",
 				   pf->num_lan_msix);
+
+			if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags))
+				dev_notice(dev, "Enabled %d MSI-X vectors for RDMA.\n",
+					   pf->num_rdma_msix);
 		}
 	}
 
@@ -3477,6 +3524,7 @@ static int ice_ena_msix_range(struct ice_pf *pf)
 		needed, v_left);
 	err = -ERANGE;
 exit_err:
+	pf->num_rdma_msix = 0;
 	pf->num_lan_msix = 0;
 	return err;
 }
@@ -4267,8 +4315,37 @@ static void ice_print_wake_reason(struct ice_pf *pf)
 probe_done:
 	/* ready to go, so clear down state bit */
 	clear_bit(__ICE_DOWN, pf->state);
+	/* init AUX devices only if supported */
+	if (ice_is_aux_ena(pf)) {
+		pf->cdev_infos = devm_kcalloc(dev, IIDC_MAX_NUM_AUX,
+					      sizeof(*pf->cdev_infos),
+					      GFP_KERNEL);
+		if (!pf->cdev_infos) {
+			err = -ENOMEM;
+			goto err_init_aux_unroll;
+		}
+
+		err = ice_init_aux_devices(pf);
+		if (err) {
+			dev_err(dev, "Failed to initialize aux devs: %d\n",
+				err);
+			err = -EIO;
+			goto err_init_aux_unroll;
+		}
+	} else {
+		dev_warn(dev, "RDMA is not supported on this device\n");
+	}
+
 	return 0;
 
+err_init_aux_unroll:
+	if (ice_is_aux_ena(pf)) {
+		ice_for_each_aux(pf, NULL, ice_unroll_cdev_info);
+		if (pf->cdev_infos) {
+			devm_kfree(dev, pf->cdev_infos);
+			pf->cdev_infos = NULL;
+		}
+	}
 err_send_version_unroll:
 	ice_vsi_release_all(pf);
 err_alloc_sw_unroll:
@@ -4614,6 +4691,8 @@ static int __maybe_unused ice_resume(struct device *dev)
 	if (ret)
 		dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);
 
+	ice_cdev_info_refresh_msix(pf);
+
 	clear_bit(__ICE_DOWN, pf->state);
 	/* Now perform PF reset and rebuild */
 	reset_type = ICE_RESET_PFR;
@@ -5980,6 +6059,7 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type)
 	struct device *dev = ice_pf_to_dev(pf);
 	struct ice_hw *hw = &pf->hw;
 	enum ice_status ret;
+	struct ice_vsi *vsi;
 	int err;
 
 	if (test_bit(__ICE_DOWN, pf->state))
@@ -6067,6 +6147,13 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type)
 		goto err_vsi_rebuild;
 	}
 
+	vsi = ice_get_main_vsi(pf);
+	if (!vsi) {
+		dev_err(dev, "No PF_VSI to update aux drivers\n");
+		goto err_vsi_rebuild;
+	}
+	ice_for_each_aux(pf, vsi, ice_cdev_info_update_vsi);
+
 	/* If Flow Director is active */
 	if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
 		err = ice_vsi_rebuild_by_type(pf, ICE_VSI_CTRL);
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index a6cb0c35..4eaab60 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -244,6 +244,7 @@ struct ice_hw_common_caps {
 	u8 rss_table_entry_width;	/* RSS Entry width in bits */
 
 	u8 dcb;
+	u8 iwarp;
 
 	bool nvm_update_pending_nvm;
 	bool nvm_update_pending_orom;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 03/23] ice: Implement iidc operations
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 01/23] iidc: Introduce iidc.h Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 02/23] ice: Initialize RDMA support Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 04/23] ice: Register auxiliary device to provide RDMA Shiraz Saleem
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

From: Dave Ertman <david.m.ertman@intel.com>

Add implementations for supporting iidc operations for device operation
such as allocation of resources and event notifications.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h             |   1 +
 drivers/net/ethernet/intel/ice/ice_adminq_cmd.h  |  32 ++
 drivers/net/ethernet/intel/ice/ice_common.c      | 200 +++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h      |   9 +
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c     |  25 +-
 drivers/net/ethernet/intel/ice/ice_hw_autogen.h  |   3 +-
 drivers/net/ethernet/intel/ice/ice_idc.c         | 439 +++++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_idc_int.h     |   3 +
 drivers/net/ethernet/intel/ice/ice_main.c        |  50 ++-
 drivers/net/ethernet/intel/ice/ice_sched.c       |  69 +++-
 drivers/net/ethernet/intel/ice/ice_switch.c      |  27 ++
 drivers/net/ethernet/intel/ice/ice_switch.h      |   4 +
 drivers/net/ethernet/intel/ice/ice_type.h        |   3 +
 drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c |  34 ++
 14 files changed, 876 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index ebd2159..561f8fd 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -327,6 +327,7 @@ struct ice_vsi {
 	u16 req_rxq;			 /* User requested Rx queues */
 	u16 num_rx_desc;
 	u16 num_tx_desc;
+	u16 qset_handle[ICE_MAX_TRAFFIC_CLASS];
 	struct ice_tc_cfg tc_cfg;
 	struct bpf_prog *xdp_prog;
 	struct ice_ring **xdp_rings;	 /* XDP ring array */
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 81735e5..9ae3d2f 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1684,6 +1684,36 @@ struct ice_aqc_dis_txq_item {
 	__le16 q_id[];
 } __packed;
 
+/* Add Tx RDMA Queue Set (indirect 0x0C33) */
+struct ice_aqc_add_rdma_qset {
+	u8 num_qset_grps;
+	u8 reserved[7];
+	__le32 addr_high;
+	__le32 addr_low;
+};
+
+/* This is the descriptor of each Qset entry for the Add Tx RDMA Queue Set
+ * command (0x0C33). Only used within struct ice_aqc_add_rdma_qset.
+ */
+struct ice_aqc_add_tx_rdma_qset_entry {
+	__le16 tx_qset_id;
+	u8 rsvd[2];
+	__le32 qset_teid;
+	struct ice_aqc_txsched_elem info;
+};
+
+/* The format of the command buffer for Add Tx RDMA Queue Set(0x0C33)
+ * is an array of the following structs. Please note that the length of
+ * each struct ice_aqc_add_rdma_qset is variable due to the variable
+ * number of queues in each group!
+ */
+struct ice_aqc_add_rdma_qset_data {
+	__le32 parent_teid;
+	__le16 num_qsets;
+	u8 rsvd[2];
+	struct ice_aqc_add_tx_rdma_qset_entry rdma_qsets[];
+};
+
 /* Configure Firmware Logging Command (indirect 0xFF09)
  * Logging Information Read Response (indirect 0xFF10)
  * Note: The 0xFF10 command has no input parameters.
@@ -1879,6 +1909,7 @@ struct ice_aq_desc {
 		struct ice_aqc_get_set_rss_key get_set_rss_key;
 		struct ice_aqc_add_txqs add_txqs;
 		struct ice_aqc_dis_txqs dis_txqs;
+		struct ice_aqc_add_rdma_qset add_rdma_qset;
 		struct ice_aqc_add_get_update_free_vsi vsi_cmd;
 		struct ice_aqc_add_update_free_vsi_resp add_update_free_vsi_res;
 		struct ice_aqc_fw_logging fw_logging;
@@ -2027,6 +2058,7 @@ enum ice_adminq_opc {
 	/* Tx queue handling commands/events */
 	ice_aqc_opc_add_txqs				= 0x0C30,
 	ice_aqc_opc_dis_txqs				= 0x0C31,
+	ice_aqc_opc_add_rdma_qset			= 0x0C33,
 
 	/* package commands */
 	ice_aqc_opc_download_pkg			= 0x0C40,
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index d7b15c3..a6cbb92 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -3577,6 +3577,54 @@ enum ice_status
 	return status;
 }
 
+/**
+ * ice_aq_add_rdma_qsets
+ * @hw: pointer to the hardware structure
+ * @num_qset_grps: Number of RDMA Qset groups
+ * @qset_list: list of Qset groups to be added
+ * @buf_size: size of buffer for indirect command
+ * @cd: pointer to command details structure or NULL
+ *
+ * Add Tx RDMA Qsets (0x0C33)
+ */
+static enum ice_status
+ice_aq_add_rdma_qsets(struct ice_hw *hw, u8 num_qset_grps,
+		      struct ice_aqc_add_rdma_qset_data *qset_list,
+		      u16 buf_size, struct ice_sq_cd *cd)
+{
+	struct ice_aqc_add_rdma_qset_data *list;
+	struct ice_aqc_add_rdma_qset *cmd;
+	struct ice_aq_desc desc;
+	u16 i, sum_size = 0;
+
+	cmd = &desc.params.add_rdma_qset;
+
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_add_rdma_qset);
+
+	if (!qset_list)
+		return ICE_ERR_PARAM;
+
+	if (num_qset_grps > ICE_LAN_TXQ_MAX_QGRPS)
+		return ICE_ERR_PARAM;
+
+	for (i = 0, list = qset_list; i < num_qset_grps; i++) {
+		u16 num_qsets = le16_to_cpu(list->num_qsets);
+
+		sum_size += struct_size(list, rdma_qsets, num_qsets);
+		list = (struct ice_aqc_add_rdma_qset_data *)(list->rdma_qsets +
+							     num_qsets);
+	}
+
+	if (buf_size != sum_size)
+		return ICE_ERR_PARAM;
+
+	desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD);
+
+	cmd->num_qset_grps = num_qset_grps;
+
+	return ice_aq_send_cmd(hw, &desc, qset_list, buf_size, cd);
+}
+
 /* End of FW Admin Queue command wrappers */
 
 /**
@@ -4075,6 +4123,158 @@ enum ice_status
 }
 
 /**
+ * ice_cfg_vsi_rdma - configure the VSI RDMA queues
+ * @pi: port information structure
+ * @vsi_handle: software VSI handle
+ * @tc_bitmap: TC bitmap
+ * @max_rdmaqs: max RDMA queues array per TC
+ *
+ * This function adds/updates the VSI RDMA queues per TC.
+ */
+enum ice_status
+ice_cfg_vsi_rdma(struct ice_port_info *pi, u16 vsi_handle, u16 tc_bitmap,
+		 u16 *max_rdmaqs)
+{
+	return ice_cfg_vsi_qs(pi, vsi_handle, tc_bitmap, max_rdmaqs,
+			      ICE_SCHED_NODE_OWNER_RDMA);
+}
+
+/**
+ * ice_ena_vsi_rdma_qset
+ * @pi: port information structure
+ * @vsi_handle: software VSI handle
+ * @tc: TC number
+ * @rdma_qset: pointer to RDMA Qset
+ * @num_qsets: number of RDMA Qsets
+ * @qset_teid: pointer to Qset node TEIDs
+ *
+ * This function adds RDMA Qset
+ */
+enum ice_status
+ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
+		      u16 *rdma_qset, u16 num_qsets, u32 *qset_teid)
+{
+	struct ice_aqc_txsched_elem_data node = { 0 };
+	struct ice_aqc_add_rdma_qset_data *buf;
+	struct ice_sched_node *parent;
+	enum ice_status status;
+	struct ice_hw *hw;
+	u16 i, buf_size;
+
+	if (!pi || pi->port_state != ICE_SCHED_PORT_STATE_READY)
+		return ICE_ERR_CFG;
+	hw = pi->hw;
+
+	if (!ice_is_vsi_valid(hw, vsi_handle))
+		return ICE_ERR_PARAM;
+
+	buf_size = struct_size(buf, rdma_qsets, num_qsets);
+	buf = kzalloc(buf_size, GFP_KERNEL);
+	if (!buf)
+		return ICE_ERR_NO_MEMORY;
+	mutex_lock(&pi->sched_lock);
+
+	parent = ice_sched_get_free_qparent(pi, vsi_handle, tc,
+					    ICE_SCHED_NODE_OWNER_RDMA);
+	if (!parent) {
+		status = ICE_ERR_PARAM;
+		goto rdma_error_exit;
+	}
+	buf->parent_teid = parent->info.node_teid;
+	node.parent_teid = parent->info.node_teid;
+
+	buf->num_qsets = cpu_to_le16(num_qsets);
+	for (i = 0; i < num_qsets; i++) {
+		buf->rdma_qsets[i].tx_qset_id = cpu_to_le16(rdma_qset[i]);
+		buf->rdma_qsets[i].info.valid_sections =
+			ICE_AQC_ELEM_VALID_GENERIC | ICE_AQC_ELEM_VALID_CIR |
+			ICE_AQC_ELEM_VALID_EIR;
+		buf->rdma_qsets[i].info.generic = 0;
+		buf->rdma_qsets[i].info.cir_bw.bw_profile_idx =
+			cpu_to_le16(ICE_SCHED_DFLT_RL_PROF_ID);
+		buf->rdma_qsets[i].info.cir_bw.bw_alloc =
+			cpu_to_le16(ICE_SCHED_DFLT_BW_WT);
+		buf->rdma_qsets[i].info.eir_bw.bw_profile_idx =
+			cpu_to_le16(ICE_SCHED_DFLT_RL_PROF_ID);
+		buf->rdma_qsets[i].info.eir_bw.bw_alloc =
+			cpu_to_le16(ICE_SCHED_DFLT_BW_WT);
+	}
+	status = ice_aq_add_rdma_qsets(hw, 1, buf, buf_size, NULL);
+	if (status) {
+		ice_debug(hw, ICE_DBG_RDMA, "add RDMA qset failed\n");
+		goto rdma_error_exit;
+	}
+	node.data.elem_type = ICE_AQC_ELEM_TYPE_LEAF;
+	for (i = 0; i < num_qsets; i++) {
+		node.node_teid = buf->rdma_qsets[i].qset_teid;
+		status = ice_sched_add_node(pi, hw->num_tx_sched_layers - 1,
+					    &node);
+		if (status)
+			break;
+		qset_teid[i] = le32_to_cpu(node.node_teid);
+	}
+rdma_error_exit:
+	mutex_unlock(&pi->sched_lock);
+	kfree(buf);
+	return status;
+}
+
+/**
+ * ice_dis_vsi_rdma_qset - free RDMA resources
+ * @pi: port_info struct
+ * @count: number of RDMA Qsets to free
+ * @qset_teid: TEID of Qset node
+ * @q_id: list of queue IDs being disabled
+ */
+enum ice_status
+ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
+		      u16 *q_id)
+{
+	struct ice_aqc_dis_txq_item *qg_list;
+	enum ice_status status = 0;
+	struct ice_hw *hw;
+	u16 qg_size;
+	int i;
+
+	if (!pi || pi->port_state != ICE_SCHED_PORT_STATE_READY)
+		return ICE_ERR_CFG;
+
+	hw = pi->hw;
+
+	qg_size = struct_size(qg_list, q_id, 1);
+	qg_list = kzalloc(qg_size, GFP_KERNEL);
+	if (!qg_list)
+		return ICE_ERR_NO_MEMORY;
+
+	mutex_lock(&pi->sched_lock);
+
+	for (i = 0; i < count; i++) {
+		struct ice_sched_node *node;
+
+		node = ice_sched_find_node_by_teid(pi->root, qset_teid[i]);
+		if (!node)
+			continue;
+
+		qg_list->parent_teid = node->info.parent_teid;
+		qg_list->num_qs = 1;
+		qg_list->q_id[0] =
+			cpu_to_le16(q_id[i] |
+				    ICE_AQC_Q_DIS_BUF_ELEM_TYPE_RDMA_QSET);
+
+		status = ice_aq_dis_lan_txq(hw, 1, qg_list, qg_size,
+					    ICE_NO_RESET, 0, NULL);
+		if (status)
+			break;
+
+		ice_free_sched_node(pi, node);
+	}
+
+	mutex_unlock(&pi->sched_lock);
+	kfree(qg_list);
+	return status;
+}
+
+/**
  * ice_replay_pre_init - replay pre initialization
  * @hw: pointer to the HW struct
  *
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h
index baf4064..fbd9421 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.h
+++ b/drivers/net/ethernet/intel/ice/ice_common.h
@@ -147,6 +147,15 @@ enum ice_status
 		  bool write, struct ice_sq_cd *cd);
 
 enum ice_status
+ice_cfg_vsi_rdma(struct ice_port_info *pi, u16 vsi_handle, u16 tc_bitmap,
+		 u16 *max_rdmaqs);
+enum ice_status
+ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
+		      u16 *rdma_qset, u16 num_qsets, u32 *qset_teid);
+enum ice_status
+ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
+		      u16 *q_id);
+enum ice_status
 ice_dis_vsi_txq(struct ice_port_info *pi, u16 vsi_handle, u8 tc, u8 num_queues,
 		u16 *q_handle, u16 *q_ids, u32 *q_teids,
 		enum ice_disq_rst_src rst_src, u16 vmvf_num,
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index 3aebfa8..306df36 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -275,6 +275,7 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked)
 	struct ice_dcbx_cfg *old_cfg, *curr_cfg;
 	struct device *dev = ice_pf_to_dev(pf);
 	int ret = ICE_DCB_NO_HW_CHG;
+	struct iidc_event *event;
 	struct ice_vsi *pf_vsi;
 
 	curr_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
@@ -313,6 +314,15 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked)
 		goto free_cfg;
 	}
 
+	/* Notify AUX drivers about impending change to TCs */
+	event = kzalloc(sizeof(*event), GFP_KERNEL);
+	if (!event)
+		return -ENOMEM;
+
+	set_bit(IIDC_EVENT_BEFORE_TC_CHANGE, event->type);
+	ice_send_event_to_auxs(pf, event);
+	kfree(event);
+
 	/* avoid race conditions by holding the lock while disabling and
 	 * re-enabling the VSI
 	 */
@@ -676,9 +686,22 @@ void ice_pf_dcb_recfg(struct ice_pf *pf)
 		if (vsi->type == ICE_VSI_PF)
 			ice_dcbnl_set_all(vsi);
 	}
+	/* Notify the AUX drivers that TC change is finished */
 	rcdi = ice_find_cdev_info_by_id(pf, IIDC_RDMA_ID);
-	if (rcdi)
+	if (rcdi) {
+		struct iidc_event *event;
+
 		ice_setup_dcb_qos_info(pf, &rcdi->qos_info);
+
+		event = kzalloc(sizeof(*event), GFP_KERNEL);
+		if (!event)
+			return;
+
+		set_bit(IIDC_EVENT_AFTER_TC_CHANGE, event->type);
+		event->info.port_qos = rcdi->qos_info;
+		ice_send_event_to_auxs(pf, event);
+		kfree(event);
+	}
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
index 093a181..b544290 100644
--- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
+++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
@@ -110,8 +110,6 @@
 #define VPGEN_VFRSTAT_VFRD_M			BIT(0)
 #define VPGEN_VFRTRIG(_VF)			(0x00090000 + ((_VF) * 4))
 #define VPGEN_VFRTRIG_VFSWR_M			BIT(0)
-#define PFHMC_ERRORDATA				0x00520500
-#define PFHMC_ERRORINFO				0x00520400
 #define GLINT_CTL				0x0016CC54
 #define GLINT_CTL_DIS_AUTOMASK_M		BIT(0)
 #define GLINT_CTL_ITR_GRAN_200_S		16
@@ -159,6 +157,7 @@
 #define PFINT_OICR_GRST_M			BIT(20)
 #define PFINT_OICR_PCI_EXCEPTION_M		BIT(21)
 #define PFINT_OICR_HMC_ERR_M			BIT(26)
+#define PFINT_OICR_PE_PUSH_M			BIT(27)
 #define PFINT_OICR_PE_CRITERR_M			BIT(28)
 #define PFINT_OICR_VFLR_M			BIT(29)
 #define PFINT_OICR_SWINT_M			BIT(31)
diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c
index e21a005..547718b 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -11,6 +11,34 @@
 static struct cdev_info_id ice_cdev_ids[] = ASSIGN_IIDC_INFO;
 
 /**
+ * ice_get_auxiliary_ops - retrieve iidc_auxiliary_ops struct
+ * @cdev_info: pointer to iidc_core_dev_info struct
+ *
+ * This function has to be called with a device_lock on the
+ * cdev_info->adev.dev to avoid race conditions.
+ */
+struct iidc_auxiliary_ops *
+ice_get_auxiliary_ops(struct iidc_core_dev_info *cdev_info)
+{
+	struct iidc_auxiliary_drv *iadrv;
+	struct auxiliary_device *adev;
+
+	if (!cdev_info)
+		return NULL;
+
+	adev = cdev_info->adev;
+	if (!adev || !adev->dev.driver)
+		return NULL;
+
+	iadrv = container_of(adev->dev.driver, struct iidc_auxiliary_drv,
+			     adrv.driver);
+	if (!iadrv)
+		return NULL;
+
+	return iadrv->ops;
+}
+
+/**
  * ice_for_each_aux - iterate across and call function for each AUX driver
  * @pf: pointer to private board struct
  * @data: data to pass to function on each call
@@ -41,6 +69,40 @@
 }
 
 /**
+ * ice_send_event_to_aux - send event to a specific AUX driver
+ * @cdev_info: pointer to iidc_core_dev_info struct for this AUX
+ * @data: opaque pointer used to pass event struct
+ *
+ * This function is only meant to be called through a ice_for_each_aux call
+ */
+static int
+ice_send_event_to_aux(struct iidc_core_dev_info *cdev_info, void *data)
+{
+	struct iidc_event *event = data;
+	struct iidc_auxiliary_ops *ops;
+
+	device_lock(&cdev_info->adev->dev);
+	ops = ice_get_auxiliary_ops(cdev_info);
+	if (ops && ops->event_handler)
+		ops->event_handler(cdev_info, event);
+	device_unlock(&cdev_info->adev->dev);
+
+	return 0;
+}
+
+/**
+ * ice_send_event_to_auxs - send event to all auxiliary drivers
+ * @pf: pointer to PF struct
+ * @event: pointer to iidc_event to propagate
+ *
+ * event struct to be populated by caller
+ */
+int ice_send_event_to_auxs(struct ice_pf *pf, struct iidc_event *event)
+{
+	return ice_for_each_aux(pf, event, ice_send_event_to_aux);
+}
+
+/**
  * ice_unroll_cdev_info - destroy cdev_info resources
  * @cdev_info: ptr to cdev_info struct
  * @data: ptr to opaque data
@@ -90,6 +152,371 @@ void ice_cdev_info_refresh_msix(struct ice_pf *pf)
 }
 
 /**
+ * ice_find_vsi - Find the VSI from VSI ID
+ * @pf: The PF pointer to search in
+ * @vsi_num: The VSI ID to search for
+ */
+static struct ice_vsi *ice_find_vsi(struct ice_pf *pf, u16 vsi_num)
+{
+	int i;
+
+	ice_for_each_vsi(pf, i)
+		if (pf->vsi[i] && pf->vsi[i]->vsi_num == vsi_num)
+			return  pf->vsi[i];
+	return NULL;
+}
+
+/**
+ * ice_alloc_rdma_qsets - Allocate Leaf Nodes for RDMA Qset
+ * @cdev_info: AUX driver that is requesting the Leaf Nodes
+ * @res: Resources to be allocated
+ * @partial_acceptable: If partial allocation is acceptable
+ *
+ * This function allocates Leaf Nodes for given RDMA Qset resources
+ * for the AUX object.
+ */
+static int
+ice_alloc_rdma_qsets(struct iidc_core_dev_info *cdev_info,
+		     struct iidc_res *res,
+		     int __always_unused partial_acceptable)
+{
+	u16 max_rdmaqs[ICE_MAX_TRAFFIC_CLASS];
+	enum ice_status status;
+	struct ice_vsi *vsi;
+	struct device *dev;
+	struct ice_pf *pf;
+	int i, ret = 0;
+	u32 *qset_teid;
+	u16 *qs_handle;
+
+	if (!cdev_info || !res)
+		return -EINVAL;
+
+	pf = pci_get_drvdata(cdev_info->pdev);
+	dev = ice_pf_to_dev(pf);
+
+	if (!test_bit(ICE_FLAG_IWARP_ENA, pf->flags))
+		return -EINVAL;
+
+	if (res->cnt_req > ICE_MAX_TXQ_PER_TXQG)
+		return -EINVAL;
+
+	qset_teid = kcalloc(res->cnt_req, sizeof(*qset_teid), GFP_KERNEL);
+	if (!qset_teid)
+		return -ENOMEM;
+
+	qs_handle = kcalloc(res->cnt_req, sizeof(*qs_handle), GFP_KERNEL);
+	if (!qs_handle) {
+		kfree(qset_teid);
+		return -ENOMEM;
+	}
+
+	ice_for_each_traffic_class(i)
+		max_rdmaqs[i] = 0;
+
+	for (i = 0; i < res->cnt_req; i++) {
+		struct iidc_rdma_qset_params *qset;
+
+		qset = &res->res[i].res.qsets;
+		if (qset->vport_id != cdev_info->vport_id) {
+			dev_err(dev, "RDMA QSet invalid VSI requested\n");
+			ret = -EINVAL;
+			goto out;
+		}
+		max_rdmaqs[qset->tc]++;
+		qs_handle[i] = qset->qs_handle;
+	}
+
+	vsi = ice_find_vsi(pf, cdev_info->vport_id);
+	if (!vsi) {
+		dev_err(dev, "RDMA QSet invalid VSI\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	status = ice_cfg_vsi_rdma(vsi->port_info, vsi->idx, vsi->tc_cfg.ena_tc,
+				  max_rdmaqs);
+	if (status) {
+		dev_err(dev, "Failed VSI RDMA qset config\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	for (i = 0; i < res->cnt_req; i++) {
+		struct iidc_rdma_qset_params *qset;
+
+		qset = &res->res[i].res.qsets;
+		status = ice_ena_vsi_rdma_qset(vsi->port_info, vsi->idx,
+					       qset->tc, &qs_handle[i], 1,
+					       &qset_teid[i]);
+		if (status) {
+			dev_err(dev, "Failed VSI RDMA qset enable\n");
+			ret = -EINVAL;
+			goto out;
+		}
+		vsi->qset_handle[qset->tc] = qset->qs_handle;
+		qset->teid = qset_teid[i];
+	}
+
+out:
+	kfree(qset_teid);
+	kfree(qs_handle);
+	return ret;
+}
+
+/**
+ * ice_free_rdma_qsets - Free leaf nodes for RDMA Qset
+ * @cdev_info: AUX driver that requested Qsets to be freed
+ * @res: Resource to be freed
+ */
+static int
+ice_free_rdma_qsets(struct iidc_core_dev_info *cdev_info, struct iidc_res *res)
+{
+	enum ice_status status;
+	int count, i, ret = 0;
+	struct ice_vsi *vsi;
+	struct device *dev;
+	struct ice_pf *pf;
+	u16 vsi_id;
+	u32 *teid;
+	u16 *q_id;
+
+	if (!cdev_info || !res)
+		return -EINVAL;
+
+	pf = pci_get_drvdata(cdev_info->pdev);
+	dev = ice_pf_to_dev(pf);
+
+	count = res->res_allocated;
+	if (count > ICE_MAX_TXQ_PER_TXQG)
+		return -EINVAL;
+
+	teid = kcalloc(count, sizeof(*teid), GFP_KERNEL);
+	if (!teid)
+		return -ENOMEM;
+
+	q_id = kcalloc(count, sizeof(*q_id), GFP_KERNEL);
+	if (!q_id) {
+		kfree(teid);
+		return -ENOMEM;
+	}
+
+	vsi_id = res->res[0].res.qsets.vport_id;
+	vsi = ice_find_vsi(pf, vsi_id);
+	if (!vsi) {
+		dev_err(dev, "RDMA Invalid VSI\n");
+		ret = -EINVAL;
+		goto rdma_free_out;
+	}
+
+	for (i = 0; i < count; i++) {
+		struct iidc_rdma_qset_params *qset;
+
+		qset = &res->res[i].res.qsets;
+		if (qset->vport_id != vsi_id) {
+			dev_err(dev, "RDMA Invalid VSI ID\n");
+			ret = -EINVAL;
+			goto rdma_free_out;
+		}
+		q_id[i] = qset->qs_handle;
+		teid[i] = qset->teid;
+
+		vsi->qset_handle[qset->tc] = 0;
+	}
+
+	status = ice_dis_vsi_rdma_qset(vsi->port_info, count, teid, q_id);
+	if (status)
+		ret = -EINVAL;
+
+rdma_free_out:
+	kfree(teid);
+	kfree(q_id);
+
+	return ret;
+}
+
+/**
+ * ice_cdev_info_alloc_res - Allocate requested resources for AUX objects
+ * @cdev_info: struct for AUX driver that is requesting resources
+ * @res: Resources to be allocated
+ * @partial_acceptable: If partial allocation is acceptable
+ *
+ * This function allocates requested resources for the AUX object.
+ */
+static int
+ice_cdev_info_alloc_res(struct iidc_core_dev_info *cdev_info,
+			struct iidc_res *res, int partial_acceptable)
+{
+	struct ice_pf *pf;
+	int ret;
+
+	if (!cdev_info || !res)
+		return -EINVAL;
+
+	pf = pci_get_drvdata(cdev_info->pdev);
+	if (!ice_pf_state_is_nominal(pf))
+		return -EBUSY;
+
+	switch (res->res_type) {
+	case IIDC_RDMA_QSETS_TXSCHED:
+		ret = ice_alloc_rdma_qsets(cdev_info, res, partial_acceptable);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
+/**
+ * ice_cdev_info_free_res - Free given resources
+ * @cdev_info: struct for AUX driver that is requesting freeing of resources
+ * @res: Resources to be freed
+ *
+ * Free/Release resources allocated to given AUX objects.
+ */
+static int
+ice_cdev_info_free_res(struct iidc_core_dev_info *cdev_info,
+		       struct iidc_res *res)
+{
+	int ret;
+
+	if (!cdev_info || !res)
+		return -EINVAL;
+
+	switch (res->res_type) {
+	case IIDC_RDMA_QSETS_TXSCHED:
+		ret = ice_free_rdma_qsets(cdev_info, res);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
+/**
+ * ice_cdev_info_request_reset - request from AUX driver to perform a reset
+ * @cdev_info: struct for AUX driver that is requesting a reset
+ * @reset_type: type of reset the AUX driver is requesting
+ */
+static int
+ice_cdev_info_request_reset(struct iidc_core_dev_info *cdev_info,
+			    enum iidc_reset_type reset_type)
+{
+	enum ice_reset_req reset;
+	struct ice_pf *pf;
+
+	if (!cdev_info)
+		return -EINVAL;
+
+	pf = pci_get_drvdata(cdev_info->pdev);
+
+	switch (reset_type) {
+	case IIDC_PFR:
+		reset = ICE_RESET_PFR;
+		break;
+	case IIDC_CORER:
+		reset = ICE_RESET_CORER;
+		break;
+	case IIDC_GLOBR:
+		reset = ICE_RESET_GLOBR;
+		break;
+	default:
+		dev_err(ice_pf_to_dev(pf), "incorrect reset request from aux driver\n");
+		return -EINVAL;
+	}
+
+	return ice_schedule_reset(pf, reset);
+}
+
+/**
+ * ice_cdev_info_update_vsi_filter - update main VSI filters for RDMA
+ * @cdev_info: pointer to struct for AUX device updating filters
+ * @vsi_id: VSI HW index to update filter on
+ * @enable: bool whether to enable or disable filters
+ */
+static int
+ice_cdev_info_update_vsi_filter(struct iidc_core_dev_info *cdev_info,
+				u16 vsi_id, bool enable)
+{
+	enum ice_status status;
+	struct ice_vsi *vsi;
+	struct ice_pf *pf;
+
+	if (!cdev_info)
+		return -EINVAL;
+
+	pf = pci_get_drvdata(cdev_info->pdev);
+
+	vsi = ice_find_vsi(pf, vsi_id);
+	if (!vsi)
+		return -EINVAL;
+
+	status = ice_cfg_iwarp_fltr(&pf->hw, vsi->idx, enable);
+	if (status) {
+		dev_err(ice_pf_to_dev(pf), "Failed to  %sable iWARP filtering\n",
+			enable ? "en" : "dis");
+	} else {
+		if (enable)
+			vsi->info.q_opt_flags |= ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
+		else
+			vsi->info.q_opt_flags &= ~ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
+	}
+
+	return ice_status_to_errno(status);
+}
+
+/**
+ * ice_cdev_info_vc_send - send a virt channel message from an AUX driver
+ * @cdev_info: pointer to cdev_info struct for AUX driver
+ * @vf_id: the absolute VF ID of recipient of message
+ * @msg: pointer to message contents
+ * @len: len of message
+ */
+static int
+ice_cdev_info_vc_send(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
+		      u16 len)
+{
+	enum ice_status status;
+	struct ice_pf *pf;
+
+	if (!cdev_info)
+		return -EINVAL;
+	if (!msg || !len)
+		return -ENOMEM;
+
+	pf = pci_get_drvdata(cdev_info->pdev);
+	if (len > ICE_AQ_MAX_BUF_LEN)
+		return -EINVAL;
+
+	if (ice_is_reset_in_progress(pf->state))
+		return -EBUSY;
+
+	switch (cdev_info->cdev_info_id) {
+	case IIDC_RDMA_ID:
+		if (vf_id >= pf->num_alloc_vfs)
+			return -ENODEV;
+
+		/* VIRTCHNL_OP_IWARP is being used for RoCEv2 msg also */
+		status = ice_aq_send_msg_to_vf(&pf->hw, vf_id, VIRTCHNL_OP_IWARP,
+					       0, msg, len, NULL);
+		break;
+	default:
+		dev_err(ice_pf_to_dev(pf), "aux driver (%i) not supported!",
+			cdev_info->cdev_info_id);
+		return -ENODEV;
+	}
+
+	if (status)
+		dev_err(ice_pf_to_dev(pf), "Unable to send msg to VF, error %s\n",
+			ice_stat_str(status));
+	return ice_status_to_errno(status);
+}
+
+/**
  * ice_reserve_cdev_info_qvector - Reserve vector resources for AUX drivers
  * @pf: board private structure to initialize
  */
@@ -146,6 +573,16 @@ int ice_cdev_info_update_vsi(struct iidc_core_dev_info *cdev_info, void *data)
 	return 0;
 }
 
+/* Initialize the ice_ops struct, which is used in 'ice_init_aux_devices' */
+static const struct iidc_core_ops ops = {
+	.alloc_res			= ice_cdev_info_alloc_res,
+	.free_res			= ice_cdev_info_free_res,
+	.request_reset			= ice_cdev_info_request_reset,
+	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
+	.vc_send			= ice_cdev_info_vc_send,
+
+};
+
 /**
  * ice_init_aux_devices - initializes cdev_info objects and AUX devices
  * @pf: ptr to ice_pf
@@ -208,6 +645,8 @@ int ice_init_aux_devices(struct ice_pf *pf)
 
 		/* for DCB, override the qos_info defaults. */
 		ice_setup_dcb_qos_info(pf, qos_info);
+		/* Initialize ice_ops */
+		cdev_info->ops = &ops;
 
 		/* make sure AUX specific resources such as msix_count and
 		 * msix_entries are initialized
diff --git a/drivers/net/ethernet/intel/ice/ice_idc_int.h b/drivers/net/ethernet/intel/ice/ice_idc_int.h
index 67b63c6..ce100ff 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc_int.h
+++ b/drivers/net/ethernet/intel/ice/ice_idc_int.h
@@ -9,6 +9,9 @@
 
 struct ice_pf;
 
+int ice_send_event_to_auxs(struct ice_pf *pf, struct iidc_event *event);
+struct iidc_auxiliary_ops *
+ice_get_auxiliary_ops(struct iidc_core_dev_info *cdev_info);
 int ice_cdev_info_update_vsi(struct iidc_core_dev_info *cdev_info, void *data);
 int ice_unroll_cdev_info(struct iidc_core_dev_info *cdev_info, void *data);
 struct iidc_core_dev_info *
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index d1b2a6e..8baf3ac 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2607,6 +2607,7 @@ static void ice_ena_misc_vector(struct ice_pf *pf)
 	       PFINT_OICR_PCI_EXCEPTION_M |
 	       PFINT_OICR_VFLR_M |
 	       PFINT_OICR_HMC_ERR_M |
+	       PFINT_OICR_PE_PUSH_M |
 	       PFINT_OICR_PE_CRITERR_M);
 
 	wr32(hw, PFINT_OICR_ENA, val);
@@ -2677,8 +2678,6 @@ static irqreturn_t ice_misc_intr(int __always_unused irq, void *data)
 
 		/* If a reset cycle isn't already in progress, we set a bit in
 		 * pf->state so that the service task can start a reset/rebuild.
-		 * We also make note of which reset happened so that peer
-		 * devices/drivers can be informed.
 		 */
 		if (!test_and_set_bit(__ICE_RESET_OICR_RECV, pf->state)) {
 			if (reset == ICE_RESET_CORER)
@@ -2705,11 +2704,19 @@ static irqreturn_t ice_misc_intr(int __always_unused irq, void *data)
 		}
 	}
 
-	if (oicr & PFINT_OICR_HMC_ERR_M) {
-		ena_mask &= ~PFINT_OICR_HMC_ERR_M;
-		dev_dbg(dev, "HMC Error interrupt - info 0x%x, data 0x%x\n",
-			rd32(hw, PFHMC_ERRORINFO),
-			rd32(hw, PFHMC_ERRORDATA));
+#define ICE_AUX_CRIT_ERR (PFINT_OICR_PE_CRITERR_M | PFINT_OICR_HMC_ERR_M | PFINT_OICR_PE_PUSH_M)
+	if (oicr & ICE_AUX_CRIT_ERR) {
+		struct iidc_event *event;
+
+		ena_mask &= ~ICE_AUX_CRIT_ERR;
+		event = kzalloc(sizeof(*event), GFP_KERNEL);
+		if (event) {
+			set_bit(IIDC_EVENT_CRIT_ERR, event->type);
+			/* report the entire OICR value to AUX driver */
+			event->info.reg = oicr;
+			ice_send_event_to_auxs(pf, event);
+			kfree(event);
+		}
 	}
 
 	/* Report any remaining unexpected interrupts */
@@ -2719,8 +2726,7 @@ static irqreturn_t ice_misc_intr(int __always_unused irq, void *data)
 		/* If a critical error is pending there is no choice but to
 		 * reset the device.
 		 */
-		if (oicr & (PFINT_OICR_PE_CRITERR_M |
-			    PFINT_OICR_PCI_EXCEPTION_M |
+		if (oicr & (PFINT_OICR_PCI_EXCEPTION_M |
 			    PFINT_OICR_ECC_ERR_M)) {
 			set_bit(__ICE_PFR_REQ, pf->state);
 			ice_service_task_schedule(pf);
@@ -4454,10 +4460,11 @@ static void ice_remove(struct pci_dev *pdev)
 		ice_free_vfs(pf);
 	}
 
-	set_bit(__ICE_DOWN, pf->state);
 	ice_service_task_stop(pf);
 
 	ice_aq_cancel_waiting_tasks(pf);
+	ice_for_each_aux(pf, NULL, ice_unroll_cdev_info);
+	set_bit(__ICE_DOWN, pf->state);
 
 	mutex_destroy(&(&pf->hw)->fdir_fltr_lock);
 	ice_deinit_lag(pf);
@@ -6224,7 +6231,9 @@ static int ice_change_mtu(struct net_device *netdev, int new_mtu)
 	struct ice_netdev_priv *np = netdev_priv(netdev);
 	struct ice_vsi *vsi = np->vsi;
 	struct ice_pf *pf = vsi->back;
+	struct iidc_event *event;
 	u8 count = 0;
+	int err = 0;
 
 	if (new_mtu == (int)netdev->mtu) {
 		netdev_warn(netdev, "MTU is already %u\n", netdev->mtu);
@@ -6257,27 +6266,38 @@ static int ice_change_mtu(struct net_device *netdev, int new_mtu)
 		return -EBUSY;
 	}
 
+	event = kzalloc(sizeof(*event), GFP_KERNEL);
+	if (!event)
+		return -ENOMEM;
+
+	set_bit(IIDC_EVENT_BEFORE_MTU_CHANGE, event->type);
+	ice_send_event_to_auxs(pf, event);
+	clear_bit(IIDC_EVENT_BEFORE_MTU_CHANGE, event->type);
+
 	netdev->mtu = (unsigned int)new_mtu;
 
 	/* if VSI is up, bring it down and then back up */
 	if (!test_and_set_bit(__ICE_DOWN, vsi->state)) {
-		int err;
-
 		err = ice_down(vsi);
 		if (err) {
 			netdev_err(netdev, "change MTU if_down err %d\n", err);
-			return err;
+			goto free_event;
 		}
 
 		err = ice_up(vsi);
 		if (err) {
 			netdev_err(netdev, "change MTU if_up err %d\n", err);
-			return err;
+			goto free_event;
 		}
 	}
 
 	netdev_dbg(netdev, "changed MTU to %d\n", new_mtu);
-	return 0;
+free_event:
+	set_bit(IIDC_EVENT_AFTER_MTU_CHANGE, event->type);
+	ice_send_event_to_auxs(pf, event);
+	kfree(event);
+
+	return err;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c
index 2403cb3..c30e7d4 100644
--- a/drivers/net/ethernet/intel/ice/ice_sched.c
+++ b/drivers/net/ethernet/intel/ice/ice_sched.c
@@ -596,6 +596,50 @@ void ice_free_sched_node(struct ice_port_info *pi, struct ice_sched_node *node)
 }
 
 /**
+ * ice_alloc_rdma_q_ctx - allocate RDMA queue contexts for the given VSI and TC
+ * @hw: pointer to the HW struct
+ * @vsi_handle: VSI handle
+ * @tc: TC number
+ * @new_numqs: number of queues
+ */
+static enum ice_status
+ice_alloc_rdma_q_ctx(struct ice_hw *hw, u16 vsi_handle, u8 tc, u16 new_numqs)
+{
+	struct ice_vsi_ctx *vsi_ctx;
+	struct ice_q_ctx *q_ctx;
+
+	vsi_ctx = ice_get_vsi_ctx(hw, vsi_handle);
+	if (!vsi_ctx)
+		return ICE_ERR_PARAM;
+	/* allocate RDMA queue contexts */
+	if (!vsi_ctx->rdma_q_ctx[tc]) {
+		vsi_ctx->rdma_q_ctx[tc] = devm_kcalloc(ice_hw_to_dev(hw),
+						       new_numqs,
+						       sizeof(*q_ctx),
+						       GFP_KERNEL);
+		if (!vsi_ctx->rdma_q_ctx[tc])
+			return ICE_ERR_NO_MEMORY;
+		vsi_ctx->num_rdma_q_entries[tc] = new_numqs;
+		return 0;
+	}
+	/* num queues are increased, update the queue contexts */
+	if (new_numqs > vsi_ctx->num_rdma_q_entries[tc]) {
+		u16 prev_num = vsi_ctx->num_rdma_q_entries[tc];
+
+		q_ctx = devm_kcalloc(ice_hw_to_dev(hw), new_numqs,
+				     sizeof(*q_ctx), GFP_KERNEL);
+		if (!q_ctx)
+			return ICE_ERR_NO_MEMORY;
+		memcpy(q_ctx, vsi_ctx->rdma_q_ctx[tc],
+		       prev_num * sizeof(*q_ctx));
+		devm_kfree(ice_hw_to_dev(hw), vsi_ctx->rdma_q_ctx[tc]);
+		vsi_ctx->rdma_q_ctx[tc] = q_ctx;
+		vsi_ctx->num_rdma_q_entries[tc] = new_numqs;
+	}
+	return 0;
+}
+
+/**
  * ice_aq_rl_profile - performs a rate limiting task
  * @hw: pointer to the HW struct
  * @opcode: opcode for add, query, or remove profile(s)
@@ -1749,13 +1793,22 @@ struct ice_sched_node *
 	if (!vsi_ctx)
 		return ICE_ERR_PARAM;
 
-	prev_numqs = vsi_ctx->sched.max_lanq[tc];
+	if (owner == ICE_SCHED_NODE_OWNER_LAN)
+		prev_numqs = vsi_ctx->sched.max_lanq[tc];
+	else
+		prev_numqs = vsi_ctx->sched.max_rdmaq[tc];
 	/* num queues are not changed or less than the previous number */
 	if (new_numqs <= prev_numqs)
 		return status;
-	status = ice_alloc_lan_q_ctx(hw, vsi_handle, tc, new_numqs);
-	if (status)
-		return status;
+	if (owner == ICE_SCHED_NODE_OWNER_LAN) {
+		status = ice_alloc_lan_q_ctx(hw, vsi_handle, tc, new_numqs);
+		if (status)
+			return status;
+	} else {
+		status = ice_alloc_rdma_q_ctx(hw, vsi_handle, tc, new_numqs);
+		if (status)
+			return status;
+	}
 
 	if (new_numqs)
 		ice_sched_calc_vsi_child_nodes(hw, new_numqs, new_num_nodes);
@@ -1770,7 +1823,10 @@ struct ice_sched_node *
 					       new_num_nodes, owner);
 	if (status)
 		return status;
-	vsi_ctx->sched.max_lanq[tc] = new_numqs;
+	if (owner == ICE_SCHED_NODE_OWNER_LAN)
+		vsi_ctx->sched.max_lanq[tc] = new_numqs;
+	else
+		vsi_ctx->sched.max_rdmaq[tc] = new_numqs;
 
 	return 0;
 }
@@ -1836,6 +1892,7 @@ enum ice_status
 		 * recreate the child nodes all the time in these cases.
 		 */
 		vsi_ctx->sched.max_lanq[tc] = 0;
+		vsi_ctx->sched.max_rdmaq[tc] = 0;
 	}
 
 	/* update the VSI child nodes */
@@ -1965,6 +2022,8 @@ static bool ice_sched_is_leaf_node_present(struct ice_sched_node *node)
 		}
 		if (owner == ICE_SCHED_NODE_OWNER_LAN)
 			vsi_ctx->sched.max_lanq[i] = 0;
+		else
+			vsi_ctx->sched.max_rdmaq[i] = 0;
 	}
 	status = 0;
 
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
index 67c965a..ebb8453 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.c
+++ b/drivers/net/ethernet/intel/ice/ice_switch.c
@@ -302,6 +302,10 @@ static void ice_clear_vsi_q_ctx(struct ice_hw *hw, u16 vsi_handle)
 			devm_kfree(ice_hw_to_dev(hw), vsi->lan_q_ctx[i]);
 			vsi->lan_q_ctx[i] = NULL;
 		}
+		if (vsi->rdma_q_ctx[i]) {
+			devm_kfree(ice_hw_to_dev(hw), vsi->rdma_q_ctx[i]);
+			vsi->rdma_q_ctx[i] = NULL;
+		}
 	}
 }
 
@@ -423,6 +427,29 @@ enum ice_status
 }
 
 /**
+ * ice_cfg_iwarp_fltr - enable/disable iWARP filtering on VSI
+ * @hw: pointer to HW struct
+ * @vsi_handle: VSI SW index
+ * @enable: boolean for enable/disable
+ */
+enum ice_status
+ice_cfg_iwarp_fltr(struct ice_hw *hw, u16 vsi_handle, bool enable)
+{
+	struct ice_vsi_ctx *ctx;
+
+	ctx = ice_get_vsi_ctx(hw, vsi_handle);
+	if (!ctx)
+		return ICE_ERR_DOES_NOT_EXIST;
+
+	if (enable)
+		ctx->info.q_opt_flags |= ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
+	else
+		ctx->info.q_opt_flags &= ~ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
+
+	return ice_update_vsi(hw, vsi_handle, ctx, NULL);
+}
+
+/**
  * ice_aq_alloc_free_vsi_list
  * @hw: pointer to the HW struct
  * @vsi_list_id: VSI list ID returned or used for lookup
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.h b/drivers/net/ethernet/intel/ice/ice_switch.h
index 8b4f9d3..6821cf7 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.h
+++ b/drivers/net/ethernet/intel/ice/ice_switch.h
@@ -26,6 +26,8 @@ struct ice_vsi_ctx {
 	u8 vf_num;
 	u16 num_lan_q_entries[ICE_MAX_TRAFFIC_CLASS];
 	struct ice_q_ctx *lan_q_ctx[ICE_MAX_TRAFFIC_CLASS];
+	u16 num_rdma_q_entries[ICE_MAX_TRAFFIC_CLASS];
+	struct ice_q_ctx *rdma_q_ctx[ICE_MAX_TRAFFIC_CLASS];
 };
 
 enum ice_sw_fwd_act_type {
@@ -223,6 +225,8 @@ enum ice_status
 ice_add_eth_mac(struct ice_hw *hw, struct list_head *em_list);
 enum ice_status
 ice_remove_eth_mac(struct ice_hw *hw, struct list_head *em_list);
+enum ice_status
+ice_cfg_iwarp_fltr(struct ice_hw *hw, u16 vsi_handle, bool enable);
 void ice_remove_vsi_fltr(struct ice_hw *hw, u16 vsi_handle);
 enum ice_status
 ice_add_vlan(struct ice_hw *hw, struct list_head *m_list);
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index 4eaab60..0b77358 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -45,6 +45,7 @@ static inline u32 ice_round_to_num(u32 N, u32 R)
 #define ICE_DBG_FLOW		BIT_ULL(9)
 #define ICE_DBG_SW		BIT_ULL(13)
 #define ICE_DBG_SCHED		BIT_ULL(14)
+#define ICE_DBG_RDMA		BIT_ULL(15)
 #define ICE_DBG_PKG		BIT_ULL(16)
 #define ICE_DBG_RES		BIT_ULL(17)
 #define ICE_DBG_AQ_MSG		BIT_ULL(24)
@@ -423,6 +424,7 @@ struct ice_sched_node {
 	u8 tc_num;
 	u8 owner;
 #define ICE_SCHED_NODE_OWNER_LAN	0
+#define ICE_SCHED_NODE_OWNER_RDMA	2
 };
 
 /* Access Macros for Tx Sched Elements data */
@@ -494,6 +496,7 @@ struct ice_sched_vsi_info {
 	struct ice_sched_node *ag_node[ICE_MAX_TRAFFIC_CLASS];
 	struct list_head list_entry;
 	u16 max_lanq[ICE_MAX_TRAFFIC_CLASS];
+	u16 max_rdmaq[ICE_MAX_TRAFFIC_CLASS];
 };
 
 /* driver defines the policy */
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index 1f38a8d..fa70abb 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -3680,6 +3680,37 @@ static int ice_vc_dis_vlan_stripping(struct ice_vf *vf)
 }
 
 /**
+ * ice_vc_rdma_msg - send msg to RDMA PF from VF
+ * @vf: pointer to VF info
+ * @msg: pointer to msg buffer
+ * @len: length of the message
+ *
+ * This function is called indirectly from the AQ clean function.
+ */
+static int ice_vc_rdma_msg(struct ice_vf *vf, u8 *msg, u16 len)
+{
+	struct iidc_core_dev_info *rcdi;
+	struct iidc_auxiliary_ops *ops;
+	int ret = 0;
+
+	rcdi = ice_find_cdev_info_by_id(vf->pf, IIDC_RDMA_ID);
+	if (!rcdi) {
+		pr_err("VF attempted to send message to invalid RDMA driver\n");
+		return -EIO;
+	}
+
+	device_lock(&rcdi->adev->dev);
+	ops = ice_get_auxiliary_ops(rcdi);
+	if (ops && ops->vc_receive)
+		ret = ops->vc_receive(rcdi, vf->vf_id, msg, len);
+	device_unlock(&rcdi->adev->dev);
+	if (ret)
+		pr_err("Failed to send message to RDMA driver, error %d\n", ret);
+
+	return ret;
+}
+
+/**
  * ice_vf_init_vlan_stripping - enable/disable VLAN stripping on initialization
  * @vf: VF to enable/disable VLAN stripping for on initialization
  *
@@ -3816,6 +3847,9 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event)
 	case VIRTCHNL_OP_DISABLE_VLAN_STRIPPING:
 		err = ice_vc_dis_vlan_stripping(vf);
 		break;
+	case VIRTCHNL_OP_IWARP:
+		err = ice_vc_rdma_msg(vf, msg, msglen);
+		break;
 	case VIRTCHNL_OP_UNKNOWN:
 	default:
 		dev_err(dev, "Unsupported opcode %d from VF %d\n", v_opcode,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 04/23] ice: Register auxiliary device to provide RDMA
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (2 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 03/23] ice: Implement iidc operations Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 05/23] ice: Add devlink params support Shiraz Saleem
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

From: Dave Ertman <david.m.ertman@intel.com>

Register ice client auxiliary RDMA device on the auxiliary bus per
PCIe device function for the auxiliary driver (irdma) to attach to.
It allows to realize a single RDMA driver (irdma) capable of working with
multiple netdev drivers over multi-generation Intel HW supporting RDMA.
There is no load ordering dependencies between ice and irdma.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/net/ethernet/intel/Kconfig        |   1 +
 drivers/net/ethernet/intel/ice/ice.h      |   8 +-
 drivers/net/ethernet/intel/ice/ice_idc.c  | 123 ++++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_main.c |   9 +++
 4 files changed, 140 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index 5aa8631..cbc5968 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -294,6 +294,7 @@ config ICE
 	tristate "Intel(R) Ethernet Connection E800 Series Support"
 	default n
 	depends on PCI_MSI
+	select AUXILIARY_BUS
 	select NET_DEVLINK
 	select PLDMFW
 	help
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 561f8fd..41bae4d 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -34,6 +34,7 @@
 #include <linux/if_bridge.h>
 #include <linux/ctype.h>
 #include <linux/bpf.h>
+#include <linux/auxiliary_bus.h>
 #include <linux/avf/virtchnl.h>
 #include <linux/cpu_rmap.h>
 #include <net/devlink.h>
@@ -633,6 +634,8 @@ static inline void ice_clear_sriov_cap(struct ice_pf *pf)
 void ice_fill_rss_lut(u8 *lut, u16 rss_table_size, u16 rss_size);
 int ice_schedule_reset(struct ice_pf *pf, enum ice_reset_req reset);
 void ice_print_link_msg(struct ice_vsi *vsi, bool isup);
+int ice_plug_aux_devs(struct ice_pf *pf);
+void ice_unplug_aux_devs(struct ice_pf *pf);
 int ice_init_aux_devices(struct ice_pf *pf);
 int
 ice_for_each_aux(struct ice_pf *pf, void *data,
@@ -667,8 +670,10 @@ int ice_aq_wait_for_event(struct ice_pf *pf, u16 opcode, unsigned long timeout,
  */
 static inline void ice_set_rdma_cap(struct ice_pf *pf)
 {
-	if (pf->hw.func_caps.common_cap.iwarp && pf->num_rdma_msix)
+	if (pf->hw.func_caps.common_cap.iwarp && pf->num_rdma_msix) {
 		set_bit(ICE_FLAG_IWARP_ENA, pf->flags);
+		ice_plug_aux_devs(pf);
+	}
 }
 
 /**
@@ -677,6 +682,7 @@ static inline void ice_set_rdma_cap(struct ice_pf *pf)
  */
 static inline void ice_clear_rdma_cap(struct ice_pf *pf)
 {
+	ice_unplug_aux_devs(pf);
 	clear_bit(ICE_FLAG_IWARP_ENA, pf->flags);
 }
 #endif /* _ICE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c
index 547718b..95695d2 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -584,6 +584,109 @@ int ice_cdev_info_update_vsi(struct iidc_core_dev_info *cdev_info, void *data)
 };
 
 /**
+ * ice_cdev_info_adev_release - function to be mapped to AUX dev's release op
+ * @dev: pointer to device to free
+ */
+static void ice_cdev_info_adev_release(struct device *dev)
+{
+	struct iidc_auxiliary_dev *iadev;
+
+	iadev = container_of(dev, struct iidc_auxiliary_dev, adev.dev);
+	kfree(iadev->adev.name);
+	kfree(iadev);
+}
+
+/**
+ * ice_plug_aux_devs - allocate and register one AUX dev per cdev_info in PF
+ * @pf: pointer to PF struct
+ */
+int ice_plug_aux_devs(struct ice_pf *pf)
+{
+	struct iidc_auxiliary_dev *iadev;
+	int ret, i;
+
+	if (!pf->cdev_infos)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(ice_cdev_ids); i++) {
+		struct iidc_core_dev_info *cdev_info;
+		struct auxiliary_device *adev;
+
+		cdev_info = pf->cdev_infos[i];
+		if (!cdev_info)
+			continue;
+
+		iadev = kzalloc(sizeof(*iadev), GFP_KERNEL);
+		if (!iadev)
+			return -ENOMEM;
+
+		adev = &iadev->adev;
+		cdev_info->adev = adev;
+		iadev->cdev_info = cdev_info;
+
+		if (ice_cdev_ids[i].id == IIDC_RDMA_ID) {
+			if (cdev_info->rdma_protocol ==
+			    IIDC_RDMA_PROTOCOL_IWARP)
+				adev->name = kasprintf(GFP_KERNEL, "%s_%s",
+						       ice_cdev_ids[i].name,
+						       "iwarp");
+			else
+				adev->name = kasprintf(GFP_KERNEL, "%s_%s",
+						       ice_cdev_ids[i].name,
+						       "roce");
+		} else {
+			adev->name = kasprintf(GFP_KERNEL, "%s",
+					       ice_cdev_ids[i].name);
+		}
+		adev->id = pf->aux_idx;
+		adev->dev.release = ice_cdev_info_adev_release;
+		adev->dev.parent = &cdev_info->pdev->dev;
+
+		ret = auxiliary_device_init(adev);
+		if (ret) {
+			cdev_info->adev = NULL;
+			kfree(adev->name);
+			kfree(iadev);
+			return ret;
+		}
+
+		ret = auxiliary_device_add(adev);
+		if (ret) {
+			cdev_info->adev = NULL;
+			auxiliary_device_uninit(adev);
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+/**
+ * ice_unplug_aux_devs - unregister and free AUX devs
+ * @pf: pointer to PF struct
+ */
+void ice_unplug_aux_devs(struct ice_pf *pf)
+{
+	int i;
+
+	if (!pf->cdev_infos)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(ice_cdev_ids); i++) {
+		struct iidc_core_dev_info *cdev_info;
+
+		cdev_info = pf->cdev_infos[i];
+		/* if this AUX dev has already been unplugged move on */
+		if (!cdev_info->adev)
+			continue;
+
+		auxiliary_device_delete(cdev_info->adev);
+		auxiliary_device_uninit(cdev_info->adev);
+		cdev_info->adev = NULL;
+	}
+}
+
+/**
  * ice_init_aux_devices - initializes cdev_info objects and AUX devices
  * @pf: ptr to ice_pf
  */
@@ -615,6 +718,19 @@ int ice_init_aux_devices(struct ice_pf *pf)
 		struct msix_entry *entry = NULL;
 		int j;
 
+		/* structure layout needed for container_of's looks like:
+		 * iidc_auxiliary_dev (container_of super-struct for adev)
+		 * |--> auxiliary_device
+		 * |--> *iidc_core_dev_info (pointer from cdev_info struct)
+		 *
+		 * The iidc_auxiliary_device has a lifespan as long as it
+		 * is on the bus.  Once removed it will be freed and a new
+		 * one allocated if needed to re-add.
+		 *
+		 * The iidc_core_dev_info is tied to the life of the PF, and
+		 * will exist as long as the PF driver is loaded.  It will be
+		 * freed in the remove flow for the PF driver.
+		 */
 		cdev_info = kzalloc(sizeof(*cdev_info), GFP_KERNEL);
 		if (!cdev_info) {
 			ida_simple_remove(&ice_cdev_info_ida, pf->aux_idx);
@@ -666,5 +782,12 @@ int ice_init_aux_devices(struct ice_pf *pf)
 		cdev_info->msix_entries = entry;
 	}
 
+	ret = ice_plug_aux_devs(pf);
+	if (ret) {
+		ice_unplug_aux_devs(pf);
+		ice_for_each_aux(pf, NULL, ice_unroll_cdev_info);
+		ida_simple_remove(&ice_cdev_info_ida, pf->aux_idx);
+	}
+
 	return ret;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8baf3ac..3d750ba 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -466,6 +466,8 @@ static void ice_pf_dis_all_vsi(struct ice_pf *pf, bool locked)
 	if (test_bit(__ICE_PREPARED_FOR_RESET, pf->state))
 		return;
 
+	ice_unplug_aux_devs(pf);
+
 	/* Notify VFs of impending reset */
 	if (ice_check_sq_alive(hw, &hw->mailboxq))
 		ice_vc_notify_reset(pf);
@@ -2122,6 +2124,8 @@ int ice_schedule_reset(struct ice_pf *pf, enum ice_reset_req reset)
 		return -EBUSY;
 	}
 
+	ice_unplug_aux_devs(pf);
+
 	switch (reset) {
 	case ICE_RESET_PFR:
 		set_bit(__ICE_PFR_REQ, pf->state);
@@ -4463,6 +4467,7 @@ static void ice_remove(struct pci_dev *pdev)
 	ice_service_task_stop(pf);
 
 	ice_aq_cancel_waiting_tasks(pf);
+	ice_unplug_aux_devs(pf);
 	ice_for_each_aux(pf, NULL, ice_unroll_cdev_info);
 	set_bit(__ICE_DOWN, pf->state);
 
@@ -4620,6 +4625,8 @@ static int __maybe_unused ice_suspend(struct device *dev)
 	 */
 	disabled = ice_service_task_stop(pf);
 
+	ice_unplug_aux_devs(pf);
+
 	/* Already suspended?, then there is nothing to do */
 	if (test_and_set_bit(__ICE_SUSPENDED, pf->state)) {
 		if (!disabled)
@@ -6193,6 +6200,8 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type)
 
 	/* if we get here, reset flow is successful */
 	clear_bit(__ICE_RESET_FAILED, pf->state);
+
+	ice_plug_aux_devs(pf);
 	return;
 
 err_vsi_rebuild:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 05/23] ice: Add devlink params support
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (3 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 04/23] ice: Register auxiliary device to provide RDMA Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-07 14:57   ` Jason Gunthorpe
  2021-04-06 21:01 ` [PATCH v4 06/23] i40e: Prep i40e header for aux bus conversion Shiraz Saleem
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

Add a new generic runtime devlink parameter 'rdma_protocol'
and use it in ice PCI driver. Configuration changes
result in unplugging the auxiliary RDMA device and re-plugging
it with updated values for irdma auxiiary driver to consume at
drv.probe()

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 .../networking/devlink/devlink-params.rst          |  6 ++
 Documentation/networking/devlink/ice.rst           | 13 +++
 drivers/net/ethernet/intel/ice/ice_devlink.c       | 92 +++++++++++++++++++++-
 drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
 drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
 include/net/devlink.h                              |  4 +
 net/core/devlink.c                                 |  5 ++
 7 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index 54c9f10..0b454c3 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -114,3 +114,9 @@ own name.
        will NACK any attempt of other host to reset the device. This parameter
        is useful for setups where a device is shared by different hosts, such
        as multi-host setup.
+   * - ``rdma_protocol``
+     - string
+     - Selects the RDMA protocol selected for multi-protocol devices.
+        - ``iwarp`` iWARP
+	- ``roce`` RoCE
+	- ``ib`` Infiniband
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
index a432dc4..2e04c99 100644
--- a/Documentation/networking/devlink/ice.rst
+++ b/Documentation/networking/devlink/ice.rst
@@ -193,3 +193,16 @@ Users can request an immediate capture of a snapshot via the
     0000000000000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 
     $ devlink region delete pci/0000:01:00.0/device-caps snapshot 1
+
+Parameters
+==========
+
+The ``ice`` driver implements the following generic and driver-specific
+parameters.
+
+.. list-table:: Generic parameters implemented
+
+   * - Name
+     - Mode
+   * - ``rdma_protocol``
+     - runtime
diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
index cf685ee..de03eb8 100644
--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
+++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
@@ -449,6 +449,64 @@ static int ice_devlink_info_get(struct devlink *devlink,
 	.flash_update = ice_devlink_flash_update,
 };
 
+static int
+ice_devlink_rdma_prot_get(struct devlink *devlink, u32 id,
+			  struct devlink_param_gset_ctx *ctx)
+{
+	struct ice_pf *pf = devlink_priv(devlink);
+	struct iidc_core_dev_info *cdev_info =
+		ice_find_cdev_info_by_id(pf, IIDC_RDMA_ID);
+
+	if (cdev_info->rdma_protocol == IIDC_RDMA_PROTOCOL_IWARP)
+		strcpy(ctx->val.vstr, "iwarp");
+	else
+		strcpy(ctx->val.vstr, "roce");
+
+	return 0;
+}
+
+static int
+ice_devlink_rdma_prot_set(struct devlink *devlink, u32 id,
+			  struct devlink_param_gset_ctx *ctx)
+{
+	struct ice_pf *pf = devlink_priv(devlink);
+	struct iidc_core_dev_info *cdev_info =
+		ice_find_cdev_info_by_id(pf, IIDC_RDMA_ID);
+	enum iidc_rdma_protocol prot = !strcmp(ctx->val.vstr, "iwarp") ?
+					IIDC_RDMA_PROTOCOL_IWARP :
+					IIDC_RDMA_PROTOCOL_ROCEV2;
+
+	if (cdev_info->rdma_protocol != prot) {
+		ice_unplug_aux_devs(pf);
+		cdev_info->rdma_protocol = prot;
+		ice_plug_aux_devs(pf);
+	}
+
+	return 0;
+}
+
+static int
+ice_devlink_rdma_prot_validate(struct devlink *devlink, u32 id,
+			       union devlink_param_value val,
+			       struct netlink_ext_ack *extack)
+{
+	char *value = val.vstr;
+
+	if (!strcmp(value, "iwarp") || !strcmp(value, "roce"))
+		return 0;
+
+	NL_SET_ERR_MSG_MOD(extack, "\"iwarp\" and \"roce\" are the only supported values");
+
+	return -EINVAL;
+}
+
+static const struct devlink_param ice_devlink_params[] = {
+	DEVLINK_PARAM_GENERIC(RDMA_PROTOCOL, BIT(DEVLINK_PARAM_CMODE_RUNTIME),
+			      ice_devlink_rdma_prot_get,
+			      ice_devlink_rdma_prot_set,
+			      ice_devlink_rdma_prot_validate),
+};
+
 static void ice_devlink_free(void *devlink_ptr)
 {
 	devlink_free((struct devlink *)devlink_ptr);
@@ -491,15 +549,31 @@ int ice_devlink_register(struct ice_pf *pf)
 {
 	struct devlink *devlink = priv_to_devlink(pf);
 	struct device *dev = ice_pf_to_dev(pf);
+	union devlink_param_value value;
 	int err;
 
 	err = devlink_register(devlink, dev);
+	if (err)
+		goto err;
+
+	err = devlink_params_register(devlink, ice_devlink_params,
+				      ARRAY_SIZE(ice_devlink_params));
 	if (err) {
-		dev_err(dev, "devlink registration failed: %d\n", err);
-		return err;
+		devlink_unregister(devlink);
+		goto err;
 	}
 
+	strcpy(value.vstr, "iwarp");
+	devlink_param_driverinit_value_set(devlink,
+					   DEVLINK_PARAM_GENERIC_ID_RDMA_PROTOCOL,
+					   value);
+
 	return 0;
+
+err:
+	dev_err(dev, "devlink registration failed: %d\n", err);
+
+	return err;
 }
 
 /**
@@ -510,10 +584,24 @@ int ice_devlink_register(struct ice_pf *pf)
  */
 void ice_devlink_unregister(struct ice_pf *pf)
 {
+	devlink_params_unregister(priv_to_devlink(pf), ice_devlink_params,
+				  ARRAY_SIZE(ice_devlink_params));
 	devlink_unregister(priv_to_devlink(pf));
 }
 
 /**
+ * ice_devlink_params_publish - Publish devlink param
+ * @pf: the PF structure to cleanup
+ *
+ * Publish previously registered devlink parameters after driver
+ * is initialized
+ */
+void ice_devlink_params_publish(struct ice_pf *pf)
+{
+	devlink_params_publish(priv_to_devlink(pf));
+}
+
+/**
  * ice_devlink_create_port - Create a devlink port for this VSI
  * @vsi: the VSI to create a port for
  *
diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.h b/drivers/net/ethernet/intel/ice/ice_devlink.h
index e07e744..e7239fa 100644
--- a/drivers/net/ethernet/intel/ice/ice_devlink.h
+++ b/drivers/net/ethernet/intel/ice/ice_devlink.h
@@ -4,10 +4,15 @@
 #ifndef _ICE_DEVLINK_H_
 #define _ICE_DEVLINK_H_
 
+enum ice_devlink_param_id {
+	ICE_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX,
+};
+
 struct ice_pf *ice_allocate_pf(struct device *dev);
 
 int ice_devlink_register(struct ice_pf *pf);
 void ice_devlink_unregister(struct ice_pf *pf);
+void ice_devlink_params_publish(struct ice_pf *pf);
 int ice_devlink_create_port(struct ice_vsi *vsi);
 void ice_devlink_destroy_port(struct ice_vsi *vsi);
 
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 3d750ba..3e3a9cf 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -4346,6 +4346,8 @@ static void ice_print_wake_reason(struct ice_pf *pf)
 		dev_warn(dev, "RDMA is not supported on this device\n");
 	}
 
+	ice_devlink_params_publish(pf);
+
 	return 0;
 
 err_init_aux_unroll:
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 853420d..09e4d76 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -498,6 +498,7 @@ enum devlink_param_generic_id {
 	DEVLINK_PARAM_GENERIC_ID_RESET_DEV_ON_DRV_PROBE,
 	DEVLINK_PARAM_GENERIC_ID_ENABLE_ROCE,
 	DEVLINK_PARAM_GENERIC_ID_ENABLE_REMOTE_DEV_RESET,
+	DEVLINK_PARAM_GENERIC_ID_RDMA_PROTOCOL,
 
 	/* add new param generic ids above here*/
 	__DEVLINK_PARAM_GENERIC_ID_MAX,
@@ -538,6 +539,9 @@ enum devlink_param_generic_id {
 #define DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_NAME "enable_remote_dev_reset"
 #define DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_TYPE DEVLINK_PARAM_TYPE_BOOL
 
+#define DEVLINK_PARAM_GENERIC_RDMA_PROTOCOL_NAME "rdma_protocol"
+#define DEVLINK_PARAM_GENERIC_RDMA_PROTOCOL_TYPE DEVLINK_PARAM_TYPE_STRING
+
 #define DEVLINK_PARAM_GENERIC(_id, _cmodes, _get, _set, _validate)	\
 {									\
 	.id = DEVLINK_PARAM_GENERIC_ID_##_id,				\
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 737b61c..1bb3865 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3766,6 +3766,11 @@ static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
 		.name = DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_NAME,
 		.type = DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_TYPE,
 	},
+	{
+		.id = DEVLINK_PARAM_GENERIC_ID_RDMA_PROTOCOL,
+		.name = DEVLINK_PARAM_GENERIC_RDMA_PROTOCOL_NAME,
+		.type = DEVLINK_PARAM_GENERIC_RDMA_PROTOCOL_TYPE,
+	},
 };
 
 static int devlink_param_generic_verify(const struct devlink_param *param)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 06/23] i40e: Prep i40e header for aux bus conversion
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (4 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 05/23] ice: Add devlink params support Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 07/23] i40e: Register auxiliary devices to provide RDMA Shiraz Saleem
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

Add the definitions and private ops to the i40e client
header file in preparation to convert i40e to use
the new auxiliary bus infrastructure. This header
is shared between the 'i40e' Intel networking driver
providing RDMA support and the 'irdma' driver.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 include/linux/net/intel/i40e_client.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/net/intel/i40e_client.h b/include/linux/net/intel/i40e_client.h
index f41387a..f4302f6 100644
--- a/include/linux/net/intel/i40e_client.h
+++ b/include/linux/net/intel/i40e_client.h
@@ -4,6 +4,8 @@
 #ifndef _I40E_CLIENT_H_
 #define _I40E_CLIENT_H_
 
+#include <linux/auxiliary_bus.h>
+
 #define I40E_CLIENT_STR_LENGTH 10
 
 /* Client interface version should be updated anytime there is a change in the
@@ -78,6 +80,7 @@ struct i40e_info {
 	u8 lanmac[6];
 	struct net_device *netdev;
 	struct pci_dev *pcidev;
+	struct auxiliary_device *aux_dev;
 	u8 __iomem *hw_addr;
 	u8 fid;	/* function id, PF id or VF id */
 #define I40E_CLIENT_FTYPE_PF 0
@@ -90,6 +93,7 @@ struct i40e_info {
 	struct i40e_qvlist_info *qvlist_info;
 	struct i40e_params params;
 	struct i40e_ops *ops;
+	struct i40e_client *client;
 
 	u16 msix_count;	 /* number of msix vectors*/
 	/* Array down below will be dynamically allocated based on msix_count */
@@ -100,6 +104,11 @@ struct i40e_info {
 	u32 fw_build;                   /* firmware build number */
 };
 
+struct i40e_auxiliary_device {
+	struct auxiliary_device aux_dev;
+	struct i40e_info *ldev;
+};
+
 #define I40E_CLIENT_RESET_LEVEL_PF   1
 #define I40E_CLIENT_RESET_LEVEL_CORE 2
 #define I40E_CLIENT_VSI_FLAG_TCP_ENABLE  BIT(1)
@@ -125,6 +134,11 @@ struct i40e_ops {
 			       struct i40e_client *client,
 			       bool is_vf, u32 vf_id,
 			       u32 flag, u32 valid_flag);
+
+	int (*client_device_register)(struct i40e_info *ldev);
+
+	void (*client_device_unregister)(struct i40e_info *ldev);
+
 };
 
 struct i40e_client_ops {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 07/23] i40e: Register auxiliary devices to provide RDMA
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (5 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 06/23] i40e: Prep i40e header for aux bus conversion Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 08/23] RDMA/irdma: Register auxiliary driver and implement private channel OPs Shiraz Saleem
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

Convert i40e to use the auxiliary bus infrastructure to export
the RDMA functionality of the device to the RDMA driver.
Register i40e client auxiliary RDMA device on the auxiliary bus per
PCIe device function for the new auxiliary rdma driver (irdma) to
attach to.

This replaces the need for the global i40e_register_client and
i40e_unregister_client symbols to connect the netdev and rdma
driver. It will be obsoleted once irdma replaces i40iw in the kernel
for the X722 device.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/net/ethernet/intel/Kconfig            |   1 +
 drivers/net/ethernet/intel/i40e/i40e.h        |   2 +
 drivers/net/ethernet/intel/i40e/i40e_client.c | 156 ++++++++++++++++++++++----
 drivers/net/ethernet/intel/i40e/i40e_main.c   |   1 +
 4 files changed, 140 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index cbc5968..ccb7ad6 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -241,6 +241,7 @@ config I40E
 	tristate "Intel(R) Ethernet Controller XL710 Family support"
 	imply PTP_1588_CLOCK
 	depends on PCI
+	select AUXILIARY_BUS
 	help
 	  This driver supports Intel(R) Ethernet Controller XL710 Family of
 	  devices.  For more information on how to identify your adapter, go
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index cd53981..3246fef 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -867,6 +867,8 @@ struct i40e_netdev_priv {
 	struct i40e_vsi *vsi;
 };
 
+extern struct ida i40e_client_ida;
+
 /* struct that defines an interrupt vector */
 struct i40e_q_vector {
 	struct i40e_vsi *vsi;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c b/drivers/net/ethernet/intel/i40e/i40e_client.c
index a2dba32..2eb5aba 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_client.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
@@ -12,6 +12,7 @@
 static struct i40e_client *registered_client;
 static LIST_HEAD(i40e_devices);
 static DEFINE_MUTEX(i40e_device_mutex);
+DEFINE_IDA(i40e_client_ida);
 
 static int i40e_client_virtchnl_send(struct i40e_info *ldev,
 				     struct i40e_client *client,
@@ -30,11 +31,17 @@ static int i40e_client_update_vsi_ctxt(struct i40e_info *ldev,
 				       bool is_vf, u32 vf_id,
 				       u32 flag, u32 valid_flag);
 
+static int i40e_client_device_register(struct i40e_info *ldev);
+
+static void i40e_client_device_unregister(struct i40e_info *ldev);
+
 static struct i40e_ops i40e_lan_ops = {
 	.virtchnl_send = i40e_client_virtchnl_send,
 	.setup_qvlist = i40e_client_setup_qvlist,
 	.request_reset = i40e_client_request_reset,
 	.update_vsi_ctxt = i40e_client_update_vsi_ctxt,
+	.client_device_register = i40e_client_device_register,
+	.client_device_unregister = i40e_client_device_unregister,
 };
 
 /**
@@ -275,6 +282,57 @@ void i40e_client_update_msix_info(struct i40e_pf *pf)
 	cdev->lan_info.msix_entries = &pf->msix_entries[pf->iwarp_base_vector];
 }
 
+static void i40e_auxiliary_dev_release(struct device *dev)
+{
+	struct i40e_auxiliary_device *i40e_aux_dev =
+			container_of(dev, struct i40e_auxiliary_device, aux_dev.dev);
+
+	ida_simple_remove(&i40e_client_ida, i40e_aux_dev->aux_dev.id);
+	kfree(i40e_aux_dev);
+}
+
+static int i40e_register_auxiliary_dev(struct i40e_info *ldev, const char *name)
+{
+	struct i40e_auxiliary_device *i40e_aux_dev;
+	struct pci_dev *pdev = ldev->pcidev;
+	struct auxiliary_device *aux_dev;
+	int ret;
+
+	i40e_aux_dev = kzalloc(sizeof(*i40e_aux_dev), GFP_KERNEL);
+	if (!i40e_aux_dev)
+		return -ENOMEM;
+
+	i40e_aux_dev->ldev = ldev;
+
+	aux_dev = &i40e_aux_dev->aux_dev;
+	aux_dev->name = name;
+	aux_dev->dev.parent = &pdev->dev;
+	aux_dev->dev.release = i40e_auxiliary_dev_release;
+	ldev->aux_dev = aux_dev;
+
+	ret = ida_simple_get(&i40e_client_ida, 0, 0, GFP_KERNEL);
+	if (ret < 0) {
+		kfree(i40e_aux_dev);
+		return ret;
+	}
+	aux_dev->id = ret;
+
+	ret = auxiliary_device_init(aux_dev);
+	if (ret < 0) {
+		ida_simple_remove(&i40e_client_ida, aux_dev->id);
+		kfree(i40e_aux_dev);
+		return ret;
+	}
+
+	ret = auxiliary_device_add(aux_dev);
+	if (ret) {
+		auxiliary_device_uninit(aux_dev);
+		return ret;
+	}
+
+	return ret;
+}
+
 /**
  * i40e_client_add_instance - add a client instance struct to the instance list
  * @pf: pointer to the board struct
@@ -286,9 +344,6 @@ static void i40e_client_add_instance(struct i40e_pf *pf)
 	struct netdev_hw_addr *mac = NULL;
 	struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi];
 
-	if (!registered_client || pf->cinst)
-		return;
-
 	cdev = kzalloc(sizeof(*cdev), GFP_KERNEL);
 	if (!cdev)
 		return;
@@ -308,11 +363,8 @@ static void i40e_client_add_instance(struct i40e_pf *pf)
 	cdev->lan_info.fw_build = pf->hw.aq.fw_build;
 	set_bit(__I40E_CLIENT_INSTANCE_NONE, &cdev->state);
 
-	if (i40e_client_get_params(vsi, &cdev->lan_info.params)) {
-		kfree(cdev);
-		cdev = NULL;
-		return;
-	}
+	if (i40e_client_get_params(vsi, &cdev->lan_info.params))
+		goto free_cdev;
 
 	mac = list_first_entry(&cdev->lan_info.netdev->dev_addrs.list,
 			       struct netdev_hw_addr, list);
@@ -324,7 +376,17 @@ static void i40e_client_add_instance(struct i40e_pf *pf)
 	cdev->client = registered_client;
 	pf->cinst = cdev;
 
-	i40e_client_update_msix_info(pf);
+	cdev->lan_info.msix_count = pf->num_iwarp_msix;
+	cdev->lan_info.msix_entries = &pf->msix_entries[pf->iwarp_base_vector];
+
+	if (i40e_register_auxiliary_dev(&cdev->lan_info, "intel_rdma_iwarp"))
+		goto free_cdev;
+
+	return;
+
+free_cdev:
+	kfree(cdev);
+	pf->cinst = NULL;
 }
 
 /**
@@ -345,7 +407,7 @@ void i40e_client_del_instance(struct i40e_pf *pf)
  **/
 void i40e_client_subtask(struct i40e_pf *pf)
 {
-	struct i40e_client *client = registered_client;
+	struct i40e_client *client;
 	struct i40e_client_instance *cdev;
 	struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi];
 	int ret = 0;
@@ -359,9 +421,11 @@ void i40e_client_subtask(struct i40e_pf *pf)
 	    test_bit(__I40E_CONFIG_BUSY, pf->state))
 		return;
 
-	if (!client || !cdev)
+	if (!cdev || !cdev->client)
 		return;
 
+	client = cdev->client;
+
 	/* Here we handle client opens. If the client is down, and
 	 * the netdev is registered, then open the client.
 	 */
@@ -422,16 +486,8 @@ int i40e_lan_add_device(struct i40e_pf *pf)
 		 pf->hw.pf_id, pf->hw.bus.bus_id,
 		 pf->hw.bus.device, pf->hw.bus.func);
 
-	/* If a client has already been registered, we need to add an instance
-	 * of it to our new LAN device.
-	 */
-	if (registered_client)
-		i40e_client_add_instance(pf);
+	i40e_client_add_instance(pf);
 
-	/* Since in some cases register may have happened before a device gets
-	 * added, we can schedule a subtask to go initiate the clients if
-	 * they can be launched at probe time.
-	 */
 	set_bit(__I40E_CLIENT_SERVICE_REQUESTED, pf->state);
 	i40e_service_event_schedule(pf);
 
@@ -448,9 +504,13 @@ int i40e_lan_add_device(struct i40e_pf *pf)
  **/
 int i40e_lan_del_device(struct i40e_pf *pf)
 {
+	struct auxiliary_device *aux_dev = pf->cinst->lan_info.aux_dev;
 	struct i40e_device *ldev, *tmp;
 	int ret = -ENODEV;
 
+	auxiliary_device_delete(aux_dev);
+	auxiliary_device_uninit(aux_dev);
+
 	/* First, remove any client instance. */
 	i40e_client_del_instance(pf);
 
@@ -731,6 +791,62 @@ static int i40e_client_update_vsi_ctxt(struct i40e_info *ldev,
 	return err;
 }
 
+static int i40e_client_device_register(struct i40e_info *ldev)
+{
+	struct i40e_client *client;
+	struct i40e_pf *pf;
+
+	if (!ldev) {
+		pr_err("Failed to reg client dev: ldev ptr NULL\n");
+		return -EINVAL;
+	}
+
+	client = ldev->client;
+	pf = ldev->pf;
+	if (!client) {
+		pr_err("Failed to reg client dev: client ptr NULL\n");
+		return -EINVAL;
+	}
+
+	if (!ldev->ops || !client->ops) {
+		pr_err("Failed to reg client dev: client dev peer_ops/ops NULL\n");
+		return -EINVAL;
+	}
+
+	pf->cinst->client = ldev->client;
+	set_bit(__I40E_CLIENT_SERVICE_REQUESTED, pf->state);
+	i40e_service_event_schedule(pf);
+
+	return 0;
+}
+
+static void i40e_client_device_unregister(struct i40e_info *ldev)
+{
+	struct i40e_pf *pf = ldev->pf;
+	struct i40e_client_instance *cdev = pf->cinst;
+
+	while (test_and_set_bit(__I40E_SERVICE_SCHED, pf->state))
+		usleep_range(500, 1000);
+
+	if (!cdev || !cdev->client || !cdev->client->ops ||
+	    !cdev->client->ops->close) {
+		dev_err(&pf->pdev->dev, "Cannot close client device\n");
+		return;
+	}
+	if (test_bit(__I40E_CLIENT_INSTANCE_OPENED, &cdev->state)) {
+		cdev->client->ops->close(&cdev->lan_info, cdev->client, false);
+		clear_bit(__I40E_CLIENT_INSTANCE_OPENED, &cdev->state);
+		i40e_client_release_qvlist(&cdev->lan_info);
+	}
+
+	pf->cinst->client = NULL;
+	clear_bit(__I40E_SERVICE_SCHED, pf->state);
+}
+
+/* Retain legacy global registration/unregistration calls till i40iw is
+ * deprecated from the kernel. The irdma unified driver does not use these
+ * exported symbols.
+ */
 /**
  * i40e_register_client - Register a i40e client driver with the L2 driver
  * @client: pointer to the i40e_client struct
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 353deae..d8d3b136 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16272,6 +16272,7 @@ static void __exit i40e_exit_module(void)
 {
 	pci_unregister_driver(&i40e_driver);
 	destroy_workqueue(i40e_wq);
+	ida_destroy(&i40e_client_ida);
 	i40e_dbg_exit();
 }
 module_exit(i40e_exit_module);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 08/23] RDMA/irdma: Register auxiliary driver and implement private channel OPs
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (6 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 07/23] i40e: Register auxiliary devices to provide RDMA Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 09/23] RDMA/irdma: Implement device initialization definitions Shiraz Saleem
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Register auxiliary drivers which can attach to auxiliary RDMA
devices from Intel PCI netdev drivers i40e and ice. Implement the private
channel ops, and register net notifiers.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/i40iw_if.c | 229 +++++++++++++
 drivers/infiniband/hw/irdma/main.c     | 382 ++++++++++++++++++++++
 drivers/infiniband/hw/irdma/main.h     | 565 +++++++++++++++++++++++++++++++++
 3 files changed, 1176 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/i40iw_if.c
 create mode 100644 drivers/infiniband/hw/irdma/main.c
 create mode 100644 drivers/infiniband/hw/irdma/main.h

diff --git a/drivers/infiniband/hw/irdma/i40iw_if.c b/drivers/infiniband/hw/irdma/i40iw_if.c
new file mode 100644
index 0000000..440dddf
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/i40iw_if.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "main.h"
+#include "i40iw_hw.h"
+#include <linux/net/intel/i40e_client.h>
+
+/**
+ * i40iw_request_reset - Request a reset
+ * @rf: RDMA PCI function
+ *
+ */
+static void i40iw_request_reset(struct irdma_pci_f *rf)
+{
+	struct i40e_info *cdev_info = rf->priv_cdev_info.cdev_info;
+
+	cdev_info->ops->request_reset(cdev_info, rf->priv_cdev_info.client, 1);
+}
+
+static void i40iw_fill_device_info(struct irdma_device *iwdev, struct i40e_info *cdev_info)
+{
+	struct irdma_pci_f *rf = iwdev->rf;
+
+	rf->rdma_ver = IRDMA_GEN_1;
+	rf->gen_ops.init_hw = i40iw_init_hw;
+	rf->gen_ops.request_reset = i40iw_request_reset;
+	rf->hw.hw_addr = cdev_info->hw_addr;
+	rf->pcidev = cdev_info->pcidev;
+	rf->sc_dev.pci_rev = rf->pcidev->revision;
+	rf->limits_sel = 5;
+	rf->protocol_used = IRDMA_IWARP_PROTOCOL_ONLY;
+	rf->iwdev = iwdev;
+
+	iwdev->init_state = INITIAL_STATE;
+	iwdev->rcv_wnd = IRDMA_CM_DEFAULT_RCV_WND_SCALED;
+	iwdev->rcv_wscale = IRDMA_CM_DEFAULT_RCV_WND_SCALE;
+	iwdev->netdev = cdev_info->netdev;
+	iwdev->vsi_num = 0;
+}
+
+/**
+ * i40iw_open - client interface operation open for iwarp/uda device
+ * @cdev_info: parent lan device information structure with data/ops
+ * @client: iwarp client information, provided during registration
+ *
+ * Called by the lan driver during the processing of client register
+ * Create device resources, set up queues, pble and hmc objects and
+ * register the device with the ib verbs interface
+ * Return 0 if successful, otherwise return error
+ */
+static int i40iw_open(struct i40e_info *cdev_info, struct i40e_client *client)
+{
+	struct irdma_priv_core_dev_info *priv_cdev_info;
+	struct irdma_l2params l2params = {};
+	struct irdma_device *iwdev;
+	struct irdma_pci_f *rf;
+	int err = -EIO;
+	int i;
+	u16 qset;
+	u16 last_qset = IRDMA_NO_QSET;
+
+	iwdev = ib_alloc_device(irdma_device, ibdev);
+	if (!iwdev)
+		return -ENOMEM;
+
+	iwdev->rf = kzalloc(sizeof(*rf), GFP_KERNEL);
+	if (!iwdev->rf) {
+		ib_dealloc_device(&iwdev->ibdev);
+		return -ENOMEM;
+	}
+
+	i40iw_fill_device_info(iwdev, cdev_info);
+	rf = iwdev->rf;
+
+	priv_cdev_info = &rf->priv_cdev_info;
+	priv_cdev_info->client = client;
+	priv_cdev_info->cdev_info = cdev_info;
+	priv_cdev_info->fn_num = cdev_info->fid;
+	priv_cdev_info->ftype = cdev_info->ftype;
+	priv_cdev_info->pf_vsi_num = 0;
+	priv_cdev_info->msix_count = cdev_info->msix_count;
+	priv_cdev_info->msix_entries = cdev_info->msix_entries;
+
+	if (irdma_ctrl_init_hw(rf)) {
+		err = -EIO;
+		goto err_ctrl_init;
+	}
+
+	l2params.mtu = (cdev_info->params.mtu) ? cdev_info->params.mtu : IRDMA_DEFAULT_MTU;
+	for (i = 0; i < I40E_CLIENT_MAX_USER_PRIORITY; i++) {
+		qset = cdev_info->params.qos.prio_qos[i].qs_handle;
+		l2params.up2tc[i] = cdev_info->params.qos.prio_qos[i].tc;
+		l2params.qs_handle_list[i] = qset;
+		if (last_qset == IRDMA_NO_QSET)
+			last_qset = qset;
+		else if ((qset != last_qset) && (qset != IRDMA_NO_QSET))
+			iwdev->dcb = true;
+	}
+
+	if (irdma_rt_init_hw(iwdev, &l2params)) {
+		err = -EIO;
+		goto err_rt_init;
+	}
+
+	err = irdma_ib_register_device(iwdev);
+	if (err)
+		goto err_ibreg;
+
+	ibdev_dbg(&iwdev->ibdev, "INIT: Gen1 PF[%d] open success cdev_info=%p\n",
+		  priv_cdev_info->fn_num, cdev_info);
+
+	return 0;
+
+err_ibreg:
+	irdma_rt_deinit_hw(iwdev);
+err_rt_init:
+	irdma_ctrl_deinit_hw(rf);
+err_ctrl_init:
+	kfree(iwdev->rf);
+	ib_dealloc_device(&iwdev->ibdev);
+
+	return err;
+}
+
+/**
+ * i40iw_l2param_change - handle mss change
+ * @cdev_info: parent lan device information structure with data/ops
+ * @client: client for parameter change
+ * @params: new parameters from L2
+ */
+static void i40iw_l2param_change(struct i40e_info *cdev_info,
+				 struct i40e_client *client,
+				 struct i40e_params *params)
+{
+	struct irdma_l2params l2params = {};
+	struct irdma_device *iwdev;
+	struct ib_device *ibdev;
+
+	ibdev = ib_device_get_by_netdev(cdev_info->netdev, RDMA_DRIVER_IRDMA);
+	if (!ibdev)
+		return;
+
+	iwdev = to_iwdev(ibdev);
+
+	if (iwdev->vsi.mtu != params->mtu) {
+		l2params.mtu_changed = true;
+		l2params.mtu = params->mtu;
+	}
+	irdma_change_l2params(&iwdev->vsi, &l2params);
+	ib_device_put(ibdev);
+}
+
+/**
+ * i40iw_close - client interface operation close for iwarp/uda device
+ * @cdev_info: parent lan device information structure with data/ops
+ * @client: client to close
+ * @reset: flag to indicate close on reset
+ *
+ * Called by the lan driver during the processing of client unregister
+ * Destroy and clean up the driver resources
+ */
+static void i40iw_close(struct i40e_info *cdev_info, struct i40e_client *client,
+			bool reset)
+{
+	struct irdma_device *iwdev;
+	struct ib_device *ibdev;
+
+	ibdev = ib_device_get_by_netdev(cdev_info->netdev, RDMA_DRIVER_IRDMA);
+	if (WARN_ON(!ibdev))
+		return;
+
+	iwdev = to_iwdev(ibdev);
+	if (reset)
+		iwdev->reset = true;
+
+	iwdev->iw_status = 0;
+	irdma_port_ibevent(iwdev);
+	ib_unregister_device_and_put(ibdev);
+	pr_debug("INIT: Gen1 VSI close complete cdev_info=%p\n", cdev_info);
+}
+
+/* client interface functions */
+static const struct i40e_client_ops i40e_ops = {
+	.open = i40iw_open,
+	.close = i40iw_close,
+	.l2_param_change = i40iw_l2param_change
+};
+
+static struct i40e_client i40iw_client = {
+	.ops = &i40e_ops,
+	.type = I40E_CLIENT_IWARP,
+};
+
+static int i40iw_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id)
+{
+	struct i40e_auxiliary_device *i40e_adev = container_of(aux_dev,
+							       struct i40e_auxiliary_device,
+							       aux_dev);
+	struct i40e_info *cdev_info = i40e_adev->ldev;
+
+	strncpy(i40iw_client.name, "irdma", I40E_CLIENT_STR_LENGTH);
+	cdev_info->client = &i40iw_client;
+	cdev_info->aux_dev = aux_dev;
+
+	return cdev_info->ops->client_device_register(cdev_info);
+}
+
+static void i40iw_remove(struct auxiliary_device *aux_dev)
+{
+	struct i40e_auxiliary_device *i40e_adev = container_of(aux_dev,
+							       struct i40e_auxiliary_device,
+							       aux_dev);
+	struct i40e_info *cdev_info = i40e_adev->ldev;
+
+	cdev_info->ops->client_device_unregister(cdev_info);
+}
+
+static const struct auxiliary_device_id i40iw_auxiliary_id_table[] = {
+	{.name = "i40e.intel_rdma_iwarp", },
+	{},
+};
+
+MODULE_DEVICE_TABLE(auxiliary, i40iw_auxiliary_id_table);
+
+struct auxiliary_driver i40iw_auxiliary_drv = {
+	.name = "gen_1",
+	.id_table = i40iw_auxiliary_id_table,
+	.probe = i40iw_probe,
+	.remove = i40iw_remove,
+};
diff --git a/drivers/infiniband/hw/irdma/main.c b/drivers/infiniband/hw/irdma/main.c
new file mode 100644
index 0000000..dcf4ab6
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/main.c
@@ -0,0 +1,382 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "main.h"
+
+MODULE_ALIAS("i40iw");
+MODULE_AUTHOR("Intel Corporation, <e1000-rdma@lists.sourceforge.net>");
+MODULE_DESCRIPTION("Intel(R) Ethernet Protocol Driver for RDMA");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static struct notifier_block irdma_inetaddr_notifier = {
+	.notifier_call = irdma_inetaddr_event
+};
+
+static struct notifier_block irdma_inetaddr6_notifier = {
+	.notifier_call = irdma_inet6addr_event
+};
+
+static struct notifier_block irdma_net_notifier = {
+	.notifier_call = irdma_net_event
+};
+
+static struct notifier_block irdma_netdevice_notifier = {
+	.notifier_call = irdma_netdevice_event
+};
+
+static void irdma_register_notifiers(void)
+{
+	register_inetaddr_notifier(&irdma_inetaddr_notifier);
+	register_inet6addr_notifier(&irdma_inetaddr6_notifier);
+	register_netevent_notifier(&irdma_net_notifier);
+	register_netdevice_notifier(&irdma_netdevice_notifier);
+}
+
+static void irdma_unregister_notifiers(void)
+{
+	unregister_netevent_notifier(&irdma_net_notifier);
+	unregister_inetaddr_notifier(&irdma_inetaddr_notifier);
+	unregister_inet6addr_notifier(&irdma_inetaddr6_notifier);
+	unregister_netdevice_notifier(&irdma_netdevice_notifier);
+}
+
+static void irdma_prep_tc_change(struct irdma_device *iwdev)
+{
+	iwdev->vsi.tc_change_pending = true;
+	irdma_sc_suspend_resume_qps(&iwdev->vsi, IRDMA_OP_SUSPEND);
+
+	/* Wait for all qp's to suspend */
+	wait_event_timeout(iwdev->suspend_wq,
+			   !atomic_read(&iwdev->vsi.qp_suspend_reqs),
+			   IRDMA_EVENT_TIMEOUT);
+	irdma_ws_reset(&iwdev->vsi);
+}
+
+static void irdma_log_invalid_mtu(u16 mtu, struct irdma_sc_dev *dev)
+{
+	if (mtu < IRDMA_MIN_MTU_IPV4)
+		ibdev_warn(to_ibdev(dev), "MTU setting [%d] too low for RDMA traffic. Minimum MTU is 576 for IPv4\n", mtu);
+	else if (mtu < IRDMA_MIN_MTU_IPV6)
+		ibdev_warn(to_ibdev(dev), "MTU setting [%d] too low for RDMA traffic. Minimum MTU is 1280 for IPv6\\n", mtu);
+}
+
+static void irdma_iidc_event_handler(struct iidc_core_dev_info *cdev_info, struct iidc_event *event)
+{
+	struct irdma_device *iwdev = dev_get_drvdata(&cdev_info->adev->dev);
+	struct irdma_l2params l2params = {};
+	int i;
+
+	if (*event->type & BIT(IIDC_EVENT_AFTER_MTU_CHANGE)) {
+		ibdev_dbg(&iwdev->ibdev, "CLNT: new MTU = %d\n",
+			  cdev_info->netdev->mtu);
+		if (iwdev->vsi.mtu != cdev_info->netdev->mtu) {
+			l2params.mtu = cdev_info->netdev->mtu;
+			l2params.mtu_changed = true;
+			irdma_log_invalid_mtu(l2params.mtu, &iwdev->rf->sc_dev);
+			irdma_change_l2params(&iwdev->vsi, &l2params);
+		}
+	} else if (*event->type & BIT(IIDC_EVENT_BEFORE_TC_CHANGE)) {
+		if (iwdev->vsi.tc_change_pending)
+			return;
+
+		irdma_prep_tc_change(iwdev);
+	} else if (*event->type & BIT(IIDC_EVENT_AFTER_TC_CHANGE)) {
+		if (!iwdev->vsi.tc_change_pending)
+			return;
+
+		l2params.tc_changed = true;
+		ibdev_dbg(&iwdev->ibdev, "CLNT: TC Change\n");
+		iwdev->dcb = event->info.port_qos.num_tc > 1;
+
+		for (i = 0; i < IIDC_MAX_USER_PRIORITY; ++i)
+			l2params.up2tc[i] = event->info.port_qos.up2tc[i];
+		irdma_change_l2params(&iwdev->vsi, &l2params);
+	} else if (*event->type & BIT(IIDC_EVENT_CRIT_ERR)) {
+		ibdev_warn(&iwdev->ibdev, "ICE OICR event notification: oicr = 0x%08x\n",
+			   event->info.reg);
+		if (event->info.reg & IRDMAPFINT_OICR_PE_CRITERR_M) {
+			u32 pe_criterr;
+
+			pe_criterr = readl(iwdev->rf->sc_dev.hw_regs[IRDMA_GLPE_CRITERR]);
+#define IRDMA_Q1_RESOURCE_ERR 0x0001024d
+			if (pe_criterr != IRDMA_Q1_RESOURCE_ERR) {
+				ibdev_err(&iwdev->ibdev, "critical PE Error, GLPE_CRITERR=0x%08x\n",
+					  pe_criterr);
+				iwdev->rf->reset = true;
+			} else {
+				ibdev_warn(&iwdev->ibdev, "Q1 Resource Check\n");
+			}
+		}
+		if (event->info.reg & IRDMAPFINT_OICR_HMC_ERR_M) {
+			ibdev_err(&iwdev->ibdev, "HMC Error\n");
+			iwdev->rf->reset = true;
+		}
+		if (event->info.reg & IRDMAPFINT_OICR_PE_PUSH_M) {
+			ibdev_err(&iwdev->ibdev, "PE Push Error\n");
+			iwdev->rf->reset = true;
+		}
+		if (iwdev->rf->reset)
+			iwdev->rf->gen_ops.request_reset(iwdev->rf);
+	}
+}
+
+/**
+ * irdma_request_reset - Request a reset
+ * @rf: RDMA PCI function
+ */
+static void irdma_request_reset(struct irdma_pci_f *rf)
+{
+	struct iidc_core_dev_info *cdev_info = rf->priv_cdev_info.cdev_info;
+
+	ibdev_warn(&rf->iwdev->ibdev, "Requesting a reset\n");
+	cdev_info->ops->request_reset(cdev_info, IIDC_PFR);
+}
+
+/**
+ * irdma_lan_register_qset - Register qset with LAN driver
+ * @vsi: vsi structure
+ * @tc_node: Traffic class node
+ */
+static enum irdma_status_code irdma_lan_register_qset(struct irdma_sc_vsi *vsi,
+						      struct irdma_ws_node *tc_node)
+{
+	struct irdma_device *iwdev = vsi->back_vsi;
+	struct iidc_core_dev_info *cdev_info = iwdev->rf->priv_cdev_info.cdev_info;
+	struct iidc_res rdma_qset_res = {};
+	int ret;
+
+	rdma_qset_res.cnt_req = 1;
+	rdma_qset_res.res_type = IIDC_RDMA_QSETS_TXSCHED;
+	rdma_qset_res.res[0].res.qsets.qs_handle = tc_node->qs_handle;
+	rdma_qset_res.res[0].res.qsets.tc = tc_node->traffic_class;
+	rdma_qset_res.res[0].res.qsets.vport_id = vsi->vsi_idx;
+	ret = cdev_info->ops->alloc_res(cdev_info, &rdma_qset_res, 0);
+	if (ret) {
+		ibdev_dbg(&iwdev->ibdev, "WS: LAN alloc_res for rdma qset failed.\n");
+		return IRDMA_ERR_REG_QSET;
+	}
+
+	tc_node->l2_sched_node_id = rdma_qset_res.res[0].res.qsets.teid;
+	vsi->qos[tc_node->user_pri].l2_sched_node_id =
+		rdma_qset_res.res[0].res.qsets.teid;
+
+	return 0;
+}
+
+/**
+ * irdma_lan_unregister_qset - Unregister qset with LAN driver
+ * @vsi: vsi structure
+ * @tc_node: Traffic class node
+ */
+static void irdma_lan_unregister_qset(struct irdma_sc_vsi *vsi,
+				      struct irdma_ws_node *tc_node)
+{
+	struct irdma_device *iwdev = vsi->back_vsi;
+	struct iidc_core_dev_info *cdev_info = iwdev->rf->priv_cdev_info.cdev_info;
+	struct iidc_res rdma_qset_res = {};
+
+	rdma_qset_res.res_allocated = 1;
+	rdma_qset_res.res_type = IIDC_RDMA_QSETS_TXSCHED;
+	rdma_qset_res.res[0].res.qsets.vport_id = vsi->vsi_idx;
+	rdma_qset_res.res[0].res.qsets.teid = tc_node->l2_sched_node_id;
+	rdma_qset_res.res[0].res.qsets.qs_handle = tc_node->qs_handle;
+
+	if (cdev_info->ops->free_res(cdev_info, &rdma_qset_res))
+		ibdev_dbg(&iwdev->ibdev, "WS: LAN free_res for rdma qset failed.\n");
+}
+
+static void irdma_remove(struct auxiliary_device *aux_dev)
+{
+	struct iidc_auxiliary_dev *iidc_adev = container_of(aux_dev,
+							    struct iidc_auxiliary_dev,
+							    adev);
+	struct iidc_core_dev_info *cdev_info = iidc_adev->cdev_info;
+	struct irdma_device *iwdev = dev_get_drvdata(&aux_dev->dev);
+
+	irdma_ib_unregister_device(iwdev);
+	cdev_info->ops->update_vport_filter(cdev_info, iwdev->vsi_num, false);
+
+	pr_debug("INIT: Gen2 device remove success cdev_info=%p\n", cdev_info);
+}
+
+static void irdma_fill_qos_info(struct irdma_l2params *l2params,
+				struct iidc_core_dev_info *cdev_info)
+{
+	int i;
+
+	l2params->mtu = cdev_info->netdev->mtu;
+	l2params->num_tc = cdev_info->qos_info.num_tc;
+	l2params->num_apps = cdev_info->qos_info.num_apps;
+	l2params->vsi_prio_type = cdev_info->qos_info.vport_priority_type;
+	l2params->vsi_rel_bw = cdev_info->qos_info.vport_relative_bw;
+	for (i = 0; i < l2params->num_tc; i++) {
+		l2params->tc_info[i].egress_virt_up =
+			cdev_info->qos_info.tc_info[i].egress_virt_up;
+		l2params->tc_info[i].ingress_virt_up =
+			cdev_info->qos_info.tc_info[i].ingress_virt_up;
+		l2params->tc_info[i].prio_type =
+			cdev_info->qos_info.tc_info[i].prio_type;
+		l2params->tc_info[i].rel_bw =
+			cdev_info->qos_info.tc_info[i].rel_bw;
+		l2params->tc_info[i].tc_ctx =
+			cdev_info->qos_info.tc_info[i].tc_ctx;
+	}
+	for (i = 0; i < IIDC_MAX_USER_PRIORITY; i++)
+		l2params->up2tc[i] = cdev_info->qos_info.up2tc[i];
+}
+
+static void irdma_fill_device_info(struct irdma_device *iwdev,
+				   struct iidc_core_dev_info *cdev_info)
+{
+	struct irdma_pci_f *rf = iwdev->rf;
+
+	rf->gen_ops.init_hw = icrdma_init_hw;
+	rf->gen_ops.request_reset = irdma_request_reset;
+	if (!cdev_info->ftype) {
+		rf->gen_ops.register_qset = irdma_lan_register_qset;
+		rf->gen_ops.unregister_qset = irdma_lan_unregister_qset;
+	}
+	rf->rdma_ver = IRDMA_GEN_2;
+	rf->rsrc_profile = IRDMA_HMC_PROFILE_DEFAULT;
+	rf->rst_to = IRDMA_RST_TIMEOUT_HZ;
+	rf->hw.hw_addr = cdev_info->hw_addr;
+	rf->pcidev = cdev_info->pdev;
+	rf->default_vsi.vsi_idx = cdev_info->vport_id;
+	rf->sc_dev.pci_rev = cdev_info->pdev->revision;
+	rf->limits_sel = 7;
+	rf->protocol_used = cdev_info->rdma_protocol == IIDC_RDMA_PROTOCOL_ROCEV2 ?
+			    IRDMA_ROCE_PROTOCOL_ONLY : IRDMA_IWARP_PROTOCOL_ONLY;
+	rf->iwdev = iwdev;
+
+	iwdev->netdev = cdev_info->netdev;
+	iwdev->init_state = INITIAL_STATE;
+	iwdev->vsi_num = cdev_info->vport_id;
+	iwdev->roce_cwnd = IRDMA_ROCE_CWND_DEFAULT;
+	iwdev->roce_ackcreds = IRDMA_ROCE_ACKCREDS_DEFAULT;
+	iwdev->rcv_wnd = IRDMA_CM_DEFAULT_RCV_WND_SCALED;
+	iwdev->rcv_wscale = IRDMA_CM_DEFAULT_RCV_WND_SCALE;
+	if (rf->protocol_used == IRDMA_ROCE_PROTOCOL_ONLY)
+		iwdev->roce_mode = true;
+}
+
+static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id)
+{
+	struct iidc_auxiliary_dev *iidc_adev = container_of(aux_dev,
+							    struct iidc_auxiliary_dev,
+							    adev);
+	struct iidc_core_dev_info *cdev_info = iidc_adev->cdev_info;
+	struct irdma_device *iwdev;
+	struct irdma_pci_f *rf;
+	struct irdma_priv_core_dev_info *priv_cdev_info;
+	struct irdma_l2params l2params = {};
+	int err;
+
+	iwdev = ib_alloc_device(irdma_device, ibdev);
+	if (!iwdev)
+		return -ENOMEM;
+
+	iwdev->rf = kzalloc(sizeof(*rf), GFP_KERNEL);
+	if (!iwdev->rf) {
+		ib_dealloc_device(&iwdev->ibdev);
+		return -ENOMEM;
+	}
+
+	irdma_fill_device_info(iwdev, cdev_info);
+	rf = iwdev->rf;
+
+	/* save information from cdev_info to priv_cdev_info*/
+	priv_cdev_info = &rf->priv_cdev_info;
+	priv_cdev_info->cdev_info = cdev_info;
+	priv_cdev_info->fn_num = PCI_FUNC(cdev_info->pdev->devfn);
+	priv_cdev_info->ftype = cdev_info->ftype;
+	priv_cdev_info->msix_count = cdev_info->msix_count;
+	priv_cdev_info->msix_entries = cdev_info->msix_entries;
+
+	if (irdma_ctrl_init_hw(rf)) {
+		err = -EIO;
+		goto err_ctrl_init;
+	}
+
+	irdma_fill_qos_info(&l2params, cdev_info);
+	if (irdma_rt_init_hw(iwdev, &l2params)) {
+		err = -EIO;
+		goto err_rt_init;
+	}
+
+	err = irdma_ib_register_device(iwdev);
+	if (err)
+		goto err_ibreg;
+
+	cdev_info->ops->update_vport_filter(cdev_info, iwdev->vsi_num, true);
+
+	ibdev_dbg(&iwdev->ibdev, "INIT: Gen2 device probe success cdev_info=%p\n",
+		  cdev_info);
+
+	dev_set_drvdata(&aux_dev->dev, iwdev);
+
+	return 0;
+
+err_ibreg:
+	irdma_rt_deinit_hw(iwdev);
+err_rt_init:
+	irdma_ctrl_deinit_hw(rf);
+err_ctrl_init:
+	kfree(iwdev->rf);
+	ib_dealloc_device(&iwdev->ibdev);
+
+	return err;
+}
+
+static struct iidc_auxiliary_ops irdma_iidc_aux_ops = {
+	.event_handler = irdma_iidc_event_handler,
+};
+
+static const struct auxiliary_device_id irdma_auxiliary_id_table[] = {
+	{.name = "ice.intel_rdma_iwarp", },
+	{.name = "ice.intel_rdma_roce", },
+	{},
+};
+
+MODULE_DEVICE_TABLE(auxiliary, irdma_auxiliary_id_table);
+
+static struct iidc_auxiliary_drv irdma_auxiliary_drv = {
+	.adrv = {
+	    .id_table = irdma_auxiliary_id_table,
+	    .probe = irdma_probe,
+	    .remove = irdma_remove,
+	},
+	.ops = &irdma_iidc_aux_ops,
+};
+
+static int __init irdma_init_module(void)
+{
+	int ret;
+
+	ret = auxiliary_driver_register(&i40iw_auxiliary_drv);
+	if (ret) {
+		pr_err("Failed i40iw(gen_1) auxiliary_driver_register() ret=%d\n",
+		       ret);
+		return ret;
+	}
+
+	ret = auxiliary_driver_register(&irdma_auxiliary_drv.adrv);
+	if (ret) {
+		auxiliary_driver_unregister(&i40iw_auxiliary_drv);
+		pr_err("Failed irdma auxiliary_driver_register() ret=%d\n",
+		       ret);
+		return ret;
+	}
+
+	irdma_register_notifiers();
+
+	return 0;
+}
+
+static void __exit irdma_exit_module(void)
+{
+	irdma_unregister_notifiers();
+	auxiliary_driver_unregister(&irdma_auxiliary_drv.adrv);
+	auxiliary_driver_unregister(&i40iw_auxiliary_drv);
+}
+
+module_init(irdma_init_module);
+module_exit(irdma_exit_module);
diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h
new file mode 100644
index 0000000..dd1bbf2
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/main.h
@@ -0,0 +1,565 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef IRDMA_MAIN_H
+#define IRDMA_MAIN_H
+
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/if_vlan.h>
+#include <net/addrconf.h>
+#include <net/netevent.h>
+#include <net/tcp.h>
+#include <net/ip6_route.h>
+#include <net/flow.h>
+#include <net/secure_seq.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/inetdevice.h>
+#include <linux/spinlock.h>
+#include <linux/kernel.h>
+#include <linux/delay.h>
+#include <linux/pci.h>
+#include <linux/dma-mapping.h>
+#include <linux/workqueue.h>
+#include <linux/slab.h>
+#include <linux/io.h>
+#include <linux/crc32c.h>
+#include <linux/kthread.h>
+#ifndef CONFIG_64BIT
+#include <linux/io-64-nonatomic-lo-hi.h>
+#endif
+#include <linux/auxiliary_bus.h>
+#include <linux/net/intel/iidc.h>
+#include <crypto/hash.h>
+#include <rdma/ib_smi.h>
+#include <rdma/ib_verbs.h>
+#include <rdma/ib_pack.h>
+#include <rdma/rdma_cm.h>
+#include <rdma/iw_cm.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/ib_umem.h>
+#include <rdma/ib_cache.h>
+#include <rdma/uverbs_ioctl.h>
+#include "status.h"
+#include "osdep.h"
+#include "defs.h"
+#include "hmc.h"
+#include "type.h"
+#include "ws.h"
+#include "protos.h"
+#include "pble.h"
+#include "cm.h"
+#include <rdma/irdma-abi.h>
+#include "verbs.h"
+#include "user.h"
+#include "puda.h"
+
+extern struct auxiliary_driver i40iw_auxiliary_drv;
+
+#define IRDMA_FW_VER_DEFAULT	2
+#define IRDMA_HW_VER	        2
+
+#define IRDMA_ARP_ADD		1
+#define IRDMA_ARP_DELETE	2
+#define IRDMA_ARP_RESOLVE	3
+
+#define IRDMA_MACIP_ADD		1
+#define IRDMA_MACIP_DELETE	2
+
+#define IW_CCQ_SIZE	(IRDMA_CQP_SW_SQSIZE_2048 + 1)
+#define IW_CEQ_SIZE	2048
+#define IW_AEQ_SIZE	2048
+
+#define RX_BUF_SIZE	(1536 + 8)
+#define IW_REG0_SIZE	(4 * 1024)
+#define IW_TX_TIMEOUT	(6 * HZ)
+#define IW_FIRST_QPN	1
+
+#define IW_SW_CONTEXT_ALIGN	1024
+
+#define MAX_DPC_ITERATIONS	128
+
+#define IRDMA_EVENT_TIMEOUT		50000
+#define IRDMA_VCHNL_EVENT_TIMEOUT	100000
+#define IRDMA_RST_TIMEOUT_HZ		4
+
+#define	IRDMA_NO_QSET	0xffff
+
+#define IW_CFG_FPM_QP_COUNT		32768
+#define IRDMA_MAX_PAGES_PER_FMR		512
+#define IRDMA_MIN_PAGES_PER_FMR		1
+#define IRDMA_CQP_COMPL_RQ_WQE_FLUSHED	2
+#define IRDMA_CQP_COMPL_SQ_WQE_FLUSHED	3
+
+#define IRDMA_Q_TYPE_PE_AEQ	0x80
+#define IRDMA_Q_INVALID_IDX	0xffff
+#define IRDMA_REM_ENDPOINT_TRK_QPID	3
+
+#define IRDMA_DRV_OPT_ENA_MPA_VER_0		0x00000001
+#define IRDMA_DRV_OPT_DISABLE_MPA_CRC		0x00000002
+#define IRDMA_DRV_OPT_DISABLE_FIRST_WRITE	0x00000004
+#define IRDMA_DRV_OPT_DISABLE_INTF		0x00000008
+#define IRDMA_DRV_OPT_ENA_MSI			0x00000010
+#define IRDMA_DRV_OPT_DUAL_LOGICAL_PORT		0x00000020
+#define IRDMA_DRV_OPT_NO_INLINE_DATA		0x00000080
+#define IRDMA_DRV_OPT_DISABLE_INT_MOD		0x00000100
+#define IRDMA_DRV_OPT_DISABLE_VIRT_WQ		0x00000200
+#define IRDMA_DRV_OPT_ENA_PAU			0x00000400
+#define IRDMA_DRV_OPT_MCAST_LOGPORT_MAP		0x00000800
+
+#define IW_HMC_OBJ_TYPE_NUM	ARRAY_SIZE(iw_hmc_obj_types)
+#define IRDMA_ROCE_CWND_DEFAULT			0x400
+#define IRDMA_ROCE_ACKCREDS_DEFAULT		0x1E
+
+#define IRDMA_FLUSH_SQ		BIT(0)
+#define IRDMA_FLUSH_RQ		BIT(1)
+#define IRDMA_REFLUSH		BIT(2)
+#define IRDMA_FLUSH_WAIT	BIT(3)
+
+enum init_completion_state {
+	INVALID_STATE = 0,
+	INITIAL_STATE,
+	CQP_CREATED,
+	HMC_OBJS_CREATED,
+	HW_RSRC_INITIALIZED,
+	CCQ_CREATED,
+	CEQ0_CREATED, /* Last state of probe */
+	ILQ_CREATED,
+	IEQ_CREATED,
+	CEQS_CREATED,
+	PBLE_CHUNK_MEM,
+	AEQ_CREATED,
+	IP_ADDR_REGISTERED,  /* Last state of open */
+};
+
+struct irdma_rsrc_limits {
+	u32 qplimit;
+	u32 mrlimit;
+	u32 cqlimit;
+};
+
+struct irdma_cqp_err_info {
+	u16 maj;
+	u16 min;
+	const char *desc;
+};
+
+struct irdma_cqp_compl_info {
+	u32 op_ret_val;
+	u16 maj_err_code;
+	u16 min_err_code;
+	bool error;
+	u8 op_code;
+};
+
+struct irdma_cqp_request {
+	struct cqp_cmds_info info;
+	wait_queue_head_t waitq;
+	struct list_head list;
+	refcount_t refcnt;
+	void (*callback_fcn)(struct irdma_cqp_request *cqp_request);
+	void *param;
+	struct irdma_cqp_compl_info compl_info;
+	bool waiting:1;
+	bool request_done:1;
+	bool dynamic:1;
+};
+
+struct irdma_cqp {
+	struct irdma_sc_cqp sc_cqp;
+	spinlock_t req_lock; /* protect CQP request list */
+	spinlock_t compl_lock; /* protect CQP completion processing */
+	wait_queue_head_t waitq;
+	wait_queue_head_t remove_wq;
+	struct irdma_dma_mem sq;
+	struct irdma_dma_mem host_ctx;
+	u64 *scratch_array;
+	struct irdma_cqp_request *cqp_requests;
+	struct list_head cqp_avail_reqs;
+	struct list_head cqp_pending_reqs;
+};
+
+struct irdma_ccq {
+	struct irdma_sc_cq sc_cq;
+	struct irdma_dma_mem mem_cq;
+	struct irdma_dma_mem shadow_area;
+};
+
+struct irdma_ceq {
+	struct irdma_sc_ceq sc_ceq;
+	struct irdma_dma_mem mem;
+	u32 irq;
+	u32 msix_idx;
+	struct irdma_pci_f *rf;
+	struct tasklet_struct dpc_tasklet;
+	spinlock_t ce_lock; /* sync cq destroy with cq completion event notification */
+};
+
+struct irdma_aeq {
+	struct irdma_sc_aeq sc_aeq;
+	struct irdma_dma_mem mem;
+	struct irdma_pble_alloc palloc;
+	bool virtual_map;
+};
+
+struct irdma_arp_entry {
+	u32 ip_addr[4];
+	u8 mac_addr[ETH_ALEN];
+};
+
+struct irdma_msix_vector {
+	u32 idx;
+	u32 irq;
+	u32 cpu_affinity;
+	u32 ceq_id;
+	cpumask_t mask;
+};
+
+struct irdma_mc_table_info {
+	u32 mgn;
+	u32 dest_ip[4];
+	bool lan_fwd:1;
+	bool ipv4_valid:1;
+};
+
+struct mc_table_list {
+	struct list_head list;
+	struct irdma_mc_table_info mc_info;
+	struct irdma_mcast_grp_info mc_grp_ctx;
+};
+
+struct irdma_qv_info {
+	u32 v_idx; /* msix_vector */
+	u16 ceq_idx;
+	u16 aeq_idx;
+	u8 itr_idx;
+};
+
+struct irdma_qvlist_info {
+	u32 num_vectors;
+	struct irdma_qv_info qv_info[1];
+};
+
+struct irdma_priv_core_dev_info {
+	unsigned int fn_num;
+	bool ftype;
+	u16 pf_vsi_num;
+	u16 msix_count;
+	struct msix_entry *msix_entries;
+	void *client;
+	void *cdev_info;
+};
+
+struct irdma_gen_ops {
+	void (*init_hw)(struct irdma_sc_dev *dev);
+	void (*request_reset)(struct irdma_pci_f *rf);
+	enum irdma_status_code (*register_qset)(struct irdma_sc_vsi *vsi,
+						struct irdma_ws_node *tc_node);
+	void (*unregister_qset)(struct irdma_sc_vsi *vsi,
+				struct irdma_ws_node *tc_node);
+};
+
+struct irdma_pci_f {
+	bool reset:1;
+	bool rsrc_created:1;
+	bool msix_shared:1;
+	u8 rsrc_profile;
+	u8 *hmc_info_mem;
+	u8 *mem_rsrc;
+	u8 rdma_ver;
+	u8 rst_to;
+	enum irdma_protocol_used protocol_used;
+	u32 sd_type;
+	u32 msix_count;
+	u32 max_mr;
+	u32 max_qp;
+	u32 max_cq;
+	u32 max_ah;
+	u32 next_ah;
+	u32 max_mcg;
+	u32 next_mcg;
+	u32 max_pd;
+	u32 next_qp;
+	u32 next_cq;
+	u32 next_pd;
+	u32 max_mr_size;
+	u32 max_cqe;
+	u32 mr_stagmask;
+	u32 used_pds;
+	u32 used_cqs;
+	u32 used_mrs;
+	u32 used_qps;
+	u32 arp_table_size;
+	u32 next_arp_index;
+	u32 ceqs_count;
+	u32 next_ws_node_id;
+	u32 max_ws_node_id;
+	u32 limits_sel;
+	unsigned long *allocated_ws_nodes;
+	unsigned long *allocated_qps;
+	unsigned long *allocated_cqs;
+	unsigned long *allocated_mrs;
+	unsigned long *allocated_pds;
+	unsigned long *allocated_mcgs;
+	unsigned long *allocated_ahs;
+	unsigned long *allocated_arps;
+	enum init_completion_state init_state;
+	struct irdma_sc_dev sc_dev;
+	struct irdma_priv_core_dev_info priv_cdev_info;
+	struct auxiliary_device *aux_dev;
+	struct pci_dev *pcidev;
+	struct irdma_hw hw;
+	struct irdma_cqp cqp;
+	struct irdma_ccq ccq;
+	struct irdma_aeq aeq;
+	struct irdma_ceq *ceqlist;
+	struct irdma_hmc_pble_rsrc *pble_rsrc;
+	struct irdma_arp_entry *arp_table;
+	spinlock_t arp_lock; /*protect ARP table access*/
+	spinlock_t rsrc_lock; /* protect HW resource array access */
+	spinlock_t qptable_lock; /*protect QP table access*/
+	struct irdma_qp **qp_table;
+	spinlock_t qh_list_lock; /* protect mc_qht_list */
+	struct mc_table_list mc_qht_list;
+	struct irdma_msix_vector *iw_msixtbl;
+	struct irdma_qvlist_info *iw_qvlist;
+	struct tasklet_struct dpc_tasklet;
+	struct irdma_dma_mem obj_mem;
+	struct irdma_dma_mem obj_next;
+	atomic_t vchnl_msgs;
+	wait_queue_head_t vchnl_waitq;
+	struct workqueue_struct *cqp_cmpl_wq;
+	struct work_struct cqp_cmpl_work;
+	struct irdma_sc_vsi default_vsi;
+	void *back_fcn;
+	struct irdma_gen_ops gen_ops;
+	struct irdma_device *iwdev;
+};
+
+struct irdma_device {
+	struct ib_device ibdev;
+	struct irdma_pci_f *rf;
+	struct net_device *netdev;
+	struct workqueue_struct *cleanup_wq;
+	struct irdma_sc_vsi vsi;
+	struct irdma_cm_core cm_core;
+	u32 roce_cwnd;
+	u32 roce_ackcreds;
+	u32 vendor_id;
+	u32 vendor_part_id;
+	u32 device_cap_flags;
+	u32 push_mode;
+	u32 rcv_wnd;
+	u16 mac_ip_table_idx;
+	u16 vsi_num;
+	u8 rcv_wscale;
+	u8 iw_status;
+	bool roce_mode:1;
+	bool roce_dcqcn_en:1;
+	bool dcb:1;
+	bool reset:1;
+	bool iw_ooo:1;
+	enum init_completion_state init_state;
+
+	wait_queue_head_t suspend_wq;
+};
+static inline struct irdma_device *to_iwdev(struct ib_device *ibdev)
+{
+	return container_of(ibdev, struct irdma_device, ibdev);
+}
+
+static inline struct irdma_ucontext *to_ucontext(struct ib_ucontext *ibucontext)
+{
+	return container_of(ibucontext, struct irdma_ucontext, ibucontext);
+}
+
+static inline struct irdma_user_mmap_entry *
+to_irdma_mmap_entry(struct rdma_user_mmap_entry *rdma_entry)
+{
+	return container_of(rdma_entry, struct irdma_user_mmap_entry,
+			    rdma_entry);
+}
+
+static inline struct irdma_pd *to_iwpd(struct ib_pd *ibpd)
+{
+	return container_of(ibpd, struct irdma_pd, ibpd);
+}
+
+static inline struct irdma_ah *to_iwah(struct ib_ah *ibah)
+{
+	return container_of(ibah, struct irdma_ah, ibah);
+}
+
+static inline struct irdma_mr *to_iwmr(struct ib_mr *ibmr)
+{
+	return container_of(ibmr, struct irdma_mr, ibmr);
+}
+
+static inline struct irdma_mr *to_iwmw(struct ib_mw *ibmw)
+{
+	return container_of(ibmw, struct irdma_mr, ibmw);
+}
+
+static inline struct irdma_cq *to_iwcq(struct ib_cq *ibcq)
+{
+	return container_of(ibcq, struct irdma_cq, ibcq);
+}
+
+static inline struct irdma_qp *to_iwqp(struct ib_qp *ibqp)
+{
+	return container_of(ibqp, struct irdma_qp, ibqp);
+}
+
+static inline struct irdma_pci_f *dev_to_rf(struct irdma_sc_dev *dev)
+{
+	return container_of(dev, struct irdma_pci_f, sc_dev);
+}
+
+/**
+ * irdma_alloc_resource - allocate a resource
+ * @iwdev: device pointer
+ * @resource_array: resource bit array:
+ * @max_resources: maximum resource number
+ * @req_resources_num: Allocated resource number
+ * @next: next free id
+ **/
+static inline int irdma_alloc_rsrc(struct irdma_pci_f *rf,
+				   unsigned long *rsrc_array, u32 max_rsrc,
+				   u32 *req_rsrc_num, u32 *next)
+{
+	u32 rsrc_num;
+	unsigned long flags;
+
+	spin_lock_irqsave(&rf->rsrc_lock, flags);
+	rsrc_num = find_next_zero_bit(rsrc_array, max_rsrc, *next);
+	if (rsrc_num >= max_rsrc) {
+		rsrc_num = find_first_zero_bit(rsrc_array, max_rsrc);
+		if (rsrc_num >= max_rsrc) {
+			spin_unlock_irqrestore(&rf->rsrc_lock, flags);
+			ibdev_dbg(&rf->iwdev->ibdev,
+				  "ERR: resource [%d] allocation failed\n",
+				  rsrc_num);
+			return -EOVERFLOW;
+		}
+	}
+	__set_bit(rsrc_num, rsrc_array);
+	*next = rsrc_num + 1;
+	if (*next == max_rsrc)
+		*next = 0;
+	*req_rsrc_num = rsrc_num;
+	spin_unlock_irqrestore(&rf->rsrc_lock, flags);
+
+	return 0;
+}
+
+/**
+ * irdma_free_resource - free a resource
+ * @iwdev: device pointer
+ * @resource_array: resource array for the resource_num
+ * @resource_num: resource number to free
+ **/
+static inline void irdma_free_rsrc(struct irdma_pci_f *rf,
+				   unsigned long *rsrc_array, u32 rsrc_num)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&rf->rsrc_lock, flags);
+	__clear_bit(rsrc_num, rsrc_array);
+	spin_unlock_irqrestore(&rf->rsrc_lock, flags);
+}
+
+enum irdma_status_code irdma_ctrl_init_hw(struct irdma_pci_f *rf);
+void irdma_ctrl_deinit_hw(struct irdma_pci_f *rf);
+enum irdma_status_code irdma_rt_init_hw(struct irdma_device *iwdev,
+					struct irdma_l2params *l2params);
+void irdma_rt_deinit_hw(struct irdma_device *iwdev);
+void irdma_qp_add_ref(struct ib_qp *ibqp);
+void irdma_qp_rem_ref(struct ib_qp *ibqp);
+void irdma_free_lsmm_rsrc(struct irdma_qp *iwqp);
+struct ib_qp *irdma_get_qp(struct ib_device *ibdev, int qpn);
+void irdma_flush_wqes(struct irdma_qp *iwqp, u32 flush_mask);
+void irdma_manage_arp_cache(struct irdma_pci_f *rf, unsigned char *mac_addr,
+			    u32 *ip_addr, bool ipv4, u32 action);
+struct irdma_apbvt_entry *irdma_add_apbvt(struct irdma_device *iwdev, u16 port);
+void irdma_del_apbvt(struct irdma_device *iwdev,
+		     struct irdma_apbvt_entry *entry);
+struct irdma_cqp_request *irdma_alloc_and_get_cqp_request(struct irdma_cqp *cqp,
+							  bool wait);
+void irdma_free_cqp_request(struct irdma_cqp *cqp,
+			    struct irdma_cqp_request *cqp_request);
+void irdma_put_cqp_request(struct irdma_cqp *cqp,
+			   struct irdma_cqp_request *cqp_request);
+int irdma_alloc_local_mac_entry(struct irdma_pci_f *rf, u16 *mac_tbl_idx);
+int irdma_add_local_mac_entry(struct irdma_pci_f *rf, u8 *mac_addr, u16 idx);
+void irdma_del_local_mac_entry(struct irdma_pci_f *rf, u16 idx);
+
+u32 irdma_initialize_hw_rsrc(struct irdma_pci_f *rf);
+void irdma_port_ibevent(struct irdma_device *iwdev);
+void irdma_cm_disconn(struct irdma_qp *qp);
+
+bool irdma_cqp_crit_err(struct irdma_sc_dev *dev, u8 cqp_cmd,
+			u16 maj_err_code, u16 min_err_code);
+enum irdma_status_code
+irdma_handle_cqp_op(struct irdma_pci_f *rf,
+		    struct irdma_cqp_request *cqp_request);
+
+int irdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask,
+		    struct ib_udata *udata);
+int irdma_modify_qp_roce(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+			 int attr_mask, struct ib_udata *udata);
+void irdma_cq_wq_destroy(struct irdma_pci_f *rf, struct irdma_sc_cq *cq);
+
+void irdma_cleanup_pending_cqp_op(struct irdma_pci_f *rf);
+enum irdma_status_code irdma_hw_modify_qp(struct irdma_device *iwdev,
+					  struct irdma_qp *iwqp,
+					  struct irdma_modify_qp_info *info,
+					  bool wait);
+enum irdma_status_code irdma_qp_suspend_resume(struct irdma_sc_qp *qp,
+					       bool suspend);
+enum irdma_status_code
+irdma_manage_qhash(struct irdma_device *iwdev, struct irdma_cm_info *cminfo,
+		   enum irdma_quad_entry_type etype,
+		   enum irdma_quad_hash_manage_type mtype, void *cmnode,
+		   bool wait);
+void irdma_receive_ilq(struct irdma_sc_vsi *vsi, struct irdma_puda_buf *rbuf);
+void irdma_free_sqbuf(struct irdma_sc_vsi *vsi, void *bufp);
+void irdma_free_qp_rsrc(struct irdma_qp *iwqp);
+enum irdma_status_code irdma_setup_cm_core(struct irdma_device *iwdev, u8 ver);
+void irdma_cleanup_cm_core(struct irdma_cm_core *cm_core);
+void irdma_next_iw_state(struct irdma_qp *iwqp, u8 state, u8 del_hash, u8 term,
+			 u8 term_len);
+int irdma_send_syn(struct irdma_cm_node *cm_node, u32 sendack);
+int irdma_send_reset(struct irdma_cm_node *cm_node);
+struct irdma_cm_node *irdma_find_node(struct irdma_cm_core *cm_core,
+				      u16 rem_port, u32 *rem_addr, u16 loc_port,
+				      u32 *loc_addr, u16 vlan_id);
+enum irdma_status_code irdma_hw_flush_wqes(struct irdma_pci_f *rf,
+					   struct irdma_sc_qp *qp,
+					   struct irdma_qp_flush_info *info,
+					   bool wait);
+void irdma_gen_ae(struct irdma_pci_f *rf, struct irdma_sc_qp *qp,
+		  struct irdma_gen_ae_info *info, bool wait);
+void irdma_copy_ip_ntohl(u32 *dst, __be32 *src);
+void irdma_copy_ip_htonl(__be32 *dst, u32 *src);
+u16 irdma_get_vlan_ipv4(u32 *addr);
+struct net_device *irdma_netdev_vlan_ipv6(u32 *addr, u16 *vlan_id, u8 *mac);
+struct ib_mr *irdma_reg_phys_mr(struct ib_pd *ib_pd, u64 addr, u64 size,
+				int acc, u64 *iova_start);
+int irdma_upload_qp_context(struct irdma_qp *iwqp, bool freeze, bool raw);
+void irdma_cqp_ce_handler(struct irdma_pci_f *rf, struct irdma_sc_cq *cq);
+int irdma_ah_cqp_op(struct irdma_pci_f *rf, struct irdma_sc_ah *sc_ah, u8 cmd,
+		    bool wait,
+		    void (*callback_fcn)(struct irdma_cqp_request *cqp_request),
+		    void *cb_param);
+void irdma_gsi_ud_qp_ah_cb(struct irdma_cqp_request *cqp_request);
+int irdma_inetaddr_event(struct notifier_block *notifier, unsigned long event,
+			 void *ptr);
+int irdma_inet6addr_event(struct notifier_block *notifier, unsigned long event,
+			  void *ptr);
+int irdma_net_event(struct notifier_block *notifier, unsigned long event,
+		    void *ptr);
+int irdma_netdevice_event(struct notifier_block *notifier, unsigned long event,
+			  void *ptr);
+void irdma_add_ip(struct irdma_device *iwdev);
+void cqp_compl_worker(struct work_struct *work);
+#endif /* IRDMA_MAIN_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 09/23] RDMA/irdma: Implement device initialization definitions
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (7 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 08/23] RDMA/irdma: Register auxiliary driver and implement private channel OPs Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 10/23] RDMA/irdma: Implement HW Admin Queue OPs Shiraz Saleem
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Implement device initialization routines, interrupt set-up,
and allocate object bit-map tracking structures.
Also, add device specific attributes and register definitions.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/hw.c        | 2755 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/i40iw_hw.c  |  216 +++
 drivers/infiniband/hw/irdma/i40iw_hw.h  |  160 ++
 drivers/infiniband/hw/irdma/icrdma_hw.c |   84 +
 drivers/infiniband/hw/irdma/icrdma_hw.h |   71 +
 5 files changed, 3286 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/hw.c
 create mode 100644 drivers/infiniband/hw/irdma/i40iw_hw.c
 create mode 100644 drivers/infiniband/hw/irdma/i40iw_hw.h
 create mode 100644 drivers/infiniband/hw/irdma/icrdma_hw.c
 create mode 100644 drivers/infiniband/hw/irdma/icrdma_hw.h

diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
new file mode 100644
index 0000000..1d50448
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/hw.c
@@ -0,0 +1,2755 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "main.h"
+
+static struct irdma_rsrc_limits rsrc_limits_table[] = {
+	[0] = {
+		.qplimit = SZ_128,
+	},
+	[1] = {
+		.qplimit = SZ_1K,
+	},
+	[2] = {
+		.qplimit = SZ_2K,
+	},
+	[3] = {
+		.qplimit = SZ_4K,
+	},
+	[4] = {
+		.qplimit = SZ_16K,
+	},
+	[5] = {
+		.qplimit = SZ_64K,
+	},
+	[6] = {
+		.qplimit = SZ_128K,
+	},
+	[7] = {
+		.qplimit = SZ_256K,
+	},
+};
+
+/* types of hmc objects */
+static enum irdma_hmc_rsrc_type iw_hmc_obj_types[] = {
+	IRDMA_HMC_IW_QP,
+	IRDMA_HMC_IW_CQ,
+	IRDMA_HMC_IW_HTE,
+	IRDMA_HMC_IW_ARP,
+	IRDMA_HMC_IW_APBVT_ENTRY,
+	IRDMA_HMC_IW_MR,
+	IRDMA_HMC_IW_XF,
+	IRDMA_HMC_IW_XFFL,
+	IRDMA_HMC_IW_Q1,
+	IRDMA_HMC_IW_Q1FL,
+	IRDMA_HMC_IW_TIMER,
+	IRDMA_HMC_IW_FSIMC,
+	IRDMA_HMC_IW_FSIAV,
+	IRDMA_HMC_IW_RRF,
+	IRDMA_HMC_IW_RRFFL,
+	IRDMA_HMC_IW_HDR,
+	IRDMA_HMC_IW_MD,
+	IRDMA_HMC_IW_OOISC,
+	IRDMA_HMC_IW_OOISCFFL,
+};
+
+/**
+ * irdma_iwarp_ce_handler - handle iwarp completions
+ * @iwcq: iwarp cq receiving event
+ */
+static void irdma_iwarp_ce_handler(struct irdma_sc_cq *iwcq)
+{
+	struct irdma_cq *cq = iwcq->back_cq;
+
+	if (cq->ibcq.comp_handler)
+		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+}
+
+/**
+ * irdma_puda_ce_handler - handle puda completion events
+ * @rf: RDMA PCI function
+ * @cq: puda completion q for event
+ */
+static void irdma_puda_ce_handler(struct irdma_pci_f *rf,
+				  struct irdma_sc_cq *cq)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	enum irdma_status_code status;
+	u32 compl_error;
+
+	do {
+		status = irdma_puda_poll_cmpl(dev, cq, &compl_error);
+		if (status == IRDMA_ERR_Q_EMPTY)
+			break;
+		if (status) {
+			ibdev_dbg(to_ibdev(dev), "ERR: puda status = %d\n", status);
+			break;
+		}
+		if (compl_error) {
+			ibdev_dbg(to_ibdev(dev), "ERR: puda compl_err  =0x%x\n",
+				  compl_error);
+			break;
+		}
+	} while (1);
+
+	dev->ccq_ops->ccq_arm(cq);
+}
+
+/**
+ * irdma_process_ceq - handle ceq for completions
+ * @rf: RDMA PCI function
+ * @ceq: ceq having cq for completion
+ */
+static void irdma_process_ceq(struct irdma_pci_f *rf, struct irdma_ceq *ceq)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_sc_ceq *sc_ceq;
+	struct irdma_sc_cq *cq;
+	unsigned long flags;
+
+	sc_ceq = &ceq->sc_ceq;
+	do {
+		spin_lock_irqsave(&ceq->ce_lock, flags);
+		cq = dev->ceq_ops->process_ceq(dev, sc_ceq);
+		if (!cq) {
+			spin_unlock_irqrestore(&ceq->ce_lock, flags);
+			break;
+		}
+
+		if (cq->cq_type == IRDMA_CQ_TYPE_IWARP)
+			irdma_iwarp_ce_handler(cq);
+
+		spin_unlock_irqrestore(&ceq->ce_lock, flags);
+
+		if (cq->cq_type == IRDMA_CQ_TYPE_CQP)
+			queue_work(rf->cqp_cmpl_wq, &rf->cqp_cmpl_work);
+		else if (cq->cq_type == IRDMA_CQ_TYPE_ILQ ||
+			 cq->cq_type == IRDMA_CQ_TYPE_IEQ)
+			irdma_puda_ce_handler(rf, cq);
+	} while (1);
+}
+
+static void irdma_set_flush_fields(struct irdma_sc_qp *qp,
+				   struct irdma_aeqe_info *info)
+{
+	qp->sq_flush_code = info->sq;
+	qp->rq_flush_code = info->rq;
+	qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC;
+
+	switch (info->ae_id) {
+	case IRDMA_AE_AMP_UNALLOCATED_STAG:
+	case IRDMA_AE_AMP_BOUNDS_VIOLATION:
+	case IRDMA_AE_AMP_INVALID_STAG:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		fallthrough;
+	case IRDMA_AE_AMP_BAD_PD:
+	case IRDMA_AE_UDA_XMIT_BAD_PD:
+		qp->flush_code = FLUSH_PROT_ERR;
+		break;
+	case IRDMA_AE_AMP_BAD_QP:
+		qp->flush_code = FLUSH_LOC_QP_OP_ERR;
+		break;
+	case IRDMA_AE_AMP_BAD_STAG_KEY:
+	case IRDMA_AE_AMP_BAD_STAG_INDEX:
+	case IRDMA_AE_AMP_TO_WRAP:
+	case IRDMA_AE_AMP_RIGHTS_VIOLATION:
+	case IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS:
+	case IRDMA_AE_PRIV_OPERATION_DENIED:
+	case IRDMA_AE_IB_INVALID_REQUEST:
+	case IRDMA_AE_IB_REMOTE_ACCESS_ERROR:
+	case IRDMA_AE_IB_REMOTE_OP_ERROR:
+		qp->flush_code = FLUSH_REM_ACCESS_ERR;
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		break;
+	case IRDMA_AE_LLP_SEGMENT_TOO_SMALL:
+	case IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER:
+	case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG:
+	case IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT:
+	case IRDMA_AE_UDA_L4LEN_INVALID:
+	case IRDMA_AE_ROCE_RSP_LENGTH_ERROR:
+		qp->flush_code = FLUSH_LOC_LEN_ERR;
+		break;
+	case IRDMA_AE_LCE_QP_CATASTROPHIC:
+		qp->flush_code = FLUSH_FATAL_ERR;
+		break;
+	case IRDMA_AE_DDP_UBE_INVALID_MO:
+	case IRDMA_AE_IB_RREQ_AND_Q1_FULL:
+	case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR:
+		qp->flush_code = FLUSH_GENERAL_ERR;
+		break;
+	default:
+		qp->flush_code = FLUSH_FATAL_ERR;
+		break;
+	}
+}
+
+/**
+ * irdma_process_aeq - handle aeq events
+ * @rf: RDMA PCI function
+ */
+static void irdma_process_aeq(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_aeq *aeq = &rf->aeq;
+	struct irdma_sc_aeq *sc_aeq = &aeq->sc_aeq;
+	struct irdma_aeqe_info aeinfo;
+	struct irdma_aeqe_info *info = &aeinfo;
+	int ret;
+	struct irdma_qp *iwqp = NULL;
+	struct irdma_sc_cq *cq = NULL;
+	struct irdma_cq *iwcq = NULL;
+	struct irdma_sc_qp *qp = NULL;
+	struct irdma_qp_host_ctx_info *ctx_info = NULL;
+	struct irdma_device *iwdev = rf->iwdev;
+	unsigned long flags;
+
+	u32 aeqcnt = 0;
+
+	if (!sc_aeq->size)
+		return;
+
+	do {
+		memset(info, 0, sizeof(*info));
+		ret = dev->aeq_ops->get_next_aeqe(sc_aeq, info);
+		if (ret)
+			break;
+
+		aeqcnt++;
+		ibdev_dbg(&iwdev->ibdev,
+			  "AEQ: ae_id = 0x%x bool qp=%d qp_id = %d tcp_state=%d iwarp_state=%d ae_src=%d\n",
+			  info->ae_id, info->qp, info->qp_cq_id, info->tcp_state,
+			  info->iwarp_state, info->ae_src);
+
+		if (info->qp) {
+			spin_lock_irqsave(&rf->qptable_lock, flags);
+			iwqp = rf->qp_table[info->qp_cq_id];
+			if (!iwqp) {
+				spin_unlock_irqrestore(&rf->qptable_lock,
+						       flags);
+				if (info->ae_id == IRDMA_AE_QP_SUSPEND_COMPLETE) {
+					atomic_dec(&iwdev->vsi.qp_suspend_reqs);
+					wake_up(&iwdev->suspend_wq);
+					continue;
+				}
+				ibdev_dbg(&iwdev->ibdev, "AEQ: qp_id %d is already freed\n",
+					  info->qp_cq_id);
+				continue;
+			}
+			irdma_qp_add_ref(&iwqp->ibqp);
+			spin_unlock_irqrestore(&rf->qptable_lock, flags);
+			qp = &iwqp->sc_qp;
+			spin_lock_irqsave(&iwqp->lock, flags);
+			iwqp->hw_tcp_state = info->tcp_state;
+			iwqp->hw_iwarp_state = info->iwarp_state;
+			if (info->ae_id != IRDMA_AE_QP_SUSPEND_COMPLETE)
+				iwqp->last_aeq = info->ae_id;
+			spin_unlock_irqrestore(&iwqp->lock, flags);
+			ctx_info = &iwqp->ctx_info;
+			if (rdma_protocol_roce(&iwqp->iwdev->ibdev, 1))
+				ctx_info->roce_info->err_rq_idx_valid = true;
+			else
+				ctx_info->iwarp_info->err_rq_idx_valid = true;
+		} else {
+			if (info->ae_id != IRDMA_AE_CQ_OPERATION_ERROR)
+				continue;
+		}
+
+		switch (info->ae_id) {
+			struct irdma_cm_node *cm_node;
+		case IRDMA_AE_LLP_CONNECTION_ESTABLISHED:
+			cm_node = iwqp->cm_node;
+			if (cm_node->accept_pend) {
+				atomic_dec(&cm_node->listener->pend_accepts_cnt);
+				cm_node->accept_pend = 0;
+			}
+			iwqp->rts_ae_rcvd = 1;
+			wake_up_interruptible(&iwqp->waitq);
+			break;
+		case IRDMA_AE_LLP_FIN_RECEIVED:
+		case IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE:
+			if (qp->term_flags)
+				break;
+			if (atomic_inc_return(&iwqp->close_timer_started) == 1) {
+				iwqp->hw_tcp_state = IRDMA_TCP_STATE_CLOSE_WAIT;
+				if (iwqp->hw_tcp_state == IRDMA_TCP_STATE_CLOSE_WAIT &&
+				    iwqp->ibqp_state == IB_QPS_RTS) {
+					irdma_next_iw_state(iwqp,
+							    IRDMA_QP_STATE_CLOSING,
+							    0, 0, 0);
+					irdma_cm_disconn(iwqp);
+				}
+				irdma_schedule_cm_timer(iwqp->cm_node,
+							(struct irdma_puda_buf *)iwqp,
+							IRDMA_TIMER_TYPE_CLOSE,
+							1, 0);
+			}
+			break;
+		case IRDMA_AE_LLP_CLOSE_COMPLETE:
+			if (qp->term_flags)
+				irdma_terminate_done(qp, 0);
+			else
+				irdma_cm_disconn(iwqp);
+			break;
+		case IRDMA_AE_BAD_CLOSE:
+		case IRDMA_AE_RESET_SENT:
+			irdma_next_iw_state(iwqp, IRDMA_QP_STATE_ERROR, 1, 0,
+					    0);
+			irdma_cm_disconn(iwqp);
+			break;
+		case IRDMA_AE_LLP_CONNECTION_RESET:
+			if (atomic_read(&iwqp->close_timer_started))
+				break;
+			irdma_cm_disconn(iwqp);
+			break;
+		case IRDMA_AE_QP_SUSPEND_COMPLETE:
+			if (iwqp->iwdev->vsi.tc_change_pending) {
+				atomic_dec(&iwqp->sc_qp.vsi->qp_suspend_reqs);
+				wake_up(&iwqp->iwdev->suspend_wq);
+			}
+			break;
+		case IRDMA_AE_TERMINATE_SENT:
+			irdma_terminate_send_fin(qp);
+			break;
+		case IRDMA_AE_LLP_TERMINATE_RECEIVED:
+			irdma_terminate_received(qp, info);
+			break;
+		case IRDMA_AE_CQ_OPERATION_ERROR:
+			ibdev_err(&iwdev->ibdev,
+				  "Processing an iWARP related AE for CQ misc = 0x%04X\n",
+				  info->ae_id);
+			cq = (struct irdma_sc_cq *)(unsigned long)
+			     info->compl_ctx;
+
+			iwcq = cq->back_cq;
+
+			if (iwcq->ibcq.event_handler) {
+				struct ib_event ibevent;
+
+				ibevent.device = iwcq->ibcq.device;
+				ibevent.event = IB_EVENT_CQ_ERR;
+				ibevent.element.cq = &iwcq->ibcq;
+				iwcq->ibcq.event_handler(&ibevent,
+							 iwcq->ibcq.cq_context);
+			}
+			break;
+		case IRDMA_AE_RESET_NOT_SENT:
+		case IRDMA_AE_LLP_DOUBT_REACHABILITY:
+		case IRDMA_AE_RESOURCE_EXHAUSTION:
+			break;
+		case IRDMA_AE_PRIV_OPERATION_DENIED:
+		case IRDMA_AE_STAG_ZERO_INVALID:
+		case IRDMA_AE_IB_RREQ_AND_Q1_FULL:
+		case IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION:
+		case IRDMA_AE_DDP_UBE_INVALID_MO:
+		case IRDMA_AE_DDP_UBE_INVALID_QN:
+		case IRDMA_AE_DDP_NO_L_BIT:
+		case IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION:
+		case IRDMA_AE_RDMAP_ROE_UNEXPECTED_OPCODE:
+		case IRDMA_AE_ROE_INVALID_RDMA_READ_REQUEST:
+		case IRDMA_AE_ROE_INVALID_RDMA_WRITE_OR_READ_RESP:
+		case IRDMA_AE_INVALID_ARP_ENTRY:
+		case IRDMA_AE_INVALID_TCP_OPTION_RCVD:
+		case IRDMA_AE_STALE_ARP_ENTRY:
+		case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR:
+		case IRDMA_AE_LLP_SEGMENT_TOO_SMALL:
+		case IRDMA_AE_LLP_SYN_RECEIVED:
+		case IRDMA_AE_LLP_TOO_MANY_RETRIES:
+		case IRDMA_AE_LCE_QP_CATASTROPHIC:
+		case IRDMA_AE_LCE_FUNCTION_CATASTROPHIC:
+		case IRDMA_AE_LCE_CQ_CATASTROPHIC:
+		case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG:
+			if (rdma_protocol_roce(&iwdev->ibdev, 1))
+				ctx_info->roce_info->err_rq_idx_valid = false;
+			else
+				ctx_info->iwarp_info->err_rq_idx_valid = false;
+			fallthrough;
+		default:
+			ibdev_err(&iwdev->ibdev, "abnormal ae_id = 0x%x bool qp=%d qp_id = %d\n",
+				  info->ae_id, info->qp, info->qp_cq_id);
+			if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
+				if (!info->sq && ctx_info->roce_info->err_rq_idx_valid) {
+					ctx_info->roce_info->err_rq_idx = info->wqe_idx;
+					dev->iw_priv_qp_ops->qp_setctx_roce(&iwqp->sc_qp,
+									    iwqp->host_ctx.va,
+									    ctx_info);
+				}
+				irdma_set_flush_fields(qp, info);
+				irdma_cm_disconn(iwqp);
+				break;
+			}
+			if (!info->sq && ctx_info->iwarp_info->err_rq_idx_valid) {
+				ctx_info->iwarp_info->err_rq_idx = info->wqe_idx;
+				ctx_info->tcp_info_valid = false;
+				ctx_info->iwarp_info_valid = true;
+				dev->iw_priv_qp_ops->qp_setctx(&iwqp->sc_qp,
+							       iwqp->host_ctx.va,
+							       ctx_info);
+			}
+			if (iwqp->hw_iwarp_state != IRDMA_QP_STATE_RTS &&
+			    iwqp->hw_iwarp_state != IRDMA_QP_STATE_TERMINATE) {
+				irdma_next_iw_state(iwqp, IRDMA_QP_STATE_ERROR, 1, 0, 0);
+				irdma_cm_disconn(iwqp);
+			} else {
+				irdma_terminate_connection(qp, info);
+			}
+			break;
+		}
+		if (info->qp)
+			irdma_qp_rem_ref(&iwqp->ibqp);
+	} while (1);
+
+	if (aeqcnt)
+		dev->aeq_ops->repost_aeq_entries(dev, aeqcnt);
+}
+
+/**
+ * irdma_ena_intr - set up device interrupts
+ * @dev: hardware control device structure
+ * @msix_id: id of the interrupt to be enabled
+ */
+static void irdma_ena_intr(struct irdma_sc_dev *dev, u32 msix_id)
+{
+	dev->irq_ops->irdma_en_irq(dev, msix_id);
+}
+
+/**
+ * irdma_dpc - tasklet for aeq and ceq 0
+ * @t: tasklet_struct ptr
+ */
+static void irdma_dpc(struct tasklet_struct *t)
+{
+	struct irdma_pci_f *rf = from_tasklet(rf, t, dpc_tasklet);
+
+	if (rf->msix_shared)
+		irdma_process_ceq(rf, rf->ceqlist);
+	irdma_process_aeq(rf);
+	irdma_ena_intr(&rf->sc_dev, rf->iw_msixtbl[0].idx);
+}
+
+/**
+ * irdma_ceq_dpc - dpc handler for CEQ
+ * @t: tasklet_struct ptr
+ */
+static void irdma_ceq_dpc(struct tasklet_struct *t)
+{
+	struct irdma_ceq *iwceq = from_tasklet(iwceq, t, dpc_tasklet);
+	struct irdma_pci_f *rf = iwceq->rf;
+
+	irdma_process_ceq(rf, iwceq);
+	irdma_ena_intr(&rf->sc_dev, iwceq->msix_idx);
+}
+
+/**
+ * irdma_save_msix_info - copy msix vector information to iwarp device
+ * @rf: RDMA PCI function
+ *
+ * Allocate iwdev msix table and copy the cdev_info msix info to the table
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_save_msix_info(struct irdma_pci_f *rf)
+{
+	struct irdma_priv_core_dev_info *cdev_info = &rf->priv_cdev_info;
+	struct irdma_qvlist_info *iw_qvlist;
+	struct irdma_qv_info *iw_qvinfo;
+	struct msix_entry *pmsix;
+	u32 ceq_idx;
+	u32 i;
+	u32 size;
+
+	if (!cdev_info->msix_count)
+		return IRDMA_ERR_NO_INTR;
+
+	rf->msix_count = cdev_info->msix_count;
+	size = sizeof(struct irdma_msix_vector) * rf->msix_count;
+	size += sizeof(struct irdma_qvlist_info);
+	size += sizeof(struct irdma_qv_info) * rf->msix_count - 1;
+	rf->iw_msixtbl = kzalloc(size, GFP_KERNEL);
+	if (!rf->iw_msixtbl)
+		return IRDMA_ERR_NO_MEMORY;
+
+	rf->iw_qvlist = (struct irdma_qvlist_info *)
+			(&rf->iw_msixtbl[rf->msix_count]);
+	iw_qvlist = rf->iw_qvlist;
+	iw_qvinfo = iw_qvlist->qv_info;
+	iw_qvlist->num_vectors = rf->msix_count;
+	if (rf->msix_count <= num_online_cpus())
+		rf->msix_shared = true;
+
+	pmsix = cdev_info->msix_entries;
+	for (i = 0, ceq_idx = 0; i < rf->msix_count; i++, iw_qvinfo++) {
+		rf->iw_msixtbl[i].idx = pmsix->entry;
+		rf->iw_msixtbl[i].irq = pmsix->vector;
+		rf->iw_msixtbl[i].cpu_affinity = ceq_idx;
+		if (!i) {
+			iw_qvinfo->aeq_idx = 0;
+			if (rf->msix_shared)
+				iw_qvinfo->ceq_idx = ceq_idx++;
+			else
+				iw_qvinfo->ceq_idx = IRDMA_Q_INVALID_IDX;
+		} else {
+			iw_qvinfo->aeq_idx = IRDMA_Q_INVALID_IDX;
+			iw_qvinfo->ceq_idx = ceq_idx++;
+		}
+		iw_qvinfo->itr_idx = 3;
+		iw_qvinfo->v_idx = rf->iw_msixtbl[i].idx;
+		pmsix++;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_irq_handler - interrupt handler for aeq and ceq0
+ * @irq: Interrupt request number
+ * @data: RDMA PCI function
+ */
+static irqreturn_t irdma_irq_handler(int irq, void *data)
+{
+	struct irdma_pci_f *rf = data;
+
+	tasklet_schedule(&rf->dpc_tasklet);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * irdma_ceq_handler - interrupt handler for ceq
+ * @irq: interrupt request number
+ * @data: ceq pointer
+ */
+static irqreturn_t irdma_ceq_handler(int irq, void *data)
+{
+	struct irdma_ceq *iwceq = data;
+
+	if (iwceq->irq != irq)
+		ibdev_err(to_ibdev(&iwceq->rf->sc_dev), "expected irq = %d received irq = %d\n",
+			  iwceq->irq, irq);
+	tasklet_schedule(&iwceq->dpc_tasklet);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * irdma_destroy_irq - destroy device interrupts
+ * @rf: RDMA PCI function
+ * @msix_vec: msix vector to disable irq
+ * @dev_id: parameter to pass to free_irq (used during irq setup)
+ *
+ * The function is called when destroying aeq/ceq
+ */
+static void irdma_destroy_irq(struct irdma_pci_f *rf,
+			      struct irdma_msix_vector *msix_vec, void *dev_id)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+
+	dev->irq_ops->irdma_dis_irq(dev, msix_vec->idx);
+	irq_set_affinity_hint(msix_vec->irq, NULL);
+	free_irq(msix_vec->irq, dev_id);
+}
+
+/**
+ * irdma_destroy_cqp  - destroy control qp
+ * @rf: RDMA PCI function
+ * @free_hwcqp: 1 if hw cqp should be freed
+ *
+ * Issue destroy cqp request and
+ * free the resources associated with the cqp
+ */
+static void irdma_destroy_cqp(struct irdma_pci_f *rf, bool free_hwcqp)
+{
+	enum irdma_status_code status = 0;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_cqp *cqp = &rf->cqp;
+
+	if (rf->cqp_cmpl_wq)
+		destroy_workqueue(rf->cqp_cmpl_wq);
+	if (free_hwcqp)
+		status = dev->cqp_ops->cqp_destroy(dev->cqp);
+	if (status)
+		ibdev_dbg(to_ibdev(dev), "ERR: Destroy CQP failed %d\n", status);
+
+	irdma_cleanup_pending_cqp_op(rf);
+	dma_free_coherent(dev->hw->device, cqp->sq.size, cqp->sq.va,
+			  cqp->sq.pa);
+	cqp->sq.va = NULL;
+	kfree(cqp->scratch_array);
+	cqp->scratch_array = NULL;
+	kfree(cqp->cqp_requests);
+	cqp->cqp_requests = NULL;
+}
+
+static void irdma_destroy_virt_aeq(struct irdma_pci_f *rf)
+{
+	struct irdma_aeq *aeq = &rf->aeq;
+	u32 pg_cnt = DIV_ROUND_UP(aeq->mem.size, PAGE_SIZE);
+	dma_addr_t *pg_arr = (dma_addr_t *)aeq->palloc.level1.addr;
+
+	irdma_unmap_vm_page_list(&rf->hw, pg_arr, pg_cnt);
+	irdma_free_pble(rf->pble_rsrc, &aeq->palloc);
+	vfree(aeq->mem.va);
+}
+
+/**
+ * irdma_destroy_aeq - destroy aeq
+ * @rf: RDMA PCI function
+ *
+ * Issue a destroy aeq request and
+ * free the resources associated with the aeq
+ * The function is called during driver unload
+ */
+static void irdma_destroy_aeq(struct irdma_pci_f *rf)
+{
+	enum irdma_status_code status = IRDMA_ERR_NOT_READY;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_aeq *aeq = &rf->aeq;
+
+	if (!rf->msix_shared) {
+		rf->sc_dev.irq_ops->irdma_cfg_aeq(&rf->sc_dev, rf->iw_msixtbl->idx, false);
+		irdma_destroy_irq(rf, rf->iw_msixtbl, rf);
+	}
+	if (rf->reset)
+		goto exit;
+
+	aeq->sc_aeq.size = 0;
+	status = irdma_cqp_aeq_cmd(dev, &aeq->sc_aeq, IRDMA_OP_AEQ_DESTROY);
+	if (status)
+		ibdev_dbg(to_ibdev(dev), "ERR: Destroy AEQ failed %d\n", status);
+
+exit:
+	if (aeq->virtual_map) {
+		irdma_destroy_virt_aeq(rf);
+	} else {
+		dma_free_coherent(dev->hw->device, aeq->mem.size, aeq->mem.va,
+				  aeq->mem.pa);
+		aeq->mem.va = NULL;
+	}
+}
+
+/**
+ * irdma_destroy_ceq - destroy ceq
+ * @rf: RDMA PCI function
+ * @iwceq: ceq to be destroyed
+ *
+ * Issue a destroy ceq request and
+ * free the resources associated with the ceq
+ */
+static void irdma_destroy_ceq(struct irdma_pci_f *rf, struct irdma_ceq *iwceq)
+{
+	enum irdma_status_code status;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+
+	if (rf->reset)
+		goto exit;
+
+	status = dev->ceq_ops->ceq_destroy(&iwceq->sc_ceq, 0, 1);
+	if (status) {
+		ibdev_dbg(to_ibdev(dev), "ERR: CEQ destroy command failed %d\n", status);
+		goto exit;
+	}
+
+	status = dev->ceq_ops->cceq_destroy_done(&iwceq->sc_ceq);
+	if (status)
+		ibdev_dbg(to_ibdev(dev), "ERR: CEQ destroy completion failed %d\n",
+			  status);
+exit:
+	dma_free_coherent(dev->hw->device, iwceq->mem.size, iwceq->mem.va,
+			  iwceq->mem.pa);
+	iwceq->mem.va = NULL;
+}
+
+/**
+ * irdma_del_ceq_0 - destroy ceq 0
+ * @rf: RDMA PCI function
+ *
+ * Disable the ceq 0 interrupt and destroy the ceq 0
+ */
+static void irdma_del_ceq_0(struct irdma_pci_f *rf)
+{
+	struct irdma_ceq *iwceq = rf->ceqlist;
+	struct irdma_msix_vector *msix_vec;
+
+	if (rf->msix_shared) {
+		msix_vec = &rf->iw_msixtbl[0];
+		rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev,
+						  msix_vec->ceq_id,
+						  msix_vec->idx, false);
+		irdma_destroy_irq(rf, msix_vec, rf);
+	} else {
+		msix_vec = &rf->iw_msixtbl[1];
+		irdma_destroy_irq(rf, msix_vec, iwceq);
+	}
+
+	irdma_destroy_ceq(rf, iwceq);
+	rf->sc_dev.ceq_valid = false;
+	rf->ceqs_count = 0;
+}
+
+/**
+ * irdma_del_ceqs - destroy all ceq's except CEQ 0
+ * @rf: RDMA PCI function
+ *
+ * Go through all of the device ceq's, except 0, and for each
+ * ceq disable the ceq interrupt and destroy the ceq
+ */
+static void irdma_del_ceqs(struct irdma_pci_f *rf)
+{
+	struct irdma_ceq *iwceq = &rf->ceqlist[1];
+	struct irdma_msix_vector *msix_vec;
+	u32 i = 0;
+
+	if (rf->msix_shared)
+		msix_vec = &rf->iw_msixtbl[1];
+	else
+		msix_vec = &rf->iw_msixtbl[2];
+
+	for (i = 1; i < rf->ceqs_count; i++, msix_vec++, iwceq++) {
+		rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, msix_vec->ceq_id,
+						  msix_vec->idx, false);
+		irdma_destroy_irq(rf, msix_vec, iwceq);
+		irdma_cqp_ceq_cmd(&rf->sc_dev, &iwceq->sc_ceq,
+				  IRDMA_OP_CEQ_DESTROY);
+		dma_free_coherent(rf->sc_dev.hw->device, iwceq->mem.size,
+				  iwceq->mem.va, iwceq->mem.pa);
+		iwceq->mem.va = NULL;
+	}
+	rf->ceqs_count = 1;
+}
+
+/**
+ * irdma_destroy_ccq - destroy control cq
+ * @rf: RDMA PCI function
+ *
+ * Issue destroy ccq request and
+ * free the resources associated with the ccq
+ */
+static void irdma_destroy_ccq(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_ccq *ccq = &rf->ccq;
+	enum irdma_status_code status = 0;
+
+	if (!rf->reset)
+		status = dev->ccq_ops->ccq_destroy(dev->ccq, 0, true);
+	if (status)
+		ibdev_dbg(to_ibdev(dev), "ERR: CCQ destroy failed %d\n", status);
+	dma_free_coherent(dev->hw->device, ccq->mem_cq.size, ccq->mem_cq.va,
+			  ccq->mem_cq.pa);
+	ccq->mem_cq.va = NULL;
+}
+
+/**
+ * irdma_close_hmc_objects_type - delete hmc objects of a given type
+ * @dev: iwarp device
+ * @obj_type: the hmc object type to be deleted
+ * @hmc_info: host memory info struct
+ * @privileged: permission to close HMC objects
+ * @reset: true if called before reset
+ */
+static void irdma_close_hmc_objects_type(struct irdma_sc_dev *dev,
+					 enum irdma_hmc_rsrc_type obj_type,
+					 struct irdma_hmc_info *hmc_info,
+					 bool privileged, bool reset)
+{
+	struct irdma_hmc_del_obj_info info = {};
+
+	info.hmc_info = hmc_info;
+	info.rsrc_type = obj_type;
+	info.count = hmc_info->hmc_obj[obj_type].cnt;
+	info.privileged = privileged;
+	if (dev->hmc_ops->del_hmc_object(dev, &info, reset))
+		ibdev_dbg(to_ibdev(dev), "ERR: del HMC obj of type %d failed\n",
+			  obj_type);
+}
+
+/**
+ * irdma_del_hmc_objects - remove all device hmc objects
+ * @dev: iwarp device
+ * @hmc_info: hmc_info to free
+ * @privileged: permission to delete HMC objects
+ * @reset: true if called before reset
+ * @vers: hardware version
+ */
+static void irdma_del_hmc_objects(struct irdma_sc_dev *dev,
+				  struct irdma_hmc_info *hmc_info, bool privileged,
+				  bool reset, enum irdma_vers vers)
+{
+	unsigned int i;
+
+	for (i = 0; i < IW_HMC_OBJ_TYPE_NUM; i++) {
+		if (dev->hmc_info->hmc_obj[iw_hmc_obj_types[i]].cnt)
+			irdma_close_hmc_objects_type(dev, iw_hmc_obj_types[i],
+						     hmc_info, privileged, reset);
+		if (vers == IRDMA_GEN_1 && i == IRDMA_HMC_IW_TIMER)
+			break;
+	}
+}
+
+/**
+ * irdma_create_hmc_obj_type - create hmc object of a given type
+ * @dev: hardware control device structure
+ * @info: information for the hmc object to create
+ */
+static enum irdma_status_code
+irdma_create_hmc_obj_type(struct irdma_sc_dev *dev,
+			  struct irdma_hmc_create_obj_info *info)
+{
+	return dev->hmc_ops->create_hmc_object(dev, info);
+}
+
+/**
+ * irdma_create_hmc_objs - create all hmc objects for the device
+ * @rf: RDMA PCI function
+ * @privileged: permission to create HMC objects
+ * @vers: HW version
+ *
+ * Create the device hmc objects and allocate hmc pages
+ * Return 0 if successful, otherwise clean up and return error
+ */
+static enum irdma_status_code
+irdma_create_hmc_objs(struct irdma_pci_f *rf, bool privileged, enum irdma_vers vers)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_hmc_create_obj_info info = {};
+	enum irdma_status_code status = 0;
+	int i;
+
+	info.hmc_info = dev->hmc_info;
+	info.privileged = privileged;
+	info.entry_type = rf->sd_type;
+
+	for (i = 0; i < IW_HMC_OBJ_TYPE_NUM; i++) {
+		if (dev->hmc_info->hmc_obj[iw_hmc_obj_types[i]].cnt) {
+			info.rsrc_type = iw_hmc_obj_types[i];
+			info.count = dev->hmc_info->hmc_obj[info.rsrc_type].cnt;
+			info.add_sd_cnt = 0;
+			status = irdma_create_hmc_obj_type(dev, &info);
+			if (status) {
+				ibdev_dbg(to_ibdev(dev),
+					  "ERR: create obj type %d status = %d\n",
+					  iw_hmc_obj_types[i], status);
+				break;
+			}
+		}
+		if (vers == IRDMA_GEN_1 && i == IRDMA_HMC_IW_TIMER)
+			break;
+	}
+
+	if (!status)
+		return dev->hmc_ops->static_hmc_pages_allocated(dev->cqp, 0,
+								dev->hmc_fn_id,
+								true, true);
+
+	while (i) {
+		i--;
+		/* destroy the hmc objects of a given type */
+		if (dev->hmc_info->hmc_obj[iw_hmc_obj_types[i]].cnt)
+			irdma_close_hmc_objects_type(dev, iw_hmc_obj_types[i],
+						     dev->hmc_info, privileged,
+						     false);
+	}
+
+	return status;
+}
+
+/**
+ * irdma_obj_aligned_mem - get aligned memory from device allocated memory
+ * @rf: RDMA PCI function
+ * @memptr: points to the memory addresses
+ * @size: size of memory needed
+ * @mask: mask for the aligned memory
+ *
+ * Get aligned memory of the requested size and
+ * update the memptr to point to the new aligned memory
+ * Return 0 if successful, otherwise return no memory error
+ */
+static enum irdma_status_code
+irdma_obj_aligned_mem(struct irdma_pci_f *rf, struct irdma_dma_mem *memptr,
+		      u32 size, u32 mask)
+{
+	unsigned long va, newva;
+	unsigned long extra;
+
+	va = (unsigned long)rf->obj_next.va;
+	newva = va;
+	if (mask)
+		newva = ALIGN(va, (unsigned long)mask + 1ULL);
+	extra = newva - va;
+	memptr->va = (u8 *)va + extra;
+	memptr->pa = rf->obj_next.pa + extra;
+	memptr->size = size;
+	if (((u8 *)memptr->va + size) > ((u8 *)rf->obj_mem.va + rf->obj_mem.size))
+		return IRDMA_ERR_NO_MEMORY;
+
+	rf->obj_next.va = (u8 *)memptr->va + size;
+	rf->obj_next.pa = memptr->pa + size;
+
+	return 0;
+}
+
+/**
+ * irdma_create_cqp - create control qp
+ * @rf: RDMA PCI function
+ *
+ * Return 0, if the cqp and all the resources associated with it
+ * are successfully created, otherwise return error
+ */
+static enum irdma_status_code irdma_create_cqp(struct irdma_pci_f *rf)
+{
+	enum irdma_status_code status;
+	u32 sqsize = IRDMA_CQP_SW_SQSIZE_2048;
+	struct irdma_dma_mem mem;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_cqp_init_info cqp_init_info = {};
+	struct irdma_cqp *cqp = &rf->cqp;
+	u16 maj_err, min_err;
+	int i;
+
+	cqp->cqp_requests = kcalloc(sqsize, sizeof(*cqp->cqp_requests), GFP_KERNEL);
+	if (!cqp->cqp_requests)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp->scratch_array = kcalloc(sqsize, sizeof(*cqp->scratch_array), GFP_KERNEL);
+	if (!cqp->scratch_array) {
+		kfree(cqp->cqp_requests);
+		return IRDMA_ERR_NO_MEMORY;
+	}
+
+	dev->cqp = &cqp->sc_cqp;
+	dev->cqp->dev = dev;
+	cqp->sq.size = ALIGN(sizeof(struct irdma_cqp_sq_wqe) * sqsize,
+			     IRDMA_CQP_ALIGNMENT);
+	cqp->sq.va = dma_alloc_coherent(dev->hw->device, cqp->sq.size,
+					&cqp->sq.pa, GFP_KERNEL);
+	if (!cqp->sq.va) {
+		kfree(cqp->scratch_array);
+		kfree(cqp->cqp_requests);
+		return IRDMA_ERR_NO_MEMORY;
+	}
+
+	status = irdma_obj_aligned_mem(rf, &mem, sizeof(struct irdma_cqp_ctx),
+				       IRDMA_HOST_CTX_ALIGNMENT_M);
+	if (status)
+		goto exit;
+
+	dev->cqp->host_ctx_pa = mem.pa;
+	dev->cqp->host_ctx = mem.va;
+	/* populate the cqp init info */
+	cqp_init_info.dev = dev;
+	cqp_init_info.sq_size = sqsize;
+	cqp_init_info.sq = cqp->sq.va;
+	cqp_init_info.sq_pa = cqp->sq.pa;
+	cqp_init_info.host_ctx_pa = mem.pa;
+	cqp_init_info.host_ctx = mem.va;
+	cqp_init_info.hmc_profile = rf->rsrc_profile;
+	cqp_init_info.scratch_array = cqp->scratch_array;
+	cqp_init_info.protocol_used = rf->protocol_used;
+
+	switch (rf->rdma_ver) {
+	case IRDMA_GEN_1:
+		cqp_init_info.hw_maj_ver = IRDMA_CQPHC_HW_MAJVER_GEN_1;
+		break;
+	case IRDMA_GEN_2:
+		cqp_init_info.hw_maj_ver = IRDMA_CQPHC_HW_MAJVER_GEN_2;
+		break;
+	}
+	status = dev->cqp_ops->cqp_init(dev->cqp, &cqp_init_info);
+	if (status) {
+		ibdev_dbg(to_ibdev(dev), "ERR: cqp init status %d\n", status);
+		goto exit;
+	}
+
+	spin_lock_init(&cqp->req_lock);
+	spin_lock_init(&cqp->compl_lock);
+
+	status = dev->cqp_ops->cqp_create(dev->cqp, &maj_err, &min_err);
+	if (status) {
+		ibdev_dbg(to_ibdev(dev),
+			  "ERR: cqp create failed - status %d maj_err %d min_err %d\n",
+			  status, maj_err, min_err);
+		goto exit;
+	}
+
+	INIT_LIST_HEAD(&cqp->cqp_avail_reqs);
+	INIT_LIST_HEAD(&cqp->cqp_pending_reqs);
+
+	/* init the waitqueue of the cqp_requests and add them to the list */
+	for (i = 0; i < sqsize; i++) {
+		init_waitqueue_head(&cqp->cqp_requests[i].waitq);
+		list_add_tail(&cqp->cqp_requests[i].list, &cqp->cqp_avail_reqs);
+	}
+	init_waitqueue_head(&cqp->remove_wq);
+	return 0;
+
+exit:
+	irdma_destroy_cqp(rf, false);
+
+	return status;
+}
+
+/**
+ * irdma_create_ccq - create control cq
+ * @rf: RDMA PCI function
+ *
+ * Return 0, if the ccq and the resources associated with it
+ * are successfully created, otherwise return error
+ */
+static enum irdma_status_code irdma_create_ccq(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	enum irdma_status_code status;
+	struct irdma_ccq_init_info info = {};
+	struct irdma_ccq *ccq = &rf->ccq;
+
+	dev->ccq = &ccq->sc_cq;
+	dev->ccq->dev = dev;
+	info.dev = dev;
+	ccq->shadow_area.size = sizeof(struct irdma_cq_shadow_area);
+	ccq->mem_cq.size = ALIGN(sizeof(struct irdma_cqe) * IW_CCQ_SIZE,
+				 IRDMA_CQ0_ALIGNMENT);
+	ccq->mem_cq.va = dma_alloc_coherent(dev->hw->device, ccq->mem_cq.size,
+					    &ccq->mem_cq.pa, GFP_KERNEL);
+	if (!ccq->mem_cq.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	status = irdma_obj_aligned_mem(rf, &ccq->shadow_area,
+				       ccq->shadow_area.size,
+				       IRDMA_SHADOWAREA_M);
+	if (status)
+		goto exit;
+
+	ccq->sc_cq.back_cq = ccq;
+	/* populate the ccq init info */
+	info.cq_base = ccq->mem_cq.va;
+	info.cq_pa = ccq->mem_cq.pa;
+	info.num_elem = IW_CCQ_SIZE;
+	info.shadow_area = ccq->shadow_area.va;
+	info.shadow_area_pa = ccq->shadow_area.pa;
+	info.ceqe_mask = false;
+	info.ceq_id_valid = true;
+	info.shadow_read_threshold = 16;
+	info.vsi = &rf->default_vsi;
+	status = dev->ccq_ops->ccq_init(dev->ccq, &info);
+	if (!status)
+		status = dev->ccq_ops->ccq_create(dev->ccq, 0, true, true);
+exit:
+	if (status) {
+		dma_free_coherent(dev->hw->device, ccq->mem_cq.size,
+				  ccq->mem_cq.va, ccq->mem_cq.pa);
+		ccq->mem_cq.va = NULL;
+	}
+
+	return status;
+}
+
+/**
+ * irdma_alloc_set_mac - set up a mac address table entry
+ * @iwdev: irdma device
+ *
+ * Allocate a mac ip entry and add it to the hw table Return 0
+ * if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_alloc_set_mac(struct irdma_device *iwdev)
+{
+	enum irdma_status_code status;
+
+	status = irdma_alloc_local_mac_entry(iwdev->rf,
+					     &iwdev->mac_ip_table_idx);
+	if (!status) {
+		status = irdma_add_local_mac_entry(iwdev->rf,
+						   (u8 *)iwdev->netdev->dev_addr,
+						   (u8)iwdev->mac_ip_table_idx);
+		if (status)
+			irdma_del_local_mac_entry(iwdev->rf,
+						  (u8)iwdev->mac_ip_table_idx);
+	}
+	return status;
+}
+
+/**
+ * irdma_cfg_ceq_vector - set up the msix interrupt vector for
+ * ceq
+ * @rf: RDMA PCI function
+ * @iwceq: ceq associated with the vector
+ * @ceq_id: the id number of the iwceq
+ * @msix_vec: interrupt vector information
+ *
+ * Allocate interrupt resources and enable irq handling
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code
+irdma_cfg_ceq_vector(struct irdma_pci_f *rf, struct irdma_ceq *iwceq,
+		     u32 ceq_id, struct irdma_msix_vector *msix_vec)
+{
+	int status;
+
+	if (rf->msix_shared && !ceq_id) {
+		tasklet_setup(&rf->dpc_tasklet, irdma_dpc);
+		status = request_irq(msix_vec->irq, irdma_irq_handler, 0,
+				     "AEQCEQ", rf);
+	} else {
+		tasklet_setup(&iwceq->dpc_tasklet, irdma_ceq_dpc);
+
+		status = request_irq(msix_vec->irq, irdma_ceq_handler, 0,
+				     "CEQ", iwceq);
+	}
+	cpumask_clear(&msix_vec->mask);
+	cpumask_set_cpu(msix_vec->cpu_affinity, &msix_vec->mask);
+	irq_set_affinity_hint(msix_vec->irq, &msix_vec->mask);
+	if (status) {
+		ibdev_dbg(&rf->iwdev->ibdev, "ERR: ceq irq config fail\n");
+		return IRDMA_ERR_CFG;
+	}
+
+	msix_vec->ceq_id = ceq_id;
+	rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, ceq_id, msix_vec->idx, true);
+
+	return 0;
+}
+
+/**
+ * irdma_cfg_aeq_vector - set up the msix vector for aeq
+ * @rf: RDMA PCI function
+ *
+ * Allocate interrupt resources and enable irq handling
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_cfg_aeq_vector(struct irdma_pci_f *rf)
+{
+	struct irdma_msix_vector *msix_vec = rf->iw_msixtbl;
+	u32 ret = 0;
+
+	if (!rf->msix_shared) {
+		tasklet_setup(&rf->dpc_tasklet, irdma_dpc);
+		ret = request_irq(msix_vec->irq, irdma_irq_handler, 0,
+				  "irdma", rf);
+	}
+	if (ret) {
+		ibdev_dbg(&rf->iwdev->ibdev, "ERR: aeq irq config fail\n");
+		return IRDMA_ERR_CFG;
+	}
+
+	rf->sc_dev.irq_ops->irdma_cfg_aeq(&rf->sc_dev, msix_vec->idx, true);
+
+	return 0;
+}
+
+/**
+ * irdma_create_ceq - create completion event queue
+ * @rf: RDMA PCI function
+ * @iwceq: pointer to the ceq resources to be created
+ * @ceq_id: the id number of the iwceq
+ * @vsi: SC vsi struct
+ *
+ * Return 0, if the ceq and the resources associated with it
+ * are successfully created, otherwise return error
+ */
+static enum irdma_status_code irdma_create_ceq(struct irdma_pci_f *rf,
+					       struct irdma_ceq *iwceq,
+					       u32 ceq_id,
+					       struct irdma_sc_vsi *vsi)
+{
+	enum irdma_status_code status;
+	struct irdma_ceq_init_info info = {};
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	u64 scratch;
+	u32 ceq_size;
+
+	info.ceq_id = ceq_id;
+	iwceq->rf = rf;
+	ceq_size = min(rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt,
+		       dev->hw_attrs.max_hw_ceq_size);
+	iwceq->mem.size = ALIGN(sizeof(struct irdma_ceqe) * ceq_size,
+				IRDMA_CEQ_ALIGNMENT);
+	iwceq->mem.va = dma_alloc_coherent(dev->hw->device, iwceq->mem.size,
+					   &iwceq->mem.pa, GFP_KERNEL);
+	if (!iwceq->mem.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	info.ceq_id = ceq_id;
+	info.ceqe_base = iwceq->mem.va;
+	info.ceqe_pa = iwceq->mem.pa;
+	info.elem_cnt = ceq_size;
+	iwceq->sc_ceq.ceq_id = ceq_id;
+	info.dev = dev;
+	info.vsi = vsi;
+	scratch = (uintptr_t)&rf->cqp.sc_cqp;
+	status = dev->ceq_ops->ceq_init(&iwceq->sc_ceq, &info);
+	if (!status) {
+		if (dev->ceq_valid)
+			status = irdma_cqp_ceq_cmd(&rf->sc_dev, &iwceq->sc_ceq,
+						   IRDMA_OP_CEQ_CREATE);
+		else
+			status = dev->ceq_ops->cceq_create(&iwceq->sc_ceq,
+							   scratch);
+	}
+
+	if (status) {
+		dma_free_coherent(dev->hw->device, iwceq->mem.size,
+				  iwceq->mem.va, iwceq->mem.pa);
+		iwceq->mem.va = NULL;
+	}
+
+	return status;
+}
+
+/**
+ * irdma_setup_ceq_0 - create CEQ 0 and it's interrupt resource
+ * @rf: RDMA PCI function
+ *
+ * Allocate a list for all device completion event queues
+ * Create the ceq 0 and configure it's msix interrupt vector
+ * Return 0, if successfully set up, otherwise return error
+ */
+static enum irdma_status_code irdma_setup_ceq_0(struct irdma_pci_f *rf)
+{
+	struct irdma_ceq *iwceq;
+	struct irdma_msix_vector *msix_vec;
+	u32 i;
+	enum irdma_status_code status = 0;
+	u32 num_ceqs;
+
+	num_ceqs = min(rf->msix_count, rf->sc_dev.hmc_fpm_misc.max_ceqs);
+	rf->ceqlist = kcalloc(num_ceqs, sizeof(*rf->ceqlist), GFP_KERNEL);
+	if (!rf->ceqlist) {
+		status = IRDMA_ERR_NO_MEMORY;
+		goto exit;
+	}
+
+	iwceq = &rf->ceqlist[0];
+	status = irdma_create_ceq(rf, iwceq, 0, &rf->default_vsi);
+	if (status) {
+		ibdev_dbg(&rf->iwdev->ibdev, "ERR: create ceq status = %d\n",
+			  status);
+		goto exit;
+	}
+
+	spin_lock_init(&iwceq->ce_lock);
+	i = rf->msix_shared ? 0 : 1;
+	msix_vec = &rf->iw_msixtbl[i];
+	iwceq->irq = msix_vec->irq;
+	iwceq->msix_idx = msix_vec->idx;
+	status = irdma_cfg_ceq_vector(rf, iwceq, 0, msix_vec);
+	if (status) {
+		irdma_destroy_ceq(rf, iwceq);
+		goto exit;
+	}
+
+	irdma_ena_intr(&rf->sc_dev, msix_vec->idx);
+	rf->ceqs_count++;
+
+exit:
+	if (status && !rf->ceqs_count) {
+		kfree(rf->ceqlist);
+		rf->ceqlist = NULL;
+		return status;
+	}
+	rf->sc_dev.ceq_valid = true;
+
+	return 0;
+}
+
+/**
+ * irdma_setup_ceqs - manage the device ceq's and their interrupt resources
+ * @rf: RDMA PCI function
+ * @vsi: VSI structure for this CEQ
+ *
+ * Allocate a list for all device completion event queues
+ * Create the ceq's and configure their msix interrupt vectors
+ * Return 0, if ceqs are successfully set up, otherwise return error
+ */
+static enum irdma_status_code irdma_setup_ceqs(struct irdma_pci_f *rf,
+					       struct irdma_sc_vsi *vsi)
+{
+	u32 i;
+	u32 ceq_id;
+	struct irdma_ceq *iwceq;
+	struct irdma_msix_vector *msix_vec;
+	enum irdma_status_code status;
+	u32 num_ceqs;
+
+	num_ceqs = min(rf->msix_count, rf->sc_dev.hmc_fpm_misc.max_ceqs);
+	i = (rf->msix_shared) ? 1 : 2;
+	for (ceq_id = 1; i < num_ceqs; i++, ceq_id++) {
+		iwceq = &rf->ceqlist[ceq_id];
+		status = irdma_create_ceq(rf, iwceq, ceq_id, vsi);
+		if (status) {
+			ibdev_dbg(&rf->iwdev->ibdev,
+				  "ERR: create ceq status = %d\n", status);
+			goto del_ceqs;
+		}
+		spin_lock_init(&iwceq->ce_lock);
+		msix_vec = &rf->iw_msixtbl[i];
+		iwceq->irq = msix_vec->irq;
+		iwceq->msix_idx = msix_vec->idx;
+		status = irdma_cfg_ceq_vector(rf, iwceq, ceq_id, msix_vec);
+		if (status) {
+			irdma_destroy_ceq(rf, iwceq);
+			goto del_ceqs;
+		}
+		irdma_ena_intr(&rf->sc_dev, msix_vec->idx);
+		rf->ceqs_count++;
+	}
+
+	return 0;
+
+del_ceqs:
+	irdma_del_ceqs(rf);
+
+	return status;
+}
+
+static enum irdma_status_code irdma_create_virt_aeq(struct irdma_pci_f *rf,
+						    u32 size)
+{
+	enum irdma_status_code status = IRDMA_ERR_NO_MEMORY;
+	struct irdma_aeq *aeq = &rf->aeq;
+	dma_addr_t *pg_arr;
+	u32 pg_cnt;
+
+	if (rf->rdma_ver < IRDMA_GEN_2)
+		return IRDMA_NOT_SUPPORTED;
+
+	aeq->mem.size = sizeof(struct irdma_sc_aeqe) * size;
+	aeq->mem.va = vzalloc(aeq->mem.size);
+
+	if (!aeq->mem.va)
+		return status;
+
+	pg_cnt = DIV_ROUND_UP(aeq->mem.size, PAGE_SIZE);
+	status = irdma_get_pble(rf->pble_rsrc, &aeq->palloc, pg_cnt, true);
+	if (status) {
+		vfree(aeq->mem.va);
+		return status;
+	}
+
+	pg_arr = (dma_addr_t *)(uintptr_t)aeq->palloc.level1.addr;
+	status = irdma_map_vm_page_list(&rf->hw, aeq->mem.va, pg_arr, pg_cnt);
+	if (status) {
+		irdma_free_pble(rf->pble_rsrc, &aeq->palloc);
+		vfree(aeq->mem.va);
+		return status;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_create_aeq - create async event queue
+ * @rf: RDMA PCI function
+ *
+ * Return 0, if the aeq and the resources associated with it
+ * are successfully created, otherwise return error
+ */
+static enum irdma_status_code irdma_create_aeq(struct irdma_pci_f *rf)
+{
+	enum irdma_status_code status;
+	struct irdma_aeq_init_info info = {};
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_aeq *aeq = &rf->aeq;
+	struct irdma_hmc_info *hmc_info = rf->sc_dev.hmc_info;
+	u32 aeq_size;
+	u8 multiplier = (rf->protocol_used == IRDMA_IWARP_PROTOCOL_ONLY) ? 2 : 1;
+
+	aeq_size = multiplier * hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt +
+		   hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt;
+	aeq_size = min(aeq_size, dev->hw_attrs.max_hw_aeq_size);
+
+	aeq->mem.size = ALIGN(sizeof(struct irdma_sc_aeqe) * aeq_size,
+			      IRDMA_AEQ_ALIGNMENT);
+	aeq->mem.va = dma_alloc_coherent(dev->hw->device, aeq->mem.size,
+					 &aeq->mem.pa,
+					 GFP_KERNEL | __GFP_NOWARN);
+	if (aeq->mem.va)
+		goto skip_virt_aeq;
+
+	/* physically mapped aeq failed. setup virtual aeq */
+	status = irdma_create_virt_aeq(rf, aeq_size);
+	if (status)
+		return status;
+
+	info.virtual_map = true;
+	aeq->virtual_map = info.virtual_map;
+	info.pbl_chunk_size = 1;
+	info.first_pm_pbl_idx = aeq->palloc.level1.idx;
+
+skip_virt_aeq:
+	info.aeqe_base = aeq->mem.va;
+	info.aeq_elem_pa = aeq->mem.pa;
+	info.elem_cnt = aeq_size;
+	info.dev = dev;
+	info.msix_idx = rf->iw_msixtbl->idx;
+	status = dev->aeq_ops->aeq_init(&aeq->sc_aeq, &info);
+	if (status)
+		goto err;
+
+	status = irdma_cqp_aeq_cmd(dev, &aeq->sc_aeq, IRDMA_OP_AEQ_CREATE);
+	if (status)
+		goto err;
+
+	return 0;
+
+err:
+	if (aeq->virtual_map) {
+		irdma_destroy_virt_aeq(rf);
+	} else {
+		dma_free_coherent(dev->hw->device, aeq->mem.size, aeq->mem.va,
+				  aeq->mem.pa);
+		aeq->mem.va = NULL;
+	}
+
+	return status;
+}
+
+/**
+ * irdma_setup_aeq - set up the device aeq
+ * @rf: RDMA PCI function
+ *
+ * Create the aeq and configure its msix interrupt vector
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_setup_aeq(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	enum irdma_status_code status;
+
+	status = irdma_create_aeq(rf);
+	if (status)
+		return status;
+
+	status = irdma_cfg_aeq_vector(rf);
+	if (status) {
+		irdma_destroy_aeq(rf);
+		return status;
+	}
+
+	if (!rf->msix_shared)
+		irdma_ena_intr(dev, rf->iw_msixtbl[0].idx);
+
+	return 0;
+}
+
+/**
+ * irdma_initialize_ilq - create iwarp local queue for cm
+ * @iwdev: irdma device
+ *
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_initialize_ilq(struct irdma_device *iwdev)
+{
+	struct irdma_puda_rsrc_info info = {};
+	enum irdma_status_code status;
+
+	info.type = IRDMA_PUDA_RSRC_TYPE_ILQ;
+	info.cq_id = 1;
+	info.qp_id = 1;
+	info.count = 1;
+	info.pd_id = 1;
+	info.abi_ver = IRDMA_ABI_VER;
+	info.sq_size = min(iwdev->rf->max_qp / 2, (u32)32768);
+	info.rq_size = info.sq_size;
+	info.buf_size = 1024;
+	info.tx_buf_cnt = 2 * info.sq_size;
+	info.receive = irdma_receive_ilq;
+	info.xmit_complete = irdma_free_sqbuf;
+	status = irdma_puda_create_rsrc(&iwdev->vsi, &info);
+	if (status)
+		ibdev_dbg(&iwdev->ibdev, "ERR: ilq create fail\n");
+
+	return status;
+}
+
+/**
+ * irdma_initialize_ieq - create iwarp exception queue
+ * @iwdev: irdma device
+ *
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_initialize_ieq(struct irdma_device *iwdev)
+{
+	struct irdma_puda_rsrc_info info = {};
+	enum irdma_status_code status;
+
+	info.type = IRDMA_PUDA_RSRC_TYPE_IEQ;
+	info.cq_id = 2;
+	info.qp_id = iwdev->vsi.exception_lan_q;
+	info.count = 1;
+	info.pd_id = 2;
+	info.abi_ver = IRDMA_ABI_VER;
+	info.sq_size = min(iwdev->rf->max_qp / 2, (u32)32768);
+	info.rq_size = info.sq_size;
+	info.buf_size = iwdev->vsi.mtu + IRDMA_IPV4_PAD;
+	info.tx_buf_cnt = 4096;
+	status = irdma_puda_create_rsrc(&iwdev->vsi, &info);
+	if (status)
+		ibdev_dbg(&iwdev->ibdev, "ERR: ieq create fail\n");
+
+	return status;
+}
+
+/**
+ * irdma_reinitialize_ieq - destroy and re-create ieq
+ * @vsi: VSI structure
+ */
+void irdma_reinitialize_ieq(struct irdma_sc_vsi *vsi)
+{
+	struct irdma_device *iwdev = vsi->back_vsi;
+	struct irdma_pci_f *rf = iwdev->rf;
+
+	irdma_puda_dele_rsrc(vsi, IRDMA_PUDA_RSRC_TYPE_IEQ, false);
+	if (irdma_initialize_ieq(iwdev)) {
+		iwdev->reset = true;
+		rf->gen_ops.request_reset(rf);
+	}
+}
+
+/**
+ * irdma_hmc_setup - create hmc objects for the device
+ * @rf: RDMA PCI function
+ *
+ * Set up the device private memory space for the number and size of
+ * the hmc objects and create the objects
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_hmc_setup(struct irdma_pci_f *rf)
+{
+	enum irdma_status_code status;
+	u32 qpcnt;
+
+	if (rf->rdma_ver == IRDMA_GEN_1)
+		qpcnt = rsrc_limits_table[rf->limits_sel].qplimit * 2;
+	else
+		qpcnt = rsrc_limits_table[rf->limits_sel].qplimit;
+
+	rf->sd_type = IRDMA_SD_TYPE_DIRECT;
+	status = irdma_cfg_fpm_val(&rf->sc_dev, qpcnt);
+	if (status)
+		return status;
+
+	status = irdma_create_hmc_objs(rf, true, rf->rdma_ver);
+
+	return status;
+}
+
+/**
+ * irdma_del_init_mem - deallocate memory resources
+ * @rf: RDMA PCI function
+ */
+static void irdma_del_init_mem(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+
+	kfree(dev->hmc_info->sd_table.sd_entry);
+	dev->hmc_info->sd_table.sd_entry = NULL;
+	kfree(rf->mem_rsrc);
+	rf->mem_rsrc = NULL;
+	dma_free_coherent(rf->hw.device, rf->obj_mem.size, rf->obj_mem.va,
+			  rf->obj_mem.pa);
+	rf->obj_mem.va = NULL;
+	if (rf->rdma_ver != IRDMA_GEN_1) {
+		kfree(rf->allocated_ws_nodes);
+		rf->allocated_ws_nodes = NULL;
+	}
+	kfree(rf->ceqlist);
+	rf->ceqlist = NULL;
+	kfree(rf->iw_msixtbl);
+	rf->iw_msixtbl = NULL;
+	kfree(rf->hmc_info_mem);
+	rf->hmc_info_mem = NULL;
+}
+
+/**
+ * irdma_initialize_dev - initialize device
+ * @rf: RDMA PCI function
+ * @cdev_info: lan device information
+ *
+ * Allocate memory for the hmc objects and initialize iwdev
+ * Return 0 if successful, otherwise clean up the resources
+ * and return error
+ */
+static enum irdma_status_code irdma_initialize_dev(struct irdma_pci_f *rf,
+						   struct irdma_priv_core_dev_info *cdev_info)
+{
+	enum irdma_status_code status;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_device_init_info info = {};
+	struct irdma_dma_mem mem;
+	u32 size;
+
+	size = sizeof(struct irdma_hmc_pble_rsrc) +
+	       sizeof(struct irdma_hmc_info) +
+	       (sizeof(struct irdma_hmc_obj_info) * IRDMA_HMC_IW_MAX);
+
+	rf->hmc_info_mem = kzalloc(size, GFP_KERNEL);
+	if (!rf->hmc_info_mem)
+		return IRDMA_ERR_NO_MEMORY;
+
+	rf->pble_rsrc = (struct irdma_hmc_pble_rsrc *)rf->hmc_info_mem;
+	dev->hmc_info = &rf->hw.hmc;
+	dev->hmc_info->hmc_obj = (struct irdma_hmc_obj_info *)
+				 (rf->pble_rsrc + 1);
+
+	status = irdma_obj_aligned_mem(rf, &mem, IRDMA_QUERY_FPM_BUF_SIZE,
+				       IRDMA_FPM_QUERY_BUF_ALIGNMENT_M);
+	if (status)
+		goto error;
+
+	info.fpm_query_buf_pa = mem.pa;
+	info.fpm_query_buf = mem.va;
+	info.init_hw = rf->gen_ops.init_hw;
+
+	status = irdma_obj_aligned_mem(rf, &mem, IRDMA_COMMIT_FPM_BUF_SIZE,
+				       IRDMA_FPM_COMMIT_BUF_ALIGNMENT_M);
+	if (status)
+		goto error;
+
+	info.fpm_commit_buf_pa = mem.pa;
+	info.fpm_commit_buf = mem.va;
+
+	info.bar0 = rf->hw.hw_addr;
+	info.hmc_fn_id = (u8)cdev_info->fn_num;
+	info.is_pf = !cdev_info->ftype;
+	info.privileged = !cdev_info->ftype;
+	info.hw = &rf->hw;
+	status = irdma_sc_ctrl_init(rf->rdma_ver, &rf->sc_dev, &info);
+	if (status)
+		goto error;
+
+	return status;
+error:
+	kfree(rf->hmc_info_mem);
+	rf->hmc_info_mem = NULL;
+
+	return status;
+}
+
+/**
+ * irdma_rt_deinit_hw - clean up the irdma device resources
+ * @iwdev: irdma device
+ *
+ * remove the mac ip entry and ipv4/ipv6 addresses, destroy the
+ * device queues and free the pble and the hmc objects
+ */
+void irdma_rt_deinit_hw(struct irdma_device *iwdev)
+{
+	ibdev_dbg(&iwdev->ibdev, "INIT: state = %d\n", iwdev->init_state);
+
+	switch (iwdev->init_state) {
+	case IP_ADDR_REGISTERED:
+		if (iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+			irdma_del_local_mac_entry(iwdev->rf,
+						  (u8)iwdev->mac_ip_table_idx);
+		fallthrough;
+	case AEQ_CREATED:
+	case PBLE_CHUNK_MEM:
+	case CEQS_CREATED:
+	case IEQ_CREATED:
+		if (!iwdev->roce_mode)
+			irdma_puda_dele_rsrc(&iwdev->vsi, IRDMA_PUDA_RSRC_TYPE_IEQ,
+					     iwdev->reset);
+		fallthrough;
+	case ILQ_CREATED:
+		if (!iwdev->roce_mode)
+			irdma_puda_dele_rsrc(&iwdev->vsi,
+					     IRDMA_PUDA_RSRC_TYPE_ILQ,
+					     iwdev->reset);
+		break;
+	default:
+		ibdev_warn(&iwdev->ibdev, "bad init_state = %d\n", iwdev->init_state);
+		break;
+	}
+
+	irdma_cleanup_cm_core(&iwdev->cm_core);
+	if (iwdev->vsi.pestat) {
+		irdma_vsi_stats_free(&iwdev->vsi);
+		kfree(iwdev->vsi.pestat);
+	}
+	if (iwdev->cleanup_wq)
+		destroy_workqueue(iwdev->cleanup_wq);
+}
+
+/**
+ * irdma_setup_init_state - set up the initial device struct
+ * @rf: RDMA PCI function
+ *
+ * Initialize the iwarp device and its hdl information
+ * using the cdev_info and client information
+ * Return 0 if successful, otherwise return error
+ */
+static enum irdma_status_code irdma_setup_init_state(struct irdma_pci_f *rf)
+{
+	struct irdma_priv_core_dev_info *cdev_info = &rf->priv_cdev_info;
+	enum irdma_status_code status;
+
+	status = irdma_save_msix_info(rf);
+	if (status)
+		return status;
+
+	rf->hw.device = &rf->pcidev->dev;
+	rf->obj_mem.size = ALIGN(8192, IRDMA_HW_PAGE_SIZE);
+	rf->obj_mem.va = dma_alloc_coherent(rf->hw.device, rf->obj_mem.size,
+					    &rf->obj_mem.pa, GFP_KERNEL);
+	if (!rf->obj_mem.va) {
+		status = IRDMA_ERR_NO_MEMORY;
+		goto clean_msixtbl;
+	}
+
+	rf->obj_next = rf->obj_mem;
+	status = irdma_initialize_dev(rf, cdev_info);
+	if (status)
+		goto clean_obj_mem;
+
+	return 0;
+
+clean_obj_mem:
+	dma_free_coherent(rf->hw.device, rf->obj_mem.size, rf->obj_mem.va,
+			  rf->obj_mem.pa);
+	rf->obj_mem.va = NULL;
+clean_msixtbl:
+	kfree(rf->iw_msixtbl);
+	rf->iw_msixtbl = NULL;
+	return status;
+}
+
+/**
+ * irdma_get_used_rsrc - determine resources used internally
+ * @iwdev: irdma device
+ *
+ * Called at the end of open to get all internal allocations
+ */
+static void irdma_get_used_rsrc(struct irdma_device *iwdev)
+{
+	iwdev->rf->used_pds = find_next_zero_bit(iwdev->rf->allocated_pds,
+						 iwdev->rf->max_pd, 0);
+	iwdev->rf->used_qps = find_next_zero_bit(iwdev->rf->allocated_qps,
+						 iwdev->rf->max_qp, 0);
+	iwdev->rf->used_cqs = find_next_zero_bit(iwdev->rf->allocated_cqs,
+						 iwdev->rf->max_cq, 0);
+	iwdev->rf->used_mrs = find_next_zero_bit(iwdev->rf->allocated_mrs,
+						 iwdev->rf->max_mr, 0);
+}
+
+void irdma_ctrl_deinit_hw(struct irdma_pci_f *rf)
+{
+	enum init_completion_state state = rf->init_state;
+
+	rf->init_state = INVALID_STATE;
+	if (rf->rsrc_created) {
+		irdma_destroy_aeq(rf);
+		irdma_destroy_pble_prm(rf->pble_rsrc);
+		irdma_del_ceqs(rf);
+		rf->rsrc_created = false;
+	}
+	switch (state) {
+	case CEQ0_CREATED:
+		irdma_del_ceq_0(rf);
+		fallthrough;
+	case CCQ_CREATED:
+		irdma_destroy_ccq(rf);
+		fallthrough;
+	case HW_RSRC_INITIALIZED:
+	case HMC_OBJS_CREATED:
+		irdma_del_hmc_objects(&rf->sc_dev, rf->sc_dev.hmc_info, true,
+				      rf->reset, rf->rdma_ver);
+		fallthrough;
+	case CQP_CREATED:
+		irdma_destroy_cqp(rf, true);
+		fallthrough;
+	case INITIAL_STATE:
+		irdma_del_init_mem(rf);
+		break;
+	case INVALID_STATE:
+	default:
+		ibdev_warn(&rf->iwdev->ibdev, "bad init_state = %d\n", rf->init_state);
+		break;
+	}
+}
+
+/**
+ * irdma_rt_init_hw - Initializes runtime portion of HW
+ * @iwdev: irdma device
+ * @l2params: qos, tc, mtu info from netdev driver
+ *
+ * Create device queues ILQ, IEQ, CEQs and PBLEs. Setup irdma
+ * device resource objects.
+ */
+enum irdma_status_code irdma_rt_init_hw(struct irdma_device *iwdev,
+					struct irdma_l2params *l2params)
+{
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	enum irdma_status_code status;
+	struct irdma_vsi_init_info vsi_info = {};
+	struct irdma_vsi_stats_info stats_info = {};
+
+	irdma_sc_rt_init(dev);
+	vsi_info.vm_vf_type = rf->priv_cdev_info.ftype ? IRDMA_VF_TYPE : IRDMA_PF_TYPE;
+	vsi_info.dev = dev;
+	vsi_info.back_vsi = iwdev;
+	vsi_info.params = l2params;
+	vsi_info.pf_data_vsi_num = iwdev->vsi_num;
+	vsi_info.register_qset = rf->gen_ops.register_qset;
+	vsi_info.unregister_qset = rf->gen_ops.unregister_qset;
+	vsi_info.exception_lan_q = 2;
+	irdma_sc_vsi_init(&iwdev->vsi, &vsi_info);
+
+	status = irdma_setup_cm_core(iwdev, rf->rdma_ver);
+	if (status)
+		return status;
+
+	stats_info.pestat = kzalloc(sizeof(*stats_info.pestat), GFP_KERNEL);
+	if (!stats_info.pestat) {
+		irdma_cleanup_cm_core(&iwdev->cm_core);
+		return IRDMA_ERR_NO_MEMORY;
+	}
+	stats_info.fcn_id = dev->hmc_fn_id;
+	status = irdma_vsi_stats_init(&iwdev->vsi, &stats_info);
+	if (status) {
+		irdma_cleanup_cm_core(&iwdev->cm_core);
+		kfree(stats_info.pestat);
+		return status;
+	}
+
+	do {
+		if (!iwdev->roce_mode) {
+			status = irdma_initialize_ilq(iwdev);
+			if (status)
+				break;
+			iwdev->init_state = ILQ_CREATED;
+			status = irdma_initialize_ieq(iwdev);
+			if (status)
+				break;
+			iwdev->init_state = IEQ_CREATED;
+		}
+		if (!rf->rsrc_created) {
+			status = irdma_setup_ceqs(rf, &iwdev->vsi);
+			if (status)
+				break;
+
+			iwdev->init_state = CEQS_CREATED;
+
+			status = irdma_hmc_init_pble(&rf->sc_dev,
+						     rf->pble_rsrc);
+			if (status) {
+				irdma_del_ceqs(rf);
+				break;
+			}
+
+			iwdev->init_state = PBLE_CHUNK_MEM;
+
+			status = irdma_setup_aeq(rf);
+			if (status) {
+				irdma_destroy_pble_prm(rf->pble_rsrc);
+				irdma_del_ceqs(rf);
+				break;
+			}
+			iwdev->init_state = AEQ_CREATED;
+			rf->rsrc_created = true;
+		}
+
+		iwdev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY |
+					  IB_DEVICE_MEM_WINDOW |
+					  IB_DEVICE_MEM_MGT_EXTENSIONS;
+
+		if (iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+			irdma_alloc_set_mac(iwdev);
+		irdma_add_ip(iwdev);
+		iwdev->init_state = IP_ADDR_REGISTERED;
+
+		/* handles asynch cleanup tasks - disconnect CM , free qp,
+		 * free cq bufs
+		 */
+		iwdev->cleanup_wq = alloc_workqueue("irdma-cleanup-wq",
+					WQ_UNBOUND, WQ_UNBOUND_MAX_ACTIVE);
+		if (!iwdev->cleanup_wq)
+			return IRDMA_ERR_NO_MEMORY;
+		irdma_get_used_rsrc(iwdev);
+		init_waitqueue_head(&iwdev->suspend_wq);
+
+		return 0;
+	} while (0);
+
+	dev_err(&rf->pcidev->dev, "HW runtime init FAIL status = %d last cmpl = %d\n",
+		status, iwdev->init_state);
+	irdma_rt_deinit_hw(iwdev);
+
+	return status;
+}
+
+/**
+ * irdma_ctrl_init_hw - Initializes control portion of HW
+ * @rf: RDMA PCI function
+ *
+ * Create admin queues, HMC obejcts and RF resource objects
+ */
+enum irdma_status_code irdma_ctrl_init_hw(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	enum irdma_status_code status;
+	do {
+		status = irdma_setup_init_state(rf);
+		if (status)
+			break;
+		rf->init_state = INITIAL_STATE;
+
+		status = irdma_create_cqp(rf);
+		if (status)
+			break;
+		rf->init_state = CQP_CREATED;
+
+		status = irdma_hmc_setup(rf);
+		if (status)
+			break;
+		rf->init_state = HMC_OBJS_CREATED;
+
+		status = irdma_initialize_hw_rsrc(rf);
+		if (status)
+			break;
+		rf->init_state = HW_RSRC_INITIALIZED;
+
+		status = irdma_create_ccq(rf);
+		if (status)
+			break;
+		rf->init_state = CCQ_CREATED;
+
+		rf->sc_dev.feature_info[IRDMA_FEATURE_FW_INFO] = IRDMA_FW_VER_DEFAULT;
+		if (rf->rdma_ver != IRDMA_GEN_1)
+			status = irdma_get_rdma_features(&rf->sc_dev);
+		if (!status) {
+			u32 fw_ver = dev->feature_info[IRDMA_FEATURE_FW_INFO];
+			u8 hw_rev = dev->hw_attrs.uk_attrs.hw_rev;
+
+			if ((hw_rev == IRDMA_GEN_1 && fw_ver >= IRDMA_FW_VER_0x30010) ||
+			    (hw_rev != IRDMA_GEN_1 && fw_ver >= IRDMA_FW_VER_0x1000D))
+
+				dev->hw_attrs.uk_attrs.feature_flags |= IRDMA_FEATURE_RTS_AE |
+									IRDMA_FEATURE_CQ_RESIZE;
+		}
+
+		status = irdma_setup_ceq_0(rf);
+		if (status)
+			break;
+		rf->init_state = CEQ0_CREATED;
+		/* Handles processing of CQP completions */
+		rf->cqp_cmpl_wq = alloc_ordered_workqueue("cqp_cmpl_wq",
+						WQ_HIGHPRI | WQ_UNBOUND);
+		if (!rf->cqp_cmpl_wq) {
+			status = IRDMA_ERR_NO_MEMORY;
+			break;
+		}
+		INIT_WORK(&rf->cqp_cmpl_work, cqp_compl_worker);
+		dev->ccq_ops->ccq_arm(dev->ccq);
+		return 0;
+	} while (0);
+
+	dev_err(&rf->pcidev->dev, "IRDMA hardware initialization FAILED init_state=%d status=%d\n",
+		rf->init_state, status);
+	irdma_ctrl_deinit_hw(rf);
+	return status;
+}
+
+/**
+ * irdma_set_hw_rsrc - set hw memory resources.
+ * @rf: RDMA PCI function
+ */
+static u32 irdma_set_hw_rsrc(struct irdma_pci_f *rf)
+{
+	rf->allocated_qps = (void *)(rf->mem_rsrc +
+		   (sizeof(struct irdma_arp_entry) * rf->arp_table_size));
+	rf->allocated_cqs = &rf->allocated_qps[BITS_TO_LONGS(rf->max_qp)];
+	rf->allocated_mrs = &rf->allocated_cqs[BITS_TO_LONGS(rf->max_cq)];
+	rf->allocated_pds = &rf->allocated_mrs[BITS_TO_LONGS(rf->max_mr)];
+	rf->allocated_ahs = &rf->allocated_pds[BITS_TO_LONGS(rf->max_pd)];
+	rf->allocated_mcgs = &rf->allocated_ahs[BITS_TO_LONGS(rf->max_ah)];
+	rf->allocated_arps = &rf->allocated_mcgs[BITS_TO_LONGS(rf->max_mcg)];
+	rf->qp_table = (struct irdma_qp **)
+		(&rf->allocated_arps[BITS_TO_LONGS(rf->arp_table_size)]);
+
+	spin_lock_init(&rf->rsrc_lock);
+	spin_lock_init(&rf->arp_lock);
+	spin_lock_init(&rf->qptable_lock);
+	spin_lock_init(&rf->qh_list_lock);
+
+	return 0;
+}
+
+/**
+ * irdma_calc_mem_rsrc_size - calculate memory resources size.
+ * @rf: RDMA PCI function
+ */
+static u32 irdma_calc_mem_rsrc_size(struct irdma_pci_f *rf)
+{
+	u32 rsrc_size;
+
+	rsrc_size = sizeof(struct irdma_arp_entry) * rf->arp_table_size;
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_qp);
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_mr);
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_cq);
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_pd);
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->arp_table_size);
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_ah);
+	rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_mcg);
+	rsrc_size += sizeof(struct irdma_qp **) * rf->max_qp;
+
+	return rsrc_size;
+}
+
+/**
+ * irdma_initialize_hw_rsrc - initialize hw resource tracking array
+ * @rf: RDMA PCI function
+ */
+u32 irdma_initialize_hw_rsrc(struct irdma_pci_f *rf)
+{
+	u32 rsrc_size;
+	u32 mrdrvbits;
+	u32 ret;
+
+	if (rf->rdma_ver != IRDMA_GEN_1) {
+		rf->allocated_ws_nodes =
+			kcalloc(BITS_TO_LONGS(IRDMA_MAX_WS_NODES),
+				sizeof(unsigned long), GFP_KERNEL);
+		if (!rf->allocated_ws_nodes)
+			return -ENOMEM;
+
+		set_bit(0, rf->allocated_ws_nodes);
+		rf->max_ws_node_id = IRDMA_MAX_WS_NODES;
+	}
+	rf->max_cqe = rf->sc_dev.hw_attrs.uk_attrs.max_hw_cq_size;
+	rf->max_qp = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt;
+	rf->max_mr = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt;
+	rf->max_cq = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt;
+	rf->max_pd = rf->sc_dev.hw_attrs.max_hw_pds;
+	rf->arp_table_size = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_ARP].cnt;
+	rf->max_ah = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt;
+	rf->max_mcg = rf->max_qp;
+
+	rsrc_size = irdma_calc_mem_rsrc_size(rf);
+	rf->mem_rsrc = kzalloc(rsrc_size, GFP_KERNEL);
+	if (!rf->mem_rsrc) {
+		ret = -ENOMEM;
+		goto mem_rsrc_kzalloc_fail;
+	}
+
+	rf->arp_table = (struct irdma_arp_entry *)rf->mem_rsrc;
+
+	ret = irdma_set_hw_rsrc(rf);
+	if (ret)
+		goto set_hw_rsrc_fail;
+
+	set_bit(0, rf->allocated_mrs);
+	set_bit(0, rf->allocated_qps);
+	set_bit(0, rf->allocated_cqs);
+	set_bit(0, rf->allocated_pds);
+	set_bit(0, rf->allocated_arps);
+	set_bit(0, rf->allocated_ahs);
+	set_bit(0, rf->allocated_mcgs);
+	set_bit(2, rf->allocated_qps); /* qp 2 IEQ */
+	set_bit(1, rf->allocated_qps); /* qp 1 ILQ */
+	set_bit(1, rf->allocated_cqs);
+	set_bit(1, rf->allocated_pds);
+	set_bit(2, rf->allocated_cqs);
+	set_bit(2, rf->allocated_pds);
+
+	INIT_LIST_HEAD(&rf->mc_qht_list.list);
+	/* stag index mask has a minimum of 14 bits */
+	mrdrvbits = 24 - max(get_count_order(rf->max_mr), 14);
+	rf->mr_stagmask = ~(((1 << mrdrvbits) - 1) << (32 - mrdrvbits));
+
+	return 0;
+
+set_hw_rsrc_fail:
+	kfree(rf->mem_rsrc);
+	rf->mem_rsrc = NULL;
+mem_rsrc_kzalloc_fail:
+	kfree(rf->allocated_ws_nodes);
+	rf->allocated_ws_nodes = NULL;
+
+	return ret;
+}
+
+/**
+ * irdma_cqp_ce_handler - handle cqp completions
+ * @rf: RDMA PCI function
+ * @cq: cq for cqp completions
+ */
+void irdma_cqp_ce_handler(struct irdma_pci_f *rf, struct irdma_sc_cq *cq)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	u32 cqe_count = 0;
+	struct irdma_ccq_cqe_info info;
+	unsigned long flags;
+	int ret;
+
+	do {
+		memset(&info, 0, sizeof(info));
+		spin_lock_irqsave(&rf->cqp.compl_lock, flags);
+		ret = dev->ccq_ops->ccq_get_cqe_info(cq, &info);
+		spin_unlock_irqrestore(&rf->cqp.compl_lock, flags);
+		if (ret)
+			break;
+
+		cqp_request = (struct irdma_cqp_request *)
+			      (unsigned long)info.scratch;
+		if (info.error && irdma_cqp_crit_err(dev, cqp_request->info.cqp_cmd,
+						     info.maj_err_code,
+						     info.min_err_code))
+			ibdev_err(&rf->iwdev->ibdev, "cqp opcode = 0x%x maj_err_code = 0x%x min_err_code = 0x%x\n",
+				  info.op_code, info.maj_err_code, info.min_err_code);
+		if (cqp_request) {
+			cqp_request->compl_info.maj_err_code = info.maj_err_code;
+			cqp_request->compl_info.min_err_code = info.min_err_code;
+			cqp_request->compl_info.op_ret_val = info.op_ret_val;
+			cqp_request->compl_info.error = info.error;
+
+			if (cqp_request->waiting) {
+				cqp_request->request_done = true;
+				wake_up(&cqp_request->waitq);
+				irdma_put_cqp_request(&rf->cqp, cqp_request);
+			} else {
+				if (cqp_request->callback_fcn)
+					cqp_request->callback_fcn(cqp_request);
+				irdma_put_cqp_request(&rf->cqp, cqp_request);
+			}
+		}
+
+		cqe_count++;
+	} while (1);
+
+	if (cqe_count) {
+		irdma_process_bh(dev);
+		dev->ccq_ops->ccq_arm(cq);
+	}
+}
+
+/**
+ * cqp_compl_worker - Handle cqp completions
+ * @work: Pointer to work structure
+ */
+void cqp_compl_worker(struct work_struct *work)
+{
+	struct irdma_pci_f *rf = container_of(work, struct irdma_pci_f,
+					      cqp_cmpl_work);
+	struct irdma_sc_cq *cq = &rf->ccq.sc_cq;
+
+	irdma_cqp_ce_handler(rf, cq);
+}
+
+/**
+ * irdma_lookup_apbvt_entry - lookup hash table for an existing apbvt entry corresponding to port
+ * @cm_core: cm's core
+ * @port: port to identify apbvt entry
+ */
+static struct irdma_apbvt_entry *irdma_lookup_apbvt_entry(struct irdma_cm_core *cm_core,
+							  u16 port)
+{
+	struct irdma_apbvt_entry *entry;
+
+	hash_for_each_possible(cm_core->apbvt_hash_tbl, entry, hlist, port) {
+		if (entry->port == port) {
+			entry->use_cnt++;
+			return entry;
+		}
+	}
+
+	return NULL;
+}
+
+/**
+ * irdma_next_iw_state - modify qp state
+ * @iwqp: iwarp qp to modify
+ * @state: next state for qp
+ * @del_hash: del hash
+ * @term: term message
+ * @termlen: length of term message
+ */
+void irdma_next_iw_state(struct irdma_qp *iwqp, u8 state, u8 del_hash, u8 term,
+			 u8 termlen)
+{
+	struct irdma_modify_qp_info info = {};
+
+	info.next_iwarp_state = state;
+	info.remove_hash_idx = del_hash;
+	info.cq_num_valid = true;
+	info.arp_cache_idx_valid = true;
+	info.dont_send_term = true;
+	info.dont_send_fin = true;
+	info.termlen = termlen;
+
+	if (term & IRDMAQP_TERM_SEND_TERM_ONLY)
+		info.dont_send_term = false;
+	if (term & IRDMAQP_TERM_SEND_FIN_ONLY)
+		info.dont_send_fin = false;
+	if (iwqp->sc_qp.term_flags && state == IRDMA_QP_STATE_ERROR)
+		info.reset_tcp_conn = true;
+	iwqp->hw_iwarp_state = state;
+	irdma_hw_modify_qp(iwqp->iwdev, iwqp, &info, 0);
+	iwqp->iwarp_state = info.next_iwarp_state;
+}
+
+/**
+ * irdma_del_local_mac_entry - remove a mac entry from the hw
+ * table
+ * @rf: RDMA PCI function
+ * @idx: the index of the mac ip address to delete
+ */
+void irdma_del_local_mac_entry(struct irdma_pci_f *rf, u16 idx)
+{
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, true);
+	if (!cqp_request)
+		return;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_DELETE_LOCAL_MAC_ENTRY;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.del_local_mac_entry.cqp = &iwcqp->sc_cqp;
+	cqp_info->in.u.del_local_mac_entry.scratch = (uintptr_t)cqp_request;
+	cqp_info->in.u.del_local_mac_entry.entry_idx = idx;
+	cqp_info->in.u.del_local_mac_entry.ignore_ref_count = 0;
+
+	irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(iwcqp, cqp_request);
+}
+
+/**
+ * irdma_add_local_mac_entry - add a mac ip address entry to the
+ * hw table
+ * @rf: RDMA PCI function
+ * @mac_addr: pointer to mac address
+ * @idx: the index of the mac ip address to add
+ */
+int irdma_add_local_mac_entry(struct irdma_pci_f *rf, u8 *mac_addr, u16 idx)
+{
+	struct irdma_local_mac_entry_info *info;
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->post_sq = 1;
+	info = &cqp_info->in.u.add_local_mac_entry.info;
+	ether_addr_copy(info->mac_addr, mac_addr);
+	info->entry_idx = idx;
+	cqp_info->in.u.add_local_mac_entry.scratch = (uintptr_t)cqp_request;
+	cqp_info->cqp_cmd = IRDMA_OP_ADD_LOCAL_MAC_ENTRY;
+	cqp_info->in.u.add_local_mac_entry.cqp = &iwcqp->sc_cqp;
+	cqp_info->in.u.add_local_mac_entry.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(iwcqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_alloc_local_mac_entry - allocate a mac entry
+ * @rf: RDMA PCI function
+ * @mac_tbl_idx: the index of the new mac address
+ *
+ * Allocate a mac address entry and update the mac_tbl_idx
+ * to hold the index of the newly created mac address
+ * Return 0 if successful, otherwise return error
+ */
+int irdma_alloc_local_mac_entry(struct irdma_pci_f *rf, u16 *mac_tbl_idx)
+{
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status = 0;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.alloc_local_mac_entry.cqp = &iwcqp->sc_cqp;
+	cqp_info->in.u.alloc_local_mac_entry.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	if (!status)
+		*mac_tbl_idx = (u16)cqp_request->compl_info.op_ret_val;
+
+	irdma_put_cqp_request(iwcqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_manage_apbvt_cmd - send cqp command manage apbvt
+ * @iwdev: irdma device
+ * @accel_local_port: port for apbvt
+ * @add_port: add ordelete port
+ */
+static enum irdma_status_code
+irdma_cqp_manage_apbvt_cmd(struct irdma_device *iwdev, u16 accel_local_port,
+			   bool add_port)
+{
+	struct irdma_apbvt_info *info;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, add_port);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	info = &cqp_info->in.u.manage_apbvt_entry.info;
+	memset(info, 0, sizeof(*info));
+	info->add = add_port;
+	info->port = accel_local_port;
+	cqp_info->cqp_cmd = IRDMA_OP_MANAGE_APBVT_ENTRY;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.manage_apbvt_entry.cqp = &iwdev->rf->cqp.sc_cqp;
+	cqp_info->in.u.manage_apbvt_entry.scratch = (uintptr_t)cqp_request;
+	ibdev_dbg(&iwdev->ibdev, "DEV: %s: port=0x%04x\n",
+		  (!add_port) ? "DELETE" : "ADD", accel_local_port);
+
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_add_apbvt - add tcp port to HW apbvt table
+ * @iwdev: irdma device
+ * @port: port for apbvt
+ */
+struct irdma_apbvt_entry *irdma_add_apbvt(struct irdma_device *iwdev, u16 port)
+{
+	struct irdma_cm_core *cm_core = &iwdev->cm_core;
+	struct irdma_apbvt_entry *entry;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cm_core->apbvt_lock, flags);
+	entry = irdma_lookup_apbvt_entry(cm_core, port);
+	if (entry) {
+		spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
+		return entry;
+	}
+
+	entry = kzalloc(sizeof(*entry), GFP_ATOMIC);
+	if (!entry) {
+		spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
+		return NULL;
+	}
+
+	entry->port = port;
+	entry->use_cnt = 1;
+	hash_add(cm_core->apbvt_hash_tbl, &entry->hlist, entry->port);
+	spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
+
+	if (irdma_cqp_manage_apbvt_cmd(iwdev, port, true)) {
+		kfree(entry);
+		return NULL;
+	}
+
+	return entry;
+}
+
+/**
+ * irdma_del_apbvt - delete tcp port from HW apbvt table
+ * @iwdev: irdma device
+ * @entry: apbvt entry object
+ */
+void irdma_del_apbvt(struct irdma_device *iwdev,
+		     struct irdma_apbvt_entry *entry)
+{
+	struct irdma_cm_core *cm_core = &iwdev->cm_core;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cm_core->apbvt_lock, flags);
+	if (--entry->use_cnt) {
+		spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
+		return;
+	}
+
+	hash_del(&entry->hlist);
+	/* apbvt_lock is held across CQP delete APBVT OP (non-waiting) to
+	 * protect against race where add APBVT CQP can race ahead of the delete
+	 * APBVT for same port.
+	 */
+	irdma_cqp_manage_apbvt_cmd(iwdev, entry->port, false);
+	kfree(entry);
+	spin_unlock_irqrestore(&cm_core->apbvt_lock, flags);
+}
+
+/**
+ * irdma_manage_arp_cache - manage hw arp cache
+ * @rf: RDMA PCI function
+ * @mac_addr: mac address ptr
+ * @ip_addr: ip addr for arp cache
+ * @ipv4: flag inicating IPv4
+ * @action: add, delete or modify
+ */
+void irdma_manage_arp_cache(struct irdma_pci_f *rf, unsigned char *mac_addr,
+			    u32 *ip_addr, bool ipv4, u32 action)
+{
+	struct irdma_add_arp_cache_entry_info *info;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	int arp_index;
+
+	arp_index = irdma_arp_table(rf, ip_addr, ipv4, mac_addr, action);
+	if (arp_index == -1)
+		return;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, false);
+	if (!cqp_request)
+		return;
+
+	cqp_info = &cqp_request->info;
+	if (action == IRDMA_ARP_ADD) {
+		cqp_info->cqp_cmd = IRDMA_OP_ADD_ARP_CACHE_ENTRY;
+		info = &cqp_info->in.u.add_arp_cache_entry.info;
+		memset(info, 0, sizeof(*info));
+		info->arp_index = (u16)arp_index;
+		info->permanent = true;
+		ether_addr_copy(info->mac_addr, mac_addr);
+		cqp_info->in.u.add_arp_cache_entry.scratch =
+			(uintptr_t)cqp_request;
+		cqp_info->in.u.add_arp_cache_entry.cqp = &rf->cqp.sc_cqp;
+	} else {
+		cqp_info->cqp_cmd = IRDMA_OP_DELETE_ARP_CACHE_ENTRY;
+		cqp_info->in.u.del_arp_cache_entry.scratch =
+			(uintptr_t)cqp_request;
+		cqp_info->in.u.del_arp_cache_entry.cqp = &rf->cqp.sc_cqp;
+		cqp_info->in.u.del_arp_cache_entry.arp_index = arp_index;
+	}
+
+	cqp_info->post_sq = 1;
+	irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+}
+
+/**
+ * irdma_send_syn_cqp_callback - do syn/ack after qhash
+ * @cqp_request: qhash cqp completion
+ */
+static void irdma_send_syn_cqp_callback(struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_cm_node *cm_node = cqp_request->param;
+
+	irdma_send_syn(cm_node, 1);
+	irdma_rem_ref_cm_node(cm_node);
+}
+
+/**
+ * irdma_manage_qhash - add or modify qhash
+ * @iwdev: irdma device
+ * @cminfo: cm info for qhash
+ * @etype: type (syn or quad)
+ * @mtype: type of qhash
+ * @cmnode: cmnode associated with connection
+ * @wait: wait for completion
+ */
+enum irdma_status_code
+irdma_manage_qhash(struct irdma_device *iwdev, struct irdma_cm_info *cminfo,
+		   enum irdma_quad_entry_type etype,
+		   enum irdma_quad_hash_manage_type mtype, void *cmnode,
+		   bool wait)
+{
+	struct irdma_qhash_table_info *info;
+	enum irdma_status_code status;
+	struct irdma_cqp *iwcqp = &iwdev->rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_cm_node *cm_node = cmnode;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, wait);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	info = &cqp_info->in.u.manage_qhash_table_entry.info;
+	memset(info, 0, sizeof(*info));
+	info->vsi = &iwdev->vsi;
+	info->manage = mtype;
+	info->entry_type = etype;
+	if (cminfo->vlan_id < VLAN_N_VID) {
+		info->vlan_valid = true;
+		info->vlan_id = cminfo->vlan_id;
+	} else {
+		info->vlan_valid = false;
+	}
+	info->ipv4_valid = cminfo->ipv4;
+	info->user_pri = cminfo->user_pri;
+	ether_addr_copy(info->mac_addr, iwdev->netdev->dev_addr);
+	info->qp_num = cminfo->qh_qpid;
+	info->dest_port = cminfo->loc_port;
+	info->dest_ip[0] = cminfo->loc_addr[0];
+	info->dest_ip[1] = cminfo->loc_addr[1];
+	info->dest_ip[2] = cminfo->loc_addr[2];
+	info->dest_ip[3] = cminfo->loc_addr[3];
+	if (etype == IRDMA_QHASH_TYPE_TCP_ESTABLISHED ||
+	    etype == IRDMA_QHASH_TYPE_UDP_UNICAST ||
+	    etype == IRDMA_QHASH_TYPE_UDP_MCAST ||
+	    etype == IRDMA_QHASH_TYPE_ROCE_MCAST ||
+	    etype == IRDMA_QHASH_TYPE_ROCEV2_HW) {
+		info->src_port = cminfo->rem_port;
+		info->src_ip[0] = cminfo->rem_addr[0];
+		info->src_ip[1] = cminfo->rem_addr[1];
+		info->src_ip[2] = cminfo->rem_addr[2];
+		info->src_ip[3] = cminfo->rem_addr[3];
+	}
+	if (cmnode) {
+		cqp_request->callback_fcn = irdma_send_syn_cqp_callback;
+		cqp_request->param = cmnode;
+		if (!wait)
+			refcount_inc(&cm_node->refcnt);
+	}
+	if (info->ipv4_valid)
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: %s caller: %pS loc_port=0x%04x rem_port=0x%04x loc_addr=%pI4 rem_addr=%pI4 mac=%pM, vlan_id=%d cm_node=%p\n",
+			  (!mtype) ? "DELETE" : "ADD",
+			  __builtin_return_address(0), info->dest_port,
+			  info->src_port, info->dest_ip, info->src_ip,
+			  info->mac_addr, cminfo->vlan_id,
+			  cmnode ? cmnode : NULL);
+	else
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: %s caller: %pS loc_port=0x%04x rem_port=0x%04x loc_addr=%pI6 rem_addr=%pI6 mac=%pM, vlan_id=%d cm_node=%p\n",
+			  (!mtype) ? "DELETE" : "ADD",
+			  __builtin_return_address(0), info->dest_port,
+			  info->src_port, info->dest_ip, info->src_ip,
+			  info->mac_addr, cminfo->vlan_id,
+			  cmnode ? cmnode : NULL);
+
+	cqp_info->in.u.manage_qhash_table_entry.cqp = &iwdev->rf->cqp.sc_cqp;
+	cqp_info->in.u.manage_qhash_table_entry.scratch = (uintptr_t)cqp_request;
+	cqp_info->cqp_cmd = IRDMA_OP_MANAGE_QHASH_TABLE_ENTRY;
+	cqp_info->post_sq = 1;
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	if (status && cm_node && !wait)
+		irdma_rem_ref_cm_node(cm_node);
+
+	irdma_put_cqp_request(iwcqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_hw_flush_wqes_callback - Check return code after flush
+ * @cqp_request: qhash cqp completion
+ */
+static void irdma_hw_flush_wqes_callback(struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_qp_flush_info *hw_info;
+	struct irdma_sc_qp *qp;
+	struct irdma_qp *iwqp;
+	struct cqp_cmds_info *cqp_info;
+
+	cqp_info = &cqp_request->info;
+	hw_info = &cqp_info->in.u.qp_flush_wqes.info;
+	qp = cqp_info->in.u.qp_flush_wqes.qp;
+	iwqp = qp->qp_uk.back_qp;
+
+	if (cqp_request->compl_info.maj_err_code)
+		return;
+
+	if (hw_info->rq &&
+	    (cqp_request->compl_info.min_err_code == IRDMA_CQP_COMPL_SQ_WQE_FLUSHED ||
+	     cqp_request->compl_info.min_err_code == 0)) {
+		/* RQ WQE flush was requested but did not happen */
+		qp->qp_uk.rq_flush_complete = true;
+	}
+	if (hw_info->sq &&
+	    (cqp_request->compl_info.min_err_code == IRDMA_CQP_COMPL_RQ_WQE_FLUSHED ||
+	     cqp_request->compl_info.min_err_code == 0)) {
+		if (IRDMA_RING_MORE_WORK(qp->qp_uk.sq_ring)) {
+			ibdev_err(&iwqp->iwdev->ibdev, "Flush QP[%d] failed, SQ has more work",
+				  qp->qp_uk.qp_id);
+			irdma_ib_qp_event(iwqp, IRDMA_QP_EVENT_CATASTROPHIC);
+		}
+		qp->qp_uk.sq_flush_complete = true;
+	}
+}
+
+/**
+ * irdma_hw_flush_wqes - flush qp's wqe
+ * @rf: RDMA PCI function
+ * @qp: hardware control qp
+ * @info: info for flush
+ * @wait: flag wait for completion
+ */
+enum irdma_status_code irdma_hw_flush_wqes(struct irdma_pci_f *rf,
+					   struct irdma_sc_qp *qp,
+					   struct irdma_qp_flush_info *info,
+					   bool wait)
+{
+	enum irdma_status_code status;
+	struct irdma_qp_flush_info *hw_info;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_qp *iwqp = qp->qp_uk.back_qp;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, wait);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	if (!wait)
+		cqp_request->callback_fcn = irdma_hw_flush_wqes_callback;
+	hw_info = &cqp_request->info.in.u.qp_flush_wqes.info;
+	memcpy(hw_info, info, sizeof(*hw_info));
+	cqp_info->cqp_cmd = IRDMA_OP_QP_FLUSH_WQES;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.qp_flush_wqes.qp = qp;
+	cqp_info->in.u.qp_flush_wqes.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	if (status) {
+		qp->qp_uk.sq_flush_complete = true;
+		qp->qp_uk.rq_flush_complete = true;
+		irdma_put_cqp_request(&rf->cqp, cqp_request);
+		return status;
+	}
+
+	if (!wait || cqp_request->compl_info.maj_err_code)
+		goto put_cqp;
+
+	if (info->rq) {
+		if (cqp_request->compl_info.min_err_code == IRDMA_CQP_COMPL_SQ_WQE_FLUSHED ||
+		    cqp_request->compl_info.min_err_code == 0) {
+			/* RQ WQE flush was requested but did not happen */
+			qp->qp_uk.rq_flush_complete = true;
+		}
+	}
+	if (info->sq) {
+		if (cqp_request->compl_info.min_err_code == IRDMA_CQP_COMPL_RQ_WQE_FLUSHED ||
+		    cqp_request->compl_info.min_err_code == 0) {
+			/*
+			 * Handling case where WQE is posted to empty SQ when
+			 * flush has not completed
+			 */
+			if (IRDMA_RING_MORE_WORK(qp->qp_uk.sq_ring)) {
+				struct irdma_cqp_request *new_req;
+
+				if (!qp->qp_uk.sq_flush_complete)
+					goto put_cqp;
+				qp->qp_uk.sq_flush_complete = false;
+				qp->flush_sq = false;
+
+				info->rq = false;
+				info->sq = true;
+				new_req = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+				if (!new_req) {
+					status = IRDMA_ERR_NO_MEMORY;
+					goto put_cqp;
+				}
+				cqp_info = &new_req->info;
+				hw_info = &new_req->info.in.u.qp_flush_wqes.info;
+				memcpy(hw_info, info, sizeof(*hw_info));
+				cqp_info->cqp_cmd = IRDMA_OP_QP_FLUSH_WQES;
+				cqp_info->post_sq = 1;
+				cqp_info->in.u.qp_flush_wqes.qp = qp;
+				cqp_info->in.u.qp_flush_wqes.scratch = (uintptr_t)new_req;
+
+				status = irdma_handle_cqp_op(rf, new_req);
+				if (new_req->compl_info.maj_err_code ||
+				    new_req->compl_info.min_err_code != IRDMA_CQP_COMPL_SQ_WQE_FLUSHED ||
+				    status) {
+					ibdev_err(&iwqp->iwdev->ibdev, "fatal QP event: SQ in error but not flushed, qp: %d",
+						  iwqp->ibqp.qp_num);
+					qp->qp_uk.sq_flush_complete = false;
+					irdma_ib_qp_event(iwqp, IRDMA_QP_EVENT_CATASTROPHIC);
+				}
+				irdma_put_cqp_request(&rf->cqp, new_req);
+			} else {
+				/* SQ WQE flush was requested but did not happen */
+				qp->qp_uk.sq_flush_complete = true;
+			}
+		} else {
+			if (!IRDMA_RING_MORE_WORK(qp->qp_uk.sq_ring))
+				qp->qp_uk.sq_flush_complete = true;
+		}
+	}
+
+	ibdev_dbg(&rf->iwdev->ibdev,
+		  "VERBS: qp_id=%d qp_type=%d qpstate=%d ibqpstate=%d last_aeq=%d hw_iw_state=%d maj_err_code=%d min_err_code=%d\n",
+		  iwqp->ibqp.qp_num, rf->protocol_used, iwqp->iwarp_state,
+		  iwqp->ibqp_state, iwqp->last_aeq, iwqp->hw_iwarp_state,
+		  cqp_request->compl_info.maj_err_code,
+		  cqp_request->compl_info.min_err_code);
+put_cqp:
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_gen_ae - generate AE
+ * @rf: RDMA PCI function
+ * @qp: qp associated with AE
+ * @info: info for ae
+ * @wait: wait for completion
+ */
+void irdma_gen_ae(struct irdma_pci_f *rf, struct irdma_sc_qp *qp,
+		  struct irdma_gen_ae_info *info, bool wait)
+{
+	struct irdma_gen_ae_info *ae_info;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, wait);
+	if (!cqp_request)
+		return;
+
+	cqp_info = &cqp_request->info;
+	ae_info = &cqp_request->info.in.u.gen_ae.info;
+	memcpy(ae_info, info, sizeof(*ae_info));
+	cqp_info->cqp_cmd = IRDMA_OP_GEN_AE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.gen_ae.qp = qp;
+	cqp_info->in.u.gen_ae.scratch = (uintptr_t)cqp_request;
+
+	irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+}
+
+void irdma_flush_wqes(struct irdma_qp *iwqp, u32 flush_mask)
+{
+	struct irdma_qp_flush_info info = {};
+	struct irdma_pci_f *rf = iwqp->iwdev->rf;
+	u8 flush_code = iwqp->sc_qp.flush_code;
+
+	if (!(flush_mask & IRDMA_FLUSH_SQ) && !(flush_mask & IRDMA_FLUSH_RQ))
+		return;
+
+	/* Set flush info fields*/
+	info.sq = flush_mask & IRDMA_FLUSH_SQ;
+	info.rq = flush_mask & IRDMA_FLUSH_RQ;
+
+	if (flush_mask & IRDMA_REFLUSH) {
+		if (info.sq)
+			iwqp->sc_qp.flush_sq = false;
+		if (info.rq)
+			iwqp->sc_qp.flush_rq = false;
+	}
+
+	/* Generate userflush errors in CQE */
+	info.sq_major_code = IRDMA_FLUSH_MAJOR_ERR;
+	info.sq_minor_code = FLUSH_GENERAL_ERR;
+	info.rq_major_code = IRDMA_FLUSH_MAJOR_ERR;
+	info.rq_minor_code = FLUSH_GENERAL_ERR;
+	info.userflushcode = true;
+	if (flush_code) {
+		if (info.sq && iwqp->sc_qp.sq_flush_code)
+			info.sq_minor_code = flush_code;
+		if (info.rq && iwqp->sc_qp.rq_flush_code)
+			info.rq_minor_code = flush_code;
+	}
+
+	/* Issue flush */
+	(void)irdma_hw_flush_wqes(rf, &iwqp->sc_qp, &info,
+				  flush_mask & IRDMA_FLUSH_WAIT);
+	iwqp->flush_issued = true;
+}
diff --git a/drivers/infiniband/hw/irdma/i40iw_hw.c b/drivers/infiniband/hw/irdma/i40iw_hw.c
new file mode 100644
index 0000000..d2ed15e
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/i40iw_hw.c
@@ -0,0 +1,216 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "type.h"
+#include "i40iw_hw.h"
+#include "status.h"
+#include "protos.h"
+
+static u32 i40iw_regs[IRDMA_MAX_REGS] = {
+	I40E_PFPE_CQPTAIL,
+	I40E_PFPE_CQPDB,
+	I40E_PFPE_CCQPSTATUS,
+	I40E_PFPE_CCQPHIGH,
+	I40E_PFPE_CCQPLOW,
+	I40E_PFPE_CQARM,
+	I40E_PFPE_CQACK,
+	I40E_PFPE_AEQALLOC,
+	I40E_PFPE_CQPERRCODES,
+	I40E_PFPE_WQEALLOC,
+	I40E_PFINT_DYN_CTLN(0),
+	I40IW_DB_ADDR_OFFSET,
+
+	I40E_GLPCI_LBARCTRL,
+	I40E_GLPE_CPUSTATUS0,
+	I40E_GLPE_CPUSTATUS1,
+	I40E_GLPE_CPUSTATUS2,
+	I40E_PFINT_AEQCTL,
+	I40E_PFINT_CEQCTL(0),
+	I40E_VSIQF_CTL(0),
+	I40E_PFHMC_PDINV,
+	I40E_GLHMC_VFPDINV(0),
+	I40E_GLPE_CRITERR,
+	0xffffffff      /* PFINT_RATEN not used in FPK */
+};
+
+static u32 i40iw_stat_offsets_32[IRDMA_HW_STAT_INDEX_MAX_32] = {
+	I40E_GLPES_PFIP4RXDISCARD(0),
+	I40E_GLPES_PFIP4RXTRUNC(0),
+	I40E_GLPES_PFIP4TXNOROUTE(0),
+	I40E_GLPES_PFIP6RXDISCARD(0),
+	I40E_GLPES_PFIP6RXTRUNC(0),
+	I40E_GLPES_PFIP6TXNOROUTE(0),
+	I40E_GLPES_PFTCPRTXSEG(0),
+	I40E_GLPES_PFTCPRXOPTERR(0),
+	I40E_GLPES_PFTCPRXPROTOERR(0),
+	I40E_GLPES_PFRXVLANERR(0)
+};
+
+static u32 i40iw_stat_offsets_64[IRDMA_HW_STAT_INDEX_MAX_64] = {
+	I40E_GLPES_PFIP4RXOCTSLO(0),
+	I40E_GLPES_PFIP4RXPKTSLO(0),
+	I40E_GLPES_PFIP4RXFRAGSLO(0),
+	I40E_GLPES_PFIP4RXMCPKTSLO(0),
+	I40E_GLPES_PFIP4TXOCTSLO(0),
+	I40E_GLPES_PFIP4TXPKTSLO(0),
+	I40E_GLPES_PFIP4TXFRAGSLO(0),
+	I40E_GLPES_PFIP4TXMCPKTSLO(0),
+	I40E_GLPES_PFIP6RXOCTSLO(0),
+	I40E_GLPES_PFIP6RXPKTSLO(0),
+	I40E_GLPES_PFIP6RXFRAGSLO(0),
+	I40E_GLPES_PFIP6RXMCPKTSLO(0),
+	I40E_GLPES_PFIP6TXOCTSLO(0),
+	I40E_GLPES_PFIP6TXPKTSLO(0),
+	I40E_GLPES_PFIP6TXFRAGSLO(0),
+	I40E_GLPES_PFIP6TXMCPKTSLO(0),
+	I40E_GLPES_PFTCPRXSEGSLO(0),
+	I40E_GLPES_PFTCPTXSEGLO(0),
+	I40E_GLPES_PFRDMARXRDSLO(0),
+	I40E_GLPES_PFRDMARXSNDSLO(0),
+	I40E_GLPES_PFRDMARXWRSLO(0),
+	I40E_GLPES_PFRDMATXRDSLO(0),
+	I40E_GLPES_PFRDMATXSNDSLO(0),
+	I40E_GLPES_PFRDMATXWRSLO(0),
+	I40E_GLPES_PFRDMAVBNDLO(0),
+	I40E_GLPES_PFRDMAVINVLO(0),
+	I40E_GLPES_PFIP4RXMCOCTSLO(0),
+	I40E_GLPES_PFIP4TXMCOCTSLO(0),
+	I40E_GLPES_PFIP6RXMCOCTSLO(0),
+	I40E_GLPES_PFIP6TXMCOCTSLO(0),
+	I40E_GLPES_PFUDPRXPKTSLO(0),
+	I40E_GLPES_PFUDPTXPKTSLO(0)
+};
+
+static u64 i40iw_masks[IRDMA_MAX_MASKS] = {
+	I40E_PFPE_CCQPSTATUS_CCQP_DONE,
+	I40E_PFPE_CCQPSTATUS_CCQP_ERR,
+	I40E_CQPSQ_STAG_PDID,
+	I40E_CQPSQ_CQ_CEQID,
+	I40E_CQPSQ_CQ_CQID,
+	I40E_COMMIT_FPM_CQCNT,
+};
+
+static u64 i40iw_shifts[IRDMA_MAX_SHIFTS] = {
+	I40E_PFPE_CCQPSTATUS_CCQP_DONE_S,
+	I40E_PFPE_CCQPSTATUS_CCQP_ERR_S,
+	I40E_CQPSQ_STAG_PDID_S,
+	I40E_CQPSQ_CQ_CEQID_S,
+	I40E_CQPSQ_CQ_CQID_S,
+	I40E_COMMIT_FPM_CQCNT_S,
+};
+
+static struct irdma_irq_ops i40iw_irq_ops;
+
+/**
+ * i40iw_config_ceq- Configure CEQ interrupt
+ * @dev: pointer to the device structure
+ * @ceq_id: Completion Event Queue ID
+ * @idx: vector index
+ * @enable: Enable CEQ interrupt when true
+ */
+static void i40iw_config_ceq(struct irdma_sc_dev *dev, u32 ceq_id, u32 idx,
+			     bool enable)
+{
+	u32 reg_val;
+
+	reg_val = FIELD_PREP(I40E_PFINT_LNKLSTN_FIRSTQ_INDX, ceq_id) |
+		  FIELD_PREP(I40E_PFINT_LNKLSTN_FIRSTQ_TYPE, QUEUE_TYPE_CEQ);
+	wr32(dev->hw, I40E_PFINT_LNKLSTN(idx - 1), reg_val);
+
+	reg_val = FIELD_PREP(I40E_PFINT_DYN_CTLN_ITR_INDX, 0x3) |
+		  FIELD_PREP(I40E_PFINT_DYN_CTLN_INTENA, 0x1);
+	wr32(dev->hw, I40E_PFINT_DYN_CTLN(idx - 1), reg_val);
+
+	reg_val = FIELD_PREP(IRDMA_GLINT_CEQCTL_CAUSE_ENA, enable) |
+		  FIELD_PREP(IRDMA_GLINT_CEQCTL_MSIX_INDX, idx) |
+		  FIELD_PREP(I40E_PFINT_CEQCTL_NEXTQ_INDX, NULL_QUEUE_INDEX) |
+		  FIELD_PREP(IRDMA_GLINT_CEQCTL_ITR_INDX, 0x3);
+
+	wr32(dev->hw, i40iw_regs[IRDMA_GLINT_CEQCTL] + 4 * ceq_id, reg_val);
+}
+
+/**
+ * i40iw_ena_irq - Enable interrupt
+ * @dev: pointer to the device structure
+ * @idx: vector index
+ */
+static void i40iw_ena_irq(struct irdma_sc_dev *dev, u32 idx)
+{
+	u32 val;
+
+	val = FIELD_PREP(IRDMA_GLINT_DYN_CTL_INTENA, 0x1) |
+	      FIELD_PREP(IRDMA_GLINT_DYN_CTL_CLEARPBA, 0x1) |
+	      FIELD_PREP(IRDMA_GLINT_DYN_CTL_ITR_INDX, 0x3);
+	wr32(dev->hw, i40iw_regs[IRDMA_GLINT_DYN_CTL] + 4 * (idx - 1), val);
+}
+
+/**
+ * i40iw_disable_irq - Disable interrupt
+ * @dev: pointer to the device structure
+ * @idx: vector index
+ */
+static void i40iw_disable_irq(struct irdma_sc_dev *dev, u32 idx)
+{
+	wr32(dev->hw, i40iw_regs[IRDMA_GLINT_DYN_CTL] + 4 * (idx - 1), 0);
+}
+
+void i40iw_init_hw(struct irdma_sc_dev *dev)
+{
+	int i;
+	u8 __iomem *hw_addr;
+
+	for (i = 0; i < IRDMA_MAX_REGS; ++i) {
+		hw_addr = dev->hw->hw_addr;
+
+		if (i == IRDMA_DB_ADDR_OFFSET)
+			hw_addr = NULL;
+
+		dev->hw_regs[i] = (u32 __iomem *)(i40iw_regs[i] + hw_addr);
+	}
+
+	for (i = 0; i < IRDMA_HW_STAT_INDEX_MAX_32; ++i)
+		dev->hw_stats_regs_32[i] = i40iw_stat_offsets_32[i];
+
+	for (i = 0; i < IRDMA_HW_STAT_INDEX_MAX_64; ++i)
+		dev->hw_stats_regs_64[i] = i40iw_stat_offsets_64[i];
+
+	dev->hw_attrs.first_hw_vf_fpm_id = I40IW_FIRST_VF_FPM_ID;
+	dev->hw_attrs.max_hw_vf_fpm_id = IRDMA_MAX_VF_FPM_ID;
+
+	for (i = 0; i < IRDMA_MAX_SHIFTS; ++i)
+		dev->hw_shifts[i] = i40iw_shifts[i];
+
+	for (i = 0; i < IRDMA_MAX_MASKS; ++i)
+		dev->hw_masks[i] = i40iw_masks[i];
+
+	dev->wqe_alloc_db = dev->hw_regs[IRDMA_WQEALLOC];
+	dev->cq_arm_db = dev->hw_regs[IRDMA_CQARM];
+	dev->aeq_alloc_db = dev->hw_regs[IRDMA_AEQALLOC];
+	dev->cqp_db = dev->hw_regs[IRDMA_CQPDB];
+	dev->cq_ack_db = dev->hw_regs[IRDMA_CQACK];
+	dev->ceq_itr_mask_db = NULL;
+	dev->aeq_itr_mask_db = NULL;
+
+	memcpy(&i40iw_irq_ops, dev->irq_ops, sizeof(i40iw_irq_ops));
+	i40iw_irq_ops.irdma_en_irq = i40iw_ena_irq;
+	i40iw_irq_ops.irdma_dis_irq = i40iw_disable_irq;
+	i40iw_irq_ops.irdma_cfg_ceq = i40iw_config_ceq;
+	dev->irq_ops = &i40iw_irq_ops;
+
+	/* Setup the hardware limits, hmc may limit further */
+	dev->hw_attrs.uk_attrs.max_hw_wq_frags = I40IW_MAX_WQ_FRAGMENT_COUNT;
+	dev->hw_attrs.uk_attrs.max_hw_read_sges = I40IW_MAX_SGE_RD;
+	dev->hw_attrs.max_hw_device_pages = I40IW_MAX_PUSH_PAGE_COUNT;
+	dev->hw_attrs.uk_attrs.max_hw_inline = I40IW_MAX_INLINE_DATA_SIZE;
+	dev->hw_attrs.max_hw_ird = I40IW_MAX_IRD_SIZE;
+	dev->hw_attrs.max_hw_ord = I40IW_MAX_ORD_SIZE;
+	dev->hw_attrs.max_hw_wqes = I40IW_MAX_WQ_ENTRIES;
+	dev->hw_attrs.uk_attrs.max_hw_rq_quanta = I40IW_QP_SW_MAX_RQ_QUANTA;
+	dev->hw_attrs.uk_attrs.max_hw_wq_quanta = I40IW_QP_SW_MAX_WQ_QUANTA;
+	dev->hw_attrs.uk_attrs.max_hw_sq_chunk = I40IW_MAX_QUANTA_PER_WR;
+	dev->hw_attrs.max_hw_pds = I40IW_MAX_PDS;
+	dev->hw_attrs.max_stat_inst = I40IW_MAX_STATS_COUNT;
+	dev->hw_attrs.max_hw_outbound_msg_size = I40IW_MAX_OUTBOUND_MSG_SIZE;
+	dev->hw_attrs.max_hw_inbound_msg_size = I40IW_MAX_INBOUND_MSG_SIZE;
+	dev->hw_attrs.max_qp_wr = I40IW_MAX_QP_WRS;
+}
diff --git a/drivers/infiniband/hw/irdma/i40iw_hw.h b/drivers/infiniband/hw/irdma/i40iw_hw.h
new file mode 100644
index 0000000..1c438b3
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/i40iw_hw.h
@@ -0,0 +1,160 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef I40IW_HW_H
+#define I40IW_HW_H
+#define I40E_VFPE_CQPTAIL1            0x0000A000 /* Reset: VFR */
+#define I40E_VFPE_CQPDB1              0x0000BC00 /* Reset: VFR */
+#define I40E_VFPE_CCQPSTATUS1         0x0000B800 /* Reset: VFR */
+#define I40E_VFPE_CCQPHIGH1           0x00009800 /* Reset: VFR */
+#define I40E_VFPE_CCQPLOW1            0x0000AC00 /* Reset: VFR */
+#define I40E_VFPE_CQARM1              0x0000B400 /* Reset: VFR */
+#define I40E_VFPE_CQACK1              0x0000B000 /* Reset: VFR */
+#define I40E_VFPE_AEQALLOC1           0x0000A400 /* Reset: VFR */
+#define I40E_VFPE_CQPERRCODES1        0x00009C00 /* Reset: VFR */
+#define I40E_VFPE_WQEALLOC1           0x0000C000 /* Reset: VFR */
+#define I40E_VFINT_DYN_CTLN(_INTVF)   (0x00024800 + ((_INTVF) * 4)) /* _i=0...511 */ /* Reset: VFR */
+
+#define I40E_PFPE_CQPTAIL             0x00008080 /* Reset: PFR */
+
+#define I40E_PFPE_CQPDB               0x00008000 /* Reset: PFR */
+#define I40E_PFPE_CCQPSTATUS          0x00008100 /* Reset: PFR */
+#define I40E_PFPE_CCQPHIGH            0x00008200 /* Reset: PFR */
+#define I40E_PFPE_CCQPLOW             0x00008180 /* Reset: PFR */
+#define I40E_PFPE_CQARM               0x00131080 /* Reset: PFR */
+#define I40E_PFPE_CQACK               0x00131100 /* Reset: PFR */
+#define I40E_PFPE_AEQALLOC            0x00131180 /* Reset: PFR */
+#define I40E_PFPE_CQPERRCODES         0x00008880 /* Reset: PFR */
+#define I40E_PFPE_WQEALLOC            0x00138C00 /* Reset: PFR */
+#define I40E_GLPCI_LBARCTRL           0x000BE484 /* Reset: POR */
+#define I40E_GLPE_CPUSTATUS0          0x0000D040 /* Reset: PE_CORER */
+#define I40E_GLPE_CPUSTATUS1          0x0000D044 /* Reset: PE_CORER */
+#define I40E_GLPE_CPUSTATUS2          0x0000D048 /* Reset: PE_CORER */
+#define I40E_GLPE_CRITERR             0x000B4000 /* Reset: PE_CORER */
+#define I40E_PFHMC_PDINV              0x000C0300 /* Reset: PFR */
+#define I40E_GLHMC_VFPDINV(_i)        (0x000C8300 + ((_i) * 4)) /* _i=0...31 */ /* Reset: CORER */
+#define I40E_PFINT_DYN_CTLN(_INTPF)   (0x00034800 + ((_INTPF) * 4)) /* _i=0...511 */	/* Reset: PFR */
+#define I40E_PFINT_AEQCTL             0x00038700 /* Reset: CORER */
+
+#define I40E_GLPES_PFIP4RXDISCARD(_i)            (0x00010600 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4RXTRUNC(_i)              (0x00010700 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4TXNOROUTE(_i)            (0x00012E00 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXDISCARD(_i)            (0x00011200 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXTRUNC(_i)              (0x00011300 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+
+#define I40E_GLPES_PFRDMAVBNDLO(_i)              (0x00014800 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4TXMCOCTSLO(_i)           (0x00012000 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXMCOCTSLO(_i)           (0x00011600 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6TXMCOCTSLO(_i)           (0x00012A00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFUDPRXPKTSLO(_i)             (0x00013800 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFUDPTXPKTSLO(_i)             (0x00013A00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+
+#define I40E_GLPES_PFIP6TXNOROUTE(_i)            (0x00012F00 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFTCPRTXSEG(_i)               (0x00013600 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFTCPRXOPTERR(_i)             (0x00013200 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFTCPRXPROTOERR(_i)           (0x00013300 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRXVLANERR(_i)               (0x00010000 + ((_i) * 4)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4RXOCTSLO(_i)             (0x00010200 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4RXPKTSLO(_i)             (0x00010400 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4RXFRAGSLO(_i)            (0x00010800 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4RXMCPKTSLO(_i)           (0x00010C00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4TXOCTSLO(_i)             (0x00011A00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4TXPKTSLO(_i)             (0x00011C00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4TXFRAGSLO(_i)            (0x00011E00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4TXMCPKTSLO(_i)           (0x00012200 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXOCTSLO(_i)             (0x00010E00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXPKTSLO(_i)             (0x00011000 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXFRAGSLO(_i)            (0x00011400 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6TXOCTSLO(_i)             (0x00012400 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6TXPKTSLO(_i)             (0x00012600 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6TXFRAGSLO(_i)            (0x00012800 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6TXMCPKTSLO(_i)           (0x00012C00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFTCPTXSEGLO(_i)              (0x00013400 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMARXRDSLO(_i)             (0x00013E00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMARXSNDSLO(_i)            (0x00014000 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMARXWRSLO(_i)             (0x00013C00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMATXRDSLO(_i)             (0x00014400 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMATXSNDSLO(_i)            (0x00014600 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMATXWRSLO(_i)             (0x00014200 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP4RXMCOCTSLO(_i)           (0x00010A00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFIP6RXMCPKTSLO(_i)           (0x00011800 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFTCPRXSEGSLO(_i)             (0x00013000 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+#define I40E_GLPES_PFRDMAVINVLO(_i)              (0x00014A00 + ((_i) * 8)) /* _i=0...15 */ /* Reset: PE_CORER */
+
+#define I40IW_DB_ADDR_OFFSET    (4 * 1024 * 1024 - 64 * 1024)
+
+#define I40IW_VF_DB_ADDR_OFFSET (64 * 1024)
+
+#define I40E_PFINT_LNKLSTN(_INTPF)           (0x00035000 + ((_INTPF) * 4)) /* _i=0...511 */ /* Reset: PFR */
+#define I40E_PFINT_LNKLSTN_MAX_INDEX         511
+#define I40E_PFINT_LNKLSTN_FIRSTQ_INDX GENMASK(10, 0)
+#define I40E_PFINT_LNKLSTN_FIRSTQ_TYPE GENMASK(12, 11)
+
+#define I40E_PFINT_CEQCTL(_INTPF)          (0x00036800 + ((_INTPF) * 4)) /* _i=0...511 */ /* Reset: CORER */
+#define I40E_PFINT_CEQCTL_MAX_INDEX        511
+
+/* shifts/masks for FLD_[LS/RS]_64 macros used in device table */
+#define I40E_PFINT_CEQCTL_MSIX_INDX_S 0
+#define I40E_PFINT_CEQCTL_MSIX_INDX GENMASK(7, 0)
+#define I40E_PFINT_CEQCTL_ITR_INDX_S 11
+#define I40E_PFINT_CEQCTL_ITR_INDX GENMASK(12, 11)
+#define I40E_PFINT_CEQCTL_MSIX0_INDX_S 13
+#define I40E_PFINT_CEQCTL_MSIX0_INDX GENMASK(15, 13)
+#define I40E_PFINT_CEQCTL_NEXTQ_INDX_S 16
+#define I40E_PFINT_CEQCTL_NEXTQ_INDX GENMASK(26, 16)
+#define I40E_PFINT_CEQCTL_NEXTQ_TYPE_S 27
+#define I40E_PFINT_CEQCTL_NEXTQ_TYPE GENMASK(28, 27)
+#define I40E_PFINT_CEQCTL_CAUSE_ENA_S 30
+#define I40E_PFINT_CEQCTL_CAUSE_ENA BIT(30)
+#define I40E_PFINT_CEQCTL_INTEVENT_S 31
+#define I40E_PFINT_CEQCTL_INTEVENT BIT(31)
+#define I40E_CQPSQ_STAG_PDID_S 48
+#define I40E_CQPSQ_STAG_PDID GENMASK_ULL(62, 48)
+#define I40E_PFPE_CCQPSTATUS_CCQP_DONE_S 0
+#define I40E_PFPE_CCQPSTATUS_CCQP_DONE BIT_ULL(0)
+#define I40E_PFPE_CCQPSTATUS_CCQP_ERR_S 31
+#define I40E_PFPE_CCQPSTATUS_CCQP_ERR BIT_ULL(31)
+#define I40E_PFINT_DYN_CTLN_ITR_INDX_S 3
+#define I40E_PFINT_DYN_CTLN_ITR_INDX GENMASK(4, 3)
+#define I40E_PFINT_DYN_CTLN_INTENA_S 0
+#define I40E_PFINT_DYN_CTLN_INTENA BIT(0)
+#define I40E_CQPSQ_CQ_CEQID_S 24
+#define I40E_CQPSQ_CQ_CEQID GENMASK(30, 24)
+#define I40E_CQPSQ_CQ_CQID_S 0
+#define I40E_CQPSQ_CQ_CQID GENMASK_ULL(15, 0)
+#define I40E_COMMIT_FPM_CQCNT_S 0
+#define I40E_COMMIT_FPM_CQCNT GENMASK_ULL(17, 0)
+
+#define I40E_VSIQF_CTL(_VSI)             (0x0020D800 + ((_VSI) * 4))
+
+enum i40iw_device_caps_const {
+	I40IW_MAX_WQ_FRAGMENT_COUNT		= 3,
+	I40IW_MAX_SGE_RD			= 1,
+	I40IW_MAX_PUSH_PAGE_COUNT		= 0,
+	I40IW_MAX_INLINE_DATA_SIZE		= 48,
+	I40IW_MAX_IRD_SIZE			= 63,
+	I40IW_MAX_ORD_SIZE			= 127,
+	I40IW_MAX_WQ_ENTRIES			= 2048,
+	I40IW_MAX_WQE_SIZE_RQ			= 128,
+	I40IW_MAX_PDS				= 32768,
+	I40IW_MAX_STATS_COUNT			= 16,
+	I40IW_MAX_CQ_SIZE			= 1048575,
+	I40IW_MAX_OUTBOUND_MSG_SIZE		= 2147483647,
+	I40IW_MAX_INBOUND_MSG_SIZE		= 2147483647,
+};
+
+#define I40IW_QP_WQE_MIN_SIZE   32
+#define I40IW_QP_WQE_MAX_SIZE   128
+#define I40IW_QP_SW_MIN_WQSIZE  4
+#define I40IW_MAX_RQ_WQE_SHIFT  2
+#define I40IW_MAX_QUANTA_PER_WR 2
+
+#define I40IW_QP_SW_MAX_SQ_QUANTA 2048
+#define I40IW_QP_SW_MAX_RQ_QUANTA 16384
+#define I40IW_QP_SW_MAX_WQ_QUANTA 2048
+#define I40IW_MAX_QP_WRS ((I40IW_QP_SW_MAX_SQ_QUANTA - IRDMA_SQ_RSVD) / I40IW_MAX_QUANTA_PER_WR)
+#define I40IW_FIRST_VF_FPM_ID 16
+#define QUEUE_TYPE_CEQ        2
+#define NULL_QUEUE_INDEX      0x7FF
+
+void i40iw_init_hw(struct irdma_sc_dev *dev);
+#endif /* I40IW_HW_H */
diff --git a/drivers/infiniband/hw/irdma/icrdma_hw.c b/drivers/infiniband/hw/irdma/icrdma_hw.c
new file mode 100644
index 0000000..97a765c
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/icrdma_hw.c
@@ -0,0 +1,84 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2017 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "type.h"
+#include "icrdma_hw.h"
+
+static u32 icrdma_regs[IRDMA_MAX_REGS] = {
+	PFPE_CQPTAIL,
+	PFPE_CQPDB,
+	PFPE_CCQPSTATUS,
+	PFPE_CCQPHIGH,
+	PFPE_CCQPLOW,
+	PFPE_CQARM,
+	PFPE_CQACK,
+	PFPE_AEQALLOC,
+	PFPE_CQPERRCODES,
+	PFPE_WQEALLOC,
+	GLINT_DYN_CTL(0),
+	ICRDMA_DB_ADDR_OFFSET,
+
+	GLPCI_LBARCTRL,
+	GLPE_CPUSTATUS0,
+	GLPE_CPUSTATUS1,
+	GLPE_CPUSTATUS2,
+	PFINT_AEQCTL,
+	GLINT_CEQCTL(0),
+	VSIQF_PE_CTL1(0),
+	PFHMC_PDINV,
+	GLHMC_VFPDINV(0),
+	GLPE_CRITERR,
+	GLINT_RATE(0),
+};
+
+static u64 icrdma_masks[IRDMA_MAX_MASKS] = {
+	ICRDMA_CCQPSTATUS_CCQP_DONE,
+	ICRDMA_CCQPSTATUS_CCQP_ERR,
+	ICRDMA_CQPSQ_STAG_PDID,
+	ICRDMA_CQPSQ_CQ_CEQID,
+	ICRDMA_CQPSQ_CQ_CQID,
+	ICRDMA_COMMIT_FPM_CQCNT,
+};
+
+static u64 icrdma_shifts[IRDMA_MAX_SHIFTS] = {
+	ICRDMA_CCQPSTATUS_CCQP_DONE_S,
+	ICRDMA_CCQPSTATUS_CCQP_ERR_S,
+	ICRDMA_CQPSQ_STAG_PDID_S,
+	ICRDMA_CQPSQ_CQ_CEQID_S,
+	ICRDMA_CQPSQ_CQ_CQID_S,
+	ICRDMA_COMMIT_FPM_CQCNT_S,
+};
+
+void icrdma_init_hw(struct irdma_sc_dev *dev)
+{
+	int i;
+	u8 __iomem *hw_addr;
+
+	for (i = 0; i < IRDMA_MAX_REGS; ++i) {
+		hw_addr = dev->hw->hw_addr;
+
+		if (i == IRDMA_DB_ADDR_OFFSET)
+			hw_addr = NULL;
+
+		dev->hw_regs[i] = (u32 __iomem *)(hw_addr + icrdma_regs[i]);
+	}
+	dev->hw_attrs.max_hw_vf_fpm_id = IRDMA_MAX_VF_FPM_ID;
+	dev->hw_attrs.first_hw_vf_fpm_id = IRDMA_FIRST_VF_FPM_ID;
+
+	for (i = 0; i < IRDMA_MAX_SHIFTS; ++i)
+		dev->hw_shifts[i] = icrdma_shifts[i];
+
+	for (i = 0; i < IRDMA_MAX_MASKS; ++i)
+		dev->hw_masks[i] = icrdma_masks[i];
+
+	dev->wqe_alloc_db = dev->hw_regs[IRDMA_WQEALLOC];
+	dev->cq_arm_db = dev->hw_regs[IRDMA_CQARM];
+	dev->aeq_alloc_db = dev->hw_regs[IRDMA_AEQALLOC];
+	dev->cqp_db = dev->hw_regs[IRDMA_CQPDB];
+	dev->cq_ack_db = dev->hw_regs[IRDMA_CQACK];
+	dev->hw_attrs.max_hw_ird = ICRDMA_MAX_IRD_SIZE;
+	dev->hw_attrs.max_hw_ord = ICRDMA_MAX_ORD_SIZE;
+	dev->hw_attrs.max_stat_inst = ICRDMA_MAX_STATS_COUNT;
+
+	dev->hw_attrs.uk_attrs.max_hw_sq_chunk = IRDMA_MAX_QUANTA_PER_WR;
+}
diff --git a/drivers/infiniband/hw/irdma/icrdma_hw.h b/drivers/infiniband/hw/irdma/icrdma_hw.h
new file mode 100644
index 0000000..b65c463
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/icrdma_hw.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2017 - 2021 Intel Corporation */
+#ifndef ICRDMA_HW_H
+#define ICRDMA_HW_H
+
+#include "irdma.h"
+
+#define VFPE_CQPTAIL1		0x0000a000
+#define VFPE_CQPDB1		0x0000bc00
+#define VFPE_CCQPSTATUS1	0x0000b800
+#define VFPE_CCQPHIGH1		0x00009800
+#define VFPE_CCQPLOW1		0x0000ac00
+#define VFPE_CQARM1		0x0000b400
+#define VFPE_CQARM1		0x0000b400
+#define VFPE_CQACK1		0x0000b000
+#define VFPE_AEQALLOC1		0x0000a400
+#define VFPE_CQPERRCODES1	0x00009c00
+#define VFPE_WQEALLOC1		0x0000c000
+#define VFINT_DYN_CTLN(_i)	(0x00003800 + ((_i) * 4)) /* _i=0...63 */
+
+#define PFPE_CQPTAIL		0x00500880
+#define PFPE_CQPDB		0x00500800
+#define PFPE_CCQPSTATUS		0x0050a000
+#define PFPE_CCQPHIGH		0x0050a100
+#define PFPE_CCQPLOW		0x0050a080
+#define PFPE_CQARM		0x00502c00
+#define PFPE_CQACK		0x00502c80
+#define PFPE_AEQALLOC		0x00502d00
+#define GLINT_DYN_CTL(_INT)	(0x00160000 + ((_INT) * 4)) /* _i=0...2047 */
+#define GLPCI_LBARCTRL		0x0009de74
+#define GLPE_CPUSTATUS0		0x0050ba5c
+#define GLPE_CPUSTATUS1		0x0050ba60
+#define GLPE_CPUSTATUS2		0x0050ba64
+#define PFINT_AEQCTL		0x0016cb00
+#define PFPE_CQPERRCODES	0x0050a200
+#define PFPE_WQEALLOC		0x00504400
+#define GLINT_CEQCTL(_INT)	(0x0015c000 + ((_INT) * 4)) /* _i=0...2047 */
+#define VSIQF_PE_CTL1(_VSI)	(0x00414000 + ((_VSI) * 4)) /* _i=0...767 */
+#define PFHMC_PDINV		0x00520300
+#define GLHMC_VFPDINV(_i)	(0x00528300 + ((_i) * 4)) /* _i=0...31 */
+#define GLPE_CRITERR		0x00534000
+#define GLINT_RATE(_INT)	(0x0015A000 + ((_INT) * 4)) /* _i=0...2047 */ /* Reset Source: CORER */
+
+#define ICRDMA_DB_ADDR_OFFSET		(8 * 1024 * 1024 - 64 * 1024)
+
+#define ICRDMA_VF_DB_ADDR_OFFSET	(64 * 1024)
+
+/* shifts/masks for FLD_[LS/RS]_64 macros used in device table */
+#define ICRDMA_CCQPSTATUS_CCQP_DONE_S 0
+#define ICRDMA_CCQPSTATUS_CCQP_DONE BIT_ULL(0)
+#define ICRDMA_CCQPSTATUS_CCQP_ERR_S 31
+#define ICRDMA_CCQPSTATUS_CCQP_ERR BIT_ULL(31)
+#define ICRDMA_CQPSQ_STAG_PDID_S 46
+#define ICRDMA_CQPSQ_STAG_PDID GENMASK_ULL(63, 46)
+#define ICRDMA_CQPSQ_CQ_CEQID_S 22
+#define ICRDMA_CQPSQ_CQ_CEQID GENMASK_ULL(31, 22)
+#define ICRDMA_CQPSQ_CQ_CQID_S 0
+#define ICRDMA_CQPSQ_CQ_CQID GENMASK_ULL(18, 0)
+#define ICRDMA_COMMIT_FPM_CQCNT_S 0
+#define ICRDMA_COMMIT_FPM_CQCNT GENMASK_ULL(19, 0)
+
+enum icrdma_device_caps_const {
+	ICRDMA_MAX_STATS_COUNT = 128,
+
+	ICRDMA_MAX_IRD_SIZE			= 127,
+	ICRDMA_MAX_ORD_SIZE			= 255,
+
+};
+
+void icrdma_init_hw(struct irdma_sc_dev *dev);
+#endif /* ICRDMA_HW_H*/
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 10/23] RDMA/irdma: Implement HW Admin Queue OPs
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (8 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 09/23] RDMA/irdma: Implement device initialization definitions Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 11/23] RDMA/irdma: Add HMC backing store setup functions Shiraz Saleem
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

The driver posts privileged commands to the HW
Admin Queue (Control QP or CQP) to request administrative
actions from the HW. Implement create/destroy of CQP
and the supporting functions, data structures and headers
to handle the different CQP commands

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/ctrl.c  | 6143 +++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/defs.h  | 1162 +++++++
 drivers/infiniband/hw/irdma/irdma.h |  157 +
 drivers/infiniband/hw/irdma/type.h  | 1717 ++++++++++
 4 files changed, 9179 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/ctrl.c
 create mode 100644 drivers/infiniband/hw/irdma/defs.h
 create mode 100644 drivers/infiniband/hw/irdma/irdma.h
 create mode 100644 drivers/infiniband/hw/irdma/type.h

diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c
new file mode 100644
index 0000000..3cd0e18
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/ctrl.c
@@ -0,0 +1,6143 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "hmc.h"
+#include "defs.h"
+#include "type.h"
+#include "ws.h"
+#include "protos.h"
+
+/**
+ * irdma_get_qp_from_list - get next qp from a list
+ * @head: Listhead of qp's
+ * @qp: current qp
+ */
+struct irdma_sc_qp *irdma_get_qp_from_list(struct list_head *head,
+					   struct irdma_sc_qp *qp)
+{
+	struct list_head *lastentry;
+	struct list_head *entry = NULL;
+
+	if (list_empty(head))
+		return NULL;
+
+	if (!qp) {
+		entry = head->next;
+	} else {
+		lastentry = &qp->list;
+		entry = lastentry->next;
+		if (entry == head)
+			return NULL;
+	}
+
+	return container_of(entry, struct irdma_sc_qp, list);
+}
+
+/**
+ * irdma_sc_suspend_resume_qps - suspend/resume all qp's on VSI
+ * @vsi: the VSI struct pointer
+ * @op: Set to IRDMA_OP_RESUME or IRDMA_OP_SUSPEND
+ */
+void irdma_sc_suspend_resume_qps(struct irdma_sc_vsi *vsi, u8 op)
+{
+	struct irdma_sc_qp *qp = NULL;
+	u8 i;
+
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) {
+		mutex_lock(&vsi->qos[i].qos_mutex);
+		qp = irdma_get_qp_from_list(&vsi->qos[i].qplist, qp);
+		while (qp) {
+			if (op == IRDMA_OP_RESUME) {
+				if (!qp->dev->ws_add(vsi, i)) {
+					qp->qs_handle =
+						vsi->qos[qp->user_pri].qs_handle;
+					irdma_cqp_qp_suspend_resume(qp, op);
+				} else {
+					irdma_cqp_qp_suspend_resume(qp, op);
+					irdma_modify_qp_to_err(qp);
+				}
+			} else if (op == IRDMA_OP_SUSPEND) {
+				/* issue cqp suspend command */
+				if (!irdma_cqp_qp_suspend_resume(qp, op))
+					atomic_inc(&vsi->qp_suspend_reqs);
+			}
+			qp = irdma_get_qp_from_list(&vsi->qos[i].qplist, qp);
+		}
+		mutex_unlock(&vsi->qos[i].qos_mutex);
+	}
+}
+
+/**
+ * irdma_change_l2params - given the new l2 parameters, change all qp
+ * @vsi: RDMA VSI pointer
+ * @l2params: New parameters from l2
+ */
+void irdma_change_l2params(struct irdma_sc_vsi *vsi,
+			   struct irdma_l2params *l2params)
+{
+	if (l2params->mtu_changed) {
+		vsi->mtu = l2params->mtu;
+		if (vsi->ieq)
+			irdma_reinitialize_ieq(vsi);
+	}
+
+	if (!l2params->tc_changed)
+		return;
+
+	vsi->tc_change_pending = false;
+	irdma_sc_suspend_resume_qps(vsi, IRDMA_OP_RESUME);
+}
+
+/**
+ * irdma_qp_rem_qos - remove qp from qos lists during destroy qp
+ * @qp: qp to be removed from qos
+ */
+void irdma_qp_rem_qos(struct irdma_sc_qp *qp)
+{
+	struct irdma_sc_vsi *vsi = qp->vsi;
+
+	ibdev_dbg(to_ibdev(qp->dev),
+		  "DCB: DCB: Remove qp[%d] UP[%d] qset[%d] on_qoslist[%d]\n",
+		  qp->qp_uk.qp_id, qp->user_pri, qp->qs_handle,
+		  qp->on_qoslist);
+	mutex_lock(&vsi->qos[qp->user_pri].qos_mutex);
+	if (qp->on_qoslist) {
+		qp->on_qoslist = false;
+		list_del(&qp->list);
+	}
+	mutex_unlock(&vsi->qos[qp->user_pri].qos_mutex);
+}
+
+/**
+ * irdma_qp_add_qos - called during setctx for qp to be added to qos
+ * @qp: qp to be added to qos
+ */
+void irdma_qp_add_qos(struct irdma_sc_qp *qp)
+{
+	struct irdma_sc_vsi *vsi = qp->vsi;
+
+	ibdev_dbg(to_ibdev(qp->dev),
+		  "DCB: DCB: Add qp[%d] UP[%d] qset[%d] on_qoslist[%d]\n",
+		  qp->qp_uk.qp_id, qp->user_pri, qp->qs_handle,
+		  qp->on_qoslist);
+	mutex_lock(&vsi->qos[qp->user_pri].qos_mutex);
+	if (!qp->on_qoslist) {
+		list_add(&qp->list, &vsi->qos[qp->user_pri].qplist);
+		qp->on_qoslist = true;
+		qp->qs_handle = vsi->qos[qp->user_pri].qs_handle;
+	}
+	mutex_unlock(&vsi->qos[qp->user_pri].qos_mutex);
+}
+
+/**
+ * irdma_sc_pd_init - initialize sc pd struct
+ * @dev: sc device struct
+ * @pd: sc pd ptr
+ * @pd_id: pd_id for allocated pd
+ * @abi_ver: User/Kernel ABI version
+ */
+static void irdma_sc_pd_init(struct irdma_sc_dev *dev, struct irdma_sc_pd *pd,
+			     u32 pd_id, int abi_ver)
+{
+	pd->pd_id = pd_id;
+	pd->abi_ver = abi_ver;
+	pd->dev = dev;
+}
+
+/**
+ * irdma_sc_add_arp_cache_entry - cqp wqe add arp cache entry
+ * @cqp: struct for cqp hw
+ * @info: arp entry information
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_add_arp_cache_entry(struct irdma_sc_cqp *cqp,
+			     struct irdma_add_arp_cache_entry_info *info,
+			     u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	set_64bit_val(wqe, 8, info->reach_max);
+	set_64bit_val(wqe, 16, ether_addr_to_u64(info->mac_addr));
+
+	hdr = info->arp_index |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MANAGE_ARP) |
+	      FIELD_PREP(IRDMA_CQPSQ_MAT_PERMANENT, (info->permanent ? 1 : 0)) |
+	      FIELD_PREP(IRDMA_CQPSQ_MAT_ENTRYVALID, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: ARP_CACHE_ENTRY WQE", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_del_arp_cache_entry - dele arp cache entry
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @arp_index: arp index to delete arp entry
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_del_arp_cache_entry(struct irdma_sc_cqp *cqp, u64 scratch,
+			     u16 arp_index, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	hdr = arp_index |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MANAGE_ARP) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: ARP_CACHE_DEL_ENTRY WQE",
+			     DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_query_arp_cache_entry - cqp wqe to query arp and arp index
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @arp_index: arp index to delete arp entry
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_query_arp_cache_entry(struct irdma_sc_cqp *cqp, u64 scratch,
+			       u16 arp_index, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	hdr = arp_index | FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MANAGE_ARP) |
+	      FIELD_PREP(IRDMA_CQPSQ_MAT_QUERY, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QUERY_ARP_CACHE_ENTRY WQE",
+			     DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_manage_apbvt_entry - for adding and deleting apbvt entries
+ * @cqp: struct for cqp hw
+ * @info: info for apbvt entry to add or delete
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_manage_apbvt_entry(struct irdma_sc_cqp *cqp,
+			    struct irdma_apbvt_info *info, u64 scratch,
+			    bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, info->port);
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MANAGE_APBVT) |
+	      FIELD_PREP(IRDMA_CQPSQ_MAPT_ADDPORT, info->add) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: MANAGE_APBVT WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_manage_qhash_table_entry - manage quad hash entries
+ * @cqp: struct for cqp hw
+ * @info: info for quad hash to manage
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ *
+ * This is called before connection establishment is started.
+ * For passive connections, when listener is created, it will
+ * call with entry type of  IRDMA_QHASH_TYPE_TCP_SYN with local
+ * ip address and tcp port. When SYN is received (passive
+ * connections) or sent (active connections), this routine is
+ * called with entry type of IRDMA_QHASH_TYPE_TCP_ESTABLISHED
+ * and quad is passed in info.
+ *
+ * When iwarp connection is done and its state moves to RTS, the
+ * quad hash entry in the hardware will point to iwarp's qp
+ * number and requires no calls from the driver.
+ */
+static enum irdma_status_code
+irdma_sc_manage_qhash_table_entry(struct irdma_sc_cqp *cqp,
+				  struct irdma_qhash_table_info *info,
+				  u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	u64 qw1 = 0;
+	u64 qw2 = 0;
+	u64 temp;
+	struct irdma_sc_vsi *vsi = info->vsi;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 0, ether_addr_to_u64(info->mac_addr));
+
+	qw1 = FIELD_PREP(IRDMA_CQPSQ_QHASH_QPN, info->qp_num) |
+	      FIELD_PREP(IRDMA_CQPSQ_QHASH_DEST_PORT, info->dest_port);
+	if (info->ipv4_valid) {
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR3, info->dest_ip[0]));
+	} else {
+		set_64bit_val(wqe, 56,
+			      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR0, info->dest_ip[0]) |
+			      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR1, info->dest_ip[1]));
+
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR2, info->dest_ip[2]) |
+			      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR3, info->dest_ip[3]));
+	}
+	qw2 = FIELD_PREP(IRDMA_CQPSQ_QHASH_QS_HANDLE,
+			 vsi->qos[info->user_pri].qs_handle);
+	if (info->vlan_valid)
+		qw2 |= FIELD_PREP(IRDMA_CQPSQ_QHASH_VLANID, info->vlan_id);
+	set_64bit_val(wqe, 16, qw2);
+	if (info->entry_type == IRDMA_QHASH_TYPE_TCP_ESTABLISHED) {
+		qw1 |= FIELD_PREP(IRDMA_CQPSQ_QHASH_SRC_PORT, info->src_port);
+		if (!info->ipv4_valid) {
+			set_64bit_val(wqe, 40,
+				      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR0, info->src_ip[0]) |
+				      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR1, info->src_ip[1]));
+			set_64bit_val(wqe, 32,
+				      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR2, info->src_ip[2]) |
+				      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR3, info->src_ip[3]));
+		} else {
+			set_64bit_val(wqe, 32,
+				      FIELD_PREP(IRDMA_CQPSQ_QHASH_ADDR3, info->src_ip[0]));
+		}
+	}
+
+	set_64bit_val(wqe, 8, qw1);
+	temp = FIELD_PREP(IRDMA_CQPSQ_QHASH_WQEVALID, cqp->polarity) |
+	       FIELD_PREP(IRDMA_CQPSQ_QHASH_OPCODE,
+			  IRDMA_CQP_OP_MANAGE_QUAD_HASH_TABLE_ENTRY) |
+	       FIELD_PREP(IRDMA_CQPSQ_QHASH_MANAGE, info->manage) |
+	       FIELD_PREP(IRDMA_CQPSQ_QHASH_IPV4VALID, info->ipv4_valid) |
+	       FIELD_PREP(IRDMA_CQPSQ_QHASH_VLANVALID, info->vlan_valid) |
+	       FIELD_PREP(IRDMA_CQPSQ_QHASH_ENTRYTYPE, info->entry_type);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, temp);
+
+	print_hex_dump_debug("WQE: MANAGE_QHASH WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_cqp_nop - send a nop wqe
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_cqp_nop(struct irdma_sc_cqp *cqp,
+					       u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_NOP) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: NOP WQE", DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_qp_init - initialize qp
+ * @qp: sc qp
+ * @info: initialization qp info
+ */
+static enum irdma_status_code irdma_sc_qp_init(struct irdma_sc_qp *qp,
+					       struct irdma_qp_init_info *info)
+{
+	enum irdma_status_code ret_code;
+	u32 pble_obj_cnt;
+	u16 wqe_size;
+
+	if (info->qp_uk_init_info.max_sq_frag_cnt >
+	    info->pd->dev->hw_attrs.uk_attrs.max_hw_wq_frags ||
+	    info->qp_uk_init_info.max_rq_frag_cnt >
+	    info->pd->dev->hw_attrs.uk_attrs.max_hw_wq_frags)
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+
+	qp->dev = info->pd->dev;
+	qp->vsi = info->vsi;
+	qp->ieq_qp = info->vsi->exception_lan_q;
+	qp->sq_pa = info->sq_pa;
+	qp->rq_pa = info->rq_pa;
+	qp->hw_host_ctx_pa = info->host_ctx_pa;
+	qp->q2_pa = info->q2_pa;
+	qp->shadow_area_pa = info->shadow_area_pa;
+	qp->q2_buf = info->q2;
+	qp->pd = info->pd;
+	qp->hw_host_ctx = info->host_ctx;
+	info->qp_uk_init_info.wqe_alloc_db = qp->pd->dev->wqe_alloc_db;
+	ret_code = irdma_qp_uk_init(&qp->qp_uk, &info->qp_uk_init_info);
+	if (ret_code)
+		return ret_code;
+
+	qp->virtual_map = info->virtual_map;
+	pble_obj_cnt = info->pd->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+
+	if ((info->virtual_map && info->sq_pa >= pble_obj_cnt) ||
+	    (info->virtual_map && info->rq_pa >= pble_obj_cnt))
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	qp->llp_stream_handle = (void *)(-1);
+	qp->hw_sq_size = irdma_get_encoded_wqe_size(qp->qp_uk.sq_ring.size,
+						    IRDMA_QUEUE_TYPE_SQ_RQ);
+	ibdev_dbg(to_ibdev(qp->dev),
+		  "WQE: hw_sq_size[%04d] sq_ring.size[%04d]\n",
+		  qp->hw_sq_size, qp->qp_uk.sq_ring.size);
+	if (qp->qp_uk.uk_attrs->hw_rev == IRDMA_GEN_1 && qp->pd->abi_ver > 4)
+		wqe_size = IRDMA_WQE_SIZE_128;
+	else
+		ret_code = irdma_fragcnt_to_wqesize_rq(qp->qp_uk.max_rq_frag_cnt,
+						       &wqe_size);
+	if (ret_code)
+		return ret_code;
+
+	qp->hw_rq_size = irdma_get_encoded_wqe_size(qp->qp_uk.rq_size *
+				(wqe_size / IRDMA_QP_WQE_MIN_SIZE), IRDMA_QUEUE_TYPE_SQ_RQ);
+	ibdev_dbg(to_ibdev(qp->dev),
+		  "WQE: hw_rq_size[%04d] qp_uk.rq_size[%04d] wqe_size[%04d]\n",
+		  qp->hw_rq_size, qp->qp_uk.rq_size, wqe_size);
+	qp->sq_tph_val = info->sq_tph_val;
+	qp->rq_tph_val = info->rq_tph_val;
+	qp->sq_tph_en = info->sq_tph_en;
+	qp->rq_tph_en = info->rq_tph_en;
+	qp->rcv_tph_en = info->rcv_tph_en;
+	qp->xmit_tph_en = info->xmit_tph_en;
+	qp->qp_uk.first_sq_wq = info->qp_uk_init_info.first_sq_wq;
+	qp->qs_handle = qp->vsi->qos[qp->user_pri].qs_handle;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_qp_create - create qp
+ * @qp: sc qp
+ * @info: qp create info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_qp_create(struct irdma_sc_qp *qp, struct irdma_create_qp_info *info,
+		   u64 scratch, bool post_sq)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+
+	cqp = qp->dev->cqp;
+	if (qp->qp_uk.qp_id < cqp->dev->hw_attrs.min_hw_qp_id ||
+	    qp->qp_uk.qp_id > (cqp->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_QP].max_cnt - 1))
+		return IRDMA_ERR_INVALID_QP_ID;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, qp->hw_host_ctx_pa);
+	set_64bit_val(wqe, 40, qp->shadow_area_pa);
+
+	hdr = qp->qp_uk.qp_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_QP) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_ORDVALID, (info->ord_valid ? 1 : 0)) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_TOECTXVALID, info->tcp_ctx_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_MACVALID, info->mac_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_QPTYPE, qp->qp_uk.qp_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_VQ, qp->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_FORCELOOPBACK, info->force_lpb) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_CQNUMVALID, info->cq_num_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_ARPTABIDXVALID,
+			 info->arp_cache_idx_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_NEXTIWSTATE, info->next_iwarp_state) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QP_CREATE WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_qp_modify - modify qp cqp wqe
+ * @qp: sc qp
+ * @info: modify qp info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_qp_modify(struct irdma_sc_qp *qp, struct irdma_modify_qp_info *info,
+		   u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+	u8 term_actions = 0;
+	u8 term_len = 0;
+
+	cqp = qp->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	if (info->next_iwarp_state == IRDMA_QP_STATE_TERMINATE) {
+		if (info->dont_send_fin)
+			term_actions += IRDMAQP_TERM_SEND_TERM_ONLY;
+		if (info->dont_send_term)
+			term_actions += IRDMAQP_TERM_SEND_FIN_ONLY;
+		if (term_actions == IRDMAQP_TERM_SEND_TERM_AND_FIN ||
+		    term_actions == IRDMAQP_TERM_SEND_TERM_ONLY)
+			term_len = info->termlen;
+	}
+
+	set_64bit_val(wqe, 8,
+		      FIELD_PREP(IRDMA_CQPSQ_QP_NEWMSS, info->new_mss) |
+		      FIELD_PREP(IRDMA_CQPSQ_QP_TERMLEN, term_len));
+	set_64bit_val(wqe, 16, qp->hw_host_ctx_pa);
+	set_64bit_val(wqe, 40, qp->shadow_area_pa);
+
+	hdr = qp->qp_uk.qp_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MODIFY_QP) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_ORDVALID, info->ord_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_TOECTXVALID, info->tcp_ctx_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_CACHEDVARVALID,
+			 info->cached_var_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_VQ, qp->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_FORCELOOPBACK, info->force_lpb) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_CQNUMVALID, info->cq_num_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_MACVALID, info->mac_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_QPTYPE, qp->qp_uk.qp_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_MSSCHANGE, info->mss_change) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_REMOVEHASHENTRY,
+			 info->remove_hash_idx) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_TERMACT, term_actions) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_RESETCON, info->reset_tcp_conn) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_ARPTABIDXVALID,
+			 info->arp_cache_idx_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_NEXTIWSTATE, info->next_iwarp_state) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QP_MODIFY WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_qp_destroy - cqp destroy qp
+ * @qp: sc qp
+ * @scratch: u64 saved to be used during cqp completion
+ * @remove_hash_idx: flag if to remove hash idx
+ * @ignore_mw_bnd: memory window bind flag
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_qp_destroy(struct irdma_sc_qp *qp, u64 scratch, bool remove_hash_idx,
+		    bool ignore_mw_bnd, bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+
+	cqp = qp->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, qp->hw_host_ctx_pa);
+	set_64bit_val(wqe, 40, qp->shadow_area_pa);
+
+	hdr = qp->qp_uk.qp_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_QP) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_QPTYPE, qp->qp_uk.qp_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_IGNOREMWBOUND, ignore_mw_bnd) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_REMOVEHASHENTRY, remove_hash_idx) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QP_DESTROY WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_get_encoded_ird_size -
+ * @ird_size: IRD size
+ * The ird from the connection is rounded to a supported HW setting and then encoded
+ * for ird_size field of qp_ctx. Consumers are expected to provide valid ird size based
+ * on hardware attributes. IRD size defaults to a value of 4 in case of invalid input
+ */
+static u8 irdma_sc_get_encoded_ird_size(u16 ird_size)
+{
+	switch (ird_size ?
+		roundup_pow_of_two(2 * ird_size) : 4) {
+	case 256:
+		return IRDMA_IRD_HW_SIZE_256;
+	case 128:
+		return IRDMA_IRD_HW_SIZE_128;
+	case 64:
+	case 32:
+		return IRDMA_IRD_HW_SIZE_64;
+	case 16:
+	case 8:
+		return IRDMA_IRD_HW_SIZE_16;
+	case 4:
+	default:
+		break;
+	}
+
+	return IRDMA_IRD_HW_SIZE_4;
+}
+
+/**
+ * irdma_sc_qp_setctx_roce - set qp's context
+ * @qp: sc qp
+ * @qp_ctx: context ptr
+ * @info: ctx info
+ */
+static void irdma_sc_qp_setctx_roce(struct irdma_sc_qp *qp, __le64 *qp_ctx,
+				    struct irdma_qp_host_ctx_info *info)
+{
+	struct irdma_roce_offload_info *roce_info;
+	struct irdma_udp_offload_info *udp;
+	u8 push_mode_en;
+	u32 push_idx;
+
+	roce_info = info->roce_info;
+	udp = info->udp_info;
+	qp->user_pri = info->user_pri;
+	if (qp->push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX) {
+		push_mode_en = 0;
+		push_idx = 0;
+	} else {
+		push_mode_en = 1;
+		push_idx = qp->push_idx;
+	}
+	set_64bit_val(qp_ctx, 0,
+		      FIELD_PREP(IRDMAQPC_RQWQESIZE, qp->qp_uk.rq_wqe_size) |
+		      FIELD_PREP(IRDMAQPC_RCVTPHEN, qp->rcv_tph_en) |
+		      FIELD_PREP(IRDMAQPC_XMITTPHEN, qp->xmit_tph_en) |
+		      FIELD_PREP(IRDMAQPC_RQTPHEN, qp->rq_tph_en) |
+		      FIELD_PREP(IRDMAQPC_SQTPHEN, qp->sq_tph_en) |
+		      FIELD_PREP(IRDMAQPC_PPIDX, push_idx) |
+		      FIELD_PREP(IRDMAQPC_PMENA, push_mode_en) |
+		      FIELD_PREP(IRDMAQPC_PDIDXHI, roce_info->pd_id >> 16) |
+		      FIELD_PREP(IRDMAQPC_DC_TCP_EN, roce_info->dctcp_en) |
+		      FIELD_PREP(IRDMAQPC_ERR_RQ_IDX_VALID, roce_info->err_rq_idx_valid) |
+		      FIELD_PREP(IRDMAQPC_ISQP1, roce_info->is_qp1) |
+		      FIELD_PREP(IRDMAQPC_ROCE_TVER, roce_info->roce_tver) |
+		      FIELD_PREP(IRDMAQPC_IPV4, udp->ipv4) |
+		      FIELD_PREP(IRDMAQPC_INSERTVLANTAG, udp->insert_vlan_tag));
+	set_64bit_val(qp_ctx, 8, qp->sq_pa);
+	set_64bit_val(qp_ctx, 16, qp->rq_pa);
+	if ((roce_info->dcqcn_en || roce_info->dctcp_en) &&
+	    !(udp->tos & 0x03))
+		udp->tos |= ECN_CODE_PT_VAL;
+	set_64bit_val(qp_ctx, 24,
+		      FIELD_PREP(IRDMAQPC_RQSIZE, qp->hw_rq_size) |
+		      FIELD_PREP(IRDMAQPC_SQSIZE, qp->hw_sq_size) |
+		      FIELD_PREP(IRDMAQPC_TTL, udp->ttl) | FIELD_PREP(IRDMAQPC_TOS, udp->tos) |
+		      FIELD_PREP(IRDMAQPC_SRCPORTNUM, udp->src_port) |
+		      FIELD_PREP(IRDMAQPC_DESTPORTNUM, udp->dst_port));
+	set_64bit_val(qp_ctx, 32,
+		      FIELD_PREP(IRDMAQPC_DESTIPADDR2, udp->dest_ip_addr[2]) |
+		      FIELD_PREP(IRDMAQPC_DESTIPADDR3, udp->dest_ip_addr[3]));
+	set_64bit_val(qp_ctx, 40,
+		      FIELD_PREP(IRDMAQPC_DESTIPADDR0, udp->dest_ip_addr[0]) |
+		      FIELD_PREP(IRDMAQPC_DESTIPADDR1, udp->dest_ip_addr[1]));
+	set_64bit_val(qp_ctx, 48,
+		      FIELD_PREP(IRDMAQPC_SNDMSS, udp->snd_mss) |
+		      FIELD_PREP(IRDMAQPC_VLANTAG, udp->vlan_tag) |
+		      FIELD_PREP(IRDMAQPC_ARPIDX, udp->arp_idx));
+	set_64bit_val(qp_ctx, 56,
+		      FIELD_PREP(IRDMAQPC_PKEY, roce_info->p_key) |
+		      FIELD_PREP(IRDMAQPC_PDIDX, roce_info->pd_id) |
+		      FIELD_PREP(IRDMAQPC_ACKCREDITS, roce_info->ack_credits) |
+		      FIELD_PREP(IRDMAQPC_FLOWLABEL, udp->flow_label));
+	set_64bit_val(qp_ctx, 64,
+		      FIELD_PREP(IRDMAQPC_QKEY, roce_info->qkey) |
+		      FIELD_PREP(IRDMAQPC_DESTQP, roce_info->dest_qp));
+	set_64bit_val(qp_ctx, 80,
+		      FIELD_PREP(IRDMAQPC_PSNNXT, udp->psn_nxt) |
+		      FIELD_PREP(IRDMAQPC_LSN, udp->lsn));
+	set_64bit_val(qp_ctx, 88,
+		      FIELD_PREP(IRDMAQPC_EPSN, udp->epsn));
+	set_64bit_val(qp_ctx, 96,
+		      FIELD_PREP(IRDMAQPC_PSNMAX, udp->psn_max) |
+		      FIELD_PREP(IRDMAQPC_PSNUNA, udp->psn_una));
+	set_64bit_val(qp_ctx, 112,
+		      FIELD_PREP(IRDMAQPC_CWNDROCE, udp->cwnd));
+	set_64bit_val(qp_ctx, 128,
+		      FIELD_PREP(IRDMAQPC_ERR_RQ_IDX, roce_info->err_rq_idx) |
+		      FIELD_PREP(IRDMAQPC_RNRNAK_THRESH, udp->rnr_nak_thresh) |
+		      FIELD_PREP(IRDMAQPC_REXMIT_THRESH, udp->rexmit_thresh) |
+		      FIELD_PREP(IRDMAQPC_RTOMIN, roce_info->rtomin));
+	set_64bit_val(qp_ctx, 136,
+		      FIELD_PREP(IRDMAQPC_TXCQNUM, info->send_cq_num) |
+		      FIELD_PREP(IRDMAQPC_RXCQNUM, info->rcv_cq_num));
+	set_64bit_val(qp_ctx, 144,
+		      FIELD_PREP(IRDMAQPC_STAT_INDEX, info->stats_idx));
+	set_64bit_val(qp_ctx, 152, ether_addr_to_u64(roce_info->mac_addr) << 16);
+	set_64bit_val(qp_ctx, 160,
+		      FIELD_PREP(IRDMAQPC_ORDSIZE, roce_info->ord_size) |
+		      FIELD_PREP(IRDMAQPC_IRDSIZE, irdma_sc_get_encoded_ird_size(roce_info->ird_size)) |
+		      FIELD_PREP(IRDMAQPC_WRRDRSPOK, roce_info->wr_rdresp_en) |
+		      FIELD_PREP(IRDMAQPC_RDOK, roce_info->rd_en) |
+		      FIELD_PREP(IRDMAQPC_USESTATSINSTANCE, info->stats_idx_valid) |
+		      FIELD_PREP(IRDMAQPC_BINDEN, roce_info->bind_en) |
+		      FIELD_PREP(IRDMAQPC_FASTREGEN, roce_info->fast_reg_en) |
+		      FIELD_PREP(IRDMAQPC_DCQCNENABLE, roce_info->dcqcn_en) |
+		      FIELD_PREP(IRDMAQPC_RCVNOICRC, roce_info->rcv_no_icrc) |
+		      FIELD_PREP(IRDMAQPC_FW_CC_ENABLE, roce_info->fw_cc_enable) |
+		      FIELD_PREP(IRDMAQPC_UDPRIVCQENABLE, roce_info->udprivcq_en) |
+		      FIELD_PREP(IRDMAQPC_PRIVEN, roce_info->priv_mode_en) |
+		      FIELD_PREP(IRDMAQPC_TIMELYENABLE, roce_info->timely_en));
+	set_64bit_val(qp_ctx, 168,
+		      FIELD_PREP(IRDMAQPC_QPCOMPCTX, info->qp_compl_ctx));
+	set_64bit_val(qp_ctx, 176,
+		      FIELD_PREP(IRDMAQPC_SQTPHVAL, qp->sq_tph_val) |
+		      FIELD_PREP(IRDMAQPC_RQTPHVAL, qp->rq_tph_val) |
+		      FIELD_PREP(IRDMAQPC_QSHANDLE, qp->qs_handle));
+	set_64bit_val(qp_ctx, 184,
+		      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR3, udp->local_ipaddr[3]) |
+		      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR2, udp->local_ipaddr[2]));
+	set_64bit_val(qp_ctx, 192,
+		      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR1, udp->local_ipaddr[1]) |
+		      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR0, udp->local_ipaddr[0]));
+	set_64bit_val(qp_ctx, 200,
+		      FIELD_PREP(IRDMAQPC_THIGH, roce_info->t_high) |
+		      FIELD_PREP(IRDMAQPC_TLOW, roce_info->t_low));
+	set_64bit_val(qp_ctx, 208,
+		      FIELD_PREP(IRDMAQPC_REMENDPOINTIDX, info->rem_endpoint_idx));
+
+	print_hex_dump_debug("WQE: QP_HOST CTX WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, qp_ctx, IRDMA_QP_CTX_SIZE, false);
+}
+
+/* irdma_sc_alloc_local_mac_entry - allocate a mac entry
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_alloc_local_mac_entry(struct irdma_sc_cqp *cqp, u64 scratch,
+			       bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = cqp->dev->cqp_ops->cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE,
+			 IRDMA_CQP_OP_ALLOCATE_LOC_MAC_TABLE_ENTRY) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: ALLOCATE_LOCAL_MAC WQE",
+			     DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+
+	if (post_sq)
+		cqp->dev->cqp_ops->cqp_post_sq(cqp);
+	return 0;
+}
+
+/**
+ * irdma_sc_add_local_mac_entry - add mac enry
+ * @cqp: struct for cqp hw
+ * @info:mac addr info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_add_local_mac_entry(struct irdma_sc_cqp *cqp,
+			     struct irdma_local_mac_entry_info *info,
+			     u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	u64 header;
+
+	wqe = cqp->dev->cqp_ops->cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 32, ether_addr_to_u64(info->mac_addr));
+
+	header = FIELD_PREP(IRDMA_CQPSQ_MLM_TABLEIDX, info->entry_idx) |
+		 FIELD_PREP(IRDMA_CQPSQ_OPCODE,
+			    IRDMA_CQP_OP_MANAGE_LOC_MAC_TABLE) |
+		 FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, header);
+
+	print_hex_dump_debug("WQE: ADD_LOCAL_MAC WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+
+	if (post_sq)
+		cqp->dev->cqp_ops->cqp_post_sq(cqp);
+	return 0;
+}
+
+/**
+ * irdma_sc_del_local_mac_entry - cqp wqe to dele local mac
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @entry_idx: index of mac entry
+ * @ignore_ref_count: to force mac adde delete
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_del_local_mac_entry(struct irdma_sc_cqp *cqp, u64 scratch,
+			     u16 entry_idx, u8 ignore_ref_count, bool post_sq)
+{
+	__le64 *wqe;
+	u64 header;
+
+	wqe = cqp->dev->cqp_ops->cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	header = FIELD_PREP(IRDMA_CQPSQ_MLM_TABLEIDX, entry_idx) |
+		 FIELD_PREP(IRDMA_CQPSQ_OPCODE,
+			    IRDMA_CQP_OP_MANAGE_LOC_MAC_TABLE) |
+		 FIELD_PREP(IRDMA_CQPSQ_MLM_FREEENTRY, 1) |
+		 FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity) |
+		 FIELD_PREP(IRDMA_CQPSQ_MLM_IGNORE_REF_CNT, ignore_ref_count);
+
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, header);
+
+	print_hex_dump_debug("WQE: DEL_LOCAL_MAC_IPADDR WQE",
+			     DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+
+	if (post_sq)
+		cqp->dev->cqp_ops->cqp_post_sq(cqp);
+	return 0;
+}
+
+/**
+ * irdma_sc_qp_setctx - set qp's context
+ * @qp: sc qp
+ * @qp_ctx: context ptr
+ * @info: ctx info
+ */
+static void irdma_sc_qp_setctx(struct irdma_sc_qp *qp, __le64 *qp_ctx,
+			       struct irdma_qp_host_ctx_info *info)
+{
+	struct irdma_iwarp_offload_info *iw;
+	struct irdma_tcp_offload_info *tcp;
+	struct irdma_sc_dev *dev;
+	u8 push_mode_en;
+	u32 push_idx;
+	u64 qw0, qw3, qw7 = 0, qw16 = 0;
+	u64 mac = 0;
+
+	iw = info->iwarp_info;
+	tcp = info->tcp_info;
+	dev = qp->dev;
+	if (iw->rcv_mark_en) {
+		qp->pfpdu.marker_len = 4;
+		qp->pfpdu.rcv_start_seq = tcp->rcv_nxt;
+	}
+	qp->user_pri = info->user_pri;
+	if (qp->push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX) {
+		push_mode_en = 0;
+		push_idx = 0;
+	} else {
+		push_mode_en = 1;
+		push_idx = qp->push_idx;
+	}
+	qw0 = FIELD_PREP(IRDMAQPC_RQWQESIZE, qp->qp_uk.rq_wqe_size) |
+	      FIELD_PREP(IRDMAQPC_RCVTPHEN, qp->rcv_tph_en) |
+	      FIELD_PREP(IRDMAQPC_XMITTPHEN, qp->xmit_tph_en) |
+	      FIELD_PREP(IRDMAQPC_RQTPHEN, qp->rq_tph_en) |
+	      FIELD_PREP(IRDMAQPC_SQTPHEN, qp->sq_tph_en) |
+	      FIELD_PREP(IRDMAQPC_PPIDX, push_idx) |
+	      FIELD_PREP(IRDMAQPC_PMENA, push_mode_en);
+
+	set_64bit_val(qp_ctx, 8, qp->sq_pa);
+	set_64bit_val(qp_ctx, 16, qp->rq_pa);
+
+	qw3 = FIELD_PREP(IRDMAQPC_RQSIZE, qp->hw_rq_size) |
+	      FIELD_PREP(IRDMAQPC_SQSIZE, qp->hw_sq_size);
+	if (dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		qw3 |= FIELD_PREP(IRDMAQPC_GEN1_SRCMACADDRIDX,
+				  qp->src_mac_addr_idx);
+	set_64bit_val(qp_ctx, 136,
+		      FIELD_PREP(IRDMAQPC_TXCQNUM, info->send_cq_num) |
+		      FIELD_PREP(IRDMAQPC_RXCQNUM, info->rcv_cq_num));
+	set_64bit_val(qp_ctx, 168,
+		      FIELD_PREP(IRDMAQPC_QPCOMPCTX, info->qp_compl_ctx));
+	set_64bit_val(qp_ctx, 176,
+		      FIELD_PREP(IRDMAQPC_SQTPHVAL, qp->sq_tph_val) |
+		      FIELD_PREP(IRDMAQPC_RQTPHVAL, qp->rq_tph_val) |
+		      FIELD_PREP(IRDMAQPC_QSHANDLE, qp->qs_handle) |
+		      FIELD_PREP(IRDMAQPC_EXCEPTION_LAN_QUEUE, qp->ieq_qp));
+	if (info->iwarp_info_valid) {
+		qw0 |= FIELD_PREP(IRDMAQPC_DDP_VER, iw->ddp_ver) |
+		       FIELD_PREP(IRDMAQPC_RDMAP_VER, iw->rdmap_ver) |
+		       FIELD_PREP(IRDMAQPC_DC_TCP_EN, iw->dctcp_en) |
+		       FIELD_PREP(IRDMAQPC_ECN_EN, iw->ecn_en) |
+		       FIELD_PREP(IRDMAQPC_IBRDENABLE, iw->ib_rd_en) |
+		       FIELD_PREP(IRDMAQPC_PDIDXHI, iw->pd_id >> 16) |
+		       FIELD_PREP(IRDMAQPC_ERR_RQ_IDX_VALID,
+				  iw->err_rq_idx_valid);
+		qw7 |= FIELD_PREP(IRDMAQPC_PDIDX, iw->pd_id);
+		qw16 |= FIELD_PREP(IRDMAQPC_ERR_RQ_IDX, iw->err_rq_idx) |
+			FIELD_PREP(IRDMAQPC_RTOMIN, iw->rtomin);
+		set_64bit_val(qp_ctx, 144,
+			      FIELD_PREP(IRDMAQPC_Q2ADDR, qp->q2_pa >> 8) |
+			      FIELD_PREP(IRDMAQPC_STAT_INDEX, info->stats_idx));
+
+		if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+			mac = ether_addr_to_u64(iw->mac_addr);
+
+		set_64bit_val(qp_ctx, 152,
+			      mac << 16 | FIELD_PREP(IRDMAQPC_LASTBYTESENT, iw->last_byte_sent));
+		set_64bit_val(qp_ctx, 160,
+			      FIELD_PREP(IRDMAQPC_ORDSIZE, iw->ord_size) |
+			      FIELD_PREP(IRDMAQPC_IRDSIZE, irdma_sc_get_encoded_ird_size(iw->ird_size)) |
+			      FIELD_PREP(IRDMAQPC_WRRDRSPOK, iw->wr_rdresp_en) |
+			      FIELD_PREP(IRDMAQPC_RDOK, iw->rd_en) |
+			      FIELD_PREP(IRDMAQPC_SNDMARKERS, iw->snd_mark_en) |
+			      FIELD_PREP(IRDMAQPC_BINDEN, iw->bind_en) |
+			      FIELD_PREP(IRDMAQPC_FASTREGEN, iw->fast_reg_en) |
+			      FIELD_PREP(IRDMAQPC_PRIVEN, iw->priv_mode_en) |
+			      FIELD_PREP(IRDMAQPC_USESTATSINSTANCE, info->stats_idx_valid) |
+			      FIELD_PREP(IRDMAQPC_IWARPMODE, 1) |
+			      FIELD_PREP(IRDMAQPC_RCVMARKERS, iw->rcv_mark_en) |
+			      FIELD_PREP(IRDMAQPC_ALIGNHDRS, iw->align_hdrs) |
+			      FIELD_PREP(IRDMAQPC_RCVNOMPACRC, iw->rcv_no_mpa_crc) |
+			      FIELD_PREP(IRDMAQPC_RCVMARKOFFSET, iw->rcv_mark_offset || !tcp ? iw->rcv_mark_offset : tcp->rcv_nxt) |
+			      FIELD_PREP(IRDMAQPC_SNDMARKOFFSET, iw->snd_mark_offset || !tcp ? iw->snd_mark_offset : tcp->snd_nxt) |
+			      FIELD_PREP(IRDMAQPC_TIMELYENABLE, iw->timely_en));
+	}
+	if (info->tcp_info_valid) {
+		qw0 |= FIELD_PREP(IRDMAQPC_IPV4, tcp->ipv4) |
+		       FIELD_PREP(IRDMAQPC_NONAGLE, tcp->no_nagle) |
+		       FIELD_PREP(IRDMAQPC_INSERTVLANTAG,
+				  tcp->insert_vlan_tag) |
+		       FIELD_PREP(IRDMAQPC_TIMESTAMP, tcp->time_stamp) |
+		       FIELD_PREP(IRDMAQPC_LIMIT, tcp->cwnd_inc_limit) |
+		       FIELD_PREP(IRDMAQPC_DROPOOOSEG, tcp->drop_ooo_seg) |
+		       FIELD_PREP(IRDMAQPC_DUPACK_THRESH, tcp->dup_ack_thresh);
+
+		if ((iw->ecn_en || iw->dctcp_en) && !(tcp->tos & 0x03))
+			tcp->tos |= ECN_CODE_PT_VAL;
+
+		qw3 |= FIELD_PREP(IRDMAQPC_TTL, tcp->ttl) |
+		       FIELD_PREP(IRDMAQPC_AVOIDSTRETCHACK, tcp->avoid_stretch_ack) |
+		       FIELD_PREP(IRDMAQPC_TOS, tcp->tos) |
+		       FIELD_PREP(IRDMAQPC_SRCPORTNUM, tcp->src_port) |
+		       FIELD_PREP(IRDMAQPC_DESTPORTNUM, tcp->dst_port);
+		if (dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1) {
+			qw3 |= FIELD_PREP(IRDMAQPC_GEN1_SRCMACADDRIDX, tcp->src_mac_addr_idx);
+
+			qp->src_mac_addr_idx = tcp->src_mac_addr_idx;
+		}
+		set_64bit_val(qp_ctx, 32,
+			      FIELD_PREP(IRDMAQPC_DESTIPADDR2, tcp->dest_ip_addr[2]) |
+			      FIELD_PREP(IRDMAQPC_DESTIPADDR3, tcp->dest_ip_addr[3]));
+		set_64bit_val(qp_ctx, 40,
+			      FIELD_PREP(IRDMAQPC_DESTIPADDR0, tcp->dest_ip_addr[0]) |
+			      FIELD_PREP(IRDMAQPC_DESTIPADDR1, tcp->dest_ip_addr[1]));
+		set_64bit_val(qp_ctx, 48,
+			      FIELD_PREP(IRDMAQPC_SNDMSS, tcp->snd_mss) |
+			      FIELD_PREP(IRDMAQPC_SYN_RST_HANDLING, tcp->syn_rst_handling) |
+			      FIELD_PREP(IRDMAQPC_VLANTAG, tcp->vlan_tag) |
+			      FIELD_PREP(IRDMAQPC_ARPIDX, tcp->arp_idx));
+		qw7 |= FIELD_PREP(IRDMAQPC_FLOWLABEL, tcp->flow_label) |
+		       FIELD_PREP(IRDMAQPC_WSCALE, tcp->wscale) |
+		       FIELD_PREP(IRDMAQPC_IGNORE_TCP_OPT,
+				  tcp->ignore_tcp_opt) |
+		       FIELD_PREP(IRDMAQPC_IGNORE_TCP_UNS_OPT,
+				  tcp->ignore_tcp_uns_opt) |
+		       FIELD_PREP(IRDMAQPC_TCPSTATE, tcp->tcp_state) |
+		       FIELD_PREP(IRDMAQPC_RCVSCALE, tcp->rcv_wscale) |
+		       FIELD_PREP(IRDMAQPC_SNDSCALE, tcp->snd_wscale);
+		set_64bit_val(qp_ctx, 72,
+			      FIELD_PREP(IRDMAQPC_TIMESTAMP_RECENT, tcp->time_stamp_recent) |
+			      FIELD_PREP(IRDMAQPC_TIMESTAMP_AGE, tcp->time_stamp_age));
+		set_64bit_val(qp_ctx, 80,
+			      FIELD_PREP(IRDMAQPC_SNDNXT, tcp->snd_nxt) |
+			      FIELD_PREP(IRDMAQPC_SNDWND, tcp->snd_wnd));
+		set_64bit_val(qp_ctx, 88,
+			      FIELD_PREP(IRDMAQPC_RCVNXT, tcp->rcv_nxt) |
+			      FIELD_PREP(IRDMAQPC_RCVWND, tcp->rcv_wnd));
+		set_64bit_val(qp_ctx, 96,
+			      FIELD_PREP(IRDMAQPC_SNDMAX, tcp->snd_max) |
+			      FIELD_PREP(IRDMAQPC_SNDUNA, tcp->snd_una));
+		set_64bit_val(qp_ctx, 104,
+			      FIELD_PREP(IRDMAQPC_SRTT, tcp->srtt) |
+			      FIELD_PREP(IRDMAQPC_RTTVAR, tcp->rtt_var));
+		set_64bit_val(qp_ctx, 112,
+			      FIELD_PREP(IRDMAQPC_SSTHRESH, tcp->ss_thresh) |
+			      FIELD_PREP(IRDMAQPC_CWND, tcp->cwnd));
+		set_64bit_val(qp_ctx, 120,
+			      FIELD_PREP(IRDMAQPC_SNDWL1, tcp->snd_wl1) |
+			      FIELD_PREP(IRDMAQPC_SNDWL2, tcp->snd_wl2));
+		qw16 |= FIELD_PREP(IRDMAQPC_MAXSNDWND, tcp->max_snd_window) |
+			FIELD_PREP(IRDMAQPC_REXMIT_THRESH, tcp->rexmit_thresh);
+		set_64bit_val(qp_ctx, 184,
+			      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR3, tcp->local_ipaddr[3]) |
+			      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR2, tcp->local_ipaddr[2]));
+		set_64bit_val(qp_ctx, 192,
+			      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR1, tcp->local_ipaddr[1]) |
+			      FIELD_PREP(IRDMAQPC_LOCAL_IPADDR0, tcp->local_ipaddr[0]));
+		set_64bit_val(qp_ctx, 200,
+			      FIELD_PREP(IRDMAQPC_THIGH, iw->t_high) |
+			      FIELD_PREP(IRDMAQPC_TLOW, iw->t_low));
+		set_64bit_val(qp_ctx, 208,
+			      FIELD_PREP(IRDMAQPC_REMENDPOINTIDX, info->rem_endpoint_idx));
+	}
+
+	set_64bit_val(qp_ctx, 0, qw0);
+	set_64bit_val(qp_ctx, 24, qw3);
+	set_64bit_val(qp_ctx, 56, qw7);
+	set_64bit_val(qp_ctx, 128, qw16);
+
+	print_hex_dump_debug("WQE: QP_HOST CTX", DUMP_PREFIX_OFFSET, 16, 8,
+			     qp_ctx, IRDMA_QP_CTX_SIZE, false);
+}
+
+/**
+ * irdma_sc_alloc_stag - mr stag alloc
+ * @dev: sc device struct
+ * @info: stag info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_alloc_stag(struct irdma_sc_dev *dev,
+		    struct irdma_allocate_stag_info *info, u64 scratch,
+		    bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+	enum irdma_page_size page_size;
+
+	if (info->page_size == 0x40000000)
+		page_size = IRDMA_PAGE_SIZE_1G;
+	else if (info->page_size == 0x200000)
+		page_size = IRDMA_PAGE_SIZE_2M;
+	else
+		page_size = IRDMA_PAGE_SIZE_4K;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 8,
+		      FLD_LS_64(dev, info->pd_id, IRDMA_CQPSQ_STAG_PDID) |
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_STAGLEN, info->total_len));
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_IDX, info->stag_idx));
+	set_64bit_val(wqe, 40,
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_HMCFNIDX, info->hmc_fcn_index));
+
+	if (info->chunk_size)
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_CQPSQ_STAG_FIRSTPMPBLIDX, info->first_pm_pbl_idx));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_ALLOC_STAG) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_MR, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_ARIGHTS, info->access_rights) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_LPBLSIZE, info->chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_HPAGESIZE, page_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_REMACCENABLED, info->remote_access) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_USEHMCFNIDX, info->use_hmc_fcn_index) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_USEPFRID, info->use_pf_rid) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: ALLOC_STAG WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_mr_reg_non_shared - non-shared mr registration
+ * @dev: sc device struct
+ * @info: mr info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_mr_reg_non_shared(struct irdma_sc_dev *dev,
+			   struct irdma_reg_ns_stag_info *info, u64 scratch,
+			   bool post_sq)
+{
+	__le64 *wqe;
+	u64 fbo;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+	u32 pble_obj_cnt;
+	bool remote_access;
+	u8 addr_type;
+	enum irdma_page_size page_size;
+
+	if (info->page_size == 0x40000000)
+		page_size = IRDMA_PAGE_SIZE_1G;
+	else if (info->page_size == 0x200000)
+		page_size = IRDMA_PAGE_SIZE_2M;
+	else if (info->page_size == 0x1000)
+		page_size = IRDMA_PAGE_SIZE_4K;
+	else
+		return IRDMA_ERR_PARAM;
+
+	if (info->access_rights & (IRDMA_ACCESS_FLAGS_REMOTEREAD_ONLY |
+				   IRDMA_ACCESS_FLAGS_REMOTEWRITE_ONLY))
+		remote_access = true;
+	else
+		remote_access = false;
+
+	pble_obj_cnt = dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+	if (info->chunk_size && info->first_pm_pbl_index >= pble_obj_cnt)
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	fbo = info->va & (info->page_size - 1);
+
+	set_64bit_val(wqe, 0,
+		      (info->addr_type == IRDMA_ADDR_TYPE_VA_BASED ?
+		      info->va : fbo));
+	set_64bit_val(wqe, 8,
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_STAGLEN, info->total_len) |
+		      FLD_LS_64(dev, info->pd_id, IRDMA_CQPSQ_STAG_PDID));
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_KEY, info->stag_key) |
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_IDX, info->stag_idx));
+	if (!info->chunk_size) {
+		set_64bit_val(wqe, 32, info->reg_addr_pa);
+		set_64bit_val(wqe, 48, 0);
+	} else {
+		set_64bit_val(wqe, 32, 0);
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_CQPSQ_STAG_FIRSTPMPBLIDX, info->first_pm_pbl_index));
+	}
+	set_64bit_val(wqe, 40, info->hmc_fcn_index);
+	set_64bit_val(wqe, 56, 0);
+
+	addr_type = (info->addr_type == IRDMA_ADDR_TYPE_VA_BASED) ? 1 : 0;
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_REG_MR) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_MR, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_LPBLSIZE, info->chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_HPAGESIZE, page_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_ARIGHTS, info->access_rights) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_REMACCENABLED, remote_access) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_VABASEDTO, addr_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_USEHMCFNIDX, info->use_hmc_fcn_index) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_USEPFRID, info->use_pf_rid) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: MR_REG_NS WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_mr_reg_shared - registered shared memory region
+ * @dev: sc device struct
+ * @info: info for shared memory registration
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_mr_reg_shared(struct irdma_sc_dev *dev,
+		       struct irdma_register_shared_stag *info, u64 scratch,
+		       bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 temp, fbo, hdr;
+	bool remote_access;
+	u8 addr_type;
+
+	if (info->access_rights & (IRDMA_ACCESS_FLAGS_REMOTEREAD_ONLY |
+				   IRDMA_ACCESS_FLAGS_REMOTEWRITE_ONLY))
+		remote_access = true;
+	else
+		remote_access = false;
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	fbo = info->va & (info->page_size - 1);
+
+	set_64bit_val(wqe, 0,
+		      (info->addr_type == IRDMA_ADDR_TYPE_VA_BASED ?
+		       info->va : fbo));
+	set_64bit_val(wqe, 8,
+		      FLD_LS_64(dev, info->pd_id, IRDMA_CQPSQ_STAG_PDID));
+	temp = FIELD_PREP(IRDMA_CQPSQ_STAG_KEY, info->new_stag_key) |
+	       FIELD_PREP(IRDMA_CQPSQ_STAG_IDX, info->new_stag_idx) |
+	       FIELD_PREP(IRDMA_CQPSQ_STAG_PARENTSTAGIDX,
+			  info->parent_stag_idx);
+	set_64bit_val(wqe, 16, temp);
+
+	addr_type = (info->addr_type == IRDMA_ADDR_TYPE_VA_BASED) ? 1 : 0;
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_REG_SMR) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_MR, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_ARIGHTS, info->access_rights) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_REMACCENABLED, remote_access) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_VABASEDTO, addr_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: MR_REG_SHARED WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_dealloc_stag - deallocate stag
+ * @dev: sc device struct
+ * @info: dealloc stag info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_dealloc_stag(struct irdma_sc_dev *dev,
+		      struct irdma_dealloc_stag_info *info, u64 scratch,
+		      bool post_sq)
+{
+	u64 hdr;
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 8,
+		      FLD_LS_64(dev, info->pd_id, IRDMA_CQPSQ_STAG_PDID));
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_IDX, info->stag_idx));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DEALLOC_STAG) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_MR, info->mr) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: DEALLOC_STAG WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_query_stag - query hardware for stag
+ * @dev: sc device struct
+ * @scratch: u64 saved to be used during cqp completion
+ * @stag_index: stag index for query
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_query_stag(struct irdma_sc_dev *dev,
+						  u64 scratch, u32 stag_index,
+						  bool post_sq)
+{
+	u64 hdr;
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_QUERYSTAG_IDX, stag_index));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_QUERY_STAG) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QUERY_STAG WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_mw_alloc - mw allocate
+ * @dev: sc device struct
+ * @info: memory window allocation information
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_mw_alloc(struct irdma_sc_dev *dev, struct irdma_mw_alloc_info *info,
+		  u64 scratch, bool post_sq)
+{
+	u64 hdr;
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 8,
+		      FLD_LS_64(dev, info->pd_id, IRDMA_CQPSQ_STAG_PDID));
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_STAG_IDX, info->mw_stag_index));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_ALLOC_STAG) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_MWTYPE, info->mw_wide) |
+	      FIELD_PREP(IRDMA_CQPSQ_STAG_MW1_BIND_DONT_VLDT_KEY,
+			 info->mw1_bind_dont_vldt_key) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: MW_ALLOC WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_mr_fast_register - Posts RDMA fast register mr WR to iwarp qp
+ * @qp: sc qp struct
+ * @info: fast mr info
+ * @post_sq: flag for cqp db to ring
+ */
+enum irdma_status_code
+irdma_sc_mr_fast_register(struct irdma_sc_qp *qp,
+			  struct irdma_fast_reg_stag_info *info, bool post_sq)
+{
+	u64 temp, hdr;
+	__le64 *wqe;
+	u32 wqe_idx;
+	enum irdma_page_size page_size;
+	struct irdma_post_sq_info sq_info = {};
+
+	if (info->page_size == 0x40000000)
+		page_size = IRDMA_PAGE_SIZE_1G;
+	else if (info->page_size == 0x200000)
+		page_size = IRDMA_PAGE_SIZE_2M;
+	else
+		page_size = IRDMA_PAGE_SIZE_4K;
+
+	sq_info.wr_id = info->wr_id;
+	sq_info.signaled = info->signaled;
+	sq_info.push_wqe = info->push_wqe;
+
+	wqe = irdma_qp_get_next_send_wqe(&qp->qp_uk, &wqe_idx,
+					 IRDMA_QP_WQE_MIN_QUANTA, 0, &sq_info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(&qp->qp_uk, wqe_idx);
+
+	ibdev_dbg(to_ibdev(qp->dev),
+		  "MR: wr_id[%llxh] wqe_idx[%04d] location[%p]\n",
+		  info->wr_id, wqe_idx,
+		  &qp->qp_uk.sq_wrtrk_array[wqe_idx].wrid);
+
+	temp = (info->addr_type == IRDMA_ADDR_TYPE_VA_BASED) ?
+		(uintptr_t)info->va : info->fbo;
+	set_64bit_val(wqe, 0, temp);
+
+	temp = FIELD_GET(IRDMAQPSQ_FIRSTPMPBLIDXHI,
+			 info->first_pm_pbl_index >> 16);
+	set_64bit_val(wqe, 8,
+		      FIELD_PREP(IRDMAQPSQ_FIRSTPMPBLIDXHI, temp) |
+		      FIELD_PREP(IRDMAQPSQ_PBLADDR >> IRDMA_HW_PAGE_SHIFT, info->reg_addr_pa));
+	set_64bit_val(wqe, 16,
+		      info->total_len |
+		      FIELD_PREP(IRDMAQPSQ_FIRSTPMPBLIDXLO, info->first_pm_pbl_index));
+
+	hdr = FIELD_PREP(IRDMAQPSQ_STAGKEY, info->stag_key) |
+	      FIELD_PREP(IRDMAQPSQ_STAGINDEX, info->stag_idx) |
+	      FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_FAST_REGISTER) |
+	      FIELD_PREP(IRDMAQPSQ_LPBLSIZE, info->chunk_size) |
+	      FIELD_PREP(IRDMAQPSQ_HPAGESIZE, page_size) |
+	      FIELD_PREP(IRDMAQPSQ_STAGRIGHTS, info->access_rights) |
+	      FIELD_PREP(IRDMAQPSQ_VABASEDTO, info->addr_type) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, (sq_info.push_wqe ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: FAST_REG WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_QP_WQE_MIN_SIZE, false);
+	if (sq_info.push_wqe) {
+		irdma_qp_push_wqe(&qp->qp_uk, wqe, IRDMA_QP_WQE_MIN_QUANTA,
+				  wqe_idx, post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(&qp->qp_uk);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_sc_gen_rts_ae - request AE generated after RTS
+ * @qp: sc qp struct
+ */
+static void irdma_sc_gen_rts_ae(struct irdma_sc_qp *qp)
+{
+	__le64 *wqe;
+	u64 hdr;
+	struct irdma_qp_uk *qp_uk;
+
+	qp_uk = &qp->qp_uk;
+
+	wqe = qp_uk->sq_base[1].elem;
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_NOP) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, 1) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+	print_hex_dump_debug("QP: NOP W/LOCAL FENCE WQE", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_QP_WQE_MIN_SIZE, false);
+
+	wqe = qp_uk->sq_base[2].elem;
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_GEN_RTS_AE) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+	print_hex_dump_debug("QP: CONN EST WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_QP_WQE_MIN_SIZE, false);
+}
+
+/**
+ * irdma_sc_send_lsmm - send last streaming mode message
+ * @qp: sc qp struct
+ * @lsmm_buf: buffer with lsmm message
+ * @size: size of lsmm buffer
+ * @stag: stag of lsmm buffer
+ */
+static void irdma_sc_send_lsmm(struct irdma_sc_qp *qp, void *lsmm_buf, u32 size,
+			       irdma_stag stag)
+{
+	__le64 *wqe;
+	u64 hdr;
+	struct irdma_qp_uk *qp_uk;
+
+	qp_uk = &qp->qp_uk;
+	wqe = qp_uk->sq_base->elem;
+
+	set_64bit_val(wqe, 0, (uintptr_t)lsmm_buf);
+	if (qp->qp_uk.uk_attrs->hw_rev == IRDMA_GEN_1) {
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_LEN, size) |
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_STAG, stag));
+	} else {
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_FRAG_LEN, size) |
+			      FIELD_PREP(IRDMAQPSQ_FRAG_STAG, stag) |
+			      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity));
+	}
+	set_64bit_val(wqe, 16, 0);
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_RDMA_SEND) |
+	      FIELD_PREP(IRDMAQPSQ_STREAMMODE, 1) |
+	      FIELD_PREP(IRDMAQPSQ_WAITFORRCVPDU, 1) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: SEND_LSMM WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_QP_WQE_MIN_SIZE, false);
+
+	if (qp->dev->hw_attrs.uk_attrs.feature_flags & IRDMA_FEATURE_RTS_AE)
+		irdma_sc_gen_rts_ae(qp);
+}
+
+/**
+ * irdma_sc_send_lsmm_nostag - for privilege qp
+ * @qp: sc qp struct
+ * @lsmm_buf: buffer with lsmm message
+ * @size: size of lsmm buffer
+ */
+static void irdma_sc_send_lsmm_nostag(struct irdma_sc_qp *qp, void *lsmm_buf,
+				      u32 size)
+{
+	__le64 *wqe;
+	u64 hdr;
+	struct irdma_qp_uk *qp_uk;
+
+	qp_uk = &qp->qp_uk;
+	wqe = qp_uk->sq_base->elem;
+
+	set_64bit_val(wqe, 0, (uintptr_t)lsmm_buf);
+
+	if (qp->qp_uk.uk_attrs->hw_rev == IRDMA_GEN_1)
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_LEN, size));
+	else
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_FRAG_LEN, size) |
+			      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity));
+	set_64bit_val(wqe, 16, 0);
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_RDMA_SEND) |
+	      FIELD_PREP(IRDMAQPSQ_STREAMMODE, 1) |
+	      FIELD_PREP(IRDMAQPSQ_WAITFORRCVPDU, 1) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: SEND_LSMM_NOSTAG WQE", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_QP_WQE_MIN_SIZE, false);
+}
+
+/**
+ * irdma_sc_send_rtt - send last read0 or write0
+ * @qp: sc qp struct
+ * @read: Do read0 or write0
+ */
+static void irdma_sc_send_rtt(struct irdma_sc_qp *qp, bool read)
+{
+	__le64 *wqe;
+	u64 hdr;
+	struct irdma_qp_uk *qp_uk;
+
+	qp_uk = &qp->qp_uk;
+	wqe = qp_uk->sq_base->elem;
+
+	set_64bit_val(wqe, 0, 0);
+	set_64bit_val(wqe, 16, 0);
+	if (read) {
+		if (qp->qp_uk.uk_attrs->hw_rev == IRDMA_GEN_1) {
+			set_64bit_val(wqe, 8,
+				      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_STAG, 0xabcd));
+		} else {
+			set_64bit_val(wqe, 8,
+				      (u64)0xabcd | FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity));
+		}
+		hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, 0x1234) |
+		      FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_RDMA_READ) |
+		      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+
+	} else {
+		if (qp->qp_uk.uk_attrs->hw_rev == IRDMA_GEN_1) {
+			set_64bit_val(wqe, 8, 0);
+		} else {
+			set_64bit_val(wqe, 8,
+				      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity));
+		}
+		hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_RDMA_WRITE) |
+		      FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity);
+	}
+
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: RTR WQE", DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_QP_WQE_MIN_SIZE, false);
+
+	if (qp->dev->hw_attrs.uk_attrs.feature_flags & IRDMA_FEATURE_RTS_AE)
+		irdma_sc_gen_rts_ae(qp);
+}
+
+/**
+ * irdma_iwarp_opcode - determine if incoming is rdma layer
+ * @info: aeq info for the packet
+ * @pkt: packet for error
+ */
+static u32 irdma_iwarp_opcode(struct irdma_aeqe_info *info, u8 *pkt)
+{
+	__be16 *mpa;
+	u32 opcode = 0xffffffff;
+
+	if (info->q2_data_written) {
+		mpa = (__be16 *)pkt;
+		opcode = ntohs(mpa[1]) & 0xf;
+	}
+
+	return opcode;
+}
+
+/**
+ * irdma_locate_mpa - return pointer to mpa in the pkt
+ * @pkt: packet with data
+ */
+static u8 *irdma_locate_mpa(u8 *pkt)
+{
+	/* skip over ethernet header */
+	pkt += IRDMA_MAC_HLEN;
+
+	/* Skip over IP and TCP headers */
+	pkt += 4 * (pkt[0] & 0x0f);
+	pkt += 4 * ((pkt[12] >> 4) & 0x0f);
+
+	return pkt;
+}
+
+/**
+ * irdma_bld_termhdr_ctrl - setup terminate hdr control fields
+ * @qp: sc qp ptr for pkt
+ * @hdr: term hdr
+ * @opcode: flush opcode for termhdr
+ * @layer_etype: error layer + error type
+ * @err: error cod ein the header
+ */
+static void irdma_bld_termhdr_ctrl(struct irdma_sc_qp *qp,
+				   struct irdma_terminate_hdr *hdr,
+				   enum irdma_flush_opcode opcode,
+				   u8 layer_etype, u8 err)
+{
+	qp->flush_code = opcode;
+	hdr->layer_etype = layer_etype;
+	hdr->error_code = err;
+}
+
+/**
+ * irdma_bld_termhdr_ddp_rdma - setup ddp and rdma hdrs in terminate hdr
+ * @pkt: ptr to mpa in offending pkt
+ * @hdr: term hdr
+ * @copy_len: offending pkt length to be copied to term hdr
+ * @is_tagged: DDP tagged or untagged
+ */
+static void irdma_bld_termhdr_ddp_rdma(u8 *pkt, struct irdma_terminate_hdr *hdr,
+				       int *copy_len, u8 *is_tagged)
+{
+	u16 ddp_seg_len;
+
+	ddp_seg_len = ntohs(*(__be16 *)pkt);
+	if (ddp_seg_len) {
+		*copy_len = 2;
+		hdr->hdrct = DDP_LEN_FLAG;
+		if (pkt[2] & 0x80) {
+			*is_tagged = 1;
+			if (ddp_seg_len >= TERM_DDP_LEN_TAGGED) {
+				*copy_len += TERM_DDP_LEN_TAGGED;
+				hdr->hdrct |= DDP_HDR_FLAG;
+			}
+		} else {
+			if (ddp_seg_len >= TERM_DDP_LEN_UNTAGGED) {
+				*copy_len += TERM_DDP_LEN_UNTAGGED;
+				hdr->hdrct |= DDP_HDR_FLAG;
+			}
+			if (ddp_seg_len >= (TERM_DDP_LEN_UNTAGGED + TERM_RDMA_LEN) &&
+			    ((pkt[3] & RDMA_OPCODE_M) == RDMA_READ_REQ_OPCODE)) {
+				*copy_len += TERM_RDMA_LEN;
+				hdr->hdrct |= RDMA_HDR_FLAG;
+			}
+		}
+	}
+}
+
+/**
+ * irdma_bld_terminate_hdr - build terminate message header
+ * @qp: qp associated with received terminate AE
+ * @info: the struct contiaing AE information
+ */
+static int irdma_bld_terminate_hdr(struct irdma_sc_qp *qp,
+				   struct irdma_aeqe_info *info)
+{
+	u8 *pkt = qp->q2_buf + Q2_BAD_FRAME_OFFSET;
+	int copy_len = 0;
+	u8 is_tagged = 0;
+	u32 opcode;
+	struct irdma_terminate_hdr *termhdr;
+
+	termhdr = (struct irdma_terminate_hdr *)qp->q2_buf;
+	memset(termhdr, 0, Q2_BAD_FRAME_OFFSET);
+
+	if (info->q2_data_written) {
+		pkt = irdma_locate_mpa(pkt);
+		irdma_bld_termhdr_ddp_rdma(pkt, termhdr, &copy_len, &is_tagged);
+	}
+
+	opcode = irdma_iwarp_opcode(info, pkt);
+	qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC;
+	qp->sq_flush_code = info->sq;
+	qp->rq_flush_code = info->rq;
+
+	switch (info->ae_id) {
+	case IRDMA_AE_AMP_UNALLOCATED_STAG:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		if (opcode == IRDMA_OP_TYPE_RDMA_WRITE)
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_PROT_ERR,
+					       (LAYER_DDP << 4) | DDP_TAGGED_BUF,
+					       DDP_TAGGED_INV_STAG);
+		else
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+					       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+					       RDMAP_INV_STAG);
+		break;
+	case IRDMA_AE_AMP_BOUNDS_VIOLATION:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		if (info->q2_data_written)
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_PROT_ERR,
+					       (LAYER_DDP << 4) | DDP_TAGGED_BUF,
+					       DDP_TAGGED_BOUNDS);
+		else
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+					       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+					       RDMAP_INV_BOUNDS);
+		break;
+	case IRDMA_AE_AMP_BAD_PD:
+		switch (opcode) {
+		case IRDMA_OP_TYPE_RDMA_WRITE:
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_PROT_ERR,
+					       (LAYER_DDP << 4) | DDP_TAGGED_BUF,
+					       DDP_TAGGED_UNASSOC_STAG);
+			break;
+		case IRDMA_OP_TYPE_SEND_INV:
+		case IRDMA_OP_TYPE_SEND_SOL_INV:
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+					       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+					       RDMAP_CANT_INV_STAG);
+			break;
+		default:
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+					       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+					       RDMAP_UNASSOC_STAG);
+		}
+		break;
+	case IRDMA_AE_AMP_INVALID_STAG:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+				       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+				       RDMAP_INV_STAG);
+		break;
+	case IRDMA_AE_AMP_BAD_QP:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_LOC_QP_OP_ERR,
+				       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+				       DDP_UNTAGGED_INV_QN);
+		break;
+	case IRDMA_AE_AMP_BAD_STAG_KEY:
+	case IRDMA_AE_AMP_BAD_STAG_INDEX:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		switch (opcode) {
+		case IRDMA_OP_TYPE_SEND_INV:
+		case IRDMA_OP_TYPE_SEND_SOL_INV:
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_OP_ERR,
+					       (LAYER_RDMA << 4) | RDMAP_REMOTE_OP,
+					       RDMAP_CANT_INV_STAG);
+			break;
+		default:
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+					       (LAYER_RDMA << 4) | RDMAP_REMOTE_OP,
+					       RDMAP_INV_STAG);
+		}
+		break;
+	case IRDMA_AE_AMP_RIGHTS_VIOLATION:
+	case IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS:
+	case IRDMA_AE_PRIV_OPERATION_DENIED:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+				       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+				       RDMAP_ACCESS);
+		break;
+	case IRDMA_AE_AMP_TO_WRAP:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_ACCESS_ERR,
+				       (LAYER_RDMA << 4) | RDMAP_REMOTE_PROT,
+				       RDMAP_TO_WRAP);
+		break;
+	case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+				       (LAYER_MPA << 4) | DDP_LLP, MPA_CRC);
+		break;
+	case IRDMA_AE_LLP_SEGMENT_TOO_SMALL:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_LOC_LEN_ERR,
+				       (LAYER_DDP << 4) | DDP_CATASTROPHIC,
+				       DDP_CATASTROPHIC_LOCAL);
+		break;
+	case IRDMA_AE_LCE_QP_CATASTROPHIC:
+	case IRDMA_AE_DDP_NO_L_BIT:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_FATAL_ERR,
+				       (LAYER_DDP << 4) | DDP_CATASTROPHIC,
+				       DDP_CATASTROPHIC_LOCAL);
+		break;
+	case IRDMA_AE_DDP_INVALID_MSN_GAP_IN_MSN:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+				       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+				       DDP_UNTAGGED_INV_MSN_RANGE);
+		break;
+	case IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER:
+		qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_LOC_LEN_ERR,
+				       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+				       DDP_UNTAGGED_INV_TOO_LONG);
+		break;
+	case IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION:
+		if (is_tagged)
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+					       (LAYER_DDP << 4) | DDP_TAGGED_BUF,
+					       DDP_TAGGED_INV_DDP_VER);
+		else
+			irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+					       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+					       DDP_UNTAGGED_INV_DDP_VER);
+		break;
+	case IRDMA_AE_DDP_UBE_INVALID_MO:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+				       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+				       DDP_UNTAGGED_INV_MO);
+		break;
+	case IRDMA_AE_DDP_UBE_INVALID_MSN_NO_BUFFER_AVAILABLE:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_REM_OP_ERR,
+				       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+				       DDP_UNTAGGED_INV_MSN_NO_BUF);
+		break;
+	case IRDMA_AE_DDP_UBE_INVALID_QN:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+				       (LAYER_DDP << 4) | DDP_UNTAGGED_BUF,
+				       DDP_UNTAGGED_INV_QN);
+		break;
+	case IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_GENERAL_ERR,
+				       (LAYER_RDMA << 4) | RDMAP_REMOTE_OP,
+				       RDMAP_INV_RDMAP_VER);
+		break;
+	default:
+		irdma_bld_termhdr_ctrl(qp, termhdr, FLUSH_FATAL_ERR,
+				       (LAYER_RDMA << 4) | RDMAP_REMOTE_OP,
+				       RDMAP_UNSPECIFIED);
+		break;
+	}
+
+	if (copy_len)
+		memcpy(termhdr + 1, pkt, copy_len);
+
+	return sizeof(struct irdma_terminate_hdr) + copy_len;
+}
+
+/**
+ * irdma_terminate_send_fin() - Send fin for terminate message
+ * @qp: qp associated with received terminate AE
+ */
+void irdma_terminate_send_fin(struct irdma_sc_qp *qp)
+{
+	irdma_term_modify_qp(qp, IRDMA_QP_STATE_TERMINATE,
+			     IRDMAQP_TERM_SEND_FIN_ONLY, 0);
+}
+
+/**
+ * irdma_terminate_connection() - Bad AE and send terminate to remote QP
+ * @qp: qp associated with received terminate AE
+ * @info: the struct contiaing AE information
+ */
+void irdma_terminate_connection(struct irdma_sc_qp *qp,
+				struct irdma_aeqe_info *info)
+{
+	u8 termlen = 0;
+
+	if (qp->term_flags & IRDMA_TERM_SENT)
+		return;
+
+	termlen = irdma_bld_terminate_hdr(qp, info);
+	irdma_terminate_start_timer(qp);
+	qp->term_flags |= IRDMA_TERM_SENT;
+	irdma_term_modify_qp(qp, IRDMA_QP_STATE_TERMINATE,
+			     IRDMAQP_TERM_SEND_TERM_ONLY, termlen);
+}
+
+/**
+ * irdma_terminate_received - handle terminate received AE
+ * @qp: qp associated with received terminate AE
+ * @info: the struct contiaing AE information
+ */
+void irdma_terminate_received(struct irdma_sc_qp *qp,
+			      struct irdma_aeqe_info *info)
+{
+	u8 *pkt = qp->q2_buf + Q2_BAD_FRAME_OFFSET;
+	__be32 *mpa;
+	u8 ddp_ctl;
+	u8 rdma_ctl;
+	u16 aeq_id = 0;
+	struct irdma_terminate_hdr *termhdr;
+
+	mpa = (__be32 *)irdma_locate_mpa(pkt);
+	if (info->q2_data_written) {
+		/* did not validate the frame - do it now */
+		ddp_ctl = (ntohl(mpa[0]) >> 8) & 0xff;
+		rdma_ctl = ntohl(mpa[0]) & 0xff;
+		if ((ddp_ctl & 0xc0) != 0x40)
+			aeq_id = IRDMA_AE_LCE_QP_CATASTROPHIC;
+		else if ((ddp_ctl & 0x03) != 1)
+			aeq_id = IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION;
+		else if (ntohl(mpa[2]) != 2)
+			aeq_id = IRDMA_AE_DDP_UBE_INVALID_QN;
+		else if (ntohl(mpa[3]) != 1)
+			aeq_id = IRDMA_AE_DDP_INVALID_MSN_GAP_IN_MSN;
+		else if (ntohl(mpa[4]) != 0)
+			aeq_id = IRDMA_AE_DDP_UBE_INVALID_MO;
+		else if ((rdma_ctl & 0xc0) != 0x40)
+			aeq_id = IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION;
+
+		info->ae_id = aeq_id;
+		if (info->ae_id) {
+			/* Bad terminate recvd - send back a terminate */
+			irdma_terminate_connection(qp, info);
+			return;
+		}
+	}
+
+	qp->term_flags |= IRDMA_TERM_RCVD;
+	qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC;
+	termhdr = (struct irdma_terminate_hdr *)&mpa[5];
+	if (termhdr->layer_etype == RDMAP_REMOTE_PROT ||
+	    termhdr->layer_etype == RDMAP_REMOTE_OP) {
+		irdma_terminate_done(qp, 0);
+	} else {
+		irdma_terminate_start_timer(qp);
+		irdma_terminate_send_fin(qp);
+	}
+}
+
+static enum irdma_status_code irdma_null_ws_add(struct irdma_sc_vsi *vsi,
+						u8 user_pri)
+{
+	return 0;
+}
+
+static void irdma_null_ws_remove(struct irdma_sc_vsi *vsi, u8 user_pri)
+{
+	/* do nothing */
+}
+
+static void irdma_null_ws_reset(struct irdma_sc_vsi *vsi)
+{
+	/* do nothing */
+}
+
+/**
+ * irdma_sc_vsi_init - Init the vsi structure
+ * @vsi: pointer to vsi structure to initialize
+ * @info: the info used to initialize the vsi struct
+ */
+void irdma_sc_vsi_init(struct irdma_sc_vsi  *vsi,
+		       struct irdma_vsi_init_info *info)
+{
+	struct irdma_l2params *l2p;
+	int i;
+
+	vsi->dev = info->dev;
+	vsi->back_vsi = info->back_vsi;
+	vsi->register_qset = info->register_qset;
+	vsi->unregister_qset = info->unregister_qset;
+	vsi->mtu = info->params->mtu;
+	vsi->exception_lan_q = info->exception_lan_q;
+	vsi->vsi_idx = info->pf_data_vsi_num;
+	vsi->vm_vf_type = info->vm_vf_type;
+	vsi->vm_id = info->vm_id;
+	if (vsi->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		vsi->fcn_id = info->dev->hmc_fn_id;
+
+	l2p = info->params;
+	vsi->qos_rel_bw = l2p->vsi_rel_bw;
+	vsi->qos_prio_type = l2p->vsi_prio_type;
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) {
+		if (vsi->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+			vsi->qos[i].qs_handle = l2p->qs_handle_list[i];
+		vsi->qos[i].traffic_class = info->params->up2tc[i];
+		vsi->qos[i].rel_bw =
+			l2p->tc_info[vsi->qos[i].traffic_class].rel_bw;
+		vsi->qos[i].prio_type =
+			l2p->tc_info[vsi->qos[i].traffic_class].prio_type;
+		vsi->qos[i].valid = false;
+		mutex_init(&vsi->qos[i].qos_mutex);
+		INIT_LIST_HEAD(&vsi->qos[i].qplist);
+	}
+	if (vsi->register_qset) {
+		vsi->dev->ws_add = irdma_ws_add;
+		vsi->dev->ws_remove = irdma_ws_remove;
+		vsi->dev->ws_reset = irdma_ws_reset;
+	} else {
+		vsi->dev->ws_add = irdma_null_ws_add;
+		vsi->dev->ws_remove = irdma_null_ws_remove;
+		vsi->dev->ws_reset = irdma_null_ws_reset;
+	}
+}
+
+/**
+ * irdma_get_fcn_id - Return the function id
+ * @vsi: pointer to the vsi
+ */
+static u8 irdma_get_fcn_id(struct irdma_sc_vsi *vsi)
+{
+	struct irdma_stats_inst_info stats_info = {};
+	struct irdma_sc_dev *dev = vsi->dev;
+	u8 fcn_id = IRDMA_INVALID_FCN_ID;
+	u8 start_idx, max_stats, i;
+
+	if (dev->hw_attrs.uk_attrs.hw_rev != IRDMA_GEN_1) {
+		if (!irdma_cqp_stats_inst_cmd(vsi, IRDMA_OP_STATS_ALLOCATE,
+					      &stats_info))
+			return stats_info.stats_idx;
+	}
+
+	start_idx = 1;
+	max_stats = 16;
+	for (i = start_idx; i < max_stats; i++)
+		if (!dev->fcn_id_array[i]) {
+			fcn_id = i;
+			dev->fcn_id_array[i] = true;
+			break;
+		}
+
+	return fcn_id;
+}
+
+/**
+ * irdma_vsi_stats_init - Initialize the vsi statistics
+ * @vsi: pointer to the vsi structure
+ * @info: The info structure used for initialization
+ */
+enum irdma_status_code irdma_vsi_stats_init(struct irdma_sc_vsi *vsi,
+					    struct irdma_vsi_stats_info *info)
+{
+	u8 fcn_id = info->fcn_id;
+	struct irdma_dma_mem *stats_buff_mem;
+
+	vsi->pestat = info->pestat;
+	vsi->pestat->hw = vsi->dev->hw;
+	vsi->pestat->vsi = vsi;
+	stats_buff_mem = &vsi->pestat->gather_info.stats_buff_mem;
+	stats_buff_mem->size = ALIGN(IRDMA_GATHER_STATS_BUF_SIZE * 2, 1);
+	stats_buff_mem->va = dma_alloc_coherent(vsi->pestat->hw->device,
+						stats_buff_mem->size,
+						&stats_buff_mem->pa,
+						GFP_KERNEL);
+	if (!stats_buff_mem->va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	vsi->pestat->gather_info.gather_stats_va = stats_buff_mem->va;
+	vsi->pestat->gather_info.last_gather_stats_va =
+		(void *)((uintptr_t)stats_buff_mem->va +
+			 IRDMA_GATHER_STATS_BUF_SIZE);
+
+	irdma_hw_stats_start_timer(vsi);
+	if (info->alloc_fcn_id)
+		fcn_id = irdma_get_fcn_id(vsi);
+	if (fcn_id == IRDMA_INVALID_FCN_ID)
+		goto stats_error;
+
+	vsi->stats_fcn_id_alloc = info->alloc_fcn_id;
+	vsi->fcn_id = fcn_id;
+	if (info->alloc_fcn_id) {
+		vsi->pestat->gather_info.use_stats_inst = true;
+		vsi->pestat->gather_info.stats_inst_index = fcn_id;
+	}
+
+	return 0;
+
+stats_error:
+	dma_free_coherent(vsi->pestat->hw->device, stats_buff_mem->size,
+			  stats_buff_mem->va, stats_buff_mem->pa);
+	stats_buff_mem->va = NULL;
+
+	return IRDMA_ERR_CQP_COMPL_ERROR;
+}
+
+/**
+ * irdma_vsi_stats_free - Free the vsi stats
+ * @vsi: pointer to the vsi structure
+ */
+void irdma_vsi_stats_free(struct irdma_sc_vsi *vsi)
+{
+	struct irdma_stats_inst_info stats_info = {};
+	u8 fcn_id = vsi->fcn_id;
+	struct irdma_sc_dev *dev = vsi->dev;
+
+	if (dev->hw_attrs.uk_attrs.hw_rev != IRDMA_GEN_1) {
+		if (vsi->stats_fcn_id_alloc) {
+			stats_info.stats_idx = vsi->fcn_id;
+			irdma_cqp_stats_inst_cmd(vsi, IRDMA_OP_STATS_FREE,
+						 &stats_info);
+		}
+	} else {
+		if (vsi->stats_fcn_id_alloc &&
+		    fcn_id < vsi->dev->hw_attrs.max_stat_inst)
+			vsi->dev->fcn_id_array[fcn_id] = false;
+	}
+
+	if (!vsi->pestat)
+		return;
+	irdma_hw_stats_stop_timer(vsi);
+	dma_free_coherent(vsi->pestat->hw->device,
+			  vsi->pestat->gather_info.stats_buff_mem.size,
+			  vsi->pestat->gather_info.stats_buff_mem.va,
+			  vsi->pestat->gather_info.stats_buff_mem.pa);
+	vsi->pestat->gather_info.stats_buff_mem.va = NULL;
+}
+
+/**
+ * irdma_get_encoded_wqe_size - given wq size, returns hardware encoded size
+ * @wqsize: size of the wq (sq, rq) to encoded_size
+ * @queue_type: queue type selected for the calculation algorithm
+ */
+u8 irdma_get_encoded_wqe_size(u32 wqsize, enum irdma_queue_type queue_type)
+{
+	u8 encoded_size = 0;
+
+	/* cqp sq's hw coded value starts from 1 for size of 4
+	 * while it starts from 0 for qp' wq's.
+	 */
+	if (queue_type == IRDMA_QUEUE_TYPE_CQP)
+		encoded_size = 1;
+	wqsize >>= 2;
+	while (wqsize >>= 1)
+		encoded_size++;
+
+	return encoded_size;
+}
+
+/**
+ * irdma_sc_gather_stats - collect the statistics
+ * @cqp: struct for cqp hw
+ * @info: gather stats info structure
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_gather_stats(struct irdma_sc_cqp *cqp,
+		      struct irdma_stats_gather_info *info, u64 scratch)
+{
+	__le64 *wqe;
+	u64 temp;
+
+	if (info->stats_buff_mem.size < IRDMA_GATHER_STATS_BUF_SIZE)
+		return IRDMA_ERR_BUF_TOO_SHORT;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 40,
+		      FIELD_PREP(IRDMA_CQPSQ_STATS_HMC_FCN_INDEX, info->hmc_fcn_index));
+	set_64bit_val(wqe, 32, info->stats_buff_mem.pa);
+
+	temp = FIELD_PREP(IRDMA_CQPSQ_STATS_WQEVALID, cqp->polarity) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_USE_INST, info->use_stats_inst) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_INST_INDEX,
+			  info->stats_inst_index) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_USE_HMC_FCN_INDEX,
+			  info->use_hmc_fcn_index) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_OP, IRDMA_CQP_OP_GATHER_STATS);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, temp);
+
+	print_hex_dump_debug("STATS: GATHER_STATS WQE", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+
+	irdma_sc_cqp_post_sq(cqp);
+	ibdev_dbg(to_ibdev(cqp->dev),
+		  "STATS: CQP SQ head 0x%x tail 0x%x size 0x%x\n",
+		  cqp->sq_ring.head, cqp->sq_ring.tail, cqp->sq_ring.size);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_manage_stats_inst - allocate or free stats instance
+ * @cqp: struct for cqp hw
+ * @info: stats info structure
+ * @alloc: alloc vs. delete flag
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_manage_stats_inst(struct irdma_sc_cqp *cqp,
+			   struct irdma_stats_inst_info *info, bool alloc,
+			   u64 scratch)
+{
+	__le64 *wqe;
+	u64 temp;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 40,
+		      FIELD_PREP(IRDMA_CQPSQ_STATS_HMC_FCN_INDEX, info->hmc_fn_id));
+	temp = FIELD_PREP(IRDMA_CQPSQ_STATS_WQEVALID, cqp->polarity) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_ALLOC_INST, alloc) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_USE_HMC_FCN_INDEX,
+			  info->use_hmc_fcn_index) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_INST_INDEX, info->stats_idx) |
+	       FIELD_PREP(IRDMA_CQPSQ_STATS_OP, IRDMA_CQP_OP_MANAGE_STATS);
+
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, temp);
+
+	print_hex_dump_debug("WQE: MANAGE_STATS WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+
+	irdma_sc_cqp_post_sq(cqp);
+	return 0;
+}
+
+/**
+ * irdma_sc_set_up_map - set the up map table
+ * @cqp: struct for cqp hw
+ * @info: User priority map info
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_set_up_map(struct irdma_sc_cqp *cqp,
+						  struct irdma_up_info *info,
+						  u64 scratch)
+{
+	__le64 *wqe;
+	u64 temp = 0;
+	int i;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++)
+		temp |= info->map[i] << (i * 8);
+
+	set_64bit_val(wqe, 0, temp);
+	set_64bit_val(wqe, 40,
+		      FIELD_PREP(IRDMA_CQPSQ_UP_CNPOVERRIDE, info->cnp_up_override) |
+		      FIELD_PREP(IRDMA_CQPSQ_UP_HMCFCNIDX, info->hmc_fcn_idx));
+
+	temp = FIELD_PREP(IRDMA_CQPSQ_UP_WQEVALID, cqp->polarity) |
+	       FIELD_PREP(IRDMA_CQPSQ_UP_USEVLAN, info->use_vlan) |
+	       FIELD_PREP(IRDMA_CQPSQ_UP_USEOVERRIDE,
+			  info->use_cnp_up_override) |
+	       FIELD_PREP(IRDMA_CQPSQ_UP_OP, IRDMA_CQP_OP_UP_MAP);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, temp);
+
+	print_hex_dump_debug("WQE: UPMAP WQE", DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_manage_ws_node - create/modify/destroy WS node
+ * @cqp: struct for cqp hw
+ * @info: node info structure
+ * @node_op: 0 for add 1 for modify, 2 for delete
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_manage_ws_node(struct irdma_sc_cqp *cqp,
+			struct irdma_ws_node_info *info,
+			enum irdma_ws_node_op node_op, u64 scratch)
+{
+	__le64 *wqe;
+	u64 temp = 0;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 32,
+		      FIELD_PREP(IRDMA_CQPSQ_WS_VSI, info->vsi) |
+		      FIELD_PREP(IRDMA_CQPSQ_WS_WEIGHT, info->weight));
+
+	temp = FIELD_PREP(IRDMA_CQPSQ_WS_WQEVALID, cqp->polarity) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_NODEOP, node_op) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_ENABLENODE, info->enable) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_NODETYPE, info->type_leaf) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_PRIOTYPE, info->prio_type) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_TC, info->tc) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_OP, IRDMA_CQP_OP_WORK_SCHED_NODE) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_PARENTID, info->parent_id) |
+	       FIELD_PREP(IRDMA_CQPSQ_WS_NODEID, info->id);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, temp);
+
+	print_hex_dump_debug("WQE: MANAGE_WS WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_qp_flush_wqes - flush qp's wqe
+ * @qp: sc qp
+ * @info: dlush information
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_qp_flush_wqes(struct irdma_sc_qp *qp, struct irdma_qp_flush_info *info,
+		       u64 scratch, bool post_sq)
+{
+	u64 temp = 0;
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+	bool flush_sq = false, flush_rq = false;
+
+	if (info->rq && !qp->flush_rq)
+		flush_rq = true;
+	if (info->sq && !qp->flush_sq)
+		flush_sq = true;
+	qp->flush_sq |= flush_sq;
+	qp->flush_rq |= flush_rq;
+
+	if (!flush_sq && !flush_rq) {
+		ibdev_dbg(to_ibdev(qp->dev),
+			  "CQP: Additional flush request ignored for qp %x\n",
+			  qp->qp_uk.qp_id);
+		return IRDMA_ERR_FLUSHED_Q;
+	}
+
+	cqp = qp->pd->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	if (info->userflushcode) {
+		if (flush_rq)
+			temp |= FIELD_PREP(IRDMA_CQPSQ_FWQE_RQMNERR,
+					   info->rq_minor_code) |
+				FIELD_PREP(IRDMA_CQPSQ_FWQE_RQMJERR,
+					   info->rq_major_code);
+		if (flush_sq)
+			temp |= FIELD_PREP(IRDMA_CQPSQ_FWQE_SQMNERR,
+					   info->sq_minor_code) |
+				FIELD_PREP(IRDMA_CQPSQ_FWQE_SQMJERR,
+					   info->sq_major_code);
+	}
+	set_64bit_val(wqe, 16, temp);
+
+	temp = (info->generate_ae) ?
+		info->ae_code | FIELD_PREP(IRDMA_CQPSQ_FWQE_AESOURCE,
+					   info->ae_src) : 0;
+	set_64bit_val(wqe, 8, temp);
+
+	hdr = qp->qp_uk.qp_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_FLUSH_WQES) |
+	      FIELD_PREP(IRDMA_CQPSQ_FWQE_GENERATE_AE, info->generate_ae) |
+	      FIELD_PREP(IRDMA_CQPSQ_FWQE_USERFLCODE, info->userflushcode) |
+	      FIELD_PREP(IRDMA_CQPSQ_FWQE_FLUSHSQ, flush_sq) |
+	      FIELD_PREP(IRDMA_CQPSQ_FWQE_FLUSHRQ, flush_rq) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QP_FLUSH WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_gen_ae - generate AE, uses flush WQE CQP OP
+ * @qp: sc qp
+ * @info: gen ae information
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_gen_ae(struct irdma_sc_qp *qp,
+					      struct irdma_gen_ae_info *info,
+					      u64 scratch, bool post_sq)
+{
+	u64 temp;
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+
+	cqp = qp->pd->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	temp = info->ae_code | FIELD_PREP(IRDMA_CQPSQ_FWQE_AESOURCE,
+					  info->ae_src);
+	set_64bit_val(wqe, 8, temp);
+
+	hdr = qp->qp_uk.qp_id | FIELD_PREP(IRDMA_CQPSQ_OPCODE,
+					   IRDMA_CQP_OP_GEN_AE) |
+	      FIELD_PREP(IRDMA_CQPSQ_FWQE_GENERATE_AE, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: GEN_AE WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/*** irdma_sc_qp_upload_context - upload qp's context
+ * @dev: sc device struct
+ * @info: upload context info ptr for return
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_qp_upload_context(struct irdma_sc_dev *dev,
+			   struct irdma_upload_context_info *info, u64 scratch,
+			   bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, info->buf_pa);
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_UCTX_QPID, info->qp_id) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_UPLOAD_CONTEXT) |
+	      FIELD_PREP(IRDMA_CQPSQ_UCTX_QPTYPE, info->qp_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_UCTX_RAWFORMAT, info->raw_format) |
+	      FIELD_PREP(IRDMA_CQPSQ_UCTX_FREEZEQP, info->freeze_qp) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QP_UPLOAD_CTX WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_manage_push_page - Handle push page
+ * @cqp: struct for cqp hw
+ * @info: push page info
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_manage_push_page(struct irdma_sc_cqp *cqp,
+			  struct irdma_cqp_manage_push_page_info *info,
+			  u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	if (info->free_page &&
+	    info->push_idx >= cqp->dev->hw_attrs.max_hw_device_pages)
+		return IRDMA_ERR_INVALID_PUSH_PAGE_INDEX;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, info->qs_handle);
+	hdr = FIELD_PREP(IRDMA_CQPSQ_MPP_PPIDX, info->push_idx) |
+	      FIELD_PREP(IRDMA_CQPSQ_MPP_PPTYPE, info->push_page_type) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MANAGE_PUSH_PAGES) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity) |
+	      FIELD_PREP(IRDMA_CQPSQ_MPP_FREE_PAGE, info->free_page);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: MANAGE_PUSH_PAGES WQE", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_suspend_qp - suspend qp for param change
+ * @cqp: struct for cqp hw
+ * @qp: sc qp struct
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_suspend_qp(struct irdma_sc_cqp *cqp,
+						  struct irdma_sc_qp *qp,
+						  u64 scratch)
+{
+	u64 hdr;
+	__le64 *wqe;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_SUSPENDQP_QPID, qp->qp_uk.qp_id) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_SUSPEND_QP) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: SUSPEND_QP WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_resume_qp - resume qp after suspend
+ * @cqp: struct for cqp hw
+ * @qp: sc qp struct
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_resume_qp(struct irdma_sc_cqp *cqp,
+						 struct irdma_sc_qp *qp,
+						 u64 scratch)
+{
+	u64 hdr;
+	__le64 *wqe;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_RESUMEQP_QSHANDLE, qp->qs_handle));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_RESUMEQP_QPID, qp->qp_uk.qp_id) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_RESUME_QP) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: RESUME_QP WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_cq_ack - acknowledge completion q
+ * @cq: cq struct
+ */
+static void irdma_sc_cq_ack(struct irdma_sc_cq *cq)
+{
+	writel(cq->cq_uk.cq_id, cq->cq_uk.cq_ack_db);
+}
+
+/**
+ * irdma_sc_cq_init - initialize completion q
+ * @cq: cq struct
+ * @info: cq initialization info
+ */
+static enum irdma_status_code irdma_sc_cq_init(struct irdma_sc_cq *cq,
+					       struct irdma_cq_init_info *info)
+{
+	enum irdma_status_code ret_code;
+	u32 pble_obj_cnt;
+
+	pble_obj_cnt = info->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+	if (info->virtual_map && info->first_pm_pbl_idx >= pble_obj_cnt)
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	cq->cq_pa = info->cq_base_pa;
+	cq->dev = info->dev;
+	cq->ceq_id = info->ceq_id;
+	info->cq_uk_init_info.cqe_alloc_db = cq->dev->cq_arm_db;
+	info->cq_uk_init_info.cq_ack_db = cq->dev->cq_ack_db;
+	ret_code = irdma_cq_uk_init(&cq->cq_uk, &info->cq_uk_init_info);
+	if (ret_code)
+		return ret_code;
+
+	cq->virtual_map = info->virtual_map;
+	cq->pbl_chunk_size = info->pbl_chunk_size;
+	cq->ceqe_mask = info->ceqe_mask;
+	cq->cq_type = (info->type) ? info->type : IRDMA_CQ_TYPE_IWARP;
+	cq->shadow_area_pa = info->shadow_area_pa;
+	cq->shadow_read_threshold = info->shadow_read_threshold;
+	cq->ceq_id_valid = info->ceq_id_valid;
+	cq->tph_en = info->tph_en;
+	cq->tph_val = info->tph_val;
+	cq->first_pm_pbl_idx = info->first_pm_pbl_idx;
+	cq->vsi = info->vsi;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_cq_create - create completion q
+ * @cq: cq struct
+ * @scratch: u64 saved to be used during cqp completion
+ * @check_overflow: flag for overflow check
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_cq_create(struct irdma_sc_cq *cq,
+						 u64 scratch,
+						 bool check_overflow,
+						 bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+	struct irdma_sc_ceq *ceq;
+	enum irdma_status_code ret_code = 0;
+
+	cqp = cq->dev->cqp;
+	if (cq->cq_uk.cq_id > (cqp->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].max_cnt - 1))
+		return IRDMA_ERR_INVALID_CQ_ID;
+
+	if (cq->ceq_id > (cq->dev->hmc_fpm_misc.max_ceqs - 1))
+		return IRDMA_ERR_INVALID_CEQ_ID;
+
+	ceq = cq->dev->ceq[cq->ceq_id];
+	if (ceq && ceq->reg_cq)
+		ret_code = irdma_sc_add_cq_ctx(ceq, cq);
+
+	if (ret_code)
+		return ret_code;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe) {
+		if (ceq && ceq->reg_cq)
+			irdma_sc_remove_cq_ctx(ceq, cq);
+		return IRDMA_ERR_RING_FULL;
+	}
+
+	set_64bit_val(wqe, 0, cq->cq_uk.cq_size);
+	set_64bit_val(wqe, 8, (uintptr_t)cq >> 1);
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_CQ_SHADOW_READ_THRESHOLD, cq->shadow_read_threshold));
+	set_64bit_val(wqe, 32, (cq->virtual_map ? 0 : cq->cq_pa));
+	set_64bit_val(wqe, 40, cq->shadow_area_pa);
+	set_64bit_val(wqe, 48,
+		      FIELD_PREP(IRDMA_CQPSQ_CQ_FIRSTPMPBLIDX, (cq->virtual_map ? cq->first_pm_pbl_idx : 0)));
+	set_64bit_val(wqe, 56,
+		      FIELD_PREP(IRDMA_CQPSQ_TPHVAL, cq->tph_val) |
+		      FIELD_PREP(IRDMA_CQPSQ_VSIIDX, cq->vsi->vsi_idx));
+
+	hdr = FLD_LS_64(cq->dev, cq->cq_uk.cq_id, IRDMA_CQPSQ_CQ_CQID) |
+	      FLD_LS_64(cq->dev, (cq->ceq_id_valid ? cq->ceq_id : 0),
+			IRDMA_CQPSQ_CQ_CEQID) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_CQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_LPBLSIZE, cq->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CHKOVERFLOW, check_overflow) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_VIRTMAP, cq->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_ENCEQEMASK, cq->ceqe_mask) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CEQIDVALID, cq->ceq_id_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_TPHEN, cq->tph_en) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_AVOIDMEMCNFLCT,
+			 cq->cq_uk.avoid_mem_cflct) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: CQ_CREATE WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_cq_destroy - destroy completion q
+ * @cq: cq struct
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_cq_destroy(struct irdma_sc_cq *cq,
+						  u64 scratch, bool post_sq)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+	struct irdma_sc_ceq *ceq;
+
+	cqp = cq->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	ceq = cq->dev->ceq[cq->ceq_id];
+	if (ceq && ceq->reg_cq)
+		irdma_sc_remove_cq_ctx(ceq, cq);
+
+	set_64bit_val(wqe, 0, cq->cq_uk.cq_size);
+	set_64bit_val(wqe, 8, (uintptr_t)cq >> 1);
+	set_64bit_val(wqe, 40, cq->shadow_area_pa);
+	set_64bit_val(wqe, 48,
+		      (cq->virtual_map ? cq->first_pm_pbl_idx : 0));
+
+	hdr = cq->cq_uk.cq_id |
+	      FLD_LS_64(cq->dev, (cq->ceq_id_valid ? cq->ceq_id : 0),
+			IRDMA_CQPSQ_CQ_CEQID) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_CQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_LPBLSIZE, cq->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_VIRTMAP, cq->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_ENCEQEMASK, cq->ceqe_mask) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CEQIDVALID, cq->ceq_id_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_TPHEN, cq->tph_en) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_AVOIDMEMCNFLCT, cq->cq_uk.avoid_mem_cflct) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: CQ_DESTROY WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_cq_resize - set resized cq buffer info
+ * @cq: resized cq
+ * @info: resized cq buffer info
+ */
+static void irdma_sc_cq_resize(struct irdma_sc_cq *cq, struct irdma_modify_cq_info *info)
+{
+	cq->virtual_map = info->virtual_map;
+	cq->cq_pa = info->cq_pa;
+	cq->first_pm_pbl_idx = info->first_pm_pbl_idx;
+	cq->pbl_chunk_size = info->pbl_chunk_size;
+	cq->cq_uk.ops.iw_cq_resize(&cq->cq_uk, info->cq_base, info->cq_size);
+}
+
+/**
+ * irdma_sc_cq_modify - modify a Completion Queue
+ * @cq: cq struct
+ * @info: modification info struct
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag to post to sq
+ */
+static enum irdma_status_code
+irdma_sc_cq_modify(struct irdma_sc_cq *cq, struct irdma_modify_cq_info *info,
+		   u64 scratch, bool post_sq)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+	u32 pble_obj_cnt;
+
+	pble_obj_cnt = cq->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+	if (info->cq_resize && info->virtual_map &&
+	    info->first_pm_pbl_idx >= pble_obj_cnt)
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	cqp = cq->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 0, info->cq_size);
+	set_64bit_val(wqe, 8, (uintptr_t)cq >> 1);
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_CQ_SHADOW_READ_THRESHOLD, info->shadow_read_threshold));
+	set_64bit_val(wqe, 32, info->cq_pa);
+	set_64bit_val(wqe, 40, cq->shadow_area_pa);
+	set_64bit_val(wqe, 48, info->first_pm_pbl_idx);
+	set_64bit_val(wqe, 56,
+		      FIELD_PREP(IRDMA_CQPSQ_TPHVAL, cq->tph_val) |
+		      FIELD_PREP(IRDMA_CQPSQ_VSIIDX, cq->vsi->vsi_idx));
+
+	hdr = cq->cq_uk.cq_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MODIFY_CQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CQRESIZE, info->cq_resize) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_LPBLSIZE, info->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CHKOVERFLOW, info->check_overflow) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_VIRTMAP, info->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_ENCEQEMASK, cq->ceqe_mask) |
+	      FIELD_PREP(IRDMA_CQPSQ_TPHEN, cq->tph_en) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_AVOIDMEMCNFLCT,
+			 cq->cq_uk.avoid_mem_cflct) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: CQ_MODIFY WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_check_cqp_progress - check cqp processing progress
+ * @timeout: timeout info struct
+ * @dev: sc device struct
+ */
+static void irdma_check_cqp_progress(struct irdma_cqp_timeout *timeout,
+				     struct irdma_sc_dev *dev)
+{
+	if (timeout->compl_cqp_cmds != dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]) {
+		timeout->compl_cqp_cmds = dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS];
+		timeout->count = 0;
+	} else {
+		if (dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] !=
+		    timeout->compl_cqp_cmds)
+			timeout->count++;
+	}
+}
+
+/**
+ * irdma_get_cqp_reg_info - get head and tail for cqp using registers
+ * @cqp: struct for cqp hw
+ * @val: cqp tail register value
+ * @tail: wqtail register value
+ * @error: cqp processing err
+ */
+static inline void irdma_get_cqp_reg_info(struct irdma_sc_cqp *cqp, u32 *val,
+					  u32 *tail, u32 *error)
+{
+	*val = readl(cqp->dev->hw_regs[IRDMA_CQPTAIL]);
+	*tail = FIELD_GET(IRDMA_CQPTAIL_WQTAIL, *val);
+	*error = FIELD_GET(IRDMA_CQPTAIL_CQP_OP_ERR, *val);
+}
+
+/**
+ * irdma_cqp_poll_registers - poll cqp registers
+ * @cqp: struct for cqp hw
+ * @tail: wqtail register value
+ * @count: how many times to try for completion
+ */
+static enum irdma_status_code irdma_cqp_poll_registers(struct irdma_sc_cqp *cqp,
+						       u32 tail, u32 count)
+{
+	u32 i = 0;
+	u32 newtail, error, val;
+
+	while (i++ < count) {
+		irdma_get_cqp_reg_info(cqp, &val, &newtail, &error);
+		if (error) {
+			error = readl(cqp->dev->hw_regs[IRDMA_CQPERRCODES]);
+			ibdev_dbg(to_ibdev(cqp->dev),
+				  "CQP: CQPERRCODES error_code[x%08X]\n",
+				  error);
+			return IRDMA_ERR_CQP_COMPL_ERROR;
+		}
+		if (newtail != tail) {
+			/* SUCCESS */
+			IRDMA_RING_MOVE_TAIL(cqp->sq_ring);
+			cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++;
+			return 0;
+		}
+		udelay(cqp->dev->hw_attrs.max_sleep_count);
+	}
+
+	return IRDMA_ERR_TIMEOUT;
+}
+
+/**
+ * irdma_sc_decode_fpm_commit - decode a 64 bit value into count and base
+ * @dev: sc device struct
+ * @buf: pointer to commit buffer
+ * @buf_idx: buffer index
+ * @obj_info: object info pointer
+ * @rsrc_idx: indexs of memory resource
+ */
+static u64 irdma_sc_decode_fpm_commit(struct irdma_sc_dev *dev, __le64 *buf,
+				      u32 buf_idx, struct irdma_hmc_obj_info *obj_info,
+				      u32 rsrc_idx)
+{
+	u64 temp;
+
+	get_64bit_val(buf, buf_idx, &temp);
+
+	switch (rsrc_idx) {
+	case IRDMA_HMC_IW_QP:
+		obj_info[rsrc_idx].cnt = (u32)FIELD_GET(IRDMA_COMMIT_FPM_QPCNT, temp);
+		break;
+	case IRDMA_HMC_IW_CQ:
+		obj_info[rsrc_idx].cnt = (u32)FLD_RS_64(dev, temp, IRDMA_COMMIT_FPM_CQCNT);
+		break;
+	case IRDMA_HMC_IW_APBVT_ENTRY:
+		obj_info[rsrc_idx].cnt = 1;
+		break;
+	default:
+		obj_info[rsrc_idx].cnt = (u32)temp;
+		break;
+	}
+
+	obj_info[rsrc_idx].base = (temp >> IRDMA_COMMIT_FPM_BASE_S) * 512;
+
+	return temp;
+}
+
+/**
+ * irdma_sc_parse_fpm_commit_buf - parse fpm commit buffer
+ * @dev: pointer to dev struct
+ * @buf: ptr to fpm commit buffer
+ * @info: ptr to irdma_hmc_obj_info struct
+ * @sd: number of SDs for HMC objects
+ *
+ * parses fpm commit info and copy base value
+ * of hmc objects in hmc_info
+ */
+static enum irdma_status_code
+irdma_sc_parse_fpm_commit_buf(struct irdma_sc_dev *dev, __le64 *buf,
+			      struct irdma_hmc_obj_info *info, u32 *sd)
+{
+	u64 size;
+	u32 i;
+	u64 max_base = 0;
+	u32 last_hmc_obj = 0;
+
+	irdma_sc_decode_fpm_commit(dev, buf, 0, info,
+				   IRDMA_HMC_IW_QP);
+	irdma_sc_decode_fpm_commit(dev, buf, 8, info,
+				   IRDMA_HMC_IW_CQ);
+	/* skiping RSRVD */
+	irdma_sc_decode_fpm_commit(dev, buf, 24, info,
+				   IRDMA_HMC_IW_HTE);
+	irdma_sc_decode_fpm_commit(dev, buf, 32, info,
+				   IRDMA_HMC_IW_ARP);
+	irdma_sc_decode_fpm_commit(dev, buf, 40, info,
+				   IRDMA_HMC_IW_APBVT_ENTRY);
+	irdma_sc_decode_fpm_commit(dev, buf, 48, info,
+				   IRDMA_HMC_IW_MR);
+	irdma_sc_decode_fpm_commit(dev, buf, 56, info,
+				   IRDMA_HMC_IW_XF);
+	irdma_sc_decode_fpm_commit(dev, buf, 64, info,
+				   IRDMA_HMC_IW_XFFL);
+	irdma_sc_decode_fpm_commit(dev, buf, 72, info,
+				   IRDMA_HMC_IW_Q1);
+	irdma_sc_decode_fpm_commit(dev, buf, 80, info,
+				   IRDMA_HMC_IW_Q1FL);
+	irdma_sc_decode_fpm_commit(dev, buf, 88, info,
+				   IRDMA_HMC_IW_TIMER);
+	irdma_sc_decode_fpm_commit(dev, buf, 112, info,
+				   IRDMA_HMC_IW_PBLE);
+	/* skipping RSVD. */
+	if (dev->hw_attrs.uk_attrs.hw_rev != IRDMA_GEN_1) {
+		irdma_sc_decode_fpm_commit(dev, buf, 96, info,
+					   IRDMA_HMC_IW_FSIMC);
+		irdma_sc_decode_fpm_commit(dev, buf, 104, info,
+					   IRDMA_HMC_IW_FSIAV);
+		irdma_sc_decode_fpm_commit(dev, buf, 128, info,
+					   IRDMA_HMC_IW_RRF);
+		irdma_sc_decode_fpm_commit(dev, buf, 136, info,
+					   IRDMA_HMC_IW_RRFFL);
+		irdma_sc_decode_fpm_commit(dev, buf, 144, info,
+					   IRDMA_HMC_IW_HDR);
+		irdma_sc_decode_fpm_commit(dev, buf, 152, info,
+					   IRDMA_HMC_IW_MD);
+		irdma_sc_decode_fpm_commit(dev, buf, 160, info,
+					   IRDMA_HMC_IW_OOISC);
+		irdma_sc_decode_fpm_commit(dev, buf, 168, info,
+					   IRDMA_HMC_IW_OOISCFFL);
+	}
+
+	/* searching for the last object in HMC to find the size of the HMC area. */
+	for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++) {
+		if (info[i].base > max_base) {
+			max_base = info[i].base;
+			last_hmc_obj = i;
+		}
+	}
+
+	size = info[last_hmc_obj].cnt * info[last_hmc_obj].size +
+	       info[last_hmc_obj].base;
+
+	if (size & 0x1FFFFF)
+		*sd = (u32)((size >> 21) + 1); /* add 1 for remainder */
+	else
+		*sd = (u32)(size >> 21);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_decode_fpm_query() - Decode a 64 bit value into max count and size
+ * @buf: ptr to fpm query buffer
+ * @buf_idx: index into buf
+ * @obj_info: ptr to irdma_hmc_obj_info struct
+ * @rsrc_idx: resource index into info
+ *
+ * Decode a 64 bit value from fpm query buffer into max count and size
+ */
+static u64 irdma_sc_decode_fpm_query(__le64 *buf, u32 buf_idx,
+				     struct irdma_hmc_obj_info *obj_info,
+				     u32 rsrc_idx)
+{
+	u64 temp;
+	u32 size;
+
+	get_64bit_val(buf, buf_idx, &temp);
+	obj_info[rsrc_idx].max_cnt = (u32)temp;
+	size = (u32)(temp >> 32);
+	obj_info[rsrc_idx].size = BIT_ULL(size);
+
+	return temp;
+}
+
+/**
+ * irdma_sc_parse_fpm_query_buf() - parses fpm query buffer
+ * @dev: ptr to shared code device
+ * @buf: ptr to fpm query buffer
+ * @hmc_info: ptr to irdma_hmc_obj_info struct
+ * @hmc_fpm_misc: ptr to fpm data
+ *
+ * parses fpm query buffer and copy max_cnt and
+ * size value of hmc objects in hmc_info
+ */
+static enum irdma_status_code
+irdma_sc_parse_fpm_query_buf(struct irdma_sc_dev *dev, __le64 *buf,
+			     struct irdma_hmc_info *hmc_info,
+			     struct irdma_hmc_fpm_misc *hmc_fpm_misc)
+{
+	struct irdma_hmc_obj_info *obj_info;
+	u64 temp;
+	u32 size;
+	u16 max_pe_sds;
+
+	obj_info = hmc_info->hmc_obj;
+
+	get_64bit_val(buf, 0, &temp);
+	hmc_info->first_sd_index = (u16)FIELD_GET(IRDMA_QUERY_FPM_FIRST_PE_SD_INDEX, temp);
+	max_pe_sds = (u16)FIELD_GET(IRDMA_QUERY_FPM_MAX_PE_SDS, temp);
+
+	/* Reduce SD count for VFs by 1 to account
+	 * for PBLE backing page rounding
+	 */
+	if (hmc_info->hmc_fn_id >= dev->hw_attrs.first_hw_vf_fpm_id || !dev->privileged)
+		max_pe_sds--;
+	hmc_fpm_misc->max_sds = max_pe_sds;
+	hmc_info->sd_table.sd_cnt = max_pe_sds + hmc_info->first_sd_index;
+	get_64bit_val(buf, 8, &temp);
+	obj_info[IRDMA_HMC_IW_QP].max_cnt = (u32)FIELD_GET(IRDMA_QUERY_FPM_MAX_QPS, temp);
+	size = (u32)(temp >> 32);
+	obj_info[IRDMA_HMC_IW_QP].size = BIT_ULL(size);
+
+	get_64bit_val(buf, 16, &temp);
+	obj_info[IRDMA_HMC_IW_CQ].max_cnt = (u32)FIELD_GET(IRDMA_QUERY_FPM_MAX_CQS, temp);
+	size = (u32)(temp >> 32);
+	obj_info[IRDMA_HMC_IW_CQ].size = BIT_ULL(size);
+
+	irdma_sc_decode_fpm_query(buf, 32, obj_info, IRDMA_HMC_IW_HTE);
+	irdma_sc_decode_fpm_query(buf, 40, obj_info, IRDMA_HMC_IW_ARP);
+
+	obj_info[IRDMA_HMC_IW_APBVT_ENTRY].size = 8192;
+	obj_info[IRDMA_HMC_IW_APBVT_ENTRY].max_cnt = 1;
+
+	irdma_sc_decode_fpm_query(buf, 48, obj_info, IRDMA_HMC_IW_MR);
+	irdma_sc_decode_fpm_query(buf, 56, obj_info, IRDMA_HMC_IW_XF);
+
+	get_64bit_val(buf, 64, &temp);
+	obj_info[IRDMA_HMC_IW_XFFL].max_cnt = (u32)temp;
+	obj_info[IRDMA_HMC_IW_XFFL].size = 4;
+	hmc_fpm_misc->xf_block_size = FIELD_GET(IRDMA_QUERY_FPM_XFBLOCKSIZE, temp);
+	if (!hmc_fpm_misc->xf_block_size)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	irdma_sc_decode_fpm_query(buf, 72, obj_info, IRDMA_HMC_IW_Q1);
+	get_64bit_val(buf, 80, &temp);
+	obj_info[IRDMA_HMC_IW_Q1FL].max_cnt = (u32)temp;
+	obj_info[IRDMA_HMC_IW_Q1FL].size = 4;
+
+	hmc_fpm_misc->q1_block_size = FIELD_GET(IRDMA_QUERY_FPM_Q1BLOCKSIZE, temp);
+	if (!hmc_fpm_misc->q1_block_size)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	irdma_sc_decode_fpm_query(buf, 88, obj_info, IRDMA_HMC_IW_TIMER);
+
+	get_64bit_val(buf, 112, &temp);
+	obj_info[IRDMA_HMC_IW_PBLE].max_cnt = (u32)temp;
+	obj_info[IRDMA_HMC_IW_PBLE].size = 8;
+
+	get_64bit_val(buf, 120, &temp);
+	hmc_fpm_misc->max_ceqs = FIELD_GET(IRDMA_QUERY_FPM_MAX_CEQS, temp);
+	hmc_fpm_misc->ht_multiplier = FIELD_GET(IRDMA_QUERY_FPM_HTMULTIPLIER, temp);
+	hmc_fpm_misc->timer_bucket = FIELD_GET(IRDMA_QUERY_FPM_TIMERBUCKET, temp);
+	if (dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		return 0;
+	irdma_sc_decode_fpm_query(buf, 96, obj_info, IRDMA_HMC_IW_FSIMC);
+	irdma_sc_decode_fpm_query(buf, 104, obj_info, IRDMA_HMC_IW_FSIAV);
+	irdma_sc_decode_fpm_query(buf, 128, obj_info, IRDMA_HMC_IW_RRF);
+
+	get_64bit_val(buf, 136, &temp);
+	obj_info[IRDMA_HMC_IW_RRFFL].max_cnt = (u32)temp;
+	obj_info[IRDMA_HMC_IW_RRFFL].size = 4;
+	hmc_fpm_misc->rrf_block_size = FIELD_GET(IRDMA_QUERY_FPM_RRFBLOCKSIZE, temp);
+	if (!hmc_fpm_misc->rrf_block_size &&
+	    obj_info[IRDMA_HMC_IW_RRFFL].max_cnt)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	irdma_sc_decode_fpm_query(buf, 144, obj_info, IRDMA_HMC_IW_HDR);
+	irdma_sc_decode_fpm_query(buf, 152, obj_info, IRDMA_HMC_IW_MD);
+	irdma_sc_decode_fpm_query(buf, 160, obj_info, IRDMA_HMC_IW_OOISC);
+
+	get_64bit_val(buf, 168, &temp);
+	obj_info[IRDMA_HMC_IW_OOISCFFL].max_cnt = (u32)temp;
+	obj_info[IRDMA_HMC_IW_OOISCFFL].size = 4;
+	hmc_fpm_misc->ooiscf_block_size = FIELD_GET(IRDMA_QUERY_FPM_OOISCFBLOCKSIZE, temp);
+	if (!hmc_fpm_misc->ooiscf_block_size &&
+	    obj_info[IRDMA_HMC_IW_OOISCFFL].max_cnt)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_find_reg_cq - find cq ctx index
+ * @ceq: ceq sc structure
+ * @cq: cq sc structure
+ */
+static u32 irdma_sc_find_reg_cq(struct irdma_sc_ceq *ceq,
+				struct irdma_sc_cq *cq)
+{
+	u32 i;
+
+	for (i = 0; i < ceq->reg_cq_size; i++) {
+		if (cq == ceq->reg_cq[i])
+			return i;
+	}
+
+	return IRDMA_INVALID_CQ_IDX;
+}
+
+/**
+ * irdma_sc_add_cq_ctx - add cq ctx tracking for ceq
+ * @ceq: ceq sc structure
+ * @cq: cq sc structure
+ */
+enum irdma_status_code irdma_sc_add_cq_ctx(struct irdma_sc_ceq *ceq,
+					   struct irdma_sc_cq *cq)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ceq->req_cq_lock, flags);
+
+	if (ceq->reg_cq_size == ceq->elem_cnt) {
+		spin_unlock_irqrestore(&ceq->req_cq_lock, flags);
+		return IRDMA_ERR_REG_CQ_FULL;
+	}
+
+	ceq->reg_cq[ceq->reg_cq_size++] = cq;
+
+	spin_unlock_irqrestore(&ceq->req_cq_lock, flags);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_remove_cq_ctx - remove cq ctx tracking for ceq
+ * @ceq: ceq sc structure
+ * @cq: cq sc structure
+ */
+void irdma_sc_remove_cq_ctx(struct irdma_sc_ceq *ceq, struct irdma_sc_cq *cq)
+{
+	unsigned long flags;
+	u32 cq_ctx_idx;
+
+	spin_lock_irqsave(&ceq->req_cq_lock, flags);
+	cq_ctx_idx = irdma_sc_find_reg_cq(ceq, cq);
+	if (cq_ctx_idx == IRDMA_INVALID_CQ_IDX)
+		goto exit;
+
+	ceq->reg_cq_size--;
+	if (cq_ctx_idx != ceq->reg_cq_size)
+		ceq->reg_cq[cq_ctx_idx] = ceq->reg_cq[ceq->reg_cq_size];
+	ceq->reg_cq[ceq->reg_cq_size] = NULL;
+
+exit:
+	spin_unlock_irqrestore(&ceq->req_cq_lock, flags);
+}
+
+/**
+ * irdma_sc_cqp_init - Initialize buffers for a control Queue Pair
+ * @cqp: IWARP control queue pair pointer
+ * @info: IWARP control queue pair init info pointer
+ *
+ * Initializes the object and context buffers for a control Queue Pair.
+ */
+static enum irdma_status_code
+irdma_sc_cqp_init(struct irdma_sc_cqp *cqp, struct irdma_cqp_init_info *info)
+{
+	u8 hw_sq_size;
+
+	if (info->sq_size > IRDMA_CQP_SW_SQSIZE_2048 ||
+	    info->sq_size < IRDMA_CQP_SW_SQSIZE_4 ||
+	    ((info->sq_size & (info->sq_size - 1))))
+		return IRDMA_ERR_INVALID_SIZE;
+
+	hw_sq_size = irdma_get_encoded_wqe_size(info->sq_size,
+						IRDMA_QUEUE_TYPE_CQP);
+	cqp->size = sizeof(*cqp);
+	cqp->sq_size = info->sq_size;
+	cqp->hw_sq_size = hw_sq_size;
+	cqp->sq_base = info->sq;
+	cqp->host_ctx = info->host_ctx;
+	cqp->sq_pa = info->sq_pa;
+	cqp->host_ctx_pa = info->host_ctx_pa;
+	cqp->dev = info->dev;
+	cqp->struct_ver = info->struct_ver;
+	cqp->hw_maj_ver = info->hw_maj_ver;
+	cqp->hw_min_ver = info->hw_min_ver;
+	cqp->scratch_array = info->scratch_array;
+	cqp->polarity = 0;
+	cqp->en_datacenter_tcp = info->en_datacenter_tcp;
+	cqp->ena_vf_count = info->ena_vf_count;
+	cqp->hmc_profile = info->hmc_profile;
+	cqp->ceqs_per_vf = info->ceqs_per_vf;
+	cqp->disable_packed = info->disable_packed;
+	cqp->rocev2_rto_policy = info->rocev2_rto_policy;
+	cqp->protocol_used = info->protocol_used;
+	memcpy(&cqp->dcqcn_params, &info->dcqcn_params, sizeof(cqp->dcqcn_params));
+	info->dev->cqp = cqp;
+
+	IRDMA_RING_INIT(cqp->sq_ring, cqp->sq_size);
+	cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] = 0;
+	cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS] = 0;
+	/* for the cqp commands backlog. */
+	INIT_LIST_HEAD(&cqp->dev->cqp_cmd_head);
+
+	writel(0, cqp->dev->hw_regs[IRDMA_CQPTAIL]);
+	writel(0, cqp->dev->hw_regs[IRDMA_CQPDB]);
+	writel(0, cqp->dev->hw_regs[IRDMA_CCQPSTATUS]);
+
+	ibdev_dbg(to_ibdev(cqp->dev),
+		  "WQE: sq_size[%04d] hw_sq_size[%04d] sq_base[%p] sq_pa[%pK] cqp[%p] polarity[x%04x]\n",
+		  cqp->sq_size, cqp->hw_sq_size, cqp->sq_base,
+		  (u64 *)(uintptr_t)cqp->sq_pa, cqp, cqp->polarity);
+	return 0;
+}
+
+/**
+ * irdma_sc_cqp_create - create cqp during bringup
+ * @cqp: struct for cqp hw
+ * @maj_err: If error, major err number
+ * @min_err: If error, minor err number
+ */
+static enum irdma_status_code irdma_sc_cqp_create(struct irdma_sc_cqp *cqp,
+						  u16 *maj_err, u16 *min_err)
+{
+	u64 temp;
+	u8 hw_rev;
+	u32 cnt = 0, p1, p2, val = 0, err_code;
+	enum irdma_status_code ret_code;
+
+	hw_rev = cqp->dev->hw_attrs.uk_attrs.hw_rev;
+	cqp->sdbuf.size = ALIGN(IRDMA_UPDATE_SD_BUFF_SIZE * cqp->sq_size,
+				IRDMA_SD_BUF_ALIGNMENT);
+	cqp->sdbuf.va = dma_alloc_coherent(cqp->dev->hw->device,
+					   cqp->sdbuf.size, &cqp->sdbuf.pa,
+					   GFP_KERNEL);
+	if (!cqp->sdbuf.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	spin_lock_init(&cqp->dev->cqp_lock);
+
+	temp = FIELD_PREP(IRDMA_CQPHC_SQSIZE, cqp->hw_sq_size) |
+	       FIELD_PREP(IRDMA_CQPHC_SVER, cqp->struct_ver) |
+	       FIELD_PREP(IRDMA_CQPHC_DISABLE_PFPDUS, cqp->disable_packed) |
+	       FIELD_PREP(IRDMA_CQPHC_CEQPERVF, cqp->ceqs_per_vf);
+	if (hw_rev >= IRDMA_GEN_2) {
+		temp |= FIELD_PREP(IRDMA_CQPHC_ROCEV2_RTO_POLICY,
+				   cqp->rocev2_rto_policy) |
+			FIELD_PREP(IRDMA_CQPHC_PROTOCOL_USED,
+				   cqp->protocol_used);
+	}
+
+	set_64bit_val(cqp->host_ctx, 0, temp);
+	set_64bit_val(cqp->host_ctx, 8, cqp->sq_pa);
+
+	temp = FIELD_PREP(IRDMA_CQPHC_ENABLED_VFS, cqp->ena_vf_count) |
+	       FIELD_PREP(IRDMA_CQPHC_HMC_PROFILE, cqp->hmc_profile);
+	set_64bit_val(cqp->host_ctx, 16, temp);
+	set_64bit_val(cqp->host_ctx, 24, (uintptr_t)cqp);
+	temp = FIELD_PREP(IRDMA_CQPHC_HW_MAJVER, cqp->hw_maj_ver) |
+	       FIELD_PREP(IRDMA_CQPHC_HW_MINVER, cqp->hw_min_ver);
+	if (hw_rev >= IRDMA_GEN_2) {
+		temp |= FIELD_PREP(IRDMA_CQPHC_MIN_RATE, cqp->dcqcn_params.min_rate) |
+			FIELD_PREP(IRDMA_CQPHC_MIN_DEC_FACTOR, cqp->dcqcn_params.min_dec_factor);
+	}
+	set_64bit_val(cqp->host_ctx, 32, temp);
+	set_64bit_val(cqp->host_ctx, 40, 0);
+	temp = 0;
+	if (hw_rev >= IRDMA_GEN_2) {
+		temp |= FIELD_PREP(IRDMA_CQPHC_DCQCN_T, cqp->dcqcn_params.dcqcn_t) |
+			FIELD_PREP(IRDMA_CQPHC_RAI_FACTOR, cqp->dcqcn_params.rai_factor) |
+			FIELD_PREP(IRDMA_CQPHC_HAI_FACTOR, cqp->dcqcn_params.hai_factor);
+	}
+	set_64bit_val(cqp->host_ctx, 48, temp);
+	temp = 0;
+	if (hw_rev >= IRDMA_GEN_2) {
+		temp |= FIELD_PREP(IRDMA_CQPHC_DCQCN_B, cqp->dcqcn_params.dcqcn_b) |
+			FIELD_PREP(IRDMA_CQPHC_DCQCN_F, cqp->dcqcn_params.dcqcn_f) |
+			FIELD_PREP(IRDMA_CQPHC_CC_CFG_VALID, cqp->dcqcn_params.cc_cfg_valid) |
+			FIELD_PREP(IRDMA_CQPHC_RREDUCE_MPERIOD, cqp->dcqcn_params.rreduce_mperiod);
+	}
+	set_64bit_val(cqp->host_ctx, 56, temp);
+	print_hex_dump_debug("WQE: CQP_HOST_CTX WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, cqp->host_ctx, IRDMA_CQP_CTX_SIZE * 8, false);
+	p1 = cqp->host_ctx_pa >> 32;
+	p2 = (u32)cqp->host_ctx_pa;
+
+	writel(p1, cqp->dev->hw_regs[IRDMA_CCQPHIGH]);
+	writel(p2, cqp->dev->hw_regs[IRDMA_CCQPLOW]);
+
+	do {
+		if (cnt++ > cqp->dev->hw_attrs.max_done_count) {
+			ret_code = IRDMA_ERR_TIMEOUT;
+			goto err;
+		}
+		udelay(cqp->dev->hw_attrs.max_sleep_count);
+		val = readl(cqp->dev->hw_regs[IRDMA_CCQPSTATUS]);
+	} while (!val);
+
+	if (FLD_RS_32(cqp->dev, val, IRDMA_CCQPSTATUS_CCQP_ERR)) {
+		ret_code = IRDMA_ERR_DEVICE_NOT_SUPPORTED;
+		goto err;
+	}
+
+	cqp->process_cqp_sds = irdma_update_sds_noccq;
+	return 0;
+
+err:
+	dma_free_coherent(cqp->dev->hw->device, cqp->sdbuf.size,
+			  cqp->sdbuf.va, cqp->sdbuf.pa);
+	cqp->sdbuf.va = NULL;
+	err_code = readl(cqp->dev->hw_regs[IRDMA_CQPERRCODES]);
+	*min_err = FIELD_GET(IRDMA_CQPERRCODES_CQP_MINOR_CODE, err_code);
+	*maj_err = FIELD_GET(IRDMA_CQPERRCODES_CQP_MAJOR_CODE, err_code);
+	return ret_code;
+}
+
+/**
+ * irdma_sc_cqp_post_sq - post of cqp's sq
+ * @cqp: struct for cqp hw
+ */
+void irdma_sc_cqp_post_sq(struct irdma_sc_cqp *cqp)
+{
+	writel(IRDMA_RING_CURRENT_HEAD(cqp->sq_ring), cqp->dev->cqp_db);
+
+	ibdev_dbg(to_ibdev(cqp->dev),
+		  "WQE: CQP SQ head 0x%x tail 0x%x size 0x%x\n",
+		  cqp->sq_ring.head, cqp->sq_ring.tail, cqp->sq_ring.size);
+}
+
+/**
+ * irdma_sc_cqp_get_next_send_wqe_idx - get next wqe on cqp sq
+ * and pass back index
+ * @cqp: CQP HW structure
+ * @scratch: private data for CQP WQE
+ * @wqe_idx: WQE index of CQP SQ
+ */
+static __le64 *irdma_sc_cqp_get_next_send_wqe_idx(struct irdma_sc_cqp *cqp,
+						  u64 scratch, u32 *wqe_idx)
+{
+	__le64 *wqe = NULL;
+	enum irdma_status_code ret_code;
+
+	if (IRDMA_RING_FULL_ERR(cqp->sq_ring)) {
+		ibdev_dbg(to_ibdev(cqp->dev),
+			  "WQE: CQP SQ is full, head 0x%x tail 0x%x size 0x%x\n",
+			  cqp->sq_ring.head, cqp->sq_ring.tail,
+			  cqp->sq_ring.size);
+		return NULL;
+	}
+	IRDMA_ATOMIC_RING_MOVE_HEAD(cqp->sq_ring, *wqe_idx, ret_code);
+	if (ret_code)
+		return NULL;
+
+	cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS]++;
+	if (!*wqe_idx)
+		cqp->polarity = !cqp->polarity;
+	wqe = cqp->sq_base[*wqe_idx].elem;
+	cqp->scratch_array[*wqe_idx] = scratch;
+	IRDMA_CQP_INIT_WQE(wqe);
+
+	return wqe;
+}
+
+/**
+ * irdma_sc_cqp_get_next_send_wqe - get next wqe on cqp sq
+ * @cqp: struct for cqp hw
+ * @scratch: private data for CQP WQE
+ */
+__le64 *irdma_sc_cqp_get_next_send_wqe(struct irdma_sc_cqp *cqp, u64 scratch)
+{
+	u32 wqe_idx;
+
+	return irdma_sc_cqp_get_next_send_wqe_idx(cqp, scratch, &wqe_idx);
+}
+
+/**
+ * irdma_sc_cqp_destroy - destroy cqp during close
+ * @cqp: struct for cqp hw
+ */
+static enum irdma_status_code irdma_sc_cqp_destroy(struct irdma_sc_cqp *cqp)
+{
+	u32 cnt = 0, val = 1;
+	enum irdma_status_code ret_code = 0;
+
+	writel(0, cqp->dev->hw_regs[IRDMA_CCQPHIGH]);
+	writel(0, cqp->dev->hw_regs[IRDMA_CCQPLOW]);
+	do {
+		if (cnt++ > cqp->dev->hw_attrs.max_done_count) {
+			ret_code = IRDMA_ERR_TIMEOUT;
+			break;
+		}
+		udelay(cqp->dev->hw_attrs.max_sleep_count);
+		val = readl(cqp->dev->hw_regs[IRDMA_CCQPSTATUS]);
+	} while (FLD_RS_32(cqp->dev, val, IRDMA_CCQPSTATUS_CCQP_DONE));
+
+	dma_free_coherent(cqp->dev->hw->device, cqp->sdbuf.size,
+			  cqp->sdbuf.va, cqp->sdbuf.pa);
+	cqp->sdbuf.va = NULL;
+	return ret_code;
+}
+
+/**
+ * irdma_sc_ccq_arm - enable intr for control cq
+ * @ccq: ccq sc struct
+ */
+static void irdma_sc_ccq_arm(struct irdma_sc_cq *ccq)
+{
+	u64 temp_val;
+	u16 sw_cq_sel;
+	u8 arm_next_se;
+	u8 arm_seq_num;
+
+	get_64bit_val(ccq->cq_uk.shadow_area, 32, &temp_val);
+	sw_cq_sel = (u16)FIELD_GET(IRDMA_CQ_DBSA_SW_CQ_SELECT, temp_val);
+	arm_next_se = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT_SE, temp_val);
+	arm_seq_num = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_SEQ_NUM, temp_val);
+	arm_seq_num++;
+	temp_val = FIELD_PREP(IRDMA_CQ_DBSA_ARM_SEQ_NUM, arm_seq_num) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_SW_CQ_SELECT, sw_cq_sel) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT_SE, arm_next_se) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT, 1);
+	set_64bit_val(ccq->cq_uk.shadow_area, 32, temp_val);
+
+	dma_wmb(); /* make sure shadow area is updated before arming */
+
+	writel(ccq->cq_uk.cq_id, ccq->dev->cq_arm_db);
+}
+
+/**
+ * irdma_sc_ccq_get_cqe_info - get ccq's cq entry
+ * @ccq: ccq sc struct
+ * @info: completion q entry to return
+ */
+static enum irdma_status_code
+irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq,
+			  struct irdma_ccq_cqe_info *info)
+{
+	u64 qp_ctx, temp, temp1;
+	__le64 *cqe;
+	struct irdma_sc_cqp *cqp;
+	u32 wqe_idx;
+	u32 error;
+	u8 polarity;
+	enum irdma_status_code ret_code = 0;
+
+	if (ccq->cq_uk.avoid_mem_cflct)
+		cqe = IRDMA_GET_CURRENT_EXTENDED_CQ_ELEM(&ccq->cq_uk);
+	else
+		cqe = IRDMA_GET_CURRENT_CQ_ELEM(&ccq->cq_uk);
+
+	get_64bit_val(cqe, 24, &temp);
+	polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, temp);
+	if (polarity != ccq->cq_uk.polarity)
+		return IRDMA_ERR_Q_EMPTY;
+
+	get_64bit_val(cqe, 8, &qp_ctx);
+	cqp = (struct irdma_sc_cqp *)(unsigned long)qp_ctx;
+	info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, temp);
+	info->maj_err_code = IRDMA_CQPSQ_MAJ_NO_ERROR;
+	info->min_err_code = (u16)FIELD_GET(IRDMA_CQ_MINERR, temp);
+	if (info->error) {
+		info->maj_err_code = (u16)FIELD_GET(IRDMA_CQ_MAJERR, temp);
+		error = readl(cqp->dev->hw_regs[IRDMA_CQPERRCODES]);
+		ibdev_dbg(to_ibdev(cqp->dev),
+			  "CQP: CQPERRCODES error_code[x%08X]\n", error);
+	}
+
+	wqe_idx = (u32)FIELD_GET(IRDMA_CQ_WQEIDX, temp);
+	info->scratch = cqp->scratch_array[wqe_idx];
+
+	get_64bit_val(cqe, 16, &temp1);
+	info->op_ret_val = (u32)FIELD_GET(IRDMA_CCQ_OPRETVAL, temp1);
+	get_64bit_val(cqp->sq_base[wqe_idx].elem, 24, &temp1);
+	info->op_code = (u8)FIELD_GET(IRDMA_CQPSQ_OPCODE, temp1);
+	info->cqp = cqp;
+
+	/*  move the head for cq */
+	IRDMA_RING_MOVE_HEAD(ccq->cq_uk.cq_ring, ret_code);
+	if (!IRDMA_RING_CURRENT_HEAD(ccq->cq_uk.cq_ring))
+		ccq->cq_uk.polarity ^= 1;
+
+	/* update cq tail in cq shadow memory also */
+	IRDMA_RING_MOVE_TAIL(ccq->cq_uk.cq_ring);
+	set_64bit_val(ccq->cq_uk.shadow_area, 0,
+		      IRDMA_RING_CURRENT_HEAD(ccq->cq_uk.cq_ring));
+
+	dma_wmb(); /* make sure shadow area is updated before moving tail */
+
+	IRDMA_RING_MOVE_TAIL(cqp->sq_ring);
+	ccq->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++;
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_poll_for_cqp_op_done - Waits for last write to complete in CQP SQ
+ * @cqp: struct for cqp hw
+ * @op_code: cqp opcode for completion
+ * @compl_info: completion q entry to return
+ */
+static enum irdma_status_code
+irdma_sc_poll_for_cqp_op_done(struct irdma_sc_cqp *cqp, u8 op_code,
+			      struct irdma_ccq_cqe_info *compl_info)
+{
+	struct irdma_ccq_cqe_info info = {};
+	struct irdma_sc_cq *ccq;
+	enum irdma_status_code ret_code = 0;
+	u32 cnt = 0;
+
+	ccq = cqp->dev->ccq;
+	while (1) {
+		if (cnt++ > 100 * cqp->dev->hw_attrs.max_done_count)
+			return IRDMA_ERR_TIMEOUT;
+
+		if (irdma_sc_ccq_get_cqe_info(ccq, &info)) {
+			udelay(cqp->dev->hw_attrs.max_sleep_count);
+			continue;
+		}
+		if (info.error && info.op_code != IRDMA_CQP_OP_QUERY_STAG) {
+			ret_code = IRDMA_ERR_CQP_COMPL_ERROR;
+			break;
+		}
+		/* make sure op code matches*/
+		if (op_code == info.op_code)
+			break;
+		ibdev_dbg(to_ibdev(cqp->dev),
+			  "WQE: opcode mismatch for my op code 0x%x, returned opcode %x\n",
+			  op_code, info.op_code);
+	}
+
+	if (compl_info)
+		memcpy(compl_info, &info, sizeof(*compl_info));
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_manage_hmc_pm_func_table - manage of function table
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @info: info for the manage function table operation
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code
+irdma_sc_manage_hmc_pm_func_table(struct irdma_sc_cqp *cqp,
+				  struct irdma_hmc_fcn_info *info,
+				  u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 0, 0);
+	set_64bit_val(wqe, 8, 0);
+	set_64bit_val(wqe, 16, 0);
+	set_64bit_val(wqe, 32, 0);
+	set_64bit_val(wqe, 40, 0);
+	set_64bit_val(wqe, 48, 0);
+	set_64bit_val(wqe, 56, 0);
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_MHMC_VFIDX, info->vf_id) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE,
+			 IRDMA_CQP_OP_MANAGE_HMC_PM_FUNC_TABLE) |
+	      FIELD_PREP(IRDMA_CQPSQ_MHMC_FREEPMFN, info->free_fcn) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: MANAGE_HMC_PM_FUNC_TABLE WQE",
+			     DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_manage_hmc_pm_func_table_done - wait for cqp wqe completion for function table
+ * @cqp: struct for cqp hw
+ */
+static enum irdma_status_code
+irdma_sc_manage_hmc_pm_func_table_done(struct irdma_sc_cqp *cqp)
+{
+	return irdma_sc_poll_for_cqp_op_done(cqp,
+					     IRDMA_CQP_OP_MANAGE_HMC_PM_FUNC_TABLE,
+					     NULL);
+}
+
+/**
+ * irdma_sc_commit_fpm_val_done - wait for cqp eqe completion
+ * for fpm commit
+ * @cqp: struct for cqp hw
+ */
+static enum irdma_status_code
+irdma_sc_commit_fpm_val_done(struct irdma_sc_cqp *cqp)
+{
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_COMMIT_FPM_VAL,
+					     NULL);
+}
+
+/**
+ * irdma_sc_commit_fpm_val - cqp wqe for commit fpm values
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @hmc_fn_id: hmc function id
+ * @commit_fpm_mem: Memory for fpm values
+ * @post_sq: flag for cqp db to ring
+ * @wait_type: poll ccq or cqp registers for cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_commit_fpm_val(struct irdma_sc_cqp *cqp, u64 scratch, u8 hmc_fn_id,
+			struct irdma_dma_mem *commit_fpm_mem, bool post_sq,
+			u8 wait_type)
+{
+	__le64 *wqe;
+	u64 hdr;
+	u32 tail, val, error;
+	enum irdma_status_code ret_code = 0;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, hmc_fn_id);
+	set_64bit_val(wqe, 32, commit_fpm_mem->pa);
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_BUFSIZE, IRDMA_COMMIT_FPM_BUF_SIZE) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_COMMIT_FPM_VAL) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: COMMIT_FPM_VAL WQE", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_get_cqp_reg_info(cqp, &val, &tail, &error);
+
+	if (post_sq) {
+		irdma_sc_cqp_post_sq(cqp);
+		if (wait_type == IRDMA_CQP_WAIT_POLL_REGS)
+			ret_code = irdma_cqp_poll_registers(cqp, tail,
+							    cqp->dev->hw_attrs.max_done_count);
+		else if (wait_type == IRDMA_CQP_WAIT_POLL_CQ)
+			ret_code = irdma_sc_commit_fpm_val_done(cqp);
+	}
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_query_fpm_val_done - poll for cqp wqe completion for
+ * query fpm
+ * @cqp: struct for cqp hw
+ */
+static enum irdma_status_code
+irdma_sc_query_fpm_val_done(struct irdma_sc_cqp *cqp)
+{
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_QUERY_FPM_VAL,
+					     NULL);
+}
+
+/**
+ * irdma_sc_query_fpm_val - cqp wqe query fpm values
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @hmc_fn_id: hmc function id
+ * @query_fpm_mem: memory for return fpm values
+ * @post_sq: flag for cqp db to ring
+ * @wait_type: poll ccq or cqp registers for cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_query_fpm_val(struct irdma_sc_cqp *cqp, u64 scratch, u8 hmc_fn_id,
+		       struct irdma_dma_mem *query_fpm_mem, bool post_sq,
+		       u8 wait_type)
+{
+	__le64 *wqe;
+	u64 hdr;
+	u32 tail, val, error;
+	enum irdma_status_code ret_code = 0;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, hmc_fn_id);
+	set_64bit_val(wqe, 32, query_fpm_mem->pa);
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_QUERY_FPM_VAL) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: QUERY_FPM WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_get_cqp_reg_info(cqp, &val, &tail, &error);
+
+	if (post_sq) {
+		irdma_sc_cqp_post_sq(cqp);
+		if (wait_type == IRDMA_CQP_WAIT_POLL_REGS)
+			ret_code = irdma_cqp_poll_registers(cqp, tail,
+							    cqp->dev->hw_attrs.max_done_count);
+		else if (wait_type == IRDMA_CQP_WAIT_POLL_CQ)
+			ret_code = irdma_sc_query_fpm_val_done(cqp);
+	}
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_ceq_init - initialize ceq
+ * @ceq: ceq sc structure
+ * @info: ceq initialization info
+ */
+static enum irdma_status_code
+irdma_sc_ceq_init(struct irdma_sc_ceq *ceq, struct irdma_ceq_init_info *info)
+{
+	u32 pble_obj_cnt;
+
+	if (info->elem_cnt < info->dev->hw_attrs.min_hw_ceq_size ||
+	    info->elem_cnt > info->dev->hw_attrs.max_hw_ceq_size)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	if (info->ceq_id > (info->dev->hmc_fpm_misc.max_ceqs - 1))
+		return IRDMA_ERR_INVALID_CEQ_ID;
+	pble_obj_cnt = info->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+
+	if (info->virtual_map && info->first_pm_pbl_idx >= pble_obj_cnt)
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	ceq->size = sizeof(*ceq);
+	ceq->ceqe_base = (struct irdma_ceqe *)info->ceqe_base;
+	ceq->ceq_id = info->ceq_id;
+	ceq->dev = info->dev;
+	ceq->elem_cnt = info->elem_cnt;
+	ceq->ceq_elem_pa = info->ceqe_pa;
+	ceq->virtual_map = info->virtual_map;
+	ceq->itr_no_expire = info->itr_no_expire;
+	ceq->reg_cq = info->reg_cq;
+	ceq->reg_cq_size = 0;
+	spin_lock_init(&ceq->req_cq_lock);
+	ceq->pbl_chunk_size = (ceq->virtual_map ? info->pbl_chunk_size : 0);
+	ceq->first_pm_pbl_idx = (ceq->virtual_map ? info->first_pm_pbl_idx : 0);
+	ceq->pbl_list = (ceq->virtual_map ? info->pbl_list : NULL);
+	ceq->tph_en = info->tph_en;
+	ceq->tph_val = info->tph_val;
+	ceq->vsi = info->vsi;
+	ceq->polarity = 1;
+	IRDMA_RING_INIT(ceq->ceq_ring, ceq->elem_cnt);
+	ceq->dev->ceq[info->ceq_id] = ceq;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_ceq_create - create ceq wqe
+ * @ceq: ceq sc structure
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+
+static enum irdma_status_code irdma_sc_ceq_create(struct irdma_sc_ceq *ceq,
+						  u64 scratch, bool post_sq)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+
+	cqp = ceq->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	set_64bit_val(wqe, 16, ceq->elem_cnt);
+	set_64bit_val(wqe, 32,
+		      (ceq->virtual_map ? 0 : ceq->ceq_elem_pa));
+	set_64bit_val(wqe, 48,
+		      (ceq->virtual_map ? ceq->first_pm_pbl_idx : 0));
+	set_64bit_val(wqe, 56,
+		      FIELD_PREP(IRDMA_CQPSQ_TPHVAL, ceq->tph_val) |
+		      FIELD_PREP(IRDMA_CQPSQ_VSIIDX, ceq->vsi->vsi_idx));
+	hdr = FIELD_PREP(IRDMA_CQPSQ_CEQ_CEQID, ceq->ceq_id) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_CEQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CEQ_LPBLSIZE, ceq->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_CEQ_VMAP, ceq->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_CEQ_ITRNOEXPIRE, ceq->itr_no_expire) |
+	      FIELD_PREP(IRDMA_CQPSQ_TPHEN, ceq->tph_en) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: CEQ_CREATE WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_cceq_create_done - poll for control ceq wqe to complete
+ * @ceq: ceq sc structure
+ */
+static enum irdma_status_code
+irdma_sc_cceq_create_done(struct irdma_sc_ceq *ceq)
+{
+	struct irdma_sc_cqp *cqp;
+
+	cqp = ceq->dev->cqp;
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_CREATE_CEQ,
+					     NULL);
+}
+
+/**
+ * irdma_sc_cceq_destroy_done - poll for destroy cceq to complete
+ * @ceq: ceq sc structure
+ */
+static enum irdma_status_code
+irdma_sc_cceq_destroy_done(struct irdma_sc_ceq *ceq)
+{
+	struct irdma_sc_cqp *cqp;
+
+	if (ceq->reg_cq)
+		irdma_sc_remove_cq_ctx(ceq, ceq->dev->ccq);
+
+	cqp = ceq->dev->cqp;
+	cqp->process_cqp_sds = irdma_update_sds_noccq;
+
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_DESTROY_CEQ,
+					     NULL);
+}
+
+/**
+ * irdma_sc_cceq_create - create cceq
+ * @ceq: ceq sc structure
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_cceq_create(struct irdma_sc_ceq *ceq,
+						   u64 scratch)
+{
+	enum irdma_status_code ret_code;
+	struct irdma_sc_dev *dev = ceq->dev;
+
+	dev->ccq->vsi = ceq->vsi;
+	if (ceq->reg_cq) {
+		ret_code = irdma_sc_add_cq_ctx(ceq, ceq->dev->ccq);
+		if (ret_code)
+			return ret_code;
+	}
+
+	ret_code = dev->ceq_ops->ceq_create(ceq, scratch, true);
+	if (!ret_code)
+		return irdma_sc_cceq_create_done(ceq);
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_ceq_destroy - destroy ceq
+ * @ceq: ceq sc structure
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_ceq_destroy(struct irdma_sc_ceq *ceq,
+						   u64 scratch, bool post_sq)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+
+	cqp = ceq->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, ceq->elem_cnt);
+	set_64bit_val(wqe, 48, ceq->first_pm_pbl_idx);
+	hdr = ceq->ceq_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_CEQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CEQ_LPBLSIZE, ceq->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_CEQ_VMAP, ceq->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_TPHEN, ceq->tph_en) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: CEQ_DESTROY WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_process_ceq - process ceq
+ * @dev: sc device struct
+ * @ceq: ceq sc structure
+ *
+ * It is expected caller serializes this function with cleanup_ceqes()
+ * because these functions manipulate the same ceq
+ */
+static void *irdma_sc_process_ceq(struct irdma_sc_dev *dev,
+				  struct irdma_sc_ceq *ceq)
+{
+	u64 temp;
+	__le64 *ceqe;
+	struct irdma_sc_cq *cq = NULL;
+	struct irdma_sc_cq *temp_cq;
+	u8 polarity;
+	u32 cq_idx;
+	unsigned long flags;
+
+	do {
+		cq_idx = 0;
+		ceqe = IRDMA_GET_CURRENT_CEQ_ELEM(ceq);
+		get_64bit_val(ceqe, 0, &temp);
+		polarity = (u8)FIELD_GET(IRDMA_CEQE_VALID, temp);
+		if (polarity != ceq->polarity)
+			return NULL;
+
+		temp_cq = (struct irdma_sc_cq *)(unsigned long)(temp << 1);
+		if (!temp_cq) {
+			cq_idx = IRDMA_INVALID_CQ_IDX;
+			IRDMA_RING_MOVE_TAIL(ceq->ceq_ring);
+
+			if (!IRDMA_RING_CURRENT_TAIL(ceq->ceq_ring))
+				ceq->polarity ^= 1;
+			continue;
+		}
+
+		cq = temp_cq;
+		if (ceq->reg_cq) {
+			spin_lock_irqsave(&ceq->req_cq_lock, flags);
+			cq_idx = irdma_sc_find_reg_cq(ceq, cq);
+			spin_unlock_irqrestore(&ceq->req_cq_lock, flags);
+		}
+
+		IRDMA_RING_MOVE_TAIL(ceq->ceq_ring);
+		if (!IRDMA_RING_CURRENT_TAIL(ceq->ceq_ring))
+			ceq->polarity ^= 1;
+	} while (cq_idx == IRDMA_INVALID_CQ_IDX);
+
+	if (cq)
+		irdma_sc_cq_ack(cq);
+	return cq;
+}
+
+/**
+ * irdma_sc_cleanup_ceqes - clear the valid ceqes ctx matching the cq
+ * @cq: cq for which the ceqes need to be cleaned up
+ * @ceq: ceq ptr
+ *
+ * The function is called after the cq is destroyed to cleanup
+ * its pending ceqe entries. It is expected caller serializes this
+ * function with process_ceq() in interrupt context.
+ */
+static void irdma_sc_cleanup_ceqes(struct irdma_sc_cq *cq, struct irdma_sc_ceq *ceq)
+{
+	struct irdma_sc_cq *next_cq;
+	u8 ceq_polarity = ceq->polarity;
+	__le64 *ceqe;
+	u8 polarity;
+	u64 temp;
+	int next;
+	u32 i;
+
+	next = IRDMA_RING_GET_NEXT_TAIL(ceq->ceq_ring, 0);
+
+	for (i = 1; i <= IRDMA_RING_SIZE(*ceq); i++) {
+		ceqe = IRDMA_GET_CEQ_ELEM_AT_POS(ceq, next);
+
+		get_64bit_val(ceqe, 0, &temp);
+		polarity = (u8)FIELD_GET(IRDMA_CEQE_VALID, temp);
+		if (polarity != ceq_polarity)
+			return;
+
+		next_cq = (struct irdma_sc_cq *)(unsigned long)(temp << 1);
+		if (cq == next_cq)
+			set_64bit_val(ceqe, 0, temp & IRDMA_CEQE_VALID);
+
+		next = IRDMA_RING_GET_NEXT_TAIL(ceq->ceq_ring, i);
+		if (!next)
+			ceq_polarity ^= 1;
+	}
+}
+
+/**
+ * irdma_sc_aeq_init - initialize aeq
+ * @aeq: aeq structure ptr
+ * @info: aeq initialization info
+ */
+static enum irdma_status_code
+irdma_sc_aeq_init(struct irdma_sc_aeq *aeq, struct irdma_aeq_init_info *info)
+{
+	u32 pble_obj_cnt;
+
+	if (info->elem_cnt < info->dev->hw_attrs.min_hw_aeq_size ||
+	    info->elem_cnt > info->dev->hw_attrs.max_hw_aeq_size)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	pble_obj_cnt = info->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+
+	if (info->virtual_map && info->first_pm_pbl_idx >= pble_obj_cnt)
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	aeq->size = sizeof(*aeq);
+	aeq->polarity = 1;
+	aeq->aeqe_base = (struct irdma_sc_aeqe *)info->aeqe_base;
+	aeq->dev = info->dev;
+	aeq->elem_cnt = info->elem_cnt;
+	aeq->aeq_elem_pa = info->aeq_elem_pa;
+	IRDMA_RING_INIT(aeq->aeq_ring, aeq->elem_cnt);
+	aeq->virtual_map = info->virtual_map;
+	aeq->pbl_list = (aeq->virtual_map ? info->pbl_list : NULL);
+	aeq->pbl_chunk_size = (aeq->virtual_map ? info->pbl_chunk_size : 0);
+	aeq->first_pm_pbl_idx = (aeq->virtual_map ? info->first_pm_pbl_idx : 0);
+	aeq->msix_idx = info->msix_idx;
+	info->dev->aeq = aeq;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_aeq_create - create aeq
+ * @aeq: aeq structure ptr
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_aeq_create(struct irdma_sc_aeq *aeq,
+						  u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+
+	cqp = aeq->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	set_64bit_val(wqe, 16, aeq->elem_cnt);
+	set_64bit_val(wqe, 32,
+		      (aeq->virtual_map ? 0 : aeq->aeq_elem_pa));
+	set_64bit_val(wqe, 48,
+		      (aeq->virtual_map ? aeq->first_pm_pbl_idx : 0));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_AEQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_AEQ_LPBLSIZE, aeq->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_AEQ_VMAP, aeq->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: AEQ_CREATE WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_aeq_destroy - destroy aeq during close
+ * @aeq: aeq structure ptr
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_aeq_destroy(struct irdma_sc_aeq *aeq,
+						   u64 scratch, bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	struct irdma_sc_dev *dev;
+	u64 hdr;
+
+	dev = aeq->dev;
+	if (dev->privileged)
+		writel(0, dev->hw_regs[IRDMA_PFINT_AEQCTL]);
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+	set_64bit_val(wqe, 16, aeq->elem_cnt);
+	set_64bit_val(wqe, 48, aeq->first_pm_pbl_idx);
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_AEQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_AEQ_LPBLSIZE, aeq->pbl_chunk_size) |
+	      FIELD_PREP(IRDMA_CQPSQ_AEQ_VMAP, aeq->virtual_map) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: AEQ_DESTROY WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	if (post_sq)
+		irdma_sc_cqp_post_sq(cqp);
+	return 0;
+}
+
+/**
+ * irdma_sc_get_next_aeqe - get next aeq entry
+ * @aeq: aeq structure ptr
+ * @info: aeqe info to be returned
+ */
+static enum irdma_status_code
+irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, struct irdma_aeqe_info *info)
+{
+	u64 temp, compl_ctx;
+	__le64 *aeqe;
+	u16 wqe_idx;
+	u8 ae_src;
+	u8 polarity;
+
+	aeqe = IRDMA_GET_CURRENT_AEQ_ELEM(aeq);
+	get_64bit_val(aeqe, 0, &compl_ctx);
+	get_64bit_val(aeqe, 8, &temp);
+	polarity = (u8)FIELD_GET(IRDMA_AEQE_VALID, temp);
+
+	if (aeq->polarity != polarity)
+		return IRDMA_ERR_Q_EMPTY;
+
+	print_hex_dump_debug("WQE: AEQ_ENTRY WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     aeqe, 16, false);
+
+	ae_src = (u8)FIELD_GET(IRDMA_AEQE_AESRC, temp);
+	wqe_idx = (u16)FIELD_GET(IRDMA_AEQE_WQDESCIDX, temp);
+	info->qp_cq_id = (u32)FIELD_GET(IRDMA_AEQE_QPCQID_LOW, temp) |
+			 ((u32)FIELD_GET(IRDMA_AEQE_QPCQID_HI, temp) << 18);
+	info->ae_id = (u16)FIELD_GET(IRDMA_AEQE_AECODE, temp);
+	info->tcp_state = (u8)FIELD_GET(IRDMA_AEQE_TCPSTATE, temp);
+	info->iwarp_state = (u8)FIELD_GET(IRDMA_AEQE_IWSTATE, temp);
+	info->q2_data_written = (u8)FIELD_GET(IRDMA_AEQE_Q2DATA, temp);
+	info->aeqe_overflow = (bool)FIELD_GET(IRDMA_AEQE_OVERFLOW, temp);
+
+	info->ae_src = ae_src;
+	switch (info->ae_id) {
+	case IRDMA_AE_PRIV_OPERATION_DENIED:
+	case IRDMA_AE_AMP_INVALIDATE_TYPE1_MW:
+	case IRDMA_AE_AMP_MWBIND_ZERO_BASED_TYPE1_MW:
+	case IRDMA_AE_AMP_FASTREG_INVALID_PBL_HPS_CFG:
+	case IRDMA_AE_AMP_FASTREG_PBLE_MISMATCH:
+	case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG:
+	case IRDMA_AE_UDA_XMIT_BAD_PD:
+	case IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT:
+	case IRDMA_AE_BAD_CLOSE:
+	case IRDMA_AE_RDMA_READ_WHILE_ORD_ZERO:
+	case IRDMA_AE_STAG_ZERO_INVALID:
+	case IRDMA_AE_IB_RREQ_AND_Q1_FULL:
+	case IRDMA_AE_IB_INVALID_REQUEST:
+	case IRDMA_AE_WQE_UNEXPECTED_OPCODE:
+	case IRDMA_AE_IB_REMOTE_ACCESS_ERROR:
+	case IRDMA_AE_IB_REMOTE_OP_ERROR:
+	case IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION:
+	case IRDMA_AE_DDP_UBE_INVALID_MO:
+	case IRDMA_AE_DDP_UBE_INVALID_QN:
+	case IRDMA_AE_DDP_NO_L_BIT:
+	case IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION:
+	case IRDMA_AE_RDMAP_ROE_UNEXPECTED_OPCODE:
+	case IRDMA_AE_ROE_INVALID_RDMA_READ_REQUEST:
+	case IRDMA_AE_ROE_INVALID_RDMA_WRITE_OR_READ_RESP:
+	case IRDMA_AE_ROCE_RSP_LENGTH_ERROR:
+	case IRDMA_AE_INVALID_ARP_ENTRY:
+	case IRDMA_AE_INVALID_TCP_OPTION_RCVD:
+	case IRDMA_AE_STALE_ARP_ENTRY:
+	case IRDMA_AE_INVALID_AH_ENTRY:
+	case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR:
+	case IRDMA_AE_LLP_SEGMENT_TOO_SMALL:
+	case IRDMA_AE_LLP_TOO_MANY_RETRIES:
+	case IRDMA_AE_LLP_DOUBT_REACHABILITY:
+	case IRDMA_AE_LLP_CONNECTION_ESTABLISHED:
+	case IRDMA_AE_RESET_SENT:
+	case IRDMA_AE_TERMINATE_SENT:
+	case IRDMA_AE_RESET_NOT_SENT:
+	case IRDMA_AE_LCE_QP_CATASTROPHIC:
+	case IRDMA_AE_QP_SUSPEND_COMPLETE:
+	case IRDMA_AE_UDA_L4LEN_INVALID:
+		info->qp = true;
+		info->compl_ctx = compl_ctx;
+		break;
+	case IRDMA_AE_LCE_CQ_CATASTROPHIC:
+		info->cq = true;
+		info->compl_ctx = compl_ctx << 1;
+		ae_src = IRDMA_AE_SOURCE_RSVD;
+		break;
+	case IRDMA_AE_ROCE_EMPTY_MCG:
+	case IRDMA_AE_ROCE_BAD_MC_IP_ADDR:
+	case IRDMA_AE_ROCE_BAD_MC_QPID:
+	case IRDMA_AE_MCG_QP_PROTOCOL_MISMATCH:
+		fallthrough;
+	case IRDMA_AE_LLP_CONNECTION_RESET:
+	case IRDMA_AE_LLP_SYN_RECEIVED:
+	case IRDMA_AE_LLP_FIN_RECEIVED:
+	case IRDMA_AE_LLP_CLOSE_COMPLETE:
+	case IRDMA_AE_LLP_TERMINATE_RECEIVED:
+	case IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE:
+		ae_src = IRDMA_AE_SOURCE_RSVD;
+		info->qp = true;
+		info->compl_ctx = compl_ctx;
+		break;
+	default:
+		break;
+	}
+
+	switch (ae_src) {
+	case IRDMA_AE_SOURCE_RQ:
+	case IRDMA_AE_SOURCE_RQ_0011:
+		info->qp = true;
+		info->rq = true;
+		info->wqe_idx = wqe_idx;
+		info->compl_ctx = compl_ctx;
+		break;
+	case IRDMA_AE_SOURCE_CQ:
+	case IRDMA_AE_SOURCE_CQ_0110:
+	case IRDMA_AE_SOURCE_CQ_1010:
+	case IRDMA_AE_SOURCE_CQ_1110:
+		info->cq = true;
+		info->compl_ctx = compl_ctx << 1;
+		break;
+	case IRDMA_AE_SOURCE_SQ:
+	case IRDMA_AE_SOURCE_SQ_0111:
+		info->qp = true;
+		info->sq = true;
+		info->wqe_idx = wqe_idx;
+		info->compl_ctx = compl_ctx;
+		break;
+	case IRDMA_AE_SOURCE_IN_RR_WR:
+	case IRDMA_AE_SOURCE_IN_RR_WR_1011:
+		info->qp = true;
+		info->compl_ctx = compl_ctx;
+		info->in_rdrsp_wr = true;
+		break;
+	case IRDMA_AE_SOURCE_OUT_RR:
+	case IRDMA_AE_SOURCE_OUT_RR_1111:
+		info->qp = true;
+		info->compl_ctx = compl_ctx;
+		info->out_rdrsp = true;
+		break;
+	case IRDMA_AE_SOURCE_RSVD:
+	default:
+		break;
+	}
+
+	IRDMA_RING_MOVE_TAIL(aeq->aeq_ring);
+	if (!IRDMA_RING_CURRENT_TAIL(aeq->aeq_ring))
+		aeq->polarity ^= 1;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_repost_aeq_entries - repost completed aeq entries
+ * @dev: sc device struct
+ * @count: allocate count
+ */
+static enum irdma_status_code
+irdma_sc_repost_aeq_entries(struct irdma_sc_dev *dev, u32 count)
+{
+	writel(count, dev->hw_regs[IRDMA_AEQALLOC]);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_aeq_create_done - create aeq
+ * @aeq: aeq structure ptr
+ */
+static enum irdma_status_code irdma_sc_aeq_create_done(struct irdma_sc_aeq *aeq)
+{
+	struct irdma_sc_cqp *cqp;
+
+	cqp = aeq->dev->cqp;
+
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_CREATE_AEQ,
+					     NULL);
+}
+
+/**
+ * irdma_sc_aeq_destroy_done - destroy of aeq during close
+ * @aeq: aeq structure ptr
+ */
+static enum irdma_status_code
+irdma_sc_aeq_destroy_done(struct irdma_sc_aeq *aeq)
+{
+	struct irdma_sc_cqp *cqp;
+
+	cqp = aeq->dev->cqp;
+
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_DESTROY_AEQ,
+					     NULL);
+}
+
+/**
+ * irdma_sc_ccq_init - initialize control cq
+ * @cq: sc's cq ctruct
+ * @info: info for control cq initialization
+ */
+static enum irdma_status_code
+irdma_sc_ccq_init(struct irdma_sc_cq *cq, struct irdma_ccq_init_info *info)
+{
+	u32 pble_obj_cnt;
+
+	if (info->num_elem < info->dev->hw_attrs.uk_attrs.min_hw_cq_size ||
+	    info->num_elem > info->dev->hw_attrs.uk_attrs.max_hw_cq_size)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	if (info->ceq_id > (info->dev->hmc_fpm_misc.max_ceqs - 1))
+		return IRDMA_ERR_INVALID_CEQ_ID;
+
+	pble_obj_cnt = info->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt;
+
+	if (info->virtual_map && info->first_pm_pbl_idx >= pble_obj_cnt)
+		return IRDMA_ERR_INVALID_PBLE_INDEX;
+
+	cq->cq_pa = info->cq_pa;
+	cq->cq_uk.cq_base = info->cq_base;
+	cq->shadow_area_pa = info->shadow_area_pa;
+	cq->cq_uk.shadow_area = info->shadow_area;
+	cq->shadow_read_threshold = info->shadow_read_threshold;
+	cq->dev = info->dev;
+	cq->ceq_id = info->ceq_id;
+	cq->cq_uk.cq_size = info->num_elem;
+	cq->cq_type = IRDMA_CQ_TYPE_CQP;
+	cq->ceqe_mask = info->ceqe_mask;
+	IRDMA_RING_INIT(cq->cq_uk.cq_ring, info->num_elem);
+	cq->cq_uk.cq_id = 0; /* control cq is id 0 always */
+	cq->ceq_id_valid = info->ceq_id_valid;
+	cq->tph_en = info->tph_en;
+	cq->tph_val = info->tph_val;
+	cq->cq_uk.avoid_mem_cflct = info->avoid_mem_cflct;
+	cq->pbl_list = info->pbl_list;
+	cq->virtual_map = info->virtual_map;
+	cq->pbl_chunk_size = info->pbl_chunk_size;
+	cq->first_pm_pbl_idx = info->first_pm_pbl_idx;
+	cq->cq_uk.polarity = true;
+	cq->vsi = info->vsi;
+	cq->cq_uk.cq_ack_db = cq->dev->cq_ack_db;
+
+	/* Only applicable to CQs other than CCQ so initialize to zero */
+	cq->cq_uk.cqe_alloc_db = NULL;
+
+	info->dev->ccq = cq;
+	return 0;
+}
+
+/**
+ * irdma_sc_ccq_create_done - poll cqp for ccq create
+ * @ccq: ccq sc struct
+ */
+static enum irdma_status_code irdma_sc_ccq_create_done(struct irdma_sc_cq *ccq)
+{
+	struct irdma_sc_cqp *cqp;
+
+	cqp = ccq->dev->cqp;
+	return irdma_sc_poll_for_cqp_op_done(cqp, IRDMA_CQP_OP_CREATE_CQ, NULL);
+}
+
+/**
+ * irdma_sc_ccq_create - create control cq
+ * @ccq: ccq sc struct
+ * @scratch: u64 saved to be used during cqp completion
+ * @check_overflow: overlow flag for ccq
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_ccq_create(struct irdma_sc_cq *ccq,
+						  u64 scratch,
+						  bool check_overflow,
+						  bool post_sq)
+{
+	enum irdma_status_code ret_code;
+
+	ret_code = irdma_sc_cq_create(ccq, scratch, check_overflow, post_sq);
+	if (ret_code)
+		return ret_code;
+
+	if (post_sq) {
+		ret_code = irdma_sc_ccq_create_done(ccq);
+		if (ret_code)
+			return ret_code;
+	}
+	ccq->dev->cqp->process_cqp_sds = irdma_cqp_sds_cmd;
+
+	return 0;
+}
+
+/**
+ * irdma_sc_ccq_destroy - destroy ccq during close
+ * @ccq: ccq sc struct
+ * @scratch: u64 saved to be used during cqp completion
+ * @post_sq: flag for cqp db to ring
+ */
+static enum irdma_status_code irdma_sc_ccq_destroy(struct irdma_sc_cq *ccq,
+						   u64 scratch, bool post_sq)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+	enum irdma_status_code ret_code = 0;
+	u32 tail, val, error;
+
+	cqp = ccq->dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 0, ccq->cq_uk.cq_size);
+	set_64bit_val(wqe, 8, (uintptr_t)ccq >> 1);
+	set_64bit_val(wqe, 40, ccq->shadow_area_pa);
+
+	hdr = ccq->cq_uk.cq_id |
+	      FLD_LS_64(ccq->dev, (ccq->ceq_id_valid ? ccq->ceq_id : 0),
+			IRDMA_CQPSQ_CQ_CEQID) |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_CQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_ENCEQEMASK, ccq->ceqe_mask) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CEQIDVALID, ccq->ceq_id_valid) |
+	      FIELD_PREP(IRDMA_CQPSQ_TPHEN, ccq->tph_en) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_AVOIDMEMCNFLCT, ccq->cq_uk.avoid_mem_cflct) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: CCQ_DESTROY WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_get_cqp_reg_info(cqp, &val, &tail, &error);
+
+	if (post_sq) {
+		irdma_sc_cqp_post_sq(cqp);
+		ret_code = irdma_cqp_poll_registers(cqp, tail,
+						    cqp->dev->hw_attrs.max_done_count);
+	}
+
+	cqp->process_cqp_sds = irdma_update_sds_noccq;
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_init_iw_hmc() - queries fpm values using cqp and populates hmc_info
+ * @dev : ptr to irdma_dev struct
+ * @hmc_fn_id: hmc function id
+ */
+enum irdma_status_code irdma_sc_init_iw_hmc(struct irdma_sc_dev *dev,
+					    u8 hmc_fn_id)
+{
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_hmc_fpm_misc *hmc_fpm_misc;
+	struct irdma_dma_mem query_fpm_mem;
+	enum irdma_status_code ret_code = 0;
+	u8 wait_type;
+
+	if ((dev->privileged && hmc_fn_id > dev->hw_attrs.max_hw_vf_fpm_id) ||
+	    (dev->hmc_fn_id != hmc_fn_id &&
+	     hmc_fn_id < dev->hw_attrs.first_hw_vf_fpm_id))
+		return IRDMA_ERR_INVALID_HMCFN_ID;
+
+	ibdev_dbg(to_ibdev(dev), "HMC: hmc_fn_id %u, dev->hmc_fn_id %u\n",
+		  hmc_fn_id, dev->hmc_fn_id);
+	if (hmc_fn_id == dev->hmc_fn_id) {
+		hmc_info = dev->hmc_info;
+		hmc_fpm_misc = &dev->hmc_fpm_misc;
+		query_fpm_mem.pa = dev->fpm_query_buf_pa;
+		query_fpm_mem.va = dev->fpm_query_buf;
+	} else {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: Bad hmc function id: hmc_fn_id %u, dev->hmc_fn_id %u\n",
+			  hmc_fn_id, dev->hmc_fn_id);
+
+		return IRDMA_ERR_INVALID_HMCFN_ID;
+	}
+	hmc_info->hmc_fn_id = hmc_fn_id;
+	wait_type = (u8)IRDMA_CQP_WAIT_POLL_REGS;
+
+	ret_code = irdma_sc_query_fpm_val(dev->cqp, 0, hmc_info->hmc_fn_id,
+					  &query_fpm_mem, true, wait_type);
+	if (ret_code)
+		return ret_code;
+
+	/* parse the fpm_query_buf and fill hmc obj info */
+	ret_code = irdma_sc_parse_fpm_query_buf(dev, query_fpm_mem.va, hmc_info,
+						hmc_fpm_misc);
+
+	print_hex_dump_debug("HMC: QUERY FPM BUFFER", DUMP_PREFIX_OFFSET, 16,
+			     8, query_fpm_mem.va, IRDMA_QUERY_FPM_BUF_SIZE,
+			     false);
+	return ret_code;
+}
+
+/**
+ * irdma_sc_cfg_iw_fpm() - commits hmc obj cnt values using cqp
+ * command and populates fpm base address in hmc_info
+ * @dev : ptr to irdma_dev struct
+ * @hmc_fn_id: hmc function id
+ */
+static enum irdma_status_code irdma_sc_cfg_iw_fpm(struct irdma_sc_dev *dev,
+						  u8 hmc_fn_id)
+{
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_hmc_obj_info *obj_info;
+	__le64 *buf;
+	struct irdma_dma_mem commit_fpm_mem;
+	enum irdma_status_code ret_code = 0;
+	u8 wait_type;
+
+	if ((dev->privileged && hmc_fn_id > dev->hw_attrs.max_hw_vf_fpm_id) ||
+	    (dev->hmc_fn_id != hmc_fn_id &&
+	     hmc_fn_id < dev->hw_attrs.first_hw_vf_fpm_id))
+		return IRDMA_ERR_INVALID_HMCFN_ID;
+
+	if (hmc_fn_id != dev->hmc_fn_id)
+		return IRDMA_ERR_INVALID_FPM_FUNC_ID;
+
+	hmc_info = dev->hmc_info;
+	if (!hmc_info)
+		return IRDMA_ERR_BAD_PTR;
+
+	obj_info = hmc_info->hmc_obj;
+	buf = dev->fpm_commit_buf;
+
+	set_64bit_val(buf, 0, (u64)obj_info[IRDMA_HMC_IW_QP].cnt);
+	set_64bit_val(buf, 8, (u64)obj_info[IRDMA_HMC_IW_CQ].cnt);
+	set_64bit_val(buf, 16, (u64)0); /* RSRVD */
+	set_64bit_val(buf, 24, (u64)obj_info[IRDMA_HMC_IW_HTE].cnt);
+	set_64bit_val(buf, 32, (u64)obj_info[IRDMA_HMC_IW_ARP].cnt);
+	set_64bit_val(buf, 40, (u64)0); /* RSVD */
+	set_64bit_val(buf, 48, (u64)obj_info[IRDMA_HMC_IW_MR].cnt);
+	set_64bit_val(buf, 56, (u64)obj_info[IRDMA_HMC_IW_XF].cnt);
+	set_64bit_val(buf, 64, (u64)obj_info[IRDMA_HMC_IW_XFFL].cnt);
+	set_64bit_val(buf, 72, (u64)obj_info[IRDMA_HMC_IW_Q1].cnt);
+	set_64bit_val(buf, 80, (u64)obj_info[IRDMA_HMC_IW_Q1FL].cnt);
+	set_64bit_val(buf, 88,
+		      (u64)obj_info[IRDMA_HMC_IW_TIMER].cnt);
+	set_64bit_val(buf, 96,
+		      (u64)obj_info[IRDMA_HMC_IW_FSIMC].cnt);
+	set_64bit_val(buf, 104,
+		      (u64)obj_info[IRDMA_HMC_IW_FSIAV].cnt);
+	set_64bit_val(buf, 112,
+		      (u64)obj_info[IRDMA_HMC_IW_PBLE].cnt);
+	set_64bit_val(buf, 120, (u64)0); /* RSVD */
+	set_64bit_val(buf, 128, (u64)obj_info[IRDMA_HMC_IW_RRF].cnt);
+	set_64bit_val(buf, 136,
+		      (u64)obj_info[IRDMA_HMC_IW_RRFFL].cnt);
+	set_64bit_val(buf, 144, (u64)obj_info[IRDMA_HMC_IW_HDR].cnt);
+	set_64bit_val(buf, 152, (u64)obj_info[IRDMA_HMC_IW_MD].cnt);
+	set_64bit_val(buf, 160,
+		      (u64)obj_info[IRDMA_HMC_IW_OOISC].cnt);
+	set_64bit_val(buf, 168,
+		      (u64)obj_info[IRDMA_HMC_IW_OOISCFFL].cnt);
+
+	commit_fpm_mem.pa = dev->fpm_commit_buf_pa;
+	commit_fpm_mem.va = dev->fpm_commit_buf;
+
+	wait_type = (u8)IRDMA_CQP_WAIT_POLL_REGS;
+	print_hex_dump_debug("HMC: COMMIT FPM BUFFER", DUMP_PREFIX_OFFSET, 16,
+			     8, commit_fpm_mem.va, IRDMA_COMMIT_FPM_BUF_SIZE,
+			     false);
+	ret_code = irdma_sc_commit_fpm_val(dev->cqp, 0, hmc_info->hmc_fn_id,
+					   &commit_fpm_mem, true, wait_type);
+	if (!ret_code)
+		ret_code = irdma_sc_parse_fpm_commit_buf(dev, dev->fpm_commit_buf,
+							 hmc_info->hmc_obj,
+							 &hmc_info->sd_table.sd_cnt);
+	print_hex_dump_debug("HMC: COMMIT FPM BUFFER", DUMP_PREFIX_OFFSET, 16,
+			     8, commit_fpm_mem.va, IRDMA_COMMIT_FPM_BUF_SIZE,
+			     false);
+
+	return ret_code;
+}
+
+/**
+ * cqp_sds_wqe_fill - fill cqp wqe doe sd
+ * @cqp: struct for cqp hw
+ * @info: sd info for wqe
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+cqp_sds_wqe_fill(struct irdma_sc_cqp *cqp, struct irdma_update_sds_info *info,
+		 u64 scratch)
+{
+	u64 data;
+	u64 hdr;
+	__le64 *wqe;
+	int mem_entries, wqe_entries;
+	struct irdma_dma_mem *sdbuf = &cqp->sdbuf;
+	u64 offset = 0;
+	u32 wqe_idx;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe_idx(cqp, scratch, &wqe_idx);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	wqe_entries = (info->cnt > 3) ? 3 : info->cnt;
+	mem_entries = info->cnt - wqe_entries;
+
+	if (mem_entries) {
+		offset = wqe_idx * IRDMA_UPDATE_SD_BUFF_SIZE;
+		memcpy(((char *)sdbuf->va + offset), &info->entry[3], mem_entries << 4);
+
+		data = (u64)sdbuf->pa + offset;
+	} else {
+		data = 0;
+	}
+	data |= FIELD_PREP(IRDMA_CQPSQ_UPESD_HMCFNID, info->hmc_fn_id);
+	set_64bit_val(wqe, 16, data);
+
+	switch (wqe_entries) {
+	case 3:
+		set_64bit_val(wqe, 48,
+			      (FIELD_PREP(IRDMA_CQPSQ_UPESD_SDCMD, info->entry[2].cmd) |
+			       FIELD_PREP(IRDMA_CQPSQ_UPESD_ENTRY_VALID, 1)));
+
+		set_64bit_val(wqe, 56, info->entry[2].data);
+		fallthrough;
+	case 2:
+		set_64bit_val(wqe, 32,
+			      (FIELD_PREP(IRDMA_CQPSQ_UPESD_SDCMD, info->entry[1].cmd) |
+			       FIELD_PREP(IRDMA_CQPSQ_UPESD_ENTRY_VALID, 1)));
+
+		set_64bit_val(wqe, 40, info->entry[1].data);
+		fallthrough;
+	case 1:
+		set_64bit_val(wqe, 0,
+			      FIELD_PREP(IRDMA_CQPSQ_UPESD_SDCMD, info->entry[0].cmd));
+
+		set_64bit_val(wqe, 8, info->entry[0].data);
+		break;
+	default:
+		break;
+	}
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_UPDATE_PE_SDS) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity) |
+	      FIELD_PREP(IRDMA_CQPSQ_UPESD_ENTRY_COUNT, mem_entries);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	if (mem_entries)
+		print_hex_dump_debug("WQE: UPDATE_PE_SDS WQE Buffer",
+				     DUMP_PREFIX_OFFSET, 16, 8,
+				     (char *)sdbuf->va + offset,
+				     mem_entries << 4, false);
+
+	print_hex_dump_debug("WQE: UPDATE_PE_SDS WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+
+	return 0;
+}
+
+/**
+ * irdma_update_pe_sds - cqp wqe for sd
+ * @dev: ptr to irdma_dev struct
+ * @info: sd info for sd's
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_update_pe_sds(struct irdma_sc_dev *dev,
+		    struct irdma_update_sds_info *info, u64 scratch)
+{
+	struct irdma_sc_cqp *cqp = dev->cqp;
+	enum irdma_status_code ret_code;
+
+	ret_code = cqp_sds_wqe_fill(cqp, info, scratch);
+	if (!ret_code)
+		irdma_sc_cqp_post_sq(cqp);
+
+	return ret_code;
+}
+
+/**
+ * irdma_update_sds_noccq - update sd before ccq created
+ * @dev: sc device struct
+ * @info: sd info for sd's
+ */
+enum irdma_status_code
+irdma_update_sds_noccq(struct irdma_sc_dev *dev,
+		       struct irdma_update_sds_info *info)
+{
+	u32 error, val, tail;
+	struct irdma_sc_cqp *cqp = dev->cqp;
+	enum irdma_status_code ret_code;
+
+	ret_code = cqp_sds_wqe_fill(cqp, info, 0);
+	if (ret_code)
+		return ret_code;
+
+	irdma_get_cqp_reg_info(cqp, &val, &tail, &error);
+
+	irdma_sc_cqp_post_sq(cqp);
+	return irdma_cqp_poll_registers(cqp, tail,
+					cqp->dev->hw_attrs.max_done_count);
+}
+
+/**
+ * irdma_sc_static_hmc_pages_allocated - cqp wqe to allocate hmc pages
+ * @cqp: struct for cqp hw
+ * @scratch: u64 saved to be used during cqp completion
+ * @hmc_fn_id: hmc function id
+ * @post_sq: flag for cqp db to ring
+ * @poll_registers: flag to poll register for cqp completion
+ */
+enum irdma_status_code
+irdma_sc_static_hmc_pages_allocated(struct irdma_sc_cqp *cqp, u64 scratch,
+				    u8 hmc_fn_id, bool post_sq,
+				    bool poll_registers)
+{
+	u64 hdr;
+	__le64 *wqe;
+	u32 tail, val, error;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_SHMC_PAGE_ALLOCATED_HMC_FN_ID, hmc_fn_id));
+
+	hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE,
+			 IRDMA_CQP_OP_SHMC_PAGES_ALLOCATED) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("WQE: SHMC_PAGES_ALLOCATED WQE",
+			     DUMP_PREFIX_OFFSET, 16, 8, wqe,
+			     IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_get_cqp_reg_info(cqp, &val, &tail, &error);
+
+	if (post_sq) {
+		irdma_sc_cqp_post_sq(cqp);
+		if (poll_registers)
+			/* check for cqp sq tail update */
+			return irdma_cqp_poll_registers(cqp, tail,
+							cqp->dev->hw_attrs.max_done_count);
+		else
+			return irdma_sc_poll_for_cqp_op_done(cqp,
+							     IRDMA_CQP_OP_SHMC_PAGES_ALLOCATED,
+							     NULL);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_cqp_ring_full - check if cqp ring is full
+ * @cqp: struct for cqp hw
+ */
+static bool irdma_cqp_ring_full(struct irdma_sc_cqp *cqp)
+{
+	return IRDMA_RING_FULL_ERR(cqp->sq_ring);
+}
+
+/**
+ * irdma_est_sd - returns approximate number of SDs for HMC
+ * @dev: sc device struct
+ * @hmc_info: hmc structure, size and count for HMC objects
+ */
+static u32 irdma_est_sd(struct irdma_sc_dev *dev,
+			struct irdma_hmc_info *hmc_info)
+{
+	int i;
+	u64 size = 0;
+	u64 sd;
+
+	for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++)
+		if (i != IRDMA_HMC_IW_PBLE)
+			size += round_up(hmc_info->hmc_obj[i].cnt *
+					 hmc_info->hmc_obj[i].size, 512);
+	if (dev->privileged)
+		size += round_up(hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt *
+			hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].size, 512);
+	if (size & 0x1FFFFF)
+		sd = (size >> 21) + 1; /* add 1 for remainder */
+	else
+		sd = size >> 21;
+	if (!dev->privileged) {
+		/* 2MB alignment for VF PBLE HMC */
+		size = hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt *
+		       hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].size;
+		if (size & 0x1FFFFF)
+			sd += (size >> 21) + 1; /* add 1 for remainder */
+		else
+			sd += size >> 21;
+	}
+	if (sd > 0xFFFFFFFF) {
+		ibdev_dbg(to_ibdev(dev), "HMC: sd overflow[%lld]\n", sd);
+		sd = 0xFFFFFFFF - 1;
+	}
+
+	return (u32)sd;
+}
+
+/**
+ * irdma_sc_query_rdma_features_done - poll cqp for query features done
+ * @cqp: struct for cqp hw
+ */
+static enum irdma_status_code
+irdma_sc_query_rdma_features_done(struct irdma_sc_cqp *cqp)
+{
+	return irdma_sc_poll_for_cqp_op_done(cqp,
+					     IRDMA_CQP_OP_QUERY_RDMA_FEATURES,
+					     NULL);
+}
+
+/**
+ * irdma_sc_query_rdma_features - query RDMA features and FW ver
+ * @cqp: struct for cqp hw
+ * @buf: buffer to hold query info
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_query_rdma_features(struct irdma_sc_cqp *cqp,
+			     struct irdma_dma_mem *buf, u64 scratch)
+{
+	__le64 *wqe;
+	u64 temp;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	temp = buf->pa;
+	set_64bit_val(wqe, 32, temp);
+
+	temp = FIELD_PREP(IRDMA_CQPSQ_QUERY_RDMA_FEATURES_WQEVALID,
+			  cqp->polarity) |
+	       FIELD_PREP(IRDMA_CQPSQ_QUERY_RDMA_FEATURES_BUF_LEN, buf->size) |
+	       FIELD_PREP(IRDMA_CQPSQ_UP_OP, IRDMA_CQP_OP_QUERY_RDMA_FEATURES);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, temp);
+
+	print_hex_dump_debug("WQE: QUERY RDMA FEATURES", DUMP_PREFIX_OFFSET,
+			     16, 8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_get_rdma_features - get RDMA features
+ * @dev: sc device struct
+ */
+enum irdma_status_code irdma_get_rdma_features(struct irdma_sc_dev *dev)
+{
+	enum irdma_status_code ret_code;
+	struct irdma_dma_mem feat_buf;
+	u64 temp;
+	u16 byte_idx, feat_type, feat_cnt, feat_idx;
+
+	feat_buf.size = ALIGN(IRDMA_FEATURE_BUF_SIZE,
+			      IRDMA_FEATURE_BUF_ALIGNMENT);
+	feat_buf.va = dma_alloc_coherent(dev->hw->device, feat_buf.size,
+					 &feat_buf.pa, GFP_KERNEL);
+	if (!feat_buf.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	ret_code = irdma_sc_query_rdma_features(dev->cqp, &feat_buf, 0);
+	if (!ret_code)
+		ret_code = irdma_sc_query_rdma_features_done(dev->cqp);
+	if (ret_code)
+		goto exit;
+
+	get_64bit_val(feat_buf.va, 0, &temp);
+	feat_cnt = (u16)FIELD_GET(IRDMA_FEATURE_CNT, temp);
+	if (feat_cnt < 2) {
+		ret_code = IRDMA_ERR_INVALID_FEAT_CNT;
+		goto exit;
+	} else if (feat_cnt > IRDMA_MAX_FEATURES) {
+		ibdev_dbg(to_ibdev(dev),
+			  "DEV: feature buf size insufficient, retrying with larger buffer\n");
+		dma_free_coherent(dev->hw->device, feat_buf.size, feat_buf.va,
+				  feat_buf.pa);
+		feat_buf.va = NULL;
+		feat_buf.size = ALIGN(8 * feat_cnt,
+				      IRDMA_FEATURE_BUF_ALIGNMENT);
+		feat_buf.va = dma_alloc_coherent(dev->hw->device,
+						 feat_buf.size, &feat_buf.pa,
+						 GFP_KERNEL);
+		if (!feat_buf.va)
+			return IRDMA_ERR_NO_MEMORY;
+
+		ret_code = irdma_sc_query_rdma_features(dev->cqp, &feat_buf, 0);
+		if (!ret_code)
+			ret_code = irdma_sc_query_rdma_features_done(dev->cqp);
+		if (ret_code)
+			goto exit;
+
+		get_64bit_val(feat_buf.va, 0, &temp);
+		feat_cnt = (u16)FIELD_GET(IRDMA_FEATURE_CNT, temp);
+		if (feat_cnt < 2) {
+			ret_code = IRDMA_ERR_INVALID_FEAT_CNT;
+			goto exit;
+		}
+	}
+
+	print_hex_dump_debug("WQE: QUERY RDMA FEATURES", DUMP_PREFIX_OFFSET,
+			     16, 8, feat_buf.va, feat_cnt * 8, false);
+
+	for (byte_idx = 0, feat_idx = 0; feat_idx < min(feat_cnt, (u16)IRDMA_MAX_FEATURES);
+	     feat_idx++, byte_idx += 8) {
+		get_64bit_val(feat_buf.va, byte_idx, &temp);
+		feat_type = FIELD_GET(IRDMA_FEATURE_TYPE, temp);
+		if (feat_type >= IRDMA_MAX_FEATURES) {
+			ibdev_dbg(to_ibdev(dev),
+				  "DEV: found unrecognized feature type %d\n",
+				  feat_type);
+			continue;
+		}
+		dev->feature_info[feat_type] = temp;
+	}
+exit:
+	dma_free_coherent(dev->hw->device, feat_buf.size, feat_buf.va,
+			  feat_buf.pa);
+	feat_buf.va = NULL;
+	return ret_code;
+}
+
+static u32 irdma_q1_cnt(struct irdma_sc_dev *dev,
+			struct irdma_hmc_info *hmc_info, u32 qpwanted)
+{
+	u32 q1_cnt;
+
+	if (dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1) {
+		q1_cnt = roundup_pow_of_two(dev->hw_attrs.max_hw_ird * 2 * qpwanted);
+	} else {
+		if (dev->cqp->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY)
+			q1_cnt = roundup_pow_of_two(dev->hw_attrs.max_hw_ird * 2 * qpwanted + 512);
+		else
+			q1_cnt = dev->hw_attrs.max_hw_ird * 2 * qpwanted;
+	}
+
+	return q1_cnt;
+}
+
+static void cfg_fpm_value_gen_1(struct irdma_sc_dev *dev,
+				struct irdma_hmc_info *hmc_info, u32 qpwanted)
+{
+	hmc_info->hmc_obj[IRDMA_HMC_IW_XF].cnt = roundup_pow_of_two(qpwanted * dev->hw_attrs.max_hw_wqes);
+}
+
+static void cfg_fpm_value_gen_2(struct irdma_sc_dev *dev,
+				struct irdma_hmc_info *hmc_info, u32 qpwanted)
+{
+	struct irdma_hmc_fpm_misc *hmc_fpm_misc = &dev->hmc_fpm_misc;
+
+	hmc_info->hmc_obj[IRDMA_HMC_IW_XF].cnt =
+		4 * hmc_fpm_misc->xf_block_size * qpwanted;
+
+	hmc_info->hmc_obj[IRDMA_HMC_IW_HDR].cnt = qpwanted;
+
+	if (hmc_info->hmc_obj[IRDMA_HMC_IW_RRF].max_cnt)
+		hmc_info->hmc_obj[IRDMA_HMC_IW_RRF].cnt = 32 * qpwanted;
+	if (hmc_info->hmc_obj[IRDMA_HMC_IW_RRFFL].max_cnt)
+		hmc_info->hmc_obj[IRDMA_HMC_IW_RRFFL].cnt =
+			hmc_info->hmc_obj[IRDMA_HMC_IW_RRF].cnt /
+			hmc_fpm_misc->rrf_block_size;
+	if (hmc_info->hmc_obj[IRDMA_HMC_IW_OOISC].max_cnt)
+		hmc_info->hmc_obj[IRDMA_HMC_IW_OOISC].cnt = 32 * qpwanted;
+	if (hmc_info->hmc_obj[IRDMA_HMC_IW_OOISCFFL].max_cnt)
+		hmc_info->hmc_obj[IRDMA_HMC_IW_OOISCFFL].cnt =
+			hmc_info->hmc_obj[IRDMA_HMC_IW_OOISC].cnt /
+			hmc_fpm_misc->ooiscf_block_size;
+}
+
+/**
+ * irdma_cfg_fpm_val - configure HMC objects
+ * @dev: sc device struct
+ * @qp_count: desired qp count
+ */
+enum irdma_status_code irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count)
+{
+	struct irdma_virt_mem virt_mem;
+	u32 i, mem_size;
+	u32 qpwanted, mrwanted, pblewanted;
+	u32 powerof2, hte;
+	u32 sd_needed;
+	u32 sd_diff;
+	u32 loop_count = 0;
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_hmc_fpm_misc *hmc_fpm_misc;
+	enum irdma_status_code ret_code = 0;
+
+	hmc_info = dev->hmc_info;
+	hmc_fpm_misc = &dev->hmc_fpm_misc;
+
+	ret_code = irdma_sc_init_iw_hmc(dev, dev->hmc_fn_id);
+	if (ret_code) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: irdma_sc_init_iw_hmc returned error_code = %d\n",
+			  ret_code);
+		return ret_code;
+	}
+
+	for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++)
+		hmc_info->hmc_obj[i].cnt = hmc_info->hmc_obj[i].max_cnt;
+	sd_needed = irdma_est_sd(dev, hmc_info);
+	ibdev_dbg(to_ibdev(dev),
+		  "HMC: FW max resources sd_needed[%08d] first_sd_index[%04d]\n",
+		  sd_needed, hmc_info->first_sd_index);
+	ibdev_dbg(to_ibdev(dev), "HMC: sd count %d where max sd is %d\n",
+		  hmc_info->sd_table.sd_cnt, hmc_fpm_misc->max_sds);
+
+	qpwanted = min(qp_count, hmc_info->hmc_obj[IRDMA_HMC_IW_QP].max_cnt);
+
+	powerof2 = 1;
+	while (powerof2 <= qpwanted)
+		powerof2 *= 2;
+	powerof2 /= 2;
+	qpwanted = powerof2;
+
+	mrwanted = hmc_info->hmc_obj[IRDMA_HMC_IW_MR].max_cnt;
+	pblewanted = hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].max_cnt;
+
+	ibdev_dbg(to_ibdev(dev),
+		  "HMC: req_qp=%d max_sd=%d, max_qp = %d, max_cq=%d, max_mr=%d, max_pble=%d, mc=%d, av=%d\n",
+		  qp_count, hmc_fpm_misc->max_sds,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_QP].max_cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].max_cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_MR].max_cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].max_cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].max_cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].max_cnt);
+	hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].cnt =
+		hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].max_cnt;
+	hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt =
+		hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].max_cnt;
+	hmc_info->hmc_obj[IRDMA_HMC_IW_ARP].cnt =
+		hmc_info->hmc_obj[IRDMA_HMC_IW_ARP].max_cnt;
+
+	hmc_info->hmc_obj[IRDMA_HMC_IW_APBVT_ENTRY].cnt = 1;
+
+	while (irdma_q1_cnt(dev, hmc_info, qpwanted) > hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].max_cnt)
+		qpwanted /= 2;
+
+	do {
+		++loop_count;
+		hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt = qpwanted;
+		hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt =
+			min(2 * qpwanted, hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt);
+		hmc_info->hmc_obj[IRDMA_HMC_IW_RESERVED].cnt = 0; /* Reserved */
+		hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt = mrwanted;
+
+		hte = round_up(qpwanted + hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].cnt, 512);
+		powerof2 = 1;
+		while (powerof2 < hte)
+			powerof2 *= 2;
+		hmc_info->hmc_obj[IRDMA_HMC_IW_HTE].cnt =
+			powerof2 * hmc_fpm_misc->ht_multiplier;
+		if (dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+			cfg_fpm_value_gen_1(dev, hmc_info, qpwanted);
+		else
+			cfg_fpm_value_gen_2(dev, hmc_info, qpwanted);
+
+		hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].cnt = irdma_q1_cnt(dev, hmc_info, qpwanted);
+		hmc_info->hmc_obj[IRDMA_HMC_IW_XFFL].cnt =
+			hmc_info->hmc_obj[IRDMA_HMC_IW_XF].cnt / hmc_fpm_misc->xf_block_size;
+		hmc_info->hmc_obj[IRDMA_HMC_IW_Q1FL].cnt =
+			hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].cnt / hmc_fpm_misc->q1_block_size;
+		hmc_info->hmc_obj[IRDMA_HMC_IW_TIMER].cnt =
+			(round_up(qpwanted, 512) / 512 + 1) * hmc_fpm_misc->timer_bucket;
+
+		hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt = pblewanted;
+		sd_needed = irdma_est_sd(dev, hmc_info);
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: sd_needed = %d, hmc_fpm_misc->max_sds=%d, mrwanted=%d, pblewanted=%d qpwanted=%d\n",
+			  sd_needed, hmc_fpm_misc->max_sds, mrwanted,
+			  pblewanted, qpwanted);
+
+		/* Do not reduce resources further. All objects fit with max SDs */
+		if (sd_needed <= hmc_fpm_misc->max_sds)
+			break;
+
+		sd_diff = sd_needed - hmc_fpm_misc->max_sds;
+		if (sd_diff > 128) {
+			if (qpwanted > 128 && sd_diff > 144)
+				qpwanted /= 2;
+			mrwanted /= 2;
+			pblewanted /= 2;
+			continue;
+		}
+		if (dev->cqp->hmc_profile != IRDMA_HMC_PROFILE_FAVOR_VF &&
+		    pblewanted > (512 * FPM_MULTIPLIER * sd_diff)) {
+			pblewanted -= 256 * FPM_MULTIPLIER * sd_diff;
+			continue;
+		} else if (pblewanted > (100 * FPM_MULTIPLIER)) {
+			pblewanted -= 10 * FPM_MULTIPLIER;
+		} else if (pblewanted > FPM_MULTIPLIER) {
+			pblewanted -= FPM_MULTIPLIER;
+		} else if (qpwanted <= 128) {
+			if (hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].cnt > 256)
+				hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].cnt /= 2;
+			if (hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt > 256)
+				hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt /= 2;
+		}
+		if (mrwanted > FPM_MULTIPLIER)
+			mrwanted -= FPM_MULTIPLIER;
+		if (!(loop_count % 10) && qpwanted > 128) {
+			qpwanted /= 2;
+			if (hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt > 256)
+				hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt /= 2;
+		}
+	} while (loop_count < 2000);
+
+	if (sd_needed > hmc_fpm_misc->max_sds) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: cfg_fpm failed loop_cnt=%d, sd_needed=%d, max sd count %d\n",
+			  loop_count, sd_needed, hmc_info->sd_table.sd_cnt);
+		return IRDMA_ERR_CFG;
+	}
+
+	if (loop_count > 1 && sd_needed < hmc_fpm_misc->max_sds) {
+		pblewanted += (hmc_fpm_misc->max_sds - sd_needed) * 256 *
+			      FPM_MULTIPLIER;
+		hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt = pblewanted;
+		sd_needed = irdma_est_sd(dev, hmc_info);
+	}
+
+	ibdev_dbg(to_ibdev(dev),
+		  "HMC: loop_cnt=%d, sd_needed=%d, qpcnt = %d, cqcnt=%d, mrcnt=%d, pblecnt=%d, mc=%d, ah=%d, max sd count %d, first sd index %d\n",
+		  loop_count, sd_needed,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].cnt,
+		  hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt,
+		  hmc_info->sd_table.sd_cnt, hmc_info->first_sd_index);
+
+	ret_code = irdma_sc_cfg_iw_fpm(dev, dev->hmc_fn_id);
+	if (ret_code) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: cfg_iw_fpm returned error_code[x%08X]\n",
+			  readl(dev->hw_regs[IRDMA_CQPERRCODES]));
+		return ret_code;
+	}
+
+	mem_size = sizeof(struct irdma_hmc_sd_entry) *
+		   (hmc_info->sd_table.sd_cnt + hmc_info->first_sd_index + 1);
+	virt_mem.size = mem_size;
+	virt_mem.va = kzalloc(virt_mem.size, GFP_KERNEL);
+	if (!virt_mem.va) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: failed to allocate memory for sd_entry buffer\n");
+		return IRDMA_ERR_NO_MEMORY;
+	}
+	hmc_info->sd_table.sd_entry = virt_mem.va;
+
+	return ret_code;
+}
+
+/**
+ * irdma_exec_cqp_cmd - execute cqp cmd when wqe are available
+ * @dev: rdma device
+ * @pcmdinfo: cqp command info
+ */
+static enum irdma_status_code irdma_exec_cqp_cmd(struct irdma_sc_dev *dev,
+						 struct cqp_cmds_info *pcmdinfo)
+{
+	enum irdma_status_code status;
+	struct irdma_dma_mem val_mem;
+	bool alloc = false;
+
+	dev->cqp_cmd_stats[pcmdinfo->cqp_cmd]++;
+	switch (pcmdinfo->cqp_cmd) {
+	case IRDMA_OP_CEQ_DESTROY:
+		status = irdma_sc_ceq_destroy(pcmdinfo->in.u.ceq_destroy.ceq,
+					      pcmdinfo->in.u.ceq_destroy.scratch,
+					      pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_AEQ_DESTROY:
+		status = irdma_sc_aeq_destroy(pcmdinfo->in.u.aeq_destroy.aeq,
+					      pcmdinfo->in.u.aeq_destroy.scratch,
+					      pcmdinfo->post_sq);
+
+		break;
+	case IRDMA_OP_CEQ_CREATE:
+		status = dev->ceq_ops->ceq_create(pcmdinfo->in.u.ceq_create.ceq,
+						  pcmdinfo->in.u.ceq_create.scratch,
+						  pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_AEQ_CREATE:
+		status = irdma_sc_aeq_create(pcmdinfo->in.u.aeq_create.aeq,
+					     pcmdinfo->in.u.aeq_create.scratch,
+					     pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_QP_UPLOAD_CONTEXT:
+		status = irdma_sc_qp_upload_context(pcmdinfo->in.u.qp_upload_context.dev,
+						    &pcmdinfo->in.u.qp_upload_context.info,
+						    pcmdinfo->in.u.qp_upload_context.scratch,
+						    pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_CQ_CREATE:
+		status = irdma_sc_cq_create(pcmdinfo->in.u.cq_create.cq,
+					    pcmdinfo->in.u.cq_create.scratch,
+					    pcmdinfo->in.u.cq_create.check_overflow,
+					    pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_CQ_MODIFY:
+		status = irdma_sc_cq_modify(pcmdinfo->in.u.cq_modify.cq,
+					    &pcmdinfo->in.u.cq_modify.info,
+					    pcmdinfo->in.u.cq_modify.scratch,
+					    pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_CQ_DESTROY:
+		status = irdma_sc_cq_destroy(pcmdinfo->in.u.cq_destroy.cq,
+					     pcmdinfo->in.u.cq_destroy.scratch,
+					     pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_QP_FLUSH_WQES:
+		status = irdma_sc_qp_flush_wqes(pcmdinfo->in.u.qp_flush_wqes.qp,
+						&pcmdinfo->in.u.qp_flush_wqes.info,
+						pcmdinfo->in.u.qp_flush_wqes.scratch,
+						pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_GEN_AE:
+		status = irdma_sc_gen_ae(pcmdinfo->in.u.gen_ae.qp,
+					 &pcmdinfo->in.u.gen_ae.info,
+					 pcmdinfo->in.u.gen_ae.scratch,
+					 pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_MANAGE_PUSH_PAGE:
+		status = irdma_sc_manage_push_page(pcmdinfo->in.u.manage_push_page.cqp,
+						   &pcmdinfo->in.u.manage_push_page.info,
+						   pcmdinfo->in.u.manage_push_page.scratch,
+						   pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_UPDATE_PE_SDS:
+		status = irdma_update_pe_sds(pcmdinfo->in.u.update_pe_sds.dev,
+					     &pcmdinfo->in.u.update_pe_sds.info,
+					     pcmdinfo->in.u.update_pe_sds.scratch);
+		break;
+	case IRDMA_OP_MANAGE_HMC_PM_FUNC_TABLE:
+		/* switch to calling through the call table */
+		status =
+			irdma_sc_manage_hmc_pm_func_table(pcmdinfo->in.u.manage_hmc_pm.dev->cqp,
+							  &pcmdinfo->in.u.manage_hmc_pm.info,
+							  pcmdinfo->in.u.manage_hmc_pm.scratch,
+							  true);
+		break;
+	case IRDMA_OP_SUSPEND:
+		status = irdma_sc_suspend_qp(pcmdinfo->in.u.suspend_resume.cqp,
+					     pcmdinfo->in.u.suspend_resume.qp,
+					     pcmdinfo->in.u.suspend_resume.scratch);
+		break;
+	case IRDMA_OP_RESUME:
+		status = irdma_sc_resume_qp(pcmdinfo->in.u.suspend_resume.cqp,
+					    pcmdinfo->in.u.suspend_resume.qp,
+					    pcmdinfo->in.u.suspend_resume.scratch);
+		break;
+	case IRDMA_OP_QUERY_FPM_VAL:
+		val_mem.pa = pcmdinfo->in.u.query_fpm_val.fpm_val_pa;
+		val_mem.va = pcmdinfo->in.u.query_fpm_val.fpm_val_va;
+		status = irdma_sc_query_fpm_val(pcmdinfo->in.u.query_fpm_val.cqp,
+						pcmdinfo->in.u.query_fpm_val.scratch,
+						pcmdinfo->in.u.query_fpm_val.hmc_fn_id,
+						&val_mem, true, IRDMA_CQP_WAIT_EVENT);
+		break;
+	case IRDMA_OP_COMMIT_FPM_VAL:
+		val_mem.pa = pcmdinfo->in.u.commit_fpm_val.fpm_val_pa;
+		val_mem.va = pcmdinfo->in.u.commit_fpm_val.fpm_val_va;
+		status = irdma_sc_commit_fpm_val(pcmdinfo->in.u.commit_fpm_val.cqp,
+						 pcmdinfo->in.u.commit_fpm_val.scratch,
+						 pcmdinfo->in.u.commit_fpm_val.hmc_fn_id,
+						 &val_mem,
+						 true,
+						 IRDMA_CQP_WAIT_EVENT);
+		break;
+	case IRDMA_OP_STATS_ALLOCATE:
+		alloc = true;
+		fallthrough;
+	case IRDMA_OP_STATS_FREE:
+		status = irdma_sc_manage_stats_inst(pcmdinfo->in.u.stats_manage.cqp,
+						    &pcmdinfo->in.u.stats_manage.info,
+						    alloc,
+						    pcmdinfo->in.u.stats_manage.scratch);
+		break;
+	case IRDMA_OP_STATS_GATHER:
+		status = irdma_sc_gather_stats(pcmdinfo->in.u.stats_gather.cqp,
+					       &pcmdinfo->in.u.stats_gather.info,
+					       pcmdinfo->in.u.stats_gather.scratch);
+		break;
+	case IRDMA_OP_WS_MODIFY_NODE:
+		status = irdma_sc_manage_ws_node(pcmdinfo->in.u.ws_node.cqp,
+						 &pcmdinfo->in.u.ws_node.info,
+						 IRDMA_MODIFY_NODE,
+						 pcmdinfo->in.u.ws_node.scratch);
+		break;
+	case IRDMA_OP_WS_DELETE_NODE:
+		status = irdma_sc_manage_ws_node(pcmdinfo->in.u.ws_node.cqp,
+						 &pcmdinfo->in.u.ws_node.info,
+						 IRDMA_DEL_NODE,
+						 pcmdinfo->in.u.ws_node.scratch);
+		break;
+	case IRDMA_OP_WS_ADD_NODE:
+		status = irdma_sc_manage_ws_node(pcmdinfo->in.u.ws_node.cqp,
+						 &pcmdinfo->in.u.ws_node.info,
+						 IRDMA_ADD_NODE,
+						 pcmdinfo->in.u.ws_node.scratch);
+		break;
+	case IRDMA_OP_SET_UP_MAP:
+		status = irdma_sc_set_up_map(pcmdinfo->in.u.up_map.cqp,
+					     &pcmdinfo->in.u.up_map.info,
+					     pcmdinfo->in.u.up_map.scratch);
+		break;
+	case IRDMA_OP_QUERY_RDMA_FEATURES:
+		status = irdma_sc_query_rdma_features(pcmdinfo->in.u.query_rdma.cqp,
+						      &pcmdinfo->in.u.query_rdma.query_buff_mem,
+						      pcmdinfo->in.u.query_rdma.scratch);
+		break;
+	case IRDMA_OP_DELETE_ARP_CACHE_ENTRY:
+		status = irdma_sc_del_arp_cache_entry(pcmdinfo->in.u.del_arp_cache_entry.cqp,
+						      pcmdinfo->in.u.del_arp_cache_entry.scratch,
+						      pcmdinfo->in.u.del_arp_cache_entry.arp_index,
+						      pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_MANAGE_APBVT_ENTRY:
+		status = irdma_sc_manage_apbvt_entry(pcmdinfo->in.u.manage_apbvt_entry.cqp,
+						     &pcmdinfo->in.u.manage_apbvt_entry.info,
+						     pcmdinfo->in.u.manage_apbvt_entry.scratch,
+						     pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_MANAGE_QHASH_TABLE_ENTRY:
+		status = irdma_sc_manage_qhash_table_entry(pcmdinfo->in.u.manage_qhash_table_entry.cqp,
+							   &pcmdinfo->in.u.manage_qhash_table_entry.info,
+							   pcmdinfo->in.u.manage_qhash_table_entry.scratch,
+							   pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_QP_MODIFY:
+		status = irdma_sc_qp_modify(pcmdinfo->in.u.qp_modify.qp,
+					    &pcmdinfo->in.u.qp_modify.info,
+					    pcmdinfo->in.u.qp_modify.scratch,
+					    pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_QP_CREATE:
+		status = irdma_sc_qp_create(pcmdinfo->in.u.qp_create.qp,
+					    &pcmdinfo->in.u.qp_create.info,
+					    pcmdinfo->in.u.qp_create.scratch,
+					    pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_QP_DESTROY:
+		status = irdma_sc_qp_destroy(pcmdinfo->in.u.qp_destroy.qp,
+					     pcmdinfo->in.u.qp_destroy.scratch,
+					     pcmdinfo->in.u.qp_destroy.remove_hash_idx,
+					     pcmdinfo->in.u.qp_destroy.ignore_mw_bnd,
+					     pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_ALLOC_STAG:
+		status = irdma_sc_alloc_stag(pcmdinfo->in.u.alloc_stag.dev,
+					     &pcmdinfo->in.u.alloc_stag.info,
+					     pcmdinfo->in.u.alloc_stag.scratch,
+					     pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_MR_REG_NON_SHARED:
+		status = irdma_sc_mr_reg_non_shared(pcmdinfo->in.u.mr_reg_non_shared.dev,
+						    &pcmdinfo->in.u.mr_reg_non_shared.info,
+						    pcmdinfo->in.u.mr_reg_non_shared.scratch,
+						    pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_DEALLOC_STAG:
+		status =
+			irdma_sc_dealloc_stag(pcmdinfo->in.u.dealloc_stag.dev,
+					      &pcmdinfo->in.u.dealloc_stag.info,
+					      pcmdinfo->in.u.dealloc_stag.scratch,
+					      pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_MW_ALLOC:
+		status = irdma_sc_mw_alloc(pcmdinfo->in.u.mw_alloc.dev,
+					   &pcmdinfo->in.u.mw_alloc.info,
+					   pcmdinfo->in.u.mw_alloc.scratch,
+					   pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_ADD_ARP_CACHE_ENTRY:
+		status = irdma_sc_add_arp_cache_entry(pcmdinfo->in.u.add_arp_cache_entry.cqp,
+						      &pcmdinfo->in.u.add_arp_cache_entry.info,
+						      pcmdinfo->in.u.add_arp_cache_entry.scratch,
+						      pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY:
+		status = dev->cqp_misc_ops->alloc_local_mac_entry(pcmdinfo->in.u.alloc_local_mac_entry.cqp,
+								  pcmdinfo->in.u.alloc_local_mac_entry.scratch,
+								  pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_ADD_LOCAL_MAC_ENTRY:
+		status = dev->cqp_misc_ops->add_local_mac_entry(pcmdinfo->in.u.add_local_mac_entry.cqp,
+								&pcmdinfo->in.u.add_local_mac_entry.info,
+								pcmdinfo->in.u.add_local_mac_entry.scratch,
+								pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_DELETE_LOCAL_MAC_ENTRY:
+		status = dev->cqp_misc_ops->del_local_mac_entry(pcmdinfo->in.u.del_local_mac_entry.cqp,
+								pcmdinfo->in.u.del_local_mac_entry.scratch,
+								pcmdinfo->in.u.del_local_mac_entry.entry_idx,
+								pcmdinfo->in.u.del_local_mac_entry.ignore_ref_count,
+								pcmdinfo->post_sq);
+		break;
+	case IRDMA_OP_AH_CREATE:
+		status = dev->iw_uda_ops->create_ah(pcmdinfo->in.u.ah_create.cqp,
+						    &pcmdinfo->in.u.ah_create.info,
+						    pcmdinfo->in.u.ah_create.scratch);
+		break;
+	case IRDMA_OP_AH_DESTROY:
+		status = dev->iw_uda_ops->destroy_ah(pcmdinfo->in.u.ah_destroy.cqp,
+						     &pcmdinfo->in.u.ah_destroy.info,
+						     pcmdinfo->in.u.ah_destroy.scratch);
+		break;
+	case IRDMA_OP_MC_CREATE:
+		status = dev->iw_uda_ops->mcast_grp_create(pcmdinfo->in.u.mc_create.cqp,
+							   &pcmdinfo->in.u.mc_create.info,
+							   pcmdinfo->in.u.mc_create.scratch);
+		break;
+	case IRDMA_OP_MC_DESTROY:
+		status = dev->iw_uda_ops->mcast_grp_destroy(pcmdinfo->in.u.mc_destroy.cqp,
+							    &pcmdinfo->in.u.mc_destroy.info,
+							    pcmdinfo->in.u.mc_destroy.scratch);
+		break;
+	case IRDMA_OP_MC_MODIFY:
+		status = dev->iw_uda_ops->mcast_grp_modify(pcmdinfo->in.u.mc_modify.cqp,
+							   &pcmdinfo->in.u.mc_modify.info,
+							   pcmdinfo->in.u.mc_modify.scratch);
+		break;
+	default:
+		status = IRDMA_NOT_SUPPORTED;
+		break;
+	}
+
+	return status;
+}
+
+/**
+ * irdma_process_cqp_cmd - process all cqp commands
+ * @dev: sc device struct
+ * @pcmdinfo: cqp command info
+ */
+enum irdma_status_code irdma_process_cqp_cmd(struct irdma_sc_dev *dev,
+					     struct cqp_cmds_info *pcmdinfo)
+{
+	enum irdma_status_code status = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->cqp_lock, flags);
+	if (list_empty(&dev->cqp_cmd_head) && !irdma_cqp_ring_full(dev->cqp))
+		status = irdma_exec_cqp_cmd(dev, pcmdinfo);
+	else
+		list_add_tail(&pcmdinfo->cqp_cmd_entry, &dev->cqp_cmd_head);
+	spin_unlock_irqrestore(&dev->cqp_lock, flags);
+	return status;
+}
+
+/**
+ * irdma_process_bh - called from tasklet for cqp list
+ * @dev: sc device struct
+ */
+enum irdma_status_code irdma_process_bh(struct irdma_sc_dev *dev)
+{
+	enum irdma_status_code status = 0;
+	struct cqp_cmds_info *pcmdinfo;
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->cqp_lock, flags);
+	while (!list_empty(&dev->cqp_cmd_head) &&
+	       !irdma_cqp_ring_full(dev->cqp)) {
+		pcmdinfo = (struct cqp_cmds_info *)irdma_remove_cqp_head(dev);
+		status = irdma_exec_cqp_cmd(dev, pcmdinfo);
+		if (status)
+			break;
+	}
+	spin_unlock_irqrestore(&dev->cqp_lock, flags);
+	return status;
+}
+
+/**
+ * irdma_ena_irq - Enable interrupt
+ * @dev: pointer to the device structure
+ * @idx: vector index
+ */
+static void irdma_ena_irq(struct irdma_sc_dev *dev, u32 idx)
+{
+	u32 val;
+	u32 interval = 0;
+
+	if (dev->ceq_itr && dev->aeq->msix_idx != idx)
+		interval = dev->ceq_itr >> 1; /* 2 usec units */
+	val = FIELD_PREP(IRDMA_GLINT_DYN_CTL_ITR_INDX, 0) |
+	      FIELD_PREP(IRDMA_GLINT_DYN_CTL_INTERVAL, interval) |
+	      FIELD_PREP(IRDMA_GLINT_DYN_CTL_INTENA, 1) |
+	      FIELD_PREP(IRDMA_GLINT_DYN_CTL_CLEARPBA, 1);
+
+	if (dev->hw_attrs.uk_attrs.hw_rev != IRDMA_GEN_1)
+		writel(val, dev->hw_regs[IRDMA_GLINT_DYN_CTL] + idx);
+	else
+		writel(val, dev->hw_regs[IRDMA_GLINT_DYN_CTL] + (idx - 1));
+}
+
+/**
+ * irdma_disable_irq - Disable interrupt
+ * @dev: pointer to the device structure
+ * @idx: vector index
+ */
+static void irdma_disable_irq(struct irdma_sc_dev *dev, u32 idx)
+{
+	if (dev->hw_attrs.uk_attrs.hw_rev != IRDMA_GEN_1)
+		writel(0, dev->hw_regs[IRDMA_GLINT_DYN_CTL] + idx);
+	else
+		writel(0, dev->hw_regs[IRDMA_GLINT_DYN_CTL] + (idx - 1));
+}
+
+/**
+ * irdma_set_irq_rate_limit- Configure interrupt rate limit
+ * @dev: pointer to the device structure
+ * @idx: vector index
+ * @interval: Time interval in 4 usec units. Zero for no limit.
+ */
+static void irdma_set_irq_rate_limit(struct irdma_sc_dev *dev, u32 idx, u32 interval)
+{
+	u32 reg_val = 0;
+
+	if (interval) {
+#define IRDMA_MAX_SUPPORTED_INT_RATE_INTERVAL 59 /* 59 * 4 = 236 us */
+		if (interval > IRDMA_MAX_SUPPORTED_INT_RATE_INTERVAL)
+			interval = IRDMA_MAX_SUPPORTED_INT_RATE_INTERVAL;
+		reg_val = interval & IRDMA_GLINT_RATE_INTERVAL;
+		reg_val |= FIELD_PREP(IRDMA_GLINT_RATE_INTRL_ENA, 1);
+	}
+	writel(reg_val, dev->hw_regs[IRDMA_GLINT_RATE] + idx);
+}
+
+/**
+ * irdma_cfg_ceq- Configure CEQ interrupt
+ * @dev: pointer to the device structure
+ * @ceq_id: Completion Event Queue ID
+ * @idx: vector index
+ * @enable: True to enable, False disables
+ */
+static void irdma_cfg_ceq(struct irdma_sc_dev *dev, u32 ceq_id, u32 idx,
+			  bool enable)
+{
+	u32 reg_val;
+
+	reg_val = FIELD_PREP(IRDMA_GLINT_CEQCTL_CAUSE_ENA, enable) |
+		  FIELD_PREP(IRDMA_GLINT_CEQCTL_MSIX_INDX, idx) |
+		  FIELD_PREP(IRDMA_GLINT_CEQCTL_ITR_INDX, 3);
+
+	writel(reg_val, dev->hw_regs[IRDMA_GLINT_CEQCTL] + ceq_id);
+}
+
+/**
+ * irdma_cfg_aeq- Configure AEQ interrupt
+ * @dev: pointer to the device structure
+ * @idx: vector index
+ * @enable: True to enable, False disables
+ */
+static void irdma_cfg_aeq(struct irdma_sc_dev *dev, u32 idx, bool enable)
+{
+	u32 reg_val;
+
+	reg_val = FIELD_PREP(IRDMA_PFINT_AEQCTL_CAUSE_ENA, enable) |
+		  FIELD_PREP(IRDMA_PFINT_AEQCTL_MSIX_INDX, idx) |
+		  FIELD_PREP(IRDMA_PFINT_AEQCTL_ITR_INDX, 3);
+
+	writel(reg_val, dev->hw_regs[IRDMA_PFINT_AEQCTL]);
+}
+
+/* iwarp pd ops */
+static const struct irdma_pd_ops iw_pd_ops = {
+	.pd_init = irdma_sc_pd_init
+};
+
+static const struct irdma_priv_qp_ops iw_priv_qp_ops = {
+	.iw_mr_fast_register = irdma_sc_mr_fast_register,
+	.qp_create = irdma_sc_qp_create,
+	.qp_destroy = irdma_sc_qp_destroy,
+	.qp_flush_wqes = irdma_sc_qp_flush_wqes,
+	.qp_init = irdma_sc_qp_init,
+	.qp_modify = irdma_sc_qp_modify,
+	.qp_send_lsmm = irdma_sc_send_lsmm,
+	.qp_send_lsmm_nostag = irdma_sc_send_lsmm_nostag,
+	.qp_send_rtt = irdma_sc_send_rtt,
+	.qp_setctx = irdma_sc_qp_setctx,
+	.qp_setctx_roce = irdma_sc_qp_setctx_roce,
+	.qp_upload_context = irdma_sc_qp_upload_context,
+	.update_resume_qp = irdma_sc_resume_qp,
+	.update_suspend_qp = irdma_sc_suspend_qp,
+};
+
+static const struct irdma_mr_ops iw_mr_ops = {
+	.alloc_stag = irdma_sc_alloc_stag,
+	.dealloc_stag = irdma_sc_dealloc_stag,
+	.mr_reg_non_shared = irdma_sc_mr_reg_non_shared,
+	.mr_reg_shared = irdma_sc_mr_reg_shared,
+	.mw_alloc = irdma_sc_mw_alloc,
+	.query_stag = irdma_sc_query_stag,
+};
+
+static const struct irdma_cqp_misc_ops iw_cqp_misc_ops = {
+	.add_arp_cache_entry = irdma_sc_add_arp_cache_entry,
+	.add_local_mac_entry = irdma_sc_add_local_mac_entry,
+	.alloc_local_mac_entry = irdma_sc_alloc_local_mac_entry,
+	.cqp_nop = irdma_sc_cqp_nop,
+	.del_arp_cache_entry = irdma_sc_del_arp_cache_entry,
+	.del_local_mac_entry = irdma_sc_del_local_mac_entry,
+	.gather_stats = irdma_sc_gather_stats,
+	.manage_apbvt_entry = irdma_sc_manage_apbvt_entry,
+	.manage_push_page = irdma_sc_manage_push_page,
+	.manage_qhash_table_entry = irdma_sc_manage_qhash_table_entry,
+	.manage_stats_instance = irdma_sc_manage_stats_inst,
+	.manage_ws_node = irdma_sc_manage_ws_node,
+	.query_arp_cache_entry = irdma_sc_query_arp_cache_entry,
+	.query_rdma_features = irdma_sc_query_rdma_features,
+	.set_up_map = irdma_sc_set_up_map,
+};
+
+static const struct irdma_irq_ops iw_irq_ops = {
+	.irdma_cfg_aeq = irdma_cfg_aeq,
+	.irdma_cfg_ceq = irdma_cfg_ceq,
+	.irdma_dis_irq = irdma_disable_irq,
+	.irdma_en_irq = irdma_ena_irq,
+	.irdma_set_intrl = irdma_set_irq_rate_limit,
+};
+
+static const struct irdma_cqp_ops iw_cqp_ops = {
+	.check_cqp_progress = irdma_check_cqp_progress,
+	.cqp_create = irdma_sc_cqp_create,
+	.cqp_destroy = irdma_sc_cqp_destroy,
+	.cqp_get_next_send_wqe = irdma_sc_cqp_get_next_send_wqe,
+	.cqp_init = irdma_sc_cqp_init,
+	.cqp_post_sq = irdma_sc_cqp_post_sq,
+	.poll_for_cqp_op_done = irdma_sc_poll_for_cqp_op_done,
+};
+
+static const struct irdma_priv_cq_ops iw_priv_cq_ops = {
+	.cq_ack = irdma_sc_cq_ack,
+	.cq_create = irdma_sc_cq_create,
+	.cq_destroy = irdma_sc_cq_destroy,
+	.cq_init = irdma_sc_cq_init,
+	.cq_modify = irdma_sc_cq_modify,
+	.cq_resize = irdma_sc_cq_resize,
+};
+
+static const struct irdma_ccq_ops iw_ccq_ops = {
+	.ccq_arm = irdma_sc_ccq_arm,
+	.ccq_create = irdma_sc_ccq_create,
+	.ccq_create_done = irdma_sc_ccq_create_done,
+	.ccq_destroy = irdma_sc_ccq_destroy,
+	.ccq_get_cqe_info = irdma_sc_ccq_get_cqe_info,
+	.ccq_init = irdma_sc_ccq_init,
+};
+
+static const struct irdma_ceq_ops iw_ceq_ops = {
+	.cceq_create = irdma_sc_cceq_create,
+	.cceq_create_done = irdma_sc_cceq_create_done,
+	.cceq_destroy_done = irdma_sc_cceq_destroy_done,
+	.ceq_create = irdma_sc_ceq_create,
+	.ceq_destroy = irdma_sc_ceq_destroy,
+	.ceq_init = irdma_sc_ceq_init,
+	.process_ceq = irdma_sc_process_ceq,
+	.cleanup_ceqes = irdma_sc_cleanup_ceqes,
+};
+
+static const struct irdma_aeq_ops iw_aeq_ops = {
+	.aeq_create = irdma_sc_aeq_create,
+	.aeq_create_done = irdma_sc_aeq_create_done,
+	.aeq_destroy = irdma_sc_aeq_destroy,
+	.aeq_destroy_done = irdma_sc_aeq_destroy_done,
+	.aeq_init = irdma_sc_aeq_init,
+	.get_next_aeqe = irdma_sc_get_next_aeqe,
+	.repost_aeq_entries = irdma_sc_repost_aeq_entries,
+};
+
+static const struct irdma_hmc_ops iw_hmc_ops = {
+	.cfg_iw_fpm = irdma_sc_cfg_iw_fpm,
+	.commit_fpm_val = irdma_sc_commit_fpm_val,
+	.commit_fpm_val_done = irdma_sc_commit_fpm_val_done,
+	.create_hmc_object = irdma_sc_create_hmc_obj,
+	.del_hmc_object = irdma_sc_del_hmc_obj,
+	.init_iw_hmc = irdma_sc_init_iw_hmc,
+	.manage_hmc_pm_func_table = irdma_sc_manage_hmc_pm_func_table,
+	.manage_hmc_pm_func_table_done = irdma_sc_manage_hmc_pm_func_table_done,
+	.parse_fpm_commit_buf = irdma_sc_parse_fpm_commit_buf,
+	.parse_fpm_query_buf = irdma_sc_parse_fpm_query_buf,
+	.pf_init_vfhmc = NULL,
+	.query_fpm_val = irdma_sc_query_fpm_val,
+	.query_fpm_val_done = irdma_sc_query_fpm_val_done,
+	.static_hmc_pages_allocated = irdma_sc_static_hmc_pages_allocated,
+	.vf_cfg_vffpm = NULL,
+};
+
+/**
+ * sc_vsi_update_stats - Update statistics
+ * @vsi: sc_vsi instance to update
+ */
+static void sc_vsi_update_stats(struct irdma_sc_vsi *vsi)
+{
+	struct irdma_gather_stats *gather_stats;
+	struct irdma_gather_stats *last_gather_stats;
+
+	gather_stats = vsi->pestat->gather_info.gather_stats_va;
+	last_gather_stats = vsi->pestat->gather_info.last_gather_stats_va;
+	irdma_update_stats(&vsi->pestat->hw_stats, gather_stats,
+			   last_gather_stats);
+}
+
+static const struct irdma_vsi_ops iw_vsi_ops = {
+	.vsi_update_stats = sc_vsi_update_stats,
+};
+
+/**
+ * irdma_wait_pe_ready - Check if firmware is ready
+ * @dev: provides access to registers
+ */
+static int irdma_wait_pe_ready(struct irdma_sc_dev *dev)
+{
+	u32 statuscpu0;
+	u32 statuscpu1;
+	u32 statuscpu2;
+	u32 retrycount = 0;
+
+	do {
+		statuscpu0 = readl(dev->hw_regs[IRDMA_GLPE_CPUSTATUS0]);
+		statuscpu1 = readl(dev->hw_regs[IRDMA_GLPE_CPUSTATUS1]);
+		statuscpu2 = readl(dev->hw_regs[IRDMA_GLPE_CPUSTATUS2]);
+		if (statuscpu0 == 0x80 && statuscpu1 == 0x80 &&
+		    statuscpu2 == 0x80)
+			return 0;
+		mdelay(1000);
+	} while (retrycount++ < dev->hw_attrs.max_pe_ready_count);
+	return -1;
+}
+
+/**
+ * irdma_sc_ctrl_init - Initialize control part of device
+ * @ver: version
+ * @dev: Device pointer
+ * @info: Device init info
+ */
+enum irdma_status_code irdma_sc_ctrl_init(enum irdma_vers ver,
+					  struct irdma_sc_dev *dev,
+					  struct irdma_device_init_info *info)
+{
+	u32 val;
+	enum irdma_status_code ret_code = 0;
+	u8 db_size;
+
+	INIT_LIST_HEAD(&dev->cqp_cmd_head); /* for CQP command backlog */
+	dev->hmc_fn_id = info->hmc_fn_id;
+	dev->privileged = info->privileged;
+	dev->is_pf = info->is_pf;
+	dev->fpm_query_buf_pa = info->fpm_query_buf_pa;
+	dev->fpm_query_buf = info->fpm_query_buf;
+	dev->fpm_commit_buf_pa = info->fpm_commit_buf_pa;
+	dev->fpm_commit_buf = info->fpm_commit_buf;
+	dev->hw = info->hw;
+	dev->hw->hw_addr = info->bar0;
+	dev->irq_ops = &iw_irq_ops;
+	dev->cqp_ops = &iw_cqp_ops;
+	dev->ccq_ops = &iw_ccq_ops;
+	dev->ceq_ops = &iw_ceq_ops;
+	dev->aeq_ops = &iw_aeq_ops;
+	dev->hmc_ops = &iw_hmc_ops;
+	dev->iw_vsi_ops = &iw_vsi_ops;
+	dev->iw_priv_cq_ops = &iw_priv_cq_ops;
+
+	/* Setup the hardware limits, hmc may limit further */
+	dev->hw_attrs.min_hw_qp_id = IRDMA_MIN_IW_QP_ID;
+	dev->hw_attrs.min_hw_aeq_size = IRDMA_MIN_AEQ_ENTRIES;
+	dev->hw_attrs.max_hw_aeq_size = IRDMA_MAX_AEQ_ENTRIES;
+	dev->hw_attrs.min_hw_ceq_size = IRDMA_MIN_CEQ_ENTRIES;
+	dev->hw_attrs.max_hw_ceq_size = IRDMA_MAX_CEQ_ENTRIES;
+	dev->hw_attrs.uk_attrs.min_hw_cq_size = IRDMA_MIN_CQ_SIZE;
+	dev->hw_attrs.uk_attrs.max_hw_cq_size = IRDMA_MAX_CQ_SIZE;
+	dev->hw_attrs.uk_attrs.max_hw_wq_frags = IRDMA_MAX_WQ_FRAGMENT_COUNT;
+	dev->hw_attrs.uk_attrs.max_hw_read_sges = IRDMA_MAX_SGE_RD;
+	dev->hw_attrs.max_hw_outbound_msg_size = IRDMA_MAX_OUTBOUND_MSG_SIZE;
+	dev->hw_attrs.max_mr_size = IRDMA_MAX_MR_SIZE;
+	dev->hw_attrs.max_hw_inbound_msg_size = IRDMA_MAX_INBOUND_MSG_SIZE;
+	dev->hw_attrs.max_hw_device_pages = IRDMA_MAX_PUSH_PAGE_COUNT;
+	dev->hw_attrs.uk_attrs.max_hw_inline = IRDMA_MAX_INLINE_DATA_SIZE;
+	dev->hw_attrs.max_hw_wqes = IRDMA_MAX_WQ_ENTRIES;
+	dev->hw_attrs.max_qp_wr = IRDMA_MAX_QP_WRS(IRDMA_MAX_QUANTA_PER_WR);
+
+	dev->hw_attrs.uk_attrs.max_hw_rq_quanta = IRDMA_QP_SW_MAX_RQ_QUANTA;
+	dev->hw_attrs.uk_attrs.max_hw_wq_quanta = IRDMA_QP_SW_MAX_WQ_QUANTA;
+	dev->hw_attrs.max_hw_pds = IRDMA_MAX_PDS;
+	dev->hw_attrs.max_hw_ena_vf_count = IRDMA_MAX_PE_ENA_VF_COUNT;
+
+	dev->hw_attrs.max_pe_ready_count = 14;
+	dev->hw_attrs.max_done_count = IRDMA_DONE_COUNT;
+	dev->hw_attrs.max_sleep_count = IRDMA_SLEEP_COUNT;
+	dev->hw_attrs.max_cqp_compl_wait_time_ms = CQP_COMPL_WAIT_TIME_MS;
+
+	dev->hw_attrs.uk_attrs.hw_rev = ver;
+	info->init_hw(dev);
+	if (dev->privileged) {
+		if (irdma_wait_pe_ready(dev))
+			return IRDMA_ERR_TIMEOUT;
+
+		val = readl(dev->hw_regs[IRDMA_GLPCI_LBARCTRL]);
+		db_size = (u8)FIELD_GET(IRDMA_GLPCI_LBARCTRL_PE_DB_SIZE, val);
+		if (db_size != IRDMA_PE_DB_SIZE_4M &&
+		    db_size != IRDMA_PE_DB_SIZE_8M) {
+			ibdev_dbg(to_ibdev(dev),
+				  "DEV: RDMA PE doorbell is not enabled in CSR val 0x%x db_size=%d\n",
+				  val, db_size);
+			return IRDMA_ERR_PE_DOORBELL_NOT_ENA;
+		}
+	}
+	dev->db_addr = dev->hw->hw_addr + (uintptr_t)dev->hw_regs[IRDMA_DB_ADDR_OFFSET];
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_rt_init - Runtime initialize device
+ * @dev: IWARP device pointer
+ */
+void irdma_sc_rt_init(struct irdma_sc_dev *dev)
+{
+	mutex_init(&dev->ws_mutex);
+	irdma_device_init_uk(&dev->dev_uk);
+	dev->cqp_misc_ops = &iw_cqp_misc_ops;
+	dev->iw_pd_ops = &iw_pd_ops;
+	dev->iw_priv_qp_ops = &iw_priv_qp_ops;
+	dev->mr_ops = &iw_mr_ops;
+	dev->iw_uda_ops = &irdma_uda_ops;
+}
+
+/**
+ * irdma_update_stats - Update statistics
+ * @hw_stats: hw_stats instance to update
+ * @gather_stats: updated stat counters
+ * @last_gather_stats: last stat counters
+ */
+void irdma_update_stats(struct irdma_dev_hw_stats *hw_stats,
+			struct irdma_gather_stats *gather_stats,
+			struct irdma_gather_stats *last_gather_stats)
+{
+	u64 *stats_val = hw_stats->stats_val_32;
+
+	stats_val[IRDMA_HW_STAT_INDEX_RXVLANERR] +=
+		IRDMA_STATS_DELTA(gather_stats->rxvlanerr,
+				  last_gather_stats->rxvlanerr,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4RXDISCARD] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4rxdiscard,
+				  last_gather_stats->ip4rxdiscard,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4RXTRUNC] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4rxtrunc,
+				  last_gather_stats->ip4rxtrunc,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4TXNOROUTE] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4txnoroute,
+				  last_gather_stats->ip4txnoroute,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6RXDISCARD] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6rxdiscard,
+				  last_gather_stats->ip6rxdiscard,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6RXTRUNC] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6rxtrunc,
+				  last_gather_stats->ip6rxtrunc,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6TXNOROUTE] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6txnoroute,
+				  last_gather_stats->ip6txnoroute,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_TCPRTXSEG] +=
+		IRDMA_STATS_DELTA(gather_stats->tcprtxseg,
+				  last_gather_stats->tcprtxseg,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_TCPRXOPTERR] +=
+		IRDMA_STATS_DELTA(gather_stats->tcprxopterr,
+				  last_gather_stats->tcprxopterr,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_TCPRXPROTOERR] +=
+		IRDMA_STATS_DELTA(gather_stats->tcprxprotoerr,
+				  last_gather_stats->tcprxprotoerr,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_RXRPCNPHANDLED] +=
+		IRDMA_STATS_DELTA(gather_stats->rxrpcnphandled,
+				  last_gather_stats->rxrpcnphandled,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_RXRPCNPIGNORED] +=
+		IRDMA_STATS_DELTA(gather_stats->rxrpcnpignored,
+				  last_gather_stats->rxrpcnpignored,
+				  IRDMA_MAX_STATS_32);
+	stats_val[IRDMA_HW_STAT_INDEX_TXNPCNPSENT] +=
+		IRDMA_STATS_DELTA(gather_stats->txnpcnpsent,
+				  last_gather_stats->txnpcnpsent,
+				  IRDMA_MAX_STATS_32);
+	stats_val = hw_stats->stats_val_64;
+	stats_val[IRDMA_HW_STAT_INDEX_IP4RXOCTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4rxocts,
+				  last_gather_stats->ip4rxocts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4RXPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4rxpkts,
+				  last_gather_stats->ip4rxpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4RXFRAGS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4txfrag,
+				  last_gather_stats->ip4txfrag,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4RXMCPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4rxmcpkts,
+				  last_gather_stats->ip4rxmcpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4TXOCTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4txocts,
+				  last_gather_stats->ip4txocts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4TXPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4txpkts,
+				  last_gather_stats->ip4txpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4TXFRAGS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4txfrag,
+				  last_gather_stats->ip4txfrag,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP4TXMCPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip4txmcpkts,
+				  last_gather_stats->ip4txmcpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6RXOCTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6rxocts,
+				  last_gather_stats->ip6rxocts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6RXPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6rxpkts,
+				  last_gather_stats->ip6rxpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6RXFRAGS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6txfrags,
+				  last_gather_stats->ip6txfrags,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6RXMCPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6rxmcpkts,
+				  last_gather_stats->ip6rxmcpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6TXOCTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6txocts,
+				  last_gather_stats->ip6txocts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6TXPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6txpkts,
+				  last_gather_stats->ip6txpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6TXFRAGS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6txfrags,
+				  last_gather_stats->ip6txfrags,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_IP6TXMCPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->ip6txmcpkts,
+				  last_gather_stats->ip6txmcpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_TCPRXSEGS] +=
+		IRDMA_STATS_DELTA(gather_stats->tcprxsegs,
+				  last_gather_stats->tcprxsegs,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_TCPTXSEG] +=
+		IRDMA_STATS_DELTA(gather_stats->tcptxsegs,
+				  last_gather_stats->tcptxsegs,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMARXRDS] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmarxrds,
+				  last_gather_stats->rdmarxrds,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMARXSNDS] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmarxsnds,
+				  last_gather_stats->rdmarxsnds,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMARXWRS] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmarxwrs,
+				  last_gather_stats->rdmarxwrs,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMATXRDS] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmatxrds,
+				  last_gather_stats->rdmatxrds,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMATXSNDS] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmatxsnds,
+				  last_gather_stats->rdmatxsnds,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMATXWRS] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmatxwrs,
+				  last_gather_stats->rdmatxwrs,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMAVBND] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmavbn,
+				  last_gather_stats->rdmavbn,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RDMAVINV] +=
+		IRDMA_STATS_DELTA(gather_stats->rdmavinv,
+				  last_gather_stats->rdmavinv,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_UDPRXPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->udprxpkts,
+				  last_gather_stats->udprxpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_UDPTXPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->udptxpkts,
+				  last_gather_stats->udptxpkts,
+				  IRDMA_MAX_STATS_48);
+	stats_val[IRDMA_HW_STAT_INDEX_RXNPECNMARKEDPKTS] +=
+		IRDMA_STATS_DELTA(gather_stats->rxnpecnmrkpkts,
+				  last_gather_stats->rxnpecnmrkpkts,
+				  IRDMA_MAX_STATS_48);
+	memcpy(last_gather_stats, gather_stats, sizeof(*last_gather_stats));
+}
diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h
new file mode 100644
index 0000000..adff0e4
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/defs.h
@@ -0,0 +1,1162 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef IRDMA_DEFS_H
+#define IRDMA_DEFS_H
+
+#define IRDMA_FIRST_USER_QP_ID	3
+
+#define ECN_CODE_PT_VAL	2
+
+#define IRDMA_PUSH_OFFSET		(8 * 1024 * 1024)
+#define IRDMA_PF_FIRST_PUSH_PAGE_INDEX	16
+#define IRDMA_PF_BAR_RSVD		(60 * 1024)
+#define IRDMA_VF_PUSH_OFFSET		((8 + 64) * 1024)
+#define IRDMA_VF_FIRST_PUSH_PAGE_INDEX	2
+#define IRDMA_VF_BAR_RSVD		4096
+#define IRDMA_VF_STATS_SIZE_V0	280
+
+#define IRDMA_PE_DB_SIZE_4M	1
+#define IRDMA_PE_DB_SIZE_8M	2
+
+#define IRDMA_IRD_HW_SIZE_4	0
+#define IRDMA_IRD_HW_SIZE_16	1
+#define IRDMA_IRD_HW_SIZE_64	2
+#define IRDMA_IRD_HW_SIZE_128	3
+#define IRDMA_IRD_HW_SIZE_256	4
+
+enum irdma_protocol_used {
+	IRDMA_ANY_PROTOCOL = 0,
+	IRDMA_IWARP_PROTOCOL_ONLY = 1,
+	IRDMA_ROCE_PROTOCOL_ONLY = 2,
+};
+
+#define IRDMA_QP_STATE_INVALID		0
+#define IRDMA_QP_STATE_IDLE		1
+#define IRDMA_QP_STATE_RTS		2
+#define IRDMA_QP_STATE_CLOSING		3
+#define IRDMA_QP_STATE_SQD		3
+#define IRDMA_QP_STATE_RTR		4
+#define IRDMA_QP_STATE_TERMINATE	5
+#define IRDMA_QP_STATE_ERROR		6
+
+#define IRDMA_MAX_TRAFFIC_CLASS		8
+#define IRDMA_MAX_USER_PRIORITY		8
+#define IRDMA_MAX_APPS			8
+#define IRDMA_MAX_STATS_COUNT		128
+#define IRDMA_FIRST_NON_PF_STAT		4
+
+#define IRDMA_MIN_MTU_IPV4	576
+#define IRDMA_MIN_MTU_IPV6	1280
+#define IRDMA_MTU_TO_MSS_IPV4	40
+#define IRDMA_MTU_TO_MSS_IPV6	60
+#define IRDMA_DEFAULT_MTU	1500
+
+#define Q2_FPSN_OFFSET		64
+#define TERM_DDP_LEN_TAGGED	14
+#define TERM_DDP_LEN_UNTAGGED	18
+#define TERM_RDMA_LEN		28
+#define RDMA_OPCODE_M		0x0f
+#define RDMA_READ_REQ_OPCODE	1
+#define Q2_BAD_FRAME_OFFSET	72
+#define CQE_MAJOR_DRV		0x8000
+
+#define IRDMA_TERM_SENT		1
+#define IRDMA_TERM_RCVD		2
+#define IRDMA_TERM_DONE		4
+#define IRDMA_MAC_HLEN		14
+
+#define IRDMA_CQP_WAIT_POLL_REGS	1
+#define IRDMA_CQP_WAIT_POLL_CQ		2
+#define IRDMA_CQP_WAIT_EVENT		3
+
+#define IRDMA_AE_SOURCE_RSVD		0x0
+#define IRDMA_AE_SOURCE_RQ		0x1
+#define IRDMA_AE_SOURCE_RQ_0011		0x3
+
+#define IRDMA_AE_SOURCE_CQ		0x2
+#define IRDMA_AE_SOURCE_CQ_0110		0x6
+#define IRDMA_AE_SOURCE_CQ_1010		0xa
+#define IRDMA_AE_SOURCE_CQ_1110		0xe
+
+#define IRDMA_AE_SOURCE_SQ		0x5
+#define IRDMA_AE_SOURCE_SQ_0111		0x7
+
+#define IRDMA_AE_SOURCE_IN_RR_WR	0x9
+#define IRDMA_AE_SOURCE_IN_RR_WR_1011	0xb
+#define IRDMA_AE_SOURCE_OUT_RR		0xd
+#define IRDMA_AE_SOURCE_OUT_RR_1111	0xf
+
+#define IRDMA_TCP_STATE_NON_EXISTENT	0
+#define IRDMA_TCP_STATE_CLOSED		1
+#define IRDMA_TCP_STATE_LISTEN		2
+#define IRDMA_STATE_SYN_SEND		3
+#define IRDMA_TCP_STATE_SYN_RECEIVED	4
+#define IRDMA_TCP_STATE_ESTABLISHED	5
+#define IRDMA_TCP_STATE_CLOSE_WAIT	6
+#define IRDMA_TCP_STATE_FIN_WAIT_1	7
+#define IRDMA_TCP_STATE_CLOSING		8
+#define IRDMA_TCP_STATE_LAST_ACK	9
+#define IRDMA_TCP_STATE_FIN_WAIT_2	10
+#define IRDMA_TCP_STATE_TIME_WAIT	11
+#define IRDMA_TCP_STATE_RESERVED_1	12
+#define IRDMA_TCP_STATE_RESERVED_2	13
+#define IRDMA_TCP_STATE_RESERVED_3	14
+#define IRDMA_TCP_STATE_RESERVED_4	15
+
+#define IRDMA_CQP_SW_SQSIZE_4		4
+#define IRDMA_CQP_SW_SQSIZE_2048	2048
+
+#define IRDMA_CQ_TYPE_IWARP	1
+#define IRDMA_CQ_TYPE_ILQ	2
+#define IRDMA_CQ_TYPE_IEQ	3
+#define IRDMA_CQ_TYPE_CQP	4
+
+#define IRDMA_DONE_COUNT	1000
+#define IRDMA_SLEEP_COUNT	10
+
+#define IRDMA_UPDATE_SD_BUFF_SIZE	128
+#define IRDMA_FEATURE_BUF_SIZE		(8 * IRDMA_MAX_FEATURES)
+
+#define IRDMA_MAX_QUANTA_PER_WR	8
+
+#define IRDMA_QP_SW_MAX_WQ_QUANTA	32768
+#define IRDMA_QP_SW_MAX_SQ_QUANTA	32768
+#define IRDMA_QP_SW_MAX_RQ_QUANTA	32768
+#define IRDMA_MAX_QP_WRS(max_quanta_per_wr) \
+	((IRDMA_QP_SW_MAX_WQ_QUANTA - IRDMA_SQ_RSVD) / (max_quanta_per_wr))
+
+#define IRDMAQP_TERM_SEND_TERM_AND_FIN		0
+#define IRDMAQP_TERM_SEND_TERM_ONLY		1
+#define IRDMAQP_TERM_SEND_FIN_ONLY		2
+#define IRDMAQP_TERM_DONOT_SEND_TERM_OR_FIN	3
+
+#define IRDMA_QP_TYPE_IWARP	1
+#define IRDMA_QP_TYPE_UDA	2
+#define IRDMA_QP_TYPE_ROCE_RC	3
+#define IRDMA_QP_TYPE_ROCE_UD	4
+
+#define IRDMA_HW_PAGE_SIZE	4096
+#define IRDMA_HW_PAGE_SHIFT	12
+#define IRDMA_CQE_QTYPE_RQ	0
+#define IRDMA_CQE_QTYPE_SQ	1
+
+#define IRDMA_QP_SW_MIN_WQSIZE	8u /* in WRs*/
+#define IRDMA_QP_WQE_MIN_SIZE	32
+#define IRDMA_QP_WQE_MAX_SIZE	256
+#define IRDMA_QP_WQE_MIN_QUANTA 1
+#define IRDMA_MAX_RQ_WQE_SHIFT_GEN1 2
+#define IRDMA_MAX_RQ_WQE_SHIFT_GEN2 3
+
+#define IRDMA_SQ_RSVD	258
+#define IRDMA_RQ_RSVD	1
+
+#define IRDMA_FEATURE_RTS_AE			1ULL
+#define IRDMA_FEATURE_CQ_RESIZE			2ULL
+#define IRDMA_FEATURE_ATOMIC_OPS		32ULL
+#define IRDMA_FEATURE_SRQ			64ULL
+
+#define IRDMAQP_OP_RDMA_WRITE			0x00
+#define IRDMAQP_OP_RDMA_READ			0x01
+#define IRDMAQP_OP_RDMA_SEND			0x03
+#define IRDMAQP_OP_RDMA_SEND_INV		0x04
+#define IRDMAQP_OP_RDMA_SEND_SOL_EVENT		0x05
+#define IRDMAQP_OP_RDMA_SEND_SOL_EVENT_INV	0x06
+#define IRDMAQP_OP_BIND_MW			0x08
+#define IRDMAQP_OP_FAST_REGISTER		0x09
+#define IRDMAQP_OP_LOCAL_INVALIDATE		0x0a
+#define IRDMAQP_OP_RDMA_READ_LOC_INV		0x0b
+#define IRDMAQP_OP_NOP				0x0c
+#define IRDMAQP_OP_RDMA_WRITE_SOL		0x0d
+#define IRDMAQP_OP_GEN_RTS_AE			0x30
+
+enum irdma_cqp_op_type {
+	IRDMA_OP_CEQ_DESTROY			= 1,
+	IRDMA_OP_AEQ_DESTROY			= 2,
+	IRDMA_OP_DELETE_ARP_CACHE_ENTRY		= 3,
+	IRDMA_OP_MANAGE_APBVT_ENTRY		= 4,
+	IRDMA_OP_CEQ_CREATE			= 5,
+	IRDMA_OP_AEQ_CREATE			= 6,
+	IRDMA_OP_MANAGE_QHASH_TABLE_ENTRY	= 7,
+	IRDMA_OP_QP_MODIFY			= 8,
+	IRDMA_OP_QP_UPLOAD_CONTEXT		= 9,
+	IRDMA_OP_CQ_CREATE			= 10,
+	IRDMA_OP_CQ_DESTROY			= 11,
+	IRDMA_OP_QP_CREATE			= 12,
+	IRDMA_OP_QP_DESTROY			= 13,
+	IRDMA_OP_ALLOC_STAG			= 14,
+	IRDMA_OP_MR_REG_NON_SHARED		= 15,
+	IRDMA_OP_DEALLOC_STAG			= 16,
+	IRDMA_OP_MW_ALLOC			= 17,
+	IRDMA_OP_QP_FLUSH_WQES			= 18,
+	IRDMA_OP_ADD_ARP_CACHE_ENTRY		= 19,
+	IRDMA_OP_MANAGE_PUSH_PAGE		= 20,
+	IRDMA_OP_UPDATE_PE_SDS			= 21,
+	IRDMA_OP_MANAGE_HMC_PM_FUNC_TABLE	= 22,
+	IRDMA_OP_SUSPEND			= 23,
+	IRDMA_OP_RESUME				= 24,
+	IRDMA_OP_MANAGE_VF_PBLE_BP		= 25,
+	IRDMA_OP_QUERY_FPM_VAL			= 26,
+	IRDMA_OP_COMMIT_FPM_VAL			= 27,
+	IRDMA_OP_REQ_CMDS			= 28,
+	IRDMA_OP_CMPL_CMDS			= 29,
+	IRDMA_OP_AH_CREATE			= 30,
+	IRDMA_OP_AH_MODIFY			= 31,
+	IRDMA_OP_AH_DESTROY			= 32,
+	IRDMA_OP_MC_CREATE			= 33,
+	IRDMA_OP_MC_DESTROY			= 34,
+	IRDMA_OP_MC_MODIFY			= 35,
+	IRDMA_OP_STATS_ALLOCATE			= 36,
+	IRDMA_OP_STATS_FREE			= 37,
+	IRDMA_OP_STATS_GATHER			= 38,
+	IRDMA_OP_WS_ADD_NODE			= 39,
+	IRDMA_OP_WS_MODIFY_NODE			= 40,
+	IRDMA_OP_WS_DELETE_NODE			= 41,
+	IRDMA_OP_WS_FAILOVER_START		= 42,
+	IRDMA_OP_WS_FAILOVER_COMPLETE		= 43,
+	IRDMA_OP_SET_UP_MAP			= 44,
+	IRDMA_OP_GEN_AE				= 45,
+	IRDMA_OP_QUERY_RDMA_FEATURES		= 46,
+	IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY		= 47,
+	IRDMA_OP_ADD_LOCAL_MAC_ENTRY		= 48,
+	IRDMA_OP_DELETE_LOCAL_MAC_ENTRY		= 49,
+	IRDMA_OP_CQ_MODIFY			= 50,
+
+	/* Must be last entry*/
+	IRDMA_MAX_CQP_OPS			= 51,
+};
+
+/* CQP SQ WQES */
+#define IRDMA_CQP_OP_CREATE_QP				0
+#define IRDMA_CQP_OP_MODIFY_QP				0x1
+#define IRDMA_CQP_OP_DESTROY_QP				0x02
+#define IRDMA_CQP_OP_CREATE_CQ				0x03
+#define IRDMA_CQP_OP_MODIFY_CQ				0x04
+#define IRDMA_CQP_OP_DESTROY_CQ				0x05
+#define IRDMA_CQP_OP_ALLOC_STAG				0x09
+#define IRDMA_CQP_OP_REG_MR				0x0a
+#define IRDMA_CQP_OP_QUERY_STAG				0x0b
+#define IRDMA_CQP_OP_REG_SMR				0x0c
+#define IRDMA_CQP_OP_DEALLOC_STAG			0x0d
+#define IRDMA_CQP_OP_MANAGE_LOC_MAC_TABLE		0x0e
+#define IRDMA_CQP_OP_MANAGE_ARP				0x0f
+#define IRDMA_CQP_OP_MANAGE_VF_PBLE_BP			0x10
+#define IRDMA_CQP_OP_MANAGE_PUSH_PAGES			0x11
+#define IRDMA_CQP_OP_QUERY_RDMA_FEATURES		0x12
+#define IRDMA_CQP_OP_UPLOAD_CONTEXT			0x13
+#define IRDMA_CQP_OP_ALLOCATE_LOC_MAC_TABLE_ENTRY	0x14
+#define IRDMA_CQP_OP_UPLOAD_CONTEXT			0x13
+#define IRDMA_CQP_OP_MANAGE_HMC_PM_FUNC_TABLE		0x15
+#define IRDMA_CQP_OP_CREATE_CEQ				0x16
+#define IRDMA_CQP_OP_DESTROY_CEQ			0x18
+#define IRDMA_CQP_OP_CREATE_AEQ				0x19
+#define IRDMA_CQP_OP_DESTROY_AEQ			0x1b
+#define IRDMA_CQP_OP_CREATE_ADDR_HANDLE			0x1c
+#define IRDMA_CQP_OP_MODIFY_ADDR_HANDLE			0x1d
+#define IRDMA_CQP_OP_DESTROY_ADDR_HANDLE		0x1e
+#define IRDMA_CQP_OP_UPDATE_PE_SDS			0x1f
+#define IRDMA_CQP_OP_QUERY_FPM_VAL			0x20
+#define IRDMA_CQP_OP_COMMIT_FPM_VAL			0x21
+#define IRDMA_CQP_OP_FLUSH_WQES				0x22
+/* IRDMA_CQP_OP_GEN_AE is the same value as IRDMA_CQP_OP_FLUSH_WQES */
+#define IRDMA_CQP_OP_GEN_AE				0x22
+#define IRDMA_CQP_OP_MANAGE_APBVT			0x23
+#define IRDMA_CQP_OP_NOP				0x24
+#define IRDMA_CQP_OP_MANAGE_QUAD_HASH_TABLE_ENTRY	0x25
+#define IRDMA_CQP_OP_CREATE_MCAST_GRP			0x26
+#define IRDMA_CQP_OP_MODIFY_MCAST_GRP			0x27
+#define IRDMA_CQP_OP_DESTROY_MCAST_GRP			0x28
+#define IRDMA_CQP_OP_SUSPEND_QP				0x29
+#define IRDMA_CQP_OP_RESUME_QP				0x2a
+#define IRDMA_CQP_OP_SHMC_PAGES_ALLOCATED		0x2b
+#define IRDMA_CQP_OP_WORK_SCHED_NODE			0x2c
+#define IRDMA_CQP_OP_MANAGE_STATS			0x2d
+#define IRDMA_CQP_OP_GATHER_STATS			0x2e
+#define IRDMA_CQP_OP_UP_MAP				0x2f
+
+/* Async Events codes */
+#define IRDMA_AE_AMP_UNALLOCATED_STAG					0x0102
+#define IRDMA_AE_AMP_INVALID_STAG					0x0103
+#define IRDMA_AE_AMP_BAD_QP						0x0104
+#define IRDMA_AE_AMP_BAD_PD						0x0105
+#define IRDMA_AE_AMP_BAD_STAG_KEY					0x0106
+#define IRDMA_AE_AMP_BAD_STAG_INDEX					0x0107
+#define IRDMA_AE_AMP_BOUNDS_VIOLATION					0x0108
+#define IRDMA_AE_AMP_RIGHTS_VIOLATION					0x0109
+#define IRDMA_AE_AMP_TO_WRAP						0x010a
+#define IRDMA_AE_AMP_FASTREG_VALID_STAG					0x010c
+#define IRDMA_AE_AMP_FASTREG_MW_STAG					0x010d
+#define IRDMA_AE_AMP_FASTREG_INVALID_RIGHTS				0x010e
+#define IRDMA_AE_AMP_FASTREG_INVALID_LENGTH				0x0110
+#define IRDMA_AE_AMP_INVALIDATE_SHARED					0x0111
+#define IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS			0x0112
+#define IRDMA_AE_AMP_INVALIDATE_MR_WITH_BOUND_WINDOWS			0x0113
+#define IRDMA_AE_AMP_MWBIND_VALID_STAG					0x0114
+#define IRDMA_AE_AMP_MWBIND_OF_MR_STAG					0x0115
+#define IRDMA_AE_AMP_MWBIND_TO_ZERO_BASED_STAG				0x0116
+#define IRDMA_AE_AMP_MWBIND_TO_MW_STAG					0x0117
+#define IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS				0x0118
+#define IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS				0x0119
+#define IRDMA_AE_AMP_MWBIND_TO_INVALID_PARENT				0x011a
+#define IRDMA_AE_AMP_MWBIND_BIND_DISABLED				0x011b
+#define IRDMA_AE_PRIV_OPERATION_DENIED					0x011c
+#define IRDMA_AE_AMP_INVALIDATE_TYPE1_MW				0x011d
+#define IRDMA_AE_AMP_MWBIND_ZERO_BASED_TYPE1_MW				0x011e
+#define IRDMA_AE_AMP_FASTREG_INVALID_PBL_HPS_CFG			0x011f
+#define IRDMA_AE_AMP_MWBIND_WRONG_TYPE					0x0120
+#define IRDMA_AE_AMP_FASTREG_PBLE_MISMATCH				0x0121
+#define IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG				0x0132
+#define IRDMA_AE_UDA_XMIT_BAD_PD					0x0133
+#define IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT				0x0134
+#define IRDMA_AE_UDA_L4LEN_INVALID					0x0135
+#define IRDMA_AE_BAD_CLOSE						0x0201
+#define IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE				0x0202
+#define IRDMA_AE_CQ_OPERATION_ERROR					0x0203
+#define IRDMA_AE_RDMA_READ_WHILE_ORD_ZERO				0x0205
+#define IRDMA_AE_STAG_ZERO_INVALID					0x0206
+#define IRDMA_AE_IB_RREQ_AND_Q1_FULL					0x0207
+#define IRDMA_AE_IB_INVALID_REQUEST					0x0208
+#define IRDMA_AE_WQE_UNEXPECTED_OPCODE					0x020a
+#define IRDMA_AE_WQE_INVALID_PARAMETER					0x020b
+#define IRDMA_AE_WQE_INVALID_FRAG_DATA					0x020c
+#define IRDMA_AE_IB_REMOTE_ACCESS_ERROR					0x020d
+#define IRDMA_AE_IB_REMOTE_OP_ERROR					0x020e
+#define IRDMA_AE_WQE_LSMM_TOO_LONG					0x0220
+#define IRDMA_AE_DDP_INVALID_MSN_GAP_IN_MSN				0x0301
+#define IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER	0x0303
+#define IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION				0x0304
+#define IRDMA_AE_DDP_UBE_INVALID_MO					0x0305
+#define IRDMA_AE_DDP_UBE_INVALID_MSN_NO_BUFFER_AVAILABLE		0x0306
+#define IRDMA_AE_DDP_UBE_INVALID_QN					0x0307
+#define IRDMA_AE_DDP_NO_L_BIT						0x0308
+#define IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION			0x0311
+#define IRDMA_AE_RDMAP_ROE_UNEXPECTED_OPCODE				0x0312
+#define IRDMA_AE_ROE_INVALID_RDMA_READ_REQUEST				0x0313
+#define IRDMA_AE_ROE_INVALID_RDMA_WRITE_OR_READ_RESP			0x0314
+#define IRDMA_AE_ROCE_RSP_LENGTH_ERROR					0x0316
+#define IRDMA_AE_ROCE_EMPTY_MCG						0x0380
+#define IRDMA_AE_ROCE_BAD_MC_IP_ADDR					0x0381
+#define IRDMA_AE_ROCE_BAD_MC_QPID					0x0382
+#define IRDMA_AE_MCG_QP_PROTOCOL_MISMATCH				0x0383
+#define IRDMA_AE_INVALID_ARP_ENTRY					0x0401
+#define IRDMA_AE_INVALID_TCP_OPTION_RCVD				0x0402
+#define IRDMA_AE_STALE_ARP_ENTRY					0x0403
+#define IRDMA_AE_INVALID_AH_ENTRY					0x0406
+#define IRDMA_AE_LLP_CLOSE_COMPLETE					0x0501
+#define IRDMA_AE_LLP_CONNECTION_RESET					0x0502
+#define IRDMA_AE_LLP_FIN_RECEIVED					0x0503
+#define IRDMA_AE_LLP_RECEIVED_MARKER_AND_LENGTH_FIELDS_DONT_MATCH	0x0504
+#define IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR				0x0505
+#define IRDMA_AE_LLP_SEGMENT_TOO_SMALL					0x0507
+#define IRDMA_AE_LLP_SYN_RECEIVED					0x0508
+#define IRDMA_AE_LLP_TERMINATE_RECEIVED					0x0509
+#define IRDMA_AE_LLP_TOO_MANY_RETRIES					0x050a
+#define IRDMA_AE_LLP_TOO_MANY_KEEPALIVE_RETRIES				0x050b
+#define IRDMA_AE_LLP_DOUBT_REACHABILITY					0x050c
+#define IRDMA_AE_LLP_CONNECTION_ESTABLISHED				0x050e
+#define IRDMA_AE_RESOURCE_EXHAUSTION					0x0520
+#define IRDMA_AE_RESET_SENT						0x0601
+#define IRDMA_AE_TERMINATE_SENT						0x0602
+#define IRDMA_AE_RESET_NOT_SENT						0x0603
+#define IRDMA_AE_LCE_QP_CATASTROPHIC					0x0700
+#define IRDMA_AE_LCE_FUNCTION_CATASTROPHIC				0x0701
+#define IRDMA_AE_LCE_CQ_CATASTROPHIC					0x0702
+#define IRDMA_AE_QP_SUSPEND_COMPLETE					0x0900
+
+#define FLD_LS_64(dev, val, field)	\
+	(((u64)(val) << (dev)->hw_shifts[field ## _S]) & (dev)->hw_masks[field ## _M])
+#define FLD_RS_64(dev, val, field)	\
+	((u64)((val) & (dev)->hw_masks[field ## _M]) >> (dev)->hw_shifts[field ## _S])
+#define FLD_LS_32(dev, val, field)	\
+	(((val) << (dev)->hw_shifts[field ## _S]) & (dev)->hw_masks[field ## _M])
+#define FLD_RS_32(dev, val, field)	\
+	((u64)((val) & (dev)->hw_masks[field ## _M]) >> (dev)->hw_shifts[field ## _S])
+
+#define IRDMA_STATS_DELTA(a, b, c) ((a) >= (b) ? (a) - (b) : (a) + (c) - (b))
+#define IRDMA_MAX_STATS_32	0xFFFFFFFFULL
+#define IRDMA_MAX_STATS_48	0xFFFFFFFFFFFFULL
+
+#define IRDMA_MAX_CQ_READ_THRESH 0x3FFFF
+#define IRDMA_CQPSQ_QHASH_VLANID GENMASK_ULL(43, 32)
+#define IRDMA_CQPSQ_QHASH_QPN GENMASK_ULL(49, 32)
+#define IRDMA_CQPSQ_QHASH_QS_HANDLE GENMASK_ULL(9, 0)
+#define IRDMA_CQPSQ_QHASH_SRC_PORT GENMASK_ULL(31, 16)
+#define IRDMA_CQPSQ_QHASH_DEST_PORT GENMASK_ULL(15, 0)
+#define IRDMA_CQPSQ_QHASH_ADDR0 GENMASK_ULL(63, 32)
+#define IRDMA_CQPSQ_QHASH_ADDR1 GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_QHASH_ADDR2 GENMASK_ULL(63, 32)
+#define IRDMA_CQPSQ_QHASH_ADDR3 GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_QHASH_WQEVALID BIT_ULL(63)
+#define IRDMA_CQPSQ_QHASH_OPCODE GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_QHASH_MANAGE GENMASK_ULL(62, 61)
+#define IRDMA_CQPSQ_QHASH_IPV4VALID BIT_ULL(60)
+#define IRDMA_CQPSQ_QHASH_VLANVALID BIT_ULL(59)
+#define IRDMA_CQPSQ_QHASH_ENTRYTYPE GENMASK_ULL(44, 42)
+#define IRDMA_CQPSQ_STATS_WQEVALID BIT_ULL(63)
+#define IRDMA_CQPSQ_STATS_ALLOC_INST BIT_ULL(62)
+#define IRDMA_CQPSQ_STATS_USE_HMC_FCN_INDEX BIT_ULL(60)
+#define IRDMA_CQPSQ_STATS_USE_INST BIT_ULL(61)
+#define IRDMA_CQPSQ_STATS_OP GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_STATS_INST_INDEX GENMASK_ULL(6, 0)
+#define IRDMA_CQPSQ_STATS_HMC_FCN_INDEX GENMASK_ULL(5, 0)
+#define IRDMA_CQPSQ_WS_WQEVALID BIT_ULL(63)
+#define IRDMA_CQPSQ_WS_NODEOP GENMASK_ULL(53, 52)
+
+#define IRDMA_CQPSQ_WS_ENABLENODE BIT_ULL(62)
+#define IRDMA_CQPSQ_WS_NODETYPE BIT_ULL(61)
+#define IRDMA_CQPSQ_WS_PRIOTYPE GENMASK_ULL(60, 59)
+#define IRDMA_CQPSQ_WS_TC GENMASK_ULL(58, 56)
+#define IRDMA_CQPSQ_WS_VMVFTYPE GENMASK_ULL(55, 54)
+#define IRDMA_CQPSQ_WS_VMVFNUM GENMASK_ULL(51, 42)
+#define IRDMA_CQPSQ_WS_OP GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_WS_PARENTID GENMASK_ULL(25, 16)
+#define IRDMA_CQPSQ_WS_NODEID GENMASK_ULL(9, 0)
+#define IRDMA_CQPSQ_WS_VSI GENMASK_ULL(57, 48)
+#define IRDMA_CQPSQ_WS_WEIGHT GENMASK_ULL(38, 32)
+
+#define IRDMA_CQPSQ_UP_WQEVALID BIT_ULL(63)
+#define IRDMA_CQPSQ_UP_USEVLAN BIT_ULL(62)
+#define IRDMA_CQPSQ_UP_USEOVERRIDE BIT_ULL(61)
+#define IRDMA_CQPSQ_UP_OP GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_UP_HMCFCNIDX GENMASK_ULL(5, 0)
+#define IRDMA_CQPSQ_UP_CNPOVERRIDE GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_QUERY_RDMA_FEATURES_WQEVALID BIT_ULL(63)
+#define IRDMA_CQPSQ_QUERY_RDMA_FEATURES_BUF_LEN GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_QUERY_RDMA_FEATURES_OP GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_QUERY_RDMA_FEATURES_HW_MODEL_USED GENMASK_ULL(47, 32)
+#define IRDMA_CQPSQ_QUERY_RDMA_FEATURES_HW_MAJOR_VERSION GENMASK_ULL(23, 16)
+#define IRDMA_CQPSQ_QUERY_RDMA_FEATURES_HW_MINOR_VERSION GENMASK_ULL(7, 0)
+#define IRDMA_CQPHC_SQSIZE GENMASK_ULL(11, 8)
+#define IRDMA_CQPHC_DISABLE_PFPDUS BIT_ULL(1)
+#define IRDMA_CQPHC_ROCEV2_RTO_POLICY BIT_ULL(2)
+#define IRDMA_CQPHC_PROTOCOL_USED GENMASK_ULL(4, 3)
+#define IRDMA_CQPHC_MIN_RATE GENMASK_ULL(51, 48)
+#define IRDMA_CQPHC_MIN_DEC_FACTOR GENMASK_ULL(59, 56)
+#define IRDMA_CQPHC_DCQCN_T GENMASK_ULL(15, 0)
+#define IRDMA_CQPHC_HAI_FACTOR GENMASK_ULL(47, 32)
+#define IRDMA_CQPHC_RAI_FACTOR GENMASK_ULL(63, 48)
+#define IRDMA_CQPHC_DCQCN_B GENMASK_ULL(24, 0)
+#define IRDMA_CQPHC_DCQCN_F GENMASK_ULL(27, 25)
+#define IRDMA_CQPHC_CC_CFG_VALID BIT_ULL(31)
+#define IRDMA_CQPHC_RREDUCE_MPERIOD GENMASK_ULL(63, 32)
+#define IRDMA_CQPHC_HW_MINVER GENMASK_ULL(15, 0)
+
+#define IRDMA_CQPHC_HW_MAJVER_GEN_1 0
+#define IRDMA_CQPHC_HW_MAJVER_GEN_2 1
+#define IRDMA_CQPHC_HW_MAJVER_GEN_3 2
+#define IRDMA_CQPHC_HW_MAJVER GENMASK_ULL(31, 16)
+#define IRDMA_CQPHC_CEQPERVF GENMASK_ULL(39, 32)
+
+#define IRDMA_CQPHC_ENABLED_VFS GENMASK_ULL(37, 32)
+
+#define IRDMA_CQPHC_HMC_PROFILE GENMASK_ULL(2, 0)
+#define IRDMA_CQPHC_SVER GENMASK_ULL(31, 24)
+#define IRDMA_CQPHC_SQBASE GENMASK_ULL(63, 9)
+
+#define IRDMA_CQPHC_QPCTX GENMASK_ULL(63, 0)
+#define IRDMA_QP_DBSA_HW_SQ_TAIL GENMASK_ULL(14, 0)
+#define IRDMA_CQ_DBSA_CQEIDX GENMASK_ULL(19, 0)
+#define IRDMA_CQ_DBSA_SW_CQ_SELECT GENMASK_ULL(13, 0)
+#define IRDMA_CQ_DBSA_ARM_NEXT BIT_ULL(14)
+#define IRDMA_CQ_DBSA_ARM_NEXT_SE BIT_ULL(15)
+#define IRDMA_CQ_DBSA_ARM_SEQ_NUM GENMASK_ULL(17, 16)
+
+/* CQP and iWARP Completion Queue */
+#define IRDMA_CQ_QPCTX IRDMA_CQPHC_QPCTX
+
+#define IRDMA_CCQ_OPRETVAL GENMASK_ULL(31, 0)
+
+#define IRDMA_CQ_MINERR GENMASK_ULL(15, 0)
+#define IRDMA_CQ_MAJERR GENMASK_ULL(31, 16)
+#define IRDMA_CQ_WQEIDX GENMASK_ULL(46, 32)
+#define IRDMA_CQ_EXTCQE BIT_ULL(50)
+#define IRDMA_OOO_CMPL BIT_ULL(54)
+#define IRDMA_CQ_ERROR BIT_ULL(55)
+#define IRDMA_CQ_SQ BIT_ULL(62)
+
+#define IRDMA_CQ_VALID BIT_ULL(63)
+#define IRDMA_CQ_IMMVALID BIT_ULL(62)
+#define IRDMA_CQ_UDSMACVALID BIT_ULL(61)
+#define IRDMA_CQ_UDVLANVALID BIT_ULL(60)
+#define IRDMA_CQ_UDSMAC GENMASK_ULL(47, 0)
+#define IRDMA_CQ_UDVLAN GENMASK_ULL(63, 48)
+
+#define IRDMA_CQ_IMMDATA_S 0
+#define IRDMA_CQ_IMMDATA_M (0xffffffffffffffffULL << IRDMA_CQ_IMMVALID_S)
+#define IRDMA_CQ_IMMDATALOW32 GENMASK_ULL(31, 0)
+#define IRDMA_CQ_IMMDATAUP32 GENMASK_ULL(63, 32)
+#define IRDMACQ_PAYLDLEN GENMASK_ULL(31, 0)
+#define IRDMACQ_TCPSEQNUMRTT GENMASK_ULL(63, 32)
+#define IRDMACQ_INVSTAG GENMASK_ULL(31, 0)
+#define IRDMACQ_QPID GENMASK_ULL(55, 32)
+
+#define IRDMACQ_UDSRCQPN GENMASK_ULL(31, 0)
+#define IRDMACQ_PSHDROP BIT_ULL(51)
+#define IRDMACQ_STAG BIT_ULL(53)
+#define IRDMACQ_IPV4 BIT_ULL(53)
+#define IRDMACQ_SOEVENT BIT_ULL(54)
+#define IRDMACQ_OP GENMASK_ULL(61, 56)
+
+#define IRDMA_CEQE_CQCTX GENMASK_ULL(62, 0)
+#define IRDMA_CEQE_VALID BIT_ULL(63)
+
+/* AEQE format */
+#define IRDMA_AEQE_COMPCTX IRDMA_CQPHC_QPCTX
+#define IRDMA_AEQE_QPCQID_LOW GENMASK_ULL(17, 0)
+#define IRDMA_AEQE_QPCQID_HI BIT_ULL(46)
+#define IRDMA_AEQE_WQDESCIDX GENMASK_ULL(32, 18)
+#define IRDMA_AEQE_OVERFLOW BIT_ULL(33)
+#define IRDMA_AEQE_AECODE GENMASK_ULL(45, 34)
+#define IRDMA_AEQE_AESRC GENMASK_ULL(53, 50)
+#define IRDMA_AEQE_IWSTATE GENMASK_ULL(56, 54)
+#define IRDMA_AEQE_TCPSTATE GENMASK_ULL(60, 57)
+#define IRDMA_AEQE_Q2DATA GENMASK_ULL(62, 61)
+#define IRDMA_AEQE_VALID BIT_ULL(63)
+
+#define IRDMA_UDA_QPSQ_NEXT_HDR GENMASK_ULL(23, 16)
+#define IRDMA_UDA_QPSQ_OPCODE GENMASK_ULL(37, 32)
+#define IRDMA_UDA_QPSQ_L4LEN GENMASK_ULL(45, 42)
+#define IRDMA_GEN1_UDA_QPSQ_L4LEN GENMASK_ULL(27, 24)
+#define IRDMA_UDA_QPSQ_AHIDX GENMASK_ULL(16, 0)
+#define IRDMA_UDA_QPSQ_VALID BIT_ULL(63)
+#define IRDMA_UDA_QPSQ_SIGCOMPL BIT_ULL(62)
+#define IRDMA_UDA_QPSQ_MACLEN GENMASK_ULL(62, 56)
+#define IRDMA_UDA_QPSQ_IPLEN GENMASK_ULL(54, 48)
+#define IRDMA_UDA_QPSQ_L4T GENMASK_ULL(31, 30)
+#define IRDMA_UDA_QPSQ_IIPT GENMASK_ULL(29, 28)
+#define IRDMA_UDA_PAYLOADLEN GENMASK_ULL(13, 0)
+#define IRDMA_UDA_HDRLEN GENMASK_ULL(24, 16)
+#define IRDMA_VLAN_TAG_VALID BIT_ULL(50)
+#define IRDMA_UDA_L3PROTO GENMASK_ULL(1, 0)
+#define IRDMA_UDA_L4PROTO GENMASK_ULL(17, 16)
+#define IRDMA_UDA_QPSQ_DOLOOPBACK BIT_ULL(44)
+#define IRDMA_CQPSQ_BUFSIZE GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_OPCODE GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_WQEVALID BIT_ULL(63)
+#define IRDMA_CQPSQ_TPHVAL GENMASK_ULL(7, 0)
+
+#define IRDMA_CQPSQ_VSIIDX GENMASK_ULL(17, 8)
+#define IRDMA_CQPSQ_TPHEN BIT_ULL(60)
+
+#define IRDMA_CQPSQ_PBUFADDR IRDMA_CQPHC_QPCTX
+
+/* Create/Modify/Destroy QP */
+
+#define IRDMA_CQPSQ_QP_NEWMSS GENMASK_ULL(45, 32)
+#define IRDMA_CQPSQ_QP_TERMLEN GENMASK_ULL(51, 48)
+
+#define IRDMA_CQPSQ_QP_QPCTX IRDMA_CQPHC_QPCTX
+
+#define IRDMA_CQPSQ_QP_QPID_S 0
+#define IRDMA_CQPSQ_QP_QPID_M (0xFFFFFFUL)
+
+#define IRDMA_CQPSQ_QP_OP_S 32
+#define IRDMA_CQPSQ_QP_OP_M IRDMACQ_OP_M
+#define IRDMA_CQPSQ_QP_ORDVALID BIT_ULL(42)
+#define IRDMA_CQPSQ_QP_TOECTXVALID BIT_ULL(43)
+#define IRDMA_CQPSQ_QP_CACHEDVARVALID BIT_ULL(44)
+#define IRDMA_CQPSQ_QP_VQ BIT_ULL(45)
+#define IRDMA_CQPSQ_QP_FORCELOOPBACK BIT_ULL(46)
+#define IRDMA_CQPSQ_QP_CQNUMVALID BIT_ULL(47)
+#define IRDMA_CQPSQ_QP_QPTYPE GENMASK_ULL(50, 48)
+#define IRDMA_CQPSQ_QP_MACVALID BIT_ULL(51)
+#define IRDMA_CQPSQ_QP_MSSCHANGE BIT_ULL(52)
+
+#define IRDMA_CQPSQ_QP_IGNOREMWBOUND BIT_ULL(54)
+#define IRDMA_CQPSQ_QP_REMOVEHASHENTRY BIT_ULL(55)
+#define IRDMA_CQPSQ_QP_TERMACT GENMASK_ULL(57, 56)
+#define IRDMA_CQPSQ_QP_RESETCON BIT_ULL(58)
+#define IRDMA_CQPSQ_QP_ARPTABIDXVALID BIT_ULL(59)
+#define IRDMA_CQPSQ_QP_NEXTIWSTATE GENMASK_ULL(62, 60)
+
+#define IRDMA_CQPSQ_QP_DBSHADOWADDR IRDMA_CQPHC_QPCTX
+
+#define IRDMA_CQPSQ_CQ_CQSIZE GENMASK_ULL(20, 0)
+#define IRDMA_CQPSQ_CQ_CQCTX GENMASK_ULL(62, 0)
+#define IRDMA_CQPSQ_CQ_SHADOW_READ_THRESHOLD GENMASK(17, 0)
+
+#define IRDMA_CQPSQ_CQ_OP GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_CQ_CQRESIZE BIT_ULL(43)
+#define IRDMA_CQPSQ_CQ_LPBLSIZE GENMASK_ULL(45, 44)
+#define IRDMA_CQPSQ_CQ_CHKOVERFLOW BIT_ULL(46)
+#define IRDMA_CQPSQ_CQ_VIRTMAP BIT_ULL(47)
+#define IRDMA_CQPSQ_CQ_ENCEQEMASK BIT_ULL(48)
+#define IRDMA_CQPSQ_CQ_CEQIDVALID BIT_ULL(49)
+#define IRDMA_CQPSQ_CQ_AVOIDMEMCNFLCT BIT_ULL(61)
+#define IRDMA_CQPSQ_CQ_FIRSTPMPBLIDX GENMASK_ULL(27, 0)
+
+/* Allocate/Register/Register Shared/Deallocate Stag */
+#define IRDMA_CQPSQ_STAG_VA_FBO IRDMA_CQPHC_QPCTX
+#define IRDMA_CQPSQ_STAG_STAGLEN GENMASK_ULL(45, 0)
+#define IRDMA_CQPSQ_STAG_KEY GENMASK_ULL(7, 0)
+#define IRDMA_CQPSQ_STAG_IDX GENMASK_ULL(31, 8)
+#define IRDMA_CQPSQ_STAG_IDX_S 8
+#define IRDMA_CQPSQ_STAG_PARENTSTAGIDX GENMASK_ULL(55, 32)
+#define IRDMA_CQPSQ_STAG_MR BIT_ULL(43)
+#define IRDMA_CQPSQ_STAG_MWTYPE BIT_ULL(42)
+#define IRDMA_CQPSQ_STAG_MW1_BIND_DONT_VLDT_KEY BIT_ULL(58)
+
+#define IRDMA_CQPSQ_STAG_LPBLSIZE IRDMA_CQPSQ_CQ_LPBLSIZE
+#define IRDMA_CQPSQ_STAG_HPAGESIZE GENMASK_ULL(47, 46)
+#define IRDMA_CQPSQ_STAG_ARIGHTS GENMASK_ULL(52, 48)
+#define IRDMA_CQPSQ_STAG_REMACCENABLED BIT_ULL(53)
+#define IRDMA_CQPSQ_STAG_VABASEDTO BIT_ULL(59)
+#define IRDMA_CQPSQ_STAG_USEHMCFNIDX BIT_ULL(60)
+#define IRDMA_CQPSQ_STAG_USEPFRID BIT_ULL(61)
+
+#define IRDMA_CQPSQ_STAG_PBA IRDMA_CQPHC_QPCTX
+#define IRDMA_CQPSQ_STAG_HMCFNIDX GENMASK_ULL(5, 0)
+
+#define IRDMA_CQPSQ_STAG_FIRSTPMPBLIDX GENMASK_ULL(27, 0)
+#define IRDMA_CQPSQ_QUERYSTAG_IDX IRDMA_CQPSQ_STAG_IDX
+#define IRDMA_CQPSQ_MLM_TABLEIDX GENMASK_ULL(5, 0)
+#define IRDMA_CQPSQ_MLM_FREEENTRY BIT_ULL(62)
+#define IRDMA_CQPSQ_MLM_IGNORE_REF_CNT BIT_ULL(61)
+#define IRDMA_CQPSQ_MLM_MAC0 GENMASK_ULL(7, 0)
+#define IRDMA_CQPSQ_MLM_MAC1 GENMASK_ULL(15, 8)
+#define IRDMA_CQPSQ_MLM_MAC2 GENMASK_ULL(23, 16)
+#define IRDMA_CQPSQ_MLM_MAC3 GENMASK_ULL(31, 24)
+#define IRDMA_CQPSQ_MLM_MAC4 GENMASK_ULL(39, 32)
+#define IRDMA_CQPSQ_MLM_MAC5 GENMASK_ULL(47, 40)
+#define IRDMA_CQPSQ_MAT_REACHMAX GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_MAT_MACADDR GENMASK_ULL(47, 0)
+#define IRDMA_CQPSQ_MAT_ARPENTRYIDX GENMASK_ULL(11, 0)
+#define IRDMA_CQPSQ_MAT_ENTRYVALID BIT_ULL(42)
+#define IRDMA_CQPSQ_MAT_PERMANENT BIT_ULL(43)
+#define IRDMA_CQPSQ_MAT_QUERY BIT_ULL(44)
+#define IRDMA_CQPSQ_MVPBP_PD_ENTRY_CNT GENMASK_ULL(9, 0)
+#define IRDMA_CQPSQ_MVPBP_FIRST_PD_INX GENMASK_ULL(24, 16)
+#define IRDMA_CQPSQ_MVPBP_SD_INX GENMASK_ULL(43, 32)
+#define IRDMA_CQPSQ_MVPBP_INV_PD_ENT BIT_ULL(62)
+#define IRDMA_CQPSQ_MVPBP_PD_PLPBA GENMASK_ULL(63, 3)
+
+/* Manage Push Page - MPP */
+#define IRDMA_INVALID_PUSH_PAGE_INDEX_GEN_1 0xffff
+#define IRDMA_INVALID_PUSH_PAGE_INDEX 0xffffffff
+
+#define IRDMA_CQPSQ_MPP_QS_HANDLE GENMASK_ULL(9, 0)
+#define IRDMA_CQPSQ_MPP_PPIDX GENMASK_ULL(9, 0)
+#define IRDMA_CQPSQ_MPP_PPTYPE GENMASK_ULL(61, 60)
+
+#define IRDMA_CQPSQ_MPP_FREE_PAGE BIT_ULL(62)
+
+/* Upload Context - UCTX */
+#define IRDMA_CQPSQ_UCTX_QPCTXADDR IRDMA_CQPHC_QPCTX
+#define IRDMA_CQPSQ_UCTX_QPID GENMASK_ULL(23, 0)
+#define IRDMA_CQPSQ_UCTX_QPTYPE GENMASK_ULL(51, 48)
+
+#define IRDMA_CQPSQ_UCTX_RAWFORMAT BIT_ULL(61)
+#define IRDMA_CQPSQ_UCTX_FREEZEQP BIT_ULL(62)
+
+#define IRDMA_CQPSQ_MHMC_VFIDX GENMASK_ULL(15, 0)
+#define IRDMA_CQPSQ_MHMC_FREEPMFN BIT_ULL(62)
+
+#define IRDMA_CQPSQ_SHMCRP_HMC_PROFILE GENMASK_ULL(2, 0)
+#define IRDMA_CQPSQ_SHMCRP_VFNUM GENMASK_ULL(37, 32)
+#define IRDMA_CQPSQ_CEQ_CEQSIZE GENMASK_ULL(21, 0)
+#define IRDMA_CQPSQ_CEQ_CEQID GENMASK_ULL(9, 0)
+
+#define IRDMA_CQPSQ_CEQ_LPBLSIZE IRDMA_CQPSQ_CQ_LPBLSIZE
+#define IRDMA_CQPSQ_CEQ_VMAP BIT_ULL(47)
+#define IRDMA_CQPSQ_CEQ_ITRNOEXPIRE BIT_ULL(46)
+#define IRDMA_CQPSQ_CEQ_FIRSTPMPBLIDX GENMASK_ULL(27, 0)
+#define IRDMA_CQPSQ_AEQ_AEQECNT GENMASK_ULL(18, 0)
+#define IRDMA_CQPSQ_AEQ_LPBLSIZE IRDMA_CQPSQ_CQ_LPBLSIZE
+#define IRDMA_CQPSQ_AEQ_VMAP BIT_ULL(47)
+#define IRDMA_CQPSQ_AEQ_FIRSTPMPBLIDX GENMASK_ULL(27, 0)
+
+#define IRDMA_COMMIT_FPM_QPCNT GENMASK_ULL(18, 0)
+
+#define IRDMA_COMMIT_FPM_BASE_S 32
+#define IRDMA_CQPSQ_CFPM_HMCFNID GENMASK_ULL(5, 0)
+#define IRDMA_CQPSQ_FWQE_AECODE GENMASK_ULL(15, 0)
+#define IRDMA_CQPSQ_FWQE_AESOURCE GENMASK_ULL(19, 16)
+#define IRDMA_CQPSQ_FWQE_RQMNERR GENMASK_ULL(15, 0)
+#define IRDMA_CQPSQ_FWQE_RQMJERR GENMASK_ULL(31, 16)
+#define IRDMA_CQPSQ_FWQE_SQMNERR GENMASK_ULL(47, 32)
+#define IRDMA_CQPSQ_FWQE_SQMJERR GENMASK_ULL(63, 48)
+#define IRDMA_CQPSQ_FWQE_QPID GENMASK_ULL(23, 0)
+#define IRDMA_CQPSQ_FWQE_GENERATE_AE BIT_ULL(59)
+#define IRDMA_CQPSQ_FWQE_USERFLCODE BIT_ULL(60)
+#define IRDMA_CQPSQ_FWQE_FLUSHSQ BIT_ULL(61)
+#define IRDMA_CQPSQ_FWQE_FLUSHRQ BIT_ULL(62)
+#define IRDMA_CQPSQ_MAPT_PORT GENMASK_ULL(15, 0)
+#define IRDMA_CQPSQ_MAPT_ADDPORT BIT_ULL(62)
+#define IRDMA_CQPSQ_UPESD_SDCMD GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_UPESD_SDDATALOW GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_UPESD_SDDATAHI GENMASK_ULL(63, 32)
+#define IRDMA_CQPSQ_UPESD_HMCFNID GENMASK_ULL(5, 0)
+#define IRDMA_CQPSQ_UPESD_ENTRY_VALID BIT_ULL(63)
+
+#define IRDMA_CQPSQ_UPESD_BM_PF 0
+#define IRDMA_CQPSQ_UPESD_BM_CP_LM 1
+#define IRDMA_CQPSQ_UPESD_BM_AXF 2
+#define IRDMA_CQPSQ_UPESD_BM_LM 4
+#define IRDMA_CQPSQ_UPESD_BM GENMASK_ULL(34, 32)
+#define IRDMA_CQPSQ_UPESD_ENTRY_COUNT GENMASK_ULL(3, 0)
+#define IRDMA_CQPSQ_UPESD_SKIP_ENTRY BIT_ULL(7)
+#define IRDMA_CQPSQ_SUSPENDQP_QPID GENMASK_ULL(23, 0)
+#define IRDMA_CQPSQ_RESUMEQP_QSHANDLE GENMASK_ULL(31, 0)
+#define IRDMA_CQPSQ_RESUMEQP_QPID GENMASK(23, 0)
+
+#define IRDMA_CQPSQ_MIN_STAG_INVALID 0x0001
+#define IRDMA_CQPSQ_MIN_SUSPEND_PND 0x0005
+
+#define IRDMA_CQPSQ_MAJ_NO_ERROR 0x0000
+#define IRDMA_CQPSQ_MAJ_OBJCACHE_ERROR 0xF000
+#define IRDMA_CQPSQ_MAJ_CNTXTCACHE_ERROR 0xF001
+#define IRDMA_CQPSQ_MAJ_ERROR 0xFFFF
+#define IRDMAQPC_DDP_VER GENMASK_ULL(1, 0)
+#define IRDMAQPC_IBRDENABLE BIT_ULL(2)
+#define IRDMAQPC_IPV4 BIT_ULL(3)
+#define IRDMAQPC_NONAGLE BIT_ULL(4)
+#define IRDMAQPC_INSERTVLANTAG BIT_ULL(5)
+#define IRDMAQPC_ISQP1 BIT_ULL(6)
+#define IRDMAQPC_TIMESTAMP BIT_ULL(7)
+#define IRDMAQPC_RQWQESIZE GENMASK_ULL(9, 8)
+#define IRDMAQPC_INSERTL2TAG2 BIT_ULL(11)
+#define IRDMAQPC_LIMIT GENMASK_ULL(13, 12)
+
+#define IRDMAQPC_ECN_EN BIT_ULL(14)
+#define IRDMAQPC_DROPOOOSEG BIT_ULL(15)
+#define IRDMAQPC_DUPACK_THRESH GENMASK_ULL(18, 16)
+#define IRDMAQPC_ERR_RQ_IDX_VALID BIT_ULL(19)
+#define IRDMAQPC_DIS_VLAN_CHECKS GENMASK_ULL(21, 19)
+#define IRDMAQPC_DC_TCP_EN BIT_ULL(25)
+#define IRDMAQPC_RCVTPHEN BIT_ULL(28)
+#define IRDMAQPC_XMITTPHEN BIT_ULL(29)
+#define IRDMAQPC_RQTPHEN BIT_ULL(30)
+#define IRDMAQPC_SQTPHEN BIT_ULL(31)
+#define IRDMAQPC_PPIDX GENMASK_ULL(41, 32)
+#define IRDMAQPC_PMENA BIT_ULL(47)
+#define IRDMAQPC_RDMAP_VER GENMASK_ULL(63, 62)
+#define IRDMAQPC_ROCE_TVER GENMASK_ULL(63, 60)
+
+#define IRDMAQPC_SQADDR IRDMA_CQPHC_QPCTX
+#define IRDMAQPC_RQADDR IRDMA_CQPHC_QPCTX
+#define IRDMAQPC_TTL GENMASK_ULL(7, 0)
+#define IRDMAQPC_RQSIZE GENMASK_ULL(11, 8)
+#define IRDMAQPC_SQSIZE GENMASK_ULL(15, 12)
+#define IRDMAQPC_GEN1_SRCMACADDRIDX GENMASK(21, 16)
+#define IRDMAQPC_AVOIDSTRETCHACK BIT_ULL(23)
+#define IRDMAQPC_TOS GENMASK_ULL(31, 24)
+#define IRDMAQPC_SRCPORTNUM GENMASK_ULL(47, 32)
+#define IRDMAQPC_DESTPORTNUM GENMASK_ULL(63, 48)
+#define IRDMAQPC_DESTIPADDR0 GENMASK_ULL(63, 32)
+#define IRDMAQPC_DESTIPADDR1 GENMASK_ULL(31, 0)
+#define IRDMAQPC_DESTIPADDR2 GENMASK_ULL(63, 32)
+#define IRDMAQPC_DESTIPADDR3 GENMASK_ULL(31, 0)
+#define IRDMAQPC_SNDMSS GENMASK_ULL(29, 16)
+#define IRDMAQPC_SYN_RST_HANDLING GENMASK_ULL(31, 30)
+#define IRDMAQPC_VLANTAG GENMASK_ULL(47, 32)
+#define IRDMAQPC_ARPIDX GENMASK_ULL(63, 48)
+#define IRDMAQPC_FLOWLABEL GENMASK_ULL(19, 0)
+#define IRDMAQPC_WSCALE BIT_ULL(20)
+#define IRDMAQPC_KEEPALIVE BIT_ULL(21)
+#define IRDMAQPC_IGNORE_TCP_OPT BIT_ULL(22)
+#define IRDMAQPC_IGNORE_TCP_UNS_OPT BIT_ULL(23)
+#define IRDMAQPC_TCPSTATE GENMASK_ULL(31, 28)
+#define IRDMAQPC_RCVSCALE GENMASK_ULL(35, 32)
+#define IRDMAQPC_SNDSCALE GENMASK_ULL(43, 40)
+#define IRDMAQPC_PDIDX GENMASK_ULL(63, 48)
+#define IRDMAQPC_PDIDXHI GENMASK_ULL(21, 20)
+#define IRDMAQPC_PKEY GENMASK_ULL(47, 32)
+#define IRDMAQPC_ACKCREDITS GENMASK_ULL(24, 20)
+#define IRDMAQPC_QKEY GENMASK_ULL(63, 32)
+#define IRDMAQPC_DESTQP GENMASK_ULL(23, 0)
+#define IRDMAQPC_KALIVE_TIMER_MAX_PROBES GENMASK_ULL(23, 16)
+#define IRDMAQPC_KEEPALIVE_INTERVAL GENMASK_ULL(31, 24)
+#define IRDMAQPC_TIMESTAMP_RECENT GENMASK_ULL(31, 0)
+#define IRDMAQPC_TIMESTAMP_AGE GENMASK_ULL(63, 32)
+#define IRDMAQPC_SNDNXT GENMASK_ULL(31, 0)
+#define IRDMAQPC_ISN GENMASK_ULL(55, 32)
+#define IRDMAQPC_PSNNXT GENMASK_ULL(23, 0)
+#define IRDMAQPC_LSN GENMASK_ULL(55, 32)
+#define IRDMAQPC_SNDWND GENMASK_ULL(63, 32)
+#define IRDMAQPC_RCVNXT GENMASK_ULL(31, 0)
+#define IRDMAQPC_EPSN GENMASK_ULL(23, 0)
+#define IRDMAQPC_RCVWND GENMASK_ULL(63, 32)
+#define IRDMAQPC_SNDMAX GENMASK_ULL(31, 0)
+#define IRDMAQPC_SNDUNA GENMASK_ULL(63, 32)
+#define IRDMAQPC_PSNMAX GENMASK_ULL(23, 0)
+#define IRDMAQPC_PSNUNA GENMASK_ULL(55, 32)
+#define IRDMAQPC_SRTT GENMASK_ULL(31, 0)
+#define IRDMAQPC_RTTVAR GENMASK_ULL(63, 32)
+#define IRDMAQPC_SSTHRESH GENMASK_ULL(31, 0)
+#define IRDMAQPC_CWND GENMASK_ULL(63, 32)
+#define IRDMAQPC_CWNDROCE GENMASK_ULL(55, 32)
+#define IRDMAQPC_SNDWL1 GENMASK_ULL(31, 0)
+#define IRDMAQPC_SNDWL2 GENMASK_ULL(63, 32)
+#define IRDMAQPC_ERR_RQ_IDX GENMASK_ULL(45, 32)
+#define IRDMAQPC_RTOMIN GENMASK_ULL(63, 57)
+#define IRDMAQPC_MAXSNDWND GENMASK_ULL(31, 0)
+#define IRDMAQPC_REXMIT_THRESH GENMASK_ULL(53, 48)
+#define IRDMAQPC_RNRNAK_THRESH GENMASK_ULL(56, 54)
+#define IRDMAQPC_TXCQNUM GENMASK_ULL(18, 0)
+#define IRDMAQPC_RXCQNUM GENMASK_ULL(50, 32)
+#define IRDMAQPC_STAT_INDEX GENMASK_ULL(6, 0)
+#define IRDMAQPC_Q2ADDR GENMASK_ULL(63, 8)
+#define IRDMAQPC_LASTBYTESENT GENMASK_ULL(7, 0)
+#define IRDMAQPC_MACADDRESS GENMASK_ULL(63, 16)
+#define IRDMAQPC_ORDSIZE GENMASK_ULL(7, 0)
+
+#define IRDMAQPC_IRDSIZE GENMASK_ULL(18, 16)
+
+#define IRDMAQPC_UDPRIVCQENABLE BIT_ULL(19)
+#define IRDMAQPC_WRRDRSPOK BIT_ULL(20)
+#define IRDMAQPC_RDOK BIT_ULL(21)
+#define IRDMAQPC_SNDMARKERS BIT_ULL(22)
+#define IRDMAQPC_DCQCNENABLE BIT_ULL(22)
+#define IRDMAQPC_FW_CC_ENABLE BIT_ULL(28)
+#define IRDMAQPC_RCVNOICRC BIT_ULL(31)
+#define IRDMAQPC_BINDEN BIT_ULL(23)
+#define IRDMAQPC_FASTREGEN BIT_ULL(24)
+#define IRDMAQPC_PRIVEN BIT_ULL(25)
+#define IRDMAQPC_TIMELYENABLE BIT_ULL(27)
+#define IRDMAQPC_THIGH GENMASK_ULL(63, 52)
+#define IRDMAQPC_TLOW GENMASK_ULL(39, 32)
+#define IRDMAQPC_REMENDPOINTIDX GENMASK_ULL(16, 0)
+#define IRDMAQPC_USESTATSINSTANCE BIT_ULL(26)
+#define IRDMAQPC_IWARPMODE BIT_ULL(28)
+#define IRDMAQPC_RCVMARKERS BIT_ULL(29)
+#define IRDMAQPC_ALIGNHDRS BIT_ULL(30)
+#define IRDMAQPC_RCVNOMPACRC BIT_ULL(31)
+#define IRDMAQPC_RCVMARKOFFSET GENMASK_ULL(40, 32)
+#define IRDMAQPC_SNDMARKOFFSET GENMASK_ULL(56, 48)
+
+#define IRDMAQPC_QPCOMPCTX IRDMA_CQPHC_QPCTX
+#define IRDMAQPC_SQTPHVAL GENMASK_ULL(7, 0)
+#define IRDMAQPC_RQTPHVAL GENMASK_ULL(15, 8)
+#define IRDMAQPC_QSHANDLE GENMASK_ULL(25, 16)
+#define IRDMAQPC_EXCEPTION_LAN_QUEUE GENMASK_ULL(43, 32)
+#define IRDMAQPC_LOCAL_IPADDR3 GENMASK_ULL(31, 0)
+#define IRDMAQPC_LOCAL_IPADDR2 GENMASK_ULL(63, 32)
+#define IRDMAQPC_LOCAL_IPADDR1 GENMASK_ULL(31, 0)
+#define IRDMAQPC_LOCAL_IPADDR0 GENMASK_ULL(63, 32)
+#define IRDMA_FW_VER_MINOR GENMASK_ULL(15, 0)
+#define IRDMA_FW_VER_MAJOR GENMASK_ULL(31, 16)
+#define IRDMA_FEATURE_INFO GENMASK_ULL(47, 0)
+#define IRDMA_FEATURE_CNT GENMASK_ULL(47, 32)
+#define IRDMA_FEATURE_TYPE GENMASK_ULL(63, 48)
+
+#define IRDMAQPSQ_OPCODE GENMASK_ULL(37, 32)
+#define IRDMAQPSQ_COPY_HOST_PBL BIT_ULL(43)
+#define IRDMAQPSQ_ADDFRAGCNT GENMASK_ULL(41, 38)
+#define IRDMAQPSQ_PUSHWQE BIT_ULL(56)
+#define IRDMAQPSQ_STREAMMODE BIT_ULL(58)
+#define IRDMAQPSQ_WAITFORRCVPDU BIT_ULL(59)
+#define IRDMAQPSQ_READFENCE BIT_ULL(60)
+#define IRDMAQPSQ_LOCALFENCE BIT_ULL(61)
+#define IRDMAQPSQ_UDPHEADER BIT_ULL(61)
+#define IRDMAQPSQ_L4LEN GENMASK_ULL(45, 42)
+#define IRDMAQPSQ_SIGCOMPL BIT_ULL(62)
+#define IRDMAQPSQ_VALID BIT_ULL(63)
+
+#define IRDMAQPSQ_FRAG_TO IRDMA_CQPHC_QPCTX
+#define IRDMAQPSQ_FRAG_VALID BIT_ULL(63)
+#define IRDMAQPSQ_FRAG_LEN GENMASK_ULL(62, 32)
+#define IRDMAQPSQ_FRAG_STAG GENMASK_ULL(31, 0)
+#define IRDMAQPSQ_GEN1_FRAG_LEN GENMASK_ULL(31, 0)
+#define IRDMAQPSQ_GEN1_FRAG_STAG GENMASK_ULL(63, 32)
+#define IRDMAQPSQ_REMSTAGINV GENMASK_ULL(31, 0)
+#define IRDMAQPSQ_DESTQKEY GENMASK_ULL(31, 0)
+#define IRDMAQPSQ_DESTQPN GENMASK_ULL(55, 32)
+#define IRDMAQPSQ_AHID GENMASK_ULL(16, 0)
+#define IRDMAQPSQ_INLINEDATAFLAG BIT_ULL(57)
+
+#define IRDMA_INLINE_VALID_S 7
+#define IRDMAQPSQ_INLINEDATALEN GENMASK_ULL(55, 48)
+#define IRDMAQPSQ_IMMDATAFLAG BIT_ULL(47)
+#define IRDMAQPSQ_REPORTRTT BIT_ULL(46)
+
+#define IRDMAQPSQ_IMMDATA GENMASK_ULL(63, 0)
+#define IRDMAQPSQ_REMSTAG GENMASK_ULL(31, 0)
+
+#define IRDMAQPSQ_REMTO IRDMA_CQPHC_QPCTX
+
+#define IRDMAQPSQ_STAGRIGHTS GENMASK_ULL(52, 48)
+#define IRDMAQPSQ_VABASEDTO BIT_ULL(53)
+#define IRDMAQPSQ_MEMWINDOWTYPE BIT_ULL(54)
+
+#define IRDMAQPSQ_MWLEN IRDMA_CQPHC_QPCTX
+#define IRDMAQPSQ_PARENTMRSTAG GENMASK_ULL(63, 32)
+#define IRDMAQPSQ_MWSTAG GENMASK_ULL(31, 0)
+
+#define IRDMAQPSQ_BASEVA_TO_FBO IRDMA_CQPHC_QPCTX
+
+#define IRDMAQPSQ_LOCSTAG GENMASK_ULL(31, 0)
+
+#define IRDMAQPSQ_STAGKEY GENMASK_ULL(7, 0)
+#define IRDMAQPSQ_STAGINDEX GENMASK_ULL(31, 8)
+#define IRDMAQPSQ_COPYHOSTPBLS BIT_ULL(43)
+#define IRDMAQPSQ_LPBLSIZE GENMASK_ULL(45, 44)
+#define IRDMAQPSQ_HPAGESIZE GENMASK_ULL(47, 46)
+#define IRDMAQPSQ_STAGLEN GENMASK_ULL(40, 0)
+#define IRDMAQPSQ_FIRSTPMPBLIDXLO GENMASK_ULL(63, 48)
+#define IRDMAQPSQ_FIRSTPMPBLIDXHI GENMASK_ULL(11, 0)
+#define IRDMAQPSQ_PBLADDR GENMASK_ULL(63, 12)
+
+/* iwarp QP RQ WQE common fields */
+#define IRDMAQPRQ_ADDFRAGCNT IRDMAQPSQ_ADDFRAGCNT
+#define IRDMAQPRQ_VALID IRDMAQPSQ_VALID
+#define IRDMAQPRQ_COMPLCTX IRDMA_CQPHC_QPCTX
+#define IRDMAQPRQ_FRAG_LEN IRDMAQPSQ_FRAG_LEN
+#define IRDMAQPRQ_STAG IRDMAQPSQ_FRAG_STAG
+#define IRDMAQPRQ_TO IRDMAQPSQ_FRAG_TO
+
+#define IRDMAPFINT_OICR_HMC_ERR_M BIT(26)
+#define IRDMAPFINT_OICR_PE_PUSH_M BIT(27)
+#define IRDMAPFINT_OICR_PE_CRITERR_M BIT(28)
+
+#define IRDMA_QUERY_FPM_MAX_QPS GENMASK_ULL(18, 0)
+#define IRDMA_QUERY_FPM_MAX_CQS GENMASK_ULL(19, 0)
+#define IRDMA_QUERY_FPM_FIRST_PE_SD_INDEX GENMASK_ULL(13, 0)
+#define IRDMA_QUERY_FPM_MAX_PE_SDS GENMASK_ULL(45, 32)
+#define IRDMA_QUERY_FPM_MAX_CEQS GENMASK_ULL(9, 0)
+#define IRDMA_QUERY_FPM_XFBLOCKSIZE GENMASK_ULL(63, 32)
+#define IRDMA_QUERY_FPM_Q1BLOCKSIZE GENMASK_ULL(63, 32)
+#define IRDMA_QUERY_FPM_HTMULTIPLIER GENMASK_ULL(19, 16)
+#define IRDMA_QUERY_FPM_TIMERBUCKET GENMASK_ULL(47, 32)
+#define IRDMA_QUERY_FPM_RRFBLOCKSIZE GENMASK_ULL(63, 32)
+#define IRDMA_QUERY_FPM_RRFFLBLOCKSIZE GENMASK_ULL(63, 32)
+#define IRDMA_QUERY_FPM_OOISCFBLOCKSIZE GENMASK_ULL(63, 32)
+#define IRDMA_SHMC_PAGE_ALLOCATED_HMC_FN_ID GENMASK_ULL(5, 0)
+
+#define IRDMA_GET_CURRENT_AEQ_ELEM(_aeq) \
+	( \
+		(_aeq)->aeqe_base[IRDMA_RING_CURRENT_TAIL((_aeq)->aeq_ring)].buf \
+	)
+
+#define IRDMA_GET_CURRENT_CEQ_ELEM(_ceq) \
+	( \
+		(_ceq)->ceqe_base[IRDMA_RING_CURRENT_TAIL((_ceq)->ceq_ring)].buf \
+	)
+
+#define IRDMA_GET_CEQ_ELEM_AT_POS(_ceq, _pos) \
+	( \
+		(_ceq)->ceqe_base[_pos].buf  \
+	)
+
+#define IRDMA_RING_GET_NEXT_TAIL(_ring, _idx) \
+	( \
+		((_ring).tail + (_idx)) % (_ring).size \
+	)
+
+#define IRDMA_CQP_INIT_WQE(wqe) memset(wqe, 0, 64)
+
+#define IRDMA_GET_CURRENT_CQ_ELEM(_cq) \
+	( \
+		(_cq)->cq_base[IRDMA_RING_CURRENT_HEAD((_cq)->cq_ring)].buf  \
+	)
+#define IRDMA_GET_CURRENT_EXTENDED_CQ_ELEM(_cq) \
+	( \
+		((struct irdma_extended_cqe *) \
+		((_cq)->cq_base))[IRDMA_RING_CURRENT_HEAD((_cq)->cq_ring)].buf \
+	)
+
+#define IRDMA_RING_INIT(_ring, _size) \
+	{ \
+		(_ring).head = 0; \
+		(_ring).tail = 0; \
+		(_ring).size = (_size); \
+	}
+#define IRDMA_RING_SIZE(_ring) ((_ring).size)
+#define IRDMA_RING_CURRENT_HEAD(_ring) ((_ring).head)
+#define IRDMA_RING_CURRENT_TAIL(_ring) ((_ring).tail)
+
+#define IRDMA_RING_MOVE_HEAD(_ring, _retcode) \
+	{ \
+		register u32 size; \
+		size = (_ring).size;  \
+		if (!IRDMA_RING_FULL_ERR(_ring)) { \
+			(_ring).head = ((_ring).head + 1) % size; \
+			(_retcode) = 0; \
+		} else { \
+			(_retcode) = IRDMA_ERR_RING_FULL; \
+		} \
+	}
+#define IRDMA_RING_MOVE_HEAD_BY_COUNT(_ring, _count, _retcode) \
+	{ \
+		register u32 size; \
+		size = (_ring).size; \
+		if ((IRDMA_RING_USED_QUANTA(_ring) + (_count)) < size) { \
+			(_ring).head = ((_ring).head + (_count)) % size; \
+			(_retcode) = 0; \
+		} else { \
+			(_retcode) = IRDMA_ERR_RING_FULL; \
+		} \
+	}
+#define IRDMA_SQ_RING_MOVE_HEAD(_ring, _retcode) \
+	{ \
+		register u32 size; \
+		size = (_ring).size;  \
+		if (!IRDMA_SQ_RING_FULL_ERR(_ring)) { \
+			(_ring).head = ((_ring).head + 1) % size; \
+			(_retcode) = 0; \
+		} else { \
+			(_retcode) = IRDMA_ERR_RING_FULL; \
+		} \
+	}
+#define IRDMA_SQ_RING_MOVE_HEAD_BY_COUNT(_ring, _count, _retcode) \
+	{ \
+		register u32 size; \
+		size = (_ring).size; \
+		if ((IRDMA_RING_USED_QUANTA(_ring) + (_count)) < (size - 256)) { \
+			(_ring).head = ((_ring).head + (_count)) % size; \
+			(_retcode) = 0; \
+		} else { \
+			(_retcode) = IRDMA_ERR_RING_FULL; \
+		} \
+	}
+#define IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(_ring, _count) \
+	(_ring).head = ((_ring).head + (_count)) % (_ring).size
+
+#define IRDMA_RING_MOVE_TAIL(_ring) \
+	(_ring).tail = ((_ring).tail + 1) % (_ring).size
+
+#define IRDMA_RING_MOVE_HEAD_NOCHECK(_ring) \
+	(_ring).head = ((_ring).head + 1) % (_ring).size
+
+#define IRDMA_RING_MOVE_TAIL_BY_COUNT(_ring, _count) \
+	(_ring).tail = ((_ring).tail + (_count)) % (_ring).size
+
+#define IRDMA_RING_SET_TAIL(_ring, _pos) \
+	(_ring).tail = (_pos) % (_ring).size
+
+#define IRDMA_RING_FULL_ERR(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 1))  \
+	)
+
+#define IRDMA_ERR_RING_FULL2(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 2))  \
+	)
+
+#define IRDMA_ERR_RING_FULL3(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 3))  \
+	)
+
+#define IRDMA_SQ_RING_FULL_ERR(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 257))  \
+	)
+
+#define IRDMA_ERR_SQ_RING_FULL2(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 258))  \
+	)
+#define IRDMA_ERR_SQ_RING_FULL3(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 259))  \
+	)
+#define IRDMA_RING_MORE_WORK(_ring) \
+	( \
+		(IRDMA_RING_USED_QUANTA(_ring) != 0) \
+	)
+
+#define IRDMA_RING_USED_QUANTA(_ring) \
+	( \
+		(((_ring).head + (_ring).size - (_ring).tail) % (_ring).size) \
+	)
+
+#define IRDMA_RING_FREE_QUANTA(_ring) \
+	( \
+		((_ring).size - IRDMA_RING_USED_QUANTA(_ring) - 1) \
+	)
+
+#define IRDMA_SQ_RING_FREE_QUANTA(_ring) \
+	( \
+		((_ring).size - IRDMA_RING_USED_QUANTA(_ring) - 257) \
+	)
+
+#define IRDMA_ATOMIC_RING_MOVE_HEAD(_ring, index, _retcode) \
+	{ \
+		index = IRDMA_RING_CURRENT_HEAD(_ring); \
+		IRDMA_RING_MOVE_HEAD(_ring, _retcode); \
+	}
+
+enum irdma_qp_wqe_size {
+	IRDMA_WQE_SIZE_32  = 32,
+	IRDMA_WQE_SIZE_64  = 64,
+	IRDMA_WQE_SIZE_96  = 96,
+	IRDMA_WQE_SIZE_128 = 128,
+	IRDMA_WQE_SIZE_256 = 256,
+};
+
+enum irdma_ws_node_op {
+	IRDMA_ADD_NODE = 0,
+	IRDMA_MODIFY_NODE,
+	IRDMA_DEL_NODE,
+};
+
+enum {	IRDMA_Q_ALIGNMENT_M		 = (128 - 1),
+	IRDMA_AEQ_ALIGNMENT_M		 = (256 - 1),
+	IRDMA_Q2_ALIGNMENT_M		 = (256 - 1),
+	IRDMA_CEQ_ALIGNMENT_M		 = (256 - 1),
+	IRDMA_CQ0_ALIGNMENT_M		 = (256 - 1),
+	IRDMA_HOST_CTX_ALIGNMENT_M	 = (4 - 1),
+	IRDMA_SHADOWAREA_M		 = (128 - 1),
+	IRDMA_FPM_QUERY_BUF_ALIGNMENT_M	 = (4 - 1),
+	IRDMA_FPM_COMMIT_BUF_ALIGNMENT_M = (4 - 1),
+};
+
+enum irdma_alignment {
+	IRDMA_CQP_ALIGNMENT	    = 0x200,
+	IRDMA_AEQ_ALIGNMENT	    = 0x100,
+	IRDMA_CEQ_ALIGNMENT	    = 0x100,
+	IRDMA_CQ0_ALIGNMENT	    = 0x100,
+	IRDMA_SD_BUF_ALIGNMENT      = 0x80,
+	IRDMA_FEATURE_BUF_ALIGNMENT = 0x8,
+};
+
+enum icrdma_protocol_used {
+	ICRDMA_ANY_PROTOCOL	   = 0,
+	ICRDMA_IWARP_PROTOCOL_ONLY = 1,
+	ICRDMA_ROCE_PROTOCOL_ONLY  = 2,
+};
+
+/**
+ * set_64bit_val - set 64 bit value to hw wqe
+ * @wqe_words: wqe addr to write
+ * @byte_index: index in wqe
+ * @val: value to write
+ **/
+static inline void set_64bit_val(__le64 *wqe_words, u32 byte_index, u64 val)
+{
+	wqe_words[byte_index >> 3] = cpu_to_le64(val);
+}
+
+/**
+ * set_32bit_val - set 32 bit value to hw wqe
+ * @wqe_words: wqe addr to write
+ * @byte_index: index in wqe
+ * @val: value to write
+ **/
+static inline void set_32bit_val(__le32 *wqe_words, u32 byte_index, u32 val)
+{
+	wqe_words[byte_index >> 2] = cpu_to_le32(val);
+}
+
+/**
+ * get_64bit_val - read 64 bit value from wqe
+ * @wqe_words: wqe addr
+ * @byte_index: index to read from
+ * @val: read value
+ **/
+static inline void get_64bit_val(__le64 *wqe_words, u32 byte_index, u64 *val)
+{
+	*val = le64_to_cpu(wqe_words[byte_index >> 3]);
+}
+
+/**
+ * get_32bit_val - read 32 bit value from wqe
+ * @wqe_words: wqe addr
+ * @byte_index: index to reaad from
+ * @val: return 32 bit value
+ **/
+static inline void get_32bit_val(__le32 *wqe_words, u32 byte_index, u32 *val)
+{
+	*val = le32_to_cpu(wqe_words[byte_index >> 2]);
+}
+#endif /* IRDMA_DEFS_H */
diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h
new file mode 100644
index 0000000..37125e2
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/irdma.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2017 - 2021 Intel Corporation */
+#ifndef IRDMA_H
+#define IRDMA_H
+
+#define IRDMA_WQEALLOC_WQE_DESC_INDEX GENMASK(31, 20)
+
+#define IRDMA_CQPTAIL_WQTAIL GENMASK(10, 0)
+#define IRDMA_CQPTAIL_CQP_OP_ERR BIT(31)
+
+#define IRDMA_CQPERRCODES_CQP_MINOR_CODE GENMASK(15, 0)
+#define IRDMA_CQPERRCODES_CQP_MAJOR_CODE GENMASK(31, 16)
+#define IRDMA_GLPCI_LBARCTRL_PE_DB_SIZE GENMASK(5, 4)
+#define IRDMA_GLINT_RATE_INTERVAL GENMASK(5, 0)
+#define IRDMA_GLINT_RATE_INTRL_ENA BIT(6)
+#define IRDMA_GLINT_DYN_CTL_INTENA BIT(0)
+#define IRDMA_GLINT_DYN_CTL_CLEARPBA BIT(1)
+#define IRDMA_GLINT_DYN_CTL_ITR_INDX GENMASK(4, 3)
+#define IRDMA_GLINT_DYN_CTL_INTERVAL GENMASK(16, 5)
+#define IRDMA_GLINT_CEQCTL_ITR_INDX GENMASK(12, 11)
+#define IRDMA_GLINT_CEQCTL_CAUSE_ENA BIT(30)
+#define IRDMA_GLINT_CEQCTL_MSIX_INDX GENMASK(10, 0)
+#define IRDMA_PFINT_AEQCTL_MSIX_INDX GENMASK(10, 0)
+#define IRDMA_PFINT_AEQCTL_ITR_INDX GENMASK(12, 11)
+#define IRDMA_PFINT_AEQCTL_CAUSE_ENA BIT(30)
+#define IRDMA_PFHMC_PDINV_PMSDIDX GENMASK(11, 0)
+#define IRDMA_PFHMC_PDINV_PMSDPARTSEL BIT(15)
+#define IRDMA_PFHMC_PDINV_PMPDIDX GENMASK(24, 16)
+#define IRDMA_PFHMC_SDDATALOW_PMSDVALID BIT(0)
+#define IRDMA_PFHMC_SDDATALOW_PMSDTYPE BIT(1)
+#define IRDMA_PFHMC_SDDATALOW_PMSDBPCOUNT GENMASK(11, 2)
+#define IRDMA_PFHMC_SDDATALOW_PMSDDATALOW GENMASK(31, 12)
+#define IRDMA_PFHMC_SDCMD_PMSDWR BIT(31)
+
+#define IRDMA_INVALID_CQ_IDX			0xffffffff
+/* I40IW FW VER which supports RTS AE and CQ RESIZE */
+#define IRDMA_FW_VER_0x30010			0x30010
+/* IRDMA FW VER */
+#define IRDMA_FW_VER_0x1000D			0x1000D
+enum irdma_registers {
+	IRDMA_CQPTAIL,
+	IRDMA_CQPDB,
+	IRDMA_CCQPSTATUS,
+	IRDMA_CCQPHIGH,
+	IRDMA_CCQPLOW,
+	IRDMA_CQARM,
+	IRDMA_CQACK,
+	IRDMA_AEQALLOC,
+	IRDMA_CQPERRCODES,
+	IRDMA_WQEALLOC,
+	IRDMA_GLINT_DYN_CTL,
+	IRDMA_DB_ADDR_OFFSET,
+	IRDMA_GLPCI_LBARCTRL,
+	IRDMA_GLPE_CPUSTATUS0,
+	IRDMA_GLPE_CPUSTATUS1,
+	IRDMA_GLPE_CPUSTATUS2,
+	IRDMA_PFINT_AEQCTL,
+	IRDMA_GLINT_CEQCTL,
+	IRDMA_VSIQF_PE_CTL1,
+	IRDMA_PFHMC_PDINV,
+	IRDMA_GLHMC_VFPDINV,
+	IRDMA_GLPE_CRITERR,
+	IRDMA_GLINT_RATE,
+	IRDMA_MAX_REGS, /* Must be last entry */
+};
+
+enum irdma_shifts {
+	IRDMA_CCQPSTATUS_CCQP_DONE_S,
+	IRDMA_CCQPSTATUS_CCQP_ERR_S,
+	IRDMA_CQPSQ_STAG_PDID_S,
+	IRDMA_CQPSQ_CQ_CEQID_S,
+	IRDMA_CQPSQ_CQ_CQID_S,
+	IRDMA_COMMIT_FPM_CQCNT_S,
+	IRDMA_MAX_SHIFTS,
+};
+
+enum irdma_masks {
+	IRDMA_CCQPSTATUS_CCQP_DONE_M,
+	IRDMA_CCQPSTATUS_CCQP_ERR_M,
+	IRDMA_CQPSQ_STAG_PDID_M,
+	IRDMA_CQPSQ_CQ_CEQID_M,
+	IRDMA_CQPSQ_CQ_CQID_M,
+	IRDMA_COMMIT_FPM_CQCNT_M,
+	IRDMA_MAX_MASKS, /* Must be last entry */
+};
+
+#define IRDMA_MAX_MGS_PER_CTX	8
+
+struct irdma_mcast_grp_ctx_entry_info {
+	u32 qp_id;
+	bool valid_entry;
+	u16 dest_port;
+	u32 use_cnt;
+};
+
+struct irdma_mcast_grp_info {
+	u8 dest_mac_addr[ETH_ALEN];
+	u16 vlan_id;
+	u8 hmc_fcn_id;
+	bool ipv4_valid:1;
+	bool vlan_valid:1;
+	u16 mg_id;
+	u32 no_of_mgs;
+	u32 dest_ip_addr[4];
+	u16 qs_handle;
+	struct irdma_dma_mem dma_mem_mc;
+	struct irdma_mcast_grp_ctx_entry_info mg_ctx_info[IRDMA_MAX_MGS_PER_CTX];
+};
+
+enum irdma_vers {
+	IRDMA_GEN_RSVD,
+	IRDMA_GEN_1,
+	IRDMA_GEN_2,
+};
+
+struct irdma_uk_attrs {
+	u64 feature_flags;
+	u32 max_hw_wq_frags;
+	u32 max_hw_read_sges;
+	u32 max_hw_inline;
+	u32 max_hw_rq_quanta;
+	u32 max_hw_wq_quanta;
+	u32 min_hw_cq_size;
+	u32 max_hw_cq_size;
+	u16 max_hw_sq_chunk;
+	u8 hw_rev;
+};
+
+struct irdma_hw_attrs {
+	struct irdma_uk_attrs uk_attrs;
+	u64 max_hw_outbound_msg_size;
+	u64 max_hw_inbound_msg_size;
+	u64 max_mr_size;
+	u32 min_hw_qp_id;
+	u32 min_hw_aeq_size;
+	u32 max_hw_aeq_size;
+	u32 min_hw_ceq_size;
+	u32 max_hw_ceq_size;
+	u32 max_hw_device_pages;
+	u32 max_hw_vf_fpm_id;
+	u32 first_hw_vf_fpm_id;
+	u32 max_hw_ird;
+	u32 max_hw_ord;
+	u32 max_hw_wqes;
+	u32 max_hw_pds;
+	u32 max_hw_ena_vf_count;
+	u32 max_qp_wr;
+	u32 max_pe_ready_count;
+	u32 max_done_count;
+	u32 max_sleep_count;
+	u32 max_cqp_compl_wait_time_ms;
+	u16 max_stat_inst;
+};
+
+void i40iw_init_hw(struct irdma_sc_dev *dev);
+void icrdma_init_hw(struct irdma_sc_dev *dev);
+#endif /* IRDMA_H*/
diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h
new file mode 100644
index 0000000..86dd51b
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/type.h
@@ -0,0 +1,1717 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef IRDMA_TYPE_H
+#define IRDMA_TYPE_H
+#include "osdep.h"
+#include "irdma.h"
+#include "user.h"
+#include "hmc.h"
+#include "uda.h"
+#include "ws.h"
+#define IRDMA_DEBUG_ERR		"ERR"
+#define IRDMA_DEBUG_INIT	"INIT"
+#define IRDMA_DEBUG_DEV		"DEV"
+#define IRDMA_DEBUG_CM		"CM"
+#define IRDMA_DEBUG_VERBS	"VERBS"
+#define IRDMA_DEBUG_PUDA	"PUDA"
+#define IRDMA_DEBUG_ILQ		"ILQ"
+#define IRDMA_DEBUG_IEQ		"IEQ"
+#define IRDMA_DEBUG_QP		"QP"
+#define IRDMA_DEBUG_CQ		"CQ"
+#define IRDMA_DEBUG_MR		"MR"
+#define IRDMA_DEBUG_PBLE	"PBLE"
+#define IRDMA_DEBUG_WQE		"WQE"
+#define IRDMA_DEBUG_AEQ		"AEQ"
+#define IRDMA_DEBUG_CQP		"CQP"
+#define IRDMA_DEBUG_HMC		"HMC"
+#define IRDMA_DEBUG_USER	"USER"
+#define IRDMA_DEBUG_VIRT	"VIRT"
+#define IRDMA_DEBUG_DCB		"DCB"
+#define	IRDMA_DEBUG_CQE		"CQE"
+#define IRDMA_DEBUG_CLNT	"CLNT"
+#define IRDMA_DEBUG_WS		"WS"
+#define IRDMA_DEBUG_STATS	"STATS"
+
+enum irdma_page_size {
+	IRDMA_PAGE_SIZE_4K = 0,
+	IRDMA_PAGE_SIZE_2M,
+	IRDMA_PAGE_SIZE_1G,
+};
+
+enum irdma_hdrct_flags {
+	DDP_LEN_FLAG  = 0x80,
+	DDP_HDR_FLAG  = 0x40,
+	RDMA_HDR_FLAG = 0x20,
+};
+
+enum irdma_term_layers {
+	LAYER_RDMA = 0,
+	LAYER_DDP  = 1,
+	LAYER_MPA  = 2,
+};
+
+enum irdma_term_error_types {
+	RDMAP_REMOTE_PROT = 1,
+	RDMAP_REMOTE_OP   = 2,
+	DDP_CATASTROPHIC  = 0,
+	DDP_TAGGED_BUF    = 1,
+	DDP_UNTAGGED_BUF  = 2,
+	DDP_LLP		  = 3,
+};
+
+enum irdma_term_rdma_errors {
+	RDMAP_INV_STAG		  = 0x00,
+	RDMAP_INV_BOUNDS	  = 0x01,
+	RDMAP_ACCESS		  = 0x02,
+	RDMAP_UNASSOC_STAG	  = 0x03,
+	RDMAP_TO_WRAP		  = 0x04,
+	RDMAP_INV_RDMAP_VER       = 0x05,
+	RDMAP_UNEXPECTED_OP       = 0x06,
+	RDMAP_CATASTROPHIC_LOCAL  = 0x07,
+	RDMAP_CATASTROPHIC_GLOBAL = 0x08,
+	RDMAP_CANT_INV_STAG       = 0x09,
+	RDMAP_UNSPECIFIED	  = 0xff,
+};
+
+enum irdma_term_ddp_errors {
+	DDP_CATASTROPHIC_LOCAL      = 0x00,
+	DDP_TAGGED_INV_STAG	    = 0x00,
+	DDP_TAGGED_BOUNDS	    = 0x01,
+	DDP_TAGGED_UNASSOC_STAG     = 0x02,
+	DDP_TAGGED_TO_WRAP	    = 0x03,
+	DDP_TAGGED_INV_DDP_VER      = 0x04,
+	DDP_UNTAGGED_INV_QN	    = 0x01,
+	DDP_UNTAGGED_INV_MSN_NO_BUF = 0x02,
+	DDP_UNTAGGED_INV_MSN_RANGE  = 0x03,
+	DDP_UNTAGGED_INV_MO	    = 0x04,
+	DDP_UNTAGGED_INV_TOO_LONG   = 0x05,
+	DDP_UNTAGGED_INV_DDP_VER    = 0x06,
+};
+
+enum irdma_term_mpa_errors {
+	MPA_CLOSED  = 0x01,
+	MPA_CRC     = 0x02,
+	MPA_MARKER  = 0x03,
+	MPA_REQ_RSP = 0x04,
+};
+
+enum irdma_qp_event_type {
+	IRDMA_QP_EVENT_CATASTROPHIC,
+	IRDMA_QP_EVENT_ACCESS_ERR,
+};
+
+enum irdma_hw_stats_index_32b {
+	IRDMA_HW_STAT_INDEX_IP4RXDISCARD	= 0,
+	IRDMA_HW_STAT_INDEX_IP4RXTRUNC		= 1,
+	IRDMA_HW_STAT_INDEX_IP4TXNOROUTE	= 2,
+	IRDMA_HW_STAT_INDEX_IP6RXDISCARD	= 3,
+	IRDMA_HW_STAT_INDEX_IP6RXTRUNC		= 4,
+	IRDMA_HW_STAT_INDEX_IP6TXNOROUTE	= 5,
+	IRDMA_HW_STAT_INDEX_TCPRTXSEG		= 6,
+	IRDMA_HW_STAT_INDEX_TCPRXOPTERR		= 7,
+	IRDMA_HW_STAT_INDEX_TCPRXPROTOERR	= 8,
+	IRDMA_HW_STAT_INDEX_MAX_32_GEN_1	= 9, /* Must be same value as next entry */
+	IRDMA_HW_STAT_INDEX_RXVLANERR		= 9,
+	IRDMA_HW_STAT_INDEX_RXRPCNPHANDLED	= 10,
+	IRDMA_HW_STAT_INDEX_RXRPCNPIGNORED	= 11,
+	IRDMA_HW_STAT_INDEX_TXNPCNPSENT		= 12,
+	IRDMA_HW_STAT_INDEX_MAX_32, /* Must be last entry */
+};
+
+enum irdma_hw_stats_index_64b {
+	IRDMA_HW_STAT_INDEX_IP4RXOCTS	= 0,
+	IRDMA_HW_STAT_INDEX_IP4RXPKTS	= 1,
+	IRDMA_HW_STAT_INDEX_IP4RXFRAGS	= 2,
+	IRDMA_HW_STAT_INDEX_IP4RXMCPKTS	= 3,
+	IRDMA_HW_STAT_INDEX_IP4TXOCTS	= 4,
+	IRDMA_HW_STAT_INDEX_IP4TXPKTS	= 5,
+	IRDMA_HW_STAT_INDEX_IP4TXFRAGS	= 6,
+	IRDMA_HW_STAT_INDEX_IP4TXMCPKTS	= 7,
+	IRDMA_HW_STAT_INDEX_IP6RXOCTS	= 8,
+	IRDMA_HW_STAT_INDEX_IP6RXPKTS	= 9,
+	IRDMA_HW_STAT_INDEX_IP6RXFRAGS	= 10,
+	IRDMA_HW_STAT_INDEX_IP6RXMCPKTS	= 11,
+	IRDMA_HW_STAT_INDEX_IP6TXOCTS	= 12,
+	IRDMA_HW_STAT_INDEX_IP6TXPKTS	= 13,
+	IRDMA_HW_STAT_INDEX_IP6TXFRAGS	= 14,
+	IRDMA_HW_STAT_INDEX_IP6TXMCPKTS	= 15,
+	IRDMA_HW_STAT_INDEX_TCPRXSEGS	= 16,
+	IRDMA_HW_STAT_INDEX_TCPTXSEG	= 17,
+	IRDMA_HW_STAT_INDEX_RDMARXRDS	= 18,
+	IRDMA_HW_STAT_INDEX_RDMARXSNDS	= 19,
+	IRDMA_HW_STAT_INDEX_RDMARXWRS	= 20,
+	IRDMA_HW_STAT_INDEX_RDMATXRDS	= 21,
+	IRDMA_HW_STAT_INDEX_RDMATXSNDS	= 22,
+	IRDMA_HW_STAT_INDEX_RDMATXWRS	= 23,
+	IRDMA_HW_STAT_INDEX_RDMAVBND	= 24,
+	IRDMA_HW_STAT_INDEX_RDMAVINV	= 25,
+	IRDMA_HW_STAT_INDEX_MAX_64_GEN_1 = 26, /* Must be same value as next entry */
+	IRDMA_HW_STAT_INDEX_IP4RXMCOCTS	= 26,
+	IRDMA_HW_STAT_INDEX_IP4TXMCOCTS	= 27,
+	IRDMA_HW_STAT_INDEX_IP6RXMCOCTS	= 28,
+	IRDMA_HW_STAT_INDEX_IP6TXMCOCTS	= 29,
+	IRDMA_HW_STAT_INDEX_UDPRXPKTS	= 30,
+	IRDMA_HW_STAT_INDEX_UDPTXPKTS	= 31,
+	IRDMA_HW_STAT_INDEX_RXNPECNMARKEDPKTS = 32,
+	IRDMA_HW_STAT_INDEX_MAX_64, /* Must be last entry */
+};
+
+enum irdma_feature_type {
+	IRDMA_FEATURE_FW_INFO = 0,
+	IRDMA_HW_VERSION_INFO = 1,
+	IRDMA_QSETS_MAX       = 26,
+	IRDMA_MAX_FEATURES, /* Must be last entry */
+};
+
+enum irdma_sched_prio_type {
+	IRDMA_PRIO_WEIGHTED_RR     = 1,
+	IRDMA_PRIO_STRICT	   = 2,
+	IRDMA_PRIO_WEIGHTED_STRICT = 3,
+};
+
+enum irdma_vm_vf_type {
+	IRDMA_VF_TYPE = 0,
+	IRDMA_VM_TYPE,
+	IRDMA_PF_TYPE,
+};
+
+enum irdma_cqp_hmc_profile {
+	IRDMA_HMC_PROFILE_DEFAULT  = 1,
+	IRDMA_HMC_PROFILE_FAVOR_VF = 2,
+	IRDMA_HMC_PROFILE_EQUAL    = 3,
+};
+
+enum irdma_quad_entry_type {
+	IRDMA_QHASH_TYPE_TCP_ESTABLISHED = 1,
+	IRDMA_QHASH_TYPE_TCP_SYN,
+	IRDMA_QHASH_TYPE_UDP_UNICAST,
+	IRDMA_QHASH_TYPE_UDP_MCAST,
+	IRDMA_QHASH_TYPE_ROCE_MCAST,
+	IRDMA_QHASH_TYPE_ROCEV2_HW,
+};
+
+enum irdma_quad_hash_manage_type {
+	IRDMA_QHASH_MANAGE_TYPE_DELETE = 0,
+	IRDMA_QHASH_MANAGE_TYPE_ADD,
+	IRDMA_QHASH_MANAGE_TYPE_MODIFY,
+};
+
+enum irdma_syn_rst_handling {
+	IRDMA_SYN_RST_HANDLING_HW_TCP_SECURE = 0,
+	IRDMA_SYN_RST_HANDLING_HW_TCP,
+	IRDMA_SYN_RST_HANDLING_FW_TCP_SECURE,
+	IRDMA_SYN_RST_HANDLING_FW_TCP,
+};
+
+enum irdma_queue_type {
+	IRDMA_QUEUE_TYPE_SQ_RQ = 0,
+	IRDMA_QUEUE_TYPE_CQP,
+};
+
+struct irdma_sc_dev;
+struct irdma_vsi_pestat;
+struct irdma_irq_ops;
+struct irdma_cqp_ops;
+struct irdma_ccq_ops;
+struct irdma_ceq_ops;
+struct irdma_aeq_ops;
+struct irdma_mr_ops;
+struct irdma_cqp_misc_ops;
+struct irdma_pd_ops;
+struct irdma_ah_ops;
+struct irdma_priv_qp_ops;
+struct irdma_priv_cq_ops;
+struct irdma_hmc_ops;
+
+struct irdma_dcqcn_cc_params {
+	u8 cc_cfg_valid;
+	u8 min_dec_factor;
+	u8 min_rate;
+	u8 dcqcn_f;
+	u16 rai_factor;
+	u16 hai_factor;
+	u16 dcqcn_t;
+	u32 dcqcn_b;
+	u32 rreduce_mperiod;
+};
+
+struct irdma_cqp_init_info {
+	u64 cqp_compl_ctx;
+	u64 host_ctx_pa;
+	u64 sq_pa;
+	struct irdma_sc_dev *dev;
+	struct irdma_cqp_quanta *sq;
+	struct irdma_dcqcn_cc_params dcqcn_params;
+	__le64 *host_ctx;
+	u64 *scratch_array;
+	u32 sq_size;
+	u16 hw_maj_ver;
+	u16 hw_min_ver;
+	u8 struct_ver;
+	u8 hmc_profile;
+	u8 ena_vf_count;
+	u8 ceqs_per_vf;
+	bool en_datacenter_tcp:1;
+	bool disable_packed:1;
+	bool rocev2_rto_policy:1;
+	enum irdma_protocol_used protocol_used;
+};
+
+struct irdma_terminate_hdr {
+	u8 layer_etype;
+	u8 error_code;
+	u8 hdrct;
+	u8 rsvd;
+};
+
+struct irdma_cqp_sq_wqe {
+	__le64 buf[IRDMA_CQP_WQE_SIZE];
+};
+
+struct irdma_sc_aeqe {
+	__le64 buf[IRDMA_AEQE_SIZE];
+};
+
+struct irdma_ceqe {
+	__le64 buf[IRDMA_CEQE_SIZE];
+};
+
+struct irdma_cqp_ctx {
+	__le64 buf[IRDMA_CQP_CTX_SIZE];
+};
+
+struct irdma_cq_shadow_area {
+	__le64 buf[IRDMA_SHADOW_AREA_SIZE];
+};
+
+struct irdma_dev_hw_stats_offsets {
+	u32 stats_offset_32[IRDMA_HW_STAT_INDEX_MAX_32];
+	u32 stats_offset_64[IRDMA_HW_STAT_INDEX_MAX_64];
+};
+
+struct irdma_dev_hw_stats {
+	u64 stats_val_32[IRDMA_HW_STAT_INDEX_MAX_32];
+	u64 stats_val_64[IRDMA_HW_STAT_INDEX_MAX_64];
+};
+
+struct irdma_gather_stats {
+	u32 rsvd1;
+	u32 rxvlanerr;
+	u64 ip4rxocts;
+	u64 ip4rxpkts;
+	u32 ip4rxtrunc;
+	u32 ip4rxdiscard;
+	u64 ip4rxfrags;
+	u64 ip4rxmcocts;
+	u64 ip4rxmcpkts;
+	u64 ip6rxocts;
+	u64 ip6rxpkts;
+	u32 ip6rxtrunc;
+	u32 ip6rxdiscard;
+	u64 ip6rxfrags;
+	u64 ip6rxmcocts;
+	u64 ip6rxmcpkts;
+	u64 ip4txocts;
+	u64 ip4txpkts;
+	u64 ip4txfrag;
+	u64 ip4txmcocts;
+	u64 ip4txmcpkts;
+	u64 ip6txocts;
+	u64 ip6txpkts;
+	u64 ip6txfrags;
+	u64 ip6txmcocts;
+	u64 ip6txmcpkts;
+	u32 ip6txnoroute;
+	u32 ip4txnoroute;
+	u64 tcprxsegs;
+	u32 tcprxprotoerr;
+	u32 tcprxopterr;
+	u64 tcptxsegs;
+	u32 rsvd2;
+	u32 tcprtxseg;
+	u64 udprxpkts;
+	u64 udptxpkts;
+	u64 rdmarxwrs;
+	u64 rdmarxrds;
+	u64 rdmarxsnds;
+	u64 rdmatxwrs;
+	u64 rdmatxrds;
+	u64 rdmatxsnds;
+	u64 rdmavbn;
+	u64 rdmavinv;
+	u64 rxnpecnmrkpkts;
+	u32 rxrpcnphandled;
+	u32 rxrpcnpignored;
+	u32 txnpcnpsent;
+	u32 rsvd3[88];
+};
+
+struct irdma_stats_gather_info {
+	bool use_hmc_fcn_index:1;
+	bool use_stats_inst:1;
+	u8 hmc_fcn_index;
+	u8 stats_inst_index;
+	struct irdma_dma_mem stats_buff_mem;
+	void *gather_stats_va;
+	void *last_gather_stats_va;
+};
+
+struct irdma_vsi_pestat {
+	struct irdma_hw *hw;
+	struct irdma_dev_hw_stats hw_stats;
+	struct irdma_stats_gather_info gather_info;
+	struct timer_list stats_timer;
+	struct irdma_sc_vsi *vsi;
+	struct irdma_dev_hw_stats last_hw_stats;
+	spinlock_t lock; /* rdma stats lock */
+};
+
+struct irdma_hw {
+	u8 __iomem *hw_addr;
+	u8 __iomem *priv_hw_addr;
+	struct device *device;
+	struct irdma_hmc_info hmc;
+};
+
+struct irdma_pfpdu {
+	struct list_head rxlist;
+	u32 rcv_nxt;
+	u32 fps;
+	u32 max_fpdu_data;
+	u32 nextseqnum;
+	u32 rcv_start_seq;
+	bool mode:1;
+	bool mpa_crc_err:1;
+	u8  marker_len;
+	u64 total_ieq_bufs;
+	u64 fpdu_processed;
+	u64 bad_seq_num;
+	u64 crc_err;
+	u64 no_tx_bufs;
+	u64 tx_err;
+	u64 out_of_order;
+	u64 pmode_count;
+	struct irdma_sc_ah *ah;
+	struct irdma_puda_buf *ah_buf;
+	spinlock_t lock; /* fpdu processing lock */
+	struct irdma_puda_buf *lastrcv_buf;
+};
+
+struct irdma_sc_pd {
+	struct irdma_sc_dev *dev;
+	u32 pd_id;
+	int abi_ver;
+};
+
+struct irdma_cqp_quanta {
+	__le64 elem[IRDMA_CQP_WQE_SIZE];
+};
+
+struct irdma_sc_cqp {
+	u32 size;
+	u64 sq_pa;
+	u64 host_ctx_pa;
+	void *back_cqp;
+	struct irdma_sc_dev *dev;
+	enum irdma_status_code (*process_cqp_sds)(struct irdma_sc_dev *dev,
+						  struct irdma_update_sds_info *info);
+	struct irdma_dma_mem sdbuf;
+	struct irdma_ring sq_ring;
+	struct irdma_cqp_quanta *sq_base;
+	struct irdma_dcqcn_cc_params dcqcn_params;
+	__le64 *host_ctx;
+	u64 *scratch_array;
+	u32 cqp_id;
+	u32 sq_size;
+	u32 hw_sq_size;
+	u16 hw_maj_ver;
+	u16 hw_min_ver;
+	u8 struct_ver;
+	u8 polarity;
+	u8 hmc_profile;
+	u8 ena_vf_count;
+	u8 timeout_count;
+	u8 ceqs_per_vf;
+	bool en_datacenter_tcp:1;
+	bool disable_packed:1;
+	bool rocev2_rto_policy:1;
+	enum irdma_protocol_used protocol_used;
+};
+
+struct irdma_sc_aeq {
+	u32 size;
+	u64 aeq_elem_pa;
+	struct irdma_sc_dev *dev;
+	struct irdma_sc_aeqe *aeqe_base;
+	void *pbl_list;
+	u32 elem_cnt;
+	struct irdma_ring aeq_ring;
+	u8 pbl_chunk_size;
+	u32 first_pm_pbl_idx;
+	u32 msix_idx;
+	u8 polarity;
+	bool virtual_map:1;
+};
+
+struct irdma_sc_ceq {
+	u32 size;
+	u64 ceq_elem_pa;
+	struct irdma_sc_dev *dev;
+	struct irdma_ceqe *ceqe_base;
+	void *pbl_list;
+	u32 ceq_id;
+	u32 elem_cnt;
+	struct irdma_ring ceq_ring;
+	u8 pbl_chunk_size;
+	u8 tph_val;
+	u32 first_pm_pbl_idx;
+	u8 polarity;
+	struct irdma_sc_vsi *vsi;
+	struct irdma_sc_cq **reg_cq;
+	u32 reg_cq_size;
+	spinlock_t req_cq_lock; /* protect access to reg_cq array */
+	bool virtual_map:1;
+	bool tph_en:1;
+	bool itr_no_expire:1;
+};
+
+struct irdma_sc_cq {
+	struct irdma_cq_uk cq_uk;
+	u64 cq_pa;
+	u64 shadow_area_pa;
+	struct irdma_sc_dev *dev;
+	struct irdma_sc_vsi *vsi;
+	void *pbl_list;
+	void *back_cq;
+	u32 ceq_id;
+	u32 shadow_read_threshold;
+	u8 pbl_chunk_size;
+	u8 cq_type;
+	u8 tph_val;
+	u32 first_pm_pbl_idx;
+	bool ceqe_mask:1;
+	bool virtual_map:1;
+	bool check_overflow:1;
+	bool ceq_id_valid:1;
+	bool tph_en;
+};
+
+struct irdma_sc_qp {
+	struct irdma_qp_uk qp_uk;
+	u64 sq_pa;
+	u64 rq_pa;
+	u64 hw_host_ctx_pa;
+	u64 shadow_area_pa;
+	u64 q2_pa;
+	struct irdma_sc_dev *dev;
+	struct irdma_sc_vsi *vsi;
+	struct irdma_sc_pd *pd;
+	__le64 *hw_host_ctx;
+	void *llp_stream_handle;
+	struct irdma_pfpdu pfpdu;
+	u32 ieq_qp;
+	u8 *q2_buf;
+	u64 qp_compl_ctx;
+	u32 push_idx;
+	u16 qs_handle;
+	u16 push_offset;
+	u8 flush_wqes_count;
+	u8 sq_tph_val;
+	u8 rq_tph_val;
+	u8 qp_state;
+	u8 hw_sq_size;
+	u8 hw_rq_size;
+	u8 src_mac_addr_idx;
+	bool on_qoslist:1;
+	bool ieq_pass_thru:1;
+	bool sq_tph_en:1;
+	bool rq_tph_en:1;
+	bool rcv_tph_en:1;
+	bool xmit_tph_en:1;
+	bool virtual_map:1;
+	bool flush_sq:1;
+	bool flush_rq:1;
+	bool sq_flush_code:1;
+	bool rq_flush_code:1;
+	enum irdma_flush_opcode flush_code;
+	enum irdma_qp_event_type event_type;
+	u8 term_flags;
+	u8 user_pri;
+	struct list_head list;
+};
+
+struct irdma_stats_inst_info {
+	bool use_hmc_fcn_index;
+	u8 hmc_fn_id;
+	u8 stats_idx;
+};
+
+struct irdma_up_info {
+	u8 map[8];
+	u8 cnp_up_override;
+	u8 hmc_fcn_idx;
+	bool use_vlan:1;
+	bool use_cnp_up_override:1;
+};
+
+#define IRDMA_MAX_WS_NODES	0x3FF
+#define IRDMA_WS_NODE_INVALID	0xFFFF
+
+struct irdma_ws_node_info {
+	u16 id;
+	u16 vsi;
+	u16 parent_id;
+	u16 qs_handle;
+	bool type_leaf:1;
+	bool enable:1;
+	u8 prio_type;
+	u8 tc;
+	u8 weight;
+};
+
+struct irdma_hmc_fpm_misc {
+	u32 max_ceqs;
+	u32 max_sds;
+	u32 xf_block_size;
+	u32 q1_block_size;
+	u32 ht_multiplier;
+	u32 timer_bucket;
+	u32 rrf_block_size;
+	u32 ooiscf_block_size;
+};
+
+#define IRDMA_LEAF_DEFAULT_REL_BW		64
+#define IRDMA_PARENT_DEFAULT_REL_BW		1
+
+struct irdma_qos {
+	struct list_head qplist;
+	struct mutex qos_mutex; /* protect QoS attributes per QoS level */
+	u64 lan_qos_handle;
+	u32 l2_sched_node_id;
+	u16 qs_handle;
+	u8 traffic_class;
+	u8 rel_bw;
+	u8 prio_type;
+	bool valid;
+};
+
+#define IRDMA_INVALID_FCN_ID 0xff
+struct irdma_sc_vsi {
+	u16 vsi_idx;
+	struct irdma_sc_dev *dev;
+	void *back_vsi;
+	u32 ilq_count;
+	struct irdma_virt_mem ilq_mem;
+	struct irdma_puda_rsrc *ilq;
+	u32 ieq_count;
+	struct irdma_virt_mem ieq_mem;
+	struct irdma_puda_rsrc *ieq;
+	u32 exception_lan_q;
+	u16 mtu;
+	u16 vm_id;
+	u8 fcn_id;
+	enum irdma_vm_vf_type vm_vf_type;
+	bool stats_fcn_id_alloc:1;
+	bool tc_change_pending:1;
+	struct irdma_qos qos[IRDMA_MAX_USER_PRIORITY];
+	struct irdma_vsi_pestat *pestat;
+	atomic_t qp_suspend_reqs;
+	enum irdma_status_code (*register_qset)(struct irdma_sc_vsi *vsi,
+						struct irdma_ws_node *tc_node);
+	void (*unregister_qset)(struct irdma_sc_vsi *vsi,
+				struct irdma_ws_node *tc_node);
+	u8 qos_rel_bw;
+	u8 qos_prio_type;
+};
+
+struct irdma_sc_dev {
+	struct list_head cqp_cmd_head; /* head of the CQP command list */
+	spinlock_t cqp_lock; /* protect CQP list access */
+	struct irdma_dev_uk dev_uk;
+	bool fcn_id_array[IRDMA_MAX_STATS_COUNT];
+	struct irdma_dma_mem vf_fpm_query_buf[IRDMA_MAX_PE_ENA_VF_COUNT];
+	u64 fpm_query_buf_pa;
+	u64 fpm_commit_buf_pa;
+	__le64 *fpm_query_buf;
+	__le64 *fpm_commit_buf;
+	struct irdma_hw *hw;
+	u8 __iomem *db_addr;
+	u32 __iomem *wqe_alloc_db;
+	u32 __iomem *cq_arm_db;
+	u32 __iomem *aeq_alloc_db;
+	u32 __iomem *cqp_db;
+	u32 __iomem *cq_ack_db;
+	u32 __iomem *ceq_itr_mask_db;
+	u32 __iomem *aeq_itr_mask_db;
+	u32 __iomem *hw_regs[IRDMA_MAX_REGS];
+	u32 ceq_itr;   /* Interrupt throttle, usecs between interrupts: 0 disabled. 2 - 8160 */
+	u64 hw_masks[IRDMA_MAX_MASKS];
+	u64 hw_shifts[IRDMA_MAX_SHIFTS];
+	u64 hw_stats_regs_32[IRDMA_HW_STAT_INDEX_MAX_32];
+	u64 hw_stats_regs_64[IRDMA_HW_STAT_INDEX_MAX_64];
+	u64 feature_info[IRDMA_MAX_FEATURES];
+	u64 cqp_cmd_stats[IRDMA_MAX_CQP_OPS];
+	struct irdma_hw_attrs hw_attrs;
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_sc_cqp *cqp;
+	struct irdma_sc_aeq *aeq;
+	struct irdma_sc_ceq *ceq[IRDMA_CEQ_MAX_COUNT];
+	struct irdma_sc_cq *ccq;
+	const struct irdma_irq_ops *irq_ops;
+	const struct irdma_cqp_ops *cqp_ops;
+	const struct irdma_ccq_ops *ccq_ops;
+	const struct irdma_ceq_ops *ceq_ops;
+	const struct irdma_aeq_ops *aeq_ops;
+	const struct irdma_pd_ops *iw_pd_ops;
+	const struct irdma_ah_ops *iw_ah_ops;
+	const struct irdma_priv_qp_ops *iw_priv_qp_ops;
+	const struct irdma_priv_cq_ops *iw_priv_cq_ops;
+	const struct irdma_mr_ops *mr_ops;
+	const struct irdma_cqp_misc_ops *cqp_misc_ops;
+	const struct irdma_hmc_ops *hmc_ops;
+	const struct irdma_uda_ops *iw_uda_ops;
+	const struct irdma_vsi_ops *iw_vsi_ops;
+	struct irdma_hmc_fpm_misc hmc_fpm_misc;
+	struct irdma_ws_node *ws_tree_root;
+	struct mutex ws_mutex; /* ws tree mutex */
+	u16 num_vfs;
+	u8 hmc_fn_id;
+	u8 vf_id;
+	bool privileged:1;
+	bool vchnl_up:1;
+	bool ceq_valid:1;
+	bool is_pf:1;
+	u8 pci_rev;
+	enum irdma_status_code (*ws_add)(struct irdma_sc_vsi *vsi, u8 user_pri);
+	void (*ws_remove)(struct irdma_sc_vsi *vsi, u8 user_pri);
+	void (*ws_reset)(struct irdma_sc_vsi *vsi);
+};
+
+struct irdma_modify_cq_info {
+	u64 cq_pa;
+	struct irdma_cqe *cq_base;
+	u32 cq_size;
+	u32 shadow_read_threshold;
+	u8 pbl_chunk_size;
+	u32 first_pm_pbl_idx;
+	bool virtual_map:1;
+	bool check_overflow;
+	bool cq_resize:1;
+};
+
+struct irdma_create_qp_info {
+	bool ord_valid:1;
+	bool tcp_ctx_valid:1;
+	bool cq_num_valid:1;
+	bool arp_cache_idx_valid:1;
+	bool mac_valid:1;
+	bool force_lpb;
+	u8 next_iwarp_state;
+};
+
+struct irdma_modify_qp_info {
+	u64 rx_win0;
+	u64 rx_win1;
+	u16 new_mss;
+	u8 next_iwarp_state;
+	u8 curr_iwarp_state;
+	u8 termlen;
+	bool ord_valid:1;
+	bool tcp_ctx_valid:1;
+	bool udp_ctx_valid:1;
+	bool cq_num_valid:1;
+	bool arp_cache_idx_valid:1;
+	bool reset_tcp_conn:1;
+	bool remove_hash_idx:1;
+	bool dont_send_term:1;
+	bool dont_send_fin:1;
+	bool cached_var_valid:1;
+	bool mss_change:1;
+	bool force_lpb:1;
+	bool mac_valid:1;
+};
+
+struct irdma_ccq_cqe_info {
+	struct irdma_sc_cqp *cqp;
+	u64 scratch;
+	u32 op_ret_val;
+	u16 maj_err_code;
+	u16 min_err_code;
+	u8 op_code;
+	bool error;
+};
+
+struct irdma_dcb_app_info {
+	u8 priority;
+	u8 selector;
+	u16 prot_id;
+};
+
+struct irdma_qos_tc_info {
+	u64 tc_ctx;
+	u8 rel_bw;
+	u8 prio_type;
+	u8 egress_virt_up;
+	u8 ingress_virt_up;
+};
+
+struct irdma_l2params {
+	struct irdma_qos_tc_info tc_info[IRDMA_MAX_USER_PRIORITY];
+	struct irdma_dcb_app_info apps[IRDMA_MAX_APPS];
+	u32 num_apps;
+	u16 qs_handle_list[IRDMA_MAX_USER_PRIORITY];
+	u16 mtu;
+	u8 up2tc[IRDMA_MAX_USER_PRIORITY];
+	u8 num_tc;
+	u8 vsi_rel_bw;
+	u8 vsi_prio_type;
+	bool mtu_changed:1;
+	bool tc_changed:1;
+};
+
+struct irdma_vsi_init_info {
+	struct irdma_sc_dev *dev;
+	void *back_vsi;
+	struct irdma_l2params *params;
+	u16 exception_lan_q;
+	u16 pf_data_vsi_num;
+	enum irdma_vm_vf_type vm_vf_type;
+	u16 vm_id;
+	enum irdma_status_code (*register_qset)(struct irdma_sc_vsi *vsi,
+						struct irdma_ws_node *tc_node);
+	void (*unregister_qset)(struct irdma_sc_vsi *vsi,
+				struct irdma_ws_node *tc_node);
+};
+
+struct irdma_vsi_stats_info {
+	struct irdma_vsi_pestat *pestat;
+	u8 fcn_id;
+	bool alloc_fcn_id;
+};
+
+struct irdma_device_init_info {
+	u64 fpm_query_buf_pa;
+	u64 fpm_commit_buf_pa;
+	__le64 *fpm_query_buf;
+	__le64 *fpm_commit_buf;
+	struct irdma_hw *hw;
+	void __iomem *bar0;
+	void (*init_hw)(struct irdma_sc_dev *dev);
+	u8 hmc_fn_id;
+	bool privileged;
+	bool is_pf;
+};
+
+struct irdma_ceq_init_info {
+	u64 ceqe_pa;
+	struct irdma_sc_dev *dev;
+	u64 *ceqe_base;
+	void *pbl_list;
+	u32 elem_cnt;
+	u32 ceq_id;
+	bool virtual_map:1;
+	bool tph_en:1;
+	bool itr_no_expire:1;
+	u8 pbl_chunk_size;
+	u8 tph_val;
+	u32 first_pm_pbl_idx;
+	struct irdma_sc_vsi *vsi;
+	struct irdma_sc_cq **reg_cq;
+	u32 reg_cq_idx;
+};
+
+struct irdma_aeq_init_info {
+	u64 aeq_elem_pa;
+	struct irdma_sc_dev *dev;
+	u32 *aeqe_base;
+	void *pbl_list;
+	u32 elem_cnt;
+	bool virtual_map;
+	u8 pbl_chunk_size;
+	u32 first_pm_pbl_idx;
+	u32 msix_idx;
+};
+
+struct irdma_ccq_init_info {
+	u64 cq_pa;
+	u64 shadow_area_pa;
+	struct irdma_sc_dev *dev;
+	struct irdma_cqe *cq_base;
+	__le64 *shadow_area;
+	void *pbl_list;
+	u32 num_elem;
+	u32 ceq_id;
+	u32 shadow_read_threshold;
+	bool ceqe_mask:1;
+	bool ceq_id_valid:1;
+	bool avoid_mem_cflct:1;
+	bool virtual_map:1;
+	bool tph_en:1;
+	u8 tph_val;
+	u8 pbl_chunk_size;
+	u32 first_pm_pbl_idx;
+	struct irdma_sc_vsi *vsi;
+};
+
+struct irdma_udp_offload_info {
+	bool ipv4:1;
+	bool insert_vlan_tag:1;
+	u8 ttl;
+	u8 tos;
+	u16 src_port;
+	u16 dst_port;
+	u32 dest_ip_addr[4];
+	u32 snd_mss;
+	u16 vlan_tag;
+	u16 arp_idx;
+	u32 flow_label;
+	u8 udp_state;
+	u32 psn_nxt;
+	u32 lsn;
+	u32 epsn;
+	u32 psn_max;
+	u32 psn_una;
+	u32 local_ipaddr[4];
+	u32 cwnd;
+	u8 rexmit_thresh;
+	u8 rnr_nak_thresh;
+};
+
+struct irdma_roce_offload_info {
+	u16 p_key;
+	u16 err_rq_idx;
+	u32 qkey;
+	u32 dest_qp;
+	u32 local_qp;
+	u8 roce_tver;
+	u8 ack_credits;
+	u8 err_rq_idx_valid;
+	u32 pd_id;
+	u16 ord_size;
+	u16 ird_size;
+	bool is_qp1:1;
+	bool udprivcq_en:1;
+	bool dcqcn_en:1;
+	bool rcv_no_icrc:1;
+	bool wr_rdresp_en:1;
+	bool bind_en:1;
+	bool fast_reg_en:1;
+	bool priv_mode_en:1;
+	bool rd_en:1;
+	bool timely_en:1;
+	bool dctcp_en:1;
+	bool fw_cc_enable:1;
+	bool use_stats_inst:1;
+	u16 t_high;
+	u16 t_low;
+	u8 last_byte_sent;
+	u8 mac_addr[ETH_ALEN];
+	u8 rtomin;
+};
+
+struct irdma_iwarp_offload_info {
+	u16 rcv_mark_offset;
+	u16 snd_mark_offset;
+	u8 ddp_ver;
+	u8 rdmap_ver;
+	u8 iwarp_mode;
+	u16 err_rq_idx;
+	u32 pd_id;
+	u16 ord_size;
+	u16 ird_size;
+	bool ib_rd_en:1;
+	bool align_hdrs:1;
+	bool rcv_no_mpa_crc:1;
+	bool err_rq_idx_valid:1;
+	bool snd_mark_en:1;
+	bool rcv_mark_en:1;
+	bool wr_rdresp_en:1;
+	bool bind_en:1;
+	bool fast_reg_en:1;
+	bool priv_mode_en:1;
+	bool rd_en:1;
+	bool timely_en:1;
+	bool use_stats_inst:1;
+	bool ecn_en:1;
+	bool dctcp_en:1;
+	u16 t_high;
+	u16 t_low;
+	u8 last_byte_sent;
+	u8 mac_addr[ETH_ALEN];
+	u8 rtomin;
+};
+
+struct irdma_tcp_offload_info {
+	bool ipv4:1;
+	bool no_nagle:1;
+	bool insert_vlan_tag:1;
+	bool time_stamp:1;
+	bool drop_ooo_seg:1;
+	bool avoid_stretch_ack:1;
+	bool wscale:1;
+	bool ignore_tcp_opt:1;
+	bool ignore_tcp_uns_opt:1;
+	u8 cwnd_inc_limit;
+	u8 dup_ack_thresh;
+	u8 ttl;
+	u8 src_mac_addr_idx;
+	u8 tos;
+	u16 src_port;
+	u16 dst_port;
+	u32 dest_ip_addr[4];
+	//u32 dest_ip_addr0;
+	//u32 dest_ip_addr1;
+	//u32 dest_ip_addr2;
+	//u32 dest_ip_addr3;
+	u32 snd_mss;
+	u16 syn_rst_handling;
+	u16 vlan_tag;
+	u16 arp_idx;
+	u32 flow_label;
+	u8 tcp_state;
+	u8 snd_wscale;
+	u8 rcv_wscale;
+	u32 time_stamp_recent;
+	u32 time_stamp_age;
+	u32 snd_nxt;
+	u32 snd_wnd;
+	u32 rcv_nxt;
+	u32 rcv_wnd;
+	u32 snd_max;
+	u32 snd_una;
+	u32 srtt;
+	u32 rtt_var;
+	u32 ss_thresh;
+	u32 cwnd;
+	u32 snd_wl1;
+	u32 snd_wl2;
+	u32 max_snd_window;
+	u8 rexmit_thresh;
+	u32 local_ipaddr[4];
+};
+
+struct irdma_qp_host_ctx_info {
+	u64 qp_compl_ctx;
+	union {
+		struct irdma_tcp_offload_info *tcp_info;
+		struct irdma_udp_offload_info *udp_info;
+	};
+	union {
+		struct irdma_iwarp_offload_info *iwarp_info;
+		struct irdma_roce_offload_info *roce_info;
+	};
+	u32 send_cq_num;
+	u32 rcv_cq_num;
+	u32 rem_endpoint_idx;
+	u8 stats_idx;
+	bool srq_valid:1;
+	bool tcp_info_valid:1;
+	bool iwarp_info_valid:1;
+	bool stats_idx_valid:1;
+	u8 user_pri;
+};
+
+struct irdma_aeqe_info {
+	u64 compl_ctx;
+	u32 qp_cq_id;
+	u16 ae_id;
+	u16 wqe_idx;
+	u8 tcp_state;
+	u8 iwarp_state;
+	bool qp:1;
+	bool cq:1;
+	bool sq:1;
+	bool rq:1;
+	bool in_rdrsp_wr:1;
+	bool out_rdrsp:1;
+	bool aeqe_overflow:1;
+	u8 q2_data_written;
+	u8 ae_src;
+};
+
+struct irdma_allocate_stag_info {
+	u64 total_len;
+	u64 first_pm_pbl_idx;
+	u32 chunk_size;
+	u32 stag_idx;
+	u32 page_size;
+	u32 pd_id;
+	u16 access_rights;
+	bool remote_access:1;
+	bool use_hmc_fcn_index:1;
+	bool use_pf_rid:1;
+	u8 hmc_fcn_index;
+};
+
+struct irdma_mw_alloc_info {
+	u32 mw_stag_index;
+	u32 page_size;
+	u32 pd_id;
+	bool remote_access:1;
+	bool mw_wide:1;
+	bool mw1_bind_dont_vldt_key:1;
+};
+
+struct irdma_reg_ns_stag_info {
+	u64 reg_addr_pa;
+	u64 va;
+	u64 total_len;
+	u32 page_size;
+	u32 chunk_size;
+	u32 first_pm_pbl_index;
+	enum irdma_addressing_type addr_type;
+	irdma_stag_index stag_idx;
+	u16 access_rights;
+	u32 pd_id;
+	irdma_stag_key stag_key;
+	bool use_hmc_fcn_index:1;
+	u8 hmc_fcn_index;
+	bool use_pf_rid:1;
+};
+
+struct irdma_fast_reg_stag_info {
+	u64 wr_id;
+	u64 reg_addr_pa;
+	u64 fbo;
+	void *va;
+	u64 total_len;
+	u32 page_size;
+	u32 chunk_size;
+	u32 first_pm_pbl_index;
+	enum irdma_addressing_type addr_type;
+	irdma_stag_index stag_idx;
+	u16 access_rights;
+	u32 pd_id;
+	irdma_stag_key stag_key;
+	bool local_fence:1;
+	bool read_fence:1;
+	bool signaled:1;
+	bool push_wqe:1;
+	bool use_hmc_fcn_index:1;
+	u8 hmc_fcn_index;
+	bool use_pf_rid:1;
+	bool defer_flag:1;
+};
+
+struct irdma_dealloc_stag_info {
+	u32 stag_idx;
+	u32 pd_id;
+	bool mr:1;
+	bool dealloc_pbl:1;
+};
+
+struct irdma_register_shared_stag {
+	u64 va;
+	enum irdma_addressing_type addr_type;
+	irdma_stag_index new_stag_idx;
+	irdma_stag_index parent_stag_idx;
+	u32 access_rights;
+	u32 pd_id;
+	u32 page_size;
+	irdma_stag_key new_stag_key;
+};
+
+struct irdma_qp_init_info {
+	struct irdma_qp_uk_init_info qp_uk_init_info;
+	struct irdma_sc_pd *pd;
+	struct irdma_sc_vsi *vsi;
+	__le64 *host_ctx;
+	u8 *q2;
+	u64 sq_pa;
+	u64 rq_pa;
+	u64 host_ctx_pa;
+	u64 q2_pa;
+	u64 shadow_area_pa;
+	u8 sq_tph_val;
+	u8 rq_tph_val;
+	bool sq_tph_en:1;
+	bool rq_tph_en:1;
+	bool rcv_tph_en:1;
+	bool xmit_tph_en:1;
+	bool virtual_map:1;
+};
+
+struct irdma_cq_init_info {
+	struct irdma_sc_dev *dev;
+	u64 cq_base_pa;
+	u64 shadow_area_pa;
+	u32 ceq_id;
+	u32 shadow_read_threshold;
+	u8 pbl_chunk_size;
+	u32 first_pm_pbl_idx;
+	bool virtual_map:1;
+	bool ceqe_mask:1;
+	bool ceq_id_valid:1;
+	bool tph_en:1;
+	u8 tph_val;
+	u8 type;
+	struct irdma_cq_uk_init_info cq_uk_init_info;
+	struct irdma_sc_vsi *vsi;
+};
+
+struct irdma_upload_context_info {
+	u64 buf_pa;
+	u32 qp_id;
+	u8 qp_type;
+	bool freeze_qp:1;
+	bool raw_format:1;
+};
+
+struct irdma_local_mac_entry_info {
+	u8 mac_addr[6];
+	u16 entry_idx;
+};
+
+struct irdma_add_arp_cache_entry_info {
+	u8 mac_addr[ETH_ALEN];
+	u32 reach_max;
+	u16 arp_index;
+	bool permanent;
+};
+
+struct irdma_apbvt_info {
+	u16 port;
+	bool add;
+};
+
+struct irdma_qhash_table_info {
+	struct irdma_sc_vsi *vsi;
+	enum irdma_quad_hash_manage_type manage;
+	enum irdma_quad_entry_type entry_type;
+	bool vlan_valid:1;
+	bool ipv4_valid:1;
+	u8 mac_addr[ETH_ALEN];
+	u16 vlan_id;
+	u8 user_pri;
+	u32 qp_num;
+	u32 dest_ip[4];
+	u32 src_ip[4];
+	u16 dest_port;
+	u16 src_port;
+};
+
+struct irdma_cqp_manage_push_page_info {
+	u32 push_idx;
+	u16 qs_handle;
+	u8 free_page;
+	u8 push_page_type;
+};
+
+struct irdma_qp_flush_info {
+	u16 sq_minor_code;
+	u16 sq_major_code;
+	u16 rq_minor_code;
+	u16 rq_major_code;
+	u16 ae_code;
+	u8 ae_src;
+	bool sq:1;
+	bool rq:1;
+	bool userflushcode:1;
+	bool generate_ae:1;
+};
+
+struct irdma_gen_ae_info {
+	u16 ae_code;
+	u8 ae_src;
+};
+
+struct irdma_cqp_timeout {
+	u64 compl_cqp_cmds;
+	u32 count;
+};
+
+struct irdma_irq_ops {
+	void (*irdma_cfg_aeq)(struct irdma_sc_dev *dev, u32 idx, bool enable);
+	void (*irdma_cfg_ceq)(struct irdma_sc_dev *dev, u32 ceq_id, u32 idx,
+			      bool enable);
+	void (*irdma_dis_irq)(struct irdma_sc_dev *dev, u32 idx);
+	void (*irdma_en_irq)(struct irdma_sc_dev *dev, u32 idx);
+	void (*irdma_set_intrl)(struct irdma_sc_dev *dev, u32 idx, u32 rate);
+};
+
+struct irdma_cqp_ops {
+	void (*check_cqp_progress)(struct irdma_cqp_timeout *cqp_timeout,
+				   struct irdma_sc_dev *dev);
+	enum irdma_status_code (*cqp_create)(struct irdma_sc_cqp *cqp,
+					     u16 *maj_err, u16 *min_err);
+	enum irdma_status_code (*cqp_destroy)(struct irdma_sc_cqp *cqp);
+	__le64 *(*cqp_get_next_send_wqe)(struct irdma_sc_cqp *cqp, u64 scratch);
+	enum irdma_status_code (*cqp_init)(struct irdma_sc_cqp *cqp,
+					   struct irdma_cqp_init_info *info);
+	void (*cqp_post_sq)(struct irdma_sc_cqp *cqp);
+	enum irdma_status_code (*poll_for_cqp_op_done)(struct irdma_sc_cqp *cqp,
+						       u8 opcode,
+						       struct irdma_ccq_cqe_info *cmpl_info);
+};
+
+struct irdma_ccq_ops {
+	void (*ccq_arm)(struct irdma_sc_cq *ccq);
+	enum irdma_status_code (*ccq_create)(struct irdma_sc_cq *ccq,
+					     u64 scratch, bool check_overflow,
+					     bool post_sq);
+	enum irdma_status_code (*ccq_create_done)(struct irdma_sc_cq *ccq);
+	enum irdma_status_code (*ccq_destroy)(struct irdma_sc_cq *ccq, u64 scratch, bool post_sq);
+	enum irdma_status_code (*ccq_get_cqe_info)(struct irdma_sc_cq *ccq,
+						   struct irdma_ccq_cqe_info *info);
+	enum irdma_status_code (*ccq_init)(struct irdma_sc_cq *ccq,
+					   struct irdma_ccq_init_info *info);
+};
+
+struct irdma_ceq_ops {
+	enum irdma_status_code (*ceq_create)(struct irdma_sc_ceq *ceq,
+					     u64 scratch, bool post_sq);
+	enum irdma_status_code (*cceq_create_done)(struct irdma_sc_ceq *ceq);
+	enum irdma_status_code (*cceq_destroy_done)(struct irdma_sc_ceq *ceq);
+	enum irdma_status_code (*cceq_create)(struct irdma_sc_ceq *ceq,
+					      u64 scratch);
+	enum irdma_status_code (*ceq_destroy)(struct irdma_sc_ceq *ceq,
+					      u64 scratch, bool post_sq);
+	enum irdma_status_code (*ceq_init)(struct irdma_sc_ceq *ceq,
+					   struct irdma_ceq_init_info *info);
+	void *(*process_ceq)(struct irdma_sc_dev *dev,
+			     struct irdma_sc_ceq *ceq);
+	void (*cleanup_ceqes)(struct irdma_sc_cq *cq,
+			      struct irdma_sc_ceq *ceq);
+};
+
+struct irdma_aeq_ops {
+	enum irdma_status_code (*aeq_init)(struct irdma_sc_aeq *aeq,
+					   struct irdma_aeq_init_info *info);
+	enum irdma_status_code (*aeq_create)(struct irdma_sc_aeq *aeq,
+					     u64 scratch, bool post_sq);
+	enum irdma_status_code (*aeq_destroy)(struct irdma_sc_aeq *aeq,
+					      u64 scratch, bool post_sq);
+	enum irdma_status_code (*get_next_aeqe)(struct irdma_sc_aeq *aeq,
+						struct irdma_aeqe_info *info);
+	enum irdma_status_code (*repost_aeq_entries)(struct irdma_sc_dev *dev,
+						     u32 count);
+	enum irdma_status_code (*aeq_create_done)(struct irdma_sc_aeq *aeq);
+	enum irdma_status_code (*aeq_destroy_done)(struct irdma_sc_aeq *aeq);
+};
+
+struct irdma_pd_ops {
+	void (*pd_init)(struct irdma_sc_dev *dev, struct irdma_sc_pd *pd,
+			u32 pd_id, int abi_ver);
+};
+
+struct irdma_priv_qp_ops {
+	enum irdma_status_code (*iw_mr_fast_register)(struct irdma_sc_qp *qp,
+						      struct irdma_fast_reg_stag_info *info,
+						      bool post_sq);
+	enum irdma_status_code (*qp_create)(struct irdma_sc_qp *qp,
+					    struct irdma_create_qp_info *info,
+					    u64 scratch, bool post_sq);
+	enum irdma_status_code (*qp_destroy)(struct irdma_sc_qp *qp,
+					     u64 scratch, bool remove_hash_idx,
+					     bool ignore_mw_bnd, bool post_sq);
+	enum irdma_status_code (*qp_flush_wqes)(struct irdma_sc_qp *qp,
+						struct irdma_qp_flush_info *info,
+						u64 scratch, bool post_sq);
+	enum irdma_status_code (*qp_init)(struct irdma_sc_qp *qp,
+					  struct irdma_qp_init_info *info);
+	enum irdma_status_code (*qp_modify)(struct irdma_sc_qp *qp,
+					    struct irdma_modify_qp_info *info,
+					    u64 scratch, bool post_sq);
+	void (*qp_send_lsmm)(struct irdma_sc_qp *qp, void *lsmm_buf, u32 size,
+			     irdma_stag stag);
+	void (*qp_send_lsmm_nostag)(struct irdma_sc_qp *qp, void *lsmm_buf,
+				    u32 size);
+	void (*qp_send_rtt)(struct irdma_sc_qp *qp, bool read);
+	void (*qp_setctx)(struct irdma_sc_qp *qp, __le64 *qp_ctx,
+			  struct irdma_qp_host_ctx_info *info);
+	void (*qp_setctx_roce)(struct irdma_sc_qp *qp, __le64 *qp_ctx,
+			       struct irdma_qp_host_ctx_info *info);
+	enum irdma_status_code (*qp_upload_context)(struct irdma_sc_dev *dev,
+						    struct irdma_upload_context_info *info,
+						    u64 scratch, bool post_sq);
+	enum irdma_status_code (*update_suspend_qp)(struct irdma_sc_cqp *cqp,
+						    struct irdma_sc_qp *qp,
+						    u64 scratch);
+	enum irdma_status_code (*update_resume_qp)(struct irdma_sc_cqp *cqp,
+						   struct irdma_sc_qp *qp,
+						   u64 scratch);
+};
+
+struct irdma_priv_cq_ops {
+	void (*cq_ack)(struct irdma_sc_cq *cq);
+	enum irdma_status_code (*cq_create)(struct irdma_sc_cq *cq, u64 scratch,
+					    bool check_overflow, bool post_sq);
+	enum irdma_status_code (*cq_destroy)(struct irdma_sc_cq *cq,
+					     u64 scratch, bool post_sq);
+	enum irdma_status_code (*cq_init)(struct irdma_sc_cq *cq,
+					  struct irdma_cq_init_info *info);
+	enum irdma_status_code (*cq_modify)(struct irdma_sc_cq *cq,
+					    struct irdma_modify_cq_info *info,
+					    u64 scratch, bool post_sq);
+	void (*cq_resize)(struct irdma_sc_cq *cq, struct irdma_modify_cq_info *info);
+};
+
+struct irdma_mr_ops {
+	enum irdma_status_code (*alloc_stag)(struct irdma_sc_dev *dev,
+					     struct irdma_allocate_stag_info *info,
+					     u64 scratch, bool post_sq);
+	enum irdma_status_code (*dealloc_stag)(struct irdma_sc_dev *dev,
+					       struct irdma_dealloc_stag_info *info,
+					       u64 scratch, bool post_sq);
+	enum irdma_status_code (*mr_reg_non_shared)(struct irdma_sc_dev *dev,
+						    struct irdma_reg_ns_stag_info *info,
+						    u64 scratch, bool post_sq);
+	enum irdma_status_code (*mr_reg_shared)(struct irdma_sc_dev *dev,
+						struct irdma_register_shared_stag *stag,
+						u64 scratch, bool post_sq);
+	enum irdma_status_code (*mw_alloc)(struct irdma_sc_dev *dev,
+					   struct irdma_mw_alloc_info *info,
+					   u64 scratch, bool post_sq);
+	enum irdma_status_code (*query_stag)(struct irdma_sc_dev *dev, u64 scratch,
+					     u32 stag_index, bool post_sq);
+};
+
+struct irdma_cqp_misc_ops {
+	enum irdma_status_code (*add_arp_cache_entry)(struct irdma_sc_cqp *cqp,
+						      struct irdma_add_arp_cache_entry_info *info,
+						      u64 scratch, bool post_sq);
+	enum irdma_status_code (*add_local_mac_entry)(struct irdma_sc_cqp *cqp,
+						      struct irdma_local_mac_entry_info *info,
+						      u64 scratch, bool post_sq);
+	enum irdma_status_code (*alloc_local_mac_entry)(struct irdma_sc_cqp *cqp,
+							u64 scratch,
+							bool post_sq);
+	enum irdma_status_code (*cqp_nop)(struct irdma_sc_cqp *cqp, u64 scratch, bool post_sq);
+	enum irdma_status_code (*del_arp_cache_entry)(struct irdma_sc_cqp *cqp,
+						      u64 scratch,
+						      u16 arp_index,
+						      bool post_sq);
+	enum irdma_status_code (*del_local_mac_entry)(struct irdma_sc_cqp *cqp,
+						      u64 scratch,
+						      u16 entry_idx,
+						      u8 ignore_ref_count,
+						      bool post_sq);
+	enum irdma_status_code (*gather_stats)(struct irdma_sc_cqp *cqp,
+					       struct irdma_stats_gather_info *info,
+					       u64 scratch);
+	enum irdma_status_code (*manage_apbvt_entry)(struct irdma_sc_cqp *cqp,
+						     struct irdma_apbvt_info *info,
+						     u64 scratch, bool post_sq);
+	enum irdma_status_code (*manage_push_page)(struct irdma_sc_cqp *cqp,
+						   struct irdma_cqp_manage_push_page_info *info,
+						   u64 scratch, bool post_sq);
+	enum irdma_status_code (*manage_qhash_table_entry)(struct irdma_sc_cqp *cqp,
+							   struct irdma_qhash_table_info *info,
+							   u64 scratch, bool post_sq);
+	enum irdma_status_code (*manage_stats_instance)(struct irdma_sc_cqp *cqp,
+							struct irdma_stats_inst_info *info,
+							bool alloc, u64 scratch);
+	enum irdma_status_code (*manage_ws_node)(struct irdma_sc_cqp *cqp,
+						 struct irdma_ws_node_info *info,
+						 enum irdma_ws_node_op node_op,
+						 u64 scratch);
+	enum irdma_status_code (*query_arp_cache_entry)(struct irdma_sc_cqp *cqp,
+							u64 scratch, u16 arp_index, bool post_sq);
+	enum irdma_status_code (*query_rdma_features)(struct irdma_sc_cqp *cqp,
+						      struct irdma_dma_mem *buf,
+						      u64 scratch);
+	enum irdma_status_code (*set_up_map)(struct irdma_sc_cqp *cqp,
+					     struct irdma_up_info *info,
+					     u64 scratch);
+};
+
+struct irdma_hmc_ops {
+	enum irdma_status_code (*cfg_iw_fpm)(struct irdma_sc_dev *dev,
+					     u8 hmc_fn_id);
+	enum irdma_status_code (*commit_fpm_val)(struct irdma_sc_cqp *cqp,
+						 u64 scratch, u8 hmc_fn_id,
+						 struct irdma_dma_mem *commit_fpm_mem,
+						 bool post_sq, u8 wait_type);
+	enum irdma_status_code (*commit_fpm_val_done)(struct irdma_sc_cqp *cqp);
+	enum irdma_status_code (*create_hmc_object)(struct irdma_sc_dev *dev,
+						    struct irdma_hmc_create_obj_info *info);
+	enum irdma_status_code (*del_hmc_object)(struct irdma_sc_dev *dev,
+						 struct irdma_hmc_del_obj_info *info,
+						 bool reset);
+	enum irdma_status_code (*init_iw_hmc)(struct irdma_sc_dev *dev, u8 hmc_fn_id);
+	enum irdma_status_code (*manage_hmc_pm_func_table)(struct irdma_sc_cqp *cqp,
+							   struct irdma_hmc_fcn_info *info,
+							   u64 scratch,
+							   bool post_sq);
+	enum irdma_status_code (*manage_hmc_pm_func_table_done)(struct irdma_sc_cqp *cqp);
+	enum irdma_status_code (*parse_fpm_commit_buf)(struct irdma_sc_dev *dev,
+						       __le64 *buf,
+						       struct irdma_hmc_obj_info *info,
+						       u32 *sd);
+	enum irdma_status_code (*parse_fpm_query_buf)(struct irdma_sc_dev *dev,
+						      __le64 *buf,
+						      struct irdma_hmc_info *hmc_info,
+						      struct irdma_hmc_fpm_misc *hmc_fpm_misc);
+	enum irdma_status_code (*pf_init_vfhmc)(struct irdma_sc_dev *dev,
+						u8 vf_hmc_fn_id,
+						u32 *vf_cnt_array);
+	enum irdma_status_code (*query_fpm_val)(struct irdma_sc_cqp *cqp,
+						u64 scratch,
+						u8 hmc_fn_id,
+						struct irdma_dma_mem *query_fpm_mem,
+						bool post_sq, u8 wait_type);
+	enum irdma_status_code (*query_fpm_val_done)(struct irdma_sc_cqp *cqp);
+	enum irdma_status_code (*static_hmc_pages_allocated)(struct irdma_sc_cqp *cqp,
+							     u64 scratch,
+							     u8 hmc_fn_id,
+							     bool post_sq,
+							     bool poll_registers);
+	enum irdma_status_code (*vf_cfg_vffpm)(struct irdma_sc_dev *dev, u32 *vf_cnt_array);
+};
+
+struct irdma_vsi_ops {
+	void (*vsi_update_stats)(struct irdma_sc_vsi *vsi);
+};
+
+struct cqp_info {
+	union {
+		struct {
+			struct irdma_sc_qp *qp;
+			struct irdma_create_qp_info info;
+			u64 scratch;
+		} qp_create;
+
+		struct {
+			struct irdma_sc_qp *qp;
+			struct irdma_modify_qp_info info;
+			u64 scratch;
+		} qp_modify;
+
+		struct {
+			struct irdma_sc_qp *qp;
+			u64 scratch;
+			bool remove_hash_idx;
+			bool ignore_mw_bnd;
+		} qp_destroy;
+
+		struct {
+			struct irdma_sc_cq *cq;
+			u64 scratch;
+			bool check_overflow;
+		} cq_create;
+
+		struct {
+			struct irdma_sc_cq *cq;
+			struct irdma_modify_cq_info info;
+			u64 scratch;
+		} cq_modify;
+
+		struct {
+			struct irdma_sc_cq *cq;
+			u64 scratch;
+		} cq_destroy;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_allocate_stag_info info;
+			u64 scratch;
+		} alloc_stag;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_mw_alloc_info info;
+			u64 scratch;
+		} mw_alloc;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_reg_ns_stag_info info;
+			u64 scratch;
+		} mr_reg_non_shared;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_dealloc_stag_info info;
+			u64 scratch;
+		} dealloc_stag;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_add_arp_cache_entry_info info;
+			u64 scratch;
+		} add_arp_cache_entry;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			u64 scratch;
+			u16 arp_index;
+		} del_arp_cache_entry;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_local_mac_entry_info info;
+			u64 scratch;
+		} add_local_mac_entry;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			u64 scratch;
+			u8 entry_idx;
+			u8 ignore_ref_count;
+		} del_local_mac_entry;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			u64 scratch;
+		} alloc_local_mac_entry;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_cqp_manage_push_page_info info;
+			u64 scratch;
+		} manage_push_page;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_upload_context_info info;
+			u64 scratch;
+		} qp_upload_context;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_hmc_fcn_info info;
+			u64 scratch;
+		} manage_hmc_pm;
+
+		struct {
+			struct irdma_sc_ceq *ceq;
+			u64 scratch;
+		} ceq_create;
+
+		struct {
+			struct irdma_sc_ceq *ceq;
+			u64 scratch;
+		} ceq_destroy;
+
+		struct {
+			struct irdma_sc_aeq *aeq;
+			u64 scratch;
+		} aeq_create;
+
+		struct {
+			struct irdma_sc_aeq *aeq;
+			u64 scratch;
+		} aeq_destroy;
+
+		struct {
+			struct irdma_sc_qp *qp;
+			struct irdma_qp_flush_info info;
+			u64 scratch;
+		} qp_flush_wqes;
+
+		struct {
+			struct irdma_sc_qp *qp;
+			struct irdma_gen_ae_info info;
+			u64 scratch;
+		} gen_ae;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			void *fpm_val_va;
+			u64 fpm_val_pa;
+			u8 hmc_fn_id;
+			u64 scratch;
+		} query_fpm_val;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			void *fpm_val_va;
+			u64 fpm_val_pa;
+			u8 hmc_fn_id;
+			u64 scratch;
+		} commit_fpm_val;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_apbvt_info info;
+			u64 scratch;
+		} manage_apbvt_entry;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_qhash_table_info info;
+			u64 scratch;
+		} manage_qhash_table_entry;
+
+		struct {
+			struct irdma_sc_dev *dev;
+			struct irdma_update_sds_info info;
+			u64 scratch;
+		} update_pe_sds;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_sc_qp *qp;
+			u64 scratch;
+		} suspend_resume;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_ah_info info;
+			u64 scratch;
+		} ah_create;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_ah_info info;
+			u64 scratch;
+		} ah_destroy;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_mcast_grp_info info;
+			u64 scratch;
+		} mc_create;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_mcast_grp_info info;
+			u64 scratch;
+		} mc_destroy;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_mcast_grp_info info;
+			u64 scratch;
+		} mc_modify;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_stats_inst_info info;
+			u64 scratch;
+		} stats_manage;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_stats_gather_info info;
+			u64 scratch;
+		} stats_gather;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_ws_node_info info;
+			u64 scratch;
+		} ws_node;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_up_info info;
+			u64 scratch;
+		} up_map;
+
+		struct {
+			struct irdma_sc_cqp *cqp;
+			struct irdma_dma_mem query_buff_mem;
+			u64 scratch;
+		} query_rdma;
+	} u;
+};
+
+struct cqp_cmds_info {
+	struct list_head cqp_cmd_entry;
+	u8 cqp_cmd;
+	u8 post_sq;
+	struct cqp_info in;
+};
+
+#endif /* IRDMA_TYPE_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 11/23] RDMA/irdma: Add HMC backing store setup functions
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (9 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 10/23] RDMA/irdma: Implement HW Admin Queue OPs Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 12/23] RDMA/irdma: Add privileged UDA queue implementation Shiraz Saleem
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

HW uses host memory as a backing store for a number of
protocol context objects and queue state tracking.
The Host Memory Cache (HMC) is a component responsible for
managing these objects stored in host memory.

Add the functions and data structures to manage the allocation
of backing pages used by the HMC for the various objects

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/hmc.c | 739 ++++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/hmc.h | 180 ++++++++++
 2 files changed, 919 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/hmc.c
 create mode 100644 drivers/infiniband/hw/irdma/hmc.h

diff --git a/drivers/infiniband/hw/irdma/hmc.c b/drivers/infiniband/hw/irdma/hmc.c
new file mode 100644
index 0000000..ff77a6a
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/hmc.c
@@ -0,0 +1,739 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "hmc.h"
+#include "defs.h"
+#include "type.h"
+#include "protos.h"
+
+/**
+ * irdma_find_sd_index_limit - finds segment descriptor index limit
+ * @hmc_info: pointer to the HMC configuration information structure
+ * @type: type of HMC resources we're searching
+ * @idx: starting index for the object
+ * @cnt: number of objects we're trying to create
+ * @sd_idx: pointer to return index of the segment descriptor in question
+ * @sd_limit: pointer to return the maximum number of segment descriptors
+ *
+ * This function calculates the segment descriptor index and index limit
+ * for the resource defined by irdma_hmc_rsrc_type.
+ */
+
+static void irdma_find_sd_index_limit(struct irdma_hmc_info *hmc_info, u32 type,
+				      u32 idx, u32 cnt, u32 *sd_idx,
+				      u32 *sd_limit)
+{
+	u64 fpm_addr, fpm_limit;
+
+	fpm_addr = hmc_info->hmc_obj[(type)].base +
+		   hmc_info->hmc_obj[type].size * idx;
+	fpm_limit = fpm_addr + hmc_info->hmc_obj[type].size * cnt;
+	*sd_idx = (u32)(fpm_addr / IRDMA_HMC_DIRECT_BP_SIZE);
+	*sd_limit = (u32)((fpm_limit - 1) / IRDMA_HMC_DIRECT_BP_SIZE);
+	*sd_limit += 1;
+}
+
+/**
+ * irdma_find_pd_index_limit - finds page descriptor index limit
+ * @hmc_info: pointer to the HMC configuration information struct
+ * @type: HMC resource type we're examining
+ * @idx: starting index for the object
+ * @cnt: number of objects we're trying to create
+ * @pd_idx: pointer to return page descriptor index
+ * @pd_limit: pointer to return page descriptor index limit
+ *
+ * Calculates the page descriptor index and index limit for the resource
+ * defined by irdma_hmc_rsrc_type.
+ */
+
+static void irdma_find_pd_index_limit(struct irdma_hmc_info *hmc_info, u32 type,
+				      u32 idx, u32 cnt, u32 *pd_idx,
+				      u32 *pd_limit)
+{
+	u64 fpm_adr, fpm_limit;
+
+	fpm_adr = hmc_info->hmc_obj[type].base +
+		  hmc_info->hmc_obj[type].size * idx;
+	fpm_limit = fpm_adr + (hmc_info)->hmc_obj[(type)].size * (cnt);
+	*pd_idx = (u32)(fpm_adr / IRDMA_HMC_PAGED_BP_SIZE);
+	*pd_limit = (u32)((fpm_limit - 1) / IRDMA_HMC_PAGED_BP_SIZE);
+	*pd_limit += 1;
+}
+
+/**
+ * irdma_set_sd_entry - setup entry for sd programming
+ * @pa: physical addr
+ * @idx: sd index
+ * @type: paged or direct sd
+ * @entry: sd entry ptr
+ */
+static void irdma_set_sd_entry(u64 pa, u32 idx, enum irdma_sd_entry_type type,
+			       struct irdma_update_sd_entry *entry)
+{
+	entry->data = pa |
+		      FIELD_PREP(IRDMA_PFHMC_SDDATALOW_PMSDBPCOUNT, IRDMA_HMC_MAX_BP_COUNT) |
+		      FIELD_PREP(IRDMA_PFHMC_SDDATALOW_PMSDTYPE,
+				 type == IRDMA_SD_TYPE_PAGED ? 0 : 1) |
+		      FIELD_PREP(IRDMA_PFHMC_SDDATALOW_PMSDVALID, 1);
+
+	entry->cmd = idx | FIELD_PREP(IRDMA_PFHMC_SDCMD_PMSDWR, 1) | BIT(15);
+}
+
+/**
+ * irdma_clr_sd_entry - setup entry for sd clear
+ * @idx: sd index
+ * @type: paged or direct sd
+ * @entry: sd entry ptr
+ */
+static void irdma_clr_sd_entry(u32 idx, enum irdma_sd_entry_type type,
+			       struct irdma_update_sd_entry *entry)
+{
+	entry->data = FIELD_PREP(IRDMA_PFHMC_SDDATALOW_PMSDBPCOUNT, IRDMA_HMC_MAX_BP_COUNT) |
+		      FIELD_PREP(IRDMA_PFHMC_SDDATALOW_PMSDTYPE,
+				 type == IRDMA_SD_TYPE_PAGED ? 0 : 1);
+
+	entry->cmd = idx | FIELD_PREP(IRDMA_PFHMC_SDCMD_PMSDWR, 1) | BIT(15);
+}
+
+/**
+ * irdma_invalidate_pf_hmc_pd - Invalidates the pd cache in the hardware for PF
+ * @dev: pointer to our device struct
+ * @sd_idx: segment descriptor index
+ * @pd_idx: page descriptor index
+ */
+static inline void irdma_invalidate_pf_hmc_pd(struct irdma_sc_dev *dev, u32 sd_idx,
+					      u32 pd_idx)
+{
+	u32 val = FIELD_PREP(IRDMA_PFHMC_PDINV_PMSDIDX, sd_idx) |
+		  FIELD_PREP(IRDMA_PFHMC_PDINV_PMSDPARTSEL, 1) |
+		  FIELD_PREP(IRDMA_PFHMC_PDINV_PMPDIDX, pd_idx);
+
+	writel(val, dev->hw_regs[IRDMA_PFHMC_PDINV]);
+}
+
+/**
+ * irdma_invalidate_vf_hmc_pd - Invalidates the pd cache in the hardware for VF
+ * @dev: pointer to our device struct
+ * @sd_idx: segment descriptor index
+ * @pd_idx: page descriptor index
+ * @hmc_fn_id: VF's function id
+ */
+static inline void irdma_invalidate_vf_hmc_pd(struct irdma_sc_dev *dev, u32 sd_idx,
+					      u32 pd_idx, u8 hmc_fn_id)
+{
+	u32 val = FIELD_PREP(IRDMA_PFHMC_PDINV_PMSDIDX, sd_idx) |
+		  FIELD_PREP(IRDMA_PFHMC_PDINV_PMPDIDX, pd_idx);
+
+	writel(val,
+	       dev->hw_regs[IRDMA_GLHMC_VFPDINV] + hmc_fn_id - dev->hw_attrs.first_hw_vf_fpm_id);
+}
+
+/**
+ * irdma_hmc_sd_one - setup 1 sd entry for cqp
+ * @dev: pointer to the device structure
+ * @hmc_fn_id: hmc's function id
+ * @pa: physical addr
+ * @sd_idx: sd index
+ * @type: paged or direct sd
+ * @setsd: flag to set or clear sd
+ */
+enum irdma_status_code irdma_hmc_sd_one(struct irdma_sc_dev *dev, u8 hmc_fn_id,
+					u64 pa, u32 sd_idx,
+					enum irdma_sd_entry_type type,
+					bool setsd)
+{
+	struct irdma_update_sds_info sdinfo;
+
+	sdinfo.cnt = 1;
+	sdinfo.hmc_fn_id = hmc_fn_id;
+	if (setsd)
+		irdma_set_sd_entry(pa, sd_idx, type, sdinfo.entry);
+	else
+		irdma_clr_sd_entry(sd_idx, type, sdinfo.entry);
+	return dev->cqp->process_cqp_sds(dev, &sdinfo);
+}
+
+/**
+ * irdma_hmc_sd_grp - setup group of sd entries for cqp
+ * @dev: pointer to the device structure
+ * @hmc_info: pointer to the HMC configuration information struct
+ * @sd_index: sd index
+ * @sd_cnt: number of sd entries
+ * @setsd: flag to set or clear sd
+ */
+static enum irdma_status_code irdma_hmc_sd_grp(struct irdma_sc_dev *dev,
+					       struct irdma_hmc_info *hmc_info,
+					       u32 sd_index, u32 sd_cnt,
+					       bool setsd)
+{
+	struct irdma_hmc_sd_entry *sd_entry;
+	struct irdma_update_sds_info sdinfo = {};
+	u64 pa;
+	u32 i;
+	enum irdma_status_code ret_code = 0;
+
+	sdinfo.hmc_fn_id = hmc_info->hmc_fn_id;
+	for (i = sd_index; i < sd_index + sd_cnt; i++) {
+		sd_entry = &hmc_info->sd_table.sd_entry[i];
+		if (!sd_entry || (!sd_entry->valid && setsd) ||
+		    (sd_entry->valid && !setsd))
+			continue;
+		if (setsd) {
+			pa = (sd_entry->entry_type == IRDMA_SD_TYPE_PAGED) ?
+				     sd_entry->u.pd_table.pd_page_addr.pa :
+				     sd_entry->u.bp.addr.pa;
+			irdma_set_sd_entry(pa, i, sd_entry->entry_type,
+					   &sdinfo.entry[sdinfo.cnt]);
+		} else {
+			irdma_clr_sd_entry(i, sd_entry->entry_type,
+					   &sdinfo.entry[sdinfo.cnt]);
+		}
+		sdinfo.cnt++;
+		if (sdinfo.cnt == IRDMA_MAX_SD_ENTRIES) {
+			ret_code = dev->cqp->process_cqp_sds(dev, &sdinfo);
+			if (ret_code) {
+				ibdev_dbg(to_ibdev(dev),
+					  "HMC: sd_programming failed err=%d\n",
+					  ret_code);
+				return ret_code;
+			}
+
+			sdinfo.cnt = 0;
+		}
+	}
+	if (sdinfo.cnt)
+		ret_code = dev->cqp->process_cqp_sds(dev, &sdinfo);
+
+	return ret_code;
+}
+
+/**
+ * irdma_hmc_finish_add_sd_reg - program sd entries for objects
+ * @dev: pointer to the device structure
+ * @info: create obj info
+ */
+static enum irdma_status_code
+irdma_hmc_finish_add_sd_reg(struct irdma_sc_dev *dev,
+			    struct irdma_hmc_create_obj_info *info)
+{
+	if (info->start_idx >= info->hmc_info->hmc_obj[info->rsrc_type].cnt)
+		return IRDMA_ERR_INVALID_HMC_OBJ_INDEX;
+
+	if ((info->start_idx + info->count) >
+	    info->hmc_info->hmc_obj[info->rsrc_type].cnt)
+		return IRDMA_ERR_INVALID_HMC_OBJ_COUNT;
+
+	if (!info->add_sd_cnt)
+		return 0;
+	return irdma_hmc_sd_grp(dev, info->hmc_info,
+				info->hmc_info->sd_indexes[0], info->add_sd_cnt,
+				true);
+}
+
+/**
+ * irdma_sc_create_hmc_obj - allocate backing store for hmc objects
+ * @dev: pointer to the device structure
+ * @info: pointer to irdma_hmc_create_obj_info struct
+ *
+ * This will allocate memory for PDs and backing pages and populate
+ * the sd and pd entries.
+ */
+enum irdma_status_code
+irdma_sc_create_hmc_obj(struct irdma_sc_dev *dev,
+			struct irdma_hmc_create_obj_info *info)
+{
+	struct irdma_hmc_sd_entry *sd_entry;
+	u32 sd_idx, sd_lmt;
+	u32 pd_idx = 0, pd_lmt = 0;
+	u32 pd_idx1 = 0, pd_lmt1 = 0;
+	u32 i, j;
+	bool pd_error = false;
+	enum irdma_status_code ret_code = 0;
+
+	if (info->start_idx >= info->hmc_info->hmc_obj[info->rsrc_type].cnt)
+		return IRDMA_ERR_INVALID_HMC_OBJ_INDEX;
+
+	if ((info->start_idx + info->count) >
+	    info->hmc_info->hmc_obj[info->rsrc_type].cnt) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: error type %u, start = %u, req cnt %u, cnt = %u\n",
+			  info->rsrc_type, info->start_idx, info->count,
+			  info->hmc_info->hmc_obj[info->rsrc_type].cnt);
+		return IRDMA_ERR_INVALID_HMC_OBJ_COUNT;
+	}
+
+	irdma_find_sd_index_limit(info->hmc_info, info->rsrc_type,
+				  info->start_idx, info->count, &sd_idx,
+				  &sd_lmt);
+	if (sd_idx >= info->hmc_info->sd_table.sd_cnt ||
+	    sd_lmt > info->hmc_info->sd_table.sd_cnt) {
+		return IRDMA_ERR_INVALID_SD_INDEX;
+	}
+
+	irdma_find_pd_index_limit(info->hmc_info, info->rsrc_type,
+				  info->start_idx, info->count, &pd_idx,
+				  &pd_lmt);
+
+	for (j = sd_idx; j < sd_lmt; j++) {
+		ret_code = irdma_add_sd_table_entry(dev->hw, info->hmc_info, j,
+						    info->entry_type,
+						    IRDMA_HMC_DIRECT_BP_SIZE);
+		if (ret_code)
+			goto exit_sd_error;
+
+		sd_entry = &info->hmc_info->sd_table.sd_entry[j];
+		if (sd_entry->entry_type == IRDMA_SD_TYPE_PAGED &&
+		    (dev->hmc_info == info->hmc_info &&
+		     info->rsrc_type != IRDMA_HMC_IW_PBLE)) {
+			pd_idx1 = max(pd_idx, (j * IRDMA_HMC_MAX_BP_COUNT));
+			pd_lmt1 = min(pd_lmt, (j + 1) * IRDMA_HMC_MAX_BP_COUNT);
+			for (i = pd_idx1; i < pd_lmt1; i++) {
+				/* update the pd table entry */
+				ret_code = irdma_add_pd_table_entry(dev,
+								    info->hmc_info,
+								    i, NULL);
+				if (ret_code) {
+					pd_error = true;
+					break;
+				}
+			}
+			if (pd_error) {
+				while (i && (i > pd_idx1)) {
+					irdma_remove_pd_bp(dev, info->hmc_info,
+							   i - 1);
+					i--;
+				}
+			}
+		}
+		if (sd_entry->valid)
+			continue;
+
+		info->hmc_info->sd_indexes[info->add_sd_cnt] = (u16)j;
+		info->add_sd_cnt++;
+		sd_entry->valid = true;
+	}
+	return irdma_hmc_finish_add_sd_reg(dev, info);
+
+exit_sd_error:
+	while (j && (j > sd_idx)) {
+		sd_entry = &info->hmc_info->sd_table.sd_entry[j - 1];
+		switch (sd_entry->entry_type) {
+		case IRDMA_SD_TYPE_PAGED:
+			pd_idx1 = max(pd_idx, (j - 1) * IRDMA_HMC_MAX_BP_COUNT);
+			pd_lmt1 = min(pd_lmt, (j * IRDMA_HMC_MAX_BP_COUNT));
+			for (i = pd_idx1; i < pd_lmt1; i++)
+				irdma_prep_remove_pd_page(info->hmc_info, i);
+			break;
+		case IRDMA_SD_TYPE_DIRECT:
+			irdma_prep_remove_pd_page(info->hmc_info, (j - 1));
+			break;
+		default:
+			ret_code = IRDMA_ERR_INVALID_SD_TYPE;
+			break;
+		}
+		j--;
+	}
+
+	return ret_code;
+}
+
+/**
+ * irdma_finish_del_sd_reg - delete sd entries for objects
+ * @dev: pointer to the device structure
+ * @info: dele obj info
+ * @reset: true if called before reset
+ */
+static enum irdma_status_code
+irdma_finish_del_sd_reg(struct irdma_sc_dev *dev,
+			struct irdma_hmc_del_obj_info *info, bool reset)
+{
+	struct irdma_hmc_sd_entry *sd_entry;
+	enum irdma_status_code ret_code = 0;
+	u32 i, sd_idx;
+	struct irdma_dma_mem *mem;
+
+	if (dev->privileged && !reset)
+		ret_code = irdma_hmc_sd_grp(dev, info->hmc_info,
+					    info->hmc_info->sd_indexes[0],
+					    info->del_sd_cnt, false);
+
+	if (ret_code)
+		ibdev_dbg(to_ibdev(dev), "HMC: error cqp sd sd_grp\n");
+	for (i = 0; i < info->del_sd_cnt; i++) {
+		sd_idx = info->hmc_info->sd_indexes[i];
+		sd_entry = &info->hmc_info->sd_table.sd_entry[sd_idx];
+		mem = (sd_entry->entry_type == IRDMA_SD_TYPE_PAGED) ?
+			      &sd_entry->u.pd_table.pd_page_addr :
+			      &sd_entry->u.bp.addr;
+
+		if (!mem || !mem->va) {
+			ibdev_dbg(to_ibdev(dev), "HMC: error cqp sd mem\n");
+		} else {
+			dma_free_coherent(dev->hw->device, mem->size, mem->va,
+					  mem->pa);
+			mem->va = NULL;
+		}
+	}
+
+	return ret_code;
+}
+
+/**
+ * irdma_sc_del_hmc_obj - remove pe hmc objects
+ * @dev: pointer to the device structure
+ * @info: pointer to irdma_hmc_del_obj_info struct
+ * @reset: true if called before reset
+ *
+ * This will de-populate the SDs and PDs.  It frees
+ * the memory for PDS and backing storage.  After this function is returned,
+ * caller should deallocate memory allocated previously for
+ * book-keeping information about PDs and backing storage.
+ */
+enum irdma_status_code irdma_sc_del_hmc_obj(struct irdma_sc_dev *dev,
+					    struct irdma_hmc_del_obj_info *info,
+					    bool reset)
+{
+	struct irdma_hmc_pd_table *pd_table;
+	u32 sd_idx, sd_lmt;
+	u32 pd_idx, pd_lmt, rel_pd_idx;
+	u32 i, j;
+	enum irdma_status_code ret_code = 0;
+
+	if (info->start_idx >= info->hmc_info->hmc_obj[info->rsrc_type].cnt) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: error start_idx[%04d]  >= [type %04d].cnt[%04d]\n",
+			  info->start_idx, info->rsrc_type,
+			  info->hmc_info->hmc_obj[info->rsrc_type].cnt);
+		return IRDMA_ERR_INVALID_HMC_OBJ_INDEX;
+	}
+
+	if ((info->start_idx + info->count) >
+	    info->hmc_info->hmc_obj[info->rsrc_type].cnt) {
+		ibdev_dbg(to_ibdev(dev),
+			  "HMC: error start_idx[%04d] + count %04d  >= [type %04d].cnt[%04d]\n",
+			  info->start_idx, info->count, info->rsrc_type,
+			  info->hmc_info->hmc_obj[info->rsrc_type].cnt);
+		return IRDMA_ERR_INVALID_HMC_OBJ_COUNT;
+	}
+
+	irdma_find_pd_index_limit(info->hmc_info, info->rsrc_type,
+				  info->start_idx, info->count, &pd_idx,
+				  &pd_lmt);
+
+	for (j = pd_idx; j < pd_lmt; j++) {
+		sd_idx = j / IRDMA_HMC_PD_CNT_IN_SD;
+
+		if (!info->hmc_info->sd_table.sd_entry[sd_idx].valid)
+			continue;
+
+		if (info->hmc_info->sd_table.sd_entry[sd_idx].entry_type !=
+		    IRDMA_SD_TYPE_PAGED)
+			continue;
+
+		rel_pd_idx = j % IRDMA_HMC_PD_CNT_IN_SD;
+		pd_table = &info->hmc_info->sd_table.sd_entry[sd_idx].u.pd_table;
+		if (pd_table->pd_entry &&
+		    pd_table->pd_entry[rel_pd_idx].valid) {
+			ret_code = irdma_remove_pd_bp(dev, info->hmc_info, j);
+			if (ret_code) {
+				ibdev_dbg(to_ibdev(dev),
+					  "HMC: remove_pd_bp error\n");
+				return ret_code;
+			}
+		}
+	}
+
+	irdma_find_sd_index_limit(info->hmc_info, info->rsrc_type,
+				  info->start_idx, info->count, &sd_idx,
+				  &sd_lmt);
+	if (sd_idx >= info->hmc_info->sd_table.sd_cnt ||
+	    sd_lmt > info->hmc_info->sd_table.sd_cnt) {
+		ibdev_dbg(to_ibdev(dev), "HMC: invalid sd_idx\n");
+		return IRDMA_ERR_INVALID_SD_INDEX;
+	}
+
+	for (i = sd_idx; i < sd_lmt; i++) {
+		pd_table = &info->hmc_info->sd_table.sd_entry[i].u.pd_table;
+		if (!info->hmc_info->sd_table.sd_entry[i].valid)
+			continue;
+		switch (info->hmc_info->sd_table.sd_entry[i].entry_type) {
+		case IRDMA_SD_TYPE_DIRECT:
+			ret_code = irdma_prep_remove_sd_bp(info->hmc_info, i);
+			if (!ret_code) {
+				info->hmc_info->sd_indexes[info->del_sd_cnt] =
+					(u16)i;
+				info->del_sd_cnt++;
+			}
+			break;
+		case IRDMA_SD_TYPE_PAGED:
+			ret_code = irdma_prep_remove_pd_page(info->hmc_info, i);
+			if (ret_code)
+				break;
+			if (dev->hmc_info != info->hmc_info &&
+			    info->rsrc_type == IRDMA_HMC_IW_PBLE &&
+			    pd_table->pd_entry) {
+				kfree(pd_table->pd_entry_virt_mem.va);
+				pd_table->pd_entry = NULL;
+			}
+			info->hmc_info->sd_indexes[info->del_sd_cnt] = (u16)i;
+			info->del_sd_cnt++;
+			break;
+		default:
+			break;
+		}
+	}
+	return irdma_finish_del_sd_reg(dev, info, reset);
+}
+
+/**
+ * irdma_add_sd_table_entry - Adds a segment descriptor to the table
+ * @hw: pointer to our hw struct
+ * @hmc_info: pointer to the HMC configuration information struct
+ * @sd_index: segment descriptor index to manipulate
+ * @type: what type of segment descriptor we're manipulating
+ * @direct_mode_sz: size to alloc in direct mode
+ */
+enum irdma_status_code irdma_add_sd_table_entry(struct irdma_hw *hw,
+						struct irdma_hmc_info *hmc_info,
+						u32 sd_index,
+						enum irdma_sd_entry_type type,
+						u64 direct_mode_sz)
+{
+	struct irdma_hmc_sd_entry *sd_entry;
+	struct irdma_dma_mem dma_mem;
+	u64 alloc_len;
+
+	sd_entry = &hmc_info->sd_table.sd_entry[sd_index];
+	if (!sd_entry->valid) {
+		if (type == IRDMA_SD_TYPE_PAGED)
+			alloc_len = IRDMA_HMC_PAGED_BP_SIZE;
+		else
+			alloc_len = direct_mode_sz;
+
+		/* allocate a 4K pd page or 2M backing page */
+		dma_mem.size = ALIGN(alloc_len, IRDMA_HMC_PD_BP_BUF_ALIGNMENT);
+		dma_mem.va = dma_alloc_coherent(hw->device, dma_mem.size,
+						&dma_mem.pa, GFP_KERNEL);
+		if (!dma_mem.va)
+			return IRDMA_ERR_NO_MEMORY;
+		if (type == IRDMA_SD_TYPE_PAGED) {
+			struct irdma_virt_mem *vmem =
+				&sd_entry->u.pd_table.pd_entry_virt_mem;
+
+			vmem->size = sizeof(struct irdma_hmc_pd_entry) * 512;
+			vmem->va = kzalloc(vmem->size, GFP_KERNEL);
+			if (!vmem->va) {
+				dma_free_coherent(hw->device, dma_mem.size,
+						  dma_mem.va, dma_mem.pa);
+				dma_mem.va = NULL;
+				return IRDMA_ERR_NO_MEMORY;
+			}
+			sd_entry->u.pd_table.pd_entry = vmem->va;
+
+			memcpy(&sd_entry->u.pd_table.pd_page_addr, &dma_mem,
+			       sizeof(sd_entry->u.pd_table.pd_page_addr));
+		} else {
+			memcpy(&sd_entry->u.bp.addr, &dma_mem,
+			       sizeof(sd_entry->u.bp.addr));
+
+			sd_entry->u.bp.sd_pd_index = sd_index;
+		}
+
+		hmc_info->sd_table.sd_entry[sd_index].entry_type = type;
+		hmc_info->sd_table.use_cnt++;
+	}
+	if (sd_entry->entry_type == IRDMA_SD_TYPE_DIRECT)
+		sd_entry->u.bp.use_cnt++;
+
+	return 0;
+}
+
+/**
+ * irdma_add_pd_table_entry - Adds page descriptor to the specified table
+ * @dev: pointer to our device structure
+ * @hmc_info: pointer to the HMC configuration information structure
+ * @pd_index: which page descriptor index to manipulate
+ * @rsrc_pg: if not NULL, use preallocated page instead of allocating new one.
+ *
+ * This function:
+ *	1. Initializes the pd entry
+ *	2. Adds pd_entry in the pd_table
+ *	3. Mark the entry valid in irdma_hmc_pd_entry structure
+ *	4. Initializes the pd_entry's ref count to 1
+ * assumptions:
+ *	1. The memory for pd should be pinned down, physically contiguous and
+ *	   aligned on 4K boundary and zeroed memory.
+ *	2. It should be 4K in size.
+ */
+enum irdma_status_code irdma_add_pd_table_entry(struct irdma_sc_dev *dev,
+						struct irdma_hmc_info *hmc_info,
+						u32 pd_index,
+						struct irdma_dma_mem *rsrc_pg)
+{
+	struct irdma_hmc_pd_table *pd_table;
+	struct irdma_hmc_pd_entry *pd_entry;
+	struct irdma_dma_mem mem;
+	struct irdma_dma_mem *page = &mem;
+	u32 sd_idx, rel_pd_idx;
+	u64 *pd_addr;
+	u64 page_desc;
+
+	if (pd_index / IRDMA_HMC_PD_CNT_IN_SD >= hmc_info->sd_table.sd_cnt)
+		return IRDMA_ERR_INVALID_PAGE_DESC_INDEX;
+
+	sd_idx = (pd_index / IRDMA_HMC_PD_CNT_IN_SD);
+	if (hmc_info->sd_table.sd_entry[sd_idx].entry_type !=
+	    IRDMA_SD_TYPE_PAGED)
+		return 0;
+
+	rel_pd_idx = (pd_index % IRDMA_HMC_PD_CNT_IN_SD);
+	pd_table = &hmc_info->sd_table.sd_entry[sd_idx].u.pd_table;
+	pd_entry = &pd_table->pd_entry[rel_pd_idx];
+	if (!pd_entry->valid) {
+		if (rsrc_pg) {
+			pd_entry->rsrc_pg = true;
+			page = rsrc_pg;
+		} else {
+			page->size = ALIGN(IRDMA_HMC_PAGED_BP_SIZE,
+					   IRDMA_HMC_PD_BP_BUF_ALIGNMENT);
+			page->va = dma_alloc_coherent(dev->hw->device,
+						      page->size, &page->pa,
+						      GFP_KERNEL);
+			if (!page->va)
+				return IRDMA_ERR_NO_MEMORY;
+
+			pd_entry->rsrc_pg = false;
+		}
+
+		memcpy(&pd_entry->bp.addr, page, sizeof(pd_entry->bp.addr));
+		pd_entry->bp.sd_pd_index = pd_index;
+		pd_entry->bp.entry_type = IRDMA_SD_TYPE_PAGED;
+		page_desc = page->pa | 0x1;
+		pd_addr = pd_table->pd_page_addr.va;
+		pd_addr += rel_pd_idx;
+		memcpy(pd_addr, &page_desc, sizeof(*pd_addr));
+		pd_entry->sd_index = sd_idx;
+		pd_entry->valid = true;
+		pd_table->use_cnt++;
+		if (hmc_info->hmc_fn_id < dev->hw_attrs.first_hw_vf_fpm_id &&
+		    dev->privileged)
+			irdma_invalidate_pf_hmc_pd(dev, sd_idx, rel_pd_idx);
+		else if (dev->hw->hmc.hmc_fn_id != hmc_info->hmc_fn_id)
+			irdma_invalidate_vf_hmc_pd(dev, sd_idx, rel_pd_idx,
+						   hmc_info->hmc_fn_id);
+	}
+	pd_entry->bp.use_cnt++;
+
+	return 0;
+}
+
+/**
+ * irdma_remove_pd_bp - remove a backing page from a page descriptor
+ * @dev: pointer to our HW structure
+ * @hmc_info: pointer to the HMC configuration information structure
+ * @idx: the page index
+ *
+ * This function:
+ *	1. Marks the entry in pd table (for paged address mode) or in sd table
+ *	   (for direct address mode) invalid.
+ *	2. Write to register PMPDINV to invalidate the backing page in FV cache
+ *	3. Decrement the ref count for the pd _entry
+ * assumptions:
+ *	1. Caller can deallocate the memory used by backing storage after this
+ *	   function returns.
+ */
+enum irdma_status_code irdma_remove_pd_bp(struct irdma_sc_dev *dev,
+					  struct irdma_hmc_info *hmc_info,
+					  u32 idx)
+{
+	struct irdma_hmc_pd_entry *pd_entry;
+	struct irdma_hmc_pd_table *pd_table;
+	struct irdma_hmc_sd_entry *sd_entry;
+	u32 sd_idx, rel_pd_idx;
+	struct irdma_dma_mem *mem;
+	u64 *pd_addr;
+
+	sd_idx = idx / IRDMA_HMC_PD_CNT_IN_SD;
+	rel_pd_idx = idx % IRDMA_HMC_PD_CNT_IN_SD;
+	if (sd_idx >= hmc_info->sd_table.sd_cnt)
+		return IRDMA_ERR_INVALID_PAGE_DESC_INDEX;
+
+	sd_entry = &hmc_info->sd_table.sd_entry[sd_idx];
+	if (sd_entry->entry_type != IRDMA_SD_TYPE_PAGED)
+		return IRDMA_ERR_INVALID_SD_TYPE;
+
+	pd_table = &hmc_info->sd_table.sd_entry[sd_idx].u.pd_table;
+	pd_entry = &pd_table->pd_entry[rel_pd_idx];
+	if (--pd_entry->bp.use_cnt)
+		return 0;
+
+	pd_entry->valid = false;
+	pd_table->use_cnt--;
+	pd_addr = pd_table->pd_page_addr.va;
+	pd_addr += rel_pd_idx;
+	memset(pd_addr, 0, sizeof(u64));
+	if (dev->privileged) {
+		if (dev->hmc_fn_id == hmc_info->hmc_fn_id)
+			irdma_invalidate_pf_hmc_pd(dev, sd_idx, idx);
+
+		else
+			irdma_invalidate_vf_hmc_pd(dev, sd_idx, idx,
+						   hmc_info->hmc_fn_id);
+	}
+
+	if (!pd_entry->rsrc_pg) {
+		mem = &pd_entry->bp.addr;
+		if (!mem || !mem->va)
+			return IRDMA_ERR_PARAM;
+
+		dma_free_coherent(dev->hw->device, mem->size, mem->va,
+				  mem->pa);
+		mem->va = NULL;
+	}
+	if (!pd_table->use_cnt)
+		kfree(pd_table->pd_entry_virt_mem.va);
+
+	return 0;
+}
+
+/**
+ * irdma_prep_remove_sd_bp - Prepares to remove a backing page from a sd entry
+ * @hmc_info: pointer to the HMC configuration information structure
+ * @idx: the page index
+ */
+enum irdma_status_code irdma_prep_remove_sd_bp(struct irdma_hmc_info *hmc_info,
+					       u32 idx)
+{
+	struct irdma_hmc_sd_entry *sd_entry;
+
+	sd_entry = &hmc_info->sd_table.sd_entry[idx];
+	if (--sd_entry->u.bp.use_cnt)
+		return IRDMA_ERR_NOT_READY;
+
+	hmc_info->sd_table.use_cnt--;
+	sd_entry->valid = false;
+
+	return 0;
+}
+
+/**
+ * irdma_prep_remove_pd_page - Prepares to remove a PD page from sd entry.
+ * @hmc_info: pointer to the HMC configuration information structure
+ * @idx: segment descriptor index to find the relevant page descriptor
+ */
+enum irdma_status_code
+irdma_prep_remove_pd_page(struct irdma_hmc_info *hmc_info, u32 idx)
+{
+	struct irdma_hmc_sd_entry *sd_entry;
+
+	sd_entry = &hmc_info->sd_table.sd_entry[idx];
+
+	if (sd_entry->u.pd_table.use_cnt)
+		return IRDMA_ERR_NOT_READY;
+
+	sd_entry->valid = false;
+	hmc_info->sd_table.use_cnt--;
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/irdma/hmc.h b/drivers/infiniband/hw/irdma/hmc.h
new file mode 100644
index 0000000..e2139c7
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/hmc.h
@@ -0,0 +1,180 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2020 Intel Corporation */
+#ifndef IRDMA_HMC_H
+#define IRDMA_HMC_H
+
+#include "defs.h"
+
+#define IRDMA_HMC_MAX_BP_COUNT			512
+#define IRDMA_MAX_SD_ENTRIES			11
+#define IRDMA_HW_DBG_HMC_INVALID_BP_MARK	0xca
+#define IRDMA_HMC_INFO_SIGNATURE		0x484d5347
+#define IRDMA_HMC_PD_CNT_IN_SD			512
+#define IRDMA_HMC_DIRECT_BP_SIZE		0x200000
+#define IRDMA_HMC_MAX_SD_COUNT			8192
+#define IRDMA_HMC_PAGED_BP_SIZE			4096
+#define IRDMA_HMC_PD_BP_BUF_ALIGNMENT		4096
+#define IRDMA_FIRST_VF_FPM_ID			8
+#define FPM_MULTIPLIER				1024
+
+enum irdma_hmc_rsrc_type {
+	IRDMA_HMC_IW_QP		 = 0,
+	IRDMA_HMC_IW_CQ		 = 1,
+	IRDMA_HMC_IW_RESERVED	 = 2,
+	IRDMA_HMC_IW_HTE	 = 3,
+	IRDMA_HMC_IW_ARP	 = 4,
+	IRDMA_HMC_IW_APBVT_ENTRY = 5,
+	IRDMA_HMC_IW_MR		 = 6,
+	IRDMA_HMC_IW_XF		 = 7,
+	IRDMA_HMC_IW_XFFL	 = 8,
+	IRDMA_HMC_IW_Q1		 = 9,
+	IRDMA_HMC_IW_Q1FL	 = 10,
+	IRDMA_HMC_IW_TIMER       = 11,
+	IRDMA_HMC_IW_FSIMC       = 12,
+	IRDMA_HMC_IW_FSIAV       = 13,
+	IRDMA_HMC_IW_PBLE	 = 14,
+	IRDMA_HMC_IW_RRF	 = 15,
+	IRDMA_HMC_IW_RRFFL       = 16,
+	IRDMA_HMC_IW_HDR	 = 17,
+	IRDMA_HMC_IW_MD		 = 18,
+	IRDMA_HMC_IW_OOISC       = 19,
+	IRDMA_HMC_IW_OOISCFFL    = 20,
+	IRDMA_HMC_IW_MAX, /* Must be last entry */
+};
+
+enum irdma_sd_entry_type {
+	IRDMA_SD_TYPE_INVALID = 0,
+	IRDMA_SD_TYPE_PAGED   = 1,
+	IRDMA_SD_TYPE_DIRECT  = 2,
+};
+
+struct irdma_hmc_obj_info {
+	u64 base;
+	u32 max_cnt;
+	u32 cnt;
+	u64 size;
+};
+
+struct irdma_hmc_bp {
+	enum irdma_sd_entry_type entry_type;
+	struct irdma_dma_mem addr;
+	u32 sd_pd_index;
+	u32 use_cnt;
+};
+
+struct irdma_hmc_pd_entry {
+	struct irdma_hmc_bp bp;
+	u32 sd_index;
+	bool rsrc_pg:1;
+	bool valid:1;
+};
+
+struct irdma_hmc_pd_table {
+	struct irdma_dma_mem pd_page_addr;
+	struct irdma_hmc_pd_entry *pd_entry;
+	struct irdma_virt_mem pd_entry_virt_mem;
+	u32 use_cnt;
+	u32 sd_index;
+};
+
+struct irdma_hmc_sd_entry {
+	enum irdma_sd_entry_type entry_type;
+	bool valid;
+	union {
+		struct irdma_hmc_pd_table pd_table;
+		struct irdma_hmc_bp bp;
+	} u;
+};
+
+struct irdma_hmc_sd_table {
+	struct irdma_virt_mem addr;
+	u32 sd_cnt;
+	u32 use_cnt;
+	struct irdma_hmc_sd_entry *sd_entry;
+};
+
+struct irdma_hmc_info {
+	u32 signature;
+	u8 hmc_fn_id;
+	u16 first_sd_index;
+	struct irdma_hmc_obj_info *hmc_obj;
+	struct irdma_virt_mem hmc_obj_virt_mem;
+	struct irdma_hmc_sd_table sd_table;
+	u16 sd_indexes[IRDMA_HMC_MAX_SD_COUNT];
+};
+
+struct irdma_update_sd_entry {
+	u64 cmd;
+	u64 data;
+};
+
+struct irdma_update_sds_info {
+	u32 cnt;
+	u8 hmc_fn_id;
+	struct irdma_update_sd_entry entry[IRDMA_MAX_SD_ENTRIES];
+};
+
+struct irdma_ccq_cqe_info;
+struct irdma_hmc_fcn_info {
+	u32 vf_id;
+	u8 free_fcn;
+};
+
+struct irdma_hmc_create_obj_info {
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_virt_mem add_sd_virt_mem;
+	u32 rsrc_type;
+	u32 start_idx;
+	u32 count;
+	u32 add_sd_cnt;
+	enum irdma_sd_entry_type entry_type;
+	bool privileged;
+};
+
+struct irdma_hmc_del_obj_info {
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_virt_mem del_sd_virt_mem;
+	u32 rsrc_type;
+	u32 start_idx;
+	u32 count;
+	u32 del_sd_cnt;
+	bool privileged;
+};
+
+enum irdma_status_code irdma_copy_dma_mem(struct irdma_hw *hw, void *dest_buf,
+					  struct irdma_dma_mem *src_mem,
+					  u64 src_offset, u64 size);
+enum irdma_status_code
+irdma_sc_create_hmc_obj(struct irdma_sc_dev *dev,
+			struct irdma_hmc_create_obj_info *info);
+enum irdma_status_code irdma_sc_del_hmc_obj(struct irdma_sc_dev *dev,
+					    struct irdma_hmc_del_obj_info *info,
+					    bool reset);
+enum irdma_status_code irdma_hmc_sd_one(struct irdma_sc_dev *dev, u8 hmc_fn_id,
+					u64 pa, u32 sd_idx,
+					enum irdma_sd_entry_type type,
+					bool setsd);
+enum irdma_status_code
+irdma_update_sds_noccq(struct irdma_sc_dev *dev,
+		       struct irdma_update_sds_info *info);
+struct irdma_vfdev *irdma_vfdev_from_fpm(struct irdma_sc_dev *dev,
+					 u8 hmc_fn_id);
+struct irdma_hmc_info *irdma_vf_hmcinfo_from_fpm(struct irdma_sc_dev *dev,
+						 u8 hmc_fn_id);
+enum irdma_status_code irdma_add_sd_table_entry(struct irdma_hw *hw,
+						struct irdma_hmc_info *hmc_info,
+						u32 sd_index,
+						enum irdma_sd_entry_type type,
+						u64 direct_mode_sz);
+enum irdma_status_code irdma_add_pd_table_entry(struct irdma_sc_dev *dev,
+						struct irdma_hmc_info *hmc_info,
+						u32 pd_index,
+						struct irdma_dma_mem *rsrc_pg);
+enum irdma_status_code irdma_remove_pd_bp(struct irdma_sc_dev *dev,
+					  struct irdma_hmc_info *hmc_info,
+					  u32 idx);
+enum irdma_status_code irdma_prep_remove_sd_bp(struct irdma_hmc_info *hmc_info,
+					       u32 idx);
+enum irdma_status_code
+irdma_prep_remove_pd_page(struct irdma_hmc_info *hmc_info, u32 idx);
+#endif /* IRDMA_HMC_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 12/23] RDMA/irdma: Add privileged UDA queue implementation
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (10 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 11/23] RDMA/irdma: Add HMC backing store setup functions Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 13/23] RDMA/irdma: Add QoS definitions Shiraz Saleem
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Implement privileged UDA queues to handle iWARP connection
packets and receive exceptions.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/puda.c | 1749 ++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/puda.h |  194 ++++
 2 files changed, 1943 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/puda.c
 create mode 100644 drivers/infiniband/hw/irdma/puda.h

diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c
new file mode 100644
index 0000000..4bb8ee3
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/puda.c
@@ -0,0 +1,1749 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "hmc.h"
+#include "defs.h"
+#include "type.h"
+#include "protos.h"
+#include "puda.h"
+#include "ws.h"
+
+static void irdma_ieq_receive(struct irdma_sc_vsi *vsi,
+			      struct irdma_puda_buf *buf);
+static void irdma_ieq_tx_compl(struct irdma_sc_vsi *vsi, void *sqwrid);
+static void irdma_ilq_putback_rcvbuf(struct irdma_sc_qp *qp,
+				     struct irdma_puda_buf *buf, u32 wqe_idx);
+/**
+ * irdma_puda_get_listbuf - get buffer from puda list
+ * @list: list to use for buffers (ILQ or IEQ)
+ */
+static struct irdma_puda_buf *irdma_puda_get_listbuf(struct list_head *list)
+{
+	struct irdma_puda_buf *buf = NULL;
+
+	if (!list_empty(list)) {
+		buf = (struct irdma_puda_buf *)list->next;
+		list_del((struct list_head *)&buf->list);
+	}
+
+	return buf;
+}
+
+/**
+ * irdma_puda_get_bufpool - return buffer from resource
+ * @rsrc: resource to use for buffer
+ */
+struct irdma_puda_buf *irdma_puda_get_bufpool(struct irdma_puda_rsrc *rsrc)
+{
+	struct irdma_puda_buf *buf = NULL;
+	struct list_head *list = &rsrc->bufpool;
+	unsigned long flags;
+
+	spin_lock_irqsave(&rsrc->bufpool_lock, flags);
+	buf = irdma_puda_get_listbuf(list);
+	if (buf) {
+		rsrc->avail_buf_count--;
+		buf->vsi = rsrc->vsi;
+	} else {
+		rsrc->stats_buf_alloc_fail++;
+	}
+	spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+
+	return buf;
+}
+
+/**
+ * irdma_puda_ret_bufpool - return buffer to rsrc list
+ * @rsrc: resource to use for buffer
+ * @buf: buffer to return to resource
+ */
+void irdma_puda_ret_bufpool(struct irdma_puda_rsrc *rsrc,
+			    struct irdma_puda_buf *buf)
+{
+	unsigned long flags;
+
+	buf->do_lpb = false;
+	spin_lock_irqsave(&rsrc->bufpool_lock, flags);
+	list_add(&buf->list, &rsrc->bufpool);
+	spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+	rsrc->avail_buf_count++;
+}
+
+/**
+ * irdma_puda_post_recvbuf - set wqe for rcv buffer
+ * @rsrc: resource ptr
+ * @wqe_idx: wqe index to use
+ * @buf: puda buffer for rcv q
+ * @initial: flag if during init time
+ */
+static void irdma_puda_post_recvbuf(struct irdma_puda_rsrc *rsrc, u32 wqe_idx,
+				    struct irdma_puda_buf *buf, bool initial)
+{
+	__le64 *wqe;
+	struct irdma_sc_qp *qp = &rsrc->qp;
+	u64 offset24 = 0;
+
+	/* Synch buffer for use by device */
+	dma_sync_single_for_device(rsrc->dev->hw->device, buf->mem.pa,
+				   buf->mem.size, DMA_BIDIRECTIONAL);
+	qp->qp_uk.rq_wrid_array[wqe_idx] = (uintptr_t)buf;
+	wqe = qp->qp_uk.rq_base[wqe_idx].elem;
+	if (!initial)
+		get_64bit_val(wqe, 24, &offset24);
+
+	offset24 = (offset24) ? 0 : FIELD_PREP(IRDMAQPSQ_VALID, 1);
+
+	set_64bit_val(wqe, 16, 0);
+	set_64bit_val(wqe, 0, buf->mem.pa);
+	if (qp->qp_uk.uk_attrs->hw_rev == IRDMA_GEN_1) {
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_LEN, buf->mem.size));
+	} else {
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_FRAG_LEN, buf->mem.size) |
+			      offset24);
+	}
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, offset24);
+}
+
+/**
+ * irdma_puda_replenish_rq - post rcv buffers
+ * @rsrc: resource to use for buffer
+ * @initial: flag if during init time
+ */
+static enum irdma_status_code
+irdma_puda_replenish_rq(struct irdma_puda_rsrc *rsrc, bool initial)
+{
+	u32 i;
+	u32 invalid_cnt = rsrc->rxq_invalid_cnt;
+	struct irdma_puda_buf *buf = NULL;
+
+	for (i = 0; i < invalid_cnt; i++) {
+		buf = irdma_puda_get_bufpool(rsrc);
+		if (!buf)
+			return IRDMA_ERR_list_empty;
+		irdma_puda_post_recvbuf(rsrc, rsrc->rx_wqe_idx, buf, initial);
+		rsrc->rx_wqe_idx = ((rsrc->rx_wqe_idx + 1) % rsrc->rq_size);
+		rsrc->rxq_invalid_cnt--;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_puda_alloc_buf - allocate mem for buffer
+ * @dev: iwarp device
+ * @len: length of buffer
+ */
+static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev,
+						   u32 len)
+{
+	struct irdma_puda_buf *buf;
+	struct irdma_virt_mem buf_mem;
+
+	buf_mem.size = sizeof(struct irdma_puda_buf);
+	buf_mem.va = kzalloc(buf_mem.size, GFP_KERNEL);
+	if (!buf_mem.va)
+		return NULL;
+
+	buf = buf_mem.va;
+	buf->mem.size = len;
+	buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL);
+	if (!buf->mem.va)
+		goto free_virt;
+	buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va,
+				     buf->mem.size, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev->hw->device, buf->mem.pa)) {
+		kfree(buf->mem.va);
+		goto free_virt;
+	}
+
+	buf->buf_mem.va = buf_mem.va;
+	buf->buf_mem.size = buf_mem.size;
+
+	return buf;
+
+free_virt:
+	kfree(buf_mem.va);
+	return NULL;
+}
+
+/**
+ * irdma_puda_dele_buf - delete buffer back to system
+ * @dev: iwarp device
+ * @buf: buffer to free
+ */
+static void irdma_puda_dele_buf(struct irdma_sc_dev *dev,
+				struct irdma_puda_buf *buf)
+{
+	dma_unmap_single(dev->hw->device, buf->mem.pa, buf->mem.size,
+			 DMA_BIDIRECTIONAL);
+	kfree(buf->mem.va);
+	kfree(buf->buf_mem.va);
+}
+
+/**
+ * irdma_puda_get_next_send_wqe - return next wqe for processing
+ * @qp: puda qp for wqe
+ * @wqe_idx: wqe index for caller
+ */
+static __le64 *irdma_puda_get_next_send_wqe(struct irdma_qp_uk *qp,
+					    u32 *wqe_idx)
+{
+	__le64 *wqe = NULL;
+	enum irdma_status_code ret_code = 0;
+
+	*wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring);
+	if (!*wqe_idx)
+		qp->swqe_polarity = !qp->swqe_polarity;
+	IRDMA_RING_MOVE_HEAD(qp->sq_ring, ret_code);
+	if (ret_code)
+		return wqe;
+
+	wqe = qp->sq_base[*wqe_idx].elem;
+
+	return wqe;
+}
+
+/**
+ * irdma_puda_poll_info - poll cq for completion
+ * @cq: cq for poll
+ * @info: info return for successful completion
+ */
+static enum irdma_status_code
+irdma_puda_poll_info(struct irdma_sc_cq *cq, struct irdma_puda_cmpl_info *info)
+{
+	struct irdma_cq_uk *cq_uk = &cq->cq_uk;
+	u64 qword0, qword2, qword3, qword6;
+	__le64 *cqe;
+	__le64 *ext_cqe = NULL;
+	u64 qword7 = 0;
+	u64 comp_ctx;
+	bool valid_bit;
+	bool ext_valid = 0;
+	u32 major_err, minor_err;
+	u32 peek_head;
+	bool error;
+	u8 polarity;
+
+	cqe = IRDMA_GET_CURRENT_CQ_ELEM(&cq->cq_uk);
+	get_64bit_val(cqe, 24, &qword3);
+	valid_bit = (bool)FIELD_GET(IRDMA_CQ_VALID, qword3);
+	if (valid_bit != cq_uk->polarity)
+		return IRDMA_ERR_Q_EMPTY;
+
+	if (cq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+		ext_valid = (bool)FIELD_GET(IRDMA_CQ_EXTCQE, qword3);
+
+	if (ext_valid) {
+		peek_head = (cq_uk->cq_ring.head + 1) % cq_uk->cq_ring.size;
+		ext_cqe = cq_uk->cq_base[peek_head].buf;
+		get_64bit_val(ext_cqe, 24, &qword7);
+		polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword7);
+		if (!peek_head)
+			polarity ^= 1;
+		if (polarity != cq_uk->polarity)
+			return IRDMA_ERR_Q_EMPTY;
+
+		IRDMA_RING_MOVE_HEAD_NOCHECK(cq_uk->cq_ring);
+		if (!IRDMA_RING_CURRENT_HEAD(cq_uk->cq_ring))
+			cq_uk->polarity = !cq_uk->polarity;
+		/* update cq tail in cq shadow memory also */
+		IRDMA_RING_MOVE_TAIL(cq_uk->cq_ring);
+	}
+
+	print_hex_dump_debug("PUDA: PUDA CQE", DUMP_PREFIX_OFFSET, 16, 8, cqe,
+			     32, false);
+	if (ext_valid)
+		print_hex_dump_debug("PUDA: PUDA EXT-CQE", DUMP_PREFIX_OFFSET,
+				     16, 8, ext_cqe, 32, false);
+
+	error = (bool)FIELD_GET(IRDMA_CQ_ERROR, qword3);
+	if (error) {
+		ibdev_dbg(to_ibdev(cq->dev), "PUDA: receive error\n");
+		major_err = (u32)(FIELD_GET(IRDMA_CQ_MAJERR, qword3));
+		minor_err = (u32)(FIELD_GET(IRDMA_CQ_MINERR, qword3));
+		info->compl_error = major_err << 16 | minor_err;
+		return IRDMA_ERR_CQ_COMPL_ERROR;
+	}
+
+	get_64bit_val(cqe, 0, &qword0);
+	get_64bit_val(cqe, 16, &qword2);
+
+	info->q_type = (u8)FIELD_GET(IRDMA_CQ_SQ, qword3);
+	info->qp_id = (u32)FIELD_GET(IRDMACQ_QPID, qword2);
+	if (cq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+		info->ipv4 = (bool)FIELD_GET(IRDMACQ_IPV4, qword3);
+
+	get_64bit_val(cqe, 8, &comp_ctx);
+	info->qp = (struct irdma_qp_uk *)(unsigned long)comp_ctx;
+	info->wqe_idx = (u32)FIELD_GET(IRDMA_CQ_WQEIDX, qword3);
+
+	if (info->q_type == IRDMA_CQE_QTYPE_RQ) {
+		if (ext_valid) {
+			info->vlan_valid = (bool)FIELD_GET(IRDMA_CQ_UDVLANVALID, qword7);
+			if (info->vlan_valid) {
+				get_64bit_val(ext_cqe, 16, &qword6);
+				info->vlan = (u16)FIELD_GET(IRDMA_CQ_UDVLAN, qword6);
+			}
+			info->smac_valid = (bool)FIELD_GET(IRDMA_CQ_UDSMACVALID, qword7);
+			if (info->smac_valid) {
+				get_64bit_val(ext_cqe, 16, &qword6);
+				info->smac[0] = (u8)((qword6 >> 40) & 0xFF);
+				info->smac[1] = (u8)((qword6 >> 32) & 0xFF);
+				info->smac[2] = (u8)((qword6 >> 24) & 0xFF);
+				info->smac[3] = (u8)((qword6 >> 16) & 0xFF);
+				info->smac[4] = (u8)((qword6 >> 8) & 0xFF);
+				info->smac[5] = (u8)(qword6 & 0xFF);
+			}
+		}
+
+		if (cq->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1) {
+			info->vlan_valid = (bool)FIELD_GET(IRDMA_VLAN_TAG_VALID, qword3);
+			info->l4proto = (u8)FIELD_GET(IRDMA_UDA_L4PROTO, qword2);
+			info->l3proto = (u8)FIELD_GET(IRDMA_UDA_L3PROTO, qword2);
+		}
+
+		info->payload_len = (u32)FIELD_GET(IRDMACQ_PAYLDLEN, qword0);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_puda_poll_cmpl - processes completion for cq
+ * @dev: iwarp device
+ * @cq: cq getting interrupt
+ * @compl_err: return any completion err
+ */
+enum irdma_status_code irdma_puda_poll_cmpl(struct irdma_sc_dev *dev,
+					    struct irdma_sc_cq *cq,
+					    u32 *compl_err)
+{
+	struct irdma_qp_uk *qp;
+	struct irdma_cq_uk *cq_uk = &cq->cq_uk;
+	struct irdma_puda_cmpl_info info = {};
+	enum irdma_status_code ret = 0;
+	struct irdma_puda_buf *buf;
+	struct irdma_puda_rsrc *rsrc;
+	u8 cq_type = cq->cq_type;
+	unsigned long flags;
+
+	if (cq_type == IRDMA_CQ_TYPE_ILQ || cq_type == IRDMA_CQ_TYPE_IEQ) {
+		rsrc = (cq_type == IRDMA_CQ_TYPE_ILQ) ? cq->vsi->ilq :
+							cq->vsi->ieq;
+	} else {
+		ibdev_dbg(to_ibdev(dev), "PUDA: qp_type error\n");
+		return IRDMA_ERR_BAD_PTR;
+	}
+
+	ret = irdma_puda_poll_info(cq, &info);
+	*compl_err = info.compl_error;
+	if (ret == IRDMA_ERR_Q_EMPTY)
+		return ret;
+	if (ret)
+		goto done;
+
+	qp = info.qp;
+	if (!qp || !rsrc) {
+		ret = IRDMA_ERR_BAD_PTR;
+		goto done;
+	}
+
+	if (qp->qp_id != rsrc->qp_id) {
+		ret = IRDMA_ERR_BAD_PTR;
+		goto done;
+	}
+
+	if (info.q_type == IRDMA_CQE_QTYPE_RQ) {
+		buf = (struct irdma_puda_buf *)(uintptr_t)
+			      qp->rq_wrid_array[info.wqe_idx];
+
+		/* reusing so synch the buffer for CPU use */
+		dma_sync_single_for_cpu(dev->hw->device, buf->mem.pa,
+					buf->mem.size, DMA_BIDIRECTIONAL);
+		/* Get all the tcpip information in the buf header */
+		ret = irdma_puda_get_tcpip_info(&info, buf);
+		if (ret) {
+			rsrc->stats_rcvd_pkt_err++;
+			if (cq_type == IRDMA_CQ_TYPE_ILQ) {
+				irdma_ilq_putback_rcvbuf(&rsrc->qp, buf,
+							 info.wqe_idx);
+			} else {
+				irdma_puda_ret_bufpool(rsrc, buf);
+				irdma_puda_replenish_rq(rsrc, false);
+			}
+			goto done;
+		}
+
+		rsrc->stats_pkt_rcvd++;
+		rsrc->compl_rxwqe_idx = info.wqe_idx;
+		ibdev_dbg(to_ibdev(dev), "PUDA: RQ completion\n");
+		rsrc->receive(rsrc->vsi, buf);
+		if (cq_type == IRDMA_CQ_TYPE_ILQ)
+			irdma_ilq_putback_rcvbuf(&rsrc->qp, buf, info.wqe_idx);
+		else
+			irdma_puda_replenish_rq(rsrc, false);
+
+	} else {
+		ibdev_dbg(to_ibdev(dev), "PUDA: SQ completion\n");
+		buf = (struct irdma_puda_buf *)(uintptr_t)
+					qp->sq_wrtrk_array[info.wqe_idx].wrid;
+
+		/* reusing so synch the buffer for CPU use */
+		dma_sync_single_for_cpu(dev->hw->device, buf->mem.pa,
+					buf->mem.size, DMA_BIDIRECTIONAL);
+		IRDMA_RING_SET_TAIL(qp->sq_ring, info.wqe_idx);
+		rsrc->xmit_complete(rsrc->vsi, buf);
+		spin_lock_irqsave(&rsrc->bufpool_lock, flags);
+		rsrc->tx_wqe_avail_cnt++;
+		spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+		if (!list_empty(&rsrc->txpend))
+			irdma_puda_send_buf(rsrc, NULL);
+	}
+
+done:
+	IRDMA_RING_MOVE_HEAD_NOCHECK(cq_uk->cq_ring);
+	if (!IRDMA_RING_CURRENT_HEAD(cq_uk->cq_ring))
+		cq_uk->polarity = !cq_uk->polarity;
+	/* update cq tail in cq shadow memory also */
+	IRDMA_RING_MOVE_TAIL(cq_uk->cq_ring);
+	set_64bit_val(cq_uk->shadow_area, 0,
+		      IRDMA_RING_CURRENT_HEAD(cq_uk->cq_ring));
+
+	return ret;
+}
+
+/**
+ * irdma_puda_send - complete send wqe for transmit
+ * @qp: puda qp for send
+ * @info: buffer information for transmit
+ */
+enum irdma_status_code irdma_puda_send(struct irdma_sc_qp *qp,
+				       struct irdma_puda_send_info *info)
+{
+	__le64 *wqe;
+	u32 iplen, l4len;
+	u64 hdr[2];
+	u32 wqe_idx;
+	u8 iipt;
+
+	/* number of 32 bits DWORDS in header */
+	l4len = info->tcplen >> 2;
+	if (info->ipv4) {
+		iipt = 3;
+		iplen = 5;
+	} else {
+		iipt = 1;
+		iplen = 10;
+	}
+
+	wqe = irdma_puda_get_next_send_wqe(&qp->qp_uk, &wqe_idx);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	qp->qp_uk.sq_wrtrk_array[wqe_idx].wrid = (uintptr_t)info->scratch;
+	/* Third line of WQE descriptor */
+	/* maclen is in words */
+
+	if (qp->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		hdr[0] = 0; /* Dest_QPN and Dest_QKey only for UD */
+		hdr[1] = FIELD_PREP(IRDMA_UDA_QPSQ_OPCODE, IRDMA_OP_TYPE_SEND) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_L4LEN, l4len) |
+			 FIELD_PREP(IRDMAQPSQ_AHID, info->ah_id) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_SIGCOMPL, 1) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_VALID,
+				    qp->qp_uk.swqe_polarity);
+
+		/* Forth line of WQE descriptor */
+
+		set_64bit_val(wqe, 0, info->paddr);
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_FRAG_LEN, info->len) |
+			      FIELD_PREP(IRDMA_UDA_QPSQ_VALID, qp->qp_uk.swqe_polarity));
+	} else {
+		hdr[0] = FIELD_PREP(IRDMA_UDA_QPSQ_MACLEN, info->maclen >> 1) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_IPLEN, iplen) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_L4T, 1) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_IIPT, iipt) |
+			 FIELD_PREP(IRDMA_GEN1_UDA_QPSQ_L4LEN, l4len);
+
+		hdr[1] = FIELD_PREP(IRDMA_UDA_QPSQ_OPCODE, IRDMA_OP_TYPE_SEND) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_SIGCOMPL, 1) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_DOLOOPBACK, info->do_lpb) |
+			 FIELD_PREP(IRDMA_UDA_QPSQ_VALID, qp->qp_uk.swqe_polarity);
+
+		/* Forth line of WQE descriptor */
+
+		set_64bit_val(wqe, 0, info->paddr);
+		set_64bit_val(wqe, 8,
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_LEN, info->len));
+	}
+
+	set_64bit_val(wqe, 16, hdr[0]);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr[1]);
+
+	print_hex_dump_debug("PUDA: PUDA SEND WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, 32, false);
+	irdma_qp_post_wr(&qp->qp_uk);
+	return 0;
+}
+
+/**
+ * irdma_puda_send_buf - transmit puda buffer
+ * @rsrc: resource to use for buffer
+ * @buf: puda buffer to transmit
+ */
+void irdma_puda_send_buf(struct irdma_puda_rsrc *rsrc,
+			 struct irdma_puda_buf *buf)
+{
+	struct irdma_puda_send_info info;
+	enum irdma_status_code ret = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(&rsrc->bufpool_lock, flags);
+	/* if no wqe available or not from a completion and we have
+	 * pending buffers, we must queue new buffer
+	 */
+	if (!rsrc->tx_wqe_avail_cnt || (buf && !list_empty(&rsrc->txpend))) {
+		list_add_tail(&buf->list, &rsrc->txpend);
+		spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+		rsrc->stats_sent_pkt_q++;
+		if (rsrc->type == IRDMA_PUDA_RSRC_TYPE_ILQ)
+			ibdev_dbg(to_ibdev(rsrc->dev),
+				  "PUDA: adding to txpend\n");
+		return;
+	}
+	rsrc->tx_wqe_avail_cnt--;
+	/* if we are coming from a completion and have pending buffers
+	 * then Get one from pending list
+	 */
+	if (!buf) {
+		buf = irdma_puda_get_listbuf(&rsrc->txpend);
+		if (!buf)
+			goto done;
+	}
+
+	info.scratch = buf;
+	info.paddr = buf->mem.pa;
+	info.len = buf->totallen;
+	info.tcplen = buf->tcphlen;
+	info.ipv4 = buf->ipv4;
+
+	if (rsrc->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		info.ah_id = buf->ah_id;
+	} else {
+		info.maclen = buf->maclen;
+		info.do_lpb = buf->do_lpb;
+	}
+
+	/* Synch buffer for use by device */
+	dma_sync_single_for_cpu(rsrc->dev->hw->device, buf->mem.pa,
+				buf->mem.size, DMA_BIDIRECTIONAL);
+	ret = irdma_puda_send(&rsrc->qp, &info);
+	if (ret) {
+		rsrc->tx_wqe_avail_cnt++;
+		rsrc->stats_sent_pkt_q++;
+		list_add(&buf->list, &rsrc->txpend);
+		if (rsrc->type == IRDMA_PUDA_RSRC_TYPE_ILQ)
+			ibdev_dbg(to_ibdev(rsrc->dev),
+				  "PUDA: adding to puda_send\n");
+	} else {
+		rsrc->stats_pkt_sent++;
+	}
+done:
+	spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+}
+
+/**
+ * irdma_puda_qp_setctx - during init, set qp's context
+ * @rsrc: qp's resource
+ */
+static void irdma_puda_qp_setctx(struct irdma_puda_rsrc *rsrc)
+{
+	struct irdma_sc_qp *qp = &rsrc->qp;
+	__le64 *qp_ctx = qp->hw_host_ctx;
+
+	set_64bit_val(qp_ctx, 8, qp->sq_pa);
+	set_64bit_val(qp_ctx, 16, qp->rq_pa);
+	set_64bit_val(qp_ctx, 24,
+		      FIELD_PREP(IRDMAQPC_RQSIZE, qp->hw_rq_size) |
+		      FIELD_PREP(IRDMAQPC_SQSIZE, qp->hw_sq_size));
+	set_64bit_val(qp_ctx, 48,
+		      FIELD_PREP(IRDMAQPC_SNDMSS, rsrc->buf_size));
+	set_64bit_val(qp_ctx, 56, 0);
+	if (qp->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		set_64bit_val(qp_ctx, 64, 1);
+	set_64bit_val(qp_ctx, 136,
+		      FIELD_PREP(IRDMAQPC_TXCQNUM, rsrc->cq_id) |
+		      FIELD_PREP(IRDMAQPC_RXCQNUM, rsrc->cq_id));
+	set_64bit_val(qp_ctx, 144,
+		      FIELD_PREP(IRDMAQPC_STAT_INDEX, rsrc->stats_idx));
+	set_64bit_val(qp_ctx, 160,
+		      FIELD_PREP(IRDMAQPC_PRIVEN, 1) |
+		      FIELD_PREP(IRDMAQPC_USESTATSINSTANCE, rsrc->stats_idx_valid));
+	set_64bit_val(qp_ctx, 168,
+		      FIELD_PREP(IRDMAQPC_QPCOMPCTX, (uintptr_t)qp));
+	set_64bit_val(qp_ctx, 176,
+		      FIELD_PREP(IRDMAQPC_SQTPHVAL, qp->sq_tph_val) |
+		      FIELD_PREP(IRDMAQPC_RQTPHVAL, qp->rq_tph_val) |
+		      FIELD_PREP(IRDMAQPC_QSHANDLE, qp->qs_handle));
+
+	print_hex_dump_debug("PUDA: PUDA QP CONTEXT", DUMP_PREFIX_OFFSET, 16,
+			     8, qp_ctx, IRDMA_QP_CTX_SIZE, false);
+}
+
+/**
+ * irdma_puda_qp_wqe - setup wqe for qp create
+ * @dev: Device
+ * @qp: Resource qp
+ */
+static enum irdma_status_code irdma_puda_qp_wqe(struct irdma_sc_dev *dev,
+						struct irdma_sc_qp *qp)
+{
+	struct irdma_sc_cqp *cqp;
+	__le64 *wqe;
+	u64 hdr;
+	struct irdma_ccq_cqe_info compl_info;
+	enum irdma_status_code status = 0;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, 0);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 16, qp->hw_host_ctx_pa);
+	set_64bit_val(wqe, 40, qp->shadow_area_pa);
+
+	hdr = qp->qp_uk.qp_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_QP) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_QPTYPE, IRDMA_QP_TYPE_UDA) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_CQNUMVALID, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_QP_NEXTIWSTATE, 2) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("PUDA: PUDA QP CREATE", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, 40, false);
+	irdma_sc_cqp_post_sq(cqp);
+	status = dev->cqp_ops->poll_for_cqp_op_done(dev->cqp,
+						    IRDMA_CQP_OP_CREATE_QP,
+						    &compl_info);
+
+	return status;
+}
+
+/**
+ * irdma_puda_qp_create - create qp for resource
+ * @rsrc: resource to use for buffer
+ */
+static enum irdma_status_code irdma_puda_qp_create(struct irdma_puda_rsrc *rsrc)
+{
+	struct irdma_sc_qp *qp = &rsrc->qp;
+	struct irdma_qp_uk *ukqp = &qp->qp_uk;
+	enum irdma_status_code ret = 0;
+	u32 sq_size, rq_size;
+	struct irdma_dma_mem *mem;
+
+	sq_size = rsrc->sq_size * IRDMA_QP_WQE_MIN_SIZE;
+	rq_size = rsrc->rq_size * IRDMA_QP_WQE_MIN_SIZE;
+	rsrc->qpmem.size = ALIGN((sq_size + rq_size + (IRDMA_SHADOW_AREA_SIZE << 3) + IRDMA_QP_CTX_SIZE),
+				 IRDMA_HW_PAGE_SIZE);
+	rsrc->qpmem.va = dma_alloc_coherent(rsrc->dev->hw->device,
+					    rsrc->qpmem.size, &rsrc->qpmem.pa,
+					    GFP_KERNEL);
+	if (!rsrc->qpmem.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	mem = &rsrc->qpmem;
+	memset(mem->va, 0, rsrc->qpmem.size);
+	qp->hw_sq_size = irdma_get_encoded_wqe_size(rsrc->sq_size, IRDMA_QUEUE_TYPE_SQ_RQ);
+	qp->hw_rq_size = irdma_get_encoded_wqe_size(rsrc->rq_size, IRDMA_QUEUE_TYPE_SQ_RQ);
+	qp->pd = &rsrc->sc_pd;
+	qp->qp_uk.qp_type = IRDMA_QP_TYPE_UDA;
+	qp->dev = rsrc->dev;
+	qp->qp_uk.back_qp = rsrc;
+	qp->sq_pa = mem->pa;
+	qp->rq_pa = qp->sq_pa + sq_size;
+	qp->vsi = rsrc->vsi;
+	ukqp->sq_base = mem->va;
+	ukqp->rq_base = &ukqp->sq_base[rsrc->sq_size];
+	ukqp->shadow_area = ukqp->rq_base[rsrc->rq_size].elem;
+	ukqp->uk_attrs = &qp->dev->hw_attrs.uk_attrs;
+	qp->shadow_area_pa = qp->rq_pa + rq_size;
+	qp->hw_host_ctx = ukqp->shadow_area + IRDMA_SHADOW_AREA_SIZE;
+	qp->hw_host_ctx_pa = qp->shadow_area_pa + (IRDMA_SHADOW_AREA_SIZE << 3);
+	qp->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX;
+	ukqp->qp_id = rsrc->qp_id;
+	ukqp->sq_wrtrk_array = rsrc->sq_wrtrk_array;
+	ukqp->rq_wrid_array = rsrc->rq_wrid_array;
+	ukqp->sq_size = rsrc->sq_size;
+	ukqp->rq_size = rsrc->rq_size;
+
+	IRDMA_RING_INIT(ukqp->sq_ring, ukqp->sq_size);
+	IRDMA_RING_INIT(ukqp->initial_ring, ukqp->sq_size);
+	IRDMA_RING_INIT(ukqp->rq_ring, ukqp->rq_size);
+	ukqp->wqe_alloc_db = qp->pd->dev->wqe_alloc_db;
+
+	ret = rsrc->dev->ws_add(qp->vsi, qp->user_pri);
+	if (ret) {
+		dma_free_coherent(rsrc->dev->hw->device, rsrc->qpmem.size,
+				  rsrc->qpmem.va, rsrc->qpmem.pa);
+		rsrc->qpmem.va = NULL;
+		return ret;
+	}
+
+	irdma_qp_add_qos(qp);
+	irdma_puda_qp_setctx(rsrc);
+
+	if (rsrc->dev->ceq_valid)
+		ret = irdma_cqp_qp_create_cmd(rsrc->dev, qp);
+	else
+		ret = irdma_puda_qp_wqe(rsrc->dev, qp);
+	if (ret) {
+		irdma_qp_rem_qos(qp);
+		rsrc->dev->ws_remove(qp->vsi, qp->user_pri);
+		dma_free_coherent(rsrc->dev->hw->device, rsrc->qpmem.size,
+				  rsrc->qpmem.va, rsrc->qpmem.pa);
+		rsrc->qpmem.va = NULL;
+	}
+
+	return ret;
+}
+
+/**
+ * irdma_puda_cq_wqe - setup wqe for CQ create
+ * @dev: Device
+ * @cq: resource for cq
+ */
+static enum irdma_status_code irdma_puda_cq_wqe(struct irdma_sc_dev *dev,
+						struct irdma_sc_cq *cq)
+{
+	__le64 *wqe;
+	struct irdma_sc_cqp *cqp;
+	u64 hdr;
+	struct irdma_ccq_cqe_info compl_info;
+	enum irdma_status_code status = 0;
+
+	cqp = dev->cqp;
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, 0);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 0, cq->cq_uk.cq_size);
+	set_64bit_val(wqe, 8, (uintptr_t)cq >> 1);
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_CQPSQ_CQ_SHADOW_READ_THRESHOLD, cq->shadow_read_threshold));
+	set_64bit_val(wqe, 32, cq->cq_pa);
+	set_64bit_val(wqe, 40, cq->shadow_area_pa);
+	set_64bit_val(wqe, 56,
+		      FIELD_PREP(IRDMA_CQPSQ_TPHVAL, cq->tph_val) |
+		      FIELD_PREP(IRDMA_CQPSQ_VSIIDX, cq->vsi->vsi_idx));
+
+	hdr = cq->cq_uk.cq_id |
+	      FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_CQ) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CHKOVERFLOW, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_ENCEQEMASK, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_CQ_CEQIDVALID, 1) |
+	      FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity);
+	dma_wmb(); /* make sure WQE is written before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	print_hex_dump_debug("PUDA: PUDA CREATE CQ", DUMP_PREFIX_OFFSET, 16,
+			     8, wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(dev->cqp);
+	status = dev->cqp_ops->poll_for_cqp_op_done(dev->cqp,
+						    IRDMA_CQP_OP_CREATE_CQ,
+						    &compl_info);
+	if (!status) {
+		struct irdma_sc_ceq *ceq = dev->ceq[0];
+
+		if (ceq && ceq->reg_cq)
+			status = irdma_sc_add_cq_ctx(ceq, cq);
+	}
+
+	return status;
+}
+
+/**
+ * irdma_puda_cq_create - create cq for resource
+ * @rsrc: resource for which cq to create
+ */
+static enum irdma_status_code irdma_puda_cq_create(struct irdma_puda_rsrc *rsrc)
+{
+	struct irdma_sc_dev *dev = rsrc->dev;
+	struct irdma_sc_cq *cq = &rsrc->cq;
+	enum irdma_status_code ret = 0;
+	u32 cqsize;
+	struct irdma_dma_mem *mem;
+	struct irdma_cq_init_info info = {};
+	struct irdma_cq_uk_init_info *init_info = &info.cq_uk_init_info;
+
+	cq->vsi = rsrc->vsi;
+	cqsize = rsrc->cq_size * (sizeof(struct irdma_cqe));
+	rsrc->cqmem.size = ALIGN(cqsize + sizeof(struct irdma_cq_shadow_area),
+				 IRDMA_CQ0_ALIGNMENT);
+	rsrc->cqmem.va = dma_alloc_coherent(dev->hw->device, rsrc->cqmem.size,
+					    &rsrc->cqmem.pa, GFP_KERNEL);
+	if (!rsrc->cqmem.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	mem = &rsrc->cqmem;
+	info.dev = dev;
+	info.type = (rsrc->type == IRDMA_PUDA_RSRC_TYPE_ILQ) ?
+		    IRDMA_CQ_TYPE_ILQ : IRDMA_CQ_TYPE_IEQ;
+	info.shadow_read_threshold = rsrc->cq_size >> 2;
+	info.cq_base_pa = mem->pa;
+	info.shadow_area_pa = mem->pa + cqsize;
+	init_info->cq_base = mem->va;
+	init_info->shadow_area = (__le64 *)((u8 *)mem->va + cqsize);
+	init_info->cq_size = rsrc->cq_size;
+	init_info->cq_id = rsrc->cq_id;
+	info.ceqe_mask = true;
+	info.ceq_id_valid = true;
+	info.vsi = rsrc->vsi;
+
+	ret = dev->iw_priv_cq_ops->cq_init(cq, &info);
+	if (ret)
+		goto error;
+
+	if (rsrc->dev->ceq_valid)
+		ret = irdma_cqp_cq_create_cmd(dev, cq);
+	else
+		ret = irdma_puda_cq_wqe(dev, cq);
+error:
+	if (ret) {
+		dma_free_coherent(dev->hw->device, rsrc->cqmem.size,
+				  rsrc->cqmem.va, rsrc->cqmem.pa);
+		rsrc->cqmem.va = NULL;
+	}
+
+	return ret;
+}
+
+/**
+ * irdma_puda_free_qp - free qp for resource
+ * @rsrc: resource for which qp to free
+ */
+static void irdma_puda_free_qp(struct irdma_puda_rsrc *rsrc)
+{
+	enum irdma_status_code ret;
+	struct irdma_ccq_cqe_info compl_info;
+	struct irdma_sc_dev *dev = rsrc->dev;
+
+	if (rsrc->dev->ceq_valid) {
+		irdma_cqp_qp_destroy_cmd(dev, &rsrc->qp);
+		rsrc->dev->ws_remove(rsrc->qp.vsi, rsrc->qp.user_pri);
+		return;
+	}
+
+	ret = dev->iw_priv_qp_ops->qp_destroy(&rsrc->qp, 0, false, true, true);
+	if (ret)
+		ibdev_dbg(to_ibdev(dev),
+			  "PUDA: error puda qp destroy wqe, status = %d\n",
+			  ret);
+	if (!ret) {
+		ret = dev->cqp_ops->poll_for_cqp_op_done(dev->cqp,
+							 IRDMA_CQP_OP_DESTROY_QP,
+							 &compl_info);
+		if (ret)
+			ibdev_dbg(to_ibdev(dev),
+				  "PUDA: error puda qp destroy failed, status = %d\n",
+				  ret);
+	}
+	rsrc->dev->ws_remove(rsrc->qp.vsi, rsrc->qp.user_pri);
+}
+
+/**
+ * irdma_puda_free_cq - free cq for resource
+ * @rsrc: resource for which cq to free
+ */
+static void irdma_puda_free_cq(struct irdma_puda_rsrc *rsrc)
+{
+	enum irdma_status_code ret;
+	struct irdma_ccq_cqe_info compl_info;
+	struct irdma_sc_dev *dev = rsrc->dev;
+
+	if (rsrc->dev->ceq_valid) {
+		irdma_cqp_cq_destroy_cmd(dev, &rsrc->cq);
+		return;
+	}
+
+	ret = dev->iw_priv_cq_ops->cq_destroy(&rsrc->cq, 0, true);
+	if (ret)
+		ibdev_dbg(to_ibdev(dev), "PUDA: error ieq cq destroy\n");
+	if (!ret) {
+		ret = dev->cqp_ops->poll_for_cqp_op_done(dev->cqp,
+							 IRDMA_CQP_OP_DESTROY_CQ,
+							 &compl_info);
+		if (ret)
+			ibdev_dbg(to_ibdev(dev),
+				  "PUDA: error ieq qp destroy done\n");
+	}
+}
+
+/**
+ * irdma_puda_dele_rsrc - delete all resources during close
+ * @vsi: VSI structure of device
+ * @type: type of resource to dele
+ * @reset: true if reset chip
+ */
+void irdma_puda_dele_rsrc(struct irdma_sc_vsi *vsi, enum puda_rsrc_type type,
+			  bool reset)
+{
+	struct irdma_sc_dev *dev = vsi->dev;
+	struct irdma_puda_rsrc *rsrc;
+	struct irdma_puda_buf *buf = NULL;
+	struct irdma_puda_buf *nextbuf = NULL;
+	struct irdma_virt_mem *vmem;
+	struct irdma_sc_ceq *ceq;
+
+	ceq = vsi->dev->ceq[0];
+	switch (type) {
+	case IRDMA_PUDA_RSRC_TYPE_ILQ:
+		rsrc = vsi->ilq;
+		vmem = &vsi->ilq_mem;
+		vsi->ilq = NULL;
+		if (ceq && ceq->reg_cq)
+			irdma_sc_remove_cq_ctx(ceq, &rsrc->cq);
+		break;
+	case IRDMA_PUDA_RSRC_TYPE_IEQ:
+		rsrc = vsi->ieq;
+		vmem = &vsi->ieq_mem;
+		vsi->ieq = NULL;
+		if (ceq && ceq->reg_cq)
+			irdma_sc_remove_cq_ctx(ceq, &rsrc->cq);
+		break;
+	default:
+		ibdev_dbg(to_ibdev(dev), "PUDA: error resource type = 0x%x\n",
+			  type);
+		return;
+	}
+
+	switch (rsrc->cmpl) {
+	case PUDA_HASH_CRC_COMPLETE:
+		irdma_free_hash_desc(rsrc->hash_desc);
+		fallthrough;
+	case PUDA_QP_CREATED:
+		irdma_qp_rem_qos(&rsrc->qp);
+
+		if (!reset)
+			irdma_puda_free_qp(rsrc);
+
+		dma_free_coherent(dev->hw->device, rsrc->qpmem.size,
+				  rsrc->qpmem.va, rsrc->qpmem.pa);
+		rsrc->qpmem.va = NULL;
+		fallthrough;
+	case PUDA_CQ_CREATED:
+		if (!reset)
+			irdma_puda_free_cq(rsrc);
+
+		dma_free_coherent(dev->hw->device, rsrc->cqmem.size,
+				  rsrc->cqmem.va, rsrc->cqmem.pa);
+		rsrc->cqmem.va = NULL;
+		break;
+	default:
+		ibdev_dbg(to_ibdev(rsrc->dev), "PUDA: error no resources\n");
+		break;
+	}
+	/* Free all allocated puda buffers for both tx and rx */
+	buf = rsrc->alloclist;
+	while (buf) {
+		nextbuf = buf->next;
+		irdma_puda_dele_buf(dev, buf);
+		buf = nextbuf;
+		rsrc->alloc_buf_count--;
+	}
+
+	kfree(vmem->va);
+}
+
+/**
+ * irdma_puda_allocbufs - allocate buffers for resource
+ * @rsrc: resource for buffer allocation
+ * @count: number of buffers to create
+ */
+static enum irdma_status_code irdma_puda_allocbufs(struct irdma_puda_rsrc *rsrc,
+						   u32 count)
+{
+	u32 i;
+	struct irdma_puda_buf *buf;
+	struct irdma_puda_buf *nextbuf;
+
+	for (i = 0; i < count; i++) {
+		buf = irdma_puda_alloc_buf(rsrc->dev, rsrc->buf_size);
+		if (!buf) {
+			rsrc->stats_buf_alloc_fail++;
+			return IRDMA_ERR_NO_MEMORY;
+		}
+		irdma_puda_ret_bufpool(rsrc, buf);
+		rsrc->alloc_buf_count++;
+		if (!rsrc->alloclist) {
+			rsrc->alloclist = buf;
+		} else {
+			nextbuf = rsrc->alloclist;
+			rsrc->alloclist = buf;
+			buf->next = nextbuf;
+		}
+	}
+
+	rsrc->avail_buf_count = rsrc->alloc_buf_count;
+
+	return 0;
+}
+
+/**
+ * irdma_puda_create_rsrc - create resource (ilq or ieq)
+ * @vsi: sc VSI struct
+ * @info: resource information
+ */
+enum irdma_status_code irdma_puda_create_rsrc(struct irdma_sc_vsi *vsi,
+					      struct irdma_puda_rsrc_info *info)
+{
+	struct irdma_sc_dev *dev = vsi->dev;
+	enum irdma_status_code ret = 0;
+	struct irdma_puda_rsrc *rsrc;
+	u32 pudasize;
+	u32 sqwridsize, rqwridsize;
+	struct irdma_virt_mem *vmem;
+
+	info->count = 1;
+	pudasize = sizeof(struct irdma_puda_rsrc);
+	sqwridsize = info->sq_size * sizeof(struct irdma_sq_uk_wr_trk_info);
+	rqwridsize = info->rq_size * 8;
+	switch (info->type) {
+	case IRDMA_PUDA_RSRC_TYPE_ILQ:
+		vmem = &vsi->ilq_mem;
+		break;
+	case IRDMA_PUDA_RSRC_TYPE_IEQ:
+		vmem = &vsi->ieq_mem;
+		break;
+	default:
+		return IRDMA_NOT_SUPPORTED;
+	}
+	vmem->size = pudasize + sqwridsize + rqwridsize;
+	vmem->va = kzalloc(vmem->size, GFP_KERNEL);
+	if (!vmem->va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	rsrc = vmem->va;
+	spin_lock_init(&rsrc->bufpool_lock);
+	switch (info->type) {
+	case IRDMA_PUDA_RSRC_TYPE_ILQ:
+		vsi->ilq = vmem->va;
+		vsi->ilq_count = info->count;
+		rsrc->receive = info->receive;
+		rsrc->xmit_complete = info->xmit_complete;
+		break;
+	case IRDMA_PUDA_RSRC_TYPE_IEQ:
+		vsi->ieq_count = info->count;
+		vsi->ieq = vmem->va;
+		rsrc->receive = irdma_ieq_receive;
+		rsrc->xmit_complete = irdma_ieq_tx_compl;
+		break;
+	default:
+		return IRDMA_NOT_SUPPORTED;
+	}
+
+	rsrc->type = info->type;
+	rsrc->sq_wrtrk_array = (struct irdma_sq_uk_wr_trk_info *)
+			       ((u8 *)vmem->va + pudasize);
+	rsrc->rq_wrid_array = (u64 *)((u8 *)vmem->va + pudasize + sqwridsize);
+	/* Initialize all ieq lists */
+	INIT_LIST_HEAD(&rsrc->bufpool);
+	INIT_LIST_HEAD(&rsrc->txpend);
+
+	rsrc->tx_wqe_avail_cnt = info->sq_size - 1;
+	dev->iw_pd_ops->pd_init(dev, &rsrc->sc_pd, info->pd_id, info->abi_ver);
+	rsrc->qp_id = info->qp_id;
+	rsrc->cq_id = info->cq_id;
+	rsrc->sq_size = info->sq_size;
+	rsrc->rq_size = info->rq_size;
+	rsrc->cq_size = info->rq_size + info->sq_size;
+	if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		if (rsrc->type == IRDMA_PUDA_RSRC_TYPE_ILQ)
+			rsrc->cq_size += info->rq_size;
+	}
+	rsrc->buf_size = info->buf_size;
+	rsrc->dev = dev;
+	rsrc->vsi = vsi;
+	rsrc->stats_idx = info->stats_idx;
+	rsrc->stats_idx_valid = info->stats_idx_valid;
+
+	ret = irdma_puda_cq_create(rsrc);
+	if (!ret) {
+		rsrc->cmpl = PUDA_CQ_CREATED;
+		ret = irdma_puda_qp_create(rsrc);
+	}
+	if (ret) {
+		ibdev_dbg(to_ibdev(dev),
+			  "PUDA: error qp_create type=%d, status=%d\n",
+			  rsrc->type, ret);
+		goto error;
+	}
+	rsrc->cmpl = PUDA_QP_CREATED;
+
+	ret = irdma_puda_allocbufs(rsrc, info->tx_buf_cnt + info->rq_size);
+	if (ret) {
+		ibdev_dbg(to_ibdev(dev), "PUDA: error alloc_buf\n");
+		goto error;
+	}
+
+	rsrc->rxq_invalid_cnt = info->rq_size;
+	ret = irdma_puda_replenish_rq(rsrc, true);
+	if (ret)
+		goto error;
+
+	if (info->type == IRDMA_PUDA_RSRC_TYPE_IEQ) {
+		if (!irdma_init_hash_desc(&rsrc->hash_desc)) {
+			rsrc->check_crc = true;
+			rsrc->cmpl = PUDA_HASH_CRC_COMPLETE;
+			ret = 0;
+		}
+	}
+
+	dev->ccq_ops->ccq_arm(&rsrc->cq);
+	return ret;
+
+error:
+	irdma_puda_dele_rsrc(vsi, info->type, false);
+
+	return ret;
+}
+
+/**
+ * irdma_ilq_putback_rcvbuf - ilq buffer to put back on rq
+ * @qp: ilq's qp resource
+ * @buf: puda buffer for rcv q
+ * @wqe_idx:  wqe index of completed rcvbuf
+ */
+static void irdma_ilq_putback_rcvbuf(struct irdma_sc_qp *qp,
+				     struct irdma_puda_buf *buf, u32 wqe_idx)
+{
+	__le64 *wqe;
+	u64 offset8, offset24;
+
+	/* Synch buffer for use by device */
+	dma_sync_single_for_device(qp->dev->hw->device, buf->mem.pa,
+				   buf->mem.size, DMA_BIDIRECTIONAL);
+	wqe = qp->qp_uk.rq_base[wqe_idx].elem;
+	get_64bit_val(wqe, 24, &offset24);
+	if (qp->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		get_64bit_val(wqe, 8, &offset8);
+		if (offset24)
+			offset8 &= ~FIELD_PREP(IRDMAQPSQ_VALID, 1);
+		else
+			offset8 |= FIELD_PREP(IRDMAQPSQ_VALID, 1);
+		set_64bit_val(wqe, 8, offset8);
+		dma_wmb(); /* make sure WQE is written before valid bit is set */
+	}
+	if (offset24)
+		offset24 = 0;
+	else
+		offset24 = FIELD_PREP(IRDMAQPSQ_VALID, 1);
+
+	set_64bit_val(wqe, 24, offset24);
+}
+
+/**
+ * irdma_ieq_get_fpdu_len - get length of fpdu with or without marker
+ * @pfpdu: pointer to fpdu
+ * @datap: pointer to data in the buffer
+ * @rcv_seq: seqnum of the data buffer
+ */
+static u16 irdma_ieq_get_fpdu_len(struct irdma_pfpdu *pfpdu, u8 *datap,
+				  u32 rcv_seq)
+{
+	u32 marker_seq, end_seq, blk_start;
+	u8 marker_len = pfpdu->marker_len;
+	u16 total_len = 0;
+	u16 fpdu_len;
+
+	blk_start = (pfpdu->rcv_start_seq - rcv_seq) & (IRDMA_MRK_BLK_SZ - 1);
+	if (!blk_start) {
+		total_len = marker_len;
+		marker_seq = rcv_seq + IRDMA_MRK_BLK_SZ;
+		if (marker_len && *(u32 *)datap)
+			return 0;
+	} else {
+		marker_seq = rcv_seq + blk_start;
+	}
+
+	datap += total_len;
+	fpdu_len = ntohs(*(__be16 *)datap);
+	fpdu_len += IRDMA_IEQ_MPA_FRAMING;
+	fpdu_len = (fpdu_len + 3) & 0xfffc;
+
+	if (fpdu_len > pfpdu->max_fpdu_data)
+		return 0;
+
+	total_len += fpdu_len;
+	end_seq = rcv_seq + total_len;
+	while ((int)(marker_seq - end_seq) < 0) {
+		total_len += marker_len;
+		end_seq += marker_len;
+		marker_seq += IRDMA_MRK_BLK_SZ;
+	}
+
+	return total_len;
+}
+
+/**
+ * irdma_ieq_copy_to_txbuf - copydata from rcv buf to tx buf
+ * @buf: rcv buffer with partial
+ * @txbuf: tx buffer for sending back
+ * @buf_offset: rcv buffer offset to copy from
+ * @txbuf_offset: at offset in tx buf to copy
+ * @len: length of data to copy
+ */
+static void irdma_ieq_copy_to_txbuf(struct irdma_puda_buf *buf,
+				    struct irdma_puda_buf *txbuf,
+				    u16 buf_offset, u32 txbuf_offset, u32 len)
+{
+	void *mem1 = (u8 *)buf->mem.va + buf_offset;
+	void *mem2 = (u8 *)txbuf->mem.va + txbuf_offset;
+
+	memcpy(mem2, mem1, len);
+}
+
+/**
+ * irdma_ieq_setup_tx_buf - setup tx buffer for partial handling
+ * @buf: reeive buffer with partial
+ * @txbuf: buffer to prepare
+ */
+static void irdma_ieq_setup_tx_buf(struct irdma_puda_buf *buf,
+				   struct irdma_puda_buf *txbuf)
+{
+	txbuf->tcphlen = buf->tcphlen;
+	txbuf->ipv4 = buf->ipv4;
+
+	if (buf->vsi->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		txbuf->hdrlen = txbuf->tcphlen;
+		irdma_ieq_copy_to_txbuf(buf, txbuf, IRDMA_TCP_OFFSET, 0,
+					txbuf->hdrlen);
+	} else {
+		txbuf->maclen = buf->maclen;
+		txbuf->hdrlen = buf->hdrlen;
+		irdma_ieq_copy_to_txbuf(buf, txbuf, 0, 0, buf->hdrlen);
+	}
+}
+
+/**
+ * irdma_ieq_check_first_buf - check if rcv buffer's seq is in range
+ * @buf: receive exception buffer
+ * @fps: first partial sequence number
+ */
+static void irdma_ieq_check_first_buf(struct irdma_puda_buf *buf, u32 fps)
+{
+	u32 offset;
+
+	if (buf->seqnum < fps) {
+		offset = fps - buf->seqnum;
+		if (offset > buf->datalen)
+			return;
+		buf->data += offset;
+		buf->datalen -= (u16)offset;
+		buf->seqnum = fps;
+	}
+}
+
+/**
+ * irdma_ieq_compl_pfpdu - write txbuf with full fpdu
+ * @ieq: ieq resource
+ * @rxlist: ieq's received buffer list
+ * @pbufl: temporary list for buffers for fpddu
+ * @txbuf: tx buffer for fpdu
+ * @fpdu_len: total length of fpdu
+ */
+static void irdma_ieq_compl_pfpdu(struct irdma_puda_rsrc *ieq,
+				  struct list_head *rxlist,
+				  struct list_head *pbufl,
+				  struct irdma_puda_buf *txbuf, u16 fpdu_len)
+{
+	struct irdma_puda_buf *buf;
+	u32 nextseqnum;
+	u16 txoffset, bufoffset;
+
+	buf = irdma_puda_get_listbuf(pbufl);
+	if (!buf)
+		return;
+
+	nextseqnum = buf->seqnum + fpdu_len;
+	irdma_ieq_setup_tx_buf(buf, txbuf);
+	if (buf->vsi->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		txoffset = txbuf->hdrlen;
+		txbuf->totallen = txbuf->hdrlen + fpdu_len;
+		txbuf->data = (u8 *)txbuf->mem.va + txoffset;
+	} else {
+		txoffset = buf->hdrlen;
+		txbuf->totallen = buf->hdrlen + fpdu_len;
+		txbuf->data = (u8 *)txbuf->mem.va + buf->hdrlen;
+	}
+	bufoffset = (u16)(buf->data - (u8 *)buf->mem.va);
+
+	do {
+		if (buf->datalen >= fpdu_len) {
+			/* copied full fpdu */
+			irdma_ieq_copy_to_txbuf(buf, txbuf, bufoffset, txoffset,
+						fpdu_len);
+			buf->datalen -= fpdu_len;
+			buf->data += fpdu_len;
+			buf->seqnum = nextseqnum;
+			break;
+		}
+		/* copy partial fpdu */
+		irdma_ieq_copy_to_txbuf(buf, txbuf, bufoffset, txoffset,
+					buf->datalen);
+		txoffset += buf->datalen;
+		fpdu_len -= buf->datalen;
+		irdma_puda_ret_bufpool(ieq, buf);
+		buf = irdma_puda_get_listbuf(pbufl);
+		if (!buf)
+			return;
+
+		bufoffset = (u16)(buf->data - (u8 *)buf->mem.va);
+	} while (1);
+
+	/* last buffer on the list*/
+	if (buf->datalen)
+		list_add(&buf->list, rxlist);
+	else
+		irdma_puda_ret_bufpool(ieq, buf);
+}
+
+/**
+ * irdma_ieq_create_pbufl - create buffer list for single fpdu
+ * @pfpdu: pointer to fpdu
+ * @rxlist: resource list for receive ieq buffes
+ * @pbufl: temp. list for buffers for fpddu
+ * @buf: first receive buffer
+ * @fpdu_len: total length of fpdu
+ */
+static enum irdma_status_code
+irdma_ieq_create_pbufl(struct irdma_pfpdu *pfpdu, struct list_head *rxlist,
+		       struct list_head *pbufl, struct irdma_puda_buf *buf,
+		       u16 fpdu_len)
+{
+	enum irdma_status_code status = 0;
+	struct irdma_puda_buf *nextbuf;
+	u32 nextseqnum;
+	u16 plen = fpdu_len - buf->datalen;
+	bool done = false;
+
+	nextseqnum = buf->seqnum + buf->datalen;
+	do {
+		nextbuf = irdma_puda_get_listbuf(rxlist);
+		if (!nextbuf) {
+			status = IRDMA_ERR_list_empty;
+			break;
+		}
+		list_add_tail(&nextbuf->list, pbufl);
+		if (nextbuf->seqnum != nextseqnum) {
+			pfpdu->bad_seq_num++;
+			status = IRDMA_ERR_SEQ_NUM;
+			break;
+		}
+		if (nextbuf->datalen >= plen) {
+			done = true;
+		} else {
+			plen -= nextbuf->datalen;
+			nextseqnum = nextbuf->seqnum + nextbuf->datalen;
+		}
+
+	} while (!done);
+
+	return status;
+}
+
+/**
+ * irdma_ieq_handle_partial - process partial fpdu buffer
+ * @ieq: ieq resource
+ * @pfpdu: partial management per user qp
+ * @buf: receive buffer
+ * @fpdu_len: fpdu len in the buffer
+ */
+static enum irdma_status_code
+irdma_ieq_handle_partial(struct irdma_puda_rsrc *ieq, struct irdma_pfpdu *pfpdu,
+			 struct irdma_puda_buf *buf, u16 fpdu_len)
+{
+	enum irdma_status_code status = 0;
+	u8 *crcptr;
+	u32 mpacrc;
+	u32 seqnum = buf->seqnum;
+	struct list_head pbufl; /* partial buffer list */
+	struct irdma_puda_buf *txbuf = NULL;
+	struct list_head *rxlist = &pfpdu->rxlist;
+
+	ieq->partials_handled++;
+
+	INIT_LIST_HEAD(&pbufl);
+	list_add(&buf->list, &pbufl);
+
+	status = irdma_ieq_create_pbufl(pfpdu, rxlist, &pbufl, buf, fpdu_len);
+	if (status)
+		goto error;
+
+	txbuf = irdma_puda_get_bufpool(ieq);
+	if (!txbuf) {
+		pfpdu->no_tx_bufs++;
+		status = IRDMA_ERR_NO_TXBUFS;
+		goto error;
+	}
+
+	irdma_ieq_compl_pfpdu(ieq, rxlist, &pbufl, txbuf, fpdu_len);
+	irdma_ieq_update_tcpip_info(txbuf, fpdu_len, seqnum);
+
+	crcptr = txbuf->data + fpdu_len - 4;
+	mpacrc = *(u32 *)crcptr;
+	if (ieq->check_crc) {
+		status = irdma_ieq_check_mpacrc(ieq->hash_desc, txbuf->data,
+						(fpdu_len - 4), mpacrc);
+		if (status) {
+			ibdev_dbg(to_ibdev(ieq->dev), "IEQ: error bad crc\n");
+			goto error;
+		}
+	}
+
+	print_hex_dump_debug("IEQ: IEQ TX BUFFER", DUMP_PREFIX_OFFSET, 16, 8,
+			     txbuf->mem.va, txbuf->totallen, false);
+	if (ieq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+		txbuf->ah_id = pfpdu->ah->ah_info.ah_idx;
+	txbuf->do_lpb = true;
+	irdma_puda_send_buf(ieq, txbuf);
+	pfpdu->rcv_nxt = seqnum + fpdu_len;
+	return status;
+
+error:
+	while (!list_empty(&pbufl)) {
+		buf = (struct irdma_puda_buf *)(pbufl.prev);
+		list_del(&buf->list);
+		list_add(&buf->list, rxlist);
+	}
+	if (txbuf)
+		irdma_puda_ret_bufpool(ieq, txbuf);
+
+	return status;
+}
+
+/**
+ * irdma_ieq_process_buf - process buffer rcvd for ieq
+ * @ieq: ieq resource
+ * @pfpdu: partial management per user qp
+ * @buf: receive buffer
+ */
+static enum irdma_status_code irdma_ieq_process_buf(struct irdma_puda_rsrc *ieq,
+						    struct irdma_pfpdu *pfpdu,
+						    struct irdma_puda_buf *buf)
+{
+	u16 fpdu_len = 0;
+	u16 datalen = buf->datalen;
+	u8 *datap = buf->data;
+	u8 *crcptr;
+	u16 ioffset = 0;
+	u32 mpacrc;
+	u32 seqnum = buf->seqnum;
+	u16 len = 0;
+	u16 full = 0;
+	bool partial = false;
+	struct irdma_puda_buf *txbuf;
+	struct list_head *rxlist = &pfpdu->rxlist;
+	enum irdma_status_code ret = 0;
+
+	ioffset = (u16)(buf->data - (u8 *)buf->mem.va);
+	while (datalen) {
+		fpdu_len = irdma_ieq_get_fpdu_len(pfpdu, datap, buf->seqnum);
+		if (!fpdu_len) {
+			ibdev_dbg(to_ibdev(ieq->dev),
+				  "IEQ: error bad fpdu len\n");
+			list_add(&buf->list, rxlist);
+			return IRDMA_ERR_MPA_CRC;
+		}
+
+		if (datalen < fpdu_len) {
+			partial = true;
+			break;
+		}
+		crcptr = datap + fpdu_len - 4;
+		mpacrc = *(u32 *)crcptr;
+		if (ieq->check_crc)
+			ret = irdma_ieq_check_mpacrc(ieq->hash_desc, datap,
+						     fpdu_len - 4, mpacrc);
+		if (ret) {
+			list_add(&buf->list, rxlist);
+			ibdev_dbg(to_ibdev(ieq->dev),
+				  "ERR: IRDMA_ERR_MPA_CRC\n");
+			return IRDMA_ERR_MPA_CRC;
+		}
+		full++;
+		pfpdu->fpdu_processed++;
+		ieq->fpdu_processed++;
+		datap += fpdu_len;
+		len += fpdu_len;
+		datalen -= fpdu_len;
+	}
+	if (full) {
+		/* copy full pdu's in the txbuf and send them out */
+		txbuf = irdma_puda_get_bufpool(ieq);
+		if (!txbuf) {
+			pfpdu->no_tx_bufs++;
+			list_add(&buf->list, rxlist);
+			return IRDMA_ERR_NO_TXBUFS;
+		}
+		/* modify txbuf's buffer header */
+		irdma_ieq_setup_tx_buf(buf, txbuf);
+		/* copy full fpdu's to new buffer */
+		if (ieq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+			irdma_ieq_copy_to_txbuf(buf, txbuf, ioffset,
+						txbuf->hdrlen, len);
+			txbuf->totallen = txbuf->hdrlen + len;
+			txbuf->ah_id = pfpdu->ah->ah_info.ah_idx;
+		} else {
+			irdma_ieq_copy_to_txbuf(buf, txbuf, ioffset,
+						buf->hdrlen, len);
+			txbuf->totallen = buf->hdrlen + len;
+		}
+		irdma_ieq_update_tcpip_info(txbuf, len, buf->seqnum);
+		print_hex_dump_debug("IEQ: IEQ TX BUFFER", DUMP_PREFIX_OFFSET,
+				     16, 8, txbuf->mem.va, txbuf->totallen,
+				     false);
+		txbuf->do_lpb = true;
+		irdma_puda_send_buf(ieq, txbuf);
+
+		if (!datalen) {
+			pfpdu->rcv_nxt = buf->seqnum + len;
+			irdma_puda_ret_bufpool(ieq, buf);
+			return 0;
+		}
+		buf->data = datap;
+		buf->seqnum = seqnum + len;
+		buf->datalen = datalen;
+		pfpdu->rcv_nxt = buf->seqnum;
+	}
+	if (partial)
+		return irdma_ieq_handle_partial(ieq, pfpdu, buf, fpdu_len);
+
+	return 0;
+}
+
+/**
+ * irdma_ieq_process_fpdus - process fpdu's buffers on its list
+ * @qp: qp for which partial fpdus
+ * @ieq: ieq resource
+ */
+void irdma_ieq_process_fpdus(struct irdma_sc_qp *qp,
+			     struct irdma_puda_rsrc *ieq)
+{
+	struct irdma_pfpdu *pfpdu = &qp->pfpdu;
+	struct list_head *rxlist = &pfpdu->rxlist;
+	struct irdma_puda_buf *buf;
+	enum irdma_status_code status;
+
+	do {
+		if (list_empty(rxlist))
+			break;
+		buf = irdma_puda_get_listbuf(rxlist);
+		if (!buf) {
+			ibdev_dbg(to_ibdev(ieq->dev), "IEQ: error no buf\n");
+			break;
+		}
+		if (buf->seqnum != pfpdu->rcv_nxt) {
+			/* This could be out of order or missing packet */
+			pfpdu->out_of_order++;
+			list_add(&buf->list, rxlist);
+			break;
+		}
+		/* keep processing buffers from the head of the list */
+		status = irdma_ieq_process_buf(ieq, pfpdu, buf);
+		if (status == IRDMA_ERR_MPA_CRC) {
+			pfpdu->mpa_crc_err = true;
+			while (!list_empty(rxlist)) {
+				buf = irdma_puda_get_listbuf(rxlist);
+				irdma_puda_ret_bufpool(ieq, buf);
+				pfpdu->crc_err++;
+				ieq->crc_err++;
+			}
+			/* create CQP for AE */
+			irdma_ieq_mpa_crc_ae(ieq->dev, qp);
+		}
+	} while (!status);
+}
+
+/**
+ * irdma_ieq_create_ah - create an address handle for IEQ
+ * @qp: qp pointer
+ * @buf: buf received on IEQ used to create AH
+ */
+static enum irdma_status_code irdma_ieq_create_ah(struct irdma_sc_qp *qp,
+						  struct irdma_puda_buf *buf)
+{
+	struct irdma_ah_info ah_info = {};
+
+	qp->pfpdu.ah_buf = buf;
+	irdma_puda_ieq_get_ah_info(qp, &ah_info);
+	return irdma_puda_create_ah(qp->vsi->dev, &ah_info, false,
+				    IRDMA_PUDA_RSRC_TYPE_IEQ, qp,
+				    &qp->pfpdu.ah);
+}
+
+/**
+ * irdma_ieq_handle_exception - handle qp's exception
+ * @ieq: ieq resource
+ * @qp: qp receiving excpetion
+ * @buf: receive buffer
+ */
+static void irdma_ieq_handle_exception(struct irdma_puda_rsrc *ieq,
+				       struct irdma_sc_qp *qp,
+				       struct irdma_puda_buf *buf)
+{
+	struct irdma_pfpdu *pfpdu = &qp->pfpdu;
+	u32 *hw_host_ctx = (u32 *)qp->hw_host_ctx;
+	u32 rcv_wnd = hw_host_ctx[23];
+	/* first partial seq # in q2 */
+	u32 fps = *(u32 *)(qp->q2_buf + Q2_FPSN_OFFSET);
+	struct list_head *rxlist = &pfpdu->rxlist;
+	unsigned long flags = 0;
+	u8 hw_rev = qp->dev->hw_attrs.uk_attrs.hw_rev;
+
+	print_hex_dump_debug("IEQ: IEQ RX BUFFER", DUMP_PREFIX_OFFSET, 16, 8,
+			     buf->mem.va, buf->totallen, false);
+
+	spin_lock_irqsave(&pfpdu->lock, flags);
+	pfpdu->total_ieq_bufs++;
+	if (pfpdu->mpa_crc_err) {
+		pfpdu->crc_err++;
+		goto error;
+	}
+	if (pfpdu->mode && fps != pfpdu->fps) {
+		/* clean up qp as it is new partial sequence */
+		irdma_ieq_cleanup_qp(ieq, qp);
+		ibdev_dbg(to_ibdev(ieq->dev), "IEQ: restarting new partial\n");
+		pfpdu->mode = false;
+	}
+
+	if (!pfpdu->mode) {
+		print_hex_dump_debug("IEQ: Q2 BUFFER", DUMP_PREFIX_OFFSET, 16,
+				     8, (u64 *)qp->q2_buf, 128, false);
+		/* First_Partial_Sequence_Number check */
+		pfpdu->rcv_nxt = fps;
+		pfpdu->fps = fps;
+		pfpdu->mode = true;
+		pfpdu->max_fpdu_data = (buf->ipv4) ?
+				       (ieq->vsi->mtu - IRDMA_MTU_TO_MSS_IPV4) :
+				       (ieq->vsi->mtu - IRDMA_MTU_TO_MSS_IPV6);
+		pfpdu->pmode_count++;
+		ieq->pmode_count++;
+		INIT_LIST_HEAD(rxlist);
+		irdma_ieq_check_first_buf(buf, fps);
+	}
+
+	if (!(rcv_wnd >= (buf->seqnum - pfpdu->rcv_nxt))) {
+		pfpdu->bad_seq_num++;
+		ieq->bad_seq_num++;
+		goto error;
+	}
+
+	if (!list_empty(rxlist)) {
+		if (buf->seqnum != pfpdu->nextseqnum) {
+			irdma_send_ieq_ack(qp);
+			/* throw away out-of-order, duplicates*/
+			goto error;
+		}
+	}
+	/* Insert buf before head */
+	list_add_tail(&buf->list, rxlist);
+	pfpdu->nextseqnum = buf->seqnum + buf->datalen;
+	pfpdu->lastrcv_buf = buf;
+	if (hw_rev >= IRDMA_GEN_2 && !pfpdu->ah) {
+		irdma_ieq_create_ah(qp, buf);
+		if (!pfpdu->ah)
+			goto error;
+		goto exit;
+	}
+	if (hw_rev == IRDMA_GEN_1)
+		irdma_ieq_process_fpdus(qp, ieq);
+	else if (pfpdu->ah && pfpdu->ah->ah_info.ah_valid)
+		irdma_ieq_process_fpdus(qp, ieq);
+exit:
+	spin_unlock_irqrestore(&pfpdu->lock, flags);
+
+	return;
+
+error:
+	irdma_puda_ret_bufpool(ieq, buf);
+	spin_unlock_irqrestore(&pfpdu->lock, flags);
+}
+
+/**
+ * irdma_ieq_receive - received exception buffer
+ * @vsi: VSI of device
+ * @buf: exception buffer received
+ */
+static void irdma_ieq_receive(struct irdma_sc_vsi *vsi,
+			      struct irdma_puda_buf *buf)
+{
+	struct irdma_puda_rsrc *ieq = vsi->ieq;
+	struct irdma_sc_qp *qp = NULL;
+	u32 wqe_idx = ieq->compl_rxwqe_idx;
+
+	qp = irdma_ieq_get_qp(vsi->dev, buf);
+	if (!qp) {
+		ieq->stats_bad_qp_id++;
+		irdma_puda_ret_bufpool(ieq, buf);
+	} else {
+		irdma_ieq_handle_exception(ieq, qp, buf);
+	}
+	/*
+	 * ieq->rx_wqe_idx is used by irdma_puda_replenish_rq()
+	 * on which wqe_idx to start replenish rq
+	 */
+	if (!ieq->rxq_invalid_cnt)
+		ieq->rx_wqe_idx = wqe_idx;
+	ieq->rxq_invalid_cnt++;
+}
+
+/**
+ * irdma_ieq_tx_compl - put back after sending completed exception buffer
+ * @vsi: sc VSI struct
+ * @sqwrid: pointer to puda buffer
+ */
+static void irdma_ieq_tx_compl(struct irdma_sc_vsi *vsi, void *sqwrid)
+{
+	struct irdma_puda_rsrc *ieq = vsi->ieq;
+	struct irdma_puda_buf *buf = sqwrid;
+
+	irdma_puda_ret_bufpool(ieq, buf);
+}
+
+/**
+ * irdma_ieq_cleanup_qp - qp is being destroyed
+ * @ieq: ieq resource
+ * @qp: all pending fpdu buffers
+ */
+void irdma_ieq_cleanup_qp(struct irdma_puda_rsrc *ieq, struct irdma_sc_qp *qp)
+{
+	struct irdma_puda_buf *buf;
+	struct irdma_pfpdu *pfpdu = &qp->pfpdu;
+	struct list_head *rxlist = &pfpdu->rxlist;
+
+	if (qp->pfpdu.ah) {
+		irdma_puda_free_ah(ieq->dev, qp->pfpdu.ah);
+		qp->pfpdu.ah = NULL;
+		qp->pfpdu.ah_buf = NULL;
+	}
+
+	if (!pfpdu->mode)
+		return;
+
+	while (!list_empty(rxlist)) {
+		buf = irdma_puda_get_listbuf(rxlist);
+		irdma_puda_ret_bufpool(ieq, buf);
+	}
+}
diff --git a/drivers/infiniband/hw/irdma/puda.h b/drivers/infiniband/hw/irdma/puda.h
new file mode 100644
index 0000000..db3a511
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/puda.h
@@ -0,0 +1,194 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2020 Intel Corporation */
+#ifndef IRDMA_PUDA_H
+#define IRDMA_PUDA_H
+
+#define IRDMA_IEQ_MPA_FRAMING	6
+#define IRDMA_TCP_OFFSET	40
+#define IRDMA_IPV4_PAD		20
+#define IRDMA_MRK_BLK_SZ	512
+
+enum puda_rsrc_type {
+	IRDMA_PUDA_RSRC_TYPE_ILQ = 1,
+	IRDMA_PUDA_RSRC_TYPE_IEQ,
+	IRDMA_PUDA_RSRC_TYPE_MAX, /* Must be last entry */
+};
+
+enum puda_rsrc_complete {
+	PUDA_CQ_CREATED = 1,
+	PUDA_QP_CREATED,
+	PUDA_TX_COMPLETE,
+	PUDA_RX_COMPLETE,
+	PUDA_HASH_CRC_COMPLETE,
+};
+
+struct irdma_sc_dev;
+struct irdma_sc_qp;
+struct irdma_sc_cq;
+
+struct irdma_puda_cmpl_info {
+	struct irdma_qp_uk *qp;
+	u8 q_type;
+	u8 l3proto;
+	u8 l4proto;
+	u16 vlan;
+	u32 payload_len;
+	u32 compl_error; /* No_err=0, else major and minor err code */
+	u32 qp_id;
+	u32 wqe_idx;
+	bool ipv4:1;
+	bool smac_valid:1;
+	bool vlan_valid:1;
+	u8 smac[ETH_ALEN];
+};
+
+struct irdma_puda_send_info {
+	u64 paddr; /* Physical address */
+	u32 len;
+	u32 ah_id;
+	u8 tcplen;
+	u8 maclen;
+	bool ipv4:1;
+	bool do_lpb:1;
+	void *scratch;
+};
+
+struct irdma_puda_buf {
+	struct list_head list; /* MUST be first entry */
+	struct irdma_dma_mem mem; /* DMA memory for the buffer */
+	struct irdma_puda_buf *next; /* for alloclist in rsrc struct */
+	struct irdma_virt_mem buf_mem; /* Buffer memory for this buffer */
+	void *scratch;
+	u8 *iph;
+	u8 *tcph;
+	u8 *data;
+	u16 datalen;
+	u16 vlan_id;
+	u8 tcphlen; /* tcp length in bytes */
+	u8 maclen; /* mac length in bytes */
+	u32 totallen; /* machlen+iphlen+tcphlen+datalen */
+	refcount_t refcount;
+	u8 hdrlen;
+	bool ipv4:1;
+	bool vlan_valid:1;
+	bool do_lpb:1; /* Loopback buffer */
+	bool smac_valid:1;
+	u32 seqnum;
+	u32 ah_id;
+	u8 smac[ETH_ALEN];
+	struct irdma_sc_vsi *vsi;
+};
+
+struct irdma_puda_rsrc_info {
+	void (*receive)(struct irdma_sc_vsi *vsi, struct irdma_puda_buf *buf);
+	void (*xmit_complete)(struct irdma_sc_vsi *vsi, void *sqwrid);
+	enum puda_rsrc_type type; /* ILQ or IEQ */
+	u32 count;
+	u32 pd_id;
+	u32 cq_id;
+	u32 qp_id;
+	u32 sq_size;
+	u32 rq_size;
+	u32 tx_buf_cnt; /* total bufs allocated will be rq_size + tx_buf_cnt */
+	u16 buf_size;
+	u8 stats_idx;
+	bool stats_idx_valid:1;
+	int abi_ver;
+};
+
+struct irdma_puda_rsrc {
+	struct irdma_sc_cq cq;
+	struct irdma_sc_qp qp;
+	struct irdma_sc_pd sc_pd;
+	struct irdma_sc_dev *dev;
+	struct irdma_sc_vsi *vsi;
+	struct irdma_dma_mem cqmem;
+	struct irdma_dma_mem qpmem;
+	struct irdma_virt_mem ilq_mem;
+	enum puda_rsrc_complete cmpl;
+	enum puda_rsrc_type type;
+	u16 buf_size; /*buf must be max datalen + tcpip hdr + mac */
+	u32 cq_id;
+	u32 qp_id;
+	u32 sq_size;
+	u32 rq_size;
+	u32 cq_size;
+	struct irdma_sq_uk_wr_trk_info *sq_wrtrk_array;
+	u64 *rq_wrid_array;
+	u32 compl_rxwqe_idx;
+	u32 rx_wqe_idx;
+	u32 rxq_invalid_cnt;
+	u32 tx_wqe_avail_cnt;
+	struct shash_desc *hash_desc;
+	struct list_head txpend;
+	struct list_head bufpool; /* free buffers pool list for recv and xmit */
+	u32 alloc_buf_count;
+	u32 avail_buf_count; /* snapshot of currently available buffers */
+	spinlock_t bufpool_lock;
+	struct irdma_puda_buf *alloclist;
+	void (*receive)(struct irdma_sc_vsi *vsi, struct irdma_puda_buf *buf);
+	void (*xmit_complete)(struct irdma_sc_vsi *vsi, void *sqwrid);
+	/* puda stats */
+	u64 stats_buf_alloc_fail;
+	u64 stats_pkt_rcvd;
+	u64 stats_pkt_sent;
+	u64 stats_rcvd_pkt_err;
+	u64 stats_sent_pkt_q;
+	u64 stats_bad_qp_id;
+	/* IEQ stats */
+	u64 fpdu_processed;
+	u64 bad_seq_num;
+	u64 crc_err;
+	u64 pmode_count;
+	u64 partials_handled;
+	u8 stats_idx;
+	bool check_crc:1;
+	bool stats_idx_valid:1;
+};
+
+struct irdma_puda_buf *irdma_puda_get_bufpool(struct irdma_puda_rsrc *rsrc);
+void irdma_puda_ret_bufpool(struct irdma_puda_rsrc *rsrc,
+			    struct irdma_puda_buf *buf);
+void irdma_puda_send_buf(struct irdma_puda_rsrc *rsrc,
+			 struct irdma_puda_buf *buf);
+enum irdma_status_code irdma_puda_send(struct irdma_sc_qp *qp,
+				       struct irdma_puda_send_info *info);
+enum irdma_status_code
+irdma_puda_create_rsrc(struct irdma_sc_vsi *vsi,
+		       struct irdma_puda_rsrc_info *info);
+void irdma_puda_dele_rsrc(struct irdma_sc_vsi *vsi, enum puda_rsrc_type type,
+			  bool reset);
+enum irdma_status_code irdma_puda_poll_cmpl(struct irdma_sc_dev *dev,
+					    struct irdma_sc_cq *cq,
+					    u32 *compl_err);
+
+struct irdma_sc_qp *irdma_ieq_get_qp(struct irdma_sc_dev *dev,
+				     struct irdma_puda_buf *buf);
+enum irdma_status_code
+irdma_puda_get_tcpip_info(struct irdma_puda_cmpl_info *info,
+			  struct irdma_puda_buf *buf);
+enum irdma_status_code irdma_ieq_check_mpacrc(struct shash_desc *desc,
+					      void *addr, u32 len, u32 val);
+enum irdma_status_code irdma_init_hash_desc(struct shash_desc **desc);
+void irdma_ieq_mpa_crc_ae(struct irdma_sc_dev *dev, struct irdma_sc_qp *qp);
+void irdma_free_hash_desc(struct shash_desc *desc);
+void irdma_ieq_update_tcpip_info(struct irdma_puda_buf *buf, u16 len,
+				 u32 seqnum);
+enum irdma_status_code irdma_cqp_qp_create_cmd(struct irdma_sc_dev *dev,
+					       struct irdma_sc_qp *qp);
+enum irdma_status_code irdma_cqp_cq_create_cmd(struct irdma_sc_dev *dev,
+					       struct irdma_sc_cq *cq);
+enum irdma_status_code irdma_cqp_qp_destroy_cmd(struct irdma_sc_dev *dev, struct irdma_sc_qp *qp);
+void irdma_cqp_cq_destroy_cmd(struct irdma_sc_dev *dev, struct irdma_sc_cq *cq);
+void irdma_puda_ieq_get_ah_info(struct irdma_sc_qp *qp,
+				struct irdma_ah_info *ah_info);
+enum irdma_status_code irdma_puda_create_ah(struct irdma_sc_dev *dev,
+					    struct irdma_ah_info *ah_info,
+					    bool wait, enum puda_rsrc_type type,
+					    void *cb_param,
+					    struct irdma_sc_ah **ah);
+void irdma_puda_free_ah(struct irdma_sc_dev *dev, struct irdma_sc_ah *ah);
+void irdma_ieq_process_fpdus(struct irdma_sc_qp *qp,
+			     struct irdma_puda_rsrc *ieq);
+void irdma_ieq_cleanup_qp(struct irdma_puda_rsrc *ieq, struct irdma_sc_qp *qp);
+#endif /*IRDMA_PROTOS_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 13/23] RDMA/irdma: Add QoS definitions
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (11 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 12/23] RDMA/irdma: Add privileged UDA queue implementation Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 14/23] RDMA/irdma: Add connection manager Shiraz Saleem
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Add definitions for managing the RDMA HW work scheduler (WS) tree.

A WS node is created via a control QP operation with the bandwidth
allocation, arbitration scheme, and traffic class of the QP specified.
The Qset handle returned associates the QoS parameters for the QP.
The Qset is registered with the LAN and an equivalent node is created
in the LAN packet scheduler tree.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/ws.c | 406 +++++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/ws.h |  41 ++++
 2 files changed, 447 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/ws.c
 create mode 100644 drivers/infiniband/hw/irdma/ws.h

diff --git a/drivers/infiniband/hw/irdma/ws.c b/drivers/infiniband/hw/irdma/ws.c
new file mode 100644
index 0000000..6e4d1d8
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/ws.c
@@ -0,0 +1,406 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2019 - 2020 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "hmc.h"
+#include "defs.h"
+#include "type.h"
+#include "protos.h"
+
+#include "ws.h"
+
+/**
+ * irdma_alloc_node - Allocate a WS node and init
+ * @vsi: vsi pointer
+ * @user_pri: user priority
+ * @node_type: Type of node, leaf or parent
+ * @parent: parent node pointer
+ */
+static struct irdma_ws_node *irdma_alloc_node(struct irdma_sc_vsi *vsi,
+					      u8 user_pri,
+					      enum irdma_ws_node_type node_type,
+					      struct irdma_ws_node *parent)
+{
+	struct irdma_virt_mem ws_mem;
+	struct irdma_ws_node *node;
+	u16 node_index = 0;
+
+	ws_mem.size = sizeof(struct irdma_ws_node);
+	ws_mem.va = kzalloc(ws_mem.size, GFP_KERNEL);
+	if (!ws_mem.va)
+		return NULL;
+
+	if (parent || vsi->vm_vf_type == IRDMA_VF_TYPE) {
+		node_index = irdma_alloc_ws_node_id(vsi->dev);
+		if (node_index == IRDMA_WS_NODE_INVALID) {
+			kfree(ws_mem.va);
+			return NULL;
+		}
+	}
+
+	node = ws_mem.va;
+	node->index = node_index;
+	node->vsi_index = vsi->vsi_idx;
+	INIT_LIST_HEAD(&node->child_list_head);
+	if (node_type == WS_NODE_TYPE_LEAF) {
+		node->type_leaf = true;
+		node->traffic_class = vsi->qos[user_pri].traffic_class;
+		node->user_pri = user_pri;
+		node->rel_bw = vsi->qos[user_pri].rel_bw;
+		if (!node->rel_bw)
+			node->rel_bw = 1;
+
+		node->lan_qs_handle = vsi->qos[user_pri].lan_qos_handle;
+		node->prio_type = IRDMA_PRIO_WEIGHTED_RR;
+	} else {
+		node->rel_bw = 1;
+		node->prio_type = IRDMA_PRIO_WEIGHTED_RR;
+		node->enable = true;
+	}
+
+	node->parent = parent;
+
+	return node;
+}
+
+/**
+ * irdma_free_node - Free a WS node
+ * @vsi: VSI stricture of device
+ * @node: Pointer to node to free
+ */
+static void irdma_free_node(struct irdma_sc_vsi *vsi,
+			    struct irdma_ws_node *node)
+{
+	struct irdma_virt_mem ws_mem;
+
+	if (node->index)
+		irdma_free_ws_node_id(vsi->dev, node->index);
+
+	ws_mem.va = node;
+	ws_mem.size = sizeof(struct irdma_ws_node);
+	kfree(ws_mem.va);
+}
+
+/**
+ * irdma_ws_cqp_cmd - Post CQP work scheduler node cmd
+ * @vsi: vsi pointer
+ * @node: pointer to node
+ * @cmd: add, remove or modify
+ */
+static enum irdma_status_code
+irdma_ws_cqp_cmd(struct irdma_sc_vsi *vsi, struct irdma_ws_node *node, u8 cmd)
+{
+	struct irdma_ws_node_info node_info = {};
+
+	node_info.id = node->index;
+	node_info.vsi = node->vsi_index;
+	if (node->parent)
+		node_info.parent_id = node->parent->index;
+	else
+		node_info.parent_id = node_info.id;
+
+	node_info.weight = node->rel_bw;
+	node_info.tc = node->traffic_class;
+	node_info.prio_type = node->prio_type;
+	node_info.type_leaf = node->type_leaf;
+	node_info.enable = node->enable;
+	if (irdma_cqp_ws_node_cmd(vsi->dev, cmd, &node_info)) {
+		ibdev_dbg(to_ibdev(vsi->dev), "WS: CQP WS CMD failed\n");
+		return IRDMA_ERR_NO_MEMORY;
+	}
+
+	if (node->type_leaf && cmd == IRDMA_OP_WS_ADD_NODE) {
+		node->qs_handle = node_info.qs_handle;
+		vsi->qos[node->user_pri].qs_handle = node_info.qs_handle;
+	}
+
+	return 0;
+}
+
+/**
+ * ws_find_node - Find SC WS node based on VSI id or TC
+ * @parent: parent node of First VSI or TC node
+ * @match_val: value to match
+ * @type: match type VSI/TC
+ */
+static struct irdma_ws_node *ws_find_node(struct irdma_ws_node *parent,
+					  u16 match_val,
+					  enum irdma_ws_match_type type)
+{
+	struct irdma_ws_node *node;
+
+	switch (type) {
+	case WS_MATCH_TYPE_VSI:
+		list_for_each_entry(node, &parent->child_list_head, siblings) {
+			if (node->vsi_index == match_val)
+				return node;
+		}
+		break;
+	case WS_MATCH_TYPE_TC:
+		list_for_each_entry(node, &parent->child_list_head, siblings) {
+			if (node->traffic_class == match_val)
+				return node;
+		}
+		break;
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+/**
+ * irdma_tc_in_use - Checks to see if a leaf node is in use
+ * @vsi: vsi pointer
+ * @user_pri: user priority
+ */
+static bool irdma_tc_in_use(struct irdma_sc_vsi *vsi, u8 user_pri)
+{
+	int i;
+
+	mutex_lock(&vsi->qos[user_pri].qos_mutex);
+	if (!list_empty(&vsi->qos[user_pri].qplist)) {
+		mutex_unlock(&vsi->qos[user_pri].qos_mutex);
+		return true;
+	}
+
+	/* Check if the traffic class associated with the given user priority
+	 * is in use by any other user priority. If so, nothing left to do
+	 */
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) {
+		if (vsi->qos[i].traffic_class == vsi->qos[user_pri].traffic_class &&
+		    !list_empty(&vsi->qos[i].qplist)) {
+			mutex_unlock(&vsi->qos[user_pri].qos_mutex);
+			return true;
+		}
+	}
+	mutex_unlock(&vsi->qos[user_pri].qos_mutex);
+
+	return false;
+}
+
+/**
+ * irdma_remove_leaf - Remove leaf node unconditionally
+ * @vsi: vsi pointer
+ * @user_pri: user priority
+ */
+static void irdma_remove_leaf(struct irdma_sc_vsi *vsi, u8 user_pri)
+{
+	struct irdma_ws_node *ws_tree_root, *vsi_node, *tc_node;
+	int i;
+	u16 traffic_class;
+
+	traffic_class = vsi->qos[user_pri].traffic_class;
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++)
+		if (vsi->qos[i].traffic_class == traffic_class)
+			vsi->qos[i].valid = false;
+
+	ws_tree_root = vsi->dev->ws_tree_root;
+	if (!ws_tree_root)
+		return;
+
+	vsi_node = ws_find_node(ws_tree_root, vsi->vsi_idx,
+				WS_MATCH_TYPE_VSI);
+	if (!vsi_node)
+		return;
+
+	tc_node = ws_find_node(vsi_node,
+			       vsi->qos[user_pri].traffic_class,
+			       WS_MATCH_TYPE_TC);
+	if (!tc_node)
+		return;
+
+	irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_DELETE_NODE);
+	vsi->unregister_qset(vsi, tc_node);
+	list_del(&tc_node->siblings);
+	irdma_free_node(vsi, tc_node);
+	/* Check if VSI node can be freed */
+	if (list_empty(&vsi_node->child_list_head)) {
+		irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_DELETE_NODE);
+		list_del(&vsi_node->siblings);
+		irdma_free_node(vsi, vsi_node);
+		/* Free head node there are no remaining VSI nodes */
+		if (list_empty(&ws_tree_root->child_list_head)) {
+			irdma_ws_cqp_cmd(vsi, ws_tree_root,
+					 IRDMA_OP_WS_DELETE_NODE);
+			irdma_free_node(vsi, ws_tree_root);
+			vsi->dev->ws_tree_root = NULL;
+		}
+	}
+}
+
+/**
+ * irdma_ws_add - Build work scheduler tree, set RDMA qs_handle
+ * @vsi: vsi pointer
+ * @user_pri: user priority
+ */
+enum irdma_status_code irdma_ws_add(struct irdma_sc_vsi *vsi, u8 user_pri)
+{
+	struct irdma_ws_node *ws_tree_root;
+	struct irdma_ws_node *vsi_node;
+	struct irdma_ws_node *tc_node;
+	u16 traffic_class;
+	enum irdma_status_code ret = 0;
+	int i;
+
+	mutex_lock(&vsi->dev->ws_mutex);
+	if (vsi->tc_change_pending) {
+		ret = IRDMA_ERR_NOT_READY;
+		goto exit;
+	}
+
+	if (vsi->qos[user_pri].valid)
+		goto exit;
+
+	ws_tree_root = vsi->dev->ws_tree_root;
+	if (!ws_tree_root) {
+		ibdev_dbg(to_ibdev(vsi->dev), "WS: Creating root node\n");
+		ws_tree_root = irdma_alloc_node(vsi, user_pri,
+						WS_NODE_TYPE_PARENT, NULL);
+		if (!ws_tree_root) {
+			ret = IRDMA_ERR_NO_MEMORY;
+			goto exit;
+		}
+
+		ret = irdma_ws_cqp_cmd(vsi, ws_tree_root, IRDMA_OP_WS_ADD_NODE);
+		if (ret) {
+			irdma_free_node(vsi, ws_tree_root);
+			goto exit;
+		}
+
+		vsi->dev->ws_tree_root = ws_tree_root;
+	}
+
+	/* Find a second tier node that matches the VSI */
+	vsi_node = ws_find_node(ws_tree_root, vsi->vsi_idx,
+				WS_MATCH_TYPE_VSI);
+
+	/* If VSI node doesn't exist, add one */
+	if (!vsi_node) {
+		ibdev_dbg(to_ibdev(vsi->dev),
+			  "WS: Node not found matching VSI %d\n",
+			  vsi->vsi_idx);
+		vsi_node = irdma_alloc_node(vsi, user_pri, WS_NODE_TYPE_PARENT,
+					    ws_tree_root);
+		if (!vsi_node) {
+			ret = IRDMA_ERR_NO_MEMORY;
+			goto vsi_add_err;
+		}
+
+		ret = irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_ADD_NODE);
+		if (ret) {
+			irdma_free_node(vsi, vsi_node);
+			goto vsi_add_err;
+		}
+
+		list_add(&vsi_node->siblings, &ws_tree_root->child_list_head);
+	}
+
+	ibdev_dbg(to_ibdev(vsi->dev),
+		  "WS: Using node %d which represents VSI %d\n",
+		  vsi_node->index, vsi->vsi_idx);
+	traffic_class = vsi->qos[user_pri].traffic_class;
+	tc_node = ws_find_node(vsi_node, traffic_class,
+			       WS_MATCH_TYPE_TC);
+	if (!tc_node) {
+		/* Add leaf node */
+		ibdev_dbg(to_ibdev(vsi->dev),
+			  "WS: Node not found matching VSI %d and TC %d\n",
+			  vsi->vsi_idx, traffic_class);
+		tc_node = irdma_alloc_node(vsi, user_pri, WS_NODE_TYPE_LEAF,
+					   vsi_node);
+		if (!tc_node) {
+			ret = IRDMA_ERR_NO_MEMORY;
+			goto leaf_add_err;
+		}
+
+		ret = irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_ADD_NODE);
+		if (ret) {
+			irdma_free_node(vsi, tc_node);
+			goto leaf_add_err;
+		}
+
+		list_add(&tc_node->siblings, &vsi_node->child_list_head);
+		/*
+		 * callback to LAN to update the LAN tree with our node
+		 */
+		ret = vsi->register_qset(vsi, tc_node);
+		if (ret)
+			goto reg_err;
+
+		tc_node->enable = true;
+		ret = irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_MODIFY_NODE);
+		if (ret)
+			goto reg_err;
+	}
+	ibdev_dbg(to_ibdev(vsi->dev),
+		  "WS: Using node %d which represents VSI %d TC %d\n",
+		  tc_node->index, vsi->vsi_idx, traffic_class);
+	/*
+	 * Iterate through other UPs and update the QS handle if they have
+	 * a matching traffic class.
+	 */
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) {
+		if (vsi->qos[i].traffic_class == traffic_class) {
+			vsi->qos[i].qs_handle = tc_node->qs_handle;
+			vsi->qos[i].lan_qos_handle = tc_node->lan_qs_handle;
+			vsi->qos[i].l2_sched_node_id = tc_node->l2_sched_node_id;
+			vsi->qos[i].valid = true;
+		}
+	}
+	goto exit;
+
+leaf_add_err:
+	if (list_empty(&vsi_node->child_list_head)) {
+		if (irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_DELETE_NODE))
+			goto exit;
+		list_del(&vsi_node->siblings);
+		irdma_free_node(vsi, vsi_node);
+	}
+
+vsi_add_err:
+	/* Free head node there are no remaining VSI nodes */
+	if (list_empty(&ws_tree_root->child_list_head)) {
+		irdma_ws_cqp_cmd(vsi, ws_tree_root, IRDMA_OP_WS_DELETE_NODE);
+		vsi->dev->ws_tree_root = NULL;
+		irdma_free_node(vsi, ws_tree_root);
+	}
+
+exit:
+	mutex_unlock(&vsi->dev->ws_mutex);
+	return ret;
+
+reg_err:
+	mutex_unlock(&vsi->dev->ws_mutex);
+	irdma_ws_remove(vsi, user_pri);
+	return ret;
+}
+
+/**
+ * irdma_ws_remove - Free WS scheduler node, update WS tree
+ * @vsi: vsi pointer
+ * @user_pri: user priority
+ */
+void irdma_ws_remove(struct irdma_sc_vsi *vsi, u8 user_pri)
+{
+	mutex_lock(&vsi->dev->ws_mutex);
+	if (irdma_tc_in_use(vsi, user_pri))
+		goto exit;
+	irdma_remove_leaf(vsi, user_pri);
+exit:
+	mutex_unlock(&vsi->dev->ws_mutex);
+}
+
+/**
+ * irdma_ws_reset - Reset entire WS tree
+ * @vsi: vsi pointer
+ */
+void irdma_ws_reset(struct irdma_sc_vsi *vsi)
+{
+	u8 i;
+
+	mutex_lock(&vsi->dev->ws_mutex);
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; ++i)
+		irdma_remove_leaf(vsi, i);
+	mutex_unlock(&vsi->dev->ws_mutex);
+}
diff --git a/drivers/infiniband/hw/irdma/ws.h b/drivers/infiniband/hw/irdma/ws.h
new file mode 100644
index 0000000..f0e16f6
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/ws.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2020 Intel Corporation */
+#ifndef IRDMA_WS_H
+#define IRDMA_WS_H
+
+#include "osdep.h"
+
+enum irdma_ws_node_type {
+	WS_NODE_TYPE_PARENT,
+	WS_NODE_TYPE_LEAF,
+};
+
+enum irdma_ws_match_type {
+	WS_MATCH_TYPE_VSI,
+	WS_MATCH_TYPE_TC,
+};
+
+struct irdma_ws_node {
+	struct list_head siblings;
+	struct list_head child_list_head;
+	struct irdma_ws_node *parent;
+	u64 lan_qs_handle; /* opaque handle used by LAN */
+	u32 l2_sched_node_id;
+	u16 index;
+	u16 qs_handle;
+	u16 vsi_index;
+	u8 traffic_class;
+	u8 user_pri;
+	u8 rel_bw;
+	u8 abstraction_layer; /* used for splitting a TC */
+	u8 prio_type;
+	bool type_leaf:1;
+	bool enable:1;
+};
+
+struct irdma_sc_vsi;
+enum irdma_status_code irdma_ws_add(struct irdma_sc_vsi *vsi, u8 user_pri);
+void irdma_ws_remove(struct irdma_sc_vsi *vsi, u8 user_pri);
+void irdma_ws_reset(struct irdma_sc_vsi *vsi);
+
+#endif /* IRDMA_WS_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 14/23] RDMA/irdma: Add connection manager
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (12 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 13/23] RDMA/irdma: Add QoS definitions Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 15/23] RDMA/irdma: Add PBLE resource manager Shiraz Saleem
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Add connection management (CM) implementation for
iWARP including accept, reject, connect, create_listen,
destroy_listen and CM utility functions

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/cm.c | 4425 ++++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/cm.h |  417 ++++
 2 files changed, 4842 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/cm.c
 create mode 100644 drivers/infiniband/hw/irdma/cm.h

diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
new file mode 100644
index 0000000..c0eb74a
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/cm.c
@@ -0,0 +1,4425 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "main.h"
+#include "trace.h"
+
+static void irdma_cm_post_event(struct irdma_cm_event *event);
+static void irdma_disconnect_worker(struct work_struct *work);
+
+/**
+ * irdma_free_sqbuf - put back puda buffer if refcount is 0
+ * @vsi: The VSI structure of the device
+ * @bufp: puda buffer to free
+ */
+void irdma_free_sqbuf(struct irdma_sc_vsi *vsi, void *bufp)
+{
+	struct irdma_puda_buf *buf = bufp;
+	struct irdma_puda_rsrc *ilq = vsi->ilq;
+
+	if (refcount_dec_and_test(&buf->refcount))
+		irdma_puda_ret_bufpool(ilq, buf);
+}
+
+/**
+ * irdma_record_ird_ord - Record IRD/ORD passed in
+ * @cm_node: connection's node
+ * @conn_ird: connection IRD
+ * @conn_ord: connection ORD
+ */
+static void irdma_record_ird_ord(struct irdma_cm_node *cm_node, u32 conn_ird,
+				 u32 conn_ord)
+{
+	if (conn_ird > cm_node->dev->hw_attrs.max_hw_ird)
+		conn_ird = cm_node->dev->hw_attrs.max_hw_ird;
+
+	if (conn_ord > cm_node->dev->hw_attrs.max_hw_ord)
+		conn_ord = cm_node->dev->hw_attrs.max_hw_ord;
+	else if (!conn_ord && cm_node->send_rdma0_op == SEND_RDMA_READ_ZERO)
+		conn_ord = 1;
+	cm_node->ird_size = conn_ird;
+	cm_node->ord_size = conn_ord;
+}
+
+/**
+ * irdma_copy_ip_ntohl - copy IP address from  network to host
+ * @dst: IP address in host order
+ * @src: IP address in network order (big endian)
+ */
+void irdma_copy_ip_ntohl(u32 *dst, __be32 *src)
+{
+	*dst++ = ntohl(*src++);
+	*dst++ = ntohl(*src++);
+	*dst++ = ntohl(*src++);
+	*dst = ntohl(*src);
+}
+
+/**
+ * irdma_copy_ip_htonl - copy IP address from host to network order
+ * @dst: IP address in network order (big endian)
+ * @src: IP address in host order
+ */
+void irdma_copy_ip_htonl(__be32 *dst, u32 *src)
+{
+	*dst++ = htonl(*src++);
+	*dst++ = htonl(*src++);
+	*dst++ = htonl(*src++);
+	*dst = htonl(*src);
+}
+
+/**
+ * irdma_get_addr_info
+ * @cm_node: contains ip/tcp info
+ * @cm_info: to get a copy of the cm_node ip/tcp info
+ */
+static void irdma_get_addr_info(struct irdma_cm_node *cm_node,
+				struct irdma_cm_info *cm_info)
+{
+	memset(cm_info, 0, sizeof(*cm_info));
+	cm_info->ipv4 = cm_node->ipv4;
+	cm_info->vlan_id = cm_node->vlan_id;
+	memcpy(cm_info->loc_addr, cm_node->loc_addr, sizeof(cm_info->loc_addr));
+	memcpy(cm_info->rem_addr, cm_node->rem_addr, sizeof(cm_info->rem_addr));
+	cm_info->loc_port = cm_node->loc_port;
+	cm_info->rem_port = cm_node->rem_port;
+}
+
+/**
+ * irdma_fill_sockaddr4 - fill in addr info for IPv4 connection
+ * @cm_node: connection's node
+ * @event: upper layer's cm event
+ */
+static inline void irdma_fill_sockaddr4(struct irdma_cm_node *cm_node,
+					struct iw_cm_event *event)
+{
+	struct sockaddr_in *laddr = (struct sockaddr_in *)&event->local_addr;
+	struct sockaddr_in *raddr = (struct sockaddr_in *)&event->remote_addr;
+
+	laddr->sin_family = AF_INET;
+	raddr->sin_family = AF_INET;
+
+	laddr->sin_port = htons(cm_node->loc_port);
+	raddr->sin_port = htons(cm_node->rem_port);
+
+	laddr->sin_addr.s_addr = htonl(cm_node->loc_addr[0]);
+	raddr->sin_addr.s_addr = htonl(cm_node->rem_addr[0]);
+}
+
+/**
+ * irdma_fill_sockaddr6 - fill in addr info for IPv6 connection
+ * @cm_node: connection's node
+ * @event: upper layer's cm event
+ */
+static inline void irdma_fill_sockaddr6(struct irdma_cm_node *cm_node,
+					struct iw_cm_event *event)
+{
+	struct sockaddr_in6 *laddr6 = (struct sockaddr_in6 *)&event->local_addr;
+	struct sockaddr_in6 *raddr6 = (struct sockaddr_in6 *)&event->remote_addr;
+
+	laddr6->sin6_family = AF_INET6;
+	raddr6->sin6_family = AF_INET6;
+
+	laddr6->sin6_port = htons(cm_node->loc_port);
+	raddr6->sin6_port = htons(cm_node->rem_port);
+
+	irdma_copy_ip_htonl(laddr6->sin6_addr.in6_u.u6_addr32,
+			    cm_node->loc_addr);
+	irdma_copy_ip_htonl(raddr6->sin6_addr.in6_u.u6_addr32,
+			    cm_node->rem_addr);
+}
+
+/**
+ * irdma_get_cmevent_info - for cm event upcall
+ * @cm_node: connection's node
+ * @cm_id: upper layers cm struct for the event
+ * @event: upper layer's cm event
+ */
+static inline void irdma_get_cmevent_info(struct irdma_cm_node *cm_node,
+					  struct iw_cm_id *cm_id,
+					  struct iw_cm_event *event)
+{
+	memcpy(&event->local_addr, &cm_id->m_local_addr,
+	       sizeof(event->local_addr));
+	memcpy(&event->remote_addr, &cm_id->m_remote_addr,
+	       sizeof(event->remote_addr));
+	if (cm_node) {
+		event->private_data = cm_node->pdata_buf;
+		event->private_data_len = (u8)cm_node->pdata.size;
+		event->ird = cm_node->ird_size;
+		event->ord = cm_node->ord_size;
+	}
+}
+
+/**
+ * irdma_send_cm_event - upcall cm's event handler
+ * @cm_node: connection's node
+ * @cm_id: upper layer's cm info struct
+ * @type: Event type to indicate
+ * @status: status for the event type
+ */
+static int irdma_send_cm_event(struct irdma_cm_node *cm_node,
+			       struct iw_cm_id *cm_id,
+			       enum iw_cm_event_type type, int status)
+{
+	struct iw_cm_event event = {};
+
+	event.event = type;
+	event.status = status;
+	trace_irdma_send_cm_event(cm_node, cm_id, type, status,
+				  __builtin_return_address(0));
+
+	ibdev_dbg(&cm_node->iwdev->ibdev,
+		  "CM: cm_node %p cm_id=%p state=%d accel=%d event_type=%d status=%d\n",
+		  cm_node, cm_id, cm_node->accelerated, cm_node->state, type,
+		  status);
+
+	switch (type) {
+	case IW_CM_EVENT_CONNECT_REQUEST:
+		if (cm_node->ipv4)
+			irdma_fill_sockaddr4(cm_node, &event);
+		else
+			irdma_fill_sockaddr6(cm_node, &event);
+		event.provider_data = cm_node;
+		event.private_data = cm_node->pdata_buf;
+		event.private_data_len = (u8)cm_node->pdata.size;
+		event.ird = cm_node->ird_size;
+		break;
+	case IW_CM_EVENT_CONNECT_REPLY:
+		irdma_get_cmevent_info(cm_node, cm_id, &event);
+		break;
+	case IW_CM_EVENT_ESTABLISHED:
+		event.ird = cm_node->ird_size;
+		event.ord = cm_node->ord_size;
+		break;
+	case IW_CM_EVENT_DISCONNECT:
+	case IW_CM_EVENT_CLOSE:
+		/* Wait if we are in RTS but havent issued the iwcm event upcall */
+		if (!cm_node->accelerated)
+			wait_for_completion(&cm_node->establish_comp);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return cm_id->event_handler(cm_id, &event);
+}
+
+/**
+ * irdma_timer_list_prep - add connection nodes to a list to perform timer tasks
+ * @cm_core: cm's core
+ * @timer_list: a timer list to which cm_node will be selected
+ */
+static void irdma_timer_list_prep(struct irdma_cm_core *cm_core,
+				  struct list_head *timer_list)
+{
+	struct irdma_cm_node *cm_node;
+	int bkt;
+
+	hash_for_each_rcu(cm_core->cm_hash_tbl, bkt, cm_node, list) {
+		if ((cm_node->close_entry || cm_node->send_entry) &&
+		    refcount_inc_not_zero(&cm_node->refcnt))
+			list_add(&cm_node->timer_entry, timer_list);
+	}
+}
+
+/**
+ * irdma_create_event - create cm event
+ * @cm_node: connection's node
+ * @type: Event type to generate
+ */
+static struct irdma_cm_event *irdma_create_event(struct irdma_cm_node *cm_node,
+						 enum irdma_cm_event_type type)
+{
+	struct irdma_cm_event *event;
+
+	if (!cm_node->cm_id)
+		return NULL;
+
+	event = kzalloc(sizeof(*event), GFP_ATOMIC);
+
+	if (!event)
+		return NULL;
+
+	event->type = type;
+	event->cm_node = cm_node;
+	memcpy(event->cm_info.rem_addr, cm_node->rem_addr,
+	       sizeof(event->cm_info.rem_addr));
+	memcpy(event->cm_info.loc_addr, cm_node->loc_addr,
+	       sizeof(event->cm_info.loc_addr));
+	event->cm_info.rem_port = cm_node->rem_port;
+	event->cm_info.loc_port = cm_node->loc_port;
+	event->cm_info.cm_id = cm_node->cm_id;
+	ibdev_dbg(&cm_node->iwdev->ibdev,
+		  "CM: node=%p event=%p type=%u dst=%pI4 src=%pI4\n", cm_node,
+		  event, type, event->cm_info.loc_addr,
+		  event->cm_info.rem_addr);
+	trace_irdma_create_event(cm_node, type, __builtin_return_address(0));
+	irdma_cm_post_event(event);
+
+	return event;
+}
+
+/**
+ * irdma_free_retrans_entry - free send entry
+ * @cm_node: connection's node
+ */
+static void irdma_free_retrans_entry(struct irdma_cm_node *cm_node)
+{
+	struct irdma_device *iwdev = cm_node->iwdev;
+	struct irdma_timer_entry *send_entry;
+
+	send_entry = cm_node->send_entry;
+	if (!send_entry)
+		return;
+
+	cm_node->send_entry = NULL;
+	irdma_free_sqbuf(&iwdev->vsi, send_entry->sqbuf);
+	kfree(send_entry);
+	refcount_dec(&cm_node->refcnt);
+}
+
+/**
+ * irdma_cleanup_retrans_entry - free send entry with lock
+ * @cm_node: connection's node
+ */
+static void irdma_cleanup_retrans_entry(struct irdma_cm_node *cm_node)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&cm_node->retrans_list_lock, flags);
+	irdma_free_retrans_entry(cm_node);
+	spin_unlock_irqrestore(&cm_node->retrans_list_lock, flags);
+}
+
+/**
+ * irdma_form_ah_cm_frame - get a free packet and build frame with address handle
+ * @cm_node: connection's node ionfo to use in frame
+ * @options: pointer to options info
+ * @hdr: pointer mpa header
+ * @pdata: pointer to private data
+ * @flags:  indicates FIN or ACK
+ */
+static struct irdma_puda_buf *irdma_form_ah_cm_frame(struct irdma_cm_node *cm_node,
+						     struct irdma_kmem_info *options,
+						     struct irdma_kmem_info *hdr,
+						     struct irdma_mpa_priv_info *pdata,
+						     u8 flags)
+{
+	struct irdma_puda_buf *sqbuf;
+	struct irdma_sc_vsi *vsi = &cm_node->iwdev->vsi;
+	u8 *buf;
+	struct tcphdr *tcph;
+	u16 pktsize;
+	u32 opts_len = 0;
+	u32 pd_len = 0;
+	u32 hdr_len = 0;
+
+	if (!cm_node->ah || !cm_node->ah->ah_info.ah_valid) {
+		ibdev_dbg(&cm_node->iwdev->ibdev, "CM: AH invalid\n");
+		return NULL;
+	}
+
+	sqbuf = irdma_puda_get_bufpool(vsi->ilq);
+	if (!sqbuf) {
+		ibdev_dbg(&cm_node->iwdev->ibdev, "CM: SQ buf NULL\n");
+		return NULL;
+	}
+
+	sqbuf->ah_id = cm_node->ah->ah_info.ah_idx;
+	buf = sqbuf->mem.va;
+	if (options)
+		opts_len = (u32)options->size;
+
+	if (hdr)
+		hdr_len = hdr->size;
+
+	if (pdata)
+		pd_len = pdata->size;
+
+	pktsize = sizeof(*tcph) + opts_len + hdr_len + pd_len;
+
+	memset(buf, 0, pktsize);
+
+	sqbuf->totallen = pktsize;
+	sqbuf->tcphlen = sizeof(*tcph) + opts_len;
+	sqbuf->scratch = cm_node;
+
+	tcph = (struct tcphdr *)buf;
+	buf += sizeof(*tcph);
+
+	tcph->source = htons(cm_node->loc_port);
+	tcph->dest = htons(cm_node->rem_port);
+	tcph->seq = htonl(cm_node->tcp_cntxt.loc_seq_num);
+
+	if (flags & SET_ACK) {
+		cm_node->tcp_cntxt.loc_ack_num = cm_node->tcp_cntxt.rcv_nxt;
+		tcph->ack_seq = htonl(cm_node->tcp_cntxt.loc_ack_num);
+		tcph->ack = 1;
+	} else {
+		tcph->ack_seq = 0;
+	}
+
+	if (flags & SET_SYN) {
+		cm_node->tcp_cntxt.loc_seq_num++;
+		tcph->syn = 1;
+	} else {
+		cm_node->tcp_cntxt.loc_seq_num += hdr_len + pd_len;
+	}
+
+	if (flags & SET_FIN) {
+		cm_node->tcp_cntxt.loc_seq_num++;
+		tcph->fin = 1;
+	}
+
+	if (flags & SET_RST)
+		tcph->rst = 1;
+
+	tcph->doff = (u16)((sizeof(*tcph) + opts_len + 3) >> 2);
+	sqbuf->tcphlen = tcph->doff << 2;
+	tcph->window = htons(cm_node->tcp_cntxt.rcv_wnd);
+	tcph->urg_ptr = 0;
+
+	if (opts_len) {
+		memcpy(buf, options->addr, opts_len);
+		buf += opts_len;
+	}
+
+	if (hdr_len) {
+		memcpy(buf, hdr->addr, hdr_len);
+		buf += hdr_len;
+	}
+
+	if (pdata && pdata->addr)
+		memcpy(buf, pdata->addr, pdata->size);
+
+	refcount_set(&sqbuf->refcount, 1);
+
+	print_hex_dump_debug("ILQ: TRANSMIT ILQ BUFFER", DUMP_PREFIX_OFFSET,
+			     16, 8, sqbuf->mem.va, sqbuf->totallen, false);
+
+	return sqbuf;
+}
+
+/**
+ * irdma_form_uda_cm_frame - get a free packet and build frame full tcpip packet
+ * @cm_node: connection's node ionfo to use in frame
+ * @options: pointer to options info
+ * @hdr: pointer mpa header
+ * @pdata: pointer to private data
+ * @flags:  indicates FIN or ACK
+ */
+static struct irdma_puda_buf *irdma_form_uda_cm_frame(struct irdma_cm_node *cm_node,
+						      struct irdma_kmem_info *options,
+						      struct irdma_kmem_info *hdr,
+						      struct irdma_mpa_priv_info *pdata,
+						      u8 flags)
+{
+	struct irdma_puda_buf *sqbuf;
+	struct irdma_sc_vsi *vsi = &cm_node->iwdev->vsi;
+	u8 *buf;
+
+	struct tcphdr *tcph;
+	struct iphdr *iph;
+	struct ipv6hdr *ip6h;
+	struct ethhdr *ethh;
+	u16 pktsize;
+	u16 eth_hlen = ETH_HLEN;
+	u32 opts_len = 0;
+	u32 pd_len = 0;
+	u32 hdr_len = 0;
+
+	u16 vtag;
+
+	sqbuf = irdma_puda_get_bufpool(vsi->ilq);
+	if (!sqbuf)
+		return NULL;
+
+	buf = sqbuf->mem.va;
+
+	if (options)
+		opts_len = (u32)options->size;
+
+	if (hdr)
+		hdr_len = hdr->size;
+
+	if (pdata)
+		pd_len = pdata->size;
+
+	if (cm_node->vlan_id < VLAN_N_VID)
+		eth_hlen += 4;
+
+	if (cm_node->ipv4)
+		pktsize = sizeof(*iph) + sizeof(*tcph);
+	else
+		pktsize = sizeof(*ip6h) + sizeof(*tcph);
+	pktsize += opts_len + hdr_len + pd_len;
+
+	memset(buf, 0, eth_hlen + pktsize);
+
+	sqbuf->totallen = pktsize + eth_hlen;
+	sqbuf->maclen = eth_hlen;
+	sqbuf->tcphlen = sizeof(*tcph) + opts_len;
+	sqbuf->scratch = cm_node;
+
+	ethh = (struct ethhdr *)buf;
+	buf += eth_hlen;
+
+	if (cm_node->do_lpb)
+		sqbuf->do_lpb = true;
+
+	if (cm_node->ipv4) {
+		sqbuf->ipv4 = true;
+
+		iph = (struct iphdr *)buf;
+		buf += sizeof(*iph);
+		tcph = (struct tcphdr *)buf;
+		buf += sizeof(*tcph);
+
+		ether_addr_copy(ethh->h_dest, cm_node->rem_mac);
+		ether_addr_copy(ethh->h_source, cm_node->loc_mac);
+		if (cm_node->vlan_id < VLAN_N_VID) {
+			((struct vlan_ethhdr *)ethh)->h_vlan_proto =
+				htons(ETH_P_8021Q);
+			vtag = (cm_node->user_pri << VLAN_PRIO_SHIFT) |
+			       cm_node->vlan_id;
+			((struct vlan_ethhdr *)ethh)->h_vlan_TCI = htons(vtag);
+
+			((struct vlan_ethhdr *)ethh)->h_vlan_encapsulated_proto =
+				htons(ETH_P_IP);
+		} else {
+			ethh->h_proto = htons(ETH_P_IP);
+		}
+
+		iph->version = IPVERSION;
+		iph->ihl = 5; /* 5 * 4Byte words, IP headr len */
+		iph->tos = cm_node->tos;
+		iph->tot_len = htons(pktsize);
+		iph->id = htons(++cm_node->tcp_cntxt.loc_id);
+
+		iph->frag_off = htons(0x4000);
+		iph->ttl = 0x40;
+		iph->protocol = IPPROTO_TCP;
+		iph->saddr = htonl(cm_node->loc_addr[0]);
+		iph->daddr = htonl(cm_node->rem_addr[0]);
+	} else {
+		sqbuf->ipv4 = false;
+		ip6h = (struct ipv6hdr *)buf;
+		buf += sizeof(*ip6h);
+		tcph = (struct tcphdr *)buf;
+		buf += sizeof(*tcph);
+
+		ether_addr_copy(ethh->h_dest, cm_node->rem_mac);
+		ether_addr_copy(ethh->h_source, cm_node->loc_mac);
+		if (cm_node->vlan_id < VLAN_N_VID) {
+			((struct vlan_ethhdr *)ethh)->h_vlan_proto =
+				htons(ETH_P_8021Q);
+			vtag = (cm_node->user_pri << VLAN_PRIO_SHIFT) |
+			       cm_node->vlan_id;
+			((struct vlan_ethhdr *)ethh)->h_vlan_TCI = htons(vtag);
+			((struct vlan_ethhdr *)ethh)->h_vlan_encapsulated_proto =
+				htons(ETH_P_IPV6);
+		} else {
+			ethh->h_proto = htons(ETH_P_IPV6);
+		}
+		ip6h->version = 6;
+		ip6h->priority = cm_node->tos >> 4;
+		ip6h->flow_lbl[0] = cm_node->tos << 4;
+		ip6h->flow_lbl[1] = 0;
+		ip6h->flow_lbl[2] = 0;
+		ip6h->payload_len = htons(pktsize - sizeof(*ip6h));
+		ip6h->nexthdr = 6;
+		ip6h->hop_limit = 128;
+		irdma_copy_ip_htonl(ip6h->saddr.in6_u.u6_addr32,
+				    cm_node->loc_addr);
+		irdma_copy_ip_htonl(ip6h->daddr.in6_u.u6_addr32,
+				    cm_node->rem_addr);
+	}
+
+	tcph->source = htons(cm_node->loc_port);
+	tcph->dest = htons(cm_node->rem_port);
+	tcph->seq = htonl(cm_node->tcp_cntxt.loc_seq_num);
+
+	if (flags & SET_ACK) {
+		cm_node->tcp_cntxt.loc_ack_num = cm_node->tcp_cntxt.rcv_nxt;
+		tcph->ack_seq = htonl(cm_node->tcp_cntxt.loc_ack_num);
+		tcph->ack = 1;
+	} else {
+		tcph->ack_seq = 0;
+	}
+
+	if (flags & SET_SYN) {
+		cm_node->tcp_cntxt.loc_seq_num++;
+		tcph->syn = 1;
+	} else {
+		cm_node->tcp_cntxt.loc_seq_num += hdr_len + pd_len;
+	}
+
+	if (flags & SET_FIN) {
+		cm_node->tcp_cntxt.loc_seq_num++;
+		tcph->fin = 1;
+	}
+
+	if (flags & SET_RST)
+		tcph->rst = 1;
+
+	tcph->doff = (u16)((sizeof(*tcph) + opts_len + 3) >> 2);
+	sqbuf->tcphlen = tcph->doff << 2;
+	tcph->window = htons(cm_node->tcp_cntxt.rcv_wnd);
+	tcph->urg_ptr = 0;
+
+	if (opts_len) {
+		memcpy(buf, options->addr, opts_len);
+		buf += opts_len;
+	}
+
+	if (hdr_len) {
+		memcpy(buf, hdr->addr, hdr_len);
+		buf += hdr_len;
+	}
+
+	if (pdata && pdata->addr)
+		memcpy(buf, pdata->addr, pdata->size);
+
+	refcount_set(&sqbuf->refcount, 1);
+
+	print_hex_dump_debug("ILQ: TRANSMIT ILQ BUFFER", DUMP_PREFIX_OFFSET,
+			     16, 8, sqbuf->mem.va, sqbuf->totallen, false);
+	return sqbuf;
+}
+
+/**
+ * irdma_send_reset - Send RST packet
+ * @cm_node: connection's node
+ */
+int irdma_send_reset(struct irdma_cm_node *cm_node)
+{
+	struct irdma_puda_buf *sqbuf;
+	int flags = SET_RST | SET_ACK;
+
+	trace_irdma_send_reset(cm_node, 0, __builtin_return_address(0));
+	sqbuf = cm_node->cm_core->form_cm_frame(cm_node, NULL, NULL, NULL,
+						flags);
+	if (!sqbuf)
+		return -ENOMEM;
+
+	ibdev_dbg(&cm_node->iwdev->ibdev,
+		  "CM: caller: %pS cm_node %p cm_id=%p accel=%d state=%d rem_port=0x%04x, loc_port=0x%04x rem_addr=%pI4 loc_addr=%pI4\n",
+		  __builtin_return_address(0), cm_node, cm_node->cm_id,
+		  cm_node->accelerated, cm_node->state, cm_node->rem_port,
+		  cm_node->loc_port, cm_node->rem_addr, cm_node->loc_addr);
+
+	return irdma_schedule_cm_timer(cm_node, sqbuf, IRDMA_TIMER_TYPE_SEND, 0,
+				       1);
+}
+
+/**
+ * irdma_active_open_err - send event for active side cm error
+ * @cm_node: connection's node
+ * @reset: Flag to send reset or not
+ */
+static void irdma_active_open_err(struct irdma_cm_node *cm_node, bool reset)
+{
+	trace_irdma_active_open_err(cm_node, reset,
+				    __builtin_return_address(0));
+	irdma_cleanup_retrans_entry(cm_node);
+	cm_node->cm_core->stats_connect_errs++;
+	if (reset) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: cm_node=%p state=%d\n", cm_node,
+			  cm_node->state);
+		refcount_inc(&cm_node->refcnt);
+		irdma_send_reset(cm_node);
+	}
+
+	cm_node->state = IRDMA_CM_STATE_CLOSED;
+	irdma_create_event(cm_node, IRDMA_CM_EVENT_ABORTED);
+}
+
+/**
+ * irdma_passive_open_err - handle passive side cm error
+ * @cm_node: connection's node
+ * @reset: send reset or just free cm_node
+ */
+static void irdma_passive_open_err(struct irdma_cm_node *cm_node, bool reset)
+{
+	irdma_cleanup_retrans_entry(cm_node);
+	cm_node->cm_core->stats_passive_errs++;
+	cm_node->state = IRDMA_CM_STATE_CLOSED;
+	ibdev_dbg(&cm_node->iwdev->ibdev, "CM: cm_node=%p state =%d\n",
+		  cm_node, cm_node->state);
+	trace_irdma_passive_open_err(cm_node, reset,
+				     __builtin_return_address(0));
+	if (reset)
+		irdma_send_reset(cm_node);
+	else
+		irdma_rem_ref_cm_node(cm_node);
+}
+
+/**
+ * irdma_event_connect_error - to create connect error event
+ * @event: cm information for connect event
+ */
+static void irdma_event_connect_error(struct irdma_cm_event *event)
+{
+	struct irdma_qp *iwqp;
+	struct iw_cm_id *cm_id;
+
+	cm_id = event->cm_node->cm_id;
+	if (!cm_id)
+		return;
+
+	iwqp = cm_id->provider_data;
+
+	if (!iwqp || !iwqp->iwdev)
+		return;
+
+	iwqp->cm_id = NULL;
+	cm_id->provider_data = NULL;
+	irdma_send_cm_event(event->cm_node, cm_id, IW_CM_EVENT_CONNECT_REPLY,
+			    -ECONNRESET);
+	irdma_rem_ref_cm_node(event->cm_node);
+}
+
+/**
+ * irdma_process_options - process options from TCP header
+ * @cm_node: connection's node
+ * @optionsloc: point to start of options
+ * @optionsize: size of all options
+ * @syn_pkt: flag if syn packet
+ */
+static int irdma_process_options(struct irdma_cm_node *cm_node, u8 *optionsloc,
+				 u32 optionsize, u32 syn_pkt)
+{
+	u32 tmp;
+	u32 offset = 0;
+	union all_known_options *all_options;
+	char got_mss_option = 0;
+
+	while (offset < optionsize) {
+		all_options = (union all_known_options *)(optionsloc + offset);
+		switch (all_options->base.optionnum) {
+		case OPTION_NUM_EOL:
+			offset = optionsize;
+			break;
+		case OPTION_NUM_NONE:
+			offset += 1;
+			continue;
+		case OPTION_NUM_MSS:
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: MSS Length: %d Offset: %d Size: %d\n",
+				  all_options->mss.len, offset, optionsize);
+			got_mss_option = 1;
+			if (all_options->mss.len != 4)
+				return -EINVAL;
+			tmp = ntohs(all_options->mss.mss);
+			if ((cm_node->ipv4 &&
+			     (tmp + IRDMA_MTU_TO_MSS_IPV4) < IRDMA_MIN_MTU_IPV4) ||
+			    (!cm_node->ipv4 &&
+			     (tmp + IRDMA_MTU_TO_MSS_IPV6) < IRDMA_MIN_MTU_IPV6))
+				return -EINVAL;
+			if (tmp < cm_node->tcp_cntxt.mss)
+				cm_node->tcp_cntxt.mss = tmp;
+			break;
+		case OPTION_NUM_WINDOW_SCALE:
+			cm_node->tcp_cntxt.snd_wscale =
+				all_options->windowscale.shiftcount;
+			break;
+		default:
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: Unsupported TCP Option: %x\n",
+				  all_options->base.optionnum);
+			break;
+		}
+		offset += all_options->base.len;
+	}
+	if (!got_mss_option && syn_pkt)
+		cm_node->tcp_cntxt.mss = IRDMA_CM_DEFAULT_MSS;
+
+	return 0;
+}
+
+/**
+ * irdma_handle_tcp_options - setup TCP context info after parsing TCP options
+ * @cm_node: connection's node
+ * @tcph: pointer tcp header
+ * @optionsize: size of options rcvd
+ * @passive: active or passive flag
+ */
+static int irdma_handle_tcp_options(struct irdma_cm_node *cm_node,
+				    struct tcphdr *tcph, int optionsize,
+				    int passive)
+{
+	u8 *optionsloc = (u8 *)&tcph[1];
+	int ret;
+
+	if (optionsize) {
+		ret = irdma_process_options(cm_node, optionsloc, optionsize,
+					    (u32)tcph->syn);
+		if (ret) {
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: Node %p, Sending Reset\n", cm_node);
+			if (passive)
+				irdma_passive_open_err(cm_node, true);
+			else
+				irdma_active_open_err(cm_node, true);
+			return ret;
+		}
+	}
+
+	cm_node->tcp_cntxt.snd_wnd = ntohs(tcph->window)
+				     << cm_node->tcp_cntxt.snd_wscale;
+
+	if (cm_node->tcp_cntxt.snd_wnd > cm_node->tcp_cntxt.max_snd_wnd)
+		cm_node->tcp_cntxt.max_snd_wnd = cm_node->tcp_cntxt.snd_wnd;
+
+	return 0;
+}
+
+/**
+ * irdma_build_mpa_v1 - build a MPA V1 frame
+ * @cm_node: connection's node
+ * @start_addr: address where to build frame
+ * @mpa_key: to do read0 or write0
+ */
+static void irdma_build_mpa_v1(struct irdma_cm_node *cm_node, void *start_addr,
+			       u8 mpa_key)
+{
+	struct ietf_mpa_v1 *mpa_frame = start_addr;
+
+	switch (mpa_key) {
+	case MPA_KEY_REQUEST:
+		memcpy(mpa_frame->key, IEFT_MPA_KEY_REQ, IETF_MPA_KEY_SIZE);
+		break;
+	case MPA_KEY_REPLY:
+		memcpy(mpa_frame->key, IEFT_MPA_KEY_REP, IETF_MPA_KEY_SIZE);
+		break;
+	default:
+		break;
+	}
+	mpa_frame->flags = IETF_MPA_FLAGS_CRC;
+	mpa_frame->rev = cm_node->mpa_frame_rev;
+	mpa_frame->priv_data_len = htons(cm_node->pdata.size);
+}
+
+/**
+ * irdma_build_mpa_v2 - build a MPA V2 frame
+ * @cm_node: connection's node
+ * @start_addr: buffer start address
+ * @mpa_key: to do read0 or write0
+ */
+static void irdma_build_mpa_v2(struct irdma_cm_node *cm_node, void *start_addr,
+			       u8 mpa_key)
+{
+	struct ietf_mpa_v2 *mpa_frame = start_addr;
+	struct ietf_rtr_msg *rtr_msg = &mpa_frame->rtr_msg;
+	u16 ctrl_ird, ctrl_ord;
+
+	/* initialize the upper 5 bytes of the frame */
+	irdma_build_mpa_v1(cm_node, start_addr, mpa_key);
+	mpa_frame->flags |= IETF_MPA_V2_FLAG;
+	if (cm_node->iwdev->iw_ooo) {
+		mpa_frame->flags |= IETF_MPA_FLAGS_MARKERS;
+		cm_node->rcv_mark_en = true;
+	}
+	mpa_frame->priv_data_len = cpu_to_be16(be16_to_cpu(mpa_frame->priv_data_len) +
+					       IETF_RTR_MSG_SIZE);
+
+	/* initialize RTR msg */
+	if (cm_node->mpav2_ird_ord == IETF_NO_IRD_ORD) {
+		ctrl_ird = IETF_NO_IRD_ORD;
+		ctrl_ord = IETF_NO_IRD_ORD;
+	} else {
+		ctrl_ird = (cm_node->ird_size > IETF_NO_IRD_ORD) ?
+				   IETF_NO_IRD_ORD :
+				   cm_node->ird_size;
+		ctrl_ord = (cm_node->ord_size > IETF_NO_IRD_ORD) ?
+				   IETF_NO_IRD_ORD :
+				   cm_node->ord_size;
+	}
+	ctrl_ird |= IETF_PEER_TO_PEER;
+
+	switch (mpa_key) {
+	case MPA_KEY_REQUEST:
+		ctrl_ord |= IETF_RDMA0_WRITE;
+		ctrl_ord |= IETF_RDMA0_READ;
+		break;
+	case MPA_KEY_REPLY:
+		switch (cm_node->send_rdma0_op) {
+		case SEND_RDMA_WRITE_ZERO:
+			ctrl_ord |= IETF_RDMA0_WRITE;
+			break;
+		case SEND_RDMA_READ_ZERO:
+			ctrl_ord |= IETF_RDMA0_READ;
+			break;
+		}
+		break;
+	default:
+		break;
+	}
+	rtr_msg->ctrl_ird = htons(ctrl_ird);
+	rtr_msg->ctrl_ord = htons(ctrl_ord);
+}
+
+/**
+ * irdma_cm_build_mpa_frame - build mpa frame for mpa version 1 or version 2
+ * @cm_node: connection's node
+ * @mpa: mpa: data buffer
+ * @mpa_key: to do read0 or write0
+ */
+static int irdma_cm_build_mpa_frame(struct irdma_cm_node *cm_node,
+				    struct irdma_kmem_info *mpa, u8 mpa_key)
+{
+	int hdr_len = 0;
+
+	switch (cm_node->mpa_frame_rev) {
+	case IETF_MPA_V1:
+		hdr_len = sizeof(struct ietf_mpa_v1);
+		irdma_build_mpa_v1(cm_node, mpa->addr, mpa_key);
+		break;
+	case IETF_MPA_V2:
+		hdr_len = sizeof(struct ietf_mpa_v2);
+		irdma_build_mpa_v2(cm_node, mpa->addr, mpa_key);
+		break;
+	default:
+		break;
+	}
+
+	return hdr_len;
+}
+
+/**
+ * irdma_send_mpa_request - active node send mpa request to passive node
+ * @cm_node: connection's node
+ */
+static int irdma_send_mpa_request(struct irdma_cm_node *cm_node)
+{
+	struct irdma_puda_buf *sqbuf;
+
+	cm_node->mpa_hdr.addr = &cm_node->mpa_v2_frame;
+	cm_node->mpa_hdr.size = irdma_cm_build_mpa_frame(cm_node,
+							 &cm_node->mpa_hdr,
+							 MPA_KEY_REQUEST);
+	if (!cm_node->mpa_hdr.size) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: mpa size = %d\n", cm_node->mpa_hdr.size);
+		return -EINVAL;
+	}
+
+	sqbuf = cm_node->cm_core->form_cm_frame(cm_node, NULL,
+						&cm_node->mpa_hdr,
+						&cm_node->pdata, SET_ACK);
+	if (!sqbuf)
+		return -ENOMEM;
+
+	return irdma_schedule_cm_timer(cm_node, sqbuf, IRDMA_TIMER_TYPE_SEND, 1,
+				       0);
+}
+
+/**
+ * irdma_send_mpa_reject -
+ * @cm_node: connection's node
+ * @pdata: reject data for connection
+ * @plen: length of reject data
+ */
+static int irdma_send_mpa_reject(struct irdma_cm_node *cm_node,
+				 const void *pdata, u8 plen)
+{
+	struct irdma_puda_buf *sqbuf;
+	struct irdma_mpa_priv_info priv_info;
+
+	cm_node->mpa_hdr.addr = &cm_node->mpa_v2_frame;
+	cm_node->mpa_hdr.size = irdma_cm_build_mpa_frame(cm_node,
+							 &cm_node->mpa_hdr,
+							 MPA_KEY_REPLY);
+
+	cm_node->mpa_frame.flags |= IETF_MPA_FLAGS_REJECT;
+	priv_info.addr = pdata;
+	priv_info.size = plen;
+
+	sqbuf = cm_node->cm_core->form_cm_frame(cm_node, NULL,
+						&cm_node->mpa_hdr, &priv_info,
+						SET_ACK | SET_FIN);
+	if (!sqbuf)
+		return -ENOMEM;
+
+	cm_node->state = IRDMA_CM_STATE_FIN_WAIT1;
+
+	return irdma_schedule_cm_timer(cm_node, sqbuf, IRDMA_TIMER_TYPE_SEND, 1,
+				       0);
+}
+
+/**
+ * irdma_negotiate_mpa_v2_ird_ord - negotiate MPAv2 IRD/ORD
+ * @cm_node: connection's node
+ * @buf: Data pointer
+ */
+static int irdma_negotiate_mpa_v2_ird_ord(struct irdma_cm_node *cm_node,
+					  u8 *buf)
+{
+	struct ietf_mpa_v2 *mpa_v2_frame;
+	struct ietf_rtr_msg *rtr_msg;
+	u16 ird_size;
+	u16 ord_size;
+	u16 ctrl_ord;
+	u16 ctrl_ird;
+
+	mpa_v2_frame = (struct ietf_mpa_v2 *)buf;
+	rtr_msg = &mpa_v2_frame->rtr_msg;
+
+	/* parse rtr message */
+	ctrl_ord = ntohs(rtr_msg->ctrl_ord);
+	ctrl_ird = ntohs(rtr_msg->ctrl_ird);
+	ird_size = ctrl_ird & IETF_NO_IRD_ORD;
+	ord_size = ctrl_ord & IETF_NO_IRD_ORD;
+
+	if (!(ctrl_ird & IETF_PEER_TO_PEER))
+		return -EOPNOTSUPP;
+
+	if (ird_size == IETF_NO_IRD_ORD || ord_size == IETF_NO_IRD_ORD) {
+		cm_node->mpav2_ird_ord = IETF_NO_IRD_ORD;
+		goto negotiate_done;
+	}
+
+	if (cm_node->state != IRDMA_CM_STATE_MPAREQ_SENT) {
+		/* responder */
+		if (!ord_size && (ctrl_ord & IETF_RDMA0_READ))
+			cm_node->ird_size = 1;
+		if (cm_node->ord_size > ird_size)
+			cm_node->ord_size = ird_size;
+	} else {
+		/* initiator */
+		if (!ird_size && (ctrl_ord & IETF_RDMA0_READ))
+			/* Remote peer doesn't support RDMA0_READ */
+			return -EOPNOTSUPP;
+
+		if (cm_node->ord_size > ird_size)
+			cm_node->ord_size = ird_size;
+
+		if (cm_node->ird_size < ord_size)
+		/* no resources available */
+			return -EINVAL;
+	}
+
+negotiate_done:
+	if (ctrl_ord & IETF_RDMA0_READ)
+		cm_node->send_rdma0_op = SEND_RDMA_READ_ZERO;
+	else if (ctrl_ord & IETF_RDMA0_WRITE)
+		cm_node->send_rdma0_op = SEND_RDMA_WRITE_ZERO;
+	else
+		/* Not supported RDMA0 operation */
+		return -EOPNOTSUPP;
+
+	ibdev_dbg(&cm_node->iwdev->ibdev,
+		  "CM: MPAV2 Negotiated ORD: %d, IRD: %d\n",
+		  cm_node->ord_size, cm_node->ird_size);
+	trace_irdma_negotiate_mpa_v2(cm_node);
+	return 0;
+}
+
+/**
+ * irdma_parse_mpa - process an IETF MPA frame
+ * @cm_node: connection's node
+ * @buf: Data pointer
+ * @type: to return accept or reject
+ * @len: Len of mpa buffer
+ */
+static int irdma_parse_mpa(struct irdma_cm_node *cm_node, u8 *buf, u32 *type,
+			   u32 len)
+{
+	struct ietf_mpa_v1 *mpa_frame;
+	int mpa_hdr_len, priv_data_len, ret;
+
+	*type = IRDMA_MPA_REQUEST_ACCEPT;
+
+	if (len < sizeof(struct ietf_mpa_v1)) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: ietf buffer small (%x)\n", len);
+		return -EINVAL;
+	}
+
+	mpa_frame = (struct ietf_mpa_v1 *)buf;
+	mpa_hdr_len = sizeof(struct ietf_mpa_v1);
+	priv_data_len = ntohs(mpa_frame->priv_data_len);
+
+	if (priv_data_len > IETF_MAX_PRIV_DATA_LEN) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: private_data too big %d\n", priv_data_len);
+		return -EOVERFLOW;
+	}
+
+	if (mpa_frame->rev != IETF_MPA_V1 && mpa_frame->rev != IETF_MPA_V2) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: unsupported mpa rev = %d\n", mpa_frame->rev);
+		return -EINVAL;
+	}
+
+	if (mpa_frame->rev > cm_node->mpa_frame_rev) {
+		ibdev_dbg(&cm_node->iwdev->ibdev, "CM: rev %d\n",
+			  mpa_frame->rev);
+		return -EINVAL;
+	}
+
+	cm_node->mpa_frame_rev = mpa_frame->rev;
+	if (cm_node->state != IRDMA_CM_STATE_MPAREQ_SENT) {
+		if (memcmp(mpa_frame->key, IEFT_MPA_KEY_REQ,
+			   IETF_MPA_KEY_SIZE)) {
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: Unexpected MPA Key received\n");
+			return -EINVAL;
+		}
+	} else {
+		if (memcmp(mpa_frame->key, IEFT_MPA_KEY_REP,
+			   IETF_MPA_KEY_SIZE)) {
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: Unexpected MPA Key received\n");
+			return -EINVAL;
+		}
+	}
+
+	if (priv_data_len + mpa_hdr_len > len) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: ietf buffer len(%x + %x != %x)\n",
+			  priv_data_len, mpa_hdr_len, len);
+		return -EOVERFLOW;
+	}
+
+	if (len > IRDMA_MAX_CM_BUF) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: ietf buffer large len = %d\n", len);
+		return -EOVERFLOW;
+	}
+
+	switch (mpa_frame->rev) {
+	case IETF_MPA_V2:
+		mpa_hdr_len += IETF_RTR_MSG_SIZE;
+		ret = irdma_negotiate_mpa_v2_ird_ord(cm_node, buf);
+		if (ret)
+			return ret;
+		break;
+	case IETF_MPA_V1:
+	default:
+		break;
+	}
+
+	memcpy(cm_node->pdata_buf, buf + mpa_hdr_len, priv_data_len);
+	cm_node->pdata.size = priv_data_len;
+
+	if (mpa_frame->flags & IETF_MPA_FLAGS_REJECT)
+		*type = IRDMA_MPA_REQUEST_REJECT;
+
+	if (mpa_frame->flags & IETF_MPA_FLAGS_MARKERS)
+		cm_node->snd_mark_en = true;
+
+	return 0;
+}
+
+/**
+ * irdma_schedule_cm_timer
+ * @cm_node: connection's node
+ * @sqbuf: buffer to send
+ * @type: if it is send or close
+ * @send_retrans: if rexmits to be done
+ * @close_when_complete: is cm_node to be removed
+ *
+ * note - cm_node needs to be protected before calling this. Encase in:
+ *		irdma_rem_ref_cm_node(cm_core, cm_node);
+ *		irdma_schedule_cm_timer(...)
+ *		refcount_inc(&cm_node->refcnt);
+ */
+int irdma_schedule_cm_timer(struct irdma_cm_node *cm_node,
+			    struct irdma_puda_buf *sqbuf,
+			    enum irdma_timer_type type, int send_retrans,
+			    int close_when_complete)
+{
+	struct irdma_sc_vsi *vsi = &cm_node->iwdev->vsi;
+	struct irdma_cm_core *cm_core = cm_node->cm_core;
+	struct irdma_timer_entry *new_send;
+	u32 was_timer_set;
+	unsigned long flags;
+
+	new_send = kzalloc(sizeof(*new_send), GFP_ATOMIC);
+	if (!new_send) {
+		if (type != IRDMA_TIMER_TYPE_CLOSE)
+			irdma_free_sqbuf(vsi, sqbuf);
+		return -ENOMEM;
+	}
+
+	new_send->retrycount = IRDMA_DEFAULT_RETRYS;
+	new_send->retranscount = IRDMA_DEFAULT_RETRANS;
+	new_send->sqbuf = sqbuf;
+	new_send->timetosend = jiffies;
+	new_send->type = type;
+	new_send->send_retrans = send_retrans;
+	new_send->close_when_complete = close_when_complete;
+
+	if (type == IRDMA_TIMER_TYPE_CLOSE) {
+		new_send->timetosend += (HZ / 10);
+		if (cm_node->close_entry) {
+			kfree(new_send);
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: already close entry\n");
+			return -EINVAL;
+		}
+
+		cm_node->close_entry = new_send;
+	} else { /* type == IRDMA_TIMER_TYPE_SEND */
+		spin_lock_irqsave(&cm_node->retrans_list_lock, flags);
+		cm_node->send_entry = new_send;
+		refcount_inc(&cm_node->refcnt);
+		spin_unlock_irqrestore(&cm_node->retrans_list_lock, flags);
+		new_send->timetosend = jiffies + IRDMA_RETRY_TIMEOUT;
+
+		refcount_inc(&sqbuf->refcount);
+		irdma_puda_send_buf(vsi->ilq, sqbuf);
+		if (!send_retrans) {
+			irdma_cleanup_retrans_entry(cm_node);
+			if (close_when_complete)
+				irdma_rem_ref_cm_node(cm_node);
+			return 0;
+		}
+	}
+
+	spin_lock_irqsave(&cm_core->ht_lock, flags);
+	was_timer_set = timer_pending(&cm_core->tcp_timer);
+
+	if (!was_timer_set) {
+		cm_core->tcp_timer.expires = new_send->timetosend;
+		add_timer(&cm_core->tcp_timer);
+	}
+	spin_unlock_irqrestore(&cm_core->ht_lock, flags);
+
+	return 0;
+}
+
+/**
+ * irdma_retrans_expired - Could not rexmit the packet
+ * @cm_node: connection's node
+ */
+static void irdma_retrans_expired(struct irdma_cm_node *cm_node)
+{
+	enum irdma_cm_node_state state = cm_node->state;
+
+	cm_node->state = IRDMA_CM_STATE_CLOSED;
+	switch (state) {
+	case IRDMA_CM_STATE_SYN_RCVD:
+	case IRDMA_CM_STATE_CLOSING:
+		irdma_rem_ref_cm_node(cm_node);
+		break;
+	case IRDMA_CM_STATE_FIN_WAIT1:
+	case IRDMA_CM_STATE_LAST_ACK:
+		irdma_send_reset(cm_node);
+		break;
+	default:
+		refcount_inc(&cm_node->refcnt);
+		irdma_send_reset(cm_node);
+		irdma_create_event(cm_node, IRDMA_CM_EVENT_ABORTED);
+		break;
+	}
+}
+
+/**
+ * irdma_handle_close_entry - for handling retry/timeouts
+ * @cm_node: connection's node
+ * @rem_node: flag for remove cm_node
+ */
+static void irdma_handle_close_entry(struct irdma_cm_node *cm_node,
+				     u32 rem_node)
+{
+	struct irdma_timer_entry *close_entry = cm_node->close_entry;
+	struct irdma_qp *iwqp;
+	unsigned long flags;
+
+	if (!close_entry)
+		return;
+	iwqp = (struct irdma_qp *)close_entry->sqbuf;
+	if (iwqp) {
+		spin_lock_irqsave(&iwqp->lock, flags);
+		if (iwqp->cm_id) {
+			iwqp->hw_tcp_state = IRDMA_TCP_STATE_CLOSED;
+			iwqp->hw_iwarp_state = IRDMA_QP_STATE_ERROR;
+			iwqp->last_aeq = IRDMA_AE_RESET_SENT;
+			iwqp->ibqp_state = IB_QPS_ERR;
+			spin_unlock_irqrestore(&iwqp->lock, flags);
+			irdma_cm_disconn(iwqp);
+		} else {
+			spin_unlock_irqrestore(&iwqp->lock, flags);
+		}
+	} else if (rem_node) {
+		/* TIME_WAIT state */
+		irdma_rem_ref_cm_node(cm_node);
+	}
+
+	kfree(close_entry);
+	cm_node->close_entry = NULL;
+}
+
+/**
+ * irdma_cm_timer_tick - system's timer expired callback
+ * @t: Pointer to timer_list
+ */
+static void irdma_cm_timer_tick(struct timer_list *t)
+{
+	unsigned long nexttimeout = jiffies + IRDMA_LONG_TIME;
+	struct irdma_cm_node *cm_node;
+	struct irdma_timer_entry *send_entry, *close_entry;
+	struct list_head *list_core_temp;
+	struct list_head *list_node;
+	struct irdma_cm_core *cm_core = from_timer(cm_core, t, tcp_timer);
+	struct irdma_sc_vsi *vsi;
+	u32 settimer = 0;
+	unsigned long timetosend;
+	unsigned long flags;
+	struct list_head timer_list;
+
+	INIT_LIST_HEAD(&timer_list);
+
+	rcu_read_lock();
+	irdma_timer_list_prep(cm_core, &timer_list);
+	rcu_read_unlock();
+
+	list_for_each_safe (list_node, list_core_temp, &timer_list) {
+		cm_node = container_of(list_node, struct irdma_cm_node,
+				       timer_entry);
+		close_entry = cm_node->close_entry;
+
+		if (close_entry) {
+			if (time_after(close_entry->timetosend, jiffies)) {
+				if (nexttimeout > close_entry->timetosend ||
+				    !settimer) {
+					nexttimeout = close_entry->timetosend;
+					settimer = 1;
+				}
+			} else {
+				irdma_handle_close_entry(cm_node, 1);
+			}
+		}
+
+		spin_lock_irqsave(&cm_node->retrans_list_lock, flags);
+
+		send_entry = cm_node->send_entry;
+		if (!send_entry)
+			goto done;
+		if (time_after(send_entry->timetosend, jiffies)) {
+			if (cm_node->state != IRDMA_CM_STATE_OFFLOADED) {
+				if (nexttimeout > send_entry->timetosend ||
+				    !settimer) {
+					nexttimeout = send_entry->timetosend;
+					settimer = 1;
+				}
+			} else {
+				irdma_free_retrans_entry(cm_node);
+			}
+			goto done;
+		}
+
+		if (cm_node->state == IRDMA_CM_STATE_OFFLOADED ||
+		    cm_node->state == IRDMA_CM_STATE_CLOSED) {
+			irdma_free_retrans_entry(cm_node);
+			goto done;
+		}
+
+		if (!send_entry->retranscount || !send_entry->retrycount) {
+			irdma_free_retrans_entry(cm_node);
+
+			spin_unlock_irqrestore(&cm_node->retrans_list_lock,
+					       flags);
+			irdma_retrans_expired(cm_node);
+			cm_node->state = IRDMA_CM_STATE_CLOSED;
+			spin_lock_irqsave(&cm_node->retrans_list_lock, flags);
+			goto done;
+		}
+		spin_unlock_irqrestore(&cm_node->retrans_list_lock, flags);
+
+		vsi = &cm_node->iwdev->vsi;
+		if (!cm_node->ack_rcvd) {
+			refcount_inc(&send_entry->sqbuf->refcount);
+			irdma_puda_send_buf(vsi->ilq, send_entry->sqbuf);
+			cm_node->cm_core->stats_pkt_retrans++;
+		}
+
+		spin_lock_irqsave(&cm_node->retrans_list_lock, flags);
+		if (send_entry->send_retrans) {
+			send_entry->retranscount--;
+			timetosend = (IRDMA_RETRY_TIMEOUT <<
+				      (IRDMA_DEFAULT_RETRANS -
+				       send_entry->retranscount));
+
+			send_entry->timetosend = jiffies +
+			    min(timetosend, IRDMA_MAX_TIMEOUT);
+			if (nexttimeout > send_entry->timetosend || !settimer) {
+				nexttimeout = send_entry->timetosend;
+				settimer = 1;
+			}
+		} else {
+			int close_when_complete;
+
+			close_when_complete = send_entry->close_when_complete;
+			irdma_free_retrans_entry(cm_node);
+			if (close_when_complete)
+				irdma_rem_ref_cm_node(cm_node);
+		}
+done:
+		spin_unlock_irqrestore(&cm_node->retrans_list_lock, flags);
+		irdma_rem_ref_cm_node(cm_node);
+	}
+
+	if (settimer) {
+		spin_lock_irqsave(&cm_core->ht_lock, flags);
+		if (!timer_pending(&cm_core->tcp_timer)) {
+			cm_core->tcp_timer.expires = nexttimeout;
+			add_timer(&cm_core->tcp_timer);
+		}
+		spin_unlock_irqrestore(&cm_core->ht_lock, flags);
+	}
+}
+
+/**
+ * irdma_send_syn - send SYN packet
+ * @cm_node: connection's node
+ * @sendack: flag to set ACK bit or not
+ */
+int irdma_send_syn(struct irdma_cm_node *cm_node, u32 sendack)
+{
+	struct irdma_puda_buf *sqbuf;
+	int flags = SET_SYN;
+	char optionsbuf[sizeof(struct option_mss) +
+			sizeof(struct option_windowscale) +
+			sizeof(struct option_base) + TCP_OPTIONS_PADDING];
+	struct irdma_kmem_info opts;
+	int optionssize = 0;
+	/* Sending MSS option */
+	union all_known_options *options;
+
+	opts.addr = optionsbuf;
+	if (!cm_node)
+		return -EINVAL;
+
+	options = (union all_known_options *)&optionsbuf[optionssize];
+	options->mss.optionnum = OPTION_NUM_MSS;
+	options->mss.len = sizeof(struct option_mss);
+	options->mss.mss = htons(cm_node->tcp_cntxt.mss);
+	optionssize += sizeof(struct option_mss);
+
+	options = (union all_known_options *)&optionsbuf[optionssize];
+	options->windowscale.optionnum = OPTION_NUM_WINDOW_SCALE;
+	options->windowscale.len = sizeof(struct option_windowscale);
+	options->windowscale.shiftcount = cm_node->tcp_cntxt.rcv_wscale;
+	optionssize += sizeof(struct option_windowscale);
+	options = (union all_known_options *)&optionsbuf[optionssize];
+	options->eol = OPTION_NUM_EOL;
+	optionssize += 1;
+
+	if (sendack)
+		flags |= SET_ACK;
+
+	opts.size = optionssize;
+
+	sqbuf = cm_node->cm_core->form_cm_frame(cm_node, &opts, NULL, NULL,
+						flags);
+	if (!sqbuf)
+		return -ENOMEM;
+
+	return irdma_schedule_cm_timer(cm_node, sqbuf, IRDMA_TIMER_TYPE_SEND, 1,
+				       0);
+}
+
+/**
+ * irdma_send_ack - Send ACK packet
+ * @cm_node: connection's node
+ */
+void irdma_send_ack(struct irdma_cm_node *cm_node)
+{
+	struct irdma_puda_buf *sqbuf;
+	struct irdma_sc_vsi *vsi = &cm_node->iwdev->vsi;
+
+	sqbuf = cm_node->cm_core->form_cm_frame(cm_node, NULL, NULL, NULL,
+						SET_ACK);
+	if (sqbuf)
+		irdma_puda_send_buf(vsi->ilq, sqbuf);
+}
+
+/**
+ * irdma_send_fin - Send FIN pkt
+ * @cm_node: connection's node
+ */
+static int irdma_send_fin(struct irdma_cm_node *cm_node)
+{
+	struct irdma_puda_buf *sqbuf;
+
+	sqbuf = cm_node->cm_core->form_cm_frame(cm_node, NULL, NULL, NULL,
+						SET_ACK | SET_FIN);
+	if (!sqbuf)
+		return -ENOMEM;
+
+	return irdma_schedule_cm_timer(cm_node, sqbuf, IRDMA_TIMER_TYPE_SEND, 1,
+				       0);
+}
+
+/**
+ * irdma_find_listener - find a cm node listening on this addr-port pair
+ * @cm_core: cm's core
+ * @dst_addr: listener ip addr
+ * @dst_port: listener tcp port num
+ * @vlan_id: virtual LAN ID
+ * @listener_state: state to match with listen node's
+ */
+static struct irdma_cm_listener *
+irdma_find_listener(struct irdma_cm_core *cm_core, u32 *dst_addr, u16 dst_port,
+		    u16 vlan_id, enum irdma_cm_listener_state listener_state)
+{
+	struct irdma_cm_listener *listen_node;
+	static const u32 ip_zero[4] = { 0, 0, 0, 0 };
+	u32 listen_addr[4];
+	u16 listen_port;
+	unsigned long flags;
+
+	/* walk list and find cm_node associated with this session ID */
+	spin_lock_irqsave(&cm_core->listen_list_lock, flags);
+	list_for_each_entry (listen_node, &cm_core->listen_list, list) {
+		memcpy(listen_addr, listen_node->loc_addr, sizeof(listen_addr));
+		listen_port = listen_node->loc_port;
+		/* compare node pair, return node handle if a match */
+		if ((!memcmp(listen_addr, dst_addr, sizeof(listen_addr)) ||
+		     !memcmp(listen_addr, ip_zero, sizeof(listen_addr))) &&
+		    listen_port == dst_port &&
+		    vlan_id == listen_node->vlan_id &&
+		    (listener_state & listen_node->listener_state)) {
+			refcount_inc(&listen_node->refcnt);
+			spin_unlock_irqrestore(&cm_core->listen_list_lock,
+					       flags);
+			trace_irdma_find_listener(listen_node);
+			return listen_node;
+		}
+	}
+	spin_unlock_irqrestore(&cm_core->listen_list_lock, flags);
+
+	return NULL;
+}
+
+/**
+ * irdma_del_multiple_qhash - Remove qhash and child listens
+ * @iwdev: iWarp device
+ * @cm_info: CM info for parent listen node
+ * @cm_parent_listen_node: The parent listen node
+ */
+static enum irdma_status_code
+irdma_del_multiple_qhash(struct irdma_device *iwdev,
+			 struct irdma_cm_info *cm_info,
+			 struct irdma_cm_listener *cm_parent_listen_node)
+{
+	struct irdma_cm_listener *child_listen_node;
+	enum irdma_status_code ret = IRDMA_ERR_CFG;
+	struct list_head *pos, *tpos;
+	unsigned long flags;
+
+	spin_lock_irqsave(&iwdev->cm_core.listen_list_lock, flags);
+	list_for_each_safe (pos, tpos,
+			    &cm_parent_listen_node->child_listen_list) {
+		child_listen_node = list_entry(pos, struct irdma_cm_listener,
+					       child_listen_list);
+		if (child_listen_node->ipv4)
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: removing child listen for IP=%pI4, port=%d, vlan=%d\n",
+				  child_listen_node->loc_addr,
+				  child_listen_node->loc_port,
+				  child_listen_node->vlan_id);
+		else
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: removing child listen for IP=%pI6, port=%d, vlan=%d\n",
+				  child_listen_node->loc_addr,
+				  child_listen_node->loc_port,
+				  child_listen_node->vlan_id);
+		trace_irdma_del_multiple_qhash(child_listen_node);
+		list_del(pos);
+		memcpy(cm_info->loc_addr, child_listen_node->loc_addr,
+		       sizeof(cm_info->loc_addr));
+		cm_info->vlan_id = child_listen_node->vlan_id;
+		if (child_listen_node->qhash_set) {
+			ret = irdma_manage_qhash(iwdev, cm_info,
+						 IRDMA_QHASH_TYPE_TCP_SYN,
+						 IRDMA_QHASH_MANAGE_TYPE_DELETE,
+						 NULL, false);
+			child_listen_node->qhash_set = false;
+		} else {
+			ret = 0;
+		}
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: Child listen node freed = %p\n",
+			  child_listen_node);
+		kfree(child_listen_node);
+		cm_parent_listen_node->cm_core->stats_listen_nodes_destroyed++;
+	}
+	spin_unlock_irqrestore(&iwdev->cm_core.listen_list_lock, flags);
+
+	return ret;
+}
+
+/**
+ * irdma_netdev_vlan_ipv6 - Gets the netdev and mac
+ * @addr: local IPv6 address
+ * @vlan_id: vlan id for the given IPv6 address
+ * @mac: mac address for the given IPv6 address
+ *
+ * Returns the net_device of the IPv6 address and also sets the
+ * vlan id and mac for that address.
+ */
+struct net_device *irdma_netdev_vlan_ipv6(u32 *addr, u16 *vlan_id, u8 *mac)
+{
+	struct net_device *ip_dev = NULL;
+	struct in6_addr laddr6;
+
+	if (!IS_ENABLED(CONFIG_IPV6))
+		return NULL;
+
+	irdma_copy_ip_htonl(laddr6.in6_u.u6_addr32, addr);
+	if (vlan_id)
+		*vlan_id = 0xFFFF;	/* Match rdma_vlan_dev_vlan_id() */
+	if (mac)
+		eth_zero_addr(mac);
+
+	rcu_read_lock();
+	for_each_netdev_rcu (&init_net, ip_dev) {
+		if (ipv6_chk_addr(&init_net, &laddr6, ip_dev, 1)) {
+			if (vlan_id)
+				*vlan_id = rdma_vlan_dev_vlan_id(ip_dev);
+			if (ip_dev->dev_addr && mac)
+				ether_addr_copy(mac, ip_dev->dev_addr);
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ip_dev;
+}
+
+/**
+ * irdma_get_vlan_ipv4 - Returns the vlan_id for IPv4 address
+ * @addr: local IPv4 address
+ */
+u16 irdma_get_vlan_ipv4(u32 *addr)
+{
+	struct net_device *netdev;
+	u16 vlan_id = 0xFFFF;
+
+	netdev = ip_dev_find(&init_net, htonl(addr[0]));
+	if (netdev) {
+		vlan_id = rdma_vlan_dev_vlan_id(netdev);
+		dev_put(netdev);
+	}
+
+	return vlan_id;
+}
+
+/**
+ * irdma_add_mqh_6 - Adds multiple qhashes for IPv6
+ * @iwdev: iWarp device
+ * @cm_info: CM info for parent listen node
+ * @cm_parent_listen_node: The parent listen node
+ *
+ * Adds a qhash and a child listen node for every IPv6 address
+ * on the adapter and adds the associated qhash filter
+ */
+static enum irdma_status_code
+irdma_add_mqh_6(struct irdma_device *iwdev, struct irdma_cm_info *cm_info,
+		struct irdma_cm_listener *cm_parent_listen_node)
+{
+	struct net_device *ip_dev;
+	struct inet6_dev *idev;
+	struct inet6_ifaddr *ifp, *tmp;
+	enum irdma_status_code ret = 0;
+	struct irdma_cm_listener *child_listen_node;
+	unsigned long flags;
+
+	rtnl_lock();
+	for_each_netdev(&init_net, ip_dev) {
+		if (!(ip_dev->flags & IFF_UP))
+			continue;
+
+		if (((rdma_vlan_dev_vlan_id(ip_dev) >= VLAN_N_VID) ||
+		     (rdma_vlan_dev_real_dev(ip_dev) != iwdev->netdev)) &&
+		    ip_dev != iwdev->netdev)
+			continue;
+
+		idev = __in6_dev_get(ip_dev);
+		if (!idev) {
+			ibdev_dbg(&iwdev->ibdev, "CM: idev == NULL\n");
+			break;
+		}
+		list_for_each_entry_safe (ifp, tmp, &idev->addr_list, if_list) {
+			ibdev_dbg(&iwdev->ibdev, "CM: IP=%pI6, vlan_id=%d, MAC=%pM\n",
+				  &ifp->addr, rdma_vlan_dev_vlan_id(ip_dev),
+				  ip_dev->dev_addr);
+			child_listen_node = kzalloc(sizeof(*child_listen_node), GFP_KERNEL);
+			ibdev_dbg(&iwdev->ibdev, "CM: Allocating child listener %p\n",
+				  child_listen_node);
+			if (!child_listen_node) {
+				ibdev_dbg(&iwdev->ibdev, "CM: listener memory allocation\n");
+				ret = IRDMA_ERR_NO_MEMORY;
+				goto exit;
+			}
+
+			cm_info->vlan_id = rdma_vlan_dev_vlan_id(ip_dev);
+			cm_parent_listen_node->vlan_id = cm_info->vlan_id;
+			memcpy(child_listen_node, cm_parent_listen_node,
+			       sizeof(*child_listen_node));
+			irdma_copy_ip_ntohl(child_listen_node->loc_addr,
+					    ifp->addr.in6_u.u6_addr32);
+			memcpy(cm_info->loc_addr, child_listen_node->loc_addr,
+			       sizeof(cm_info->loc_addr));
+			ret = irdma_manage_qhash(iwdev, cm_info,
+						 IRDMA_QHASH_TYPE_TCP_SYN,
+						 IRDMA_QHASH_MANAGE_TYPE_ADD,
+						 NULL, true);
+			if (ret) {
+				kfree(child_listen_node);
+				continue;
+			}
+
+			trace_irdma_add_mqh_6(iwdev, child_listen_node,
+					      ip_dev->dev_addr);
+
+			child_listen_node->qhash_set = true;
+			spin_lock_irqsave(&iwdev->cm_core.listen_list_lock, flags);
+			list_add(&child_listen_node->child_listen_list,
+				 &cm_parent_listen_node->child_listen_list);
+			spin_unlock_irqrestore(&iwdev->cm_core.listen_list_lock, flags);
+			cm_parent_listen_node->cm_core->stats_listen_nodes_created++;
+		}
+	}
+exit:
+	rtnl_unlock();
+
+	return ret;
+}
+
+/**
+ * irdma_add_mqh_4 - Adds multiple qhashes for IPv4
+ * @iwdev: iWarp device
+ * @cm_info: CM info for parent listen node
+ * @cm_parent_listen_node: The parent listen node
+ *
+ * Adds a qhash and a child listen node for every IPv4 address
+ * on the adapter and adds the associated qhash filter
+ */
+static enum irdma_status_code
+irdma_add_mqh_4(struct irdma_device *iwdev, struct irdma_cm_info *cm_info,
+		struct irdma_cm_listener *cm_parent_listen_node)
+{
+	struct net_device *ip_dev;
+	struct in_device *idev;
+	struct irdma_cm_listener *child_listen_node;
+	enum irdma_status_code ret = 0;
+	unsigned long flags;
+	const struct in_ifaddr *ifa;
+
+	rtnl_lock();
+	for_each_netdev(&init_net, ip_dev) {
+		if (!(ip_dev->flags & IFF_UP))
+			continue;
+
+		if (((rdma_vlan_dev_vlan_id(ip_dev) >= VLAN_N_VID) ||
+		     (rdma_vlan_dev_real_dev(ip_dev) != iwdev->netdev)) &&
+		    ip_dev != iwdev->netdev)
+			continue;
+
+		idev = in_dev_get(ip_dev);
+		in_dev_for_each_ifa_rtnl(ifa, idev) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: Allocating child CM Listener forIP=%pI4, vlan_id=%d, MAC=%pM\n",
+				  &ifa->ifa_address, rdma_vlan_dev_vlan_id(ip_dev),
+				  ip_dev->dev_addr);
+			child_listen_node = kzalloc(sizeof(*child_listen_node), GFP_KERNEL);
+			cm_parent_listen_node->cm_core->stats_listen_nodes_created++;
+			ibdev_dbg(&iwdev->ibdev, "CM: Allocating child listener %p\n",
+				  child_listen_node);
+			if (!child_listen_node) {
+				ibdev_dbg(&iwdev->ibdev, "CM: listener memory allocation\n");
+				in_dev_put(idev);
+				ret = IRDMA_ERR_NO_MEMORY;
+				goto exit;
+			}
+
+			cm_info->vlan_id = rdma_vlan_dev_vlan_id(ip_dev);
+			cm_parent_listen_node->vlan_id = cm_info->vlan_id;
+			memcpy(child_listen_node, cm_parent_listen_node,
+			       sizeof(*child_listen_node));
+			child_listen_node->loc_addr[0] =
+				ntohl(ifa->ifa_address);
+			memcpy(cm_info->loc_addr, child_listen_node->loc_addr,
+			       sizeof(cm_info->loc_addr));
+			ret = irdma_manage_qhash(iwdev, cm_info,
+						 IRDMA_QHASH_TYPE_TCP_SYN,
+						 IRDMA_QHASH_MANAGE_TYPE_ADD,
+						 NULL, true);
+			if (ret) {
+				kfree(child_listen_node);
+				cm_parent_listen_node->cm_core
+					->stats_listen_nodes_created--;
+				continue;
+			}
+
+			trace_irdma_add_mqh_4(iwdev, child_listen_node,
+					      ip_dev->dev_addr);
+
+			child_listen_node->qhash_set = true;
+			spin_lock_irqsave(&iwdev->cm_core.listen_list_lock,
+					  flags);
+			list_add(&child_listen_node->child_listen_list,
+				 &cm_parent_listen_node->child_listen_list);
+			spin_unlock_irqrestore(&iwdev->cm_core.listen_list_lock, flags);
+		}
+		in_dev_put(idev);
+	}
+exit:
+	rtnl_unlock();
+
+	return ret;
+}
+
+/**
+ * irdma_add_mqh - Adds multiple qhashes
+ * @iwdev: iWarp device
+ * @cm_info: CM info for parent listen node
+ * @cm_listen_node: The parent listen node
+ */
+static enum irdma_status_code
+irdma_add_mqh(struct irdma_device *iwdev, struct irdma_cm_info *cm_info,
+	      struct irdma_cm_listener *cm_listen_node)
+{
+	if (cm_info->ipv4)
+		return irdma_add_mqh_4(iwdev, cm_info, cm_listen_node);
+	else
+		return irdma_add_mqh_6(iwdev, cm_info, cm_listen_node);
+}
+
+/**
+ * irdma_reset_list_prep - add connection nodes slated for reset to list
+ * @cm_core: cm's core
+ * @listener: pointer to listener node
+ * @reset_list: a list to which cm_node will be selected
+ */
+static void irdma_reset_list_prep(struct irdma_cm_core *cm_core,
+				  struct irdma_cm_listener *listener,
+				  struct list_head *reset_list)
+{
+	struct irdma_cm_node *cm_node;
+	int bkt;
+
+	hash_for_each_rcu(cm_core->cm_hash_tbl, bkt, cm_node, list) {
+		if (cm_node->listener == listener &&
+		    !cm_node->accelerated &&
+		    refcount_inc_not_zero(&cm_node->refcnt))
+			list_add(&cm_node->reset_entry, reset_list);
+	}
+}
+
+/**
+ * irdma_dec_refcnt_listen - delete listener and associated cm nodes
+ * @cm_core: cm's core
+ * @listener: pointer to listener node
+ * @free_hanging_nodes: to free associated cm_nodes
+ * @apbvt_del: flag to delete the apbvt
+ */
+static int irdma_dec_refcnt_listen(struct irdma_cm_core *cm_core,
+				   struct irdma_cm_listener *listener,
+				   int free_hanging_nodes, bool apbvt_del)
+{
+	int err;
+	struct list_head *list_pos;
+	struct list_head *list_temp;
+	struct irdma_cm_node *cm_node;
+	struct list_head reset_list;
+	struct irdma_cm_info nfo;
+	enum irdma_cm_node_state old_state;
+	unsigned long flags;
+
+	trace_irdma_dec_refcnt_listen(listener, __builtin_return_address(0));
+	/* free non-accelerated child nodes for this listener */
+	INIT_LIST_HEAD(&reset_list);
+	if (free_hanging_nodes) {
+		rcu_read_lock();
+		irdma_reset_list_prep(cm_core, listener, &reset_list);
+		rcu_read_unlock();
+	}
+
+	list_for_each_safe (list_pos, list_temp, &reset_list) {
+		cm_node = container_of(list_pos, struct irdma_cm_node,
+				       reset_entry);
+		if (cm_node->state >= IRDMA_CM_STATE_FIN_WAIT1) {
+			irdma_rem_ref_cm_node(cm_node);
+			continue;
+		}
+
+		irdma_cleanup_retrans_entry(cm_node);
+		err = irdma_send_reset(cm_node);
+		if (err) {
+			cm_node->state = IRDMA_CM_STATE_CLOSED;
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: send reset failed\n");
+		} else {
+			old_state = cm_node->state;
+			cm_node->state = IRDMA_CM_STATE_LISTENER_DESTROYED;
+			if (old_state != IRDMA_CM_STATE_MPAREQ_RCVD)
+				irdma_rem_ref_cm_node(cm_node);
+		}
+	}
+
+	if (refcount_dec_and_test(&listener->refcnt)) {
+		spin_lock_irqsave(&cm_core->listen_list_lock, flags);
+		list_del(&listener->list);
+		spin_unlock_irqrestore(&cm_core->listen_list_lock, flags);
+
+		if (apbvt_del)
+			irdma_del_apbvt(listener->iwdev,
+					listener->apbvt_entry);
+		memcpy(nfo.loc_addr, listener->loc_addr, sizeof(nfo.loc_addr));
+		nfo.loc_port = listener->loc_port;
+		nfo.ipv4 = listener->ipv4;
+		nfo.vlan_id = listener->vlan_id;
+		nfo.user_pri = listener->user_pri;
+		nfo.qh_qpid = listener->iwdev->vsi.ilq->qp_id;
+
+		if (!list_empty(&listener->child_listen_list)) {
+			irdma_del_multiple_qhash(listener->iwdev, &nfo,
+						 listener);
+		} else {
+			if (listener->qhash_set)
+				irdma_manage_qhash(listener->iwdev,
+						   &nfo,
+						   IRDMA_QHASH_TYPE_TCP_SYN,
+						   IRDMA_QHASH_MANAGE_TYPE_DELETE,
+						   NULL, false);
+		}
+
+		cm_core->stats_listen_destroyed++;
+		cm_core->stats_listen_nodes_destroyed++;
+		ibdev_dbg(&listener->iwdev->ibdev,
+			  "CM: loc_port=0x%04x loc_addr=%pI4 cm_listen_node=%p cm_id=%p qhash_set=%d vlan_id=%d apbvt_del=%d\n",
+			  listener->loc_port, listener->loc_addr, listener,
+			  listener->cm_id, listener->qhash_set,
+			  listener->vlan_id, apbvt_del);
+		kfree(listener);
+		listener = NULL;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
+/**
+ * irdma_cm_del_listen - delete a listener
+ * @cm_core: cm's core
+ * @listener: passive connection's listener
+ * @apbvt_del: flag to delete apbvt
+ */
+static int irdma_cm_del_listen(struct irdma_cm_core *cm_core,
+			       struct irdma_cm_listener *listener,
+			       bool apbvt_del)
+{
+	listener->listener_state = IRDMA_CM_LISTENER_PASSIVE_STATE;
+	listener->cm_id = NULL;
+
+	return irdma_dec_refcnt_listen(cm_core, listener, 1, apbvt_del);
+}
+
+/**
+ * irdma_addr_resolve_neigh - resolve neighbor address
+ * @iwdev: iwarp device structure
+ * @src_ip: local ip address
+ * @dst_ip: remote ip address
+ * @arpindex: if there is an arp entry
+ */
+static int irdma_addr_resolve_neigh(struct irdma_device *iwdev, u32 src_ip,
+				    u32 dst_ip, int arpindex)
+{
+	struct rtable *rt;
+	struct neighbour *neigh;
+	int rc = arpindex;
+	__be32 dst_ipaddr = htonl(dst_ip);
+	__be32 src_ipaddr = htonl(src_ip);
+
+	rt = ip_route_output(&init_net, dst_ipaddr, src_ipaddr, 0, 0);
+	if (IS_ERR(rt)) {
+		ibdev_dbg(&iwdev->ibdev, "CM: ip_route_output fail\n");
+		return -EINVAL;
+	}
+
+	neigh = dst_neigh_lookup(&rt->dst, &dst_ipaddr);
+	if (!neigh)
+		goto exit;
+
+	if (neigh->nud_state & NUD_VALID)
+		rc = irdma_add_arp(iwdev->rf, &dst_ip, true, neigh->ha);
+	else
+		neigh_event_send(neigh, NULL);
+	if (neigh)
+		neigh_release(neigh);
+exit:
+	ip_rt_put(rt);
+
+	return rc;
+}
+
+/**
+ * irdma_get_dst_ipv6 - get destination cache entry via ipv6 lookup
+ * @src_addr: local ipv6 sock address
+ * @dst_addr: destination ipv6 sock address
+ */
+static struct dst_entry *irdma_get_dst_ipv6(struct sockaddr_in6 *src_addr,
+					    struct sockaddr_in6 *dst_addr)
+{
+	struct dst_entry *dst = NULL;
+
+	if ((IS_ENABLED(CONFIG_IPV6))) {
+		struct flowi6 fl6 = {};
+
+		fl6.daddr = dst_addr->sin6_addr;
+		fl6.saddr = src_addr->sin6_addr;
+		if (ipv6_addr_type(&fl6.daddr) & IPV6_ADDR_LINKLOCAL)
+			fl6.flowi6_oif = dst_addr->sin6_scope_id;
+
+		dst = ip6_route_output(&init_net, NULL, &fl6);
+	}
+
+	return dst;
+}
+
+/**
+ * irdma_addr_resolve_neigh_ipv6 - resolve neighbor ipv6 address
+ * @iwdev: iwarp device structure
+ * @src: local ip address
+ * @dest: remote ip address
+ * @arpindex: if there is an arp entry
+ */
+static int irdma_addr_resolve_neigh_ipv6(struct irdma_device *iwdev, u32 *src,
+					 u32 *dest, int arpindex)
+{
+	struct neighbour *neigh;
+	int rc = arpindex;
+	struct dst_entry *dst;
+	struct sockaddr_in6 dst_addr = {};
+	struct sockaddr_in6 src_addr = {};
+
+	dst_addr.sin6_family = AF_INET6;
+	irdma_copy_ip_htonl(dst_addr.sin6_addr.in6_u.u6_addr32, dest);
+	src_addr.sin6_family = AF_INET6;
+	irdma_copy_ip_htonl(src_addr.sin6_addr.in6_u.u6_addr32, src);
+	dst = irdma_get_dst_ipv6(&src_addr, &dst_addr);
+	if (!dst || dst->error) {
+		if (dst) {
+			dst_release(dst);
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: ip6_route_output returned dst->error = %d\n",
+				  dst->error);
+		}
+		return -EINVAL;
+	}
+
+	neigh = dst_neigh_lookup(dst, dst_addr.sin6_addr.in6_u.u6_addr32);
+	if (!neigh)
+		goto exit;
+
+	ibdev_dbg(&iwdev->ibdev, "CM: dst_neigh_lookup MAC=%pM\n",
+		  neigh->ha);
+
+	trace_irdma_addr_resolve(iwdev, neigh->ha);
+
+	if (neigh->nud_state & NUD_VALID)
+		rc = irdma_add_arp(iwdev->rf, dest, false, neigh->ha);
+	else
+		neigh_event_send(neigh, NULL);
+	if (neigh)
+		neigh_release(neigh);
+exit:
+	dst_release(dst);
+
+	return rc;
+}
+
+/**
+ * irdma_find_node - find a cm node that matches the reference cm node
+ * @cm_core: cm's core
+ * @rem_port: remote tcp port num
+ * @rem_addr: remote ip addr
+ * @loc_port: local tcp port num
+ * @loc_addr: local ip addr
+ * @vlan_id: local VLAN ID
+ */
+struct irdma_cm_node *irdma_find_node(struct irdma_cm_core *cm_core,
+				      u16 rem_port, u32 *rem_addr, u16 loc_port,
+				      u32 *loc_addr, u16 vlan_id)
+{
+	struct irdma_cm_node *cm_node;
+	u32 key = (rem_port << 16) | loc_port;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(cm_core->cm_hash_tbl, cm_node, list, key) {
+		if (cm_node->vlan_id == vlan_id &&
+		    cm_node->loc_port == loc_port && cm_node->rem_port == rem_port &&
+		    !memcmp(cm_node->loc_addr, loc_addr, sizeof(cm_node->loc_addr)) &&
+		    !memcmp(cm_node->rem_addr, rem_addr, sizeof(cm_node->rem_addr))) {
+			if (!refcount_inc_not_zero(&cm_node->refcnt))
+				goto exit;
+			rcu_read_unlock();
+			trace_irdma_find_node(cm_node, 0, NULL);
+			return cm_node;
+		}
+	}
+
+exit:
+	rcu_read_unlock();
+
+	/* no owner node */
+	return NULL;
+}
+
+/**
+ * irdma_add_hte_node - add a cm node to the hash table
+ * @cm_core: cm's core
+ * @cm_node: connection's node
+ */
+static void irdma_add_hte_node(struct irdma_cm_core *cm_core,
+			       struct irdma_cm_node *cm_node)
+{
+	unsigned long flags;
+	u32 key = (cm_node->rem_port << 16) | cm_node->loc_port;
+
+	spin_lock_irqsave(&cm_core->ht_lock, flags);
+	hash_add_rcu(cm_core->cm_hash_tbl, &cm_node->list, key);
+	spin_unlock_irqrestore(&cm_core->ht_lock, flags);
+}
+
+/**
+ * irdma_ipv4_is_lpb - check if loopback
+ * @loc_addr: local addr to compare
+ * @rem_addr: remote address
+ */
+bool irdma_ipv4_is_lpb(u32 loc_addr, u32 rem_addr)
+{
+	return ipv4_is_loopback(htonl(rem_addr)) || (loc_addr == rem_addr);
+}
+
+/**
+ * irdma_ipv6_is_lpb - check if loopback
+ * @loc_addr: local addr to compare
+ * @rem_addr: remote address
+ */
+bool irdma_ipv6_is_lpb(u32 *loc_addr, u32 *rem_addr)
+{
+	struct in6_addr raddr6;
+
+	irdma_copy_ip_htonl(raddr6.in6_u.u6_addr32, rem_addr);
+
+	return !memcmp(loc_addr, rem_addr, 16) || ipv6_addr_loopback(&raddr6);
+}
+
+/**
+ * irdma_cm_create_ah - create a cm address handle
+ * @cm_node: The connection manager node to create AH for
+ * @wait: Provides option to wait for ah creation or not
+ */
+static int irdma_cm_create_ah(struct irdma_cm_node *cm_node, bool wait)
+{
+	struct irdma_ah_info ah_info = {};
+	struct irdma_device *iwdev = cm_node->iwdev;
+
+	ether_addr_copy(ah_info.mac_addr, iwdev->netdev->dev_addr);
+
+	ah_info.hop_ttl = 0x40;
+	ah_info.tc_tos = cm_node->tos;
+	ah_info.vsi = &iwdev->vsi;
+
+	if (cm_node->ipv4) {
+		ah_info.ipv4_valid = true;
+		ah_info.dest_ip_addr[0] = cm_node->rem_addr[0];
+		ah_info.src_ip_addr[0] = cm_node->loc_addr[0];
+		ah_info.do_lpbk = irdma_ipv4_is_lpb(ah_info.src_ip_addr[0],
+						    ah_info.dest_ip_addr[0]);
+	} else {
+		memcpy(ah_info.dest_ip_addr, cm_node->rem_addr,
+		       sizeof(ah_info.dest_ip_addr));
+		memcpy(ah_info.src_ip_addr, cm_node->loc_addr,
+		       sizeof(ah_info.src_ip_addr));
+		ah_info.do_lpbk = irdma_ipv6_is_lpb(ah_info.src_ip_addr,
+						    ah_info.dest_ip_addr);
+	}
+
+	ah_info.vlan_tag = cm_node->vlan_id;
+	if (cm_node->vlan_id < VLAN_N_VID) {
+		ah_info.insert_vlan_tag = 1;
+		ah_info.vlan_tag |= cm_node->user_pri << VLAN_PRIO_SHIFT;
+	}
+
+	ah_info.dst_arpindex =
+		irdma_arp_table(iwdev->rf, ah_info.dest_ip_addr,
+				ah_info.ipv4_valid, NULL, IRDMA_ARP_RESOLVE);
+
+	if (irdma_puda_create_ah(&iwdev->rf->sc_dev, &ah_info, wait,
+				 IRDMA_PUDA_RSRC_TYPE_ILQ, cm_node,
+				 &cm_node->ah))
+		return -ENOMEM;
+
+	trace_irdma_create_ah(cm_node);
+	return 0;
+}
+
+/**
+ * irdma_cm_free_ah - free a cm address handle
+ * @cm_node: The connection manager node to create AH for
+ */
+static void irdma_cm_free_ah(struct irdma_cm_node *cm_node)
+{
+	struct irdma_device *iwdev = cm_node->iwdev;
+
+	trace_irdma_cm_free_ah(cm_node);
+	irdma_puda_free_ah(&iwdev->rf->sc_dev, cm_node->ah);
+	cm_node->ah = NULL;
+}
+
+/**
+ * irdma_make_cm_node - create a new instance of a cm node
+ * @cm_core: cm's core
+ * @iwdev: iwarp device structure
+ * @cm_info: quad info for connection
+ * @listener: passive connection's listener
+ */
+static struct irdma_cm_node *
+irdma_make_cm_node(struct irdma_cm_core *cm_core, struct irdma_device *iwdev,
+		   struct irdma_cm_info *cm_info,
+		   struct irdma_cm_listener *listener)
+{
+	struct irdma_cm_node *cm_node;
+	int oldarpindex;
+	int arpindex;
+	struct net_device *netdev = iwdev->netdev;
+
+	/* create an hte and cm_node for this instance */
+	cm_node = kzalloc(sizeof(*cm_node), GFP_ATOMIC);
+	if (!cm_node)
+		return NULL;
+
+	/* set our node specific transport info */
+	cm_node->ipv4 = cm_info->ipv4;
+	cm_node->vlan_id = cm_info->vlan_id;
+	if (cm_node->vlan_id >= VLAN_N_VID && iwdev->dcb)
+		cm_node->vlan_id = 0;
+	cm_node->tos = cm_info->tos;
+	cm_node->user_pri = cm_info->user_pri;
+	if (listener) {
+		if (listener->tos != cm_info->tos)
+			ibdev_warn(&iwdev->ibdev,
+				   "application TOS[%d] and remote client TOS[%d] mismatch\n",
+				   listener->tos, cm_info->tos);
+		cm_node->tos = max(listener->tos, cm_info->tos);
+		cm_node->user_pri = rt_tos2priority(cm_node->tos);
+		ibdev_dbg(&iwdev->ibdev,
+			  "DCB: listener: TOS:[%d] UP:[%d]\n", cm_node->tos,
+			  cm_node->user_pri);
+		trace_irdma_listener_tos(iwdev, cm_node->tos,
+					 cm_node->user_pri);
+	}
+	memcpy(cm_node->loc_addr, cm_info->loc_addr, sizeof(cm_node->loc_addr));
+	memcpy(cm_node->rem_addr, cm_info->rem_addr, sizeof(cm_node->rem_addr));
+	cm_node->loc_port = cm_info->loc_port;
+	cm_node->rem_port = cm_info->rem_port;
+
+	cm_node->mpa_frame_rev = IRDMA_CM_DEFAULT_MPA_VER;
+	cm_node->send_rdma0_op = SEND_RDMA_READ_ZERO;
+	cm_node->iwdev = iwdev;
+	cm_node->dev = &iwdev->rf->sc_dev;
+
+	cm_node->ird_size = cm_node->dev->hw_attrs.max_hw_ird;
+	cm_node->ord_size = cm_node->dev->hw_attrs.max_hw_ord;
+
+	cm_node->listener = listener;
+	cm_node->cm_id = cm_info->cm_id;
+	ether_addr_copy(cm_node->loc_mac, netdev->dev_addr);
+	spin_lock_init(&cm_node->retrans_list_lock);
+	cm_node->ack_rcvd = false;
+
+	init_completion(&cm_node->establish_comp);
+	refcount_set(&cm_node->refcnt, 1);
+	/* associate our parent CM core */
+	cm_node->cm_core = cm_core;
+	cm_node->tcp_cntxt.loc_id = IRDMA_CM_DEFAULT_LOCAL_ID;
+	cm_node->tcp_cntxt.rcv_wscale = iwdev->rcv_wscale;
+	cm_node->tcp_cntxt.rcv_wnd = iwdev->rcv_wnd >> cm_node->tcp_cntxt.rcv_wscale;
+	if (cm_node->ipv4) {
+		cm_node->tcp_cntxt.loc_seq_num = secure_tcp_seq(htonl(cm_node->loc_addr[0]),
+								htonl(cm_node->rem_addr[0]),
+								htons(cm_node->loc_port),
+								htons(cm_node->rem_port));
+		cm_node->tcp_cntxt.mss = iwdev->vsi.mtu - IRDMA_MTU_TO_MSS_IPV4;
+	} else if (IS_ENABLED(CONFIG_IPV6)) {
+		__be32 loc[4] = {
+			htonl(cm_node->loc_addr[0]), htonl(cm_node->loc_addr[1]),
+			htonl(cm_node->loc_addr[2]), htonl(cm_node->loc_addr[3])
+		};
+		__be32 rem[4] = {
+			htonl(cm_node->rem_addr[0]), htonl(cm_node->rem_addr[1]),
+			htonl(cm_node->rem_addr[2]), htonl(cm_node->rem_addr[3])
+		};
+		cm_node->tcp_cntxt.loc_seq_num = secure_tcpv6_seq(loc, rem,
+								  htons(cm_node->loc_port),
+								  htons(cm_node->rem_port));
+		cm_node->tcp_cntxt.mss = iwdev->vsi.mtu - IRDMA_MTU_TO_MSS_IPV6;
+	}
+
+	if ((cm_node->ipv4 &&
+	     irdma_ipv4_is_lpb(cm_node->loc_addr[0], cm_node->rem_addr[0])) ||
+	    (!cm_node->ipv4 &&
+	     irdma_ipv6_is_lpb(cm_node->loc_addr, cm_node->rem_addr))) {
+		cm_node->do_lpb = true;
+		arpindex = irdma_arp_table(iwdev->rf, cm_node->rem_addr,
+					   cm_node->ipv4, NULL,
+					   IRDMA_ARP_RESOLVE);
+	} else {
+		oldarpindex = irdma_arp_table(iwdev->rf, cm_node->rem_addr,
+					      cm_node->ipv4, NULL,
+					      IRDMA_ARP_RESOLVE);
+		if (cm_node->ipv4)
+			arpindex = irdma_addr_resolve_neigh(iwdev,
+							    cm_info->loc_addr[0],
+							    cm_info->rem_addr[0],
+							    oldarpindex);
+		else if (IS_ENABLED(CONFIG_IPV6))
+			arpindex = irdma_addr_resolve_neigh_ipv6(iwdev,
+								 cm_info->loc_addr,
+								 cm_info->rem_addr,
+								 oldarpindex);
+		else
+			arpindex = -EINVAL;
+	}
+
+	if (arpindex < 0)
+		goto err;
+
+	ether_addr_copy(cm_node->rem_mac,
+			iwdev->rf->arp_table[arpindex].mac_addr);
+	irdma_add_hte_node(cm_core, cm_node);
+	cm_core->stats_nodes_created++;
+	return cm_node;
+
+err:
+	kfree(cm_node);
+
+	return NULL;
+}
+
+static void irdma_cm_node_free_cb(struct rcu_head *rcu_head)
+{
+	struct irdma_cm_node *cm_node =
+			    container_of(rcu_head, struct irdma_cm_node, rcu_head);
+	struct irdma_cm_core *cm_core = cm_node->cm_core;
+	struct irdma_qp *iwqp;
+	struct irdma_cm_info nfo;
+
+	/* if the node is destroyed before connection was accelerated */
+	if (!cm_node->accelerated && cm_node->accept_pend) {
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: node destroyed before established\n");
+		atomic_dec(&cm_node->listener->pend_accepts_cnt);
+	}
+	if (cm_node->close_entry)
+		irdma_handle_close_entry(cm_node, 0);
+	if (cm_node->listener) {
+		irdma_dec_refcnt_listen(cm_core, cm_node->listener, 0, true);
+	} else {
+		if (cm_node->apbvt_set) {
+			irdma_del_apbvt(cm_node->iwdev, cm_node->apbvt_entry);
+			cm_node->apbvt_set = 0;
+		}
+		irdma_get_addr_info(cm_node, &nfo);
+		if (cm_node->qhash_set) {
+			nfo.qh_qpid = cm_node->iwdev->vsi.ilq->qp_id;
+			irdma_manage_qhash(cm_node->iwdev, &nfo,
+					   IRDMA_QHASH_TYPE_TCP_ESTABLISHED,
+					   IRDMA_QHASH_MANAGE_TYPE_DELETE, NULL,
+					   false);
+			cm_node->qhash_set = 0;
+		}
+	}
+
+	iwqp = cm_node->iwqp;
+	if (iwqp) {
+		cm_node->cm_id->rem_ref(cm_node->cm_id);
+		cm_node->cm_id = NULL;
+		iwqp->cm_id = NULL;
+		irdma_qp_rem_ref(&iwqp->ibqp);
+		cm_node->iwqp = NULL;
+	} else if (cm_node->qhash_set) {
+		irdma_get_addr_info(cm_node, &nfo);
+		nfo.qh_qpid = cm_node->iwdev->vsi.ilq->qp_id;
+		irdma_manage_qhash(cm_node->iwdev, &nfo,
+				   IRDMA_QHASH_TYPE_TCP_ESTABLISHED,
+				   IRDMA_QHASH_MANAGE_TYPE_DELETE, NULL, false);
+		cm_node->qhash_set = 0;
+	}
+
+	cm_core->cm_free_ah(cm_node);
+	kfree(cm_node);
+}
+
+/**
+ * irdma_rem_ref_cm_node - destroy an instance of a cm node
+ * @cm_node: connection's node
+ */
+void irdma_rem_ref_cm_node(struct irdma_cm_node *cm_node)
+{
+	struct irdma_cm_core *cm_core = cm_node->cm_core;
+	unsigned long flags;
+
+	trace_irdma_rem_ref_cm_node(cm_node, 0, __builtin_return_address(0));
+	spin_lock_irqsave(&cm_core->ht_lock, flags);
+
+	if (!refcount_dec_and_test(&cm_node->refcnt)) {
+		spin_unlock_irqrestore(&cm_core->ht_lock, flags);
+		return;
+	}
+	if (cm_node->iwqp) {
+		cm_node->iwqp->cm_node = NULL;
+		cm_node->iwqp->cm_id = NULL;
+	}
+	hash_del_rcu(&cm_node->list);
+	cm_node->cm_core->stats_nodes_destroyed++;
+
+	spin_unlock_irqrestore(&cm_core->ht_lock, flags);
+
+	/* wait for all list walkers to exit their grace period */
+	call_rcu(&cm_node->rcu_head, irdma_cm_node_free_cb);
+}
+
+/**
+ * irdma_handle_fin_pkt - FIN packet received
+ * @cm_node: connection's node
+ */
+static void irdma_handle_fin_pkt(struct irdma_cm_node *cm_node)
+{
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_SYN_RCVD:
+	case IRDMA_CM_STATE_SYN_SENT:
+	case IRDMA_CM_STATE_ESTABLISHED:
+	case IRDMA_CM_STATE_MPAREJ_RCVD:
+		cm_node->tcp_cntxt.rcv_nxt++;
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_LAST_ACK;
+		irdma_send_fin(cm_node);
+		break;
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+		irdma_create_event(cm_node, IRDMA_CM_EVENT_ABORTED);
+		cm_node->tcp_cntxt.rcv_nxt++;
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		refcount_inc(&cm_node->refcnt);
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_FIN_WAIT1:
+		cm_node->tcp_cntxt.rcv_nxt++;
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_CLOSING;
+		irdma_send_ack(cm_node);
+		/*
+		 * Wait for ACK as this is simultaneous close.
+		 * After we receive ACK, do not send anything.
+		 * Just rm the node.
+		 */
+		break;
+	case IRDMA_CM_STATE_FIN_WAIT2:
+		cm_node->tcp_cntxt.rcv_nxt++;
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_TIME_WAIT;
+		irdma_send_ack(cm_node);
+		irdma_schedule_cm_timer(cm_node, NULL, IRDMA_TIMER_TYPE_CLOSE,
+					1, 0);
+		break;
+	case IRDMA_CM_STATE_TIME_WAIT:
+		cm_node->tcp_cntxt.rcv_nxt++;
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		irdma_rem_ref_cm_node(cm_node);
+		break;
+	case IRDMA_CM_STATE_OFFLOADED:
+	default:
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: bad state node state = %d\n", cm_node->state);
+		break;
+	}
+}
+
+/**
+ * irdma_handle_rst_pkt - process received RST packet
+ * @cm_node: connection's node
+ * @rbuf: receive buffer
+ */
+static void irdma_handle_rst_pkt(struct irdma_cm_node *cm_node,
+				 struct irdma_puda_buf *rbuf)
+{
+	ibdev_dbg(&cm_node->iwdev->ibdev,
+		  "CM: caller: %pS cm_node=%p state=%d rem_port=0x%04x loc_port=0x%04x rem_addr=%pI4 loc_addr=%pI4\n",
+		  __builtin_return_address(0), cm_node, cm_node->state,
+		  cm_node->rem_port, cm_node->loc_port, cm_node->rem_addr,
+		  cm_node->loc_addr);
+
+	irdma_cleanup_retrans_entry(cm_node);
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_SYN_SENT:
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+		switch (cm_node->mpa_frame_rev) {
+		case IETF_MPA_V2:
+			/* Drop down to MPA_V1*/
+			cm_node->mpa_frame_rev = IETF_MPA_V1;
+			/* send a syn and goto syn sent state */
+			cm_node->state = IRDMA_CM_STATE_SYN_SENT;
+			if (irdma_send_syn(cm_node, 0))
+				irdma_active_open_err(cm_node, false);
+			break;
+		case IETF_MPA_V1:
+		default:
+			irdma_active_open_err(cm_node, false);
+			break;
+		}
+		break;
+	case IRDMA_CM_STATE_MPAREQ_RCVD:
+		atomic_inc(&cm_node->passive_state);
+		break;
+	case IRDMA_CM_STATE_ESTABLISHED:
+	case IRDMA_CM_STATE_SYN_RCVD:
+	case IRDMA_CM_STATE_LISTENING:
+		irdma_passive_open_err(cm_node, false);
+		break;
+	case IRDMA_CM_STATE_OFFLOADED:
+		irdma_active_open_err(cm_node, false);
+		break;
+	case IRDMA_CM_STATE_CLOSED:
+		break;
+	case IRDMA_CM_STATE_FIN_WAIT2:
+	case IRDMA_CM_STATE_FIN_WAIT1:
+	case IRDMA_CM_STATE_LAST_ACK:
+	case IRDMA_CM_STATE_TIME_WAIT:
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		irdma_rem_ref_cm_node(cm_node);
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ * irdma_handle_rcv_mpa - Process a recv'd mpa buffer
+ * @cm_node: connection's node
+ * @rbuf: receive buffer
+ */
+static void irdma_handle_rcv_mpa(struct irdma_cm_node *cm_node,
+				 struct irdma_puda_buf *rbuf)
+{
+	int err;
+	int datasize = rbuf->datalen;
+	u8 *dataloc = rbuf->data;
+
+	enum irdma_cm_event_type type = IRDMA_CM_EVENT_UNKNOWN;
+	u32 res_type;
+
+	err = irdma_parse_mpa(cm_node, dataloc, &res_type, datasize);
+	if (err) {
+		if (cm_node->state == IRDMA_CM_STATE_MPAREQ_SENT)
+			irdma_active_open_err(cm_node, true);
+		else
+			irdma_passive_open_err(cm_node, true);
+		return;
+	}
+
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_ESTABLISHED:
+		if (res_type == IRDMA_MPA_REQUEST_REJECT)
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: state for reject\n");
+		cm_node->state = IRDMA_CM_STATE_MPAREQ_RCVD;
+		type = IRDMA_CM_EVENT_MPA_REQ;
+		irdma_send_ack(cm_node); /* ACK received MPA request */
+		atomic_set(&cm_node->passive_state,
+			   IRDMA_PASSIVE_STATE_INDICATED);
+		break;
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+		irdma_cleanup_retrans_entry(cm_node);
+		if (res_type == IRDMA_MPA_REQUEST_REJECT) {
+			type = IRDMA_CM_EVENT_MPA_REJECT;
+			cm_node->state = IRDMA_CM_STATE_MPAREJ_RCVD;
+		} else {
+			type = IRDMA_CM_EVENT_CONNECTED;
+			cm_node->state = IRDMA_CM_STATE_OFFLOADED;
+		}
+		irdma_send_ack(cm_node);
+		break;
+	default:
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: wrong cm_node state =%d\n", cm_node->state);
+		break;
+	}
+	irdma_create_event(cm_node, type);
+}
+
+/**
+ * irdma_check_syn - Check for error on received syn ack
+ * @cm_node: connection's node
+ * @tcph: pointer tcp header
+ */
+static int irdma_check_syn(struct irdma_cm_node *cm_node, struct tcphdr *tcph)
+{
+	if (ntohl(tcph->ack_seq) != cm_node->tcp_cntxt.loc_seq_num) {
+		irdma_active_open_err(cm_node, true);
+		return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_check_seq - check seq numbers if OK
+ * @cm_node: connection's node
+ * @tcph: pointer tcp header
+ */
+static int irdma_check_seq(struct irdma_cm_node *cm_node, struct tcphdr *tcph)
+{
+	u32 seq;
+	u32 ack_seq;
+	u32 loc_seq_num = cm_node->tcp_cntxt.loc_seq_num;
+	u32 rcv_nxt = cm_node->tcp_cntxt.rcv_nxt;
+	u32 rcv_wnd;
+	int err = 0;
+
+	seq = ntohl(tcph->seq);
+	ack_seq = ntohl(tcph->ack_seq);
+	rcv_wnd = cm_node->tcp_cntxt.rcv_wnd;
+	if (ack_seq != loc_seq_num ||
+	    !between(seq, rcv_nxt, (rcv_nxt + rcv_wnd)))
+		err = -1;
+	if (err)
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: seq number err\n");
+
+	return err;
+}
+
+void irdma_add_conn_est_qh(struct irdma_cm_node *cm_node)
+{
+	struct irdma_cm_info nfo;
+
+	irdma_get_addr_info(cm_node, &nfo);
+	nfo.qh_qpid = cm_node->iwdev->vsi.ilq->qp_id;
+	irdma_manage_qhash(cm_node->iwdev, &nfo,
+			   IRDMA_QHASH_TYPE_TCP_ESTABLISHED,
+			   IRDMA_QHASH_MANAGE_TYPE_ADD,
+			   cm_node, false);
+	cm_node->qhash_set = true;
+}
+
+/**
+ * irdma_handle_syn_pkt - is for Passive node
+ * @cm_node: connection's node
+ * @rbuf: receive buffer
+ */
+static void irdma_handle_syn_pkt(struct irdma_cm_node *cm_node,
+				 struct irdma_puda_buf *rbuf)
+{
+	struct tcphdr *tcph = (struct tcphdr *)rbuf->tcph;
+	int err;
+	u32 inc_sequence;
+	int optionsize;
+
+	optionsize = (tcph->doff << 2) - sizeof(struct tcphdr);
+	inc_sequence = ntohl(tcph->seq);
+
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_SYN_SENT:
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+		/* Rcvd syn on active open connection */
+		irdma_active_open_err(cm_node, 1);
+		break;
+	case IRDMA_CM_STATE_LISTENING:
+		/* Passive OPEN */
+		if (atomic_read(&cm_node->listener->pend_accepts_cnt) >
+		    cm_node->listener->backlog) {
+			cm_node->cm_core->stats_backlog_drops++;
+			irdma_passive_open_err(cm_node, false);
+			break;
+		}
+		err = irdma_handle_tcp_options(cm_node, tcph, optionsize, 1);
+		if (err) {
+			irdma_passive_open_err(cm_node, false);
+			/* drop pkt */
+			break;
+		}
+		err = cm_node->cm_core->cm_create_ah(cm_node, false);
+		if (err) {
+			irdma_passive_open_err(cm_node, false);
+			/* drop pkt */
+			break;
+		}
+		cm_node->tcp_cntxt.rcv_nxt = inc_sequence + 1;
+		cm_node->accept_pend = 1;
+		atomic_inc(&cm_node->listener->pend_accepts_cnt);
+
+		cm_node->state = IRDMA_CM_STATE_SYN_RCVD;
+		break;
+	case IRDMA_CM_STATE_CLOSED:
+		irdma_cleanup_retrans_entry(cm_node);
+		refcount_inc(&cm_node->refcnt);
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_OFFLOADED:
+	case IRDMA_CM_STATE_ESTABLISHED:
+	case IRDMA_CM_STATE_FIN_WAIT1:
+	case IRDMA_CM_STATE_FIN_WAIT2:
+	case IRDMA_CM_STATE_MPAREQ_RCVD:
+	case IRDMA_CM_STATE_LAST_ACK:
+	case IRDMA_CM_STATE_CLOSING:
+	case IRDMA_CM_STATE_UNKNOWN:
+	default:
+		break;
+	}
+}
+
+/**
+ * irdma_handle_synack_pkt - Process SYN+ACK packet (active side)
+ * @cm_node: connection's node
+ * @rbuf: receive buffer
+ */
+static void irdma_handle_synack_pkt(struct irdma_cm_node *cm_node,
+				    struct irdma_puda_buf *rbuf)
+{
+	struct tcphdr *tcph = (struct tcphdr *)rbuf->tcph;
+	int err;
+	u32 inc_sequence;
+	int optionsize;
+
+	optionsize = (tcph->doff << 2) - sizeof(struct tcphdr);
+	inc_sequence = ntohl(tcph->seq);
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_SYN_SENT:
+		irdma_cleanup_retrans_entry(cm_node);
+		/* active open */
+		if (irdma_check_syn(cm_node, tcph)) {
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: check syn fail\n");
+			return;
+		}
+		cm_node->tcp_cntxt.rem_ack_num = ntohl(tcph->ack_seq);
+		/* setup options */
+		err = irdma_handle_tcp_options(cm_node, tcph, optionsize, 0);
+		if (err) {
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: cm_node=%p tcp_options failed\n",
+				  cm_node);
+			break;
+		}
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->tcp_cntxt.rcv_nxt = inc_sequence + 1;
+		irdma_send_ack(cm_node); /* ACK  for the syn_ack */
+		err = irdma_send_mpa_request(cm_node);
+		if (err) {
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: cm_node=%p irdma_send_mpa_request failed\n",
+				  cm_node);
+			break;
+		}
+		cm_node->state = IRDMA_CM_STATE_MPAREQ_SENT;
+		break;
+	case IRDMA_CM_STATE_MPAREQ_RCVD:
+		irdma_passive_open_err(cm_node, true);
+		break;
+	case IRDMA_CM_STATE_LISTENING:
+		cm_node->tcp_cntxt.loc_seq_num = ntohl(tcph->ack_seq);
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_CLOSED:
+		cm_node->tcp_cntxt.loc_seq_num = ntohl(tcph->ack_seq);
+		irdma_cleanup_retrans_entry(cm_node);
+		refcount_inc(&cm_node->refcnt);
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_ESTABLISHED:
+	case IRDMA_CM_STATE_FIN_WAIT1:
+	case IRDMA_CM_STATE_FIN_WAIT2:
+	case IRDMA_CM_STATE_LAST_ACK:
+	case IRDMA_CM_STATE_OFFLOADED:
+	case IRDMA_CM_STATE_CLOSING:
+	case IRDMA_CM_STATE_UNKNOWN:
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+	default:
+		break;
+	}
+}
+
+/**
+ * irdma_handle_ack_pkt - process packet with ACK
+ * @cm_node: connection's node
+ * @rbuf: receive buffer
+ */
+static int irdma_handle_ack_pkt(struct irdma_cm_node *cm_node,
+				struct irdma_puda_buf *rbuf)
+{
+	struct tcphdr *tcph = (struct tcphdr *)rbuf->tcph;
+	u32 inc_sequence;
+	int ret;
+	int optionsize;
+	u32 datasize = rbuf->datalen;
+
+	optionsize = (tcph->doff << 2) - sizeof(struct tcphdr);
+
+	if (irdma_check_seq(cm_node, tcph))
+		return -EINVAL;
+
+	inc_sequence = ntohl(tcph->seq);
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_SYN_RCVD:
+		irdma_cleanup_retrans_entry(cm_node);
+		ret = irdma_handle_tcp_options(cm_node, tcph, optionsize, 1);
+		if (ret)
+			return ret;
+		cm_node->tcp_cntxt.rem_ack_num = ntohl(tcph->ack_seq);
+		cm_node->state = IRDMA_CM_STATE_ESTABLISHED;
+		if (datasize) {
+			cm_node->tcp_cntxt.rcv_nxt = inc_sequence + datasize;
+			irdma_handle_rcv_mpa(cm_node, rbuf);
+		}
+		break;
+	case IRDMA_CM_STATE_ESTABLISHED:
+		irdma_cleanup_retrans_entry(cm_node);
+		if (datasize) {
+			cm_node->tcp_cntxt.rcv_nxt = inc_sequence + datasize;
+			irdma_handle_rcv_mpa(cm_node, rbuf);
+		}
+		break;
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+		cm_node->tcp_cntxt.rem_ack_num = ntohl(tcph->ack_seq);
+		if (datasize) {
+			cm_node->tcp_cntxt.rcv_nxt = inc_sequence + datasize;
+			cm_node->ack_rcvd = false;
+			irdma_handle_rcv_mpa(cm_node, rbuf);
+		} else {
+			cm_node->ack_rcvd = true;
+		}
+		break;
+	case IRDMA_CM_STATE_LISTENING:
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_CLOSED:
+		irdma_cleanup_retrans_entry(cm_node);
+		refcount_inc(&cm_node->refcnt);
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_LAST_ACK:
+	case IRDMA_CM_STATE_CLOSING:
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		irdma_rem_ref_cm_node(cm_node);
+		break;
+	case IRDMA_CM_STATE_FIN_WAIT1:
+		irdma_cleanup_retrans_entry(cm_node);
+		cm_node->state = IRDMA_CM_STATE_FIN_WAIT2;
+		break;
+	case IRDMA_CM_STATE_SYN_SENT:
+	case IRDMA_CM_STATE_FIN_WAIT2:
+	case IRDMA_CM_STATE_OFFLOADED:
+	case IRDMA_CM_STATE_MPAREQ_RCVD:
+	case IRDMA_CM_STATE_UNKNOWN:
+	default:
+		irdma_cleanup_retrans_entry(cm_node);
+		break;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_process_pkt - process cm packet
+ * @cm_node: connection's node
+ * @rbuf: receive buffer
+ */
+static void irdma_process_pkt(struct irdma_cm_node *cm_node,
+			      struct irdma_puda_buf *rbuf)
+{
+	enum irdma_tcpip_pkt_type pkt_type = IRDMA_PKT_TYPE_UNKNOWN;
+	struct tcphdr *tcph = (struct tcphdr *)rbuf->tcph;
+	u32 fin_set = 0;
+	int err;
+
+	if (tcph->rst) {
+		pkt_type = IRDMA_PKT_TYPE_RST;
+	} else if (tcph->syn) {
+		pkt_type = IRDMA_PKT_TYPE_SYN;
+		if (tcph->ack)
+			pkt_type = IRDMA_PKT_TYPE_SYNACK;
+	} else if (tcph->ack) {
+		pkt_type = IRDMA_PKT_TYPE_ACK;
+	}
+	if (tcph->fin)
+		fin_set = 1;
+
+	switch (pkt_type) {
+	case IRDMA_PKT_TYPE_SYN:
+		irdma_handle_syn_pkt(cm_node, rbuf);
+		break;
+	case IRDMA_PKT_TYPE_SYNACK:
+		irdma_handle_synack_pkt(cm_node, rbuf);
+		break;
+	case IRDMA_PKT_TYPE_ACK:
+		err = irdma_handle_ack_pkt(cm_node, rbuf);
+		if (fin_set && !err)
+			irdma_handle_fin_pkt(cm_node);
+		break;
+	case IRDMA_PKT_TYPE_RST:
+		irdma_handle_rst_pkt(cm_node, rbuf);
+		break;
+	default:
+		if (fin_set &&
+		    (!irdma_check_seq(cm_node, (struct tcphdr *)rbuf->tcph)))
+			irdma_handle_fin_pkt(cm_node);
+		break;
+	}
+}
+
+/**
+ * irdma_make_listen_node - create a listen node with params
+ * @cm_core: cm's core
+ * @iwdev: iwarp device structure
+ * @cm_info: quad info for connection
+ */
+static struct irdma_cm_listener *
+irdma_make_listen_node(struct irdma_cm_core *cm_core,
+		       struct irdma_device *iwdev,
+		       struct irdma_cm_info *cm_info)
+{
+	struct irdma_cm_listener *listener;
+	unsigned long flags;
+
+	/* cannot have multiple matching listeners */
+	listener = irdma_find_listener(cm_core, cm_info->loc_addr,
+				       cm_info->loc_port, cm_info->vlan_id,
+				       IRDMA_CM_LISTENER_EITHER_STATE);
+	if (listener &&
+	    listener->listener_state == IRDMA_CM_LISTENER_ACTIVE_STATE) {
+		refcount_dec(&listener->refcnt);
+		return NULL;
+	}
+
+	if (!listener) {
+		/* create a CM listen node
+		 * 1/2 node to compare incoming traffic to
+		 */
+		listener = kzalloc(sizeof(*listener), GFP_KERNEL);
+		if (!listener)
+			return NULL;
+		cm_core->stats_listen_nodes_created++;
+		memcpy(listener->loc_addr, cm_info->loc_addr,
+		       sizeof(listener->loc_addr));
+		listener->loc_port = cm_info->loc_port;
+
+		INIT_LIST_HEAD(&listener->child_listen_list);
+
+		refcount_set(&listener->refcnt, 1);
+	} else {
+		listener->reused_node = 1;
+	}
+
+	listener->cm_id = cm_info->cm_id;
+	listener->ipv4 = cm_info->ipv4;
+	listener->vlan_id = cm_info->vlan_id;
+	atomic_set(&listener->pend_accepts_cnt, 0);
+	listener->cm_core = cm_core;
+	listener->iwdev = iwdev;
+
+	listener->backlog = cm_info->backlog;
+	listener->listener_state = IRDMA_CM_LISTENER_ACTIVE_STATE;
+
+	if (!listener->reused_node) {
+		spin_lock_irqsave(&cm_core->listen_list_lock, flags);
+		list_add(&listener->list, &cm_core->listen_list);
+		spin_unlock_irqrestore(&cm_core->listen_list_lock, flags);
+	}
+
+	return listener;
+}
+
+/**
+ * irdma_create_cm_node - make a connection node with params
+ * @cm_core: cm's core
+ * @iwdev: iwarp device structure
+ * @conn_param: connection parameters
+ * @cm_info: quad info for connection
+ * @caller_cm_node: pointer to cm_node structure to return
+ */
+static int irdma_create_cm_node(struct irdma_cm_core *cm_core,
+				struct irdma_device *iwdev,
+				struct iw_cm_conn_param *conn_param,
+				struct irdma_cm_info *cm_info,
+				struct irdma_cm_node **caller_cm_node)
+{
+	struct irdma_cm_node *cm_node;
+	u16 private_data_len = conn_param->private_data_len;
+	const void *private_data = conn_param->private_data;
+
+	/* create a CM connection node */
+	cm_node = irdma_make_cm_node(cm_core, iwdev, cm_info, NULL);
+	if (!cm_node)
+		return -ENOMEM;
+
+	/* set our node side to client (active) side */
+	cm_node->tcp_cntxt.client = 1;
+	cm_node->tcp_cntxt.rcv_wscale = IRDMA_CM_DEFAULT_RCV_WND_SCALE;
+
+	irdma_record_ird_ord(cm_node, conn_param->ird, conn_param->ord);
+
+	cm_node->pdata.size = private_data_len;
+	cm_node->pdata.addr = cm_node->pdata_buf;
+
+	memcpy(cm_node->pdata_buf, private_data, private_data_len);
+	*caller_cm_node = cm_node;
+
+	return 0;
+}
+
+/**
+ * irdma_cm_reject - reject and teardown a connection
+ * @cm_node: connection's node
+ * @pdata: ptr to private data for reject
+ * @plen: size of private data
+ */
+static int irdma_cm_reject(struct irdma_cm_node *cm_node, const void *pdata,
+			   u8 plen)
+{
+	int ret;
+	int passive_state;
+
+	if (cm_node->tcp_cntxt.client)
+		return 0;
+
+	irdma_cleanup_retrans_entry(cm_node);
+
+	passive_state = atomic_add_return(1, &cm_node->passive_state);
+	if (passive_state == IRDMA_SEND_RESET_EVENT) {
+		cm_node->state = IRDMA_CM_STATE_CLOSED;
+		irdma_rem_ref_cm_node(cm_node);
+		return 0;
+	}
+
+	if (cm_node->state == IRDMA_CM_STATE_LISTENER_DESTROYED) {
+		irdma_rem_ref_cm_node(cm_node);
+		return 0;
+	}
+
+	ret = irdma_send_mpa_reject(cm_node, pdata, plen);
+	if (!ret)
+		return 0;
+
+	cm_node->state = IRDMA_CM_STATE_CLOSED;
+	if (irdma_send_reset(cm_node))
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: send reset failed\n");
+
+	return ret;
+}
+
+/**
+ * irdma_cm_close - close of cm connection
+ * @cm_node: connection's node
+ */
+static int irdma_cm_close(struct irdma_cm_node *cm_node)
+{
+	switch (cm_node->state) {
+	case IRDMA_CM_STATE_SYN_RCVD:
+	case IRDMA_CM_STATE_SYN_SENT:
+	case IRDMA_CM_STATE_ONE_SIDE_ESTABLISHED:
+	case IRDMA_CM_STATE_ESTABLISHED:
+	case IRDMA_CM_STATE_ACCEPTING:
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+	case IRDMA_CM_STATE_MPAREQ_RCVD:
+		irdma_cleanup_retrans_entry(cm_node);
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_CLOSE_WAIT:
+		cm_node->state = IRDMA_CM_STATE_LAST_ACK;
+		irdma_send_fin(cm_node);
+		break;
+	case IRDMA_CM_STATE_FIN_WAIT1:
+	case IRDMA_CM_STATE_FIN_WAIT2:
+	case IRDMA_CM_STATE_LAST_ACK:
+	case IRDMA_CM_STATE_TIME_WAIT:
+	case IRDMA_CM_STATE_CLOSING:
+		return -EINVAL;
+	case IRDMA_CM_STATE_LISTENING:
+		irdma_cleanup_retrans_entry(cm_node);
+		irdma_send_reset(cm_node);
+		break;
+	case IRDMA_CM_STATE_MPAREJ_RCVD:
+	case IRDMA_CM_STATE_UNKNOWN:
+	case IRDMA_CM_STATE_INITED:
+	case IRDMA_CM_STATE_CLOSED:
+	case IRDMA_CM_STATE_LISTENER_DESTROYED:
+		irdma_rem_ref_cm_node(cm_node);
+		break;
+	case IRDMA_CM_STATE_OFFLOADED:
+		if (cm_node->send_entry)
+			ibdev_dbg(&cm_node->iwdev->ibdev,
+				  "CM: CM send_entry in OFFLOADED state\n");
+		irdma_rem_ref_cm_node(cm_node);
+		break;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_receive_ilq - recv an ETHERNET packet, and process it
+ * through CM
+ * @vsi: VSI structure of dev
+ * @rbuf: receive buffer
+ */
+void irdma_receive_ilq(struct irdma_sc_vsi *vsi, struct irdma_puda_buf *rbuf)
+{
+	struct irdma_cm_node *cm_node;
+	struct irdma_cm_listener *listener;
+	struct iphdr *iph;
+	struct ipv6hdr *ip6h;
+	struct tcphdr *tcph;
+	struct irdma_cm_info cm_info = {};
+	struct irdma_device *iwdev = vsi->back_vsi;
+	struct irdma_cm_core *cm_core = &iwdev->cm_core;
+	struct vlan_ethhdr *ethh;
+	u16 vtag;
+
+	/* if vlan, then maclen = 18 else 14 */
+	iph = (struct iphdr *)rbuf->iph;
+	print_hex_dump_debug("ILQ: RECEIVE ILQ BUFFER", DUMP_PREFIX_OFFSET,
+			     16, 8, rbuf->mem.va, rbuf->totallen, false);
+	if (iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		if (rbuf->vlan_valid) {
+			vtag = rbuf->vlan_id;
+			cm_info.user_pri = (vtag & VLAN_PRIO_MASK) >>
+					   VLAN_PRIO_SHIFT;
+			cm_info.vlan_id = vtag & VLAN_VID_MASK;
+		} else {
+			cm_info.vlan_id = 0xFFFF;
+		}
+	} else {
+		ethh = rbuf->mem.va;
+
+		if (ethh->h_vlan_proto == htons(ETH_P_8021Q)) {
+			vtag = ntohs(ethh->h_vlan_TCI);
+			cm_info.user_pri = (vtag & VLAN_PRIO_MASK) >>
+					   VLAN_PRIO_SHIFT;
+			cm_info.vlan_id = vtag & VLAN_VID_MASK;
+			ibdev_dbg(&cm_core->iwdev->ibdev,
+				  "CM: vlan_id=%d\n", cm_info.vlan_id);
+		} else {
+			cm_info.vlan_id = 0xFFFF;
+		}
+	}
+	tcph = (struct tcphdr *)rbuf->tcph;
+
+	if (rbuf->ipv4) {
+		cm_info.loc_addr[0] = ntohl(iph->daddr);
+		cm_info.rem_addr[0] = ntohl(iph->saddr);
+		cm_info.ipv4 = true;
+		cm_info.tos = iph->tos;
+	} else {
+		ip6h = (struct ipv6hdr *)rbuf->iph;
+		irdma_copy_ip_ntohl(cm_info.loc_addr,
+				    ip6h->daddr.in6_u.u6_addr32);
+		irdma_copy_ip_ntohl(cm_info.rem_addr,
+				    ip6h->saddr.in6_u.u6_addr32);
+		cm_info.ipv4 = false;
+		cm_info.tos = (ip6h->priority << 4) | (ip6h->flow_lbl[0] >> 4);
+	}
+	cm_info.loc_port = ntohs(tcph->dest);
+	cm_info.rem_port = ntohs(tcph->source);
+	cm_node = irdma_find_node(cm_core, cm_info.rem_port, cm_info.rem_addr,
+				  cm_info.loc_port, cm_info.loc_addr, cm_info.vlan_id);
+
+	if (!cm_node) {
+		/* Only type of packet accepted are for the
+		 * PASSIVE open (syn only)
+		 */
+		if (!tcph->syn || tcph->ack)
+			return;
+
+		listener = irdma_find_listener(cm_core,
+					       cm_info.loc_addr,
+					       cm_info.loc_port,
+					       cm_info.vlan_id,
+					       IRDMA_CM_LISTENER_ACTIVE_STATE);
+		if (!listener) {
+			cm_info.cm_id = NULL;
+			ibdev_dbg(&cm_core->iwdev->ibdev,
+				  "CM: no listener found\n");
+			return;
+		}
+
+		cm_info.cm_id = listener->cm_id;
+		cm_node = irdma_make_cm_node(cm_core, iwdev, &cm_info,
+					     listener);
+		if (!cm_node) {
+			ibdev_dbg(&cm_core->iwdev->ibdev,
+				  "CM: allocate node failed\n");
+			refcount_dec(&listener->refcnt);
+			return;
+		}
+
+		if (!tcph->rst && !tcph->fin) {
+			cm_node->state = IRDMA_CM_STATE_LISTENING;
+		} else {
+			irdma_rem_ref_cm_node(cm_node);
+			return;
+		}
+
+		refcount_inc(&cm_node->refcnt);
+	} else if (cm_node->state == IRDMA_CM_STATE_OFFLOADED) {
+		irdma_rem_ref_cm_node(cm_node);
+		return;
+	}
+
+	irdma_process_pkt(cm_node, rbuf);
+	irdma_rem_ref_cm_node(cm_node);
+}
+
+static int irdma_add_qh(struct irdma_cm_node *cm_node, bool active)
+{
+	if (!active)
+		irdma_add_conn_est_qh(cm_node);
+	return 0;
+}
+
+static void irdma_cm_free_ah_nop(struct irdma_cm_node *cm_node)
+{
+}
+
+/**
+ * irdma_setup_cm_core - setup top level instance of a cm core
+ * @iwdev: iwarp device structure
+ * @rdma_ver: HW version
+ */
+enum irdma_status_code irdma_setup_cm_core(struct irdma_device *iwdev,
+					   u8 rdma_ver)
+{
+	struct irdma_cm_core *cm_core = &iwdev->cm_core;
+
+	cm_core->iwdev = iwdev;
+	cm_core->dev = &iwdev->rf->sc_dev;
+
+	/* Handles CM event work items send to Iwarp core */
+	cm_core->event_wq = alloc_ordered_workqueue("iwarp-event-wq", 0);
+	if (!cm_core->event_wq)
+		return IRDMA_ERR_NO_MEMORY;
+
+	INIT_LIST_HEAD(&cm_core->listen_list);
+
+	timer_setup(&cm_core->tcp_timer, irdma_cm_timer_tick, 0);
+
+	spin_lock_init(&cm_core->ht_lock);
+	spin_lock_init(&cm_core->listen_list_lock);
+	spin_lock_init(&cm_core->apbvt_lock);
+	switch (rdma_ver) {
+	case IRDMA_GEN_1:
+		cm_core->form_cm_frame = irdma_form_uda_cm_frame;
+		cm_core->cm_create_ah = irdma_add_qh;
+		cm_core->cm_free_ah = irdma_cm_free_ah_nop;
+		break;
+	case IRDMA_GEN_2:
+	default:
+		cm_core->form_cm_frame = irdma_form_ah_cm_frame;
+		cm_core->cm_create_ah = irdma_cm_create_ah;
+		cm_core->cm_free_ah = irdma_cm_free_ah;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_cleanup_cm_core - deallocate a top level instance of a
+ * cm core
+ * @cm_core: cm's core
+ */
+void irdma_cleanup_cm_core(struct irdma_cm_core *cm_core)
+{
+	unsigned long flags;
+
+	if (!cm_core)
+		return;
+
+	spin_lock_irqsave(&cm_core->ht_lock, flags);
+	if (timer_pending(&cm_core->tcp_timer))
+		del_timer_sync(&cm_core->tcp_timer);
+	spin_unlock_irqrestore(&cm_core->ht_lock, flags);
+
+	destroy_workqueue(cm_core->event_wq);
+	cm_core->dev->ws_reset(&cm_core->iwdev->vsi);
+}
+
+/**
+ * irdma_init_tcp_ctx - setup qp context
+ * @cm_node: connection's node
+ * @tcp_info: offload info for tcp
+ * @iwqp: associate qp for the connection
+ */
+static void irdma_init_tcp_ctx(struct irdma_cm_node *cm_node,
+			       struct irdma_tcp_offload_info *tcp_info,
+			       struct irdma_qp *iwqp)
+{
+	tcp_info->ipv4 = cm_node->ipv4;
+	tcp_info->drop_ooo_seg = !iwqp->iwdev->iw_ooo;
+	tcp_info->wscale = true;
+	tcp_info->ignore_tcp_opt = true;
+	tcp_info->ignore_tcp_uns_opt = true;
+	tcp_info->no_nagle = false;
+
+	tcp_info->ttl = IRDMA_DEFAULT_TTL;
+	tcp_info->rtt_var = IRDMA_DEFAULT_RTT_VAR;
+	tcp_info->ss_thresh = IRDMA_DEFAULT_SS_THRESH;
+	tcp_info->rexmit_thresh = IRDMA_DEFAULT_REXMIT_THRESH;
+
+	tcp_info->tcp_state = IRDMA_TCP_STATE_ESTABLISHED;
+	tcp_info->snd_wscale = cm_node->tcp_cntxt.snd_wscale;
+	tcp_info->rcv_wscale = cm_node->tcp_cntxt.rcv_wscale;
+
+	tcp_info->snd_nxt = cm_node->tcp_cntxt.loc_seq_num;
+	tcp_info->snd_wnd = cm_node->tcp_cntxt.snd_wnd;
+	tcp_info->rcv_nxt = cm_node->tcp_cntxt.rcv_nxt;
+	tcp_info->snd_max = cm_node->tcp_cntxt.loc_seq_num;
+
+	tcp_info->snd_una = cm_node->tcp_cntxt.loc_seq_num;
+	tcp_info->cwnd = 2 * cm_node->tcp_cntxt.mss;
+	tcp_info->snd_wl1 = cm_node->tcp_cntxt.rcv_nxt;
+	tcp_info->snd_wl2 = cm_node->tcp_cntxt.loc_seq_num;
+	tcp_info->max_snd_window = cm_node->tcp_cntxt.max_snd_wnd;
+	tcp_info->rcv_wnd = cm_node->tcp_cntxt.rcv_wnd
+			    << cm_node->tcp_cntxt.rcv_wscale;
+
+	tcp_info->flow_label = 0;
+	tcp_info->snd_mss = (u32)cm_node->tcp_cntxt.mss;
+	tcp_info->tos = cm_node->tos;
+	if (cm_node->vlan_id < VLAN_N_VID) {
+		tcp_info->insert_vlan_tag = true;
+		tcp_info->vlan_tag = cm_node->vlan_id;
+		tcp_info->vlan_tag |= cm_node->user_pri << VLAN_PRIO_SHIFT;
+	}
+	if (cm_node->ipv4) {
+		tcp_info->src_port = cm_node->loc_port;
+		tcp_info->dst_port = cm_node->rem_port;
+
+		tcp_info->dest_ip_addr[3] = cm_node->rem_addr[0];
+		tcp_info->local_ipaddr[3] = cm_node->loc_addr[0];
+		tcp_info->arp_idx = (u16)irdma_arp_table(iwqp->iwdev->rf,
+							 &tcp_info->dest_ip_addr[3],
+							 true, NULL,
+							 IRDMA_ARP_RESOLVE);
+	} else {
+		tcp_info->src_port = cm_node->loc_port;
+		tcp_info->dst_port = cm_node->rem_port;
+		memcpy(tcp_info->dest_ip_addr, cm_node->rem_addr,
+		       sizeof(tcp_info->dest_ip_addr));
+		memcpy(tcp_info->local_ipaddr, cm_node->loc_addr,
+		       sizeof(tcp_info->local_ipaddr));
+
+		tcp_info->arp_idx = (u16)irdma_arp_table(iwqp->iwdev->rf,
+							 &tcp_info->dest_ip_addr[0],
+							 false, NULL,
+							 IRDMA_ARP_RESOLVE);
+	}
+}
+
+/**
+ * irdma_cm_init_tsa_conn - setup qp for RTS
+ * @iwqp: associate qp for the connection
+ * @cm_node: connection's node
+ */
+static void irdma_cm_init_tsa_conn(struct irdma_qp *iwqp,
+				   struct irdma_cm_node *cm_node)
+{
+	struct irdma_iwarp_offload_info *iwarp_info;
+	struct irdma_qp_host_ctx_info *ctx_info;
+	struct irdma_sc_dev *dev = &iwqp->iwdev->rf->sc_dev;
+
+	iwarp_info = &iwqp->iwarp_info;
+	ctx_info = &iwqp->ctx_info;
+
+	ctx_info->tcp_info = &iwqp->tcp_info;
+	ctx_info->send_cq_num = iwqp->iwscq->sc_cq.cq_uk.cq_id;
+	ctx_info->rcv_cq_num = iwqp->iwrcq->sc_cq.cq_uk.cq_id;
+
+	iwarp_info->ord_size = cm_node->ord_size;
+	iwarp_info->ird_size = cm_node->ird_size;
+	iwarp_info->rd_en = true;
+	iwarp_info->rdmap_ver = 1;
+	iwarp_info->ddp_ver = 1;
+	iwarp_info->pd_id = iwqp->iwpd->sc_pd.pd_id;
+
+	ctx_info->tcp_info_valid = true;
+	ctx_info->iwarp_info_valid = true;
+	ctx_info->user_pri = cm_node->user_pri;
+
+	irdma_init_tcp_ctx(cm_node, &iwqp->tcp_info, iwqp);
+	if (cm_node->snd_mark_en) {
+		iwarp_info->snd_mark_en = true;
+		iwarp_info->snd_mark_offset = (iwqp->tcp_info.snd_nxt & SNDMARKER_SEQNMASK) +
+					       cm_node->lsmm_size;
+	}
+
+	cm_node->state = IRDMA_CM_STATE_OFFLOADED;
+	iwqp->tcp_info.tcp_state = IRDMA_TCP_STATE_ESTABLISHED;
+	iwqp->tcp_info.src_mac_addr_idx = iwqp->iwdev->mac_ip_table_idx;
+
+	if (cm_node->rcv_mark_en) {
+		iwarp_info->rcv_mark_en = true;
+		iwarp_info->align_hdrs = true;
+	}
+
+	dev->iw_priv_qp_ops->qp_setctx(&iwqp->sc_qp, iwqp->host_ctx.va,
+				       ctx_info);
+
+	/* once tcp_info is set, no need to do it again */
+	ctx_info->tcp_info_valid = false;
+	ctx_info->iwarp_info_valid = false;
+}
+
+/**
+ * irdma_cm_disconn - when a connection is being closed
+ * @iwqp: associated qp for the connection
+ */
+void irdma_cm_disconn(struct irdma_qp *iwqp)
+{
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct disconn_work *work;
+	unsigned long flags;
+
+	work = kzalloc(sizeof(*work), GFP_ATOMIC);
+	if (!work)
+		return;
+
+	spin_lock_irqsave(&iwdev->rf->qptable_lock, flags);
+	if (!iwdev->rf->qp_table[iwqp->ibqp.qp_num]) {
+		spin_unlock_irqrestore(&iwdev->rf->qptable_lock, flags);
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: qp_id %d is already freed\n",
+			  iwqp->ibqp.qp_num);
+		kfree(work);
+		return;
+	}
+	irdma_qp_add_ref(&iwqp->ibqp);
+	spin_unlock_irqrestore(&iwdev->rf->qptable_lock, flags);
+
+	work->iwqp = iwqp;
+	INIT_WORK(&work->work, irdma_disconnect_worker);
+	queue_work(iwdev->cleanup_wq, &work->work);
+}
+
+/**
+ * irdma_qp_disconnect - free qp and close cm
+ * @iwqp: associate qp for the connection
+ */
+static void irdma_qp_disconnect(struct irdma_qp *iwqp)
+{
+	struct irdma_device *iwdev = iwqp->iwdev;
+
+	iwqp->active_conn = 0;
+	/* close the CM node down if it is still active */
+	ibdev_dbg(&iwdev->ibdev, "CM: Call close API\n");
+	irdma_cm_close(iwqp->cm_node);
+}
+
+/**
+ * irdma_cm_disconn_true - called by worker thread to disconnect qp
+ * @iwqp: associate qp for the connection
+ */
+static void irdma_cm_disconn_true(struct irdma_qp *iwqp)
+{
+	struct iw_cm_id *cm_id;
+	struct irdma_device *iwdev;
+	struct irdma_sc_qp *qp = &iwqp->sc_qp;
+	u16 last_ae;
+	u8 original_hw_tcp_state;
+	u8 original_ibqp_state;
+	int disconn_status = 0;
+	int issue_disconn = 0;
+	int issue_close = 0;
+	int issue_flush = 0;
+	unsigned long flags;
+	int err;
+
+	iwdev = iwqp->iwdev;
+	spin_lock_irqsave(&iwqp->lock, flags);
+	if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
+		struct ib_qp_attr attr;
+
+		if (iwqp->flush_issued || iwqp->sc_qp.qp_uk.destroy_pending) {
+			spin_unlock_irqrestore(&iwqp->lock, flags);
+			return;
+		}
+
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+
+		attr.qp_state = IB_QPS_ERR;
+		irdma_modify_qp_roce(&iwqp->ibqp, &attr, IB_QP_STATE, NULL);
+		irdma_ib_qp_event(iwqp, qp->event_type);
+		return;
+	}
+
+	cm_id = iwqp->cm_id;
+	/* make sure we havent already closed this connection */
+	if (!cm_id) {
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+		return;
+	}
+
+	original_hw_tcp_state = iwqp->hw_tcp_state;
+	original_ibqp_state = iwqp->ibqp_state;
+	last_ae = iwqp->last_aeq;
+
+	if (qp->term_flags) {
+		issue_disconn = 1;
+		issue_close = 1;
+		iwqp->cm_id = NULL;
+		irdma_terminate_del_timer(qp);
+		if (!iwqp->flush_issued) {
+			iwqp->flush_issued = 1;
+			issue_flush = 1;
+		}
+	} else if ((original_hw_tcp_state == IRDMA_TCP_STATE_CLOSE_WAIT) ||
+		   ((original_ibqp_state == IB_QPS_RTS) &&
+		    (last_ae == IRDMA_AE_LLP_CONNECTION_RESET))) {
+		issue_disconn = 1;
+		if (last_ae == IRDMA_AE_LLP_CONNECTION_RESET)
+			disconn_status = -ECONNRESET;
+	}
+
+	if ((original_hw_tcp_state == IRDMA_TCP_STATE_CLOSED ||
+	     original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT ||
+	     last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE ||
+	     last_ae == IRDMA_AE_BAD_CLOSE ||
+	     last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->reset)) {
+		issue_close = 1;
+		iwqp->cm_id = NULL;
+		qp->term_flags = 0;
+		if (!iwqp->flush_issued) {
+			iwqp->flush_issued = 1;
+			issue_flush = 1;
+		}
+	}
+
+	spin_unlock_irqrestore(&iwqp->lock, flags);
+	if (issue_flush && !iwqp->sc_qp.qp_uk.destroy_pending) {
+		irdma_flush_wqes(iwqp, IRDMA_FLUSH_SQ | IRDMA_FLUSH_RQ |
+				 IRDMA_FLUSH_WAIT);
+
+		if (qp->term_flags)
+			irdma_ib_qp_event(iwqp, qp->event_type);
+	}
+
+	if (!cm_id || !cm_id->event_handler)
+		return;
+
+	spin_lock_irqsave(&iwdev->cm_core.ht_lock, flags);
+	if (!iwqp->cm_node) {
+		spin_unlock_irqrestore(&iwdev->cm_core.ht_lock, flags);
+		return;
+	}
+	refcount_inc(&iwqp->cm_node->refcnt);
+
+	spin_unlock_irqrestore(&iwdev->cm_core.ht_lock, flags);
+
+	if (issue_disconn) {
+		err = irdma_send_cm_event(iwqp->cm_node, cm_id,
+					  IW_CM_EVENT_DISCONNECT,
+					  disconn_status);
+		if (err)
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: disconnect event failed: - cm_id = %p\n",
+				  cm_id);
+	}
+	if (issue_close) {
+		cm_id->provider_data = iwqp;
+		err = irdma_send_cm_event(iwqp->cm_node, cm_id,
+					  IW_CM_EVENT_CLOSE, 0);
+		if (err)
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: close event failed: - cm_id = %p\n",
+				  cm_id);
+		irdma_qp_disconnect(iwqp);
+	}
+	irdma_rem_ref_cm_node(iwqp->cm_node);
+}
+
+/**
+ * irdma_disconnect_worker - worker for connection close
+ * @work: points or disconn structure
+ */
+static void irdma_disconnect_worker(struct work_struct *work)
+{
+	struct disconn_work *dwork = container_of(work, struct disconn_work, work);
+	struct irdma_qp *iwqp = dwork->iwqp;
+
+	kfree(dwork);
+	irdma_cm_disconn_true(iwqp);
+	irdma_qp_rem_ref(&iwqp->ibqp);
+}
+
+/**
+ * irdma_free_lsmm_rsrc - free lsmm memory and deregister
+ * @iwqp: associate qp for the connection
+ */
+void irdma_free_lsmm_rsrc(struct irdma_qp *iwqp)
+{
+	struct irdma_device *iwdev;
+
+	iwdev = iwqp->iwdev;
+
+	if (iwqp->ietf_mem.va) {
+		if (iwqp->lsmm_mr)
+			iwdev->ibdev.ops.dereg_mr(iwqp->lsmm_mr, NULL);
+		dma_free_coherent(iwdev->rf->sc_dev.hw->device,
+				  iwqp->ietf_mem.size, iwqp->ietf_mem.va,
+				  iwqp->ietf_mem.pa);
+		iwqp->ietf_mem.va = NULL;
+		iwqp->ietf_mem.va = NULL;
+	}
+}
+
+/**
+ * irdma_accept - registered call for connection to be accepted
+ * @cm_id: cm information for passive connection
+ * @conn_param: accpet parameters
+ */
+int irdma_accept(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
+{
+	struct ib_qp *ibqp;
+	struct irdma_qp *iwqp;
+	struct irdma_device *iwdev;
+	struct irdma_sc_dev *dev;
+	struct irdma_cm_node *cm_node;
+	struct ib_qp_attr attr = {};
+	int passive_state;
+	struct ib_mr *ibmr;
+	struct irdma_pd *iwpd;
+	u16 buf_len = 0;
+	struct irdma_kmem_info accept;
+	u64 tagged_offset;
+	int wait_ret;
+	int ret = 0;
+
+	ibqp = irdma_get_qp(cm_id->device, conn_param->qpn);
+	if (!ibqp)
+		return -EINVAL;
+
+	iwqp = to_iwqp(ibqp);
+	iwdev = iwqp->iwdev;
+	dev = &iwdev->rf->sc_dev;
+	cm_node = cm_id->provider_data;
+
+	if (((struct sockaddr_in *)&cm_id->local_addr)->sin_family == AF_INET) {
+		cm_node->ipv4 = true;
+		cm_node->vlan_id = irdma_get_vlan_ipv4(cm_node->loc_addr);
+	} else {
+		cm_node->ipv4 = false;
+		irdma_netdev_vlan_ipv6(cm_node->loc_addr, &cm_node->vlan_id,
+				       NULL);
+	}
+	ibdev_dbg(&iwdev->ibdev, "CM: Accept vlan_id=%d\n",
+		  cm_node->vlan_id);
+
+	trace_irdma_accept(cm_node, 0, NULL);
+
+	if (cm_node->state == IRDMA_CM_STATE_LISTENER_DESTROYED) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	passive_state = atomic_add_return(1, &cm_node->passive_state);
+	if (passive_state == IRDMA_SEND_RESET_EVENT) {
+		ret = -ECONNRESET;
+		goto error;
+	}
+
+	buf_len = conn_param->private_data_len + IRDMA_MAX_IETF_SIZE;
+	iwqp->ietf_mem.size = ALIGN(buf_len, 1);
+	iwqp->ietf_mem.va = dma_alloc_coherent(dev->hw->device,
+					       iwqp->ietf_mem.size,
+					       &iwqp->ietf_mem.pa, GFP_KERNEL);
+	if (!iwqp->ietf_mem.va) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	cm_node->pdata.size = conn_param->private_data_len;
+	accept.addr = iwqp->ietf_mem.va;
+	accept.size = irdma_cm_build_mpa_frame(cm_node, &accept, MPA_KEY_REPLY);
+	memcpy((u8 *)accept.addr + accept.size, conn_param->private_data,
+	       conn_param->private_data_len);
+
+	if (cm_node->dev->ws_add(iwqp->sc_qp.vsi, cm_node->user_pri)) {
+		ret = -ENOMEM;
+		goto error;
+	}
+	iwqp->sc_qp.user_pri = cm_node->user_pri;
+	irdma_qp_add_qos(&iwqp->sc_qp);
+	/* setup our first outgoing iWarp send WQE (the IETF frame response) */
+	iwpd = iwqp->iwpd;
+	tagged_offset = (uintptr_t)iwqp->ietf_mem.va;
+	ibmr = irdma_reg_phys_mr(&iwpd->ibpd, iwqp->ietf_mem.pa, buf_len,
+				 IB_ACCESS_LOCAL_WRITE, &tagged_offset);
+	if (IS_ERR(ibmr)) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	ibmr->pd = &iwpd->ibpd;
+	ibmr->device = iwpd->ibpd.device;
+	iwqp->lsmm_mr = ibmr;
+	if (iwqp->page)
+		iwqp->sc_qp.qp_uk.sq_base = kmap(iwqp->page);
+
+	cm_node->lsmm_size = accept.size + conn_param->private_data_len;
+	dev->iw_priv_qp_ops->qp_send_lsmm(&iwqp->sc_qp,
+					  iwqp->ietf_mem.va,
+					  cm_node->lsmm_size,
+					  ibmr->lkey);
+
+	if (iwqp->page)
+		kunmap(iwqp->page);
+
+	iwqp->cm_id = cm_id;
+	cm_node->cm_id = cm_id;
+
+	cm_id->provider_data = iwqp;
+	iwqp->active_conn = 0;
+	iwqp->cm_node = cm_node;
+	cm_node->iwqp = iwqp;
+	irdma_cm_init_tsa_conn(iwqp, cm_node);
+	irdma_qp_add_ref(&iwqp->ibqp);
+	cm_id->add_ref(cm_id);
+
+	attr.qp_state = IB_QPS_RTS;
+	cm_node->qhash_set = false;
+	cm_node->cm_core->cm_free_ah(cm_node);
+
+	irdma_modify_qp(&iwqp->ibqp, &attr, IB_QP_STATE, NULL);
+	if (dev->hw_attrs.uk_attrs.feature_flags & IRDMA_FEATURE_RTS_AE) {
+		wait_ret = wait_event_interruptible_timeout(iwqp->waitq,
+							    iwqp->rts_ae_rcvd,
+							    IRDMA_MAX_TIMEOUT);
+		if (!wait_ret) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: Slow Connection: cm_node=%p, loc_port=%d, rem_port=%d, cm_id=%p\n",
+				  cm_node, cm_node->loc_port,
+				  cm_node->rem_port, cm_node->cm_id);
+			ret = -ECONNRESET;
+			goto error;
+		}
+	}
+
+	irdma_send_cm_event(cm_node, cm_id, IW_CM_EVENT_ESTABLISHED, 0);
+	cm_node->accelerated = true;
+	complete(&cm_node->establish_comp);
+
+	if (cm_node->accept_pend) {
+		atomic_dec(&cm_node->listener->pend_accepts_cnt);
+		cm_node->accept_pend = 0;
+	}
+
+	ibdev_dbg(&iwdev->ibdev,
+		  "CM: rem_port=0x%04x, loc_port=0x%04x rem_addr=%pI4 loc_addr=%pI4 cm_node=%p cm_id=%p qp_id = %d\n\n",
+		  cm_node->rem_port, cm_node->loc_port, cm_node->rem_addr,
+		  cm_node->loc_addr, cm_node, cm_id, ibqp->qp_num);
+	cm_node->cm_core->stats_accepts++;
+
+	return 0;
+error:
+	irdma_free_lsmm_rsrc(iwqp);
+	irdma_rem_ref_cm_node(cm_node);
+
+	return ret;
+}
+
+/**
+ * irdma_reject - registered call for connection to be rejected
+ * @cm_id: cm information for passive connection
+ * @pdata: private data to be sent
+ * @pdata_len: private data length
+ */
+int irdma_reject(struct iw_cm_id *cm_id, const void *pdata, u8 pdata_len)
+{
+	struct irdma_device *iwdev;
+	struct irdma_cm_node *cm_node;
+
+	cm_node = cm_id->provider_data;
+	cm_node->pdata.size = pdata_len;
+
+	trace_irdma_reject(cm_node, 0, NULL);
+
+	iwdev = to_iwdev(cm_id->device);
+	if (!iwdev)
+		return -EINVAL;
+
+	cm_node->cm_core->stats_rejects++;
+
+	if (pdata_len + sizeof(struct ietf_mpa_v2) > IRDMA_MAX_CM_BUF)
+		return -EINVAL;
+
+	return irdma_cm_reject(cm_node, pdata, pdata_len);
+}
+
+/**
+ * irdma_connect - registered call for connection to be established
+ * @cm_id: cm information for passive connection
+ * @conn_param: Information about the connection
+ */
+int irdma_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
+{
+	struct ib_qp *ibqp;
+	struct irdma_qp *iwqp;
+	struct irdma_device *iwdev;
+	struct irdma_cm_node *cm_node;
+	struct irdma_cm_info cm_info;
+	struct sockaddr_in *laddr;
+	struct sockaddr_in *raddr;
+	struct sockaddr_in6 *laddr6;
+	struct sockaddr_in6 *raddr6;
+	int ret = 0;
+
+	ibqp = irdma_get_qp(cm_id->device, conn_param->qpn);
+	if (!ibqp)
+		return -EINVAL;
+	iwqp = to_iwqp(ibqp);
+	if (!iwqp)
+		return -EINVAL;
+	iwdev = iwqp->iwdev;
+	if (!iwdev)
+		return -EINVAL;
+
+	laddr = (struct sockaddr_in *)&cm_id->m_local_addr;
+	raddr = (struct sockaddr_in *)&cm_id->m_remote_addr;
+	laddr6 = (struct sockaddr_in6 *)&cm_id->m_local_addr;
+	raddr6 = (struct sockaddr_in6 *)&cm_id->m_remote_addr;
+
+	if (!(laddr->sin_port) || !(raddr->sin_port))
+		return -EINVAL;
+
+	iwqp->active_conn = 1;
+	iwqp->cm_id = NULL;
+	cm_id->provider_data = iwqp;
+
+	/* set up the connection params for the node */
+	if (cm_id->remote_addr.ss_family == AF_INET) {
+		if (iwdev->vsi.mtu < IRDMA_MIN_MTU_IPV4)
+			return -EINVAL;
+
+		cm_info.ipv4 = true;
+		memset(cm_info.loc_addr, 0, sizeof(cm_info.loc_addr));
+		memset(cm_info.rem_addr, 0, sizeof(cm_info.rem_addr));
+		cm_info.loc_addr[0] = ntohl(laddr->sin_addr.s_addr);
+		cm_info.rem_addr[0] = ntohl(raddr->sin_addr.s_addr);
+		cm_info.loc_port = ntohs(laddr->sin_port);
+		cm_info.rem_port = ntohs(raddr->sin_port);
+		cm_info.vlan_id = irdma_get_vlan_ipv4(cm_info.loc_addr);
+	} else {
+		if (iwdev->vsi.mtu < IRDMA_MIN_MTU_IPV6)
+			return -EINVAL;
+
+		cm_info.ipv4 = false;
+		irdma_copy_ip_ntohl(cm_info.loc_addr,
+				    laddr6->sin6_addr.in6_u.u6_addr32);
+		irdma_copy_ip_ntohl(cm_info.rem_addr,
+				    raddr6->sin6_addr.in6_u.u6_addr32);
+		cm_info.loc_port = ntohs(laddr6->sin6_port);
+		cm_info.rem_port = ntohs(raddr6->sin6_port);
+		irdma_netdev_vlan_ipv6(cm_info.loc_addr, &cm_info.vlan_id,
+				       NULL);
+	}
+	cm_info.cm_id = cm_id;
+	cm_info.qh_qpid = iwdev->vsi.ilq->qp_id;
+	cm_info.tos = cm_id->tos;
+	cm_info.user_pri = rt_tos2priority(cm_id->tos);
+
+	if (iwqp->sc_qp.dev->ws_add(iwqp->sc_qp.vsi, cm_info.user_pri))
+		return -ENOMEM;
+	iwqp->sc_qp.user_pri = cm_info.user_pri;
+	irdma_qp_add_qos(&iwqp->sc_qp);
+	ibdev_dbg(&iwdev->ibdev, "DCB: TOS:[%d] UP:[%d]\n", cm_id->tos,
+		  cm_info.user_pri);
+
+	trace_irdma_dcb_tos(iwdev, cm_id->tos, cm_info.user_pri);
+
+	ret = irdma_create_cm_node(&iwdev->cm_core, iwdev, conn_param, &cm_info,
+				   &cm_node);
+	if (ret)
+		return ret;
+	ret = cm_node->cm_core->cm_create_ah(cm_node, true);
+	if (ret)
+		goto err;
+	if (irdma_manage_qhash(iwdev, &cm_info,
+			       IRDMA_QHASH_TYPE_TCP_ESTABLISHED,
+			       IRDMA_QHASH_MANAGE_TYPE_ADD, NULL, true)) {
+		ret = -EINVAL;
+		goto err;
+	}
+	cm_node->qhash_set = true;
+
+	cm_node->apbvt_entry = irdma_add_apbvt(iwdev, cm_info.loc_port);
+	if (!cm_node->apbvt_entry) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	cm_node->apbvt_set = true;
+	iwqp->cm_node = cm_node;
+	cm_node->iwqp = iwqp;
+	iwqp->cm_id = cm_id;
+	irdma_qp_add_ref(&iwqp->ibqp);
+	cm_id->add_ref(cm_id);
+
+	if (cm_node->state != IRDMA_CM_STATE_OFFLOADED) {
+		cm_node->state = IRDMA_CM_STATE_SYN_SENT;
+		ret = irdma_send_syn(cm_node, 0);
+		if (ret)
+			goto err;
+	}
+
+	ibdev_dbg(&iwdev->ibdev,
+		  "CM: rem_port=0x%04x, loc_port=0x%04x rem_addr=%pI4 loc_addr=%pI4 cm_node=%p cm_id=%p qp_id = %d\n\n",
+		  cm_node->rem_port, cm_node->loc_port, cm_node->rem_addr,
+		  cm_node->loc_addr, cm_node, cm_id, ibqp->qp_num);
+
+	trace_irdma_connect(cm_node, 0, NULL);
+
+	return 0;
+
+err:
+	if (cm_info.ipv4)
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: connect() FAILED: dest addr=%pI4",
+			  cm_info.rem_addr);
+	else
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: connect() FAILED: dest addr=%pI6",
+			  cm_info.rem_addr);
+	irdma_rem_ref_cm_node(cm_node);
+	iwdev->cm_core.stats_connect_errs++;
+
+	return ret;
+}
+
+/**
+ * irdma_create_listen - registered call creating listener
+ * @cm_id: cm information for passive connection
+ * @backlog: to max accept pending count
+ */
+int irdma_create_listen(struct iw_cm_id *cm_id, int backlog)
+{
+	struct irdma_device *iwdev;
+	struct irdma_cm_listener *cm_listen_node;
+	struct irdma_cm_info cm_info = {};
+	enum irdma_status_code err;
+	struct sockaddr_in *laddr;
+	struct sockaddr_in6 *laddr6;
+	bool wildcard = false;
+
+	iwdev = to_iwdev(cm_id->device);
+	if (!iwdev)
+		return -EINVAL;
+
+	laddr = (struct sockaddr_in *)&cm_id->m_local_addr;
+	laddr6 = (struct sockaddr_in6 *)&cm_id->m_local_addr;
+	cm_info.qh_qpid = iwdev->vsi.ilq->qp_id;
+
+	if (laddr->sin_family == AF_INET) {
+		if (iwdev->vsi.mtu < IRDMA_MIN_MTU_IPV4)
+			return -EINVAL;
+
+		cm_info.ipv4 = true;
+		cm_info.loc_addr[0] = ntohl(laddr->sin_addr.s_addr);
+		cm_info.loc_port = ntohs(laddr->sin_port);
+
+		if (laddr->sin_addr.s_addr != htonl(INADDR_ANY)) {
+			cm_info.vlan_id = irdma_get_vlan_ipv4(cm_info.loc_addr);
+		} else {
+			cm_info.vlan_id = 0xFFFF;
+			wildcard = true;
+		}
+	} else {
+		if (iwdev->vsi.mtu < IRDMA_MIN_MTU_IPV6)
+			return -EINVAL;
+
+		cm_info.ipv4 = false;
+		irdma_copy_ip_ntohl(cm_info.loc_addr,
+				    laddr6->sin6_addr.in6_u.u6_addr32);
+		cm_info.loc_port = ntohs(laddr6->sin6_port);
+		if (ipv6_addr_type(&laddr6->sin6_addr) != IPV6_ADDR_ANY) {
+			irdma_netdev_vlan_ipv6(cm_info.loc_addr,
+					       &cm_info.vlan_id, NULL);
+		} else {
+			cm_info.vlan_id = 0xFFFF;
+			wildcard = true;
+		}
+	}
+
+	if (cm_info.vlan_id >= VLAN_N_VID && iwdev->dcb)
+		cm_info.vlan_id = 0;
+	cm_info.backlog = backlog;
+	cm_info.cm_id = cm_id;
+
+	trace_irdma_create_listen(iwdev, &cm_info);
+
+	cm_listen_node = irdma_make_listen_node(&iwdev->cm_core, iwdev,
+						&cm_info);
+	if (!cm_listen_node) {
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: cm_listen_node == NULL\n");
+		return -ENOMEM;
+	}
+
+	cm_id->provider_data = cm_listen_node;
+
+	cm_listen_node->tos = cm_id->tos;
+	cm_listen_node->user_pri = rt_tos2priority(cm_id->tos);
+	cm_info.user_pri = cm_listen_node->user_pri;
+	if (!cm_listen_node->reused_node) {
+		if (wildcard) {
+			err = irdma_add_mqh(iwdev, &cm_info, cm_listen_node);
+			if (err)
+				goto error;
+		} else {
+			err = irdma_manage_qhash(iwdev, &cm_info,
+						 IRDMA_QHASH_TYPE_TCP_SYN,
+						 IRDMA_QHASH_MANAGE_TYPE_ADD,
+						 NULL, true);
+			if (err)
+				goto error;
+
+			cm_listen_node->qhash_set = true;
+		}
+
+		cm_listen_node->apbvt_entry = irdma_add_apbvt(iwdev,
+							      cm_info.loc_port);
+		if (!cm_listen_node->apbvt_entry)
+			goto error;
+	}
+	cm_id->add_ref(cm_id);
+	cm_listen_node->cm_core->stats_listen_created++;
+	ibdev_dbg(&iwdev->ibdev,
+		  "CM: loc_port=0x%04x loc_addr=%pI4 cm_listen_node=%p cm_id=%p qhash_set=%d vlan_id=%d\n",
+		  cm_listen_node->loc_port, cm_listen_node->loc_addr,
+		  cm_listen_node, cm_listen_node->cm_id,
+		  cm_listen_node->qhash_set, cm_listen_node->vlan_id);
+
+	return 0;
+
+error:
+
+	irdma_cm_del_listen(&iwdev->cm_core, cm_listen_node, false);
+
+	return -EINVAL;
+}
+
+/**
+ * irdma_destroy_listen - registered call to destroy listener
+ * @cm_id: cm information for passive connection
+ */
+int irdma_destroy_listen(struct iw_cm_id *cm_id)
+{
+	struct irdma_device *iwdev;
+
+	iwdev = to_iwdev(cm_id->device);
+	if (cm_id->provider_data)
+		irdma_cm_del_listen(&iwdev->cm_core, cm_id->provider_data,
+				    true);
+	else
+		ibdev_dbg(&iwdev->ibdev,
+			  "CM: cm_id->provider_data was NULL\n");
+
+	cm_id->rem_ref(cm_id);
+
+	return 0;
+}
+
+/**
+ * irdma_teardown_list_prep - add conn nodes slated for tear down to list
+ * @cm_core: cm's core
+ * @teardown_list: a list to which cm_node will be selected
+ * @ipaddr: pointer to ip address
+ * @nfo: pointer to cm_info structure instance
+ * @disconnect_all: flag indicating disconnect all QPs
+ */
+static void irdma_teardown_list_prep(struct irdma_cm_core *cm_core,
+				     struct list_head *teardown_list,
+				     u32 *ipaddr,
+				     struct irdma_cm_info *nfo,
+				     bool disconnect_all)
+{
+	struct irdma_cm_node *cm_node;
+	int bkt;
+
+	hash_for_each_rcu(cm_core->cm_hash_tbl, bkt, cm_node, list) {
+		if ((disconnect_all ||
+		     (nfo->vlan_id == cm_node->vlan_id &&
+		      !memcmp(cm_node->loc_addr, ipaddr, nfo->ipv4 ? 4 : 16))) &&
+		    refcount_inc_not_zero(&cm_node->refcnt))
+			list_add(&cm_node->teardown_entry, teardown_list);
+	}
+}
+
+/**
+ * irdma_cm_event_connected - handle connected active node
+ * @event: the info for cm_node of connection
+ */
+static void irdma_cm_event_connected(struct irdma_cm_event *event)
+{
+	struct irdma_qp *iwqp;
+	struct irdma_device *iwdev;
+	struct irdma_cm_node *cm_node;
+	struct irdma_sc_dev *dev;
+	struct ib_qp_attr attr = {};
+	struct iw_cm_id *cm_id;
+	int status;
+	bool read0;
+	int wait_ret = 0;
+
+	cm_node = event->cm_node;
+	cm_id = cm_node->cm_id;
+	iwqp = cm_id->provider_data;
+	iwdev = iwqp->iwdev;
+	dev = &iwdev->rf->sc_dev;
+	if (iwqp->sc_qp.qp_uk.destroy_pending) {
+		status = -ETIMEDOUT;
+		goto error;
+	}
+
+	irdma_cm_init_tsa_conn(iwqp, cm_node);
+	read0 = (cm_node->send_rdma0_op == SEND_RDMA_READ_ZERO);
+	if (iwqp->page)
+		iwqp->sc_qp.qp_uk.sq_base = kmap(iwqp->page);
+	dev->iw_priv_qp_ops->qp_send_rtt(&iwqp->sc_qp, read0);
+	if (iwqp->page)
+		kunmap(iwqp->page);
+
+	attr.qp_state = IB_QPS_RTS;
+	cm_node->qhash_set = false;
+	irdma_modify_qp(&iwqp->ibqp, &attr, IB_QP_STATE, NULL);
+	if (dev->hw_attrs.uk_attrs.feature_flags & IRDMA_FEATURE_RTS_AE) {
+		wait_ret = wait_event_interruptible_timeout(iwqp->waitq,
+							    iwqp->rts_ae_rcvd,
+							    IRDMA_MAX_TIMEOUT);
+		if (!wait_ret)
+			ibdev_dbg(&iwdev->ibdev,
+				  "CM: Slow Connection: cm_node=%p, loc_port=%d, rem_port=%d, cm_id=%p\n",
+				  cm_node, cm_node->loc_port,
+				  cm_node->rem_port, cm_node->cm_id);
+	}
+
+	irdma_send_cm_event(cm_node, cm_id, IW_CM_EVENT_CONNECT_REPLY, 0);
+	cm_node->accelerated = true;
+	complete(&cm_node->establish_comp);
+	cm_node->cm_core->cm_free_ah(cm_node);
+	return;
+
+error:
+	iwqp->cm_id = NULL;
+	cm_id->provider_data = NULL;
+	irdma_send_cm_event(event->cm_node, cm_id, IW_CM_EVENT_CONNECT_REPLY,
+			    status);
+	irdma_rem_ref_cm_node(event->cm_node);
+}
+
+/**
+ * irdma_cm_event_reset - handle reset
+ * @event: the info for cm_node of connection
+ */
+static void irdma_cm_event_reset(struct irdma_cm_event *event)
+{
+	struct irdma_cm_node *cm_node = event->cm_node;
+	struct iw_cm_id *cm_id = cm_node->cm_id;
+	struct irdma_qp *iwqp;
+
+	if (!cm_id)
+		return;
+
+	iwqp = cm_id->provider_data;
+	if (!iwqp)
+		return;
+
+	ibdev_dbg(&cm_node->iwdev->ibdev,
+		  "CM: reset event %p - cm_id = %p\n", event->cm_node, cm_id);
+	iwqp->cm_id = NULL;
+
+	irdma_send_cm_event(cm_node, cm_node->cm_id, IW_CM_EVENT_DISCONNECT,
+			    -ECONNRESET);
+	irdma_send_cm_event(cm_node, cm_node->cm_id, IW_CM_EVENT_CLOSE, 0);
+}
+
+/**
+ * irdma_cm_event_handler - send event to cm upper layer
+ * @work: pointer of cm event info.
+ */
+static void irdma_cm_event_handler(struct work_struct *work)
+{
+	struct irdma_cm_event *event = container_of(work, struct irdma_cm_event, event_work);
+	struct irdma_cm_node *cm_node;
+
+	if (!event || !event->cm_node || !event->cm_node->cm_core)
+		return;
+
+	cm_node = event->cm_node;
+	trace_irdma_cm_event_handler(cm_node, event->type, NULL);
+
+	switch (event->type) {
+	case IRDMA_CM_EVENT_MPA_REQ:
+		irdma_send_cm_event(cm_node, cm_node->cm_id,
+				    IW_CM_EVENT_CONNECT_REQUEST, 0);
+		break;
+	case IRDMA_CM_EVENT_RESET:
+		irdma_cm_event_reset(event);
+		break;
+	case IRDMA_CM_EVENT_CONNECTED:
+		if (!event->cm_node->cm_id ||
+		    event->cm_node->state != IRDMA_CM_STATE_OFFLOADED)
+			break;
+		irdma_cm_event_connected(event);
+		break;
+	case IRDMA_CM_EVENT_MPA_REJECT:
+		if (!event->cm_node->cm_id ||
+		    cm_node->state == IRDMA_CM_STATE_OFFLOADED)
+			break;
+		irdma_send_cm_event(cm_node, cm_node->cm_id,
+				    IW_CM_EVENT_CONNECT_REPLY, -ECONNREFUSED);
+		break;
+	case IRDMA_CM_EVENT_ABORTED:
+		if (!event->cm_node->cm_id ||
+		    event->cm_node->state == IRDMA_CM_STATE_OFFLOADED)
+			break;
+		irdma_event_connect_error(event);
+		break;
+	default:
+		ibdev_dbg(&cm_node->iwdev->ibdev,
+			  "CM: bad event type = %d\n", event->type);
+		break;
+	}
+
+	irdma_rem_ref_cm_node(event->cm_node);
+	kfree(event);
+}
+
+/**
+ * irdma_cm_post_event - queue event request for worker thread
+ * @event: cm node's info for up event call
+ */
+static void irdma_cm_post_event(struct irdma_cm_event *event)
+{
+	refcount_inc(&event->cm_node->refcnt);
+	INIT_WORK(&event->event_work, irdma_cm_event_handler);
+	queue_work(event->cm_node->cm_core->event_wq, &event->event_work);
+}
+
+/**
+ * irdma_cm_teardown_connections - teardown QPs
+ * @iwdev: device pointer
+ * @ipaddr: Pointer to IPv4 or IPv6 address
+ * @nfo: Connection info
+ * @disconnect_all: flag indicating disconnect all QPs
+ *
+ * teardown QPs where source or destination addr matches ip addr
+ */
+void irdma_cm_teardown_connections(struct irdma_device *iwdev, u32 *ipaddr,
+				   struct irdma_cm_info *nfo,
+				   bool disconnect_all)
+{
+	struct irdma_cm_core *cm_core = &iwdev->cm_core;
+	struct list_head *list_core_temp;
+	struct list_head *list_node;
+	struct irdma_cm_node *cm_node;
+	struct list_head teardown_list;
+	struct ib_qp_attr attr;
+	struct irdma_sc_vsi *vsi = &iwdev->vsi;
+	struct irdma_sc_qp *sc_qp;
+	struct irdma_qp *qp;
+	int i;
+
+	INIT_LIST_HEAD(&teardown_list);
+
+	rcu_read_lock();
+	irdma_teardown_list_prep(cm_core, &teardown_list, ipaddr, nfo, disconnect_all);
+	rcu_read_unlock();
+
+	list_for_each_safe (list_node, list_core_temp, &teardown_list) {
+		cm_node = container_of(list_node, struct irdma_cm_node,
+				       teardown_entry);
+		attr.qp_state = IB_QPS_ERR;
+		irdma_modify_qp(&cm_node->iwqp->ibqp, &attr, IB_QP_STATE, NULL);
+		if (iwdev->reset)
+			irdma_cm_disconn(cm_node->iwqp);
+		irdma_rem_ref_cm_node(cm_node);
+	}
+	if (!iwdev->roce_mode)
+		return;
+
+	INIT_LIST_HEAD(&teardown_list);
+	for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) {
+		mutex_lock(&vsi->qos[i].qos_mutex);
+		list_for_each_safe (list_node, list_core_temp,
+				    &vsi->qos[i].qplist) {
+			u32 qp_ip[4];
+
+			sc_qp = container_of(list_node, struct irdma_sc_qp,
+					     list);
+			if (sc_qp->qp_uk.qp_type != IRDMA_QP_TYPE_ROCE_RC)
+				continue;
+
+			qp = sc_qp->qp_uk.back_qp;
+			if (!disconnect_all) {
+				if (nfo->ipv4)
+					qp_ip[0] = qp->udp_info.local_ipaddr[3];
+				else
+					memcpy(qp_ip,
+					       &qp->udp_info.local_ipaddr[0],
+					       sizeof(qp_ip));
+			}
+
+			if (disconnect_all ||
+			    (nfo->vlan_id == (qp->udp_info.vlan_tag & VLAN_VID_MASK) &&
+			     !memcmp(qp_ip, ipaddr, nfo->ipv4 ? 4 : 16))) {
+				spin_lock(&iwdev->rf->qptable_lock);
+				if (iwdev->rf->qp_table[sc_qp->qp_uk.qp_id]) {
+					irdma_qp_add_ref(&qp->ibqp);
+					list_add(&qp->teardown_entry,
+						 &teardown_list);
+				}
+				spin_unlock(&iwdev->rf->qptable_lock);
+			}
+		}
+		mutex_unlock(&vsi->qos[i].qos_mutex);
+	}
+
+	list_for_each_safe (list_node, list_core_temp, &teardown_list) {
+		qp = container_of(list_node, struct irdma_qp, teardown_entry);
+		attr.qp_state = IB_QPS_ERR;
+		irdma_modify_qp_roce(&qp->ibqp, &attr, IB_QP_STATE, NULL);
+		irdma_qp_rem_ref(&qp->ibqp);
+	}
+}
+
+/**
+ * irdma_qhash_ctrl - enable/disable qhash for list
+ * @iwdev: device pointer
+ * @parent_listen_node: parent listen node
+ * @nfo: cm info node
+ * @ipaddr: Pointer to IPv4 or IPv6 address
+ * @ipv4: flag indicating IPv4 when true
+ * @ifup: flag indicating interface up when true
+ *
+ * Enables or disables the qhash for the node in the child
+ * listen list that matches ipaddr. If no matching IP was found
+ * it will allocate and add a new child listen node to the
+ * parent listen node. The listen_list_lock is assumed to be
+ * held when called.
+ */
+static void irdma_qhash_ctrl(struct irdma_device *iwdev,
+			     struct irdma_cm_listener *parent_listen_node,
+			     struct irdma_cm_info *nfo, u32 *ipaddr, bool ipv4,
+			     bool ifup)
+{
+	struct list_head *child_listen_list = &parent_listen_node->child_listen_list;
+	struct irdma_cm_listener *child_listen_node;
+	struct list_head *pos, *tpos;
+	enum irdma_status_code err;
+	bool node_allocated = false;
+	enum irdma_quad_hash_manage_type op = ifup ?
+					      IRDMA_QHASH_MANAGE_TYPE_ADD :
+					      IRDMA_QHASH_MANAGE_TYPE_DELETE;
+
+	list_for_each_safe (pos, tpos, child_listen_list) {
+		child_listen_node = list_entry(pos, struct irdma_cm_listener,
+					       child_listen_list);
+		if (!memcmp(child_listen_node->loc_addr, ipaddr, ipv4 ? 4 : 16))
+			goto set_qhash;
+	}
+
+	/* if not found then add a child listener if interface is going up */
+	if (!ifup)
+		return;
+	child_listen_node = kmemdup(parent_listen_node,
+				    sizeof(*child_listen_node), GFP_ATOMIC);
+	if (!child_listen_node)
+		return;
+
+	node_allocated = true;
+	memcpy(child_listen_node->loc_addr, ipaddr, ipv4 ? 4 : 16);
+
+set_qhash:
+	memcpy(nfo->loc_addr, child_listen_node->loc_addr,
+	       sizeof(nfo->loc_addr));
+	nfo->vlan_id = child_listen_node->vlan_id;
+	err = irdma_manage_qhash(iwdev, nfo, IRDMA_QHASH_TYPE_TCP_SYN, op, NULL,
+				 false);
+	if (!err) {
+		child_listen_node->qhash_set = ifup;
+		if (node_allocated)
+			list_add(&child_listen_node->child_listen_list,
+				 &parent_listen_node->child_listen_list);
+	} else if (node_allocated) {
+		kfree(child_listen_node);
+	}
+}
+
+/**
+ * irdma_if_notify - process an ifdown on an interface
+ * @iwdev: device pointer
+ * @netdev: network device structure
+ * @ipaddr: Pointer to IPv4 or IPv6 address
+ * @ipv4: flag indicating IPv4 when true
+ * @ifup: flag indicating interface up when true
+ */
+void irdma_if_notify(struct irdma_device *iwdev, struct net_device *netdev,
+		     u32 *ipaddr, bool ipv4, bool ifup)
+{
+	struct irdma_cm_core *cm_core = &iwdev->cm_core;
+	unsigned long flags;
+	struct irdma_cm_listener *listen_node;
+	static const u32 ip_zero[4] = { 0, 0, 0, 0 };
+	struct irdma_cm_info nfo = {};
+	u16 vlan_id = rdma_vlan_dev_vlan_id(netdev);
+	enum irdma_quad_hash_manage_type op = ifup ?
+					      IRDMA_QHASH_MANAGE_TYPE_ADD :
+					      IRDMA_QHASH_MANAGE_TYPE_DELETE;
+
+	nfo.vlan_id = vlan_id;
+	nfo.ipv4 = ipv4;
+	nfo.qh_qpid = 1;
+
+	/* Disable or enable qhash for listeners */
+	spin_lock_irqsave(&cm_core->listen_list_lock, flags);
+	list_for_each_entry (listen_node, &cm_core->listen_list, list) {
+		if (vlan_id != listen_node->vlan_id ||
+		    (memcmp(listen_node->loc_addr, ipaddr, ipv4 ? 4 : 16) &&
+		     memcmp(listen_node->loc_addr, ip_zero, ipv4 ? 4 : 16)))
+			continue;
+
+		memcpy(nfo.loc_addr, listen_node->loc_addr,
+		       sizeof(nfo.loc_addr));
+		nfo.loc_port = listen_node->loc_port;
+		nfo.user_pri = listen_node->user_pri;
+		if (!list_empty(&listen_node->child_listen_list)) {
+			irdma_qhash_ctrl(iwdev, listen_node, &nfo, ipaddr, ipv4,
+					 ifup);
+		} else if (memcmp(listen_node->loc_addr, ip_zero,
+				  ipv4 ? 4 : 16)) {
+			if (!irdma_manage_qhash(iwdev, &nfo,
+						IRDMA_QHASH_TYPE_TCP_SYN, op,
+						NULL, false))
+				listen_node->qhash_set = ifup;
+		}
+	}
+	spin_unlock_irqrestore(&cm_core->listen_list_lock, flags);
+
+	/* disconnect any connected qp's on ifdown */
+	if (!ifup)
+		irdma_cm_teardown_connections(iwdev, ipaddr, &nfo, false);
+}
diff --git a/drivers/infiniband/hw/irdma/cm.h b/drivers/infiniband/hw/irdma/cm.h
new file mode 100644
index 0000000..d03cd29
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/cm.h
@@ -0,0 +1,417 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef IRDMA_CM_H
+#define IRDMA_CM_H
+
+#define IRDMA_MPA_REQUEST_ACCEPT	1
+#define IRDMA_MPA_REQUEST_REJECT	2
+
+/* IETF MPA -- defines */
+#define IEFT_MPA_KEY_REQ	"MPA ID Req Frame"
+#define IEFT_MPA_KEY_REP	"MPA ID Rep Frame"
+#define IETF_MPA_KEY_SIZE	16
+#define IETF_MPA_VER		1
+#define IETF_MAX_PRIV_DATA_LEN	512
+#define IETF_MPA_FRAME_SIZE	20
+#define IETF_RTR_MSG_SIZE	4
+#define IETF_MPA_V2_FLAG	0x10
+#define SNDMARKER_SEQNMASK	0x000001ff
+#define IRDMA_MAX_IETF_SIZE	32
+
+/* IETF RTR MSG Fields */
+#define IETF_PEER_TO_PEER	0x8000
+#define IETF_FLPDU_ZERO_LEN	0x4000
+#define IETF_RDMA0_WRITE	0x8000
+#define IETF_RDMA0_READ		0x4000
+#define IETF_NO_IRD_ORD		0x3fff
+
+#define MAX_PORTS	65536
+
+#define IRDMA_PASSIVE_STATE_INDICATED	0
+#define IRDMA_DO_NOT_SEND_RESET_EVENT	1
+#define IRDMA_SEND_RESET_EVENT		2
+
+#define MAX_IRDMA_IFS	4
+
+#define SET_ACK		1
+#define SET_SYN		2
+#define SET_FIN		4
+#define SET_RST		8
+
+#define TCP_OPTIONS_PADDING	3
+
+#define IRDMA_DEFAULT_RETRYS	64
+#define IRDMA_DEFAULT_RETRANS	8
+#define IRDMA_DEFAULT_TTL		0x40
+#define IRDMA_DEFAULT_RTT_VAR		6
+#define IRDMA_DEFAULT_SS_THRESH		0x3fffffff
+#define IRDMA_DEFAULT_REXMIT_THRESH	8
+
+#define IRDMA_RETRY_TIMEOUT	HZ
+#define IRDMA_SHORT_TIME	10
+#define IRDMA_LONG_TIME		(2 * HZ)
+#define IRDMA_MAX_TIMEOUT	((unsigned long)(12 * HZ))
+
+#define IRDMA_CM_HASHTABLE_SIZE		1024
+#define IRDMA_CM_TCP_TIMER_INTERVAL	3000
+#define IRDMA_CM_DEFAULT_MTU		1540
+#define IRDMA_CM_DEFAULT_FRAME_CNT	10
+#define IRDMA_CM_THREAD_STACK_SIZE	256
+#define IRDMA_CM_DEFAULT_RCV_WND	64240
+#define IRDMA_CM_DEFAULT_RCV_WND_SCALED	0x3FFFC
+#define IRDMA_CM_DEFAULT_RCV_WND_SCALE	2
+#define IRDMA_CM_DEFAULT_FREE_PKTS	10
+#define IRDMA_CM_FREE_PKT_LO_WATERMARK	2
+#define IRDMA_CM_DEFAULT_MSS		536
+#define IRDMA_CM_DEFAULT_MPA_VER	2
+#define IRDMA_CM_DEFAULT_SEQ		0x159bf75f
+#define IRDMA_CM_DEFAULT_LOCAL_ID	0x3b47
+#define IRDMA_CM_DEFAULT_SEQ2		0x18ed5740
+#define IRDMA_CM_DEFAULT_LOCAL_ID2	0xb807
+#define IRDMA_MAX_CM_BUF		(IRDMA_MAX_IETF_SIZE + IETF_MAX_PRIV_DATA_LEN)
+
+enum ietf_mpa_flags {
+	IETF_MPA_FLAGS_REJECT  = 0x20,
+	IETF_MPA_FLAGS_CRC     = 0x40,
+	IETF_MPA_FLAGS_MARKERS = 0x80,
+};
+
+enum irdma_timer_type {
+	IRDMA_TIMER_TYPE_SEND,
+	IRDMA_TIMER_TYPE_CLOSE,
+};
+
+enum option_nums {
+	OPTION_NUM_EOL,
+	OPTION_NUM_NONE,
+	OPTION_NUM_MSS,
+	OPTION_NUM_WINDOW_SCALE,
+	OPTION_NUM_SACK_PERM,
+	OPTION_NUM_SACK,
+	OPTION_NUM_WRITE0 = 0xbc,
+};
+
+/* cm node transition states */
+enum irdma_cm_node_state {
+	IRDMA_CM_STATE_UNKNOWN,
+	IRDMA_CM_STATE_INITED,
+	IRDMA_CM_STATE_LISTENING,
+	IRDMA_CM_STATE_SYN_RCVD,
+	IRDMA_CM_STATE_SYN_SENT,
+	IRDMA_CM_STATE_ONE_SIDE_ESTABLISHED,
+	IRDMA_CM_STATE_ESTABLISHED,
+	IRDMA_CM_STATE_ACCEPTING,
+	IRDMA_CM_STATE_MPAREQ_SENT,
+	IRDMA_CM_STATE_MPAREQ_RCVD,
+	IRDMA_CM_STATE_MPAREJ_RCVD,
+	IRDMA_CM_STATE_OFFLOADED,
+	IRDMA_CM_STATE_FIN_WAIT1,
+	IRDMA_CM_STATE_FIN_WAIT2,
+	IRDMA_CM_STATE_CLOSE_WAIT,
+	IRDMA_CM_STATE_TIME_WAIT,
+	IRDMA_CM_STATE_LAST_ACK,
+	IRDMA_CM_STATE_CLOSING,
+	IRDMA_CM_STATE_LISTENER_DESTROYED,
+	IRDMA_CM_STATE_CLOSED,
+};
+
+enum mpa_frame_ver {
+	IETF_MPA_V1 = 1,
+	IETF_MPA_V2 = 2,
+};
+
+enum mpa_frame_key {
+	MPA_KEY_REQUEST,
+	MPA_KEY_REPLY,
+};
+
+enum send_rdma0 {
+	SEND_RDMA_READ_ZERO  = 1,
+	SEND_RDMA_WRITE_ZERO = 2,
+};
+
+enum irdma_tcpip_pkt_type {
+	IRDMA_PKT_TYPE_UNKNOWN,
+	IRDMA_PKT_TYPE_SYN,
+	IRDMA_PKT_TYPE_SYNACK,
+	IRDMA_PKT_TYPE_ACK,
+	IRDMA_PKT_TYPE_FIN,
+	IRDMA_PKT_TYPE_RST,
+};
+
+enum irdma_cm_listener_state {
+	IRDMA_CM_LISTENER_PASSIVE_STATE = 1,
+	IRDMA_CM_LISTENER_ACTIVE_STATE  = 2,
+	IRDMA_CM_LISTENER_EITHER_STATE  = 3,
+};
+
+/* CM event codes */
+enum irdma_cm_event_type {
+	IRDMA_CM_EVENT_UNKNOWN,
+	IRDMA_CM_EVENT_ESTABLISHED,
+	IRDMA_CM_EVENT_MPA_REQ,
+	IRDMA_CM_EVENT_MPA_CONNECT,
+	IRDMA_CM_EVENT_MPA_ACCEPT,
+	IRDMA_CM_EVENT_MPA_REJECT,
+	IRDMA_CM_EVENT_MPA_ESTABLISHED,
+	IRDMA_CM_EVENT_CONNECTED,
+	IRDMA_CM_EVENT_RESET,
+	IRDMA_CM_EVENT_ABORTED,
+};
+
+struct irdma_bth { /* Base Trasnport Header */
+	u8 opcode;
+	u8 flags;
+	__be16 pkey;
+	__be32 qpn;
+	__be32 apsn;
+};
+
+struct ietf_mpa_v1 {
+	u8 key[IETF_MPA_KEY_SIZE];
+	u8 flags;
+	u8 rev;
+	__be16 priv_data_len;
+	u8 priv_data[];
+};
+
+struct ietf_rtr_msg {
+	__be16 ctrl_ird;
+	__be16 ctrl_ord;
+};
+
+struct ietf_mpa_v2 {
+	u8 key[IETF_MPA_KEY_SIZE];
+	u8 flags;
+	u8 rev;
+	__be16 priv_data_len;
+	struct ietf_rtr_msg rtr_msg;
+	u8 priv_data[];
+};
+
+struct option_base {
+	u8 optionnum;
+	u8 len;
+};
+
+struct option_mss {
+	u8 optionnum;
+	u8 len;
+	__be16 mss;
+};
+
+struct option_windowscale {
+	u8 optionnum;
+	u8 len;
+	u8 shiftcount;
+};
+
+union all_known_options {
+	char eol;
+	struct option_base base;
+	struct option_mss mss;
+	struct option_windowscale windowscale;
+};
+
+struct irdma_timer_entry {
+	struct list_head list;
+	unsigned long timetosend; /* jiffies */
+	struct irdma_puda_buf *sqbuf;
+	u32 type;
+	u32 retrycount;
+	u32 retranscount;
+	u32 context;
+	u32 send_retrans;
+	int close_when_complete;
+};
+
+/* CM context params */
+struct irdma_cm_tcp_context {
+	u8 client;
+	u32 loc_seq_num;
+	u32 loc_ack_num;
+	u32 rem_ack_num;
+	u32 rcv_nxt;
+	u32 loc_id;
+	u32 rem_id;
+	u32 snd_wnd;
+	u32 max_snd_wnd;
+	u32 rcv_wnd;
+	u32 mss;
+	u8 snd_wscale;
+	u8 rcv_wscale;
+};
+
+struct irdma_apbvt_entry {
+	struct hlist_node hlist;
+	u32 use_cnt;
+	u16 port;
+};
+
+struct irdma_cm_listener {
+	struct list_head list;
+	struct iw_cm_id *cm_id;
+	struct irdma_cm_core *cm_core;
+	struct irdma_device *iwdev;
+	struct list_head child_listen_list;
+	struct irdma_apbvt_entry *apbvt_entry;
+	enum irdma_cm_listener_state listener_state;
+	refcount_t refcnt;
+	atomic_t pend_accepts_cnt;
+	u32 loc_addr[4];
+	u32 reused_node;
+	int backlog;
+	u16 loc_port;
+	u16 vlan_id;
+	u8 loc_mac[ETH_ALEN];
+	u8 user_pri;
+	u8 tos;
+	bool qhash_set:1;
+	bool ipv4:1;
+};
+
+struct irdma_kmem_info {
+	void *addr;
+	u32 size;
+};
+
+struct irdma_mpa_priv_info {
+	const void *addr;
+	u32 size;
+};
+
+struct irdma_cm_node {
+	struct irdma_qp *iwqp;
+	struct irdma_device *iwdev;
+	struct irdma_sc_dev *dev;
+	struct irdma_cm_tcp_context tcp_cntxt;
+	struct irdma_cm_core *cm_core;
+	struct irdma_timer_entry *send_entry;
+	struct irdma_timer_entry *close_entry;
+	struct irdma_cm_listener *listener;
+	struct list_head timer_entry;
+	struct list_head reset_entry;
+	struct list_head teardown_entry;
+	struct irdma_apbvt_entry *apbvt_entry;
+	struct rcu_head rcu_head;
+	struct irdma_mpa_priv_info pdata;
+	struct irdma_sc_ah *ah;
+	union {
+		struct ietf_mpa_v1 mpa_frame;
+		struct ietf_mpa_v2 mpa_v2_frame;
+	};
+	struct irdma_kmem_info mpa_hdr;
+	struct iw_cm_id *cm_id;
+	struct hlist_node list;
+	struct completion establish_comp;
+	spinlock_t retrans_list_lock; /* protect CM node rexmit updates*/
+	atomic_t passive_state;
+	refcount_t refcnt;
+	enum irdma_cm_node_state state;
+	enum send_rdma0 send_rdma0_op;
+	enum mpa_frame_ver mpa_frame_rev;
+	u32 loc_addr[4], rem_addr[4];
+	u16 loc_port, rem_port;
+	int apbvt_set;
+	int accept_pend;
+	u16 vlan_id;
+	u16 ird_size;
+	u16 ord_size;
+	u16 mpav2_ird_ord;
+	u16 lsmm_size;
+	u8 pdata_buf[IETF_MAX_PRIV_DATA_LEN];
+	u8 loc_mac[ETH_ALEN];
+	u8 rem_mac[ETH_ALEN];
+	u8 user_pri;
+	u8 tos;
+	bool ack_rcvd:1;
+	bool qhash_set:1;
+	bool ipv4:1;
+	bool snd_mark_en:1;
+	bool rcv_mark_en:1;
+	bool do_lpb:1;
+	bool accelerated:1;
+};
+
+/* Used by internal CM APIs to pass CM information*/
+struct irdma_cm_info {
+	struct iw_cm_id *cm_id;
+	u16 loc_port;
+	u16 rem_port;
+	u32 loc_addr[4];
+	u32 rem_addr[4];
+	u32 qh_qpid;
+	u16 vlan_id;
+	int backlog;
+	u8 user_pri;
+	u8 tos;
+	bool ipv4;
+};
+
+struct irdma_cm_event {
+	enum irdma_cm_event_type type;
+	struct irdma_cm_info cm_info;
+	struct work_struct event_work;
+	struct irdma_cm_node *cm_node;
+};
+
+struct irdma_cm_core {
+	struct irdma_device *iwdev;
+	struct irdma_sc_dev *dev;
+	struct list_head listen_list;
+	DECLARE_HASHTABLE(cm_hash_tbl, 8);
+	DECLARE_HASHTABLE(apbvt_hash_tbl, 8);
+	struct timer_list tcp_timer;
+	struct workqueue_struct *event_wq;
+	spinlock_t ht_lock; /* protect CM node (active side) list */
+	spinlock_t listen_list_lock; /* protect listener list */
+	spinlock_t apbvt_lock; /*serialize apbvt add/del entries*/
+	u64 stats_nodes_created;
+	u64 stats_nodes_destroyed;
+	u64 stats_listen_created;
+	u64 stats_listen_destroyed;
+	u64 stats_listen_nodes_created;
+	u64 stats_listen_nodes_destroyed;
+	u64 stats_lpbs;
+	u64 stats_accepts;
+	u64 stats_rejects;
+	u64 stats_connect_errs;
+	u64 stats_passive_errs;
+	u64 stats_pkt_retrans;
+	u64 stats_backlog_drops;
+	struct irdma_puda_buf *(*form_cm_frame)(struct irdma_cm_node *cm_node,
+						struct irdma_kmem_info *options,
+						struct irdma_kmem_info *hdr,
+						struct irdma_mpa_priv_info *pdata,
+						u8 flags);
+	int (*cm_create_ah)(struct irdma_cm_node *cm_node, bool wait);
+	void (*cm_free_ah)(struct irdma_cm_node *cm_node);
+};
+
+int irdma_schedule_cm_timer(struct irdma_cm_node *cm_node,
+			    struct irdma_puda_buf *sqbuf,
+			    enum irdma_timer_type type, int send_retrans,
+			    int close_when_complete);
+int irdma_accept(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param);
+int irdma_reject(struct iw_cm_id *cm_id, const void *pdata, u8 pdata_len);
+int irdma_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param);
+int irdma_create_listen(struct iw_cm_id *cm_id, int backlog);
+int irdma_destroy_listen(struct iw_cm_id *cm_id);
+int irdma_add_arp(struct irdma_pci_f *rf, u32 *ip, bool ipv4, u8 *mac);
+void irdma_cm_teardown_connections(struct irdma_device *iwdev, u32 *ipaddr,
+				   struct irdma_cm_info *nfo,
+				   bool disconnect_all);
+int irdma_cm_start(struct irdma_device *dev);
+int irdma_cm_stop(struct irdma_device *dev);
+bool irdma_ipv4_is_lpb(u32 loc_addr, u32 rem_addr);
+bool irdma_ipv6_is_lpb(u32 *loc_addr, u32 *rem_addr);
+int irdma_arp_table(struct irdma_pci_f *rf, u32 *ip_addr, bool ipv4,
+		    u8 *mac_addr, u32 action);
+void irdma_if_notify(struct irdma_device *iwdev, struct net_device *netdev,
+		     u32 *ipaddr, bool ipv4, bool ifup);
+bool irdma_port_in_use(struct irdma_cm_core *cm_core, u16 port);
+void irdma_send_ack(struct irdma_cm_node *cm_node);
+void irdma_lpb_nop(struct irdma_sc_qp *qp);
+void irdma_rem_ref_cm_node(struct irdma_cm_node *cm_node);
+void irdma_add_conn_est_qh(struct irdma_cm_node *cm_node);
+#endif /* IRDMA_CM_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 15/23] RDMA/irdma: Add PBLE resource manager
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (13 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 14/23] RDMA/irdma: Add connection manager Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 16/23] RDMA/irdma: Implement device supported verb APIs Shiraz Saleem
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Implement a Physical Buffer List Entry (PBLE) resource manager
to manage a pool of PBLE HMC resource objects.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/pble.c | 525 +++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/pble.h | 136 ++++++++++
 2 files changed, 661 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/pble.c
 create mode 100644 drivers/infiniband/hw/irdma/pble.h

diff --git a/drivers/infiniband/hw/irdma/pble.c b/drivers/infiniband/hw/irdma/pble.c
new file mode 100644
index 0000000..e8f29d9
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/pble.c
@@ -0,0 +1,525 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2020 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "hmc.h"
+#include "defs.h"
+#include "type.h"
+#include "protos.h"
+#include "pble.h"
+
+static enum irdma_status_code
+add_pble_prm(struct irdma_hmc_pble_rsrc *pble_rsrc);
+
+/**
+ * irdma_destroy_pble_prm - destroy prm during module unload
+ * @pble_rsrc: pble resources
+ */
+void irdma_destroy_pble_prm(struct irdma_hmc_pble_rsrc *pble_rsrc)
+{
+	struct irdma_chunk *chunk;
+	struct irdma_pble_prm *pinfo = &pble_rsrc->pinfo;
+
+	while (!list_empty(&pinfo->clist)) {
+		chunk = (struct irdma_chunk *) pinfo->clist.next;
+		list_del(&chunk->list);
+		if (chunk->type == PBLE_SD_PAGED)
+			irdma_pble_free_paged_mem(chunk);
+		if (chunk->bitmapbuf)
+			kfree(chunk->bitmapmem.va);
+		kfree(chunk->chunkmem.va);
+	}
+}
+
+/**
+ * irdma_hmc_init_pble - Initialize pble resources during module load
+ * @dev: irdma_sc_dev struct
+ * @pble_rsrc: pble resources
+ */
+enum irdma_status_code
+irdma_hmc_init_pble(struct irdma_sc_dev *dev,
+		    struct irdma_hmc_pble_rsrc *pble_rsrc)
+{
+	struct irdma_hmc_info *hmc_info;
+	u32 fpm_idx = 0;
+	enum irdma_status_code status = 0;
+
+	hmc_info = dev->hmc_info;
+	pble_rsrc->dev = dev;
+	pble_rsrc->fpm_base_addr = hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].base;
+	/* Start pble' on 4k boundary */
+	if (pble_rsrc->fpm_base_addr & 0xfff)
+		fpm_idx = (4096 - (pble_rsrc->fpm_base_addr & 0xfff)) >> 3;
+	pble_rsrc->unallocated_pble =
+		hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt - fpm_idx;
+	pble_rsrc->next_fpm_addr = pble_rsrc->fpm_base_addr + (fpm_idx << 3);
+	pble_rsrc->pinfo.pble_shift = PBLE_SHIFT;
+
+	mutex_init(&pble_rsrc->pble_mutex_lock);
+
+	spin_lock_init(&pble_rsrc->pinfo.prm_lock);
+	INIT_LIST_HEAD(&pble_rsrc->pinfo.clist);
+	if (add_pble_prm(pble_rsrc)) {
+		irdma_destroy_pble_prm(pble_rsrc);
+		status = IRDMA_ERR_NO_MEMORY;
+	}
+
+	return status;
+}
+
+/**
+ * get_sd_pd_idx -  Returns sd index, pd index and rel_pd_idx from fpm address
+ * @pble_rsrc: structure containing fpm address
+ * @idx: where to return indexes
+ */
+static void get_sd_pd_idx(struct irdma_hmc_pble_rsrc *pble_rsrc,
+			  struct sd_pd_idx *idx)
+{
+	idx->sd_idx = (u32)pble_rsrc->next_fpm_addr / IRDMA_HMC_DIRECT_BP_SIZE;
+	idx->pd_idx = (u32)(pble_rsrc->next_fpm_addr / IRDMA_HMC_PAGED_BP_SIZE);
+	idx->rel_pd_idx = (idx->pd_idx % IRDMA_HMC_PD_CNT_IN_SD);
+}
+
+/**
+ * add_sd_direct - add sd direct for pble
+ * @pble_rsrc: pble resource ptr
+ * @info: page info for sd
+ */
+static enum irdma_status_code
+add_sd_direct(struct irdma_hmc_pble_rsrc *pble_rsrc,
+	      struct irdma_add_page_info *info)
+{
+	struct irdma_sc_dev *dev = pble_rsrc->dev;
+	enum irdma_status_code ret_code = 0;
+	struct sd_pd_idx *idx = &info->idx;
+	struct irdma_chunk *chunk = info->chunk;
+	struct irdma_hmc_info *hmc_info = info->hmc_info;
+	struct irdma_hmc_sd_entry *sd_entry = info->sd_entry;
+	u32 offset = 0;
+
+	if (!sd_entry->valid) {
+		ret_code = irdma_add_sd_table_entry(dev->hw, hmc_info,
+						    info->idx.sd_idx,
+						    IRDMA_SD_TYPE_DIRECT,
+						    IRDMA_HMC_DIRECT_BP_SIZE);
+		if (ret_code)
+			return ret_code;
+
+		chunk->type = PBLE_SD_CONTIGOUS;
+	}
+
+	offset = idx->rel_pd_idx << HMC_PAGED_BP_SHIFT;
+	chunk->size = info->pages << HMC_PAGED_BP_SHIFT;
+	chunk->vaddr = (uintptr_t)sd_entry->u.bp.addr.va + offset;
+	chunk->fpm_addr = pble_rsrc->next_fpm_addr;
+	ibdev_dbg(to_ibdev(dev),
+		  "PBLE: chunk_size[%lld] = 0x%llx vaddr=0x%llx fpm_addr = %llx\n",
+		  chunk->size, chunk->size, chunk->vaddr, chunk->fpm_addr);
+
+	return 0;
+}
+
+/**
+ * fpm_to_idx - given fpm address, get pble index
+ * @pble_rsrc: pble resource management
+ * @addr: fpm address for index
+ */
+static u32 fpm_to_idx(struct irdma_hmc_pble_rsrc *pble_rsrc, u64 addr)
+{
+	u64 idx;
+
+	idx = (addr - (pble_rsrc->fpm_base_addr)) >> 3;
+
+	return (u32)idx;
+}
+
+/**
+ * add_bp_pages - add backing pages for sd
+ * @pble_rsrc: pble resource management
+ * @info: page info for sd
+ */
+static enum irdma_status_code
+add_bp_pages(struct irdma_hmc_pble_rsrc *pble_rsrc,
+	     struct irdma_add_page_info *info)
+{
+	struct irdma_sc_dev *dev = pble_rsrc->dev;
+	u8 *addr;
+	struct irdma_dma_mem mem;
+	struct irdma_hmc_pd_entry *pd_entry;
+	struct irdma_hmc_sd_entry *sd_entry = info->sd_entry;
+	struct irdma_hmc_info *hmc_info = info->hmc_info;
+	struct irdma_chunk *chunk = info->chunk;
+	enum irdma_status_code status = 0;
+	u32 rel_pd_idx = info->idx.rel_pd_idx;
+	u32 pd_idx = info->idx.pd_idx;
+	u32 i;
+
+	if (irdma_pble_get_paged_mem(chunk, info->pages))
+		return IRDMA_ERR_NO_MEMORY;
+
+	status = irdma_add_sd_table_entry(dev->hw, hmc_info, info->idx.sd_idx,
+					  IRDMA_SD_TYPE_PAGED,
+					  IRDMA_HMC_DIRECT_BP_SIZE);
+	if (status)
+		goto error;
+
+	addr = (u8 *)(uintptr_t)chunk->vaddr;
+	for (i = 0; i < info->pages; i++) {
+		mem.pa = (u64)chunk->dmainfo.dmaaddrs[i];
+		mem.size = 4096;
+		mem.va = addr;
+		pd_entry = &sd_entry->u.pd_table.pd_entry[rel_pd_idx++];
+		if (!pd_entry->valid) {
+			status = irdma_add_pd_table_entry(dev, hmc_info,
+							  pd_idx++, &mem);
+			if (status)
+				goto error;
+
+			addr += 4096;
+		}
+	}
+
+	chunk->fpm_addr = pble_rsrc->next_fpm_addr;
+	return 0;
+
+error:
+	irdma_pble_free_paged_mem(chunk);
+
+	return status;
+}
+
+/**
+ * irdma_get_type - add a sd entry type for sd
+ * @dev: irdma_sc_dev struct
+ * @idx: index of sd
+ * @pages: pages in the sd
+ */
+static enum irdma_sd_entry_type irdma_get_type(struct irdma_sc_dev *dev,
+					       struct sd_pd_idx *idx, u32 pages)
+{
+	enum irdma_sd_entry_type sd_entry_type;
+
+	sd_entry_type = !idx->rel_pd_idx && pages == IRDMA_HMC_PD_CNT_IN_SD && dev->privileged ?
+			IRDMA_SD_TYPE_DIRECT : IRDMA_SD_TYPE_PAGED;
+
+	return sd_entry_type;
+}
+
+/**
+ * add_pble_prm - add a sd entry for pble resoure
+ * @pble_rsrc: pble resource management
+ */
+static enum irdma_status_code
+add_pble_prm(struct irdma_hmc_pble_rsrc *pble_rsrc)
+{
+	struct irdma_sc_dev *dev = pble_rsrc->dev;
+	struct irdma_hmc_sd_entry *sd_entry;
+	struct irdma_hmc_info *hmc_info;
+	struct irdma_chunk *chunk;
+	struct irdma_add_page_info info;
+	struct sd_pd_idx *idx = &info.idx;
+	enum irdma_status_code ret_code = 0;
+	enum irdma_sd_entry_type sd_entry_type;
+	u64 sd_reg_val = 0;
+	struct irdma_virt_mem chunkmem;
+	u32 pages;
+
+	if (pble_rsrc->unallocated_pble < PBLE_PER_PAGE)
+		return IRDMA_ERR_NO_MEMORY;
+
+	if (pble_rsrc->next_fpm_addr & 0xfff)
+		return IRDMA_ERR_INVALID_PAGE_DESC_INDEX;
+
+	chunkmem.size = sizeof(*chunk);
+	chunkmem.va = kzalloc(chunkmem.size, GFP_KERNEL);
+	if (!chunkmem.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	chunk = chunkmem.va;
+	chunk->chunkmem = chunkmem;
+	hmc_info = dev->hmc_info;
+	chunk->dev = dev;
+	chunk->fpm_addr = pble_rsrc->next_fpm_addr;
+	get_sd_pd_idx(pble_rsrc, idx);
+	sd_entry = &hmc_info->sd_table.sd_entry[idx->sd_idx];
+	pages = (idx->rel_pd_idx) ? (IRDMA_HMC_PD_CNT_IN_SD - idx->rel_pd_idx) :
+				    IRDMA_HMC_PD_CNT_IN_SD;
+	pages = min(pages, pble_rsrc->unallocated_pble >> PBLE_512_SHIFT);
+	info.chunk = chunk;
+	info.hmc_info = hmc_info;
+	info.pages = pages;
+	info.sd_entry = sd_entry;
+	if (!sd_entry->valid)
+		sd_entry_type = irdma_get_type(dev, idx, pages);
+	else
+		sd_entry_type = sd_entry->entry_type;
+
+	ibdev_dbg(to_ibdev(dev),
+		  "PBLE: pages = %d, unallocated_pble[%d] current_fpm_addr = %llx\n",
+		  pages, pble_rsrc->unallocated_pble,
+		  pble_rsrc->next_fpm_addr);
+	ibdev_dbg(to_ibdev(dev), "PBLE: sd_entry_type = %d\n", sd_entry_type);
+	if (sd_entry_type == IRDMA_SD_TYPE_DIRECT)
+		ret_code = add_sd_direct(pble_rsrc, &info);
+
+	if (ret_code)
+		sd_entry_type = IRDMA_SD_TYPE_PAGED;
+	else
+		pble_rsrc->stats_direct_sds++;
+
+	if (sd_entry_type == IRDMA_SD_TYPE_PAGED) {
+		ret_code = add_bp_pages(pble_rsrc, &info);
+		if (ret_code)
+			goto error;
+		else
+			pble_rsrc->stats_paged_sds++;
+	}
+
+	ret_code = irdma_prm_add_pble_mem(&pble_rsrc->pinfo, chunk);
+	if (ret_code)
+		goto error;
+
+	pble_rsrc->next_fpm_addr += chunk->size;
+	ibdev_dbg(to_ibdev(dev),
+		  "PBLE: next_fpm_addr = %llx chunk_size[%llu] = 0x%llx\n",
+		  pble_rsrc->next_fpm_addr, chunk->size, chunk->size);
+	pble_rsrc->unallocated_pble -= (u32)(chunk->size >> 3);
+	list_add(&chunk->list, &pble_rsrc->pinfo.clist);
+	sd_reg_val = (sd_entry_type == IRDMA_SD_TYPE_PAGED) ?
+			     sd_entry->u.pd_table.pd_page_addr.pa :
+			     sd_entry->u.bp.addr.pa;
+	if (sd_entry->valid)
+		return 0;
+
+	if (dev->privileged) {
+		ret_code = irdma_hmc_sd_one(dev, hmc_info->hmc_fn_id,
+					    sd_reg_val, idx->sd_idx,
+					    sd_entry->entry_type, true);
+		if (ret_code)
+			goto error;
+	}
+
+	sd_entry->valid = true;
+	return 0;
+
+error:
+	if (chunk->bitmapbuf)
+		kfree(chunk->bitmapmem.va);
+	kfree(chunk->chunkmem.va);
+
+	return ret_code;
+}
+
+/**
+ * free_lvl2 - fee level 2 pble
+ * @pble_rsrc: pble resource management
+ * @palloc: level 2 pble allocation
+ */
+static void free_lvl2(struct irdma_hmc_pble_rsrc *pble_rsrc,
+		      struct irdma_pble_alloc *palloc)
+{
+	u32 i;
+	struct irdma_pble_level2 *lvl2 = &palloc->level2;
+	struct irdma_pble_info *root = &lvl2->root;
+	struct irdma_pble_info *leaf = lvl2->leaf;
+
+	for (i = 0; i < lvl2->leaf_cnt; i++, leaf++) {
+		if (leaf->addr)
+			irdma_prm_return_pbles(&pble_rsrc->pinfo,
+					       &leaf->chunkinfo);
+		else
+			break;
+	}
+
+	if (root->addr)
+		irdma_prm_return_pbles(&pble_rsrc->pinfo, &root->chunkinfo);
+
+	kfree(lvl2->leafmem.va);
+	lvl2->leaf = NULL;
+}
+
+/**
+ * get_lvl2_pble - get level 2 pble resource
+ * @pble_rsrc: pble resource management
+ * @palloc: level 2 pble allocation
+ */
+static enum irdma_status_code
+get_lvl2_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+	      struct irdma_pble_alloc *palloc)
+{
+	u32 lf4k, lflast, total, i;
+	u32 pblcnt = PBLE_PER_PAGE;
+	u64 *addr;
+	struct irdma_pble_level2 *lvl2 = &palloc->level2;
+	struct irdma_pble_info *root = &lvl2->root;
+	struct irdma_pble_info *leaf;
+	enum irdma_status_code ret_code;
+	u64 fpm_addr;
+
+	/* number of full 512 (4K) leafs) */
+	lf4k = palloc->total_cnt >> 9;
+	lflast = palloc->total_cnt % PBLE_PER_PAGE;
+	total = (lflast == 0) ? lf4k : lf4k + 1;
+	lvl2->leaf_cnt = total;
+
+	lvl2->leafmem.size = (sizeof(*leaf) * total);
+	lvl2->leafmem.va = kzalloc(lvl2->leafmem.size, GFP_KERNEL);
+	if (!lvl2->leafmem.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	lvl2->leaf = lvl2->leafmem.va;
+	leaf = lvl2->leaf;
+	ret_code = irdma_prm_get_pbles(&pble_rsrc->pinfo, &root->chunkinfo,
+				       total << 3, &root->addr, &fpm_addr);
+	if (ret_code) {
+		kfree(lvl2->leafmem.va);
+		lvl2->leaf = NULL;
+		return IRDMA_ERR_NO_MEMORY;
+	}
+
+	root->idx = fpm_to_idx(pble_rsrc, fpm_addr);
+	root->cnt = total;
+	addr = (u64 *)(uintptr_t)root->addr;
+	for (i = 0; i < total; i++, leaf++) {
+		pblcnt = (lflast && ((i + 1) == total)) ?
+				lflast : PBLE_PER_PAGE;
+		ret_code = irdma_prm_get_pbles(&pble_rsrc->pinfo,
+					       &leaf->chunkinfo, pblcnt << 3,
+					       &leaf->addr, &fpm_addr);
+		if (ret_code)
+			goto error;
+
+		leaf->idx = fpm_to_idx(pble_rsrc, fpm_addr);
+
+		leaf->cnt = pblcnt;
+		*addr = (u64)leaf->idx;
+		addr++;
+	}
+
+	palloc->level = PBLE_LEVEL_2;
+	pble_rsrc->stats_lvl2++;
+	return 0;
+
+error:
+	free_lvl2(pble_rsrc, palloc);
+
+	return IRDMA_ERR_NO_MEMORY;
+}
+
+/**
+ * get_lvl1_pble - get level 1 pble resource
+ * @pble_rsrc: pble resource management
+ * @palloc: level 1 pble allocation
+ */
+static enum irdma_status_code
+get_lvl1_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+	      struct irdma_pble_alloc *palloc)
+{
+	enum irdma_status_code ret_code;
+	u64 fpm_addr, vaddr;
+	struct irdma_pble_info *lvl1 = &palloc->level1;
+
+	ret_code = irdma_prm_get_pbles(&pble_rsrc->pinfo, &lvl1->chunkinfo,
+				       palloc->total_cnt << 3, &vaddr,
+				       &fpm_addr);
+	if (ret_code)
+		return IRDMA_ERR_NO_MEMORY;
+
+	lvl1->addr = vaddr;
+	palloc->level = PBLE_LEVEL_1;
+	lvl1->idx = fpm_to_idx(pble_rsrc, fpm_addr);
+	lvl1->cnt = palloc->total_cnt;
+	pble_rsrc->stats_lvl1++;
+
+	return 0;
+}
+
+/**
+ * get_lvl1_lvl2_pble - calls get_lvl1 and get_lvl2 pble routine
+ * @pble_rsrc: pble resources
+ * @palloc: contains all inforamtion regarding pble (idx + pble addr)
+ * @level1_only: flag for a level 1 PBLE
+ */
+static enum irdma_status_code
+get_lvl1_lvl2_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+		   struct irdma_pble_alloc *palloc, bool level1_only)
+{
+	enum irdma_status_code status = 0;
+
+	status = get_lvl1_pble(pble_rsrc, palloc);
+	if (!status || level1_only || palloc->total_cnt <= PBLE_PER_PAGE)
+		return status;
+
+	status = get_lvl2_pble(pble_rsrc, palloc);
+
+	return status;
+}
+
+/**
+ * irdma_get_pble - allocate pbles from the prm
+ * @pble_rsrc: pble resources
+ * @palloc: contains all inforamtion regarding pble (idx + pble addr)
+ * @pble_cnt: #of pbles requested
+ * @level1_only: true if only pble level 1 to acquire
+ */
+enum irdma_status_code irdma_get_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+				      struct irdma_pble_alloc *palloc,
+				      u32 pble_cnt, bool level1_only)
+{
+	enum irdma_status_code status = 0;
+	int max_sds = 0;
+	int i;
+
+	palloc->total_cnt = pble_cnt;
+	palloc->level = PBLE_LEVEL_0;
+
+	mutex_lock(&pble_rsrc->pble_mutex_lock);
+
+	/*check first to see if we can get pble's without acquiring
+	 * additional sd's
+	 */
+	status = get_lvl1_lvl2_pble(pble_rsrc, palloc, level1_only);
+	if (!status)
+		goto exit;
+
+	max_sds = (palloc->total_cnt >> 18) + 1;
+	for (i = 0; i < max_sds; i++) {
+		status = add_pble_prm(pble_rsrc);
+		if (status)
+			break;
+
+		status = get_lvl1_lvl2_pble(pble_rsrc, palloc, level1_only);
+		/* if level1_only, only go through it once */
+		if (!status || level1_only)
+			break;
+	}
+
+exit:
+	if (!status) {
+		pble_rsrc->allocdpbles += pble_cnt;
+		pble_rsrc->stats_alloc_ok++;
+	} else {
+		pble_rsrc->stats_alloc_fail++;
+	}
+	mutex_unlock(&pble_rsrc->pble_mutex_lock);
+
+	return status;
+}
+
+/**
+ * irdma_free_pble - put pbles back into prm
+ * @pble_rsrc: pble resources
+ * @palloc: contains all information regarding pble resource being freed
+ */
+void irdma_free_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+		     struct irdma_pble_alloc *palloc)
+{
+	pble_rsrc->freedpbles += palloc->total_cnt;
+
+	if (palloc->level == PBLE_LEVEL_2)
+		free_lvl2(pble_rsrc, palloc);
+	else
+		irdma_prm_return_pbles(&pble_rsrc->pinfo,
+				       &palloc->level1.chunkinfo);
+	pble_rsrc->stats_alloc_freed++;
+}
diff --git a/drivers/infiniband/hw/irdma/pble.h b/drivers/infiniband/hw/irdma/pble.h
new file mode 100644
index 0000000..e4da6f5
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/pble.h
@@ -0,0 +1,136 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2019 Intel Corporation */
+#ifndef IRDMA_PBLE_H
+#define IRDMA_PBLE_H
+
+#define PBLE_SHIFT		6
+#define PBLE_PER_PAGE		512
+#define HMC_PAGED_BP_SHIFT	12
+#define PBLE_512_SHIFT		9
+#define PBLE_INVALID_IDX	0xffffffff
+
+enum irdma_pble_level {
+	PBLE_LEVEL_0 = 0,
+	PBLE_LEVEL_1 = 1,
+	PBLE_LEVEL_2 = 2,
+};
+
+enum irdma_alloc_type {
+	PBLE_NO_ALLOC	  = 0,
+	PBLE_SD_CONTIGOUS = 1,
+	PBLE_SD_PAGED	  = 2,
+};
+
+struct irdma_chunk;
+
+struct irdma_pble_chunkinfo {
+	struct irdma_chunk *pchunk;
+	u64 bit_idx;
+	u64 bits_used;
+};
+
+struct irdma_pble_info {
+	u64 addr;
+	u32 idx;
+	u32 cnt;
+	struct irdma_pble_chunkinfo chunkinfo;
+};
+
+struct irdma_pble_level2 {
+	struct irdma_pble_info root;
+	struct irdma_pble_info *leaf;
+	struct irdma_virt_mem leafmem;
+	u32 leaf_cnt;
+};
+
+struct irdma_pble_alloc {
+	u32 total_cnt;
+	enum irdma_pble_level level;
+	union {
+		struct irdma_pble_info level1;
+		struct irdma_pble_level2 level2;
+	};
+};
+
+struct sd_pd_idx {
+	u32 sd_idx;
+	u32 pd_idx;
+	u32 rel_pd_idx;
+};
+
+struct irdma_add_page_info {
+	struct irdma_chunk *chunk;
+	struct irdma_hmc_sd_entry *sd_entry;
+	struct irdma_hmc_info *hmc_info;
+	struct sd_pd_idx idx;
+	u32 pages;
+};
+
+struct irdma_chunk {
+	struct list_head list;
+	struct irdma_dma_info dmainfo;
+	void *bitmapbuf;
+
+	u32 sizeofbitmap;
+	u64 size;
+	u64 vaddr;
+	u64 fpm_addr;
+	u32 pg_cnt;
+	enum irdma_alloc_type type;
+	struct irdma_sc_dev *dev;
+	struct irdma_virt_mem bitmapmem;
+	struct irdma_virt_mem chunkmem;
+};
+
+struct irdma_pble_prm {
+	struct list_head clist;
+	spinlock_t prm_lock; /* protect prm bitmap */
+	u64 total_pble_alloc;
+	u64 free_pble_cnt;
+	u8 pble_shift;
+};
+
+struct irdma_hmc_pble_rsrc {
+	u32 unallocated_pble;
+	struct mutex pble_mutex_lock; /* protect PBLE resource */
+	struct irdma_sc_dev *dev;
+	u64 fpm_base_addr;
+	u64 next_fpm_addr;
+	struct irdma_pble_prm pinfo;
+	u64 allocdpbles;
+	u64 freedpbles;
+	u32 stats_direct_sds;
+	u32 stats_paged_sds;
+	u64 stats_alloc_ok;
+	u64 stats_alloc_fail;
+	u64 stats_alloc_freed;
+	u64 stats_lvl1;
+	u64 stats_lvl2;
+};
+
+void irdma_destroy_pble_prm(struct irdma_hmc_pble_rsrc *pble_rsrc);
+enum irdma_status_code
+irdma_hmc_init_pble(struct irdma_sc_dev *dev,
+		    struct irdma_hmc_pble_rsrc *pble_rsrc);
+void irdma_free_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+		     struct irdma_pble_alloc *palloc);
+enum irdma_status_code irdma_get_pble(struct irdma_hmc_pble_rsrc *pble_rsrc,
+				      struct irdma_pble_alloc *palloc,
+				      u32 pble_cnt, bool level1_only);
+enum irdma_status_code irdma_prm_add_pble_mem(struct irdma_pble_prm *pprm,
+					      struct irdma_chunk *pchunk);
+enum irdma_status_code
+irdma_prm_get_pbles(struct irdma_pble_prm *pprm,
+		    struct irdma_pble_chunkinfo *chunkinfo, u32 mem_size,
+		    u64 *vaddr, u64 *fpm_addr);
+void irdma_prm_return_pbles(struct irdma_pble_prm *pprm,
+			    struct irdma_pble_chunkinfo *chunkinfo);
+void irdma_pble_acquire_lock(struct irdma_hmc_pble_rsrc *pble_rsrc,
+			     unsigned long *flags);
+void irdma_pble_release_lock(struct irdma_hmc_pble_rsrc *pble_rsrc,
+			     unsigned long *flags);
+void irdma_pble_free_paged_mem(struct irdma_chunk *chunk);
+enum irdma_status_code irdma_pble_get_paged_mem(struct irdma_chunk *chunk,
+						u32 pg_cnt);
+void irdma_prm_rem_bitmapmem(struct irdma_hw *hw, struct irdma_chunk *chunk);
+#endif /* IRDMA_PBLE_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 16/23] RDMA/irdma: Implement device supported verb APIs
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (14 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 15/23] RDMA/irdma: Add PBLE resource manager Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 17/23] RDMA/irdma: Add RoCEv2 UD OP support Shiraz Saleem
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Implement device supported verb APIs. The supported APIs
vary based on the underlying transport the ibdev is
registered as (i.e. iWARP or RoCEv2).

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/verbs.c     | 4544 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/verbs.h     |  225 ++
 include/uapi/rdma/ib_user_ioctl_verbs.h |    1 +
 3 files changed, 4770 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/verbs.c
 create mode 100644 drivers/infiniband/hw/irdma/verbs.h

diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
new file mode 100644
index 0000000..60ddfd5
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -0,0 +1,4544 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "main.h"
+
+/**
+ * irdma_query_device - get device attributes
+ * @ibdev: device pointer from stack
+ * @props: returning device attributes
+ * @udata: user data
+ */
+static int irdma_query_device(struct ib_device *ibdev,
+			      struct ib_device_attr *props,
+			      struct ib_udata *udata)
+{
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct pci_dev *pcidev = iwdev->rf->pcidev;
+	struct irdma_hw_attrs *hw_attrs = &rf->sc_dev.hw_attrs;
+
+	if (udata->inlen || udata->outlen)
+		return -EINVAL;
+
+	memset(props, 0, sizeof(*props));
+	ether_addr_copy((u8 *)&props->sys_image_guid, iwdev->netdev->dev_addr);
+	props->fw_ver = (u64)irdma_fw_major_ver(&rf->sc_dev) << 32 |
+			irdma_fw_minor_ver(&rf->sc_dev);
+	props->device_cap_flags = iwdev->device_cap_flags;
+	props->vendor_id = pcidev->vendor;
+	props->vendor_part_id = pcidev->device;
+
+	props->hw_ver = (u32)rf->sc_dev.pci_rev;
+	props->page_size_cap = SZ_4K | SZ_2M | SZ_1G;
+	props->max_mr_size = hw_attrs->max_mr_size;
+	props->max_qp = rf->max_qp - rf->used_qps;
+	props->max_qp_wr = hw_attrs->max_qp_wr;
+	props->max_send_sge = hw_attrs->uk_attrs.max_hw_wq_frags;
+	props->max_recv_sge = hw_attrs->uk_attrs.max_hw_wq_frags;
+	props->max_cq = rf->max_cq - rf->used_cqs;
+	props->max_cqe = rf->max_cqe;
+	props->max_mr = rf->max_mr - rf->used_mrs;
+	props->max_mw = props->max_mr;
+	props->max_pd = rf->max_pd - rf->used_pds;
+	props->max_sge_rd = hw_attrs->uk_attrs.max_hw_read_sges;
+	props->max_qp_rd_atom = hw_attrs->max_hw_ird;
+	props->max_qp_init_rd_atom = hw_attrs->max_hw_ord;
+	props->masked_atomic_cap = IB_ATOMIC_NONE;
+	if (rdma_protocol_roce(ibdev, 1))
+		props->max_pkeys = IRDMA_PKEY_TBL_SZ;
+	props->max_ah = rf->max_ah;
+	props->max_mcast_grp = rf->max_mcg;
+	props->max_mcast_qp_attach = IRDMA_MAX_MGS_PER_CTX;
+	props->max_total_mcast_qp_attach = rf->max_qp * IRDMA_MAX_MGS_PER_CTX;
+	props->max_fast_reg_page_list_len = IRDMA_MAX_PAGES_PER_FMR;
+	if (hw_attrs->uk_attrs.feature_flags & IRDMA_FEATURE_ATOMIC_OPS)
+		props->atomic_cap = IB_ATOMIC_HCA;
+#define HCA_CLOCK_TIMESTAMP_MASK 0x1ffff
+	if (hw_attrs->uk_attrs.hw_rev >= IRDMA_GEN_2)
+		props->timestamp_mask = HCA_CLOCK_TIMESTAMP_MASK;
+
+	return 0;
+}
+
+/**
+ * irdma_get_eth_speed_and_width - Get IB port speed and width from netdev speed
+ * @link_speed: netdev phy link speed
+ * @active_speed: IB port speed
+ * @active_width: IB port width
+ */
+static void irdma_get_eth_speed_and_width(u32 link_speed, u16 *active_speed,
+					  u8 *active_width)
+{
+	if (link_speed <= SPEED_1000) {
+		*active_width = IB_WIDTH_1X;
+		*active_speed = IB_SPEED_SDR;
+	} else if (link_speed <= SPEED_10000) {
+		*active_width = IB_WIDTH_1X;
+		*active_speed = IB_SPEED_FDR10;
+	} else if (link_speed <= SPEED_20000) {
+		*active_width = IB_WIDTH_4X;
+		*active_speed = IB_SPEED_DDR;
+	} else if (link_speed <= SPEED_25000) {
+		*active_width = IB_WIDTH_1X;
+		*active_speed = IB_SPEED_EDR;
+	} else if (link_speed <= SPEED_40000) {
+		*active_width = IB_WIDTH_4X;
+		*active_speed = IB_SPEED_FDR10;
+	} else {
+		*active_width = IB_WIDTH_4X;
+		*active_speed = IB_SPEED_EDR;
+	}
+}
+
+/**
+ * irdma_query_port - get port attributes
+ * @ibdev: device pointer from stack
+ * @port: port number for query
+ * @props: returning device attributes
+ */
+static int irdma_query_port(struct ib_device *ibdev, u32 port,
+			    struct ib_port_attr *props)
+{
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+	struct net_device *netdev = iwdev->netdev;
+
+	/* no need to zero out pros here. done by caller */
+
+	props->max_mtu = IB_MTU_4096;
+	props->active_mtu = ib_mtu_int_to_enum(netdev->mtu);
+	props->lid = 1;
+	props->lmc = 0;
+	props->sm_lid = 0;
+	props->sm_sl = 0;
+	if (netif_carrier_ok(netdev) && netif_running(netdev)) {
+		props->state = IB_PORT_ACTIVE;
+		props->phys_state = IB_PORT_PHYS_STATE_LINK_UP;
+	} else {
+		props->state = IB_PORT_DOWN;
+		props->phys_state = IB_PORT_PHYS_STATE_DISABLED;
+	}
+	irdma_get_eth_speed_and_width(SPEED_100000, &props->active_speed,
+				      &props->active_width);
+
+	if (rdma_protocol_roce(ibdev, 1)) {
+		props->gid_tbl_len = 32;
+		props->ip_gids = true;
+		props->pkey_tbl_len = IRDMA_PKEY_TBL_SZ;
+	} else {
+		props->gid_tbl_len = 1;
+	}
+	props->qkey_viol_cntr = 0;
+	props->port_cap_flags |= IB_PORT_CM_SUP | IB_PORT_REINIT_SUP;
+	props->max_msg_sz = iwdev->rf->sc_dev.hw_attrs.max_hw_outbound_msg_size;
+
+	return 0;
+}
+
+/**
+ * irdma_disassociate_ucontext - Disassociate user context
+ * @context: ib user context
+ */
+static void irdma_disassociate_ucontext(struct ib_ucontext *context)
+{
+}
+
+static int irdma_mmap_legacy(struct irdma_ucontext *ucontext,
+			     struct vm_area_struct *vma)
+{
+	u64 pfn;
+
+	if (vma->vm_pgoff || vma->vm_end - vma->vm_start != PAGE_SIZE)
+		return -EINVAL;
+
+	vma->vm_private_data = ucontext;
+	pfn = ((uintptr_t)ucontext->iwdev->rf->sc_dev.hw_regs[IRDMA_DB_ADDR_OFFSET] +
+	       pci_resource_start(ucontext->iwdev->rf->pcidev, 0)) >> PAGE_SHIFT;
+
+	return rdma_user_mmap_io(&ucontext->ibucontext, vma, pfn, PAGE_SIZE,
+				 pgprot_noncached(vma->vm_page_prot), NULL);
+}
+
+static void irdma_mmap_free(struct rdma_user_mmap_entry *rdma_entry)
+{
+	struct irdma_user_mmap_entry *entry = to_irdma_mmap_entry(rdma_entry);
+
+	kfree(entry);
+}
+
+static struct rdma_user_mmap_entry*
+irdma_user_mmap_entry_insert(struct irdma_ucontext *ucontext, u64 bar_offset,
+			     enum irdma_mmap_flag mmap_flag, u64 *mmap_offset)
+{
+	struct irdma_user_mmap_entry *entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	int ret;
+
+	if (!entry)
+		return NULL;
+
+	entry->bar_offset = bar_offset;
+	entry->mmap_flag = mmap_flag;
+
+	ret = rdma_user_mmap_entry_insert(&ucontext->ibucontext,
+					  &entry->rdma_entry, PAGE_SIZE);
+	if (ret) {
+		kfree(entry);
+		return NULL;
+	}
+	*mmap_offset = rdma_user_mmap_get_offset(&entry->rdma_entry);
+
+	return &entry->rdma_entry;
+}
+/**
+ * irdma_mmap - user memory map
+ * @context: context created during alloc
+ * @vma: kernel info for user memory map
+ */
+static int irdma_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
+{
+	struct rdma_user_mmap_entry *rdma_entry;
+	struct irdma_user_mmap_entry *entry;
+	struct irdma_ucontext *ucontext;
+	u64 pfn;
+	int ret;
+
+	ucontext = to_ucontext(context);
+
+	/* Legacy support for libi40iw with hard-coded mmap key */
+	if (ucontext->legacy_mode)
+		return irdma_mmap_legacy(ucontext, vma);
+
+	rdma_entry = rdma_user_mmap_entry_get(&ucontext->ibucontext, vma);
+	if (!rdma_entry) {
+		ibdev_dbg(&ucontext->iwdev->ibdev,
+			  "VERBS: pgoff[0x%lx] does not have valid entry\n",
+			  vma->vm_pgoff);
+		return -EINVAL;
+	}
+
+	entry = to_irdma_mmap_entry(rdma_entry);
+	ibdev_dbg(&ucontext->iwdev->ibdev,
+		  "VERBS: bar_offset [0x%llx] mmap_flag [%d]\n",
+		  entry->bar_offset, entry->mmap_flag);
+
+	pfn = (entry->bar_offset +
+	      pci_resource_start(ucontext->iwdev->rf->pcidev, 0)) >> PAGE_SHIFT;
+
+	switch (entry->mmap_flag) {
+	case IRDMA_MMAP_IO_NC:
+		ret = rdma_user_mmap_io(context, vma, pfn, PAGE_SIZE,
+					pgprot_noncached(vma->vm_page_prot),
+					rdma_entry);
+		break;
+	case IRDMA_MMAP_IO_WC:
+		ret = rdma_user_mmap_io(context, vma, pfn, PAGE_SIZE,
+					pgprot_writecombine(vma->vm_page_prot),
+					rdma_entry);
+		break;
+	default:
+		ret = -EINVAL;
+	}
+
+	if (ret)
+		ibdev_dbg(&ucontext->iwdev->ibdev,
+			  "VERBS: bar_offset [0x%llx] mmap_flag[%d] err[%d]\n",
+			  entry->bar_offset, entry->mmap_flag, ret);
+	rdma_user_mmap_entry_put(rdma_entry);
+
+	return ret;
+}
+
+/**
+ * irdma_alloc_push_page - allocate a push page for qp
+ * @iwqp: qp pointer
+ */
+static void irdma_alloc_push_page(struct irdma_qp *iwqp)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_sc_qp *qp = &iwqp->sc_qp;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_MANAGE_PUSH_PAGE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.manage_push_page.info.push_idx = 0;
+	cqp_info->in.u.manage_push_page.info.qs_handle =
+		qp->vsi->qos[qp->user_pri].qs_handle;
+	cqp_info->in.u.manage_push_page.info.free_page = 0;
+	cqp_info->in.u.manage_push_page.info.push_page_type = 0;
+	cqp_info->in.u.manage_push_page.cqp = &iwdev->rf->cqp.sc_cqp;
+	cqp_info->in.u.manage_push_page.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	if (!status && cqp_request->compl_info.op_ret_val <
+	    iwdev->rf->sc_dev.hw_attrs.max_hw_device_pages) {
+		qp->push_idx = cqp_request->compl_info.op_ret_val;
+		qp->push_offset = 0;
+	}
+
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+}
+
+/**
+ * irdma_alloc_ucontext - Allocate the user context data structure
+ * @uctx: uverbs context pointer
+ * @udata: user data
+ *
+ * This keeps track of all objects associated with a particular
+ * user-mode client.
+ */
+static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
+				struct ib_udata *udata)
+{
+	struct ib_device *ibdev = uctx->device;
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+	struct irdma_alloc_ucontext_req req;
+	struct irdma_alloc_ucontext_resp uresp = {};
+	struct irdma_ucontext *ucontext = to_ucontext(uctx);
+	struct irdma_uk_attrs *uk_attrs;
+
+	if (ib_copy_from_udata(&req, udata, min(sizeof(req), udata->inlen)))
+		return -EINVAL;
+
+	if (req.userspace_ver < 4 || req.userspace_ver > IRDMA_ABI_VER)
+		goto ver_error;
+
+	ucontext->iwdev = iwdev;
+	ucontext->abi_ver = req.userspace_ver;
+
+	uk_attrs = &iwdev->rf->sc_dev.hw_attrs.uk_attrs;
+	/* GEN_1 legacy support with libi40iw */
+	if (udata->outlen < sizeof(uresp)) {
+		if (uk_attrs->hw_rev != IRDMA_GEN_1)
+			return -EOPNOTSUPP;
+
+		ucontext->legacy_mode = true;
+		uresp.max_qps = iwdev->rf->max_qp;
+		uresp.max_pds = iwdev->rf->sc_dev.hw_attrs.max_hw_pds;
+		uresp.wq_size = iwdev->rf->sc_dev.hw_attrs.max_qp_wr * 2;
+		uresp.kernel_ver = req.userspace_ver;
+		if (ib_copy_to_udata(udata, &uresp,
+				     min(sizeof(uresp), udata->outlen)))
+			return -EFAULT;
+	} else {
+		u64 bar_off = (uintptr_t)iwdev->rf->sc_dev.hw_regs[IRDMA_DB_ADDR_OFFSET];
+		ucontext->db_mmap_entry =
+			irdma_user_mmap_entry_insert(ucontext, bar_off,
+						     IRDMA_MMAP_IO_NC,
+						     &uresp.db_mmap_key);
+		if (!ucontext->db_mmap_entry)
+			return -ENOMEM;
+
+		uresp.kernel_ver = IRDMA_ABI_VER;
+		uresp.feature_flags = uk_attrs->feature_flags;
+		uresp.max_hw_wq_frags = uk_attrs->max_hw_wq_frags;
+		uresp.max_hw_read_sges = uk_attrs->max_hw_read_sges;
+		uresp.max_hw_inline = uk_attrs->max_hw_inline;
+		uresp.max_hw_rq_quanta = uk_attrs->max_hw_rq_quanta;
+		uresp.max_hw_wq_quanta = uk_attrs->max_hw_wq_quanta;
+		uresp.max_hw_sq_chunk = uk_attrs->max_hw_sq_chunk;
+		uresp.max_hw_cq_size = uk_attrs->max_hw_cq_size;
+		uresp.min_hw_cq_size = uk_attrs->min_hw_cq_size;
+		uresp.hw_rev = uk_attrs->hw_rev;
+		if (ib_copy_to_udata(udata, &uresp,
+				     min(sizeof(uresp), udata->outlen))) {
+			rdma_user_mmap_entry_remove(ucontext->db_mmap_entry);
+			return -EFAULT;
+		}
+	}
+
+	INIT_LIST_HEAD(&ucontext->cq_reg_mem_list);
+	spin_lock_init(&ucontext->cq_reg_mem_list_lock);
+	INIT_LIST_HEAD(&ucontext->qp_reg_mem_list);
+	spin_lock_init(&ucontext->qp_reg_mem_list_lock);
+
+	return 0;
+
+ver_error:
+	ibdev_err(&iwdev->ibdev,
+		  "Invalid userspace driver version detected. Detected version %d, should be %d\n",
+		  req.userspace_ver, IRDMA_ABI_VER);
+	return -EINVAL;
+}
+
+/**
+ * irdma_dealloc_ucontext - deallocate the user context data structure
+ * @context: user context created during alloc
+ */
+static void irdma_dealloc_ucontext(struct ib_ucontext *context)
+{
+	struct irdma_ucontext *ucontext = to_ucontext(context);
+
+	rdma_user_mmap_entry_remove(ucontext->db_mmap_entry);
+}
+
+/**
+ * irdma_alloc_pd - allocate protection domain
+ * @pd: PD pointer
+ * @udata: user data
+ */
+static int irdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+{
+	struct irdma_pd *iwpd = to_iwpd(pd);
+	struct irdma_device *iwdev = to_iwdev(pd->device);
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct irdma_alloc_pd_resp uresp = {};
+	struct irdma_sc_pd *sc_pd;
+	u32 pd_id = 0;
+	int err;
+
+	err = irdma_alloc_rsrc(rf, rf->allocated_pds, rf->max_pd, &pd_id,
+			       &rf->next_pd);
+	if (err)
+		return err;
+
+	sc_pd = &iwpd->sc_pd;
+	if (udata) {
+		struct irdma_ucontext *ucontext =
+			rdma_udata_to_drv_context(udata, struct irdma_ucontext,
+						  ibucontext);
+		dev->iw_pd_ops->pd_init(dev, sc_pd, pd_id, ucontext->abi_ver);
+		uresp.pd_id = pd_id;
+		if (ib_copy_to_udata(udata, &uresp,
+				     min(sizeof(uresp), udata->outlen))) {
+			err = -EFAULT;
+			goto error;
+		}
+	} else {
+		dev->iw_pd_ops->pd_init(dev, sc_pd, pd_id, IRDMA_ABI_VER);
+	}
+
+	return 0;
+error:
+	irdma_free_rsrc(rf, rf->allocated_pds, pd_id);
+
+	return err;
+}
+
+/**
+ * irdma_dealloc_pd - deallocate pd
+ * @ibpd: ptr of pd to be deallocated
+ * @udata: user data
+ */
+static int irdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+{
+	struct irdma_pd *iwpd = to_iwpd(ibpd);
+	struct irdma_device *iwdev = to_iwdev(ibpd->device);
+
+	irdma_free_rsrc(iwdev->rf, iwdev->rf->allocated_pds, iwpd->sc_pd.pd_id);
+
+	return 0;
+}
+
+/**
+ * irdma_get_pbl - Retrieve pbl from a list given a virtual
+ * address
+ * @va: user virtual address
+ * @pbl_list: pbl list to search in (QP's or CQ's)
+ */
+static struct irdma_pbl *irdma_get_pbl(unsigned long va,
+				       struct list_head *pbl_list)
+{
+	struct irdma_pbl *iwpbl;
+
+	list_for_each_entry (iwpbl, pbl_list, list) {
+		if (iwpbl->user_base == va) {
+			list_del(&iwpbl->list);
+			iwpbl->on_list = false;
+			return iwpbl;
+		}
+	}
+
+	return NULL;
+}
+
+/**
+ * irdma_clean_cqes - clean cq entries for qp
+ * @iwqp: qp ptr (user or kernel)
+ * @iwcq: cq ptr
+ */
+static void irdma_clean_cqes(struct irdma_qp *iwqp, struct irdma_cq *iwcq)
+{
+	struct irdma_cq_uk *ukcq = &iwcq->sc_cq.cq_uk;
+	unsigned long flags;
+
+	spin_lock_irqsave(&iwcq->lock, flags);
+	ukcq->ops.iw_cq_clean(&iwqp->sc_qp.qp_uk, ukcq);
+	spin_unlock_irqrestore(&iwcq->lock, flags);
+}
+
+static void irdma_remove_push_mmap_entries(struct irdma_qp *iwqp)
+{
+	if (iwqp->push_db_mmap_entry) {
+		rdma_user_mmap_entry_remove(iwqp->push_db_mmap_entry);
+		iwqp->push_db_mmap_entry = NULL;
+	}
+	if (iwqp->push_wqe_mmap_entry) {
+		rdma_user_mmap_entry_remove(iwqp->push_wqe_mmap_entry);
+		iwqp->push_wqe_mmap_entry = NULL;
+	}
+}
+
+static int irdma_setup_push_mmap_entries(struct irdma_ucontext *ucontext,
+					 struct irdma_qp *iwqp,
+					 u64 *push_wqe_mmap_key,
+					 u64 *push_db_mmap_key)
+{
+	struct irdma_device *iwdev = ucontext->iwdev;
+	u64 rsvd, bar_off;
+
+	rsvd = iwdev->rf->priv_cdev_info.ftype ? IRDMA_VF_BAR_RSVD : IRDMA_PF_BAR_RSVD;
+	bar_off = (uintptr_t)iwdev->rf->sc_dev.hw_regs[IRDMA_DB_ADDR_OFFSET];
+	/* skip over db page */
+	bar_off += IRDMA_HW_PAGE_SIZE;
+	/* push wqe page */
+	bar_off += rsvd + iwqp->sc_qp.push_idx * IRDMA_HW_PAGE_SIZE;
+	iwqp->push_wqe_mmap_entry = irdma_user_mmap_entry_insert(ucontext,
+					bar_off, IRDMA_MMAP_IO_WC,
+					push_wqe_mmap_key);
+	if (!iwqp->push_wqe_mmap_entry)
+		return -ENOMEM;
+
+	/* push doorbell page */
+	bar_off += IRDMA_HW_PAGE_SIZE;
+	iwqp->push_db_mmap_entry = irdma_user_mmap_entry_insert(ucontext,
+					bar_off, IRDMA_MMAP_IO_NC,
+					push_db_mmap_key);
+	if (!iwqp->push_db_mmap_entry) {
+		rdma_user_mmap_entry_remove(iwqp->push_wqe_mmap_entry);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_destroy_qp - destroy qp
+ * @ibqp: qp's ib pointer also to get to device's qp address
+ * @udata: user data
+ */
+static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
+{
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_device *iwdev = iwqp->iwdev;
+
+	iwqp->sc_qp.qp_uk.destroy_pending = true;
+	if (iwqp->iwarp_state == IRDMA_QP_STATE_RTS)
+		irdma_modify_qp_to_err(&iwqp->sc_qp);
+
+	irdma_qp_rem_ref(&iwqp->ibqp);
+	wait_for_completion(&iwqp->free_qp);
+	irdma_free_lsmm_rsrc(iwqp);
+	if (!iwdev->reset)
+		irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp);
+	if (!iwqp->user_mode) {
+		if (iwqp->iwscq) {
+			irdma_clean_cqes(iwqp, iwqp->iwscq);
+			if (iwqp->iwrcq != iwqp->iwscq)
+				irdma_clean_cqes(iwqp, iwqp->iwrcq);
+		}
+	}
+	irdma_remove_push_mmap_entries(iwqp);
+	irdma_free_qp_rsrc(iwqp);
+
+	return 0;
+}
+
+/**
+ * irdma_setup_virt_qp - setup for allocation of virtual qp
+ * @iwdev: irdma device
+ * @iwqp: qp ptr
+ * @init_info: initialize info to return
+ */
+static int irdma_setup_virt_qp(struct irdma_device *iwdev,
+			       struct irdma_qp *iwqp,
+			       struct irdma_qp_init_info *init_info)
+{
+	struct irdma_pbl *iwpbl = iwqp->iwpbl;
+	struct irdma_qp_mr *qpmr = &iwpbl->qp_mr;
+
+	iwqp->page = qpmr->sq_page;
+	init_info->shadow_area_pa = qpmr->shadow;
+	if (iwpbl->pbl_allocated) {
+		init_info->virtual_map = true;
+		init_info->sq_pa = qpmr->sq_pbl.idx;
+		init_info->rq_pa = qpmr->rq_pbl.idx;
+	} else {
+		init_info->sq_pa = qpmr->sq_pbl.addr;
+		init_info->rq_pa = qpmr->rq_pbl.addr;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_setup_kmode_qp - setup initialization for kernel mode qp
+ * @iwdev: iwarp device
+ * @iwqp: qp ptr (user or kernel)
+ * @info: initialize info to return
+ * @init_attr: Initial QP create attributes
+ */
+static int irdma_setup_kmode_qp(struct irdma_device *iwdev,
+				struct irdma_qp *iwqp,
+				struct irdma_qp_init_info *info,
+				struct ib_qp_init_attr *init_attr)
+{
+	struct irdma_dma_mem *mem = &iwqp->kqp.dma_mem;
+	u32 sqdepth, rqdepth;
+	u8 sqshift, rqshift;
+	u32 size;
+	enum irdma_status_code status;
+	struct irdma_qp_uk_init_info *ukinfo = &info->qp_uk_init_info;
+	struct irdma_uk_attrs *uk_attrs = &iwdev->rf->sc_dev.hw_attrs.uk_attrs;
+
+	irdma_get_wqe_shift(uk_attrs,
+		uk_attrs->hw_rev >= IRDMA_GEN_2 ? ukinfo->max_sq_frag_cnt + 1 :
+						  ukinfo->max_sq_frag_cnt,
+		ukinfo->max_inline_data, &sqshift);
+	status = irdma_get_sqdepth(uk_attrs, ukinfo->sq_size, sqshift,
+				   &sqdepth);
+	if (status)
+		return -ENOMEM;
+
+	if (uk_attrs->hw_rev == IRDMA_GEN_1)
+		rqshift = IRDMA_MAX_RQ_WQE_SHIFT_GEN1;
+	else
+		irdma_get_wqe_shift(uk_attrs, ukinfo->max_rq_frag_cnt, 0,
+				    &rqshift);
+
+	status = irdma_get_rqdepth(uk_attrs, ukinfo->rq_size, rqshift,
+				   &rqdepth);
+	if (status)
+		return -ENOMEM;
+
+	iwqp->kqp.sq_wrid_mem =
+		kcalloc(sqdepth, sizeof(*iwqp->kqp.sq_wrid_mem), GFP_KERNEL);
+	if (!iwqp->kqp.sq_wrid_mem)
+		return -ENOMEM;
+
+	iwqp->kqp.rq_wrid_mem =
+		kcalloc(rqdepth, sizeof(*iwqp->kqp.rq_wrid_mem), GFP_KERNEL);
+	if (!iwqp->kqp.rq_wrid_mem) {
+		kfree(iwqp->kqp.sq_wrid_mem);
+		iwqp->kqp.sq_wrid_mem = NULL;
+		return -ENOMEM;
+	}
+
+	ukinfo->sq_wrtrk_array = iwqp->kqp.sq_wrid_mem;
+	ukinfo->rq_wrid_array = iwqp->kqp.rq_wrid_mem;
+
+	size = (sqdepth + rqdepth) * IRDMA_QP_WQE_MIN_SIZE;
+	size += (IRDMA_SHADOW_AREA_SIZE << 3);
+
+	mem->size = ALIGN(size, 256);
+	mem->va = dma_alloc_coherent(iwdev->rf->hw.device, mem->size,
+				     &mem->pa, GFP_KERNEL);
+	if (!mem->va) {
+		kfree(iwqp->kqp.sq_wrid_mem);
+		iwqp->kqp.sq_wrid_mem = NULL;
+		kfree(iwqp->kqp.rq_wrid_mem);
+		iwqp->kqp.rq_wrid_mem = NULL;
+		return -ENOMEM;
+	}
+
+	ukinfo->sq = mem->va;
+	info->sq_pa = mem->pa;
+	ukinfo->rq = &ukinfo->sq[sqdepth];
+	info->rq_pa = info->sq_pa + (sqdepth * IRDMA_QP_WQE_MIN_SIZE);
+	ukinfo->shadow_area = ukinfo->rq[rqdepth].elem;
+	info->shadow_area_pa = info->rq_pa + (rqdepth * IRDMA_QP_WQE_MIN_SIZE);
+	ukinfo->sq_size = sqdepth >> sqshift;
+	ukinfo->rq_size = rqdepth >> rqshift;
+	ukinfo->qp_id = iwqp->ibqp.qp_num;
+
+	init_attr->cap.max_send_wr = (sqdepth - IRDMA_SQ_RSVD) >> sqshift;
+	init_attr->cap.max_recv_wr = (rqdepth - IRDMA_RQ_RSVD) >> rqshift;
+
+	return 0;
+}
+
+static int irdma_cqp_create_qp_cmd(struct irdma_qp *iwqp)
+{
+	struct irdma_pci_f *rf = iwqp->iwdev->rf;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_create_qp_info *qp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	qp_info = &cqp_request->info.in.u.qp_create.info;
+	memset(qp_info, 0, sizeof(*qp_info));
+	qp_info->mac_valid = true;
+	qp_info->cq_num_valid = true;
+	qp_info->next_iwarp_state = IRDMA_QP_STATE_IDLE;
+
+	cqp_info->cqp_cmd = IRDMA_OP_QP_CREATE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.qp_create.qp = &iwqp->sc_qp;
+	cqp_info->in.u.qp_create.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status ? -ENOMEM : 0;
+}
+
+static void irdma_roce_fill_and_set_qpctx_info(struct irdma_qp *iwqp,
+					       struct irdma_qp_host_ctx_info *ctx_info)
+{
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	struct irdma_roce_offload_info *roce_info;
+	struct irdma_udp_offload_info *udp_info;
+
+	udp_info = &iwqp->udp_info;
+	udp_info->snd_mss = ib_mtu_enum_to_int(ib_mtu_int_to_enum(iwdev->vsi.mtu));
+	udp_info->cwnd = iwdev->roce_cwnd;
+	udp_info->rexmit_thresh = 2;
+	udp_info->rnr_nak_thresh = 2;
+	udp_info->src_port = 0xc000;
+	udp_info->dst_port = ROCE_V2_UDP_DPORT;
+	roce_info = &iwqp->roce_info;
+	ether_addr_copy(roce_info->mac_addr, iwdev->netdev->dev_addr);
+
+	roce_info->rd_en = true;
+	roce_info->wr_rdresp_en = true;
+	roce_info->bind_en = true;
+	roce_info->dcqcn_en = false;
+	roce_info->rtomin = 5;
+
+	roce_info->ack_credits = iwdev->roce_ackcreds;
+	roce_info->ird_size = dev->hw_attrs.max_hw_ird;
+	roce_info->ord_size = dev->hw_attrs.max_hw_ord;
+
+	if (!iwqp->user_mode) {
+		roce_info->priv_mode_en = true;
+		roce_info->fast_reg_en = true;
+		roce_info->udprivcq_en = true;
+	}
+	roce_info->roce_tver = 0;
+
+	ctx_info->roce_info = &iwqp->roce_info;
+	ctx_info->udp_info = &iwqp->udp_info;
+	dev->iw_priv_qp_ops->qp_setctx_roce(&iwqp->sc_qp, iwqp->host_ctx.va,
+					    ctx_info);
+}
+
+static void irdma_iw_fill_and_set_qpctx_info(struct irdma_qp *iwqp,
+					     struct irdma_qp_host_ctx_info *ctx_info)
+{
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	struct irdma_iwarp_offload_info *iwarp_info;
+
+	iwarp_info = &iwqp->iwarp_info;
+	ether_addr_copy(iwarp_info->mac_addr, iwdev->netdev->dev_addr);
+	iwarp_info->rd_en = true;
+	iwarp_info->wr_rdresp_en = true;
+	iwarp_info->bind_en = true;
+	iwarp_info->ecn_en = true;
+	iwarp_info->rtomin = 5;
+
+	if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+		iwarp_info->ib_rd_en = true;
+	if (!iwqp->user_mode) {
+		iwarp_info->priv_mode_en = true;
+		iwarp_info->fast_reg_en = true;
+	}
+	iwarp_info->ddp_ver = 1;
+	iwarp_info->rdmap_ver = 1;
+
+	ctx_info->iwarp_info = &iwqp->iwarp_info;
+	ctx_info->iwarp_info_valid = true;
+	dev->iw_priv_qp_ops->qp_setctx(&iwqp->sc_qp, iwqp->host_ctx.va, ctx_info);
+	ctx_info->iwarp_info_valid = false;
+}
+
+static int irdma_validate_qp_attrs(struct ib_qp_init_attr *init_attr,
+				   struct irdma_device *iwdev)
+{
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	struct irdma_uk_attrs *uk_attrs = &dev->hw_attrs.uk_attrs;
+
+	if (init_attr->create_flags)
+		return -EOPNOTSUPP;
+
+	if (init_attr->cap.max_inline_data > uk_attrs->max_hw_inline ||
+	    init_attr->cap.max_send_sge > uk_attrs->max_hw_wq_frags ||
+	    init_attr->cap.max_recv_sge > uk_attrs->max_hw_wq_frags)
+		return -EINVAL;
+
+	if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
+		if (init_attr->qp_type != IB_QPT_RC &&
+		    init_attr->qp_type != IB_QPT_UD &&
+		    init_attr->qp_type != IB_QPT_GSI)
+			return -EOPNOTSUPP;
+	} else {
+		if (init_attr->qp_type != IB_QPT_RC)
+			return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_create_qp - create qp
+ * @ibpd: ptr of pd
+ * @init_attr: attributes for qp
+ * @udata: user data for create qp
+ */
+static struct ib_qp *irdma_create_qp(struct ib_pd *ibpd,
+				     struct ib_qp_init_attr *init_attr,
+				     struct ib_udata *udata)
+{
+	struct irdma_pd *iwpd = to_iwpd(ibpd);
+	struct irdma_device *iwdev = to_iwdev(ibpd->device);
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct irdma_qp *iwqp;
+	struct irdma_create_qp_req req;
+	struct irdma_create_qp_resp uresp = {};
+	u32 qp_num = 0;
+	enum irdma_status_code ret;
+	int err_code;
+	int sq_size;
+	int rq_size;
+	struct irdma_sc_qp *qp;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_uk_attrs *uk_attrs = &dev->hw_attrs.uk_attrs;
+	struct irdma_qp_init_info init_info = {};
+	struct irdma_qp_host_ctx_info *ctx_info;
+	unsigned long flags;
+
+	err_code = irdma_validate_qp_attrs(init_attr, iwdev);
+	if (err_code)
+		return ERR_PTR(err_code);
+
+	sq_size = init_attr->cap.max_send_wr;
+	rq_size = init_attr->cap.max_recv_wr;
+
+	init_info.vsi = &iwdev->vsi;
+	init_info.qp_uk_init_info.uk_attrs = uk_attrs;
+	init_info.qp_uk_init_info.sq_size = sq_size;
+	init_info.qp_uk_init_info.rq_size = rq_size;
+	init_info.qp_uk_init_info.max_sq_frag_cnt = init_attr->cap.max_send_sge;
+	init_info.qp_uk_init_info.max_rq_frag_cnt = init_attr->cap.max_recv_sge;
+	init_info.qp_uk_init_info.max_inline_data = init_attr->cap.max_inline_data;
+
+	iwqp = kzalloc(sizeof(*iwqp), GFP_KERNEL);
+	if (!iwqp)
+		return ERR_PTR(-ENOMEM);
+
+	qp = &iwqp->sc_qp;
+	qp->qp_uk.back_qp = iwqp;
+	qp->qp_uk.lock = &iwqp->lock;
+	qp->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX;
+
+	iwqp->iwdev = iwdev;
+	iwqp->q2_ctx_mem.size = ALIGN(IRDMA_Q2_BUF_SIZE + IRDMA_QP_CTX_SIZE,
+				      256);
+	iwqp->q2_ctx_mem.va = dma_alloc_coherent(dev->hw->device,
+						 iwqp->q2_ctx_mem.size,
+						 &iwqp->q2_ctx_mem.pa,
+						 GFP_KERNEL);
+	if (!iwqp->q2_ctx_mem.va) {
+		err_code = -ENOMEM;
+		goto error;
+	}
+
+	init_info.q2 = iwqp->q2_ctx_mem.va;
+	init_info.q2_pa = iwqp->q2_ctx_mem.pa;
+	init_info.host_ctx = (__le64 *)(init_info.q2 + IRDMA_Q2_BUF_SIZE);
+	init_info.host_ctx_pa = init_info.q2_pa + IRDMA_Q2_BUF_SIZE;
+
+	if (init_attr->qp_type == IB_QPT_GSI)
+		qp_num = 1;
+	else
+		err_code = irdma_alloc_rsrc(rf, rf->allocated_qps, rf->max_qp,
+					    &qp_num, &rf->next_qp);
+	if (err_code)
+		goto error;
+
+	iwqp->iwpd = iwpd;
+	iwqp->ibqp.qp_num = qp_num;
+	qp = &iwqp->sc_qp;
+	iwqp->iwscq = to_iwcq(init_attr->send_cq);
+	iwqp->iwrcq = to_iwcq(init_attr->recv_cq);
+	iwqp->host_ctx.va = init_info.host_ctx;
+	iwqp->host_ctx.pa = init_info.host_ctx_pa;
+	iwqp->host_ctx.size = IRDMA_QP_CTX_SIZE;
+
+	init_info.pd = &iwpd->sc_pd;
+	init_info.qp_uk_init_info.qp_id = iwqp->ibqp.qp_num;
+	if (!rdma_protocol_roce(&iwdev->ibdev, 1))
+		init_info.qp_uk_init_info.first_sq_wq = 1;
+	iwqp->ctx_info.qp_compl_ctx = (uintptr_t)qp;
+	init_waitqueue_head(&iwqp->waitq);
+	init_waitqueue_head(&iwqp->mod_qp_waitq);
+
+	if (udata) {
+		err_code = ib_copy_from_udata(&req, udata,
+					      min(sizeof(req), udata->inlen));
+		if (err_code) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "VERBS: ib_copy_from_data fail\n");
+			goto error;
+		}
+
+		iwqp->ctx_info.qp_compl_ctx = req.user_compl_ctx;
+		iwqp->user_mode = 1;
+		if (req.user_wqe_bufs) {
+			struct irdma_ucontext *ucontext =
+				rdma_udata_to_drv_context(udata,
+							  struct irdma_ucontext,
+							  ibucontext);
+
+			init_info.qp_uk_init_info.legacy_mode = ucontext->legacy_mode;
+			spin_lock_irqsave(&ucontext->qp_reg_mem_list_lock, flags);
+			iwqp->iwpbl = irdma_get_pbl((unsigned long)req.user_wqe_bufs,
+						    &ucontext->qp_reg_mem_list);
+			spin_unlock_irqrestore(&ucontext->qp_reg_mem_list_lock, flags);
+
+			if (!iwqp->iwpbl) {
+				err_code = -ENODATA;
+				ibdev_dbg(&iwdev->ibdev, "VERBS: no pbl info\n");
+				goto error;
+			}
+		}
+		init_info.qp_uk_init_info.abi_ver = iwpd->sc_pd.abi_ver;
+		err_code = irdma_setup_virt_qp(iwdev, iwqp, &init_info);
+	} else {
+		init_info.qp_uk_init_info.abi_ver = IRDMA_ABI_VER;
+		err_code = irdma_setup_kmode_qp(iwdev, iwqp, &init_info, init_attr);
+	}
+
+	if (err_code) {
+		ibdev_dbg(&iwdev->ibdev, "VERBS: setup qp failed\n");
+		goto error;
+	}
+
+	if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
+		if (init_attr->qp_type == IB_QPT_RC) {
+			init_info.qp_uk_init_info.type = IRDMA_QP_TYPE_ROCE_RC;
+			init_info.qp_uk_init_info.qp_caps = IRDMA_SEND_WITH_IMM |
+							    IRDMA_WRITE_WITH_IMM |
+							    IRDMA_ROCE;
+		} else {
+			init_info.qp_uk_init_info.type = IRDMA_QP_TYPE_ROCE_UD;
+			init_info.qp_uk_init_info.qp_caps = IRDMA_SEND_WITH_IMM |
+							    IRDMA_ROCE;
+		}
+	} else {
+		init_info.qp_uk_init_info.type = IRDMA_QP_TYPE_IWARP;
+		init_info.qp_uk_init_info.qp_caps = IRDMA_WRITE_WITH_IMM;
+	}
+
+	if (dev->hw_attrs.uk_attrs.hw_rev > IRDMA_GEN_1)
+		init_info.qp_uk_init_info.qp_caps |= IRDMA_PUSH_MODE;
+
+	ret = dev->iw_priv_qp_ops->qp_init(qp, &init_info);
+	if (ret) {
+		err_code = -EPROTO;
+		ibdev_dbg(&iwdev->ibdev, "VERBS: qp_init fail\n");
+		goto error;
+	}
+
+	ctx_info = &iwqp->ctx_info;
+	ctx_info->send_cq_num = iwqp->iwscq->sc_cq.cq_uk.cq_id;
+	ctx_info->rcv_cq_num = iwqp->iwrcq->sc_cq.cq_uk.cq_id;
+
+	if (rdma_protocol_roce(&iwdev->ibdev, 1))
+		irdma_roce_fill_and_set_qpctx_info(iwqp, ctx_info);
+	else
+		irdma_iw_fill_and_set_qpctx_info(iwqp, ctx_info);
+
+	err_code = irdma_cqp_create_qp_cmd(iwqp);
+	if (err_code)
+		goto error;
+
+	refcount_set(&iwqp->refcnt, 1);
+	spin_lock_init(&iwqp->lock);
+	spin_lock_init(&iwqp->sc_qp.pfpdu.lock);
+	iwqp->sig_all = (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) ? 1 : 0;
+	rf->qp_table[qp_num] = iwqp;
+	iwqp->max_send_wr = sq_size;
+	iwqp->max_recv_wr = rq_size;
+
+	if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
+		if (dev->ws_add(&iwdev->vsi, 0)) {
+			irdma_cqp_qp_destroy_cmd(&rf->sc_dev, &iwqp->sc_qp);
+			err_code = -EINVAL;
+			goto error;
+		}
+
+		irdma_qp_add_qos(&iwqp->sc_qp);
+	}
+
+	if (udata) {
+		/* GEN_1 legacy support with libi40iw does not have expanded uresp struct */
+		if (udata->outlen < sizeof(uresp)) {
+			uresp.lsmm = 1;
+			uresp.push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX_GEN_1;
+		} else {
+			if (rdma_protocol_iwarp(&iwdev->ibdev, 1))
+				uresp.lsmm = 1;
+		}
+		uresp.actual_sq_size = sq_size;
+		uresp.actual_rq_size = rq_size;
+		uresp.qp_id = qp_num;
+		uresp.qp_caps = qp->qp_uk.qp_caps;
+
+		err_code = ib_copy_to_udata(udata, &uresp,
+					    min(sizeof(uresp), udata->outlen));
+		if (err_code) {
+			ibdev_dbg(&iwdev->ibdev, "VERBS: copy_to_udata failed\n");
+			irdma_destroy_qp(&iwqp->ibqp, udata);
+			return ERR_PTR(err_code);
+		}
+	}
+
+	init_completion(&iwqp->free_qp);
+	return &iwqp->ibqp;
+
+error:
+	irdma_free_qp_rsrc(iwqp);
+
+	return ERR_PTR(err_code);
+}
+
+static int irdma_get_ib_acc_flags(struct irdma_qp *iwqp)
+{
+	int acc_flags = 0;
+
+	if (rdma_protocol_roce(iwqp->ibqp.device, 1)) {
+		if (iwqp->roce_info.wr_rdresp_en) {
+			acc_flags |= IB_ACCESS_LOCAL_WRITE;
+			acc_flags |= IB_ACCESS_REMOTE_WRITE;
+		}
+		if (iwqp->roce_info.rd_en)
+			acc_flags |= IB_ACCESS_REMOTE_READ;
+		if (iwqp->roce_info.bind_en)
+			acc_flags |= IB_ACCESS_MW_BIND;
+	} else {
+		if (iwqp->iwarp_info.wr_rdresp_en) {
+			acc_flags |= IB_ACCESS_LOCAL_WRITE;
+			acc_flags |= IB_ACCESS_REMOTE_WRITE;
+		}
+		if (iwqp->iwarp_info.rd_en)
+			acc_flags |= IB_ACCESS_REMOTE_READ;
+		if (iwqp->iwarp_info.bind_en)
+			acc_flags |= IB_ACCESS_MW_BIND;
+	}
+	return acc_flags;
+}
+
+/**
+ * irdma_query_qp - query qp attributes
+ * @ibqp: qp pointer
+ * @attr: attributes pointer
+ * @attr_mask: Not used
+ * @init_attr: qp attributes to return
+ */
+static int irdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+			  int attr_mask, struct ib_qp_init_attr *init_attr)
+{
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_sc_qp *qp = &iwqp->sc_qp;
+
+	memset(attr, 0, sizeof(*attr));
+	memset(init_attr, 0, sizeof(*init_attr));
+
+	attr->qp_state = iwqp->ibqp_state;
+	attr->cur_qp_state = iwqp->ibqp_state;
+	attr->cap.max_send_wr = iwqp->max_send_wr;
+	attr->cap.max_recv_wr = iwqp->max_recv_wr;
+	attr->cap.max_inline_data = qp->qp_uk.max_inline_data;
+	attr->cap.max_send_sge = qp->qp_uk.max_sq_frag_cnt;
+	attr->cap.max_recv_sge = qp->qp_uk.max_rq_frag_cnt;
+	attr->qp_access_flags = irdma_get_ib_acc_flags(iwqp);
+	attr->port_num = 1;
+	if (rdma_protocol_roce(ibqp->device, 1)) {
+		attr->path_mtu = ib_mtu_int_to_enum(iwqp->udp_info.snd_mss);
+		attr->qkey = iwqp->roce_info.qkey;
+		attr->rq_psn = iwqp->udp_info.epsn;
+		attr->sq_psn = iwqp->udp_info.psn_nxt;
+		attr->dest_qp_num = iwqp->roce_info.dest_qp;
+		attr->pkey_index = iwqp->roce_info.p_key;
+		attr->retry_cnt = iwqp->udp_info.rexmit_thresh;
+		attr->rnr_retry = iwqp->udp_info.rnr_nak_thresh;
+		attr->max_rd_atomic = iwqp->roce_info.ord_size;
+		attr->max_dest_rd_atomic = iwqp->roce_info.ird_size;
+	}
+
+	init_attr->event_handler = iwqp->ibqp.event_handler;
+	init_attr->qp_context = iwqp->ibqp.qp_context;
+	init_attr->send_cq = iwqp->ibqp.send_cq;
+	init_attr->recv_cq = iwqp->ibqp.recv_cq;
+	init_attr->cap = attr->cap;
+
+	return 0;
+}
+
+/**
+ * irdma_query_pkey - Query partition key
+ * @ibdev: device pointer from stack
+ * @port: port number
+ * @index: index of pkey
+ * @pkey: pointer to store the pkey
+ */
+static int irdma_query_pkey(struct ib_device *ibdev, u32 port, u16 index,
+			    u16 *pkey)
+{
+	if (index >= IRDMA_PKEY_TBL_SZ)
+		return -EINVAL;
+
+	*pkey = IRDMA_DEFAULT_PKEY;
+	return 0;
+}
+
+/**
+ * irdma_modify_qp_roce - modify qp request
+ * @ibqp: qp's pointer for modify
+ * @attr: access attributes
+ * @attr_mask: state mask
+ * @udata: user data
+ */
+int irdma_modify_qp_roce(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+			 int attr_mask, struct ib_udata *udata)
+{
+	struct irdma_pd *iwpd = to_iwpd(ibqp->pd);
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	struct irdma_qp_host_ctx_info *ctx_info;
+	struct irdma_roce_offload_info *roce_info;
+	struct irdma_udp_offload_info *udp_info;
+	struct irdma_modify_qp_info info = {};
+	struct irdma_modify_qp_resp uresp = {};
+	struct irdma_modify_qp_req ureq = {};
+	unsigned long flags;
+	u8 issue_modify_qp = 0;
+	int ret = 0;
+
+	ctx_info = &iwqp->ctx_info;
+	roce_info = &iwqp->roce_info;
+	udp_info = &iwqp->udp_info;
+
+	if (attr_mask & ~IB_QP_ATTR_STANDARD_BITS)
+		return -EOPNOTSUPP;
+
+	if (attr_mask & IB_QP_DEST_QPN)
+		roce_info->dest_qp = attr->dest_qp_num;
+
+	if (attr_mask & IB_QP_PKEY_INDEX) {
+		ret = irdma_query_pkey(ibqp->device, 0, attr->pkey_index,
+				       &roce_info->p_key);
+		if (ret)
+			return ret;
+	}
+
+	if (attr_mask & IB_QP_QKEY)
+		roce_info->qkey = attr->qkey;
+
+	if (attr_mask & IB_QP_PATH_MTU)
+		udp_info->snd_mss = ib_mtu_enum_to_int(attr->path_mtu);
+
+	if (attr_mask & IB_QP_SQ_PSN) {
+		udp_info->psn_nxt = attr->sq_psn;
+		udp_info->lsn =  0xffff;
+		udp_info->psn_una = attr->sq_psn;
+		udp_info->psn_max = attr->sq_psn;
+	}
+
+	if (attr_mask & IB_QP_RQ_PSN)
+		udp_info->epsn = attr->rq_psn;
+
+	if (attr_mask & IB_QP_RNR_RETRY)
+		udp_info->rnr_nak_thresh = attr->rnr_retry;
+
+	if (attr_mask & IB_QP_RETRY_CNT)
+		udp_info->rexmit_thresh = attr->retry_cnt;
+
+	ctx_info->roce_info->pd_id = iwpd->sc_pd.pd_id;
+
+	if (attr_mask & IB_QP_AV) {
+		struct irdma_av *av = &iwqp->roce_ah.av;
+		const struct ib_gid_attr *sgid_attr;
+		u16 vlan_id = VLAN_N_VID;
+		u32 local_ip[4];
+
+		memset(&iwqp->roce_ah, 0, sizeof(iwqp->roce_ah));
+		if (attr->ah_attr.ah_flags & IB_AH_GRH) {
+			udp_info->ttl = attr->ah_attr.grh.hop_limit;
+			udp_info->flow_label = attr->ah_attr.grh.flow_label;
+			udp_info->tos = attr->ah_attr.grh.traffic_class;
+			irdma_qp_rem_qos(&iwqp->sc_qp);
+			dev->ws_remove(iwqp->sc_qp.vsi, ctx_info->user_pri);
+			ctx_info->user_pri = rt_tos2priority(udp_info->tos);
+			iwqp->sc_qp.user_pri = ctx_info->user_pri;
+			if (dev->ws_add(iwqp->sc_qp.vsi, ctx_info->user_pri))
+				return -ENOMEM;
+			irdma_qp_add_qos(&iwqp->sc_qp);
+		}
+		sgid_attr = attr->ah_attr.grh.sgid_attr;
+		ret = rdma_read_gid_l2_fields(sgid_attr, &vlan_id,
+					      ctx_info->roce_info->mac_addr);
+		if (ret)
+			return ret;
+
+		if (vlan_id >= VLAN_N_VID && iwdev->dcb)
+			vlan_id = 0;
+		if (vlan_id < VLAN_N_VID) {
+			udp_info->insert_vlan_tag = true;
+			udp_info->vlan_tag = vlan_id |
+				ctx_info->user_pri << VLAN_PRIO_SHIFT;
+		} else {
+			udp_info->insert_vlan_tag = false;
+		}
+
+		av->attrs = attr->ah_attr;
+		rdma_gid2ip((struct sockaddr *)&av->sgid_addr, &sgid_attr->gid);
+		rdma_gid2ip((struct sockaddr *)&av->dgid_addr, &attr->ah_attr.grh.dgid);
+		roce_info->local_qp = ibqp->qp_num;
+		if (av->sgid_addr.saddr.sa_family == AF_INET6) {
+			__be32 *daddr =
+				av->dgid_addr.saddr_in6.sin6_addr.in6_u.u6_addr32;
+			__be32 *saddr =
+				av->sgid_addr.saddr_in6.sin6_addr.in6_u.u6_addr32;
+
+			irdma_copy_ip_ntohl(&udp_info->dest_ip_addr[0], daddr);
+			irdma_copy_ip_ntohl(&udp_info->local_ipaddr[0], saddr);
+
+			udp_info->ipv4 = false;
+			irdma_copy_ip_ntohl(local_ip, daddr);
+
+			udp_info->arp_idx = irdma_arp_table(iwdev->rf,
+							    &local_ip[0],
+							    false, NULL,
+							    IRDMA_ARP_RESOLVE);
+		} else {
+			__be32 saddr = av->sgid_addr.saddr_in.sin_addr.s_addr;
+			__be32 daddr = av->dgid_addr.saddr_in.sin_addr.s_addr;
+
+			local_ip[0] = ntohl(daddr);
+
+			udp_info->ipv4 = true;
+			udp_info->dest_ip_addr[0] = 0;
+			udp_info->dest_ip_addr[1] = 0;
+			udp_info->dest_ip_addr[2] = 0;
+			udp_info->dest_ip_addr[3] = local_ip[0];
+
+			udp_info->local_ipaddr[0] = 0;
+			udp_info->local_ipaddr[1] = 0;
+			udp_info->local_ipaddr[2] = 0;
+			udp_info->local_ipaddr[3] = ntohl(saddr);
+		}
+		udp_info->arp_idx =
+			irdma_add_arp(iwdev->rf, local_ip, udp_info->ipv4,
+				      attr->ah_attr.roce.dmac);
+	}
+
+	if (attr_mask & IB_QP_MAX_QP_RD_ATOMIC) {
+		if (attr->max_rd_atomic > dev->hw_attrs.max_hw_ord) {
+			ibdev_err(&iwdev->ibdev,
+				  "rd_atomic = %d, above max_hw_ord=%d\n",
+				  attr->max_rd_atomic,
+				  dev->hw_attrs.max_hw_ord);
+			return -EINVAL;
+		}
+		if (attr->max_rd_atomic)
+			roce_info->ord_size = attr->max_rd_atomic;
+		info.ord_valid = true;
+	}
+
+	if (attr_mask & IB_QP_MAX_DEST_RD_ATOMIC) {
+		if (attr->max_dest_rd_atomic > dev->hw_attrs.max_hw_ird) {
+			ibdev_err(&iwdev->ibdev,
+				  "rd_atomic = %d, above max_hw_ird=%d\n",
+				   attr->max_rd_atomic,
+				   dev->hw_attrs.max_hw_ird);
+			return -EINVAL;
+		}
+		if (attr->max_dest_rd_atomic)
+			roce_info->ird_size = attr->max_dest_rd_atomic;
+	}
+
+	if (attr_mask & IB_QP_ACCESS_FLAGS) {
+		if (attr->qp_access_flags & IB_ACCESS_LOCAL_WRITE)
+			roce_info->wr_rdresp_en = true;
+		if (attr->qp_access_flags & IB_ACCESS_REMOTE_WRITE)
+			roce_info->wr_rdresp_en = true;
+		if (attr->qp_access_flags & IB_ACCESS_REMOTE_READ)
+			roce_info->rd_en = true;
+	}
+
+	wait_event(iwqp->mod_qp_waitq, !atomic_read(&iwqp->hw_mod_qp_pend));
+
+	ibdev_dbg(&iwdev->ibdev,
+		  "VERBS: caller: %pS qp_id=%d to_ibqpstate=%d ibqpstate=%d irdma_qpstate=%d attr_mask=0x%x\n",
+		  __builtin_return_address(0), ibqp->qp_num, attr->qp_state,
+		  iwqp->ibqp_state, iwqp->iwarp_state, attr_mask);
+
+	spin_lock_irqsave(&iwqp->lock, flags);
+	if (attr_mask & IB_QP_STATE) {
+		if (!ib_modify_qp_is_ok(iwqp->ibqp_state, attr->qp_state,
+					iwqp->ibqp.qp_type, attr_mask)) {
+			ibdev_warn(&iwdev->ibdev, "modify_qp invalid for qp_id=%d, old_state=0x%x, new_state=0x%x\n",
+				   iwqp->ibqp.qp_num, iwqp->ibqp_state,
+				   attr->qp_state);
+			ret = -EINVAL;
+			goto exit;
+		}
+		info.curr_iwarp_state = iwqp->iwarp_state;
+
+		switch (attr->qp_state) {
+		case IB_QPS_INIT:
+			if (iwqp->iwarp_state > IRDMA_QP_STATE_IDLE) {
+				ret = -EINVAL;
+				goto exit;
+			}
+
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_INVALID) {
+				info.next_iwarp_state = IRDMA_QP_STATE_IDLE;
+				issue_modify_qp = 1;
+			}
+			break;
+		case IB_QPS_RTR:
+			if (iwqp->iwarp_state > IRDMA_QP_STATE_IDLE) {
+				ret = -EINVAL;
+				goto exit;
+			}
+			info.arp_cache_idx_valid = true;
+			info.cq_num_valid = true;
+			info.next_iwarp_state = IRDMA_QP_STATE_RTR;
+			issue_modify_qp = 1;
+			break;
+		case IB_QPS_RTS:
+			if (iwqp->ibqp_state < IB_QPS_RTR ||
+			    iwqp->ibqp_state == IB_QPS_ERR) {
+				ret = -EINVAL;
+				goto exit;
+			}
+
+			info.arp_cache_idx_valid = true;
+			info.cq_num_valid = true;
+			info.ord_valid = true;
+			info.next_iwarp_state = IRDMA_QP_STATE_RTS;
+			issue_modify_qp = 1;
+			if (iwdev->push_mode && udata &&
+			    iwqp->sc_qp.push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX &&
+			    dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				irdma_alloc_push_page(iwqp);
+				spin_lock_irqsave(&iwqp->lock, flags);
+			}
+			break;
+		case IB_QPS_SQD:
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_SQD)
+				goto exit;
+
+			if (iwqp->iwarp_state != IRDMA_QP_STATE_RTS) {
+				ret = -EINVAL;
+				goto exit;
+			}
+
+			info.next_iwarp_state = IRDMA_QP_STATE_SQD;
+			issue_modify_qp = 1;
+			break;
+		case IB_QPS_SQE:
+		case IB_QPS_ERR:
+		case IB_QPS_RESET:
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_RTS) {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				info.next_iwarp_state = IRDMA_QP_STATE_SQD;
+				irdma_hw_modify_qp(iwdev, iwqp, &info, true);
+				spin_lock_irqsave(&iwqp->lock, flags);
+			}
+
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_ERROR) {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				if (udata) {
+					if (ib_copy_from_udata(&ureq, udata,
+					    min(sizeof(ureq), udata->inlen)))
+						return -EINVAL;
+
+					irdma_flush_wqes(iwqp,
+					    (ureq.sq_flush ? IRDMA_FLUSH_SQ : 0) |
+					    (ureq.rq_flush ? IRDMA_FLUSH_RQ : 0) |
+					    IRDMA_REFLUSH);
+				}
+				return 0;
+			}
+
+			info.next_iwarp_state = IRDMA_QP_STATE_ERROR;
+			issue_modify_qp = 1;
+			break;
+		default:
+			ret = -EINVAL;
+			goto exit;
+		}
+
+		iwqp->ibqp_state = attr->qp_state;
+	}
+
+	ctx_info->send_cq_num = iwqp->iwscq->sc_cq.cq_uk.cq_id;
+	ctx_info->rcv_cq_num = iwqp->iwrcq->sc_cq.cq_uk.cq_id;
+	dev->iw_priv_qp_ops->qp_setctx_roce(&iwqp->sc_qp,
+					    iwqp->host_ctx.va, ctx_info);
+	spin_unlock_irqrestore(&iwqp->lock, flags);
+
+	if (attr_mask & IB_QP_STATE) {
+		if (issue_modify_qp) {
+			ctx_info->rem_endpoint_idx = udp_info->arp_idx;
+			if (irdma_hw_modify_qp(iwdev, iwqp, &info, true))
+				return -EINVAL;
+			spin_lock_irqsave(&iwqp->lock, flags);
+			if (iwqp->iwarp_state == info.curr_iwarp_state) {
+				iwqp->iwarp_state = info.next_iwarp_state;
+				iwqp->ibqp_state = attr->qp_state;
+			}
+			if (iwqp->ibqp_state > IB_QPS_RTS &&
+			    !iwqp->flush_issued) {
+				iwqp->flush_issued = 1;
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				irdma_flush_wqes(iwqp, IRDMA_FLUSH_SQ |
+						       IRDMA_FLUSH_RQ |
+						       IRDMA_FLUSH_WAIT);
+			} else {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+			}
+		} else {
+			iwqp->ibqp_state = attr->qp_state;
+		}
+		if (udata && dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+			struct irdma_ucontext *ucontext;
+
+			ucontext = rdma_udata_to_drv_context(udata,
+					struct irdma_ucontext, ibucontext);
+			if (iwqp->sc_qp.push_idx != IRDMA_INVALID_PUSH_PAGE_INDEX &&
+			    !iwqp->push_wqe_mmap_entry &&
+			    !irdma_setup_push_mmap_entries(ucontext, iwqp,
+				&uresp.push_wqe_mmap_key, &uresp.push_db_mmap_key)) {
+				uresp.push_valid = 1;
+				uresp.push_offset = iwqp->sc_qp.push_offset;
+			}
+			ret = ib_copy_to_udata(udata, &uresp, min(sizeof(uresp),
+					       udata->outlen));
+			if (ret) {
+				irdma_remove_push_mmap_entries(iwqp);
+				ibdev_dbg(&iwdev->ibdev,
+					  "VERBS: copy_to_udata failed\n");
+				return ret;
+			}
+		}
+	}
+
+	return 0;
+exit:
+	spin_unlock_irqrestore(&iwqp->lock, flags);
+
+	return ret;
+}
+
+/**
+ * irdma_modify_qp - modify qp request
+ * @ibqp: qp's pointer for modify
+ * @attr: access attributes
+ * @attr_mask: state mask
+ * @udata: user data
+ */
+int irdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask,
+		    struct ib_udata *udata)
+{
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	struct irdma_qp_host_ctx_info *ctx_info;
+	struct irdma_tcp_offload_info *tcp_info;
+	struct irdma_iwarp_offload_info *offload_info;
+	struct irdma_modify_qp_info info = {};
+	struct irdma_modify_qp_resp uresp = {};
+	struct irdma_modify_qp_req ureq = {};
+	u8 issue_modify_qp = 0;
+	u8 dont_wait = 0;
+	int err;
+	unsigned long flags;
+
+	if (attr_mask & ~IB_QP_ATTR_STANDARD_BITS)
+		return -EOPNOTSUPP;
+
+	ctx_info = &iwqp->ctx_info;
+	offload_info = &iwqp->iwarp_info;
+	tcp_info = &iwqp->tcp_info;
+	wait_event(iwqp->mod_qp_waitq, !atomic_read(&iwqp->hw_mod_qp_pend));
+	ibdev_dbg(&iwdev->ibdev,
+		  "VERBS: caller: %pS qp_id=%d to_ibqpstate=%d ibqpstate=%d irdma_qpstate=%d last_aeq=%d hw_tcp_state=%d hw_iwarp_state=%d attr_mask=0x%x\n",
+		  __builtin_return_address(0), ibqp->qp_num, attr->qp_state,
+		  iwqp->ibqp_state, iwqp->iwarp_state, iwqp->last_aeq,
+		  iwqp->hw_tcp_state, iwqp->hw_iwarp_state, attr_mask);
+
+	spin_lock_irqsave(&iwqp->lock, flags);
+	if (attr_mask & IB_QP_STATE) {
+		info.curr_iwarp_state = iwqp->iwarp_state;
+		switch (attr->qp_state) {
+		case IB_QPS_INIT:
+		case IB_QPS_RTR:
+			if (iwqp->iwarp_state > IRDMA_QP_STATE_IDLE) {
+				err = -EINVAL;
+				goto exit;
+			}
+
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_INVALID) {
+				info.next_iwarp_state = IRDMA_QP_STATE_IDLE;
+				issue_modify_qp = 1;
+			}
+			if (iwdev->push_mode && udata &&
+			    iwqp->sc_qp.push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX &&
+			    dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				irdma_alloc_push_page(iwqp);
+				spin_lock_irqsave(&iwqp->lock, flags);
+			}
+			break;
+		case IB_QPS_RTS:
+			if (iwqp->iwarp_state > IRDMA_QP_STATE_RTS ||
+			    !iwqp->cm_id) {
+				err = -EINVAL;
+				goto exit;
+			}
+
+			issue_modify_qp = 1;
+			iwqp->hw_tcp_state = IRDMA_TCP_STATE_ESTABLISHED;
+			iwqp->hte_added = 1;
+			info.next_iwarp_state = IRDMA_QP_STATE_RTS;
+			info.tcp_ctx_valid = true;
+			info.ord_valid = true;
+			info.arp_cache_idx_valid = true;
+			info.cq_num_valid = true;
+			break;
+		case IB_QPS_SQD:
+			if (iwqp->hw_iwarp_state > IRDMA_QP_STATE_RTS) {
+				err = 0;
+				goto exit;
+			}
+
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_CLOSING ||
+			    iwqp->iwarp_state < IRDMA_QP_STATE_RTS) {
+				err = 0;
+				goto exit;
+			}
+
+			if (iwqp->iwarp_state > IRDMA_QP_STATE_CLOSING) {
+				err = -EINVAL;
+				goto exit;
+			}
+
+			info.next_iwarp_state = IRDMA_QP_STATE_CLOSING;
+			issue_modify_qp = 1;
+			break;
+		case IB_QPS_SQE:
+			if (iwqp->iwarp_state >= IRDMA_QP_STATE_TERMINATE) {
+				err = -EINVAL;
+				goto exit;
+			}
+
+			info.next_iwarp_state = IRDMA_QP_STATE_TERMINATE;
+			issue_modify_qp = 1;
+			break;
+		case IB_QPS_ERR:
+		case IB_QPS_RESET:
+			if (iwqp->iwarp_state == IRDMA_QP_STATE_ERROR) {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				if (udata) {
+					if (ib_copy_from_udata(&ureq, udata,
+					    min(sizeof(ureq), udata->inlen)))
+						return -EINVAL;
+
+					irdma_flush_wqes(iwqp,
+					    (ureq.sq_flush ? IRDMA_FLUSH_SQ : 0) |
+					    (ureq.rq_flush ? IRDMA_FLUSH_RQ : 0) |
+					    IRDMA_REFLUSH);
+				}
+				return 0;
+			}
+
+			if (iwqp->sc_qp.term_flags) {
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				irdma_terminate_del_timer(&iwqp->sc_qp);
+				spin_lock_irqsave(&iwqp->lock, flags);
+			}
+			info.next_iwarp_state = IRDMA_QP_STATE_ERROR;
+			if (iwqp->hw_tcp_state > IRDMA_TCP_STATE_CLOSED &&
+			    iwdev->iw_status &&
+			    iwqp->hw_tcp_state != IRDMA_TCP_STATE_TIME_WAIT)
+				info.reset_tcp_conn = true;
+			else
+				dont_wait = 1;
+
+			issue_modify_qp = 1;
+			info.next_iwarp_state = IRDMA_QP_STATE_ERROR;
+			break;
+		default:
+			err = -EINVAL;
+			goto exit;
+		}
+
+		iwqp->ibqp_state = attr->qp_state;
+	}
+	if (attr_mask & IB_QP_ACCESS_FLAGS) {
+		ctx_info->iwarp_info_valid = true;
+		if (attr->qp_access_flags & IB_ACCESS_LOCAL_WRITE)
+			offload_info->wr_rdresp_en = true;
+		if (attr->qp_access_flags & IB_ACCESS_REMOTE_WRITE)
+			offload_info->wr_rdresp_en = true;
+		if (attr->qp_access_flags & IB_ACCESS_REMOTE_READ)
+			offload_info->rd_en = true;
+	}
+
+	if (ctx_info->iwarp_info_valid) {
+		struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+
+		ctx_info->send_cq_num = iwqp->iwscq->sc_cq.cq_uk.cq_id;
+		ctx_info->rcv_cq_num = iwqp->iwrcq->sc_cq.cq_uk.cq_id;
+		dev->iw_priv_qp_ops->qp_setctx(&iwqp->sc_qp,
+					       iwqp->host_ctx.va,
+					       ctx_info);
+	}
+	spin_unlock_irqrestore(&iwqp->lock, flags);
+
+	if (attr_mask & IB_QP_STATE) {
+		if (issue_modify_qp) {
+			ctx_info->rem_endpoint_idx = tcp_info->arp_idx;
+			if (irdma_hw_modify_qp(iwdev, iwqp, &info, true))
+				return -EINVAL;
+		}
+
+		spin_lock_irqsave(&iwqp->lock, flags);
+		if (iwqp->iwarp_state == info.curr_iwarp_state) {
+			iwqp->iwarp_state = info.next_iwarp_state;
+			iwqp->ibqp_state = attr->qp_state;
+		}
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+	}
+
+	if (issue_modify_qp && iwqp->ibqp_state > IB_QPS_RTS) {
+		if (dont_wait) {
+			if (iwqp->cm_id && iwqp->hw_tcp_state) {
+				spin_lock_irqsave(&iwqp->lock, flags);
+				iwqp->hw_tcp_state = IRDMA_TCP_STATE_CLOSED;
+				iwqp->last_aeq = IRDMA_AE_RESET_SENT;
+				spin_unlock_irqrestore(&iwqp->lock, flags);
+				irdma_cm_disconn(iwqp);
+			}
+		} else {
+			int close_timer_started;
+
+			spin_lock_irqsave(&iwdev->cm_core.ht_lock, flags);
+
+			if (iwqp->cm_node) {
+				refcount_inc(&iwqp->cm_node->refcnt);
+				spin_unlock_irqrestore(&iwdev->cm_core.ht_lock, flags);
+				close_timer_started = atomic_inc_return(&iwqp->close_timer_started);
+				if (iwqp->cm_id && close_timer_started == 1)
+					irdma_schedule_cm_timer(iwqp->cm_node,
+						(struct irdma_puda_buf *)iwqp,
+						IRDMA_TIMER_TYPE_CLOSE, 1, 0);
+
+				irdma_rem_ref_cm_node(iwqp->cm_node);
+			} else {
+				spin_unlock_irqrestore(&iwdev->cm_core.ht_lock, flags);
+			}
+		}
+	}
+	if (attr_mask & IB_QP_STATE && udata &&
+	    dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) {
+		struct irdma_ucontext *ucontext;
+
+		ucontext = rdma_udata_to_drv_context(udata,
+					struct irdma_ucontext, ibucontext);
+		if (iwqp->sc_qp.push_idx != IRDMA_INVALID_PUSH_PAGE_INDEX &&
+		    !iwqp->push_wqe_mmap_entry &&
+		    !irdma_setup_push_mmap_entries(ucontext, iwqp,
+			&uresp.push_wqe_mmap_key, &uresp.push_db_mmap_key)) {
+			uresp.push_valid = 1;
+			uresp.push_offset = iwqp->sc_qp.push_offset;
+		}
+
+		err = ib_copy_to_udata(udata, &uresp, min(sizeof(uresp),
+				       udata->outlen));
+		if (err) {
+			irdma_remove_push_mmap_entries(iwqp);
+			ibdev_dbg(&iwdev->ibdev,
+				  "VERBS: copy_to_udata failed\n");
+			return err;
+		}
+	}
+
+	return 0;
+exit:
+	spin_unlock_irqrestore(&iwqp->lock, flags);
+
+	return err;
+}
+
+/**
+ * irdma_cq_free_rsrc - free up resources for cq
+ * @rf: RDMA PCI function
+ * @iwcq: cq ptr
+ */
+static void irdma_cq_free_rsrc(struct irdma_pci_f *rf, struct irdma_cq *iwcq)
+{
+	struct irdma_sc_cq *cq = &iwcq->sc_cq;
+
+	if (!iwcq->user_mode) {
+		dma_free_coherent(rf->sc_dev.hw->device, iwcq->kmem.size,
+				  iwcq->kmem.va, iwcq->kmem.pa);
+		iwcq->kmem.va = NULL;
+		dma_free_coherent(rf->sc_dev.hw->device,
+				  iwcq->kmem_shadow.size,
+				  iwcq->kmem_shadow.va, iwcq->kmem_shadow.pa);
+		iwcq->kmem_shadow.va = NULL;
+	}
+
+	irdma_free_rsrc(rf, rf->allocated_cqs, cq->cq_uk.cq_id);
+}
+
+/**
+ * irdma_free_cqbuf - worker to free a cq buffer
+ * @work: provides access to the cq buffer to free
+ */
+static void irdma_free_cqbuf(struct work_struct *work)
+{
+	struct irdma_cq_buf *cq_buf = container_of(work, struct irdma_cq_buf, work);
+
+	dma_free_coherent(cq_buf->hw->device, cq_buf->kmem_buf.size,
+			  cq_buf->kmem_buf.va, cq_buf->kmem_buf.pa);
+	cq_buf->kmem_buf.va = NULL;
+	kfree(cq_buf);
+}
+
+/**
+ * irdma_process_resize_list - remove resized cq buffers from the resize_list
+ * @iwcq: cq which owns the resize_list
+ * @iwdev: irdma device
+ * @lcqe_buf: the buffer where the last cqe is received
+ */
+static int irdma_process_resize_list(struct irdma_cq *iwcq,
+				     struct irdma_device *iwdev,
+				     struct irdma_cq_buf *lcqe_buf)
+{
+	struct list_head *tmp_node, *list_node;
+	struct irdma_cq_buf *cq_buf;
+	int cnt = 0;
+
+	list_for_each_safe(list_node, tmp_node, &iwcq->resize_list) {
+		cq_buf = list_entry(list_node, struct irdma_cq_buf, list);
+		if (cq_buf == lcqe_buf)
+			return cnt;
+
+		list_del(&cq_buf->list);
+		queue_work(iwdev->cleanup_wq, &cq_buf->work);
+		cnt++;
+	}
+
+	return cnt;
+}
+
+/**
+ * irdma_destroy_cq - destroy cq
+ * @ib_cq: cq pointer
+ * @udata: user data
+ */
+static int irdma_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
+{
+	struct irdma_device *iwdev = to_iwdev(ib_cq->device);
+	struct irdma_cq *iwcq = to_iwcq(ib_cq);
+	struct irdma_sc_cq *cq = &iwcq->sc_cq;
+	struct irdma_sc_dev *dev = cq->dev;
+	struct irdma_sc_ceq *ceq = dev->ceq[cq->ceq_id];
+	struct irdma_ceq *iwceq = container_of(ceq, struct irdma_ceq, sc_ceq);
+	unsigned long flags;
+
+	spin_lock_irqsave(&iwcq->lock, flags);
+	if (!list_empty(&iwcq->resize_list))
+		irdma_process_resize_list(iwcq, iwdev, NULL);
+	spin_unlock_irqrestore(&iwcq->lock, flags);
+
+	irdma_cq_wq_destroy(iwdev->rf, cq);
+	irdma_cq_free_rsrc(iwdev->rf, iwcq);
+
+	spin_lock_irqsave(&iwceq->ce_lock, flags);
+	dev->ceq_ops->cleanup_ceqes(cq, ceq);
+	spin_unlock_irqrestore(&iwceq->ce_lock, flags);
+
+	return 0;
+}
+
+/**
+ * irdma_resize_cq - resize cq
+ * @ibcq: cq to be resized
+ * @entries: desired cq size
+ * @udata: user data
+ */
+static int irdma_resize_cq(struct ib_cq *ibcq, int entries,
+			   struct ib_udata *udata)
+{
+	struct irdma_cq *iwcq = to_iwcq(ibcq);
+	struct irdma_sc_dev *dev = iwcq->sc_cq.dev;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_modify_cq_info *m_info;
+	struct irdma_modify_cq_info info = {};
+	struct irdma_dma_mem kmem_buf;
+	struct irdma_cq_mr *cqmr_buf;
+	struct irdma_pbl *iwpbl_buf;
+	struct irdma_device *iwdev;
+	struct irdma_pci_f *rf;
+	struct irdma_cq_buf *cq_buf = NULL;
+	enum irdma_status_code status = 0;
+	unsigned long flags;
+	int ret;
+
+	iwdev = to_iwdev(ibcq->device);
+	rf = iwdev->rf;
+
+	if (!(rf->sc_dev.hw_attrs.uk_attrs.feature_flags &
+	    IRDMA_FEATURE_CQ_RESIZE))
+		return -EOPNOTSUPP;
+
+	if (entries > rf->max_cqe)
+		return -EINVAL;
+
+	if (!iwcq->user_mode) {
+		entries++;
+		if (rf->sc_dev.hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+			entries *= 2;
+	}
+
+	info.cq_size = max(entries, 4);
+
+	if (info.cq_size == iwcq->sc_cq.cq_uk.cq_size - 1)
+		return 0;
+
+	if (udata) {
+		struct irdma_resize_cq_req req = {};
+		struct irdma_ucontext *ucontext =
+			rdma_udata_to_drv_context(udata, struct irdma_ucontext,
+						  ibucontext);
+
+		/* CQ resize not supported with legacy GEN_1 libi40iw */
+		if (ucontext->legacy_mode)
+			return -EOPNOTSUPP;
+
+		if (ib_copy_from_udata(&req, udata,
+				       min(sizeof(req), udata->inlen)))
+			return -EINVAL;
+
+		spin_lock_irqsave(&ucontext->cq_reg_mem_list_lock, flags);
+		iwpbl_buf = irdma_get_pbl((unsigned long)req.user_cq_buffer,
+					  &ucontext->cq_reg_mem_list);
+		spin_unlock_irqrestore(&ucontext->cq_reg_mem_list_lock, flags);
+
+		if (!iwpbl_buf)
+			return -ENOMEM;
+
+		cqmr_buf = &iwpbl_buf->cq_mr;
+		if (iwpbl_buf->pbl_allocated) {
+			info.virtual_map = true;
+			info.pbl_chunk_size = 1;
+			info.first_pm_pbl_idx = cqmr_buf->cq_pbl.idx;
+		} else {
+			info.cq_pa = cqmr_buf->cq_pbl.addr;
+		}
+	} else {
+		/* Kmode CQ resize */
+		int rsize;
+
+		rsize = info.cq_size * sizeof(struct irdma_cqe);
+		kmem_buf.size = ALIGN(round_up(rsize, 256), 256);
+		kmem_buf.va = dma_alloc_coherent(dev->hw->device,
+						 kmem_buf.size, &kmem_buf.pa,
+						 GFP_KERNEL);
+		if (!kmem_buf.va)
+			return -ENOMEM;
+
+		info.cq_base = kmem_buf.va;
+		info.cq_pa = kmem_buf.pa;
+		cq_buf = kzalloc(sizeof(*cq_buf), GFP_KERNEL);
+		if (!cq_buf) {
+			ret = -ENOMEM;
+			goto error;
+		}
+	}
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	info.shadow_read_threshold = iwcq->sc_cq.shadow_read_threshold;
+	info.cq_resize = true;
+
+	cqp_info = &cqp_request->info;
+	m_info = &cqp_info->in.u.cq_modify.info;
+	memcpy(m_info, &info, sizeof(*m_info));
+
+	cqp_info->cqp_cmd = IRDMA_OP_CQ_MODIFY;
+	cqp_info->in.u.cq_modify.cq = &iwcq->sc_cq;
+	cqp_info->in.u.cq_modify.scratch = (uintptr_t)cqp_request;
+	cqp_info->post_sq = 1;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+	if (status) {
+		ret = -EPROTO;
+		goto error;
+	}
+
+	spin_lock_irqsave(&iwcq->lock, flags);
+	if (cq_buf) {
+		cq_buf->kmem_buf = iwcq->kmem;
+		cq_buf->hw = dev->hw;
+		memcpy(&cq_buf->cq_uk, &iwcq->sc_cq.cq_uk, sizeof(cq_buf->cq_uk));
+		INIT_WORK(&cq_buf->work, irdma_free_cqbuf);
+		list_add_tail(&cq_buf->list, &iwcq->resize_list);
+		iwcq->kmem = kmem_buf;
+	}
+
+	dev->iw_priv_cq_ops->cq_resize(&iwcq->sc_cq, &info);
+	ibcq->cqe = info.cq_size - 1;
+	spin_unlock_irqrestore(&iwcq->lock, flags);
+
+	return 0;
+error:
+	if (!udata) {
+		dma_free_coherent(dev->hw->device, kmem_buf.size, kmem_buf.va,
+				  kmem_buf.pa);
+		kmem_buf.va = NULL;
+	}
+	kfree(cq_buf);
+
+	return ret;
+}
+
+static inline int cq_validate_flags(u32 flags, u8 hw_rev)
+{
+	/* GEN1 does not support CQ create flags */
+	if (hw_rev == IRDMA_GEN_1)
+		return flags ? -EOPNOTSUPP : 0;
+
+	return flags & ~IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION ? -EOPNOTSUPP : 0;
+}
+
+/**
+ * irdma_create_cq - create cq
+ * @ibcq: CQ allocated
+ * @attr: attributes for cq
+ * @udata: user data
+ */
+static int irdma_create_cq(struct ib_cq *ibcq,
+			   const struct ib_cq_init_attr *attr,
+			   struct ib_udata *udata)
+{
+	struct ib_device *ibdev = ibcq->device;
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct irdma_cq *iwcq = to_iwcq(ibcq);
+	u32 cq_num = 0;
+	struct irdma_sc_cq *cq;
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_cq_init_info info = {};
+	enum irdma_status_code status;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_cq_uk_init_info *ukinfo = &info.cq_uk_init_info;
+	unsigned long flags;
+	int err_code;
+	int entries = attr->cqe;
+
+	err_code = cq_validate_flags(attr->flags, dev->hw_attrs.uk_attrs.hw_rev);
+	if (err_code)
+		return err_code;
+	err_code = irdma_alloc_rsrc(rf, rf->allocated_cqs, rf->max_cq, &cq_num,
+				    &rf->next_cq);
+	if (err_code)
+		return err_code;
+
+	cq = &iwcq->sc_cq;
+	cq->back_cq = iwcq;
+	spin_lock_init(&iwcq->lock);
+	INIT_LIST_HEAD(&iwcq->resize_list);
+	info.dev = dev;
+	ukinfo->cq_size = max(entries, 4);
+	ukinfo->cq_id = cq_num;
+	iwcq->ibcq.cqe = info.cq_uk_init_info.cq_size;
+	if (attr->comp_vector < rf->ceqs_count)
+		info.ceq_id = attr->comp_vector;
+	info.ceq_id_valid = true;
+	info.ceqe_mask = 1;
+	info.type = IRDMA_CQ_TYPE_IWARP;
+	info.vsi = &iwdev->vsi;
+
+	if (udata) {
+		struct irdma_ucontext *ucontext;
+		struct irdma_create_cq_req req = {};
+		struct irdma_cq_mr *cqmr;
+		struct irdma_pbl *iwpbl;
+		struct irdma_pbl *iwpbl_shadow;
+		struct irdma_cq_mr *cqmr_shadow;
+
+		iwcq->user_mode = true;
+		ucontext =
+			rdma_udata_to_drv_context(udata, struct irdma_ucontext,
+						  ibucontext);
+		if (ib_copy_from_udata(&req, udata,
+				       min(sizeof(req), udata->inlen))) {
+			err_code = -EFAULT;
+			goto cq_free_rsrc;
+		}
+
+		spin_lock_irqsave(&ucontext->cq_reg_mem_list_lock, flags);
+		iwpbl = irdma_get_pbl((unsigned long)req.user_cq_buf,
+				      &ucontext->cq_reg_mem_list);
+		spin_unlock_irqrestore(&ucontext->cq_reg_mem_list_lock, flags);
+		if (!iwpbl) {
+			err_code = -EPROTO;
+			goto cq_free_rsrc;
+		}
+
+		iwcq->iwpbl = iwpbl;
+		iwcq->cq_mem_size = 0;
+		cqmr = &iwpbl->cq_mr;
+
+		if (rf->sc_dev.hw_attrs.uk_attrs.feature_flags &
+		    IRDMA_FEATURE_CQ_RESIZE && !ucontext->legacy_mode) {
+			spin_lock_irqsave(&ucontext->cq_reg_mem_list_lock, flags);
+			iwpbl_shadow = irdma_get_pbl(
+					(unsigned long)req.user_shadow_area,
+					&ucontext->cq_reg_mem_list);
+			spin_unlock_irqrestore(&ucontext->cq_reg_mem_list_lock, flags);
+
+			if (!iwpbl_shadow) {
+				err_code = -EPROTO;
+				goto cq_free_rsrc;
+			}
+			iwcq->iwpbl_shadow = iwpbl_shadow;
+			cqmr_shadow = &iwpbl_shadow->cq_mr;
+			info.shadow_area_pa = cqmr_shadow->cq_pbl.addr;
+			cqmr->split = true;
+		} else {
+			info.shadow_area_pa = cqmr->shadow;
+		}
+		if (iwpbl->pbl_allocated) {
+			info.virtual_map = true;
+			info.pbl_chunk_size = 1;
+			info.first_pm_pbl_idx = cqmr->cq_pbl.idx;
+		} else {
+			info.cq_base_pa = cqmr->cq_pbl.addr;
+		}
+	} else {
+		/* Kmode allocations */
+		int rsize;
+
+		if (entries > rf->max_cqe) {
+			err_code = -EINVAL;
+			goto cq_free_rsrc;
+		}
+
+		entries++;
+		if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+			entries *= 2;
+		ukinfo->cq_size = entries;
+
+		rsize = info.cq_uk_init_info.cq_size * sizeof(struct irdma_cqe);
+		iwcq->kmem.size = ALIGN(round_up(rsize, 256), 256);
+		iwcq->kmem.va = dma_alloc_coherent(dev->hw->device,
+						   iwcq->kmem.size,
+						   &iwcq->kmem.pa, GFP_KERNEL);
+		if (!iwcq->kmem.va) {
+			err_code = -ENOMEM;
+			goto cq_free_rsrc;
+		}
+
+		iwcq->kmem_shadow.size = ALIGN(IRDMA_SHADOW_AREA_SIZE << 3,
+					       64);
+		iwcq->kmem_shadow.va = dma_alloc_coherent(dev->hw->device,
+							  iwcq->kmem_shadow.size,
+							  &iwcq->kmem_shadow.pa,
+							  GFP_KERNEL);
+		if (!iwcq->kmem_shadow.va) {
+			err_code = -ENOMEM;
+			goto cq_free_rsrc;
+		}
+		info.shadow_area_pa = iwcq->kmem_shadow.pa;
+		ukinfo->shadow_area = iwcq->kmem_shadow.va;
+		ukinfo->cq_base = iwcq->kmem.va;
+		info.cq_base_pa = iwcq->kmem.pa;
+	}
+
+	if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+		info.shadow_read_threshold = min(info.cq_uk_init_info.cq_size / 2,
+						 (u32)IRDMA_MAX_CQ_READ_THRESH);
+	if (dev->iw_priv_cq_ops->cq_init(cq, &info)) {
+		ibdev_dbg(&iwdev->ibdev, "VERBS: init cq fail\n");
+		err_code = -EPROTO;
+		goto cq_free_rsrc;
+	}
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request) {
+		err_code = -ENOMEM;
+		goto cq_free_rsrc;
+	}
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_CQ_CREATE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.cq_create.cq = cq;
+	cqp_info->in.u.cq_create.check_overflow = true;
+	cqp_info->in.u.cq_create.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+	if (status) {
+		err_code = -ENOMEM;
+		goto cq_free_rsrc;
+	}
+
+	if (udata) {
+		struct irdma_create_cq_resp resp = {};
+
+		resp.cq_id = info.cq_uk_init_info.cq_id;
+		resp.cq_size = info.cq_uk_init_info.cq_size;
+		if (ib_copy_to_udata(udata, &resp,
+				     min(sizeof(resp), udata->outlen))) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "VERBS: copy to user data\n");
+			err_code = -EPROTO;
+			goto cq_destroy;
+		}
+	}
+	return 0;
+
+cq_destroy:
+	irdma_cq_wq_destroy(rf, cq);
+cq_free_rsrc:
+	irdma_cq_free_rsrc(rf, iwcq);
+
+	return err_code;
+}
+
+/**
+ * irdma_get_mr_access - get hw MR access permissions from IB access flags
+ * @access: IB access flags
+ */
+static inline u16 irdma_get_mr_access(int access)
+{
+	u16 hw_access = 0;
+
+	hw_access |= (access & IB_ACCESS_LOCAL_WRITE) ?
+		     IRDMA_ACCESS_FLAGS_LOCALWRITE : 0;
+	hw_access |= (access & IB_ACCESS_REMOTE_WRITE) ?
+		     IRDMA_ACCESS_FLAGS_REMOTEWRITE : 0;
+	hw_access |= (access & IB_ACCESS_REMOTE_READ) ?
+		     IRDMA_ACCESS_FLAGS_REMOTEREAD : 0;
+	hw_access |= (access & IB_ACCESS_MW_BIND) ?
+		     IRDMA_ACCESS_FLAGS_BIND_WINDOW : 0;
+	hw_access |= (access & IB_ZERO_BASED) ?
+		     IRDMA_ACCESS_FLAGS_ZERO_BASED : 0;
+	hw_access |= IRDMA_ACCESS_FLAGS_LOCALREAD;
+
+	return hw_access;
+}
+
+/**
+ * irdma_free_stag - free stag resource
+ * @iwdev: irdma device
+ * @stag: stag to free
+ */
+static void irdma_free_stag(struct irdma_device *iwdev, u32 stag)
+{
+	u32 stag_idx;
+
+	stag_idx = (stag & iwdev->rf->mr_stagmask) >> IRDMA_CQPSQ_STAG_IDX_S;
+	irdma_free_rsrc(iwdev->rf, iwdev->rf->allocated_mrs, stag_idx);
+}
+
+/**
+ * irdma_create_stag - create random stag
+ * @iwdev: irdma device
+ */
+static u32 irdma_create_stag(struct irdma_device *iwdev)
+{
+	u32 stag = 0;
+	u32 stag_index = 0;
+	u32 next_stag_index;
+	u32 driver_key;
+	u32 random;
+	u8 consumer_key;
+	int ret;
+
+	get_random_bytes(&random, sizeof(random));
+	consumer_key = (u8)random;
+
+	driver_key = random & ~iwdev->rf->mr_stagmask;
+	next_stag_index = (random & iwdev->rf->mr_stagmask) >> 8;
+	next_stag_index %= iwdev->rf->max_mr;
+
+	ret = irdma_alloc_rsrc(iwdev->rf, iwdev->rf->allocated_mrs,
+			       iwdev->rf->max_mr, &stag_index,
+			       &next_stag_index);
+	if (ret)
+		return stag;
+	stag = stag_index << IRDMA_CQPSQ_STAG_IDX_S;
+	stag |= driver_key;
+	stag += (u32)consumer_key;
+
+	return stag;
+}
+
+/**
+ * irdma_next_pbl_addr - Get next pbl address
+ * @pbl: pointer to a pble
+ * @pinfo: info pointer
+ * @idx: index
+ */
+static inline u64 *irdma_next_pbl_addr(u64 *pbl, struct irdma_pble_info **pinfo,
+				       u32 *idx)
+{
+	*idx += 1;
+	if (!(*pinfo) || *idx != (*pinfo)->cnt)
+		return ++pbl;
+	*idx = 0;
+	(*pinfo)++;
+
+	return (u64 *)(uintptr_t)(*pinfo)->addr;
+}
+
+/**
+ * irdma_copy_user_pgaddrs - copy user page address to pble's os locally
+ * @iwmr: iwmr for IB's user page addresses
+ * @pbl: ple pointer to save 1 level or 0 level pble
+ * @level: indicated level 0, 1 or 2
+ */
+static void irdma_copy_user_pgaddrs(struct irdma_mr *iwmr, u64 *pbl,
+				    enum irdma_pble_level level)
+{
+	struct ib_umem *region = iwmr->region;
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+	struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+	struct irdma_pble_info *pinfo;
+	struct ib_block_iter biter;
+	u32 idx = 0;
+	u32 pbl_cnt = 0;
+
+	pinfo = (level == PBLE_LEVEL_1) ? NULL : palloc->level2.leaf;
+
+	if (iwmr->type == IW_MEMREG_TYPE_QP)
+		iwpbl->qp_mr.sq_page = sg_page(region->sg_head.sgl);
+
+	rdma_umem_for_each_dma_block(region, &biter, iwmr->page_size) {
+		*pbl = rdma_block_iter_dma_address(&biter);
+		if (++pbl_cnt == palloc->total_cnt)
+			break;
+		pbl = irdma_next_pbl_addr(pbl, &pinfo, &idx);
+	}
+}
+
+/**
+ * irdma_check_mem_contiguous - check if pbls stored in arr are contiguous
+ * @arr: lvl1 pbl array
+ * @npages: page count
+ * @pg_size: page size
+ *
+ */
+static bool irdma_check_mem_contiguous(u64 *arr, u32 npages, u32 pg_size)
+{
+	u32 pg_idx;
+
+	for (pg_idx = 0; pg_idx < npages; pg_idx++) {
+		if ((*arr + (pg_size * pg_idx)) != arr[pg_idx])
+			return false;
+	}
+
+	return true;
+}
+
+/**
+ * irdma_check_mr_contiguous - check if MR is physically contiguous
+ * @palloc: pbl allocation struct
+ * @pg_size: page size
+ */
+static bool irdma_check_mr_contiguous(struct irdma_pble_alloc *palloc,
+				      u32 pg_size)
+{
+	struct irdma_pble_level2 *lvl2 = &palloc->level2;
+	struct irdma_pble_info *leaf = lvl2->leaf;
+	u64 *arr = NULL;
+	u64 *start_addr = NULL;
+	int i;
+	bool ret;
+
+	if (palloc->level == PBLE_LEVEL_1) {
+		arr = (u64 *)(uintptr_t)palloc->level1.addr;
+		ret = irdma_check_mem_contiguous(arr, palloc->total_cnt,
+						 pg_size);
+		return ret;
+	}
+
+	start_addr = (u64 *)(uintptr_t)leaf->addr;
+
+	for (i = 0; i < lvl2->leaf_cnt; i++, leaf++) {
+		arr = (u64 *)(uintptr_t)leaf->addr;
+		if ((*start_addr + (i * pg_size * PBLE_PER_PAGE)) != *arr)
+			return false;
+		ret = irdma_check_mem_contiguous(arr, leaf->cnt, pg_size);
+		if (!ret)
+			return false;
+	}
+
+	return true;
+}
+
+/**
+ * irdma_setup_pbles - copy user pg address to pble's
+ * @rf: RDMA PCI function
+ * @iwmr: mr pointer for this memory registration
+ * @use_pbles: flag if to use pble's
+ */
+static int irdma_setup_pbles(struct irdma_pci_f *rf, struct irdma_mr *iwmr,
+			     bool use_pbles)
+{
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+	struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+	struct irdma_pble_info *pinfo;
+	u64 *pbl;
+	enum irdma_status_code status;
+	enum irdma_pble_level level = PBLE_LEVEL_1;
+
+	if (use_pbles) {
+		status = irdma_get_pble(rf->pble_rsrc, palloc, iwmr->page_cnt,
+					false);
+		if (status)
+			return -ENOMEM;
+
+		iwpbl->pbl_allocated = true;
+		level = palloc->level;
+		pinfo = (level == PBLE_LEVEL_1) ? &palloc->level1 :
+						  palloc->level2.leaf;
+		pbl = (u64 *)(uintptr_t)pinfo->addr;
+	} else {
+		pbl = iwmr->pgaddrmem;
+	}
+
+	irdma_copy_user_pgaddrs(iwmr, pbl, level);
+
+	if (use_pbles)
+		iwmr->pgaddrmem[0] = *pbl;
+
+	return 0;
+}
+
+/**
+ * irdma_handle_q_mem - handle memory for qp and cq
+ * @iwdev: irdma device
+ * @req: information for q memory management
+ * @iwpbl: pble struct
+ * @use_pbles: flag to use pble
+ */
+static int irdma_handle_q_mem(struct irdma_device *iwdev,
+			      struct irdma_mem_reg_req *req,
+			      struct irdma_pbl *iwpbl, bool use_pbles)
+{
+	struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+	struct irdma_mr *iwmr = iwpbl->iwmr;
+	struct irdma_qp_mr *qpmr = &iwpbl->qp_mr;
+	struct irdma_cq_mr *cqmr = &iwpbl->cq_mr;
+	struct irdma_hmc_pble *hmc_p;
+	u64 *arr = iwmr->pgaddrmem;
+	u32 pg_size;
+	int err = 0;
+	int total;
+	bool ret = true;
+
+	total = req->sq_pages + req->rq_pages + req->cq_pages;
+	pg_size = iwmr->page_size;
+	err = irdma_setup_pbles(iwdev->rf, iwmr, use_pbles);
+	if (err)
+		return err;
+
+	if (use_pbles && palloc->level != PBLE_LEVEL_1) {
+		irdma_free_pble(iwdev->rf->pble_rsrc, palloc);
+		iwpbl->pbl_allocated = false;
+		return -ENOMEM;
+	}
+
+	if (use_pbles)
+		arr = (u64 *)(uintptr_t)palloc->level1.addr;
+
+	switch (iwmr->type) {
+	case IW_MEMREG_TYPE_QP:
+		hmc_p = &qpmr->sq_pbl;
+		qpmr->shadow = (dma_addr_t)arr[total];
+
+		if (use_pbles) {
+			ret = irdma_check_mem_contiguous(arr, req->sq_pages,
+							 pg_size);
+			if (ret)
+				ret = irdma_check_mem_contiguous(&arr[req->sq_pages],
+								 req->rq_pages,
+								 pg_size);
+		}
+
+		if (!ret) {
+			hmc_p->idx = palloc->level1.idx;
+			hmc_p = &qpmr->rq_pbl;
+			hmc_p->idx = palloc->level1.idx + req->sq_pages;
+		} else {
+			hmc_p->addr = arr[0];
+			hmc_p = &qpmr->rq_pbl;
+			hmc_p->addr = arr[req->sq_pages];
+		}
+		break;
+	case IW_MEMREG_TYPE_CQ:
+		hmc_p = &cqmr->cq_pbl;
+
+		if (!cqmr->split)
+			cqmr->shadow = (dma_addr_t)arr[total];
+
+		if (use_pbles)
+			ret = irdma_check_mem_contiguous(arr, req->cq_pages,
+							 pg_size);
+
+		if (!ret)
+			hmc_p->idx = palloc->level1.idx;
+		else
+			hmc_p->addr = arr[0];
+	break;
+	default:
+		ibdev_dbg(&iwdev->ibdev, "VERBS: MR type error\n");
+		err = -EINVAL;
+	}
+
+	if (use_pbles && ret) {
+		irdma_free_pble(iwdev->rf->pble_rsrc, palloc);
+		iwpbl->pbl_allocated = false;
+	}
+
+	return err;
+}
+
+/**
+ * irdma_hw_alloc_mw - create the hw memory window
+ * @iwdev: irdma device
+ * @iwmr: pointer to memory window info
+ */
+static int irdma_hw_alloc_mw(struct irdma_device *iwdev, struct irdma_mr *iwmr)
+{
+	struct irdma_mw_alloc_info *info;
+	struct irdma_pd *iwpd = to_iwpd(iwmr->ibmr.pd);
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	info = &cqp_info->in.u.mw_alloc.info;
+	memset(info, 0, sizeof(*info));
+	if (iwmr->ibmw.type == IB_MW_TYPE_1)
+		info->mw_wide = true;
+
+	info->page_size = PAGE_SIZE;
+	info->mw_stag_index = iwmr->stag >> IRDMA_CQPSQ_STAG_IDX_S;
+	info->pd_id = iwpd->sc_pd.pd_id;
+	info->remote_access = true;
+	cqp_info->cqp_cmd = IRDMA_OP_MW_ALLOC;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.mw_alloc.dev = &iwdev->rf->sc_dev;
+	cqp_info->in.u.mw_alloc.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+
+	return status ? -ENOMEM : 0;
+}
+
+/**
+ * irdma_alloc_mw - Allocate memory window
+ * @ibmw: Memory Window
+ * @udata: user data pointer
+ */
+static int irdma_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata)
+{
+	struct irdma_device *iwdev = to_iwdev(ibmw->device);
+	struct irdma_mr *iwmr = to_iwmw(ibmw);
+	int err_code;
+	u32 stag;
+
+	stag = irdma_create_stag(iwdev);
+	if (!stag)
+		return -ENOMEM;
+
+	iwmr->stag = stag;
+	ibmw->rkey = stag;
+	iwmr->type = IW_MEMREG_TYPE_MW;
+
+	err_code = irdma_hw_alloc_mw(iwdev, iwmr);
+	if (err_code) {
+		irdma_free_stag(iwdev, stag);
+		return err_code;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_dealloc_mw - Dealloc memory window
+ * @ibmw: memory window structure.
+ */
+static int irdma_dealloc_mw(struct ib_mw *ibmw)
+{
+	struct ib_pd *ibpd = ibmw->pd;
+	struct irdma_pd *iwpd = to_iwpd(ibpd);
+	struct irdma_mr *iwmr = to_iwmr((struct ib_mr *)ibmw);
+	struct irdma_device *iwdev = to_iwdev(ibmw->device);
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_dealloc_stag_info *info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	info = &cqp_info->in.u.dealloc_stag.info;
+	memset(info, 0, sizeof(*info));
+	info->pd_id = iwpd->sc_pd.pd_id & 0x00007fff;
+	info->stag_idx = ibmw->rkey >> IRDMA_CQPSQ_STAG_IDX_S;
+	info->mr = false;
+	cqp_info->cqp_cmd = IRDMA_OP_DEALLOC_STAG;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.dealloc_stag.dev = &iwdev->rf->sc_dev;
+	cqp_info->in.u.dealloc_stag.scratch = (uintptr_t)cqp_request;
+	irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+	irdma_free_stag(iwdev, iwmr->stag);
+
+	return 0;
+}
+
+/**
+ * irdma_hw_alloc_stag - cqp command to allocate stag
+ * @iwdev: irdma device
+ * @iwmr: irdma mr pointer
+ */
+static int irdma_hw_alloc_stag(struct irdma_device *iwdev,
+			       struct irdma_mr *iwmr)
+{
+	struct irdma_allocate_stag_info *info;
+	struct irdma_pd *iwpd = to_iwpd(iwmr->ibmr.pd);
+	enum irdma_status_code status;
+	int err = 0;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	info = &cqp_info->in.u.alloc_stag.info;
+	memset(info, 0, sizeof(*info));
+	info->page_size = PAGE_SIZE;
+	info->stag_idx = iwmr->stag >> IRDMA_CQPSQ_STAG_IDX_S;
+	info->pd_id = iwpd->sc_pd.pd_id;
+	info->total_len = iwmr->len;
+	info->remote_access = true;
+	cqp_info->cqp_cmd = IRDMA_OP_ALLOC_STAG;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.alloc_stag.dev = &iwdev->rf->sc_dev;
+	cqp_info->in.u.alloc_stag.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+	if (status)
+		err = -ENOMEM;
+
+	return err;
+}
+
+/**
+ * irdma_alloc_mr - register stag for fast memory registration
+ * @pd: ibpd pointer
+ * @mr_type: memory for stag registrion
+ * @max_num_sg: man number of pages
+ */
+static struct ib_mr *irdma_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
+				    u32 max_num_sg)
+{
+	struct irdma_device *iwdev = to_iwdev(pd->device);
+	struct irdma_pble_alloc *palloc;
+	struct irdma_pbl *iwpbl;
+	struct irdma_mr *iwmr;
+	enum irdma_status_code status;
+	u32 stag;
+	int err_code = -ENOMEM;
+
+	iwmr = kzalloc(sizeof(*iwmr), GFP_KERNEL);
+	if (!iwmr)
+		return ERR_PTR(-ENOMEM);
+
+	stag = irdma_create_stag(iwdev);
+	if (!stag) {
+		err_code = -ENOMEM;
+		goto err;
+	}
+
+	iwmr->stag = stag;
+	iwmr->ibmr.rkey = stag;
+	iwmr->ibmr.lkey = stag;
+	iwmr->ibmr.pd = pd;
+	iwmr->ibmr.device = pd->device;
+	iwpbl = &iwmr->iwpbl;
+	iwpbl->iwmr = iwmr;
+	iwmr->type = IW_MEMREG_TYPE_MEM;
+	palloc = &iwpbl->pble_alloc;
+	iwmr->page_cnt = max_num_sg;
+	status = irdma_get_pble(iwdev->rf->pble_rsrc, palloc, iwmr->page_cnt,
+				true);
+	if (status)
+		goto err_get_pble;
+
+	err_code = irdma_hw_alloc_stag(iwdev, iwmr);
+	if (err_code)
+		goto err_alloc_stag;
+
+	iwpbl->pbl_allocated = true;
+
+	return &iwmr->ibmr;
+err_alloc_stag:
+	irdma_free_pble(iwdev->rf->pble_rsrc, palloc);
+err_get_pble:
+	irdma_free_stag(iwdev, stag);
+err:
+	kfree(iwmr);
+
+	return ERR_PTR(err_code);
+}
+
+/**
+ * irdma_set_page - populate pbl list for fmr
+ * @ibmr: ib mem to access iwarp mr pointer
+ * @addr: page dma address fro pbl list
+ */
+static int irdma_set_page(struct ib_mr *ibmr, u64 addr)
+{
+	struct irdma_mr *iwmr = to_iwmr(ibmr);
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+	struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+	u64 *pbl;
+
+	if (unlikely(iwmr->npages == iwmr->page_cnt))
+		return -ENOMEM;
+
+	pbl = (u64 *)(uintptr_t)palloc->level1.addr;
+	pbl[iwmr->npages++] = addr;
+
+	return 0;
+}
+
+/**
+ * irdma_map_mr_sg - map of sg list for fmr
+ * @ibmr: ib mem to access iwarp mr pointer
+ * @sg: scatter gather list
+ * @sg_nents: number of sg pages
+ * @sg_offset: scatter gather list for fmr
+ */
+static int irdma_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
+			   int sg_nents, unsigned int *sg_offset)
+{
+	struct irdma_mr *iwmr = to_iwmr(ibmr);
+
+	iwmr->npages = 0;
+
+	return ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, irdma_set_page);
+}
+
+/**
+ * irdma_hwreg_mr - send cqp command for memory registration
+ * @iwdev: irdma device
+ * @iwmr: irdma mr pointer
+ * @access: access for MR
+ */
+static int irdma_hwreg_mr(struct irdma_device *iwdev, struct irdma_mr *iwmr,
+			  u16 access)
+{
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+	struct irdma_reg_ns_stag_info *stag_info;
+	struct irdma_pd *iwpd = to_iwpd(iwmr->ibmr.pd);
+	struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+	enum irdma_status_code status;
+	int err = 0;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	stag_info = &cqp_info->in.u.mr_reg_non_shared.info;
+	memset(stag_info, 0, sizeof(*stag_info));
+	stag_info->va = iwpbl->user_base;
+	stag_info->stag_idx = iwmr->stag >> IRDMA_CQPSQ_STAG_IDX_S;
+	stag_info->stag_key = (u8)iwmr->stag;
+	stag_info->total_len = iwmr->len;
+	stag_info->access_rights = irdma_get_mr_access(access);
+	stag_info->pd_id = iwpd->sc_pd.pd_id;
+	if (stag_info->access_rights & IRDMA_ACCESS_FLAGS_ZERO_BASED)
+		stag_info->addr_type = IRDMA_ADDR_TYPE_ZERO_BASED;
+	else
+		stag_info->addr_type = IRDMA_ADDR_TYPE_VA_BASED;
+	stag_info->page_size = iwmr->page_size;
+
+	if (iwpbl->pbl_allocated) {
+		if (palloc->level == PBLE_LEVEL_1) {
+			stag_info->first_pm_pbl_index = palloc->level1.idx;
+			stag_info->chunk_size = 1;
+		} else {
+			stag_info->first_pm_pbl_index = palloc->level2.root.idx;
+			stag_info->chunk_size = 3;
+		}
+	} else {
+		stag_info->reg_addr_pa = iwmr->pgaddrmem[0];
+	}
+
+	cqp_info->cqp_cmd = IRDMA_OP_MR_REG_NON_SHARED;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.mr_reg_non_shared.dev = &iwdev->rf->sc_dev;
+	cqp_info->in.u.mr_reg_non_shared.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+	if (status)
+		err = -ENOMEM;
+
+	return err;
+}
+
+/**
+ * irdma_reg_user_mr - Register a user memory region
+ * @pd: ptr of pd
+ * @start: virtual start address
+ * @len: length of mr
+ * @virt: virtual address
+ * @access: access of mr
+ * @udata: user data
+ */
+static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len,
+				       u64 virt, int access,
+				       struct ib_udata *udata)
+{
+	struct irdma_device *iwdev = to_iwdev(pd->device);
+	struct irdma_ucontext *ucontext;
+	struct irdma_pble_alloc *palloc;
+	struct irdma_pbl *iwpbl;
+	struct irdma_mr *iwmr;
+	struct ib_umem *region;
+	struct irdma_mem_reg_req req;
+	u32 stag = 0;
+	bool use_pbles = false;
+	unsigned long flags;
+	int err = -EINVAL;
+	int ret;
+
+	if (len > iwdev->rf->sc_dev.hw_attrs.max_mr_size)
+		return ERR_PTR(-EINVAL);
+
+	region = ib_umem_get(pd->device, start, len, access);
+
+	if (IS_ERR(region)) {
+		ibdev_dbg(&iwdev->ibdev,
+			  "VERBS: Failed to create ib_umem region\n");
+		return (struct ib_mr *)region;
+	}
+
+	if (ib_copy_from_udata(&req, udata, min(sizeof(req), udata->inlen))) {
+		ib_umem_release(region);
+		return ERR_PTR(-EFAULT);
+	}
+
+	iwmr = kzalloc(sizeof(*iwmr), GFP_KERNEL);
+	if (!iwmr) {
+		ib_umem_release(region);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	iwpbl = &iwmr->iwpbl;
+	iwpbl->iwmr = iwmr;
+	iwmr->region = region;
+	iwmr->ibmr.pd = pd;
+	iwmr->ibmr.device = pd->device;
+	iwmr->ibmr.iova = virt;
+	iwmr->page_size = PAGE_SIZE;
+
+	if (req.reg_type == IW_MEMREG_TYPE_MEM)
+		iwmr->page_size = ib_umem_find_best_pgsz(region,
+							 SZ_4K | SZ_2M | SZ_1G,
+							 virt);
+	iwmr->len = region->length;
+	iwpbl->user_base = virt;
+	palloc = &iwpbl->pble_alloc;
+	iwmr->type = req.reg_type;
+	iwmr->page_cnt = ib_umem_num_dma_blocks(region, iwmr->page_size);
+
+	switch (req.reg_type) {
+	case IW_MEMREG_TYPE_QP:
+		use_pbles = ((req.sq_pages + req.rq_pages) > 2);
+		err = irdma_handle_q_mem(iwdev, &req, iwpbl, use_pbles);
+		if (err)
+			goto error;
+
+		ucontext = rdma_udata_to_drv_context(udata, struct irdma_ucontext,
+						     ibucontext);
+		spin_lock_irqsave(&ucontext->qp_reg_mem_list_lock, flags);
+		list_add_tail(&iwpbl->list, &ucontext->qp_reg_mem_list);
+		iwpbl->on_list = true;
+		spin_unlock_irqrestore(&ucontext->qp_reg_mem_list_lock, flags);
+		break;
+	case IW_MEMREG_TYPE_CQ:
+		use_pbles = (req.cq_pages > 1);
+		err = irdma_handle_q_mem(iwdev, &req, iwpbl, use_pbles);
+		if (err)
+			goto error;
+
+		ucontext = rdma_udata_to_drv_context(udata, struct irdma_ucontext,
+						     ibucontext);
+		spin_lock_irqsave(&ucontext->cq_reg_mem_list_lock, flags);
+		list_add_tail(&iwpbl->list, &ucontext->cq_reg_mem_list);
+		iwpbl->on_list = true;
+		spin_unlock_irqrestore(&ucontext->cq_reg_mem_list_lock, flags);
+		break;
+	case IW_MEMREG_TYPE_MEM:
+		use_pbles = (iwmr->page_cnt != 1);
+
+		err = irdma_setup_pbles(iwdev->rf, iwmr, use_pbles);
+		if (err)
+			goto error;
+
+		if (use_pbles) {
+			ret = irdma_check_mr_contiguous(palloc,
+							iwmr->page_size);
+			if (ret) {
+				irdma_free_pble(iwdev->rf->pble_rsrc, palloc);
+				iwpbl->pbl_allocated = false;
+			}
+		}
+
+		stag = irdma_create_stag(iwdev);
+		if (!stag) {
+			err = -ENOMEM;
+			goto error;
+		}
+
+		iwmr->stag = stag;
+		iwmr->ibmr.rkey = stag;
+		iwmr->ibmr.lkey = stag;
+		err = irdma_hwreg_mr(iwdev, iwmr, access);
+		if (err) {
+			irdma_free_stag(iwdev, stag);
+			goto error;
+		}
+
+		break;
+	default:
+		goto error;
+	}
+
+	iwmr->type = req.reg_type;
+
+	return &iwmr->ibmr;
+
+error:
+	if (palloc->level != PBLE_LEVEL_0 && iwpbl->pbl_allocated)
+		irdma_free_pble(iwdev->rf->pble_rsrc, palloc);
+	ib_umem_release(region);
+	kfree(iwmr);
+
+	return ERR_PTR(err);
+}
+
+/**
+ * irdma_reg_phys_mr - register kernel physical memory
+ * @pd: ibpd pointer
+ * @addr: physical address of memory to register
+ * @size: size of memory to register
+ * @access: Access rights
+ * @iova_start: start of virtual address for physical buffers
+ */
+struct ib_mr *irdma_reg_phys_mr(struct ib_pd *pd, u64 addr, u64 size, int access,
+				u64 *iova_start)
+{
+	struct irdma_device *iwdev = to_iwdev(pd->device);
+	struct irdma_pbl *iwpbl;
+	struct irdma_mr *iwmr;
+	enum irdma_status_code status;
+	u32 stag;
+	int ret;
+
+	iwmr = kzalloc(sizeof(*iwmr), GFP_KERNEL);
+	if (!iwmr)
+		return ERR_PTR(-ENOMEM);
+
+	iwmr->ibmr.pd = pd;
+	iwmr->ibmr.device = pd->device;
+	iwpbl = &iwmr->iwpbl;
+	iwpbl->iwmr = iwmr;
+	iwmr->type = IW_MEMREG_TYPE_MEM;
+	iwpbl->user_base = *iova_start;
+	stag = irdma_create_stag(iwdev);
+	if (!stag) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	iwmr->stag = stag;
+	iwmr->ibmr.iova = *iova_start;
+	iwmr->ibmr.rkey = stag;
+	iwmr->ibmr.lkey = stag;
+	iwmr->page_cnt = 1;
+	iwmr->pgaddrmem[0] = addr;
+	iwmr->len = size;
+	iwmr->page_size = SZ_4K;
+	status = irdma_hwreg_mr(iwdev, iwmr, access);
+	if (status) {
+		irdma_free_stag(iwdev, stag);
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	return &iwmr->ibmr;
+
+err:
+	kfree(iwmr);
+
+	return ERR_PTR(ret);
+}
+
+/**
+ * irdma_get_dma_mr - register physical mem
+ * @pd: ptr of pd
+ * @acc: access for memory
+ */
+static struct ib_mr *irdma_get_dma_mr(struct ib_pd *pd, int acc)
+{
+	u64 kva = 0;
+
+	return irdma_reg_phys_mr(pd, 0, 0, acc, &kva);
+}
+
+/**
+ * irdma_del_memlist - Deleting pbl list entries for CQ/QP
+ * @iwmr: iwmr for IB's user page addresses
+ * @ucontext: ptr to user context
+ */
+static void irdma_del_memlist(struct irdma_mr *iwmr,
+			      struct irdma_ucontext *ucontext)
+{
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+	unsigned long flags;
+
+	switch (iwmr->type) {
+	case IW_MEMREG_TYPE_CQ:
+		spin_lock_irqsave(&ucontext->cq_reg_mem_list_lock, flags);
+		if (iwpbl->on_list) {
+			iwpbl->on_list = false;
+			list_del(&iwpbl->list);
+		}
+		spin_unlock_irqrestore(&ucontext->cq_reg_mem_list_lock, flags);
+		break;
+	case IW_MEMREG_TYPE_QP:
+		spin_lock_irqsave(&ucontext->qp_reg_mem_list_lock, flags);
+		if (iwpbl->on_list) {
+			iwpbl->on_list = false;
+			list_del(&iwpbl->list);
+		}
+		spin_unlock_irqrestore(&ucontext->qp_reg_mem_list_lock, flags);
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ * irdma_dereg_mr - deregister mr
+ * @ib_mr: mr ptr for dereg
+ * @udata: user data
+ */
+static int irdma_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
+{
+	struct ib_pd *ibpd = ib_mr->pd;
+	struct irdma_pd *iwpd = to_iwpd(ibpd);
+	struct irdma_mr *iwmr = to_iwmr(ib_mr);
+	struct irdma_device *iwdev = to_iwdev(ib_mr->device);
+	struct irdma_dealloc_stag_info *info;
+	struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+	struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+
+	if (iwmr->type != IW_MEMREG_TYPE_MEM) {
+		if (iwmr->region) {
+			struct irdma_ucontext *ucontext;
+
+			ucontext = rdma_udata_to_drv_context(udata,
+						struct irdma_ucontext,
+						ibucontext);
+			irdma_del_memlist(iwmr, ucontext);
+		}
+		goto done;
+	}
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	info = &cqp_info->in.u.dealloc_stag.info;
+	memset(info, 0, sizeof(*info));
+	info->pd_id = iwpd->sc_pd.pd_id & 0x00007fff;
+	info->stag_idx = ib_mr->rkey >> IRDMA_CQPSQ_STAG_IDX_S;
+	info->mr = true;
+	if (iwpbl->pbl_allocated)
+		info->dealloc_pbl = true;
+
+	cqp_info->cqp_cmd = IRDMA_OP_DEALLOC_STAG;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.dealloc_stag.dev = &iwdev->rf->sc_dev;
+	cqp_info->in.u.dealloc_stag.scratch = (uintptr_t)cqp_request;
+	irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+	irdma_free_stag(iwdev, iwmr->stag);
+done:
+	if (iwpbl->pbl_allocated)
+		irdma_free_pble(iwdev->rf->pble_rsrc, palloc);
+	ib_umem_release(iwmr->region);
+	kfree(iwmr);
+
+	return 0;
+}
+
+/**
+ * irdma_copy_sg_list - copy sg list for qp
+ * @sg_list: copied into sg_list
+ * @sgl: copy from sgl
+ * @num_sges: count of sg entries
+ */
+static void irdma_copy_sg_list(struct irdma_sge *sg_list, struct ib_sge *sgl,
+			       int num_sges)
+{
+	unsigned int i;
+
+	for (i = 0; (i < num_sges) && (i < IRDMA_MAX_WQ_FRAGMENT_COUNT); i++) {
+		sg_list[i].tag_off = sgl[i].addr;
+		sg_list[i].len = sgl[i].length;
+		sg_list[i].stag = sgl[i].lkey;
+	}
+}
+
+/**
+ * irdma_post_send -  kernel application wr
+ * @ibqp: qp ptr for wr
+ * @ib_wr: work request ptr
+ * @bad_wr: return of bad wr if err
+ */
+static int irdma_post_send(struct ib_qp *ibqp,
+			   const struct ib_send_wr *ib_wr,
+			   const struct ib_send_wr **bad_wr)
+{
+	struct irdma_qp *iwqp;
+	struct irdma_qp_uk *ukqp;
+	struct irdma_sc_dev *dev;
+	struct irdma_post_sq_info info;
+	enum irdma_status_code ret;
+	int err = 0;
+	unsigned long flags;
+	bool inv_stag;
+	struct irdma_ah *ah;
+	bool reflush = false;
+
+	iwqp = to_iwqp(ibqp);
+	ukqp = &iwqp->sc_qp.qp_uk;
+	dev = &iwqp->iwdev->rf->sc_dev;
+
+	spin_lock_irqsave(&iwqp->lock, flags);
+	if (iwqp->flush_issued && ukqp->sq_flush_complete)
+		reflush = true;
+	while (ib_wr) {
+		memset(&info, 0, sizeof(info));
+		inv_stag = false;
+		info.wr_id = (ib_wr->wr_id);
+		if ((ib_wr->send_flags & IB_SEND_SIGNALED) || iwqp->sig_all)
+			info.signaled = true;
+		if (ib_wr->send_flags & IB_SEND_FENCE)
+			info.read_fence = true;
+		switch (ib_wr->opcode) {
+		case IB_WR_SEND_WITH_IMM:
+			if (ukqp->qp_caps & IRDMA_SEND_WITH_IMM) {
+				info.imm_data_valid = true;
+				info.imm_data = ntohl(ib_wr->ex.imm_data);
+			} else {
+				err = -EINVAL;
+				break;
+			}
+			fallthrough;
+		case IB_WR_SEND:
+		case IB_WR_SEND_WITH_INV:
+			if (ib_wr->opcode == IB_WR_SEND ||
+			    ib_wr->opcode == IB_WR_SEND_WITH_IMM) {
+				if (ib_wr->send_flags & IB_SEND_SOLICITED)
+					info.op_type = IRDMA_OP_TYPE_SEND_SOL;
+				else
+					info.op_type = IRDMA_OP_TYPE_SEND;
+			} else {
+				if (ib_wr->send_flags & IB_SEND_SOLICITED)
+					info.op_type = IRDMA_OP_TYPE_SEND_SOL_INV;
+				else
+					info.op_type = IRDMA_OP_TYPE_SEND_INV;
+				info.stag_to_inv = ib_wr->ex.invalidate_rkey;
+			}
+
+			if (ib_wr->send_flags & IB_SEND_INLINE) {
+				info.op.inline_send.data = (void *)(unsigned long)
+							   ib_wr->sg_list[0].addr;
+				info.op.inline_send.len = ib_wr->sg_list[0].length;
+				if (iwqp->ibqp.qp_type == IB_QPT_UD ||
+				    iwqp->ibqp.qp_type == IB_QPT_GSI) {
+					ah = to_iwah(ud_wr(ib_wr)->ah);
+					info.op.inline_send.ah_id = ah->sc_ah.ah_info.ah_idx;
+					info.op.inline_send.qkey = ud_wr(ib_wr)->remote_qkey;
+					info.op.inline_send.dest_qp = ud_wr(ib_wr)->remote_qpn;
+				}
+				ret = ukqp->qp_ops.iw_inline_send(ukqp, &info,
+								  false);
+			} else {
+				info.op.send.num_sges = ib_wr->num_sge;
+				info.op.send.sg_list = (struct irdma_sge *)
+						       ib_wr->sg_list;
+				if (iwqp->ibqp.qp_type == IB_QPT_UD ||
+				    iwqp->ibqp.qp_type == IB_QPT_GSI) {
+					ah = to_iwah(ud_wr(ib_wr)->ah);
+					info.op.send.ah_id = ah->sc_ah.ah_info.ah_idx;
+					info.op.send.qkey = ud_wr(ib_wr)->remote_qkey;
+					info.op.send.dest_qp = ud_wr(ib_wr)->remote_qpn;
+				}
+				ret = ukqp->qp_ops.iw_send(ukqp, &info, false);
+			}
+
+			if (ret) {
+				if (ret == IRDMA_ERR_QP_TOOMANY_WRS_POSTED)
+					err = -ENOMEM;
+				else
+					err = -EINVAL;
+			}
+			break;
+		case IB_WR_RDMA_WRITE_WITH_IMM:
+			if (ukqp->qp_caps & IRDMA_WRITE_WITH_IMM) {
+				info.imm_data_valid = true;
+				info.imm_data = ntohl(ib_wr->ex.imm_data);
+			} else {
+				err = -EINVAL;
+				break;
+			}
+			fallthrough;
+		case IB_WR_RDMA_WRITE:
+			if (ib_wr->send_flags & IB_SEND_SOLICITED)
+				info.op_type = IRDMA_OP_TYPE_RDMA_WRITE_SOL;
+			else
+				info.op_type = IRDMA_OP_TYPE_RDMA_WRITE;
+
+			if (ib_wr->send_flags & IB_SEND_INLINE) {
+				info.op.inline_rdma_write.data = (void *)(uintptr_t)ib_wr->sg_list[0].addr;
+				info.op.inline_rdma_write.len = ib_wr->sg_list[0].length;
+				info.op.inline_rdma_write.rem_addr.tag_off = rdma_wr(ib_wr)->remote_addr;
+				info.op.inline_rdma_write.rem_addr.stag = rdma_wr(ib_wr)->rkey;
+				ret = ukqp->qp_ops.iw_inline_rdma_write(ukqp, &info, false);
+			} else {
+				info.op.rdma_write.lo_sg_list = (void *)ib_wr->sg_list;
+				info.op.rdma_write.num_lo_sges = ib_wr->num_sge;
+				info.op.rdma_write.rem_addr.tag_off = rdma_wr(ib_wr)->remote_addr;
+				info.op.rdma_write.rem_addr.stag = rdma_wr(ib_wr)->rkey;
+				ret = ukqp->qp_ops.iw_rdma_write(ukqp, &info, false);
+			}
+
+			if (ret) {
+				if (ret == IRDMA_ERR_QP_TOOMANY_WRS_POSTED)
+					err = -ENOMEM;
+				else
+					err = -EINVAL;
+			}
+			break;
+		case IB_WR_RDMA_READ_WITH_INV:
+			inv_stag = true;
+			fallthrough;
+		case IB_WR_RDMA_READ:
+			if (ib_wr->num_sge >
+			    dev->hw_attrs.uk_attrs.max_hw_read_sges) {
+				err = -EINVAL;
+				break;
+			}
+			info.op_type = IRDMA_OP_TYPE_RDMA_READ;
+			info.op.rdma_read.rem_addr.tag_off = rdma_wr(ib_wr)->remote_addr;
+			info.op.rdma_read.rem_addr.stag = rdma_wr(ib_wr)->rkey;
+			info.op.rdma_read.lo_sg_list = (void *)ib_wr->sg_list;
+			info.op.rdma_read.num_lo_sges = ib_wr->num_sge;
+
+			ret = ukqp->qp_ops.iw_rdma_read(ukqp, &info, inv_stag,
+							false);
+			if (ret) {
+				if (ret == IRDMA_ERR_QP_TOOMANY_WRS_POSTED)
+					err = -ENOMEM;
+				else
+					err = -EINVAL;
+			}
+			break;
+		case IB_WR_LOCAL_INV:
+			info.op_type = IRDMA_OP_TYPE_INV_STAG;
+			info.op.inv_local_stag.target_stag = ib_wr->ex.invalidate_rkey;
+			ret = ukqp->qp_ops.iw_stag_local_invalidate(ukqp, &info, true);
+			if (ret)
+				err = -ENOMEM;
+			break;
+		case IB_WR_REG_MR: {
+			struct irdma_mr *iwmr = to_iwmr(reg_wr(ib_wr)->mr);
+			struct irdma_pble_alloc *palloc = &iwmr->iwpbl.pble_alloc;
+			struct irdma_fast_reg_stag_info stag_info = {};
+
+			stag_info.signaled = info.signaled;
+			stag_info.read_fence = info.read_fence;
+			stag_info.access_rights = irdma_get_mr_access(reg_wr(ib_wr)->access);
+			stag_info.stag_key = reg_wr(ib_wr)->key & 0xff;
+			stag_info.stag_idx = reg_wr(ib_wr)->key >> 8;
+			stag_info.page_size = reg_wr(ib_wr)->mr->page_size;
+			stag_info.wr_id = ib_wr->wr_id;
+			stag_info.addr_type = IRDMA_ADDR_TYPE_VA_BASED;
+			stag_info.va = (void *)(uintptr_t)iwmr->ibmr.iova;
+			stag_info.total_len = iwmr->ibmr.length;
+			stag_info.reg_addr_pa = *((u64 *)(uintptr_t)palloc->level1.addr);
+			stag_info.first_pm_pbl_index = palloc->level1.idx;
+			stag_info.local_fence = ib_wr->send_flags & IB_SEND_FENCE;
+			if (iwmr->npages > IRDMA_MIN_PAGES_PER_FMR)
+				stag_info.chunk_size = 1;
+			ret = dev->iw_priv_qp_ops->iw_mr_fast_register(&iwqp->sc_qp,
+								       &stag_info,
+								       true);
+			if (ret)
+				err = -ENOMEM;
+			break;
+		}
+		default:
+			err = -EINVAL;
+			ibdev_dbg(&iwqp->iwdev->ibdev,
+				  "VERBS: upost_send bad opcode = 0x%x\n",
+				  ib_wr->opcode);
+			break;
+		}
+
+		if (err)
+			break;
+		ib_wr = ib_wr->next;
+	}
+
+	if (!iwqp->flush_issued && iwqp->hw_iwarp_state <= IRDMA_QP_STATE_RTS) {
+		ukqp->qp_ops.iw_qp_post_wr(ukqp);
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+	} else if (reflush) {
+		ukqp->sq_flush_complete = false;
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+		irdma_flush_wqes(iwqp, IRDMA_FLUSH_SQ | IRDMA_REFLUSH);
+	} else {
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+	}
+	if (err)
+		*bad_wr = ib_wr;
+
+	return err;
+}
+
+/**
+ * irdma_post_recv - post receive wr for kernel application
+ * @ibqp: ib qp pointer
+ * @ib_wr: work request for receive
+ * @bad_wr: bad wr caused an error
+ */
+static int irdma_post_recv(struct ib_qp *ibqp,
+			   const struct ib_recv_wr *ib_wr,
+			   const struct ib_recv_wr **bad_wr)
+{
+	struct irdma_qp *iwqp;
+	struct irdma_qp_uk *ukqp;
+	struct irdma_post_rq_info post_recv = {};
+	struct irdma_sge sg_list[IRDMA_MAX_WQ_FRAGMENT_COUNT];
+	enum irdma_status_code ret = 0;
+	unsigned long flags;
+	int err = 0;
+	bool reflush = false;
+
+	iwqp = to_iwqp(ibqp);
+	ukqp = &iwqp->sc_qp.qp_uk;
+
+	spin_lock_irqsave(&iwqp->lock, flags);
+	if (iwqp->flush_issued && ukqp->rq_flush_complete)
+		reflush = true;
+	while (ib_wr) {
+		post_recv.num_sges = ib_wr->num_sge;
+		post_recv.wr_id = ib_wr->wr_id;
+		irdma_copy_sg_list(sg_list, ib_wr->sg_list, ib_wr->num_sge);
+		post_recv.sg_list = sg_list;
+		ret = ukqp->qp_ops.iw_post_receive(ukqp, &post_recv);
+		if (ret) {
+			ibdev_dbg(&iwqp->iwdev->ibdev,
+				  "VERBS: post_recv err %d\n", ret);
+			if (ret == IRDMA_ERR_QP_TOOMANY_WRS_POSTED)
+				err = -ENOMEM;
+			else
+				err = -EINVAL;
+			goto out;
+		}
+
+		ib_wr = ib_wr->next;
+	}
+
+out:
+	if (reflush) {
+		ukqp->rq_flush_complete = false;
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+		irdma_flush_wqes(iwqp, IRDMA_FLUSH_RQ | IRDMA_REFLUSH);
+	} else {
+		spin_unlock_irqrestore(&iwqp->lock, flags);
+	}
+
+	if (err)
+		*bad_wr = ib_wr;
+
+	return err;
+}
+
+/**
+ * irdma_flush_err_to_ib_wc_status - return change flush error code to IB status
+ * @opcode: iwarp flush code
+ */
+static enum ib_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode opcode)
+{
+	switch (opcode) {
+	case FLUSH_PROT_ERR:
+		return IB_WC_LOC_PROT_ERR;
+	case FLUSH_REM_ACCESS_ERR:
+		return IB_WC_REM_ACCESS_ERR;
+	case FLUSH_LOC_QP_OP_ERR:
+		return IB_WC_LOC_QP_OP_ERR;
+	case FLUSH_REM_OP_ERR:
+		return IB_WC_REM_OP_ERR;
+	case FLUSH_LOC_LEN_ERR:
+		return IB_WC_LOC_LEN_ERR;
+	case FLUSH_GENERAL_ERR:
+		return IB_WC_WR_FLUSH_ERR;
+	case FLUSH_FATAL_ERR:
+	default:
+		return IB_WC_FATAL_ERR;
+	}
+}
+
+/**
+ * irdma_process_cqe - process cqe info
+ * @entry: processed cqe
+ * @cq_poll_info: cqe info
+ */
+static void irdma_process_cqe(struct ib_wc *entry,
+			      struct irdma_cq_poll_info *cq_poll_info)
+{
+	struct irdma_qp *iwqp;
+	struct irdma_sc_qp *qp;
+
+	entry->wc_flags = 0;
+	entry->pkey_index = 0;
+	entry->wr_id = cq_poll_info->wr_id;
+
+	qp = cq_poll_info->qp_handle;
+	iwqp = qp->qp_uk.back_qp;
+	entry->qp = qp->qp_uk.back_qp;
+
+	if (cq_poll_info->error) {
+		entry->status = (cq_poll_info->comp_status == IRDMA_COMPL_STATUS_FLUSHED) ?
+				irdma_flush_err_to_ib_wc_status(cq_poll_info->minor_err) : IB_WC_GENERAL_ERR;
+
+		entry->vendor_err = cq_poll_info->major_err << 16 |
+				    cq_poll_info->minor_err;
+	} else {
+		entry->status = IB_WC_SUCCESS;
+		if (cq_poll_info->imm_valid) {
+			entry->ex.imm_data = htonl(cq_poll_info->imm_data);
+			entry->wc_flags |= IB_WC_WITH_IMM;
+		}
+		if (cq_poll_info->ud_smac_valid) {
+			ether_addr_copy(entry->smac, cq_poll_info->ud_smac);
+			entry->wc_flags |= IB_WC_WITH_SMAC;
+		}
+
+		if (cq_poll_info->ud_vlan_valid) {
+			entry->vlan_id = cq_poll_info->ud_vlan & VLAN_VID_MASK;
+			entry->wc_flags |= IB_WC_WITH_VLAN;
+			entry->sl = cq_poll_info->ud_vlan >> VLAN_PRIO_SHIFT;
+		} else {
+			entry->sl = 0;
+		}
+	}
+
+	switch (cq_poll_info->op_type) {
+	case IRDMA_OP_TYPE_RDMA_WRITE:
+	case IRDMA_OP_TYPE_RDMA_WRITE_SOL:
+		entry->opcode = IB_WC_RDMA_WRITE;
+		break;
+	case IRDMA_OP_TYPE_RDMA_READ_INV_STAG:
+	case IRDMA_OP_TYPE_RDMA_READ:
+		entry->opcode = IB_WC_RDMA_READ;
+		break;
+	case IRDMA_OP_TYPE_SEND_INV:
+	case IRDMA_OP_TYPE_SEND_SOL:
+	case IRDMA_OP_TYPE_SEND_SOL_INV:
+	case IRDMA_OP_TYPE_SEND:
+		entry->opcode = IB_WC_SEND;
+		break;
+	case IRDMA_OP_TYPE_FAST_REG_NSMR:
+		entry->opcode = IB_WC_REG_MR;
+		break;
+	case IRDMA_OP_TYPE_INV_STAG:
+		entry->opcode = IB_WC_LOCAL_INV;
+		break;
+	case IRDMA_OP_TYPE_REC_IMM:
+	case IRDMA_OP_TYPE_REC:
+		entry->opcode = cq_poll_info->op_type == IRDMA_OP_TYPE_REC_IMM ?
+			IB_WC_RECV_RDMA_WITH_IMM : IB_WC_RECV;
+		if (qp->qp_uk.qp_type != IRDMA_QP_TYPE_ROCE_UD &&
+		    cq_poll_info->stag_invalid_set) {
+			entry->ex.invalidate_rkey = cq_poll_info->inv_stag;
+			entry->wc_flags |= IB_WC_WITH_INVALIDATE;
+		}
+		break;
+	default:
+		ibdev_err(&iwqp->iwdev->ibdev,
+			  "Invalid opcode = %d in CQE\n", cq_poll_info->op_type);
+		entry->status = IB_WC_GENERAL_ERR;
+		return;
+	}
+
+	if (qp->qp_uk.qp_type == IRDMA_QP_TYPE_ROCE_UD) {
+		entry->src_qp = cq_poll_info->ud_src_qpn;
+		entry->slid = 0;
+		entry->wc_flags |=
+			(IB_WC_GRH | IB_WC_WITH_NETWORK_HDR_TYPE);
+		entry->network_hdr_type = cq_poll_info->ipv4 ?
+						  RDMA_NETWORK_IPV4 :
+						  RDMA_NETWORK_IPV6;
+	} else {
+		entry->src_qp = cq_poll_info->qp_id;
+	}
+
+		entry->byte_len = cq_poll_info->bytes_xfered;
+}
+
+/**
+ * irdma_poll_one - poll one entry of the CQ
+ * @ukcq: ukcq to poll
+ * @cur_cqe: current CQE info to be filled in
+ * @entry: ibv_wc object to be filled for non-extended CQ or NULL for extended CQ
+ *
+ * Returns the internal irdma device error code or 0 on success
+ */
+static inline int irdma_poll_one(struct irdma_cq_uk *ukcq,
+				 struct irdma_cq_poll_info *cur_cqe,
+				 struct ib_wc *entry)
+{
+	int ret = ukcq->ops.iw_cq_poll_cmpl(ukcq, cur_cqe);
+
+	if (ret)
+		return ret;
+
+	irdma_process_cqe(entry, cur_cqe);
+
+	return 0;
+}
+
+/**
+ * __irdma_poll_cq - poll cq for completion (kernel apps)
+ * @iwcq: cq to poll
+ * @num_entries: number of entries to poll
+ * @entry: wr of a completed entry
+ */
+static int __irdma_poll_cq(struct irdma_cq *iwcq, int num_entries, struct ib_wc *entry)
+{
+	struct list_head *tmp_node, *list_node;
+	struct irdma_cq_buf *last_buf = NULL;
+	struct irdma_cq_poll_info *cur_cqe = &iwcq->cur_cqe;
+	struct irdma_cq_buf *cq_buf;
+	enum irdma_status_code ret;
+	struct irdma_device *iwdev;
+	struct irdma_cq_uk *ukcq;
+	bool cq_new_cqe = false;
+	int resized_bufs = 0;
+	int npolled = 0;
+
+	iwdev = to_iwdev(iwcq->ibcq.device);
+	ukcq = &iwcq->sc_cq.cq_uk;
+
+	/* go through the list of previously resized CQ buffers */
+	list_for_each_safe(list_node, tmp_node, &iwcq->resize_list) {
+		cq_buf = container_of(list_node, struct irdma_cq_buf, list);
+		while (npolled < num_entries) {
+			ret = irdma_poll_one(&cq_buf->cq_uk, cur_cqe, entry + npolled);
+			if (!ret) {
+				++npolled;
+				cq_new_cqe = true;
+				continue;
+			}
+			if (ret == IRDMA_ERR_Q_EMPTY)
+				break;
+			 /* QP using the CQ is destroyed. Skip reporting this CQE */
+			if (ret == IRDMA_ERR_Q_DESTROYED) {
+				cq_new_cqe = true;
+				continue;
+			}
+			goto error;
+		}
+
+		/* save the resized CQ buffer which received the last cqe */
+		if (cq_new_cqe)
+			last_buf = cq_buf;
+		cq_new_cqe = false;
+	}
+
+	/* check the current CQ for new cqes */
+	while (npolled < num_entries) {
+		ret = irdma_poll_one(ukcq, cur_cqe, entry + npolled);
+		if (!ret) {
+			++npolled;
+			cq_new_cqe = true;
+			continue;
+		}
+
+		if (ret == IRDMA_ERR_Q_EMPTY)
+			break;
+		/* QP using the CQ is destroyed. Skip reporting this CQE */
+		if (ret == IRDMA_ERR_Q_DESTROYED) {
+			cq_new_cqe = true;
+			continue;
+		}
+		goto error;
+	}
+
+	if (cq_new_cqe)
+		/* all previous CQ resizes are complete */
+		resized_bufs = irdma_process_resize_list(iwcq, iwdev, NULL);
+	else if (last_buf)
+		/* only CQ resizes up to the last_buf are complete */
+		resized_bufs = irdma_process_resize_list(iwcq, iwdev, last_buf);
+	if (resized_bufs)
+		/* report to the HW the number of complete CQ resizes */
+		ukcq->ops.iw_cq_set_resized_cnt(ukcq, resized_bufs);
+
+	return npolled;
+error:
+	ibdev_dbg(&iwdev->ibdev, "%s: Error polling CQ, irdma_err: %d\n",
+		  __func__, ret);
+
+	return -EINVAL;
+}
+
+/**
+ * irdma_poll_cq - poll cq for completion (kernel apps)
+ * @ibcq: cq to poll
+ * @num_entries: number of entries to poll
+ * @entry: wr of a completed entry
+ */
+static int irdma_poll_cq(struct ib_cq *ibcq, int num_entries,
+			 struct ib_wc *entry)
+{
+	struct irdma_cq *iwcq;
+	unsigned long flags;
+	int ret;
+
+	iwcq = to_iwcq(ibcq);
+
+	spin_lock_irqsave(&iwcq->lock, flags);
+	ret = __irdma_poll_cq(iwcq, num_entries, entry);
+	spin_unlock_irqrestore(&iwcq->lock, flags);
+
+	return ret;
+}
+
+/**
+ * irdma_req_notify_cq - arm cq kernel application
+ * @ibcq: cq to arm
+ * @notify_flags: notofication flags
+ */
+static int irdma_req_notify_cq(struct ib_cq *ibcq,
+			       enum ib_cq_notify_flags notify_flags)
+{
+	struct irdma_cq *iwcq;
+	struct irdma_cq_uk *ukcq;
+	unsigned long flags;
+	enum irdma_cmpl_notify cq_notify = IRDMA_CQ_COMPL_EVENT;
+
+	iwcq = to_iwcq(ibcq);
+	ukcq = &iwcq->sc_cq.cq_uk;
+	if (notify_flags == IB_CQ_SOLICITED)
+		cq_notify = IRDMA_CQ_COMPL_SOLICITED;
+	spin_lock_irqsave(&iwcq->lock, flags);
+	ukcq->ops.iw_cq_request_notification(ukcq, cq_notify);
+	spin_unlock_irqrestore(&iwcq->lock, flags);
+
+	return 0;
+}
+
+static int irdma_roce_port_immutable(struct ib_device *ibdev, u32 port_num,
+				     struct ib_port_immutable *immutable)
+{
+	struct ib_port_attr attr;
+	int err;
+
+	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+	err = ib_query_port(ibdev, port_num, &attr);
+	if (err)
+		return err;
+
+	immutable->max_mad_size = IB_MGMT_MAD_SIZE;
+	immutable->pkey_tbl_len = attr.pkey_tbl_len;
+	immutable->gid_tbl_len = attr.gid_tbl_len;
+
+	return 0;
+}
+
+static int irdma_iw_port_immutable(struct ib_device *ibdev, u32 port_num,
+				   struct ib_port_immutable *immutable)
+{
+	struct ib_port_attr attr;
+	int err;
+
+	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;
+	err = ib_query_port(ibdev, port_num, &attr);
+	if (err)
+		return err;
+	immutable->gid_tbl_len = 1;
+
+	return 0;
+}
+
+static const char *const irdma_hw_stat_names[] = {
+	/* 32bit names */
+	[IRDMA_HW_STAT_INDEX_RXVLANERR] = "rxVlanErrors",
+	[IRDMA_HW_STAT_INDEX_IP4RXDISCARD] = "ip4InDiscards",
+	[IRDMA_HW_STAT_INDEX_IP4RXTRUNC] = "ip4InTruncatedPkts",
+	[IRDMA_HW_STAT_INDEX_IP4TXNOROUTE] = "ip4OutNoRoutes",
+	[IRDMA_HW_STAT_INDEX_IP6RXDISCARD] = "ip6InDiscards",
+	[IRDMA_HW_STAT_INDEX_IP6RXTRUNC] = "ip6InTruncatedPkts",
+	[IRDMA_HW_STAT_INDEX_IP6TXNOROUTE] = "ip6OutNoRoutes",
+	[IRDMA_HW_STAT_INDEX_TCPRTXSEG] = "tcpRetransSegs",
+	[IRDMA_HW_STAT_INDEX_TCPRXOPTERR] = "tcpInOptErrors",
+	[IRDMA_HW_STAT_INDEX_TCPRXPROTOERR] = "tcpInProtoErrors",
+	[IRDMA_HW_STAT_INDEX_RXRPCNPHANDLED] = "cnpHandled",
+	[IRDMA_HW_STAT_INDEX_RXRPCNPIGNORED] = "cnpIgnored",
+	[IRDMA_HW_STAT_INDEX_TXNPCNPSENT] = "cnpSent",
+
+	/* 64bit names */
+	[IRDMA_HW_STAT_INDEX_IP4RXOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4InOctets",
+	[IRDMA_HW_STAT_INDEX_IP4RXPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4InPkts",
+	[IRDMA_HW_STAT_INDEX_IP4RXFRAGS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4InReasmRqd",
+	[IRDMA_HW_STAT_INDEX_IP4RXMCOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4InMcastOctets",
+	[IRDMA_HW_STAT_INDEX_IP4RXMCPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4InMcastPkts",
+	[IRDMA_HW_STAT_INDEX_IP4TXOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4OutOctets",
+	[IRDMA_HW_STAT_INDEX_IP4TXPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4OutPkts",
+	[IRDMA_HW_STAT_INDEX_IP4TXFRAGS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4OutSegRqd",
+	[IRDMA_HW_STAT_INDEX_IP4TXMCOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4OutMcastOctets",
+	[IRDMA_HW_STAT_INDEX_IP4TXMCPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip4OutMcastPkts",
+	[IRDMA_HW_STAT_INDEX_IP6RXOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6InOctets",
+	[IRDMA_HW_STAT_INDEX_IP6RXPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6InPkts",
+	[IRDMA_HW_STAT_INDEX_IP6RXFRAGS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6InReasmRqd",
+	[IRDMA_HW_STAT_INDEX_IP6RXMCOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6InMcastOctets",
+	[IRDMA_HW_STAT_INDEX_IP6RXMCPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6InMcastPkts",
+	[IRDMA_HW_STAT_INDEX_IP6TXOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6OutOctets",
+	[IRDMA_HW_STAT_INDEX_IP6TXPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6OutPkts",
+	[IRDMA_HW_STAT_INDEX_IP6TXFRAGS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6OutSegRqd",
+	[IRDMA_HW_STAT_INDEX_IP6TXMCOCTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6OutMcastOctets",
+	[IRDMA_HW_STAT_INDEX_IP6TXMCPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"ip6OutMcastPkts",
+	[IRDMA_HW_STAT_INDEX_TCPRXSEGS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"tcpInSegs",
+	[IRDMA_HW_STAT_INDEX_TCPTXSEG + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"tcpOutSegs",
+	[IRDMA_HW_STAT_INDEX_RDMARXRDS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwInRdmaReads",
+	[IRDMA_HW_STAT_INDEX_RDMARXSNDS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwInRdmaSends",
+	[IRDMA_HW_STAT_INDEX_RDMARXWRS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwInRdmaWrites",
+	[IRDMA_HW_STAT_INDEX_RDMATXRDS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwOutRdmaReads",
+	[IRDMA_HW_STAT_INDEX_RDMATXSNDS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwOutRdmaSends",
+	[IRDMA_HW_STAT_INDEX_RDMATXWRS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwOutRdmaWrites",
+	[IRDMA_HW_STAT_INDEX_RDMAVBND + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwRdmaBnd",
+	[IRDMA_HW_STAT_INDEX_RDMAVINV + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"iwRdmaInv",
+	[IRDMA_HW_STAT_INDEX_UDPRXPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"RxUDP",
+	[IRDMA_HW_STAT_INDEX_UDPTXPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"TxUDP",
+	[IRDMA_HW_STAT_INDEX_RXNPECNMARKEDPKTS + IRDMA_HW_STAT_INDEX_MAX_32] =
+		"RxECNMrkd",
+};
+
+static void irdma_get_dev_fw_str(struct ib_device *dev, char *str)
+{
+	struct irdma_device *iwdev = to_iwdev(dev);
+
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%u.%u",
+		 irdma_fw_major_ver(&iwdev->rf->sc_dev),
+		 irdma_fw_minor_ver(&iwdev->rf->sc_dev));
+}
+
+/**
+ * irdma_alloc_hw_stats - Allocate a hw stats structure
+ * @ibdev: device pointer from stack
+ * @port_num: port number
+ */
+static struct rdma_hw_stats *irdma_alloc_hw_stats(struct ib_device *ibdev,
+						  u32 port_num)
+{
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+	struct irdma_sc_dev *dev = &iwdev->rf->sc_dev;
+	int num_counters = IRDMA_HW_STAT_INDEX_MAX_32 +
+			   IRDMA_HW_STAT_INDEX_MAX_64;
+	unsigned long lifespan = RDMA_HW_STATS_DEFAULT_LIFESPAN;
+
+	BUILD_BUG_ON(ARRAY_SIZE(irdma_hw_stat_names) !=
+		     (IRDMA_HW_STAT_INDEX_MAX_32 + IRDMA_HW_STAT_INDEX_MAX_64));
+
+	/*
+	 * PFs get the default update lifespan, but VFs only update once
+	 * per second
+	 */
+	if (!dev->privileged)
+		lifespan = 1000;
+
+	return rdma_alloc_hw_stats_struct(irdma_hw_stat_names, num_counters,
+					  lifespan);
+}
+
+/**
+ * irdma_get_hw_stats - Populates the rdma_hw_stats structure
+ * @ibdev: device pointer from stack
+ * @stats: stats pointer from stack
+ * @port_num: port number
+ * @index: which hw counter the stack is requesting we update
+ */
+static int irdma_get_hw_stats(struct ib_device *ibdev,
+			      struct rdma_hw_stats *stats, u32 port_num,
+			      int index)
+{
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+	struct irdma_dev_hw_stats *hw_stats = &iwdev->vsi.pestat->hw_stats;
+
+	if (iwdev->rf->rdma_ver >= IRDMA_GEN_2)
+		irdma_cqp_gather_stats_cmd(&iwdev->rf->sc_dev, iwdev->vsi.pestat, true);
+	else
+		irdma_cqp_gather_stats_gen1(&iwdev->rf->sc_dev, iwdev->vsi.pestat);
+
+	memcpy(&stats->value[0], hw_stats, sizeof(*hw_stats));
+
+	return stats->num_counters;
+}
+
+/**
+ * irdma_query_gid - Query port GID
+ * @ibdev: device pointer from stack
+ * @port: port number
+ * @index: Entry index
+ * @gid: Global ID
+ */
+static int irdma_query_gid(struct ib_device *ibdev, u32 port, int index,
+			   union ib_gid *gid)
+{
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+
+	memset(gid->raw, 0, sizeof(gid->raw));
+	ether_addr_copy(gid->raw, iwdev->netdev->dev_addr);
+
+	return 0;
+}
+
+/**
+ * mcast_list_add -  Add a new mcast item to list
+ * @rf: RDMA PCI function
+ * @new_elem: pointer to element to add
+ */
+static void mcast_list_add(struct irdma_pci_f *rf,
+			   struct mc_table_list *new_elem)
+{
+	list_add(&new_elem->list, &rf->mc_qht_list.list);
+}
+
+/**
+ * mcast_list_del - Remove an mcast item from list
+ * @mc_qht_elem: pointer to mcast table list element
+ */
+static void mcast_list_del(struct mc_table_list *mc_qht_elem)
+{
+	if (mc_qht_elem)
+		list_del(&mc_qht_elem->list);
+}
+
+/**
+ * mcast_list_lookup_ip - Search mcast list for address
+ * @rf: RDMA PCI function
+ * @ip_mcast: pointer to mcast IP address
+ */
+static struct mc_table_list *mcast_list_lookup_ip(struct irdma_pci_f *rf,
+						  u32 *ip_mcast)
+{
+	struct mc_table_list *mc_qht_el;
+	struct list_head *pos, *q;
+
+	list_for_each_safe (pos, q, &rf->mc_qht_list.list) {
+		mc_qht_el = list_entry(pos, struct mc_table_list, list);
+		if (!memcmp(mc_qht_el->mc_info.dest_ip, ip_mcast,
+			    sizeof(mc_qht_el->mc_info.dest_ip)))
+			return mc_qht_el;
+	}
+
+	return NULL;
+}
+
+/**
+ * irdma_mcast_cqp_op - perform a mcast cqp operation
+ * @iwdev: irdma device
+ * @mc_grp_ctx: mcast group info
+ * @op: operation
+ *
+ * returns error status
+ */
+static int irdma_mcast_cqp_op(struct irdma_device *iwdev,
+			      struct irdma_mcast_grp_info *mc_grp_ctx, u8 op)
+{
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_cqp_request *cqp_request;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_request->info.in.u.mc_create.info = *mc_grp_ctx;
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = op;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.mc_create.scratch = (uintptr_t)cqp_request;
+	cqp_info->in.u.mc_create.cqp = &iwdev->rf->cqp.sc_cqp;
+	status = irdma_handle_cqp_op(iwdev->rf, cqp_request);
+	irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request);
+	if (status)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/**
+ * irdma_mcast_mac - Get the multicast MAC for an IP address
+ * @ip_addr: IPv4 or IPv6 address
+ * @mac: pointer to result MAC address
+ * @ipv4: flag indicating IPv4 or IPv6
+ *
+ */
+void irdma_mcast_mac(u32 *ip_addr, u8 *mac, bool ipv4)
+{
+	u8 *ip = (u8 *)ip_addr;
+
+	if (ipv4) {
+		unsigned char mac4[ETH_ALEN] = {0x01, 0x00, 0x5E, 0x00,
+						0x00, 0x00};
+
+		mac4[3] = ip[2] & 0x7F;
+		mac4[4] = ip[1];
+		mac4[5] = ip[0];
+		ether_addr_copy(mac, mac4);
+	} else {
+		unsigned char mac6[ETH_ALEN] = {0x33, 0x33, 0x00, 0x00,
+						0x00, 0x00};
+
+		mac6[2] = ip[3];
+		mac6[3] = ip[2];
+		mac6[4] = ip[1];
+		mac6[5] = ip[0];
+		ether_addr_copy(mac, mac6);
+	}
+}
+
+/**
+ * irdma_attach_mcast - attach a qp to a multicast group
+ * @ibqp: ptr to qp
+ * @ibgid: pointer to global ID
+ * @lid: local ID
+ *
+ * returns error status
+ */
+static int irdma_attach_mcast(struct ib_qp *ibqp, union ib_gid *ibgid, u16 lid)
+{
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct mc_table_list *mc_qht_elem;
+	struct irdma_mcast_grp_ctx_entry_info mcg_info = {};
+	unsigned long flags;
+	u32 ip_addr[4] = {};
+	u32 mgn;
+	u32 no_mgs;
+	int ret = 0;
+	bool ipv4;
+	u16 vlan_id;
+	union {
+		struct sockaddr saddr;
+		struct sockaddr_in saddr_in;
+		struct sockaddr_in6 saddr_in6;
+	} sgid_addr;
+	unsigned char dmac[ETH_ALEN];
+
+	rdma_gid2ip((struct sockaddr *)&sgid_addr, ibgid);
+
+	if (!ipv6_addr_v4mapped((struct in6_addr *)ibgid)) {
+		irdma_copy_ip_ntohl(ip_addr,
+				    sgid_addr.saddr_in6.sin6_addr.in6_u.u6_addr32);
+		irdma_netdev_vlan_ipv6(ip_addr, &vlan_id, NULL);
+		ipv4 = false;
+		ibdev_dbg(&iwdev->ibdev,
+			  "VERBS: qp_id=%d, IP6address=%pI6\n", ibqp->qp_num,
+			  ip_addr);
+		irdma_mcast_mac(ip_addr, dmac, false);
+	} else {
+		ip_addr[0] = ntohl(sgid_addr.saddr_in.sin_addr.s_addr);
+		ipv4 = true;
+		vlan_id = irdma_get_vlan_ipv4(ip_addr);
+		irdma_mcast_mac(ip_addr, dmac, true);
+		ibdev_dbg(&iwdev->ibdev,
+			  "VERBS: qp_id=%d, IP4address=%pI4, MAC=%pM\n",
+			  ibqp->qp_num, ip_addr, dmac);
+	}
+
+	spin_lock_irqsave(&rf->qh_list_lock, flags);
+	mc_qht_elem = mcast_list_lookup_ip(rf, ip_addr);
+	if (!mc_qht_elem) {
+		struct irdma_dma_mem *dma_mem_mc;
+
+		spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+		mc_qht_elem = kzalloc(sizeof(*mc_qht_elem), GFP_KERNEL);
+		if (!mc_qht_elem)
+			return -ENOMEM;
+
+		mc_qht_elem->mc_info.ipv4_valid = ipv4;
+		memcpy(mc_qht_elem->mc_info.dest_ip, ip_addr,
+		       sizeof(mc_qht_elem->mc_info.dest_ip));
+		ret = irdma_alloc_rsrc(rf, rf->allocated_mcgs, rf->max_mcg,
+				       &mgn, &rf->next_mcg);
+		if (ret) {
+			kfree(mc_qht_elem);
+			return -ENOMEM;
+		}
+
+		mc_qht_elem->mc_info.mgn = mgn;
+		dma_mem_mc = &mc_qht_elem->mc_grp_ctx.dma_mem_mc;
+		dma_mem_mc->size = ALIGN(sizeof(u64) * IRDMA_MAX_MGS_PER_CTX,
+					 IRDMA_HW_PAGE_SIZE);
+		dma_mem_mc->va = dma_alloc_coherent(rf->hw.device,
+						    dma_mem_mc->size,
+						    &dma_mem_mc->pa,
+						    GFP_KERNEL);
+		if (!dma_mem_mc->va) {
+			irdma_free_rsrc(rf, rf->allocated_mcgs, mgn);
+			kfree(mc_qht_elem);
+			return -ENOMEM;
+		}
+
+		mc_qht_elem->mc_grp_ctx.mg_id = (u16)mgn;
+		memcpy(mc_qht_elem->mc_grp_ctx.dest_ip_addr, ip_addr,
+		       sizeof(mc_qht_elem->mc_grp_ctx.dest_ip_addr));
+		mc_qht_elem->mc_grp_ctx.ipv4_valid = ipv4;
+		mc_qht_elem->mc_grp_ctx.vlan_id = vlan_id;
+		if (vlan_id < VLAN_N_VID)
+			mc_qht_elem->mc_grp_ctx.vlan_valid = true;
+		mc_qht_elem->mc_grp_ctx.hmc_fcn_id = iwdev->vsi.fcn_id;
+		mc_qht_elem->mc_grp_ctx.qs_handle =
+			iwqp->sc_qp.vsi->qos[iwqp->sc_qp.user_pri].qs_handle;
+		ether_addr_copy(mc_qht_elem->mc_grp_ctx.dest_mac_addr, dmac);
+
+		spin_lock_irqsave(&rf->qh_list_lock, flags);
+		mcast_list_add(rf, mc_qht_elem);
+	} else {
+		if (mc_qht_elem->mc_grp_ctx.no_of_mgs ==
+		    IRDMA_MAX_MGS_PER_CTX) {
+			spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+			return -ENOMEM;
+		}
+	}
+
+	mcg_info.qp_id = iwqp->ibqp.qp_num;
+	no_mgs = mc_qht_elem->mc_grp_ctx.no_of_mgs;
+	rf->sc_dev.iw_uda_ops->mcast_grp_add(&mc_qht_elem->mc_grp_ctx,
+					     &mcg_info);
+	spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+
+	/* Only if there is a change do we need to modify or create */
+	if (!no_mgs) {
+		ret = irdma_mcast_cqp_op(iwdev, &mc_qht_elem->mc_grp_ctx,
+					 IRDMA_OP_MC_CREATE);
+	} else if (no_mgs != mc_qht_elem->mc_grp_ctx.no_of_mgs) {
+		ret = irdma_mcast_cqp_op(iwdev, &mc_qht_elem->mc_grp_ctx,
+					 IRDMA_OP_MC_MODIFY);
+	} else {
+		return 0;
+	}
+
+	if (ret)
+		goto error;
+
+	return 0;
+
+error:
+	rf->sc_dev.iw_uda_ops->mcast_grp_del(&mc_qht_elem->mc_grp_ctx,
+					     &mcg_info);
+	if (!mc_qht_elem->mc_grp_ctx.no_of_mgs) {
+		mcast_list_del(mc_qht_elem);
+		dma_free_coherent(rf->hw.device,
+				  mc_qht_elem->mc_grp_ctx.dma_mem_mc.size,
+				  mc_qht_elem->mc_grp_ctx.dma_mem_mc.va,
+				  mc_qht_elem->mc_grp_ctx.dma_mem_mc.pa);
+		mc_qht_elem->mc_grp_ctx.dma_mem_mc.va = NULL;
+		irdma_free_rsrc(rf, rf->allocated_mcgs,
+				mc_qht_elem->mc_grp_ctx.mg_id);
+		kfree(mc_qht_elem);
+	}
+
+	return ret;
+}
+
+/**
+ * irdma_detach_mcast - detach a qp from a multicast group
+ * @ibqp: ptr to qp
+ * @ibgid: pointer to global ID
+ * @lid: local ID
+ *
+ * returns error status
+ */
+static int irdma_detach_mcast(struct ib_qp *ibqp, union ib_gid *ibgid, u16 lid)
+{
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_pci_f *rf = iwdev->rf;
+	u32 ip_addr[4] = {};
+	struct mc_table_list *mc_qht_elem;
+	struct irdma_mcast_grp_ctx_entry_info mcg_info = {};
+	int ret;
+	unsigned long flags;
+	union {
+		struct sockaddr saddr;
+		struct sockaddr_in saddr_in;
+		struct sockaddr_in6 saddr_in6;
+	} sgid_addr;
+
+	rdma_gid2ip((struct sockaddr *)&sgid_addr, ibgid);
+	if (!ipv6_addr_v4mapped((struct in6_addr *)ibgid))
+		irdma_copy_ip_ntohl(ip_addr,
+				    sgid_addr.saddr_in6.sin6_addr.in6_u.u6_addr32);
+	else
+		ip_addr[0] = ntohl(sgid_addr.saddr_in.sin_addr.s_addr);
+
+	spin_lock_irqsave(&rf->qh_list_lock, flags);
+	mc_qht_elem = mcast_list_lookup_ip(rf, ip_addr);
+	if (!mc_qht_elem) {
+		spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+		ibdev_dbg(&iwdev->ibdev,
+			  "VERBS: address not found MCG\n");
+		return 0;
+	}
+
+	mcg_info.qp_id = iwqp->ibqp.qp_num;
+	rf->sc_dev.iw_uda_ops->mcast_grp_del(&mc_qht_elem->mc_grp_ctx,
+					     &mcg_info);
+	if (!mc_qht_elem->mc_grp_ctx.no_of_mgs) {
+		mcast_list_del(mc_qht_elem);
+		spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+		ret = irdma_mcast_cqp_op(iwdev, &mc_qht_elem->mc_grp_ctx,
+					 IRDMA_OP_MC_DESTROY);
+		if (ret) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "VERBS: failed MC_DESTROY MCG\n");
+			spin_lock_irqsave(&rf->qh_list_lock, flags);
+			mcast_list_add(rf, mc_qht_elem);
+			spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+			return -EAGAIN;
+		}
+
+		dma_free_coherent(rf->hw.device,
+				  mc_qht_elem->mc_grp_ctx.dma_mem_mc.size,
+				  mc_qht_elem->mc_grp_ctx.dma_mem_mc.va,
+				  mc_qht_elem->mc_grp_ctx.dma_mem_mc.pa);
+		mc_qht_elem->mc_grp_ctx.dma_mem_mc.va = NULL;
+		irdma_free_rsrc(rf, rf->allocated_mcgs,
+				mc_qht_elem->mc_grp_ctx.mg_id);
+		kfree(mc_qht_elem);
+	} else {
+		spin_unlock_irqrestore(&rf->qh_list_lock, flags);
+		ret = irdma_mcast_cqp_op(iwdev, &mc_qht_elem->mc_grp_ctx,
+					 IRDMA_OP_MC_MODIFY);
+		if (ret) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "VERBS: failed Modify MCG\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_create_ah - create address handle
+ * @ibah: address handle
+ * @attr: address handle attributes
+ * @udata: User data
+ *
+ * returns 0 on success, error otherwise
+ */
+static int irdma_create_ah(struct ib_ah *ibah,
+			   struct rdma_ah_init_attr *attr,
+			   struct ib_udata *udata)
+{
+	struct irdma_pd *pd = to_iwpd(ibah->pd);
+	struct irdma_ah *ah = container_of(ibah, struct irdma_ah, ibah);
+	struct rdma_ah_attr *ah_attr = attr->ah_attr;
+	const struct ib_gid_attr *sgid_attr;
+	struct irdma_device *iwdev = to_iwdev(ibah->pd->device);
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct irdma_sc_ah *sc_ah;
+	u32 ah_id = 0;
+	struct irdma_ah_info *ah_info;
+	struct irdma_create_ah_resp uresp;
+	union {
+		struct sockaddr saddr;
+		struct sockaddr_in saddr_in;
+		struct sockaddr_in6 saddr_in6;
+	} sgid_addr, dgid_addr;
+	int err;
+	u8 dmac[ETH_ALEN];
+
+	err = irdma_alloc_rsrc(rf, rf->allocated_ahs, rf->max_ah, &ah_id,
+			       &rf->next_ah);
+	if (err)
+		return err;
+
+	ah->pd = pd;
+	sc_ah = &ah->sc_ah;
+	sc_ah->ah_info.ah_idx = ah_id;
+	sc_ah->ah_info.vsi = &iwdev->vsi;
+	iwdev->rf->sc_dev.iw_uda_ops->init_ah(&rf->sc_dev, sc_ah);
+	ah->sgid_index = ah_attr->grh.sgid_index;
+	sgid_attr = ah_attr->grh.sgid_attr;
+	memcpy(&ah->dgid, &ah_attr->grh.dgid, sizeof(ah->dgid));
+	rdma_gid2ip((struct sockaddr *)&sgid_addr, &sgid_attr->gid);
+	rdma_gid2ip((struct sockaddr *)&dgid_addr, &ah_attr->grh.dgid);
+	ah->av.attrs = *ah_attr;
+	ah->av.net_type = rdma_gid_attr_network_type(sgid_attr);
+	ah->av.sgid_addr.saddr = sgid_addr.saddr;
+	ah->av.dgid_addr.saddr = dgid_addr.saddr;
+	ah_info = &sc_ah->ah_info;
+	ah_info->ah_idx = ah_id;
+	ah_info->pd_idx = pd->sc_pd.pd_id;
+	if (ah_attr->ah_flags & IB_AH_GRH) {
+		ah_info->flow_label = ah_attr->grh.flow_label;
+		ah_info->hop_ttl = ah_attr->grh.hop_limit;
+		ah_info->tc_tos = ah_attr->grh.traffic_class;
+	}
+
+	ether_addr_copy(dmac, ah_attr->roce.dmac);
+	if (rdma_gid_attr_network_type(sgid_attr) == RDMA_NETWORK_IPV4) {
+		ah_info->ipv4_valid = true;
+		ah_info->dest_ip_addr[0] =
+			ntohl(dgid_addr.saddr_in.sin_addr.s_addr);
+		ah_info->src_ip_addr[0] =
+			ntohl(sgid_addr.saddr_in.sin_addr.s_addr);
+		ah_info->do_lpbk = irdma_ipv4_is_lpb(ah_info->src_ip_addr[0],
+						     ah_info->dest_ip_addr[0]);
+		if (ipv4_is_multicast(dgid_addr.saddr_in.sin_addr.s_addr)) {
+			ah_info->do_lpbk = true;
+			irdma_mcast_mac(ah_info->dest_ip_addr, dmac, true);
+		}
+	} else {
+		irdma_copy_ip_ntohl(ah_info->dest_ip_addr,
+				    dgid_addr.saddr_in6.sin6_addr.in6_u.u6_addr32);
+		irdma_copy_ip_ntohl(ah_info->src_ip_addr,
+				    sgid_addr.saddr_in6.sin6_addr.in6_u.u6_addr32);
+		ah_info->do_lpbk = irdma_ipv6_is_lpb(ah_info->src_ip_addr,
+						     ah_info->dest_ip_addr);
+		if (rdma_is_multicast_addr(&dgid_addr.saddr_in6.sin6_addr)) {
+			ah_info->do_lpbk = true;
+			irdma_mcast_mac(ah_info->dest_ip_addr, dmac, false);
+		}
+	}
+
+	err = rdma_read_gid_l2_fields(sgid_attr, &ah_info->vlan_tag,
+				      ah_info->mac_addr);
+	if (err)
+		goto error;
+
+	ah_info->dst_arpindex = irdma_add_arp(iwdev->rf, ah_info->dest_ip_addr,
+					      ah_info->ipv4_valid, dmac);
+
+	if (ah_info->dst_arpindex == -1) {
+		err = -EINVAL;
+		goto error;
+	}
+
+	if (ah_info->vlan_tag >= VLAN_N_VID && iwdev->dcb)
+		ah_info->vlan_tag = 0;
+
+	if (ah_info->vlan_tag < VLAN_N_VID) {
+		ah_info->insert_vlan_tag = true;
+		ah_info->vlan_tag |=
+			rt_tos2priority(ah_info->tc_tos) << VLAN_PRIO_SHIFT;
+	}
+
+	err = irdma_ah_cqp_op(iwdev->rf, sc_ah, IRDMA_OP_AH_CREATE,
+			      attr->flags & RDMA_CREATE_AH_SLEEPABLE,
+			      irdma_gsi_ud_qp_ah_cb, sc_ah);
+
+	if (err) {
+		ibdev_dbg(&iwdev->ibdev,
+			  "VERBS: CQP-OP Create AH fail");
+		goto error;
+	}
+
+	if (!(attr->flags & RDMA_CREATE_AH_SLEEPABLE)) {
+		int cnt = CQP_COMPL_WAIT_TIME_MS * CQP_TIMEOUT_THRESHOLD;
+
+		do {
+			irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq);
+			mdelay(1);
+		} while (!sc_ah->ah_info.ah_valid && --cnt);
+
+		if (!cnt) {
+			ibdev_dbg(&iwdev->ibdev,
+				  "VERBS: CQP create AH timed out");
+			err = -ETIMEDOUT;
+			goto error;
+		}
+	}
+
+	if (udata) {
+		uresp.ah_id = ah->sc_ah.ah_info.ah_idx;
+		err = ib_copy_to_udata(udata, &uresp,
+				       min(sizeof(uresp), udata->outlen));
+	}
+	return 0;
+
+error:
+	irdma_free_rsrc(iwdev->rf, iwdev->rf->allocated_ahs, ah_id);
+
+	return err;
+}
+
+/**
+ * irdma_destroy_ah - Destroy address handle
+ * @ibah: pointer to address handle
+ * @ah_flags: flags for sleepable
+ */
+static int irdma_destroy_ah(struct ib_ah *ibah, u32 ah_flags)
+{
+	struct irdma_device *iwdev = to_iwdev(ibah->device);
+	struct irdma_ah *ah = to_iwah(ibah);
+
+	irdma_ah_cqp_op(iwdev->rf, &ah->sc_ah, IRDMA_OP_AH_DESTROY,
+			false, NULL, ah);
+
+	irdma_free_rsrc(iwdev->rf, iwdev->rf->allocated_ahs,
+			ah->sc_ah.ah_info.ah_idx);
+
+	return 0;
+}
+
+/**
+ * irdma_query_ah - Query address handle
+ * @ibah: pointer to address handle
+ * @ah_attr: address handle attributes
+ */
+static int irdma_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
+{
+	struct irdma_ah *ah = to_iwah(ibah);
+
+	memset(ah_attr, 0, sizeof(*ah_attr));
+	if (ah->av.attrs.ah_flags & IB_AH_GRH) {
+		ah_attr->ah_flags = IB_AH_GRH;
+		ah_attr->grh.flow_label = ah->sc_ah.ah_info.flow_label;
+		ah_attr->grh.traffic_class = ah->sc_ah.ah_info.tc_tos;
+		ah_attr->grh.hop_limit = ah->sc_ah.ah_info.hop_ttl;
+		ah_attr->grh.sgid_index = ah->sgid_index;
+		ah_attr->grh.sgid_index = ah->sgid_index;
+		memcpy(&ah_attr->grh.dgid, &ah->dgid,
+		       sizeof(ah_attr->grh.dgid));
+	}
+
+	return 0;
+}
+
+static enum rdma_link_layer irdma_get_link_layer(struct ib_device *ibdev,
+						 u32 port_num)
+{
+	return IB_LINK_LAYER_ETHERNET;
+}
+
+static __be64 irdma_mac_to_guid(struct net_device *ndev)
+{
+	unsigned char *mac = ndev->dev_addr;
+	__be64 guid;
+	unsigned char *dst = (unsigned char *)&guid;
+
+	dst[0] = mac[0] ^ 2;
+	dst[1] = mac[1];
+	dst[2] = mac[2];
+	dst[3] = 0xff;
+	dst[4] = 0xfe;
+	dst[5] = mac[3];
+	dst[6] = mac[4];
+	dst[7] = mac[5];
+
+	return guid;
+}
+
+static const struct ib_device_ops irdma_roce_dev_ops = {
+	.attach_mcast = irdma_attach_mcast,
+	.create_ah = irdma_create_ah,
+	.create_user_ah = irdma_create_ah,
+	.destroy_ah = irdma_destroy_ah,
+	.detach_mcast = irdma_detach_mcast,
+	.get_link_layer = irdma_get_link_layer,
+	.get_port_immutable = irdma_roce_port_immutable,
+	.modify_qp = irdma_modify_qp_roce,
+	.query_ah = irdma_query_ah,
+	.query_pkey = irdma_query_pkey,
+};
+
+static const struct ib_device_ops irdma_iw_dev_ops = {
+	.modify_qp = irdma_modify_qp,
+	.get_port_immutable = irdma_iw_port_immutable,
+	.query_gid = irdma_query_gid,
+};
+
+static const struct ib_device_ops irdma_dev_ops = {
+	.owner = THIS_MODULE,
+	.driver_id = RDMA_DRIVER_IRDMA,
+	.uverbs_abi_ver = IRDMA_ABI_VER,
+
+	.alloc_hw_stats = irdma_alloc_hw_stats,
+	.alloc_mr = irdma_alloc_mr,
+	.alloc_mw = irdma_alloc_mw,
+	.alloc_pd = irdma_alloc_pd,
+	.alloc_ucontext = irdma_alloc_ucontext,
+	.create_cq = irdma_create_cq,
+	.create_qp = irdma_create_qp,
+	.dealloc_driver = irdma_ib_dealloc_device,
+	.dealloc_mw = irdma_dealloc_mw,
+	.dealloc_pd = irdma_dealloc_pd,
+	.dealloc_ucontext = irdma_dealloc_ucontext,
+	.dereg_mr = irdma_dereg_mr,
+	.destroy_cq = irdma_destroy_cq,
+	.destroy_qp = irdma_destroy_qp,
+	.disassociate_ucontext = irdma_disassociate_ucontext,
+	.get_dev_fw_str = irdma_get_dev_fw_str,
+	.get_dma_mr = irdma_get_dma_mr,
+	.get_hw_stats = irdma_get_hw_stats,
+	.map_mr_sg = irdma_map_mr_sg,
+	.mmap = irdma_mmap,
+	.mmap_free = irdma_mmap_free,
+	.poll_cq = irdma_poll_cq,
+	.post_recv = irdma_post_recv,
+	.post_send = irdma_post_send,
+	.query_device = irdma_query_device,
+	.query_port = irdma_query_port,
+	.query_qp = irdma_query_qp,
+	.reg_user_mr = irdma_reg_user_mr,
+	.req_notify_cq = irdma_req_notify_cq,
+	.resize_cq = irdma_resize_cq,
+	INIT_RDMA_OBJ_SIZE(ib_pd, irdma_pd, ibpd),
+	INIT_RDMA_OBJ_SIZE(ib_ucontext, irdma_ucontext, ibucontext),
+	INIT_RDMA_OBJ_SIZE(ib_ah, irdma_ah, ibah),
+	INIT_RDMA_OBJ_SIZE(ib_cq, irdma_cq, ibcq),
+	INIT_RDMA_OBJ_SIZE(ib_mw, irdma_mr, ibmw),
+};
+
+/**
+ * irdma_init_roce_device - initialization of roce rdma device
+ * @iwdev: irdma device
+ */
+static void irdma_init_roce_device(struct irdma_device *iwdev)
+{
+	iwdev->ibdev.node_type = RDMA_NODE_IB_CA;
+	iwdev->ibdev.node_guid = irdma_mac_to_guid(iwdev->netdev);
+	ib_set_device_ops(&iwdev->ibdev, &irdma_roce_dev_ops);
+}
+
+/**
+ * irdma_init_iw_device - initialization of iwarp rdma device
+ * @iwdev: irdma device
+ */
+static int irdma_init_iw_device(struct irdma_device *iwdev)
+{
+	struct net_device *netdev = iwdev->netdev;
+
+	iwdev->ibdev.node_type = RDMA_NODE_RNIC;
+	ether_addr_copy((u8 *)&iwdev->ibdev.node_guid, netdev->dev_addr);
+	iwdev->ibdev.ops.iw_add_ref = irdma_qp_add_ref;
+	iwdev->ibdev.ops.iw_rem_ref = irdma_qp_rem_ref;
+	iwdev->ibdev.ops.iw_get_qp = irdma_get_qp;
+	iwdev->ibdev.ops.iw_connect = irdma_connect;
+	iwdev->ibdev.ops.iw_accept = irdma_accept;
+	iwdev->ibdev.ops.iw_reject = irdma_reject;
+	iwdev->ibdev.ops.iw_create_listen = irdma_create_listen;
+	iwdev->ibdev.ops.iw_destroy_listen = irdma_destroy_listen;
+	memcpy(iwdev->ibdev.iw_ifname, netdev->name,
+	       sizeof(iwdev->ibdev.iw_ifname));
+	ib_set_device_ops(&iwdev->ibdev, &irdma_iw_dev_ops);
+
+	return 0;
+}
+
+/**
+ * irdma_init_rdma_device - initialization of rdma device
+ * @iwdev: irdma device
+ */
+static int irdma_init_rdma_device(struct irdma_device *iwdev)
+{
+	struct pci_dev *pcidev = iwdev->rf->pcidev;
+	int ret;
+
+	if (iwdev->roce_mode) {
+		irdma_init_roce_device(iwdev);
+	} else {
+		ret = irdma_init_iw_device(iwdev);
+		if (ret)
+			return ret;
+	}
+	iwdev->ibdev.phys_port_cnt = 1;
+	iwdev->ibdev.num_comp_vectors = iwdev->rf->ceqs_count;
+	iwdev->ibdev.dev.parent = &pcidev->dev;
+	ib_set_device_ops(&iwdev->ibdev, &irdma_dev_ops);
+
+	return 0;
+}
+
+/**
+ * irdma_port_ibevent - indicate port event
+ * @iwdev: irdma device
+ */
+void irdma_port_ibevent(struct irdma_device *iwdev)
+{
+	struct ib_event event;
+
+	event.device = &iwdev->ibdev;
+	event.element.port_num = 1;
+	event.event =
+		iwdev->iw_status ? IB_EVENT_PORT_ACTIVE : IB_EVENT_PORT_ERR;
+	ib_dispatch_event(&event);
+}
+
+/**
+ * irdma_ib_unregister_device - unregister rdma device from IB
+ * core
+ * @iwdev: irdma device
+ */
+void irdma_ib_unregister_device(struct irdma_device *iwdev)
+{
+	iwdev->iw_status = 0;
+	irdma_port_ibevent(iwdev);
+	ib_unregister_device(&iwdev->ibdev);
+}
+
+/**
+ * irdma_ib_register_device - register irdma device to IB core
+ * @iwdev: irdma device
+ */
+int irdma_ib_register_device(struct irdma_device *iwdev)
+{
+	int ret;
+
+	ret = irdma_init_rdma_device(iwdev);
+	if (ret)
+		return ret;
+
+	ret = ib_device_set_netdev(&iwdev->ibdev, iwdev->netdev, 1);
+	if (ret)
+		goto error;
+	dma_set_max_seg_size(iwdev->rf->hw.device, UINT_MAX);
+	ret = ib_register_device(&iwdev->ibdev, "irdma%d", iwdev->rf->hw.device);
+	if (ret)
+		goto error;
+
+	iwdev->iw_status = 1;
+	irdma_port_ibevent(iwdev);
+
+	return 0;
+
+error:
+	if (ret)
+		ibdev_dbg(&iwdev->ibdev, "VERBS: Register RDMA device fail\n");
+
+	return ret;
+}
+
+/**
+ * irdma_ib_dealloc_device
+ * @ibdev: ib device
+ *
+ * callback from ibdev dealloc_driver to deallocate resources
+ * unber irdma device
+ */
+void irdma_ib_dealloc_device(struct ib_device *ibdev)
+{
+	struct irdma_device *iwdev = to_iwdev(ibdev);
+
+	irdma_rt_deinit_hw(iwdev);
+	irdma_ctrl_deinit_hw(iwdev->rf);
+	kfree(iwdev->rf);
+}
diff --git a/drivers/infiniband/hw/irdma/verbs.h b/drivers/infiniband/hw/irdma/verbs.h
new file mode 100644
index 0000000..5c244cd
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/verbs.h
@@ -0,0 +1,225 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef IRDMA_VERBS_H
+#define IRDMA_VERBS_H
+
+#define IRDMA_MAX_SAVED_PHY_PGADDR	4
+
+#define IRDMA_PKEY_TBL_SZ		1
+#define IRDMA_DEFAULT_PKEY		0xFFFF
+
+struct irdma_ucontext {
+	struct ib_ucontext ibucontext;
+	struct irdma_device *iwdev;
+	struct rdma_user_mmap_entry *db_mmap_entry;
+	struct list_head cq_reg_mem_list;
+	spinlock_t cq_reg_mem_list_lock; /* protect CQ memory list */
+	struct list_head qp_reg_mem_list;
+	spinlock_t qp_reg_mem_list_lock; /* protect QP memory list */
+	int abi_ver;
+	bool legacy_mode;
+};
+
+struct irdma_pd {
+	struct ib_pd ibpd;
+	struct irdma_sc_pd sc_pd;
+};
+
+struct irdma_av {
+	u8 macaddr[16];
+	struct rdma_ah_attr attrs;
+	union {
+		struct sockaddr saddr;
+		struct sockaddr_in saddr_in;
+		struct sockaddr_in6 saddr_in6;
+	} sgid_addr, dgid_addr;
+	u8 net_type;
+};
+
+struct irdma_ah {
+	struct ib_ah ibah;
+	struct irdma_sc_ah sc_ah;
+	struct irdma_pd *pd;
+	struct irdma_av av;
+	u8 sgid_index;
+	union ib_gid dgid;
+};
+
+struct irdma_hmc_pble {
+	union {
+		u32 idx;
+		dma_addr_t addr;
+	};
+};
+
+struct irdma_cq_mr {
+	struct irdma_hmc_pble cq_pbl;
+	dma_addr_t shadow;
+	bool split;
+};
+
+struct irdma_qp_mr {
+	struct irdma_hmc_pble sq_pbl;
+	struct irdma_hmc_pble rq_pbl;
+	dma_addr_t shadow;
+	struct page *sq_page;
+};
+
+struct irdma_cq_buf {
+	struct irdma_dma_mem kmem_buf;
+	struct irdma_cq_uk cq_uk;
+	struct irdma_hw *hw;
+	struct list_head list;
+	struct work_struct work;
+};
+
+struct irdma_pbl {
+	struct list_head list;
+	union {
+		struct irdma_qp_mr qp_mr;
+		struct irdma_cq_mr cq_mr;
+	};
+
+	bool pbl_allocated:1;
+	bool on_list:1;
+	u64 user_base;
+	struct irdma_pble_alloc pble_alloc;
+	struct irdma_mr *iwmr;
+};
+
+struct irdma_mr {
+	union {
+		struct ib_mr ibmr;
+		struct ib_mw ibmw;
+	};
+	struct ib_umem *region;
+	u16 type;
+	u32 page_cnt;
+	u64 page_size;
+	u32 npages;
+	u32 stag;
+	u64 len;
+	u64 pgaddrmem[IRDMA_MAX_SAVED_PHY_PGADDR];
+	struct irdma_pbl iwpbl;
+};
+
+struct irdma_cq {
+	struct ib_cq ibcq;
+	struct irdma_sc_cq sc_cq;
+	u16 cq_head;
+	u16 cq_size;
+	u16 cq_num;
+	bool user_mode;
+	u32 polled_cmpls;
+	u32 cq_mem_size;
+	struct irdma_dma_mem kmem;
+	struct irdma_dma_mem kmem_shadow;
+	spinlock_t lock; /* for poll cq */
+	struct irdma_pbl *iwpbl;
+	struct irdma_pbl *iwpbl_shadow;
+	struct list_head resize_list;
+	struct irdma_cq_poll_info cur_cqe;
+};
+
+struct disconn_work {
+	struct work_struct work;
+	struct irdma_qp *iwqp;
+};
+
+struct iw_cm_id;
+
+struct irdma_qp_kmode {
+	struct irdma_dma_mem dma_mem;
+	struct irdma_sq_uk_wr_trk_info *sq_wrid_mem;
+	u64 *rq_wrid_mem;
+};
+
+struct irdma_qp {
+	struct ib_qp ibqp;
+	struct irdma_sc_qp sc_qp;
+	struct irdma_device *iwdev;
+	struct irdma_cq *iwscq;
+	struct irdma_cq *iwrcq;
+	struct irdma_pd *iwpd;
+	struct rdma_user_mmap_entry *push_wqe_mmap_entry;
+	struct rdma_user_mmap_entry *push_db_mmap_entry;
+	struct irdma_qp_host_ctx_info ctx_info;
+	union {
+		struct irdma_iwarp_offload_info iwarp_info;
+		struct irdma_roce_offload_info roce_info;
+	};
+
+	union {
+		struct irdma_tcp_offload_info tcp_info;
+		struct irdma_udp_offload_info udp_info;
+	};
+
+	struct irdma_ah roce_ah;
+	struct list_head teardown_entry;
+	refcount_t refcnt;
+	struct iw_cm_id *cm_id;
+	struct irdma_cm_node *cm_node;
+	struct ib_mr *lsmm_mr;
+	atomic_t hw_mod_qp_pend;
+	enum ib_qp_state ibqp_state;
+	u32 qp_mem_size;
+	u32 last_aeq;
+	int max_send_wr;
+	int max_recv_wr;
+	atomic_t close_timer_started;
+	spinlock_t lock; /* serialize posting WRs to SQ/RQ */
+	struct irdma_qp_context *iwqp_context;
+	void *pbl_vbase;
+	dma_addr_t pbl_pbase;
+	struct page *page;
+	u8 active_conn : 1;
+	u8 user_mode : 1;
+	u8 hte_added : 1;
+	u8 flush_issued : 1;
+	u8 sig_all : 1;
+	u8 pau_mode : 1;
+	u8 rsvd : 1;
+	u8 iwarp_state;
+	u16 term_sq_flush_code;
+	u16 term_rq_flush_code;
+	u8 hw_iwarp_state;
+	u8 hw_tcp_state;
+	struct irdma_qp_kmode kqp;
+	struct irdma_dma_mem host_ctx;
+	struct timer_list terminate_timer;
+	struct irdma_pbl *iwpbl;
+	struct irdma_dma_mem q2_ctx_mem;
+	struct irdma_dma_mem ietf_mem;
+	struct completion free_qp;
+	wait_queue_head_t waitq;
+	wait_queue_head_t mod_qp_waitq;
+	u8 rts_ae_rcvd;
+};
+
+enum irdma_mmap_flag {
+	IRDMA_MMAP_IO_NC,
+	IRDMA_MMAP_IO_WC,
+};
+
+struct irdma_user_mmap_entry {
+	struct rdma_user_mmap_entry rdma_entry;
+	u64 bar_offset;
+	u8 mmap_flag;
+};
+
+static inline u16 irdma_fw_major_ver(struct irdma_sc_dev *dev)
+{
+	return (u16)FIELD_GET(IRDMA_FW_VER_MAJOR, dev->feature_info[IRDMA_FEATURE_FW_INFO]);
+}
+
+static inline u16 irdma_fw_minor_ver(struct irdma_sc_dev *dev)
+{
+	return (u16)FIELD_GET(IRDMA_FW_VER_MINOR, dev->feature_info[IRDMA_FEATURE_FW_INFO]);
+}
+
+void irdma_mcast_mac(u32 *ip_addr, u8 *mac, bool ipv4);
+int irdma_ib_register_device(struct irdma_device *iwdev);
+void irdma_ib_unregister_device(struct irdma_device *iwdev);
+void irdma_ib_dealloc_device(struct ib_device *ibdev);
+void irdma_ib_qp_event(struct irdma_qp *iwqp, enum irdma_qp_event_type event);
+#endif /* IRDMA_VERBS_H */
diff --git a/include/uapi/rdma/ib_user_ioctl_verbs.h b/include/uapi/rdma/ib_user_ioctl_verbs.h
index 2248379..3072e5d 100644
--- a/include/uapi/rdma/ib_user_ioctl_verbs.h
+++ b/include/uapi/rdma/ib_user_ioctl_verbs.h
@@ -240,6 +240,7 @@ enum rdma_driver_id {
 	RDMA_DRIVER_OCRDMA,
 	RDMA_DRIVER_NES,
 	RDMA_DRIVER_I40IW,
+	RDMA_DRIVER_IRDMA = RDMA_DRIVER_I40IW,
 	RDMA_DRIVER_VMW_PVRDMA,
 	RDMA_DRIVER_QEDR,
 	RDMA_DRIVER_HNS,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 17/23] RDMA/irdma: Add RoCEv2 UD OP support
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (15 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 16/23] RDMA/irdma: Implement device supported verb APIs Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 18/23] RDMA/irdma: Add user/kernel shared libraries Shiraz Saleem
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Add the header, data structures and functions
to populate the WQE descriptors and issue the
Control QP commands that support RoCEv2 UD operations.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/uda.c   | 379 ++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/uda.h   |  63 ++++++
 drivers/infiniband/hw/irdma/uda_d.h | 128 ++++++++++++
 3 files changed, 570 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/uda.c
 create mode 100644 drivers/infiniband/hw/irdma/uda.h
 create mode 100644 drivers/infiniband/hw/irdma/uda_d.h

diff --git a/drivers/infiniband/hw/irdma/uda.c b/drivers/infiniband/hw/irdma/uda.c
new file mode 100644
index 0000000..da9fc71
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/uda.c
@@ -0,0 +1,379 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2016 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "hmc.h"
+#include "defs.h"
+#include "type.h"
+#include "protos.h"
+#include "uda.h"
+#include "uda_d.h"
+
+/**
+ * irdma_sc_init_ah - initialize sc ah struct
+ * @dev: sc device struct
+ * @ah: sc ah ptr
+ */
+static void irdma_sc_init_ah(struct irdma_sc_dev *dev, struct irdma_sc_ah *ah)
+{
+	ah->dev = dev;
+}
+
+/**
+ * irdma_sc_access_ah() - Create, modify or delete AH
+ * @cqp: struct for cqp hw
+ * @info: ah information
+ * @op: Operation
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_access_ah(struct irdma_sc_cqp *cqp,
+						 struct irdma_ah_info *info,
+						 u32 op, u64 scratch)
+{
+	__le64 *wqe;
+	u64 qw1, qw2;
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe)
+		return IRDMA_ERR_RING_FULL;
+
+	set_64bit_val(wqe, 0, ether_addr_to_u64(info->mac_addr) << 16);
+	qw1 = FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_PDINDEXLO, info->pd_idx) |
+	      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_TC, info->tc_tos) |
+	      FIELD_PREP(IRDMA_UDAQPC_VLANTAG, info->vlan_tag);
+
+	qw2 = FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ARPINDEX, info->dst_arpindex) |
+	      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_FLOWLABEL, info->flow_label) |
+	      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_HOPLIMIT, info->hop_ttl) |
+	      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_PDINDEXHI, info->pd_idx >> 16);
+
+	if (!info->ipv4_valid) {
+		set_64bit_val(wqe, 40,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR0, info->dest_ip_addr[0]) |
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR1, info->dest_ip_addr[1]));
+		set_64bit_val(wqe, 32,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR2, info->dest_ip_addr[2]) |
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR3, info->dest_ip_addr[3]));
+
+		set_64bit_val(wqe, 56,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR0, info->src_ip_addr[0]) |
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR1, info->src_ip_addr[1]));
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR2, info->src_ip_addr[2]) |
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR3, info->src_ip_addr[3]));
+	} else {
+		set_64bit_val(wqe, 32,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR3, info->dest_ip_addr[0]));
+
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR3, info->src_ip_addr[0]));
+	}
+
+	set_64bit_val(wqe, 8, qw1);
+	set_64bit_val(wqe, 16, qw2);
+
+	dma_wmb(); /* need write block before writing WQE header */
+
+	set_64bit_val(
+		wqe, 24,
+		FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_WQEVALID, cqp->polarity) |
+		FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_OPCODE, op) |
+		FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_DOLOOPBACKK, info->do_lpbk) |
+		FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_IPV4VALID, info->ipv4_valid) |
+		FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_AVIDX, info->ah_idx) |
+		FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_INSERTVLANTAG, info->insert_vlan_tag));
+
+	print_hex_dump_debug("WQE: MANAGE_AH WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_create_ah() - Create AH
+ * @cqp: struct for cqp hw
+ * @info: ah information
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_create_ah(struct irdma_sc_cqp *cqp,
+						 struct irdma_ah_info *info,
+						 u64 scratch)
+{
+	return irdma_sc_access_ah(cqp, info, IRDMA_CQP_OP_CREATE_ADDR_HANDLE,
+				  scratch);
+}
+
+/**
+ * irdma_sc_modify_ah() - Modify AH
+ * @cqp: struct for cqp hw
+ * @info: ah information
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_modify_ah(struct irdma_sc_cqp *cqp,
+						 struct irdma_ah_info *info,
+						 u64 scratch)
+{
+	return irdma_sc_access_ah(cqp, info, IRDMA_CQP_OP_MODIFY_ADDR_HANDLE,
+				  scratch);
+}
+
+/**
+ * irdma_sc_destroy_ah() - Delete AH
+ * @cqp: struct for cqp hw
+ * @info: ah information
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code irdma_sc_destroy_ah(struct irdma_sc_cqp *cqp,
+						  struct irdma_ah_info *info,
+						  u64 scratch)
+{
+	return irdma_sc_access_ah(cqp, info, IRDMA_CQP_OP_DESTROY_ADDR_HANDLE,
+				  scratch);
+}
+
+/**
+ * irdma_create_mg_ctx() - create a mcg context
+ * @info: multicast group context info
+ */
+static enum irdma_status_code
+irdma_create_mg_ctx(struct irdma_mcast_grp_info *info)
+{
+	struct irdma_mcast_grp_ctx_entry_info *entry_info = NULL;
+	u8 idx = 0; /* index in the array */
+	u8 ctx_idx = 0; /* index in the MG context */
+
+	memset(info->dma_mem_mc.va, 0, IRDMA_MAX_MGS_PER_CTX * sizeof(u64));
+
+	for (idx = 0; idx < IRDMA_MAX_MGS_PER_CTX; idx++) {
+		entry_info = &info->mg_ctx_info[idx];
+		if (entry_info->valid_entry) {
+			set_64bit_val((__le64 *)info->dma_mem_mc.va,
+				      ctx_idx * sizeof(u64),
+				      FIELD_PREP(IRDMA_UDA_MGCTX_DESTPORT, entry_info->dest_port) |
+				      FIELD_PREP(IRDMA_UDA_MGCTX_VALIDENT, entry_info->valid_entry) |
+				      FIELD_PREP(IRDMA_UDA_MGCTX_QPID, entry_info->qp_id));
+			ctx_idx++;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_access_mcast_grp() - Access mcast group based on op
+ * @cqp: Control QP
+ * @info: multicast group context info
+ * @op: operation to perform
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_access_mcast_grp(struct irdma_sc_cqp *cqp,
+		       struct irdma_mcast_grp_info *info, u32 op, u64 scratch)
+{
+	__le64 *wqe;
+	enum irdma_status_code ret_code = 0;
+
+	if (info->mg_id >= IRDMA_UDA_MAX_FSI_MGS) {
+		ibdev_dbg(to_ibdev(cqp->dev), "WQE: mg_id out of range\n");
+		return IRDMA_ERR_PARAM;
+	}
+
+	wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch);
+	if (!wqe) {
+		ibdev_dbg(to_ibdev(cqp->dev), "WQE: ring full\n");
+		return IRDMA_ERR_RING_FULL;
+	}
+
+	ret_code = irdma_create_mg_ctx(info);
+	if (ret_code)
+		return ret_code;
+
+	set_64bit_val(wqe, 32, info->dma_mem_mc.pa);
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_VLANID, info->vlan_id) |
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_QS_HANDLE, info->qs_handle));
+	set_64bit_val(wqe, 0, ether_addr_to_u64(info->dest_mac_addr));
+	set_64bit_val(wqe, 8,
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_HMC_FCN_ID, info->hmc_fcn_id));
+
+	if (!info->ipv4_valid) {
+		set_64bit_val(wqe, 56,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR0, info->dest_ip_addr[0]) |
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR1, info->dest_ip_addr[1]));
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR2, info->dest_ip_addr[2]) |
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR3, info->dest_ip_addr[3]));
+	} else {
+		set_64bit_val(wqe, 48,
+			      FIELD_PREP(IRDMA_UDA_CQPSQ_MAV_ADDR3, info->dest_ip_addr[0]));
+	}
+
+	dma_wmb(); /* need write memory block before writing the WQE header. */
+
+	set_64bit_val(wqe, 24,
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_WQEVALID, cqp->polarity) |
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_OPCODE, op) |
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_MGIDX, info->mg_id) |
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_VLANVALID, info->vlan_valid) |
+		      FIELD_PREP(IRDMA_UDA_CQPSQ_MG_IPV4VALID, info->ipv4_valid));
+
+	print_hex_dump_debug("WQE: MANAGE_MCG WQE", DUMP_PREFIX_OFFSET, 16, 8,
+			     wqe, IRDMA_CQP_WQE_SIZE * 8, false);
+	print_hex_dump_debug("WQE: MCG_HOST CTX WQE", DUMP_PREFIX_OFFSET, 16,
+			     8, info->dma_mem_mc.va,
+			     IRDMA_MAX_MGS_PER_CTX * 8, false);
+	irdma_sc_cqp_post_sq(cqp);
+
+	return 0;
+}
+
+/**
+ * irdma_sc_create_mcast_grp() - Create mcast group.
+ * @cqp: Control QP
+ * @info: multicast group context info
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_create_mcast_grp(struct irdma_sc_cqp *cqp,
+			  struct irdma_mcast_grp_info *info, u64 scratch)
+{
+	return irdma_access_mcast_grp(cqp, info, IRDMA_CQP_OP_CREATE_MCAST_GRP,
+				      scratch);
+}
+
+/**
+ * irdma_sc_modify_mcast_grp() - Modify mcast group
+ * @cqp: Control QP
+ * @info: multicast group context info
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_modify_mcast_grp(struct irdma_sc_cqp *cqp,
+			  struct irdma_mcast_grp_info *info, u64 scratch)
+{
+	return irdma_access_mcast_grp(cqp, info, IRDMA_CQP_OP_MODIFY_MCAST_GRP,
+				      scratch);
+}
+
+/**
+ * irdma_sc_destroy_mcast_grp() - Destroys mcast group
+ * @cqp: Control QP
+ * @info: multicast group context info
+ * @scratch: u64 saved to be used during cqp completion
+ */
+static enum irdma_status_code
+irdma_sc_destroy_mcast_grp(struct irdma_sc_cqp *cqp,
+			   struct irdma_mcast_grp_info *info, u64 scratch)
+{
+	return irdma_access_mcast_grp(cqp, info, IRDMA_CQP_OP_DESTROY_MCAST_GRP,
+				      scratch);
+}
+
+/**
+ * irdma_compare_mgs - Compares two multicast group structures
+ * @entry1: Multcast group info
+ * @entry2: Multcast group info in context
+ */
+static bool irdma_compare_mgs(struct irdma_mcast_grp_ctx_entry_info *entry1,
+			      struct irdma_mcast_grp_ctx_entry_info *entry2)
+{
+	if (entry1->dest_port == entry2->dest_port &&
+	    entry1->qp_id == entry2->qp_id)
+		return true;
+
+	return false;
+}
+
+/**
+ * irdma_sc_add_mcast_grp - Allocates mcast group entry in ctx
+ * @ctx: Multcast group context
+ * @mg: Multcast group info
+ */
+static enum irdma_status_code
+irdma_sc_add_mcast_grp(struct irdma_mcast_grp_info *ctx,
+		       struct irdma_mcast_grp_ctx_entry_info *mg)
+{
+	u32 idx;
+	bool free_entry_found = false;
+	u32 free_entry_idx = 0;
+
+	/* find either an identical or a free entry for a multicast group */
+	for (idx = 0; idx < IRDMA_MAX_MGS_PER_CTX; idx++) {
+		if (ctx->mg_ctx_info[idx].valid_entry) {
+			if (irdma_compare_mgs(&ctx->mg_ctx_info[idx], mg)) {
+				ctx->mg_ctx_info[idx].use_cnt++;
+				return 0;
+			}
+			continue;
+		}
+		if (!free_entry_found) {
+			free_entry_found = true;
+			free_entry_idx = idx;
+		}
+	}
+
+	if (free_entry_found) {
+		ctx->mg_ctx_info[free_entry_idx] = *mg;
+		ctx->mg_ctx_info[free_entry_idx].valid_entry = true;
+		ctx->mg_ctx_info[free_entry_idx].use_cnt = 1;
+		ctx->no_of_mgs++;
+		return 0;
+	}
+
+	return IRDMA_ERR_NO_MEMORY;
+}
+
+/**
+ * irdma_sc_del_mcast_grp - Delete mcast group
+ * @ctx: Multcast group context
+ * @mg: Multcast group info
+ *
+ * Finds and removes a specific mulicast group from context, all
+ * parameters must match to remove a multicast group.
+ */
+static enum irdma_status_code
+irdma_sc_del_mcast_grp(struct irdma_mcast_grp_info *ctx,
+		       struct irdma_mcast_grp_ctx_entry_info *mg)
+{
+	u32 idx;
+
+	/* find an entry in multicast group context */
+	for (idx = 0; idx < IRDMA_MAX_MGS_PER_CTX; idx++) {
+		if (!ctx->mg_ctx_info[idx].valid_entry)
+			continue;
+
+		if (irdma_compare_mgs(mg, &ctx->mg_ctx_info[idx])) {
+			ctx->mg_ctx_info[idx].use_cnt--;
+
+			if (!ctx->mg_ctx_info[idx].use_cnt) {
+				ctx->mg_ctx_info[idx].valid_entry = false;
+				ctx->no_of_mgs--;
+				/* Remove gap if element was not the last */
+				if (idx != ctx->no_of_mgs &&
+				    ctx->no_of_mgs > 0) {
+					memcpy(&ctx->mg_ctx_info[idx],
+					       &ctx->mg_ctx_info[ctx->no_of_mgs - 1],
+					       sizeof(ctx->mg_ctx_info[idx]));
+					ctx->mg_ctx_info[ctx->no_of_mgs - 1].valid_entry = false;
+				}
+			}
+
+			return 0;
+		}
+	}
+
+	return IRDMA_ERR_PARAM;
+}
+
+const struct irdma_uda_ops irdma_uda_ops = {
+	.create_ah = irdma_sc_create_ah,
+	.destroy_ah = irdma_sc_destroy_ah,
+	.init_ah = irdma_sc_init_ah,
+	.mcast_grp_add = irdma_sc_add_mcast_grp,
+	.mcast_grp_create = irdma_sc_create_mcast_grp,
+	.mcast_grp_del = irdma_sc_del_mcast_grp,
+	.mcast_grp_destroy = irdma_sc_destroy_mcast_grp,
+	.mcast_grp_modify = irdma_sc_modify_mcast_grp,
+	.modify_ah = irdma_sc_modify_ah,
+};
diff --git a/drivers/infiniband/hw/irdma/uda.h b/drivers/infiniband/hw/irdma/uda.h
new file mode 100644
index 0000000..4f6591d
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/uda.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2016 - 2021 Intel Corporation */
+#ifndef IRDMA_UDA_H
+#define IRDMA_UDA_H
+
+extern const struct irdma_uda_ops irdma_uda_ops;
+
+#define IRDMA_UDA_MAX_FSI_MGS	4096
+#define IRDMA_UDA_MAX_PFS	16
+#define IRDMA_UDA_MAX_VFS	128
+
+struct irdma_sc_cqp;
+
+struct irdma_ah_info {
+	struct irdma_sc_vsi *vsi;
+	u32 pd_idx;
+	u32 dst_arpindex;
+	u32 dest_ip_addr[4];
+	u32 src_ip_addr[4];
+	u32 flow_label;
+	u32 ah_idx;
+	u16 vlan_tag;
+	u8 insert_vlan_tag;
+	u8 tc_tos;
+	u8 hop_ttl;
+	u8 mac_addr[ETH_ALEN];
+	bool ah_valid:1;
+	bool ipv4_valid:1;
+	bool do_lpbk:1;
+};
+
+struct irdma_sc_ah {
+	struct irdma_sc_dev *dev;
+	struct irdma_ah_info ah_info;
+};
+
+struct irdma_uda_ops {
+	void (*init_ah)(struct irdma_sc_dev *dev, struct irdma_sc_ah *ah);
+	enum irdma_status_code (*create_ah)(struct irdma_sc_cqp *cqp,
+					    struct irdma_ah_info *info,
+					    u64 scratch);
+	enum irdma_status_code (*modify_ah)(struct irdma_sc_cqp *cqp,
+					    struct irdma_ah_info *info,
+					    u64 scratch);
+	enum irdma_status_code (*destroy_ah)(struct irdma_sc_cqp *cqp,
+					     struct irdma_ah_info *info,
+					     u64 scratch);
+	/* multicast */
+	enum irdma_status_code (*mcast_grp_create)(struct irdma_sc_cqp *cqp,
+						   struct irdma_mcast_grp_info *info,
+						   u64 scratch);
+	enum irdma_status_code (*mcast_grp_modify)(struct irdma_sc_cqp *cqp,
+						   struct irdma_mcast_grp_info *info,
+						   u64 scratch);
+	enum irdma_status_code (*mcast_grp_destroy)(struct irdma_sc_cqp *cqp,
+						    struct irdma_mcast_grp_info *info,
+						    u64 scratch);
+	enum irdma_status_code (*mcast_grp_add)(struct irdma_mcast_grp_info *ctx,
+						struct irdma_mcast_grp_ctx_entry_info *mg);
+	enum irdma_status_code (*mcast_grp_del)(struct irdma_mcast_grp_info *ctx,
+						struct irdma_mcast_grp_ctx_entry_info *mg);
+};
+#endif /* IRDMA_UDA_H */
diff --git a/drivers/infiniband/hw/irdma/uda_d.h b/drivers/infiniband/hw/irdma/uda_d.h
new file mode 100644
index 0000000..bfc81ca
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/uda_d.h
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2016 - 2021 Intel Corporation */
+#ifndef IRDMA_UDA_D_H
+#define IRDMA_UDA_D_H
+
+/* L4 packet type */
+#define IRDMA_E_UDA_SQ_L4T_UNKNOWN	0
+#define IRDMA_E_UDA_SQ_L4T_TCP		1
+#define IRDMA_E_UDA_SQ_L4T_SCTP		2
+#define IRDMA_E_UDA_SQ_L4T_UDP		3
+
+/* Inner IP header type */
+#define IRDMA_E_UDA_SQ_IIPT_UNKNOWN		0
+#define IRDMA_E_UDA_SQ_IIPT_IPV6		1
+#define IRDMA_E_UDA_SQ_IIPT_IPV4_NO_CSUM	2
+#define IRDMA_E_UDA_SQ_IIPT_IPV4_CSUM		3
+#define IRDMA_UDA_QPSQ_PUSHWQE BIT_ULL(56)
+#define IRDMA_UDA_QPSQ_INLINEDATAFLAG BIT_ULL(57)
+#define IRDMA_UDA_QPSQ_INLINEDATALEN GENMASK_ULL(55, 48)
+#define IRDMA_UDA_QPSQ_ADDFRAGCNT GENMASK_ULL(41, 38)
+#define IRDMA_UDA_QPSQ_IPFRAGFLAGS GENMASK_ULL(43, 42)
+#define IRDMA_UDA_QPSQ_NOCHECKSUM BIT_ULL(45)
+#define IRDMA_UDA_QPSQ_AHIDXVALID BIT_ULL(46)
+#define IRDMA_UDA_QPSQ_LOCAL_FENCE BIT_ULL(61)
+#define IRDMA_UDA_QPSQ_AHIDX GENMASK_ULL(16, 0)
+#define IRDMA_UDA_QPSQ_PROTOCOL GENMASK_ULL(23, 16)
+#define IRDMA_UDA_QPSQ_EXTHDRLEN GENMASK_ULL(40, 32)
+#define IRDMA_UDA_QPSQ_MULTICAST BIT_ULL(63)
+#define IRDMA_UDA_QPSQ_MACLEN GENMASK_ULL(62, 56)
+#define IRDMA_UDA_QPSQ_MACLEN_LINE 2
+#define IRDMA_UDA_QPSQ_IPLEN GENMASK_ULL(54, 48)
+#define IRDMA_UDA_QPSQ_IPLEN_LINE 2
+#define IRDMA_UDA_QPSQ_L4T GENMASK_ULL(31, 30)
+#define IRDMA_UDA_QPSQ_L4T_LINE 2
+#define IRDMA_UDA_QPSQ_IIPT GENMASK_ULL(29, 28)
+#define IRDMA_UDA_QPSQ_IIPT_LINE 2
+
+#define IRDMA_UDA_QPSQ_DO_LPB_LINE 3
+#define IRDMA_UDA_QPSQ_FWD_PROG_CONFIRM BIT_ULL(45)
+#define IRDMA_UDA_QPSQ_FWD_PROG_CONFIRM_LINE 3
+#define IRDMA_UDA_QPSQ_IMMDATA GENMASK_ULL(63, 0)
+
+/* Byte Offset 0 */
+#define IRDMA_UDAQPC_IPV4_M BIT_ULL(3)
+#define IRDMA_UDAQPC_INSERTVLANTAG BIT_ULL(5)
+#define IRDMA_UDAQPC_ISQP1 BIT_ULL(6)
+
+#define IRDMA_UDAQPC_ECNENABLE BIT_ULL(14)
+#define IRDMA_UDAQPC_PDINDEXHI GENMASK_ULL(21, 20)
+#define IRDMA_UDAQPC_DCTCPENABLE BIT_ULL(25)
+
+#define IRDMA_UDAQPC_RCVTPHEN IRDMAQPC_RCVTPHEN
+#define IRDMA_UDAQPC_XMITTPHEN IRDMAQPC_XMITTPHEN
+#define IRDMA_UDAQPC_RQTPHEN IRDMAQPC_RQTPHEN
+#define IRDMA_UDAQPC_SQTPHEN IRDMAQPC_SQTPHEN
+#define IRDMA_UDAQPC_PPIDX IRDMAQPC_PPIDX
+#define IRDMA_UDAQPC_PMENA IRDMAQPC_PMENA
+#define IRDMA_UDAQPC_INSERTTAG2 BIT_ULL(11)
+#define IRDMA_UDAQPC_INSERTTAG3 BIT_ULL(14)
+
+#define IRDMA_UDAQPC_RQSIZE IRDMAQPC_RQSIZE
+#define IRDMA_UDAQPC_SQSIZE IRDMAQPC_SQSIZE
+#define IRDMA_UDAQPC_TXCQNUM IRDMAQPC_TXCQNUM
+#define IRDMA_UDAQPC_RXCQNUM IRDMAQPC_RXCQNUM
+#define IRDMA_UDAQPC_QPCOMPCTX IRDMAQPC_QPCOMPCTX
+#define IRDMA_UDAQPC_SQTPHVAL IRDMAQPC_SQTPHVAL
+#define IRDMA_UDAQPC_RQTPHVAL IRDMAQPC_RQTPHVAL
+#define IRDMA_UDAQPC_QSHANDLE IRDMAQPC_QSHANDLE
+#define IRDMA_UDAQPC_RQHDRRINGBUFSIZE GENMASK_ULL(49, 48)
+#define IRDMA_UDAQPC_SQHDRRINGBUFSIZE GENMASK_ULL(33, 32)
+#define IRDMA_UDAQPC_PRIVILEGEENABLE BIT_ULL(25)
+#define IRDMA_UDAQPC_USE_STATISTICS_INSTANCE BIT_ULL(26)
+#define IRDMA_UDAQPC_STATISTICS_INSTANCE_INDEX GENMASK_ULL(6, 0)
+#define IRDMA_UDAQPC_PRIVHDRGENENABLE BIT_ULL(0)
+#define IRDMA_UDAQPC_RQHDRSPLITENABLE BIT_ULL(3)
+#define IRDMA_UDAQPC_RQHDRRINGBUFENABLE BIT_ULL(2)
+#define IRDMA_UDAQPC_SQHDRRINGBUFENABLE BIT_ULL(1)
+#define IRDMA_UDAQPC_IPID GENMASK_ULL(47, 32)
+#define IRDMA_UDAQPC_SNDMSS GENMASK_ULL(29, 16)
+#define IRDMA_UDAQPC_VLANTAG GENMASK_ULL(15, 0)
+
+#define IRDMA_UDA_CQPSQ_MAV_PDINDEXHI GENMASK_ULL(21, 20)
+#define IRDMA_UDA_CQPSQ_MAV_PDINDEXLO GENMASK_ULL(63, 48)
+#define IRDMA_UDA_CQPSQ_MAV_SRCMACADDRINDEX GENMASK_ULL(29, 24)
+#define IRDMA_UDA_CQPSQ_MAV_ARPINDEX GENMASK_ULL(63, 48)
+#define IRDMA_UDA_CQPSQ_MAV_TC GENMASK_ULL(39, 32)
+#define IRDMA_UDA_CQPSQ_MAV_HOPLIMIT GENMASK_ULL(39, 32)
+#define IRDMA_UDA_CQPSQ_MAV_FLOWLABEL GENMASK_ULL(19, 0)
+#define IRDMA_UDA_CQPSQ_MAV_ADDR0 GENMASK_ULL(63, 32)
+#define IRDMA_UDA_CQPSQ_MAV_ADDR1 GENMASK_ULL(31, 0)
+#define IRDMA_UDA_CQPSQ_MAV_ADDR2 GENMASK_ULL(63, 32)
+#define IRDMA_UDA_CQPSQ_MAV_ADDR3 GENMASK_ULL(31, 0)
+#define IRDMA_UDA_CQPSQ_MAV_WQEVALID BIT_ULL(63)
+#define IRDMA_UDA_CQPSQ_MAV_OPCODE GENMASK_ULL(37, 32)
+#define IRDMA_UDA_CQPSQ_MAV_DOLOOPBACKK BIT_ULL(62)
+#define IRDMA_UDA_CQPSQ_MAV_IPV4VALID BIT_ULL(59)
+#define IRDMA_UDA_CQPSQ_MAV_AVIDX GENMASK_ULL(16, 0)
+#define IRDMA_UDA_CQPSQ_MAV_INSERTVLANTAG BIT_ULL(60)
+#define IRDMA_UDA_MGCTX_VFFLAG BIT_ULL(29)
+#define IRDMA_UDA_MGCTX_DESTPORT GENMASK_ULL(47, 32)
+#define IRDMA_UDA_MGCTX_VFID GENMASK_ULL(28, 22)
+#define IRDMA_UDA_MGCTX_VALIDENT BIT_ULL(31)
+#define IRDMA_UDA_MGCTX_PFID GENMASK_ULL(21, 18)
+#define IRDMA_UDA_MGCTX_FLAGIGNOREDPORT BIT_ULL(30)
+#define IRDMA_UDA_MGCTX_QPID GENMASK_ULL(17, 0)
+#define IRDMA_UDA_CQPSQ_MG_WQEVALID BIT_ULL(63)
+#define IRDMA_UDA_CQPSQ_MG_OPCODE GENMASK_ULL(37, 32)
+#define IRDMA_UDA_CQPSQ_MG_MGIDX GENMASK_ULL(12, 0)
+#define IRDMA_UDA_CQPSQ_MG_IPV4VALID BIT_ULL(60)
+#define IRDMA_UDA_CQPSQ_MG_VLANVALID BIT_ULL(59)
+#define IRDMA_UDA_CQPSQ_MG_HMC_FCN_ID GENMASK_ULL(5, 0)
+#define IRDMA_UDA_CQPSQ_MG_VLANID GENMASK_ULL(43, 32)
+#define IRDMA_UDA_CQPSQ_QS_HANDLE GENMASK_ULL(9, 0)
+#define IRDMA_UDA_CQPSQ_QHASH_QPN GENMASK_ULL(49, 32)
+#define IRDMA_UDA_CQPSQ_QHASH_ BIT_ULL(0)
+#define IRDMA_UDA_CQPSQ_QHASH_SRC_PORT GENMASK_ULL(31, 16)
+#define IRDMA_UDA_CQPSQ_QHASH_DEST_PORT GENMASK_ULL(15, 0)
+#define IRDMA_UDA_CQPSQ_QHASH_ADDR0 GENMASK_ULL(63, 32)
+#define IRDMA_UDA_CQPSQ_QHASH_ADDR1 GENMASK_ULL(31, 0)
+#define IRDMA_UDA_CQPSQ_QHASH_ADDR2 GENMASK_ULL(63, 32)
+#define IRDMA_UDA_CQPSQ_QHASH_ADDR3 GENMASK_ULL(31, 0)
+#define IRDMA_UDA_CQPSQ_QHASH_WQEVALID BIT_ULL(63)
+#define IRDMA_UDA_CQPSQ_QHASH_OPCODE GENMASK_ULL(37, 32)
+#define IRDMA_UDA_CQPSQ_QHASH_MANAGE GENMASK_ULL(62, 61)
+#define IRDMA_UDA_CQPSQ_QHASH_IPV4VALID GENMASK_ULL(60, 60)
+#define IRDMA_UDA_CQPSQ_QHASH_LANFWD GENMASK_ULL(59, 59)
+#define IRDMA_UDA_CQPSQ_QHASH_ENTRYTYPE GENMASK_ULL(44, 42)
+#endif /* IRDMA_UDA_D_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 18/23] RDMA/irdma: Add user/kernel shared libraries
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (16 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 17/23] RDMA/irdma: Add RoCEv2 UD OP support Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 19/23] RDMA/irdma: Add miscellaneous utility definitions Shiraz Saleem
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Building the WQE descriptors for different verb
operations are similar in kernel and user-space.
Add these shared libraries.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/uk.c   | 1737 ++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/irdma/user.h |  464 ++++++++++
 2 files changed, 2201 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/uk.c
 create mode 100644 drivers/infiniband/hw/irdma/user.h

diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c
new file mode 100644
index 0000000..ca0b70e
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/uk.c
@@ -0,0 +1,1737 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "osdep.h"
+#include "status.h"
+#include "defs.h"
+#include "user.h"
+#include "irdma.h"
+
+/**
+ * irdma_set_fragment - set fragment in wqe
+ * @wqe: wqe for setting fragment
+ * @offset: offset value
+ * @sge: sge length and stag
+ * @valid: The wqe valid
+ */
+static void irdma_set_fragment(__le64 *wqe, u32 offset, struct irdma_sge *sge,
+			       u8 valid)
+{
+	if (sge) {
+		set_64bit_val(wqe, offset,
+			      FIELD_PREP(IRDMAQPSQ_FRAG_TO, sge->tag_off));
+		set_64bit_val(wqe, offset + 8,
+			      FIELD_PREP(IRDMAQPSQ_VALID, valid) |
+			      FIELD_PREP(IRDMAQPSQ_FRAG_LEN, sge->len) |
+			      FIELD_PREP(IRDMAQPSQ_FRAG_STAG, sge->stag));
+	} else {
+		set_64bit_val(wqe, offset, 0);
+		set_64bit_val(wqe, offset + 8,
+			      FIELD_PREP(IRDMAQPSQ_VALID, valid));
+	}
+}
+
+/**
+ * irdma_set_fragment_gen_1 - set fragment in wqe
+ * @wqe: wqe for setting fragment
+ * @offset: offset value
+ * @sge: sge length and stag
+ * @valid: wqe valid flag
+ */
+static void irdma_set_fragment_gen_1(__le64 *wqe, u32 offset,
+				     struct irdma_sge *sge, u8 valid)
+{
+	if (sge) {
+		set_64bit_val(wqe, offset,
+			      FIELD_PREP(IRDMAQPSQ_FRAG_TO, sge->tag_off));
+		set_64bit_val(wqe, offset + 8,
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_LEN, sge->len) |
+			      FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_STAG, sge->stag));
+	} else {
+		set_64bit_val(wqe, offset, 0);
+		set_64bit_val(wqe, offset + 8, 0);
+	}
+}
+
+/**
+ * irdma_nop_1 - insert a NOP wqe
+ * @qp: hw qp ptr
+ */
+static enum irdma_status_code irdma_nop_1(struct irdma_qp_uk *qp)
+{
+	u64 hdr;
+	__le64 *wqe;
+	u32 wqe_idx;
+	bool signaled = false;
+
+	if (!qp->sq_ring.head)
+		return IRDMA_ERR_PARAM;
+
+	wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring);
+	wqe = qp->sq_base[wqe_idx].elem;
+
+	qp->sq_wrtrk_array[wqe_idx].quanta = IRDMA_QP_WQE_MIN_QUANTA;
+
+	set_64bit_val(wqe, 0, 0);
+	set_64bit_val(wqe, 8, 0);
+	set_64bit_val(wqe, 16, 0);
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_NOP) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	/* make sure WQE is written before valid bit is set */
+	dma_wmb();
+
+	set_64bit_val(wqe, 24, hdr);
+
+	return 0;
+}
+
+/**
+ * irdma_clr_wqes - clear next 128 sq entries
+ * @qp: hw qp ptr
+ * @qp_wqe_idx: wqe_idx
+ */
+void irdma_clr_wqes(struct irdma_qp_uk *qp, u32 qp_wqe_idx)
+{
+	__le64 *wqe;
+	u32 wqe_idx;
+
+	if (!(qp_wqe_idx & 0x7F)) {
+		wqe_idx = (qp_wqe_idx + 128) % qp->sq_ring.size;
+		wqe = qp->sq_base[wqe_idx].elem;
+		if (wqe_idx)
+			memset(wqe, qp->swqe_polarity ? 0 : 0xFF, 0x1000);
+		else
+			memset(wqe, qp->swqe_polarity ? 0xFF : 0, 0x1000);
+	}
+}
+
+/**
+ * irdma_qp_post_wr - ring doorbell
+ * @qp: hw qp ptr
+ */
+void irdma_qp_post_wr(struct irdma_qp_uk *qp)
+{
+	u64 temp;
+	u32 hw_sq_tail;
+	u32 sw_sq_head;
+
+	/* valid bit is written and loads completed before reading shadow */
+	mb();
+
+	/* read the doorbell shadow area */
+	get_64bit_val(qp->shadow_area, 0, &temp);
+
+	hw_sq_tail = (u32)FIELD_GET(IRDMA_QP_DBSA_HW_SQ_TAIL, temp);
+	sw_sq_head = IRDMA_RING_CURRENT_HEAD(qp->sq_ring);
+	if (sw_sq_head != qp->initial_ring.head) {
+		if (qp->push_dropped) {
+			writel(qp->qp_id, qp->wqe_alloc_db);
+			qp->push_dropped = false;
+		} else if (sw_sq_head != hw_sq_tail) {
+			if (sw_sq_head > qp->initial_ring.head) {
+				if (hw_sq_tail >= qp->initial_ring.head &&
+				    hw_sq_tail < sw_sq_head)
+					writel(qp->qp_id, qp->wqe_alloc_db);
+			} else {
+				if (hw_sq_tail >= qp->initial_ring.head ||
+				    hw_sq_tail < sw_sq_head)
+					writel(qp->qp_id, qp->wqe_alloc_db);
+			}
+		}
+	}
+
+	qp->initial_ring.head = qp->sq_ring.head;
+}
+
+/**
+ * irdma_qp_ring_push_db -  ring qp doorbell
+ * @qp: hw qp ptr
+ * @wqe_idx: wqe index
+ */
+static void irdma_qp_ring_push_db(struct irdma_qp_uk *qp, u32 wqe_idx)
+{
+	set_32bit_val(qp->push_db, 0,
+		      FIELD_PREP(IRDMA_WQEALLOC_WQE_DESC_INDEX, wqe_idx >> 3) | qp->qp_id);
+	qp->initial_ring.head = qp->sq_ring.head;
+	qp->push_mode = true;
+	qp->push_dropped = false;
+}
+
+void irdma_qp_push_wqe(struct irdma_qp_uk *qp, __le64 *wqe, u16 quanta,
+		       u32 wqe_idx, bool post_sq)
+{
+	__le64 *push;
+
+	if (IRDMA_RING_CURRENT_HEAD(qp->initial_ring) !=
+		    IRDMA_RING_CURRENT_TAIL(qp->sq_ring) &&
+	    !qp->push_mode) {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	} else {
+		push = (__le64 *)((uintptr_t)qp->push_wqe +
+				  (wqe_idx & 0x7) * 0x20);
+		memcpy(push, wqe, quanta * IRDMA_QP_WQE_MIN_SIZE);
+		irdma_qp_ring_push_db(qp, wqe_idx);
+	}
+}
+
+/**
+ * irdma_qp_get_next_send_wqe - pad with NOP if needed, return where next WR should go
+ * @qp: hw qp ptr
+ * @wqe_idx: return wqe index
+ * @quanta: size of WR in quanta
+ * @total_size: size of WR in bytes
+ * @info: info on WR
+ */
+__le64 *irdma_qp_get_next_send_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx,
+				   u16 quanta, u32 total_size,
+				   struct irdma_post_sq_info *info)
+{
+	__le64 *wqe;
+	__le64 *wqe_0 = NULL;
+	u32 nop_wqe_idx;
+	u16 avail_quanta;
+	u16 i;
+
+	avail_quanta = qp->uk_attrs->max_hw_sq_chunk -
+		       (IRDMA_RING_CURRENT_HEAD(qp->sq_ring) %
+		       qp->uk_attrs->max_hw_sq_chunk);
+	if (quanta <= avail_quanta) {
+		/* WR fits in current chunk */
+		if (quanta > IRDMA_SQ_RING_FREE_QUANTA(qp->sq_ring))
+			return NULL;
+	} else {
+		/* Need to pad with NOP */
+		if (quanta + avail_quanta >
+			IRDMA_SQ_RING_FREE_QUANTA(qp->sq_ring))
+			return NULL;
+
+		nop_wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring);
+		for (i = 0; i < avail_quanta; i++) {
+			irdma_nop_1(qp);
+			IRDMA_RING_MOVE_HEAD_NOCHECK(qp->sq_ring);
+		}
+		if (qp->push_db && info->push_wqe)
+			irdma_qp_push_wqe(qp, qp->sq_base[nop_wqe_idx].elem,
+					  avail_quanta, nop_wqe_idx, true);
+	}
+
+	*wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring);
+	if (!*wqe_idx)
+		qp->swqe_polarity = !qp->swqe_polarity;
+
+	IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(qp->sq_ring, quanta);
+
+	wqe = qp->sq_base[*wqe_idx].elem;
+	if (qp->uk_attrs->hw_rev == IRDMA_GEN_1 && quanta == 1 &&
+	    (IRDMA_RING_CURRENT_HEAD(qp->sq_ring) & 1)) {
+		wqe_0 = qp->sq_base[IRDMA_RING_CURRENT_HEAD(qp->sq_ring)].elem;
+		wqe_0[3] = cpu_to_le64(FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity ? 0 : 1));
+	}
+	qp->sq_wrtrk_array[*wqe_idx].wrid = info->wr_id;
+	qp->sq_wrtrk_array[*wqe_idx].wr_len = total_size;
+	qp->sq_wrtrk_array[*wqe_idx].quanta = quanta;
+
+	return wqe;
+}
+
+/**
+ * irdma_qp_get_next_recv_wqe - get next qp's rcv wqe
+ * @qp: hw qp ptr
+ * @wqe_idx: return wqe index
+ */
+__le64 *irdma_qp_get_next_recv_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx)
+{
+	__le64 *wqe;
+	enum irdma_status_code ret_code;
+
+	if (IRDMA_RING_FULL_ERR(qp->rq_ring))
+		return NULL;
+
+	IRDMA_ATOMIC_RING_MOVE_HEAD(qp->rq_ring, *wqe_idx, ret_code);
+	if (ret_code)
+		return NULL;
+
+	if (!*wqe_idx)
+		qp->rwqe_polarity = !qp->rwqe_polarity;
+	/* rq_wqe_size_multiplier is no of 32 byte quanta in one rq wqe */
+	wqe = qp->rq_base[*wqe_idx * qp->rq_wqe_size_multiplier].elem;
+
+	return wqe;
+}
+
+/**
+ * irdma_rdma_write - rdma write operation
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code irdma_rdma_write(struct irdma_qp_uk *qp,
+					       struct irdma_post_sq_info *info,
+					       bool post_sq)
+{
+	u64 hdr;
+	__le64 *wqe;
+	struct irdma_rdma_write *op_info;
+	u32 i, wqe_idx;
+	u32 total_size = 0, byte_off;
+	enum irdma_status_code ret_code;
+	u32 frag_cnt, addl_frag_cnt;
+	bool read_fence = false;
+	u16 quanta;
+
+	info->push_wqe = qp->push_db ? true : false;
+
+	op_info = &info->op.rdma_write;
+	if (op_info->num_lo_sges > qp->max_sq_frag_cnt)
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+
+	for (i = 0; i < op_info->num_lo_sges; i++)
+		total_size += op_info->lo_sg_list[i].len;
+
+	read_fence |= info->read_fence;
+
+	if (info->imm_data_valid)
+		frag_cnt = op_info->num_lo_sges + 1;
+	else
+		frag_cnt = op_info->num_lo_sges;
+	addl_frag_cnt = frag_cnt > 1 ? (frag_cnt - 1) : 0;
+	ret_code = irdma_fragcnt_to_quanta_sq(frag_cnt, &quanta);
+	if (ret_code)
+		return ret_code;
+
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size,
+					 info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMAQPSQ_FRAG_TO, op_info->rem_addr.tag_off));
+
+	if (info->imm_data_valid) {
+		set_64bit_val(wqe, 0,
+			      FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data));
+		i = 0;
+	} else {
+		qp->wqe_ops.iw_set_fragment(wqe, 0,
+					    op_info->lo_sg_list,
+					    qp->swqe_polarity);
+		i = 1;
+	}
+
+	for (byte_off = 32; i < op_info->num_lo_sges; i++) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off,
+					    &op_info->lo_sg_list[i],
+					    qp->swqe_polarity);
+		byte_off += 16;
+	}
+
+	/* if not an odd number set valid bit in next fragment */
+	if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(frag_cnt & 0x01) &&
+	    frag_cnt) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL,
+					    qp->swqe_polarity);
+		if (qp->uk_attrs->hw_rev == IRDMA_GEN_2)
+			++addl_frag_cnt;
+	}
+
+	hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, op_info->rem_addr.stag) |
+	      FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) |
+	      FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG, info->imm_data_valid) |
+	      FIELD_PREP(IRDMAQPSQ_REPORTRTT, info->report_rtt) |
+	      FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_rdma_read - rdma read command
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @inv_stag: flag for inv_stag
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code irdma_rdma_read(struct irdma_qp_uk *qp,
+					      struct irdma_post_sq_info *info,
+					      bool inv_stag, bool post_sq)
+{
+	struct irdma_rdma_read *op_info;
+	enum irdma_status_code ret_code;
+	u32 i, byte_off, total_size = 0;
+	bool local_fence = false;
+	u32 addl_frag_cnt;
+	__le64 *wqe;
+	u32 wqe_idx;
+	u16 quanta;
+	u64 hdr;
+
+	info->push_wqe = qp->push_db ? true : false;
+
+	op_info = &info->op.rdma_read;
+	if (qp->max_sq_frag_cnt < op_info->num_lo_sges)
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+
+	for (i = 0; i < op_info->num_lo_sges; i++)
+		total_size += op_info->lo_sg_list[i].len;
+
+	ret_code = irdma_fragcnt_to_quanta_sq(op_info->num_lo_sges, &quanta);
+	if (ret_code)
+		return ret_code;
+
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size,
+					 info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	addl_frag_cnt = op_info->num_lo_sges > 1 ?
+			(op_info->num_lo_sges - 1) : 0;
+	local_fence |= info->local_fence;
+
+	qp->wqe_ops.iw_set_fragment(wqe, 0, op_info->lo_sg_list,
+				    qp->swqe_polarity);
+	for (i = 1, byte_off = 32; i < op_info->num_lo_sges; ++i) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off,
+					    &op_info->lo_sg_list[i],
+					    qp->swqe_polarity);
+		byte_off += 16;
+	}
+
+	/* if not an odd number set valid bit in next fragment */
+	if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 &&
+	    !(op_info->num_lo_sges & 0x01) && op_info->num_lo_sges) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL,
+					    qp->swqe_polarity);
+		if (qp->uk_attrs->hw_rev == IRDMA_GEN_2)
+			++addl_frag_cnt;
+	}
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMAQPSQ_FRAG_TO, op_info->rem_addr.tag_off));
+	hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, op_info->rem_addr.stag) |
+	      FIELD_PREP(IRDMAQPSQ_REPORTRTT, (info->report_rtt ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) |
+	      FIELD_PREP(IRDMAQPSQ_OPCODE,
+			 (inv_stag ? IRDMAQP_OP_RDMA_READ_LOC_INV : IRDMAQP_OP_RDMA_READ)) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_send - rdma send command
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code irdma_send(struct irdma_qp_uk *qp,
+					 struct irdma_post_sq_info *info,
+					 bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_post_send *op_info;
+	u64 hdr;
+	u32 i, wqe_idx, total_size = 0, byte_off;
+	enum irdma_status_code ret_code;
+	u32 frag_cnt, addl_frag_cnt;
+	bool read_fence = false;
+	u16 quanta;
+
+	info->push_wqe = qp->push_db ? true : false;
+
+	op_info = &info->op.send;
+	if (qp->max_sq_frag_cnt < op_info->num_sges)
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+
+	for (i = 0; i < op_info->num_sges; i++)
+		total_size += op_info->sg_list[i].len;
+
+	if (info->imm_data_valid)
+		frag_cnt = op_info->num_sges + 1;
+	else
+		frag_cnt = op_info->num_sges;
+	ret_code = irdma_fragcnt_to_quanta_sq(frag_cnt, &quanta);
+	if (ret_code)
+		return ret_code;
+
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size,
+					 info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	read_fence |= info->read_fence;
+	addl_frag_cnt = frag_cnt > 1 ? (frag_cnt - 1) : 0;
+	if (info->imm_data_valid) {
+		set_64bit_val(wqe, 0,
+			      FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data));
+		i = 0;
+	} else {
+		qp->wqe_ops.iw_set_fragment(wqe, 0, op_info->sg_list,
+					    qp->swqe_polarity);
+		i = 1;
+	}
+
+	for (byte_off = 32; i < op_info->num_sges; i++) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off, &op_info->sg_list[i],
+					    qp->swqe_polarity);
+		byte_off += 16;
+	}
+
+	/* if not an odd number set valid bit in next fragment */
+	if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(frag_cnt & 0x01) &&
+	    frag_cnt) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL,
+					    qp->swqe_polarity);
+		if (qp->uk_attrs->hw_rev == IRDMA_GEN_2)
+			++addl_frag_cnt;
+	}
+
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMAQPSQ_DESTQKEY, op_info->qkey) |
+		      FIELD_PREP(IRDMAQPSQ_DESTQPN, op_info->dest_qp));
+	hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, info->stag_to_inv) |
+	      FIELD_PREP(IRDMAQPSQ_AHID, op_info->ah_id) |
+	      FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG,
+			 (info->imm_data_valid ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_REPORTRTT, (info->report_rtt ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) |
+	      FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_UDPHEADER, info->udp_hdr) |
+	      FIELD_PREP(IRDMAQPSQ_L4LEN, info->l4len) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_set_mw_bind_wqe_gen_1 - set mw bind wqe
+ * @wqe: wqe for setting fragment
+ * @op_info: info for setting bind wqe values
+ */
+static void irdma_set_mw_bind_wqe_gen_1(__le64 *wqe,
+					struct irdma_bind_window *op_info)
+{
+	set_64bit_val(wqe, 0, (uintptr_t)op_info->va);
+	set_64bit_val(wqe, 8,
+		      FIELD_PREP(IRDMAQPSQ_PARENTMRSTAG, op_info->mw_stag) |
+		      FIELD_PREP(IRDMAQPSQ_MWSTAG, op_info->mr_stag));
+	set_64bit_val(wqe, 16, op_info->bind_len);
+}
+
+/**
+ * irdma_copy_inline_data_gen_1 - Copy inline data to wqe
+ * @dest: pointer to wqe
+ * @src: pointer to inline data
+ * @len: length of inline data to copy
+ * @polarity: compatibility parameter
+ */
+static void irdma_copy_inline_data_gen_1(u8 *dest, u8 *src, u32 len,
+					 u8 polarity)
+{
+	if (len <= 16) {
+		memcpy(dest, src, len);
+	} else {
+		memcpy(dest, src, 16);
+		src += 16;
+		dest = dest + 32;
+		memcpy(dest, src, len - 16);
+	}
+}
+
+/**
+ * irdma_inline_data_size_to_quanta_gen_1 - based on inline data, quanta
+ * @data_size: data size for inline
+ *
+ * Gets the quanta based on inline and immediate data.
+ */
+static inline u16 irdma_inline_data_size_to_quanta_gen_1(u32 data_size)
+{
+	return data_size <= 16 ? IRDMA_QP_WQE_MIN_QUANTA : 2;
+}
+
+/**
+ * irdma_set_mw_bind_wqe - set mw bind in wqe
+ * @wqe: wqe for setting mw bind
+ * @op_info: info for setting wqe values
+ */
+static void irdma_set_mw_bind_wqe(__le64 *wqe,
+				  struct irdma_bind_window *op_info)
+{
+	set_64bit_val(wqe, 0, (uintptr_t)op_info->va);
+	set_64bit_val(wqe, 8,
+		      FIELD_PREP(IRDMAQPSQ_PARENTMRSTAG, op_info->mr_stag) |
+		      FIELD_PREP(IRDMAQPSQ_MWSTAG, op_info->mw_stag));
+	set_64bit_val(wqe, 16, op_info->bind_len);
+}
+
+/**
+ * irdma_copy_inline_data - Copy inline data to wqe
+ * @dest: pointer to wqe
+ * @src: pointer to inline data
+ * @len: length of inline data to copy
+ * @polarity: polarity of wqe valid bit
+ */
+static void irdma_copy_inline_data(u8 *dest, u8 *src, u32 len, u8 polarity)
+{
+	u8 inline_valid = polarity << IRDMA_INLINE_VALID_S;
+	u32 copy_size;
+
+	dest += 8;
+	if (len <= 8) {
+		memcpy(dest, src, len);
+		return;
+	}
+
+	*((u64 *)dest) = *((u64 *)src);
+	len -= 8;
+	src += 8;
+	dest += 24; /* point to additional 32 byte quanta */
+
+	while (len) {
+		copy_size = len < 31 ? len : 31;
+		memcpy(dest, src, copy_size);
+		*(dest + 31) = inline_valid;
+		len -= copy_size;
+		dest += 32;
+		src += copy_size;
+	}
+}
+
+/**
+ * irdma_inline_data_size_to_quanta - based on inline data, quanta
+ * @data_size: data size for inline
+ *
+ * Gets the quanta based on inline and immediate data.
+ */
+static u16 irdma_inline_data_size_to_quanta(u32 data_size)
+{
+	if (data_size <= 8)
+		return IRDMA_QP_WQE_MIN_QUANTA;
+	else if (data_size <= 39)
+		return 2;
+	else if (data_size <= 70)
+		return 3;
+	else if (data_size <= 101)
+		return 4;
+	else if (data_size <= 132)
+		return 5;
+	else if (data_size <= 163)
+		return 6;
+	else if (data_size <= 194)
+		return 7;
+	else
+		return 8;
+}
+
+/**
+ * irdma_inline_rdma_write - inline rdma write operation
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code
+irdma_inline_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info,
+			bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_inline_rdma_write *op_info;
+	u64 hdr = 0;
+	u32 wqe_idx;
+	bool read_fence = false;
+	u16 quanta;
+
+	info->push_wqe = qp->push_db ? true : false;
+	op_info = &info->op.inline_rdma_write;
+
+	if (op_info->len > qp->max_inline_data)
+		return IRDMA_ERR_INVALID_INLINE_DATA_SIZE;
+
+	quanta = qp->wqe_ops.iw_inline_data_size_to_quanta(op_info->len);
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, op_info->len,
+					 info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	read_fence |= info->read_fence;
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMAQPSQ_FRAG_TO, op_info->rem_addr.tag_off));
+
+	hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, op_info->rem_addr.stag) |
+	      FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) |
+	      FIELD_PREP(IRDMAQPSQ_INLINEDATALEN, op_info->len) |
+	      FIELD_PREP(IRDMAQPSQ_REPORTRTT, info->report_rtt ? 1 : 0) |
+	      FIELD_PREP(IRDMAQPSQ_INLINEDATAFLAG, 1) |
+	      FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG, info->imm_data_valid ? 1 : 0) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe ? 1 : 0) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	if (info->imm_data_valid)
+		set_64bit_val(wqe, 0,
+			      FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data));
+
+	qp->wqe_ops.iw_copy_inline_data((u8 *)wqe, op_info->data, op_info->len,
+					qp->swqe_polarity);
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_inline_send - inline send operation
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code irdma_inline_send(struct irdma_qp_uk *qp,
+						struct irdma_post_sq_info *info,
+						bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_post_inline_send *op_info;
+	u64 hdr;
+	u32 wqe_idx;
+	bool read_fence = false;
+	u16 quanta;
+
+	info->push_wqe = qp->push_db ? true : false;
+	op_info = &info->op.inline_send;
+
+	if (op_info->len > qp->max_inline_data)
+		return IRDMA_ERR_INVALID_INLINE_DATA_SIZE;
+
+	quanta = qp->wqe_ops.iw_inline_data_size_to_quanta(op_info->len);
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, op_info->len,
+					 info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	set_64bit_val(wqe, 16,
+		      FIELD_PREP(IRDMAQPSQ_DESTQKEY, op_info->qkey) |
+		      FIELD_PREP(IRDMAQPSQ_DESTQPN, op_info->dest_qp));
+
+	read_fence |= info->read_fence;
+	hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, info->stag_to_inv) |
+	      FIELD_PREP(IRDMAQPSQ_AHID, op_info->ah_id) |
+	      FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) |
+	      FIELD_PREP(IRDMAQPSQ_INLINEDATALEN, op_info->len) |
+	      FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG,
+			 (info->imm_data_valid ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_REPORTRTT, (info->report_rtt ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_INLINEDATAFLAG, 1) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_UDPHEADER, info->udp_hdr) |
+	      FIELD_PREP(IRDMAQPSQ_L4LEN, info->l4len) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	if (info->imm_data_valid)
+		set_64bit_val(wqe, 0,
+			      FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data));
+	qp->wqe_ops.iw_copy_inline_data((u8 *)wqe, op_info->data, op_info->len,
+					qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_stag_local_invalidate - stag invalidate operation
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code
+irdma_stag_local_invalidate(struct irdma_qp_uk *qp,
+			    struct irdma_post_sq_info *info, bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_inv_local_stag *op_info;
+	u64 hdr;
+	u32 wqe_idx;
+	bool local_fence = false;
+	struct irdma_sge sge = {};
+
+	info->push_wqe = qp->push_db ? true : false;
+	op_info = &info->op.inv_local_stag;
+	local_fence = info->local_fence;
+
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, IRDMA_QP_WQE_MIN_QUANTA,
+					 0, info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	sge.stag = op_info->target_stag;
+	qp->wqe_ops.iw_set_fragment(wqe, 0, &sge, 0);
+
+	set_64bit_val(wqe, 16, 0);
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMA_OP_TYPE_INV_STAG) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, IRDMA_QP_WQE_MIN_QUANTA, wqe_idx,
+				  post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_mw_bind - bind Memory Window
+ * @qp: hw qp ptr
+ * @info: post sq information
+ * @post_sq: flag to post sq
+ */
+static enum irdma_status_code irdma_mw_bind(struct irdma_qp_uk *qp,
+					    struct irdma_post_sq_info *info,
+					    bool post_sq)
+{
+	__le64 *wqe;
+	struct irdma_bind_window *op_info;
+	u64 hdr;
+	u32 wqe_idx;
+	bool local_fence = false;
+
+	info->push_wqe = qp->push_db ? true : false;
+	op_info = &info->op.bind_window;
+	local_fence |= info->local_fence;
+
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, IRDMA_QP_WQE_MIN_QUANTA,
+					 0, info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	qp->wqe_ops.iw_set_mw_bind_wqe(wqe, op_info);
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMA_OP_TYPE_BIND_MW) |
+	      FIELD_PREP(IRDMAQPSQ_STAGRIGHTS,
+			 ((op_info->ena_reads << 2) | (op_info->ena_writes << 3))) |
+	      FIELD_PREP(IRDMAQPSQ_VABASEDTO,
+			 (op_info->addressing_type == IRDMA_ADDR_TYPE_VA_BASED ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_MEMWINDOWTYPE,
+			 (op_info->mem_window_type_1 ? 1 : 0)) |
+	      FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) |
+	      FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) |
+	      FIELD_PREP(IRDMAQPSQ_LOCALFENCE, local_fence) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	if (info->push_wqe) {
+		irdma_qp_push_wqe(qp, wqe, IRDMA_QP_WQE_MIN_QUANTA, wqe_idx,
+				  post_sq);
+	} else {
+		if (post_sq)
+			irdma_qp_post_wr(qp);
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_post_receive - post receive wqe
+ * @qp: hw qp ptr
+ * @info: post rq information
+ */
+static enum irdma_status_code
+irdma_post_receive(struct irdma_qp_uk *qp, struct irdma_post_rq_info *info)
+{
+	u32 total_size = 0, wqe_idx, i, byte_off;
+	u32 addl_frag_cnt;
+	__le64 *wqe;
+	u64 hdr;
+
+	if (qp->max_rq_frag_cnt < info->num_sges)
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+
+	for (i = 0; i < info->num_sges; i++)
+		total_size += info->sg_list[i].len;
+
+	wqe = irdma_qp_get_next_recv_wqe(qp, &wqe_idx);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	qp->rq_wrid_array[wqe_idx] = info->wr_id;
+	addl_frag_cnt = info->num_sges > 1 ? (info->num_sges - 1) : 0;
+	qp->wqe_ops.iw_set_fragment(wqe, 0, info->sg_list,
+				    qp->rwqe_polarity);
+
+	for (i = 1, byte_off = 32; i < info->num_sges; i++) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off, &info->sg_list[i],
+					    qp->rwqe_polarity);
+		byte_off += 16;
+	}
+
+	/* if not an odd number set valid bit in next fragment */
+	if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(info->num_sges & 0x01) &&
+	    info->num_sges) {
+		qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL,
+					    qp->rwqe_polarity);
+		if (qp->uk_attrs->hw_rev == IRDMA_GEN_2)
+			++addl_frag_cnt;
+	}
+
+	set_64bit_val(wqe, 16, 0);
+	hdr = FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->rwqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+
+	return 0;
+}
+
+/**
+ * irdma_cq_resize - reset the cq buffer info
+ * @cq: cq to resize
+ * @cq_base: new cq buffer addr
+ * @cq_size: number of cqes
+ */
+static void irdma_cq_resize(struct irdma_cq_uk *cq, void *cq_base, int cq_size)
+{
+	cq->cq_base = cq_base;
+	cq->cq_size = cq_size;
+	IRDMA_RING_INIT(cq->cq_ring, cq->cq_size);
+	cq->polarity = 1;
+}
+
+/**
+ * irdma_cq_set_resized_cnt - record the count of the resized buffers
+ * @cq: cq to resize
+ * @cq_cnt: the count of the resized cq buffers
+ */
+static void irdma_cq_set_resized_cnt(struct irdma_cq_uk *cq, u16 cq_cnt)
+{
+	u64 temp_val;
+	u16 sw_cq_sel;
+	u8 arm_next_se;
+	u8 arm_next;
+	u8 arm_seq_num;
+
+	get_64bit_val(cq->shadow_area, 32, &temp_val);
+
+	sw_cq_sel = (u16)FIELD_GET(IRDMA_CQ_DBSA_SW_CQ_SELECT, temp_val);
+	sw_cq_sel += cq_cnt;
+
+	arm_seq_num = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_SEQ_NUM, temp_val);
+	arm_next_se = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT_SE, temp_val);
+	arm_next = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT, temp_val);
+
+	temp_val = FIELD_PREP(IRDMA_CQ_DBSA_ARM_SEQ_NUM, arm_seq_num) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_SW_CQ_SELECT, sw_cq_sel) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT_SE, arm_next_se) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT, arm_next);
+
+	set_64bit_val(cq->shadow_area, 32, temp_val);
+}
+
+/**
+ * irdma_cq_request_notification - cq notification request (door bell)
+ * @cq: hw cq
+ * @cq_notify: notification type
+ */
+static void irdma_cq_request_notification(struct irdma_cq_uk *cq,
+					  enum irdma_cmpl_notify cq_notify)
+{
+	u64 temp_val;
+	u16 sw_cq_sel;
+	u8 arm_next_se = 0;
+	u8 arm_next = 0;
+	u8 arm_seq_num;
+
+	get_64bit_val(cq->shadow_area, 32, &temp_val);
+	arm_seq_num = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_SEQ_NUM, temp_val);
+	arm_seq_num++;
+	sw_cq_sel = (u16)FIELD_GET(IRDMA_CQ_DBSA_SW_CQ_SELECT, temp_val);
+	arm_next_se = (u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT_SE, temp_val);
+	arm_next_se |= 1;
+	if (cq_notify == IRDMA_CQ_COMPL_EVENT)
+		arm_next = 1;
+	temp_val = FIELD_PREP(IRDMA_CQ_DBSA_ARM_SEQ_NUM, arm_seq_num) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_SW_CQ_SELECT, sw_cq_sel) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT_SE, arm_next_se) |
+		   FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT, arm_next);
+
+	set_64bit_val(cq->shadow_area, 32, temp_val);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	writel(cq->cq_id, cq->cqe_alloc_db);
+}
+
+/**
+ * irdma_cq_post_entries - update tail in shadow memory
+ * @cq: hw cq
+ * @count: # of entries processed
+ */
+static enum irdma_status_code irdma_cq_post_entries(struct irdma_cq_uk *cq,
+						    u8 count)
+{
+	IRDMA_RING_MOVE_TAIL_BY_COUNT(cq->cq_ring, count);
+	set_64bit_val(cq->shadow_area, 0,
+		      IRDMA_RING_CURRENT_HEAD(cq->cq_ring));
+
+	return 0;
+}
+/**
+ * irdma_cq_poll_cmpl - get cq completion info
+ * @cq: hw cq
+ * @info: cq poll information returned
+ */
+static enum irdma_status_code
+irdma_cq_poll_cmpl(struct irdma_cq_uk *cq, struct irdma_cq_poll_info *info)
+{
+	u64 comp_ctx, qword0, qword2, qword3;
+	__le64 *cqe;
+	struct irdma_qp_uk *qp;
+	struct irdma_ring *pring = NULL;
+	u32 wqe_idx, q_type;
+	enum irdma_status_code ret_code;
+	bool move_cq_head = true;
+	u8 polarity;
+	bool ext_valid;
+	__le64 *ext_cqe;
+
+	if (cq->avoid_mem_cflct)
+		cqe = IRDMA_GET_CURRENT_EXTENDED_CQ_ELEM(cq);
+	else
+		cqe = IRDMA_GET_CURRENT_CQ_ELEM(cq);
+
+	get_64bit_val(cqe, 24, &qword3);
+	polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword3);
+	if (polarity != cq->polarity)
+		return IRDMA_ERR_Q_EMPTY;
+
+	/* Ensure CQE contents are read after valid bit is checked */
+	dma_rmb();
+
+	ext_valid = (bool)FIELD_GET(IRDMA_CQ_EXTCQE, qword3);
+	if (ext_valid) {
+		u64 qword6, qword7;
+		u32 peek_head;
+
+		if (cq->avoid_mem_cflct) {
+			ext_cqe = (__le64 *)((u8 *)cqe + 32);
+			get_64bit_val(ext_cqe, 24, &qword7);
+			polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword3);
+		} else {
+			peek_head = (cq->cq_ring.head + 1) % cq->cq_ring.size;
+			ext_cqe = cq->cq_base[peek_head].buf;
+			get_64bit_val(ext_cqe, 24, &qword7);
+			polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword3);
+			if (!peek_head)
+				polarity ^= 1;
+		}
+		if (polarity != cq->polarity)
+			return IRDMA_ERR_Q_EMPTY;
+
+		/* Ensure ext CQE contents are read after ext valid bit is checked */
+		dma_rmb();
+
+		info->imm_valid = (bool)FIELD_GET(IRDMA_CQ_IMMVALID, qword7);
+		if (info->imm_valid) {
+			u64 qword4;
+
+			get_64bit_val(ext_cqe, 0, &qword4);
+			info->imm_data = (u32)FIELD_GET(IRDMA_CQ_IMMDATALOW32, qword4);
+		}
+		info->ud_smac_valid = (bool)FIELD_GET(IRDMA_CQ_UDSMACVALID, qword7);
+		info->ud_vlan_valid = (bool)FIELD_GET(IRDMA_CQ_UDVLANVALID, qword7);
+		if (info->ud_smac_valid || info->ud_vlan_valid) {
+			get_64bit_val(ext_cqe, 16, &qword6);
+			if (info->ud_vlan_valid)
+				info->ud_vlan = (u16)FIELD_GET(IRDMA_CQ_UDVLAN, qword6);
+			if (info->ud_smac_valid) {
+				info->ud_smac[5] = qword6 & 0xFF;
+				info->ud_smac[4] = (qword6 >> 8) & 0xFF;
+				info->ud_smac[3] = (qword6 >> 16) & 0xFF;
+				info->ud_smac[2] = (qword6 >> 24) & 0xFF;
+				info->ud_smac[1] = (qword6 >> 32) & 0xFF;
+				info->ud_smac[0] = (qword6 >> 40) & 0xFF;
+			}
+		}
+	} else {
+		info->imm_valid = false;
+		info->ud_smac_valid = false;
+		info->ud_vlan_valid = false;
+	}
+
+	q_type = (u8)FIELD_GET(IRDMA_CQ_SQ, qword3);
+	info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, qword3);
+	info->push_dropped = (bool)FIELD_GET(IRDMACQ_PSHDROP, qword3);
+	info->ipv4 = (bool)FIELD_GET(IRDMACQ_IPV4, qword3);
+	if (info->error) {
+		info->major_err = FIELD_GET(IRDMA_CQ_MAJERR, qword3);
+		info->minor_err = FIELD_GET(IRDMA_CQ_MINERR, qword3);
+		if (info->major_err == IRDMA_FLUSH_MAJOR_ERR) {
+			info->comp_status = IRDMA_COMPL_STATUS_FLUSHED;
+			/* Set the min error to standard flush error code for remaining cqes */
+			if (info->minor_err != FLUSH_GENERAL_ERR) {
+				qword3 &= ~IRDMA_CQ_MINERR;
+				qword3 |= FIELD_PREP(IRDMA_CQ_MINERR, FLUSH_GENERAL_ERR);
+				set_64bit_val(cqe, 24, qword3);
+			}
+		} else {
+			info->comp_status = IRDMA_COMPL_STATUS_UNKNOWN;
+		}
+	} else {
+		info->comp_status = IRDMA_COMPL_STATUS_SUCCESS;
+	}
+
+	get_64bit_val(cqe, 0, &qword0);
+	get_64bit_val(cqe, 16, &qword2);
+
+	info->tcp_seq_num_rtt = (u32)FIELD_GET(IRDMACQ_TCPSEQNUMRTT, qword0);
+	info->qp_id = (u32)FIELD_GET(IRDMACQ_QPID, qword2);
+	info->ud_src_qpn = (u32)FIELD_GET(IRDMACQ_UDSRCQPN, qword2);
+
+	get_64bit_val(cqe, 8, &comp_ctx);
+
+	info->solicited_event = (bool)FIELD_GET(IRDMACQ_SOEVENT, qword3);
+	qp = (struct irdma_qp_uk *)(unsigned long)comp_ctx;
+	if (!qp || qp->destroy_pending) {
+		ret_code = IRDMA_ERR_Q_DESTROYED;
+		goto exit;
+	}
+	wqe_idx = (u32)FIELD_GET(IRDMA_CQ_WQEIDX, qword3);
+	info->qp_handle = (irdma_qp_handle)(unsigned long)qp;
+
+	if (q_type == IRDMA_CQE_QTYPE_RQ) {
+		u32 array_idx;
+
+		array_idx = wqe_idx / qp->rq_wqe_size_multiplier;
+
+		if (info->comp_status == IRDMA_COMPL_STATUS_FLUSHED ||
+		    info->comp_status == IRDMA_COMPL_STATUS_UNKNOWN) {
+			if (!IRDMA_RING_MORE_WORK(qp->rq_ring)) {
+				ret_code = IRDMA_ERR_Q_EMPTY;
+				goto exit;
+			}
+
+			info->wr_id = qp->rq_wrid_array[qp->rq_ring.tail];
+			array_idx = qp->rq_ring.tail;
+		} else {
+			info->wr_id = qp->rq_wrid_array[array_idx];
+		}
+
+		info->bytes_xfered = (u32)FIELD_GET(IRDMACQ_PAYLDLEN, qword0);
+
+		if (info->imm_valid)
+			info->op_type = IRDMA_OP_TYPE_REC_IMM;
+		else
+			info->op_type = IRDMA_OP_TYPE_REC;
+		if (qword3 & IRDMACQ_STAG) {
+			info->stag_invalid_set = true;
+			info->inv_stag = (u32)FIELD_GET(IRDMACQ_INVSTAG, qword2);
+		} else {
+			info->stag_invalid_set = false;
+		}
+		IRDMA_RING_SET_TAIL(qp->rq_ring, array_idx + 1);
+		if (info->comp_status == IRDMA_COMPL_STATUS_FLUSHED) {
+			qp->rq_flush_seen = true;
+			if (!IRDMA_RING_MORE_WORK(qp->rq_ring))
+				qp->rq_flush_complete = true;
+			else
+				move_cq_head = false;
+		}
+		pring = &qp->rq_ring;
+	} else { /* q_type is IRDMA_CQE_QTYPE_SQ */
+		if (qp->first_sq_wq) {
+			if (wqe_idx + 1 >= qp->conn_wqes)
+				qp->first_sq_wq = false;
+
+			if (wqe_idx < qp->conn_wqes && qp->sq_ring.head == qp->sq_ring.tail) {
+				IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring);
+				IRDMA_RING_MOVE_TAIL(cq->cq_ring);
+				set_64bit_val(cq->shadow_area, 0,
+					      IRDMA_RING_CURRENT_HEAD(cq->cq_ring));
+				memset(info, 0,
+				       sizeof(struct irdma_cq_poll_info));
+				return irdma_cq_poll_cmpl(cq, info);
+			}
+		}
+		/*cease posting push mode on push drop*/
+		if (info->push_dropped) {
+			qp->push_mode = false;
+			qp->push_dropped = true;
+		}
+		if (info->comp_status != IRDMA_COMPL_STATUS_FLUSHED) {
+			info->wr_id = qp->sq_wrtrk_array[wqe_idx].wrid;
+			if (!info->comp_status)
+				info->bytes_xfered = qp->sq_wrtrk_array[wqe_idx].wr_len;
+			info->op_type = (u8)FIELD_GET(IRDMACQ_OP, qword3);
+			IRDMA_RING_SET_TAIL(qp->sq_ring,
+					    wqe_idx + qp->sq_wrtrk_array[wqe_idx].quanta);
+		} else {
+			if (!IRDMA_RING_MORE_WORK(qp->sq_ring)) {
+				ret_code = IRDMA_ERR_Q_EMPTY;
+				goto exit;
+			}
+
+			do {
+				__le64 *sw_wqe;
+				u64 wqe_qword;
+				u8 op_type;
+				u32 tail;
+
+				tail = qp->sq_ring.tail;
+				sw_wqe = qp->sq_base[tail].elem;
+				get_64bit_val(sw_wqe, 24,
+					      &wqe_qword);
+				op_type = (u8)FIELD_GET(IRDMAQPSQ_OPCODE, wqe_qword);
+				info->op_type = op_type;
+				IRDMA_RING_SET_TAIL(qp->sq_ring,
+						    tail + qp->sq_wrtrk_array[tail].quanta);
+				if (op_type != IRDMAQP_OP_NOP) {
+					info->wr_id = qp->sq_wrtrk_array[tail].wrid;
+					info->bytes_xfered = qp->sq_wrtrk_array[tail].wr_len;
+					break;
+				}
+			} while (1);
+			qp->sq_flush_seen = true;
+			if (!IRDMA_RING_MORE_WORK(qp->sq_ring))
+				qp->sq_flush_complete = true;
+		}
+		pring = &qp->sq_ring;
+	}
+
+	ret_code = 0;
+
+exit:
+	if (!ret_code && info->comp_status == IRDMA_COMPL_STATUS_FLUSHED)
+		if (pring && IRDMA_RING_MORE_WORK(*pring))
+			move_cq_head = false;
+
+	if (move_cq_head) {
+		IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring);
+		if (!IRDMA_RING_CURRENT_HEAD(cq->cq_ring))
+			cq->polarity ^= 1;
+
+		if (ext_valid && !cq->avoid_mem_cflct) {
+			IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring);
+			if (!IRDMA_RING_CURRENT_HEAD(cq->cq_ring))
+				cq->polarity ^= 1;
+		}
+
+		IRDMA_RING_MOVE_TAIL(cq->cq_ring);
+		if (!cq->avoid_mem_cflct && ext_valid)
+			IRDMA_RING_MOVE_TAIL(cq->cq_ring);
+		set_64bit_val(cq->shadow_area, 0,
+			      IRDMA_RING_CURRENT_HEAD(cq->cq_ring));
+	} else {
+		qword3 &= ~IRDMA_CQ_WQEIDX;
+		qword3 |= FIELD_PREP(IRDMA_CQ_WQEIDX, pring->tail);
+		set_64bit_val(cqe, 24, qword3);
+	}
+
+	return ret_code;
+}
+
+/**
+ * irdma_qp_round_up - return round up qp wq depth
+ * @wqdepth: wq depth in quanta to round up
+ */
+static int irdma_qp_round_up(u32 wqdepth)
+{
+	int scount = 1;
+
+	for (wqdepth--; scount <= 16; scount *= 2)
+		wqdepth |= wqdepth >> scount;
+
+	return ++wqdepth;
+}
+
+/**
+ * irdma_get_wqe_shift - get shift count for maximum wqe size
+ * @uk_attrs: qp HW attributes
+ * @sge: Maximum Scatter Gather Elements wqe
+ * @inline_data: Maximum inline data size
+ * @shift: Returns the shift needed based on sge
+ *
+ * Shift can be used to left shift the wqe size based on number of SGEs and inlind data size.
+ * For 1 SGE or inline data <= 8, shift = 0 (wqe size of 32
+ * bytes). For 2 or 3 SGEs or inline data <= 39, shift = 1 (wqe
+ * size of 64 bytes).
+ * For 4-7 SGE's and inline <= 101 Shift of 2 otherwise (wqe
+ * size of 256 bytes).
+ */
+void irdma_get_wqe_shift(struct irdma_uk_attrs *uk_attrs, u32 sge,
+			 u32 inline_data, u8 *shift)
+{
+	*shift = 0;
+	if (uk_attrs->hw_rev >= IRDMA_GEN_2) {
+		if (sge > 1 || inline_data > 8) {
+			if (sge < 4 && inline_data <= 39)
+				*shift = 1;
+			else if (sge < 8 && inline_data <= 101)
+				*shift = 2;
+			else
+				*shift = 3;
+		}
+	} else if (sge > 1 || inline_data > 16) {
+		*shift = (sge < 4 && inline_data <= 48) ? 1 : 2;
+	}
+}
+
+/*
+ * irdma_get_sqdepth - get SQ depth (quanta)
+ * @uk_attrs: qp HW attributes
+ * @sq_size: SQ size
+ * @shift: shift which determines size of WQE
+ * @sqdepth: depth of SQ
+ *
+ */
+enum irdma_status_code irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs,
+					 u32 sq_size, u8 shift, u32 *sqdepth)
+{
+	*sqdepth = irdma_qp_round_up((sq_size << shift) + IRDMA_SQ_RSVD);
+
+	if (*sqdepth < (IRDMA_QP_SW_MIN_WQSIZE << shift))
+		*sqdepth = IRDMA_QP_SW_MIN_WQSIZE << shift;
+	else if (*sqdepth > uk_attrs->max_hw_wq_quanta)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	return 0;
+}
+
+/*
+ * irdma_get_rqdepth - get RQ depth (quanta)
+ * @uk_attrs: qp HW attributes
+ * @rq_size: RQ size
+ * @shift: shift which determines size of WQE
+ * @rqdepth: depth of RQ
+ */
+enum irdma_status_code irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs,
+					 u32 rq_size, u8 shift, u32 *rqdepth)
+{
+	*rqdepth = irdma_qp_round_up((rq_size << shift) + IRDMA_RQ_RSVD);
+
+	if (*rqdepth < (IRDMA_QP_SW_MIN_WQSIZE << shift))
+		*rqdepth = IRDMA_QP_SW_MIN_WQSIZE << shift;
+	else if (*rqdepth > uk_attrs->max_hw_rq_quanta)
+		return IRDMA_ERR_INVALID_SIZE;
+
+	return 0;
+}
+
+static const struct irdma_qp_uk_ops iw_qp_uk_ops = {
+	.iw_inline_rdma_write = irdma_inline_rdma_write,
+	.iw_inline_send = irdma_inline_send,
+	.iw_mw_bind = irdma_mw_bind,
+	.iw_post_nop = irdma_nop,
+	.iw_post_receive = irdma_post_receive,
+	.iw_qp_post_wr = irdma_qp_post_wr,
+	.iw_qp_ring_push_db = irdma_qp_ring_push_db,
+	.iw_rdma_read = irdma_rdma_read,
+	.iw_rdma_write = irdma_rdma_write,
+	.iw_send = irdma_send,
+	.iw_stag_local_invalidate = irdma_stag_local_invalidate,
+};
+
+static const struct irdma_wqe_uk_ops iw_wqe_uk_ops = {
+	.iw_copy_inline_data = irdma_copy_inline_data,
+	.iw_inline_data_size_to_quanta = irdma_inline_data_size_to_quanta,
+	.iw_set_fragment = irdma_set_fragment,
+	.iw_set_mw_bind_wqe = irdma_set_mw_bind_wqe,
+};
+
+static const struct irdma_wqe_uk_ops iw_wqe_uk_ops_gen_1 = {
+	.iw_copy_inline_data = irdma_copy_inline_data_gen_1,
+	.iw_inline_data_size_to_quanta = irdma_inline_data_size_to_quanta_gen_1,
+	.iw_set_fragment = irdma_set_fragment_gen_1,
+	.iw_set_mw_bind_wqe = irdma_set_mw_bind_wqe_gen_1,
+};
+
+static const struct irdma_cq_ops iw_cq_ops = {
+	.iw_cq_clean = irdma_clean_cq,
+	.iw_cq_poll_cmpl = irdma_cq_poll_cmpl,
+	.iw_cq_post_entries = irdma_cq_post_entries,
+	.iw_cq_request_notification = irdma_cq_request_notification,
+	.iw_cq_resize = irdma_cq_resize,
+	.iw_cq_set_resized_cnt = irdma_cq_set_resized_cnt,
+};
+
+static const struct irdma_device_uk_ops iw_device_uk_ops = {
+	.iw_cq_uk_init = irdma_cq_uk_init,
+	.iw_qp_uk_init = irdma_qp_uk_init,
+};
+
+/**
+ * irdma_setup_connection_wqes - setup WQEs necessary to complete
+ * connection.
+ * @qp: hw qp (user and kernel)
+ * @info: qp initialization info
+ */
+static void irdma_setup_connection_wqes(struct irdma_qp_uk *qp,
+					struct irdma_qp_uk_init_info *info)
+{
+	u16 move_cnt = 1;
+
+	if (!info->legacy_mode &&
+	    (qp->uk_attrs->feature_flags & IRDMA_FEATURE_RTS_AE))
+		move_cnt = 3;
+
+	qp->conn_wqes = move_cnt;
+	IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(qp->sq_ring, move_cnt);
+	IRDMA_RING_MOVE_TAIL_BY_COUNT(qp->sq_ring, move_cnt);
+	IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(qp->initial_ring, move_cnt);
+}
+
+/**
+ * irdma_qp_uk_init - initialize shared qp
+ * @qp: hw qp (user and kernel)
+ * @info: qp initialization info
+ *
+ * initializes the vars used in both user and kernel mode.
+ * size of the wqe depends on numbers of max. fragements
+ * allowed. Then size of wqe * the number of wqes should be the
+ * amount of memory allocated for sq and rq.
+ */
+enum irdma_status_code irdma_qp_uk_init(struct irdma_qp_uk *qp,
+					struct irdma_qp_uk_init_info *info)
+{
+	enum irdma_status_code ret_code = 0;
+	u32 sq_ring_size;
+	u8 sqshift, rqshift;
+
+	qp->uk_attrs = info->uk_attrs;
+	if (info->max_sq_frag_cnt > qp->uk_attrs->max_hw_wq_frags ||
+	    info->max_rq_frag_cnt > qp->uk_attrs->max_hw_wq_frags)
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+
+	irdma_get_wqe_shift(qp->uk_attrs, info->max_rq_frag_cnt, 0, &rqshift);
+	if (qp->uk_attrs->hw_rev == IRDMA_GEN_1) {
+		irdma_get_wqe_shift(qp->uk_attrs, info->max_sq_frag_cnt,
+				    info->max_inline_data, &sqshift);
+		if (info->abi_ver > 4)
+			rqshift = IRDMA_MAX_RQ_WQE_SHIFT_GEN1;
+	} else {
+		irdma_get_wqe_shift(qp->uk_attrs, info->max_sq_frag_cnt + 1,
+				    info->max_inline_data, &sqshift);
+	}
+	qp->qp_caps = info->qp_caps;
+	qp->sq_base = info->sq;
+	qp->rq_base = info->rq;
+	qp->qp_type = info->type ? info->type : IRDMA_QP_TYPE_IWARP;
+	qp->shadow_area = info->shadow_area;
+	qp->sq_wrtrk_array = info->sq_wrtrk_array;
+
+	qp->rq_wrid_array = info->rq_wrid_array;
+	qp->wqe_alloc_db = info->wqe_alloc_db;
+	qp->qp_id = info->qp_id;
+	qp->sq_size = info->sq_size;
+	qp->push_mode = false;
+	qp->max_sq_frag_cnt = info->max_sq_frag_cnt;
+	sq_ring_size = qp->sq_size << sqshift;
+	IRDMA_RING_INIT(qp->sq_ring, sq_ring_size);
+	IRDMA_RING_INIT(qp->initial_ring, sq_ring_size);
+	if (info->first_sq_wq) {
+		irdma_setup_connection_wqes(qp, info);
+		qp->swqe_polarity = 1;
+		qp->first_sq_wq = true;
+	} else {
+		qp->swqe_polarity = 0;
+	}
+	qp->swqe_polarity_deferred = 1;
+	qp->rwqe_polarity = 0;
+	qp->rq_size = info->rq_size;
+	qp->max_rq_frag_cnt = info->max_rq_frag_cnt;
+	qp->max_inline_data = info->max_inline_data;
+	qp->rq_wqe_size = rqshift;
+	IRDMA_RING_INIT(qp->rq_ring, qp->rq_size);
+	qp->rq_wqe_size_multiplier = 1 << rqshift;
+	qp->qp_ops = iw_qp_uk_ops;
+	if (qp->uk_attrs->hw_rev == IRDMA_GEN_1)
+		qp->wqe_ops = iw_wqe_uk_ops_gen_1;
+	else
+		qp->wqe_ops = iw_wqe_uk_ops;
+	return ret_code;
+}
+
+/**
+ * irdma_cq_uk_init - initialize shared cq (user and kernel)
+ * @cq: hw cq
+ * @info: hw cq initialization info
+ */
+enum irdma_status_code irdma_cq_uk_init(struct irdma_cq_uk *cq,
+					struct irdma_cq_uk_init_info *info)
+{
+	cq->cq_base = info->cq_base;
+	cq->cq_id = info->cq_id;
+	cq->cq_size = info->cq_size;
+	cq->cqe_alloc_db = info->cqe_alloc_db;
+	cq->cq_ack_db = info->cq_ack_db;
+	cq->shadow_area = info->shadow_area;
+	cq->avoid_mem_cflct = info->avoid_mem_cflct;
+	IRDMA_RING_INIT(cq->cq_ring, cq->cq_size);
+	cq->polarity = 1;
+	cq->ops = iw_cq_ops;
+
+	return 0;
+}
+
+/**
+ * irdma_device_init_uk - setup routines for iwarp shared device
+ * @dev: iwarp shared (user and kernel)
+ */
+void irdma_device_init_uk(struct irdma_dev_uk *dev)
+{
+	dev->ops_uk = iw_device_uk_ops;
+}
+
+/**
+ * irdma_clean_cq - clean cq entries
+ * @q: completion context
+ * @cq: cq to clean
+ */
+void irdma_clean_cq(void *q, struct irdma_cq_uk *cq)
+{
+	__le64 *cqe;
+	u64 qword3, comp_ctx;
+	u32 cq_head;
+	u8 polarity, temp;
+
+	cq_head = cq->cq_ring.head;
+	temp = cq->polarity;
+	do {
+		if (cq->avoid_mem_cflct)
+			cqe = ((struct irdma_extended_cqe *)(cq->cq_base))[cq_head].buf;
+		else
+			cqe = cq->cq_base[cq_head].buf;
+		get_64bit_val(cqe, 24, &qword3);
+		polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword3);
+
+		if (polarity != temp)
+			break;
+
+		get_64bit_val(cqe, 8, &comp_ctx);
+		if ((void *)(unsigned long)comp_ctx == q)
+			set_64bit_val(cqe, 8, 0);
+
+		cq_head = (cq_head + 1) % cq->cq_ring.size;
+		if (!cq_head)
+			temp ^= 1;
+	} while (true);
+}
+
+/**
+ * irdma_nop - post a nop
+ * @qp: hw qp ptr
+ * @wr_id: work request id
+ * @signaled: signaled for completion
+ * @post_sq: ring doorbell
+ */
+enum irdma_status_code irdma_nop(struct irdma_qp_uk *qp, u64 wr_id,
+				 bool signaled, bool post_sq)
+{
+	__le64 *wqe;
+	u64 hdr;
+	u32 wqe_idx;
+	struct irdma_post_sq_info info = {};
+
+	info.push_wqe = false;
+	info.wr_id = wr_id;
+	wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, IRDMA_QP_WQE_MIN_QUANTA,
+					 0, &info);
+	if (!wqe)
+		return IRDMA_ERR_QP_TOOMANY_WRS_POSTED;
+
+	irdma_clr_wqes(qp, wqe_idx);
+
+	set_64bit_val(wqe, 0, 0);
+	set_64bit_val(wqe, 8, 0);
+	set_64bit_val(wqe, 16, 0);
+
+	hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_NOP) |
+	      FIELD_PREP(IRDMAQPSQ_SIGCOMPL, signaled) |
+	      FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity);
+
+	dma_wmb(); /* make sure WQE is populated before valid bit is set */
+
+	set_64bit_val(wqe, 24, hdr);
+	if (post_sq)
+		irdma_qp_post_wr(qp);
+
+	return 0;
+}
+
+/**
+ * irdma_fragcnt_to_quanta_sq - calculate quanta based on fragment count for SQ
+ * @frag_cnt: number of fragments
+ * @quanta: quanta for frag_cnt
+ */
+enum irdma_status_code irdma_fragcnt_to_quanta_sq(u32 frag_cnt, u16 *quanta)
+{
+	switch (frag_cnt) {
+	case 0:
+	case 1:
+		*quanta = IRDMA_QP_WQE_MIN_QUANTA;
+		break;
+	case 2:
+	case 3:
+		*quanta = 2;
+		break;
+	case 4:
+	case 5:
+		*quanta = 3;
+		break;
+	case 6:
+	case 7:
+		*quanta = 4;
+		break;
+	case 8:
+	case 9:
+		*quanta = 5;
+		break;
+	case 10:
+	case 11:
+		*quanta = 6;
+		break;
+	case 12:
+	case 13:
+		*quanta = 7;
+		break;
+	case 14:
+	case 15: /* when immediate data is present */
+		*quanta = 8;
+		break;
+	default:
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_fragcnt_to_wqesize_rq - calculate wqe size based on fragment count for RQ
+ * @frag_cnt: number of fragments
+ * @wqe_size: size in bytes given frag_cnt
+ */
+enum irdma_status_code irdma_fragcnt_to_wqesize_rq(u32 frag_cnt, u16 *wqe_size)
+{
+	switch (frag_cnt) {
+	case 0:
+	case 1:
+		*wqe_size = 32;
+		break;
+	case 2:
+	case 3:
+		*wqe_size = 64;
+		break;
+	case 4:
+	case 5:
+	case 6:
+	case 7:
+		*wqe_size = 128;
+		break;
+	case 8:
+	case 9:
+	case 10:
+	case 11:
+	case 12:
+	case 13:
+	case 14:
+		*wqe_size = 256;
+		break;
+	default:
+		return IRDMA_ERR_INVALID_FRAG_COUNT;
+	}
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h
new file mode 100644
index 0000000..a81cced
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/user.h
@@ -0,0 +1,464 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2020 Intel Corporation */
+#ifndef IRDMA_USER_H
+#define IRDMA_USER_H
+
+#define irdma_handle void *
+#define irdma_adapter_handle irdma_handle
+#define irdma_qp_handle irdma_handle
+#define irdma_cq_handle irdma_handle
+#define irdma_pd_id irdma_handle
+#define irdma_stag_handle irdma_handle
+#define irdma_stag_index u32
+#define irdma_stag u32
+#define irdma_stag_key u8
+#define irdma_tagged_offset u64
+#define irdma_access_privileges u32
+#define irdma_physical_fragment u64
+#define irdma_address_list u64 *
+#define irdma_sgl struct irdma_sge *
+
+#define	IRDMA_MAX_MR_SIZE       0x200000000000ULL
+
+#define IRDMA_ACCESS_FLAGS_LOCALREAD		0x01
+#define IRDMA_ACCESS_FLAGS_LOCALWRITE		0x02
+#define IRDMA_ACCESS_FLAGS_REMOTEREAD_ONLY	0x04
+#define IRDMA_ACCESS_FLAGS_REMOTEREAD		0x05
+#define IRDMA_ACCESS_FLAGS_REMOTEWRITE_ONLY	0x08
+#define IRDMA_ACCESS_FLAGS_REMOTEWRITE		0x0a
+#define IRDMA_ACCESS_FLAGS_BIND_WINDOW		0x10
+#define IRDMA_ACCESS_FLAGS_ZERO_BASED		0x20
+#define IRDMA_ACCESS_FLAGS_ALL			0x3f
+
+#define IRDMA_OP_TYPE_RDMA_WRITE		0x00
+#define IRDMA_OP_TYPE_RDMA_READ			0x01
+#define IRDMA_OP_TYPE_SEND			0x03
+#define IRDMA_OP_TYPE_SEND_INV			0x04
+#define IRDMA_OP_TYPE_SEND_SOL			0x05
+#define IRDMA_OP_TYPE_SEND_SOL_INV		0x06
+#define IRDMA_OP_TYPE_RDMA_WRITE_SOL		0x0d
+#define IRDMA_OP_TYPE_BIND_MW			0x08
+#define IRDMA_OP_TYPE_FAST_REG_NSMR		0x09
+#define IRDMA_OP_TYPE_INV_STAG			0x0a
+#define IRDMA_OP_TYPE_RDMA_READ_INV_STAG	0x0b
+#define IRDMA_OP_TYPE_NOP			0x0c
+#define IRDMA_OP_TYPE_REC	0x3e
+#define IRDMA_OP_TYPE_REC_IMM	0x3f
+
+#define IRDMA_FLUSH_MAJOR_ERR	1
+
+enum irdma_device_caps_const {
+	IRDMA_WQE_SIZE =			4,
+	IRDMA_CQP_WQE_SIZE =			8,
+	IRDMA_CQE_SIZE =			4,
+	IRDMA_EXTENDED_CQE_SIZE =		8,
+	IRDMA_AEQE_SIZE =			2,
+	IRDMA_CEQE_SIZE =			1,
+	IRDMA_CQP_CTX_SIZE =			8,
+	IRDMA_SHADOW_AREA_SIZE =		8,
+	IRDMA_QUERY_FPM_BUF_SIZE =		176,
+	IRDMA_COMMIT_FPM_BUF_SIZE =		176,
+	IRDMA_GATHER_STATS_BUF_SIZE =		1024,
+	IRDMA_MIN_IW_QP_ID =			0,
+	IRDMA_MAX_IW_QP_ID =			262143,
+	IRDMA_MIN_CEQID =			0,
+	IRDMA_MAX_CEQID =			1023,
+	IRDMA_CEQ_MAX_COUNT =			IRDMA_MAX_CEQID + 1,
+	IRDMA_MIN_CQID =			0,
+	IRDMA_MAX_CQID =			524287,
+	IRDMA_MIN_AEQ_ENTRIES =			1,
+	IRDMA_MAX_AEQ_ENTRIES =			524287,
+	IRDMA_MIN_CEQ_ENTRIES =			1,
+	IRDMA_MAX_CEQ_ENTRIES =			262143,
+	IRDMA_MIN_CQ_SIZE =			1,
+	IRDMA_MAX_CQ_SIZE =			1048575,
+	IRDMA_DB_ID_ZERO =			0,
+	IRDMA_MAX_WQ_FRAGMENT_COUNT =		13,
+	IRDMA_MAX_SGE_RD =			13,
+	IRDMA_MAX_OUTBOUND_MSG_SIZE =		2147483647,
+	IRDMA_MAX_INBOUND_MSG_SIZE =		2147483647,
+	IRDMA_MAX_PUSH_PAGE_COUNT =		1024,
+	IRDMA_MAX_PE_ENA_VF_COUNT =		32,
+	IRDMA_MAX_VF_FPM_ID =			47,
+	IRDMA_MAX_SQ_PAYLOAD_SIZE =		2145386496,
+	IRDMA_MAX_INLINE_DATA_SIZE =		101,
+	IRDMA_MAX_WQ_ENTRIES =			32768,
+	IRDMA_Q2_BUF_SIZE =			256,
+	IRDMA_QP_CTX_SIZE =			256,
+	IRDMA_MAX_PDS =				262144,
+};
+
+enum irdma_addressing_type {
+	IRDMA_ADDR_TYPE_ZERO_BASED = 0,
+	IRDMA_ADDR_TYPE_VA_BASED   = 1,
+};
+
+enum irdma_flush_opcode {
+	FLUSH_INVALID = 0,
+	FLUSH_GENERAL_ERR,
+	FLUSH_PROT_ERR,
+	FLUSH_REM_ACCESS_ERR,
+	FLUSH_LOC_QP_OP_ERR,
+	FLUSH_REM_OP_ERR,
+	FLUSH_LOC_LEN_ERR,
+	FLUSH_FATAL_ERR,
+};
+
+enum irdma_cmpl_status {
+	IRDMA_COMPL_STATUS_SUCCESS = 0,
+	IRDMA_COMPL_STATUS_FLUSHED,
+	IRDMA_COMPL_STATUS_INVALID_WQE,
+	IRDMA_COMPL_STATUS_QP_CATASTROPHIC,
+	IRDMA_COMPL_STATUS_REMOTE_TERMINATION,
+	IRDMA_COMPL_STATUS_INVALID_STAG,
+	IRDMA_COMPL_STATUS_BASE_BOUND_VIOLATION,
+	IRDMA_COMPL_STATUS_ACCESS_VIOLATION,
+	IRDMA_COMPL_STATUS_INVALID_PD_ID,
+	IRDMA_COMPL_STATUS_WRAP_ERROR,
+	IRDMA_COMPL_STATUS_STAG_INVALID_PDID,
+	IRDMA_COMPL_STATUS_RDMA_READ_ZERO_ORD,
+	IRDMA_COMPL_STATUS_QP_NOT_PRIVLEDGED,
+	IRDMA_COMPL_STATUS_STAG_NOT_INVALID,
+	IRDMA_COMPL_STATUS_INVALID_PHYS_BUF_SIZE,
+	IRDMA_COMPL_STATUS_INVALID_PHYS_BUF_ENTRY,
+	IRDMA_COMPL_STATUS_INVALID_FBO,
+	IRDMA_COMPL_STATUS_INVALID_LEN,
+	IRDMA_COMPL_STATUS_INVALID_ACCESS,
+	IRDMA_COMPL_STATUS_PHYS_BUF_LIST_TOO_LONG,
+	IRDMA_COMPL_STATUS_INVALID_VIRT_ADDRESS,
+	IRDMA_COMPL_STATUS_INVALID_REGION,
+	IRDMA_COMPL_STATUS_INVALID_WINDOW,
+	IRDMA_COMPL_STATUS_INVALID_TOTAL_LEN,
+	IRDMA_COMPL_STATUS_UNKNOWN,
+};
+
+enum irdma_cmpl_notify {
+	IRDMA_CQ_COMPL_EVENT     = 0,
+	IRDMA_CQ_COMPL_SOLICITED = 1,
+};
+
+enum irdma_qp_caps {
+	IRDMA_WRITE_WITH_IMM = 1,
+	IRDMA_SEND_WITH_IMM  = 2,
+	IRDMA_ROCE	     = 4,
+	IRDMA_PUSH_MODE      = 8,
+};
+
+struct irdma_qp_uk;
+struct irdma_cq_uk;
+struct irdma_qp_uk_init_info;
+struct irdma_cq_uk_init_info;
+
+struct irdma_sge {
+	irdma_tagged_offset tag_off;
+	u32 len;
+	irdma_stag stag;
+};
+
+struct irdma_ring {
+	u32 head;
+	u32 tail;
+	u32 size;
+};
+
+struct irdma_cqe {
+	__le64 buf[IRDMA_CQE_SIZE];
+};
+
+struct irdma_extended_cqe {
+	__le64 buf[IRDMA_EXTENDED_CQE_SIZE];
+};
+
+struct irdma_post_send {
+	irdma_sgl sg_list;
+	u32 num_sges;
+	u32 qkey;
+	u32 dest_qp;
+	u32 ah_id;
+};
+
+struct irdma_post_inline_send {
+	void *data;
+	u32 len;
+	u32 qkey;
+	u32 dest_qp;
+	u32 ah_id;
+};
+
+struct irdma_post_rq_info {
+	u64 wr_id;
+	irdma_sgl sg_list;
+	u32 num_sges;
+};
+
+struct irdma_rdma_write {
+	irdma_sgl lo_sg_list;
+	u32 num_lo_sges;
+	struct irdma_sge rem_addr;
+};
+
+struct irdma_inline_rdma_write {
+	void *data;
+	u32 len;
+	struct irdma_sge rem_addr;
+};
+
+struct irdma_rdma_read {
+	irdma_sgl lo_sg_list;
+	u32 num_lo_sges;
+	struct irdma_sge rem_addr;
+};
+
+struct irdma_bind_window {
+	irdma_stag mr_stag;
+	u64 bind_len;
+	void *va;
+	enum irdma_addressing_type addressing_type;
+	bool ena_reads:1;
+	bool ena_writes:1;
+	irdma_stag mw_stag;
+	bool mem_window_type_1:1;
+};
+
+struct irdma_inv_local_stag {
+	irdma_stag target_stag;
+};
+
+struct irdma_post_sq_info {
+	u64 wr_id;
+	u8 op_type;
+	u8 l4len;
+	bool signaled:1;
+	bool read_fence:1;
+	bool local_fence:1;
+	bool inline_data:1;
+	bool imm_data_valid:1;
+	bool push_wqe:1;
+	bool report_rtt:1;
+	bool udp_hdr:1;
+	bool defer_flag:1;
+	u32 imm_data;
+	u32 stag_to_inv;
+	union {
+		struct irdma_post_send send;
+		struct irdma_rdma_write rdma_write;
+		struct irdma_rdma_read rdma_read;
+		struct irdma_bind_window bind_window;
+		struct irdma_inv_local_stag inv_local_stag;
+		struct irdma_inline_rdma_write inline_rdma_write;
+		struct irdma_post_inline_send inline_send;
+	} op;
+};
+
+struct irdma_cq_poll_info {
+	u64 wr_id;
+	irdma_qp_handle qp_handle;
+	u32 bytes_xfered;
+	u32 tcp_seq_num_rtt;
+	u32 qp_id;
+	u32 ud_src_qpn;
+	u32 imm_data;
+	irdma_stag inv_stag; /* or L_R_Key */
+	enum irdma_cmpl_status comp_status;
+	u16 major_err;
+	u16 minor_err;
+	u16 ud_vlan;
+	u8 ud_smac[6];
+	u8 op_type;
+	bool stag_invalid_set:1; /* or L_R_Key set */
+	bool push_dropped:1;
+	bool error:1;
+	bool solicited_event:1;
+	bool ipv4:1;
+	bool ud_vlan_valid:1;
+	bool ud_smac_valid:1;
+	bool imm_valid:1;
+};
+
+struct irdma_qp_uk_ops {
+	enum irdma_status_code (*iw_inline_rdma_write)(struct irdma_qp_uk *qp,
+						       struct irdma_post_sq_info *info,
+						       bool post_sq);
+	enum irdma_status_code (*iw_inline_send)(struct irdma_qp_uk *qp,
+						 struct irdma_post_sq_info *info,
+						 bool post_sq);
+	enum irdma_status_code (*iw_mw_bind)(struct irdma_qp_uk *qp,
+					     struct irdma_post_sq_info *info,
+					     bool post_sq);
+	enum irdma_status_code (*iw_post_nop)(struct irdma_qp_uk *qp, u64 wr_id,
+					      bool signaled, bool post_sq);
+	enum irdma_status_code (*iw_post_receive)(struct irdma_qp_uk *qp,
+						  struct irdma_post_rq_info *info);
+	void (*iw_qp_post_wr)(struct irdma_qp_uk *qp);
+	void (*iw_qp_ring_push_db)(struct irdma_qp_uk *qp, u32 wqe_index);
+	enum irdma_status_code (*iw_rdma_read)(struct irdma_qp_uk *qp,
+					       struct irdma_post_sq_info *info,
+					       bool inv_stag, bool post_sq);
+	enum irdma_status_code (*iw_rdma_write)(struct irdma_qp_uk *qp,
+						struct irdma_post_sq_info *info,
+						bool post_sq);
+	enum irdma_status_code (*iw_send)(struct irdma_qp_uk *qp,
+					  struct irdma_post_sq_info *info,
+					  bool post_sq);
+	enum irdma_status_code (*iw_stag_local_invalidate)(struct irdma_qp_uk *qp,
+							   struct irdma_post_sq_info *info,
+							   bool post_sq);
+};
+
+struct irdma_wqe_uk_ops {
+	void (*iw_copy_inline_data)(u8 *dest, u8 *src, u32 len, u8 polarity);
+	u16 (*iw_inline_data_size_to_quanta)(u32 data_size);
+	void (*iw_set_fragment)(__le64 *wqe, u32 offset, struct irdma_sge *sge,
+				u8 valid);
+	void (*iw_set_mw_bind_wqe)(__le64 *wqe,
+				   struct irdma_bind_window *op_info);
+};
+
+struct irdma_cq_ops {
+	void (*iw_cq_clean)(void *q, struct irdma_cq_uk *cq);
+	enum irdma_status_code (*iw_cq_poll_cmpl)(struct irdma_cq_uk *cq,
+						  struct irdma_cq_poll_info *info);
+	enum irdma_status_code (*iw_cq_post_entries)(struct irdma_cq_uk *cq,
+						     u8 count);
+	void (*iw_cq_request_notification)(struct irdma_cq_uk *cq,
+					   enum irdma_cmpl_notify cq_notify);
+	void (*iw_cq_resize)(struct irdma_cq_uk *cq, void *cq_base, int size);
+	void (*iw_cq_set_resized_cnt)(struct irdma_cq_uk *qp, u16 cnt);
+};
+
+struct irdma_dev_uk;
+
+struct irdma_device_uk_ops {
+	enum irdma_status_code (*iw_cq_uk_init)(struct irdma_cq_uk *cq,
+						struct irdma_cq_uk_init_info *info);
+	enum irdma_status_code (*iw_qp_uk_init)(struct irdma_qp_uk *qp,
+						struct irdma_qp_uk_init_info *info);
+};
+
+struct irdma_dev_uk {
+	struct irdma_device_uk_ops ops_uk;
+};
+
+struct irdma_sq_uk_wr_trk_info {
+	u64 wrid;
+	u32 wr_len;
+	u16 quanta;
+	u8 reserved[2];
+};
+
+struct irdma_qp_quanta {
+	__le64 elem[IRDMA_WQE_SIZE];
+};
+
+struct irdma_qp_uk {
+	struct irdma_qp_quanta *sq_base;
+	struct irdma_qp_quanta *rq_base;
+	struct irdma_uk_attrs *uk_attrs;
+	u32 __iomem *wqe_alloc_db;
+	struct irdma_sq_uk_wr_trk_info *sq_wrtrk_array;
+	u64 *rq_wrid_array;
+	__le64 *shadow_area;
+	__le32 *push_db;
+	__le64 *push_wqe;
+	struct irdma_ring sq_ring;
+	struct irdma_ring rq_ring;
+	struct irdma_ring initial_ring;
+	u32 qp_id;
+	u32 qp_caps;
+	u32 sq_size;
+	u32 rq_size;
+	u32 max_sq_frag_cnt;
+	u32 max_rq_frag_cnt;
+	u32 max_inline_data;
+	struct irdma_qp_uk_ops qp_ops;
+	struct irdma_wqe_uk_ops wqe_ops;
+	u16 conn_wqes;
+	u8 qp_type;
+	u8 swqe_polarity;
+	u8 swqe_polarity_deferred;
+	u8 rwqe_polarity;
+	u8 rq_wqe_size;
+	u8 rq_wqe_size_multiplier;
+	bool deferred_flag:1;
+	bool push_mode:1; /* whether the last post wqe was pushed */
+	bool push_dropped:1;
+	bool first_sq_wq:1;
+	bool sq_flush_complete:1; /* Indicates flush was seen and SQ was empty after the flush */
+	bool rq_flush_complete:1; /* Indicates flush was seen and RQ was empty after the flush */
+	bool destroy_pending:1; /* Indicates the QP is being destroyed */
+	void *back_qp;
+	spinlock_t *lock;
+	u8 dbg_rq_flushed;
+	u8 sq_flush_seen;
+	u8 rq_flush_seen;
+};
+
+struct irdma_cq_uk {
+	struct irdma_cqe *cq_base;
+	u32 __iomem *cqe_alloc_db;
+	u32 __iomem *cq_ack_db;
+	__le64 *shadow_area;
+	u32 cq_id;
+	u32 cq_size;
+	struct irdma_ring cq_ring;
+	u8 polarity;
+	struct irdma_cq_ops ops;
+	bool avoid_mem_cflct:1;
+};
+
+struct irdma_qp_uk_init_info {
+	struct irdma_qp_quanta *sq;
+	struct irdma_qp_quanta *rq;
+	struct irdma_uk_attrs *uk_attrs;
+	u32 __iomem *wqe_alloc_db;
+	__le64 *shadow_area;
+	struct irdma_sq_uk_wr_trk_info *sq_wrtrk_array;
+	u64 *rq_wrid_array;
+	u32 qp_id;
+	u32 qp_caps;
+	u32 sq_size;
+	u32 rq_size;
+	u32 max_sq_frag_cnt;
+	u32 max_rq_frag_cnt;
+	u32 max_inline_data;
+	u8 first_sq_wq;
+	u8 type;
+	int abi_ver;
+	bool legacy_mode;
+};
+
+struct irdma_cq_uk_init_info {
+	u32 __iomem *cqe_alloc_db;
+	u32 __iomem *cq_ack_db;
+	struct irdma_cqe *cq_base;
+	__le64 *shadow_area;
+	u32 cq_size;
+	u32 cq_id;
+	bool avoid_mem_cflct;
+};
+
+void irdma_device_init_uk(struct irdma_dev_uk *dev);
+void irdma_qp_post_wr(struct irdma_qp_uk *qp);
+__le64 *irdma_qp_get_next_send_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx,
+				   u16 quanta, u32 total_size,
+				   struct irdma_post_sq_info *info);
+__le64 *irdma_qp_get_next_recv_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx);
+enum irdma_status_code irdma_cq_uk_init(struct irdma_cq_uk *cq,
+					struct irdma_cq_uk_init_info *info);
+enum irdma_status_code irdma_qp_uk_init(struct irdma_qp_uk *qp,
+					struct irdma_qp_uk_init_info *info);
+void irdma_clean_cq(void *q, struct irdma_cq_uk *cq);
+enum irdma_status_code irdma_nop(struct irdma_qp_uk *qp, u64 wr_id,
+				 bool signaled, bool post_sq);
+enum irdma_status_code irdma_fragcnt_to_quanta_sq(u32 frag_cnt, u16 *quanta);
+enum irdma_status_code irdma_fragcnt_to_wqesize_rq(u32 frag_cnt, u16 *wqe_size);
+void irdma_get_wqe_shift(struct irdma_uk_attrs *uk_attrs, u32 sge,
+			 u32 inline_data, u8 *shift);
+enum irdma_status_code irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs,
+					 u32 sq_size, u8 shift, u32 *wqdepth);
+enum irdma_status_code irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs,
+					 u32 rq_size, u8 shift, u32 *wqdepth);
+void irdma_qp_push_wqe(struct irdma_qp_uk *qp, __le64 *wqe, u16 quanta,
+		       u32 wqe_idx, bool post_sq);
+void irdma_clr_wqes(struct irdma_qp_uk *qp, u32 qp_wqe_idx);
+#endif /* IRDMA_USER_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 19/23] RDMA/irdma: Add miscellaneous utility definitions
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (17 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 18/23] RDMA/irdma: Add user/kernel shared libraries Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 20/23] RDMA/irdma: Add dynamic tracing for CM Shiraz Saleem
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Add miscellaneous utility functions and headers.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/osdep.h  |   86 ++
 drivers/infiniband/hw/irdma/protos.h |  116 ++
 drivers/infiniband/hw/irdma/status.h |   71 +
 drivers/infiniband/hw/irdma/utils.c  | 2543 ++++++++++++++++++++++++++++++++++
 4 files changed, 2816 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/osdep.h
 create mode 100644 drivers/infiniband/hw/irdma/protos.h
 create mode 100644 drivers/infiniband/hw/irdma/status.h
 create mode 100644 drivers/infiniband/hw/irdma/utils.c

diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h
new file mode 100644
index 0000000..b2ab523
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/osdep.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#ifndef IRDMA_OSDEP_H
+#define IRDMA_OSDEP_H
+
+#include <linux/pci.h>
+#include <linux/bitfield.h>
+#include <crypto/hash.h>
+#include <rdma/ib_verbs.h>
+
+#define STATS_TIMER_DELAY	60000
+
+struct irdma_dma_info {
+	dma_addr_t *dmaaddrs;
+};
+
+struct irdma_dma_mem {
+	void *va;
+	dma_addr_t pa;
+	u32 size;
+} __packed;
+
+struct irdma_virt_mem {
+	void *va;
+	u32 size;
+} __packed;
+
+struct irdma_sc_vsi;
+struct irdma_sc_dev;
+struct irdma_sc_qp;
+struct irdma_puda_buf;
+struct irdma_puda_cmpl_info;
+struct irdma_update_sds_info;
+struct irdma_hmc_fcn_info;
+struct irdma_manage_vf_pble_info;
+struct irdma_hw;
+struct irdma_pci_f;
+
+struct ib_device *to_ibdev(struct irdma_sc_dev *dev);
+u8 __iomem *irdma_get_hw_addr(void *dev);
+void irdma_ieq_mpa_crc_ae(struct irdma_sc_dev *dev, struct irdma_sc_qp *qp);
+enum irdma_status_code irdma_vf_wait_vchnl_resp(struct irdma_sc_dev *dev);
+bool irdma_vf_clear_to_send(struct irdma_sc_dev *dev);
+void irdma_add_dev_ref(struct irdma_sc_dev *dev);
+void irdma_put_dev_ref(struct irdma_sc_dev *dev);
+enum irdma_status_code irdma_ieq_check_mpacrc(struct shash_desc *desc,
+					      void *addr, u32 len, u32 val);
+struct irdma_sc_qp *irdma_ieq_get_qp(struct irdma_sc_dev *dev,
+				     struct irdma_puda_buf *buf);
+void irdma_send_ieq_ack(struct irdma_sc_qp *qp);
+void irdma_ieq_update_tcpip_info(struct irdma_puda_buf *buf, u16 len,
+				 u32 seqnum);
+void irdma_free_hash_desc(struct shash_desc *hash_desc);
+enum irdma_status_code irdma_init_hash_desc(struct shash_desc **hash_desc);
+enum irdma_status_code
+irdma_puda_get_tcpip_info(struct irdma_puda_cmpl_info *info,
+			  struct irdma_puda_buf *buf);
+enum irdma_status_code irdma_cqp_sds_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_update_sds_info *info);
+enum irdma_status_code
+irdma_cqp_manage_hmc_fcn_cmd(struct irdma_sc_dev *dev,
+			     struct irdma_hmc_fcn_info *hmcfcninfo,
+			     u16 *pmf_idx);
+enum irdma_status_code
+irdma_cqp_query_fpm_val_cmd(struct irdma_sc_dev *dev,
+			    struct irdma_dma_mem *val_mem, u8 hmc_fn_id);
+enum irdma_status_code
+irdma_cqp_commit_fpm_val_cmd(struct irdma_sc_dev *dev,
+			     struct irdma_dma_mem *val_mem, u8 hmc_fn_id);
+enum irdma_status_code irdma_alloc_query_fpm_buf(struct irdma_sc_dev *dev,
+						 struct irdma_dma_mem *mem);
+void *irdma_remove_cqp_head(struct irdma_sc_dev *dev);
+void irdma_term_modify_qp(struct irdma_sc_qp *qp, u8 next_state, u8 term,
+			  u8 term_len);
+void irdma_terminate_done(struct irdma_sc_qp *qp, int timeout_occurred);
+void irdma_terminate_start_timer(struct irdma_sc_qp *qp);
+void irdma_terminate_del_timer(struct irdma_sc_qp *qp);
+void irdma_hw_stats_start_timer(struct irdma_sc_vsi *vsi);
+void irdma_hw_stats_stop_timer(struct irdma_sc_vsi *vsi);
+void wr32(struct irdma_hw *hw, u32 reg, u32 val);
+u32 rd32(struct irdma_hw *hw, u32 reg);
+u64 rd64(struct irdma_hw *hw, u32 reg);
+enum irdma_status_code irdma_map_vm_page_list(struct irdma_hw *hw, void *va,
+					      dma_addr_t *pg_dma, u32 pg_cnt);
+void irdma_unmap_vm_page_list(struct irdma_hw *hw, dma_addr_t *pg_dma, u32 pg_cnt);
+#endif /* IRDMA_OSDEP_H */
diff --git a/drivers/infiniband/hw/irdma/protos.h b/drivers/infiniband/hw/irdma/protos.h
new file mode 100644
index 0000000..60a6523
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/protos.h
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2016 - 2021 Intel Corporation */
+#ifndef IRDMA_PROTOS_H
+#define IRDMA_PROTOS_H
+
+#define PAUSE_TIMER_VAL		0xffff
+#define REFRESH_THRESHOLD	0x7fff
+#define HIGH_THRESHOLD		0x800
+#define LOW_THRESHOLD		0x200
+#define ALL_TC2PFC		0xff
+#define CQP_COMPL_WAIT_TIME_MS	10
+#define CQP_TIMEOUT_THRESHOLD	500
+
+/* init operations */
+enum irdma_status_code irdma_sc_ctrl_init(enum irdma_vers ver,
+					  struct irdma_sc_dev *dev,
+					  struct irdma_device_init_info *info);
+void irdma_sc_rt_init(struct irdma_sc_dev *dev);
+void irdma_sc_cqp_post_sq(struct irdma_sc_cqp *cqp);
+__le64 *irdma_sc_cqp_get_next_send_wqe(struct irdma_sc_cqp *cqp, u64 scratch);
+enum irdma_status_code
+irdma_sc_mr_fast_register(struct irdma_sc_qp *qp,
+			  struct irdma_fast_reg_stag_info *info, bool post_sq);
+/* HMC/FPM functions */
+enum irdma_status_code irdma_sc_init_iw_hmc(struct irdma_sc_dev *dev,
+					    u8 hmc_fn_id);
+/* stats misc */
+enum irdma_status_code
+irdma_cqp_gather_stats_cmd(struct irdma_sc_dev *dev,
+			   struct irdma_vsi_pestat *pestat, bool wait);
+void irdma_cqp_gather_stats_gen1(struct irdma_sc_dev *dev,
+				 struct irdma_vsi_pestat *pestat);
+void irdma_hw_stats_read_all(struct irdma_vsi_pestat *stats,
+			     struct irdma_dev_hw_stats *stats_values,
+			     u64 *hw_stats_regs_32, u64 *hw_stats_regs_64,
+			     u8 hw_rev);
+enum irdma_status_code
+irdma_cqp_ws_node_cmd(struct irdma_sc_dev *dev, u8 cmd,
+		      struct irdma_ws_node_info *node_info);
+enum irdma_status_code irdma_cqp_up_map_cmd(struct irdma_sc_dev *dev, u8 cmd,
+					    struct irdma_up_info *map_info);
+enum irdma_status_code irdma_cqp_ceq_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_sc_ceq *sc_ceq, u8 op);
+enum irdma_status_code irdma_cqp_aeq_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_sc_aeq *sc_aeq, u8 op);
+enum irdma_status_code
+irdma_cqp_stats_inst_cmd(struct irdma_sc_vsi *vsi, u8 cmd,
+			 struct irdma_stats_inst_info *stats_info);
+u16 irdma_alloc_ws_node_id(struct irdma_sc_dev *dev);
+void irdma_free_ws_node_id(struct irdma_sc_dev *dev, u16 node_id);
+void irdma_update_stats(struct irdma_dev_hw_stats *hw_stats,
+			struct irdma_gather_stats *gather_stats,
+			struct irdma_gather_stats *last_gather_stats);
+/* vsi functions */
+enum irdma_status_code irdma_vsi_stats_init(struct irdma_sc_vsi *vsi,
+					    struct irdma_vsi_stats_info *info);
+void irdma_vsi_stats_free(struct irdma_sc_vsi *vsi);
+void irdma_sc_vsi_init(struct irdma_sc_vsi *vsi,
+		       struct irdma_vsi_init_info *info);
+enum irdma_status_code irdma_sc_add_cq_ctx(struct irdma_sc_ceq *ceq,
+					   struct irdma_sc_cq *cq);
+void irdma_sc_remove_cq_ctx(struct irdma_sc_ceq *ceq, struct irdma_sc_cq *cq);
+/* misc L2 param change functions */
+void irdma_change_l2params(struct irdma_sc_vsi *vsi,
+			   struct irdma_l2params *l2params);
+void irdma_sc_suspend_resume_qps(struct irdma_sc_vsi *vsi, u8 suspend);
+enum irdma_status_code irdma_cqp_qp_suspend_resume(struct irdma_sc_qp *qp,
+						   u8 cmd);
+void irdma_qp_add_qos(struct irdma_sc_qp *qp);
+void irdma_qp_rem_qos(struct irdma_sc_qp *qp);
+struct irdma_sc_qp *irdma_get_qp_from_list(struct list_head *head,
+					   struct irdma_sc_qp *qp);
+void irdma_reinitialize_ieq(struct irdma_sc_vsi *vsi);
+u16 irdma_alloc_ws_node_id(struct irdma_sc_dev *dev);
+void irdma_free_ws_node_id(struct irdma_sc_dev *dev, u16 node_id);
+/* terminate functions*/
+void irdma_terminate_send_fin(struct irdma_sc_qp *qp);
+
+void irdma_terminate_connection(struct irdma_sc_qp *qp,
+				struct irdma_aeqe_info *info);
+
+void irdma_terminate_received(struct irdma_sc_qp *qp,
+			      struct irdma_aeqe_info *info);
+/* dynamic memory allocation */
+/* misc */
+u8 irdma_get_encoded_wqe_size(u32 wqsize, enum irdma_queue_type queue_type);
+void irdma_modify_qp_to_err(struct irdma_sc_qp *sc_qp);
+enum irdma_status_code
+irdma_sc_static_hmc_pages_allocated(struct irdma_sc_cqp *cqp, u64 scratch,
+				    u8 hmc_fn_id, bool post_sq,
+				    bool poll_registers);
+enum irdma_status_code irdma_cfg_fpm_val(struct irdma_sc_dev *dev,
+					 u32 qp_count);
+enum irdma_status_code irdma_get_rdma_features(struct irdma_sc_dev *dev);
+void free_sd_mem(struct irdma_sc_dev *dev);
+enum irdma_status_code irdma_process_cqp_cmd(struct irdma_sc_dev *dev,
+					     struct cqp_cmds_info *pcmdinfo);
+enum irdma_status_code irdma_process_bh(struct irdma_sc_dev *dev);
+enum irdma_status_code irdma_cqp_sds_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_update_sds_info *info);
+enum irdma_status_code
+irdma_cqp_query_fpm_val_cmd(struct irdma_sc_dev *dev,
+			    struct irdma_dma_mem *val_mem, u8 hmc_fn_id);
+enum irdma_status_code
+irdma_cqp_commit_fpm_val_cmd(struct irdma_sc_dev *dev,
+			     struct irdma_dma_mem *val_mem, u8 hmc_fn_id);
+enum irdma_status_code irdma_alloc_query_fpm_buf(struct irdma_sc_dev *dev,
+						 struct irdma_dma_mem *mem);
+enum irdma_status_code
+irdma_cqp_manage_hmc_fcn_cmd(struct irdma_sc_dev *dev,
+			     struct irdma_hmc_fcn_info *hmcfcninfo,
+			     u16 *pmf_idx);
+void irdma_add_dev_ref(struct irdma_sc_dev *dev);
+void irdma_put_dev_ref(struct irdma_sc_dev *dev);
+void *irdma_remove_cqp_head(struct irdma_sc_dev *dev);
+#endif /* IRDMA_PROTOS_H */
diff --git a/drivers/infiniband/hw/irdma/status.h b/drivers/infiniband/hw/irdma/status.h
new file mode 100644
index 0000000..22ea388
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/status.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2015 - 2020 Intel Corporation */
+#ifndef IRDMA_STATUS_H
+#define IRDMA_STATUS_H
+
+/* Error Codes */
+enum irdma_status_code {
+	IRDMA_SUCCESS				= 0,
+	IRDMA_ERR_NVM				= -1,
+	IRDMA_ERR_NVM_CHECKSUM			= -2,
+	IRDMA_ERR_CFG				= -4,
+	IRDMA_ERR_PARAM				= -5,
+	IRDMA_ERR_DEVICE_NOT_SUPPORTED		= -6,
+	IRDMA_ERR_RESET_FAILED			= -7,
+	IRDMA_ERR_SWFW_SYNC			= -8,
+	IRDMA_ERR_NO_MEMORY			= -9,
+	IRDMA_ERR_BAD_PTR			= -10,
+	IRDMA_ERR_INVALID_PD_ID			= -11,
+	IRDMA_ERR_INVALID_QP_ID			= -12,
+	IRDMA_ERR_INVALID_CQ_ID			= -13,
+	IRDMA_ERR_INVALID_CEQ_ID		= -14,
+	IRDMA_ERR_INVALID_AEQ_ID		= -15,
+	IRDMA_ERR_INVALID_SIZE			= -16,
+	IRDMA_ERR_INVALID_ARP_INDEX		= -17,
+	IRDMA_ERR_INVALID_FPM_FUNC_ID		= -18,
+	IRDMA_ERR_QP_INVALID_MSG_SIZE		= -19,
+	IRDMA_ERR_QP_TOOMANY_WRS_POSTED		= -20,
+	IRDMA_ERR_INVALID_FRAG_COUNT		= -21,
+	IRDMA_ERR_Q_EMPTY			= -22,
+	IRDMA_ERR_INVALID_ALIGNMENT		= -23,
+	IRDMA_ERR_FLUSHED_Q			= -24,
+	IRDMA_ERR_INVALID_PUSH_PAGE_INDEX	= -25,
+	IRDMA_ERR_INVALID_INLINE_DATA_SIZE	= -26,
+	IRDMA_ERR_TIMEOUT			= -27,
+	IRDMA_ERR_OPCODE_MISMATCH		= -28,
+	IRDMA_ERR_CQP_COMPL_ERROR		= -29,
+	IRDMA_ERR_INVALID_VF_ID			= -30,
+	IRDMA_ERR_INVALID_HMCFN_ID		= -31,
+	IRDMA_ERR_BACKING_PAGE_ERROR		= -32,
+	IRDMA_ERR_NO_PBLCHUNKS_AVAILABLE	= -33,
+	IRDMA_ERR_INVALID_PBLE_INDEX		= -34,
+	IRDMA_ERR_INVALID_SD_INDEX		= -35,
+	IRDMA_ERR_INVALID_PAGE_DESC_INDEX	= -36,
+	IRDMA_ERR_INVALID_SD_TYPE		= -37,
+	IRDMA_ERR_MEMCPY_FAILED			= -38,
+	IRDMA_ERR_INVALID_HMC_OBJ_INDEX		= -39,
+	IRDMA_ERR_INVALID_HMC_OBJ_COUNT		= -40,
+	IRDMA_ERR_BUF_TOO_SHORT			= -43,
+	IRDMA_ERR_BAD_IWARP_CQE			= -44,
+	IRDMA_ERR_NVM_BLANK_MODE		= -45,
+	IRDMA_ERR_NOT_IMPL			= -46,
+	IRDMA_ERR_PE_DOORBELL_NOT_ENA		= -47,
+	IRDMA_ERR_NOT_READY			= -48,
+	IRDMA_NOT_SUPPORTED			= -49,
+	IRDMA_ERR_FIRMWARE_API_VER		= -50,
+	IRDMA_ERR_RING_FULL			= -51,
+	IRDMA_ERR_MPA_CRC			= -61,
+	IRDMA_ERR_NO_TXBUFS			= -62,
+	IRDMA_ERR_SEQ_NUM			= -63,
+	IRDMA_ERR_list_empty			= -64,
+	IRDMA_ERR_INVALID_MAC_ADDR		= -65,
+	IRDMA_ERR_BAD_STAG			= -66,
+	IRDMA_ERR_CQ_COMPL_ERROR		= -67,
+	IRDMA_ERR_Q_DESTROYED			= -68,
+	IRDMA_ERR_INVALID_FEAT_CNT		= -69,
+	IRDMA_ERR_REG_CQ_FULL			= -70,
+	IRDMA_ERR_VF_MSG_ERROR			= -71,
+	IRDMA_ERR_NO_INTR			= -72,
+	IRDMA_ERR_REG_QSET			= -73,
+};
+#endif /* IRDMA_STATUS_H */
diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c
new file mode 100644
index 0000000..e10e44a
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/utils.c
@@ -0,0 +1,2543 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2015 - 2021 Intel Corporation */
+#include "main.h"
+
+/**
+ * irdma_arp_table -manage arp table
+ * @rf: RDMA PCI function
+ * @ip_addr: ip address for device
+ * @ipv4: IPv4 flag
+ * @mac_addr: mac address ptr
+ * @action: modify, delete or add
+ */
+int irdma_arp_table(struct irdma_pci_f *rf, u32 *ip_addr, bool ipv4,
+		    u8 *mac_addr, u32 action)
+{
+	unsigned long flags;
+	int arp_index;
+	u32 ip[4] = {};
+
+	if (ipv4)
+		ip[0] = *ip_addr;
+	else
+		memcpy(ip, ip_addr, sizeof(ip));
+
+	spin_lock_irqsave(&rf->arp_lock, flags);
+	for (arp_index = 0; (u32)arp_index < rf->arp_table_size; arp_index++) {
+		if (!memcmp(rf->arp_table[arp_index].ip_addr, ip, sizeof(ip)))
+			break;
+	}
+
+	switch (action) {
+	case IRDMA_ARP_ADD:
+		if (arp_index != rf->arp_table_size) {
+			arp_index = -1;
+			break;
+		}
+
+		arp_index = 0;
+		if (irdma_alloc_rsrc(rf, rf->allocated_arps, rf->arp_table_size,
+				     (u32 *)&arp_index, &rf->next_arp_index)) {
+			arp_index = -1;
+			break;
+		}
+
+		memcpy(rf->arp_table[arp_index].ip_addr, ip,
+		       sizeof(rf->arp_table[arp_index].ip_addr));
+		ether_addr_copy(rf->arp_table[arp_index].mac_addr, mac_addr);
+		break;
+	case IRDMA_ARP_RESOLVE:
+		if (arp_index == rf->arp_table_size)
+			arp_index = -1;
+		break;
+	case IRDMA_ARP_DELETE:
+		if (arp_index == rf->arp_table_size) {
+			arp_index = -1;
+			break;
+		}
+
+		memset(rf->arp_table[arp_index].ip_addr, 0,
+		       sizeof(rf->arp_table[arp_index].ip_addr));
+		eth_zero_addr(rf->arp_table[arp_index].mac_addr);
+		irdma_free_rsrc(rf, rf->allocated_arps, arp_index);
+		break;
+	default:
+		arp_index = -1;
+		break;
+	}
+
+	spin_unlock_irqrestore(&rf->arp_lock, flags);
+	return arp_index;
+}
+
+/**
+ * irdma_add_arp - add a new arp entry if needed
+ * @rf: RDMA function
+ * @ip: IP address
+ * @ipv4: IPv4 flag
+ * @mac: MAC address
+ */
+int irdma_add_arp(struct irdma_pci_f *rf, u32 *ip, bool ipv4, u8 *mac)
+{
+	int arpidx;
+
+	arpidx = irdma_arp_table(rf, &ip[0], ipv4, NULL, IRDMA_ARP_RESOLVE);
+	if (arpidx >= 0) {
+		if (ether_addr_equal(rf->arp_table[arpidx].mac_addr, mac))
+			return arpidx;
+
+		irdma_manage_arp_cache(rf, rf->arp_table[arpidx].mac_addr, ip,
+				       ipv4, IRDMA_ARP_DELETE);
+	}
+
+	irdma_manage_arp_cache(rf, mac, ip, ipv4, IRDMA_ARP_ADD);
+
+	return irdma_arp_table(rf, ip, ipv4, NULL, IRDMA_ARP_RESOLVE);
+}
+
+/**
+ * wr32 - write 32 bits to hw register
+ * @hw: hardware information including registers
+ * @reg: register offset
+ * @val: value to write to register
+ */
+inline void wr32(struct irdma_hw *hw, u32 reg, u32 val)
+{
+	writel(val, hw->hw_addr + reg);
+}
+
+/**
+ * rd32 - read a 32 bit hw register
+ * @hw: hardware information including registers
+ * @reg: register offset
+ *
+ * Return value of register content
+ */
+inline u32 rd32(struct irdma_hw *hw, u32 reg)
+{
+	return readl(hw->hw_addr + reg);
+}
+
+/**
+ * rd64 - read a 64 bit hw register
+ * @hw: hardware information including registers
+ * @reg: register offset
+ *
+ * Return value of register content
+ */
+inline u64 rd64(struct irdma_hw *hw, u32 reg)
+{
+	return readq(hw->hw_addr + reg);
+}
+
+static void irdma_gid_change_event(struct ib_device *ibdev)
+{
+	struct ib_event ib_event;
+
+	ib_event.event = IB_EVENT_GID_CHANGE;
+	ib_event.device = ibdev;
+	ib_event.element.port_num = 1;
+	ib_dispatch_event(&ib_event);
+}
+
+/**
+ * irdma_inetaddr_event - system notifier for ipv4 addr events
+ * @notifier: not used
+ * @event: event for notifier
+ * @ptr: if address
+ */
+int irdma_inetaddr_event(struct notifier_block *notifier, unsigned long event,
+			 void *ptr)
+{
+	struct in_ifaddr *ifa = ptr;
+	struct net_device *netdev = ifa->ifa_dev->dev;
+	struct irdma_device *iwdev;
+	struct ib_device *ibdev;
+	u32 local_ipaddr;
+
+	ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_IRDMA);
+	if (!ibdev)
+		return NOTIFY_DONE;
+
+	iwdev = to_iwdev(ibdev);
+	local_ipaddr = ntohl(ifa->ifa_address);
+	ibdev_dbg(&iwdev->ibdev,
+		  "DEV: netdev %p event %lu local_ip=%pI4 MAC=%pM\n", netdev,
+		  event, &local_ipaddr, netdev->dev_addr);
+	switch (event) {
+	case NETDEV_DOWN:
+		irdma_manage_arp_cache(iwdev->rf, netdev->dev_addr,
+				       &local_ipaddr, true, IRDMA_ARP_DELETE);
+		irdma_if_notify(iwdev, netdev, &local_ipaddr, true, false);
+		irdma_gid_change_event(&iwdev->ibdev);
+		break;
+	case NETDEV_UP:
+	case NETDEV_CHANGEADDR:
+		irdma_add_arp(iwdev->rf, &local_ipaddr, true, netdev->dev_addr);
+		irdma_if_notify(iwdev, netdev, &local_ipaddr, true, true);
+		irdma_gid_change_event(&iwdev->ibdev);
+		break;
+	default:
+		break;
+	}
+
+	ib_device_put(ibdev);
+
+	return NOTIFY_DONE;
+}
+
+/**
+ * irdma_inet6addr_event - system notifier for ipv6 addr events
+ * @notifier: not used
+ * @event: event for notifier
+ * @ptr: if address
+ */
+int irdma_inet6addr_event(struct notifier_block *notifier, unsigned long event,
+			  void *ptr)
+{
+	struct inet6_ifaddr *ifa = ptr;
+	struct net_device *netdev = ifa->idev->dev;
+	struct irdma_device *iwdev;
+	struct ib_device *ibdev;
+	u32 local_ipaddr6[4];
+
+	ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_IRDMA);
+	if (!ibdev)
+		return NOTIFY_DONE;
+
+	iwdev = to_iwdev(ibdev);
+	irdma_copy_ip_ntohl(local_ipaddr6, ifa->addr.in6_u.u6_addr32);
+	ibdev_dbg(&iwdev->ibdev,
+		  "DEV: netdev %p event %lu local_ip=%pI6 MAC=%pM\n", netdev,
+		  event, local_ipaddr6, netdev->dev_addr);
+	switch (event) {
+	case NETDEV_DOWN:
+		irdma_manage_arp_cache(iwdev->rf, netdev->dev_addr,
+				       local_ipaddr6, false, IRDMA_ARP_DELETE);
+		irdma_if_notify(iwdev, netdev, local_ipaddr6, false, false);
+		irdma_gid_change_event(&iwdev->ibdev);
+		break;
+	case NETDEV_UP:
+	case NETDEV_CHANGEADDR:
+		irdma_add_arp(iwdev->rf, local_ipaddr6, false,
+			      netdev->dev_addr);
+		irdma_if_notify(iwdev, netdev, local_ipaddr6, false, true);
+		irdma_gid_change_event(&iwdev->ibdev);
+		break;
+	default:
+		break;
+	}
+
+	ib_device_put(ibdev);
+
+	return NOTIFY_DONE;
+}
+
+/**
+ * irdma_net_event - system notifier for net events
+ * @notifier: not used
+ * @event: event for notifier
+ * @ptr: neighbor
+ */
+int irdma_net_event(struct notifier_block *notifier, unsigned long event,
+		    void *ptr)
+{
+	struct neighbour *neigh = ptr;
+	struct irdma_device *iwdev;
+	struct ib_device *ibdev;
+	__be32 *p;
+	u32 local_ipaddr[4] = {};
+	bool ipv4 = true;
+
+	ibdev = ib_device_get_by_netdev((struct net_device *)neigh->dev,
+					RDMA_DRIVER_IRDMA);
+	if (!ibdev)
+		return NOTIFY_DONE;
+
+	iwdev = to_iwdev(ibdev);
+
+	switch (event) {
+	case NETEVENT_NEIGH_UPDATE:
+		p = (__be32 *)neigh->primary_key;
+		if (neigh->tbl->family == AF_INET6) {
+			ipv4 = false;
+			irdma_copy_ip_ntohl(local_ipaddr, p);
+		} else {
+			local_ipaddr[0] = ntohl(*p);
+		}
+
+		ibdev_dbg(&iwdev->ibdev,
+			  "DEV: netdev %p state %d local_ip=%pI4 MAC=%pM\n",
+			  iwdev->netdev, neigh->nud_state, local_ipaddr,
+			  neigh->ha);
+
+		if (neigh->nud_state & NUD_VALID)
+			irdma_add_arp(iwdev->rf, local_ipaddr, ipv4, neigh->ha);
+
+		else
+			irdma_manage_arp_cache(iwdev->rf, neigh->ha,
+					       local_ipaddr, ipv4,
+					       IRDMA_ARP_DELETE);
+		break;
+	default:
+		break;
+	}
+
+	ib_device_put(ibdev);
+
+	return NOTIFY_DONE;
+}
+
+/**
+ * irdma_netdevice_event - system notifier for netdev events
+ * @notifier: not used
+ * @event: event for notifier
+ * @ptr: netdev
+ */
+int irdma_netdevice_event(struct notifier_block *notifier, unsigned long event,
+			  void *ptr)
+{
+	struct irdma_device *iwdev;
+	struct ib_device *ibdev;
+	struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
+
+	ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_IRDMA);
+	if (!ibdev)
+		return NOTIFY_DONE;
+
+	iwdev = to_iwdev(ibdev);
+	iwdev->iw_status = 1;
+	switch (event) {
+	case NETDEV_DOWN:
+		iwdev->iw_status = 0;
+		fallthrough;
+	case NETDEV_UP:
+		irdma_port_ibevent(iwdev);
+		break;
+	default:
+		break;
+	}
+	ib_device_put(ibdev);
+
+	return NOTIFY_DONE;
+}
+/**
+ * irdma_add_ipv6_addr - add ipv6 address to the hw arp table
+ * @iwdev: irdma device
+ */
+static void irdma_add_ipv6_addr(struct irdma_device *iwdev)
+{
+	struct net_device *ip_dev;
+	struct inet6_dev *idev;
+	struct inet6_ifaddr *ifp, *tmp;
+	u32 local_ipaddr6[4];
+
+	rcu_read_lock();
+	for_each_netdev_rcu (&init_net, ip_dev) {
+		if (((rdma_vlan_dev_vlan_id(ip_dev) < 0xFFFF &&
+		      rdma_vlan_dev_real_dev(ip_dev) == iwdev->netdev) ||
+		      ip_dev == iwdev->netdev) &&
+		      (READ_ONCE(ip_dev->flags) & IFF_UP)) {
+			idev = __in6_dev_get(ip_dev);
+			if (!idev) {
+				ibdev_err(&iwdev->ibdev, "ipv6 inet device not found\n");
+				break;
+			}
+			list_for_each_entry_safe (ifp, tmp, &idev->addr_list,
+						  if_list) {
+				ibdev_dbg(&iwdev->ibdev,
+					  "INIT: IP=%pI6, vlan_id=%d, MAC=%pM\n",
+					  &ifp->addr,
+					  rdma_vlan_dev_vlan_id(ip_dev),
+					  ip_dev->dev_addr);
+
+				irdma_copy_ip_ntohl(local_ipaddr6,
+						    ifp->addr.in6_u.u6_addr32);
+				irdma_manage_arp_cache(iwdev->rf,
+						       ip_dev->dev_addr,
+						       local_ipaddr6, false,
+						       IRDMA_ARP_ADD);
+			}
+		}
+	}
+	rcu_read_unlock();
+}
+
+/**
+ * irdma_add_ipv4_addr - add ipv4 address to the hw arp table
+ * @iwdev: irdma device
+ */
+static void irdma_add_ipv4_addr(struct irdma_device *iwdev)
+{
+	struct net_device *dev;
+	struct in_device *idev;
+	u32 ip_addr;
+
+	rcu_read_lock();
+	for_each_netdev_rcu (&init_net, dev) {
+		if (((rdma_vlan_dev_vlan_id(dev) < 0xFFFF &&
+		      rdma_vlan_dev_real_dev(dev) == iwdev->netdev) ||
+		      dev == iwdev->netdev) && (READ_ONCE(dev->flags) & IFF_UP)) {
+			const struct in_ifaddr *ifa;
+
+			idev = __in_dev_get_rcu(dev);
+			if (!idev)
+				continue;
+
+			in_dev_for_each_ifa_rcu(ifa, idev) {
+				ibdev_dbg(&iwdev->ibdev, "CM: IP=%pI4, vlan_id=%d, MAC=%pM\n",
+					  &ifa->ifa_address, rdma_vlan_dev_vlan_id(dev),
+					  dev->dev_addr);
+
+				ip_addr = ntohl(ifa->ifa_address);
+				irdma_manage_arp_cache(iwdev->rf, dev->dev_addr,
+						       &ip_addr, true,
+						       IRDMA_ARP_ADD);
+			}
+		}
+	}
+	rcu_read_unlock();
+}
+
+/**
+ * irdma_add_ip - add ip addresses
+ * @iwdev: irdma device
+ *
+ * Add ipv4/ipv6 addresses to the arp cache
+ */
+void irdma_add_ip(struct irdma_device *iwdev)
+{
+	irdma_add_ipv4_addr(iwdev);
+	irdma_add_ipv6_addr(iwdev);
+}
+
+/**
+ * irdma_alloc_and_get_cqp_request - get cqp struct
+ * @cqp: device cqp ptr
+ * @wait: cqp to be used in wait mode
+ */
+struct irdma_cqp_request *irdma_alloc_and_get_cqp_request(struct irdma_cqp *cqp,
+							  bool wait)
+{
+	struct irdma_cqp_request *cqp_request = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cqp->req_lock, flags);
+	if (!list_empty(&cqp->cqp_avail_reqs)) {
+		cqp_request = list_entry(cqp->cqp_avail_reqs.next,
+					 struct irdma_cqp_request, list);
+		list_del_init(&cqp_request->list);
+	}
+	spin_unlock_irqrestore(&cqp->req_lock, flags);
+	if (!cqp_request) {
+		cqp_request = kzalloc(sizeof(*cqp_request), GFP_ATOMIC);
+		if (cqp_request) {
+			cqp_request->dynamic = true;
+			if (wait)
+				init_waitqueue_head(&cqp_request->waitq);
+		}
+	}
+	if (!cqp_request) {
+		ibdev_dbg(to_ibdev(cqp->sc_cqp.dev), "ERR: CQP Request Fail: No Memory");
+		return NULL;
+	}
+
+	cqp_request->waiting = wait;
+	refcount_set(&cqp_request->refcnt, 1);
+	memset(&cqp_request->compl_info, 0, sizeof(cqp_request->compl_info));
+
+	return cqp_request;
+}
+
+/**
+ * irdma_get_cqp_request - increase refcount for cqp_request
+ * @cqp_request: pointer to cqp_request instance
+ */
+static inline void irdma_get_cqp_request(struct irdma_cqp_request *cqp_request)
+{
+	refcount_inc(&cqp_request->refcnt);
+}
+
+/**
+ * irdma_free_cqp_request - free cqp request
+ * @cqp: cqp ptr
+ * @cqp_request: to be put back in cqp list
+ */
+void irdma_free_cqp_request(struct irdma_cqp *cqp,
+			    struct irdma_cqp_request *cqp_request)
+{
+	unsigned long flags;
+
+	if (cqp_request->dynamic) {
+		kfree(cqp_request);
+	} else {
+		cqp_request->request_done = false;
+		cqp_request->callback_fcn = NULL;
+		cqp_request->waiting = false;
+
+		spin_lock_irqsave(&cqp->req_lock, flags);
+		list_add_tail(&cqp_request->list, &cqp->cqp_avail_reqs);
+		spin_unlock_irqrestore(&cqp->req_lock, flags);
+	}
+	wake_up(&cqp->remove_wq);
+}
+
+/**
+ * irdma_put_cqp_request - dec ref count and free if 0
+ * @cqp: cqp ptr
+ * @cqp_request: to be put back in cqp list
+ */
+void irdma_put_cqp_request(struct irdma_cqp *cqp,
+			   struct irdma_cqp_request *cqp_request)
+{
+	if (refcount_dec_and_test(&cqp_request->refcnt))
+		irdma_free_cqp_request(cqp, cqp_request);
+}
+
+/**
+ * irdma_free_pending_cqp_request -free pending cqp request objs
+ * @cqp: cqp ptr
+ * @cqp_request: to be put back in cqp list
+ */
+static void
+irdma_free_pending_cqp_request(struct irdma_cqp *cqp,
+			       struct irdma_cqp_request *cqp_request)
+{
+	if (cqp_request->waiting) {
+		cqp_request->compl_info.error = true;
+		cqp_request->request_done = true;
+		wake_up(&cqp_request->waitq);
+	}
+	wait_event_timeout(cqp->remove_wq,
+			   refcount_read(&cqp_request->refcnt) == 1, 1000);
+	irdma_put_cqp_request(cqp, cqp_request);
+}
+
+/**
+ * irdma_cleanup_pending_cqp_op - clean-up cqp with no
+ * completions
+ * @rf: RDMA PCI function
+ */
+void irdma_cleanup_pending_cqp_op(struct irdma_pci_f *rf)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct irdma_cqp *cqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request = NULL;
+	struct cqp_cmds_info *pcmdinfo = NULL;
+	u32 i, pending_work, wqe_idx;
+
+	pending_work = IRDMA_RING_USED_QUANTA(cqp->sc_cqp.sq_ring);
+	wqe_idx = IRDMA_RING_CURRENT_TAIL(cqp->sc_cqp.sq_ring);
+	for (i = 0; i < pending_work; i++) {
+		cqp_request = (struct irdma_cqp_request *)(unsigned long)
+				      cqp->scratch_array[wqe_idx];
+		if (cqp_request)
+			irdma_free_pending_cqp_request(cqp, cqp_request);
+		wqe_idx = (wqe_idx + 1) % IRDMA_RING_SIZE(cqp->sc_cqp.sq_ring);
+	}
+
+	while (!list_empty(&dev->cqp_cmd_head)) {
+		pcmdinfo = irdma_remove_cqp_head(dev);
+		cqp_request =
+			container_of(pcmdinfo, struct irdma_cqp_request, info);
+		if (cqp_request)
+			irdma_free_pending_cqp_request(cqp, cqp_request);
+	}
+}
+
+/**
+ * irdma_wait_event - wait for completion
+ * @rf: RDMA PCI function
+ * @cqp_request: cqp request to wait
+ */
+static enum irdma_status_code irdma_wait_event(struct irdma_pci_f *rf,
+					       struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_cqp_timeout cqp_timeout = {};
+	bool cqp_error = false;
+	enum irdma_status_code err_code = 0;
+
+	cqp_timeout.compl_cqp_cmds = rf->sc_dev.cqp_cmd_stats[IRDMA_OP_CMPL_CMDS];
+	do {
+		irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq);
+		if (wait_event_timeout(cqp_request->waitq,
+				       cqp_request->request_done,
+				       msecs_to_jiffies(CQP_COMPL_WAIT_TIME_MS)))
+			break;
+
+		rf->sc_dev.cqp_ops->check_cqp_progress(&cqp_timeout, &rf->sc_dev);
+
+		if (cqp_timeout.count < CQP_TIMEOUT_THRESHOLD)
+			continue;
+
+		if (!rf->reset) {
+			rf->reset = true;
+			rf->gen_ops.request_reset(rf);
+		}
+		return IRDMA_ERR_TIMEOUT;
+	} while (1);
+
+	cqp_error = cqp_request->compl_info.error;
+	if (cqp_error) {
+		err_code = IRDMA_ERR_CQP_COMPL_ERROR;
+		if (cqp_request->compl_info.maj_err_code == 0xFFFF &&
+		    cqp_request->compl_info.min_err_code == 0x8029) {
+			if (!rf->reset) {
+				rf->reset = true;
+				rf->gen_ops.request_reset(rf);
+			}
+		}
+	}
+
+	return err_code;
+}
+
+static const char *const irdma_cqp_cmd_names[IRDMA_MAX_CQP_OPS] = {
+	[IRDMA_OP_CEQ_DESTROY] = "Destroy CEQ Cmd",
+	[IRDMA_OP_AEQ_DESTROY] = "Destroy AEQ Cmd",
+	[IRDMA_OP_DELETE_ARP_CACHE_ENTRY] = "Delete ARP Cache Cmd",
+	[IRDMA_OP_MANAGE_APBVT_ENTRY] = "Manage APBV Table Entry Cmd",
+	[IRDMA_OP_CEQ_CREATE] = "CEQ Create Cmd",
+	[IRDMA_OP_AEQ_CREATE] = "AEQ Destroy Cmd",
+	[IRDMA_OP_MANAGE_QHASH_TABLE_ENTRY] = "Manage Quad Hash Table Entry Cmd",
+	[IRDMA_OP_QP_MODIFY] = "Modify QP Cmd",
+	[IRDMA_OP_QP_UPLOAD_CONTEXT] = "Upload Context Cmd",
+	[IRDMA_OP_CQ_CREATE] = "Create CQ Cmd",
+	[IRDMA_OP_CQ_DESTROY] = "Destroy CQ Cmd",
+	[IRDMA_OP_QP_CREATE] = "Create QP Cmd",
+	[IRDMA_OP_QP_DESTROY] = "Destroy QP Cmd",
+	[IRDMA_OP_ALLOC_STAG] = "Allocate STag Cmd",
+	[IRDMA_OP_MR_REG_NON_SHARED] = "Register Non-Shared MR Cmd",
+	[IRDMA_OP_DEALLOC_STAG] = "Deallocate STag Cmd",
+	[IRDMA_OP_MW_ALLOC] = "Allocate Memory Window Cmd",
+	[IRDMA_OP_QP_FLUSH_WQES] = "Flush QP Cmd",
+	[IRDMA_OP_ADD_ARP_CACHE_ENTRY] = "Add ARP Cache Cmd",
+	[IRDMA_OP_MANAGE_PUSH_PAGE] = "Manage Push Page Cmd",
+	[IRDMA_OP_UPDATE_PE_SDS] = "Update PE SDs Cmd",
+	[IRDMA_OP_MANAGE_HMC_PM_FUNC_TABLE] = "Manage HMC PM Function Table Cmd",
+	[IRDMA_OP_SUSPEND] = "Suspend QP Cmd",
+	[IRDMA_OP_RESUME] = "Resume QP Cmd",
+	[IRDMA_OP_MANAGE_VF_PBLE_BP] = "Manage VF PBLE Backing Pages Cmd",
+	[IRDMA_OP_QUERY_FPM_VAL] = "Query FPM Values Cmd",
+	[IRDMA_OP_COMMIT_FPM_VAL] = "Commit FPM Values Cmd",
+	[IRDMA_OP_AH_CREATE] = "Create Address Handle Cmd",
+	[IRDMA_OP_AH_MODIFY] = "Modify Address Handle Cmd",
+	[IRDMA_OP_AH_DESTROY] = "Destroy Address Handle Cmd",
+	[IRDMA_OP_MC_CREATE] = "Create Multicast Group Cmd",
+	[IRDMA_OP_MC_DESTROY] = "Destroy Multicast Group Cmd",
+	[IRDMA_OP_MC_MODIFY] = "Modify Multicast Group Cmd",
+	[IRDMA_OP_STATS_ALLOCATE] = "Add Statistics Instance Cmd",
+	[IRDMA_OP_STATS_FREE] = "Free Statistics Instance Cmd",
+	[IRDMA_OP_STATS_GATHER] = "Gather Statistics Cmd",
+	[IRDMA_OP_WS_ADD_NODE] = "Add Work Scheduler Node Cmd",
+	[IRDMA_OP_WS_MODIFY_NODE] = "Modify Work Scheduler Node Cmd",
+	[IRDMA_OP_WS_DELETE_NODE] = "Delete Work Scheduler Node Cmd",
+	[IRDMA_OP_SET_UP_MAP] = "Set UP-UP Mapping Cmd",
+	[IRDMA_OP_GEN_AE] = "Generate AE Cmd",
+	[IRDMA_OP_QUERY_RDMA_FEATURES] = "RDMA Get Features Cmd",
+	[IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY] = "Allocal Local MAC Entry Cmd",
+	[IRDMA_OP_ADD_LOCAL_MAC_ENTRY] = "Add Local MAC Entry Cmd",
+	[IRDMA_OP_DELETE_LOCAL_MAC_ENTRY] = "Delete Local MAC Entry Cmd",
+	[IRDMA_OP_CQ_MODIFY] = "CQ Modify Cmd",
+};
+
+static const struct irdma_cqp_err_info irdma_noncrit_err_list[] = {
+	{0xffff, 0x8006, "Flush No Wqe Pending"},
+	{0xffff, 0x8007, "Modify QP Bad Close"},
+	{0xffff, 0x8009, "LLP Closed"},
+	{0xffff, 0x800a, "Reset Not Sent"}
+};
+
+/**
+ * irdma_cqp_crit_err - check if CQP error is critical
+ * @dev: pointer to dev structure
+ * @cqp_cmd: code for last CQP operation
+ * @maj_err_code: major error code
+ * @min_err_code: minot error code
+ */
+bool irdma_cqp_crit_err(struct irdma_sc_dev *dev, u8 cqp_cmd,
+			u16 maj_err_code, u16 min_err_code)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(irdma_noncrit_err_list); ++i) {
+		if (maj_err_code == irdma_noncrit_err_list[i].maj &&
+		    min_err_code == irdma_noncrit_err_list[i].min) {
+			ibdev_dbg(to_ibdev(dev),
+				  "CQP: [%s Error][%s] maj=0x%x min=0x%x\n",
+				  irdma_noncrit_err_list[i].desc,
+				  irdma_cqp_cmd_names[cqp_cmd], maj_err_code,
+				  min_err_code);
+			return false;
+		}
+	}
+	return true;
+}
+
+/**
+ * irdma_handle_cqp_op - process cqp command
+ * @rf: RDMA PCI function
+ * @cqp_request: cqp request to process
+ */
+enum irdma_status_code irdma_handle_cqp_op(struct irdma_pci_f *rf,
+					   struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_sc_dev *dev = &rf->sc_dev;
+	struct cqp_cmds_info *info = &cqp_request->info;
+	enum irdma_status_code status;
+	bool put_cqp_request = true;
+
+	if (rf->reset)
+		return IRDMA_ERR_NOT_READY;
+
+	irdma_get_cqp_request(cqp_request);
+	status = irdma_process_cqp_cmd(dev, info);
+	if (status)
+		goto err;
+
+	if (cqp_request->waiting) {
+		put_cqp_request = false;
+		status = irdma_wait_event(rf, cqp_request);
+		if (status)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	if (irdma_cqp_crit_err(dev, info->cqp_cmd,
+			       cqp_request->compl_info.maj_err_code,
+			       cqp_request->compl_info.min_err_code))
+		ibdev_err(&rf->iwdev->ibdev,
+			  "[%s Error][op_code=%d] status=%d waiting=%d completion_err=%d maj=0x%x min=0x%x\n",
+			  irdma_cqp_cmd_names[info->cqp_cmd], info->cqp_cmd, status, cqp_request->waiting,
+			  cqp_request->compl_info.error, cqp_request->compl_info.maj_err_code,
+			  cqp_request->compl_info.min_err_code);
+
+	if (put_cqp_request)
+		irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+void irdma_qp_add_ref(struct ib_qp *ibqp)
+{
+	struct irdma_qp *iwqp = (struct irdma_qp *)ibqp;
+
+	refcount_inc(&iwqp->refcnt);
+}
+
+void irdma_qp_rem_ref(struct ib_qp *ibqp)
+{
+	struct irdma_qp *iwqp = to_iwqp(ibqp);
+	struct irdma_device *iwdev = iwqp->iwdev;
+	u32 qp_num;
+	unsigned long flags;
+
+	spin_lock_irqsave(&iwdev->rf->qptable_lock, flags);
+	if (!refcount_dec_and_test(&iwqp->refcnt)) {
+		spin_unlock_irqrestore(&iwdev->rf->qptable_lock, flags);
+		return;
+	}
+
+	qp_num = iwqp->ibqp.qp_num;
+	iwdev->rf->qp_table[qp_num] = NULL;
+	spin_unlock_irqrestore(&iwdev->rf->qptable_lock, flags);
+	complete(&iwqp->free_qp);
+}
+
+struct ib_device *to_ibdev(struct irdma_sc_dev *dev)
+{
+	return &(container_of(dev, struct irdma_pci_f, sc_dev))->iwdev->ibdev;
+}
+
+/**
+ * irdma_get_qp - get qp address
+ * @device: iwarp device
+ * @qpn: qp number
+ */
+struct ib_qp *irdma_get_qp(struct ib_device *device, int qpn)
+{
+	struct irdma_device *iwdev = to_iwdev(device);
+
+	if (qpn < IW_FIRST_QPN || qpn >= iwdev->rf->max_qp)
+		return NULL;
+
+	return &iwdev->rf->qp_table[qpn]->ibqp;
+}
+
+/**
+ * irdma_get_hw_addr - return hw addr
+ * @par: points to shared dev
+ */
+u8 __iomem *irdma_get_hw_addr(void *par)
+{
+	struct irdma_sc_dev *dev = par;
+
+	return dev->hw->hw_addr;
+}
+
+/**
+ * irdma_remove_cqp_head - return head entry and remove
+ * @dev: device
+ */
+void *irdma_remove_cqp_head(struct irdma_sc_dev *dev)
+{
+	struct list_head *entry;
+	struct list_head *list = &dev->cqp_cmd_head;
+
+	if (list_empty(list))
+		return NULL;
+
+	entry = list->next;
+	list_del(entry);
+
+	return entry;
+}
+
+/**
+ * irdma_cqp_sds_cmd - create cqp command for sd
+ * @dev: hardware control device structure
+ * @sdinfo: information for sd cqp
+ *
+ */
+enum irdma_status_code irdma_cqp_sds_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_update_sds_info *sdinfo)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	memcpy(&cqp_info->in.u.update_pe_sds.info, sdinfo,
+	       sizeof(cqp_info->in.u.update_pe_sds.info));
+	cqp_info->cqp_cmd = IRDMA_OP_UPDATE_PE_SDS;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.update_pe_sds.dev = dev;
+	cqp_info->in.u.update_pe_sds.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_qp_suspend_resume - cqp command for suspend/resume
+ * @qp: hardware control qp
+ * @op: suspend or resume
+ */
+enum irdma_status_code irdma_cqp_qp_suspend_resume(struct irdma_sc_qp *qp,
+						   u8 op)
+{
+	struct irdma_sc_dev *dev = qp->dev;
+	struct irdma_cqp_request *cqp_request;
+	struct irdma_sc_cqp *cqp = dev->cqp;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, false);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = op;
+	cqp_info->in.u.suspend_resume.cqp = cqp;
+	cqp_info->in.u.suspend_resume.qp = qp;
+	cqp_info->in.u.suspend_resume.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_term_modify_qp - modify qp for term message
+ * @qp: hardware control qp
+ * @next_state: qp's next state
+ * @term: terminate code
+ * @term_len: length
+ */
+void irdma_term_modify_qp(struct irdma_sc_qp *qp, u8 next_state, u8 term,
+			  u8 term_len)
+{
+	struct irdma_qp *iwqp;
+
+	iwqp = qp->qp_uk.back_qp;
+	irdma_next_iw_state(iwqp, next_state, 0, term, term_len);
+};
+
+/**
+ * irdma_terminate_done - after terminate is completed
+ * @qp: hardware control qp
+ * @timeout_occurred: indicates if terminate timer expired
+ */
+void irdma_terminate_done(struct irdma_sc_qp *qp, int timeout_occurred)
+{
+	struct irdma_qp *iwqp;
+	u8 hte = 0;
+	bool first_time;
+	unsigned long flags;
+
+	iwqp = qp->qp_uk.back_qp;
+	spin_lock_irqsave(&iwqp->lock, flags);
+	if (iwqp->hte_added) {
+		iwqp->hte_added = 0;
+		hte = 1;
+	}
+	first_time = !(qp->term_flags & IRDMA_TERM_DONE);
+	qp->term_flags |= IRDMA_TERM_DONE;
+	spin_unlock_irqrestore(&iwqp->lock, flags);
+	if (first_time) {
+		if (!timeout_occurred)
+			irdma_terminate_del_timer(qp);
+
+		irdma_next_iw_state(iwqp, IRDMA_QP_STATE_ERROR, hte, 0, 0);
+		irdma_cm_disconn(iwqp);
+	}
+}
+
+static void irdma_terminate_timeout(struct timer_list *t)
+{
+	struct irdma_qp *iwqp = from_timer(iwqp, t, terminate_timer);
+	struct irdma_sc_qp *qp = &iwqp->sc_qp;
+
+	irdma_terminate_done(qp, 1);
+	irdma_qp_rem_ref(&iwqp->ibqp);
+}
+
+/**
+ * irdma_terminate_start_timer - start terminate timeout
+ * @qp: hardware control qp
+ */
+void irdma_terminate_start_timer(struct irdma_sc_qp *qp)
+{
+	struct irdma_qp *iwqp;
+
+	iwqp = qp->qp_uk.back_qp;
+	irdma_qp_add_ref(&iwqp->ibqp);
+	timer_setup(&iwqp->terminate_timer, irdma_terminate_timeout, 0);
+	iwqp->terminate_timer.expires = jiffies + HZ;
+
+	add_timer(&iwqp->terminate_timer);
+}
+
+/**
+ * irdma_terminate_del_timer - delete terminate timeout
+ * @qp: hardware control qp
+ */
+void irdma_terminate_del_timer(struct irdma_sc_qp *qp)
+{
+	struct irdma_qp *iwqp;
+	int ret;
+
+	iwqp = qp->qp_uk.back_qp;
+	ret = del_timer(&iwqp->terminate_timer);
+	if (ret)
+		irdma_qp_rem_ref(&iwqp->ibqp);
+}
+
+/**
+ * irdma_cqp_query_fpm_val_cmd - send cqp command for fpm
+ * @dev: function device struct
+ * @val_mem: buffer for fpm
+ * @hmc_fn_id: function id for fpm
+ */
+enum irdma_status_code
+irdma_cqp_query_fpm_val_cmd(struct irdma_sc_dev *dev,
+			    struct irdma_dma_mem *val_mem, u8 hmc_fn_id)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_request->param = NULL;
+	cqp_info->in.u.query_fpm_val.cqp = dev->cqp;
+	cqp_info->in.u.query_fpm_val.fpm_val_pa = val_mem->pa;
+	cqp_info->in.u.query_fpm_val.fpm_val_va = val_mem->va;
+	cqp_info->in.u.query_fpm_val.hmc_fn_id = hmc_fn_id;
+	cqp_info->cqp_cmd = IRDMA_OP_QUERY_FPM_VAL;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.query_fpm_val.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_commit_fpm_val_cmd - commit fpm values in hw
+ * @dev: hardware control device structure
+ * @val_mem: buffer with fpm values
+ * @hmc_fn_id: function id for fpm
+ */
+enum irdma_status_code
+irdma_cqp_commit_fpm_val_cmd(struct irdma_sc_dev *dev,
+			     struct irdma_dma_mem *val_mem, u8 hmc_fn_id)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_request->param = NULL;
+	cqp_info->in.u.commit_fpm_val.cqp = dev->cqp;
+	cqp_info->in.u.commit_fpm_val.fpm_val_pa = val_mem->pa;
+	cqp_info->in.u.commit_fpm_val.fpm_val_va = val_mem->va;
+	cqp_info->in.u.commit_fpm_val.hmc_fn_id = hmc_fn_id;
+	cqp_info->cqp_cmd = IRDMA_OP_COMMIT_FPM_VAL;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.commit_fpm_val.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_cq_create_cmd - create a cq for the cqp
+ * @dev: device pointer
+ * @cq: pointer to created cq
+ */
+enum irdma_status_code irdma_cqp_cq_create_cmd(struct irdma_sc_dev *dev,
+					       struct irdma_sc_cq *cq)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_CQ_CREATE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.cq_create.cq = cq;
+	cqp_info->in.u.cq_create.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(iwcqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_qp_create_cmd - create a qp for the cqp
+ * @dev: device pointer
+ * @qp: pointer to created qp
+ */
+enum irdma_status_code irdma_cqp_qp_create_cmd(struct irdma_sc_dev *dev,
+					       struct irdma_sc_qp *qp)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_create_qp_info *qp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	qp_info = &cqp_request->info.in.u.qp_create.info;
+	memset(qp_info, 0, sizeof(*qp_info));
+	qp_info->cq_num_valid = true;
+	qp_info->next_iwarp_state = IRDMA_QP_STATE_RTS;
+	cqp_info->cqp_cmd = IRDMA_OP_QP_CREATE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.qp_create.qp = qp;
+	cqp_info->in.u.qp_create.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(iwcqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_dealloc_push_page - free a push page for qp
+ * @rf: RDMA PCI function
+ * @qp: hardware control qp
+ */
+static void irdma_dealloc_push_page(struct irdma_pci_f *rf,
+				    struct irdma_sc_qp *qp)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	if (qp->push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX)
+		return;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, false);
+	if (!cqp_request)
+		return;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_MANAGE_PUSH_PAGE;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.manage_push_page.info.push_idx = qp->push_idx;
+	cqp_info->in.u.manage_push_page.info.qs_handle = qp->qs_handle;
+	cqp_info->in.u.manage_push_page.info.free_page = 1;
+	cqp_info->in.u.manage_push_page.info.push_page_type = 0;
+	cqp_info->in.u.manage_push_page.cqp = &rf->cqp.sc_cqp;
+	cqp_info->in.u.manage_push_page.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	if (!status)
+		qp->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX;
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+}
+
+/**
+ * irdma_free_qp_rsrc - free up memory resources for qp
+ * @iwqp: qp ptr (user or kernel)
+ */
+void irdma_free_qp_rsrc(struct irdma_qp *iwqp)
+{
+	struct irdma_device *iwdev = iwqp->iwdev;
+	struct irdma_pci_f *rf = iwdev->rf;
+	u32 qp_num = iwqp->ibqp.qp_num;
+
+	irdma_ieq_cleanup_qp(iwdev->vsi.ieq, &iwqp->sc_qp);
+	irdma_dealloc_push_page(rf, &iwqp->sc_qp);
+	if (iwqp->sc_qp.vsi) {
+		irdma_qp_rem_qos(&iwqp->sc_qp);
+		iwqp->sc_qp.dev->ws_remove(iwqp->sc_qp.vsi,
+					   iwqp->sc_qp.user_pri);
+	}
+
+	if (qp_num > 2)
+		irdma_free_rsrc(rf, rf->allocated_qps, qp_num);
+	dma_free_coherent(rf->sc_dev.hw->device, iwqp->q2_ctx_mem.size,
+			  iwqp->q2_ctx_mem.va, iwqp->q2_ctx_mem.pa);
+	iwqp->q2_ctx_mem.va = NULL;
+	dma_free_coherent(rf->sc_dev.hw->device, iwqp->kqp.dma_mem.size,
+			  iwqp->kqp.dma_mem.va, iwqp->kqp.dma_mem.pa);
+	iwqp->kqp.dma_mem.va = NULL;
+	kfree(iwqp->kqp.sq_wrid_mem);
+	iwqp->kqp.sq_wrid_mem = NULL;
+	kfree(iwqp->kqp.rq_wrid_mem);
+	iwqp->kqp.rq_wrid_mem = NULL;
+	kfree(iwqp);
+}
+
+/**
+ * irdma_cq_wq_destroy - send cq destroy cqp
+ * @rf: RDMA PCI function
+ * @cq: hardware control cq
+ */
+void irdma_cq_wq_destroy(struct irdma_pci_f *rf, struct irdma_sc_cq *cq)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = IRDMA_OP_CQ_DESTROY;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.cq_destroy.cq = cq;
+	cqp_info->in.u.cq_destroy.scratch = (uintptr_t)cqp_request;
+
+	irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+}
+
+/**
+ * irdma_hw_modify_qp_callback - handle state for modifyQPs that don't wait
+ * @cqp_request: modify QP completion
+ */
+static void irdma_hw_modify_qp_callback(struct irdma_cqp_request *cqp_request)
+{
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_qp *iwqp;
+
+	cqp_info = &cqp_request->info;
+	iwqp = cqp_info->in.u.qp_modify.qp->qp_uk.back_qp;
+	atomic_dec(&iwqp->hw_mod_qp_pend);
+	wake_up(&iwqp->mod_qp_waitq);
+}
+
+/**
+ * irdma_hw_modify_qp - setup cqp for modify qp
+ * @iwdev: RDMA device
+ * @iwqp: qp ptr (user or kernel)
+ * @info: info for modify qp
+ * @wait: flag to wait or not for modify qp completion
+ */
+enum irdma_status_code irdma_hw_modify_qp(struct irdma_device *iwdev,
+					  struct irdma_qp *iwqp,
+					  struct irdma_modify_qp_info *info,
+					  bool wait)
+{
+	enum irdma_status_code status;
+	struct irdma_pci_f *rf = iwdev->rf;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_modify_qp_info *m_info;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, wait);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	if (!wait) {
+		cqp_request->callback_fcn = irdma_hw_modify_qp_callback;
+		atomic_inc(&iwqp->hw_mod_qp_pend);
+	}
+	cqp_info = &cqp_request->info;
+	m_info = &cqp_info->in.u.qp_modify.info;
+	memcpy(m_info, info, sizeof(*m_info));
+	cqp_info->cqp_cmd = IRDMA_OP_QP_MODIFY;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.qp_modify.qp = &iwqp->sc_qp;
+	cqp_info->in.u.qp_modify.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+	if (status) {
+		if (rdma_protocol_roce(&iwdev->ibdev, 1))
+			return status;
+
+		switch (m_info->next_iwarp_state) {
+			struct irdma_gen_ae_info ae_info;
+
+		case IRDMA_QP_STATE_RTS:
+		case IRDMA_QP_STATE_IDLE:
+		case IRDMA_QP_STATE_TERMINATE:
+		case IRDMA_QP_STATE_CLOSING:
+			if (info->curr_iwarp_state == IRDMA_QP_STATE_IDLE)
+				irdma_send_reset(iwqp->cm_node);
+			else
+				iwqp->sc_qp.term_flags = IRDMA_TERM_DONE;
+			if (!wait) {
+				ae_info.ae_code = IRDMA_AE_BAD_CLOSE;
+				ae_info.ae_src = 0;
+				irdma_gen_ae(rf, &iwqp->sc_qp, &ae_info, false);
+			} else {
+				cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp,
+									      wait);
+				if (!cqp_request)
+					return IRDMA_ERR_NO_MEMORY;
+
+				cqp_info = &cqp_request->info;
+				m_info = &cqp_info->in.u.qp_modify.info;
+				memcpy(m_info, info, sizeof(*m_info));
+				cqp_info->cqp_cmd = IRDMA_OP_QP_MODIFY;
+				cqp_info->post_sq = 1;
+				cqp_info->in.u.qp_modify.qp = &iwqp->sc_qp;
+				cqp_info->in.u.qp_modify.scratch = (uintptr_t)cqp_request;
+				m_info->next_iwarp_state = IRDMA_QP_STATE_ERROR;
+				m_info->reset_tcp_conn = true;
+				irdma_handle_cqp_op(rf, cqp_request);
+				irdma_put_cqp_request(&rf->cqp, cqp_request);
+			}
+			break;
+		case IRDMA_QP_STATE_ERROR:
+		default:
+			break;
+		}
+	}
+
+	return status;
+}
+
+/**
+ * irdma_cqp_cq_destroy_cmd - destroy the cqp cq
+ * @dev: device pointer
+ * @cq: pointer to cq
+ */
+void irdma_cqp_cq_destroy_cmd(struct irdma_sc_dev *dev, struct irdma_sc_cq *cq)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+
+	irdma_cq_wq_destroy(rf, cq);
+}
+
+/**
+ * irdma_cqp_qp_destroy_cmd - destroy the cqp
+ * @dev: device pointer
+ * @qp: pointer to qp
+ */
+enum irdma_status_code irdma_cqp_qp_destroy_cmd(struct irdma_sc_dev *dev, struct irdma_sc_qp *qp)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	memset(cqp_info, 0, sizeof(*cqp_info));
+	cqp_info->cqp_cmd = IRDMA_OP_QP_DESTROY;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.qp_destroy.qp = qp;
+	cqp_info->in.u.qp_destroy.scratch = (uintptr_t)cqp_request;
+	cqp_info->in.u.qp_destroy.remove_hash_idx = true;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_ieq_mpa_crc_ae - generate AE for crc error
+ * @dev: hardware control device structure
+ * @qp: hardware control qp
+ */
+void irdma_ieq_mpa_crc_ae(struct irdma_sc_dev *dev, struct irdma_sc_qp *qp)
+{
+	struct irdma_gen_ae_info info = {};
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+
+	ibdev_dbg(&rf->iwdev->ibdev, "AEQ: Generate MPA CRC AE\n");
+	info.ae_code = IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR;
+	info.ae_src = IRDMA_AE_SOURCE_RQ;
+	irdma_gen_ae(rf, qp, &info, false);
+}
+
+/**
+ * irdma_init_hash_desc - initialize hash for crc calculation
+ * @desc: cryption type
+ */
+enum irdma_status_code irdma_init_hash_desc(struct shash_desc **desc)
+{
+	struct crypto_shash *tfm;
+	struct shash_desc *tdesc;
+
+	tfm = crypto_alloc_shash("crc32c", 0, 0);
+	if (IS_ERR(tfm))
+		return IRDMA_ERR_MPA_CRC;
+
+	tdesc = kzalloc(sizeof(*tdesc) + crypto_shash_descsize(tfm),
+			GFP_KERNEL);
+	if (!tdesc) {
+		crypto_free_shash(tfm);
+		return IRDMA_ERR_MPA_CRC;
+	}
+
+	tdesc->tfm = tfm;
+	*desc = tdesc;
+
+	return 0;
+}
+
+/**
+ * irdma_free_hash_desc - free hash desc
+ * @desc: to be freed
+ */
+void irdma_free_hash_desc(struct shash_desc *desc)
+{
+	if (desc) {
+		crypto_free_shash(desc->tfm);
+		kfree(desc);
+	}
+}
+
+/**
+ * irdma_ieq_check_mpacrc - check if mpa crc is OK
+ * @desc: desc for hash
+ * @addr: address of buffer for crc
+ * @len: length of buffer
+ * @val: value to be compared
+ */
+enum irdma_status_code irdma_ieq_check_mpacrc(struct shash_desc *desc,
+					      void *addr, u32 len, u32 val)
+{
+	u32 crc = 0;
+	int ret;
+	enum irdma_status_code ret_code = 0;
+
+	crypto_shash_init(desc);
+	ret = crypto_shash_update(desc, addr, len);
+	if (!ret)
+		crypto_shash_final(desc, (u8 *)&crc);
+	if (crc != val)
+		ret_code = IRDMA_ERR_MPA_CRC;
+
+	return ret_code;
+}
+
+/**
+ * irdma_ieq_get_qp - get qp based on quad in puda buffer
+ * @dev: hardware control device structure
+ * @buf: receive puda buffer on exception q
+ */
+struct irdma_sc_qp *irdma_ieq_get_qp(struct irdma_sc_dev *dev,
+				     struct irdma_puda_buf *buf)
+{
+	struct irdma_qp *iwqp;
+	struct irdma_cm_node *cm_node;
+	struct irdma_device *iwdev = buf->vsi->back_vsi;
+	u32 loc_addr[4] = {};
+	u32 rem_addr[4] = {};
+	u16 loc_port, rem_port;
+	struct ipv6hdr *ip6h;
+	struct iphdr *iph = (struct iphdr *)buf->iph;
+	struct tcphdr *tcph = (struct tcphdr *)buf->tcph;
+
+	if (iph->version == 4) {
+		loc_addr[0] = ntohl(iph->daddr);
+		rem_addr[0] = ntohl(iph->saddr);
+	} else {
+		ip6h = (struct ipv6hdr *)buf->iph;
+		irdma_copy_ip_ntohl(loc_addr, ip6h->daddr.in6_u.u6_addr32);
+		irdma_copy_ip_ntohl(rem_addr, ip6h->saddr.in6_u.u6_addr32);
+	}
+	loc_port = ntohs(tcph->dest);
+	rem_port = ntohs(tcph->source);
+	cm_node = irdma_find_node(&iwdev->cm_core, rem_port, rem_addr, loc_port,
+				  loc_addr, buf->vlan_valid ? buf->vlan_id : 0xFFFF);
+	if (!cm_node)
+		return NULL;
+
+	iwqp = cm_node->iwqp;
+	irdma_rem_ref_cm_node(cm_node);
+
+	return &iwqp->sc_qp;
+}
+
+/**
+ * irdma_send_ieq_ack - ACKs for duplicate or OOO partials FPDUs
+ * @qp: qp ptr
+ */
+void irdma_send_ieq_ack(struct irdma_sc_qp *qp)
+{
+	struct irdma_cm_node *cm_node = ((struct irdma_qp *)qp->qp_uk.back_qp)->cm_node;
+	struct irdma_puda_buf *buf = qp->pfpdu.lastrcv_buf;
+	struct tcphdr *tcph = (struct tcphdr *)buf->tcph;
+
+	cm_node->tcp_cntxt.rcv_nxt = qp->pfpdu.nextseqnum;
+	cm_node->tcp_cntxt.loc_seq_num = ntohl(tcph->ack_seq);
+
+	irdma_send_ack(cm_node);
+}
+
+/**
+ * irdma_puda_ieq_get_ah_info - get AH info from IEQ buffer
+ * @qp: qp pointer
+ * @ah_info: AH info pointer
+ */
+void irdma_puda_ieq_get_ah_info(struct irdma_sc_qp *qp,
+				struct irdma_ah_info *ah_info)
+{
+	struct irdma_puda_buf *buf = qp->pfpdu.ah_buf;
+	struct iphdr *iph;
+	struct ipv6hdr *ip6h;
+
+	memset(ah_info, 0, sizeof(*ah_info));
+	ah_info->do_lpbk = true;
+	ah_info->vlan_tag = buf->vlan_id;
+	ah_info->insert_vlan_tag = buf->vlan_valid;
+	ah_info->ipv4_valid = buf->ipv4;
+	ah_info->vsi = qp->vsi;
+
+	if (buf->smac_valid)
+		ether_addr_copy(ah_info->mac_addr, buf->smac);
+
+	if (buf->ipv4) {
+		ah_info->ipv4_valid = true;
+		iph = (struct iphdr *)buf->iph;
+		ah_info->hop_ttl = iph->ttl;
+		ah_info->tc_tos = iph->tos;
+		ah_info->dest_ip_addr[0] = ntohl(iph->daddr);
+		ah_info->src_ip_addr[0] = ntohl(iph->saddr);
+	} else {
+		ip6h = (struct ipv6hdr *)buf->iph;
+		ah_info->hop_ttl = ip6h->hop_limit;
+		ah_info->tc_tos = ip6h->priority;
+		irdma_copy_ip_ntohl(ah_info->dest_ip_addr,
+				    ip6h->daddr.in6_u.u6_addr32);
+		irdma_copy_ip_ntohl(ah_info->src_ip_addr,
+				    ip6h->saddr.in6_u.u6_addr32);
+	}
+
+	ah_info->dst_arpindex = irdma_arp_table(dev_to_rf(qp->dev),
+						ah_info->dest_ip_addr,
+						ah_info->ipv4_valid,
+						NULL, IRDMA_ARP_RESOLVE);
+}
+
+/**
+ * irdma_gen1_ieq_update_tcpip_info - update tcpip in the buffer
+ * @buf: puda to update
+ * @len: length of buffer
+ * @seqnum: seq number for tcp
+ */
+static void irdma_gen1_ieq_update_tcpip_info(struct irdma_puda_buf *buf,
+					     u16 len, u32 seqnum)
+{
+	struct tcphdr *tcph;
+	struct iphdr *iph;
+	u16 iphlen;
+	u16 pktsize;
+	u8 *addr = buf->mem.va;
+
+	iphlen = (buf->ipv4) ? 20 : 40;
+	iph = (struct iphdr *)(addr + buf->maclen);
+	tcph = (struct tcphdr *)(addr + buf->maclen + iphlen);
+	pktsize = len + buf->tcphlen + iphlen;
+	iph->tot_len = htons(pktsize);
+	tcph->seq = htonl(seqnum);
+}
+
+/**
+ * irdma_ieq_update_tcpip_info - update tcpip in the buffer
+ * @buf: puda to update
+ * @len: length of buffer
+ * @seqnum: seq number for tcp
+ */
+void irdma_ieq_update_tcpip_info(struct irdma_puda_buf *buf, u16 len,
+				 u32 seqnum)
+{
+	struct tcphdr *tcph;
+	u8 *addr;
+
+	if (buf->vsi->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		return irdma_gen1_ieq_update_tcpip_info(buf, len, seqnum);
+
+	addr = buf->mem.va;
+	tcph = (struct tcphdr *)addr;
+	tcph->seq = htonl(seqnum);
+}
+
+/**
+ * irdma_gen1_puda_get_tcpip_info - get tcpip info from puda
+ * buffer
+ * @info: to get information
+ * @buf: puda buffer
+ */
+static enum irdma_status_code
+irdma_gen1_puda_get_tcpip_info(struct irdma_puda_cmpl_info *info,
+			       struct irdma_puda_buf *buf)
+{
+	struct iphdr *iph;
+	struct ipv6hdr *ip6h;
+	struct tcphdr *tcph;
+	u16 iphlen;
+	u16 pkt_len;
+	u8 *mem = buf->mem.va;
+	struct ethhdr *ethh = buf->mem.va;
+
+	if (ethh->h_proto == htons(0x8100)) {
+		info->vlan_valid = true;
+		buf->vlan_id = ntohs(((struct vlan_ethhdr *)ethh)->h_vlan_TCI) &
+			       VLAN_VID_MASK;
+	}
+
+	buf->maclen = (info->vlan_valid) ? 18 : 14;
+	iphlen = (info->l3proto) ? 40 : 20;
+	buf->ipv4 = (info->l3proto) ? false : true;
+	buf->iph = mem + buf->maclen;
+	iph = (struct iphdr *)buf->iph;
+	buf->tcph = buf->iph + iphlen;
+	tcph = (struct tcphdr *)buf->tcph;
+
+	if (buf->ipv4) {
+		pkt_len = ntohs(iph->tot_len);
+	} else {
+		ip6h = (struct ipv6hdr *)buf->iph;
+		pkt_len = ntohs(ip6h->payload_len) + iphlen;
+	}
+
+	buf->totallen = pkt_len + buf->maclen;
+
+	if (info->payload_len < buf->totallen) {
+		ibdev_dbg(to_ibdev(buf->vsi->dev),
+			  "ERR: payload_len = 0x%x totallen expected0x%x\n",
+			  info->payload_len, buf->totallen);
+		return IRDMA_ERR_INVALID_SIZE;
+	}
+
+	buf->tcphlen = tcph->doff << 2;
+	buf->datalen = pkt_len - iphlen - buf->tcphlen;
+	buf->data = buf->datalen ? buf->tcph + buf->tcphlen : NULL;
+	buf->hdrlen = buf->maclen + iphlen + buf->tcphlen;
+	buf->seqnum = ntohl(tcph->seq);
+
+	return 0;
+}
+
+/**
+ * irdma_puda_get_tcpip_info - get tcpip info from puda buffer
+ * @info: to get information
+ * @buf: puda buffer
+ */
+enum irdma_status_code
+irdma_puda_get_tcpip_info(struct irdma_puda_cmpl_info *info,
+			  struct irdma_puda_buf *buf)
+{
+	struct tcphdr *tcph;
+	u32 pkt_len;
+	u8 *mem;
+
+	if (buf->vsi->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		return irdma_gen1_puda_get_tcpip_info(info, buf);
+
+	mem = buf->mem.va;
+	buf->vlan_valid = info->vlan_valid;
+	if (info->vlan_valid)
+		buf->vlan_id = info->vlan;
+
+	buf->ipv4 = info->ipv4;
+	if (buf->ipv4)
+		buf->iph = mem + IRDMA_IPV4_PAD;
+	else
+		buf->iph = mem;
+
+	buf->tcph = mem + IRDMA_TCP_OFFSET;
+	tcph = (struct tcphdr *)buf->tcph;
+	pkt_len = info->payload_len;
+	buf->totallen = pkt_len;
+	buf->tcphlen = tcph->doff << 2;
+	buf->datalen = pkt_len - IRDMA_TCP_OFFSET - buf->tcphlen;
+	buf->data = buf->datalen ? buf->tcph + buf->tcphlen : NULL;
+	buf->hdrlen = IRDMA_TCP_OFFSET + buf->tcphlen;
+	buf->seqnum = ntohl(tcph->seq);
+
+	if (info->smac_valid) {
+		ether_addr_copy(buf->smac, info->smac);
+		buf->smac_valid = true;
+	}
+
+	return 0;
+}
+
+/**
+ * irdma_hw_stats_timeout - Stats timer-handler which updates all HW stats
+ * @t: timer_list pointer
+ */
+static void irdma_hw_stats_timeout(struct timer_list *t)
+{
+	struct irdma_vsi_pestat *pf_devstat =
+		from_timer(pf_devstat, t, stats_timer);
+	struct irdma_sc_vsi *sc_vsi = pf_devstat->vsi;
+
+	if (sc_vsi->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1)
+		irdma_cqp_gather_stats_gen1(sc_vsi->dev, sc_vsi->pestat);
+	else
+		irdma_cqp_gather_stats_cmd(sc_vsi->dev, sc_vsi->pestat, false);
+
+	mod_timer(&pf_devstat->stats_timer,
+		  jiffies + msecs_to_jiffies(STATS_TIMER_DELAY));
+}
+
+/**
+ * irdma_hw_stats_start_timer - Start periodic stats timer
+ * @vsi: vsi structure pointer
+ */
+void irdma_hw_stats_start_timer(struct irdma_sc_vsi *vsi)
+{
+	struct irdma_vsi_pestat *devstat = vsi->pestat;
+
+	timer_setup(&devstat->stats_timer, irdma_hw_stats_timeout, 0);
+	mod_timer(&devstat->stats_timer,
+		  jiffies + msecs_to_jiffies(STATS_TIMER_DELAY));
+}
+
+/**
+ * irdma_hw_stats_stop_timer - Delete periodic stats timer
+ * @vsi: pointer to vsi structure
+ */
+void irdma_hw_stats_stop_timer(struct irdma_sc_vsi *vsi)
+{
+	struct irdma_vsi_pestat *devstat = vsi->pestat;
+
+	del_timer_sync(&devstat->stats_timer);
+}
+
+/**
+ * irdma_process_stats - Checking for wrap and update stats
+ * @pestat: stats structure pointer
+ */
+static void irdma_process_stats(struct irdma_vsi_pestat *pestat)
+{
+	struct irdma_sc_dev *dev = pestat->vsi->dev;
+
+	dev->iw_vsi_ops->vsi_update_stats(pestat->vsi);
+}
+
+/**
+ * irdma_cqp_gather_stats_gen1 - Gather stats
+ * @dev: pointer to device structure
+ * @pestat: statistics structure
+ */
+void irdma_cqp_gather_stats_gen1(struct irdma_sc_dev *dev,
+				 struct irdma_vsi_pestat *pestat)
+{
+	struct irdma_gather_stats *gather_stats =
+		pestat->gather_info.gather_stats_va;
+	u32 stats_inst_offset_32;
+	u32 stats_inst_offset_64;
+
+	stats_inst_offset_32 = (pestat->gather_info.use_stats_inst) ?
+				       pestat->gather_info.stats_inst_index :
+				       pestat->hw->hmc.hmc_fn_id;
+	stats_inst_offset_32 *= 4;
+	stats_inst_offset_64 = stats_inst_offset_32 * 2;
+
+	gather_stats->rxvlanerr =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_RXVLANERR]
+		     + stats_inst_offset_32);
+	gather_stats->ip4rxdiscard =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_IP4RXDISCARD]
+		     + stats_inst_offset_32);
+	gather_stats->ip4rxtrunc =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_IP4RXTRUNC]
+		     + stats_inst_offset_32);
+	gather_stats->ip4txnoroute =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_IP4TXNOROUTE]
+		     + stats_inst_offset_32);
+	gather_stats->ip6rxdiscard =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_IP6RXDISCARD]
+		     + stats_inst_offset_32);
+	gather_stats->ip6rxtrunc =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_IP6RXTRUNC]
+		     + stats_inst_offset_32);
+	gather_stats->ip6txnoroute =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_IP6TXNOROUTE]
+		     + stats_inst_offset_32);
+	gather_stats->tcprtxseg =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_TCPRTXSEG]
+		     + stats_inst_offset_32);
+	gather_stats->tcprxopterr =
+		rd32(dev->hw,
+		     dev->hw_stats_regs_32[IRDMA_HW_STAT_INDEX_TCPRXOPTERR]
+		     + stats_inst_offset_32);
+
+	gather_stats->ip4rxocts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4RXOCTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4rxpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4RXPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4txfrag =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4RXFRAGS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4rxmcpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4RXMCPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4txocts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4TXOCTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4txpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4TXPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4txfrag =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4TXFRAGS]
+		     + stats_inst_offset_64);
+	gather_stats->ip4txmcpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP4TXMCPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6rxocts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6RXOCTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6rxpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6RXPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6txfrags =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6RXFRAGS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6rxmcpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6RXMCPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6txocts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6TXOCTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6txpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6TXPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6txfrags =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6TXFRAGS]
+		     + stats_inst_offset_64);
+	gather_stats->ip6txmcpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_IP6TXMCPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->tcprxsegs =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_TCPRXSEGS]
+		     + stats_inst_offset_64);
+	gather_stats->tcptxsegs =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_TCPTXSEG]
+		     + stats_inst_offset_64);
+	gather_stats->rdmarxrds =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMARXRDS]
+		     + stats_inst_offset_64);
+	gather_stats->rdmarxsnds =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMARXSNDS]
+		     + stats_inst_offset_64);
+	gather_stats->rdmarxwrs =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMARXWRS]
+		     + stats_inst_offset_64);
+	gather_stats->rdmatxrds =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMATXRDS]
+		     + stats_inst_offset_64);
+	gather_stats->rdmatxsnds =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMATXSNDS]
+		     + stats_inst_offset_64);
+	gather_stats->rdmatxwrs =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMATXWRS]
+		     + stats_inst_offset_64);
+	gather_stats->rdmavbn =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMAVBND]
+		     + stats_inst_offset_64);
+	gather_stats->rdmavinv =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_RDMAVINV]
+		     + stats_inst_offset_64);
+	gather_stats->udprxpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_UDPRXPKTS]
+		     + stats_inst_offset_64);
+	gather_stats->udptxpkts =
+		rd64(dev->hw,
+		     dev->hw_stats_regs_64[IRDMA_HW_STAT_INDEX_UDPTXPKTS]
+		     + stats_inst_offset_64);
+
+	irdma_process_stats(pestat);
+}
+
+/**
+ * irdma_process_cqp_stats - Checking for wrap and update stats
+ * @cqp_request: cqp_request structure pointer
+ */
+static void irdma_process_cqp_stats(struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_vsi_pestat *pestat = cqp_request->param;
+
+	irdma_process_stats(pestat);
+}
+
+/**
+ * irdma_cqp_gather_stats_cmd - Gather stats
+ * @dev: pointer to device structure
+ * @pestat: pointer to stats info
+ * @wait: flag to wait or not wait for stats
+ */
+enum irdma_status_code
+irdma_cqp_gather_stats_cmd(struct irdma_sc_dev *dev,
+			   struct irdma_vsi_pestat *pestat, bool wait)
+
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, wait);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	memset(cqp_info, 0, sizeof(*cqp_info));
+	cqp_info->cqp_cmd = IRDMA_OP_STATS_GATHER;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.stats_gather.info = pestat->gather_info;
+	cqp_info->in.u.stats_gather.scratch = (uintptr_t)cqp_request;
+	cqp_info->in.u.stats_gather.cqp = &rf->cqp.sc_cqp;
+	cqp_request->param = pestat;
+	if (!wait)
+		cqp_request->callback_fcn = irdma_process_cqp_stats;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	if (wait)
+		irdma_process_stats(pestat);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_stats_inst_cmd - Allocate/free stats instance
+ * @vsi: pointer to vsi structure
+ * @cmd: command to allocate or free
+ * @stats_info: pointer to allocate stats info
+ */
+enum irdma_status_code
+irdma_cqp_stats_inst_cmd(struct irdma_sc_vsi *vsi, u8 cmd,
+			 struct irdma_stats_inst_info *stats_info)
+{
+	struct irdma_pci_f *rf = dev_to_rf(vsi->dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+	bool wait = false;
+
+	if (cmd == IRDMA_OP_STATS_ALLOCATE)
+		wait = true;
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, wait);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	memset(cqp_info, 0, sizeof(*cqp_info));
+	cqp_info->cqp_cmd = cmd;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.stats_manage.info = *stats_info;
+	cqp_info->in.u.stats_manage.scratch = (uintptr_t)cqp_request;
+	cqp_info->in.u.stats_manage.cqp = &rf->cqp.sc_cqp;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	if (wait)
+		stats_info->stats_idx = cqp_request->compl_info.op_ret_val;
+	irdma_put_cqp_request(iwcqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_ceq_cmd - Create/Destroy CEQ's after CEQ 0
+ * @dev: pointer to device info
+ * @sc_ceq: pointer to ceq structure
+ * @op: Create or Destroy
+ */
+enum irdma_status_code irdma_cqp_ceq_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_sc_ceq *sc_ceq, u8 op)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->post_sq = 1;
+	cqp_info->cqp_cmd = op;
+	cqp_info->in.u.ceq_create.ceq = sc_ceq;
+	cqp_info->in.u.ceq_create.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_aeq_cmd - Create/Destroy AEQ
+ * @dev: pointer to device info
+ * @sc_aeq: pointer to aeq structure
+ * @op: Create or Destroy
+ */
+enum irdma_status_code irdma_cqp_aeq_cmd(struct irdma_sc_dev *dev,
+					 struct irdma_sc_aeq *sc_aeq, u8 op)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->post_sq = 1;
+	cqp_info->cqp_cmd = op;
+	cqp_info->in.u.aeq_create.aeq = sc_aeq;
+	cqp_info->in.u.aeq_create.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_ws_node_cmd - Add/modify/delete ws node
+ * @dev: pointer to device structure
+ * @cmd: Add, modify or delete
+ * @node_info: pointer to ws node info
+ */
+enum irdma_status_code
+irdma_cqp_ws_node_cmd(struct irdma_sc_dev *dev, u8 cmd,
+		      struct irdma_ws_node_info *node_info)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_sc_cqp *cqp = &iwcqp->sc_cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+	bool poll;
+
+	if (!rf->sc_dev.ceq_valid)
+		poll = true;
+	else
+		poll = false;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, !poll);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	memset(cqp_info, 0, sizeof(*cqp_info));
+	cqp_info->cqp_cmd = cmd;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.ws_node.info = *node_info;
+	cqp_info->in.u.ws_node.cqp = cqp;
+	cqp_info->in.u.ws_node.scratch = (uintptr_t)cqp_request;
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	if (status)
+		goto exit;
+
+	if (poll) {
+		struct irdma_ccq_cqe_info compl_info;
+
+		status = cqp->dev->cqp_ops->poll_for_cqp_op_done(cqp,
+								 IRDMA_CQP_OP_WORK_SCHED_NODE,
+								 &compl_info);
+		node_info->qs_handle = compl_info.op_ret_val;
+		ibdev_dbg(&rf->iwdev->ibdev, "DCB: opcode=%d, compl_info.retval=%d\n",
+			  compl_info.op_code, compl_info.op_ret_val);
+	} else {
+		node_info->qs_handle = cqp_request->compl_info.op_ret_val;
+	}
+
+exit:
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_cqp_up_map_cmd - Set the up-up mapping
+ * @dev: pointer to device structure
+ * @cmd: map command
+ * @map_info: pointer to up map info
+ */
+enum irdma_status_code irdma_cqp_up_map_cmd(struct irdma_sc_dev *dev, u8 cmd,
+					    struct irdma_up_info *map_info)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	struct irdma_cqp *iwcqp = &rf->cqp;
+	struct irdma_sc_cqp *cqp = &iwcqp->sc_cqp;
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(iwcqp, false);
+	if (!cqp_request)
+		return IRDMA_ERR_NO_MEMORY;
+
+	cqp_info = &cqp_request->info;
+	memset(cqp_info, 0, sizeof(*cqp_info));
+	cqp_info->cqp_cmd = cmd;
+	cqp_info->post_sq = 1;
+	cqp_info->in.u.up_map.info = *map_info;
+	cqp_info->in.u.up_map.cqp = cqp;
+	cqp_info->in.u.up_map.scratch = (uintptr_t)cqp_request;
+
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	return status;
+}
+
+/**
+ * irdma_ah_cqp_op - perform an AH cqp operation
+ * @rf: RDMA PCI function
+ * @sc_ah: address handle
+ * @cmd: AH operation
+ * @wait: wait if true
+ * @callback_fcn: Callback function on CQP op completion
+ * @cb_param: parameter for callback function
+ *
+ * returns errno
+ */
+int irdma_ah_cqp_op(struct irdma_pci_f *rf, struct irdma_sc_ah *sc_ah, u8 cmd,
+		    bool wait,
+		    void (*callback_fcn)(struct irdma_cqp_request *),
+		    void *cb_param)
+{
+	struct irdma_cqp_request *cqp_request;
+	struct cqp_cmds_info *cqp_info;
+	enum irdma_status_code status;
+
+	if (cmd != IRDMA_OP_AH_CREATE && cmd != IRDMA_OP_AH_DESTROY)
+		return -EINVAL;
+
+	cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, wait);
+	if (!cqp_request)
+		return -ENOMEM;
+
+	cqp_info = &cqp_request->info;
+	cqp_info->cqp_cmd = cmd;
+	cqp_info->post_sq = 1;
+	if (cmd == IRDMA_OP_AH_CREATE) {
+		cqp_info->in.u.ah_create.info = sc_ah->ah_info;
+		cqp_info->in.u.ah_create.scratch = (uintptr_t)cqp_request;
+		cqp_info->in.u.ah_create.cqp = &rf->cqp.sc_cqp;
+	} else if (cmd == IRDMA_OP_AH_DESTROY) {
+		cqp_info->in.u.ah_destroy.info = sc_ah->ah_info;
+		cqp_info->in.u.ah_destroy.scratch = (uintptr_t)cqp_request;
+		cqp_info->in.u.ah_destroy.cqp = &rf->cqp.sc_cqp;
+	}
+
+	if (!wait) {
+		cqp_request->callback_fcn = callback_fcn;
+		cqp_request->param = cb_param;
+	}
+	status = irdma_handle_cqp_op(rf, cqp_request);
+	irdma_put_cqp_request(&rf->cqp, cqp_request);
+
+	if (status)
+		return -ENOMEM;
+
+	if (wait)
+		sc_ah->ah_info.ah_valid = (cmd == IRDMA_OP_AH_CREATE);
+
+	return 0;
+}
+
+/**
+ * irdma_ieq_ah_cb - callback after creation of AH for IEQ
+ * @cqp_request: pointer to cqp_request of create AH
+ */
+static void irdma_ieq_ah_cb(struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_sc_qp *qp = cqp_request->param;
+	struct irdma_sc_ah *sc_ah = qp->pfpdu.ah;
+	unsigned long flags;
+
+	spin_lock_irqsave(&qp->pfpdu.lock, flags);
+	if (!cqp_request->compl_info.op_ret_val) {
+		sc_ah->ah_info.ah_valid = true;
+		irdma_ieq_process_fpdus(qp, qp->vsi->ieq);
+	} else {
+		sc_ah->ah_info.ah_valid = false;
+		irdma_ieq_cleanup_qp(qp->vsi->ieq, qp);
+	}
+	spin_unlock_irqrestore(&qp->pfpdu.lock, flags);
+}
+
+/**
+ * irdma_ilq_ah_cb - callback after creation of AH for ILQ
+ * @cqp_request: pointer to cqp_request of create AH
+ */
+static void irdma_ilq_ah_cb(struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_cm_node *cm_node = cqp_request->param;
+	struct irdma_sc_ah *sc_ah = cm_node->ah;
+
+	sc_ah->ah_info.ah_valid = !cqp_request->compl_info.op_ret_val;
+	irdma_add_conn_est_qh(cm_node);
+}
+
+/**
+ * irdma_puda_create_ah - create AH for ILQ/IEQ qp's
+ * @dev: device pointer
+ * @ah_info: Address handle info
+ * @wait: When true will wait for operation to complete
+ * @type: ILQ/IEQ
+ * @cb_param: Callback param when not waiting
+ * @ah_ret: Returned pointer to address handle if created
+ *
+ */
+enum irdma_status_code irdma_puda_create_ah(struct irdma_sc_dev *dev,
+					    struct irdma_ah_info *ah_info,
+					    bool wait, enum puda_rsrc_type type,
+					    void *cb_param,
+					    struct irdma_sc_ah **ah_ret)
+{
+	struct irdma_sc_ah *ah;
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	int err;
+
+	ah = kzalloc(sizeof(*ah), GFP_ATOMIC);
+	*ah_ret = ah;
+	if (!ah)
+		return IRDMA_ERR_NO_MEMORY;
+
+	err = irdma_alloc_rsrc(rf, rf->allocated_ahs, rf->max_ah,
+			       &ah_info->ah_idx, &rf->next_ah);
+	if (err)
+		goto err_free;
+
+	ah->dev = dev;
+	ah->ah_info = *ah_info;
+
+	if (type == IRDMA_PUDA_RSRC_TYPE_ILQ)
+		err = irdma_ah_cqp_op(rf, ah, IRDMA_OP_AH_CREATE, wait,
+				      irdma_ilq_ah_cb, cb_param);
+	else
+		err = irdma_ah_cqp_op(rf, ah, IRDMA_OP_AH_CREATE, wait,
+				      irdma_ieq_ah_cb, cb_param);
+
+	if (err)
+		goto error;
+	return 0;
+
+error:
+	irdma_free_rsrc(rf, rf->allocated_ahs, ah->ah_info.ah_idx);
+err_free:
+	kfree(ah);
+	*ah_ret = NULL;
+	return IRDMA_ERR_NO_MEMORY;
+}
+
+/**
+ * irdma_puda_free_ah - free a puda address handle
+ * @dev: device pointer
+ * @ah: The address handle to free
+ */
+void irdma_puda_free_ah(struct irdma_sc_dev *dev, struct irdma_sc_ah *ah)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+
+	if (!ah)
+		return;
+
+	if (ah->ah_info.ah_valid) {
+		irdma_ah_cqp_op(rf, ah, IRDMA_OP_AH_DESTROY, false, NULL, NULL);
+		irdma_free_rsrc(rf, rf->allocated_ahs, ah->ah_info.ah_idx);
+	}
+
+	kfree(ah);
+}
+
+/**
+ * irdma_gsi_ud_qp_ah_cb - callback after creation of AH for GSI/ID QP
+ * @cqp_request: pointer to cqp_request of create AH
+ */
+void irdma_gsi_ud_qp_ah_cb(struct irdma_cqp_request *cqp_request)
+{
+	struct irdma_sc_ah *sc_ah = cqp_request->param;
+
+	if (!cqp_request->compl_info.op_ret_val)
+		sc_ah->ah_info.ah_valid = true;
+	else
+		sc_ah->ah_info.ah_valid = false;
+}
+
+/**
+ * irdma_prm_add_pble_mem - add moemory to pble resources
+ * @pprm: pble resource manager
+ * @pchunk: chunk of memory to add
+ */
+enum irdma_status_code irdma_prm_add_pble_mem(struct irdma_pble_prm *pprm,
+					      struct irdma_chunk *pchunk)
+{
+	u64 sizeofbitmap;
+
+	if (pchunk->size & 0xfff)
+		return IRDMA_ERR_PARAM;
+
+	sizeofbitmap = (u64)pchunk->size >> pprm->pble_shift;
+
+	pchunk->bitmapmem.size = sizeofbitmap >> 3;
+	pchunk->bitmapmem.va = kzalloc(pchunk->bitmapmem.size, GFP_KERNEL);
+
+	if (!pchunk->bitmapmem.va)
+		return IRDMA_ERR_NO_MEMORY;
+
+	pchunk->bitmapbuf = pchunk->bitmapmem.va;
+	bitmap_zero(pchunk->bitmapbuf, sizeofbitmap);
+
+	pchunk->sizeofbitmap = sizeofbitmap;
+	/* each pble is 8 bytes hence shift by 3 */
+	pprm->total_pble_alloc += pchunk->size >> 3;
+	pprm->free_pble_cnt += pchunk->size >> 3;
+
+	return 0;
+}
+
+/**
+ * irdma_prm_get_pbles - get pble's from prm
+ * @pprm: pble resource manager
+ * @chunkinfo: nformation about chunk where pble's were acquired
+ * @mem_size: size of pble memory needed
+ * @vaddr: returns virtual address of pble memory
+ * @fpm_addr: returns fpm address of pble memory
+ */
+enum irdma_status_code
+irdma_prm_get_pbles(struct irdma_pble_prm *pprm,
+		    struct irdma_pble_chunkinfo *chunkinfo, u32 mem_size,
+		    u64 *vaddr, u64 *fpm_addr)
+{
+	u64 bits_needed;
+	u64 bit_idx = PBLE_INVALID_IDX;
+	struct irdma_chunk *pchunk = NULL;
+	struct list_head *chunk_entry = pprm->clist.next;
+	u32 offset;
+	unsigned long flags;
+	*vaddr = 0;
+	*fpm_addr = 0;
+
+	bits_needed = (mem_size + (1 << pprm->pble_shift) - 1) >> pprm->pble_shift;
+
+	spin_lock_irqsave(&pprm->prm_lock, flags);
+	while (chunk_entry != &pprm->clist) {
+		pchunk = (struct irdma_chunk *)chunk_entry;
+		bit_idx = bitmap_find_next_zero_area(pchunk->bitmapbuf,
+						     pchunk->sizeofbitmap, 0,
+						     bits_needed, 0);
+		if (bit_idx < pchunk->sizeofbitmap)
+			break;
+
+		/* list.next used macro */
+		chunk_entry = pchunk->list.next;
+	}
+
+	if (!pchunk || bit_idx >= pchunk->sizeofbitmap) {
+		spin_unlock_irqrestore(&pprm->prm_lock, flags);
+		return IRDMA_ERR_NO_MEMORY;
+	}
+
+	bitmap_set(pchunk->bitmapbuf, bit_idx, bits_needed);
+	offset = bit_idx << pprm->pble_shift;
+	*vaddr = pchunk->vaddr + offset;
+	*fpm_addr = pchunk->fpm_addr + offset;
+
+	chunkinfo->pchunk = pchunk;
+	chunkinfo->bit_idx = bit_idx;
+	chunkinfo->bits_used = bits_needed;
+	/* 3 is sizeof pble divide */
+	pprm->free_pble_cnt -= chunkinfo->bits_used << (pprm->pble_shift - 3);
+	spin_unlock_irqrestore(&pprm->prm_lock, flags);
+
+	return 0;
+}
+
+/**
+ * irdma_prm_return_pbles - return pbles back to prm
+ * @pprm: pble resource manager
+ * @chunkinfo: chunk where pble's were acquired and to be freed
+ */
+void irdma_prm_return_pbles(struct irdma_pble_prm *pprm,
+			    struct irdma_pble_chunkinfo *chunkinfo)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&pprm->prm_lock, flags);
+	pprm->free_pble_cnt += chunkinfo->bits_used << (pprm->pble_shift - 3);
+	bitmap_clear(chunkinfo->pchunk->bitmapbuf, chunkinfo->bit_idx,
+		     chunkinfo->bits_used);
+	spin_unlock_irqrestore(&pprm->prm_lock, flags);
+}
+
+enum irdma_status_code irdma_map_vm_page_list(struct irdma_hw *hw, void *va,
+					      dma_addr_t *pg_dma, u32 pg_cnt)
+{
+	struct page *vm_page;
+	int i;
+	u8 *addr;
+
+	addr = (u8 *)(uintptr_t)va;
+	for (i = 0; i < pg_cnt; i++) {
+		vm_page = vmalloc_to_page(addr);
+		if (!vm_page)
+			goto err;
+
+		pg_dma[i] = dma_map_page(hw->device, vm_page, 0, PAGE_SIZE,
+					 DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(hw->device, pg_dma[i]))
+			goto err;
+
+		addr += PAGE_SIZE;
+	}
+
+	return 0;
+
+err:
+	irdma_unmap_vm_page_list(hw, pg_dma, i);
+	return IRDMA_ERR_NO_MEMORY;
+}
+
+void irdma_unmap_vm_page_list(struct irdma_hw *hw, dma_addr_t *pg_dma, u32 pg_cnt)
+{
+	int i;
+
+	for (i = 0; i < pg_cnt; i++)
+		dma_unmap_page(hw->device, pg_dma[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
+}
+
+/**
+ * irdma_pble_free_paged_mem - free virtual paged memory
+ * @chunk: chunk to free with paged memory
+ */
+void irdma_pble_free_paged_mem(struct irdma_chunk *chunk)
+{
+	if (!chunk->pg_cnt)
+		goto done;
+
+	irdma_unmap_vm_page_list(chunk->dev->hw, chunk->dmainfo.dmaaddrs,
+				 chunk->pg_cnt);
+
+done:
+	kfree(chunk->dmainfo.dmaaddrs);
+	chunk->dmainfo.dmaaddrs = NULL;
+	vfree((void *)(uintptr_t)chunk->vaddr);
+	chunk->vaddr = 0;
+	chunk->type = 0;
+}
+
+/**
+ * irdma_pble_get_paged_mem -allocate paged memory for pbles
+ * @chunk: chunk to add for paged memory
+ * @pg_cnt: number of pages needed
+ */
+enum irdma_status_code irdma_pble_get_paged_mem(struct irdma_chunk *chunk,
+						u32 pg_cnt)
+{
+	u32 size;
+	void *va;
+
+	chunk->dmainfo.dmaaddrs = kzalloc(pg_cnt << 3, GFP_KERNEL);
+	if (!chunk->dmainfo.dmaaddrs)
+		return IRDMA_ERR_NO_MEMORY;
+
+	size = PAGE_SIZE * pg_cnt;
+	va = vmalloc(size);
+	if (!va)
+		goto err;
+
+	if (irdma_map_vm_page_list(chunk->dev->hw, va, chunk->dmainfo.dmaaddrs,
+				   pg_cnt)) {
+		vfree(va);
+		goto err;
+	}
+	chunk->vaddr = (uintptr_t)va;
+	chunk->size = size;
+	chunk->pg_cnt = pg_cnt;
+	chunk->type = PBLE_SD_PAGED;
+
+	return 0;
+err:
+	kfree(chunk->dmainfo.dmaaddrs);
+	chunk->dmainfo.dmaaddrs = NULL;
+
+	return IRDMA_ERR_NO_MEMORY;
+}
+
+/**
+ * irdma_alloc_ws_node_id - Allocate a tx scheduler node ID
+ * @dev: device pointer
+ */
+u16 irdma_alloc_ws_node_id(struct irdma_sc_dev *dev)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+	u32 next = 1;
+	u32 node_id;
+
+	if (irdma_alloc_rsrc(rf, rf->allocated_ws_nodes, rf->max_ws_node_id,
+			     &node_id, &next))
+		return IRDMA_WS_NODE_INVALID;
+
+	return (u16)node_id;
+}
+
+/**
+ * irdma_free_ws_node_id - Free a tx scheduler node ID
+ * @dev: device pointer
+ * @node_id: Work scheduler node ID
+ */
+void irdma_free_ws_node_id(struct irdma_sc_dev *dev, u16 node_id)
+{
+	struct irdma_pci_f *rf = dev_to_rf(dev);
+
+	irdma_free_rsrc(rf, rf->allocated_ws_nodes, (u32)node_id);
+}
+
+/**
+ * irdma_modify_qp_to_err - Modify a QP to error
+ * @sc_qp: qp structure
+ */
+void irdma_modify_qp_to_err(struct irdma_sc_qp *sc_qp)
+{
+	struct irdma_qp *qp = sc_qp->qp_uk.back_qp;
+	struct ib_qp_attr attr;
+
+	if (qp->iwdev->reset)
+		return;
+	attr.qp_state = IB_QPS_ERR;
+
+	if (rdma_protocol_roce(qp->ibqp.device, 1))
+		irdma_modify_qp_roce(&qp->ibqp, &attr, IB_QP_STATE, NULL);
+	else
+		irdma_modify_qp(&qp->ibqp, &attr, IB_QP_STATE, NULL);
+}
+
+void irdma_ib_qp_event(struct irdma_qp *iwqp, enum irdma_qp_event_type event)
+{
+	struct ib_event ibevent;
+
+	if (!iwqp->ibqp.event_handler)
+		return;
+
+	switch (event) {
+	case IRDMA_QP_EVENT_CATASTROPHIC:
+		ibevent.event = IB_EVENT_QP_FATAL;
+		break;
+	case IRDMA_QP_EVENT_ACCESS_ERR:
+		ibevent.event = IB_EVENT_QP_ACCESS_ERR;
+		break;
+	}
+	ibevent.device = iwqp->ibqp.device;
+	ibevent.element.qp = &iwqp->ibqp;
+	iwqp->ibqp.event_handler(&ibevent, iwqp->ibqp.qp_context);
+}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 20/23] RDMA/irdma: Add dynamic tracing for CM
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (18 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 19/23] RDMA/irdma: Add miscellaneous utility definitions Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 21/23] RDMA/irdma: Add ABI definitions Shiraz Saleem
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Michael J. Ruhl, Shiraz Saleem

From: "Michael J. Ruhl" <michael.j.ruhl@intel.com>

Add dynamic tracing functionality to debug connection
management issues.

Signed-off-by: "Michael J. Ruhl" <michael.j.ruhl@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/trace.c    | 112 ++++++++
 drivers/infiniband/hw/irdma/trace.h    |   3 +
 drivers/infiniband/hw/irdma/trace_cm.h | 458 +++++++++++++++++++++++++++++++++
 3 files changed, 573 insertions(+)
 create mode 100644 drivers/infiniband/hw/irdma/trace.c
 create mode 100644 drivers/infiniband/hw/irdma/trace.h
 create mode 100644 drivers/infiniband/hw/irdma/trace_cm.h

diff --git a/drivers/infiniband/hw/irdma/trace.c b/drivers/infiniband/hw/irdma/trace.c
new file mode 100644
index 0000000..b5133f4
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/trace.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
+/* Copyright (c) 2019 Intel Corporation */
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
+const char *print_ip_addr(struct trace_seq *p, u32 *addr, u16 port, bool ipv4)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+
+	if (ipv4) {
+		__be32 myaddr = htonl(*addr);
+
+		trace_seq_printf(p, "%pI4:%d", &myaddr, htons(port));
+	} else {
+		trace_seq_printf(p, "%pI6:%d", addr, htons(port));
+	}
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+const char *parse_iw_event_type(enum iw_cm_event_type iw_type)
+{
+	switch (iw_type) {
+	case IW_CM_EVENT_CONNECT_REQUEST:
+		return "IwRequest";
+	case IW_CM_EVENT_CONNECT_REPLY:
+		return "IwReply";
+	case IW_CM_EVENT_ESTABLISHED:
+		return "IwEstablished";
+	case IW_CM_EVENT_DISCONNECT:
+		return "IwDisconnect";
+	case IW_CM_EVENT_CLOSE:
+		return "IwClose";
+	}
+
+	return "Unknown";
+}
+
+const char *parse_cm_event_type(enum irdma_cm_event_type cm_type)
+{
+	switch (cm_type) {
+	case IRDMA_CM_EVENT_ESTABLISHED:
+		return "CmEstablished";
+	case IRDMA_CM_EVENT_MPA_REQ:
+		return "CmMPA_REQ";
+	case IRDMA_CM_EVENT_MPA_CONNECT:
+		return "CmMPA_CONNECT";
+	case IRDMA_CM_EVENT_MPA_ACCEPT:
+		return "CmMPA_ACCEPT";
+	case IRDMA_CM_EVENT_MPA_REJECT:
+		return "CmMPA_REJECT";
+	case IRDMA_CM_EVENT_MPA_ESTABLISHED:
+		return "CmMPA_ESTABLISHED";
+	case IRDMA_CM_EVENT_CONNECTED:
+		return "CmConnected";
+	case IRDMA_CM_EVENT_RESET:
+		return "CmReset";
+	case IRDMA_CM_EVENT_ABORTED:
+		return "CmAborted";
+	case IRDMA_CM_EVENT_UNKNOWN:
+		return "none";
+	}
+	return "Unknown";
+}
+
+const char *parse_cm_state(enum irdma_cm_node_state state)
+{
+	switch (state) {
+	case IRDMA_CM_STATE_UNKNOWN:
+		return "UNKNOWN";
+	case IRDMA_CM_STATE_INITED:
+		return "INITED";
+	case IRDMA_CM_STATE_LISTENING:
+		return "LISTENING";
+	case IRDMA_CM_STATE_SYN_RCVD:
+		return "SYN_RCVD";
+	case IRDMA_CM_STATE_SYN_SENT:
+		return "SYN_SENT";
+	case IRDMA_CM_STATE_ONE_SIDE_ESTABLISHED:
+		return "ONE_SIDE_ESTABLISHED";
+	case IRDMA_CM_STATE_ESTABLISHED:
+		return "ESTABLISHED";
+	case IRDMA_CM_STATE_ACCEPTING:
+		return "ACCEPTING";
+	case IRDMA_CM_STATE_MPAREQ_SENT:
+		return "MPAREQ_SENT";
+	case IRDMA_CM_STATE_MPAREQ_RCVD:
+		return "MPAREQ_RCVD";
+	case IRDMA_CM_STATE_MPAREJ_RCVD:
+		return "MPAREJ_RECVD";
+	case IRDMA_CM_STATE_OFFLOADED:
+		return "OFFLOADED";
+	case IRDMA_CM_STATE_FIN_WAIT1:
+		return "FIN_WAIT1";
+	case IRDMA_CM_STATE_FIN_WAIT2:
+		return "FIN_WAIT2";
+	case IRDMA_CM_STATE_CLOSE_WAIT:
+		return "CLOSE_WAIT";
+	case IRDMA_CM_STATE_TIME_WAIT:
+		return "TIME_WAIT";
+	case IRDMA_CM_STATE_LAST_ACK:
+		return "LAST_ACK";
+	case IRDMA_CM_STATE_CLOSING:
+		return "CLOSING";
+	case IRDMA_CM_STATE_LISTENER_DESTROYED:
+		return "LISTENER_DESTROYED";
+	case IRDMA_CM_STATE_CLOSED:
+		return "CLOSED";
+	}
+	return ("Bad state");
+}
diff --git a/drivers/infiniband/hw/irdma/trace.h b/drivers/infiniband/hw/irdma/trace.h
new file mode 100644
index 0000000..702e4ef
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/trace.h
@@ -0,0 +1,3 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2019 Intel Corporation */
+#include "trace_cm.h"
diff --git a/drivers/infiniband/hw/irdma/trace_cm.h b/drivers/infiniband/hw/irdma/trace_cm.h
new file mode 100644
index 0000000..74ad2b6
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/trace_cm.h
@@ -0,0 +1,458 @@
+/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */
+/* Copyright (c) 2019 - 2021 Intel Corporation */
+#if !defined(__TRACE_CM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define __TRACE_CM_H
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include "main.h"
+
+const char *print_ip_addr(struct trace_seq *p, u32 *addr, u16 port, bool ivp4);
+const char *parse_iw_event_type(enum iw_cm_event_type iw_type);
+const char *parse_cm_event_type(enum irdma_cm_event_type cm_type);
+const char *parse_cm_state(enum irdma_cm_node_state);
+#define __print_ip_addr(addr, port, ipv4) print_ip_addr(p, addr, port, ipv4)
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM irdma_cm
+
+TRACE_EVENT(irdma_create_listen,
+	    TP_PROTO(struct irdma_device *iwdev, struct irdma_cm_info *cm_info),
+	    TP_ARGS(iwdev, cm_info),
+	    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+			     __dynamic_array(u32, laddr, 4)
+			     __field(u16, lport)
+			     __field(bool, ipv4)
+		    ),
+	    TP_fast_assign(__entry->iwdev = iwdev;
+			   __entry->lport = cm_info->loc_port;
+			   __entry->ipv4 = cm_info->ipv4;
+			   memcpy(__get_dynamic_array(laddr),
+				  cm_info->loc_addr, 4);
+		    ),
+	    TP_printk("iwdev=%p  loc: %s",
+		      __entry->iwdev,
+		      __print_ip_addr(__get_dynamic_array(laddr),
+				      __entry->lport, __entry->ipv4)
+		    )
+);
+
+TRACE_EVENT(irdma_dec_refcnt_listen,
+	    TP_PROTO(struct irdma_cm_listener *listener, void *caller),
+	    TP_ARGS(listener, caller),
+	    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+		    __field(u32, refcnt)
+		    __dynamic_array(u32, laddr, 4)
+		    __field(u16, lport)
+		    __field(bool, ipv4)
+		    __field(void *, caller)
+		    ),
+	    TP_fast_assign(__entry->iwdev = listener->iwdev;
+			   __entry->lport = listener->loc_port;
+			   __entry->ipv4 = listener->ipv4;
+			   memcpy(__get_dynamic_array(laddr),
+				  listener->loc_addr, 4);
+		    ),
+	    TP_printk("iwdev=%p  caller=%pS  loc: %s",
+		      __entry->iwdev,
+		      __entry->caller,
+		      __print_ip_addr(__get_dynamic_array(laddr),
+				      __entry->lport, __entry->ipv4)
+		    )
+);
+
+DECLARE_EVENT_CLASS(listener_template,
+		    TP_PROTO(struct irdma_cm_listener *listener),
+		    TP_ARGS(listener),
+		    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+				     __field(u16, lport)
+				     __field(u16, vlan_id)
+				     __field(bool, ipv4)
+				     __field(enum irdma_cm_listener_state,
+					     state)
+				     __dynamic_array(u32, laddr, 4)
+			    ),
+		    TP_fast_assign(__entry->iwdev = listener->iwdev;
+				   __entry->lport = listener->loc_port;
+				   __entry->vlan_id = listener->vlan_id;
+				   __entry->ipv4 = listener->ipv4;
+				   __entry->state = listener->listener_state;
+				   memcpy(__get_dynamic_array(laddr),
+					  listener->loc_addr, 4);
+			    ),
+		    TP_printk("iwdev=%p  vlan=%d  loc: %s",
+			      __entry->iwdev,
+			      __entry->vlan_id,
+			      __print_ip_addr(__get_dynamic_array(laddr),
+					      __entry->lport, __entry->ipv4)
+			    )
+);
+
+DEFINE_EVENT(listener_template, irdma_find_listener,
+	     TP_PROTO(struct irdma_cm_listener *listener),
+	     TP_ARGS(listener));
+
+DEFINE_EVENT(listener_template, irdma_del_multiple_qhash,
+	     TP_PROTO(struct irdma_cm_listener *listener),
+	     TP_ARGS(listener));
+
+TRACE_EVENT(irdma_negotiate_mpa_v2,
+	    TP_PROTO(struct irdma_cm_node *cm_node),
+	    TP_ARGS(cm_node),
+	    TP_STRUCT__entry(__field(struct irdma_cm_node *, cm_node)
+			     __field(u16, ord_size)
+			     __field(u16, ird_size)
+		    ),
+	    TP_fast_assign(__entry->cm_node = cm_node;
+			   __entry->ord_size = cm_node->ord_size;
+			   __entry->ird_size = cm_node->ird_size;
+		    ),
+	    TP_printk("MPVA2 Negotiated cm_node=%p ORD:[%d], IRD:[%d]",
+		      __entry->cm_node,
+		      __entry->ord_size,
+		      __entry->ird_size
+		    )
+);
+
+DECLARE_EVENT_CLASS(tos_template,
+		    TP_PROTO(struct irdma_device *iwdev, u8 tos, u8 user_pri),
+		    TP_ARGS(iwdev, tos, user_pri),
+		    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+				     __field(u8, tos)
+				     __field(u8, user_pri)
+			    ),
+		    TP_fast_assign(__entry->iwdev = iwdev;
+				   __entry->tos = tos;
+				   __entry->user_pri = user_pri;
+			    ),
+		    TP_printk("iwdev=%p  TOS:[%d]  UP:[%d]",
+			      __entry->iwdev,
+			      __entry->tos,
+			      __entry->user_pri
+			    )
+);
+
+DEFINE_EVENT(tos_template, irdma_listener_tos,
+	     TP_PROTO(struct irdma_device *iwdev, u8 tos, u8 user_pri),
+	     TP_ARGS(iwdev, tos, user_pri));
+
+DEFINE_EVENT(tos_template, irdma_dcb_tos,
+	     TP_PROTO(struct irdma_device *iwdev, u8 tos, u8 user_pri),
+	     TP_ARGS(iwdev, tos, user_pri));
+
+DECLARE_EVENT_CLASS(qhash_template,
+		    TP_PROTO(struct irdma_device *iwdev,
+			     struct irdma_cm_listener *listener,
+			     char *dev_addr),
+		    TP_ARGS(iwdev, listener, dev_addr),
+		    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+				     __field(u16, lport)
+				     __field(u16, vlan_id)
+				     __field(bool, ipv4)
+				     __dynamic_array(u32, laddr, 4)
+				     __dynamic_array(u32, mac, ETH_ALEN)
+			    ),
+		    TP_fast_assign(__entry->iwdev = iwdev;
+				   __entry->lport = listener->loc_port;
+				   __entry->vlan_id = listener->vlan_id;
+				   __entry->ipv4 = listener->ipv4;
+				   memcpy(__get_dynamic_array(laddr),
+					  listener->loc_addr, 4);
+				   ether_addr_copy(__get_dynamic_array(mac),
+						   dev_addr);
+			    ),
+		    TP_printk("iwdev=%p  vlan=%d  MAC=%pM  loc: %s",
+			      __entry->iwdev,
+			      __entry->vlan_id,
+			      __get_dynamic_array(mac),
+			      __print_ip_addr(__get_dynamic_array(laddr),
+					      __entry->lport, __entry->ipv4)
+		    )
+);
+
+DEFINE_EVENT(qhash_template, irdma_add_mqh_6,
+	     TP_PROTO(struct irdma_device *iwdev,
+		      struct irdma_cm_listener *listener, char *dev_addr),
+	     TP_ARGS(iwdev, listener, dev_addr));
+
+DEFINE_EVENT(qhash_template, irdma_add_mqh_4,
+	     TP_PROTO(struct irdma_device *iwdev,
+		      struct irdma_cm_listener *listener, char *dev_addr),
+	     TP_ARGS(iwdev, listener, dev_addr));
+
+TRACE_EVENT(irdma_addr_resolve,
+	    TP_PROTO(struct irdma_device *iwdev, char *dev_addr),
+	    TP_ARGS(iwdev, dev_addr),
+	    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+		    __dynamic_array(u8, mac, ETH_ALEN)
+		    ),
+	    TP_fast_assign(__entry->iwdev = iwdev;
+		    ether_addr_copy(__get_dynamic_array(mac), dev_addr);
+		    ),
+	    TP_printk("iwdev=%p   MAC=%pM", __entry->iwdev,
+		      __get_dynamic_array(mac)
+		    )
+);
+
+TRACE_EVENT(irdma_send_cm_event,
+	    TP_PROTO(struct irdma_cm_node *cm_node, struct iw_cm_id *cm_id,
+		     enum iw_cm_event_type type, int status, void *caller),
+	    TP_ARGS(cm_node, cm_id, type, status, caller),
+	    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+			     __field(struct irdma_cm_node *, cm_node)
+			     __field(struct iw_cm_id *, cm_id)
+			     __field(u32, refcount)
+			     __field(u16, lport)
+			     __field(u16, rport)
+			     __field(enum irdma_cm_node_state, state)
+			     __field(bool, ipv4)
+			     __field(u16, vlan_id)
+			     __field(int, accel)
+			     __field(enum iw_cm_event_type, type)
+			     __field(int, status)
+			     __field(void *, caller)
+			     __dynamic_array(u32, laddr, 4)
+			     __dynamic_array(u32, raddr, 4)
+		    ),
+	    TP_fast_assign(__entry->iwdev = cm_node->iwdev;
+			   __entry->cm_node = cm_node;
+			   __entry->cm_id = cm_id;
+			   __entry->refcount = refcount_read(&cm_node->refcnt);
+			   __entry->state = cm_node->state;
+			   __entry->lport = cm_node->loc_port;
+			   __entry->rport = cm_node->rem_port;
+			   __entry->ipv4 = cm_node->ipv4;
+			   __entry->vlan_id = cm_node->vlan_id;
+			   __entry->accel = cm_node->accelerated;
+			   __entry->type = type;
+			   __entry->status = status;
+			   __entry->caller = caller;
+			   memcpy(__get_dynamic_array(laddr),
+				  cm_node->loc_addr, 4);
+			   memcpy(__get_dynamic_array(raddr),
+				  cm_node->rem_addr, 4);
+		    ),
+	    TP_printk("iwdev=%p  caller=%pS  cm_id=%p  node=%p  refcnt=%d  vlan_id=%d  accel=%d  state=%s  event_type=%s  status=%d  loc: %s  rem: %s",
+		      __entry->iwdev,
+		      __entry->caller,
+		      __entry->cm_id,
+		      __entry->cm_node,
+		      __entry->refcount,
+		      __entry->vlan_id,
+		      __entry->accel,
+		      parse_cm_state(__entry->state),
+		      parse_iw_event_type(__entry->type),
+		      __entry->status,
+		      __print_ip_addr(__get_dynamic_array(laddr),
+				      __entry->lport, __entry->ipv4),
+		      __print_ip_addr(__get_dynamic_array(raddr),
+				      __entry->rport, __entry->ipv4)
+		    )
+);
+
+TRACE_EVENT(irdma_send_cm_event_no_node,
+	    TP_PROTO(struct iw_cm_id *cm_id, enum iw_cm_event_type type,
+		     int status, void *caller),
+	    TP_ARGS(cm_id, type, status, caller),
+	    TP_STRUCT__entry(__field(struct iw_cm_id *, cm_id)
+			     __field(enum iw_cm_event_type, type)
+			     __field(int, status)
+			     __field(void *, caller)
+		    ),
+	    TP_fast_assign(__entry->cm_id = cm_id;
+			   __entry->type = type;
+			   __entry->status = status;
+			   __entry->caller = caller;
+		    ),
+	    TP_printk("cm_id=%p  caller=%pS  event_type=%s  status=%d",
+		      __entry->cm_id,
+		      __entry->caller,
+		      parse_iw_event_type(__entry->type),
+		      __entry->status
+		    )
+);
+
+DECLARE_EVENT_CLASS(cm_node_template,
+		    TP_PROTO(struct irdma_cm_node *cm_node,
+			     enum irdma_cm_event_type type, void *caller),
+		    TP_ARGS(cm_node, type, caller),
+		    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+				     __field(struct irdma_cm_node *, cm_node)
+				     __field(u32, refcount)
+				     __field(u16, lport)
+				     __field(u16, rport)
+				     __field(enum irdma_cm_node_state, state)
+				     __field(bool, ipv4)
+				     __field(u16, vlan_id)
+				     __field(int, accel)
+				     __field(enum irdma_cm_event_type, type)
+				     __field(void *, caller)
+				     __dynamic_array(u32, laddr, 4)
+				     __dynamic_array(u32, raddr, 4)
+			    ),
+		    TP_fast_assign(__entry->iwdev = cm_node->iwdev;
+				   __entry->cm_node = cm_node;
+				   __entry->refcount = refcount_read(&cm_node->refcnt);
+				   __entry->state = cm_node->state;
+				   __entry->lport = cm_node->loc_port;
+				   __entry->rport = cm_node->rem_port;
+				   __entry->ipv4 = cm_node->ipv4;
+				   __entry->vlan_id = cm_node->vlan_id;
+				   __entry->accel = cm_node->accelerated;
+				   __entry->type = type;
+				   __entry->caller = caller;
+				   memcpy(__get_dynamic_array(laddr),
+					  cm_node->loc_addr, 4);
+				   memcpy(__get_dynamic_array(raddr),
+					  cm_node->rem_addr, 4);
+			    ),
+		    TP_printk("iwdev=%p  caller=%pS  node=%p  refcnt=%d  vlan_id=%d  accel=%d  state=%s  event_type=%s  loc: %s  rem: %s",
+			      __entry->iwdev,
+			      __entry->caller,
+			      __entry->cm_node,
+			      __entry->refcount,
+			      __entry->vlan_id,
+			      __entry->accel,
+			      parse_cm_state(__entry->state),
+			      parse_cm_event_type(__entry->type),
+			      __print_ip_addr(__get_dynamic_array(laddr),
+					      __entry->lport, __entry->ipv4),
+			      __print_ip_addr(__get_dynamic_array(raddr),
+					      __entry->rport, __entry->ipv4)
+		    )
+);
+
+DEFINE_EVENT(cm_node_template, irdma_create_event,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_accept,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_connect,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_reject,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_find_node,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_send_reset,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_rem_ref_cm_node,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+DEFINE_EVENT(cm_node_template, irdma_cm_event_handler,
+	     TP_PROTO(struct irdma_cm_node *cm_node,
+		      enum irdma_cm_event_type type, void *caller),
+	     TP_ARGS(cm_node, type, caller));
+
+TRACE_EVENT(open_err_template,
+	    TP_PROTO(struct irdma_cm_node *cm_node, bool reset, void *caller),
+	    TP_ARGS(cm_node, reset, caller),
+	    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+			     __field(struct irdma_cm_node *, cm_node)
+			     __field(enum irdma_cm_node_state, state)
+			     __field(bool, reset)
+			     __field(void *, caller)
+		    ),
+	    TP_fast_assign(__entry->iwdev = cm_node->iwdev;
+			   __entry->cm_node = cm_node;
+			   __entry->state = cm_node->state;
+			   __entry->reset = reset;
+			   __entry->caller = caller;
+		    ),
+	    TP_printk("iwdev=%p  caller=%pS  node%p reset=%d  state=%s",
+		      __entry->iwdev,
+		      __entry->caller,
+		      __entry->cm_node,
+		      __entry->reset,
+		      parse_cm_state(__entry->state)
+		    )
+);
+
+DEFINE_EVENT(open_err_template, irdma_active_open_err,
+	     TP_PROTO(struct irdma_cm_node *cm_node, bool reset, void *caller),
+	     TP_ARGS(cm_node, reset, caller));
+
+DEFINE_EVENT(open_err_template, irdma_passive_open_err,
+	     TP_PROTO(struct irdma_cm_node *cm_node, bool reset, void *caller),
+	     TP_ARGS(cm_node, reset, caller));
+
+DECLARE_EVENT_CLASS(cm_node_ah_template,
+		    TP_PROTO(struct irdma_cm_node *cm_node),
+		    TP_ARGS(cm_node),
+		    TP_STRUCT__entry(__field(struct irdma_device *, iwdev)
+				     __field(struct irdma_cm_node *, cm_node)
+				     __field(struct irdma_sc_ah *, ah)
+				     __field(u32, refcount)
+				     __field(u16, lport)
+				     __field(u16, rport)
+				     __field(enum irdma_cm_node_state, state)
+				     __field(bool, ipv4)
+				     __field(u16, vlan_id)
+				     __field(int, accel)
+				     __dynamic_array(u32, laddr, 4)
+				     __dynamic_array(u32, raddr, 4)
+			    ),
+		    TP_fast_assign(__entry->iwdev = cm_node->iwdev;
+				   __entry->cm_node = cm_node;
+				   __entry->ah = cm_node->ah;
+				   __entry->refcount = refcount_read(&cm_node->refcnt);
+				   __entry->lport = cm_node->loc_port;
+				   __entry->rport = cm_node->rem_port;
+				   __entry->state = cm_node->state;
+				   __entry->ipv4 = cm_node->ipv4;
+				   __entry->vlan_id = cm_node->vlan_id;
+				   __entry->accel = cm_node->accelerated;
+				   memcpy(__get_dynamic_array(laddr),
+					  cm_node->loc_addr, 4);
+				   memcpy(__get_dynamic_array(raddr),
+					  cm_node->rem_addr, 4);
+			    ),
+		    TP_printk("iwdev=%p  node=%p  ah=%p  refcnt=%d  vlan_id=%d  accel=%d  state=%s loc: %s  rem: %s",
+			      __entry->iwdev,
+			      __entry->cm_node,
+			      __entry->ah,
+			      __entry->refcount,
+			      __entry->vlan_id,
+			      __entry->accel,
+			      parse_cm_state(__entry->state),
+			      __print_ip_addr(__get_dynamic_array(laddr),
+					      __entry->lport, __entry->ipv4),
+			      __print_ip_addr(__get_dynamic_array(raddr),
+					      __entry->rport, __entry->ipv4)
+		    )
+);
+
+DEFINE_EVENT(cm_node_ah_template, irdma_cm_free_ah,
+	     TP_PROTO(struct irdma_cm_node *cm_node),
+	     TP_ARGS(cm_node));
+
+DEFINE_EVENT(cm_node_ah_template, irdma_create_ah,
+	     TP_PROTO(struct irdma_cm_node *cm_node),
+	     TP_ARGS(cm_node));
+
+#endif  /* __TRACE_CM_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_cm
+#include <trace/define_trace.h>
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 21/23] RDMA/irdma: Add ABI definitions
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (19 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 20/23] RDMA/irdma: Add dynamic tracing for CM Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 22/23] RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw Shiraz Saleem
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen,
	Mustafa Ismail, Shiraz Saleem

From: Mustafa Ismail <mustafa.ismail@intel.com>

Add ABI definitions for irdma.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 include/uapi/rdma/irdma-abi.h | 116 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 116 insertions(+)
 create mode 100644 include/uapi/rdma/irdma-abi.h

diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h
new file mode 100644
index 0000000..d994b0b
--- /dev/null
+++ b/include/uapi/rdma/irdma-abi.h
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */
+/*
+ * Copyright (c) 2006 - 2021 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2005 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005 Cisco Systems.  All rights reserved.
+ * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
+ */
+
+#ifndef IRDMA_ABI_H
+#define IRDMA_ABI_H
+
+#include <linux/types.h>
+
+/* irdma must support legacy GEN_1 i40iw kernel
+ * and user-space whose last ABI ver is 5
+ */
+#define IRDMA_ABI_VER 5
+
+enum irdma_memreg_type {
+	IW_MEMREG_TYPE_MEM  = 0,
+	IW_MEMREG_TYPE_QP   = 1,
+	IW_MEMREG_TYPE_CQ   = 2,
+	IW_MEMREG_TYPE_RSVD = 3,
+	IW_MEMREG_TYPE_MW   = 4,
+};
+
+struct irdma_alloc_ucontext_req {
+	__u32 rsvd32;
+	__u8 userspace_ver;
+	__u8 rsvd8[3];
+};
+
+struct irdma_alloc_ucontext_resp {
+	__u32 max_pds;
+	__u32 max_qps;
+	__u32 wq_size; /* size of the WQs (SQ+RQ) in the mmaped area */
+	__u8 kernel_ver;
+	__u8 rsvd[3];
+	__aligned_u64 feature_flags;
+	__aligned_u64 db_mmap_key;
+	__u32 max_hw_wq_frags;
+	__u32 max_hw_read_sges;
+	__u32 max_hw_inline;
+	__u32 max_hw_rq_quanta;
+	__u32 max_hw_wq_quanta;
+	__u32 min_hw_cq_size;
+	__u32 max_hw_cq_size;
+	__u32 rsvd1[7];
+	__u16 max_hw_sq_chunk;
+	__u16 rsvd2[11];
+	__u8 hw_rev;
+	__u8 rsvd3[7];
+};
+
+struct irdma_alloc_pd_resp {
+	__u32 pd_id;
+	__u8 rsvd[4];
+};
+
+struct irdma_resize_cq_req {
+	__aligned_u64 user_cq_buffer;
+};
+
+struct irdma_create_cq_req {
+	__aligned_u64 user_cq_buf;
+	__aligned_u64 user_shadow_area;
+};
+
+struct irdma_create_qp_req {
+	__aligned_u64 user_wqe_bufs;
+	__aligned_u64 user_compl_ctx;
+};
+
+struct irdma_mem_reg_req {
+	__u16 reg_type; /* Memory, QP or CQ */
+	__u16 cq_pages;
+	__u16 rq_pages;
+	__u16 sq_pages;
+};
+
+struct irdma_modify_qp_req {
+	__u8 sq_flush;
+	__u8 rq_flush;
+	__u8 rsvd[6];
+};
+
+struct irdma_create_cq_resp {
+	__u32 cq_id;
+	__u32 cq_size;
+};
+
+struct irdma_create_qp_resp {
+	__u32 qp_id;
+	__u32 actual_sq_size;
+	__u32 actual_rq_size;
+	__u32 irdma_drv_opt;
+	__u16 push_idx;
+	__u8 lsmm;
+	__u8 rsvd;
+	__u32 qp_caps;
+	__u16 rsvd2[2];
+};
+
+struct irdma_modify_qp_resp {
+	__aligned_u64 push_wqe_mmap_key;
+	__aligned_u64 push_db_mmap_key;
+	__u16 push_offset;
+	__u8 push_valid;
+	__u8 rsvd[5];
+};
+
+struct irdma_create_ah_resp {
+	__u32 ah_id;
+	__u8 rsvd[4];
+};
+#endif /* IRDMA_ABI_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 22/23] RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (20 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 21/23] RDMA/irdma: Add ABI definitions Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:01 ` [PATCH v4 23/23] RDMA/irdma: Update MAINTAINERS file Shiraz Saleem
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

Add Kconfig and Makefile to build irdma driver.

Remove i40iw driver and add an alias in irdma.
irdma is the replacement driver that supports X722.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 Documentation/ABI/stable/sysfs-class-infiniband |   20 -
 MAINTAINERS                                     |    8 -
 drivers/infiniband/Kconfig                      |    2 +-
 drivers/infiniband/hw/Makefile                  |    2 +-
 drivers/infiniband/hw/i40iw/Kconfig             |    9 -
 drivers/infiniband/hw/i40iw/Makefile            |    9 -
 drivers/infiniband/hw/i40iw/i40iw.h             |  602 ---
 drivers/infiniband/hw/i40iw/i40iw_cm.c          | 4419 -------------------
 drivers/infiniband/hw/i40iw/i40iw_cm.h          |  462 --
 drivers/infiniband/hw/i40iw/i40iw_ctrl.c        | 5243 -----------------------
 drivers/infiniband/hw/i40iw/i40iw_d.h           | 1746 --------
 drivers/infiniband/hw/i40iw/i40iw_hmc.c         |  821 ----
 drivers/infiniband/hw/i40iw/i40iw_hmc.h         |  241 --
 drivers/infiniband/hw/i40iw/i40iw_hw.c          |  851 ----
 drivers/infiniband/hw/i40iw/i40iw_main.c        | 2066 ---------
 drivers/infiniband/hw/i40iw/i40iw_osdep.h       |  195 -
 drivers/infiniband/hw/i40iw/i40iw_p.h           |  129 -
 drivers/infiniband/hw/i40iw/i40iw_pble.c        |  613 ---
 drivers/infiniband/hw/i40iw/i40iw_pble.h        |  131 -
 drivers/infiniband/hw/i40iw/i40iw_puda.c        | 1496 -------
 drivers/infiniband/hw/i40iw/i40iw_puda.h        |  188 -
 drivers/infiniband/hw/i40iw/i40iw_register.h    | 1030 -----
 drivers/infiniband/hw/i40iw/i40iw_status.h      |  101 -
 drivers/infiniband/hw/i40iw/i40iw_type.h        | 1358 ------
 drivers/infiniband/hw/i40iw/i40iw_uk.c          | 1200 ------
 drivers/infiniband/hw/i40iw/i40iw_user.h        |  422 --
 drivers/infiniband/hw/i40iw/i40iw_utils.c       | 1518 -------
 drivers/infiniband/hw/i40iw/i40iw_verbs.c       | 2652 ------------
 drivers/infiniband/hw/i40iw/i40iw_verbs.h       |  179 -
 drivers/infiniband/hw/i40iw/i40iw_vf.c          |   85 -
 drivers/infiniband/hw/i40iw/i40iw_vf.h          |   62 -
 drivers/infiniband/hw/i40iw/i40iw_virtchnl.c    |  759 ----
 drivers/infiniband/hw/i40iw/i40iw_virtchnl.h    |  124 -
 drivers/infiniband/hw/irdma/Kconfig             |   11 +
 drivers/infiniband/hw/irdma/Makefile            |   27 +
 include/uapi/rdma/i40iw-abi.h                   |  107 -
 36 files changed, 40 insertions(+), 28848 deletions(-)
 delete mode 100644 drivers/infiniband/hw/i40iw/Kconfig
 delete mode 100644 drivers/infiniband/hw/i40iw/Makefile
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_cm.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_cm.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_ctrl.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_d.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_hmc.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_hmc.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_hw.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_main.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_osdep.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_p.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_pble.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_pble.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_puda.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_puda.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_register.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_status.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_type.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_uk.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_user.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_utils.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_verbs.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_verbs.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_vf.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_vf.h
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_virtchnl.c
 delete mode 100644 drivers/infiniband/hw/i40iw/i40iw_virtchnl.h
 create mode 100644 drivers/infiniband/hw/irdma/Kconfig
 create mode 100644 drivers/infiniband/hw/irdma/Makefile
 delete mode 100644 include/uapi/rdma/i40iw-abi.h

diff --git a/Documentation/ABI/stable/sysfs-class-infiniband b/Documentation/ABI/stable/sysfs-class-infiniband
index 348c4ac..9b1bdfa 100644
--- a/Documentation/ABI/stable/sysfs-class-infiniband
+++ b/Documentation/ABI/stable/sysfs-class-infiniband
@@ -731,26 +731,6 @@ Description:
 		is the irq number of "sdma3", and M is irq number of "sdma4" in
 		the /proc/interrupts file.
 
-
-sysfs interface for Intel(R) X722 iWARP i40iw driver
-----------------------------------------------------
-
-What:		/sys/class/infiniband/i40iwX/hw_rev
-What:		/sys/class/infiniband/i40iwX/hca_type
-What:		/sys/class/infiniband/i40iwX/board_id
-Date:		Jan, 2016
-KernelVersion:	v4.10
-Contact:	linux-rdma@vger.kernel.org
-Description:
-		=============== ==== ========================
-		hw_rev:		(RO) Hardware revision number
-
-		hca_type:	(RO) Show HCA type (I40IW)
-
-		board_id:	(RO) I40IW board ID
-		=============== ==== ========================
-
-
 sysfs interface for QLogic qedr NIC Driver
 ------------------------------------------
 
diff --git a/MAINTAINERS b/MAINTAINERS
index 27e424e..276cadf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9169,14 +9169,6 @@ L:	linux-pm@vger.kernel.org
 S:	Supported
 F:	drivers/cpufreq/intel_pstate.c
 
-INTEL RDMA RNIC DRIVER
-M:	Faisal Latif <faisal.latif@intel.com>
-M:	Shiraz Saleem <shiraz.saleem@intel.com>
-L:	linux-rdma@vger.kernel.org
-S:	Supported
-F:	drivers/infiniband/hw/i40iw/
-F:	include/uapi/rdma/i40iw-abi.h
-
 INTEL SCU DRIVERS
 M:	Mika Westerberg <mika.westerberg@linux.intel.com>
 S:	Maintained
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 04a78d9..33d3ce9 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -82,7 +82,7 @@ source "drivers/infiniband/hw/mthca/Kconfig"
 source "drivers/infiniband/hw/qib/Kconfig"
 source "drivers/infiniband/hw/cxgb4/Kconfig"
 source "drivers/infiniband/hw/efa/Kconfig"
-source "drivers/infiniband/hw/i40iw/Kconfig"
+source "drivers/infiniband/hw/irdma/Kconfig"
 source "drivers/infiniband/hw/mlx4/Kconfig"
 source "drivers/infiniband/hw/mlx5/Kconfig"
 source "drivers/infiniband/hw/ocrdma/Kconfig"
diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
index 0aeccd9..fba0b3b 100644
--- a/drivers/infiniband/hw/Makefile
+++ b/drivers/infiniband/hw/Makefile
@@ -3,7 +3,7 @@ obj-$(CONFIG_INFINIBAND_MTHCA)		+= mthca/
 obj-$(CONFIG_INFINIBAND_QIB)		+= qib/
 obj-$(CONFIG_INFINIBAND_CXGB4)		+= cxgb4/
 obj-$(CONFIG_INFINIBAND_EFA)		+= efa/
-obj-$(CONFIG_INFINIBAND_I40IW)		+= i40iw/
+obj-$(CONFIG_INFINIBAND_IRDMA)		+= irdma/
 obj-$(CONFIG_MLX4_INFINIBAND)		+= mlx4/
 obj-$(CONFIG_MLX5_INFINIBAND)		+= mlx5/
 obj-$(CONFIG_INFINIBAND_OCRDMA)		+= ocrdma/
diff --git a/drivers/infiniband/hw/i40iw/Kconfig b/drivers/infiniband/hw/i40iw/Kconfig
deleted file mode 100644
index 7476f3b..0000000
diff --git a/drivers/infiniband/hw/i40iw/Makefile b/drivers/infiniband/hw/i40iw/Makefile
deleted file mode 100644
index 34da9eb..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw.h b/drivers/infiniband/hw/i40iw/i40iw.h
deleted file mode 100644
index be4094a..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c b/drivers/infiniband/hw/i40iw/i40iw_cm.c
deleted file mode 100644
index 2450b7d..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.h b/drivers/infiniband/hw/i40iw/i40iw_cm.h
deleted file mode 100644
index 6e43e4d..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
deleted file mode 100644
index eaea5d5..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_d.h b/drivers/infiniband/hw/i40iw/i40iw_d.h
deleted file mode 100644
index 86d5a33..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_hmc.c b/drivers/infiniband/hw/i40iw/i40iw_hmc.c
deleted file mode 100644
index b44bfc1..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_hmc.h b/drivers/infiniband/hw/i40iw/i40iw_hmc.h
deleted file mode 100644
index 4c3fdd8..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_hw.c b/drivers/infiniband/hw/i40iw/i40iw_hw.c
deleted file mode 100644
index d167ac1..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_main.c b/drivers/infiniband/hw/i40iw/i40iw_main.c
deleted file mode 100644
index 2ebcbb1..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_osdep.h b/drivers/infiniband/hw/i40iw/i40iw_osdep.h
deleted file mode 100644
index d938ccb..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_p.h b/drivers/infiniband/hw/i40iw/i40iw_p.h
deleted file mode 100644
index 4c42956..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_pble.c b/drivers/infiniband/hw/i40iw/i40iw_pble.c
deleted file mode 100644
index 53e5cd1..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_pble.h b/drivers/infiniband/hw/i40iw/i40iw_pble.h
deleted file mode 100644
index 7b1851d..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_puda.c b/drivers/infiniband/hw/i40iw/i40iw_puda.c
deleted file mode 100644
index 88fb68e..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_puda.h b/drivers/infiniband/hw/i40iw/i40iw_puda.h
deleted file mode 100644
index 53a7d58c..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_register.h b/drivers/infiniband/hw/i40iw/i40iw_register.h
deleted file mode 100644
index 5776818..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_status.h b/drivers/infiniband/hw/i40iw/i40iw_status.h
deleted file mode 100644
index 36a19c4..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_type.h b/drivers/infiniband/hw/i40iw/i40iw_type.h
deleted file mode 100644
index 394e182..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c
deleted file mode 100644
index f521be1..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_user.h b/drivers/infiniband/hw/i40iw/i40iw_user.h
deleted file mode 100644
index 93fc308..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c b/drivers/infiniband/hw/i40iw/i40iw_utils.c
deleted file mode 100644
index 9ff825f..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
deleted file mode 100644
index b876d72..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.h b/drivers/infiniband/hw/i40iw/i40iw_verbs.h
deleted file mode 100644
index bab71f3..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_vf.c b/drivers/infiniband/hw/i40iw/i40iw_vf.c
deleted file mode 100644
index e33d481..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_vf.h b/drivers/infiniband/hw/i40iw/i40iw_vf.h
deleted file mode 100644
index 4359559..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_virtchnl.c b/drivers/infiniband/hw/i40iw/i40iw_virtchnl.c
deleted file mode 100644
index e34a152..0000000
diff --git a/drivers/infiniband/hw/i40iw/i40iw_virtchnl.h b/drivers/infiniband/hw/i40iw/i40iw_virtchnl.h
deleted file mode 100644
index 24886ef..0000000
diff --git a/drivers/infiniband/hw/irdma/Kconfig b/drivers/infiniband/hw/irdma/Kconfig
new file mode 100644
index 0000000..3fe621b
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config INFINIBAND_IRDMA
+	tristate "Intel(R) Ethernet Protocol Driver for RDMA"
+	depends on INET
+	depends on IPV6 || !IPV6
+	depends on PCI
+	select GENERIC_ALLOCATOR
+	select CONFIG_AUXILIARY_BUS
+	help
+	  This is an Intel(R) Ethernet Protocol Driver for RDMA driver
+	  that support E810 (iWARP/RoCE) and X722 (iWARP) network devices.
diff --git a/drivers/infiniband/hw/irdma/Makefile b/drivers/infiniband/hw/irdma/Makefile
new file mode 100644
index 0000000..48c3854
--- /dev/null
+++ b/drivers/infiniband/hw/irdma/Makefile
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+# Copyright (c) 2019, Intel Corporation.
+
+#
+# Makefile for the Intel(R) Ethernet Connection RDMA Linux Driver
+#
+
+obj-$(CONFIG_INFINIBAND_IRDMA) += irdma.o
+
+irdma-objs := cm.o        \
+              ctrl.o      \
+              hmc.o       \
+              hw.o        \
+              i40iw_hw.o  \
+              i40iw_if.o  \
+              icrdma_hw.o \
+              main.o      \
+              pble.o      \
+              puda.o      \
+              trace.o     \
+              uda.o       \
+              uk.o        \
+              utils.o     \
+              verbs.o     \
+              ws.o        \
+
+CFLAGS_trace.o = -I$(src)
diff --git a/include/uapi/rdma/i40iw-abi.h b/include/uapi/rdma/i40iw-abi.h
deleted file mode 100644
index 79890ba..0000000
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 23/23] RDMA/irdma: Update MAINTAINERS file
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (21 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 22/23] RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw Shiraz Saleem
@ 2021-04-06 21:01 ` Shiraz Saleem
  2021-04-06 21:05 ` [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Saleem, Shiraz
  2021-04-06 23:15 ` Jason Gunthorpe
  24 siblings, 0 replies; 52+ messages in thread
From: Shiraz Saleem @ 2021-04-06 21:01 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, david.m.ertman, anthony.l.nguyen, Shiraz Saleem

Add maintainer entry for irdma driver.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 276cadf..f1a9752 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8957,6 +8957,14 @@ F:	drivers/net/ethernet/intel/*/
 F:	include/linux/avf/virtchnl.h
 F:	include/linux/net/intel/iidc.h
 
+INTEL ETHERNET PROTOCOL DRIVER FOR RDMA
+M:	Mustafa Ismail <mustafa.ismail@intel.com>
+M:	Shiraz Saleem <shiraz.saleem@intel.com>
+L:	linux-rdma@vger.kernel.org
+S:	Supported
+F:	drivers/infiniband/hw/irdma/
+F:	include/uapi/rdma/irdma-abi.h
+
 INTEL FRAMEBUFFER DRIVER (excluding 810 and 815)
 M:	Maik Broemme <mbroemme@libmpq.org>
 L:	linux-fbdev@vger.kernel.org
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (22 preceding siblings ...)
  2021-04-06 21:01 ` [PATCH v4 23/23] RDMA/irdma: Update MAINTAINERS file Shiraz Saleem
@ 2021-04-06 21:05 ` Saleem, Shiraz
  2021-04-06 23:15 ` Jason Gunthorpe
  24 siblings, 0 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-06 21:05 UTC (permalink / raw)
  To: dledford, jgg, kuba, davem
  Cc: linux-rdma, netdev, Ertman, David M, Nguyen, Anthony L

> Subject: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
> 
> The following patch series introduces a unified Intel Ethernet Protocol Driver for
> RDMA (irdma) for the X722 iWARP device and a new E810 device which supports
> iWARP and RoCEv2. The irdma module replaces the legacy i40iw module for X722
> and extends the ABI already defined for i40iw. It is backward compatible with
> legacy X722 rdma-core provider (libi40iw).
> 
> X722 and E810 are PCI network devices that are RDMA capable. The RDMA block
> of this parent device is represented via an auxiliary device exported to 'irdma'
> using the core auxiliary bus infrastructure recently added for 5.11 kernel.
> The parent PCI netdev drivers 'i40e' and 'ice' register auxiliary RDMA devices with
> private data/ops encapsulated that bind to auxiliary drivers registered in irdma
> module.
> 
> This patchset was initially submitted as an RFC where in we got feedback to come
> up with a generic scheme for RDMA drivers to attach to a PCI device owned by
> netdev PCI driver [1]. Solutions using platform bus and MFD were explored but
> rejected by the community and the consensus was to add a new bus infrastructure
> to support this usage model.
> 
> Further revisions of this series along with the auxiliary bus were submitted [2]. At
> this point, Greg KH requested that we take the auxiliary bus review and revision
> process to an internal mailing list and garner the buy-in of a respected kernel
> contributor, along with consensus of all major stakeholders including Nvidia (for
> mlx5 sub-function use-case) and Intel sound driver. This process took a while and
> stalled further development/review of this netdev/irdma series.
> The auxiliary bus was eventually merged in 5.11.
> 
> Between v1 to v2 of this submission, the IIDC went through a major re-write based
> on the feedback and we hope it is now more in alignment with what the community
> wants.
> 
> This series is built against rdma for-next and currently includes the netdev patches
> for ease of review. This includes updates to 'ice' driver to provide RDMA support
> and converts 'i40e' driver to use the auxiliary bus infrastructure.
> Once the patches are closer to merging, a shared pull request will be submitted.
> 
> v3-->v4:
> * Fixup W=1 warnings in ice patches
> * Fix issues uncovered by pyverbs for create user AH and multicast
> * Fix descriptor set issue for fast register introduced during port to FIELD_PREP
> in v2 submission
> 
> v2-->v3:
> * rebase rdma for-next. Adapt to core change '1fb7f8973f51 ("RDMA: Support
> more than 255 rdma ports")'
> * irdma Kconfig updates to conform with linux coding style.
> * Fix a 0-day build issue
> * Remove rdma_resource_limits selector devlink param. Follow on patch to be
> submitted for it with suggestion from Parav to use devlink resource.
> * Capitalize abbreviations in ice idc. e.g. 'aux' to 'AUX'
> 
> v1-->v2:
> * Remove IIDC channel OPs - open, close, peer_register and peer_unregister.
>   And all its associated FSM in ice PCI core driver.
> * Use device_lock in ice PCI core driver while issuing IIDC ops callbacks.
> * Remove peer_* verbiage from shared IIDC header and rename the structs and
> channel ops
>   with iidc_core*/iidc_auxiliary*.
> * Allocate ib_device at start and register it at the end of drv.probe() in irdma gen2
> auxiliary driver.
> * Use ibdev_* printing extensively throughout most of the driver
>   Remove idev_to_dev, ihw_to_dev macros as no longer required in new print
> scheme.
> * Do not bump ABI ver. to 6 in irdma. Maintain irdma ABI ver. at 5 for legacy i40iw
> user-provider compatibility.
> * Add a boundary check in irdma_alloc_ucontext to fail binding with < 4 user-space
> provider version.
> * Remove devlink from irdma. Add 2 new rdma-related devlink parameters added
> to ice PCI core driver.
> * Use FIELD_PREP/FIELD_GET/GENMASK on get/set of descriptor fields versus
> home grown ones LS_*/RS_*.
> * Bind 2 separate auxiliary drivers in irdma - one for gen1 and one for gen2 and
> future devices.
> * Misc. driver fixes in irdma
> 
> [1] https://patchwork.kernel.org/project/linux-rdma/patch/20190215171107.6464-2-
> shiraz.saleem@intel.com/
> [2] https://lore.kernel.org/linux-rdma/20200520070415.3392210-1-
> jeffrey.t.kirsher@intel.com/
> 

The irdma rdma-core provider will be send out shortly.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
  2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
                   ` (23 preceding siblings ...)
  2021-04-06 21:05 ` [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Saleem, Shiraz
@ 2021-04-06 23:15 ` Jason Gunthorpe
  2021-04-06 23:30   ` Saleem, Shiraz
  24 siblings, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-06 23:15 UTC (permalink / raw)
  To: Shiraz Saleem
  Cc: dledford, kuba, davem, linux-rdma, netdev, david.m.ertman,
	anthony.l.nguyen

On Tue, Apr 06, 2021 at 04:01:02PM -0500, Shiraz Saleem wrote:
> Dave Ertman (4):
>   iidc: Introduce iidc.h
>   ice: Initialize RDMA support
>   ice: Implement iidc operations
>   ice: Register auxiliary device to provide RDMA
> 
> Michael J. Ruhl (1):
>   RDMA/irdma: Add dynamic tracing for CM
> 
> Mustafa Ismail (13):
>   RDMA/irdma: Register auxiliary driver and implement private channel
>     OPs
>   RDMA/irdma: Implement device initialization definitions
>   RDMA/irdma: Implement HW Admin Queue OPs
>   RDMA/irdma: Add HMC backing store setup functions
>   RDMA/irdma: Add privileged UDA queue implementation
>   RDMA/irdma: Add QoS definitions
>   RDMA/irdma: Add connection manager
>   RDMA/irdma: Add PBLE resource manager
>   RDMA/irdma: Implement device supported verb APIs
>   RDMA/irdma: Add RoCEv2 UD OP support
>   RDMA/irdma: Add user/kernel shared libraries
>   RDMA/irdma: Add miscellaneous utility definitions
>   RDMA/irdma: Add ABI definitions
> 
> Shiraz Saleem (5):
>   ice: Add devlink params support
>   i40e: Prep i40e header for aux bus conversion
>   i40e: Register auxiliary devices to provide RDMA
>   RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
>   RDMA/irdma: Update MAINTAINERS file

This doesn't apply, and I don't really know why:

Applying: iidc: Introduce iidc.h
Applying: ice: Initialize RDMA support
Applying: ice: Implement iidc operations
Applying: ice: Register auxiliary device to provide RDMA
Applying: ice: Add devlink params support
Applying: i40e: Prep i40e header for aux bus conversion
Applying: i40e: Register auxiliary devices to provide RDMA
Applying: RDMA/irdma: Register auxiliary driver and implement private channel OPs
Applying: RDMA/irdma: Implement device initialization definitions
Applying: RDMA/irdma: Implement HW Admin Queue OPs
Applying: RDMA/irdma: Add HMC backing store setup functions
Applying: RDMA/irdma: Add privileged UDA queue implementation
Applying: RDMA/irdma: Add QoS definitions
Applying: RDMA/irdma: Add connection manager
Applying: RDMA/irdma: Add PBLE resource manager
Applying: RDMA/irdma: Implement device supported verb APIs
Applying: RDMA/irdma: Add RoCEv2 UD OP support
Applying: RDMA/irdma: Add user/kernel shared libraries
Applying: RDMA/irdma: Add miscellaneous utility definitions
Applying: RDMA/irdma: Add dynamic tracing for CM
Applying: RDMA/irdma: Add ABI definitions
Applying: RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
Using index info to reconstruct a base tree...
error: removal patch leaves file contents
error: drivers/infiniband/hw/i40iw/Kconfig: patch does not apply

Can you investigate and fix it? Perhaps using a 9 year old version of
git is the problem?

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
  2021-04-06 23:15 ` Jason Gunthorpe
@ 2021-04-06 23:30   ` Saleem, Shiraz
  2021-04-07  0:18     ` Saleem, Shiraz
  2021-04-07 11:31     ` Jason Gunthorpe
  0 siblings, 2 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-06 23:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

> Subject: Re: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA
> (irdma)
> 
> On Tue, Apr 06, 2021 at 04:01:02PM -0500, Shiraz Saleem wrote:
> > Dave Ertman (4):
> >   iidc: Introduce iidc.h
> >   ice: Initialize RDMA support
> >   ice: Implement iidc operations
> >   ice: Register auxiliary device to provide RDMA
> >
> > Michael J. Ruhl (1):
> >   RDMA/irdma: Add dynamic tracing for CM
> >
> > Mustafa Ismail (13):
> >   RDMA/irdma: Register auxiliary driver and implement private channel
> >     OPs
> >   RDMA/irdma: Implement device initialization definitions
> >   RDMA/irdma: Implement HW Admin Queue OPs
> >   RDMA/irdma: Add HMC backing store setup functions
> >   RDMA/irdma: Add privileged UDA queue implementation
> >   RDMA/irdma: Add QoS definitions
> >   RDMA/irdma: Add connection manager
> >   RDMA/irdma: Add PBLE resource manager
> >   RDMA/irdma: Implement device supported verb APIs
> >   RDMA/irdma: Add RoCEv2 UD OP support
> >   RDMA/irdma: Add user/kernel shared libraries
> >   RDMA/irdma: Add miscellaneous utility definitions
> >   RDMA/irdma: Add ABI definitions
> >
> > Shiraz Saleem (5):
> >   ice: Add devlink params support
> >   i40e: Prep i40e header for aux bus conversion
> >   i40e: Register auxiliary devices to provide RDMA
> >   RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
> >   RDMA/irdma: Update MAINTAINERS file
> 
> This doesn't apply, and I don't really know why:
> 
> Applying: iidc: Introduce iidc.h
> Applying: ice: Initialize RDMA support
> Applying: ice: Implement iidc operations
> Applying: ice: Register auxiliary device to provide RDMA
> Applying: ice: Add devlink params support
> Applying: i40e: Prep i40e header for aux bus conversion
> Applying: i40e: Register auxiliary devices to provide RDMA
> Applying: RDMA/irdma: Register auxiliary driver and implement private channel
> OPs
> Applying: RDMA/irdma: Implement device initialization definitions
> Applying: RDMA/irdma: Implement HW Admin Queue OPs
> Applying: RDMA/irdma: Add HMC backing store setup functions
> Applying: RDMA/irdma: Add privileged UDA queue implementation
> Applying: RDMA/irdma: Add QoS definitions
> Applying: RDMA/irdma: Add connection manager
> Applying: RDMA/irdma: Add PBLE resource manager
> Applying: RDMA/irdma: Implement device supported verb APIs
> Applying: RDMA/irdma: Add RoCEv2 UD OP support
> Applying: RDMA/irdma: Add user/kernel shared libraries
> Applying: RDMA/irdma: Add miscellaneous utility definitions
> Applying: RDMA/irdma: Add dynamic tracing for CM
> Applying: RDMA/irdma: Add ABI definitions
> Applying: RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw Using index
> info to reconstruct a base tree...
> error: removal patch leaves file contents
> error: drivers/infiniband/hw/i40iw/Kconfig: patch does not apply
> 
> Can you investigate and fix it? Perhaps using a 9 year old version of git is the
> problem?
> 

Hi Jason - I think its because I used --irreversible-delete flag in git format-patch for review that this one doesn't apply.

I can resend without it if your trying to apply.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
  2021-04-06 23:30   ` Saleem, Shiraz
@ 2021-04-07  0:18     ` Saleem, Shiraz
  2021-04-07 11:31     ` Jason Gunthorpe
  1 sibling, 0 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-07  0:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

> Subject: RE: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA
> (irdma)
> 
> > Subject: Re: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for
> > RDMA
> > (irdma)
> >
> > On Tue, Apr 06, 2021 at 04:01:02PM -0500, Shiraz Saleem wrote:
> > > Dave Ertman (4):
> > >   iidc: Introduce iidc.h
> > >   ice: Initialize RDMA support
> > >   ice: Implement iidc operations
> > >   ice: Register auxiliary device to provide RDMA
> > >
> > > Michael J. Ruhl (1):
> > >   RDMA/irdma: Add dynamic tracing for CM
> > >
> > > Mustafa Ismail (13):
> > >   RDMA/irdma: Register auxiliary driver and implement private channel
> > >     OPs
> > >   RDMA/irdma: Implement device initialization definitions
> > >   RDMA/irdma: Implement HW Admin Queue OPs
> > >   RDMA/irdma: Add HMC backing store setup functions
> > >   RDMA/irdma: Add privileged UDA queue implementation
> > >   RDMA/irdma: Add QoS definitions
> > >   RDMA/irdma: Add connection manager
> > >   RDMA/irdma: Add PBLE resource manager
> > >   RDMA/irdma: Implement device supported verb APIs
> > >   RDMA/irdma: Add RoCEv2 UD OP support
> > >   RDMA/irdma: Add user/kernel shared libraries
> > >   RDMA/irdma: Add miscellaneous utility definitions
> > >   RDMA/irdma: Add ABI definitions
> > >
> > > Shiraz Saleem (5):
> > >   ice: Add devlink params support
> > >   i40e: Prep i40e header for aux bus conversion
> > >   i40e: Register auxiliary devices to provide RDMA
> > >   RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
> > >   RDMA/irdma: Update MAINTAINERS file
> >
> > This doesn't apply, and I don't really know why:
> >
> > Applying: iidc: Introduce iidc.h
> > Applying: ice: Initialize RDMA support
> > Applying: ice: Implement iidc operations
> > Applying: ice: Register auxiliary device to provide RDMA
> > Applying: ice: Add devlink params support
> > Applying: i40e: Prep i40e header for aux bus conversion
> > Applying: i40e: Register auxiliary devices to provide RDMA
> > Applying: RDMA/irdma: Register auxiliary driver and implement private
> > channel OPs
> > Applying: RDMA/irdma: Implement device initialization definitions
> > Applying: RDMA/irdma: Implement HW Admin Queue OPs
> > Applying: RDMA/irdma: Add HMC backing store setup functions
> > Applying: RDMA/irdma: Add privileged UDA queue implementation
> > Applying: RDMA/irdma: Add QoS definitions
> > Applying: RDMA/irdma: Add connection manager
> > Applying: RDMA/irdma: Add PBLE resource manager
> > Applying: RDMA/irdma: Implement device supported verb APIs
> > Applying: RDMA/irdma: Add RoCEv2 UD OP support
> > Applying: RDMA/irdma: Add user/kernel shared libraries
> > Applying: RDMA/irdma: Add miscellaneous utility definitions
> > Applying: RDMA/irdma: Add dynamic tracing for CM
> > Applying: RDMA/irdma: Add ABI definitions
> > Applying: RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
> > Using index info to reconstruct a base tree...
> > error: removal patch leaves file contents
> > error: drivers/infiniband/hw/i40iw/Kconfig: patch does not apply
> >
> > Can you investigate and fix it? Perhaps using a 9 year old version of
> > git is the problem?
> >
> 
> Hi Jason - I think its because I used --irreversible-delete flag in git format-patch for
> review that this one doesn't apply.
> 
> I can resend without it if your trying to apply.
> 

I resend it without using --irreversible-delete. Hopefully, it applies cleanly now.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
  2021-04-06 23:30   ` Saleem, Shiraz
  2021-04-07  0:18     ` Saleem, Shiraz
@ 2021-04-07 11:31     ` Jason Gunthorpe
  2021-04-07 15:06       ` Saleem, Shiraz
  1 sibling, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 11:31 UTC (permalink / raw)
  To: Saleem, Shiraz
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

On Tue, Apr 06, 2021 at 11:30:23PM +0000, Saleem, Shiraz wrote:

> Hi Jason - I think its because I used --irreversible-delete flag in git format-patch for review that this one doesn't apply.

I doubt it

> I can resend without it if your trying to apply.

Now it is too big to go to the mailing lists

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-06 21:01 ` [PATCH v4 05/23] ice: Add devlink params support Shiraz Saleem
@ 2021-04-07 14:57   ` Jason Gunthorpe
  2021-04-07 20:58     ` Saleem, Shiraz
  0 siblings, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 14:57 UTC (permalink / raw)
  To: Shiraz Saleem
  Cc: dledford, kuba, davem, linux-rdma, netdev, david.m.ertman,
	anthony.l.nguyen

On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> Add a new generic runtime devlink parameter 'rdma_protocol'
> and use it in ice PCI driver. Configuration changes
> result in unplugging the auxiliary RDMA device and re-plugging
> it with updated values for irdma auxiiary driver to consume at
> drv.probe()
> 
> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
>  .../networking/devlink/devlink-params.rst          |  6 ++
>  Documentation/networking/devlink/ice.rst           | 13 +++
>  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92 +++++++++++++++++++++-
>  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
>  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
>  include/net/devlink.h                              |  4 +
>  net/core/devlink.c                                 |  5 ++
>  7 files changed, 125 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
> index 54c9f10..0b454c3 100644
> +++ b/Documentation/networking/devlink/devlink-params.rst
> @@ -114,3 +114,9 @@ own name.
>         will NACK any attempt of other host to reset the device. This parameter
>         is useful for setups where a device is shared by different hosts, such
>         as multi-host setup.
> +   * - ``rdma_protocol``
> +     - string
> +     - Selects the RDMA protocol selected for multi-protocol devices.
> +        - ``iwarp`` iWARP
> +	- ``roce`` RoCE
> +	- ``ib`` Infiniband

I'm still not sure this belongs in devlink. 

What about devices that support roce and iwarp concurrently?

There is nothing at the protocol level that precludes this - doesn't
this device allow it?

I know Parav is looking at the general problem of how to customize
what aux devices are created, that may be a better fit for this.

Can you remove the devlink parts to make progress?

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma)
  2021-04-07 11:31     ` Jason Gunthorpe
@ 2021-04-07 15:06       ` Saleem, Shiraz
  0 siblings, 0 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-07 15:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

> Subject: Re: [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA
> (irdma)
> 
> On Tue, Apr 06, 2021 at 11:30:23PM +0000, Saleem, Shiraz wrote:
> 
> > Hi Jason - I think its because I used --irreversible-delete flag in git format-patch
> for review that this one doesn't apply.
> 
> I doubt it

The documentation hints at it.
https://git-scm.com/docs/git-format-patch

-D
--irreversible-delete
Omit the preimage for deletes, i.e. print only the header but not the diff between the preimage and /dev/null. The resulting patch is not meant to be applied with patch or git apply; this is solely for people who want to just concentrate on reviewing the text after the change. In addition, the output obviously lacks enough information to apply such a patch in reverse, even manually, hence the name of the option.

> 
> > I can resend without it if your trying to apply.
> 
> Now it is too big to go to the mailing lists
> 

It is showing now.
https://patchwork.kernel.org/project/linux-rdma/patch/20210407001502.1890-23-shiraz.saleem@intel.com/

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-06 21:01 ` [PATCH v4 01/23] iidc: Introduce iidc.h Shiraz Saleem
@ 2021-04-07 15:44   ` Jason Gunthorpe
  2021-04-07 20:58     ` Saleem, Shiraz
  2021-04-07 17:35   ` Jason Gunthorpe
  1 sibling, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 15:44 UTC (permalink / raw)
  To: Shiraz Saleem
  Cc: dledford, kuba, davem, linux-rdma, netdev, david.m.ertman,
	anthony.l.nguyen

On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:

> +/* Following APIs are implemented by core PCI driver */
> +struct iidc_core_ops {
> +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> +	 * completion queues, Tx/Rx queues, etc...
> +	 */
> +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> +			 struct iidc_res *res,
> +			 int partial_acceptable);
> +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> +			struct iidc_res *res);
> +
> +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> +			     enum iidc_reset_type reset_type);
> +
> +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> +				   u16 vport_id, bool enable);
> +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
> +		       u16 len);
> +};

What is this? There is only one implementation:

static const struct iidc_core_ops ops = {
	.alloc_res			= ice_cdev_info_alloc_res,
	.free_res			= ice_cdev_info_free_res,
	.request_reset			= ice_cdev_info_request_reset,
	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
	.vc_send			= ice_cdev_info_vc_send,
};

So export and call the functions directly.

We just had this very same discussion with Broadcom.

I notice there is no module dependency between irdma and the ethernet
driver because the above ops are avoiding it.

This entire idea was already NAK'd once:

https://lore.kernel.org/linux-rdma/20180522203831.20624-1-jeffrey.t.kirsher@intel.com/

So please remove it.

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-06 21:01 ` [PATCH v4 01/23] iidc: Introduce iidc.h Shiraz Saleem
  2021-04-07 15:44   ` Jason Gunthorpe
@ 2021-04-07 17:35   ` Jason Gunthorpe
  2021-04-12 14:51     ` Saleem, Shiraz
  1 sibling, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 17:35 UTC (permalink / raw)
  To: Shiraz Saleem
  Cc: dledford, kuba, davem, linux-rdma, netdev, david.m.ertman,
	anthony.l.nguyen

On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:

> +struct iidc_res_base {
> +	/* Union for future provision e.g. other res_type */
> +	union {
> +		struct iidc_rdma_qset_params qsets;
> +	} res;

Use an anonymous union?

There is alot of confusiong provisioning for future types, do you have
concrete plans here? I'm a bit confused why this is so different from
how mlx5 ended up when it already has multiple types.

> +};
> +
> +struct iidc_res {
> +	/* Type of resource. */
> +	enum iidc_res_type res_type;
> +	/* Count requested */
> +	u16 cnt_req;
> +
> +	/* Number of resources allocated. Filled in by callee.
> +	 * Based on this value, caller to fill up "resources"
> +	 */
> +	u16 res_allocated;
> +
> +	/* Unique handle to resources allocated. Zero if call fails.
> +	 * Allocated by callee and for now used by caller for internal
> +	 * tracking purpose.
> +	 */
> +	u32 res_handle;
> +
> +	/* Peer driver has to allocate sufficient memory, to accommodate
> +	 * cnt_requested before calling this function.

Calling what function?

> +	 * Memory has to be zero initialized. It is input/output param.
> +	 * As a result of alloc_res API, this structures will be populated.
> +	 */
> +	struct iidc_res_base res[1];

So it is a wrongly defined flex array? Confused

The usages are all using this as some super-complicated function argument:

	struct iidc_res rdma_qset_res = {};

	rdma_qset_res.res_allocated = 1;
	rdma_qset_res.res_type = IIDC_RDMA_QSETS_TXSCHED;
	rdma_qset_res.res[0].res.qsets.vport_id = vsi->vsi_idx;
	rdma_qset_res.res[0].res.qsets.teid = tc_node->l2_sched_node_id;
	rdma_qset_res.res[0].res.qsets.qs_handle = tc_node->qs_handle;

	if (cdev_info->ops->free_res(cdev_info, &rdma_qset_res))

So the answer here is to make your function calls sane and well
architected. If you have to pass a union to call a function then
something is very wrong with the design.

You aren't trying to achieve ABI decoupling of the rdma/ethernet
modules with an obfuscated complicated function pointer indirection,
are you?

Please use sane function calls 

> +/* Following APIs are implemented by auxiliary drivers and invoked by core PCI
> + * driver
> + */
> +struct iidc_auxiliary_ops {
> +	/* This event_handler is meant to be a blocking call.  For instance,
> +	 * when a BEFORE_MTU_CHANGE event comes in, the event_handler will not
> +	 * return until the auxiliary driver is ready for the MTU change to
> +	 * happen.
> +	 */
> +	void (*event_handler)(struct iidc_core_dev_info *cdev_info,
> +			      struct iidc_event *event);
> +
> +	int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id,
> +			  u8 *msg, u16 len);
> +};

This is not the normal pattern:

> +struct iidc_auxiliary_drv {
> +	struct auxiliary_driver adrv;
> +	struct iidc_auxiliary_ops *ops;
> +};

Just put the two functions above in the drv directly:

struct iidc_auxiliary_drv {
        struct auxilary_driver adrv;
	void (*event_handler)(struct iidc_core_dev_info *cdev_info, *cdev_info,
			      struct iidc_event *event);

	int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id,
			  u8 *msg, u16 len);
}

> +
> +#define IIDC_RDMA_NAME	"intel_rdma"
> +#define IIDC_RDMA_ID	0x00000010
> +#define IIDC_MAX_NUM_AUX	4
> +
> +/* The const struct that instantiates cdev_info_id needs to be initialized
> + * in the .c with the macro ASSIGN_IIDC_INFO.
> + * For example:
> + * static const struct cdev_info_id cdev_info_ids[] = ASSIGN_IIDC_INFO;
> + */
> +struct cdev_info_id {
> +	char *name;
> +	int id;
> +};
> +
> +#define IIDC_RDMA_INFO   { .name = IIDC_RDMA_NAME,  .id = IIDC_RDMA_ID },
> +
> +#define ASSIGN_IIDC_INFO	\
> +{				\
> +	IIDC_RDMA_INFO		\
> +}

I tried to figure out what all this was for and came up short. There
is only one user and all this seems unnecessary in this series, add it
later when you need it.

> +
> +#define iidc_priv(x) ((x)->auxiliary_priv)

Use a static inline function

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-07 14:57   ` Jason Gunthorpe
@ 2021-04-07 20:58     ` Saleem, Shiraz
  2021-04-07 22:46       ` Jason Gunthorpe
  0 siblings, 1 reply; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-07 20:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

> Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> 
> On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> > Add a new generic runtime devlink parameter 'rdma_protocol'
> > and use it in ice PCI driver. Configuration changes result in
> > unplugging the auxiliary RDMA device and re-plugging it with updated
> > values for irdma auxiiary driver to consume at
> > drv.probe()
> >
> > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> >  .../networking/devlink/devlink-params.rst          |  6 ++
> >  Documentation/networking/devlink/ice.rst           | 13 +++
> >  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92 +++++++++++++++++++++-
> >  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
> >  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
> >  include/net/devlink.h                              |  4 +
> >  net/core/devlink.c                                 |  5 ++
> >  7 files changed, 125 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/networking/devlink/devlink-params.rst
> > b/Documentation/networking/devlink/devlink-params.rst
> > index 54c9f10..0b454c3 100644
> > +++ b/Documentation/networking/devlink/devlink-params.rst
> > @@ -114,3 +114,9 @@ own name.
> >         will NACK any attempt of other host to reset the device. This parameter
> >         is useful for setups where a device is shared by different hosts, such
> >         as multi-host setup.
> > +   * - ``rdma_protocol``
> > +     - string
> > +     - Selects the RDMA protocol selected for multi-protocol devices.
> > +        - ``iwarp`` iWARP
> > +	- ``roce`` RoCE
> > +	- ``ib`` Infiniband
> 
> I'm still not sure this belongs in devlink.

I believe you suggested we use devlink for protocol switch.

> 
> What about devices that support roce and iwarp concurrently?
> 
> There is nothing at the protocol level that precludes this - doesn't this device allow
> it?

Nope. This device doesn’t support both protocols concurrently on same PCI function.

Maybe then it makes sense to move this protocol switch as driver specific devlink?

> 
> I know Parav is looking at the general problem of how to customize what aux
> devices are created, that may be a better fit for this.
> 
> Can you remove the devlink parts to make progress?
> 

It is important since otherwise the customer will have no way to use RoCEv2 on this device.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-07 15:44   ` Jason Gunthorpe
@ 2021-04-07 20:58     ` Saleem, Shiraz
  2021-04-07 22:43       ` Jason Gunthorpe
  0 siblings, 1 reply; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-07 20:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> 
> On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> 
> > +/* Following APIs are implemented by core PCI driver */ struct
> > +iidc_core_ops {
> > +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> > +	 * completion queues, Tx/Rx queues, etc...
> > +	 */
> > +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> > +			 struct iidc_res *res,
> > +			 int partial_acceptable);
> > +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> > +			struct iidc_res *res);
> > +
> > +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> > +			     enum iidc_reset_type reset_type);
> > +
> > +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> > +				   u16 vport_id, bool enable);
> > +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
> > +		       u16 len);
> > +};
> 
> What is this? There is only one implementation:
> 
> static const struct iidc_core_ops ops = {
> 	.alloc_res			= ice_cdev_info_alloc_res,
> 	.free_res			= ice_cdev_info_free_res,
> 	.request_reset			= ice_cdev_info_request_reset,
> 	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
> 	.vc_send			= ice_cdev_info_vc_send,
> };
> 
> So export and call the functions directly.

No. Then we end up requiring ice to be loaded even when just want to use irdma with x722 [whose ethernet driver is "i40e"].
This will not allow us to be a unified RDMA driver.

> 
> We just had this very same discussion with Broadcom.

Yes, but they only have a single ethernet driver today I presume.

Here we have a single RDMA driver and 2 ethernet drivers, i40e and ice.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-07 20:58     ` Saleem, Shiraz
@ 2021-04-07 22:43       ` Jason Gunthorpe
  2021-04-08  7:14         ` Leon Romanovsky
  2021-04-12 14:50         ` Saleem, Shiraz
  0 siblings, 2 replies; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 22:43 UTC (permalink / raw)
  To: Saleem, Shiraz
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote:
> > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> > 
> > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> > 
> > > +/* Following APIs are implemented by core PCI driver */ struct
> > > +iidc_core_ops {
> > > +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> > > +	 * completion queues, Tx/Rx queues, etc...
> > > +	 */
> > > +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> > > +			 struct iidc_res *res,
> > > +			 int partial_acceptable);
> > > +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> > > +			struct iidc_res *res);
> > > +
> > > +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> > > +			     enum iidc_reset_type reset_type);
> > > +
> > > +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> > > +				   u16 vport_id, bool enable);
> > > +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
> > > +		       u16 len);
> > > +};
> > 
> > What is this? There is only one implementation:
> > 
> > static const struct iidc_core_ops ops = {
> > 	.alloc_res			= ice_cdev_info_alloc_res,
> > 	.free_res			= ice_cdev_info_free_res,
> > 	.request_reset			= ice_cdev_info_request_reset,
> > 	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
> > 	.vc_send			= ice_cdev_info_vc_send,
> > };
> > 
> > So export and call the functions directly.
> 
> No. Then we end up requiring ice to be loaded even when just want to
> use irdma with x722 [whose ethernet driver is "i40e"].

So what? What does it matter to load a few extra kb of modules?

If you had both drivers use the same ops interface you'd have an
argument.

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-07 20:58     ` Saleem, Shiraz
@ 2021-04-07 22:46       ` Jason Gunthorpe
  2021-04-12 14:50         ` Saleem, Shiraz
  0 siblings, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 22:46 UTC (permalink / raw)
  To: Saleem, Shiraz
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

On Wed, Apr 07, 2021 at 08:58:25PM +0000, Saleem, Shiraz wrote:
> > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > 
> > On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> > > Add a new generic runtime devlink parameter 'rdma_protocol'
> > > and use it in ice PCI driver. Configuration changes result in
> > > unplugging the auxiliary RDMA device and re-plugging it with updated
> > > values for irdma auxiiary driver to consume at
> > > drv.probe()
> > >
> > > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > >  .../networking/devlink/devlink-params.rst          |  6 ++
> > >  Documentation/networking/devlink/ice.rst           | 13 +++
> > >  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92 +++++++++++++++++++++-
> > >  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
> > >  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
> > >  include/net/devlink.h                              |  4 +
> > >  net/core/devlink.c                                 |  5 ++
> > >  7 files changed, 125 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/networking/devlink/devlink-params.rst
> > > b/Documentation/networking/devlink/devlink-params.rst
> > > index 54c9f10..0b454c3 100644
> > > +++ b/Documentation/networking/devlink/devlink-params.rst
> > > @@ -114,3 +114,9 @@ own name.
> > >         will NACK any attempt of other host to reset the device. This parameter
> > >         is useful for setups where a device is shared by different hosts, such
> > >         as multi-host setup.
> > > +   * - ``rdma_protocol``
> > > +     - string
> > > +     - Selects the RDMA protocol selected for multi-protocol devices.
> > > +        - ``iwarp`` iWARP
> > > +	- ``roce`` RoCE
> > > +	- ``ib`` Infiniband
> > 
> > I'm still not sure this belongs in devlink.
> 
> I believe you suggested we use devlink for protocol switch.

Yes, devlink is the right place, but selecting a *single* protocol
doesn't seem right, or general enough.

Parav is talking about generic ways to customize the aux devices
created and that would seem to serve the same function as this.

> > I know Parav is looking at the general problem of how to customize what aux
> > devices are created, that may be a better fit for this.
> > 
> > Can you remove the devlink parts to make progress?
> 
> It is important since otherwise the customer will have no way to use RoCEv2 on this device.

I'm not saying to not having it eventually, I'm just getting tired of
looking at 23 patches. You can argue it out after

I'm also half thinking of applying this under driver/staging or
CONFIG_BROKEN or something just because I am getting sick of looking
at it.

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-07 22:43       ` Jason Gunthorpe
@ 2021-04-08  7:14         ` Leon Romanovsky
  2021-04-09  1:38           ` Saleem, Shiraz
  2021-04-12 14:50         ` Saleem, Shiraz
  1 sibling, 1 reply; 52+ messages in thread
From: Leon Romanovsky @ 2021-04-08  7:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Saleem, Shiraz, dledford, kuba, davem, linux-rdma, netdev,
	Ertman, David M, Nguyen, Anthony L

On Wed, Apr 07, 2021 at 07:43:24PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote:
> > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> > > 
> > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> > > 
> > > > +/* Following APIs are implemented by core PCI driver */ struct
> > > > +iidc_core_ops {
> > > > +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> > > > +	 * completion queues, Tx/Rx queues, etc...
> > > > +	 */
> > > > +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> > > > +			 struct iidc_res *res,
> > > > +			 int partial_acceptable);
> > > > +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> > > > +			struct iidc_res *res);
> > > > +
> > > > +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> > > > +			     enum iidc_reset_type reset_type);
> > > > +
> > > > +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> > > > +				   u16 vport_id, bool enable);
> > > > +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
> > > > +		       u16 len);
> > > > +};
> > > 
> > > What is this? There is only one implementation:
> > > 
> > > static const struct iidc_core_ops ops = {
> > > 	.alloc_res			= ice_cdev_info_alloc_res,
> > > 	.free_res			= ice_cdev_info_free_res,
> > > 	.request_reset			= ice_cdev_info_request_reset,
> > > 	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
> > > 	.vc_send			= ice_cdev_info_vc_send,
> > > };
> > > 
> > > So export and call the functions directly.
> > 
> > No. Then we end up requiring ice to be loaded even when just want to
> > use irdma with x722 [whose ethernet driver is "i40e"].
> 
> So what? What does it matter to load a few extra kb of modules?

And if user cares about it, he will blacklist that module anyway.

Thanks

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-08  7:14         ` Leon Romanovsky
@ 2021-04-09  1:38           ` Saleem, Shiraz
  2021-04-11 11:48             ` Leon Romanovsky
  0 siblings, 1 reply; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-09  1:38 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L

> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> 
> On Wed, Apr 07, 2021 at 07:43:24PM -0300, Jason Gunthorpe wrote:
> > On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote:
> > > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> > > >
> > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> > > >
> > > > > +/* Following APIs are implemented by core PCI driver */ struct
> > > > > +iidc_core_ops {
> > > > > +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> > > > > +	 * completion queues, Tx/Rx queues, etc...
> > > > > +	 */
> > > > > +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> > > > > +			 struct iidc_res *res,
> > > > > +			 int partial_acceptable);
> > > > > +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> > > > > +			struct iidc_res *res);
> > > > > +
> > > > > +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> > > > > +			     enum iidc_reset_type reset_type);
> > > > > +
> > > > > +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> > > > > +				   u16 vport_id, bool enable);
> > > > > +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8
> *msg,
> > > > > +		       u16 len);
> > > > > +};
> > > >
> > > > What is this? There is only one implementation:
> > > >
> > > > static const struct iidc_core_ops ops = {
> > > > 	.alloc_res			= ice_cdev_info_alloc_res,
> > > > 	.free_res			= ice_cdev_info_free_res,
> > > > 	.request_reset			= ice_cdev_info_request_reset,
> > > > 	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
> > > > 	.vc_send			= ice_cdev_info_vc_send,
> > > > };
> > > >
> > > > So export and call the functions directly.
> > >
> > > No. Then we end up requiring ice to be loaded even when just want to
> > > use irdma with x722 [whose ethernet driver is "i40e"].
> >
> > So what? What does it matter to load a few extra kb of modules?
> 
> And if user cares about it, he will blacklist that module anyway.
> 
 blacklist ice when you just have an x722 card? How does that solve anything? You wont be able to load irdma then.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-09  1:38           ` Saleem, Shiraz
@ 2021-04-11 11:48             ` Leon Romanovsky
  0 siblings, 0 replies; 52+ messages in thread
From: Leon Romanovsky @ 2021-04-11 11:48 UTC (permalink / raw)
  To: Saleem, Shiraz
  Cc: Jason Gunthorpe, dledford, kuba, davem, linux-rdma, netdev,
	Ertman, David M, Nguyen, Anthony L

On Fri, Apr 09, 2021 at 01:38:37AM +0000, Saleem, Shiraz wrote:
> > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> > 
> > On Wed, Apr 07, 2021 at 07:43:24PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote:
> > > > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> > > > >
> > > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> > > > >
> > > > > > +/* Following APIs are implemented by core PCI driver */ struct
> > > > > > +iidc_core_ops {
> > > > > > +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> > > > > > +	 * completion queues, Tx/Rx queues, etc...
> > > > > > +	 */
> > > > > > +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> > > > > > +			 struct iidc_res *res,
> > > > > > +			 int partial_acceptable);
> > > > > > +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> > > > > > +			struct iidc_res *res);
> > > > > > +
> > > > > > +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> > > > > > +			     enum iidc_reset_type reset_type);
> > > > > > +
> > > > > > +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> > > > > > +				   u16 vport_id, bool enable);
> > > > > > +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8
> > *msg,
> > > > > > +		       u16 len);
> > > > > > +};
> > > > >
> > > > > What is this? There is only one implementation:
> > > > >
> > > > > static const struct iidc_core_ops ops = {
> > > > > 	.alloc_res			= ice_cdev_info_alloc_res,
> > > > > 	.free_res			= ice_cdev_info_free_res,
> > > > > 	.request_reset			= ice_cdev_info_request_reset,
> > > > > 	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
> > > > > 	.vc_send			= ice_cdev_info_vc_send,
> > > > > };
> > > > >
> > > > > So export and call the functions directly.
> > > >
> > > > No. Then we end up requiring ice to be loaded even when just want to
> > > > use irdma with x722 [whose ethernet driver is "i40e"].
> > >
> > > So what? What does it matter to load a few extra kb of modules?
> > 
> > And if user cares about it, he will blacklist that module anyway.
> > 
>  blacklist ice when you just have an x722 card? How does that solve anything? You wont be able to load irdma then.

You will blacklist i40e if you want solely irdma functionality.

Thanks

> 
> Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-07 22:43       ` Jason Gunthorpe
  2021-04-08  7:14         ` Leon Romanovsky
@ 2021-04-12 14:50         ` Saleem, Shiraz
  2021-04-12 16:12           ` Jason Gunthorpe
  1 sibling, 1 reply; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-12 14:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L, Williams, Dan J, Hefty, Sean, Lacombe, John S

> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> 
> On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote:
> > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> > >
> > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> > >
> > > > +/* Following APIs are implemented by core PCI driver */ struct
> > > > +iidc_core_ops {
> > > > +	/* APIs to allocate resources such as VEB, VSI, Doorbell queues,
> > > > +	 * completion queues, Tx/Rx queues, etc...
> > > > +	 */
> > > > +	int (*alloc_res)(struct iidc_core_dev_info *cdev_info,
> > > > +			 struct iidc_res *res,
> > > > +			 int partial_acceptable);
> > > > +	int (*free_res)(struct iidc_core_dev_info *cdev_info,
> > > > +			struct iidc_res *res);
> > > > +
> > > > +	int (*request_reset)(struct iidc_core_dev_info *cdev_info,
> > > > +			     enum iidc_reset_type reset_type);
> > > > +
> > > > +	int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info,
> > > > +				   u16 vport_id, bool enable);
> > > > +	int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg,
> > > > +		       u16 len);
> > > > +};
> > >
> > > What is this? There is only one implementation:
> > >
> > > static const struct iidc_core_ops ops = {
> > > 	.alloc_res			= ice_cdev_info_alloc_res,
> > > 	.free_res			= ice_cdev_info_free_res,
> > > 	.request_reset			= ice_cdev_info_request_reset,
> > > 	.update_vport_filter		= ice_cdev_info_update_vsi_filter,
> > > 	.vc_send			= ice_cdev_info_vc_send,
> > > };
> > >
> > > So export and call the functions directly.
> >
> > No. Then we end up requiring ice to be loaded even when just want to
> > use irdma with x722 [whose ethernet driver is "i40e"].
> 
> So what? What does it matter to load a few extra kb of modules?

Because it is an unnecessary thing to force a user to build/load drivers for
which they don't have the HW for? The problem gets compounded if we have to
do it for all future HW Intel PCI drivers, i.e. depends on ICE && ....

IIDC is Intel's converged and generic RDMA <--> PCI driver channel interface; which
we intend to use moving forward. And these .ops callbacks are part of this interface which will
have different implementations by each HW generation PCI core driver. It is extensible
with new ops added to the table for new HW and where implementations of the certain ops on some
HW will be NULL.

There is a near-term Intel ethernet VF driver which will use IIDC to provide RDMA in the VF,
and implement some of these .ops callbacks. There is also intent to move i40e to IIDC. 

And yes, it allows to load a unified irdma driver without having all the mulit-gen PCI core drivers to be
built/loaded as a pre-requisite which is solving a pain-point to the user and does not unnecessarily
add a memory foot-print.

In the past, with i40e <-> i40iw, I acknowledge such a dependency was decoupled
for the wrong reasons [1] and understand where your frustration is coming from. But in
a unified irdma driver model connecting to multiple PCI gen drivers, I do think it serves
a reason. This has also been voiced over the years in some of our discussions [2] leading to
the auxiliary bus and been part of our submissions from the get go. In fact, use of such domain
specific .ops from the parent device is an assumption baked into the design when the auxiliary bus
was conceived and in the documentation [3] (See Example Usage).

[1] https://lore.kernel.org/linux-rdma/20180522205612.GD7502@mellanox.com/
[2] https://lore.kernel.org/linux-rdma/2B0E3F215D1AB84DA946C8BEE234CCC97B2E1D29@ORSMSX101.amr.corp.intel.com/
[3] https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-07 22:46       ` Jason Gunthorpe
@ 2021-04-12 14:50         ` Saleem, Shiraz
  2021-04-12 19:07           ` Parav Pandit
  0 siblings, 1 reply; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-12 14:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E

> Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> 
> On Wed, Apr 07, 2021 at 08:58:25PM +0000, Saleem, Shiraz wrote:
> > > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > >
> > > On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> > > > Add a new generic runtime devlink parameter 'rdma_protocol'
> > > > and use it in ice PCI driver. Configuration changes result in
> > > > unplugging the auxiliary RDMA device and re-plugging it with
> > > > updated values for irdma auxiiary driver to consume at
> > > > drv.probe()
> > > >
> > > > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > > >  .../networking/devlink/devlink-params.rst          |  6 ++
> > > >  Documentation/networking/devlink/ice.rst           | 13 +++
> > > >  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92
> +++++++++++++++++++++-
> > > >  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
> > > >  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
> > > >  include/net/devlink.h                              |  4 +
> > > >  net/core/devlink.c                                 |  5 ++
> > > >  7 files changed, 125 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/Documentation/networking/devlink/devlink-params.rst
> > > > b/Documentation/networking/devlink/devlink-params.rst
> > > > index 54c9f10..0b454c3 100644
> > > > +++ b/Documentation/networking/devlink/devlink-params.rst
> > > > @@ -114,3 +114,9 @@ own name.
> > > >         will NACK any attempt of other host to reset the device. This parameter
> > > >         is useful for setups where a device is shared by different hosts, such
> > > >         as multi-host setup.
> > > > +   * - ``rdma_protocol``
> > > > +     - string
> > > > +     - Selects the RDMA protocol selected for multi-protocol devices.
> > > > +        - ``iwarp`` iWARP
> > > > +	- ``roce`` RoCE
> > > > +	- ``ib`` Infiniband
> > >
> > > I'm still not sure this belongs in devlink.
> >
> > I believe you suggested we use devlink for protocol switch.
> 
> Yes, devlink is the right place, but selecting a *single* protocol doesn't seem right,
> or general enough.
> 
> Parav is talking about generic ways to customize the aux devices created and that
> would seem to serve the same function as this.

Is there an RFC or something posted for us to look at?

> 
> > > I know Parav is looking at the general problem of how to customize
> > > what aux devices are created, that may be a better fit for this.
> > >
> > > Can you remove the devlink parts to make progress?
> >
> > It is important since otherwise the customer will have no way to use RoCEv2 on
> this device.
> 
> I'm not saying to not having it eventually, I'm just getting tired of looking at 23
> patches. You can argue it out after
> 

Since this device cannot do concurrent protocols per function,
is having a driver specific devlink to switch between the 2 also a NACK?

There is something similar in another PCI driver.
https://www.kernel.org/doc/html/latest/networking/devlink/qed.html

Can we not adapt to Parav's solution (if it's a good fit) when it becomes available?

Shiraz





^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-07 17:35   ` Jason Gunthorpe
@ 2021-04-12 14:51     ` Saleem, Shiraz
  0 siblings, 0 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-12 14:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L, Williams, Dan J, Hefty, Sean

> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> 
> On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote:
> 
> > +struct iidc_res_base {
> > +	/* Union for future provision e.g. other res_type */
> > +	union {
> > +		struct iidc_rdma_qset_params qsets;
> > +	} res;
> 
> Use an anonymous union?
> 
> There is alot of confusiong provisioning for future types, do you have concrete
> plans here? I'm a bit confused why this is so different from how mlx5 ended up
> when it already has multiple types.

It was initially designed to be extensible for more resource types. But at this point,
there is no concrete plan and hence it doesn't need to be a union. 

> 
> > +};
> > +
> > +struct iidc_res {
> > +	/* Type of resource. */
> > +	enum iidc_res_type res_type;
> > +	/* Count requested */
> > +	u16 cnt_req;
> > +
> > +	/* Number of resources allocated. Filled in by callee.
> > +	 * Based on this value, caller to fill up "resources"
> > +	 */
> > +	u16 res_allocated;
> > +
> > +	/* Unique handle to resources allocated. Zero if call fails.
> > +	 * Allocated by callee and for now used by caller for internal
> > +	 * tracking purpose.
> > +	 */
> > +	u32 res_handle;
> > +
> > +	/* Peer driver has to allocate sufficient memory, to accommodate
> > +	 * cnt_requested before calling this function.
> 
> Calling what function?

Left over cruft from the re-write of IIDC in v2.
> 
> > +	 * Memory has to be zero initialized. It is input/output param.
> > +	 * As a result of alloc_res API, this structures will be populated.
> > +	 */
> > +	struct iidc_res_base res[1];
> 
> So it is a wrongly defined flex array? Confused

Needs fixing.

> 
> The usages are all using this as some super-complicated function argument:
> 
> 	struct iidc_res rdma_qset_res = {};
> 
> 	rdma_qset_res.res_allocated = 1;
> 	rdma_qset_res.res_type = IIDC_RDMA_QSETS_TXSCHED;
> 	rdma_qset_res.res[0].res.qsets.vport_id = vsi->vsi_idx;
> 	rdma_qset_res.res[0].res.qsets.teid = tc_node->l2_sched_node_id;
> 	rdma_qset_res.res[0].res.qsets.qs_handle = tc_node->qs_handle;
> 
> 	if (cdev_info->ops->free_res(cdev_info, &rdma_qset_res))
> 
> So the answer here is to make your function calls sane and well architected. If you
> have to pass a union to call a function then something is very wrong with the
> design.
> 

Based on previous comment, the union will be removed.

> You aren't trying to achieve ABI decoupling of the rdma/ethernet modules with an
> obfuscated complicated function pointer indirection, are you?

As discussed in other thread, this is part of the IIDC interface exporting the core device .ops callbacks.
> 
> > +/* Following APIs are implemented by auxiliary drivers and invoked by
> > +core PCI
> > + * driver
> > + */
> > +struct iidc_auxiliary_ops {
> > +	/* This event_handler is meant to be a blocking call.  For instance,
> > +	 * when a BEFORE_MTU_CHANGE event comes in, the event_handler will
> not
> > +	 * return until the auxiliary driver is ready for the MTU change to
> > +	 * happen.
> > +	 */
> > +	void (*event_handler)(struct iidc_core_dev_info *cdev_info,
> > +			      struct iidc_event *event);
> > +
> > +	int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id,
> > +			  u8 *msg, u16 len);
> > +};
> 
> This is not the normal pattern:
> 
> > +struct iidc_auxiliary_drv {
> > +	struct auxiliary_driver adrv;
> > +	struct iidc_auxiliary_ops *ops;
> > +};
> 
> Just put the two functions above in the drv directly:

Ok.


> 
> struct iidc_auxiliary_drv {
>         struct auxilary_driver adrv;
> 	void (*event_handler)(struct iidc_core_dev_info *cdev_info, *cdev_info,
> 			      struct iidc_event *event);
> 
> 	int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id,
> 			  u8 *msg, u16 len);
> }
> 
> > +
> > +#define IIDC_RDMA_NAME	"intel_rdma"
> > +#define IIDC_RDMA_ID	0x00000010
> > +#define IIDC_MAX_NUM_AUX	4
> > +
> > +/* The const struct that instantiates cdev_info_id needs to be
> > +initialized
> > + * in the .c with the macro ASSIGN_IIDC_INFO.
> > + * For example:
> > + * static const struct cdev_info_id cdev_info_ids[] =
> > +ASSIGN_IIDC_INFO;  */ struct cdev_info_id {
> > +	char *name;
> > +	int id;
> > +};
> > +
> > +#define IIDC_RDMA_INFO   { .name = IIDC_RDMA_NAME,  .id =
> IIDC_RDMA_ID },
> > +
> > +#define ASSIGN_IIDC_INFO	\
> > +{				\
> > +	IIDC_RDMA_INFO		\
> > +}
> 
> I tried to figure out what all this was for and came up short. There is only one user
> and all this seems unnecessary in this series, add it later when you need it.

No plan for new user, so this should go.

> 
> > +
> > +#define iidc_priv(x) ((x)->auxiliary_priv)
> 
> Use a static inline function
> 
Ok

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-12 14:50         ` Saleem, Shiraz
@ 2021-04-12 16:12           ` Jason Gunthorpe
  2021-04-15 17:36             ` Saleem, Shiraz
  0 siblings, 1 reply; 52+ messages in thread
From: Jason Gunthorpe @ 2021-04-12 16:12 UTC (permalink / raw)
  To: Saleem, Shiraz
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L, Williams, Dan J, Hefty, Sean, Lacombe, John S

On Mon, Apr 12, 2021 at 02:50:43PM +0000, Saleem, Shiraz wrote:

> Because it is an unnecessary thing to force a user to build/load drivers for
> which they don't have the HW for? 

Happens automatically in all distros, so I don't agree with this.

> The problem gets compounded if we have to do it for all future HW
> Intel PCI drivers, i.e. depends on ICE && ....

Then someday build a proper pluggable abstraction and put all your
ethernet drivers under it. 

Today you haven't done that and all we have a set of ops that only
work with one eth driver and a totally different set of functions that
only work with a different driver.

It is all just dead code until it gets finished and process is to not
merge dead code.

> There is a near-term Intel ethernet VF driver which will use IIDC to
> provide RDMA in the VF, and implement some of these .ops
> callbacks. There is also intent to move i40e to IIDC.

"near-term" We are now on year three of Intel talking about this
driver!

Get the bulk of the thing merged and deal with the rest in followup
patches.

> But in a unified irdma driver model connecting to multiple PCI gen
> drivers, I do think it serves a reason.

It is fine as a high level idea, but the implementation has to meet
kernel standards.

Jason

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-12 14:50         ` Saleem, Shiraz
@ 2021-04-12 19:07           ` Parav Pandit
  2021-04-13  4:03             ` Parav Pandit
  2021-04-13 14:40             ` Saleem, Shiraz
  0 siblings, 2 replies; 52+ messages in thread
From: Parav Pandit @ 2021-04-12 19:07 UTC (permalink / raw)
  To: Saleem, Shiraz, Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E



> From: Saleem, Shiraz <shiraz.saleem@intel.com>
> Sent: Monday, April 12, 2021 8:21 PM
> 
> > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> >
> > On Wed, Apr 07, 2021 at 08:58:25PM +0000, Saleem, Shiraz wrote:
> > > > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > > >
> > > > On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> > > > > Add a new generic runtime devlink parameter 'rdma_protocol'
> > > > > and use it in ice PCI driver. Configuration changes result in
> > > > > unplugging the auxiliary RDMA device and re-plugging it with
> > > > > updated values for irdma auxiiary driver to consume at
> > > > > drv.probe()
> > > > >
> > > > > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > > > >  .../networking/devlink/devlink-params.rst          |  6 ++
> > > > >  Documentation/networking/devlink/ice.rst           | 13 +++
> > > > >  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92
> > +++++++++++++++++++++-
> > > > >  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
> > > > >  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
> > > > >  include/net/devlink.h                              |  4 +
> > > > >  net/core/devlink.c                                 |  5 ++
> > > > >  7 files changed, 125 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/Documentation/networking/devlink/devlink-params.rst
> > > > > b/Documentation/networking/devlink/devlink-params.rst
> > > > > index 54c9f10..0b454c3 100644
> > > > > +++ b/Documentation/networking/devlink/devlink-params.rst
> > > > > @@ -114,3 +114,9 @@ own name.
> > > > >         will NACK any attempt of other host to reset the device. This
> parameter
> > > > >         is useful for setups where a device is shared by different hosts,
> such
> > > > >         as multi-host setup.
> > > > > +   * - ``rdma_protocol``
> > > > > +     - string
> > > > > +     - Selects the RDMA protocol selected for multi-protocol devices.
> > > > > +        - ``iwarp`` iWARP
> > > > > +	- ``roce`` RoCE
> > > > > +	- ``ib`` Infiniband
> > > >
> > > > I'm still not sure this belongs in devlink.
> > >
> > > I believe you suggested we use devlink for protocol switch.
> >
> > Yes, devlink is the right place, but selecting a *single* protocol
> > doesn't seem right, or general enough.
> >
> > Parav is talking about generic ways to customize the aux devices
> > created and that would seem to serve the same function as this.
> 
> Is there an RFC or something posted for us to look at?
I do not have polished RFC content ready yet.
But coping the full config sequence snippet from the internal draft (changed for ice example) here as I like to discuss with you in this context.

# (1) show auxiliary device types supported by a given devlink device.
# applies to pci pf,vf,sf. (in general at devlink instance).
$ devlink dev auxdev show pci/0000:06.00.0
pci/0000:06.00.0:
  current:
    roce eth
  new:
  supported:
    roce eth iwarp

# (2) enable iwarp and ethernet type of aux devices and disable roce.
$ devlink dev auxdev set pci/0000:06:00.0 roce off iwarp on

# (3) now see which aux devices will be enable on next reload.
$ devlink dev auxdev show pci/0000:06:00.0
pci/0000:06:00.0:
  current:
    roce eth
  new:
    eth iwarp
  supported:
    roce eth iwarp

# (4) now reload the device and see which aux devices are created.
At this point driver undergoes reconfig for removal of roce and adding iwarp.
$ devlink reload pci/0000:06:00.0

# (5) verify which are the aux devices now activated.
$ devlink dev auxdev show pci/0000:06:00.0
pci/0000:06:00.0:
  current:
    roce eth
  new:
  supported:
    roce eth iwarp

Above 'new' section is only shown when its set. (similar to devlink resource).
In command two vendor driver can fail the call when iwarp and roce both enabled by user.
mlx5_core doesn't have iwarp, but its vdpa type of device. And for other cases we just want to enable roce disabling eth and vdpa.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-12 19:07           ` Parav Pandit
@ 2021-04-13  4:03             ` Parav Pandit
  2021-04-13 14:40             ` Saleem, Shiraz
  1 sibling, 0 replies; 52+ messages in thread
From: Parav Pandit @ 2021-04-13  4:03 UTC (permalink / raw)
  To: Parav Pandit, Saleem, Shiraz, Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E



> From: Parav Pandit <parav@nvidia.com>
> Sent: Tuesday, April 13, 2021 12:38 AM
> 
> > From: Saleem, Shiraz <shiraz.saleem@intel.com>
> > Sent: Monday, April 12, 2021 8:21 PM
> >
> > > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > >
> > > On Wed, Apr 07, 2021 at 08:58:25PM +0000, Saleem, Shiraz wrote:
> > > > > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > > > >
> > > > > On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> > > > > > Add a new generic runtime devlink parameter 'rdma_protocol'
> > > > > > and use it in ice PCI driver. Configuration changes result in
> > > > > > unplugging the auxiliary RDMA device and re-plugging it with
> > > > > > updated values for irdma auxiiary driver to consume at
> > > > > > drv.probe()
> > > > > >
> > > > > > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > > > > >  .../networking/devlink/devlink-params.rst          |  6 ++
> > > > > >  Documentation/networking/devlink/ice.rst           | 13 +++
> > > > > >  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92
> > > +++++++++++++++++++++-
> > > > > >  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
> > > > > >  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
> > > > > >  include/net/devlink.h                              |  4 +
> > > > > >  net/core/devlink.c                                 |  5 ++
> > > > > >  7 files changed, 125 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git
> > > > > > a/Documentation/networking/devlink/devlink-params.rst
> > > > > > b/Documentation/networking/devlink/devlink-params.rst
> > > > > > index 54c9f10..0b454c3 100644
> > > > > > +++ b/Documentation/networking/devlink/devlink-params.rst
> > > > > > @@ -114,3 +114,9 @@ own name.
> > > > > >         will NACK any attempt of other host to reset the
> > > > > > device. This
> > parameter
> > > > > >         is useful for setups where a device is shared by
> > > > > > different hosts,
> > such
> > > > > >         as multi-host setup.
> > > > > > +   * - ``rdma_protocol``
> > > > > > +     - string
> > > > > > +     - Selects the RDMA protocol selected for multi-protocol devices.
> > > > > > +        - ``iwarp`` iWARP
> > > > > > +	- ``roce`` RoCE
> > > > > > +	- ``ib`` Infiniband
> > > > >
> > > > > I'm still not sure this belongs in devlink.
> > > >
> > > > I believe you suggested we use devlink for protocol switch.
> > >
> > > Yes, devlink is the right place, but selecting a *single* protocol
> > > doesn't seem right, or general enough.
> > >
> > > Parav is talking about generic ways to customize the aux devices
> > > created and that would seem to serve the same function as this.
> >
> > Is there an RFC or something posted for us to look at?
> I do not have polished RFC content ready yet.
> But coping the full config sequence snippet from the internal draft (changed
> for ice example) here as I like to discuss with you in this context.
> 
> # (1) show auxiliary device types supported by a given devlink device.
> # applies to pci pf,vf,sf. (in general at devlink instance).
> $ devlink dev auxdev show pci/0000:06.00.0
> pci/0000:06.00.0:
>   current:
>     roce eth
>   new:
>   supported:
>     roce eth iwarp
> 
> # (2) enable iwarp and ethernet type of aux devices and disable roce.
> $ devlink dev auxdev set pci/0000:06:00.0 roce off iwarp on
> 
> # (3) now see which aux devices will be enable on next reload.
> $ devlink dev auxdev show pci/0000:06:00.0
> pci/0000:06:00.0:
>   current:
>     roce eth
>   new:
>     eth iwarp
>   supported:
>     roce eth iwarp
> 
> # (4) now reload the device and see which aux devices are created.
> At this point driver undergoes reconfig for removal of roce and adding iwarp.
> $ devlink reload pci/0000:06:00.0
> 
> # (5) verify which are the aux devices now activated.
> $ devlink dev auxdev show pci/0000:06:00.0
> pci/0000:06:00.0:
>   current:
>     roce eth
This was copy paste error from my command #1.
It should be "roce eth" should be "iwarp eth".
This is because in step #3, iwarp was enabled.

>   new:
>   supported:
>     roce eth iwarp
> 
> Above 'new' section is only shown when its set. (similar to devlink resource).
> In command two vendor driver can fail the call when iwarp and roce both
> enabled by user.
> mlx5_core doesn't have iwarp, but its vdpa type of device. And for other
> cases we just want to enable roce disabling eth and vdpa.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-12 19:07           ` Parav Pandit
  2021-04-13  4:03             ` Parav Pandit
@ 2021-04-13 14:40             ` Saleem, Shiraz
  2021-04-13 17:36               ` Parav Pandit
  1 sibling, 1 reply; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-13 14:40 UTC (permalink / raw)
  To: Parav Pandit, Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E

> Subject: RE: [PATCH v4 05/23] ice: Add devlink params support
> 
> 
> 
> > From: Saleem, Shiraz <shiraz.saleem@intel.com>
> > Sent: Monday, April 12, 2021 8:21 PM
> >
> > > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > >
> > > On Wed, Apr 07, 2021 at 08:58:25PM +0000, Saleem, Shiraz wrote:
> > > > > Subject: Re: [PATCH v4 05/23] ice: Add devlink params support
> > > > >
> > > > > On Tue, Apr 06, 2021 at 04:01:07PM -0500, Shiraz Saleem wrote:
> > > > > > Add a new generic runtime devlink parameter 'rdma_protocol'
> > > > > > and use it in ice PCI driver. Configuration changes result in
> > > > > > unplugging the auxiliary RDMA device and re-plugging it with
> > > > > > updated values for irdma auxiiary driver to consume at
> > > > > > drv.probe()
> > > > > >
> > > > > > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > > > > >  .../networking/devlink/devlink-params.rst          |  6 ++
> > > > > >  Documentation/networking/devlink/ice.rst           | 13 +++
> > > > > >  drivers/net/ethernet/intel/ice/ice_devlink.c       | 92
> > > +++++++++++++++++++++-
> > > > > >  drivers/net/ethernet/intel/ice/ice_devlink.h       |  5 ++
> > > > > >  drivers/net/ethernet/intel/ice/ice_main.c          |  2 +
> > > > > >  include/net/devlink.h                              |  4 +
> > > > > >  net/core/devlink.c                                 |  5 ++
> > > > > >  7 files changed, 125 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git
> > > > > > a/Documentation/networking/devlink/devlink-params.rst
> > > > > > b/Documentation/networking/devlink/devlink-params.rst
> > > > > > index 54c9f10..0b454c3 100644
> > > > > > +++ b/Documentation/networking/devlink/devlink-params.rst
> > > > > > @@ -114,3 +114,9 @@ own name.
> > > > > >         will NACK any attempt of other host to reset the
> > > > > > device. This
> > parameter
> > > > > >         is useful for setups where a device is shared by
> > > > > > different hosts,
> > such
> > > > > >         as multi-host setup.
> > > > > > +   * - ``rdma_protocol``
> > > > > > +     - string
> > > > > > +     - Selects the RDMA protocol selected for multi-protocol devices.
> > > > > > +        - ``iwarp`` iWARP
> > > > > > +	- ``roce`` RoCE
> > > > > > +	- ``ib`` Infiniband
> > > > >
> > > > > I'm still not sure this belongs in devlink.
> > > >
> > > > I believe you suggested we use devlink for protocol switch.
> > >
> > > Yes, devlink is the right place, but selecting a *single* protocol
> > > doesn't seem right, or general enough.
> > >
> > > Parav is talking about generic ways to customize the aux devices
> > > created and that would seem to serve the same function as this.
> >
> > Is there an RFC or something posted for us to look at?
> I do not have polished RFC content ready yet.
> But coping the full config sequence snippet from the internal draft (changed for ice
> example) here as I like to discuss with you in this context.

Thanks Parav! Some comments below.

> 
> # (1) show auxiliary device types supported by a given devlink device.
> # applies to pci pf,vf,sf. (in general at devlink instance).
> $ devlink dev auxdev show pci/0000:06.00.0
> pci/0000:06.00.0:
>   current:
>     roce eth
>   new:
>   supported:
>     roce eth iwarp
> 
> # (2) enable iwarp and ethernet type of aux devices and disable roce.
> $ devlink dev auxdev set pci/0000:06:00.0 roce off iwarp on
> 
> # (3) now see which aux devices will be enable on next reload.
> $ devlink dev auxdev show pci/0000:06:00.0
> pci/0000:06:00.0:
>   current:
>     roce eth
>   new:
>     eth iwarp
>   supported:
>     roce eth iwarp
> 
> # (4) now reload the device and see which aux devices are created.
> At this point driver undergoes reconfig for removal of roce and adding iwarp.
> $ devlink reload pci/0000:06:00.0

I see this is modeled like devlink resource.

Do we really to need a PCI driver re-init to switch the type of the auxdev hanging off the PCI dev?

Why not just allow the setting to apply dynamically during a 'set' itself with an unplug/plug
of the auxdev with correct type.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-13 14:40             ` Saleem, Shiraz
@ 2021-04-13 17:36               ` Parav Pandit
  2021-04-14  0:21                 ` Saleem, Shiraz
  0 siblings, 1 reply; 52+ messages in thread
From: Parav Pandit @ 2021-04-13 17:36 UTC (permalink / raw)
  To: Saleem, Shiraz, Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E



> From: Saleem, Shiraz <shiraz.saleem@intel.com>
> Sent: Tuesday, April 13, 2021 8:11 PM
[..]

> > > > Parav is talking about generic ways to customize the aux devices
> > > > created and that would seem to serve the same function as this.
> > >
> > > Is there an RFC or something posted for us to look at?
> > I do not have polished RFC content ready yet.
> > But coping the full config sequence snippet from the internal draft
> > (changed for ice
> > example) here as I like to discuss with you in this context.
> 
> Thanks Parav! Some comments below.
> 
> >
> > # (1) show auxiliary device types supported by a given devlink device.
> > # applies to pci pf,vf,sf. (in general at devlink instance).
> > $ devlink dev auxdev show pci/0000:06.00.0
> > pci/0000:06.00.0:
> >   current:
> >     roce eth
> >   new:
> >   supported:
> >     roce eth iwarp
> >
> > # (2) enable iwarp and ethernet type of aux devices and disable roce.
> > $ devlink dev auxdev set pci/0000:06:00.0 roce off iwarp on
> >
> > # (3) now see which aux devices will be enable on next reload.
> > $ devlink dev auxdev show pci/0000:06:00.0
> > pci/0000:06:00.0:
> >   current:
> >     roce eth
> >   new:
> >     eth iwarp
> >   supported:
> >     roce eth iwarp
> >
> > # (4) now reload the device and see which aux devices are created.
> > At this point driver undergoes reconfig for removal of roce and adding
> iwarp.
> > $ devlink reload pci/0000:06:00.0
> 
> I see this is modeled like devlink resource.
> 
> Do we really to need a PCI driver re-init to switch the type of the auxdev
> hanging off the PCI dev?
> 
I don't see a need to re-init the whole PCI driver. Since only aux device config is changed only that piece to get reloaded.

> Why not just allow the setting to apply dynamically during a 'set' itself with an
> unplug/plug of the auxdev with correct type.
> 
This suggestion came up in the internal discussion too.
However such task needs to synchronize with devlink reload command and also with driver remove() sequence.
So locking wise and depending on amount of config change, it is close to what reload will do.
For example other resource config or other params setting also to take effect.
So to avoid defining multiple config sequence, doing as part of already existing devlink reload, it brings simple sequence to user.

For example,
1. enable/disable desired aux devices
2. configure device resources
3. set some device params
4. do devlink reload and apply settings done in #1 to #3

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-13 17:36               ` Parav Pandit
@ 2021-04-14  0:21                 ` Saleem, Shiraz
  2021-04-14  5:27                   ` Parav Pandit
  2021-04-18 11:51                   ` Leon Romanovsky
  0 siblings, 2 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-14  0:21 UTC (permalink / raw)
  To: Parav Pandit, Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E

> Subject: RE: [PATCH v4 05/23] ice: Add devlink params support
> 
> 
> 
> > From: Saleem, Shiraz <shiraz.saleem@intel.com>
> > Sent: Tuesday, April 13, 2021 8:11 PM
> [..]
> 
> > > > > Parav is talking about generic ways to customize the aux devices
> > > > > created and that would seem to serve the same function as this.
> > > >
> > > > Is there an RFC or something posted for us to look at?
> > > I do not have polished RFC content ready yet.
> > > But coping the full config sequence snippet from the internal draft
> > > (changed for ice
> > > example) here as I like to discuss with you in this context.
> >
> > Thanks Parav! Some comments below.
> >
> > >
> > > # (1) show auxiliary device types supported by a given devlink device.
> > > # applies to pci pf,vf,sf. (in general at devlink instance).
> > > $ devlink dev auxdev show pci/0000:06.00.0
> > > pci/0000:06.00.0:
> > >   current:
> > >     roce eth
> > >   new:
> > >   supported:
> > >     roce eth iwarp
> > >
> > > # (2) enable iwarp and ethernet type of aux devices and disable roce.
> > > $ devlink dev auxdev set pci/0000:06:00.0 roce off iwarp on
> > >
> > > # (3) now see which aux devices will be enable on next reload.
> > > $ devlink dev auxdev show pci/0000:06:00.0
> > > pci/0000:06:00.0:
> > >   current:
> > >     roce eth
> > >   new:
> > >     eth iwarp
> > >   supported:
> > >     roce eth iwarp
> > >
> > > # (4) now reload the device and see which aux devices are created.
> > > At this point driver undergoes reconfig for removal of roce and
> > > adding
> > iwarp.
> > > $ devlink reload pci/0000:06:00.0
> >
> > I see this is modeled like devlink resource.
> >
> > Do we really to need a PCI driver re-init to switch the type of the
> > auxdev hanging off the PCI dev?
> >
> I don't see a need to re-init the whole PCI driver. Since only aux device config is
> changed only that piece to get reloaded.

But that is what mlx5 and other implementations does on reload no? i.e. a PCI driver reinit.
I can see an ice implementation of reload morphing to similar over time to support a new config
that requires a true reinit of PCI driver entities.

> 
> > Why not just allow the setting to apply dynamically during a 'set'
> > itself with an unplug/plug of the auxdev with correct type.
> >
> This suggestion came up in the internal discussion too.
> However such task needs to synchronize with devlink reload command and also
> with driver remove() sequence.
> So locking wise and depending on amount of config change, it is close to what
> reload will do.

Holding this mutex across the auxiliary device unplug/plug in "set" wont cut it?
https://elixir.bootlin.com/linux/v5.12-rc7/source/drivers/net/ethernet/mellanox/mlx5/core/main.c#L1304

> For example other resource config or other params setting also to take effect.
> So to avoid defining multiple config sequence, doing as part of already existing
> devlink reload, it brings simple sequence to user.
> 
> For example,
> 1. enable/disable desired aux devices
> 2. configure device resources
> 3. set some device params
> 4. do devlink reload and apply settings done in #1 to #3

Sure. But a user might also just want to operate on just an auxiliary device configuration change. As in #1.
And he ends up having everything hanging off the PF to get blown out, including potentially
the VFs. That feels like too big a hammer.

Shiraz


^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-14  0:21                 ` Saleem, Shiraz
@ 2021-04-14  5:27                   ` Parav Pandit
  2021-04-18 11:51                   ` Leon Romanovsky
  1 sibling, 0 replies; 52+ messages in thread
From: Parav Pandit @ 2021-04-14  5:27 UTC (permalink / raw)
  To: Saleem, Shiraz, Jason Gunthorpe, Jiri Pirko
  Cc: dledford, kuba, davem, linux-rdma, Lacombe, John S, netdev,
	Ertman, David M, Nguyen, Anthony L, Williams, Dan J, Hefty, Sean,
	Keller, Jacob E

+ Jiri.

> From: Saleem, Shiraz <shiraz.saleem@intel.com>
> Sent: Wednesday, April 14, 2021 5:51 AM
> 
> > Subject: RE: [PATCH v4 05/23] ice: Add devlink params support
> >
> >
> >
> > > From: Saleem, Shiraz <shiraz.saleem@intel.com>
> > > Sent: Tuesday, April 13, 2021 8:11 PM
> > [..]
> >
> > > > > > Parav is talking about generic ways to customize the aux
> > > > > > devices created and that would seem to serve the same function as
> this.
> > > > >
> > > > > Is there an RFC or something posted for us to look at?
> > > > I do not have polished RFC content ready yet.
> > > > But coping the full config sequence snippet from the internal
> > > > draft (changed for ice
> > > > example) here as I like to discuss with you in this context.
> > >
> > > Thanks Parav! Some comments below.
> > >
> > > >
> > > > # (1) show auxiliary device types supported by a given devlink device.
> > > > # applies to pci pf,vf,sf. (in general at devlink instance).
> > > > $ devlink dev auxdev show pci/0000:06.00.0
> > > > pci/0000:06.00.0:
> > > >   current:
> > > >     roce eth
> > > >   new:
> > > >   supported:
> > > >     roce eth iwarp
> > > >
> > > > # (2) enable iwarp and ethernet type of aux devices and disable roce.
> > > > $ devlink dev auxdev set pci/0000:06:00.0 roce off iwarp on
> > > >
> > > > # (3) now see which aux devices will be enable on next reload.
> > > > $ devlink dev auxdev show pci/0000:06:00.0
> > > > pci/0000:06:00.0:
> > > >   current:
> > > >     roce eth
> > > >   new:
> > > >     eth iwarp
> > > >   supported:
> > > >     roce eth iwarp
> > > >
> > > > # (4) now reload the device and see which aux devices are created.
> > > > At this point driver undergoes reconfig for removal of roce and
> > > > adding
> > > iwarp.
> > > > $ devlink reload pci/0000:06:00.0
> > >
> > > I see this is modeled like devlink resource.
> > >
> > > Do we really to need a PCI driver re-init to switch the type of the
> > > auxdev hanging off the PCI dev?
> > >
> > I don't see a need to re-init the whole PCI driver. Since only aux
> > device config is changed only that piece to get reloaded.
> 
> But that is what mlx5 and other implementations does on reload no? i.e. a
> PCI driver reinit.
Currently yes, reload does PCI re-init.
However I am not seeing the value of reload if no config (param, resource, auxdev) is changed.

> I can see an ice implementation of reload morphing to similar over time to
> support a new config that requires a true reinit of PCI driver entities.
> 
Sure.

> >
> > > Why not just allow the setting to apply dynamically during a 'set'
> > > itself with an unplug/plug of the auxdev with correct type.
> > >
> > This suggestion came up in the internal discussion too.
> > However such task needs to synchronize with devlink reload command and
> > also with driver remove() sequence.
> > So locking wise and depending on amount of config change, it is close
> > to what reload will do.
> 
> Holding this mutex across the auxiliary device unplug/plug in "set" wont cut
> it?
> https://elixir.bootlin.com/linux/v5.12-
> rc7/source/drivers/net/ethernet/mellanox/mlx5/core/main.c#L1304
> 
Currently devlink reload for mlx5 is source of lockdep assert, use after free access and a deadlock in net ns. :-(
Multiple of us (Leon, Saeed, Moshe) working on it resolve it.
So I want to stay away from intf_mutex for now.

> > For example other resource config or other params setting also to take
> effect.
> > So to avoid defining multiple config sequence, doing as part of
> > already existing devlink reload, it brings simple sequence to user.
> >
> > For example,
> > 1. enable/disable desired aux devices
> > 2. configure device resources
> > 3. set some device params
> > 4. do devlink reload and apply settings done in #1 to #3
> 
> Sure. But a user might also just want to operate on just an auxiliary device
> configuration change. As in #1.
> And he ends up having everything hanging off the PF to get blown out,
> including potentially the VFs. That feels like too big a hammer.
This is certainly not desired.

If we want aux device enable/disable to take effect when its done without reload than above flow should be redefined as,

1. configure device resources (optional)
2. set some device params (optional)
3. enable/disable desired aux devices

Step-3 needs to apply the settings of (1) and (2) without user doing devlink reload.
devlink core doesn't know on step #3, that reload_down() and reload_up() to be done.
So driver internally needs to implement reload_down(), up() on callback of #3.
This builds parallel framework to devlink reload.

Jiri,
What do you think of it?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v4 01/23] iidc: Introduce iidc.h
  2021-04-12 16:12           ` Jason Gunthorpe
@ 2021-04-15 17:36             ` Saleem, Shiraz
  0 siblings, 0 replies; 52+ messages in thread
From: Saleem, Shiraz @ 2021-04-15 17:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, kuba, davem, linux-rdma, netdev, Ertman, David M,
	Nguyen, Anthony L, Williams, Dan J, Hefty, Sean, Lacombe, John S

> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h
> 
> On Mon, Apr 12, 2021 at 02:50:43PM +0000, Saleem, Shiraz wrote:
> 

[....]

> > There is a near-term Intel ethernet VF driver which will use IIDC to
> > provide RDMA in the VF, and implement some of these .ops callbacks.
> > There is also intent to move i40e to IIDC.
> 
> "near-term" We are now on year three of Intel talking about this driver!
> 
> Get the bulk of the thing merged and deal with the rest in followup patches.

We will submit with symbols exported from ice and direct calls from irdma.

Shiraz

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 05/23] ice: Add devlink params support
  2021-04-14  0:21                 ` Saleem, Shiraz
  2021-04-14  5:27                   ` Parav Pandit
@ 2021-04-18 11:51                   ` Leon Romanovsky
  1 sibling, 0 replies; 52+ messages in thread
From: Leon Romanovsky @ 2021-04-18 11:51 UTC (permalink / raw)
  To: Saleem, Shiraz
  Cc: Parav Pandit, Jason Gunthorpe, dledford, kuba, davem, linux-rdma,
	Lacombe, John S, netdev, Ertman, David M, Nguyen, Anthony L,
	Williams, Dan J, Hefty, Sean, Keller, Jacob E

On Wed, Apr 14, 2021 at 12:21:08AM +0000, Saleem, Shiraz wrote:
> > Subject: RE: [PATCH v4 05/23] ice: Add devlink params support

<...>

> > > Why not just allow the setting to apply dynamically during a 'set'
> > > itself with an unplug/plug of the auxdev with correct type.
> > >
> > This suggestion came up in the internal discussion too.
> > However such task needs to synchronize with devlink reload command and also
> > with driver remove() sequence.
> > So locking wise and depending on amount of config change, it is close to what
> > reload will do.
> 
> Holding this mutex across the auxiliary device unplug/plug in "set" wont cut it?
> https://elixir.bootlin.com/linux/v5.12-rc7/source/drivers/net/ethernet/mellanox/mlx5/core/main.c#L1304

Like Parav said, we are working to fix it and already have one working
solution, unfortunately it has one eyebrow raising change and we are
trying another one.

You can take a look here to get sense of the scope:
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=devlink-core

Thanks

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2021-04-18 11:51 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-06 21:01 [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 01/23] iidc: Introduce iidc.h Shiraz Saleem
2021-04-07 15:44   ` Jason Gunthorpe
2021-04-07 20:58     ` Saleem, Shiraz
2021-04-07 22:43       ` Jason Gunthorpe
2021-04-08  7:14         ` Leon Romanovsky
2021-04-09  1:38           ` Saleem, Shiraz
2021-04-11 11:48             ` Leon Romanovsky
2021-04-12 14:50         ` Saleem, Shiraz
2021-04-12 16:12           ` Jason Gunthorpe
2021-04-15 17:36             ` Saleem, Shiraz
2021-04-07 17:35   ` Jason Gunthorpe
2021-04-12 14:51     ` Saleem, Shiraz
2021-04-06 21:01 ` [PATCH v4 02/23] ice: Initialize RDMA support Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 03/23] ice: Implement iidc operations Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 04/23] ice: Register auxiliary device to provide RDMA Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 05/23] ice: Add devlink params support Shiraz Saleem
2021-04-07 14:57   ` Jason Gunthorpe
2021-04-07 20:58     ` Saleem, Shiraz
2021-04-07 22:46       ` Jason Gunthorpe
2021-04-12 14:50         ` Saleem, Shiraz
2021-04-12 19:07           ` Parav Pandit
2021-04-13  4:03             ` Parav Pandit
2021-04-13 14:40             ` Saleem, Shiraz
2021-04-13 17:36               ` Parav Pandit
2021-04-14  0:21                 ` Saleem, Shiraz
2021-04-14  5:27                   ` Parav Pandit
2021-04-18 11:51                   ` Leon Romanovsky
2021-04-06 21:01 ` [PATCH v4 06/23] i40e: Prep i40e header for aux bus conversion Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 07/23] i40e: Register auxiliary devices to provide RDMA Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 08/23] RDMA/irdma: Register auxiliary driver and implement private channel OPs Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 09/23] RDMA/irdma: Implement device initialization definitions Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 10/23] RDMA/irdma: Implement HW Admin Queue OPs Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 11/23] RDMA/irdma: Add HMC backing store setup functions Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 12/23] RDMA/irdma: Add privileged UDA queue implementation Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 13/23] RDMA/irdma: Add QoS definitions Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 14/23] RDMA/irdma: Add connection manager Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 15/23] RDMA/irdma: Add PBLE resource manager Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 16/23] RDMA/irdma: Implement device supported verb APIs Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 17/23] RDMA/irdma: Add RoCEv2 UD OP support Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 18/23] RDMA/irdma: Add user/kernel shared libraries Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 19/23] RDMA/irdma: Add miscellaneous utility definitions Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 20/23] RDMA/irdma: Add dynamic tracing for CM Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 21/23] RDMA/irdma: Add ABI definitions Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 22/23] RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw Shiraz Saleem
2021-04-06 21:01 ` [PATCH v4 23/23] RDMA/irdma: Update MAINTAINERS file Shiraz Saleem
2021-04-06 21:05 ` [PATCH v4 00/23] Add Intel Ethernet Protocol Driver for RDMA (irdma) Saleem, Shiraz
2021-04-06 23:15 ` Jason Gunthorpe
2021-04-06 23:30   ` Saleem, Shiraz
2021-04-07  0:18     ` Saleem, Shiraz
2021-04-07 11:31     ` Jason Gunthorpe
2021-04-07 15:06       ` Saleem, Shiraz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).